summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorCoprDistGit <infra@openeuler.org>2023-06-20 04:56:52 +0000
committerCoprDistGit <infra@openeuler.org>2023-06-20 04:56:52 +0000
commit35aaeaccacbdfc6c6add2bd62e95dc5110be6451 (patch)
treecb96f4c1f36a7fb3a93a09decfd81b9e5e7d64ef
parent8df1fd119f7ac2aa40b633f9ca2f2c261b1652f7 (diff)
automatic import of python-kuro2sudachiopeneuler20.03
-rw-r--r--.gitignore1
-rw-r--r--python-kuro2sudachi.spec419
-rw-r--r--sources1
3 files changed, 421 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..feb86e4 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/kuro2sudachi-0.3.6.tar.gz
diff --git a/python-kuro2sudachi.spec b/python-kuro2sudachi.spec
new file mode 100644
index 0000000..108216e
--- /dev/null
+++ b/python-kuro2sudachi.spec
@@ -0,0 +1,419 @@
+%global _empty_manifest_terminate_build 0
+Name: python-kuro2sudachi
+Version: 0.3.6
+Release: 1
+Summary: please add a summary manually as the author left a blank one
+License: Apache-2.0
+URL: http://github.com/po3rin/kuro2sudachi
+Source0: https://mirrors.aliyun.com/pypi/web/packages/d8/7c/c2bd0bc055f4b22dff32040519c4f7f1727b2c053db0c23f160dad2df0e5/kuro2sudachi-0.3.6.tar.gz
+BuildArch: noarch
+
+Requires: python3-jaconv
+Requires: python3-importlib-metadata
+Requires: python3-sudachipy
+Requires: python3-sudachidict_full
+Requires: python3-SudachiDict-core
+
+%description
+# kuro2sudachi
+
+[![PyPi version](https://img.shields.io/pypi/v/kuro2sudachi.svg)](https://pypi.python.org/pypi/kuro2sudachi/)
+![PyTest](https://github.com/po3rin/kuro2sudachi/workflows/PyTest/badge.svg)
+[![](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/release/python-390/)
+
+kuro2sudachi lets you to convert kuromoji user dict to sudachi user dict.
+
+## Usage
+
+```sh
+$ pip install kuro2sudachi
+$ kuro2sudachi kuromoji_dict.txt -o sudachi_user_dict.txt
+```
+
+## Custom pos convert dict
+
+you can overwrite convert config with setting json file.
+
+```json
+{
+ "固有名詞": {
+ "sudachi_pos": "名詞,固有名詞,一般,*,*,*",
+ "left_id": 4786,
+ "right_id": 4786,
+ "cost": 5000
+ },
+ "名詞": {
+ "sudachi_pos": "名詞,普通名詞,一般,*,*,*",
+ "left_id": 5146,
+ "right_id": 5146,
+ "cost": 5000
+ }
+}
+
+```
+
+```$
+$ kuro2sudachi kuromoji_dict.txt -o sudachi_user_dict.txt -c convert_config.json
+```
+
+if you want to ignore unsupported pos error & invalid format, use `--ignore` flag.
+
+## Dictionary type
+
+You can specify the dictionary with the tokenize option -s (default: core).
+
+```sh
+$ pip install sudachidict_full
+$ kuro2sudachi kuromoji_dict.txt -o sudachi_user_dict.txt -s full
+```
+
+## Auto Splitting
+
+kuro2sudachi supports suto splitting.
+
+```json
+{
+ "名詞": {
+ "sudachi_pos": "名詞,普通名詞,一般,*,*,*",
+ "left_id": 5146,
+ "right_id": 5146,
+ "cost": 5000,
+ "split_mode": "C",
+ "unit_div_mode": [
+ "A", "B"
+ ]
+ }
+}
+```
+
+output includes unit devision info.
+
+```sh
+$ cat kuromoji_dict.txt
+融合たんぱく質,融合たんぱく質,融合たんぱく質,名詞
+発作性心房細動,発作性心房細動,発作性心房細動,名詞
+
+$ kuro2sudachi kuromoji_dict.txt -o sudachi_user_dict.txt -c convert_config.json --ignore
+
+$ cat sudachi_user_dict.txt
+融合たんぱく質,4786,4786,5000,融合たんぱく質,名詞,普通名詞,一般,*,*,*,,融合たんぱく質,*,C,"融合,名詞,普通名詞,サ変可能,*,*,*,ユウゴウ/たんぱく,名詞,普通名詞,一般,*,*,*,タンパク/質,接尾辞,名詞的,一般,*,*,*,シツ","融合,名詞,普通名詞,サ変可能,*,*,*,ユウゴウ/たんぱく質,名詞,普通名詞,一般,*,*,*,タンパクシツ",*
+発作性心房細動,4786,4786,5000,発作性心房細動,名詞,普通名詞,一般,*,*,*,,発作性心房細動,*,C,"発作,名詞,普通名詞,一般,*,*,*,ホッサ/性,接尾辞,名詞的,一般,*,*,*,セイ/心房,名詞,普通名詞,一般,*,*,*,シンボウ/細動,名詞,普通名詞,一般,*,*,*,サイドウ","発作,名詞,普通名詞,一般,*,*,*,ホッサ/性,接尾辞,名詞的,一般,*,*,*,セイ/心房,名詞,普通名詞,一般,*,*,*,シンボウ/細動,名詞,普通名詞,一般,*,*,*,サイドウ",*
+```
+
+## Splitting Words defined by kuromoji
+
+Currently, the CLI does not support word splitting defined by kuromoji. Therefore, the split representation of kuromoji is ignored.
+
+```
+中咽頭ガン,中咽頭 ガン,チュウイントウ ガン,カスタム名詞
+↓
+中咽頭ガン,4786,4786,7000,中咽頭ガン,名詞,固有名詞,一般,*,*,*,チュウイントウガン,中咽頭ガン,*,*,*,*,*
+```
+
+# For Developer
+
+test kuro2sudachi
+
+```sh
+$ poetry install
+$ poetry run pytest
+```
+
+exec kuro2sudachi command
+
+```sh
+$ poetry run kuro2sudachi tests/kuromoji_dict_test.txt -o sudachi_user_dict.txt
+```
+
+## TODO
+
+- [ ] split mode
+- [ ] default rewrite
+
+
+%package -n python3-kuro2sudachi
+Summary: please add a summary manually as the author left a blank one
+Provides: python-kuro2sudachi
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-kuro2sudachi
+# kuro2sudachi
+
+[![PyPi version](https://img.shields.io/pypi/v/kuro2sudachi.svg)](https://pypi.python.org/pypi/kuro2sudachi/)
+![PyTest](https://github.com/po3rin/kuro2sudachi/workflows/PyTest/badge.svg)
+[![](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/release/python-390/)
+
+kuro2sudachi lets you to convert kuromoji user dict to sudachi user dict.
+
+## Usage
+
+```sh
+$ pip install kuro2sudachi
+$ kuro2sudachi kuromoji_dict.txt -o sudachi_user_dict.txt
+```
+
+## Custom pos convert dict
+
+you can overwrite convert config with setting json file.
+
+```json
+{
+ "固有名詞": {
+ "sudachi_pos": "名詞,固有名詞,一般,*,*,*",
+ "left_id": 4786,
+ "right_id": 4786,
+ "cost": 5000
+ },
+ "名詞": {
+ "sudachi_pos": "名詞,普通名詞,一般,*,*,*",
+ "left_id": 5146,
+ "right_id": 5146,
+ "cost": 5000
+ }
+}
+
+```
+
+```$
+$ kuro2sudachi kuromoji_dict.txt -o sudachi_user_dict.txt -c convert_config.json
+```
+
+if you want to ignore unsupported pos error & invalid format, use `--ignore` flag.
+
+## Dictionary type
+
+You can specify the dictionary with the tokenize option -s (default: core).
+
+```sh
+$ pip install sudachidict_full
+$ kuro2sudachi kuromoji_dict.txt -o sudachi_user_dict.txt -s full
+```
+
+## Auto Splitting
+
+kuro2sudachi supports suto splitting.
+
+```json
+{
+ "名詞": {
+ "sudachi_pos": "名詞,普通名詞,一般,*,*,*",
+ "left_id": 5146,
+ "right_id": 5146,
+ "cost": 5000,
+ "split_mode": "C",
+ "unit_div_mode": [
+ "A", "B"
+ ]
+ }
+}
+```
+
+output includes unit devision info.
+
+```sh
+$ cat kuromoji_dict.txt
+融合たんぱく質,融合たんぱく質,融合たんぱく質,名詞
+発作性心房細動,発作性心房細動,発作性心房細動,名詞
+
+$ kuro2sudachi kuromoji_dict.txt -o sudachi_user_dict.txt -c convert_config.json --ignore
+
+$ cat sudachi_user_dict.txt
+融合たんぱく質,4786,4786,5000,融合たんぱく質,名詞,普通名詞,一般,*,*,*,,融合たんぱく質,*,C,"融合,名詞,普通名詞,サ変可能,*,*,*,ユウゴウ/たんぱく,名詞,普通名詞,一般,*,*,*,タンパク/質,接尾辞,名詞的,一般,*,*,*,シツ","融合,名詞,普通名詞,サ変可能,*,*,*,ユウゴウ/たんぱく質,名詞,普通名詞,一般,*,*,*,タンパクシツ",*
+発作性心房細動,4786,4786,5000,発作性心房細動,名詞,普通名詞,一般,*,*,*,,発作性心房細動,*,C,"発作,名詞,普通名詞,一般,*,*,*,ホッサ/性,接尾辞,名詞的,一般,*,*,*,セイ/心房,名詞,普通名詞,一般,*,*,*,シンボウ/細動,名詞,普通名詞,一般,*,*,*,サイドウ","発作,名詞,普通名詞,一般,*,*,*,ホッサ/性,接尾辞,名詞的,一般,*,*,*,セイ/心房,名詞,普通名詞,一般,*,*,*,シンボウ/細動,名詞,普通名詞,一般,*,*,*,サイドウ",*
+```
+
+## Splitting Words defined by kuromoji
+
+Currently, the CLI does not support word splitting defined by kuromoji. Therefore, the split representation of kuromoji is ignored.
+
+```
+中咽頭ガン,中咽頭 ガン,チュウイントウ ガン,カスタム名詞
+↓
+中咽頭ガン,4786,4786,7000,中咽頭ガン,名詞,固有名詞,一般,*,*,*,チュウイントウガン,中咽頭ガン,*,*,*,*,*
+```
+
+# For Developer
+
+test kuro2sudachi
+
+```sh
+$ poetry install
+$ poetry run pytest
+```
+
+exec kuro2sudachi command
+
+```sh
+$ poetry run kuro2sudachi tests/kuromoji_dict_test.txt -o sudachi_user_dict.txt
+```
+
+## TODO
+
+- [ ] split mode
+- [ ] default rewrite
+
+
+%package help
+Summary: Development documents and examples for kuro2sudachi
+Provides: python3-kuro2sudachi-doc
+%description help
+# kuro2sudachi
+
+[![PyPi version](https://img.shields.io/pypi/v/kuro2sudachi.svg)](https://pypi.python.org/pypi/kuro2sudachi/)
+![PyTest](https://github.com/po3rin/kuro2sudachi/workflows/PyTest/badge.svg)
+[![](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/release/python-390/)
+
+kuro2sudachi lets you to convert kuromoji user dict to sudachi user dict.
+
+## Usage
+
+```sh
+$ pip install kuro2sudachi
+$ kuro2sudachi kuromoji_dict.txt -o sudachi_user_dict.txt
+```
+
+## Custom pos convert dict
+
+you can overwrite convert config with setting json file.
+
+```json
+{
+ "固有名詞": {
+ "sudachi_pos": "名詞,固有名詞,一般,*,*,*",
+ "left_id": 4786,
+ "right_id": 4786,
+ "cost": 5000
+ },
+ "名詞": {
+ "sudachi_pos": "名詞,普通名詞,一般,*,*,*",
+ "left_id": 5146,
+ "right_id": 5146,
+ "cost": 5000
+ }
+}
+
+```
+
+```$
+$ kuro2sudachi kuromoji_dict.txt -o sudachi_user_dict.txt -c convert_config.json
+```
+
+if you want to ignore unsupported pos error & invalid format, use `--ignore` flag.
+
+## Dictionary type
+
+You can specify the dictionary with the tokenize option -s (default: core).
+
+```sh
+$ pip install sudachidict_full
+$ kuro2sudachi kuromoji_dict.txt -o sudachi_user_dict.txt -s full
+```
+
+## Auto Splitting
+
+kuro2sudachi supports suto splitting.
+
+```json
+{
+ "名詞": {
+ "sudachi_pos": "名詞,普通名詞,一般,*,*,*",
+ "left_id": 5146,
+ "right_id": 5146,
+ "cost": 5000,
+ "split_mode": "C",
+ "unit_div_mode": [
+ "A", "B"
+ ]
+ }
+}
+```
+
+output includes unit devision info.
+
+```sh
+$ cat kuromoji_dict.txt
+融合たんぱく質,融合たんぱく質,融合たんぱく質,名詞
+発作性心房細動,発作性心房細動,発作性心房細動,名詞
+
+$ kuro2sudachi kuromoji_dict.txt -o sudachi_user_dict.txt -c convert_config.json --ignore
+
+$ cat sudachi_user_dict.txt
+融合たんぱく質,4786,4786,5000,融合たんぱく質,名詞,普通名詞,一般,*,*,*,,融合たんぱく質,*,C,"融合,名詞,普通名詞,サ変可能,*,*,*,ユウゴウ/たんぱく,名詞,普通名詞,一般,*,*,*,タンパク/質,接尾辞,名詞的,一般,*,*,*,シツ","融合,名詞,普通名詞,サ変可能,*,*,*,ユウゴウ/たんぱく質,名詞,普通名詞,一般,*,*,*,タンパクシツ",*
+発作性心房細動,4786,4786,5000,発作性心房細動,名詞,普通名詞,一般,*,*,*,,発作性心房細動,*,C,"発作,名詞,普通名詞,一般,*,*,*,ホッサ/性,接尾辞,名詞的,一般,*,*,*,セイ/心房,名詞,普通名詞,一般,*,*,*,シンボウ/細動,名詞,普通名詞,一般,*,*,*,サイドウ","発作,名詞,普通名詞,一般,*,*,*,ホッサ/性,接尾辞,名詞的,一般,*,*,*,セイ/心房,名詞,普通名詞,一般,*,*,*,シンボウ/細動,名詞,普通名詞,一般,*,*,*,サイドウ",*
+```
+
+## Splitting Words defined by kuromoji
+
+Currently, the CLI does not support word splitting defined by kuromoji. Therefore, the split representation of kuromoji is ignored.
+
+```
+中咽頭ガン,中咽頭 ガン,チュウイントウ ガン,カスタム名詞
+↓
+中咽頭ガン,4786,4786,7000,中咽頭ガン,名詞,固有名詞,一般,*,*,*,チュウイントウガン,中咽頭ガン,*,*,*,*,*
+```
+
+# For Developer
+
+test kuro2sudachi
+
+```sh
+$ poetry install
+$ poetry run pytest
+```
+
+exec kuro2sudachi command
+
+```sh
+$ poetry run kuro2sudachi tests/kuromoji_dict_test.txt -o sudachi_user_dict.txt
+```
+
+## TODO
+
+- [ ] split mode
+- [ ] default rewrite
+
+
+%prep
+%autosetup -n kuro2sudachi-0.3.6
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-kuro2sudachi -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Tue Jun 20 2023 Python_Bot <Python_Bot@openeuler.org> - 0.3.6-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..42b94d4
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+a42e7c7414877ad3fd942709102162fd kuro2sudachi-0.3.6.tar.gz