diff options
| author | CoprDistGit <infra@openeuler.org> | 2023-04-10 20:22:04 +0000 |
|---|---|---|
| committer | CoprDistGit <infra@openeuler.org> | 2023-04-10 20:22:04 +0000 |
| commit | 88320c823fd14d1e566858a473bc9f1ff4719c6c (patch) | |
| tree | 1daa96ec7a1cb68548219b247ee8da6f42662cdc | |
| parent | 6798afa4d3c79e8679e05950ca2793bea0c3efc5 (diff) | |
automatic import of python-spacy-lookups-data
| -rw-r--r-- | .gitignore | 1 | ||||
| -rw-r--r-- | python-spacy-lookups-data.spec | 289 | ||||
| -rw-r--r-- | sources | 1 |
3 files changed, 291 insertions, 0 deletions
@@ -0,0 +1 @@ +/spacy_lookups_data-1.0.3.tar.gz diff --git a/python-spacy-lookups-data.spec b/python-spacy-lookups-data.spec new file mode 100644 index 0000000..cd473af --- /dev/null +++ b/python-spacy-lookups-data.spec @@ -0,0 +1,289 @@ +%global _empty_manifest_terminate_build 0 +Name: python-spacy-lookups-data +Version: 1.0.3 +Release: 1 +Summary: Additional lookup tables and data resources for spaCy +License: MIT +URL: https://spacy.io +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/72/cc/0fc40728ca86612c60de09c450eb4c91973677ab22ef4a5f93f96ef3050e/spacy_lookups_data-1.0.3.tar.gz +BuildArch: noarch + +Requires: python3-setuptools + +%description +<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a> + +# spaCy lookups data + +This repository contains additional data files to be used with +[spaCy](https://spacy.io) v2.2+. When it's installed in the same environment as +spaCy, this package makes the resources for each language available as an entry +point, which spaCy checks when setting up the `Vocab` and `Lookups`. + +Feel free to submit pull requests to update the data. For issues related to the +data, lookups and integration, please use the +[spaCy issue tracker](https://github.com/explosion/spaCy/issues). + +[>)](https://dev.azure.com/explosion-ai/public/_build?definitionId=12) +[](https://github.com/explosion/spacy-lookups-data/releases) +[](https://pypi.org/project/spacy-lookups-data/) +[](https://anaconda.org/conda-forge/spacy-lookups-data) + +## FAQ + +### Why does this exist? + +The main purpose of this package is to make the default spaCy installation +smaller and not force every user to download large data files for _all_ +languages by default. Lookups data is now either provided **via the pre-trained +models** (which serialize out their vocabulary and lookup tables) or by +**explicitly installing this package** or `spacy[lookups]`. + +### When should I install this? + +You should install this package if you want to use lemmatization for languages +that don't yet have a [pretrained model](https://spacy.io/models) available for +download and don't rely on third-party libraries for lemmatization – for +example, **Turkish**, **Swedish** or **Croatian** +([see data files](spacy_lookups_data/data)). + +If you are training new models with spaCy, you should probably install this, +since it contains lemmatization and normalization data for 25+ languages that +is no longer included as part of the main spaCy library. In particular, you +should install it if you're creating a **blank model** and you want it to +include lemmatization and normalization data. Once you've saved out the model +(e.g. via `nlp.disk`), it will include the lookup tables as part of its +`Vocab`. + +### Is this package only for lemmatization? + +This package used to only be for lemmatization, but it has been extended to +include normalization data for many languages. As of v0.3.1 it also includes +optional probability and Brown cluster data that used to be distributed with +provided models in spaCy v2.2 but is no longer included in spaCy v2.3. In the +future it may include other lookup lists and tables as well, e.g. large +tokenizer exception files. + +## Running tests + +This package now also includes all +[data-specific tests](spacy_lookups_data/tests). The test suite depends on +spaCy. + +```bash +pip install -r requirements.txt +python -m pytest spacy_lookups_data +``` + +If you've installed the package in your spaCy environment, you can also run the +tests like this: + +```bash +python -m pytest --pyargs spacy_lookups_data +``` + + + + +%package -n python3-spacy-lookups-data +Summary: Additional lookup tables and data resources for spaCy +Provides: python-spacy-lookups-data +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-spacy-lookups-data +<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a> + +# spaCy lookups data + +This repository contains additional data files to be used with +[spaCy](https://spacy.io) v2.2+. When it's installed in the same environment as +spaCy, this package makes the resources for each language available as an entry +point, which spaCy checks when setting up the `Vocab` and `Lookups`. + +Feel free to submit pull requests to update the data. For issues related to the +data, lookups and integration, please use the +[spaCy issue tracker](https://github.com/explosion/spaCy/issues). + +[>)](https://dev.azure.com/explosion-ai/public/_build?definitionId=12) +[](https://github.com/explosion/spacy-lookups-data/releases) +[](https://pypi.org/project/spacy-lookups-data/) +[](https://anaconda.org/conda-forge/spacy-lookups-data) + +## FAQ + +### Why does this exist? + +The main purpose of this package is to make the default spaCy installation +smaller and not force every user to download large data files for _all_ +languages by default. Lookups data is now either provided **via the pre-trained +models** (which serialize out their vocabulary and lookup tables) or by +**explicitly installing this package** or `spacy[lookups]`. + +### When should I install this? + +You should install this package if you want to use lemmatization for languages +that don't yet have a [pretrained model](https://spacy.io/models) available for +download and don't rely on third-party libraries for lemmatization – for +example, **Turkish**, **Swedish** or **Croatian** +([see data files](spacy_lookups_data/data)). + +If you are training new models with spaCy, you should probably install this, +since it contains lemmatization and normalization data for 25+ languages that +is no longer included as part of the main spaCy library. In particular, you +should install it if you're creating a **blank model** and you want it to +include lemmatization and normalization data. Once you've saved out the model +(e.g. via `nlp.disk`), it will include the lookup tables as part of its +`Vocab`. + +### Is this package only for lemmatization? + +This package used to only be for lemmatization, but it has been extended to +include normalization data for many languages. As of v0.3.1 it also includes +optional probability and Brown cluster data that used to be distributed with +provided models in spaCy v2.2 but is no longer included in spaCy v2.3. In the +future it may include other lookup lists and tables as well, e.g. large +tokenizer exception files. + +## Running tests + +This package now also includes all +[data-specific tests](spacy_lookups_data/tests). The test suite depends on +spaCy. + +```bash +pip install -r requirements.txt +python -m pytest spacy_lookups_data +``` + +If you've installed the package in your spaCy environment, you can also run the +tests like this: + +```bash +python -m pytest --pyargs spacy_lookups_data +``` + + + + +%package help +Summary: Development documents and examples for spacy-lookups-data +Provides: python3-spacy-lookups-data-doc +%description help +<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a> + +# spaCy lookups data + +This repository contains additional data files to be used with +[spaCy](https://spacy.io) v2.2+. When it's installed in the same environment as +spaCy, this package makes the resources for each language available as an entry +point, which spaCy checks when setting up the `Vocab` and `Lookups`. + +Feel free to submit pull requests to update the data. For issues related to the +data, lookups and integration, please use the +[spaCy issue tracker](https://github.com/explosion/spaCy/issues). + +[>)](https://dev.azure.com/explosion-ai/public/_build?definitionId=12) +[](https://github.com/explosion/spacy-lookups-data/releases) +[](https://pypi.org/project/spacy-lookups-data/) +[](https://anaconda.org/conda-forge/spacy-lookups-data) + +## FAQ + +### Why does this exist? + +The main purpose of this package is to make the default spaCy installation +smaller and not force every user to download large data files for _all_ +languages by default. Lookups data is now either provided **via the pre-trained +models** (which serialize out their vocabulary and lookup tables) or by +**explicitly installing this package** or `spacy[lookups]`. + +### When should I install this? + +You should install this package if you want to use lemmatization for languages +that don't yet have a [pretrained model](https://spacy.io/models) available for +download and don't rely on third-party libraries for lemmatization – for +example, **Turkish**, **Swedish** or **Croatian** +([see data files](spacy_lookups_data/data)). + +If you are training new models with spaCy, you should probably install this, +since it contains lemmatization and normalization data for 25+ languages that +is no longer included as part of the main spaCy library. In particular, you +should install it if you're creating a **blank model** and you want it to +include lemmatization and normalization data. Once you've saved out the model +(e.g. via `nlp.disk`), it will include the lookup tables as part of its +`Vocab`. + +### Is this package only for lemmatization? + +This package used to only be for lemmatization, but it has been extended to +include normalization data for many languages. As of v0.3.1 it also includes +optional probability and Brown cluster data that used to be distributed with +provided models in spaCy v2.2 but is no longer included in spaCy v2.3. In the +future it may include other lookup lists and tables as well, e.g. large +tokenizer exception files. + +## Running tests + +This package now also includes all +[data-specific tests](spacy_lookups_data/tests). The test suite depends on +spaCy. + +```bash +pip install -r requirements.txt +python -m pytest spacy_lookups_data +``` + +If you've installed the package in your spaCy environment, you can also run the +tests like this: + +```bash +python -m pytest --pyargs spacy_lookups_data +``` + + + + +%prep +%autosetup -n spacy-lookups-data-1.0.3 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-spacy-lookups-data -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Mon Apr 10 2023 Python_Bot <Python_Bot@openeuler.org> - 1.0.3-1 +- Package Spec generated @@ -0,0 +1 @@ +13923e6101e68c2d39d1a78a4cb7deb8 spacy_lookups_data-1.0.3.tar.gz |
