summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorCoprDistGit <infra@openeuler.org>2023-05-05 11:11:55 +0000
committerCoprDistGit <infra@openeuler.org>2023-05-05 11:11:55 +0000
commit9ca9b8d2eff00bf4521bb3e27dd90e7dc9e9a097 (patch)
treed227ad4852ad9976f09bb05d6a115973899100c9
parent2938b3d0baf14338c43c708ce639d51dfc62bb48 (diff)
automatic import of python-spacymojiopeneuler20.03
-rw-r--r--.gitignore1
-rw-r--r--python-spacymoji.spec386
-rw-r--r--sources1
3 files changed, 388 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..d3b964b 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/spacymoji-3.0.1.tar.gz
diff --git a/python-spacymoji.spec b/python-spacymoji.spec
new file mode 100644
index 0000000..bfe5e8c
--- /dev/null
+++ b/python-spacymoji.spec
@@ -0,0 +1,386 @@
+%global _empty_manifest_terminate_build 0
+Name: python-spacymoji
+Version: 3.0.1
+Release: 1
+Summary: spaCy pipeline component for adding emoji meta data to Doc, Token and Span objects
+License: MIT
+URL: https://github.com/explosion/spacymoji
+Source0: https://mirrors.nju.edu.cn/pypi/web/packages/2d/69/91125a437c48a2c5d40ff89a7adc659dcc4e371223f83540bf1ae990ffd3/spacymoji-3.0.1.tar.gz
+BuildArch: noarch
+
+Requires: python3-spacy
+Requires: python3-emoji
+
+%description
+# spacymoji: emoji for spaCy
+
+[spaCy](https://spacy.io) extension and pipeline component
+for adding emoji meta data to `Doc` objects. Detects emoji consisting of one
+or more unicode characters, and can optionally merge multi-char emoji (combined
+pictures, emoji with skin tone modifiers) into one token. Human-readable emoji
+descriptions are added as a custom attribute, and an optional lookup table can
+be provided for your own descriptions. The extension sets the custom `Doc`,
+`Token` and `Span` attributes `._.is_emoji`, `._.emoji_desc`, `._.has_emoji` and `._.emoji`. You can read more about custom pipeline components and extension attributes [here](https://spacy.io/usage/processing-pipelines).
+
+Emoji are matched using spaCy's [`PhraseMatcher`](https://spacy.io/api/phrasematcher), and looked up in the data
+table provided by the [`emoji` package](https://github.com/carpedm20/emoji).
+
+[![Azure Pipelines](https://img.shields.io/azure-devops/build/explosion-ai/public/22/master.svg?logo=azure-pipelines&style=flat-square&label=build)](https://dev.azure.com/explosion-ai/public/_build?definitionId=22)
+[![Current Release Version](https://img.shields.io/github/release/explosion/spacymoji.svg?style=flat-square&logo=github)](https://github.com/explosion/spacymoji/releases)
+[![pypi Version](https://img.shields.io/pypi/v/spacymoji.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.org/project/spacymoji/)
+
+# ⏳ Installation
+
+`spacymoji` requires `spacy` v3.0.0 or higher. For spaCy v2.x, instally `spacymoji==2.0.0`.
+
+```bash
+pip install spacymoji
+```
+
+# ☝️ Usage
+
+Import the component and add it anywhere in your pipeline using the string
+name of the `"emoji"` component factory:
+
+```python
+import spacy
+
+nlp = spacy.load("en_core_web_sm")
+nlp.add_pipe("emoji", first=True)
+doc = nlp("This is a test 😻 👍🏿")
+assert doc._.has_emoji is True
+assert doc[2:5]._.has_emoji is True
+assert doc[0]._.is_emoji is False
+assert doc[4]._.is_emoji is True
+assert doc[5]._.emoji_desc == "thumbs up dark skin tone"
+assert len(doc._.emoji) == 2
+assert doc._.emoji[1] == ("👍🏿", 5, "thumbs up dark skin tone")
+```
+
+`spacymoji` only cares about the token text, so you can use it on a blank
+`Language` instance (it should work for all
+[available languages](https://spacy.io/usage/models#languages)!), or in
+a pipeline with a loaded pipeline. If your pipeline
+includes a tagger, parser and entity recognizer, make sure to add the emoji
+component as `first=True`, so the spans are merged right after tokenization,
+and _before_ the document is parsed. If your text contains a lot of emoji, this
+might even give you a nice boost in parser accuracy.
+
+## Available attributes
+
+The extension sets attributes on the `Doc`, `Span` and `Token`. You can
+change the attribute names (and other parameters of the Emoji component) by passing
+them via the `config` parameter in the `nlp.add_pipe(...)` method. For more details
+on custom components and attributes, see the
+[processing pipelines documentation](https://spacy.io/usage/processing-pipelines#custom-components).
+
+| Attribute | Type | Description |
+| -------------------- | -------------------------- | ------------------------------------------------------------- |
+| `Token._.is_emoji` | bool | Whether the token is an emoji. |
+| `Token._.emoji_desc` | str | A human-readable description of the emoji. |
+| `Doc._.has_emoji` | bool | Whether the document contains emoji. |
+| `Doc._.emoji` | List[Tuple[str, int, str]] | `(emoji, index, description)` tuples of the document's emoji. |
+| `Span._.has_emoji` | bool  | Whether the span contains emoji. |
+| `Span._.emoji` | List[Tuple[str, int, str]] | `(emoji, index, description)` tuples of the span's emoji. |
+
+## Settings
+
+You can configure the `emoji` factory by setting any of the following parameters in
+the `config` dictionary:
+
+| Setting | Type | Description |
+| ------------- | ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
+| `attrs` | Tuple[str, str, str, str] | Attributes to set on the `._` property. Defaults to `('has_emoji', 'is_emoji', 'emoji_desc', 'emoji')`. |
+| `pattern_id` | str | ID of match pattern, defaults to `'EMOJI'`. Can be changed to avoid ID conflicts. |
+| `merge_spans` | bool | Merge spans containing multi-character emoji, defaults to `True`. Will only merge combined emoji resulting in one icon, not sequences. |
+| `lookup` | Dict[str, str] | Optional lookup table that maps emoji strings to custom descriptions, e.g. translations or other annotations. |
+
+```python
+emoji_config = {"attrs": ("has_e", "is_e", "e_desc", "e"), lookup={"👨‍🎤": "David Bowie"})
+nlp.add_pipe(emoji, first=True, config=emoji_config)
+doc = nlp("We can be 👨‍🎤 heroes")
+assert doc[3]._.is_e
+assert doc[3]._.e_desc == "David Bowie"
+```
+
+If you're training a pipeline, you can define the component config in your [`config.cfg`](https://spacy.io/usage/training):
+
+```ini
+[nlp]
+pipeline = ["emoji", "ner"]
+# ...
+
+[components.emoji]
+factory = "emoji"
+merge_spans = false
+```
+
+
+
+
+%package -n python3-spacymoji
+Summary: spaCy pipeline component for adding emoji meta data to Doc, Token and Span objects
+Provides: python-spacymoji
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-spacymoji
+# spacymoji: emoji for spaCy
+
+[spaCy](https://spacy.io) extension and pipeline component
+for adding emoji meta data to `Doc` objects. Detects emoji consisting of one
+or more unicode characters, and can optionally merge multi-char emoji (combined
+pictures, emoji with skin tone modifiers) into one token. Human-readable emoji
+descriptions are added as a custom attribute, and an optional lookup table can
+be provided for your own descriptions. The extension sets the custom `Doc`,
+`Token` and `Span` attributes `._.is_emoji`, `._.emoji_desc`, `._.has_emoji` and `._.emoji`. You can read more about custom pipeline components and extension attributes [here](https://spacy.io/usage/processing-pipelines).
+
+Emoji are matched using spaCy's [`PhraseMatcher`](https://spacy.io/api/phrasematcher), and looked up in the data
+table provided by the [`emoji` package](https://github.com/carpedm20/emoji).
+
+[![Azure Pipelines](https://img.shields.io/azure-devops/build/explosion-ai/public/22/master.svg?logo=azure-pipelines&style=flat-square&label=build)](https://dev.azure.com/explosion-ai/public/_build?definitionId=22)
+[![Current Release Version](https://img.shields.io/github/release/explosion/spacymoji.svg?style=flat-square&logo=github)](https://github.com/explosion/spacymoji/releases)
+[![pypi Version](https://img.shields.io/pypi/v/spacymoji.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.org/project/spacymoji/)
+
+# ⏳ Installation
+
+`spacymoji` requires `spacy` v3.0.0 or higher. For spaCy v2.x, instally `spacymoji==2.0.0`.
+
+```bash
+pip install spacymoji
+```
+
+# ☝️ Usage
+
+Import the component and add it anywhere in your pipeline using the string
+name of the `"emoji"` component factory:
+
+```python
+import spacy
+
+nlp = spacy.load("en_core_web_sm")
+nlp.add_pipe("emoji", first=True)
+doc = nlp("This is a test 😻 👍🏿")
+assert doc._.has_emoji is True
+assert doc[2:5]._.has_emoji is True
+assert doc[0]._.is_emoji is False
+assert doc[4]._.is_emoji is True
+assert doc[5]._.emoji_desc == "thumbs up dark skin tone"
+assert len(doc._.emoji) == 2
+assert doc._.emoji[1] == ("👍🏿", 5, "thumbs up dark skin tone")
+```
+
+`spacymoji` only cares about the token text, so you can use it on a blank
+`Language` instance (it should work for all
+[available languages](https://spacy.io/usage/models#languages)!), or in
+a pipeline with a loaded pipeline. If your pipeline
+includes a tagger, parser and entity recognizer, make sure to add the emoji
+component as `first=True`, so the spans are merged right after tokenization,
+and _before_ the document is parsed. If your text contains a lot of emoji, this
+might even give you a nice boost in parser accuracy.
+
+## Available attributes
+
+The extension sets attributes on the `Doc`, `Span` and `Token`. You can
+change the attribute names (and other parameters of the Emoji component) by passing
+them via the `config` parameter in the `nlp.add_pipe(...)` method. For more details
+on custom components and attributes, see the
+[processing pipelines documentation](https://spacy.io/usage/processing-pipelines#custom-components).
+
+| Attribute | Type | Description |
+| -------------------- | -------------------------- | ------------------------------------------------------------- |
+| `Token._.is_emoji` | bool | Whether the token is an emoji. |
+| `Token._.emoji_desc` | str | A human-readable description of the emoji. |
+| `Doc._.has_emoji` | bool | Whether the document contains emoji. |
+| `Doc._.emoji` | List[Tuple[str, int, str]] | `(emoji, index, description)` tuples of the document's emoji. |
+| `Span._.has_emoji` | bool  | Whether the span contains emoji. |
+| `Span._.emoji` | List[Tuple[str, int, str]] | `(emoji, index, description)` tuples of the span's emoji. |
+
+## Settings
+
+You can configure the `emoji` factory by setting any of the following parameters in
+the `config` dictionary:
+
+| Setting | Type | Description |
+| ------------- | ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
+| `attrs` | Tuple[str, str, str, str] | Attributes to set on the `._` property. Defaults to `('has_emoji', 'is_emoji', 'emoji_desc', 'emoji')`. |
+| `pattern_id` | str | ID of match pattern, defaults to `'EMOJI'`. Can be changed to avoid ID conflicts. |
+| `merge_spans` | bool | Merge spans containing multi-character emoji, defaults to `True`. Will only merge combined emoji resulting in one icon, not sequences. |
+| `lookup` | Dict[str, str] | Optional lookup table that maps emoji strings to custom descriptions, e.g. translations or other annotations. |
+
+```python
+emoji_config = {"attrs": ("has_e", "is_e", "e_desc", "e"), lookup={"👨‍🎤": "David Bowie"})
+nlp.add_pipe(emoji, first=True, config=emoji_config)
+doc = nlp("We can be 👨‍🎤 heroes")
+assert doc[3]._.is_e
+assert doc[3]._.e_desc == "David Bowie"
+```
+
+If you're training a pipeline, you can define the component config in your [`config.cfg`](https://spacy.io/usage/training):
+
+```ini
+[nlp]
+pipeline = ["emoji", "ner"]
+# ...
+
+[components.emoji]
+factory = "emoji"
+merge_spans = false
+```
+
+
+
+
+%package help
+Summary: Development documents and examples for spacymoji
+Provides: python3-spacymoji-doc
+%description help
+# spacymoji: emoji for spaCy
+
+[spaCy](https://spacy.io) extension and pipeline component
+for adding emoji meta data to `Doc` objects. Detects emoji consisting of one
+or more unicode characters, and can optionally merge multi-char emoji (combined
+pictures, emoji with skin tone modifiers) into one token. Human-readable emoji
+descriptions are added as a custom attribute, and an optional lookup table can
+be provided for your own descriptions. The extension sets the custom `Doc`,
+`Token` and `Span` attributes `._.is_emoji`, `._.emoji_desc`, `._.has_emoji` and `._.emoji`. You can read more about custom pipeline components and extension attributes [here](https://spacy.io/usage/processing-pipelines).
+
+Emoji are matched using spaCy's [`PhraseMatcher`](https://spacy.io/api/phrasematcher), and looked up in the data
+table provided by the [`emoji` package](https://github.com/carpedm20/emoji).
+
+[![Azure Pipelines](https://img.shields.io/azure-devops/build/explosion-ai/public/22/master.svg?logo=azure-pipelines&style=flat-square&label=build)](https://dev.azure.com/explosion-ai/public/_build?definitionId=22)
+[![Current Release Version](https://img.shields.io/github/release/explosion/spacymoji.svg?style=flat-square&logo=github)](https://github.com/explosion/spacymoji/releases)
+[![pypi Version](https://img.shields.io/pypi/v/spacymoji.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.org/project/spacymoji/)
+
+# ⏳ Installation
+
+`spacymoji` requires `spacy` v3.0.0 or higher. For spaCy v2.x, instally `spacymoji==2.0.0`.
+
+```bash
+pip install spacymoji
+```
+
+# ☝️ Usage
+
+Import the component and add it anywhere in your pipeline using the string
+name of the `"emoji"` component factory:
+
+```python
+import spacy
+
+nlp = spacy.load("en_core_web_sm")
+nlp.add_pipe("emoji", first=True)
+doc = nlp("This is a test 😻 👍🏿")
+assert doc._.has_emoji is True
+assert doc[2:5]._.has_emoji is True
+assert doc[0]._.is_emoji is False
+assert doc[4]._.is_emoji is True
+assert doc[5]._.emoji_desc == "thumbs up dark skin tone"
+assert len(doc._.emoji) == 2
+assert doc._.emoji[1] == ("👍🏿", 5, "thumbs up dark skin tone")
+```
+
+`spacymoji` only cares about the token text, so you can use it on a blank
+`Language` instance (it should work for all
+[available languages](https://spacy.io/usage/models#languages)!), or in
+a pipeline with a loaded pipeline. If your pipeline
+includes a tagger, parser and entity recognizer, make sure to add the emoji
+component as `first=True`, so the spans are merged right after tokenization,
+and _before_ the document is parsed. If your text contains a lot of emoji, this
+might even give you a nice boost in parser accuracy.
+
+## Available attributes
+
+The extension sets attributes on the `Doc`, `Span` and `Token`. You can
+change the attribute names (and other parameters of the Emoji component) by passing
+them via the `config` parameter in the `nlp.add_pipe(...)` method. For more details
+on custom components and attributes, see the
+[processing pipelines documentation](https://spacy.io/usage/processing-pipelines#custom-components).
+
+| Attribute | Type | Description |
+| -------------------- | -------------------------- | ------------------------------------------------------------- |
+| `Token._.is_emoji` | bool | Whether the token is an emoji. |
+| `Token._.emoji_desc` | str | A human-readable description of the emoji. |
+| `Doc._.has_emoji` | bool | Whether the document contains emoji. |
+| `Doc._.emoji` | List[Tuple[str, int, str]] | `(emoji, index, description)` tuples of the document's emoji. |
+| `Span._.has_emoji` | bool  | Whether the span contains emoji. |
+| `Span._.emoji` | List[Tuple[str, int, str]] | `(emoji, index, description)` tuples of the span's emoji. |
+
+## Settings
+
+You can configure the `emoji` factory by setting any of the following parameters in
+the `config` dictionary:
+
+| Setting | Type | Description |
+| ------------- | ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
+| `attrs` | Tuple[str, str, str, str] | Attributes to set on the `._` property. Defaults to `('has_emoji', 'is_emoji', 'emoji_desc', 'emoji')`. |
+| `pattern_id` | str | ID of match pattern, defaults to `'EMOJI'`. Can be changed to avoid ID conflicts. |
+| `merge_spans` | bool | Merge spans containing multi-character emoji, defaults to `True`. Will only merge combined emoji resulting in one icon, not sequences. |
+| `lookup` | Dict[str, str] | Optional lookup table that maps emoji strings to custom descriptions, e.g. translations or other annotations. |
+
+```python
+emoji_config = {"attrs": ("has_e", "is_e", "e_desc", "e"), lookup={"👨‍🎤": "David Bowie"})
+nlp.add_pipe(emoji, first=True, config=emoji_config)
+doc = nlp("We can be 👨‍🎤 heroes")
+assert doc[3]._.is_e
+assert doc[3]._.e_desc == "David Bowie"
+```
+
+If you're training a pipeline, you can define the component config in your [`config.cfg`](https://spacy.io/usage/training):
+
+```ini
+[nlp]
+pipeline = ["emoji", "ner"]
+# ...
+
+[components.emoji]
+factory = "emoji"
+merge_spans = false
+```
+
+
+
+
+%prep
+%autosetup -n spacymoji-3.0.1
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-spacymoji -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Fri May 05 2023 Python_Bot <Python_Bot@openeuler.org> - 3.0.1-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..7eaf6c4
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+2c81778d55a5603b891e2c8edf4a4ade spacymoji-3.0.1.tar.gz