%global _empty_manifest_terminate_build 0 Name: python-spacymoji Version: 3.0.1 Release: 1 Summary: spaCy pipeline component for adding emoji meta data to Doc, Token and Span objects License: MIT URL: https://github.com/explosion/spacymoji Source0: https://mirrors.nju.edu.cn/pypi/web/packages/2d/69/91125a437c48a2c5d40ff89a7adc659dcc4e371223f83540bf1ae990ffd3/spacymoji-3.0.1.tar.gz BuildArch: noarch Requires: python3-spacy Requires: python3-emoji %description # spacymoji: emoji for spaCy [spaCy](https://spacy.io) extension and pipeline component for adding emoji meta data to `Doc` objects. Detects emoji consisting of one or more unicode characters, and can optionally merge multi-char emoji (combined pictures, emoji with skin tone modifiers) into one token. Human-readable emoji descriptions are added as a custom attribute, and an optional lookup table can be provided for your own descriptions. The extension sets the custom `Doc`, `Token` and `Span` attributes `._.is_emoji`, `._.emoji_desc`, `._.has_emoji` and `._.emoji`. You can read more about custom pipeline components and extension attributes [here](https://spacy.io/usage/processing-pipelines). Emoji are matched using spaCy's [`PhraseMatcher`](https://spacy.io/api/phrasematcher), and looked up in the data table provided by the [`emoji` package](https://github.com/carpedm20/emoji). [![Azure Pipelines](https://img.shields.io/azure-devops/build/explosion-ai/public/22/master.svg?logo=azure-pipelines&style=flat-square&label=build)](https://dev.azure.com/explosion-ai/public/_build?definitionId=22) [![Current Release Version](https://img.shields.io/github/release/explosion/spacymoji.svg?style=flat-square&logo=github)](https://github.com/explosion/spacymoji/releases) [![pypi Version](https://img.shields.io/pypi/v/spacymoji.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.org/project/spacymoji/) # ⏳ Installation `spacymoji` requires `spacy` v3.0.0 or higher. For spaCy v2.x, instally `spacymoji==2.0.0`. ```bash pip install spacymoji ``` # ☝️ Usage Import the component and add it anywhere in your pipeline using the string name of the `"emoji"` component factory: ```python import spacy nlp = spacy.load("en_core_web_sm") nlp.add_pipe("emoji", first=True) doc = nlp("This is a test 😻 👍🏿") assert doc._.has_emoji is True assert doc[2:5]._.has_emoji is True assert doc[0]._.is_emoji is False assert doc[4]._.is_emoji is True assert doc[5]._.emoji_desc == "thumbs up dark skin tone" assert len(doc._.emoji) == 2 assert doc._.emoji[1] == ("👍🏿", 5, "thumbs up dark skin tone") ``` `spacymoji` only cares about the token text, so you can use it on a blank `Language` instance (it should work for all [available languages](https://spacy.io/usage/models#languages)!), or in a pipeline with a loaded pipeline. If your pipeline includes a tagger, parser and entity recognizer, make sure to add the emoji component as `first=True`, so the spans are merged right after tokenization, and _before_ the document is parsed. If your text contains a lot of emoji, this might even give you a nice boost in parser accuracy. ## Available attributes The extension sets attributes on the `Doc`, `Span` and `Token`. You can change the attribute names (and other parameters of the Emoji component) by passing them via the `config` parameter in the `nlp.add_pipe(...)` method. For more details on custom components and attributes, see the [processing pipelines documentation](https://spacy.io/usage/processing-pipelines#custom-components). | Attribute | Type | Description | | -------------------- | -------------------------- | ------------------------------------------------------------- | | `Token._.is_emoji` | bool | Whether the token is an emoji. | | `Token._.emoji_desc` | str | A human-readable description of the emoji. | | `Doc._.has_emoji` | bool | Whether the document contains emoji. | | `Doc._.emoji` | List[Tuple[str, int, str]] | `(emoji, index, description)` tuples of the document's emoji. | | `Span._.has_emoji` | bool  | Whether the span contains emoji. | | `Span._.emoji` | List[Tuple[str, int, str]] | `(emoji, index, description)` tuples of the span's emoji. | ## Settings You can configure the `emoji` factory by setting any of the following parameters in the `config` dictionary: | Setting | Type | Description | | ------------- | ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | | `attrs` | Tuple[str, str, str, str] | Attributes to set on the `._` property. Defaults to `('has_emoji', 'is_emoji', 'emoji_desc', 'emoji')`. | | `pattern_id` | str | ID of match pattern, defaults to `'EMOJI'`. Can be changed to avoid ID conflicts. | | `merge_spans` | bool | Merge spans containing multi-character emoji, defaults to `True`. Will only merge combined emoji resulting in one icon, not sequences. | | `lookup` | Dict[str, str] | Optional lookup table that maps emoji strings to custom descriptions, e.g. translations or other annotations. | ```python emoji_config = {"attrs": ("has_e", "is_e", "e_desc", "e"), lookup={"👨‍🎤": "David Bowie"}) nlp.add_pipe(emoji, first=True, config=emoji_config) doc = nlp("We can be 👨‍🎤 heroes") assert doc[3]._.is_e assert doc[3]._.e_desc == "David Bowie" ``` If you're training a pipeline, you can define the component config in your [`config.cfg`](https://spacy.io/usage/training): ```ini [nlp] pipeline = ["emoji", "ner"] # ... [components.emoji] factory = "emoji" merge_spans = false ``` %package -n python3-spacymoji Summary: spaCy pipeline component for adding emoji meta data to Doc, Token and Span objects Provides: python-spacymoji BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-spacymoji # spacymoji: emoji for spaCy [spaCy](https://spacy.io) extension and pipeline component for adding emoji meta data to `Doc` objects. Detects emoji consisting of one or more unicode characters, and can optionally merge multi-char emoji (combined pictures, emoji with skin tone modifiers) into one token. Human-readable emoji descriptions are added as a custom attribute, and an optional lookup table can be provided for your own descriptions. The extension sets the custom `Doc`, `Token` and `Span` attributes `._.is_emoji`, `._.emoji_desc`, `._.has_emoji` and `._.emoji`. You can read more about custom pipeline components and extension attributes [here](https://spacy.io/usage/processing-pipelines). Emoji are matched using spaCy's [`PhraseMatcher`](https://spacy.io/api/phrasematcher), and looked up in the data table provided by the [`emoji` package](https://github.com/carpedm20/emoji). [![Azure Pipelines](https://img.shields.io/azure-devops/build/explosion-ai/public/22/master.svg?logo=azure-pipelines&style=flat-square&label=build)](https://dev.azure.com/explosion-ai/public/_build?definitionId=22) [![Current Release Version](https://img.shields.io/github/release/explosion/spacymoji.svg?style=flat-square&logo=github)](https://github.com/explosion/spacymoji/releases) [![pypi Version](https://img.shields.io/pypi/v/spacymoji.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.org/project/spacymoji/) # ⏳ Installation `spacymoji` requires `spacy` v3.0.0 or higher. For spaCy v2.x, instally `spacymoji==2.0.0`. ```bash pip install spacymoji ``` # ☝️ Usage Import the component and add it anywhere in your pipeline using the string name of the `"emoji"` component factory: ```python import spacy nlp = spacy.load("en_core_web_sm") nlp.add_pipe("emoji", first=True) doc = nlp("This is a test 😻 👍🏿") assert doc._.has_emoji is True assert doc[2:5]._.has_emoji is True assert doc[0]._.is_emoji is False assert doc[4]._.is_emoji is True assert doc[5]._.emoji_desc == "thumbs up dark skin tone" assert len(doc._.emoji) == 2 assert doc._.emoji[1] == ("👍🏿", 5, "thumbs up dark skin tone") ``` `spacymoji` only cares about the token text, so you can use it on a blank `Language` instance (it should work for all [available languages](https://spacy.io/usage/models#languages)!), or in a pipeline with a loaded pipeline. If your pipeline includes a tagger, parser and entity recognizer, make sure to add the emoji component as `first=True`, so the spans are merged right after tokenization, and _before_ the document is parsed. If your text contains a lot of emoji, this might even give you a nice boost in parser accuracy. ## Available attributes The extension sets attributes on the `Doc`, `Span` and `Token`. You can change the attribute names (and other parameters of the Emoji component) by passing them via the `config` parameter in the `nlp.add_pipe(...)` method. For more details on custom components and attributes, see the [processing pipelines documentation](https://spacy.io/usage/processing-pipelines#custom-components). | Attribute | Type | Description | | -------------------- | -------------------------- | ------------------------------------------------------------- | | `Token._.is_emoji` | bool | Whether the token is an emoji. | | `Token._.emoji_desc` | str | A human-readable description of the emoji. | | `Doc._.has_emoji` | bool | Whether the document contains emoji. | | `Doc._.emoji` | List[Tuple[str, int, str]] | `(emoji, index, description)` tuples of the document's emoji. | | `Span._.has_emoji` | bool  | Whether the span contains emoji. | | `Span._.emoji` | List[Tuple[str, int, str]] | `(emoji, index, description)` tuples of the span's emoji. | ## Settings You can configure the `emoji` factory by setting any of the following parameters in the `config` dictionary: | Setting | Type | Description | | ------------- | ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | | `attrs` | Tuple[str, str, str, str] | Attributes to set on the `._` property. Defaults to `('has_emoji', 'is_emoji', 'emoji_desc', 'emoji')`. | | `pattern_id` | str | ID of match pattern, defaults to `'EMOJI'`. Can be changed to avoid ID conflicts. | | `merge_spans` | bool | Merge spans containing multi-character emoji, defaults to `True`. Will only merge combined emoji resulting in one icon, not sequences. | | `lookup` | Dict[str, str] | Optional lookup table that maps emoji strings to custom descriptions, e.g. translations or other annotations. | ```python emoji_config = {"attrs": ("has_e", "is_e", "e_desc", "e"), lookup={"👨‍🎤": "David Bowie"}) nlp.add_pipe(emoji, first=True, config=emoji_config) doc = nlp("We can be 👨‍🎤 heroes") assert doc[3]._.is_e assert doc[3]._.e_desc == "David Bowie" ``` If you're training a pipeline, you can define the component config in your [`config.cfg`](https://spacy.io/usage/training): ```ini [nlp] pipeline = ["emoji", "ner"] # ... [components.emoji] factory = "emoji" merge_spans = false ``` %package help Summary: Development documents and examples for spacymoji Provides: python3-spacymoji-doc %description help # spacymoji: emoji for spaCy [spaCy](https://spacy.io) extension and pipeline component for adding emoji meta data to `Doc` objects. Detects emoji consisting of one or more unicode characters, and can optionally merge multi-char emoji (combined pictures, emoji with skin tone modifiers) into one token. Human-readable emoji descriptions are added as a custom attribute, and an optional lookup table can be provided for your own descriptions. The extension sets the custom `Doc`, `Token` and `Span` attributes `._.is_emoji`, `._.emoji_desc`, `._.has_emoji` and `._.emoji`. You can read more about custom pipeline components and extension attributes [here](https://spacy.io/usage/processing-pipelines). Emoji are matched using spaCy's [`PhraseMatcher`](https://spacy.io/api/phrasematcher), and looked up in the data table provided by the [`emoji` package](https://github.com/carpedm20/emoji). [![Azure Pipelines](https://img.shields.io/azure-devops/build/explosion-ai/public/22/master.svg?logo=azure-pipelines&style=flat-square&label=build)](https://dev.azure.com/explosion-ai/public/_build?definitionId=22) [![Current Release Version](https://img.shields.io/github/release/explosion/spacymoji.svg?style=flat-square&logo=github)](https://github.com/explosion/spacymoji/releases) [![pypi Version](https://img.shields.io/pypi/v/spacymoji.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.org/project/spacymoji/) # ⏳ Installation `spacymoji` requires `spacy` v3.0.0 or higher. For spaCy v2.x, instally `spacymoji==2.0.0`. ```bash pip install spacymoji ``` # ☝️ Usage Import the component and add it anywhere in your pipeline using the string name of the `"emoji"` component factory: ```python import spacy nlp = spacy.load("en_core_web_sm") nlp.add_pipe("emoji", first=True) doc = nlp("This is a test 😻 👍🏿") assert doc._.has_emoji is True assert doc[2:5]._.has_emoji is True assert doc[0]._.is_emoji is False assert doc[4]._.is_emoji is True assert doc[5]._.emoji_desc == "thumbs up dark skin tone" assert len(doc._.emoji) == 2 assert doc._.emoji[1] == ("👍🏿", 5, "thumbs up dark skin tone") ``` `spacymoji` only cares about the token text, so you can use it on a blank `Language` instance (it should work for all [available languages](https://spacy.io/usage/models#languages)!), or in a pipeline with a loaded pipeline. If your pipeline includes a tagger, parser and entity recognizer, make sure to add the emoji component as `first=True`, so the spans are merged right after tokenization, and _before_ the document is parsed. If your text contains a lot of emoji, this might even give you a nice boost in parser accuracy. ## Available attributes The extension sets attributes on the `Doc`, `Span` and `Token`. You can change the attribute names (and other parameters of the Emoji component) by passing them via the `config` parameter in the `nlp.add_pipe(...)` method. For more details on custom components and attributes, see the [processing pipelines documentation](https://spacy.io/usage/processing-pipelines#custom-components). | Attribute | Type | Description | | -------------------- | -------------------------- | ------------------------------------------------------------- | | `Token._.is_emoji` | bool | Whether the token is an emoji. | | `Token._.emoji_desc` | str | A human-readable description of the emoji. | | `Doc._.has_emoji` | bool | Whether the document contains emoji. | | `Doc._.emoji` | List[Tuple[str, int, str]] | `(emoji, index, description)` tuples of the document's emoji. | | `Span._.has_emoji` | bool  | Whether the span contains emoji. | | `Span._.emoji` | List[Tuple[str, int, str]] | `(emoji, index, description)` tuples of the span's emoji. | ## Settings You can configure the `emoji` factory by setting any of the following parameters in the `config` dictionary: | Setting | Type | Description | | ------------- | ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | | `attrs` | Tuple[str, str, str, str] | Attributes to set on the `._` property. Defaults to `('has_emoji', 'is_emoji', 'emoji_desc', 'emoji')`. | | `pattern_id` | str | ID of match pattern, defaults to `'EMOJI'`. Can be changed to avoid ID conflicts. | | `merge_spans` | bool | Merge spans containing multi-character emoji, defaults to `True`. Will only merge combined emoji resulting in one icon, not sequences. | | `lookup` | Dict[str, str] | Optional lookup table that maps emoji strings to custom descriptions, e.g. translations or other annotations. | ```python emoji_config = {"attrs": ("has_e", "is_e", "e_desc", "e"), lookup={"👨‍🎤": "David Bowie"}) nlp.add_pipe(emoji, first=True, config=emoji_config) doc = nlp("We can be 👨‍🎤 heroes") assert doc[3]._.is_e assert doc[3]._.e_desc == "David Bowie" ``` If you're training a pipeline, you can define the component config in your [`config.cfg`](https://spacy.io/usage/training): ```ini [nlp] pipeline = ["emoji", "ner"] # ... [components.emoji] factory = "emoji" merge_spans = false ``` %prep %autosetup -n spacymoji-3.0.1 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-spacymoji -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Fri May 05 2023 Python_Bot - 3.0.1-1 - Package Spec generated