%global _empty_manifest_terminate_build 0 Name: python-demoji Version: 1.1.0 Release: 1 Summary: Accurately remove and replace emojis in text strings License: Apache-2.0 URL: https://github.com/bsolomon1124/demoji Source0: https://mirrors.nju.edu.cn/pypi/web/packages/9d/62/e6de96cf1ef2c6ac91b84a51af151d791f874529d8b146d3587771d05727/demoji-1.1.0.tar.gz BuildArch: noarch Requires: python3-importlib-resources Requires: python3-ujson %description ## Major Changes in Version 1.x Version 1.x of `demoji` now bundles Unicode data in the package at install time rather than requiring a download of the codes from unicode.org at runtime. Please see the [CHANGELOG.md](CHANGELOG.md) for detail and be familiar with the changes before updating from 0.x to 1.x. To report any regressions, please [open a GitHub issue](https://github.com/bsolomon1124/demoji/issues/new?assignees=&labels=&template=bug_report.md&title=). ## Basic Usage `demoji` exports several text-related functions for find-and-replace functionality with emojis: ```python >>> tweet = """\ >>> demoji.findall(tweet) { "šŸ”„": "fire", "šŸŒ‹": "volcano", "šŸ‘ØšŸ½\u200dāš–ļø": "man judge: medium skin tone", "šŸŽ…šŸ¾": "Santa Claus: medium-dark skin tone", "šŸ‡²šŸ‡½": "flag: Mexico", "šŸ‘¹": "ogre", "šŸ¤”": "clown face", "šŸ‡³šŸ‡®": "flag: Nicaragua", "šŸš£šŸ¼": "person rowing boat: medium-light skin tone", "šŸ‚": "ox", } ``` See [below](#reference) for function API. ## Command-line Use You can use `demoji` or `python -m demoji` to replace emojis in file(s) or stdin with their `:code:` equivalents: ```bash $ cat out.txt All done! āœØ šŸ° āœØ $ demoji out.txt All done! :sparkles: :shortcake: :sparkles: $ echo 'All done! āœØ šŸ° āœØ' | demoji All done! :sparkles: :shortcake: :sparkles: $ demoji - we didnt start the šŸ”„ we didnt start the :fire: ``` ## Reference ```python findall(string: str) -> Dict[str, str] ``` Find emojis within `string`. Return a mapping of `{emoji: description}`. ```python findall_list(string: str, desc: bool = True) -> List[str] ``` Find emojis within `string`. Return a list (with possible duplicates). If `desc` is True, the list contains description codes. If `desc` is False, the list contains emojis. ```python replace(string: str, repl: str = "") -> str ``` Replace emojis in `string` with `repl`. ```python replace_with_desc(string: str, sep: str = ":") -> str ``` Replace emojis in `string` with their description codes. The codes are surrounded by `sep`. ```python last_downloaded_timestamp() -> datetime.datetime ``` Show the timestamp of last download for the emoji data bundled with the package. ## Footnote: Emoji Sequences Numerous emojis that look like single Unicode characters are actually multi-character sequences. Examples: - The keycap 2ļøāƒ£ is actually 3 characters, U+0032 (the ASCII digit 2), U+FE0F (variation selector), and U+20E3 (combining enclosing keycap). - The flag of Scotland 7 component characters, `b'\\U0001f3f4\\U000e0067\\U000e0062\\U000e0073\\U000e0063\\U000e0074\\U000e007f'` in full esaped notation. (You can see any of these through `s.encode("unicode-escape")`.) `demoji` is careful to handle this and should find the full sequences rather than their incomplete subcomponents. The way it does this it to sort emoji codes by their length, and then compile a concatenated regular expression that will greedily search for longer emojis first, falling back to shorter ones if not found. This is not by any means a super-optimized way of searching as it has O(N2) properties, but the focus is on accuracy and completeness. ```python >>> from pprint import pprint >>> seq = """\ >>> pprint(seq.encode('unicode-escape')) # Python 3 (b"I bet you didn't know that \\U0001f64b, \\U0001f64b\\u200d\\u2642\\ufe0f," b' and \\U0001f64b\\u200d\\u2640\\ufe0f are three different emojis.\\n') ``` # Changelog ## 1.1.0 - Add a `__main.py__` to allow running `python -m demoji`; add an entry-point `demoji` command; permit stdin (`-`), file name(s), or piped stdin. Contribution by @jap. ## 1.0.0 **This is a backwards-incompatible release with several substantial changes.** The largest change is that `demoji` now bundles a static copy of Unicode emoji data with the package at install time, rather than requiring a runtime download of the codes from unicode.org. Changes below are grouped by their corresponding [Semantic Versioning](https://semver.org/) identifier. SemVer MAJOR: - Drop support for Python 2 and Python 3.5 - The `demoji` package now bundles emoji data that is distributed with the package at install time, rather than requiring a download of the codes from the unicode.org site at runtime (closes #23) - As a result of the above change, the following functions are **removed** from the `demoji` API: - `download_codes()` - `parse_unicode_sequence()` - `parse_unicode_range()` - `stream_unicodeorg_emojifile()` SemVer MINOR: - The `demoji.DIRECTORY` and `demoji.CACHEPATH` attributes are deprecated due to no longer being functionally in used by the package. Accessing them will warn with a `FutureWarning`, and these attributes may be removed completely in a future release - `demoji` can now be installed with optional `ujson` support for faster loading of emoji data from file (versus the standard library's `json`, which is the default); use `python -m pip install demoji[ujson]` - The dependencies `requests` and `colorama` have been removed completely - `importlib_resources` (a backport module) is now required for Python < 3.7 - The `EMOJI_VERSION` attribute, newly added to `demoji`, is a `str` denoting the Unicode database version in use SemVer PATCH: - Fix a typo in `demoji.__all__` to properly include `demoji.findall_list()` - Internal change: Functions that call `set_emoji_pattern()` are now decorated with a `@cache_setter` to set the cache - Some unit tests have been removed to update the change in behavior from downloading codes to bundling codes with install - Update README to reflect bundling behavior ## 0.4.0 - Update emoji source list to version 13.1. (See 5090eb5.) - Formally support Python 3.9. (See 6e9c34c.) - Bugfix: ensure that `demoji.last_downloaded_timestamp()` returns correct UTC time. (See 6c8ad15.) ## 0.3.0 - Feature: add `findall_list()` and `replace_with_desc()` functions. (See 7cea333.) - Modernize setup config to use `setup.cfg`. (See 8f141e7.) ## 0.2.1 - Tox: formally add Python 3.8 tests. ## 0.2.0 - Windows: use the [colorama] package to support printing ANSI escape sequences on Windows; this introduces colorama as a dependency. (See cd343c1.) - Setup: Fix a bug in `setup.py` that would require dependencies to be installed _prior to_ installation of `demoji` in order to find the `__version__`. (See d5f429c.) - Python 2 + Windows support: use `io.open(..., encoding='utf-8')` consistently in `setup.py`. (See 1efec5d.) - Distribution: use a universal wheel in PyPI release. (See 8636a32.) [colorama]: https://github.com/tartley/colorama ## 0.1.5 - Performance improvement: use `re.escape()` rather than failing to compile a small subset of codes. - Remove an unused constant in `__init__.py`. %package -n python3-demoji Summary: Accurately remove and replace emojis in text strings Provides: python-demoji BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-demoji ## Major Changes in Version 1.x Version 1.x of `demoji` now bundles Unicode data in the package at install time rather than requiring a download of the codes from unicode.org at runtime. Please see the [CHANGELOG.md](CHANGELOG.md) for detail and be familiar with the changes before updating from 0.x to 1.x. To report any regressions, please [open a GitHub issue](https://github.com/bsolomon1124/demoji/issues/new?assignees=&labels=&template=bug_report.md&title=). ## Basic Usage `demoji` exports several text-related functions for find-and-replace functionality with emojis: ```python >>> tweet = """\ >>> demoji.findall(tweet) { "šŸ”„": "fire", "šŸŒ‹": "volcano", "šŸ‘ØšŸ½\u200dāš–ļø": "man judge: medium skin tone", "šŸŽ…šŸ¾": "Santa Claus: medium-dark skin tone", "šŸ‡²šŸ‡½": "flag: Mexico", "šŸ‘¹": "ogre", "šŸ¤”": "clown face", "šŸ‡³šŸ‡®": "flag: Nicaragua", "šŸš£šŸ¼": "person rowing boat: medium-light skin tone", "šŸ‚": "ox", } ``` See [below](#reference) for function API. ## Command-line Use You can use `demoji` or `python -m demoji` to replace emojis in file(s) or stdin with their `:code:` equivalents: ```bash $ cat out.txt All done! āœØ šŸ° āœØ $ demoji out.txt All done! :sparkles: :shortcake: :sparkles: $ echo 'All done! āœØ šŸ° āœØ' | demoji All done! :sparkles: :shortcake: :sparkles: $ demoji - we didnt start the šŸ”„ we didnt start the :fire: ``` ## Reference ```python findall(string: str) -> Dict[str, str] ``` Find emojis within `string`. Return a mapping of `{emoji: description}`. ```python findall_list(string: str, desc: bool = True) -> List[str] ``` Find emojis within `string`. Return a list (with possible duplicates). If `desc` is True, the list contains description codes. If `desc` is False, the list contains emojis. ```python replace(string: str, repl: str = "") -> str ``` Replace emojis in `string` with `repl`. ```python replace_with_desc(string: str, sep: str = ":") -> str ``` Replace emojis in `string` with their description codes. The codes are surrounded by `sep`. ```python last_downloaded_timestamp() -> datetime.datetime ``` Show the timestamp of last download for the emoji data bundled with the package. ## Footnote: Emoji Sequences Numerous emojis that look like single Unicode characters are actually multi-character sequences. Examples: - The keycap 2ļøāƒ£ is actually 3 characters, U+0032 (the ASCII digit 2), U+FE0F (variation selector), and U+20E3 (combining enclosing keycap). - The flag of Scotland 7 component characters, `b'\\U0001f3f4\\U000e0067\\U000e0062\\U000e0073\\U000e0063\\U000e0074\\U000e007f'` in full esaped notation. (You can see any of these through `s.encode("unicode-escape")`.) `demoji` is careful to handle this and should find the full sequences rather than their incomplete subcomponents. The way it does this it to sort emoji codes by their length, and then compile a concatenated regular expression that will greedily search for longer emojis first, falling back to shorter ones if not found. This is not by any means a super-optimized way of searching as it has O(N2) properties, but the focus is on accuracy and completeness. ```python >>> from pprint import pprint >>> seq = """\ >>> pprint(seq.encode('unicode-escape')) # Python 3 (b"I bet you didn't know that \\U0001f64b, \\U0001f64b\\u200d\\u2642\\ufe0f," b' and \\U0001f64b\\u200d\\u2640\\ufe0f are three different emojis.\\n') ``` # Changelog ## 1.1.0 - Add a `__main.py__` to allow running `python -m demoji`; add an entry-point `demoji` command; permit stdin (`-`), file name(s), or piped stdin. Contribution by @jap. ## 1.0.0 **This is a backwards-incompatible release with several substantial changes.** The largest change is that `demoji` now bundles a static copy of Unicode emoji data with the package at install time, rather than requiring a runtime download of the codes from unicode.org. Changes below are grouped by their corresponding [Semantic Versioning](https://semver.org/) identifier. SemVer MAJOR: - Drop support for Python 2 and Python 3.5 - The `demoji` package now bundles emoji data that is distributed with the package at install time, rather than requiring a download of the codes from the unicode.org site at runtime (closes #23) - As a result of the above change, the following functions are **removed** from the `demoji` API: - `download_codes()` - `parse_unicode_sequence()` - `parse_unicode_range()` - `stream_unicodeorg_emojifile()` SemVer MINOR: - The `demoji.DIRECTORY` and `demoji.CACHEPATH` attributes are deprecated due to no longer being functionally in used by the package. Accessing them will warn with a `FutureWarning`, and these attributes may be removed completely in a future release - `demoji` can now be installed with optional `ujson` support for faster loading of emoji data from file (versus the standard library's `json`, which is the default); use `python -m pip install demoji[ujson]` - The dependencies `requests` and `colorama` have been removed completely - `importlib_resources` (a backport module) is now required for Python < 3.7 - The `EMOJI_VERSION` attribute, newly added to `demoji`, is a `str` denoting the Unicode database version in use SemVer PATCH: - Fix a typo in `demoji.__all__` to properly include `demoji.findall_list()` - Internal change: Functions that call `set_emoji_pattern()` are now decorated with a `@cache_setter` to set the cache - Some unit tests have been removed to update the change in behavior from downloading codes to bundling codes with install - Update README to reflect bundling behavior ## 0.4.0 - Update emoji source list to version 13.1. (See 5090eb5.) - Formally support Python 3.9. (See 6e9c34c.) - Bugfix: ensure that `demoji.last_downloaded_timestamp()` returns correct UTC time. (See 6c8ad15.) ## 0.3.0 - Feature: add `findall_list()` and `replace_with_desc()` functions. (See 7cea333.) - Modernize setup config to use `setup.cfg`. (See 8f141e7.) ## 0.2.1 - Tox: formally add Python 3.8 tests. ## 0.2.0 - Windows: use the [colorama] package to support printing ANSI escape sequences on Windows; this introduces colorama as a dependency. (See cd343c1.) - Setup: Fix a bug in `setup.py` that would require dependencies to be installed _prior to_ installation of `demoji` in order to find the `__version__`. (See d5f429c.) - Python 2 + Windows support: use `io.open(..., encoding='utf-8')` consistently in `setup.py`. (See 1efec5d.) - Distribution: use a universal wheel in PyPI release. (See 8636a32.) [colorama]: https://github.com/tartley/colorama ## 0.1.5 - Performance improvement: use `re.escape()` rather than failing to compile a small subset of codes. - Remove an unused constant in `__init__.py`. %package help Summary: Development documents and examples for demoji Provides: python3-demoji-doc %description help ## Major Changes in Version 1.x Version 1.x of `demoji` now bundles Unicode data in the package at install time rather than requiring a download of the codes from unicode.org at runtime. Please see the [CHANGELOG.md](CHANGELOG.md) for detail and be familiar with the changes before updating from 0.x to 1.x. To report any regressions, please [open a GitHub issue](https://github.com/bsolomon1124/demoji/issues/new?assignees=&labels=&template=bug_report.md&title=). ## Basic Usage `demoji` exports several text-related functions for find-and-replace functionality with emojis: ```python >>> tweet = """\ >>> demoji.findall(tweet) { "šŸ”„": "fire", "šŸŒ‹": "volcano", "šŸ‘ØšŸ½\u200dāš–ļø": "man judge: medium skin tone", "šŸŽ…šŸ¾": "Santa Claus: medium-dark skin tone", "šŸ‡²šŸ‡½": "flag: Mexico", "šŸ‘¹": "ogre", "šŸ¤”": "clown face", "šŸ‡³šŸ‡®": "flag: Nicaragua", "šŸš£šŸ¼": "person rowing boat: medium-light skin tone", "šŸ‚": "ox", } ``` See [below](#reference) for function API. ## Command-line Use You can use `demoji` or `python -m demoji` to replace emojis in file(s) or stdin with their `:code:` equivalents: ```bash $ cat out.txt All done! āœØ šŸ° āœØ $ demoji out.txt All done! :sparkles: :shortcake: :sparkles: $ echo 'All done! āœØ šŸ° āœØ' | demoji All done! :sparkles: :shortcake: :sparkles: $ demoji - we didnt start the šŸ”„ we didnt start the :fire: ``` ## Reference ```python findall(string: str) -> Dict[str, str] ``` Find emojis within `string`. Return a mapping of `{emoji: description}`. ```python findall_list(string: str, desc: bool = True) -> List[str] ``` Find emojis within `string`. Return a list (with possible duplicates). If `desc` is True, the list contains description codes. If `desc` is False, the list contains emojis. ```python replace(string: str, repl: str = "") -> str ``` Replace emojis in `string` with `repl`. ```python replace_with_desc(string: str, sep: str = ":") -> str ``` Replace emojis in `string` with their description codes. The codes are surrounded by `sep`. ```python last_downloaded_timestamp() -> datetime.datetime ``` Show the timestamp of last download for the emoji data bundled with the package. ## Footnote: Emoji Sequences Numerous emojis that look like single Unicode characters are actually multi-character sequences. Examples: - The keycap 2ļøāƒ£ is actually 3 characters, U+0032 (the ASCII digit 2), U+FE0F (variation selector), and U+20E3 (combining enclosing keycap). - The flag of Scotland 7 component characters, `b'\\U0001f3f4\\U000e0067\\U000e0062\\U000e0073\\U000e0063\\U000e0074\\U000e007f'` in full esaped notation. (You can see any of these through `s.encode("unicode-escape")`.) `demoji` is careful to handle this and should find the full sequences rather than their incomplete subcomponents. The way it does this it to sort emoji codes by their length, and then compile a concatenated regular expression that will greedily search for longer emojis first, falling back to shorter ones if not found. This is not by any means a super-optimized way of searching as it has O(N2) properties, but the focus is on accuracy and completeness. ```python >>> from pprint import pprint >>> seq = """\ >>> pprint(seq.encode('unicode-escape')) # Python 3 (b"I bet you didn't know that \\U0001f64b, \\U0001f64b\\u200d\\u2642\\ufe0f," b' and \\U0001f64b\\u200d\\u2640\\ufe0f are three different emojis.\\n') ``` # Changelog ## 1.1.0 - Add a `__main.py__` to allow running `python -m demoji`; add an entry-point `demoji` command; permit stdin (`-`), file name(s), or piped stdin. Contribution by @jap. ## 1.0.0 **This is a backwards-incompatible release with several substantial changes.** The largest change is that `demoji` now bundles a static copy of Unicode emoji data with the package at install time, rather than requiring a runtime download of the codes from unicode.org. Changes below are grouped by their corresponding [Semantic Versioning](https://semver.org/) identifier. SemVer MAJOR: - Drop support for Python 2 and Python 3.5 - The `demoji` package now bundles emoji data that is distributed with the package at install time, rather than requiring a download of the codes from the unicode.org site at runtime (closes #23) - As a result of the above change, the following functions are **removed** from the `demoji` API: - `download_codes()` - `parse_unicode_sequence()` - `parse_unicode_range()` - `stream_unicodeorg_emojifile()` SemVer MINOR: - The `demoji.DIRECTORY` and `demoji.CACHEPATH` attributes are deprecated due to no longer being functionally in used by the package. Accessing them will warn with a `FutureWarning`, and these attributes may be removed completely in a future release - `demoji` can now be installed with optional `ujson` support for faster loading of emoji data from file (versus the standard library's `json`, which is the default); use `python -m pip install demoji[ujson]` - The dependencies `requests` and `colorama` have been removed completely - `importlib_resources` (a backport module) is now required for Python < 3.7 - The `EMOJI_VERSION` attribute, newly added to `demoji`, is a `str` denoting the Unicode database version in use SemVer PATCH: - Fix a typo in `demoji.__all__` to properly include `demoji.findall_list()` - Internal change: Functions that call `set_emoji_pattern()` are now decorated with a `@cache_setter` to set the cache - Some unit tests have been removed to update the change in behavior from downloading codes to bundling codes with install - Update README to reflect bundling behavior ## 0.4.0 - Update emoji source list to version 13.1. (See 5090eb5.) - Formally support Python 3.9. (See 6e9c34c.) - Bugfix: ensure that `demoji.last_downloaded_timestamp()` returns correct UTC time. (See 6c8ad15.) ## 0.3.0 - Feature: add `findall_list()` and `replace_with_desc()` functions. (See 7cea333.) - Modernize setup config to use `setup.cfg`. (See 8f141e7.) ## 0.2.1 - Tox: formally add Python 3.8 tests. ## 0.2.0 - Windows: use the [colorama] package to support printing ANSI escape sequences on Windows; this introduces colorama as a dependency. (See cd343c1.) - Setup: Fix a bug in `setup.py` that would require dependencies to be installed _prior to_ installation of `demoji` in order to find the `__version__`. (See d5f429c.) - Python 2 + Windows support: use `io.open(..., encoding='utf-8')` consistently in `setup.py`. (See 1efec5d.) - Distribution: use a universal wheel in PyPI release. (See 8636a32.) [colorama]: https://github.com/tartley/colorama ## 0.1.5 - Performance improvement: use `re.escape()` rather than failing to compile a small subset of codes. - Remove an unused constant in `__init__.py`. %prep %autosetup -n demoji-1.1.0 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-demoji -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Tue Apr 11 2023 Python_Bot - 1.1.0-1 - Package Spec generated