diff options
| author | CoprDistGit <infra@openeuler.org> | 2023-05-18 07:11:58 +0000 |
|---|---|---|
| committer | CoprDistGit <infra@openeuler.org> | 2023-05-18 07:11:58 +0000 |
| commit | 2e7c3b3f5de7a94a048b53356419a48b167e4c78 (patch) | |
| tree | ff7e03cfa23f86f27ba0e849b35f587aea86f970 /python-smaz-py3.spec | |
| parent | f7129c17ecb2a523996f4a23a58c0eb342c6fbfb (diff) | |
automatic import of python-smaz-py3
Diffstat (limited to 'python-smaz-py3.spec')
| -rw-r--r-- | python-smaz-py3.spec | 410 |
1 files changed, 410 insertions, 0 deletions
diff --git a/python-smaz-py3.spec b/python-smaz-py3.spec new file mode 100644 index 0000000..2f9bea7 --- /dev/null +++ b/python-smaz-py3.spec @@ -0,0 +1,410 @@ +%global _empty_manifest_terminate_build 0 +Name: python-smaz-py3 +Version: 1.1.2 +Release: 1 +Summary: Small string compression using smaz, supports Python 3. +License: BSD +URL: https://github.com/originell/smaz-py3 +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/01/e5/c7672eeb7e969d1ffa05bbdacab25c877b27eeddfa5e7e45b757b39fce5d/smaz-py3-1.1.2.tar.gz + + +%description +# smaz-py3 + +Small string compression using [_smaz_](https://github.com/antirez/smaz) compression +algorithm. + +This library wraps the original C code, so it should be quite fast. It also has a +testsuite that uses [hypothesis](https://hypothesis.readthedocs.io/en/latest/) based +property testing - a fancy way of saying that the tests are run with randomly +generated strings using most of unicode, to better guard against edge cases. + +## Why do I need this? + +You are working with tons of short strings (text messages, urls,...) and want to save +space. + +According to the original code and notes, it achieves the best compression with english +strings (up to 50%) that do not contain a ton of numbers. However, any other language +might just work as well (allegedly still up to 30%). + +Note that in certain cases it is possible that the compression increases the size. +Keep that in mind and maybe first run some tests. Measuring size is explained in the +example below as well. + +## How do I use this? + +Let's install: + +```sh +$ pip install smaz-py3 +``` + +_Note_: the `-py3` is important. There is an original release, kudos to Benjamin +Sergeant, but it does not work with Python 3+. + +Now, a usage example. + +```python +import smaz +# First we compress our example sentence. +compressed = smaz.compress("The quick brown fox jumps over the lazy dog.") +# The output is raw bytes. As can be seen in the decompress() call below. +# Now, we decompress these raw bytes again. This should return our example sentence. +decompressed = smaz.decompress(b'H\x00\xfeq&\x83\xfek^sA)\xdc\xfa\x00\xfej&-<\x95\xe7\r\x0b\x89\xdbG\x18\x06;n') +# This does not fail, which means we have successfully compressed and decompressed +# without damaging anything. +assert decompressed == "The quick brown fox jumps over the lazy dog." +``` + +How much did we compress? + +```python +# First, we get the actual byte size of our example string. +original_size = len("The quick brown fox jumps over the lazy dog.".encode("utf-8")) # 44 bytes +# As `compressed` is already raw bytes, we can also call len() on this +compressed_size = len(compressed) # 31 bytes +compression_ratio = 1 - compressed_size / original_size # 0.295 +``` + +So we saved about 30% (0.295 \* 100 and some rounding 😉). + +If the compression ratio would be below 0, we would have actually increased the +string. Yes, this can happen. Again, smaz works best on _small_ strings. + +### A small note about NULL bytes + +Currently, `smaz-py3` does not support strings with NULL bytes (`\x00`) in compression: + +```python +>>> import smaz +>>> smaz.compress("The quick brown fox\x00 jumps over the lazy dog.") +Traceback (most recent call last): + File "<stdin>", line 1, in <module> +ValueError: embedded null character +``` + +My reasoning behind this is that in most scenarios you want to clean that away +beforehand anyways. If you think this is wrong, please open up an +[issue on github](https://github.com/originell/smaz-py3). I am happy for further input! + +## Migrating from Python 2 `smaz` + +If you have been using the [Python 2 `smaz` library](https://pypi.org/project/smaz/), +this Python 3 version exposes the same API, so it is a drop-in replacement. + +**Important**: While developing this extension, I think I found a bug in the original +library. Using Python 2.7.16: + +```python +>>> import smaz +>>> smaz.compress("The quick brown fox jumps over the lazy dog.") +'H' # this is wrong. +>>> small = smaz.compress("The quick brown fox jumps over the lazy dog.") +>>> smaz.decompress(small) +'The' # information lost. +``` + +So, if you are actually upgrading from this, please make sure that you are not +affected by this. `smaz-py3` is not prone to this bug. + +Behind the scenes, smaz uses NULL bytes in compression. However, when converting from +C back to a Python string object, NULL is used to mark the end of the string. The +above sentence, compressed, has the NULL byte right after the `H` (`H\x00\xfeq…`). +That's why it stops right then and there. Again, `smaz-py3` is not affected by this, +mostly because I got lucky in choosing this example sentence. + +## Credits + +Credit where credit is due. First to [antirez's SMAZ compression](https://github.com/antirez/smaz) +and to the [original python 2 wrapper](https://pypi.org/project/smaz/) by Benjamin +Sergeant. + + + + +%package -n python3-smaz-py3 +Summary: Small string compression using smaz, supports Python 3. +Provides: python-smaz-py3 +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +BuildRequires: python3-cffi +BuildRequires: gcc +BuildRequires: gdb +%description -n python3-smaz-py3 +# smaz-py3 + +Small string compression using [_smaz_](https://github.com/antirez/smaz) compression +algorithm. + +This library wraps the original C code, so it should be quite fast. It also has a +testsuite that uses [hypothesis](https://hypothesis.readthedocs.io/en/latest/) based +property testing - a fancy way of saying that the tests are run with randomly +generated strings using most of unicode, to better guard against edge cases. + +## Why do I need this? + +You are working with tons of short strings (text messages, urls,...) and want to save +space. + +According to the original code and notes, it achieves the best compression with english +strings (up to 50%) that do not contain a ton of numbers. However, any other language +might just work as well (allegedly still up to 30%). + +Note that in certain cases it is possible that the compression increases the size. +Keep that in mind and maybe first run some tests. Measuring size is explained in the +example below as well. + +## How do I use this? + +Let's install: + +```sh +$ pip install smaz-py3 +``` + +_Note_: the `-py3` is important. There is an original release, kudos to Benjamin +Sergeant, but it does not work with Python 3+. + +Now, a usage example. + +```python +import smaz +# First we compress our example sentence. +compressed = smaz.compress("The quick brown fox jumps over the lazy dog.") +# The output is raw bytes. As can be seen in the decompress() call below. +# Now, we decompress these raw bytes again. This should return our example sentence. +decompressed = smaz.decompress(b'H\x00\xfeq&\x83\xfek^sA)\xdc\xfa\x00\xfej&-<\x95\xe7\r\x0b\x89\xdbG\x18\x06;n') +# This does not fail, which means we have successfully compressed and decompressed +# without damaging anything. +assert decompressed == "The quick brown fox jumps over the lazy dog." +``` + +How much did we compress? + +```python +# First, we get the actual byte size of our example string. +original_size = len("The quick brown fox jumps over the lazy dog.".encode("utf-8")) # 44 bytes +# As `compressed` is already raw bytes, we can also call len() on this +compressed_size = len(compressed) # 31 bytes +compression_ratio = 1 - compressed_size / original_size # 0.295 +``` + +So we saved about 30% (0.295 \* 100 and some rounding 😉). + +If the compression ratio would be below 0, we would have actually increased the +string. Yes, this can happen. Again, smaz works best on _small_ strings. + +### A small note about NULL bytes + +Currently, `smaz-py3` does not support strings with NULL bytes (`\x00`) in compression: + +```python +>>> import smaz +>>> smaz.compress("The quick brown fox\x00 jumps over the lazy dog.") +Traceback (most recent call last): + File "<stdin>", line 1, in <module> +ValueError: embedded null character +``` + +My reasoning behind this is that in most scenarios you want to clean that away +beforehand anyways. If you think this is wrong, please open up an +[issue on github](https://github.com/originell/smaz-py3). I am happy for further input! + +## Migrating from Python 2 `smaz` + +If you have been using the [Python 2 `smaz` library](https://pypi.org/project/smaz/), +this Python 3 version exposes the same API, so it is a drop-in replacement. + +**Important**: While developing this extension, I think I found a bug in the original +library. Using Python 2.7.16: + +```python +>>> import smaz +>>> smaz.compress("The quick brown fox jumps over the lazy dog.") +'H' # this is wrong. +>>> small = smaz.compress("The quick brown fox jumps over the lazy dog.") +>>> smaz.decompress(small) +'The' # information lost. +``` + +So, if you are actually upgrading from this, please make sure that you are not +affected by this. `smaz-py3` is not prone to this bug. + +Behind the scenes, smaz uses NULL bytes in compression. However, when converting from +C back to a Python string object, NULL is used to mark the end of the string. The +above sentence, compressed, has the NULL byte right after the `H` (`H\x00\xfeq…`). +That's why it stops right then and there. Again, `smaz-py3` is not affected by this, +mostly because I got lucky in choosing this example sentence. + +## Credits + +Credit where credit is due. First to [antirez's SMAZ compression](https://github.com/antirez/smaz) +and to the [original python 2 wrapper](https://pypi.org/project/smaz/) by Benjamin +Sergeant. + + + + +%package help +Summary: Development documents and examples for smaz-py3 +Provides: python3-smaz-py3-doc +%description help +# smaz-py3 + +Small string compression using [_smaz_](https://github.com/antirez/smaz) compression +algorithm. + +This library wraps the original C code, so it should be quite fast. It also has a +testsuite that uses [hypothesis](https://hypothesis.readthedocs.io/en/latest/) based +property testing - a fancy way of saying that the tests are run with randomly +generated strings using most of unicode, to better guard against edge cases. + +## Why do I need this? + +You are working with tons of short strings (text messages, urls,...) and want to save +space. + +According to the original code and notes, it achieves the best compression with english +strings (up to 50%) that do not contain a ton of numbers. However, any other language +might just work as well (allegedly still up to 30%). + +Note that in certain cases it is possible that the compression increases the size. +Keep that in mind and maybe first run some tests. Measuring size is explained in the +example below as well. + +## How do I use this? + +Let's install: + +```sh +$ pip install smaz-py3 +``` + +_Note_: the `-py3` is important. There is an original release, kudos to Benjamin +Sergeant, but it does not work with Python 3+. + +Now, a usage example. + +```python +import smaz +# First we compress our example sentence. +compressed = smaz.compress("The quick brown fox jumps over the lazy dog.") +# The output is raw bytes. As can be seen in the decompress() call below. +# Now, we decompress these raw bytes again. This should return our example sentence. +decompressed = smaz.decompress(b'H\x00\xfeq&\x83\xfek^sA)\xdc\xfa\x00\xfej&-<\x95\xe7\r\x0b\x89\xdbG\x18\x06;n') +# This does not fail, which means we have successfully compressed and decompressed +# without damaging anything. +assert decompressed == "The quick brown fox jumps over the lazy dog." +``` + +How much did we compress? + +```python +# First, we get the actual byte size of our example string. +original_size = len("The quick brown fox jumps over the lazy dog.".encode("utf-8")) # 44 bytes +# As `compressed` is already raw bytes, we can also call len() on this +compressed_size = len(compressed) # 31 bytes +compression_ratio = 1 - compressed_size / original_size # 0.295 +``` + +So we saved about 30% (0.295 \* 100 and some rounding 😉). + +If the compression ratio would be below 0, we would have actually increased the +string. Yes, this can happen. Again, smaz works best on _small_ strings. + +### A small note about NULL bytes + +Currently, `smaz-py3` does not support strings with NULL bytes (`\x00`) in compression: + +```python +>>> import smaz +>>> smaz.compress("The quick brown fox\x00 jumps over the lazy dog.") +Traceback (most recent call last): + File "<stdin>", line 1, in <module> +ValueError: embedded null character +``` + +My reasoning behind this is that in most scenarios you want to clean that away +beforehand anyways. If you think this is wrong, please open up an +[issue on github](https://github.com/originell/smaz-py3). I am happy for further input! + +## Migrating from Python 2 `smaz` + +If you have been using the [Python 2 `smaz` library](https://pypi.org/project/smaz/), +this Python 3 version exposes the same API, so it is a drop-in replacement. + +**Important**: While developing this extension, I think I found a bug in the original +library. Using Python 2.7.16: + +```python +>>> import smaz +>>> smaz.compress("The quick brown fox jumps over the lazy dog.") +'H' # this is wrong. +>>> small = smaz.compress("The quick brown fox jumps over the lazy dog.") +>>> smaz.decompress(small) +'The' # information lost. +``` + +So, if you are actually upgrading from this, please make sure that you are not +affected by this. `smaz-py3` is not prone to this bug. + +Behind the scenes, smaz uses NULL bytes in compression. However, when converting from +C back to a Python string object, NULL is used to mark the end of the string. The +above sentence, compressed, has the NULL byte right after the `H` (`H\x00\xfeq…`). +That's why it stops right then and there. Again, `smaz-py3` is not affected by this, +mostly because I got lucky in choosing this example sentence. + +## Credits + +Credit where credit is due. First to [antirez's SMAZ compression](https://github.com/antirez/smaz) +and to the [original python 2 wrapper](https://pypi.org/project/smaz/) by Benjamin +Sergeant. + + + + +%prep +%autosetup -n smaz-py3-1.1.2 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-smaz-py3 -f filelist.lst +%dir %{python3_sitearch}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Thu May 18 2023 Python_Bot <Python_Bot@openeuler.org> - 1.1.2-1 +- Package Spec generated |
