diff options
| author | CoprDistGit <infra@openeuler.org> | 2023-05-17 04:47:42 +0000 |
|---|---|---|
| committer | CoprDistGit <infra@openeuler.org> | 2023-05-17 04:47:42 +0000 |
| commit | 0c49dc9b371b95aee8a820daed3b5c4bc7e17621 (patch) | |
| tree | 90dbf21ac1455c540e8cc11efea8dd8c433f2338 /python-npy-append-array.spec | |
| parent | bdfd66f08338ee6acb604e31c4e1fe356b49b2b5 (diff) | |
automatic import of python-npy-append-array
Diffstat (limited to 'python-npy-append-array.spec')
| -rw-r--r-- | python-npy-append-array.spec | 342 |
1 files changed, 342 insertions, 0 deletions
diff --git a/python-npy-append-array.spec b/python-npy-append-array.spec new file mode 100644 index 0000000..2373fd2 --- /dev/null +++ b/python-npy-append-array.spec @@ -0,0 +1,342 @@ +%global _empty_manifest_terminate_build 0 +Name: python-npy-append-array +Version: 0.9.16 +Release: 1 +Summary: Create Numpy .npy files by appending on the growth axis +License: MIT +URL: https://github.com/xor2k/npy-append-array +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/da/70/e40e52bfca6e0aba60b2e12aee2ce7e86e9e40f818c72bfa9f6dc4b81dd4/npy-append-array-0.9.16.tar.gz +BuildArch: noarch + + +%description +# NpyAppendArray + +Create Numpy `.npy` files by appending on the growth axis (0 for C order, -1 +for Fortran order). It behaves like `numpy.concatenate` with the difference +that the result is stored out-of-memory in a `.npy` file and can be reused for +further appending. After creation, the file can then be read with memory +mapping (e.g. by adding `mmap_mode="r"`) which altogether allows to create and +read files (optionally) larger than the machine's main memory. + +Some possible applications: +1. efficiently create large `.npy` (optionally database-like) files + * Handling of offsets not included, can be done in an extra array + * Large legacy files can be made appendable by calling `ensure_appendable` + * can (optionally) be performed in-place to minimize disk space usage +2. create binary log files (optionally on low-memory embedded devices) + * Check the option `rewrite_header_on_append=False` for extra efficiency + * Binary log files can be accessed very efficiently without parsing + * Incomplete files can be recovered efficiently by calling `recover` + +Another feature of this library is the (above mentioned) `recover` function, +which makes incomplete `.npy` files readable by `numpy.load` again, no matter +whether they should be appended to or not. + +Incomplete files can be the result of broken downloads or unfinished writes. +Recovery works by rewriting the header and inferring the growth axis (see +above) by the file size. As the data length may not be evenly divisible by the +non-append-axis shape, incomplete entries can either be ignored +(`zerofill_incomplete=False`), which probably makes sense in most scenarios. +Alternatively, to squeeze out the as much information from the file as +possible, `zerofill_incomplete=True` can be used, which fills the incomplete +last append axis item with zeros. + +Raises `ValueError` instead of `TypeError` since version 0.9.14 to be more +consistent with Numpy. + +NpyAppendArray can be used in multithreaded environments. + +## Installation +```bash +conda install -c conda-forge npy-append-array +``` +or +```bash +pip install npy-append-array +``` +## Usage + +```python +from npy_append_array import NpyAppendArray +import numpy as np + +arr1 = np.array([[1,2],[3,4]]) +arr2 = np.array([[1,2],[3,4],[5,6]]) + +filename = 'out.npy' + +with NpyAppendArray(filename) as npaa: + npaa.append(arr1) + npaa.append(arr2) + npaa.append(arr2) + +data = np.load(filename, mmap_mode="r") + +print(data) +``` + +## Concurrency +Concurrency can be achieved by multithreading: A single `NpyAppendArray` +object (per file) needs to be created. Then, `append` can be called from +multiple threads and locks will ensure that file writes do not happen in +parallel. When using with a `with` statement, make sure the `join` happens +within it, compare `test.py`. + +Multithreaded writes are not the pinnacle of what is technically possible with +modern operating systems. It would be highly desirable to use `async` file +writes. However, although modules like `aiofile` exist, this is currently not +supported natively by Python or Numpy, compare + +https://github.com/python/cpython/issues/76742 + +## Implementation Details +NpyAppendArray contains a modified, partial version of `format.py` from the +Numpy package. It ensures that array headers are created with 21 +(`=len(str(8*2**64-1))`) bytes of spare space. This allows to fit an array of +maxed out dimensions (for a 64 bit machine) without increasing the array +header size. This allows to simply rewrite the header as we append data to the +end of the `.npy` file. + +## Supported Systems +Testes with Ubuntu Linux, macOS and Windows. + + +%package -n python3-npy-append-array +Summary: Create Numpy .npy files by appending on the growth axis +Provides: python-npy-append-array +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-npy-append-array +# NpyAppendArray + +Create Numpy `.npy` files by appending on the growth axis (0 for C order, -1 +for Fortran order). It behaves like `numpy.concatenate` with the difference +that the result is stored out-of-memory in a `.npy` file and can be reused for +further appending. After creation, the file can then be read with memory +mapping (e.g. by adding `mmap_mode="r"`) which altogether allows to create and +read files (optionally) larger than the machine's main memory. + +Some possible applications: +1. efficiently create large `.npy` (optionally database-like) files + * Handling of offsets not included, can be done in an extra array + * Large legacy files can be made appendable by calling `ensure_appendable` + * can (optionally) be performed in-place to minimize disk space usage +2. create binary log files (optionally on low-memory embedded devices) + * Check the option `rewrite_header_on_append=False` for extra efficiency + * Binary log files can be accessed very efficiently without parsing + * Incomplete files can be recovered efficiently by calling `recover` + +Another feature of this library is the (above mentioned) `recover` function, +which makes incomplete `.npy` files readable by `numpy.load` again, no matter +whether they should be appended to or not. + +Incomplete files can be the result of broken downloads or unfinished writes. +Recovery works by rewriting the header and inferring the growth axis (see +above) by the file size. As the data length may not be evenly divisible by the +non-append-axis shape, incomplete entries can either be ignored +(`zerofill_incomplete=False`), which probably makes sense in most scenarios. +Alternatively, to squeeze out the as much information from the file as +possible, `zerofill_incomplete=True` can be used, which fills the incomplete +last append axis item with zeros. + +Raises `ValueError` instead of `TypeError` since version 0.9.14 to be more +consistent with Numpy. + +NpyAppendArray can be used in multithreaded environments. + +## Installation +```bash +conda install -c conda-forge npy-append-array +``` +or +```bash +pip install npy-append-array +``` +## Usage + +```python +from npy_append_array import NpyAppendArray +import numpy as np + +arr1 = np.array([[1,2],[3,4]]) +arr2 = np.array([[1,2],[3,4],[5,6]]) + +filename = 'out.npy' + +with NpyAppendArray(filename) as npaa: + npaa.append(arr1) + npaa.append(arr2) + npaa.append(arr2) + +data = np.load(filename, mmap_mode="r") + +print(data) +``` + +## Concurrency +Concurrency can be achieved by multithreading: A single `NpyAppendArray` +object (per file) needs to be created. Then, `append` can be called from +multiple threads and locks will ensure that file writes do not happen in +parallel. When using with a `with` statement, make sure the `join` happens +within it, compare `test.py`. + +Multithreaded writes are not the pinnacle of what is technically possible with +modern operating systems. It would be highly desirable to use `async` file +writes. However, although modules like `aiofile` exist, this is currently not +supported natively by Python or Numpy, compare + +https://github.com/python/cpython/issues/76742 + +## Implementation Details +NpyAppendArray contains a modified, partial version of `format.py` from the +Numpy package. It ensures that array headers are created with 21 +(`=len(str(8*2**64-1))`) bytes of spare space. This allows to fit an array of +maxed out dimensions (for a 64 bit machine) without increasing the array +header size. This allows to simply rewrite the header as we append data to the +end of the `.npy` file. + +## Supported Systems +Testes with Ubuntu Linux, macOS and Windows. + + +%package help +Summary: Development documents and examples for npy-append-array +Provides: python3-npy-append-array-doc +%description help +# NpyAppendArray + +Create Numpy `.npy` files by appending on the growth axis (0 for C order, -1 +for Fortran order). It behaves like `numpy.concatenate` with the difference +that the result is stored out-of-memory in a `.npy` file and can be reused for +further appending. After creation, the file can then be read with memory +mapping (e.g. by adding `mmap_mode="r"`) which altogether allows to create and +read files (optionally) larger than the machine's main memory. + +Some possible applications: +1. efficiently create large `.npy` (optionally database-like) files + * Handling of offsets not included, can be done in an extra array + * Large legacy files can be made appendable by calling `ensure_appendable` + * can (optionally) be performed in-place to minimize disk space usage +2. create binary log files (optionally on low-memory embedded devices) + * Check the option `rewrite_header_on_append=False` for extra efficiency + * Binary log files can be accessed very efficiently without parsing + * Incomplete files can be recovered efficiently by calling `recover` + +Another feature of this library is the (above mentioned) `recover` function, +which makes incomplete `.npy` files readable by `numpy.load` again, no matter +whether they should be appended to or not. + +Incomplete files can be the result of broken downloads or unfinished writes. +Recovery works by rewriting the header and inferring the growth axis (see +above) by the file size. As the data length may not be evenly divisible by the +non-append-axis shape, incomplete entries can either be ignored +(`zerofill_incomplete=False`), which probably makes sense in most scenarios. +Alternatively, to squeeze out the as much information from the file as +possible, `zerofill_incomplete=True` can be used, which fills the incomplete +last append axis item with zeros. + +Raises `ValueError` instead of `TypeError` since version 0.9.14 to be more +consistent with Numpy. + +NpyAppendArray can be used in multithreaded environments. + +## Installation +```bash +conda install -c conda-forge npy-append-array +``` +or +```bash +pip install npy-append-array +``` +## Usage + +```python +from npy_append_array import NpyAppendArray +import numpy as np + +arr1 = np.array([[1,2],[3,4]]) +arr2 = np.array([[1,2],[3,4],[5,6]]) + +filename = 'out.npy' + +with NpyAppendArray(filename) as npaa: + npaa.append(arr1) + npaa.append(arr2) + npaa.append(arr2) + +data = np.load(filename, mmap_mode="r") + +print(data) +``` + +## Concurrency +Concurrency can be achieved by multithreading: A single `NpyAppendArray` +object (per file) needs to be created. Then, `append` can be called from +multiple threads and locks will ensure that file writes do not happen in +parallel. When using with a `with` statement, make sure the `join` happens +within it, compare `test.py`. + +Multithreaded writes are not the pinnacle of what is technically possible with +modern operating systems. It would be highly desirable to use `async` file +writes. However, although modules like `aiofile` exist, this is currently not +supported natively by Python or Numpy, compare + +https://github.com/python/cpython/issues/76742 + +## Implementation Details +NpyAppendArray contains a modified, partial version of `format.py` from the +Numpy package. It ensures that array headers are created with 21 +(`=len(str(8*2**64-1))`) bytes of spare space. This allows to fit an array of +maxed out dimensions (for a 64 bit machine) without increasing the array +header size. This allows to simply rewrite the header as we append data to the +end of the `.npy` file. + +## Supported Systems +Testes with Ubuntu Linux, macOS and Windows. + + +%prep +%autosetup -n npy-append-array-0.9.16 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-npy-append-array -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Wed May 17 2023 Python_Bot <Python_Bot@openeuler.org> - 0.9.16-1 +- Package Spec generated |
