diff options
Diffstat (limited to 'python-ddsketch.spec')
| -rw-r--r-- | python-ddsketch.spec | 511 |
1 files changed, 511 insertions, 0 deletions
diff --git a/python-ddsketch.spec b/python-ddsketch.spec new file mode 100644 index 0000000..9765ed2 --- /dev/null +++ b/python-ddsketch.spec @@ -0,0 +1,511 @@ +%global _empty_manifest_terminate_build 0 +Name: python-ddsketch +Version: 2.0.4 +Release: 1 +Summary: Distributed quantile sketches +License: Apache Software License +URL: http://github.com/datadog/sketches-py +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/c7/18/668af158f4a464f220f93aca4c87d68f1bb2271fb9b0211ac1b500a65af4/ddsketch-2.0.4.tar.gz +BuildArch: noarch + +Requires: python3-six +Requires: python3-typing +Requires: python3-protobuf +Requires: python3-protobuf + +%description +# ddsketch + +This repo contains the Python implementation of the distributed quantile sketch +algorithm DDSketch [1]. DDSketch has relative-error guarantees for any quantile +q in [0, 1]. That is if the true value of the qth-quantile is `x` then DDSketch +returns a value `y` such that `|x-y| / x < e` where `e` is the relative error +parameter. (The default here is set to 0.01.) DDSketch is also fully mergeable, +meaning that multiple sketches from distributed systems can be combined in a +central node. + +Our default implementation, `DDSketch`, is guaranteed [1] to not grow too large +in size for any data that can be described by a distribution whose tails are +sub-exponential. + +We also provide implementations (`LogCollapsingLowestDenseDDSketch` and +`LogCollapsingHighestDenseDDSketch`) where the q-quantile will be accurate up to +the specified relative error for q that is not too small (or large). Concretely, +the q-quantile will be accurate up to the specified relative error as long as it +belongs to one of the `m` bins kept by the sketch. If the data is time in +seconds, the default of `m = 2048` covers 80 microseconds to 1 year. + +## Installation + +To install this package, run `pip install ddsketch`, or clone the repo and run +`python setup.py install`. This package depends on `numpy` and `protobuf`. (The +protobuf dependency can be removed if it's not applicable.) + +## Usage +``` +from ddsketch import DDSketch + +sketch = DDSketch() +``` +Add values to the sketch +``` +import numpy as np + +values = np.random.normal(size=500) +for v in values: + sketch.add(v) +``` +Find the quantiles of `values` to within the relative error. +``` +quantiles = [sketch.get_quantile_value(q) for q in [0.5, 0.75, 0.9, 1]] +``` +Merge another `DDSketch` into `sketch`. +``` +another_sketch = DDSketch() +other_values = np.random.normal(size=500) +for v in other_values: + another_sketch.add(v) +sketch.merge(another_sketch) +``` +The quantiles of `values` concatenated with `other_values` are still accurate to within the relative error. + +## Development + +To work on ddsketch a Python interpreter must be installed. It is recommended to use the provided development +container (requires [docker](https://www.docker.com/)) which includes all the required Python interpreters. + + docker-compose run dev + +Or, if developing outside of docker then it is recommended to use a virtual environment: + + pip install virtualenv + virtualenv --python=3 .venv + source .venv/bin/activate + + +### Testing + +To run the tests install `riot`: + + pip install riot + +Replace the Python version with the interpreter(s) available. + + # Run tests with Python 3.9 + riot run -p3.9 test + +### Release notes + +New features, bug fixes, deprecations and other breaking changes must have +release notes included. + +To generate a release note for the change: + + riot run reno new <short-description-of-change-no-spaces> + +Edit the generated file to include notes on the changes made in the commit/PR +and add commit it. + + +### Formatting + +Format code with + + riot run fmt + + +### Type-checking + +Type checking is done with [mypy](http://mypy-lang.org/): + + riot run mypy + + +### Type-checking + +Lint the code with [flake8](https://flake8.pycqa.org/en/latest/): + + riot run flake8 + + +### Protobuf + +The protobuf is stored in the go repository: https://github.com/DataDog/sketches-go/blob/master/ddsketch/pb/ddsketch.proto + +Install the minimum required protoc and generate the Python code: + +```sh +docker run -v $PWD:/code -it ubuntu:18.04 /bin/bash +apt update && apt install protobuf-compiler # default is 3.0.0 +protoc --proto_path=ddsketch/pb/ --python_out=ddsketch/pb/ ddsketch/pb/ddsketch.proto +``` + + +### Releasing + +1. Generate the release notes and use [`pandoc`](https://pandoc.org/) to format +them for Github: +```bash + git checkout master && git pull + riot run -s reno report --no-show-source | pandoc -f rst -t gfm --wrap=none +``` + Copy the output into a new release: https://github.com/DataDog/sketches-py/releases/new. + +2. Enter a tag for the release (following [`semver`](https://semver.org)) (eg. `v1.1.3`, `v1.0.3`, `v1.2.0`). +3. Use the tag without the `v` as the title. +4. Save the release as a draft and pass the link to someone else to give a quick review. +5. If all looks good hit publish + + +## References +[1] Charles Masson and Jee E Rim and Homin K. Lee. DDSketch: A fast and fully-mergeable quantile sketch with relative-error guarantees. PVLDB, 12(12): 2195-2205, 2019. (The code referenced in the paper, including our implementation of the the Greenwald-Khanna (GK) algorithm, can be found at: https://github.com/DataDog/sketches-py/releases/tag/v0.1 ) + + +%package -n python3-ddsketch +Summary: Distributed quantile sketches +Provides: python-ddsketch +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-ddsketch +# ddsketch + +This repo contains the Python implementation of the distributed quantile sketch +algorithm DDSketch [1]. DDSketch has relative-error guarantees for any quantile +q in [0, 1]. That is if the true value of the qth-quantile is `x` then DDSketch +returns a value `y` such that `|x-y| / x < e` where `e` is the relative error +parameter. (The default here is set to 0.01.) DDSketch is also fully mergeable, +meaning that multiple sketches from distributed systems can be combined in a +central node. + +Our default implementation, `DDSketch`, is guaranteed [1] to not grow too large +in size for any data that can be described by a distribution whose tails are +sub-exponential. + +We also provide implementations (`LogCollapsingLowestDenseDDSketch` and +`LogCollapsingHighestDenseDDSketch`) where the q-quantile will be accurate up to +the specified relative error for q that is not too small (or large). Concretely, +the q-quantile will be accurate up to the specified relative error as long as it +belongs to one of the `m` bins kept by the sketch. If the data is time in +seconds, the default of `m = 2048` covers 80 microseconds to 1 year. + +## Installation + +To install this package, run `pip install ddsketch`, or clone the repo and run +`python setup.py install`. This package depends on `numpy` and `protobuf`. (The +protobuf dependency can be removed if it's not applicable.) + +## Usage +``` +from ddsketch import DDSketch + +sketch = DDSketch() +``` +Add values to the sketch +``` +import numpy as np + +values = np.random.normal(size=500) +for v in values: + sketch.add(v) +``` +Find the quantiles of `values` to within the relative error. +``` +quantiles = [sketch.get_quantile_value(q) for q in [0.5, 0.75, 0.9, 1]] +``` +Merge another `DDSketch` into `sketch`. +``` +another_sketch = DDSketch() +other_values = np.random.normal(size=500) +for v in other_values: + another_sketch.add(v) +sketch.merge(another_sketch) +``` +The quantiles of `values` concatenated with `other_values` are still accurate to within the relative error. + +## Development + +To work on ddsketch a Python interpreter must be installed. It is recommended to use the provided development +container (requires [docker](https://www.docker.com/)) which includes all the required Python interpreters. + + docker-compose run dev + +Or, if developing outside of docker then it is recommended to use a virtual environment: + + pip install virtualenv + virtualenv --python=3 .venv + source .venv/bin/activate + + +### Testing + +To run the tests install `riot`: + + pip install riot + +Replace the Python version with the interpreter(s) available. + + # Run tests with Python 3.9 + riot run -p3.9 test + +### Release notes + +New features, bug fixes, deprecations and other breaking changes must have +release notes included. + +To generate a release note for the change: + + riot run reno new <short-description-of-change-no-spaces> + +Edit the generated file to include notes on the changes made in the commit/PR +and add commit it. + + +### Formatting + +Format code with + + riot run fmt + + +### Type-checking + +Type checking is done with [mypy](http://mypy-lang.org/): + + riot run mypy + + +### Type-checking + +Lint the code with [flake8](https://flake8.pycqa.org/en/latest/): + + riot run flake8 + + +### Protobuf + +The protobuf is stored in the go repository: https://github.com/DataDog/sketches-go/blob/master/ddsketch/pb/ddsketch.proto + +Install the minimum required protoc and generate the Python code: + +```sh +docker run -v $PWD:/code -it ubuntu:18.04 /bin/bash +apt update && apt install protobuf-compiler # default is 3.0.0 +protoc --proto_path=ddsketch/pb/ --python_out=ddsketch/pb/ ddsketch/pb/ddsketch.proto +``` + + +### Releasing + +1. Generate the release notes and use [`pandoc`](https://pandoc.org/) to format +them for Github: +```bash + git checkout master && git pull + riot run -s reno report --no-show-source | pandoc -f rst -t gfm --wrap=none +``` + Copy the output into a new release: https://github.com/DataDog/sketches-py/releases/new. + +2. Enter a tag for the release (following [`semver`](https://semver.org)) (eg. `v1.1.3`, `v1.0.3`, `v1.2.0`). +3. Use the tag without the `v` as the title. +4. Save the release as a draft and pass the link to someone else to give a quick review. +5. If all looks good hit publish + + +## References +[1] Charles Masson and Jee E Rim and Homin K. Lee. DDSketch: A fast and fully-mergeable quantile sketch with relative-error guarantees. PVLDB, 12(12): 2195-2205, 2019. (The code referenced in the paper, including our implementation of the the Greenwald-Khanna (GK) algorithm, can be found at: https://github.com/DataDog/sketches-py/releases/tag/v0.1 ) + + +%package help +Summary: Development documents and examples for ddsketch +Provides: python3-ddsketch-doc +%description help +# ddsketch + +This repo contains the Python implementation of the distributed quantile sketch +algorithm DDSketch [1]. DDSketch has relative-error guarantees for any quantile +q in [0, 1]. That is if the true value of the qth-quantile is `x` then DDSketch +returns a value `y` such that `|x-y| / x < e` where `e` is the relative error +parameter. (The default here is set to 0.01.) DDSketch is also fully mergeable, +meaning that multiple sketches from distributed systems can be combined in a +central node. + +Our default implementation, `DDSketch`, is guaranteed [1] to not grow too large +in size for any data that can be described by a distribution whose tails are +sub-exponential. + +We also provide implementations (`LogCollapsingLowestDenseDDSketch` and +`LogCollapsingHighestDenseDDSketch`) where the q-quantile will be accurate up to +the specified relative error for q that is not too small (or large). Concretely, +the q-quantile will be accurate up to the specified relative error as long as it +belongs to one of the `m` bins kept by the sketch. If the data is time in +seconds, the default of `m = 2048` covers 80 microseconds to 1 year. + +## Installation + +To install this package, run `pip install ddsketch`, or clone the repo and run +`python setup.py install`. This package depends on `numpy` and `protobuf`. (The +protobuf dependency can be removed if it's not applicable.) + +## Usage +``` +from ddsketch import DDSketch + +sketch = DDSketch() +``` +Add values to the sketch +``` +import numpy as np + +values = np.random.normal(size=500) +for v in values: + sketch.add(v) +``` +Find the quantiles of `values` to within the relative error. +``` +quantiles = [sketch.get_quantile_value(q) for q in [0.5, 0.75, 0.9, 1]] +``` +Merge another `DDSketch` into `sketch`. +``` +another_sketch = DDSketch() +other_values = np.random.normal(size=500) +for v in other_values: + another_sketch.add(v) +sketch.merge(another_sketch) +``` +The quantiles of `values` concatenated with `other_values` are still accurate to within the relative error. + +## Development + +To work on ddsketch a Python interpreter must be installed. It is recommended to use the provided development +container (requires [docker](https://www.docker.com/)) which includes all the required Python interpreters. + + docker-compose run dev + +Or, if developing outside of docker then it is recommended to use a virtual environment: + + pip install virtualenv + virtualenv --python=3 .venv + source .venv/bin/activate + + +### Testing + +To run the tests install `riot`: + + pip install riot + +Replace the Python version with the interpreter(s) available. + + # Run tests with Python 3.9 + riot run -p3.9 test + +### Release notes + +New features, bug fixes, deprecations and other breaking changes must have +release notes included. + +To generate a release note for the change: + + riot run reno new <short-description-of-change-no-spaces> + +Edit the generated file to include notes on the changes made in the commit/PR +and add commit it. + + +### Formatting + +Format code with + + riot run fmt + + +### Type-checking + +Type checking is done with [mypy](http://mypy-lang.org/): + + riot run mypy + + +### Type-checking + +Lint the code with [flake8](https://flake8.pycqa.org/en/latest/): + + riot run flake8 + + +### Protobuf + +The protobuf is stored in the go repository: https://github.com/DataDog/sketches-go/blob/master/ddsketch/pb/ddsketch.proto + +Install the minimum required protoc and generate the Python code: + +```sh +docker run -v $PWD:/code -it ubuntu:18.04 /bin/bash +apt update && apt install protobuf-compiler # default is 3.0.0 +protoc --proto_path=ddsketch/pb/ --python_out=ddsketch/pb/ ddsketch/pb/ddsketch.proto +``` + + +### Releasing + +1. Generate the release notes and use [`pandoc`](https://pandoc.org/) to format +them for Github: +```bash + git checkout master && git pull + riot run -s reno report --no-show-source | pandoc -f rst -t gfm --wrap=none +``` + Copy the output into a new release: https://github.com/DataDog/sketches-py/releases/new. + +2. Enter a tag for the release (following [`semver`](https://semver.org)) (eg. `v1.1.3`, `v1.0.3`, `v1.2.0`). +3. Use the tag without the `v` as the title. +4. Save the release as a draft and pass the link to someone else to give a quick review. +5. If all looks good hit publish + + +## References +[1] Charles Masson and Jee E Rim and Homin K. Lee. DDSketch: A fast and fully-mergeable quantile sketch with relative-error guarantees. PVLDB, 12(12): 2195-2205, 2019. (The code referenced in the paper, including our implementation of the the Greenwald-Khanna (GK) algorithm, can be found at: https://github.com/DataDog/sketches-py/releases/tag/v0.1 ) + + +%prep +%autosetup -n ddsketch-2.0.4 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-ddsketch -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Mon May 29 2023 Python_Bot <Python_Bot@openeuler.org> - 2.0.4-1 +- Package Spec generated |
