diff options
Diffstat (limited to 'python-datasynthesizer.spec')
-rw-r--r-- | python-datasynthesizer.spec | 408 |
1 files changed, 408 insertions, 0 deletions
diff --git a/python-datasynthesizer.spec b/python-datasynthesizer.spec new file mode 100644 index 0000000..60cdb62 --- /dev/null +++ b/python-datasynthesizer.spec @@ -0,0 +1,408 @@ +%global _empty_manifest_terminate_build 0 +Name: python-DataSynthesizer +Version: 0.1.11 +Release: 1 +Summary: Generate synthetic data that simulate a given dataset. +License: MIT license +URL: https://github.com/DataResponsibly/DataSynthesizer +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/be/ce/43a60603f66c4d63d03cff76f037067d2d1484dd05f45bfb75bb0d297bcf/DataSynthesizer-0.1.11.tar.gz +BuildArch: noarch + +Requires: python3-numpy +Requires: python3-pandas +Requires: python3-scikit-learn +Requires: python3-matplotlib +Requires: python3-seaborn +Requires: python3-dateutil + +%description +[](https://pypi.python.org/pypi/DataSynthesizer) [](https://travis-ci.com/DataResponsibly/DataSynthesizer) + +# DataSynthesizer + +DataSynthesizer generates synthetic data that simulates a given dataset. + +> It aims to facilitate the collaborations between data scientists and owners of sensitive data. It applies Differential Privacy techniques to achieve strong privacy guarantee. +> +> For more details, please refer to [DataSynthesizer: Privacy-Preserving Synthetic Datasets](docs/cr-datasynthesizer-privacy.pdf) + +### Install DataSynthesizer + +```bash +pip install DataSynthesizer +``` + +### Usage + +##### Assumptions for the Input Dataset + +1. The input dataset is a table in first normal form ([1NF](https://en.wikipedia.org/wiki/First_normal_form)). +2. When implementing differential privacy, DataSynthesizer injects noises into the statistics within **active domain** that are the values presented in the table. + +##### Use Jupyter Notebook + +After installing DataSynthesizer and [Jupyter Notebook](https://jupyter.org/install), open and try the demos in `./notebooks/` + +- [DataSynthesizer__random_mode.ipynb](notebooks/DataSynthesizer__random_mode.ipynb) +- [DataSynthesizer__independent_attribute_mode.ipynb](notebooks/DataSynthesizer__independent_attribute_mode.ipynb) +- [DataSynthesizer__correlated_attribute_mode.ipynb](notebooks/DataSynthesizer__correlated_attribute_mode.ipynb) + +##### Use Web UI + +The [dataResponsiblyUI](https://github.com/DataResponsibly/dataResponsiblyUI) is a Django project that includes DataSynthesizer. Please follow the steps in [Run the Web UIs locally](https://github.com/DataResponsibly/dataResponsiblyUI#run-the-web-uis-locally) and run DataSynthesizer by visiting http://127.0.0.1:8000/synthesizer in a browser. + + + +# History + +## 0.1.0 - 2020-06-11 + +* First release on PyPI. + +## 0.1.1 - 2020-07-05 + +### Bugs Fixed + +* Numpy error when synthesising data with unique identifiers. - [Issue #23](https://github.com/DataResponsibly/DataSynthesizer/issues/23) by @raids + +## 0.1.2 - 2020-07-19 + +### Bugs Fixed + +* infer_distribution() for string attributes fails to sort index of varying types. - [Issue #24](https://github.com/DataResponsibly/DataSynthesizer/issues/24) by @raids + +## 0.1.3 - 2020-09-13 + +### Bugs Fixed + +* The dataframes are not appended into the full space in get_noisy_distribution_of_attributes(). - [Issue #26](https://github.com/DataResponsibly/DataSynthesizer/issues/26) by @zjroth + +## 0.1.4 - 2021-01-14 + +### Bugs Fixed + +* Fix a bug in candidate key identification. + +## 0.1.5 - 2021-03-11 + +### What's New + +* Downgrade required Python from >=3.8 to >=3.7. + +## 0.1.6 - 2021-03-11 + +### What's New + +* Update example notebooks. + +## 0.1.7 - 2021-03-31 + +### Bugs Fixed + +* Fixed an error in Laplace noise parameter. - [Issue #34](https://github.com/DataResponsibly/DataSynthesizer/issues/34) by @ganevgv + +## 0.1.8 - 2021-04-09 + +### Bugs Fixed + +* The randomness seeding is effective across the entire project now. + +## 0.1.9 - 2021-07-18 + +### Bugs Fixed + +* Optimized the datetime datatype detection. + +## 0.1.10 - 2021-11-15 + +### Bugs Fixed + +* Seed the randomness in `greedy_bayes()`. + +## 0.1.11 - 2022-03-31 + +### Bugs Fixed + +* Fixed a bug in DateTime generation. - [Issue #37](https://github.com/DataResponsibly/DataSynthesizer/issues/37) by @artemgur + + + + +%package -n python3-DataSynthesizer +Summary: Generate synthetic data that simulate a given dataset. +Provides: python-DataSynthesizer +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-DataSynthesizer +[](https://pypi.python.org/pypi/DataSynthesizer) [](https://travis-ci.com/DataResponsibly/DataSynthesizer) + +# DataSynthesizer + +DataSynthesizer generates synthetic data that simulates a given dataset. + +> It aims to facilitate the collaborations between data scientists and owners of sensitive data. It applies Differential Privacy techniques to achieve strong privacy guarantee. +> +> For more details, please refer to [DataSynthesizer: Privacy-Preserving Synthetic Datasets](docs/cr-datasynthesizer-privacy.pdf) + +### Install DataSynthesizer + +```bash +pip install DataSynthesizer +``` + +### Usage + +##### Assumptions for the Input Dataset + +1. The input dataset is a table in first normal form ([1NF](https://en.wikipedia.org/wiki/First_normal_form)). +2. When implementing differential privacy, DataSynthesizer injects noises into the statistics within **active domain** that are the values presented in the table. + +##### Use Jupyter Notebook + +After installing DataSynthesizer and [Jupyter Notebook](https://jupyter.org/install), open and try the demos in `./notebooks/` + +- [DataSynthesizer__random_mode.ipynb](notebooks/DataSynthesizer__random_mode.ipynb) +- [DataSynthesizer__independent_attribute_mode.ipynb](notebooks/DataSynthesizer__independent_attribute_mode.ipynb) +- [DataSynthesizer__correlated_attribute_mode.ipynb](notebooks/DataSynthesizer__correlated_attribute_mode.ipynb) + +##### Use Web UI + +The [dataResponsiblyUI](https://github.com/DataResponsibly/dataResponsiblyUI) is a Django project that includes DataSynthesizer. Please follow the steps in [Run the Web UIs locally](https://github.com/DataResponsibly/dataResponsiblyUI#run-the-web-uis-locally) and run DataSynthesizer by visiting http://127.0.0.1:8000/synthesizer in a browser. + + + +# History + +## 0.1.0 - 2020-06-11 + +* First release on PyPI. + +## 0.1.1 - 2020-07-05 + +### Bugs Fixed + +* Numpy error when synthesising data with unique identifiers. - [Issue #23](https://github.com/DataResponsibly/DataSynthesizer/issues/23) by @raids + +## 0.1.2 - 2020-07-19 + +### Bugs Fixed + +* infer_distribution() for string attributes fails to sort index of varying types. - [Issue #24](https://github.com/DataResponsibly/DataSynthesizer/issues/24) by @raids + +## 0.1.3 - 2020-09-13 + +### Bugs Fixed + +* The dataframes are not appended into the full space in get_noisy_distribution_of_attributes(). - [Issue #26](https://github.com/DataResponsibly/DataSynthesizer/issues/26) by @zjroth + +## 0.1.4 - 2021-01-14 + +### Bugs Fixed + +* Fix a bug in candidate key identification. + +## 0.1.5 - 2021-03-11 + +### What's New + +* Downgrade required Python from >=3.8 to >=3.7. + +## 0.1.6 - 2021-03-11 + +### What's New + +* Update example notebooks. + +## 0.1.7 - 2021-03-31 + +### Bugs Fixed + +* Fixed an error in Laplace noise parameter. - [Issue #34](https://github.com/DataResponsibly/DataSynthesizer/issues/34) by @ganevgv + +## 0.1.8 - 2021-04-09 + +### Bugs Fixed + +* The randomness seeding is effective across the entire project now. + +## 0.1.9 - 2021-07-18 + +### Bugs Fixed + +* Optimized the datetime datatype detection. + +## 0.1.10 - 2021-11-15 + +### Bugs Fixed + +* Seed the randomness in `greedy_bayes()`. + +## 0.1.11 - 2022-03-31 + +### Bugs Fixed + +* Fixed a bug in DateTime generation. - [Issue #37](https://github.com/DataResponsibly/DataSynthesizer/issues/37) by @artemgur + + + + +%package help +Summary: Development documents and examples for DataSynthesizer +Provides: python3-DataSynthesizer-doc +%description help +[](https://pypi.python.org/pypi/DataSynthesizer) [](https://travis-ci.com/DataResponsibly/DataSynthesizer) + +# DataSynthesizer + +DataSynthesizer generates synthetic data that simulates a given dataset. + +> It aims to facilitate the collaborations between data scientists and owners of sensitive data. It applies Differential Privacy techniques to achieve strong privacy guarantee. +> +> For more details, please refer to [DataSynthesizer: Privacy-Preserving Synthetic Datasets](docs/cr-datasynthesizer-privacy.pdf) + +### Install DataSynthesizer + +```bash +pip install DataSynthesizer +``` + +### Usage + +##### Assumptions for the Input Dataset + +1. The input dataset is a table in first normal form ([1NF](https://en.wikipedia.org/wiki/First_normal_form)). +2. When implementing differential privacy, DataSynthesizer injects noises into the statistics within **active domain** that are the values presented in the table. + +##### Use Jupyter Notebook + +After installing DataSynthesizer and [Jupyter Notebook](https://jupyter.org/install), open and try the demos in `./notebooks/` + +- [DataSynthesizer__random_mode.ipynb](notebooks/DataSynthesizer__random_mode.ipynb) +- [DataSynthesizer__independent_attribute_mode.ipynb](notebooks/DataSynthesizer__independent_attribute_mode.ipynb) +- [DataSynthesizer__correlated_attribute_mode.ipynb](notebooks/DataSynthesizer__correlated_attribute_mode.ipynb) + +##### Use Web UI + +The [dataResponsiblyUI](https://github.com/DataResponsibly/dataResponsiblyUI) is a Django project that includes DataSynthesizer. Please follow the steps in [Run the Web UIs locally](https://github.com/DataResponsibly/dataResponsiblyUI#run-the-web-uis-locally) and run DataSynthesizer by visiting http://127.0.0.1:8000/synthesizer in a browser. + + + +# History + +## 0.1.0 - 2020-06-11 + +* First release on PyPI. + +## 0.1.1 - 2020-07-05 + +### Bugs Fixed + +* Numpy error when synthesising data with unique identifiers. - [Issue #23](https://github.com/DataResponsibly/DataSynthesizer/issues/23) by @raids + +## 0.1.2 - 2020-07-19 + +### Bugs Fixed + +* infer_distribution() for string attributes fails to sort index of varying types. - [Issue #24](https://github.com/DataResponsibly/DataSynthesizer/issues/24) by @raids + +## 0.1.3 - 2020-09-13 + +### Bugs Fixed + +* The dataframes are not appended into the full space in get_noisy_distribution_of_attributes(). - [Issue #26](https://github.com/DataResponsibly/DataSynthesizer/issues/26) by @zjroth + +## 0.1.4 - 2021-01-14 + +### Bugs Fixed + +* Fix a bug in candidate key identification. + +## 0.1.5 - 2021-03-11 + +### What's New + +* Downgrade required Python from >=3.8 to >=3.7. + +## 0.1.6 - 2021-03-11 + +### What's New + +* Update example notebooks. + +## 0.1.7 - 2021-03-31 + +### Bugs Fixed + +* Fixed an error in Laplace noise parameter. - [Issue #34](https://github.com/DataResponsibly/DataSynthesizer/issues/34) by @ganevgv + +## 0.1.8 - 2021-04-09 + +### Bugs Fixed + +* The randomness seeding is effective across the entire project now. + +## 0.1.9 - 2021-07-18 + +### Bugs Fixed + +* Optimized the datetime datatype detection. + +## 0.1.10 - 2021-11-15 + +### Bugs Fixed + +* Seed the randomness in `greedy_bayes()`. + +## 0.1.11 - 2022-03-31 + +### Bugs Fixed + +* Fixed a bug in DateTime generation. - [Issue #37](https://github.com/DataResponsibly/DataSynthesizer/issues/37) by @artemgur + + + + +%prep +%autosetup -n DataSynthesizer-0.1.11 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-DataSynthesizer -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Mon May 29 2023 Python_Bot <Python_Bot@openeuler.org> - 0.1.11-1 +- Package Spec generated |