%global _empty_manifest_terminate_build 0 Name: python-DataSynthesizer Version: 0.1.11 Release: 1 Summary: Generate synthetic data that simulate a given dataset. License: MIT license URL: https://github.com/DataResponsibly/DataSynthesizer Source0: https://mirrors.aliyun.com/pypi/web/packages/be/ce/43a60603f66c4d63d03cff76f037067d2d1484dd05f45bfb75bb0d297bcf/DataSynthesizer-0.1.11.tar.gz BuildArch: noarch Requires: python3-numpy Requires: python3-pandas Requires: python3-scikit-learn Requires: python3-matplotlib Requires: python3-seaborn Requires: python3-dateutil %description [![PyPi Shield](https://img.shields.io/pypi/v/DataSynthesizer.svg)](https://pypi.python.org/pypi/DataSynthesizer) [![Travis CI Shield](https://travis-ci.com/DataResponsibly/DataSynthesizer.svg?branch=master)](https://travis-ci.com/DataResponsibly/DataSynthesizer) # DataSynthesizer DataSynthesizer generates synthetic data that simulates a given dataset. > It aims to facilitate the collaborations between data scientists and owners of sensitive data. It applies Differential Privacy techniques to achieve strong privacy guarantee. > > For more details, please refer to [DataSynthesizer: Privacy-Preserving Synthetic Datasets](docs/cr-datasynthesizer-privacy.pdf) ### Install DataSynthesizer ```bash pip install DataSynthesizer ``` ### Usage ##### Assumptions for the Input Dataset 1. The input dataset is a table in first normal form ([1NF](https://en.wikipedia.org/wiki/First_normal_form)). 2. When implementing differential privacy, DataSynthesizer injects noises into the statistics within **active domain** that are the values presented in the table. ##### Use Jupyter Notebook After installing DataSynthesizer and [Jupyter Notebook](https://jupyter.org/install), open and try the demos in `./notebooks/` - [DataSynthesizer__random_mode.ipynb](notebooks/DataSynthesizer__random_mode.ipynb) - [DataSynthesizer__independent_attribute_mode.ipynb](notebooks/DataSynthesizer__independent_attribute_mode.ipynb) - [DataSynthesizer__correlated_attribute_mode.ipynb](notebooks/DataSynthesizer__correlated_attribute_mode.ipynb) ##### Use Web UI The [dataResponsiblyUI](https://github.com/DataResponsibly/dataResponsiblyUI) is a Django project that includes DataSynthesizer. Please follow the steps in [Run the Web UIs locally](https://github.com/DataResponsibly/dataResponsiblyUI#run-the-web-uis-locally) and run DataSynthesizer by visiting http://127.0.0.1:8000/synthesizer in a browser. # History ## 0.1.0 - 2020-06-11 * First release on PyPI. ## 0.1.1 - 2020-07-05 ### Bugs Fixed * Numpy error when synthesising data with unique identifiers. - [Issue #23](https://github.com/DataResponsibly/DataSynthesizer/issues/23) by @raids ## 0.1.2 - 2020-07-19 ### Bugs Fixed * infer_distribution() for string attributes fails to sort index of varying types. - [Issue #24](https://github.com/DataResponsibly/DataSynthesizer/issues/24) by @raids ## 0.1.3 - 2020-09-13 ### Bugs Fixed * The dataframes are not appended into the full space in get_noisy_distribution_of_attributes(). - [Issue #26](https://github.com/DataResponsibly/DataSynthesizer/issues/26) by @zjroth ## 0.1.4 - 2021-01-14 ### Bugs Fixed * Fix a bug in candidate key identification. ## 0.1.5 - 2021-03-11 ### What's New * Downgrade required Python from >=3.8 to >=3.7. ## 0.1.6 - 2021-03-11 ### What's New * Update example notebooks. ## 0.1.7 - 2021-03-31 ### Bugs Fixed * Fixed an error in Laplace noise parameter. - [Issue #34](https://github.com/DataResponsibly/DataSynthesizer/issues/34) by @ganevgv ## 0.1.8 - 2021-04-09 ### Bugs Fixed * The randomness seeding is effective across the entire project now. ## 0.1.9 - 2021-07-18 ### Bugs Fixed * Optimized the datetime datatype detection. ## 0.1.10 - 2021-11-15 ### Bugs Fixed * Seed the randomness in `greedy_bayes()`. ## 0.1.11 - 2022-03-31 ### Bugs Fixed * Fixed a bug in DateTime generation. - [Issue #37](https://github.com/DataResponsibly/DataSynthesizer/issues/37) by @artemgur %package -n python3-DataSynthesizer Summary: Generate synthetic data that simulate a given dataset. Provides: python-DataSynthesizer BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-DataSynthesizer [![PyPi Shield](https://img.shields.io/pypi/v/DataSynthesizer.svg)](https://pypi.python.org/pypi/DataSynthesizer) [![Travis CI Shield](https://travis-ci.com/DataResponsibly/DataSynthesizer.svg?branch=master)](https://travis-ci.com/DataResponsibly/DataSynthesizer) # DataSynthesizer DataSynthesizer generates synthetic data that simulates a given dataset. > It aims to facilitate the collaborations between data scientists and owners of sensitive data. It applies Differential Privacy techniques to achieve strong privacy guarantee. > > For more details, please refer to [DataSynthesizer: Privacy-Preserving Synthetic Datasets](docs/cr-datasynthesizer-privacy.pdf) ### Install DataSynthesizer ```bash pip install DataSynthesizer ``` ### Usage ##### Assumptions for the Input Dataset 1. The input dataset is a table in first normal form ([1NF](https://en.wikipedia.org/wiki/First_normal_form)). 2. When implementing differential privacy, DataSynthesizer injects noises into the statistics within **active domain** that are the values presented in the table. ##### Use Jupyter Notebook After installing DataSynthesizer and [Jupyter Notebook](https://jupyter.org/install), open and try the demos in `./notebooks/` - [DataSynthesizer__random_mode.ipynb](notebooks/DataSynthesizer__random_mode.ipynb) - [DataSynthesizer__independent_attribute_mode.ipynb](notebooks/DataSynthesizer__independent_attribute_mode.ipynb) - [DataSynthesizer__correlated_attribute_mode.ipynb](notebooks/DataSynthesizer__correlated_attribute_mode.ipynb) ##### Use Web UI The [dataResponsiblyUI](https://github.com/DataResponsibly/dataResponsiblyUI) is a Django project that includes DataSynthesizer. Please follow the steps in [Run the Web UIs locally](https://github.com/DataResponsibly/dataResponsiblyUI#run-the-web-uis-locally) and run DataSynthesizer by visiting http://127.0.0.1:8000/synthesizer in a browser. # History ## 0.1.0 - 2020-06-11 * First release on PyPI. ## 0.1.1 - 2020-07-05 ### Bugs Fixed * Numpy error when synthesising data with unique identifiers. - [Issue #23](https://github.com/DataResponsibly/DataSynthesizer/issues/23) by @raids ## 0.1.2 - 2020-07-19 ### Bugs Fixed * infer_distribution() for string attributes fails to sort index of varying types. - [Issue #24](https://github.com/DataResponsibly/DataSynthesizer/issues/24) by @raids ## 0.1.3 - 2020-09-13 ### Bugs Fixed * The dataframes are not appended into the full space in get_noisy_distribution_of_attributes(). - [Issue #26](https://github.com/DataResponsibly/DataSynthesizer/issues/26) by @zjroth ## 0.1.4 - 2021-01-14 ### Bugs Fixed * Fix a bug in candidate key identification. ## 0.1.5 - 2021-03-11 ### What's New * Downgrade required Python from >=3.8 to >=3.7. ## 0.1.6 - 2021-03-11 ### What's New * Update example notebooks. ## 0.1.7 - 2021-03-31 ### Bugs Fixed * Fixed an error in Laplace noise parameter. - [Issue #34](https://github.com/DataResponsibly/DataSynthesizer/issues/34) by @ganevgv ## 0.1.8 - 2021-04-09 ### Bugs Fixed * The randomness seeding is effective across the entire project now. ## 0.1.9 - 2021-07-18 ### Bugs Fixed * Optimized the datetime datatype detection. ## 0.1.10 - 2021-11-15 ### Bugs Fixed * Seed the randomness in `greedy_bayes()`. ## 0.1.11 - 2022-03-31 ### Bugs Fixed * Fixed a bug in DateTime generation. - [Issue #37](https://github.com/DataResponsibly/DataSynthesizer/issues/37) by @artemgur %package help Summary: Development documents and examples for DataSynthesizer Provides: python3-DataSynthesizer-doc %description help [![PyPi Shield](https://img.shields.io/pypi/v/DataSynthesizer.svg)](https://pypi.python.org/pypi/DataSynthesizer) [![Travis CI Shield](https://travis-ci.com/DataResponsibly/DataSynthesizer.svg?branch=master)](https://travis-ci.com/DataResponsibly/DataSynthesizer) # DataSynthesizer DataSynthesizer generates synthetic data that simulates a given dataset. > It aims to facilitate the collaborations between data scientists and owners of sensitive data. It applies Differential Privacy techniques to achieve strong privacy guarantee. > > For more details, please refer to [DataSynthesizer: Privacy-Preserving Synthetic Datasets](docs/cr-datasynthesizer-privacy.pdf) ### Install DataSynthesizer ```bash pip install DataSynthesizer ``` ### Usage ##### Assumptions for the Input Dataset 1. The input dataset is a table in first normal form ([1NF](https://en.wikipedia.org/wiki/First_normal_form)). 2. When implementing differential privacy, DataSynthesizer injects noises into the statistics within **active domain** that are the values presented in the table. ##### Use Jupyter Notebook After installing DataSynthesizer and [Jupyter Notebook](https://jupyter.org/install), open and try the demos in `./notebooks/` - [DataSynthesizer__random_mode.ipynb](notebooks/DataSynthesizer__random_mode.ipynb) - [DataSynthesizer__independent_attribute_mode.ipynb](notebooks/DataSynthesizer__independent_attribute_mode.ipynb) - [DataSynthesizer__correlated_attribute_mode.ipynb](notebooks/DataSynthesizer__correlated_attribute_mode.ipynb) ##### Use Web UI The [dataResponsiblyUI](https://github.com/DataResponsibly/dataResponsiblyUI) is a Django project that includes DataSynthesizer. Please follow the steps in [Run the Web UIs locally](https://github.com/DataResponsibly/dataResponsiblyUI#run-the-web-uis-locally) and run DataSynthesizer by visiting http://127.0.0.1:8000/synthesizer in a browser. # History ## 0.1.0 - 2020-06-11 * First release on PyPI. ## 0.1.1 - 2020-07-05 ### Bugs Fixed * Numpy error when synthesising data with unique identifiers. - [Issue #23](https://github.com/DataResponsibly/DataSynthesizer/issues/23) by @raids ## 0.1.2 - 2020-07-19 ### Bugs Fixed * infer_distribution() for string attributes fails to sort index of varying types. - [Issue #24](https://github.com/DataResponsibly/DataSynthesizer/issues/24) by @raids ## 0.1.3 - 2020-09-13 ### Bugs Fixed * The dataframes are not appended into the full space in get_noisy_distribution_of_attributes(). - [Issue #26](https://github.com/DataResponsibly/DataSynthesizer/issues/26) by @zjroth ## 0.1.4 - 2021-01-14 ### Bugs Fixed * Fix a bug in candidate key identification. ## 0.1.5 - 2021-03-11 ### What's New * Downgrade required Python from >=3.8 to >=3.7. ## 0.1.6 - 2021-03-11 ### What's New * Update example notebooks. ## 0.1.7 - 2021-03-31 ### Bugs Fixed * Fixed an error in Laplace noise parameter. - [Issue #34](https://github.com/DataResponsibly/DataSynthesizer/issues/34) by @ganevgv ## 0.1.8 - 2021-04-09 ### Bugs Fixed * The randomness seeding is effective across the entire project now. ## 0.1.9 - 2021-07-18 ### Bugs Fixed * Optimized the datetime datatype detection. ## 0.1.10 - 2021-11-15 ### Bugs Fixed * Seed the randomness in `greedy_bayes()`. ## 0.1.11 - 2022-03-31 ### Bugs Fixed * Fixed a bug in DateTime generation. - [Issue #37](https://github.com/DataResponsibly/DataSynthesizer/issues/37) by @artemgur %prep %autosetup -n DataSynthesizer-0.1.11 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-DataSynthesizer -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Fri Jun 09 2023 Python_Bot - 0.1.11-1 - Package Spec generated