%global _empty_manifest_terminate_build 0 Name: python-anonypy Version: 0.1.7 Release: 1 Summary: Anonymization library for python License: MIT License URL: https://github.com/glassonion1/anonypy Source0: https://mirrors.nju.edu.cn/pypi/web/packages/04/5d/adc4824c45316d48c1448082a8144ba07378f5b22dfb08a5b3bd112e7e49/anonypy-0.1.7.tar.gz BuildArch: noarch %description # AnonyPy Anonymization library for python. AnonyPy provides following privacy preserving techniques for the anonymization. - K Anonymity - L Diversity - T Closeness ## The Anonymization method - Anonymization method aims at making the individual record be indistinguishable among a group record by using techniques of generalization and suppression. - Turning a dataset into a k-anonymous (and possibly l-diverse or t-close) dataset is a complex problem, and finding the optimal partition into k-anonymous groups is an NP-hard problem. - AnonyPy uses "Mondrian" algorithm to partition the original data into smaller and smaller groups - The algorithm assumes that we have converted all attributes into numerical or categorical values and that we are able to measure the “span” of a given attribute Xi. ## Install ``` $ pip install anonypy ``` ## Usage ```python import anonypy import pandas as pd data = [ [6, "1", "test1", "x", 20], [6, "1", "test1", "x", 30], [8, "2", "test2", "x", 50], [8, "2", "test3", "w", 45], [8, "1", "test2", "y", 35], [4, "2", "test3", "y", 20], [4, "1", "test3", "y", 20], [2, "1", "test3", "z", 22], [2, "2", "test3", "y", 32], ] columns = ["col1", "col2", "col3", "col4", "col5"] categorical = set(("col2", "col3", "col4")) def main(): df = pd.DataFrame(data=data, columns=columns) for name in categorical: df[name] = df[name].astype("category") feature_columns = ["col1", "col2", "col3"] sensitive_column = "col4" p = anonypy.Preserver(df, feature_columns, sensitive_column) rows = p.anonymize_k_anonymity(k=2) dfn = pd.DataFrame(rows) print(dfn) ``` Original data ```bash col1 col2 col3 col4 col5 0 6 1 test1 x 20 1 6 1 test1 x 30 2 8 2 test2 x 50 3 8 2 test3 w 45 4 8 1 test2 y 35 5 4 2 test3 y 20 6 4 1 test3 y 20 7 2 1 test3 z 22 8 2 2 test3 y 32 ``` The created anonymized data is below(Guarantee 2-anonymity). ```bash col1 col2 col3 col4 count 0 2-4 2 test3 y 2 1 2-4 1 test3 y 1 2 2-4 1 test3 z 1 3 6-8 1 test1,test2 x 2 4 6-8 1 test1,test2 y 1 5 8 2 test3,test2 w 1 6 8 2 test3,test2 x 1 ``` %package -n python3-anonypy Summary: Anonymization library for python Provides: python-anonypy BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-anonypy # AnonyPy Anonymization library for python. AnonyPy provides following privacy preserving techniques for the anonymization. - K Anonymity - L Diversity - T Closeness ## The Anonymization method - Anonymization method aims at making the individual record be indistinguishable among a group record by using techniques of generalization and suppression. - Turning a dataset into a k-anonymous (and possibly l-diverse or t-close) dataset is a complex problem, and finding the optimal partition into k-anonymous groups is an NP-hard problem. - AnonyPy uses "Mondrian" algorithm to partition the original data into smaller and smaller groups - The algorithm assumes that we have converted all attributes into numerical or categorical values and that we are able to measure the “span” of a given attribute Xi. ## Install ``` $ pip install anonypy ``` ## Usage ```python import anonypy import pandas as pd data = [ [6, "1", "test1", "x", 20], [6, "1", "test1", "x", 30], [8, "2", "test2", "x", 50], [8, "2", "test3", "w", 45], [8, "1", "test2", "y", 35], [4, "2", "test3", "y", 20], [4, "1", "test3", "y", 20], [2, "1", "test3", "z", 22], [2, "2", "test3", "y", 32], ] columns = ["col1", "col2", "col3", "col4", "col5"] categorical = set(("col2", "col3", "col4")) def main(): df = pd.DataFrame(data=data, columns=columns) for name in categorical: df[name] = df[name].astype("category") feature_columns = ["col1", "col2", "col3"] sensitive_column = "col4" p = anonypy.Preserver(df, feature_columns, sensitive_column) rows = p.anonymize_k_anonymity(k=2) dfn = pd.DataFrame(rows) print(dfn) ``` Original data ```bash col1 col2 col3 col4 col5 0 6 1 test1 x 20 1 6 1 test1 x 30 2 8 2 test2 x 50 3 8 2 test3 w 45 4 8 1 test2 y 35 5 4 2 test3 y 20 6 4 1 test3 y 20 7 2 1 test3 z 22 8 2 2 test3 y 32 ``` The created anonymized data is below(Guarantee 2-anonymity). ```bash col1 col2 col3 col4 count 0 2-4 2 test3 y 2 1 2-4 1 test3 y 1 2 2-4 1 test3 z 1 3 6-8 1 test1,test2 x 2 4 6-8 1 test1,test2 y 1 5 8 2 test3,test2 w 1 6 8 2 test3,test2 x 1 ``` %package help Summary: Development documents and examples for anonypy Provides: python3-anonypy-doc %description help # AnonyPy Anonymization library for python. AnonyPy provides following privacy preserving techniques for the anonymization. - K Anonymity - L Diversity - T Closeness ## The Anonymization method - Anonymization method aims at making the individual record be indistinguishable among a group record by using techniques of generalization and suppression. - Turning a dataset into a k-anonymous (and possibly l-diverse or t-close) dataset is a complex problem, and finding the optimal partition into k-anonymous groups is an NP-hard problem. - AnonyPy uses "Mondrian" algorithm to partition the original data into smaller and smaller groups - The algorithm assumes that we have converted all attributes into numerical or categorical values and that we are able to measure the “span” of a given attribute Xi. ## Install ``` $ pip install anonypy ``` ## Usage ```python import anonypy import pandas as pd data = [ [6, "1", "test1", "x", 20], [6, "1", "test1", "x", 30], [8, "2", "test2", "x", 50], [8, "2", "test3", "w", 45], [8, "1", "test2", "y", 35], [4, "2", "test3", "y", 20], [4, "1", "test3", "y", 20], [2, "1", "test3", "z", 22], [2, "2", "test3", "y", 32], ] columns = ["col1", "col2", "col3", "col4", "col5"] categorical = set(("col2", "col3", "col4")) def main(): df = pd.DataFrame(data=data, columns=columns) for name in categorical: df[name] = df[name].astype("category") feature_columns = ["col1", "col2", "col3"] sensitive_column = "col4" p = anonypy.Preserver(df, feature_columns, sensitive_column) rows = p.anonymize_k_anonymity(k=2) dfn = pd.DataFrame(rows) print(dfn) ``` Original data ```bash col1 col2 col3 col4 col5 0 6 1 test1 x 20 1 6 1 test1 x 30 2 8 2 test2 x 50 3 8 2 test3 w 45 4 8 1 test2 y 35 5 4 2 test3 y 20 6 4 1 test3 y 20 7 2 1 test3 z 22 8 2 2 test3 y 32 ``` The created anonymized data is below(Guarantee 2-anonymity). ```bash col1 col2 col3 col4 count 0 2-4 2 test3 y 2 1 2-4 1 test3 y 1 2 2-4 1 test3 z 1 3 6-8 1 test1,test2 x 2 4 6-8 1 test1,test2 y 1 5 8 2 test3,test2 w 1 6 8 2 test3,test2 x 1 ``` %prep %autosetup -n anonypy-0.1.7 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-anonypy -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Thu May 18 2023 Python_Bot - 0.1.7-1 - Package Spec generated