%global _empty_manifest_terminate_build 0
Name:		python-anonypy
Version:	0.1.7
Release:	1
Summary:	Anonymization library for python
License:	MIT License
URL:		https://github.com/glassonion1/anonypy
Source0:	https://mirrors.nju.edu.cn/pypi/web/packages/04/5d/adc4824c45316d48c1448082a8144ba07378f5b22dfb08a5b3bd112e7e49/anonypy-0.1.7.tar.gz
BuildArch:	noarch


%description
# AnonyPy
Anonymization library for python.
AnonyPy provides following privacy preserving techniques for the anonymization.
- K Anonymity
- L Diversity
- T Closeness

## The Anonymization method
- Anonymization method aims at making the individual record be indistinguishable among a group record by using techniques of generalization and suppression.
- Turning a dataset into a k-anonymous (and possibly l-diverse or t-close) dataset is a complex problem, and finding the optimal partition into k-anonymous groups is an NP-hard problem.
- AnonyPy uses "Mondrian" algorithm to partition the original data into smaller and smaller groups
- The algorithm assumes that we have converted all attributes into numerical or categorical values and that we are able to measure the “span” of a given attribute Xi.

## Install
```
$ pip install anonypy
```

## Usage
```python
import anonypy
import pandas as pd

data = [
    [6, "1", "test1", "x", 20],
    [6, "1", "test1", "x", 30],
    [8, "2", "test2", "x", 50],
    [8, "2", "test3", "w", 45],
    [8, "1", "test2", "y", 35],
    [4, "2", "test3", "y", 20],
    [4, "1", "test3", "y", 20],
    [2, "1", "test3", "z", 22],
    [2, "2", "test3", "y", 32],
]

columns = ["col1", "col2", "col3", "col4", "col5"]
categorical = set(("col2", "col3", "col4"))

def main():
    df = pd.DataFrame(data=data, columns=columns)

    for name in categorical:
        df[name] = df[name].astype("category")

    feature_columns = ["col1", "col2", "col3"]
    sensitive_column = "col4"

    p = anonypy.Preserver(df, feature_columns, sensitive_column)
    rows = p.anonymize_k_anonymity(k=2)

    dfn = pd.DataFrame(rows)
    print(dfn)
```

Original data
```bash
   col1 col2   col3 col4  col5
0     6    1  test1    x    20
1     6    1  test1    x    30
2     8    2  test2    x    50
3     8    2  test3    w    45
4     8    1  test2    y    35
5     4    2  test3    y    20
6     4    1  test3    y    20
7     2    1  test3    z    22
8     2    2  test3    y    32
```

The created anonymized data is below(Guarantee 2-anonymity).
```bash
  col1 col2         col3 col4  count
0  2-4    2        test3    y      2
1  2-4    1        test3    y      1
2  2-4    1        test3    z      1
3  6-8    1  test1,test2    x      2
4  6-8    1  test1,test2    y      1
5    8    2  test3,test2    w      1
6    8    2  test3,test2    x      1
```


%package -n python3-anonypy
Summary:	Anonymization library for python
Provides:	python-anonypy
BuildRequires:	python3-devel
BuildRequires:	python3-setuptools
BuildRequires:	python3-pip
%description -n python3-anonypy
# AnonyPy
Anonymization library for python.
AnonyPy provides following privacy preserving techniques for the anonymization.
- K Anonymity
- L Diversity
- T Closeness

## The Anonymization method
- Anonymization method aims at making the individual record be indistinguishable among a group record by using techniques of generalization and suppression.
- Turning a dataset into a k-anonymous (and possibly l-diverse or t-close) dataset is a complex problem, and finding the optimal partition into k-anonymous groups is an NP-hard problem.
- AnonyPy uses "Mondrian" algorithm to partition the original data into smaller and smaller groups
- The algorithm assumes that we have converted all attributes into numerical or categorical values and that we are able to measure the “span” of a given attribute Xi.

## Install
```
$ pip install anonypy
```

## Usage
```python
import anonypy
import pandas as pd

data = [
    [6, "1", "test1", "x", 20],
    [6, "1", "test1", "x", 30],
    [8, "2", "test2", "x", 50],
    [8, "2", "test3", "w", 45],
    [8, "1", "test2", "y", 35],
    [4, "2", "test3", "y", 20],
    [4, "1", "test3", "y", 20],
    [2, "1", "test3", "z", 22],
    [2, "2", "test3", "y", 32],
]

columns = ["col1", "col2", "col3", "col4", "col5"]
categorical = set(("col2", "col3", "col4"))

def main():
    df = pd.DataFrame(data=data, columns=columns)

    for name in categorical:
        df[name] = df[name].astype("category")

    feature_columns = ["col1", "col2", "col3"]
    sensitive_column = "col4"

    p = anonypy.Preserver(df, feature_columns, sensitive_column)
    rows = p.anonymize_k_anonymity(k=2)

    dfn = pd.DataFrame(rows)
    print(dfn)
```

Original data
```bash
   col1 col2   col3 col4  col5
0     6    1  test1    x    20
1     6    1  test1    x    30
2     8    2  test2    x    50
3     8    2  test3    w    45
4     8    1  test2    y    35
5     4    2  test3    y    20
6     4    1  test3    y    20
7     2    1  test3    z    22
8     2    2  test3    y    32
```

The created anonymized data is below(Guarantee 2-anonymity).
```bash
  col1 col2         col3 col4  count
0  2-4    2        test3    y      2
1  2-4    1        test3    y      1
2  2-4    1        test3    z      1
3  6-8    1  test1,test2    x      2
4  6-8    1  test1,test2    y      1
5    8    2  test3,test2    w      1
6    8    2  test3,test2    x      1
```


%package help
Summary:	Development documents and examples for anonypy
Provides:	python3-anonypy-doc
%description help
# AnonyPy
Anonymization library for python.
AnonyPy provides following privacy preserving techniques for the anonymization.
- K Anonymity
- L Diversity
- T Closeness

## The Anonymization method
- Anonymization method aims at making the individual record be indistinguishable among a group record by using techniques of generalization and suppression.
- Turning a dataset into a k-anonymous (and possibly l-diverse or t-close) dataset is a complex problem, and finding the optimal partition into k-anonymous groups is an NP-hard problem.
- AnonyPy uses "Mondrian" algorithm to partition the original data into smaller and smaller groups
- The algorithm assumes that we have converted all attributes into numerical or categorical values and that we are able to measure the “span” of a given attribute Xi.

## Install
```
$ pip install anonypy
```

## Usage
```python
import anonypy
import pandas as pd

data = [
    [6, "1", "test1", "x", 20],
    [6, "1", "test1", "x", 30],
    [8, "2", "test2", "x", 50],
    [8, "2", "test3", "w", 45],
    [8, "1", "test2", "y", 35],
    [4, "2", "test3", "y", 20],
    [4, "1", "test3", "y", 20],
    [2, "1", "test3", "z", 22],
    [2, "2", "test3", "y", 32],
]

columns = ["col1", "col2", "col3", "col4", "col5"]
categorical = set(("col2", "col3", "col4"))

def main():
    df = pd.DataFrame(data=data, columns=columns)

    for name in categorical:
        df[name] = df[name].astype("category")

    feature_columns = ["col1", "col2", "col3"]
    sensitive_column = "col4"

    p = anonypy.Preserver(df, feature_columns, sensitive_column)
    rows = p.anonymize_k_anonymity(k=2)

    dfn = pd.DataFrame(rows)
    print(dfn)
```

Original data
```bash
   col1 col2   col3 col4  col5
0     6    1  test1    x    20
1     6    1  test1    x    30
2     8    2  test2    x    50
3     8    2  test3    w    45
4     8    1  test2    y    35
5     4    2  test3    y    20
6     4    1  test3    y    20
7     2    1  test3    z    22
8     2    2  test3    y    32
```

The created anonymized data is below(Guarantee 2-anonymity).
```bash
  col1 col2         col3 col4  count
0  2-4    2        test3    y      2
1  2-4    1        test3    y      1
2  2-4    1        test3    z      1
3  6-8    1  test1,test2    x      2
4  6-8    1  test1,test2    y      1
5    8    2  test3,test2    w      1
6    8    2  test3,test2    x      1
```


%prep
%autosetup -n anonypy-0.1.7

%build
%py3_build

%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
	find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
	find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
	find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
	find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
	find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .

%files -n python3-anonypy -f filelist.lst
%dir %{python3_sitelib}/*

%files help -f doclist.lst
%{_docdir}/*

%changelog
* Thu May 18 2023 Python_Bot <Python_Bot@openeuler.org> - 0.1.7-1
- Package Spec generated