automatic import of python-pd-helperopeneuler20.03

author: CoprDistGit <infra@openeuler.org> 2023-06-20 03:23:08 +0000
committer: CoprDistGit <infra@openeuler.org> 2023-06-20 03:23:08 +0000
commit: d3b2f539be2262c8443e10582ea83be13807d99e (patch)
tree: 9e5dde79823a4b9af25c4e28043fb4fd1e4453ae
parent: a10c6fe95d993d42bc7341d5584766885699f203 (diff)
3 files changed, 387 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..2872264 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/pd_helper-1.0.0.tar.gz
diff --git a/python-pd-helper.spec b/python-pd-helper.spec
new file mode 100644
index 0000000..a24de3d
--- /dev/null
+++ b/python-pd-helper.spec
@@ -0,0 +1,385 @@
+%global _empty_manifest_terminate_build 0
+Name:		python-pd-helper
+Version:	1.0.0
+Release:	1
+Summary:	A helpful script to optimize a Pandas DataFrame.
+License:	MIT License
+URL:		https://github.com/justinhchae/pd-helper
+Source0:	https://mirrors.aliyun.com/pypi/web/packages/91/90/e3db69d9c398cecc805a93885b8494974a7f1f579a5a62340148379be1d5/pd_helper-1.0.0.tar.gz
+BuildArch:	noarch
+
+Requires:	python3-pandas
+Requires:	python3-numpy
+Requires:	python3-tqdm
+Requires:	python3-shortuuid
+
+%description
+# pd-helper
+
+ A helpful package to streamline Pandas DataFrame optimization.
+
+ Save 50-75% on DataFrame memory usage by running the optimizer. 
+
+ Autoconfigure dtypes for appropriate data types in each column with **helper**.
+
+ Generate a random DataFrame of controlled random variables for testing with **maker**.
+
+## Install
+ ```bash
+ pip install pd-helper
+ ```
+
+## Basic Usage to Iterate over DataFrame
+```python
+from pd_helper.maker import MakeData 
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+   # MakeData() generates a fake dataframe, convenient for testing
+   df = faker.make_df()
+   df = optimize(df)
+```
+## Better Usage With Multiprocessing
+```python
+from pd_helper.maker import MakeData 
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+   # MakeData() generates a fake dataframe, convenient for testing
+   df = faker.make_df()
+   df = optimize(df, enable_mp=True)
+```
+
+## Specify Special Mappings
+```python
+from pd_helper.maker import MakeData 
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+   # MakeData() generates a fake dataframe, convenient for testing
+   df = faker.make_df()
+   special_mappings = {'string': ['object_id'],
+                       'category': ['item_name']}
+
+   # special mappings will be applied instead of by optimize ruleset, they will be returned.
+   df = optimize(df
+                 , enable_mp=True,
+                 special_mappings=special_mappings
+                 )
+```
+
+
+## Sample Results with Helper
+
+```bash
+Starting with 175.63 MB memory.
+
+After optmization. 
+
+Ending with 65.33 MB memory.
+```
+
+## Generating a Randomly Imperfect DataFrame with Maker
+
+ Maker provides a class, MakeData(), to generate a table of made-up records. 
+
+ Each row is an event where an item was retrieved. 
+
+ Options to make the table imperfectly random in various ways. 
+
+ Sample table below:
+
+|  | Retrieved Date  | Item Name | Retrieved | Condition | Sector |
+| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
+| Example | 2019-01-01, 2019-03-4  | Toaster, Lighter  | True, False  | Junk, Excellent  | 1, 2 |
+| Data Type | String  | String  | String  | String | Integer |
+
+
+## References
+
+* Pandas Categorical: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Categorical.html>
+
+* Pandas Pickle: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_pickle.html>
+
+* Pandas CSV: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html>
+
+* Pandas Datetime: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html>
+
+### TODO
+
+* Improve efficiency of iterating on DataFrame.
+
+* Allow user to toggle logging.
+
+* Provide tools for imputing missing data.
+
+
+
+
+%package -n python3-pd-helper
+Summary:	A helpful script to optimize a Pandas DataFrame.
+Provides:	python-pd-helper
+BuildRequires:	python3-devel
+BuildRequires:	python3-setuptools
+BuildRequires:	python3-pip
+%description -n python3-pd-helper
+# pd-helper
+
+ A helpful package to streamline Pandas DataFrame optimization.
+
+ Save 50-75% on DataFrame memory usage by running the optimizer. 
+
+ Autoconfigure dtypes for appropriate data types in each column with **helper**.
+
+ Generate a random DataFrame of controlled random variables for testing with **maker**.
+
+## Install
+ ```bash
+ pip install pd-helper
+ ```
+
+## Basic Usage to Iterate over DataFrame
+```python
+from pd_helper.maker import MakeData 
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+   # MakeData() generates a fake dataframe, convenient for testing
+   df = faker.make_df()
+   df = optimize(df)
+```
+## Better Usage With Multiprocessing
+```python
+from pd_helper.maker import MakeData 
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+   # MakeData() generates a fake dataframe, convenient for testing
+   df = faker.make_df()
+   df = optimize(df, enable_mp=True)
+```
+
+## Specify Special Mappings
+```python
+from pd_helper.maker import MakeData 
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+   # MakeData() generates a fake dataframe, convenient for testing
+   df = faker.make_df()
+   special_mappings = {'string': ['object_id'],
+                       'category': ['item_name']}
+
+   # special mappings will be applied instead of by optimize ruleset, they will be returned.
+   df = optimize(df
+                 , enable_mp=True,
+                 special_mappings=special_mappings
+                 )
+```
+
+
+## Sample Results with Helper
+
+```bash
+Starting with 175.63 MB memory.
+
+After optmization. 
+
+Ending with 65.33 MB memory.
+```
+
+## Generating a Randomly Imperfect DataFrame with Maker
+
+ Maker provides a class, MakeData(), to generate a table of made-up records. 
+
+ Each row is an event where an item was retrieved. 
+
+ Options to make the table imperfectly random in various ways. 
+
+ Sample table below:
+
+|  | Retrieved Date  | Item Name | Retrieved | Condition | Sector |
+| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
+| Example | 2019-01-01, 2019-03-4  | Toaster, Lighter  | True, False  | Junk, Excellent  | 1, 2 |
+| Data Type | String  | String  | String  | String | Integer |
+
+
+## References
+
+* Pandas Categorical: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Categorical.html>
+
+* Pandas Pickle: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_pickle.html>
+
+* Pandas CSV: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html>
+
+* Pandas Datetime: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html>
+
+### TODO
+
+* Improve efficiency of iterating on DataFrame.
+
+* Allow user to toggle logging.
+
+* Provide tools for imputing missing data.
+
+
+
+
+%package help
+Summary:	Development documents and examples for pd-helper
+Provides:	python3-pd-helper-doc
+%description help
+# pd-helper
+
+ A helpful package to streamline Pandas DataFrame optimization.
+
+ Save 50-75% on DataFrame memory usage by running the optimizer. 
+
+ Autoconfigure dtypes for appropriate data types in each column with **helper**.
+
+ Generate a random DataFrame of controlled random variables for testing with **maker**.
+
+## Install
+ ```bash
+ pip install pd-helper
+ ```
+
+## Basic Usage to Iterate over DataFrame
+```python
+from pd_helper.maker import MakeData 
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+   # MakeData() generates a fake dataframe, convenient for testing
+   df = faker.make_df()
+   df = optimize(df)
+```
+## Better Usage With Multiprocessing
+```python
+from pd_helper.maker import MakeData 
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+   # MakeData() generates a fake dataframe, convenient for testing
+   df = faker.make_df()
+   df = optimize(df, enable_mp=True)
+```
+
+## Specify Special Mappings
+```python
+from pd_helper.maker import MakeData 
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+   # MakeData() generates a fake dataframe, convenient for testing
+   df = faker.make_df()
+   special_mappings = {'string': ['object_id'],
+                       'category': ['item_name']}
+
+   # special mappings will be applied instead of by optimize ruleset, they will be returned.
+   df = optimize(df
+                 , enable_mp=True,
+                 special_mappings=special_mappings
+                 )
+```
+
+
+## Sample Results with Helper
+
+```bash
+Starting with 175.63 MB memory.
+
+After optmization. 
+
+Ending with 65.33 MB memory.
+```
+
+## Generating a Randomly Imperfect DataFrame with Maker
+
+ Maker provides a class, MakeData(), to generate a table of made-up records. 
+
+ Each row is an event where an item was retrieved. 
+
+ Options to make the table imperfectly random in various ways. 
+
+ Sample table below:
+
+|  | Retrieved Date  | Item Name | Retrieved | Condition | Sector |
+| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
+| Example | 2019-01-01, 2019-03-4  | Toaster, Lighter  | True, False  | Junk, Excellent  | 1, 2 |
+| Data Type | String  | String  | String  | String | Integer |
+
+
+## References
+
+* Pandas Categorical: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Categorical.html>
+
+* Pandas Pickle: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_pickle.html>
+
+* Pandas CSV: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html>
+
+* Pandas Datetime: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html>
+
+### TODO
+
+* Improve efficiency of iterating on DataFrame.
+
+* Allow user to toggle logging.
+
+* Provide tools for imputing missing data.
+
+
+
+
+%prep
+%autosetup -n pd_helper-1.0.0
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+	find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+	find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+	find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+	find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+	find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-pd-helper -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Tue Jun 20 2023 Python_Bot <Python_Bot@openeuler.org> - 1.0.0-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..350b4f4
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+94d0e1ee5ebbcec038bfd5adfc91ec97  pd_helper-1.0.0.tar.gz
author	CoprDistGit <infra@openeuler.org>	2023-06-20 03:23:08 +0000
committer	CoprDistGit <infra@openeuler.org>	2023-06-20 03:23:08 +0000
commit	d3b2f539be2262c8443e10582ea83be13807d99e (patch)
tree	9e5dde79823a4b9af25c4e28043fb4fd1e4453ae
parent	a10c6fe95d993d42bc7341d5584766885699f203 (diff)