summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorCoprDistGit <infra@openeuler.org>2023-06-20 03:23:08 +0000
committerCoprDistGit <infra@openeuler.org>2023-06-20 03:23:08 +0000
commitd3b2f539be2262c8443e10582ea83be13807d99e (patch)
tree9e5dde79823a4b9af25c4e28043fb4fd1e4453ae
parenta10c6fe95d993d42bc7341d5584766885699f203 (diff)
automatic import of python-pd-helperopeneuler20.03
-rw-r--r--.gitignore1
-rw-r--r--python-pd-helper.spec385
-rw-r--r--sources1
3 files changed, 387 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..2872264 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/pd_helper-1.0.0.tar.gz
diff --git a/python-pd-helper.spec b/python-pd-helper.spec
new file mode 100644
index 0000000..a24de3d
--- /dev/null
+++ b/python-pd-helper.spec
@@ -0,0 +1,385 @@
+%global _empty_manifest_terminate_build 0
+Name: python-pd-helper
+Version: 1.0.0
+Release: 1
+Summary: A helpful script to optimize a Pandas DataFrame.
+License: MIT License
+URL: https://github.com/justinhchae/pd-helper
+Source0: https://mirrors.aliyun.com/pypi/web/packages/91/90/e3db69d9c398cecc805a93885b8494974a7f1f579a5a62340148379be1d5/pd_helper-1.0.0.tar.gz
+BuildArch: noarch
+
+Requires: python3-pandas
+Requires: python3-numpy
+Requires: python3-tqdm
+Requires: python3-shortuuid
+
+%description
+# pd-helper
+
+ A helpful package to streamline Pandas DataFrame optimization.
+
+ Save 50-75% on DataFrame memory usage by running the optimizer.
+
+ Autoconfigure dtypes for appropriate data types in each column with **helper**.
+
+ Generate a random DataFrame of controlled random variables for testing with **maker**.
+
+## Install
+ ```bash
+ pip install pd-helper
+ ```
+
+## Basic Usage to Iterate over DataFrame
+```python
+from pd_helper.maker import MakeData
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+ # MakeData() generates a fake dataframe, convenient for testing
+ df = faker.make_df()
+ df = optimize(df)
+```
+## Better Usage With Multiprocessing
+```python
+from pd_helper.maker import MakeData
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+ # MakeData() generates a fake dataframe, convenient for testing
+ df = faker.make_df()
+ df = optimize(df, enable_mp=True)
+```
+
+## Specify Special Mappings
+```python
+from pd_helper.maker import MakeData
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+ # MakeData() generates a fake dataframe, convenient for testing
+ df = faker.make_df()
+ special_mappings = {'string': ['object_id'],
+ 'category': ['item_name']}
+
+ # special mappings will be applied instead of by optimize ruleset, they will be returned.
+ df = optimize(df
+ , enable_mp=True,
+ special_mappings=special_mappings
+ )
+```
+
+
+## Sample Results with Helper
+
+```bash
+Starting with 175.63 MB memory.
+
+After optmization.
+
+Ending with 65.33 MB memory.
+```
+
+## Generating a Randomly Imperfect DataFrame with Maker
+
+ Maker provides a class, MakeData(), to generate a table of made-up records.
+
+ Each row is an event where an item was retrieved.
+
+ Options to make the table imperfectly random in various ways.
+
+ Sample table below:
+
+| | Retrieved Date | Item Name | Retrieved | Condition | Sector |
+| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
+| Example | 2019-01-01, 2019-03-4 | Toaster, Lighter | True, False | Junk, Excellent | 1, 2 |
+| Data Type | String | String | String | String | Integer |
+
+
+## References
+
+* Pandas Categorical: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Categorical.html>
+
+* Pandas Pickle: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_pickle.html>
+
+* Pandas CSV: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html>
+
+* Pandas Datetime: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html>
+
+### TODO
+
+* Improve efficiency of iterating on DataFrame.
+
+* Allow user to toggle logging.
+
+* Provide tools for imputing missing data.
+
+
+
+
+%package -n python3-pd-helper
+Summary: A helpful script to optimize a Pandas DataFrame.
+Provides: python-pd-helper
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-pd-helper
+# pd-helper
+
+ A helpful package to streamline Pandas DataFrame optimization.
+
+ Save 50-75% on DataFrame memory usage by running the optimizer.
+
+ Autoconfigure dtypes for appropriate data types in each column with **helper**.
+
+ Generate a random DataFrame of controlled random variables for testing with **maker**.
+
+## Install
+ ```bash
+ pip install pd-helper
+ ```
+
+## Basic Usage to Iterate over DataFrame
+```python
+from pd_helper.maker import MakeData
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+ # MakeData() generates a fake dataframe, convenient for testing
+ df = faker.make_df()
+ df = optimize(df)
+```
+## Better Usage With Multiprocessing
+```python
+from pd_helper.maker import MakeData
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+ # MakeData() generates a fake dataframe, convenient for testing
+ df = faker.make_df()
+ df = optimize(df, enable_mp=True)
+```
+
+## Specify Special Mappings
+```python
+from pd_helper.maker import MakeData
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+ # MakeData() generates a fake dataframe, convenient for testing
+ df = faker.make_df()
+ special_mappings = {'string': ['object_id'],
+ 'category': ['item_name']}
+
+ # special mappings will be applied instead of by optimize ruleset, they will be returned.
+ df = optimize(df
+ , enable_mp=True,
+ special_mappings=special_mappings
+ )
+```
+
+
+## Sample Results with Helper
+
+```bash
+Starting with 175.63 MB memory.
+
+After optmization.
+
+Ending with 65.33 MB memory.
+```
+
+## Generating a Randomly Imperfect DataFrame with Maker
+
+ Maker provides a class, MakeData(), to generate a table of made-up records.
+
+ Each row is an event where an item was retrieved.
+
+ Options to make the table imperfectly random in various ways.
+
+ Sample table below:
+
+| | Retrieved Date | Item Name | Retrieved | Condition | Sector |
+| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
+| Example | 2019-01-01, 2019-03-4 | Toaster, Lighter | True, False | Junk, Excellent | 1, 2 |
+| Data Type | String | String | String | String | Integer |
+
+
+## References
+
+* Pandas Categorical: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Categorical.html>
+
+* Pandas Pickle: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_pickle.html>
+
+* Pandas CSV: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html>
+
+* Pandas Datetime: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html>
+
+### TODO
+
+* Improve efficiency of iterating on DataFrame.
+
+* Allow user to toggle logging.
+
+* Provide tools for imputing missing data.
+
+
+
+
+%package help
+Summary: Development documents and examples for pd-helper
+Provides: python3-pd-helper-doc
+%description help
+# pd-helper
+
+ A helpful package to streamline Pandas DataFrame optimization.
+
+ Save 50-75% on DataFrame memory usage by running the optimizer.
+
+ Autoconfigure dtypes for appropriate data types in each column with **helper**.
+
+ Generate a random DataFrame of controlled random variables for testing with **maker**.
+
+## Install
+ ```bash
+ pip install pd-helper
+ ```
+
+## Basic Usage to Iterate over DataFrame
+```python
+from pd_helper.maker import MakeData
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+ # MakeData() generates a fake dataframe, convenient for testing
+ df = faker.make_df()
+ df = optimize(df)
+```
+## Better Usage With Multiprocessing
+```python
+from pd_helper.maker import MakeData
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+ # MakeData() generates a fake dataframe, convenient for testing
+ df = faker.make_df()
+ df = optimize(df, enable_mp=True)
+```
+
+## Specify Special Mappings
+```python
+from pd_helper.maker import MakeData
+from pd_helper.helper import optimize
+faker = MakeData()
+
+if __name__ == "__main__":
+ # MakeData() generates a fake dataframe, convenient for testing
+ df = faker.make_df()
+ special_mappings = {'string': ['object_id'],
+ 'category': ['item_name']}
+
+ # special mappings will be applied instead of by optimize ruleset, they will be returned.
+ df = optimize(df
+ , enable_mp=True,
+ special_mappings=special_mappings
+ )
+```
+
+
+## Sample Results with Helper
+
+```bash
+Starting with 175.63 MB memory.
+
+After optmization.
+
+Ending with 65.33 MB memory.
+```
+
+## Generating a Randomly Imperfect DataFrame with Maker
+
+ Maker provides a class, MakeData(), to generate a table of made-up records.
+
+ Each row is an event where an item was retrieved.
+
+ Options to make the table imperfectly random in various ways.
+
+ Sample table below:
+
+| | Retrieved Date | Item Name | Retrieved | Condition | Sector |
+| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
+| Example | 2019-01-01, 2019-03-4 | Toaster, Lighter | True, False | Junk, Excellent | 1, 2 |
+| Data Type | String | String | String | String | Integer |
+
+
+## References
+
+* Pandas Categorical: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Categorical.html>
+
+* Pandas Pickle: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_pickle.html>
+
+* Pandas CSV: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html>
+
+* Pandas Datetime: <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html>
+
+### TODO
+
+* Improve efficiency of iterating on DataFrame.
+
+* Allow user to toggle logging.
+
+* Provide tools for imputing missing data.
+
+
+
+
+%prep
+%autosetup -n pd_helper-1.0.0
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-pd-helper -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Tue Jun 20 2023 Python_Bot <Python_Bot@openeuler.org> - 1.0.0-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..350b4f4
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+94d0e1ee5ebbcec038bfd5adfc91ec97 pd_helper-1.0.0.tar.gz