%global _empty_manifest_terminate_build 0 Name: python-pd-helper Version: 1.0.0 Release: 1 Summary: A helpful script to optimize a Pandas DataFrame. License: MIT License URL: https://github.com/justinhchae/pd-helper Source0: https://mirrors.aliyun.com/pypi/web/packages/91/90/e3db69d9c398cecc805a93885b8494974a7f1f579a5a62340148379be1d5/pd_helper-1.0.0.tar.gz BuildArch: noarch Requires: python3-pandas Requires: python3-numpy Requires: python3-tqdm Requires: python3-shortuuid %description # pd-helper A helpful package to streamline Pandas DataFrame optimization. Save 50-75% on DataFrame memory usage by running the optimizer. Autoconfigure dtypes for appropriate data types in each column with **helper**. Generate a random DataFrame of controlled random variables for testing with **maker**. ## Install ```bash pip install pd-helper ``` ## Basic Usage to Iterate over DataFrame ```python from pd_helper.maker import MakeData from pd_helper.helper import optimize faker = MakeData() if __name__ == "__main__": # MakeData() generates a fake dataframe, convenient for testing df = faker.make_df() df = optimize(df) ``` ## Better Usage With Multiprocessing ```python from pd_helper.maker import MakeData from pd_helper.helper import optimize faker = MakeData() if __name__ == "__main__": # MakeData() generates a fake dataframe, convenient for testing df = faker.make_df() df = optimize(df, enable_mp=True) ``` ## Specify Special Mappings ```python from pd_helper.maker import MakeData from pd_helper.helper import optimize faker = MakeData() if __name__ == "__main__": # MakeData() generates a fake dataframe, convenient for testing df = faker.make_df() special_mappings = {'string': ['object_id'], 'category': ['item_name']} # special mappings will be applied instead of by optimize ruleset, they will be returned. df = optimize(df , enable_mp=True, special_mappings=special_mappings ) ``` ## Sample Results with Helper ```bash Starting with 175.63 MB memory. After optmization. Ending with 65.33 MB memory. ``` ## Generating a Randomly Imperfect DataFrame with Maker Maker provides a class, MakeData(), to generate a table of made-up records. Each row is an event where an item was retrieved. Options to make the table imperfectly random in various ways. Sample table below: | | Retrieved Date | Item Name | Retrieved | Condition | Sector | | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | | Example | 2019-01-01, 2019-03-4 | Toaster, Lighter | True, False | Junk, Excellent | 1, 2 | | Data Type | String | String | String | String | Integer | ## References * Pandas Categorical: * Pandas Pickle: * Pandas CSV: * Pandas Datetime: ### TODO * Improve efficiency of iterating on DataFrame. * Allow user to toggle logging. * Provide tools for imputing missing data. %package -n python3-pd-helper Summary: A helpful script to optimize a Pandas DataFrame. Provides: python-pd-helper BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-pd-helper # pd-helper A helpful package to streamline Pandas DataFrame optimization. Save 50-75% on DataFrame memory usage by running the optimizer. Autoconfigure dtypes for appropriate data types in each column with **helper**. Generate a random DataFrame of controlled random variables for testing with **maker**. ## Install ```bash pip install pd-helper ``` ## Basic Usage to Iterate over DataFrame ```python from pd_helper.maker import MakeData from pd_helper.helper import optimize faker = MakeData() if __name__ == "__main__": # MakeData() generates a fake dataframe, convenient for testing df = faker.make_df() df = optimize(df) ``` ## Better Usage With Multiprocessing ```python from pd_helper.maker import MakeData from pd_helper.helper import optimize faker = MakeData() if __name__ == "__main__": # MakeData() generates a fake dataframe, convenient for testing df = faker.make_df() df = optimize(df, enable_mp=True) ``` ## Specify Special Mappings ```python from pd_helper.maker import MakeData from pd_helper.helper import optimize faker = MakeData() if __name__ == "__main__": # MakeData() generates a fake dataframe, convenient for testing df = faker.make_df() special_mappings = {'string': ['object_id'], 'category': ['item_name']} # special mappings will be applied instead of by optimize ruleset, they will be returned. df = optimize(df , enable_mp=True, special_mappings=special_mappings ) ``` ## Sample Results with Helper ```bash Starting with 175.63 MB memory. After optmization. Ending with 65.33 MB memory. ``` ## Generating a Randomly Imperfect DataFrame with Maker Maker provides a class, MakeData(), to generate a table of made-up records. Each row is an event where an item was retrieved. Options to make the table imperfectly random in various ways. Sample table below: | | Retrieved Date | Item Name | Retrieved | Condition | Sector | | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | | Example | 2019-01-01, 2019-03-4 | Toaster, Lighter | True, False | Junk, Excellent | 1, 2 | | Data Type | String | String | String | String | Integer | ## References * Pandas Categorical: * Pandas Pickle: * Pandas CSV: * Pandas Datetime: ### TODO * Improve efficiency of iterating on DataFrame. * Allow user to toggle logging. * Provide tools for imputing missing data. %package help Summary: Development documents and examples for pd-helper Provides: python3-pd-helper-doc %description help # pd-helper A helpful package to streamline Pandas DataFrame optimization. Save 50-75% on DataFrame memory usage by running the optimizer. Autoconfigure dtypes for appropriate data types in each column with **helper**. Generate a random DataFrame of controlled random variables for testing with **maker**. ## Install ```bash pip install pd-helper ``` ## Basic Usage to Iterate over DataFrame ```python from pd_helper.maker import MakeData from pd_helper.helper import optimize faker = MakeData() if __name__ == "__main__": # MakeData() generates a fake dataframe, convenient for testing df = faker.make_df() df = optimize(df) ``` ## Better Usage With Multiprocessing ```python from pd_helper.maker import MakeData from pd_helper.helper import optimize faker = MakeData() if __name__ == "__main__": # MakeData() generates a fake dataframe, convenient for testing df = faker.make_df() df = optimize(df, enable_mp=True) ``` ## Specify Special Mappings ```python from pd_helper.maker import MakeData from pd_helper.helper import optimize faker = MakeData() if __name__ == "__main__": # MakeData() generates a fake dataframe, convenient for testing df = faker.make_df() special_mappings = {'string': ['object_id'], 'category': ['item_name']} # special mappings will be applied instead of by optimize ruleset, they will be returned. df = optimize(df , enable_mp=True, special_mappings=special_mappings ) ``` ## Sample Results with Helper ```bash Starting with 175.63 MB memory. After optmization. Ending with 65.33 MB memory. ``` ## Generating a Randomly Imperfect DataFrame with Maker Maker provides a class, MakeData(), to generate a table of made-up records. Each row is an event where an item was retrieved. Options to make the table imperfectly random in various ways. Sample table below: | | Retrieved Date | Item Name | Retrieved | Condition | Sector | | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | | Example | 2019-01-01, 2019-03-4 | Toaster, Lighter | True, False | Junk, Excellent | 1, 2 | | Data Type | String | String | String | String | Integer | ## References * Pandas Categorical: * Pandas Pickle: * Pandas CSV: * Pandas Datetime: ### TODO * Improve efficiency of iterating on DataFrame. * Allow user to toggle logging. * Provide tools for imputing missing data. %prep %autosetup -n pd_helper-1.0.0 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-pd-helper -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Tue Jun 20 2023 Python_Bot - 1.0.0-1 - Package Spec generated