%global _empty_manifest_terminate_build 0 Name: python-pandas-aws Version: 0.1.6 Release: 1 Summary: please add a summary manually as the author left a blank one License: MIT URL: https://github.com/FlorentPajot/pandas-aws Source0: https://mirrors.aliyun.com/pypi/web/packages/bb/99/352369f0265066eeb3d222f312beb555df5b6676fa0a30d93e6edceeabe6/pandas-aws-0.1.6.tar.gz BuildArch: noarch Requires: python3-boto3 Requires: python3-pandas Requires: python3-fastparquet Requires: python3-pyarrow Requires: python3-xlsxwriter Requires: python3-xlrd Requires: python3-psycopg2 %description [![Build Status](https://travis-ci.com/FlorentPajot/pandas-aws.svg?branch=master)](https://travis-ci.com/FlorentPajot/pandas-aws) [![codecov](https://codecov.io/gh/FlorentPajot/pandas-aws/branch/master/graph/badge.svg)](https://codecov.io/gh/FlorentPajot/pandas-aws) # Pandas AWS - AWS made easy for data scientists Pandas AWS makes it super easy to use a pandas.DataFrame along with AWS services. ## Working with S3 First create an S3 client to be used later and define a bucket ``` from pandas_aws import get_client s3 = get_client('s3') MY_BUCKET= 'pandas-aws-bucket' ``` Example 1: get a DataFrame from a parquet file stored in S3 ``` from pandas_aws.s3 import get_df df_from_parquet_file = get_df(s3, MY_BUCKET, 'my_parquet_file_path', format='parquet') ``` Example 2: get a DataFrame from multiple CSV files (with same schema) stored in S3 ``` from pandas_aws.s3 import get_df_from_keys df_from_list = get_df_from_keys(s3, MY_BUCKET, prefix='my-folder', suffix='.csv') ``` Example 3: put a DataFrame into S3 using an xlsx (Excel) file format ``` from pandas_aws.s3 import put_df put_df(s3, my_dataframe, MY_BUCKET, 'target_file_path', format='xlsx') ``` Example 4: put a DataFrame into S3 using multi parts upload ``` from pandas_aws.s3 import put_df put_df(s3, my_dataframe, MY_BUCKET, 'target_file_path', format='csv', compression='gzip', parts=8) ``` # Installing pandas-aws ## Pip installation You can use pip to download the package `pip install pandas-aws` # Contributing to pandas-aws ## Git clone We use the `develop` brand as the release branch, thus `git clone` the repository and `git checkout develop` in order to get the latest version in development. ``` git clone git@github.com:FlorentPajot/pandas-aws.git ``` ## Preparing your environment Pandas AWS uses `poetry` to manage dependencies. Thus, `poetry` is required: `curl -SSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python` Create a separate Python environment, for example using `pyenv` along with `pyenv-virtualenv` and Python 3.7.7: ``` pyenv install 3.7.7 pyenv virtualenv 3.7.7 pandas-aws pyenv activate pandas-aws ``` Check your environment using: ``` which python // should show something like .pyenv/shims/python python -V // should show python 3.7.7 (or any other version you selected) pip list // should show barely nothing except pip and setuptools ``` In cas your encounter a problem, check `Pyenv` documentation. Then install dependencies with poetry after your `git clone` from the project repository: `poetry install` ## Guidelines Todo ## Requires The project needs the following dependencies: - libpq-dev (psycopg2 dependency) %package -n python3-pandas-aws Summary: please add a summary manually as the author left a blank one Provides: python-pandas-aws BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-pandas-aws [![Build Status](https://travis-ci.com/FlorentPajot/pandas-aws.svg?branch=master)](https://travis-ci.com/FlorentPajot/pandas-aws) [![codecov](https://codecov.io/gh/FlorentPajot/pandas-aws/branch/master/graph/badge.svg)](https://codecov.io/gh/FlorentPajot/pandas-aws) # Pandas AWS - AWS made easy for data scientists Pandas AWS makes it super easy to use a pandas.DataFrame along with AWS services. ## Working with S3 First create an S3 client to be used later and define a bucket ``` from pandas_aws import get_client s3 = get_client('s3') MY_BUCKET= 'pandas-aws-bucket' ``` Example 1: get a DataFrame from a parquet file stored in S3 ``` from pandas_aws.s3 import get_df df_from_parquet_file = get_df(s3, MY_BUCKET, 'my_parquet_file_path', format='parquet') ``` Example 2: get a DataFrame from multiple CSV files (with same schema) stored in S3 ``` from pandas_aws.s3 import get_df_from_keys df_from_list = get_df_from_keys(s3, MY_BUCKET, prefix='my-folder', suffix='.csv') ``` Example 3: put a DataFrame into S3 using an xlsx (Excel) file format ``` from pandas_aws.s3 import put_df put_df(s3, my_dataframe, MY_BUCKET, 'target_file_path', format='xlsx') ``` Example 4: put a DataFrame into S3 using multi parts upload ``` from pandas_aws.s3 import put_df put_df(s3, my_dataframe, MY_BUCKET, 'target_file_path', format='csv', compression='gzip', parts=8) ``` # Installing pandas-aws ## Pip installation You can use pip to download the package `pip install pandas-aws` # Contributing to pandas-aws ## Git clone We use the `develop` brand as the release branch, thus `git clone` the repository and `git checkout develop` in order to get the latest version in development. ``` git clone git@github.com:FlorentPajot/pandas-aws.git ``` ## Preparing your environment Pandas AWS uses `poetry` to manage dependencies. Thus, `poetry` is required: `curl -SSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python` Create a separate Python environment, for example using `pyenv` along with `pyenv-virtualenv` and Python 3.7.7: ``` pyenv install 3.7.7 pyenv virtualenv 3.7.7 pandas-aws pyenv activate pandas-aws ``` Check your environment using: ``` which python // should show something like .pyenv/shims/python python -V // should show python 3.7.7 (or any other version you selected) pip list // should show barely nothing except pip and setuptools ``` In cas your encounter a problem, check `Pyenv` documentation. Then install dependencies with poetry after your `git clone` from the project repository: `poetry install` ## Guidelines Todo ## Requires The project needs the following dependencies: - libpq-dev (psycopg2 dependency) %package help Summary: Development documents and examples for pandas-aws Provides: python3-pandas-aws-doc %description help [![Build Status](https://travis-ci.com/FlorentPajot/pandas-aws.svg?branch=master)](https://travis-ci.com/FlorentPajot/pandas-aws) [![codecov](https://codecov.io/gh/FlorentPajot/pandas-aws/branch/master/graph/badge.svg)](https://codecov.io/gh/FlorentPajot/pandas-aws) # Pandas AWS - AWS made easy for data scientists Pandas AWS makes it super easy to use a pandas.DataFrame along with AWS services. ## Working with S3 First create an S3 client to be used later and define a bucket ``` from pandas_aws import get_client s3 = get_client('s3') MY_BUCKET= 'pandas-aws-bucket' ``` Example 1: get a DataFrame from a parquet file stored in S3 ``` from pandas_aws.s3 import get_df df_from_parquet_file = get_df(s3, MY_BUCKET, 'my_parquet_file_path', format='parquet') ``` Example 2: get a DataFrame from multiple CSV files (with same schema) stored in S3 ``` from pandas_aws.s3 import get_df_from_keys df_from_list = get_df_from_keys(s3, MY_BUCKET, prefix='my-folder', suffix='.csv') ``` Example 3: put a DataFrame into S3 using an xlsx (Excel) file format ``` from pandas_aws.s3 import put_df put_df(s3, my_dataframe, MY_BUCKET, 'target_file_path', format='xlsx') ``` Example 4: put a DataFrame into S3 using multi parts upload ``` from pandas_aws.s3 import put_df put_df(s3, my_dataframe, MY_BUCKET, 'target_file_path', format='csv', compression='gzip', parts=8) ``` # Installing pandas-aws ## Pip installation You can use pip to download the package `pip install pandas-aws` # Contributing to pandas-aws ## Git clone We use the `develop` brand as the release branch, thus `git clone` the repository and `git checkout develop` in order to get the latest version in development. ``` git clone git@github.com:FlorentPajot/pandas-aws.git ``` ## Preparing your environment Pandas AWS uses `poetry` to manage dependencies. Thus, `poetry` is required: `curl -SSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python` Create a separate Python environment, for example using `pyenv` along with `pyenv-virtualenv` and Python 3.7.7: ``` pyenv install 3.7.7 pyenv virtualenv 3.7.7 pandas-aws pyenv activate pandas-aws ``` Check your environment using: ``` which python // should show something like .pyenv/shims/python python -V // should show python 3.7.7 (or any other version you selected) pip list // should show barely nothing except pip and setuptools ``` In cas your encounter a problem, check `Pyenv` documentation. Then install dependencies with poetry after your `git clone` from the project repository: `poetry install` ## Guidelines Todo ## Requires The project needs the following dependencies: - libpq-dev (psycopg2 dependency) %prep %autosetup -n pandas-aws-0.1.6 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-pandas-aws -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Tue Jun 20 2023 Python_Bot - 0.1.6-1 - Package Spec generated