%global _empty_manifest_terminate_build 0
Name: python-mrjob
Version: 0.7.4
Release: 1
Summary: Python MapReduce framework
License: Apache
URL: http://github.com/Yelp/mrjob
Source0: https://mirrors.nju.edu.cn/pypi/web/packages/2c/ed/207853d1ebc6b549551d12db35e471289d26cac2cdae363419357294d3c5/mrjob-0.7.4.tar.gz
BuildArch: noarch
Requires: python3-PyYAML
Requires: python3-boto3
Requires: python3-botocore
Requires: python3-google-cloud-dataproc
Requires: python3-google-cloud-logging
Requires: python3-google-cloud-storage
Requires: python3-rapidjson
Requires: python3-simplejson
Requires: python3-ujson
%description
mrjob is a Python 2.7/3.4+ package that helps you write and run Hadoop
Streaming jobs.
`Stable version (v0.7.4) documentation `_
`Development version documentation `_
mrjob fully supports Amazon's Elastic MapReduce (EMR) service, which allows you
to buy time on a Hadoop cluster on an hourly basis. mrjob has basic support for Google Cloud Dataproc (Dataproc)
which allows you to buy time on a Hadoop cluster on a minute-by-minute basis. It also works with your own
Hadoop cluster.
Some important features:
* Run jobs on EMR, Google Cloud Dataproc, your own Hadoop cluster, or locally (for testing).
* Write multi-step jobs (one map-reduce step feeds into the next)
* Easily launch Spark jobs on EMR or your own Hadoop cluster
* Duplicate your production environment inside Hadoop
* Upload your source tree and put it in your job's ``$PYTHONPATH``
* Run make and other setup scripts
* Set environment variables (e.g. ``$TZ``)
* Easily install python packages from tarballs (EMR only)
* Setup handled transparently by ``mrjob.conf`` config file
* Automatically interpret error logs
* SSH tunnel to hadoop job tracker (EMR only)
* Minimal setup
* To run on EMR, set ``$AWS_ACCESS_KEY_ID`` and ``$AWS_SECRET_ACCESS_KEY``
* To run on Dataproc, set ``$GOOGLE_APPLICATION_CREDENTIALS``
* No setup needed to use mrjob on your own Hadoop cluster
%package -n python3-mrjob
Summary: Python MapReduce framework
Provides: python-mrjob
BuildRequires: python3-devel
BuildRequires: python3-setuptools
BuildRequires: python3-pip
%description -n python3-mrjob
mrjob is a Python 2.7/3.4+ package that helps you write and run Hadoop
Streaming jobs.
`Stable version (v0.7.4) documentation `_
`Development version documentation `_
mrjob fully supports Amazon's Elastic MapReduce (EMR) service, which allows you
to buy time on a Hadoop cluster on an hourly basis. mrjob has basic support for Google Cloud Dataproc (Dataproc)
which allows you to buy time on a Hadoop cluster on a minute-by-minute basis. It also works with your own
Hadoop cluster.
Some important features:
* Run jobs on EMR, Google Cloud Dataproc, your own Hadoop cluster, or locally (for testing).
* Write multi-step jobs (one map-reduce step feeds into the next)
* Easily launch Spark jobs on EMR or your own Hadoop cluster
* Duplicate your production environment inside Hadoop
* Upload your source tree and put it in your job's ``$PYTHONPATH``
* Run make and other setup scripts
* Set environment variables (e.g. ``$TZ``)
* Easily install python packages from tarballs (EMR only)
* Setup handled transparently by ``mrjob.conf`` config file
* Automatically interpret error logs
* SSH tunnel to hadoop job tracker (EMR only)
* Minimal setup
* To run on EMR, set ``$AWS_ACCESS_KEY_ID`` and ``$AWS_SECRET_ACCESS_KEY``
* To run on Dataproc, set ``$GOOGLE_APPLICATION_CREDENTIALS``
* No setup needed to use mrjob on your own Hadoop cluster
%package help
Summary: Development documents and examples for mrjob
Provides: python3-mrjob-doc
%description help
mrjob is a Python 2.7/3.4+ package that helps you write and run Hadoop
Streaming jobs.
`Stable version (v0.7.4) documentation `_
`Development version documentation `_
mrjob fully supports Amazon's Elastic MapReduce (EMR) service, which allows you
to buy time on a Hadoop cluster on an hourly basis. mrjob has basic support for Google Cloud Dataproc (Dataproc)
which allows you to buy time on a Hadoop cluster on a minute-by-minute basis. It also works with your own
Hadoop cluster.
Some important features:
* Run jobs on EMR, Google Cloud Dataproc, your own Hadoop cluster, or locally (for testing).
* Write multi-step jobs (one map-reduce step feeds into the next)
* Easily launch Spark jobs on EMR or your own Hadoop cluster
* Duplicate your production environment inside Hadoop
* Upload your source tree and put it in your job's ``$PYTHONPATH``
* Run make and other setup scripts
* Set environment variables (e.g. ``$TZ``)
* Easily install python packages from tarballs (EMR only)
* Setup handled transparently by ``mrjob.conf`` config file
* Automatically interpret error logs
* SSH tunnel to hadoop job tracker (EMR only)
* Minimal setup
* To run on EMR, set ``$AWS_ACCESS_KEY_ID`` and ``$AWS_SECRET_ACCESS_KEY``
* To run on Dataproc, set ``$GOOGLE_APPLICATION_CREDENTIALS``
* No setup needed to use mrjob on your own Hadoop cluster
%prep
%autosetup -n mrjob-0.7.4
%build
%py3_build
%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .
%files -n python3-mrjob -f filelist.lst
%dir %{python3_sitelib}/*
%files help -f doclist.lst
%{_docdir}/*
%changelog
* Sun Apr 23 2023 Python_Bot - 0.7.4-1
- Package Spec generated