automatic import of python-spark-sklearn

author: CoprDistGit <infra@openeuler.org> 2023-04-10 13:23:33 +0000
committer: CoprDistGit <infra@openeuler.org> 2023-04-10 13:23:33 +0000
commit: 7f97180ebb80aff752923a8c77ebe55f39c8c889 (patch)
tree: 74d4caea36e3d6169c7ef9c242efad5611e298c0
parent: a0e46b4847ac0f29dd95ee944c5170403bbe5071 (diff)
3 files changed, 107 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..f3e01c3 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/spark-sklearn-0.3.0.tar.gz
diff --git a/python-spark-sklearn.spec b/python-spark-sklearn.spec
new file mode 100644
index 0000000..3a200be
--- /dev/null
+++ b/python-spark-sklearn.spec
@@ -0,0 +1,105 @@
+%global _empty_manifest_terminate_build 0
+Name:		python-spark-sklearn
+Version:	0.3.0
+Release:	1
+Summary:	Integration tools for running scikit-learn on Spark
+License:	Apache 2.0
+URL:		https://github.com/databricks/spark-sklearn
+Source0:	https://mirrors.nju.edu.cn/pypi/web/packages/b0/3f/34b8dec7d2cfcfe0ba99d637b4f2d306c1ca0b404107c07c829e085f6b38/spark-sklearn-0.3.0.tar.gz
+BuildArch:	noarch
+
+
+%description
+This package contains some tools to integrate the `Spark computing framework <https://spark.apache.org/>`_
+with the popular `scikit-learn machine library <https://scikit-learn.org/stable/>`_. Among other things, it can:
+- train and evaluate multiple scikit-learn models in parallel. It is a distributed analog to the
+  `multicore implementation <https://pythonhosted.org/joblib/parallel.html>`_ included by default in ``scikit-learn``
+- convert Spark's Dataframes seamlessly into numpy ``ndarray`` or sparse matrices
+- (experimental) distribute Scipy's sparse matrices as a dataset of sparse vectors
+It focuses on problems that have a small amount of data and that can be run in parallel.
+For small datasets, it distributes the search for estimator parameters (``GridSearchCV`` in scikit-learn),
+using Spark. For datasets that do not fit in memory, we recommend using the `distributed implementation in
+`Spark MLlib <https://spark.apache.org/docs/latest/api/python/pyspark.mllib.html>`_.
+This package distributes simple tasks like grid-search cross-validation.
+It does not distribute individual learning algorithms (unlike Spark MLlib).
+
+%package -n python3-spark-sklearn
+Summary:	Integration tools for running scikit-learn on Spark
+Provides:	python-spark-sklearn
+BuildRequires:	python3-devel
+BuildRequires:	python3-setuptools
+BuildRequires:	python3-pip
+%description -n python3-spark-sklearn
+This package contains some tools to integrate the `Spark computing framework <https://spark.apache.org/>`_
+with the popular `scikit-learn machine library <https://scikit-learn.org/stable/>`_. Among other things, it can:
+- train and evaluate multiple scikit-learn models in parallel. It is a distributed analog to the
+  `multicore implementation <https://pythonhosted.org/joblib/parallel.html>`_ included by default in ``scikit-learn``
+- convert Spark's Dataframes seamlessly into numpy ``ndarray`` or sparse matrices
+- (experimental) distribute Scipy's sparse matrices as a dataset of sparse vectors
+It focuses on problems that have a small amount of data and that can be run in parallel.
+For small datasets, it distributes the search for estimator parameters (``GridSearchCV`` in scikit-learn),
+using Spark. For datasets that do not fit in memory, we recommend using the `distributed implementation in
+`Spark MLlib <https://spark.apache.org/docs/latest/api/python/pyspark.mllib.html>`_.
+This package distributes simple tasks like grid-search cross-validation.
+It does not distribute individual learning algorithms (unlike Spark MLlib).
+
+%package help
+Summary:	Development documents and examples for spark-sklearn
+Provides:	python3-spark-sklearn-doc
+%description help
+This package contains some tools to integrate the `Spark computing framework <https://spark.apache.org/>`_
+with the popular `scikit-learn machine library <https://scikit-learn.org/stable/>`_. Among other things, it can:
+- train and evaluate multiple scikit-learn models in parallel. It is a distributed analog to the
+  `multicore implementation <https://pythonhosted.org/joblib/parallel.html>`_ included by default in ``scikit-learn``
+- convert Spark's Dataframes seamlessly into numpy ``ndarray`` or sparse matrices
+- (experimental) distribute Scipy's sparse matrices as a dataset of sparse vectors
+It focuses on problems that have a small amount of data and that can be run in parallel.
+For small datasets, it distributes the search for estimator parameters (``GridSearchCV`` in scikit-learn),
+using Spark. For datasets that do not fit in memory, we recommend using the `distributed implementation in
+`Spark MLlib <https://spark.apache.org/docs/latest/api/python/pyspark.mllib.html>`_.
+This package distributes simple tasks like grid-search cross-validation.
+It does not distribute individual learning algorithms (unlike Spark MLlib).
+
+%prep
+%autosetup -n spark-sklearn-0.3.0
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+	find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+	find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+	find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+	find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+	find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-spark-sklearn -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Mon Apr 10 2023 Python_Bot <Python_Bot@openeuler.org> - 0.3.0-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..8b12113
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+4460d6c8402a5b46d361c442c2e47f19  spark-sklearn-0.3.0.tar.gz
author	CoprDistGit <infra@openeuler.org>	2023-04-10 13:23:33 +0000
committer	CoprDistGit <infra@openeuler.org>	2023-04-10 13:23:33 +0000
commit	7f97180ebb80aff752923a8c77ebe55f39c8c889 (patch)
tree	74d4caea36e3d6169c7ef9c242efad5611e298c0
parent	a0e46b4847ac0f29dd95ee944c5170403bbe5071 (diff)