%global _empty_manifest_terminate_build 0 Name: python-databricks-connect Version: 11.3.7 Release: 1 Summary: Databricks Connect Client License: Databricks Proprietary License URL: https://pypi.org/project/databricks-connect/ Source0: https://mirrors.nju.edu.cn/pypi/web/packages/c1/14/e9fdd8338b501d266eecc42ce4949eb3d0e6dc492e86707e4b4553b53693/databricks-connect-11.3.7.tar.gz BuildArch: noarch %description Databricks Connect allows you to write jobs using Spark native APIs and have them execute remotely on a Databricks cluster instead of in the local Spark session. For example, when you run the DataFrame command ``spark.read.parquet(...). groupBy(...).agg(...).show()`` using Databricks Connect, the parsing and planning of the job runs on your local machine. Then, the logical representation of the job is sent to the Spark server running in Databricks for execution in the cluster. With Databricks Connect, you can: - Run large-scale Spark jobs from any Python, Java, Scala, or R application. Anywhere you can ``import pyspark``, ``import org.apache.spark``, or ``require(SparkR)``, you can now run Spark jobs directly from your application, without needing to install any IDE plugins or use Spark submission scripts. - Step through and debug code in your IDE even when working with a remote cluster. - Iterate quickly when developing libraries. You do not need to restart the cluster after changing Python or Java library dependencies in Databricks Connect, because each client session is isolated from each other in the cluster. - Shut down idle clusters without losing work. Because the client session is decoupled from the cluster, it is unaffected by cluster restarts or upgrades, which would normally cause you to lose all the variables, RDDs, and DataFrame objects defined in a notebook. %package -n python3-databricks-connect Summary: Databricks Connect Client Provides: python-databricks-connect BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-databricks-connect Databricks Connect allows you to write jobs using Spark native APIs and have them execute remotely on a Databricks cluster instead of in the local Spark session. For example, when you run the DataFrame command ``spark.read.parquet(...). groupBy(...).agg(...).show()`` using Databricks Connect, the parsing and planning of the job runs on your local machine. Then, the logical representation of the job is sent to the Spark server running in Databricks for execution in the cluster. With Databricks Connect, you can: - Run large-scale Spark jobs from any Python, Java, Scala, or R application. Anywhere you can ``import pyspark``, ``import org.apache.spark``, or ``require(SparkR)``, you can now run Spark jobs directly from your application, without needing to install any IDE plugins or use Spark submission scripts. - Step through and debug code in your IDE even when working with a remote cluster. - Iterate quickly when developing libraries. You do not need to restart the cluster after changing Python or Java library dependencies in Databricks Connect, because each client session is isolated from each other in the cluster. - Shut down idle clusters without losing work. Because the client session is decoupled from the cluster, it is unaffected by cluster restarts or upgrades, which would normally cause you to lose all the variables, RDDs, and DataFrame objects defined in a notebook. %package help Summary: Development documents and examples for databricks-connect Provides: python3-databricks-connect-doc %description help Databricks Connect allows you to write jobs using Spark native APIs and have them execute remotely on a Databricks cluster instead of in the local Spark session. For example, when you run the DataFrame command ``spark.read.parquet(...). groupBy(...).agg(...).show()`` using Databricks Connect, the parsing and planning of the job runs on your local machine. Then, the logical representation of the job is sent to the Spark server running in Databricks for execution in the cluster. With Databricks Connect, you can: - Run large-scale Spark jobs from any Python, Java, Scala, or R application. Anywhere you can ``import pyspark``, ``import org.apache.spark``, or ``require(SparkR)``, you can now run Spark jobs directly from your application, without needing to install any IDE plugins or use Spark submission scripts. - Step through and debug code in your IDE even when working with a remote cluster. - Iterate quickly when developing libraries. You do not need to restart the cluster after changing Python or Java library dependencies in Databricks Connect, because each client session is isolated from each other in the cluster. - Shut down idle clusters without losing work. Because the client session is decoupled from the cluster, it is unaffected by cluster restarts or upgrades, which would normally cause you to lose all the variables, RDDs, and DataFrame objects defined in a notebook. %prep %autosetup -n databricks-connect-11.3.7 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-databricks-connect -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Mon Apr 10 2023 Python_Bot - 11.3.7-1 - Package Spec generated