python-databricks-connect.spec


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141

%global _empty_manifest_terminate_build 0
Name:		python-databricks-connect
Version:	11.3.7
Release:	1
Summary:	Databricks Connect Client
License:	Databricks Proprietary License
URL:		https://pypi.org/project/databricks-connect/
Source0:	https://mirrors.nju.edu.cn/pypi/web/packages/c1/14/e9fdd8338b501d266eecc42ce4949eb3d0e6dc492e86707e4b4553b53693/databricks-connect-11.3.7.tar.gz
BuildArch:	noarch


%description
Databricks Connect allows you to write 
jobs using Spark native APIs and have them execute remotely on a Databricks 
cluster instead of in the local Spark session.
For example, when you run the DataFrame command ``spark.read.parquet(...).
groupBy(...).agg(...).show()`` using Databricks Connect, the parsing and 
planning of the job runs on your local machine. Then, the logical 
representation of the job is sent to the Spark server running in Databricks 
for execution in the cluster.
With Databricks Connect, you can:
- Run large-scale Spark jobs from any Python, Java, Scala, or R application. 
Anywhere you can ``import pyspark``, ``import org.apache.spark``, or 
``require(SparkR)``, you can now run Spark jobs directly from your 
application, without needing to install any IDE plugins or use Spark 
submission scripts.
- Step through and debug code in your IDE even when working with a remote 
cluster.
- Iterate quickly when developing libraries. You do not need to restart the 
cluster after changing Python or Java library dependencies in Databricks 
Connect, because each client session is isolated from each other in the 
cluster.
- Shut down idle clusters without losing work. Because the client session is 
decoupled from the cluster, it is unaffected by cluster restarts or upgrades, 
which would normally cause you to lose all the variables, RDDs, and DataFrame 
objects defined in a notebook.

%package -n python3-databricks-connect
Summary:	Databricks Connect Client
Provides:	python-databricks-connect
BuildRequires:	python3-devel
BuildRequires:	python3-setuptools
BuildRequires:	python3-pip
%description -n python3-databricks-connect
Databricks Connect allows you to write 
jobs using Spark native APIs and have them execute remotely on a Databricks 
cluster instead of in the local Spark session.
For example, when you run the DataFrame command ``spark.read.parquet(...).
groupBy(...).agg(...).show()`` using Databricks Connect, the parsing and 
planning of the job runs on your local machine. Then, the logical 
representation of the job is sent to the Spark server running in Databricks 
for execution in the cluster.
With Databricks Connect, you can:
- Run large-scale Spark jobs from any Python, Java, Scala, or R application. 
Anywhere you can ``import pyspark``, ``import org.apache.spark``, or 
``require(SparkR)``, you can now run Spark jobs directly from your 
application, without needing to install any IDE plugins or use Spark 
submission scripts.
- Step through and debug code in your IDE even when working with a remote 
cluster.
- Iterate quickly when developing libraries. You do not need to restart the 
cluster after changing Python or Java library dependencies in Databricks 
Connect, because each client session is isolated from each other in the 
cluster.
- Shut down idle clusters without losing work. Because the client session is 
decoupled from the cluster, it is unaffected by cluster restarts or upgrades, 
which would normally cause you to lose all the variables, RDDs, and DataFrame 
objects defined in a notebook.

%package help
Summary:	Development documents and examples for databricks-connect
Provides:	python3-databricks-connect-doc
%description help
Databricks Connect allows you to write 
jobs using Spark native APIs and have them execute remotely on a Databricks 
cluster instead of in the local Spark session.
For example, when you run the DataFrame command ``spark.read.parquet(...).
groupBy(...).agg(...).show()`` using Databricks Connect, the parsing and 
planning of the job runs on your local machine. Then, the logical 
representation of the job is sent to the Spark server running in Databricks 
for execution in the cluster.
With Databricks Connect, you can:
- Run large-scale Spark jobs from any Python, Java, Scala, or R application. 
Anywhere you can ``import pyspark``, ``import org.apache.spark``, or 
``require(SparkR)``, you can now run Spark jobs directly from your 
application, without needing to install any IDE plugins or use Spark 
submission scripts.
- Step through and debug code in your IDE even when working with a remote 
cluster.
- Iterate quickly when developing libraries. You do not need to restart the 
cluster after changing Python or Java library dependencies in Databricks 
Connect, because each client session is isolated from each other in the 
cluster.
- Shut down idle clusters without losing work. Because the client session is 
decoupled from the cluster, it is unaffected by cluster restarts or upgrades, 
which would normally cause you to lose all the variables, RDDs, and DataFrame 
objects defined in a notebook.

%prep
%autosetup -n databricks-connect-11.3.7

%build
%py3_build

%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
	find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
	find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
	find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
	find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
	find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .

%files -n python3-databricks-connect -f filelist.lst
%dir %{python3_sitelib}/*

%files help -f doclist.lst
%{_docdir}/*

%changelog
* Mon Apr 10 2023 Python_Bot <Python_Bot@openeuler.org> - 11.3.7-1
- Package Spec generated