summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorCoprDistGit <infra@openeuler.org>2023-05-05 09:50:22 +0000
committerCoprDistGit <infra@openeuler.org>2023-05-05 09:50:22 +0000
commit8714f97e501ece05fb56986265738eb7373106dd (patch)
treec1189af2e57136652a3e5dfb7efdc4b25544bd43
parentf6072fe2535f7d294f476d0231cbfbf78434a8f9 (diff)
automatic import of python-pydbropeneuler20.03
-rw-r--r--.gitignore1
-rw-r--r--python-pydbr.spec1481
-rw-r--r--sources1
3 files changed, 1483 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..1348298 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/pydbr-0.0.7.tar.gz
diff --git a/python-pydbr.spec b/python-pydbr.spec
new file mode 100644
index 0000000..7461c35
--- /dev/null
+++ b/python-pydbr.spec
@@ -0,0 +1,1481 @@
+%global _empty_manifest_terminate_build 0
+Name: python-pydbr
+Version: 0.0.7
+Release: 1
+Summary: Databricks client SDK with command line client for Databricks REST APIs
+License: MIT License
+URL: https://github.com/ivangeorgiev/pydbr
+Source0: https://mirrors.nju.edu.cn/pypi/web/packages/09/c6/618f1b2cacaa50ebae807f4e862bd61e8f32b5ca0d44fb2422ce74156274/pydbr-0.0.7.tar.gz
+BuildArch: noarch
+
+Requires: python3-click
+Requires: python3-requests
+
+%description
+# pydbr
+Databricks client SDK for Python with command line interface for Databricks REST APIs.
+
+{:toc}
+
+## Introduction
+
+Pydbr (short of Python-Databricks) package provides python SDK for Databricks REST API:
+
+* dbfs
+* workspace
+* jobs
+* runs
+
+The package also comes with a useful CLI which might be very helpful in automation.
+
+## Installation
+
+```bash
+$ pip install pydbr
+```
+
+
+
+## Databricks CLI
+
+Databricks command line client provides convenient way to interact with Databricks cluster at the command line. A very popular use of such approach in in automation tasks, like DevOps pipelines or third party workflow managers.
+
+You can call the Databricks CLI using convenient shell command `pydbr`:
+
+```bash
+$ pydbr --help
+```
+
+ or using python module:
+
+```bash
+$ python -m pydbr.cli --help
+```
+
+To connect to the Databricks cluster, you can supply arguments at the command line:
+
+* `--bearer-token`
+* `--url`
+* `--cluster-id`
+
+Alternatively, you can define environment variables. Command line arguments take precedence.
+
+```bash
+export DATABRICKS_URL='https://westeurope.azuredatabricks.net/'
+export DATABRICKS_BEARER_TOKEN='dapixyz89u9ufsdfd0'
+export DATABRICKS_CLUSTER_ID='1234-456778-abc234'
+export DATABRICKS_ORG_ID='87287878293983984'
+```
+
+
+
+### DBFS
+
+#### List DBFS items
+
+```bash
+# List items on DBFS
+pydbr dbfs ls --json-indent 3 FileStore/movielens
+```
+
+```bash
+[
+ {
+ "path": "/FileStore/movielens/ml-latest-small",
+ "is_dir": true,
+ "file_size": 0,
+ "is_file": false,
+ "human_size": "0 B"
+ }
+]
+```
+
+#### Download file from DBFS
+
+```bash
+# Download a file and print to STDOUT
+pydbr dbfs get ml-latest-small/movies.csv
+```
+
+#### Download directory from DBFS
+
+```bash
+# Download recursively entire directory and store locally
+pydbr dbfs get -o ml-local ml-latest-small
+```
+
+
+
+### Workspace
+
+Databricks workspace contains notebooks and other items.
+
+#### List workspace
+
+```bash
+####################
+# List workspace
+# Default path is root - '/'
+$ pydbr workspace ls
+# auto-add leading '/'
+$ pydbr workspace ls 'Users'
+# Space-indentend json output with number of spaces
+$ pydbr workspace --json-indent 4 ls
+# Custom indent string
+$ pydbr workspace ls --json-indent='>'
+```
+
+
+
+#### Export items from Databricks workspace
+
+```bash
+#####################
+# Export workspace items
+# Export everything in source format using defaults: format=SOURCE, path=/
+pydbr workspace export -o ./.dev/export
+# Export everything in DBC format
+pydbr workspace export -f DBC -o ./.dev/export.
+# When path is folder, export is recursive
+pydbr workspace export -o ./.dev/export-utils 'Utils'
+# Export single ITEM
+pydbr workspace export -o ./.dev/GetML 'Utils/Download MovieLens.py'
+```
+
+
+
+### Runs
+
+This command group implements the [`jobs/runs` Databricks REST API](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit).
+
+#### Submit a notebook
+
+Implements: [https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit)
+
+```bash
+$ pydbr runs submit "Utils/Download MovieLens"
+```
+
+```
+{"run_id": 4}
+```
+
+You can retrieve the job information using `runs get`:
+
+```bash
+$ pydbr runs get 4 -i 3
+```
+
+
+
+If you need to pass parameters, use the `--parameters` or `-p` option and specify JSON text.
+
+```bash
+$ pydbr runs submit -p '{"run_tag":"20250103"}' "Utils/Download MovieLens"
+```
+
+You can refer also to parameters in JSON file:
+
+```bash
+$ pydbr runs submit -p '@params.json' "Utils/Download MovieLens"
+```
+
+You can use the parameters in the notebook and will also be able to see them in the run metadata:
+
+```bash
+pydbr runs get-output -i 3 8
+```
+
+```json
+{
+ "notebook_output": {
+ "result": "Downloaded files (tag: 20250103): README.txt, links.csv, movies.csv, ratings.csv, tags.csv",
+ "truncated": false
+ },
+ "error": null,
+ "metadata": {
+ "job_id": 8,
+ "run_id": 8,
+ "creator_user_name": "your.name@gmail.com",
+ "number_in_job": 1,
+ "original_attempt_run_id": null,
+ "state": {
+ "life_cycle_state": "TERMINATED",
+ "result_state": "SUCCESS",
+ "state_message": ""
+ },
+ "schedule": null,
+ "task": {
+ "notebook_task": {
+ "notebook_path": "/Utils/Download MovieLens",
+ "base_parameters": {
+ "run_tag": "20250103"
+ }
+ }
+ },
+ "cluster_spec": {
+ "existing_cluster_id": "xxxx-yyyyyy-zzzzzz"
+ },
+ "cluster_instance": {
+ "cluster_id": "xxxx-yyyyyy-zzzzzzzz",
+ "spark_context_id": "8734983498349834"
+ },
+ "overriding_parameters": null,
+ "start_time": 1592067357734,
+ "setup_duration": 0,
+ "execution_duration": 11000,
+ "cleanup_duration": 0,
+ "trigger": null,
+ "run_name": "pydbr-1592067355",
+ "run_page_url": "https://westeurope.azuredatabricks.net/?o=89349849834#job/8/run/1",
+ "run_type": "SUBMIT_RUN"
+ }
+}
+```
+
+
+
+#### Get run metadata
+
+Implements: [Databricks REST runs/get](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-get)
+
+```bash
+$ pydbr runs get -i 3 6
+```
+
+```json
+{
+ "job_id": 6,
+ "run_id": 6,
+ "creator_user_name": "your.name@gmail.com",
+ "number_in_job": 1,
+ "original_attempt_run_id": null,
+ "state": {
+ "life_cycle_state": "TERMINATED",
+ "result_state": "SUCCESS",
+ "state_message": ""
+ },
+ "schedule": null,
+ "task": {
+ "notebook_task": {
+ "notebook_path": "/Utils/Download MovieLens"
+ }
+ },
+ "cluster_spec": {
+ "existing_cluster_id": "xxxx-yyyyy-zzzzzz"
+ },
+ "cluster_instance": {
+ "cluster_id": "xxxx-yyyyy-zzzzzz",
+ "spark_context_id": "783487348734873873"
+ },
+ "overriding_parameters": null,
+ "start_time": 1592062497162,
+ "setup_duration": 0,
+ "execution_duration": 11000,
+ "cleanup_duration": 0,
+ "trigger": null,
+ "run_name": "pydbr-1592062494",
+ "run_page_url": "https://westeurope.azuredatabricks.net/?o=398348734873487#job/6/run/1",
+ "run_type": "SUBMIT_RUN"
+}
+```
+
+
+
+#### List Runs
+
+Implements: [Databricks REST runs/list](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-list)
+
+```bash
+$ pydbr runs ls
+```
+
+
+
+To get only the runs for a particular job:
+
+```bash
+# Get job with job-id=4
+$ pydbr runs ls 4 -i 3
+```
+
+```json
+{
+ "runs": [
+ {
+ "job_id": 4,
+ "run_id": 4,
+ "creator_user_name": "your.name@gmail.com",
+ "number_in_job": 1,
+ "original_attempt_run_id": null,
+ "state": {
+ "life_cycle_state": "PENDING",
+ "state_message": ""
+ },
+ "schedule": null,
+ "task": {
+ "notebook_task": {
+ "notebook_path": "/Utils/Download MovieLens"
+ }
+ },
+ "cluster_spec": {
+ "existing_cluster_id": "xxxxx-yyyy-zzzzzzz"
+ },
+ "cluster_instance": {
+ "cluster_id": "xxxxx-yyyy-zzzzzzz"
+ },
+ "overriding_parameters": null,
+ "start_time": 1592058826123,
+ "setup_duration": 0,
+ "execution_duration": 0,
+ "cleanup_duration": 0,
+ "trigger": null,
+ "run_name": "pydbr-1592058823",
+ "run_page_url": "https://westeurope.azuredatabricks.net/?o=abcdefghasdf#job/4/run/1",
+ "run_type": "SUBMIT_RUN"
+ }
+ ],
+ "has_more": false
+}
+```
+
+
+
+#### Export run
+
+Implements: [Databricks REST runs/export](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-export)
+
+```bash
+$ pydbr runs export --content-only 4 > .dev/run-view.html
+```
+
+
+
+#### Get run output
+
+Implements: [Databricks REST runs/get-output](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-get-output)
+
+```bash
+$ pydbr runs get-output -i 3 6
+```
+
+```json
+{
+ "notebook_output": {
+ "result": "Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv",
+ "truncated": false
+ },
+ "error": null,
+ "metadata": {
+ "job_id": 5,
+ "run_id": 5,
+ "creator_user_name": "your.name@gmail.com",
+ "number_in_job": 1,
+ "original_attempt_run_id": null,
+ "state": {
+ "life_cycle_state": "TERMINATED",
+ "result_state": "SUCCESS",
+ "state_message": ""
+ },
+ "schedule": null,
+ "task": {
+ "notebook_task": {
+ "notebook_path": "/Utils/Download MovieLens"
+ }
+ },
+ "cluster_spec": {
+ "existing_cluster_id": "xxxx-yyyyy-zzzzzzz"
+ },
+ "cluster_instance": {
+ "cluster_id": "xxxx-yyyyy-zzzzzzz",
+ "spark_context_id": "8973498743973498"
+ },
+ "overriding_parameters": null,
+ "start_time": 1592062147101,
+ "setup_duration": 1000,
+ "execution_duration": 11000,
+ "cleanup_duration": 0,
+ "trigger": null,
+ "run_name": "pydbr-1592062135",
+ "run_page_url": "https://westeurope.azuredatabricks.net/?o=89798374987987#job/5/run/1",
+ "run_type": "SUBMIT_RUN"
+ }
+}
+```
+
+
+
+To get only the exit output:
+
+```bash
+$ pydbr runs get-output -r 6
+```
+
+```
+Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv
+```
+
+
+
+## Python Client SDK for Databricks REST APIs
+
+To implement your own Databricks REST API client, you can use the Python Client SDK for Databricks REST APIs.
+
+### Create Databricks connection
+
+```python
+# Get Databricks workspace connection
+dbc = pydbr.connect(
+ bearer_token='dapixyzabcd09rasdf',
+ url='https://westeurope.azuredatabricks.net')
+```
+
+### DBFS
+
+```python
+# Get list of items at path /FileStore
+dbc.dbfs.ls('/FileStore')
+
+# Check if file or directory exists
+dbc.dbfs.exists('/path/to/heaven')
+
+# Make a directory and it's parents
+dbc.dbfs.mkdirs('/path/to/heaven')
+
+# Delete a directory recusively
+dbc.dbfs.rm('/path', recursive=True)
+
+# Download file block starting 1024 with size 2048
+dbc.dbfs.read('/data/movies.csv', 1024, 2048)
+
+# Download entire file
+dbc.dbfs.read_all('/data/movies.csv')
+```
+
+### Databricks workspace
+
+```python
+# List root workspace directory
+dbc.workspace.ls('/')
+
+# Check if workspace item exists
+dbc.workspace.exists('/explore')
+
+# Check if workspace item is a directory
+dbc.workspace.is_directory('/')
+
+# Export notebook in default (SOURCE) format
+dbc.workspace.export('/my_notebook')
+
+# Export notebook in HTML format
+dbc.workspace.export('/my_notebook', 'HTML')
+```
+
+
+
+## Build and publish
+
+```bash
+pip install wheel twine
+python setup.py sdist bdist_wheel
+python -m twine upload dist/*
+```
+
+
+
+%package -n python3-pydbr
+Summary: Databricks client SDK with command line client for Databricks REST APIs
+Provides: python-pydbr
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-pydbr
+# pydbr
+Databricks client SDK for Python with command line interface for Databricks REST APIs.
+
+{:toc}
+
+## Introduction
+
+Pydbr (short of Python-Databricks) package provides python SDK for Databricks REST API:
+
+* dbfs
+* workspace
+* jobs
+* runs
+
+The package also comes with a useful CLI which might be very helpful in automation.
+
+## Installation
+
+```bash
+$ pip install pydbr
+```
+
+
+
+## Databricks CLI
+
+Databricks command line client provides convenient way to interact with Databricks cluster at the command line. A very popular use of such approach in in automation tasks, like DevOps pipelines or third party workflow managers.
+
+You can call the Databricks CLI using convenient shell command `pydbr`:
+
+```bash
+$ pydbr --help
+```
+
+ or using python module:
+
+```bash
+$ python -m pydbr.cli --help
+```
+
+To connect to the Databricks cluster, you can supply arguments at the command line:
+
+* `--bearer-token`
+* `--url`
+* `--cluster-id`
+
+Alternatively, you can define environment variables. Command line arguments take precedence.
+
+```bash
+export DATABRICKS_URL='https://westeurope.azuredatabricks.net/'
+export DATABRICKS_BEARER_TOKEN='dapixyz89u9ufsdfd0'
+export DATABRICKS_CLUSTER_ID='1234-456778-abc234'
+export DATABRICKS_ORG_ID='87287878293983984'
+```
+
+
+
+### DBFS
+
+#### List DBFS items
+
+```bash
+# List items on DBFS
+pydbr dbfs ls --json-indent 3 FileStore/movielens
+```
+
+```bash
+[
+ {
+ "path": "/FileStore/movielens/ml-latest-small",
+ "is_dir": true,
+ "file_size": 0,
+ "is_file": false,
+ "human_size": "0 B"
+ }
+]
+```
+
+#### Download file from DBFS
+
+```bash
+# Download a file and print to STDOUT
+pydbr dbfs get ml-latest-small/movies.csv
+```
+
+#### Download directory from DBFS
+
+```bash
+# Download recursively entire directory and store locally
+pydbr dbfs get -o ml-local ml-latest-small
+```
+
+
+
+### Workspace
+
+Databricks workspace contains notebooks and other items.
+
+#### List workspace
+
+```bash
+####################
+# List workspace
+# Default path is root - '/'
+$ pydbr workspace ls
+# auto-add leading '/'
+$ pydbr workspace ls 'Users'
+# Space-indentend json output with number of spaces
+$ pydbr workspace --json-indent 4 ls
+# Custom indent string
+$ pydbr workspace ls --json-indent='>'
+```
+
+
+
+#### Export items from Databricks workspace
+
+```bash
+#####################
+# Export workspace items
+# Export everything in source format using defaults: format=SOURCE, path=/
+pydbr workspace export -o ./.dev/export
+# Export everything in DBC format
+pydbr workspace export -f DBC -o ./.dev/export.
+# When path is folder, export is recursive
+pydbr workspace export -o ./.dev/export-utils 'Utils'
+# Export single ITEM
+pydbr workspace export -o ./.dev/GetML 'Utils/Download MovieLens.py'
+```
+
+
+
+### Runs
+
+This command group implements the [`jobs/runs` Databricks REST API](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit).
+
+#### Submit a notebook
+
+Implements: [https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit)
+
+```bash
+$ pydbr runs submit "Utils/Download MovieLens"
+```
+
+```
+{"run_id": 4}
+```
+
+You can retrieve the job information using `runs get`:
+
+```bash
+$ pydbr runs get 4 -i 3
+```
+
+
+
+If you need to pass parameters, use the `--parameters` or `-p` option and specify JSON text.
+
+```bash
+$ pydbr runs submit -p '{"run_tag":"20250103"}' "Utils/Download MovieLens"
+```
+
+You can refer also to parameters in JSON file:
+
+```bash
+$ pydbr runs submit -p '@params.json' "Utils/Download MovieLens"
+```
+
+You can use the parameters in the notebook and will also be able to see them in the run metadata:
+
+```bash
+pydbr runs get-output -i 3 8
+```
+
+```json
+{
+ "notebook_output": {
+ "result": "Downloaded files (tag: 20250103): README.txt, links.csv, movies.csv, ratings.csv, tags.csv",
+ "truncated": false
+ },
+ "error": null,
+ "metadata": {
+ "job_id": 8,
+ "run_id": 8,
+ "creator_user_name": "your.name@gmail.com",
+ "number_in_job": 1,
+ "original_attempt_run_id": null,
+ "state": {
+ "life_cycle_state": "TERMINATED",
+ "result_state": "SUCCESS",
+ "state_message": ""
+ },
+ "schedule": null,
+ "task": {
+ "notebook_task": {
+ "notebook_path": "/Utils/Download MovieLens",
+ "base_parameters": {
+ "run_tag": "20250103"
+ }
+ }
+ },
+ "cluster_spec": {
+ "existing_cluster_id": "xxxx-yyyyyy-zzzzzz"
+ },
+ "cluster_instance": {
+ "cluster_id": "xxxx-yyyyyy-zzzzzzzz",
+ "spark_context_id": "8734983498349834"
+ },
+ "overriding_parameters": null,
+ "start_time": 1592067357734,
+ "setup_duration": 0,
+ "execution_duration": 11000,
+ "cleanup_duration": 0,
+ "trigger": null,
+ "run_name": "pydbr-1592067355",
+ "run_page_url": "https://westeurope.azuredatabricks.net/?o=89349849834#job/8/run/1",
+ "run_type": "SUBMIT_RUN"
+ }
+}
+```
+
+
+
+#### Get run metadata
+
+Implements: [Databricks REST runs/get](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-get)
+
+```bash
+$ pydbr runs get -i 3 6
+```
+
+```json
+{
+ "job_id": 6,
+ "run_id": 6,
+ "creator_user_name": "your.name@gmail.com",
+ "number_in_job": 1,
+ "original_attempt_run_id": null,
+ "state": {
+ "life_cycle_state": "TERMINATED",
+ "result_state": "SUCCESS",
+ "state_message": ""
+ },
+ "schedule": null,
+ "task": {
+ "notebook_task": {
+ "notebook_path": "/Utils/Download MovieLens"
+ }
+ },
+ "cluster_spec": {
+ "existing_cluster_id": "xxxx-yyyyy-zzzzzz"
+ },
+ "cluster_instance": {
+ "cluster_id": "xxxx-yyyyy-zzzzzz",
+ "spark_context_id": "783487348734873873"
+ },
+ "overriding_parameters": null,
+ "start_time": 1592062497162,
+ "setup_duration": 0,
+ "execution_duration": 11000,
+ "cleanup_duration": 0,
+ "trigger": null,
+ "run_name": "pydbr-1592062494",
+ "run_page_url": "https://westeurope.azuredatabricks.net/?o=398348734873487#job/6/run/1",
+ "run_type": "SUBMIT_RUN"
+}
+```
+
+
+
+#### List Runs
+
+Implements: [Databricks REST runs/list](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-list)
+
+```bash
+$ pydbr runs ls
+```
+
+
+
+To get only the runs for a particular job:
+
+```bash
+# Get job with job-id=4
+$ pydbr runs ls 4 -i 3
+```
+
+```json
+{
+ "runs": [
+ {
+ "job_id": 4,
+ "run_id": 4,
+ "creator_user_name": "your.name@gmail.com",
+ "number_in_job": 1,
+ "original_attempt_run_id": null,
+ "state": {
+ "life_cycle_state": "PENDING",
+ "state_message": ""
+ },
+ "schedule": null,
+ "task": {
+ "notebook_task": {
+ "notebook_path": "/Utils/Download MovieLens"
+ }
+ },
+ "cluster_spec": {
+ "existing_cluster_id": "xxxxx-yyyy-zzzzzzz"
+ },
+ "cluster_instance": {
+ "cluster_id": "xxxxx-yyyy-zzzzzzz"
+ },
+ "overriding_parameters": null,
+ "start_time": 1592058826123,
+ "setup_duration": 0,
+ "execution_duration": 0,
+ "cleanup_duration": 0,
+ "trigger": null,
+ "run_name": "pydbr-1592058823",
+ "run_page_url": "https://westeurope.azuredatabricks.net/?o=abcdefghasdf#job/4/run/1",
+ "run_type": "SUBMIT_RUN"
+ }
+ ],
+ "has_more": false
+}
+```
+
+
+
+#### Export run
+
+Implements: [Databricks REST runs/export](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-export)
+
+```bash
+$ pydbr runs export --content-only 4 > .dev/run-view.html
+```
+
+
+
+#### Get run output
+
+Implements: [Databricks REST runs/get-output](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-get-output)
+
+```bash
+$ pydbr runs get-output -i 3 6
+```
+
+```json
+{
+ "notebook_output": {
+ "result": "Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv",
+ "truncated": false
+ },
+ "error": null,
+ "metadata": {
+ "job_id": 5,
+ "run_id": 5,
+ "creator_user_name": "your.name@gmail.com",
+ "number_in_job": 1,
+ "original_attempt_run_id": null,
+ "state": {
+ "life_cycle_state": "TERMINATED",
+ "result_state": "SUCCESS",
+ "state_message": ""
+ },
+ "schedule": null,
+ "task": {
+ "notebook_task": {
+ "notebook_path": "/Utils/Download MovieLens"
+ }
+ },
+ "cluster_spec": {
+ "existing_cluster_id": "xxxx-yyyyy-zzzzzzz"
+ },
+ "cluster_instance": {
+ "cluster_id": "xxxx-yyyyy-zzzzzzz",
+ "spark_context_id": "8973498743973498"
+ },
+ "overriding_parameters": null,
+ "start_time": 1592062147101,
+ "setup_duration": 1000,
+ "execution_duration": 11000,
+ "cleanup_duration": 0,
+ "trigger": null,
+ "run_name": "pydbr-1592062135",
+ "run_page_url": "https://westeurope.azuredatabricks.net/?o=89798374987987#job/5/run/1",
+ "run_type": "SUBMIT_RUN"
+ }
+}
+```
+
+
+
+To get only the exit output:
+
+```bash
+$ pydbr runs get-output -r 6
+```
+
+```
+Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv
+```
+
+
+
+## Python Client SDK for Databricks REST APIs
+
+To implement your own Databricks REST API client, you can use the Python Client SDK for Databricks REST APIs.
+
+### Create Databricks connection
+
+```python
+# Get Databricks workspace connection
+dbc = pydbr.connect(
+ bearer_token='dapixyzabcd09rasdf',
+ url='https://westeurope.azuredatabricks.net')
+```
+
+### DBFS
+
+```python
+# Get list of items at path /FileStore
+dbc.dbfs.ls('/FileStore')
+
+# Check if file or directory exists
+dbc.dbfs.exists('/path/to/heaven')
+
+# Make a directory and it's parents
+dbc.dbfs.mkdirs('/path/to/heaven')
+
+# Delete a directory recusively
+dbc.dbfs.rm('/path', recursive=True)
+
+# Download file block starting 1024 with size 2048
+dbc.dbfs.read('/data/movies.csv', 1024, 2048)
+
+# Download entire file
+dbc.dbfs.read_all('/data/movies.csv')
+```
+
+### Databricks workspace
+
+```python
+# List root workspace directory
+dbc.workspace.ls('/')
+
+# Check if workspace item exists
+dbc.workspace.exists('/explore')
+
+# Check if workspace item is a directory
+dbc.workspace.is_directory('/')
+
+# Export notebook in default (SOURCE) format
+dbc.workspace.export('/my_notebook')
+
+# Export notebook in HTML format
+dbc.workspace.export('/my_notebook', 'HTML')
+```
+
+
+
+## Build and publish
+
+```bash
+pip install wheel twine
+python setup.py sdist bdist_wheel
+python -m twine upload dist/*
+```
+
+
+
+%package help
+Summary: Development documents and examples for pydbr
+Provides: python3-pydbr-doc
+%description help
+# pydbr
+Databricks client SDK for Python with command line interface for Databricks REST APIs.
+
+{:toc}
+
+## Introduction
+
+Pydbr (short of Python-Databricks) package provides python SDK for Databricks REST API:
+
+* dbfs
+* workspace
+* jobs
+* runs
+
+The package also comes with a useful CLI which might be very helpful in automation.
+
+## Installation
+
+```bash
+$ pip install pydbr
+```
+
+
+
+## Databricks CLI
+
+Databricks command line client provides convenient way to interact with Databricks cluster at the command line. A very popular use of such approach in in automation tasks, like DevOps pipelines or third party workflow managers.
+
+You can call the Databricks CLI using convenient shell command `pydbr`:
+
+```bash
+$ pydbr --help
+```
+
+ or using python module:
+
+```bash
+$ python -m pydbr.cli --help
+```
+
+To connect to the Databricks cluster, you can supply arguments at the command line:
+
+* `--bearer-token`
+* `--url`
+* `--cluster-id`
+
+Alternatively, you can define environment variables. Command line arguments take precedence.
+
+```bash
+export DATABRICKS_URL='https://westeurope.azuredatabricks.net/'
+export DATABRICKS_BEARER_TOKEN='dapixyz89u9ufsdfd0'
+export DATABRICKS_CLUSTER_ID='1234-456778-abc234'
+export DATABRICKS_ORG_ID='87287878293983984'
+```
+
+
+
+### DBFS
+
+#### List DBFS items
+
+```bash
+# List items on DBFS
+pydbr dbfs ls --json-indent 3 FileStore/movielens
+```
+
+```bash
+[
+ {
+ "path": "/FileStore/movielens/ml-latest-small",
+ "is_dir": true,
+ "file_size": 0,
+ "is_file": false,
+ "human_size": "0 B"
+ }
+]
+```
+
+#### Download file from DBFS
+
+```bash
+# Download a file and print to STDOUT
+pydbr dbfs get ml-latest-small/movies.csv
+```
+
+#### Download directory from DBFS
+
+```bash
+# Download recursively entire directory and store locally
+pydbr dbfs get -o ml-local ml-latest-small
+```
+
+
+
+### Workspace
+
+Databricks workspace contains notebooks and other items.
+
+#### List workspace
+
+```bash
+####################
+# List workspace
+# Default path is root - '/'
+$ pydbr workspace ls
+# auto-add leading '/'
+$ pydbr workspace ls 'Users'
+# Space-indentend json output with number of spaces
+$ pydbr workspace --json-indent 4 ls
+# Custom indent string
+$ pydbr workspace ls --json-indent='>'
+```
+
+
+
+#### Export items from Databricks workspace
+
+```bash
+#####################
+# Export workspace items
+# Export everything in source format using defaults: format=SOURCE, path=/
+pydbr workspace export -o ./.dev/export
+# Export everything in DBC format
+pydbr workspace export -f DBC -o ./.dev/export.
+# When path is folder, export is recursive
+pydbr workspace export -o ./.dev/export-utils 'Utils'
+# Export single ITEM
+pydbr workspace export -o ./.dev/GetML 'Utils/Download MovieLens.py'
+```
+
+
+
+### Runs
+
+This command group implements the [`jobs/runs` Databricks REST API](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit).
+
+#### Submit a notebook
+
+Implements: [https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit)
+
+```bash
+$ pydbr runs submit "Utils/Download MovieLens"
+```
+
+```
+{"run_id": 4}
+```
+
+You can retrieve the job information using `runs get`:
+
+```bash
+$ pydbr runs get 4 -i 3
+```
+
+
+
+If you need to pass parameters, use the `--parameters` or `-p` option and specify JSON text.
+
+```bash
+$ pydbr runs submit -p '{"run_tag":"20250103"}' "Utils/Download MovieLens"
+```
+
+You can refer also to parameters in JSON file:
+
+```bash
+$ pydbr runs submit -p '@params.json' "Utils/Download MovieLens"
+```
+
+You can use the parameters in the notebook and will also be able to see them in the run metadata:
+
+```bash
+pydbr runs get-output -i 3 8
+```
+
+```json
+{
+ "notebook_output": {
+ "result": "Downloaded files (tag: 20250103): README.txt, links.csv, movies.csv, ratings.csv, tags.csv",
+ "truncated": false
+ },
+ "error": null,
+ "metadata": {
+ "job_id": 8,
+ "run_id": 8,
+ "creator_user_name": "your.name@gmail.com",
+ "number_in_job": 1,
+ "original_attempt_run_id": null,
+ "state": {
+ "life_cycle_state": "TERMINATED",
+ "result_state": "SUCCESS",
+ "state_message": ""
+ },
+ "schedule": null,
+ "task": {
+ "notebook_task": {
+ "notebook_path": "/Utils/Download MovieLens",
+ "base_parameters": {
+ "run_tag": "20250103"
+ }
+ }
+ },
+ "cluster_spec": {
+ "existing_cluster_id": "xxxx-yyyyyy-zzzzzz"
+ },
+ "cluster_instance": {
+ "cluster_id": "xxxx-yyyyyy-zzzzzzzz",
+ "spark_context_id": "8734983498349834"
+ },
+ "overriding_parameters": null,
+ "start_time": 1592067357734,
+ "setup_duration": 0,
+ "execution_duration": 11000,
+ "cleanup_duration": 0,
+ "trigger": null,
+ "run_name": "pydbr-1592067355",
+ "run_page_url": "https://westeurope.azuredatabricks.net/?o=89349849834#job/8/run/1",
+ "run_type": "SUBMIT_RUN"
+ }
+}
+```
+
+
+
+#### Get run metadata
+
+Implements: [Databricks REST runs/get](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-get)
+
+```bash
+$ pydbr runs get -i 3 6
+```
+
+```json
+{
+ "job_id": 6,
+ "run_id": 6,
+ "creator_user_name": "your.name@gmail.com",
+ "number_in_job": 1,
+ "original_attempt_run_id": null,
+ "state": {
+ "life_cycle_state": "TERMINATED",
+ "result_state": "SUCCESS",
+ "state_message": ""
+ },
+ "schedule": null,
+ "task": {
+ "notebook_task": {
+ "notebook_path": "/Utils/Download MovieLens"
+ }
+ },
+ "cluster_spec": {
+ "existing_cluster_id": "xxxx-yyyyy-zzzzzz"
+ },
+ "cluster_instance": {
+ "cluster_id": "xxxx-yyyyy-zzzzzz",
+ "spark_context_id": "783487348734873873"
+ },
+ "overriding_parameters": null,
+ "start_time": 1592062497162,
+ "setup_duration": 0,
+ "execution_duration": 11000,
+ "cleanup_duration": 0,
+ "trigger": null,
+ "run_name": "pydbr-1592062494",
+ "run_page_url": "https://westeurope.azuredatabricks.net/?o=398348734873487#job/6/run/1",
+ "run_type": "SUBMIT_RUN"
+}
+```
+
+
+
+#### List Runs
+
+Implements: [Databricks REST runs/list](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-list)
+
+```bash
+$ pydbr runs ls
+```
+
+
+
+To get only the runs for a particular job:
+
+```bash
+# Get job with job-id=4
+$ pydbr runs ls 4 -i 3
+```
+
+```json
+{
+ "runs": [
+ {
+ "job_id": 4,
+ "run_id": 4,
+ "creator_user_name": "your.name@gmail.com",
+ "number_in_job": 1,
+ "original_attempt_run_id": null,
+ "state": {
+ "life_cycle_state": "PENDING",
+ "state_message": ""
+ },
+ "schedule": null,
+ "task": {
+ "notebook_task": {
+ "notebook_path": "/Utils/Download MovieLens"
+ }
+ },
+ "cluster_spec": {
+ "existing_cluster_id": "xxxxx-yyyy-zzzzzzz"
+ },
+ "cluster_instance": {
+ "cluster_id": "xxxxx-yyyy-zzzzzzz"
+ },
+ "overriding_parameters": null,
+ "start_time": 1592058826123,
+ "setup_duration": 0,
+ "execution_duration": 0,
+ "cleanup_duration": 0,
+ "trigger": null,
+ "run_name": "pydbr-1592058823",
+ "run_page_url": "https://westeurope.azuredatabricks.net/?o=abcdefghasdf#job/4/run/1",
+ "run_type": "SUBMIT_RUN"
+ }
+ ],
+ "has_more": false
+}
+```
+
+
+
+#### Export run
+
+Implements: [Databricks REST runs/export](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-export)
+
+```bash
+$ pydbr runs export --content-only 4 > .dev/run-view.html
+```
+
+
+
+#### Get run output
+
+Implements: [Databricks REST runs/get-output](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-get-output)
+
+```bash
+$ pydbr runs get-output -i 3 6
+```
+
+```json
+{
+ "notebook_output": {
+ "result": "Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv",
+ "truncated": false
+ },
+ "error": null,
+ "metadata": {
+ "job_id": 5,
+ "run_id": 5,
+ "creator_user_name": "your.name@gmail.com",
+ "number_in_job": 1,
+ "original_attempt_run_id": null,
+ "state": {
+ "life_cycle_state": "TERMINATED",
+ "result_state": "SUCCESS",
+ "state_message": ""
+ },
+ "schedule": null,
+ "task": {
+ "notebook_task": {
+ "notebook_path": "/Utils/Download MovieLens"
+ }
+ },
+ "cluster_spec": {
+ "existing_cluster_id": "xxxx-yyyyy-zzzzzzz"
+ },
+ "cluster_instance": {
+ "cluster_id": "xxxx-yyyyy-zzzzzzz",
+ "spark_context_id": "8973498743973498"
+ },
+ "overriding_parameters": null,
+ "start_time": 1592062147101,
+ "setup_duration": 1000,
+ "execution_duration": 11000,
+ "cleanup_duration": 0,
+ "trigger": null,
+ "run_name": "pydbr-1592062135",
+ "run_page_url": "https://westeurope.azuredatabricks.net/?o=89798374987987#job/5/run/1",
+ "run_type": "SUBMIT_RUN"
+ }
+}
+```
+
+
+
+To get only the exit output:
+
+```bash
+$ pydbr runs get-output -r 6
+```
+
+```
+Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv
+```
+
+
+
+## Python Client SDK for Databricks REST APIs
+
+To implement your own Databricks REST API client, you can use the Python Client SDK for Databricks REST APIs.
+
+### Create Databricks connection
+
+```python
+# Get Databricks workspace connection
+dbc = pydbr.connect(
+ bearer_token='dapixyzabcd09rasdf',
+ url='https://westeurope.azuredatabricks.net')
+```
+
+### DBFS
+
+```python
+# Get list of items at path /FileStore
+dbc.dbfs.ls('/FileStore')
+
+# Check if file or directory exists
+dbc.dbfs.exists('/path/to/heaven')
+
+# Make a directory and it's parents
+dbc.dbfs.mkdirs('/path/to/heaven')
+
+# Delete a directory recusively
+dbc.dbfs.rm('/path', recursive=True)
+
+# Download file block starting 1024 with size 2048
+dbc.dbfs.read('/data/movies.csv', 1024, 2048)
+
+# Download entire file
+dbc.dbfs.read_all('/data/movies.csv')
+```
+
+### Databricks workspace
+
+```python
+# List root workspace directory
+dbc.workspace.ls('/')
+
+# Check if workspace item exists
+dbc.workspace.exists('/explore')
+
+# Check if workspace item is a directory
+dbc.workspace.is_directory('/')
+
+# Export notebook in default (SOURCE) format
+dbc.workspace.export('/my_notebook')
+
+# Export notebook in HTML format
+dbc.workspace.export('/my_notebook', 'HTML')
+```
+
+
+
+## Build and publish
+
+```bash
+pip install wheel twine
+python setup.py sdist bdist_wheel
+python -m twine upload dist/*
+```
+
+
+
+%prep
+%autosetup -n pydbr-0.0.7
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-pydbr -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Fri May 05 2023 Python_Bot <Python_Bot@openeuler.org> - 0.0.7-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..1a0fd7f
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+ae9a413b96a519fd5647cf9a241e070a pydbr-0.0.7.tar.gz