automatic import of python-pydbropeneuler20.03

author: CoprDistGit <infra@openeuler.org> 2023-05-05 09:50:22 +0000
committer: CoprDistGit <infra@openeuler.org> 2023-05-05 09:50:22 +0000
commit: 8714f97e501ece05fb56986265738eb7373106dd (patch)
tree: c1189af2e57136652a3e5dfb7efdc4b25544bd43 /python-pydbr.spec
parent: f6072fe2535f7d294f476d0231cbfbf78434a8f9 (diff)
1 files changed, 1481 insertions, 0 deletions
diff --git a/python-pydbr.spec b/python-pydbr.spec
new file mode 100644
index 0000000..7461c35
--- /dev/null
+++ b/python-pydbr.spec
@@ -0,0 +1,1481 @@
+%global _empty_manifest_terminate_build 0
+Name:		python-pydbr
+Version:	0.0.7
+Release:	1
+Summary:	Databricks client SDK with command line client for Databricks REST APIs
+License:	MIT License
+URL:		https://github.com/ivangeorgiev/pydbr
+Source0:	https://mirrors.nju.edu.cn/pypi/web/packages/09/c6/618f1b2cacaa50ebae807f4e862bd61e8f32b5ca0d44fb2422ce74156274/pydbr-0.0.7.tar.gz
+BuildArch:	noarch
+
+Requires:	python3-click
+Requires:	python3-requests
+
+%description
+# pydbr
+Databricks client SDK for Python with command line interface for Databricks REST APIs.
+
+{:toc}
+
+## Introduction
+
+Pydbr (short of Python-Databricks) package provides python SDK for Databricks REST API:
+
+* dbfs
+* workspace
+* jobs
+* runs
+
+The package also comes with a useful CLI which might be very helpful in automation.
+
+## Installation
+
+```bash
+$ pip install pydbr
+```
+
+
+
+## Databricks CLI
+
+Databricks command line client provides convenient way to interact with Databricks cluster at the command line. A very popular use of such approach in in automation tasks, like DevOps pipelines or third party workflow managers.
+
+You can call the Databricks CLI using convenient shell command `pydbr`:
+
+```bash
+$ pydbr --help
+```
+
+ or using python module:
+
+```bash
+$ python -m pydbr.cli --help
+```
+
+To connect to the Databricks cluster, you can supply arguments at the command line:
+
+* `--bearer-token`
+* `--url`
+* `--cluster-id`
+
+Alternatively, you can define environment variables. Command line arguments take precedence.
+
+```bash
+export DATABRICKS_URL='https://westeurope.azuredatabricks.net/'
+export DATABRICKS_BEARER_TOKEN='dapixyz89u9ufsdfd0'
+export DATABRICKS_CLUSTER_ID='1234-456778-abc234'
+export DATABRICKS_ORG_ID='87287878293983984'
+```
+
+
+
+### DBFS
+
+#### List DBFS items
+
+```bash
+# List items on DBFS
+pydbr dbfs ls --json-indent 3 FileStore/movielens
+```
+
+```bash
+[
+   {
+      "path": "/FileStore/movielens/ml-latest-small",
+      "is_dir": true,
+      "file_size": 0,
+      "is_file": false,
+      "human_size": "0 B"
+   }
+]
+```
+
+#### Download file from DBFS
+
+```bash
+# Download a file and print to STDOUT
+pydbr dbfs get ml-latest-small/movies.csv
+```
+
+#### Download directory from DBFS
+
+```bash
+# Download recursively entire directory and store locally
+pydbr dbfs get -o ml-local ml-latest-small
+```
+
+
+
+### Workspace
+
+Databricks workspace contains notebooks and other items.
+
+#### List workspace
+
+```bash
+####################
+# List workspace
+# Default path is root - '/'
+$ pydbr workspace ls
+# auto-add leading '/'
+$ pydbr workspace ls 'Users'
+# Space-indentend json output with number of spaces
+$ pydbr workspace --json-indent 4 ls
+# Custom indent string
+$ pydbr workspace ls --json-indent='>'
+```
+
+
+
+#### Export items from Databricks workspace
+
+```bash
+#####################
+# Export workspace items
+# Export everything in source format using defaults: format=SOURCE, path=/
+pydbr workspace export -o ./.dev/export
+# Export everything in DBC format
+pydbr workspace export -f DBC -o ./.dev/export.
+# When path is folder, export is recursive
+pydbr workspace export -o ./.dev/export-utils 'Utils'
+# Export single ITEM
+pydbr workspace export -o ./.dev/GetML 'Utils/Download MovieLens.py'
+```
+
+
+
+### Runs
+
+This command group implements the [`jobs/runs` Databricks REST API](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit).
+
+#### Submit a notebook
+
+Implements: [https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit)
+
+```bash
+$ pydbr runs submit "Utils/Download MovieLens"
+```
+
+```
+{"run_id": 4}
+```
+
+You can retrieve the job information using `runs get`:
+
+```bash
+$ pydbr runs get 4 -i 3
+```
+
+
+
+If you need to pass parameters, use the `--parameters` or `-p` option and specify JSON text.
+
+```bash
+$ pydbr runs submit -p '{"run_tag":"20250103"}' "Utils/Download MovieLens"
+```
+
+You can refer also to parameters in JSON file:
+
+```bash
+$ pydbr runs submit -p '@params.json' "Utils/Download MovieLens"
+```
+
+You can use the parameters in the notebook and will also be able to see them in the run metadata:
+
+```bash
+pydbr runs get-output -i 3 8
+```
+
+```json
+{
+   "notebook_output": {
+      "result": "Downloaded files (tag: 20250103): README.txt, links.csv, movies.csv, ratings.csv, tags.csv",
+      "truncated": false
+   },
+   "error": null,
+   "metadata": {
+      "job_id": 8,
+      "run_id": 8,
+      "creator_user_name": "your.name@gmail.com",
+      "number_in_job": 1,
+      "original_attempt_run_id": null,
+      "state": {
+         "life_cycle_state": "TERMINATED",
+         "result_state": "SUCCESS",
+         "state_message": ""
+      },
+      "schedule": null,
+      "task": {
+         "notebook_task": {
+            "notebook_path": "/Utils/Download MovieLens",
+            "base_parameters": {
+               "run_tag": "20250103"
+            }
+         }
+      },
+      "cluster_spec": {
+         "existing_cluster_id": "xxxx-yyyyyy-zzzzzz"
+      },
+      "cluster_instance": {
+         "cluster_id": "xxxx-yyyyyy-zzzzzzzz",
+         "spark_context_id": "8734983498349834"
+      },
+      "overriding_parameters": null,
+      "start_time": 1592067357734,
+      "setup_duration": 0,
+      "execution_duration": 11000,
+      "cleanup_duration": 0,
+      "trigger": null,
+      "run_name": "pydbr-1592067355",
+      "run_page_url": "https://westeurope.azuredatabricks.net/?o=89349849834#job/8/run/1",
+      "run_type": "SUBMIT_RUN"
+   }
+}
+```
+
+
+
+#### Get run metadata
+
+Implements: [Databricks REST runs/get](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-get) 
+
+```bash
+$ pydbr runs get -i 3 6
+```
+
+```json
+{
+   "job_id": 6,
+   "run_id": 6,
+   "creator_user_name": "your.name@gmail.com",
+   "number_in_job": 1,
+   "original_attempt_run_id": null,
+   "state": {
+      "life_cycle_state": "TERMINATED",
+      "result_state": "SUCCESS",
+      "state_message": ""
+   },
+   "schedule": null,
+   "task": {
+      "notebook_task": {
+         "notebook_path": "/Utils/Download MovieLens"
+      }
+   },
+   "cluster_spec": {
+      "existing_cluster_id": "xxxx-yyyyy-zzzzzz"
+   },
+   "cluster_instance": {
+      "cluster_id": "xxxx-yyyyy-zzzzzz",
+      "spark_context_id": "783487348734873873"
+   },
+   "overriding_parameters": null,
+   "start_time": 1592062497162,
+   "setup_duration": 0,
+   "execution_duration": 11000,
+   "cleanup_duration": 0,
+   "trigger": null,
+   "run_name": "pydbr-1592062494",
+   "run_page_url": "https://westeurope.azuredatabricks.net/?o=398348734873487#job/6/run/1",
+   "run_type": "SUBMIT_RUN"
+}
+```
+
+
+
+#### List Runs
+
+Implements: [Databricks REST runs/list](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-list)
+
+```bash
+$ pydbr runs ls
+```
+
+
+
+To get only the runs for a particular job:
+
+```bash
+# Get job with job-id=4
+$ pydbr runs ls 4 -i 3
+```
+
+```json
+{
+   "runs": [
+      {
+         "job_id": 4,
+         "run_id": 4,
+         "creator_user_name": "your.name@gmail.com",
+         "number_in_job": 1,
+         "original_attempt_run_id": null,
+         "state": {
+            "life_cycle_state": "PENDING",
+            "state_message": ""
+         },
+         "schedule": null,
+         "task": {
+            "notebook_task": {
+               "notebook_path": "/Utils/Download MovieLens"
+            }
+         },
+         "cluster_spec": {
+            "existing_cluster_id": "xxxxx-yyyy-zzzzzzz"
+         },
+         "cluster_instance": {
+            "cluster_id": "xxxxx-yyyy-zzzzzzz"
+         },
+         "overriding_parameters": null,
+         "start_time": 1592058826123,
+         "setup_duration": 0,
+         "execution_duration": 0,
+         "cleanup_duration": 0,
+         "trigger": null,
+         "run_name": "pydbr-1592058823",
+         "run_page_url": "https://westeurope.azuredatabricks.net/?o=abcdefghasdf#job/4/run/1",
+         "run_type": "SUBMIT_RUN"
+      }
+   ],
+   "has_more": false
+}
+```
+
+
+
+#### Export run 
+
+Implements: [Databricks REST runs/export](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-export)
+
+```bash
+$ pydbr runs export --content-only 4 > .dev/run-view.html
+```
+
+
+
+#### Get run output
+
+Implements: [Databricks REST runs/get-output](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-get-output)
+
+```bash
+$ pydbr runs get-output -i 3 6
+```
+
+```json
+{
+   "notebook_output": {
+      "result": "Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv",
+      "truncated": false
+   },
+   "error": null,
+   "metadata": {
+      "job_id": 5,
+      "run_id": 5,
+      "creator_user_name": "your.name@gmail.com",
+      "number_in_job": 1,
+      "original_attempt_run_id": null,
+      "state": {
+         "life_cycle_state": "TERMINATED",
+         "result_state": "SUCCESS",
+         "state_message": ""
+      },
+      "schedule": null,
+      "task": {
+         "notebook_task": {
+            "notebook_path": "/Utils/Download MovieLens"
+         }
+      },
+      "cluster_spec": {
+         "existing_cluster_id": "xxxx-yyyyy-zzzzzzz"
+      },
+      "cluster_instance": {
+         "cluster_id": "xxxx-yyyyy-zzzzzzz",
+         "spark_context_id": "8973498743973498"
+      },
+      "overriding_parameters": null,
+      "start_time": 1592062147101,
+      "setup_duration": 1000,
+      "execution_duration": 11000,
+      "cleanup_duration": 0,
+      "trigger": null,
+      "run_name": "pydbr-1592062135",
+      "run_page_url": "https://westeurope.azuredatabricks.net/?o=89798374987987#job/5/run/1",
+      "run_type": "SUBMIT_RUN"
+   }
+}
+```
+
+
+
+To get only the exit output:
+
+```bash
+$ pydbr runs get-output -r 6
+```
+
+```
+Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv
+```
+
+
+
+## Python Client SDK for Databricks REST APIs
+
+To implement your own Databricks REST API client, you can use the Python Client SDK for Databricks REST APIs.
+
+### Create Databricks connection
+
+```python
+# Get Databricks workspace connection
+dbc = pydbr.connect(
+        bearer_token='dapixyzabcd09rasdf',
+        url='https://westeurope.azuredatabricks.net')
+```
+
+### DBFS
+
+```python
+# Get list of items at path /FileStore
+dbc.dbfs.ls('/FileStore')
+
+# Check if file or directory exists
+dbc.dbfs.exists('/path/to/heaven')
+
+# Make a directory and it's parents
+dbc.dbfs.mkdirs('/path/to/heaven')
+
+# Delete a directory recusively
+dbc.dbfs.rm('/path', recursive=True)
+
+# Download file block starting 1024 with size 2048
+dbc.dbfs.read('/data/movies.csv', 1024, 2048)
+
+# Download entire file
+dbc.dbfs.read_all('/data/movies.csv')
+```
+
+### Databricks workspace
+
+```python
+# List root workspace directory
+dbc.workspace.ls('/')
+
+# Check if workspace item exists
+dbc.workspace.exists('/explore')
+
+# Check if workspace item is a directory
+dbc.workspace.is_directory('/')
+
+# Export notebook in default (SOURCE) format
+dbc.workspace.export('/my_notebook')
+
+# Export notebook in HTML format
+dbc.workspace.export('/my_notebook', 'HTML')
+```
+
+
+
+## Build and publish
+
+```bash
+pip install wheel twine
+python setup.py sdist bdist_wheel
+python -m twine upload dist/*
+```
+
+
+
+%package -n python3-pydbr
+Summary:	Databricks client SDK with command line client for Databricks REST APIs
+Provides:	python-pydbr
+BuildRequires:	python3-devel
+BuildRequires:	python3-setuptools
+BuildRequires:	python3-pip
+%description -n python3-pydbr
+# pydbr
+Databricks client SDK for Python with command line interface for Databricks REST APIs.
+
+{:toc}
+
+## Introduction
+
+Pydbr (short of Python-Databricks) package provides python SDK for Databricks REST API:
+
+* dbfs
+* workspace
+* jobs
+* runs
+
+The package also comes with a useful CLI which might be very helpful in automation.
+
+## Installation
+
+```bash
+$ pip install pydbr
+```
+
+
+
+## Databricks CLI
+
+Databricks command line client provides convenient way to interact with Databricks cluster at the command line. A very popular use of such approach in in automation tasks, like DevOps pipelines or third party workflow managers.
+
+You can call the Databricks CLI using convenient shell command `pydbr`:
+
+```bash
+$ pydbr --help
+```
+
+ or using python module:
+
+```bash
+$ python -m pydbr.cli --help
+```
+
+To connect to the Databricks cluster, you can supply arguments at the command line:
+
+* `--bearer-token`
+* `--url`
+* `--cluster-id`
+
+Alternatively, you can define environment variables. Command line arguments take precedence.
+
+```bash
+export DATABRICKS_URL='https://westeurope.azuredatabricks.net/'
+export DATABRICKS_BEARER_TOKEN='dapixyz89u9ufsdfd0'
+export DATABRICKS_CLUSTER_ID='1234-456778-abc234'
+export DATABRICKS_ORG_ID='87287878293983984'
+```
+
+
+
+### DBFS
+
+#### List DBFS items
+
+```bash
+# List items on DBFS
+pydbr dbfs ls --json-indent 3 FileStore/movielens
+```
+
+```bash
+[
+   {
+      "path": "/FileStore/movielens/ml-latest-small",
+      "is_dir": true,
+      "file_size": 0,
+      "is_file": false,
+      "human_size": "0 B"
+   }
+]
+```
+
+#### Download file from DBFS
+
+```bash
+# Download a file and print to STDOUT
+pydbr dbfs get ml-latest-small/movies.csv
+```
+
+#### Download directory from DBFS
+
+```bash
+# Download recursively entire directory and store locally
+pydbr dbfs get -o ml-local ml-latest-small
+```
+
+
+
+### Workspace
+
+Databricks workspace contains notebooks and other items.
+
+#### List workspace
+
+```bash
+####################
+# List workspace
+# Default path is root - '/'
+$ pydbr workspace ls
+# auto-add leading '/'
+$ pydbr workspace ls 'Users'
+# Space-indentend json output with number of spaces
+$ pydbr workspace --json-indent 4 ls
+# Custom indent string
+$ pydbr workspace ls --json-indent='>'
+```
+
+
+
+#### Export items from Databricks workspace
+
+```bash
+#####################
+# Export workspace items
+# Export everything in source format using defaults: format=SOURCE, path=/
+pydbr workspace export -o ./.dev/export
+# Export everything in DBC format
+pydbr workspace export -f DBC -o ./.dev/export.
+# When path is folder, export is recursive
+pydbr workspace export -o ./.dev/export-utils 'Utils'
+# Export single ITEM
+pydbr workspace export -o ./.dev/GetML 'Utils/Download MovieLens.py'
+```
+
+
+
+### Runs
+
+This command group implements the [`jobs/runs` Databricks REST API](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit).
+
+#### Submit a notebook
+
+Implements: [https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit)
+
+```bash
+$ pydbr runs submit "Utils/Download MovieLens"
+```
+
+```
+{"run_id": 4}
+```
+
+You can retrieve the job information using `runs get`:
+
+```bash
+$ pydbr runs get 4 -i 3
+```
+
+
+
+If you need to pass parameters, use the `--parameters` or `-p` option and specify JSON text.
+
+```bash
+$ pydbr runs submit -p '{"run_tag":"20250103"}' "Utils/Download MovieLens"
+```
+
+You can refer also to parameters in JSON file:
+
+```bash
+$ pydbr runs submit -p '@params.json' "Utils/Download MovieLens"
+```
+
+You can use the parameters in the notebook and will also be able to see them in the run metadata:
+
+```bash
+pydbr runs get-output -i 3 8
+```
+
+```json
+{
+   "notebook_output": {
+      "result": "Downloaded files (tag: 20250103): README.txt, links.csv, movies.csv, ratings.csv, tags.csv",
+      "truncated": false
+   },
+   "error": null,
+   "metadata": {
+      "job_id": 8,
+      "run_id": 8,
+      "creator_user_name": "your.name@gmail.com",
+      "number_in_job": 1,
+      "original_attempt_run_id": null,
+      "state": {
+         "life_cycle_state": "TERMINATED",
+         "result_state": "SUCCESS",
+         "state_message": ""
+      },
+      "schedule": null,
+      "task": {
+         "notebook_task": {
+            "notebook_path": "/Utils/Download MovieLens",
+            "base_parameters": {
+               "run_tag": "20250103"
+            }
+         }
+      },
+      "cluster_spec": {
+         "existing_cluster_id": "xxxx-yyyyyy-zzzzzz"
+      },
+      "cluster_instance": {
+         "cluster_id": "xxxx-yyyyyy-zzzzzzzz",
+         "spark_context_id": "8734983498349834"
+      },
+      "overriding_parameters": null,
+      "start_time": 1592067357734,
+      "setup_duration": 0,
+      "execution_duration": 11000,
+      "cleanup_duration": 0,
+      "trigger": null,
+      "run_name": "pydbr-1592067355",
+      "run_page_url": "https://westeurope.azuredatabricks.net/?o=89349849834#job/8/run/1",
+      "run_type": "SUBMIT_RUN"
+   }
+}
+```
+
+
+
+#### Get run metadata
+
+Implements: [Databricks REST runs/get](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-get) 
+
+```bash
+$ pydbr runs get -i 3 6
+```
+
+```json
+{
+   "job_id": 6,
+   "run_id": 6,
+   "creator_user_name": "your.name@gmail.com",
+   "number_in_job": 1,
+   "original_attempt_run_id": null,
+   "state": {
+      "life_cycle_state": "TERMINATED",
+      "result_state": "SUCCESS",
+      "state_message": ""
+   },
+   "schedule": null,
+   "task": {
+      "notebook_task": {
+         "notebook_path": "/Utils/Download MovieLens"
+      }
+   },
+   "cluster_spec": {
+      "existing_cluster_id": "xxxx-yyyyy-zzzzzz"
+   },
+   "cluster_instance": {
+      "cluster_id": "xxxx-yyyyy-zzzzzz",
+      "spark_context_id": "783487348734873873"
+   },
+   "overriding_parameters": null,
+   "start_time": 1592062497162,
+   "setup_duration": 0,
+   "execution_duration": 11000,
+   "cleanup_duration": 0,
+   "trigger": null,
+   "run_name": "pydbr-1592062494",
+   "run_page_url": "https://westeurope.azuredatabricks.net/?o=398348734873487#job/6/run/1",
+   "run_type": "SUBMIT_RUN"
+}
+```
+
+
+
+#### List Runs
+
+Implements: [Databricks REST runs/list](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-list)
+
+```bash
+$ pydbr runs ls
+```
+
+
+
+To get only the runs for a particular job:
+
+```bash
+# Get job with job-id=4
+$ pydbr runs ls 4 -i 3
+```
+
+```json
+{
+   "runs": [
+      {
+         "job_id": 4,
+         "run_id": 4,
+         "creator_user_name": "your.name@gmail.com",
+         "number_in_job": 1,
+         "original_attempt_run_id": null,
+         "state": {
+            "life_cycle_state": "PENDING",
+            "state_message": ""
+         },
+         "schedule": null,
+         "task": {
+            "notebook_task": {
+               "notebook_path": "/Utils/Download MovieLens"
+            }
+         },
+         "cluster_spec": {
+            "existing_cluster_id": "xxxxx-yyyy-zzzzzzz"
+         },
+         "cluster_instance": {
+            "cluster_id": "xxxxx-yyyy-zzzzzzz"
+         },
+         "overriding_parameters": null,
+         "start_time": 1592058826123,
+         "setup_duration": 0,
+         "execution_duration": 0,
+         "cleanup_duration": 0,
+         "trigger": null,
+         "run_name": "pydbr-1592058823",
+         "run_page_url": "https://westeurope.azuredatabricks.net/?o=abcdefghasdf#job/4/run/1",
+         "run_type": "SUBMIT_RUN"
+      }
+   ],
+   "has_more": false
+}
+```
+
+
+
+#### Export run 
+
+Implements: [Databricks REST runs/export](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-export)
+
+```bash
+$ pydbr runs export --content-only 4 > .dev/run-view.html
+```
+
+
+
+#### Get run output
+
+Implements: [Databricks REST runs/get-output](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-get-output)
+
+```bash
+$ pydbr runs get-output -i 3 6
+```
+
+```json
+{
+   "notebook_output": {
+      "result": "Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv",
+      "truncated": false
+   },
+   "error": null,
+   "metadata": {
+      "job_id": 5,
+      "run_id": 5,
+      "creator_user_name": "your.name@gmail.com",
+      "number_in_job": 1,
+      "original_attempt_run_id": null,
+      "state": {
+         "life_cycle_state": "TERMINATED",
+         "result_state": "SUCCESS",
+         "state_message": ""
+      },
+      "schedule": null,
+      "task": {
+         "notebook_task": {
+            "notebook_path": "/Utils/Download MovieLens"
+         }
+      },
+      "cluster_spec": {
+         "existing_cluster_id": "xxxx-yyyyy-zzzzzzz"
+      },
+      "cluster_instance": {
+         "cluster_id": "xxxx-yyyyy-zzzzzzz",
+         "spark_context_id": "8973498743973498"
+      },
+      "overriding_parameters": null,
+      "start_time": 1592062147101,
+      "setup_duration": 1000,
+      "execution_duration": 11000,
+      "cleanup_duration": 0,
+      "trigger": null,
+      "run_name": "pydbr-1592062135",
+      "run_page_url": "https://westeurope.azuredatabricks.net/?o=89798374987987#job/5/run/1",
+      "run_type": "SUBMIT_RUN"
+   }
+}
+```
+
+
+
+To get only the exit output:
+
+```bash
+$ pydbr runs get-output -r 6
+```
+
+```
+Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv
+```
+
+
+
+## Python Client SDK for Databricks REST APIs
+
+To implement your own Databricks REST API client, you can use the Python Client SDK for Databricks REST APIs.
+
+### Create Databricks connection
+
+```python
+# Get Databricks workspace connection
+dbc = pydbr.connect(
+        bearer_token='dapixyzabcd09rasdf',
+        url='https://westeurope.azuredatabricks.net')
+```
+
+### DBFS
+
+```python
+# Get list of items at path /FileStore
+dbc.dbfs.ls('/FileStore')
+
+# Check if file or directory exists
+dbc.dbfs.exists('/path/to/heaven')
+
+# Make a directory and it's parents
+dbc.dbfs.mkdirs('/path/to/heaven')
+
+# Delete a directory recusively
+dbc.dbfs.rm('/path', recursive=True)
+
+# Download file block starting 1024 with size 2048
+dbc.dbfs.read('/data/movies.csv', 1024, 2048)
+
+# Download entire file
+dbc.dbfs.read_all('/data/movies.csv')
+```
+
+### Databricks workspace
+
+```python
+# List root workspace directory
+dbc.workspace.ls('/')
+
+# Check if workspace item exists
+dbc.workspace.exists('/explore')
+
+# Check if workspace item is a directory
+dbc.workspace.is_directory('/')
+
+# Export notebook in default (SOURCE) format
+dbc.workspace.export('/my_notebook')
+
+# Export notebook in HTML format
+dbc.workspace.export('/my_notebook', 'HTML')
+```
+
+
+
+## Build and publish
+
+```bash
+pip install wheel twine
+python setup.py sdist bdist_wheel
+python -m twine upload dist/*
+```
+
+
+
+%package help
+Summary:	Development documents and examples for pydbr
+Provides:	python3-pydbr-doc
+%description help
+# pydbr
+Databricks client SDK for Python with command line interface for Databricks REST APIs.
+
+{:toc}
+
+## Introduction
+
+Pydbr (short of Python-Databricks) package provides python SDK for Databricks REST API:
+
+* dbfs
+* workspace
+* jobs
+* runs
+
+The package also comes with a useful CLI which might be very helpful in automation.
+
+## Installation
+
+```bash
+$ pip install pydbr
+```
+
+
+
+## Databricks CLI
+
+Databricks command line client provides convenient way to interact with Databricks cluster at the command line. A very popular use of such approach in in automation tasks, like DevOps pipelines or third party workflow managers.
+
+You can call the Databricks CLI using convenient shell command `pydbr`:
+
+```bash
+$ pydbr --help
+```
+
+ or using python module:
+
+```bash
+$ python -m pydbr.cli --help
+```
+
+To connect to the Databricks cluster, you can supply arguments at the command line:
+
+* `--bearer-token`
+* `--url`
+* `--cluster-id`
+
+Alternatively, you can define environment variables. Command line arguments take precedence.
+
+```bash
+export DATABRICKS_URL='https://westeurope.azuredatabricks.net/'
+export DATABRICKS_BEARER_TOKEN='dapixyz89u9ufsdfd0'
+export DATABRICKS_CLUSTER_ID='1234-456778-abc234'
+export DATABRICKS_ORG_ID='87287878293983984'
+```
+
+
+
+### DBFS
+
+#### List DBFS items
+
+```bash
+# List items on DBFS
+pydbr dbfs ls --json-indent 3 FileStore/movielens
+```
+
+```bash
+[
+   {
+      "path": "/FileStore/movielens/ml-latest-small",
+      "is_dir": true,
+      "file_size": 0,
+      "is_file": false,
+      "human_size": "0 B"
+   }
+]
+```
+
+#### Download file from DBFS
+
+```bash
+# Download a file and print to STDOUT
+pydbr dbfs get ml-latest-small/movies.csv
+```
+
+#### Download directory from DBFS
+
+```bash
+# Download recursively entire directory and store locally
+pydbr dbfs get -o ml-local ml-latest-small
+```
+
+
+
+### Workspace
+
+Databricks workspace contains notebooks and other items.
+
+#### List workspace
+
+```bash
+####################
+# List workspace
+# Default path is root - '/'
+$ pydbr workspace ls
+# auto-add leading '/'
+$ pydbr workspace ls 'Users'
+# Space-indentend json output with number of spaces
+$ pydbr workspace --json-indent 4 ls
+# Custom indent string
+$ pydbr workspace ls --json-indent='>'
+```
+
+
+
+#### Export items from Databricks workspace
+
+```bash
+#####################
+# Export workspace items
+# Export everything in source format using defaults: format=SOURCE, path=/
+pydbr workspace export -o ./.dev/export
+# Export everything in DBC format
+pydbr workspace export -f DBC -o ./.dev/export.
+# When path is folder, export is recursive
+pydbr workspace export -o ./.dev/export-utils 'Utils'
+# Export single ITEM
+pydbr workspace export -o ./.dev/GetML 'Utils/Download MovieLens.py'
+```
+
+
+
+### Runs
+
+This command group implements the [`jobs/runs` Databricks REST API](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit).
+
+#### Submit a notebook
+
+Implements: [https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit)
+
+```bash
+$ pydbr runs submit "Utils/Download MovieLens"
+```
+
+```
+{"run_id": 4}
+```
+
+You can retrieve the job information using `runs get`:
+
+```bash
+$ pydbr runs get 4 -i 3
+```
+
+
+
+If you need to pass parameters, use the `--parameters` or `-p` option and specify JSON text.
+
+```bash
+$ pydbr runs submit -p '{"run_tag":"20250103"}' "Utils/Download MovieLens"
+```
+
+You can refer also to parameters in JSON file:
+
+```bash
+$ pydbr runs submit -p '@params.json' "Utils/Download MovieLens"
+```
+
+You can use the parameters in the notebook and will also be able to see them in the run metadata:
+
+```bash
+pydbr runs get-output -i 3 8
+```
+
+```json
+{
+   "notebook_output": {
+      "result": "Downloaded files (tag: 20250103): README.txt, links.csv, movies.csv, ratings.csv, tags.csv",
+      "truncated": false
+   },
+   "error": null,
+   "metadata": {
+      "job_id": 8,
+      "run_id": 8,
+      "creator_user_name": "your.name@gmail.com",
+      "number_in_job": 1,
+      "original_attempt_run_id": null,
+      "state": {
+         "life_cycle_state": "TERMINATED",
+         "result_state": "SUCCESS",
+         "state_message": ""
+      },
+      "schedule": null,
+      "task": {
+         "notebook_task": {
+            "notebook_path": "/Utils/Download MovieLens",
+            "base_parameters": {
+               "run_tag": "20250103"
+            }
+         }
+      },
+      "cluster_spec": {
+         "existing_cluster_id": "xxxx-yyyyyy-zzzzzz"
+      },
+      "cluster_instance": {
+         "cluster_id": "xxxx-yyyyyy-zzzzzzzz",
+         "spark_context_id": "8734983498349834"
+      },
+      "overriding_parameters": null,
+      "start_time": 1592067357734,
+      "setup_duration": 0,
+      "execution_duration": 11000,
+      "cleanup_duration": 0,
+      "trigger": null,
+      "run_name": "pydbr-1592067355",
+      "run_page_url": "https://westeurope.azuredatabricks.net/?o=89349849834#job/8/run/1",
+      "run_type": "SUBMIT_RUN"
+   }
+}
+```
+
+
+
+#### Get run metadata
+
+Implements: [Databricks REST runs/get](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-get) 
+
+```bash
+$ pydbr runs get -i 3 6
+```
+
+```json
+{
+   "job_id": 6,
+   "run_id": 6,
+   "creator_user_name": "your.name@gmail.com",
+   "number_in_job": 1,
+   "original_attempt_run_id": null,
+   "state": {
+      "life_cycle_state": "TERMINATED",
+      "result_state": "SUCCESS",
+      "state_message": ""
+   },
+   "schedule": null,
+   "task": {
+      "notebook_task": {
+         "notebook_path": "/Utils/Download MovieLens"
+      }
+   },
+   "cluster_spec": {
+      "existing_cluster_id": "xxxx-yyyyy-zzzzzz"
+   },
+   "cluster_instance": {
+      "cluster_id": "xxxx-yyyyy-zzzzzz",
+      "spark_context_id": "783487348734873873"
+   },
+   "overriding_parameters": null,
+   "start_time": 1592062497162,
+   "setup_duration": 0,
+   "execution_duration": 11000,
+   "cleanup_duration": 0,
+   "trigger": null,
+   "run_name": "pydbr-1592062494",
+   "run_page_url": "https://westeurope.azuredatabricks.net/?o=398348734873487#job/6/run/1",
+   "run_type": "SUBMIT_RUN"
+}
+```
+
+
+
+#### List Runs
+
+Implements: [Databricks REST runs/list](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-list)
+
+```bash
+$ pydbr runs ls
+```
+
+
+
+To get only the runs for a particular job:
+
+```bash
+# Get job with job-id=4
+$ pydbr runs ls 4 -i 3
+```
+
+```json
+{
+   "runs": [
+      {
+         "job_id": 4,
+         "run_id": 4,
+         "creator_user_name": "your.name@gmail.com",
+         "number_in_job": 1,
+         "original_attempt_run_id": null,
+         "state": {
+            "life_cycle_state": "PENDING",
+            "state_message": ""
+         },
+         "schedule": null,
+         "task": {
+            "notebook_task": {
+               "notebook_path": "/Utils/Download MovieLens"
+            }
+         },
+         "cluster_spec": {
+            "existing_cluster_id": "xxxxx-yyyy-zzzzzzz"
+         },
+         "cluster_instance": {
+            "cluster_id": "xxxxx-yyyy-zzzzzzz"
+         },
+         "overriding_parameters": null,
+         "start_time": 1592058826123,
+         "setup_duration": 0,
+         "execution_duration": 0,
+         "cleanup_duration": 0,
+         "trigger": null,
+         "run_name": "pydbr-1592058823",
+         "run_page_url": "https://westeurope.azuredatabricks.net/?o=abcdefghasdf#job/4/run/1",
+         "run_type": "SUBMIT_RUN"
+      }
+   ],
+   "has_more": false
+}
+```
+
+
+
+#### Export run 
+
+Implements: [Databricks REST runs/export](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-export)
+
+```bash
+$ pydbr runs export --content-only 4 > .dev/run-view.html
+```
+
+
+
+#### Get run output
+
+Implements: [Databricks REST runs/get-output](https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-get-output)
+
+```bash
+$ pydbr runs get-output -i 3 6
+```
+
+```json
+{
+   "notebook_output": {
+      "result": "Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv",
+      "truncated": false
+   },
+   "error": null,
+   "metadata": {
+      "job_id": 5,
+      "run_id": 5,
+      "creator_user_name": "your.name@gmail.com",
+      "number_in_job": 1,
+      "original_attempt_run_id": null,
+      "state": {
+         "life_cycle_state": "TERMINATED",
+         "result_state": "SUCCESS",
+         "state_message": ""
+      },
+      "schedule": null,
+      "task": {
+         "notebook_task": {
+            "notebook_path": "/Utils/Download MovieLens"
+         }
+      },
+      "cluster_spec": {
+         "existing_cluster_id": "xxxx-yyyyy-zzzzzzz"
+      },
+      "cluster_instance": {
+         "cluster_id": "xxxx-yyyyy-zzzzzzz",
+         "spark_context_id": "8973498743973498"
+      },
+      "overriding_parameters": null,
+      "start_time": 1592062147101,
+      "setup_duration": 1000,
+      "execution_duration": 11000,
+      "cleanup_duration": 0,
+      "trigger": null,
+      "run_name": "pydbr-1592062135",
+      "run_page_url": "https://westeurope.azuredatabricks.net/?o=89798374987987#job/5/run/1",
+      "run_type": "SUBMIT_RUN"
+   }
+}
+```
+
+
+
+To get only the exit output:
+
+```bash
+$ pydbr runs get-output -r 6
+```
+
+```
+Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv
+```
+
+
+
+## Python Client SDK for Databricks REST APIs
+
+To implement your own Databricks REST API client, you can use the Python Client SDK for Databricks REST APIs.
+
+### Create Databricks connection
+
+```python
+# Get Databricks workspace connection
+dbc = pydbr.connect(
+        bearer_token='dapixyzabcd09rasdf',
+        url='https://westeurope.azuredatabricks.net')
+```
+
+### DBFS
+
+```python
+# Get list of items at path /FileStore
+dbc.dbfs.ls('/FileStore')
+
+# Check if file or directory exists
+dbc.dbfs.exists('/path/to/heaven')
+
+# Make a directory and it's parents
+dbc.dbfs.mkdirs('/path/to/heaven')
+
+# Delete a directory recusively
+dbc.dbfs.rm('/path', recursive=True)
+
+# Download file block starting 1024 with size 2048
+dbc.dbfs.read('/data/movies.csv', 1024, 2048)
+
+# Download entire file
+dbc.dbfs.read_all('/data/movies.csv')
+```
+
+### Databricks workspace
+
+```python
+# List root workspace directory
+dbc.workspace.ls('/')
+
+# Check if workspace item exists
+dbc.workspace.exists('/explore')
+
+# Check if workspace item is a directory
+dbc.workspace.is_directory('/')
+
+# Export notebook in default (SOURCE) format
+dbc.workspace.export('/my_notebook')
+
+# Export notebook in HTML format
+dbc.workspace.export('/my_notebook', 'HTML')
+```
+
+
+
+## Build and publish
+
+```bash
+pip install wheel twine
+python setup.py sdist bdist_wheel
+python -m twine upload dist/*
+```
+
+
+
+%prep
+%autosetup -n pydbr-0.0.7
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+	find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+	find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+	find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+	find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+	find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-pydbr -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Fri May 05 2023 Python_Bot <Python_Bot@openeuler.org> - 0.0.7-1
+- Package Spec generated
author	CoprDistGit <infra@openeuler.org>	2023-05-05 09:50:22 +0000
committer	CoprDistGit <infra@openeuler.org>	2023-05-05 09:50:22 +0000
commit	8714f97e501ece05fb56986265738eb7373106dd (patch)
tree	c1189af2e57136652a3e5dfb7efdc4b25544bd43 /python-pydbr.spec
parent	f6072fe2535f7d294f476d0231cbfbf78434a8f9 (diff)