automatic import of python-soda-sql-coreopeneuler20.03

author: CoprDistGit <infra@openeuler.org> 2023-05-05 07:29:37 +0000
committer: CoprDistGit <infra@openeuler.org> 2023-05-05 07:29:37 +0000
commit: 20241071398ebd85c43d682bfacf102eab0661ba (patch)
tree: 1bfaab22490f539b725984bc9160e293a02fb061
parent: 1c66c67f8c22baecfad56d2afc203cf640f1b9f8 (diff)
3 files changed, 554 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..54441a3 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/soda-sql-core-2.2.2.tar.gz
diff --git a/python-soda-sql-core.spec b/python-soda-sql-core.spec
new file mode 100644
index 0000000..51b7b40
--- /dev/null
+++ b/python-soda-sql-core.spec
@@ -0,0 +1,552 @@
+%global _empty_manifest_terminate_build 0
+Name:		python-soda-sql-core
+Version:	2.2.2
+Release:	1
+Summary:	Soda SQL library & CLI
+License:	Apache Software License
+URL:		https://pypi.org/project/soda-sql-core/
+Source0:	https://mirrors.nju.edu.cn/pypi/web/packages/bb/5b/16a61b4e206e03f78ecea9e8bdff5f96601d119cb891777156f997b51587/soda-sql-core-2.2.2.tar.gz
+BuildArch:	noarch
+
+Requires:	python3-markupsafe
+Requires:	python3-Jinja2
+Requires:	python3-click
+Requires:	python3-pyyaml
+Requires:	python3-requests
+Requires:	python3-Deprecated
+Requires:	python3-opentelemetry-api
+Requires:	python3-opentelemetry-exporter-otlp-proto-http
+Requires:	python3-protobuf
+
+%description
+<p align="center"><img src="https://raw.githubusercontent.com/sodadata/docs/main/assets/images/soda-banner.png" alt="Soda logo" /></p>
+
+<h1 align="center">Soda SQL</h1>
+<p align="center"><b>Data testing, monitoring and profiling for SQL accessible data.</b></p>
+
+<p align="center">
+  <a href="https://github.com/sodadata/soda-sql/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-blue.svg" alt="License: Apache 2.0"></a>
+  <a href="https://join.slack.com/t/soda-community/shared_invite/zt-m77gajo1-nXJF7JtbbRht2zwaiLb9pg"><img alt="Slack" src="https://img.shields.io/badge/chat-slack-green.svg"></a>
+  <a href="https://pypi.org/project/soda-sql/"><img alt="Pypi Soda SQL" src="https://img.shields.io/badge/pypi-soda%20sql-green.svg"></a>
+  <a href="https://github.com/sodadata/soda-sql/actions/workflows/build.yml"><img alt="Build soda-sql" src="https://github.com/sodadata/soda-sql/actions/workflows/build.yml/badge.svg"></a>
+</p>
+
+**What does Soda SQL do?**
+
+Soda SQL allows you to
+
+ * Stop your pipeline when bad data is detected
+ * Extract metrics and column profiles through super efficient SQL
+ * Full control over metrics and queries through declarative config files
+
+**Why Soda SQL?**
+
+To protect against silent data issues for the consumers of your data,
+it's best-practice to profile and test your data:
+
+ * as it lands in your warehouse,
+ * after every important data processing step
+ * right before consumption.
+
+This way you will prevent delivery of bad data to downstream consumers.
+You will spend less time firefighting and gain a better reputation.
+
+**How does Soda SQL work?**
+
+Soda SQL is a Command Line Interface (CLI) and a Python library to measure
+and test your data using SQL.
+
+As input, Soda SQL uses YAML configuration files that include:
+ * SQL connection details
+ * What metrics to compute
+ * What tests to run on the measurements
+
+Based on those configuration files, Soda SQL will perform scans.  A scan
+performs all measurements and runs all tests associated with one table.  Typically
+a scan is executed after new data has arrived.  All soda-sql configuration files
+can be checked into your version control system as part of your pipeline
+code.
+
+> Want to try Soda SQL? Head over to our ['Quick start tutorial'](https://docs.soda.io/soda-sql/getting-started/5_min_tutorial.html) and get started straight away!
+
+**"[Show me the metrics](https://www.youtube.com/watch?v=1-mOKMq19zU)"**
+
+Let's walk through an example. Simple metrics and tests can be configured in scan YAML configuration
+files. An example of the contents of such a file:
+
+```yaml
+metrics:
+    - row_count
+    - missing_count
+    - missing_percentage
+    - values_count
+    - values_percentage
+    - valid_count
+    - valid_percentage
+    - invalid_count
+    - invalid_percentage
+    - min
+    - max
+    - avg
+    - sum
+    - min_length
+    - max_length
+    - avg_length
+    - distinct
+    - unique_count
+    - duplicate_count
+    - uniqueness
+    - maxs
+    - mins
+    - frequent_values
+    - histogram
+columns:
+    ID:
+        metrics:
+            - distinct
+            - duplicate_count
+        valid_format: uuid
+        tests:
+            duplicate_count == 0
+    CATEGORY:
+        missing_values:
+            - N/A
+            - No category
+        tests:
+            missing_percentage < 3
+    SIZE:
+        tests:
+            max - min < 20
+sql_metrics:
+    - sql: |
+        SELECT sum(volume) as total_volume_us
+        FROM CUSTOMER_TRANSACTIONS
+        WHERE country = 'US'
+      tests:
+        - total_volume_us > 5000
+```
+
+Based on these configuration files, Soda SQL will scan your data
+each time new data arrived like this:
+
+```bash
+$ soda scan ./soda/metrics my_warehouse my_dataset
+Soda 1.0 scan for dataset my_dataset on prod my_warehouse
+  | SELECT column_name, data_type, is_nullable
+  | FROM information_schema.columns
+  | WHERE lower(table_name) = 'customers'
+  |   AND table_catalog = 'datasource.database'
+  |   AND table_schema = 'datasource.schema'
+  - 0.256 seconds
+Found 4 columns: ID, NAME, CREATE_DATE, COUNTRY
+  | SELECT
+  |  COUNT(*),
+  |  COUNT(CASE WHEN ID IS NULL THEN 1 END),
+  |  COUNT(CASE WHEN ID IS NOT NULL AND ID regexp '\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b' THEN 1 END),
+  |  MIN(LENGTH(ID)),
+  |  AVG(LENGTH(ID)),
+  |  MAX(LENGTH(ID)),
+  | FROM customers
+  - 0.557 seconds
+row_count : 23543
+missing   : 23
+invalid   : 0
+min_length: 9
+avg_length: 9
+max_length: 9
+
+...more queries...
+
+47 measurements computed
+23 tests executed
+All is good. No tests failed. Scan took 23.307 seconds
+```
+
+The next step is to add Soda SQL scans in your favorite
+data pipeline orchestration solution like:
+
+* Airflow
+* AWS Glue
+* Prefect
+* Dagster
+* Fivetran
+* Matillion
+* Luigi
+
+If you like the goals of this project, encourage us! Star [sodadata/soda-sql on Github](https://github.com/sodadata/soda-sql).
+
+> Next, head over to our ['Quick start tutorial'](https://docs.soda.io/soda-sql/getting-started/5_min_tutorial.html) and get your first project going!
+
+
+%package -n python3-soda-sql-core
+Summary:	Soda SQL library & CLI
+Provides:	python-soda-sql-core
+BuildRequires:	python3-devel
+BuildRequires:	python3-setuptools
+BuildRequires:	python3-pip
+%description -n python3-soda-sql-core
+<p align="center"><img src="https://raw.githubusercontent.com/sodadata/docs/main/assets/images/soda-banner.png" alt="Soda logo" /></p>
+
+<h1 align="center">Soda SQL</h1>
+<p align="center"><b>Data testing, monitoring and profiling for SQL accessible data.</b></p>
+
+<p align="center">
+  <a href="https://github.com/sodadata/soda-sql/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-blue.svg" alt="License: Apache 2.0"></a>
+  <a href="https://join.slack.com/t/soda-community/shared_invite/zt-m77gajo1-nXJF7JtbbRht2zwaiLb9pg"><img alt="Slack" src="https://img.shields.io/badge/chat-slack-green.svg"></a>
+  <a href="https://pypi.org/project/soda-sql/"><img alt="Pypi Soda SQL" src="https://img.shields.io/badge/pypi-soda%20sql-green.svg"></a>
+  <a href="https://github.com/sodadata/soda-sql/actions/workflows/build.yml"><img alt="Build soda-sql" src="https://github.com/sodadata/soda-sql/actions/workflows/build.yml/badge.svg"></a>
+</p>
+
+**What does Soda SQL do?**
+
+Soda SQL allows you to
+
+ * Stop your pipeline when bad data is detected
+ * Extract metrics and column profiles through super efficient SQL
+ * Full control over metrics and queries through declarative config files
+
+**Why Soda SQL?**
+
+To protect against silent data issues for the consumers of your data,
+it's best-practice to profile and test your data:
+
+ * as it lands in your warehouse,
+ * after every important data processing step
+ * right before consumption.
+
+This way you will prevent delivery of bad data to downstream consumers.
+You will spend less time firefighting and gain a better reputation.
+
+**How does Soda SQL work?**
+
+Soda SQL is a Command Line Interface (CLI) and a Python library to measure
+and test your data using SQL.
+
+As input, Soda SQL uses YAML configuration files that include:
+ * SQL connection details
+ * What metrics to compute
+ * What tests to run on the measurements
+
+Based on those configuration files, Soda SQL will perform scans.  A scan
+performs all measurements and runs all tests associated with one table.  Typically
+a scan is executed after new data has arrived.  All soda-sql configuration files
+can be checked into your version control system as part of your pipeline
+code.
+
+> Want to try Soda SQL? Head over to our ['Quick start tutorial'](https://docs.soda.io/soda-sql/getting-started/5_min_tutorial.html) and get started straight away!
+
+**"[Show me the metrics](https://www.youtube.com/watch?v=1-mOKMq19zU)"**
+
+Let's walk through an example. Simple metrics and tests can be configured in scan YAML configuration
+files. An example of the contents of such a file:
+
+```yaml
+metrics:
+    - row_count
+    - missing_count
+    - missing_percentage
+    - values_count
+    - values_percentage
+    - valid_count
+    - valid_percentage
+    - invalid_count
+    - invalid_percentage
+    - min
+    - max
+    - avg
+    - sum
+    - min_length
+    - max_length
+    - avg_length
+    - distinct
+    - unique_count
+    - duplicate_count
+    - uniqueness
+    - maxs
+    - mins
+    - frequent_values
+    - histogram
+columns:
+    ID:
+        metrics:
+            - distinct
+            - duplicate_count
+        valid_format: uuid
+        tests:
+            duplicate_count == 0
+    CATEGORY:
+        missing_values:
+            - N/A
+            - No category
+        tests:
+            missing_percentage < 3
+    SIZE:
+        tests:
+            max - min < 20
+sql_metrics:
+    - sql: |
+        SELECT sum(volume) as total_volume_us
+        FROM CUSTOMER_TRANSACTIONS
+        WHERE country = 'US'
+      tests:
+        - total_volume_us > 5000
+```
+
+Based on these configuration files, Soda SQL will scan your data
+each time new data arrived like this:
+
+```bash
+$ soda scan ./soda/metrics my_warehouse my_dataset
+Soda 1.0 scan for dataset my_dataset on prod my_warehouse
+  | SELECT column_name, data_type, is_nullable
+  | FROM information_schema.columns
+  | WHERE lower(table_name) = 'customers'
+  |   AND table_catalog = 'datasource.database'
+  |   AND table_schema = 'datasource.schema'
+  - 0.256 seconds
+Found 4 columns: ID, NAME, CREATE_DATE, COUNTRY
+  | SELECT
+  |  COUNT(*),
+  |  COUNT(CASE WHEN ID IS NULL THEN 1 END),
+  |  COUNT(CASE WHEN ID IS NOT NULL AND ID regexp '\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b' THEN 1 END),
+  |  MIN(LENGTH(ID)),
+  |  AVG(LENGTH(ID)),
+  |  MAX(LENGTH(ID)),
+  | FROM customers
+  - 0.557 seconds
+row_count : 23543
+missing   : 23
+invalid   : 0
+min_length: 9
+avg_length: 9
+max_length: 9
+
+...more queries...
+
+47 measurements computed
+23 tests executed
+All is good. No tests failed. Scan took 23.307 seconds
+```
+
+The next step is to add Soda SQL scans in your favorite
+data pipeline orchestration solution like:
+
+* Airflow
+* AWS Glue
+* Prefect
+* Dagster
+* Fivetran
+* Matillion
+* Luigi
+
+If you like the goals of this project, encourage us! Star [sodadata/soda-sql on Github](https://github.com/sodadata/soda-sql).
+
+> Next, head over to our ['Quick start tutorial'](https://docs.soda.io/soda-sql/getting-started/5_min_tutorial.html) and get your first project going!
+
+
+%package help
+Summary:	Development documents and examples for soda-sql-core
+Provides:	python3-soda-sql-core-doc
+%description help
+<p align="center"><img src="https://raw.githubusercontent.com/sodadata/docs/main/assets/images/soda-banner.png" alt="Soda logo" /></p>
+
+<h1 align="center">Soda SQL</h1>
+<p align="center"><b>Data testing, monitoring and profiling for SQL accessible data.</b></p>
+
+<p align="center">
+  <a href="https://github.com/sodadata/soda-sql/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-blue.svg" alt="License: Apache 2.0"></a>
+  <a href="https://join.slack.com/t/soda-community/shared_invite/zt-m77gajo1-nXJF7JtbbRht2zwaiLb9pg"><img alt="Slack" src="https://img.shields.io/badge/chat-slack-green.svg"></a>
+  <a href="https://pypi.org/project/soda-sql/"><img alt="Pypi Soda SQL" src="https://img.shields.io/badge/pypi-soda%20sql-green.svg"></a>
+  <a href="https://github.com/sodadata/soda-sql/actions/workflows/build.yml"><img alt="Build soda-sql" src="https://github.com/sodadata/soda-sql/actions/workflows/build.yml/badge.svg"></a>
+</p>
+
+**What does Soda SQL do?**
+
+Soda SQL allows you to
+
+ * Stop your pipeline when bad data is detected
+ * Extract metrics and column profiles through super efficient SQL
+ * Full control over metrics and queries through declarative config files
+
+**Why Soda SQL?**
+
+To protect against silent data issues for the consumers of your data,
+it's best-practice to profile and test your data:
+
+ * as it lands in your warehouse,
+ * after every important data processing step
+ * right before consumption.
+
+This way you will prevent delivery of bad data to downstream consumers.
+You will spend less time firefighting and gain a better reputation.
+
+**How does Soda SQL work?**
+
+Soda SQL is a Command Line Interface (CLI) and a Python library to measure
+and test your data using SQL.
+
+As input, Soda SQL uses YAML configuration files that include:
+ * SQL connection details
+ * What metrics to compute
+ * What tests to run on the measurements
+
+Based on those configuration files, Soda SQL will perform scans.  A scan
+performs all measurements and runs all tests associated with one table.  Typically
+a scan is executed after new data has arrived.  All soda-sql configuration files
+can be checked into your version control system as part of your pipeline
+code.
+
+> Want to try Soda SQL? Head over to our ['Quick start tutorial'](https://docs.soda.io/soda-sql/getting-started/5_min_tutorial.html) and get started straight away!
+
+**"[Show me the metrics](https://www.youtube.com/watch?v=1-mOKMq19zU)"**
+
+Let's walk through an example. Simple metrics and tests can be configured in scan YAML configuration
+files. An example of the contents of such a file:
+
+```yaml
+metrics:
+    - row_count
+    - missing_count
+    - missing_percentage
+    - values_count
+    - values_percentage
+    - valid_count
+    - valid_percentage
+    - invalid_count
+    - invalid_percentage
+    - min
+    - max
+    - avg
+    - sum
+    - min_length
+    - max_length
+    - avg_length
+    - distinct
+    - unique_count
+    - duplicate_count
+    - uniqueness
+    - maxs
+    - mins
+    - frequent_values
+    - histogram
+columns:
+    ID:
+        metrics:
+            - distinct
+            - duplicate_count
+        valid_format: uuid
+        tests:
+            duplicate_count == 0
+    CATEGORY:
+        missing_values:
+            - N/A
+            - No category
+        tests:
+            missing_percentage < 3
+    SIZE:
+        tests:
+            max - min < 20
+sql_metrics:
+    - sql: |
+        SELECT sum(volume) as total_volume_us
+        FROM CUSTOMER_TRANSACTIONS
+        WHERE country = 'US'
+      tests:
+        - total_volume_us > 5000
+```
+
+Based on these configuration files, Soda SQL will scan your data
+each time new data arrived like this:
+
+```bash
+$ soda scan ./soda/metrics my_warehouse my_dataset
+Soda 1.0 scan for dataset my_dataset on prod my_warehouse
+  | SELECT column_name, data_type, is_nullable
+  | FROM information_schema.columns
+  | WHERE lower(table_name) = 'customers'
+  |   AND table_catalog = 'datasource.database'
+  |   AND table_schema = 'datasource.schema'
+  - 0.256 seconds
+Found 4 columns: ID, NAME, CREATE_DATE, COUNTRY
+  | SELECT
+  |  COUNT(*),
+  |  COUNT(CASE WHEN ID IS NULL THEN 1 END),
+  |  COUNT(CASE WHEN ID IS NOT NULL AND ID regexp '\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b' THEN 1 END),
+  |  MIN(LENGTH(ID)),
+  |  AVG(LENGTH(ID)),
+  |  MAX(LENGTH(ID)),
+  | FROM customers
+  - 0.557 seconds
+row_count : 23543
+missing   : 23
+invalid   : 0
+min_length: 9
+avg_length: 9
+max_length: 9
+
+...more queries...
+
+47 measurements computed
+23 tests executed
+All is good. No tests failed. Scan took 23.307 seconds
+```
+
+The next step is to add Soda SQL scans in your favorite
+data pipeline orchestration solution like:
+
+* Airflow
+* AWS Glue
+* Prefect
+* Dagster
+* Fivetran
+* Matillion
+* Luigi
+
+If you like the goals of this project, encourage us! Star [sodadata/soda-sql on Github](https://github.com/sodadata/soda-sql).
+
+> Next, head over to our ['Quick start tutorial'](https://docs.soda.io/soda-sql/getting-started/5_min_tutorial.html) and get your first project going!
+
+
+%prep
+%autosetup -n soda-sql-core-2.2.2
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+	find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+	find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+	find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+	find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+	find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-soda-sql-core -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Fri May 05 2023 Python_Bot <Python_Bot@openeuler.org> - 2.2.2-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..d2f2fb0
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+978d3398f6f03c9d505cda51c94db1c6  soda-sql-core-2.2.2.tar.gz
author	CoprDistGit <infra@openeuler.org>	2023-05-05 07:29:37 +0000
committer	CoprDistGit <infra@openeuler.org>	2023-05-05 07:29:37 +0000
commit	20241071398ebd85c43d682bfacf102eab0661ba (patch)
tree	1bfaab22490f539b725984bc9160e293a02fb061
parent	1c66c67f8c22baecfad56d2afc203cf640f1b9f8 (diff)