summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--.gitignore1
-rw-r--r--python-cdm-connector.spec258
-rw-r--r--sources1
3 files changed, 260 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..9d717b3 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/cdm-connector-0.0.6.70.tar.gz
diff --git a/python-cdm-connector.spec b/python-cdm-connector.spec
new file mode 100644
index 0000000..52b6ca0
--- /dev/null
+++ b/python-cdm-connector.spec
@@ -0,0 +1,258 @@
+%global _empty_manifest_terminate_build 0
+Name: python-cdm-connector
+Version: 0.0.6.70
+Release: 1
+Summary: A Python package to read and write files in CDM format. Customized for SkyPoint use cases.
+License: GPL-3.0
+URL: https://github.com/skypointcloud/skypoint-python-cdm-connector
+Source0: https://mirrors.nju.edu.cn/pypi/web/packages/f9/6e/b4595933644029689cd3cd32e2aab235b23d1de4506bc4a35f02901819aa/cdm-connector-0.0.6.70.tar.gz
+BuildArch: noarch
+
+Requires: python3-pandas
+Requires: python3-azure-storage-blob
+Requires: python3-numpy
+Requires: python3-retry
+Requires: python3-boto3
+Requires: python3-botocore
+
+%description
+# skypoint-python-cdm-connector
+Python Spark CDM Connector by SkyPoint.
+
+Apache Spark connector for the Microsoft Azure "Common Data Model". Reading and writing is supported and it is a work in progress. Please file issues for any bugs that you find.
+
+For more information about the Azure Common Data Model, check out [this page](https://docs.microsoft.com/en-us/common-data-model/data-lake). <br>
+
+We support Azure Data Lake Service (ADLS) and AWS S3 as storage, historical data preservation using snapshots of the schema & data files and usage within PySpark, Azure Functions etc.
+
+*Upcoming Support for incremental data refresh handling, [CDM 1.1](https://docs.microsoft.com/en-us/common-data-model/cdm-manifest and Google Cloud (Cloud Storage). <br>
+
+## Example
+
+1. Please look into the sample usage file skypoint_python_cdm.py
+2. Dynamically add/remove entities, annotations and attributes
+3. Pass Reader and Writer object for any storage account you like to write/read data to/from.
+4. Check out the below code for basic read and write examples.
+
+```python
+# Initialize empty model
+m = Model()
+
+# Sample dataframe
+df = {"country": ["Brazil", "Russia", "India", "China", "South Africa", "ParaSF"],
+ "currentTime": [datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now()],
+ "area": [8.516, 17.10, 3.286, 9.597, 1.221, 2.222],
+ "capital": ["Brasilia", "Moscow", "New Dehli", "Beijing", "Pretoria", "ParaSF"],
+ "population": [200.4, 143.5, 1252, 1357, 52.98, 12.34] }
+df = pd.DataFrame(df)
+
+# Generate entity from the dataframe
+entity = Model.generate_entity(df, "customEntity")
+
+# Add generated entity to model
+m.add_entity(entity)
+
+# Add model level annotation
+# Annotation can be added at entity level as well as attribute level
+Model.add_annotation("modelJsonAnnotation", "modelJsonAnnotationValue", m)
+
+
+# Create an ADLSWriter to write into ADLS
+writer = ADLSWriter("ACCOUNT_NAME", "ACCOUNT_KEY",
+ "CONTAINER_NAME", "STORAGE_NAME", "DATAFLOW_NAME")
+
+# Write data as well as model.json in ADLS storage
+m.write_to_storage("customEntity", df, writer)
+```
+
+## Contributing
+
+This project welcomes contributions and suggestions.
+
+## References
+
+[Model.json version1 schema](https://github.com/microsoft/CDM/blob/master/docs/schema/modeljsonschema.json)
+
+[A clean implementation for Python Objects from/to model.json file](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/CDM/python/CdmModel.py)
+
+
+
+
+%package -n python3-cdm-connector
+Summary: A Python package to read and write files in CDM format. Customized for SkyPoint use cases.
+Provides: python-cdm-connector
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-cdm-connector
+# skypoint-python-cdm-connector
+Python Spark CDM Connector by SkyPoint.
+
+Apache Spark connector for the Microsoft Azure "Common Data Model". Reading and writing is supported and it is a work in progress. Please file issues for any bugs that you find.
+
+For more information about the Azure Common Data Model, check out [this page](https://docs.microsoft.com/en-us/common-data-model/data-lake). <br>
+
+We support Azure Data Lake Service (ADLS) and AWS S3 as storage, historical data preservation using snapshots of the schema & data files and usage within PySpark, Azure Functions etc.
+
+*Upcoming Support for incremental data refresh handling, [CDM 1.1](https://docs.microsoft.com/en-us/common-data-model/cdm-manifest and Google Cloud (Cloud Storage). <br>
+
+## Example
+
+1. Please look into the sample usage file skypoint_python_cdm.py
+2. Dynamically add/remove entities, annotations and attributes
+3. Pass Reader and Writer object for any storage account you like to write/read data to/from.
+4. Check out the below code for basic read and write examples.
+
+```python
+# Initialize empty model
+m = Model()
+
+# Sample dataframe
+df = {"country": ["Brazil", "Russia", "India", "China", "South Africa", "ParaSF"],
+ "currentTime": [datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now()],
+ "area": [8.516, 17.10, 3.286, 9.597, 1.221, 2.222],
+ "capital": ["Brasilia", "Moscow", "New Dehli", "Beijing", "Pretoria", "ParaSF"],
+ "population": [200.4, 143.5, 1252, 1357, 52.98, 12.34] }
+df = pd.DataFrame(df)
+
+# Generate entity from the dataframe
+entity = Model.generate_entity(df, "customEntity")
+
+# Add generated entity to model
+m.add_entity(entity)
+
+# Add model level annotation
+# Annotation can be added at entity level as well as attribute level
+Model.add_annotation("modelJsonAnnotation", "modelJsonAnnotationValue", m)
+
+
+# Create an ADLSWriter to write into ADLS
+writer = ADLSWriter("ACCOUNT_NAME", "ACCOUNT_KEY",
+ "CONTAINER_NAME", "STORAGE_NAME", "DATAFLOW_NAME")
+
+# Write data as well as model.json in ADLS storage
+m.write_to_storage("customEntity", df, writer)
+```
+
+## Contributing
+
+This project welcomes contributions and suggestions.
+
+## References
+
+[Model.json version1 schema](https://github.com/microsoft/CDM/blob/master/docs/schema/modeljsonschema.json)
+
+[A clean implementation for Python Objects from/to model.json file](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/CDM/python/CdmModel.py)
+
+
+
+
+%package help
+Summary: Development documents and examples for cdm-connector
+Provides: python3-cdm-connector-doc
+%description help
+# skypoint-python-cdm-connector
+Python Spark CDM Connector by SkyPoint.
+
+Apache Spark connector for the Microsoft Azure "Common Data Model". Reading and writing is supported and it is a work in progress. Please file issues for any bugs that you find.
+
+For more information about the Azure Common Data Model, check out [this page](https://docs.microsoft.com/en-us/common-data-model/data-lake). <br>
+
+We support Azure Data Lake Service (ADLS) and AWS S3 as storage, historical data preservation using snapshots of the schema & data files and usage within PySpark, Azure Functions etc.
+
+*Upcoming Support for incremental data refresh handling, [CDM 1.1](https://docs.microsoft.com/en-us/common-data-model/cdm-manifest and Google Cloud (Cloud Storage). <br>
+
+## Example
+
+1. Please look into the sample usage file skypoint_python_cdm.py
+2. Dynamically add/remove entities, annotations and attributes
+3. Pass Reader and Writer object for any storage account you like to write/read data to/from.
+4. Check out the below code for basic read and write examples.
+
+```python
+# Initialize empty model
+m = Model()
+
+# Sample dataframe
+df = {"country": ["Brazil", "Russia", "India", "China", "South Africa", "ParaSF"],
+ "currentTime": [datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now()],
+ "area": [8.516, 17.10, 3.286, 9.597, 1.221, 2.222],
+ "capital": ["Brasilia", "Moscow", "New Dehli", "Beijing", "Pretoria", "ParaSF"],
+ "population": [200.4, 143.5, 1252, 1357, 52.98, 12.34] }
+df = pd.DataFrame(df)
+
+# Generate entity from the dataframe
+entity = Model.generate_entity(df, "customEntity")
+
+# Add generated entity to model
+m.add_entity(entity)
+
+# Add model level annotation
+# Annotation can be added at entity level as well as attribute level
+Model.add_annotation("modelJsonAnnotation", "modelJsonAnnotationValue", m)
+
+
+# Create an ADLSWriter to write into ADLS
+writer = ADLSWriter("ACCOUNT_NAME", "ACCOUNT_KEY",
+ "CONTAINER_NAME", "STORAGE_NAME", "DATAFLOW_NAME")
+
+# Write data as well as model.json in ADLS storage
+m.write_to_storage("customEntity", df, writer)
+```
+
+## Contributing
+
+This project welcomes contributions and suggestions.
+
+## References
+
+[Model.json version1 schema](https://github.com/microsoft/CDM/blob/master/docs/schema/modeljsonschema.json)
+
+[A clean implementation for Python Objects from/to model.json file](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/CDM/python/CdmModel.py)
+
+
+
+
+%prep
+%autosetup -n cdm-connector-0.0.6.70
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-cdm-connector -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Wed May 10 2023 Python_Bot <Python_Bot@openeuler.org> - 0.0.6.70-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..82df664
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+3fc95c65f9c080366a84ded82809e7ae cdm-connector-0.0.6.70.tar.gz