diff options
| -rw-r--r-- | .gitignore | 1 | ||||
| -rw-r--r-- | python-cdm-connector.spec | 258 | ||||
| -rw-r--r-- | sources | 1 |
3 files changed, 260 insertions, 0 deletions
@@ -0,0 +1 @@ +/cdm-connector-0.0.6.70.tar.gz diff --git a/python-cdm-connector.spec b/python-cdm-connector.spec new file mode 100644 index 0000000..52b6ca0 --- /dev/null +++ b/python-cdm-connector.spec @@ -0,0 +1,258 @@ +%global _empty_manifest_terminate_build 0 +Name: python-cdm-connector +Version: 0.0.6.70 +Release: 1 +Summary: A Python package to read and write files in CDM format. Customized for SkyPoint use cases. +License: GPL-3.0 +URL: https://github.com/skypointcloud/skypoint-python-cdm-connector +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/f9/6e/b4595933644029689cd3cd32e2aab235b23d1de4506bc4a35f02901819aa/cdm-connector-0.0.6.70.tar.gz +BuildArch: noarch + +Requires: python3-pandas +Requires: python3-azure-storage-blob +Requires: python3-numpy +Requires: python3-retry +Requires: python3-boto3 +Requires: python3-botocore + +%description +# skypoint-python-cdm-connector +Python Spark CDM Connector by SkyPoint. + +Apache Spark connector for the Microsoft Azure "Common Data Model". Reading and writing is supported and it is a work in progress. Please file issues for any bugs that you find. + +For more information about the Azure Common Data Model, check out [this page](https://docs.microsoft.com/en-us/common-data-model/data-lake). <br> + +We support Azure Data Lake Service (ADLS) and AWS S3 as storage, historical data preservation using snapshots of the schema & data files and usage within PySpark, Azure Functions etc. + +*Upcoming Support for incremental data refresh handling, [CDM 1.1](https://docs.microsoft.com/en-us/common-data-model/cdm-manifest and Google Cloud (Cloud Storage). <br> + +## Example + +1. Please look into the sample usage file skypoint_python_cdm.py +2. Dynamically add/remove entities, annotations and attributes +3. Pass Reader and Writer object for any storage account you like to write/read data to/from. +4. Check out the below code for basic read and write examples. + +```python +# Initialize empty model +m = Model() + +# Sample dataframe +df = {"country": ["Brazil", "Russia", "India", "China", "South Africa", "ParaSF"], + "currentTime": [datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now()], + "area": [8.516, 17.10, 3.286, 9.597, 1.221, 2.222], + "capital": ["Brasilia", "Moscow", "New Dehli", "Beijing", "Pretoria", "ParaSF"], + "population": [200.4, 143.5, 1252, 1357, 52.98, 12.34] } +df = pd.DataFrame(df) + +# Generate entity from the dataframe +entity = Model.generate_entity(df, "customEntity") + +# Add generated entity to model +m.add_entity(entity) + +# Add model level annotation +# Annotation can be added at entity level as well as attribute level +Model.add_annotation("modelJsonAnnotation", "modelJsonAnnotationValue", m) + + +# Create an ADLSWriter to write into ADLS +writer = ADLSWriter("ACCOUNT_NAME", "ACCOUNT_KEY", + "CONTAINER_NAME", "STORAGE_NAME", "DATAFLOW_NAME") + +# Write data as well as model.json in ADLS storage +m.write_to_storage("customEntity", df, writer) +``` + +## Contributing + +This project welcomes contributions and suggestions. + +## References + +[Model.json version1 schema](https://github.com/microsoft/CDM/blob/master/docs/schema/modeljsonschema.json) + +[A clean implementation for Python Objects from/to model.json file](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/CDM/python/CdmModel.py) + + + + +%package -n python3-cdm-connector +Summary: A Python package to read and write files in CDM format. Customized for SkyPoint use cases. +Provides: python-cdm-connector +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-cdm-connector +# skypoint-python-cdm-connector +Python Spark CDM Connector by SkyPoint. + +Apache Spark connector for the Microsoft Azure "Common Data Model". Reading and writing is supported and it is a work in progress. Please file issues for any bugs that you find. + +For more information about the Azure Common Data Model, check out [this page](https://docs.microsoft.com/en-us/common-data-model/data-lake). <br> + +We support Azure Data Lake Service (ADLS) and AWS S3 as storage, historical data preservation using snapshots of the schema & data files and usage within PySpark, Azure Functions etc. + +*Upcoming Support for incremental data refresh handling, [CDM 1.1](https://docs.microsoft.com/en-us/common-data-model/cdm-manifest and Google Cloud (Cloud Storage). <br> + +## Example + +1. Please look into the sample usage file skypoint_python_cdm.py +2. Dynamically add/remove entities, annotations and attributes +3. Pass Reader and Writer object for any storage account you like to write/read data to/from. +4. Check out the below code for basic read and write examples. + +```python +# Initialize empty model +m = Model() + +# Sample dataframe +df = {"country": ["Brazil", "Russia", "India", "China", "South Africa", "ParaSF"], + "currentTime": [datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now()], + "area": [8.516, 17.10, 3.286, 9.597, 1.221, 2.222], + "capital": ["Brasilia", "Moscow", "New Dehli", "Beijing", "Pretoria", "ParaSF"], + "population": [200.4, 143.5, 1252, 1357, 52.98, 12.34] } +df = pd.DataFrame(df) + +# Generate entity from the dataframe +entity = Model.generate_entity(df, "customEntity") + +# Add generated entity to model +m.add_entity(entity) + +# Add model level annotation +# Annotation can be added at entity level as well as attribute level +Model.add_annotation("modelJsonAnnotation", "modelJsonAnnotationValue", m) + + +# Create an ADLSWriter to write into ADLS +writer = ADLSWriter("ACCOUNT_NAME", "ACCOUNT_KEY", + "CONTAINER_NAME", "STORAGE_NAME", "DATAFLOW_NAME") + +# Write data as well as model.json in ADLS storage +m.write_to_storage("customEntity", df, writer) +``` + +## Contributing + +This project welcomes contributions and suggestions. + +## References + +[Model.json version1 schema](https://github.com/microsoft/CDM/blob/master/docs/schema/modeljsonschema.json) + +[A clean implementation for Python Objects from/to model.json file](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/CDM/python/CdmModel.py) + + + + +%package help +Summary: Development documents and examples for cdm-connector +Provides: python3-cdm-connector-doc +%description help +# skypoint-python-cdm-connector +Python Spark CDM Connector by SkyPoint. + +Apache Spark connector for the Microsoft Azure "Common Data Model". Reading and writing is supported and it is a work in progress. Please file issues for any bugs that you find. + +For more information about the Azure Common Data Model, check out [this page](https://docs.microsoft.com/en-us/common-data-model/data-lake). <br> + +We support Azure Data Lake Service (ADLS) and AWS S3 as storage, historical data preservation using snapshots of the schema & data files and usage within PySpark, Azure Functions etc. + +*Upcoming Support for incremental data refresh handling, [CDM 1.1](https://docs.microsoft.com/en-us/common-data-model/cdm-manifest and Google Cloud (Cloud Storage). <br> + +## Example + +1. Please look into the sample usage file skypoint_python_cdm.py +2. Dynamically add/remove entities, annotations and attributes +3. Pass Reader and Writer object for any storage account you like to write/read data to/from. +4. Check out the below code for basic read and write examples. + +```python +# Initialize empty model +m = Model() + +# Sample dataframe +df = {"country": ["Brazil", "Russia", "India", "China", "South Africa", "ParaSF"], + "currentTime": [datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now()], + "area": [8.516, 17.10, 3.286, 9.597, 1.221, 2.222], + "capital": ["Brasilia", "Moscow", "New Dehli", "Beijing", "Pretoria", "ParaSF"], + "population": [200.4, 143.5, 1252, 1357, 52.98, 12.34] } +df = pd.DataFrame(df) + +# Generate entity from the dataframe +entity = Model.generate_entity(df, "customEntity") + +# Add generated entity to model +m.add_entity(entity) + +# Add model level annotation +# Annotation can be added at entity level as well as attribute level +Model.add_annotation("modelJsonAnnotation", "modelJsonAnnotationValue", m) + + +# Create an ADLSWriter to write into ADLS +writer = ADLSWriter("ACCOUNT_NAME", "ACCOUNT_KEY", + "CONTAINER_NAME", "STORAGE_NAME", "DATAFLOW_NAME") + +# Write data as well as model.json in ADLS storage +m.write_to_storage("customEntity", df, writer) +``` + +## Contributing + +This project welcomes contributions and suggestions. + +## References + +[Model.json version1 schema](https://github.com/microsoft/CDM/blob/master/docs/schema/modeljsonschema.json) + +[A clean implementation for Python Objects from/to model.json file](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/CDM/python/CdmModel.py) + + + + +%prep +%autosetup -n cdm-connector-0.0.6.70 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-cdm-connector -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Wed May 10 2023 Python_Bot <Python_Bot@openeuler.org> - 0.0.6.70-1 +- Package Spec generated @@ -0,0 +1 @@ +3fc95c65f9c080366a84ded82809e7ae cdm-connector-0.0.6.70.tar.gz |
