%global _empty_manifest_terminate_build 0
Name:		python-cdm-connector
Version:	0.0.6.70
Release:	1
Summary:	A Python package to read and write files in CDM format. Customized for SkyPoint use cases.
License:	GPL-3.0
URL:		https://github.com/skypointcloud/skypoint-python-cdm-connector
Source0:	https://mirrors.nju.edu.cn/pypi/web/packages/f9/6e/b4595933644029689cd3cd32e2aab235b23d1de4506bc4a35f02901819aa/cdm-connector-0.0.6.70.tar.gz
BuildArch:	noarch

Requires:	python3-pandas
Requires:	python3-azure-storage-blob
Requires:	python3-numpy
Requires:	python3-retry
Requires:	python3-boto3
Requires:	python3-botocore

%description
# skypoint-python-cdm-connector
Python Spark CDM Connector by SkyPoint. 

Apache Spark connector for the Microsoft Azure "Common Data Model". Reading and writing is supported and it is a work in progress. Please file issues for any bugs that you find. 

For more information about the Azure Common Data Model, check out [this page](https://docs.microsoft.com/en-us/common-data-model/data-lake). <br>

We support Azure Data Lake Service (ADLS) and AWS S3 as storage, historical data preservation using snapshots of the schema & data files and usage within PySpark, Azure Functions etc.  

*Upcoming Support for incremental data refresh handling, [CDM 1.1](https://docs.microsoft.com/en-us/common-data-model/cdm-manifest and Google Cloud (Cloud Storage). <br>

## Example

1. Please look into the sample usage file skypoint_python_cdm.py
2. Dynamically add/remove entities, annotations and attributes
3. Pass Reader and Writer object for any storage account you like to write/read data to/from.
4. Check out the below code for basic read and write examples.

```python
# Initialize empty model
m = Model()

# Sample dataframe
df = {"country": ["Brazil", "Russia", "India", "China", "South Africa", "ParaSF"],
       "currentTime": [datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now()],
       "area": [8.516, 17.10, 3.286, 9.597, 1.221, 2.222],
       "capital": ["Brasilia", "Moscow", "New Dehli", "Beijing", "Pretoria", "ParaSF"],
       "population": [200.4, 143.5, 1252, 1357, 52.98, 12.34] }
df = pd.DataFrame(df)

# Generate entity from the dataframe
entity = Model.generate_entity(df, "customEntity")

# Add generated entity to model
m.add_entity(entity)

# Add model level annotation
# Annotation can be added at entity level as well as attribute level
Model.add_annotation("modelJsonAnnotation", "modelJsonAnnotationValue", m)


# Create an ADLSWriter to write into ADLS
writer = ADLSWriter("ACCOUNT_NAME", "ACCOUNT_KEY",
                     "CONTAINER_NAME", "STORAGE_NAME", "DATAFLOW_NAME")    

# Write data as well as model.json in ADLS storage
m.write_to_storage("customEntity", df, writer)
```

## Contributing

This project welcomes contributions and suggestions. 

## References

[Model.json version1 schema](https://github.com/microsoft/CDM/blob/master/docs/schema/modeljsonschema.json)

[A clean implementation for Python Objects from/to model.json file](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/CDM/python/CdmModel.py)


%package -n python3-cdm-connector
Summary:	A Python package to read and write files in CDM format. Customized for SkyPoint use cases.
Provides:	python-cdm-connector
BuildRequires:	python3-devel
BuildRequires:	python3-setuptools
BuildRequires:	python3-pip
%description -n python3-cdm-connector
# skypoint-python-cdm-connector
Python Spark CDM Connector by SkyPoint. 

Apache Spark connector for the Microsoft Azure "Common Data Model". Reading and writing is supported and it is a work in progress. Please file issues for any bugs that you find. 

For more information about the Azure Common Data Model, check out [this page](https://docs.microsoft.com/en-us/common-data-model/data-lake). <br>

We support Azure Data Lake Service (ADLS) and AWS S3 as storage, historical data preservation using snapshots of the schema & data files and usage within PySpark, Azure Functions etc.  

*Upcoming Support for incremental data refresh handling, [CDM 1.1](https://docs.microsoft.com/en-us/common-data-model/cdm-manifest and Google Cloud (Cloud Storage). <br>

## Example

1. Please look into the sample usage file skypoint_python_cdm.py
2. Dynamically add/remove entities, annotations and attributes
3. Pass Reader and Writer object for any storage account you like to write/read data to/from.
4. Check out the below code for basic read and write examples.

```python
# Initialize empty model
m = Model()

# Sample dataframe
df = {"country": ["Brazil", "Russia", "India", "China", "South Africa", "ParaSF"],
       "currentTime": [datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now()],
       "area": [8.516, 17.10, 3.286, 9.597, 1.221, 2.222],
       "capital": ["Brasilia", "Moscow", "New Dehli", "Beijing", "Pretoria", "ParaSF"],
       "population": [200.4, 143.5, 1252, 1357, 52.98, 12.34] }
df = pd.DataFrame(df)

# Generate entity from the dataframe
entity = Model.generate_entity(df, "customEntity")

# Add generated entity to model
m.add_entity(entity)

# Add model level annotation
# Annotation can be added at entity level as well as attribute level
Model.add_annotation("modelJsonAnnotation", "modelJsonAnnotationValue", m)


# Create an ADLSWriter to write into ADLS
writer = ADLSWriter("ACCOUNT_NAME", "ACCOUNT_KEY",
                     "CONTAINER_NAME", "STORAGE_NAME", "DATAFLOW_NAME")    

# Write data as well as model.json in ADLS storage
m.write_to_storage("customEntity", df, writer)
```

## Contributing

This project welcomes contributions and suggestions. 

## References

[Model.json version1 schema](https://github.com/microsoft/CDM/blob/master/docs/schema/modeljsonschema.json)

[A clean implementation for Python Objects from/to model.json file](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/CDM/python/CdmModel.py)


%package help
Summary:	Development documents and examples for cdm-connector
Provides:	python3-cdm-connector-doc
%description help
# skypoint-python-cdm-connector
Python Spark CDM Connector by SkyPoint. 

Apache Spark connector for the Microsoft Azure "Common Data Model". Reading and writing is supported and it is a work in progress. Please file issues for any bugs that you find. 

For more information about the Azure Common Data Model, check out [this page](https://docs.microsoft.com/en-us/common-data-model/data-lake). <br>

We support Azure Data Lake Service (ADLS) and AWS S3 as storage, historical data preservation using snapshots of the schema & data files and usage within PySpark, Azure Functions etc.  

*Upcoming Support for incremental data refresh handling, [CDM 1.1](https://docs.microsoft.com/en-us/common-data-model/cdm-manifest and Google Cloud (Cloud Storage). <br>

## Example

1. Please look into the sample usage file skypoint_python_cdm.py
2. Dynamically add/remove entities, annotations and attributes
3. Pass Reader and Writer object for any storage account you like to write/read data to/from.
4. Check out the below code for basic read and write examples.

```python
# Initialize empty model
m = Model()

# Sample dataframe
df = {"country": ["Brazil", "Russia", "India", "China", "South Africa", "ParaSF"],
       "currentTime": [datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now(), datetime.now()],
       "area": [8.516, 17.10, 3.286, 9.597, 1.221, 2.222],
       "capital": ["Brasilia", "Moscow", "New Dehli", "Beijing", "Pretoria", "ParaSF"],
       "population": [200.4, 143.5, 1252, 1357, 52.98, 12.34] }
df = pd.DataFrame(df)

# Generate entity from the dataframe
entity = Model.generate_entity(df, "customEntity")

# Add generated entity to model
m.add_entity(entity)

# Add model level annotation
# Annotation can be added at entity level as well as attribute level
Model.add_annotation("modelJsonAnnotation", "modelJsonAnnotationValue", m)


# Create an ADLSWriter to write into ADLS
writer = ADLSWriter("ACCOUNT_NAME", "ACCOUNT_KEY",
                     "CONTAINER_NAME", "STORAGE_NAME", "DATAFLOW_NAME")    

# Write data as well as model.json in ADLS storage
m.write_to_storage("customEntity", df, writer)
```

## Contributing

This project welcomes contributions and suggestions. 

## References

[Model.json version1 schema](https://github.com/microsoft/CDM/blob/master/docs/schema/modeljsonschema.json)

[A clean implementation for Python Objects from/to model.json file](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/CDM/python/CdmModel.py)


%prep
%autosetup -n cdm-connector-0.0.6.70

%build
%py3_build

%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
	find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
	find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
	find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
	find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
	find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .

%files -n python3-cdm-connector -f filelist.lst
%dir %{python3_sitelib}/*

%files help -f doclist.lst
%{_docdir}/*

%changelog
* Wed May 10 2023 Python_Bot <Python_Bot@openeuler.org> - 0.0.6.70-1
- Package Spec generated