1 files changed, 576 insertions, 0 deletions
diff --git a/python-easyexplore.spec b/python-easyexplore.spec
new file mode 100644
index 0000000..83db513
--- /dev/null
+++ b/python-easyexplore.spec
@@ -0,0 +1,576 @@
+%global _empty_manifest_terminate_build 0
+Name:		python-easyexplore
+Version:	0.7.4
+Release:	1
+Summary:	Toolbox for easy and effective data exploration
+License:	GNU
+URL:		https://github.com/GianniBalistreri/easyexplore
+Source0:	https://mirrors.nju.edu.cn/pypi/web/packages/94/d1/ef8855b0162a01a248c5d82e7820cba6997877eeca9cd537cfe405307ac7/easyexplore-0.7.4.tar.gz
+BuildArch:	noarch
+
+Requires:	python3-boto3
+Requires:	python3-dask
+Requires:	python3-fsspec
+Requires:	python3-geojson
+Requires:	python3-google-cloud-storage
+Requires:	python3-ipywidgets
+Requires:	python3-joblib
+Requires:	python3-kaleido
+Requires:	python3-networkx
+Requires:	python3-numpy
+Requires:	python3-pandas
+Requires:	python3-plotly
+Requires:	python3-pyarrow
+Requires:	python3-pyLDAvis
+Requires:	python3-pyod
+Requires:	python3-s3fs
+Requires:	python3-scikit-image
+Requires:	python3-scipy
+Requires:	python3-scikit-learn
+Requires:	python3-sqlalchemy
+Requires:	python3-statsmodels
+Requires:	python3-toolz
+Requires:	python3-wheel
+Requires:	python3-xlrd
+
+%description
+# EasyExplore
+
+## Description:
+Toolbox for easy and effective data exploration in Python. It is designed to work with Jupyter notebooks especially, but it can also be used in any python module.
+
+## Table of Content:
+1. Installation
+2. Requirements
+3. Introduction
+    - Practical Usage
+    - Utilities
+        - DataImporter
+        - DataExporter
+    - DataExplorer
+    - DataVisualizer
+
+
+## 1. Installation:
+You can easily install EasyExplore via pip install easyexplore on every operating system.
+
+## 2. Requirements:
+ - dask>=2.23.0
+ - geojson>=2.5.0
+ - ipywidgets>=0.5.1
+ - joblib>=0.14.1
+ - networkx>=2.2
+ - numpy>=1.18.1
+ - pandas>=1.1.0
+ - plotly>=4.5.4
+ - pyod>=0.7.7.1
+ - psutil>=5.5.1
+ - scipy>=1.4.1
+ - scikit-learn>=0.23.1
+ - sqlalchemy>=1.3.15
+ - statsmodels>=0.9.0
+ - wheel>=0.35.1
+ - xlrd>=1.2.0
+
+## 3. Introduction:
+ - Practical Usage:
+ 
+ EasyExplore is designed as a wrapper which helps Data Scientists to explore data more convinient and efficient.
+ 
+ - Data Importer:
+ 
+ You can easily import data set from several files as well as databases into a Pandas or dask DataFrame.
+ 
+ - Data Exporter:
+ 
+ You can easily import data set from Pandas DataFrame or other data objects into several files or databases.
+ 
+ - Data Explorer:
+ 
+ Explore your data set quickly and efficiently using the DataExplorer:
+
+    -- Data Typing:
+
+        Check whether represented data types of Pandas is equal to the real data types occuring in the data
+
+    -- Data Health Check:
+
+        Check the health of the data set in order to detecting, describing and visualizing ...
+            ... the ammount of missing or invalid data vs. valid observations
+            ... the amount of duplicated data
+            ... the amount of invariant data
+
+    -- Data Distribution:
+
+        Describing and visualizing statistical distribution of ...
+            ... categorical features
+            ... continuous features
+            ... date features
+
+    -- Outlier Detection:
+
+        Analyze outliers or anomalies of continuous features using univariate and multivariate methods:
+            a) Univariate: Examines outlier values for each features separately using Inter-Quantile-Range (IQR)
+            b) Multivarite: Examines outliers for each possible feature pair combined using a bunch of different machine learning algorithms. For further information just look at the PyOD packages documentation, because it is used under the hood.
+
+    -- Categorical Breakdown Statistics:
+
+        Descriptive statistics of continuous features grouped by values of each categorical feature in the data set:
+
+
+    -- Correlation:
+
+        Correlation analysis of continuous features. For analyzing multi-collinearity there is a partial correlation method implemented. The differences between marginal and partial correlations are inspected by visualizing the differences of the coefficients in a heat map as well.
+
+    -- Geo Statistics:
+
+        Descriptive statistics of continuous features grouped by values of each geo features in the data set. Additionally, there is a geo map (OpenStreetMap) generated to visualize statistical distribution.
+
+    -- Text Analyzer:
+
+        Analyze potential text features and generate various numerical features from those
+
+- Data Visualizer:
+
+Visualize your data set very easily using Plot.ly an interactive visualization library under the hood. The DataVisualizer is an efficient wrapper to abstract the most important elements for data exploration:
+
+    -- Table Chart:
+        Visualize matrix (Pandas DataFrame) as an interactive table
+
+    -- Heat Map:
+        Visualize value range of continuous features as heat map
+
+    -- Geo Map:
+        Visualize statistics of categorical and continuous features as interactive OpenStreetMap
+
+    -- Contour Chart:
+        Visualize value ranges of at least two continuous features as contours
+
+    -- Pie Chart:
+        Visualize occurances of values of categorical features as an interactive pie chart
+
+    -- Bar Chart:
+        Visualize occurances of values of categorical features as an interactive bar chart
+
+    -- Histogram:
+        Visualize distribution of continuous features as an interactive histogram
+
+    -- Box-Whisker-Plot:
+        Visualize descriptive statistics of continuous features as an interactive box-whisker-plot
+
+    -- Violin Chart:
+        Visualize descriptive statistics of continuous features as an interactive violin chart
+
+    -- Parallel Category Chart:
+        Visualize relationships interactively between categorical features especially, but it can also be used for mixed relations between values of categorical and continuous features by using brushing as well.
+
+    -- Parallel Coordinate Chart:
+        Visualize relationships interactively between ranges of continuous features especially, but it can also be used for mixed relations between values of categorical and ranges of continuous features as well.
+
+    -- Scatter Chart:
+        Visualize values of continuous features interactively.
+
+    -- Scatter3D Chart:
+        Visualize values of three continuous features in one chart interactively.
+
+    -- Joint Distribution Chart:
+        Visualize values of two continuous features interactively, including contours and histogram for each continuous feature.
+
+    -- Ridgeline Chart:
+        Visualize changes in distribution of continuous features on certain time steps separately.
+
+    -- Line Chart:
+        Visualize distribution after certain time steps as an interactive line chart.
+
+    -- Candlestick Chart:
+        Visualize descritive statistics for each time steps as an interactive candlestick chart.
+
+    -- Dendrogram:
+        Visualize hierarchical clusters.
+
+    -- Silhoutte Chart:
+        Visualize partitionized clusters.
+
+## 4. Examples:
+
+Check the jupyter notebook for examples. Happy exploring :)
+
+
+%package -n python3-easyexplore
+Summary:	Toolbox for easy and effective data exploration
+Provides:	python-easyexplore
+BuildRequires:	python3-devel
+BuildRequires:	python3-setuptools
+BuildRequires:	python3-pip
+%description -n python3-easyexplore
+# EasyExplore
+
+## Description:
+Toolbox for easy and effective data exploration in Python. It is designed to work with Jupyter notebooks especially, but it can also be used in any python module.
+
+## Table of Content:
+1. Installation
+2. Requirements
+3. Introduction
+    - Practical Usage
+    - Utilities
+        - DataImporter
+        - DataExporter
+    - DataExplorer
+    - DataVisualizer
+
+
+## 1. Installation:
+You can easily install EasyExplore via pip install easyexplore on every operating system.
+
+## 2. Requirements:
+ - dask>=2.23.0
+ - geojson>=2.5.0
+ - ipywidgets>=0.5.1
+ - joblib>=0.14.1
+ - networkx>=2.2
+ - numpy>=1.18.1
+ - pandas>=1.1.0
+ - plotly>=4.5.4
+ - pyod>=0.7.7.1
+ - psutil>=5.5.1
+ - scipy>=1.4.1
+ - scikit-learn>=0.23.1
+ - sqlalchemy>=1.3.15
+ - statsmodels>=0.9.0
+ - wheel>=0.35.1
+ - xlrd>=1.2.0
+
+## 3. Introduction:
+ - Practical Usage:
+ 
+ EasyExplore is designed as a wrapper which helps Data Scientists to explore data more convinient and efficient.
+ 
+ - Data Importer:
+ 
+ You can easily import data set from several files as well as databases into a Pandas or dask DataFrame.
+ 
+ - Data Exporter:
+ 
+ You can easily import data set from Pandas DataFrame or other data objects into several files or databases.
+ 
+ - Data Explorer:
+ 
+ Explore your data set quickly and efficiently using the DataExplorer:
+
+    -- Data Typing:
+
+        Check whether represented data types of Pandas is equal to the real data types occuring in the data
+
+    -- Data Health Check:
+
+        Check the health of the data set in order to detecting, describing and visualizing ...
+            ... the ammount of missing or invalid data vs. valid observations
+            ... the amount of duplicated data
+            ... the amount of invariant data
+
+    -- Data Distribution:
+
+        Describing and visualizing statistical distribution of ...
+            ... categorical features
+            ... continuous features
+            ... date features
+
+    -- Outlier Detection:
+
+        Analyze outliers or anomalies of continuous features using univariate and multivariate methods:
+            a) Univariate: Examines outlier values for each features separately using Inter-Quantile-Range (IQR)
+            b) Multivarite: Examines outliers for each possible feature pair combined using a bunch of different machine learning algorithms. For further information just look at the PyOD packages documentation, because it is used under the hood.
+
+    -- Categorical Breakdown Statistics:
+
+        Descriptive statistics of continuous features grouped by values of each categorical feature in the data set:
+
+
+    -- Correlation:
+
+        Correlation analysis of continuous features. For analyzing multi-collinearity there is a partial correlation method implemented. The differences between marginal and partial correlations are inspected by visualizing the differences of the coefficients in a heat map as well.
+
+    -- Geo Statistics:
+
+        Descriptive statistics of continuous features grouped by values of each geo features in the data set. Additionally, there is a geo map (OpenStreetMap) generated to visualize statistical distribution.
+
+    -- Text Analyzer:
+
+        Analyze potential text features and generate various numerical features from those
+
+- Data Visualizer:
+
+Visualize your data set very easily using Plot.ly an interactive visualization library under the hood. The DataVisualizer is an efficient wrapper to abstract the most important elements for data exploration:
+
+    -- Table Chart:
+        Visualize matrix (Pandas DataFrame) as an interactive table
+
+    -- Heat Map:
+        Visualize value range of continuous features as heat map
+
+    -- Geo Map:
+        Visualize statistics of categorical and continuous features as interactive OpenStreetMap
+
+    -- Contour Chart:
+        Visualize value ranges of at least two continuous features as contours
+
+    -- Pie Chart:
+        Visualize occurances of values of categorical features as an interactive pie chart
+
+    -- Bar Chart:
+        Visualize occurances of values of categorical features as an interactive bar chart
+
+    -- Histogram:
+        Visualize distribution of continuous features as an interactive histogram
+
+    -- Box-Whisker-Plot:
+        Visualize descriptive statistics of continuous features as an interactive box-whisker-plot
+
+    -- Violin Chart:
+        Visualize descriptive statistics of continuous features as an interactive violin chart
+
+    -- Parallel Category Chart:
+        Visualize relationships interactively between categorical features especially, but it can also be used for mixed relations between values of categorical and continuous features by using brushing as well.
+
+    -- Parallel Coordinate Chart:
+        Visualize relationships interactively between ranges of continuous features especially, but it can also be used for mixed relations between values of categorical and ranges of continuous features as well.
+
+    -- Scatter Chart:
+        Visualize values of continuous features interactively.
+
+    -- Scatter3D Chart:
+        Visualize values of three continuous features in one chart interactively.
+
+    -- Joint Distribution Chart:
+        Visualize values of two continuous features interactively, including contours and histogram for each continuous feature.
+
+    -- Ridgeline Chart:
+        Visualize changes in distribution of continuous features on certain time steps separately.
+
+    -- Line Chart:
+        Visualize distribution after certain time steps as an interactive line chart.
+
+    -- Candlestick Chart:
+        Visualize descritive statistics for each time steps as an interactive candlestick chart.
+
+    -- Dendrogram:
+        Visualize hierarchical clusters.
+
+    -- Silhoutte Chart:
+        Visualize partitionized clusters.
+
+## 4. Examples:
+
+Check the jupyter notebook for examples. Happy exploring :)
+
+
+%package help
+Summary:	Development documents and examples for easyexplore
+Provides:	python3-easyexplore-doc
+%description help
+# EasyExplore
+
+## Description:
+Toolbox for easy and effective data exploration in Python. It is designed to work with Jupyter notebooks especially, but it can also be used in any python module.
+
+## Table of Content:
+1. Installation
+2. Requirements
+3. Introduction
+    - Practical Usage
+    - Utilities
+        - DataImporter
+        - DataExporter
+    - DataExplorer
+    - DataVisualizer
+
+
+## 1. Installation:
+You can easily install EasyExplore via pip install easyexplore on every operating system.
+
+## 2. Requirements:
+ - dask>=2.23.0
+ - geojson>=2.5.0
+ - ipywidgets>=0.5.1
+ - joblib>=0.14.1
+ - networkx>=2.2
+ - numpy>=1.18.1
+ - pandas>=1.1.0
+ - plotly>=4.5.4
+ - pyod>=0.7.7.1
+ - psutil>=5.5.1
+ - scipy>=1.4.1
+ - scikit-learn>=0.23.1
+ - sqlalchemy>=1.3.15
+ - statsmodels>=0.9.0
+ - wheel>=0.35.1
+ - xlrd>=1.2.0
+
+## 3. Introduction:
+ - Practical Usage:
+ 
+ EasyExplore is designed as a wrapper which helps Data Scientists to explore data more convinient and efficient.
+ 
+ - Data Importer:
+ 
+ You can easily import data set from several files as well as databases into a Pandas or dask DataFrame.
+ 
+ - Data Exporter:
+ 
+ You can easily import data set from Pandas DataFrame or other data objects into several files or databases.
+ 
+ - Data Explorer:
+ 
+ Explore your data set quickly and efficiently using the DataExplorer:
+
+    -- Data Typing:
+
+        Check whether represented data types of Pandas is equal to the real data types occuring in the data
+
+    -- Data Health Check:
+
+        Check the health of the data set in order to detecting, describing and visualizing ...
+            ... the ammount of missing or invalid data vs. valid observations
+            ... the amount of duplicated data
+            ... the amount of invariant data
+
+    -- Data Distribution:
+
+        Describing and visualizing statistical distribution of ...
+            ... categorical features
+            ... continuous features
+            ... date features
+
+    -- Outlier Detection:
+
+        Analyze outliers or anomalies of continuous features using univariate and multivariate methods:
+            a) Univariate: Examines outlier values for each features separately using Inter-Quantile-Range (IQR)
+            b) Multivarite: Examines outliers for each possible feature pair combined using a bunch of different machine learning algorithms. For further information just look at the PyOD packages documentation, because it is used under the hood.
+
+    -- Categorical Breakdown Statistics:
+
+        Descriptive statistics of continuous features grouped by values of each categorical feature in the data set:
+
+
+    -- Correlation:
+
+        Correlation analysis of continuous features. For analyzing multi-collinearity there is a partial correlation method implemented. The differences between marginal and partial correlations are inspected by visualizing the differences of the coefficients in a heat map as well.
+
+    -- Geo Statistics:
+
+        Descriptive statistics of continuous features grouped by values of each geo features in the data set. Additionally, there is a geo map (OpenStreetMap) generated to visualize statistical distribution.
+
+    -- Text Analyzer:
+
+        Analyze potential text features and generate various numerical features from those
+
+- Data Visualizer:
+
+Visualize your data set very easily using Plot.ly an interactive visualization library under the hood. The DataVisualizer is an efficient wrapper to abstract the most important elements for data exploration:
+
+    -- Table Chart:
+        Visualize matrix (Pandas DataFrame) as an interactive table
+
+    -- Heat Map:
+        Visualize value range of continuous features as heat map
+
+    -- Geo Map:
+        Visualize statistics of categorical and continuous features as interactive OpenStreetMap
+
+    -- Contour Chart:
+        Visualize value ranges of at least two continuous features as contours
+
+    -- Pie Chart:
+        Visualize occurances of values of categorical features as an interactive pie chart
+
+    -- Bar Chart:
+        Visualize occurances of values of categorical features as an interactive bar chart
+
+    -- Histogram:
+        Visualize distribution of continuous features as an interactive histogram
+
+    -- Box-Whisker-Plot:
+        Visualize descriptive statistics of continuous features as an interactive box-whisker-plot
+
+    -- Violin Chart:
+        Visualize descriptive statistics of continuous features as an interactive violin chart
+
+    -- Parallel Category Chart:
+        Visualize relationships interactively between categorical features especially, but it can also be used for mixed relations between values of categorical and continuous features by using brushing as well.
+
+    -- Parallel Coordinate Chart:
+        Visualize relationships interactively between ranges of continuous features especially, but it can also be used for mixed relations between values of categorical and ranges of continuous features as well.
+
+    -- Scatter Chart:
+        Visualize values of continuous features interactively.
+
+    -- Scatter3D Chart:
+        Visualize values of three continuous features in one chart interactively.
+
+    -- Joint Distribution Chart:
+        Visualize values of two continuous features interactively, including contours and histogram for each continuous feature.
+
+    -- Ridgeline Chart:
+        Visualize changes in distribution of continuous features on certain time steps separately.
+
+    -- Line Chart:
+        Visualize distribution after certain time steps as an interactive line chart.
+
+    -- Candlestick Chart:
+        Visualize descritive statistics for each time steps as an interactive candlestick chart.
+
+    -- Dendrogram:
+        Visualize hierarchical clusters.
+
+    -- Silhoutte Chart:
+        Visualize partitionized clusters.
+
+## 4. Examples:
+
+Check the jupyter notebook for examples. Happy exploring :)
+
+
+%prep
+%autosetup -n easyexplore-0.7.4
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+	find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+	find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+	find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+	find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+	find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-easyexplore -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Wed May 10 2023 Python_Bot <Python_Bot@openeuler.org> - 0.7.4-1
+- Package Spec generated