diff options
author | CoprDistGit <infra@openeuler.org> | 2023-05-10 07:18:59 +0000 |
---|---|---|
committer | CoprDistGit <infra@openeuler.org> | 2023-05-10 07:18:59 +0000 |
commit | addc4bfddaf4d6648dbbf847a2ee24241ad140ab (patch) | |
tree | 817f556ce592ccaef6b7d749423e76193d106cad /python-easyexplore.spec | |
parent | 99b414d74f4fc37b6726cca4d30280047dc17f79 (diff) |
automatic import of python-easyexplore
Diffstat (limited to 'python-easyexplore.spec')
-rw-r--r-- | python-easyexplore.spec | 576 |
1 files changed, 576 insertions, 0 deletions
diff --git a/python-easyexplore.spec b/python-easyexplore.spec new file mode 100644 index 0000000..83db513 --- /dev/null +++ b/python-easyexplore.spec @@ -0,0 +1,576 @@ +%global _empty_manifest_terminate_build 0 +Name: python-easyexplore +Version: 0.7.4 +Release: 1 +Summary: Toolbox for easy and effective data exploration +License: GNU +URL: https://github.com/GianniBalistreri/easyexplore +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/94/d1/ef8855b0162a01a248c5d82e7820cba6997877eeca9cd537cfe405307ac7/easyexplore-0.7.4.tar.gz +BuildArch: noarch + +Requires: python3-boto3 +Requires: python3-dask +Requires: python3-fsspec +Requires: python3-geojson +Requires: python3-google-cloud-storage +Requires: python3-ipywidgets +Requires: python3-joblib +Requires: python3-kaleido +Requires: python3-networkx +Requires: python3-numpy +Requires: python3-pandas +Requires: python3-plotly +Requires: python3-pyarrow +Requires: python3-pyLDAvis +Requires: python3-pyod +Requires: python3-s3fs +Requires: python3-scikit-image +Requires: python3-scipy +Requires: python3-scikit-learn +Requires: python3-sqlalchemy +Requires: python3-statsmodels +Requires: python3-toolz +Requires: python3-wheel +Requires: python3-xlrd + +%description +# EasyExplore + +## Description: +Toolbox for easy and effective data exploration in Python. It is designed to work with Jupyter notebooks especially, but it can also be used in any python module. + +## Table of Content: +1. Installation +2. Requirements +3. Introduction + - Practical Usage + - Utilities + - DataImporter + - DataExporter + - DataExplorer + - DataVisualizer + + +## 1. Installation: +You can easily install EasyExplore via pip install easyexplore on every operating system. + +## 2. Requirements: + - dask>=2.23.0 + - geojson>=2.5.0 + - ipywidgets>=0.5.1 + - joblib>=0.14.1 + - networkx>=2.2 + - numpy>=1.18.1 + - pandas>=1.1.0 + - plotly>=4.5.4 + - pyod>=0.7.7.1 + - psutil>=5.5.1 + - scipy>=1.4.1 + - scikit-learn>=0.23.1 + - sqlalchemy>=1.3.15 + - statsmodels>=0.9.0 + - wheel>=0.35.1 + - xlrd>=1.2.0 + +## 3. Introduction: + - Practical Usage: + + EasyExplore is designed as a wrapper which helps Data Scientists to explore data more convinient and efficient. + + - Data Importer: + + You can easily import data set from several files as well as databases into a Pandas or dask DataFrame. + + - Data Exporter: + + You can easily import data set from Pandas DataFrame or other data objects into several files or databases. + + - Data Explorer: + + Explore your data set quickly and efficiently using the DataExplorer: + + -- Data Typing: + + Check whether represented data types of Pandas is equal to the real data types occuring in the data + + -- Data Health Check: + + Check the health of the data set in order to detecting, describing and visualizing ... + ... the ammount of missing or invalid data vs. valid observations + ... the amount of duplicated data + ... the amount of invariant data + + -- Data Distribution: + + Describing and visualizing statistical distribution of ... + ... categorical features + ... continuous features + ... date features + + -- Outlier Detection: + + Analyze outliers or anomalies of continuous features using univariate and multivariate methods: + a) Univariate: Examines outlier values for each features separately using Inter-Quantile-Range (IQR) + b) Multivarite: Examines outliers for each possible feature pair combined using a bunch of different machine learning algorithms. For further information just look at the PyOD packages documentation, because it is used under the hood. + + -- Categorical Breakdown Statistics: + + Descriptive statistics of continuous features grouped by values of each categorical feature in the data set: + + + -- Correlation: + + Correlation analysis of continuous features. For analyzing multi-collinearity there is a partial correlation method implemented. The differences between marginal and partial correlations are inspected by visualizing the differences of the coefficients in a heat map as well. + + -- Geo Statistics: + + Descriptive statistics of continuous features grouped by values of each geo features in the data set. Additionally, there is a geo map (OpenStreetMap) generated to visualize statistical distribution. + + -- Text Analyzer: + + Analyze potential text features and generate various numerical features from those + +- Data Visualizer: + +Visualize your data set very easily using Plot.ly an interactive visualization library under the hood. The DataVisualizer is an efficient wrapper to abstract the most important elements for data exploration: + + -- Table Chart: + Visualize matrix (Pandas DataFrame) as an interactive table + + -- Heat Map: + Visualize value range of continuous features as heat map + + -- Geo Map: + Visualize statistics of categorical and continuous features as interactive OpenStreetMap + + -- Contour Chart: + Visualize value ranges of at least two continuous features as contours + + -- Pie Chart: + Visualize occurances of values of categorical features as an interactive pie chart + + -- Bar Chart: + Visualize occurances of values of categorical features as an interactive bar chart + + -- Histogram: + Visualize distribution of continuous features as an interactive histogram + + -- Box-Whisker-Plot: + Visualize descriptive statistics of continuous features as an interactive box-whisker-plot + + -- Violin Chart: + Visualize descriptive statistics of continuous features as an interactive violin chart + + -- Parallel Category Chart: + Visualize relationships interactively between categorical features especially, but it can also be used for mixed relations between values of categorical and continuous features by using brushing as well. + + -- Parallel Coordinate Chart: + Visualize relationships interactively between ranges of continuous features especially, but it can also be used for mixed relations between values of categorical and ranges of continuous features as well. + + -- Scatter Chart: + Visualize values of continuous features interactively. + + -- Scatter3D Chart: + Visualize values of three continuous features in one chart interactively. + + -- Joint Distribution Chart: + Visualize values of two continuous features interactively, including contours and histogram for each continuous feature. + + -- Ridgeline Chart: + Visualize changes in distribution of continuous features on certain time steps separately. + + -- Line Chart: + Visualize distribution after certain time steps as an interactive line chart. + + -- Candlestick Chart: + Visualize descritive statistics for each time steps as an interactive candlestick chart. + + -- Dendrogram: + Visualize hierarchical clusters. + + -- Silhoutte Chart: + Visualize partitionized clusters. + +## 4. Examples: + +Check the jupyter notebook for examples. Happy exploring :) + + +%package -n python3-easyexplore +Summary: Toolbox for easy and effective data exploration +Provides: python-easyexplore +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-easyexplore +# EasyExplore + +## Description: +Toolbox for easy and effective data exploration in Python. It is designed to work with Jupyter notebooks especially, but it can also be used in any python module. + +## Table of Content: +1. Installation +2. Requirements +3. Introduction + - Practical Usage + - Utilities + - DataImporter + - DataExporter + - DataExplorer + - DataVisualizer + + +## 1. Installation: +You can easily install EasyExplore via pip install easyexplore on every operating system. + +## 2. Requirements: + - dask>=2.23.0 + - geojson>=2.5.0 + - ipywidgets>=0.5.1 + - joblib>=0.14.1 + - networkx>=2.2 + - numpy>=1.18.1 + - pandas>=1.1.0 + - plotly>=4.5.4 + - pyod>=0.7.7.1 + - psutil>=5.5.1 + - scipy>=1.4.1 + - scikit-learn>=0.23.1 + - sqlalchemy>=1.3.15 + - statsmodels>=0.9.0 + - wheel>=0.35.1 + - xlrd>=1.2.0 + +## 3. Introduction: + - Practical Usage: + + EasyExplore is designed as a wrapper which helps Data Scientists to explore data more convinient and efficient. + + - Data Importer: + + You can easily import data set from several files as well as databases into a Pandas or dask DataFrame. + + - Data Exporter: + + You can easily import data set from Pandas DataFrame or other data objects into several files or databases. + + - Data Explorer: + + Explore your data set quickly and efficiently using the DataExplorer: + + -- Data Typing: + + Check whether represented data types of Pandas is equal to the real data types occuring in the data + + -- Data Health Check: + + Check the health of the data set in order to detecting, describing and visualizing ... + ... the ammount of missing or invalid data vs. valid observations + ... the amount of duplicated data + ... the amount of invariant data + + -- Data Distribution: + + Describing and visualizing statistical distribution of ... + ... categorical features + ... continuous features + ... date features + + -- Outlier Detection: + + Analyze outliers or anomalies of continuous features using univariate and multivariate methods: + a) Univariate: Examines outlier values for each features separately using Inter-Quantile-Range (IQR) + b) Multivarite: Examines outliers for each possible feature pair combined using a bunch of different machine learning algorithms. For further information just look at the PyOD packages documentation, because it is used under the hood. + + -- Categorical Breakdown Statistics: + + Descriptive statistics of continuous features grouped by values of each categorical feature in the data set: + + + -- Correlation: + + Correlation analysis of continuous features. For analyzing multi-collinearity there is a partial correlation method implemented. The differences between marginal and partial correlations are inspected by visualizing the differences of the coefficients in a heat map as well. + + -- Geo Statistics: + + Descriptive statistics of continuous features grouped by values of each geo features in the data set. Additionally, there is a geo map (OpenStreetMap) generated to visualize statistical distribution. + + -- Text Analyzer: + + Analyze potential text features and generate various numerical features from those + +- Data Visualizer: + +Visualize your data set very easily using Plot.ly an interactive visualization library under the hood. The DataVisualizer is an efficient wrapper to abstract the most important elements for data exploration: + + -- Table Chart: + Visualize matrix (Pandas DataFrame) as an interactive table + + -- Heat Map: + Visualize value range of continuous features as heat map + + -- Geo Map: + Visualize statistics of categorical and continuous features as interactive OpenStreetMap + + -- Contour Chart: + Visualize value ranges of at least two continuous features as contours + + -- Pie Chart: + Visualize occurances of values of categorical features as an interactive pie chart + + -- Bar Chart: + Visualize occurances of values of categorical features as an interactive bar chart + + -- Histogram: + Visualize distribution of continuous features as an interactive histogram + + -- Box-Whisker-Plot: + Visualize descriptive statistics of continuous features as an interactive box-whisker-plot + + -- Violin Chart: + Visualize descriptive statistics of continuous features as an interactive violin chart + + -- Parallel Category Chart: + Visualize relationships interactively between categorical features especially, but it can also be used for mixed relations between values of categorical and continuous features by using brushing as well. + + -- Parallel Coordinate Chart: + Visualize relationships interactively between ranges of continuous features especially, but it can also be used for mixed relations between values of categorical and ranges of continuous features as well. + + -- Scatter Chart: + Visualize values of continuous features interactively. + + -- Scatter3D Chart: + Visualize values of three continuous features in one chart interactively. + + -- Joint Distribution Chart: + Visualize values of two continuous features interactively, including contours and histogram for each continuous feature. + + -- Ridgeline Chart: + Visualize changes in distribution of continuous features on certain time steps separately. + + -- Line Chart: + Visualize distribution after certain time steps as an interactive line chart. + + -- Candlestick Chart: + Visualize descritive statistics for each time steps as an interactive candlestick chart. + + -- Dendrogram: + Visualize hierarchical clusters. + + -- Silhoutte Chart: + Visualize partitionized clusters. + +## 4. Examples: + +Check the jupyter notebook for examples. Happy exploring :) + + +%package help +Summary: Development documents and examples for easyexplore +Provides: python3-easyexplore-doc +%description help +# EasyExplore + +## Description: +Toolbox for easy and effective data exploration in Python. It is designed to work with Jupyter notebooks especially, but it can also be used in any python module. + +## Table of Content: +1. Installation +2. Requirements +3. Introduction + - Practical Usage + - Utilities + - DataImporter + - DataExporter + - DataExplorer + - DataVisualizer + + +## 1. Installation: +You can easily install EasyExplore via pip install easyexplore on every operating system. + +## 2. Requirements: + - dask>=2.23.0 + - geojson>=2.5.0 + - ipywidgets>=0.5.1 + - joblib>=0.14.1 + - networkx>=2.2 + - numpy>=1.18.1 + - pandas>=1.1.0 + - plotly>=4.5.4 + - pyod>=0.7.7.1 + - psutil>=5.5.1 + - scipy>=1.4.1 + - scikit-learn>=0.23.1 + - sqlalchemy>=1.3.15 + - statsmodels>=0.9.0 + - wheel>=0.35.1 + - xlrd>=1.2.0 + +## 3. Introduction: + - Practical Usage: + + EasyExplore is designed as a wrapper which helps Data Scientists to explore data more convinient and efficient. + + - Data Importer: + + You can easily import data set from several files as well as databases into a Pandas or dask DataFrame. + + - Data Exporter: + + You can easily import data set from Pandas DataFrame or other data objects into several files or databases. + + - Data Explorer: + + Explore your data set quickly and efficiently using the DataExplorer: + + -- Data Typing: + + Check whether represented data types of Pandas is equal to the real data types occuring in the data + + -- Data Health Check: + + Check the health of the data set in order to detecting, describing and visualizing ... + ... the ammount of missing or invalid data vs. valid observations + ... the amount of duplicated data + ... the amount of invariant data + + -- Data Distribution: + + Describing and visualizing statistical distribution of ... + ... categorical features + ... continuous features + ... date features + + -- Outlier Detection: + + Analyze outliers or anomalies of continuous features using univariate and multivariate methods: + a) Univariate: Examines outlier values for each features separately using Inter-Quantile-Range (IQR) + b) Multivarite: Examines outliers for each possible feature pair combined using a bunch of different machine learning algorithms. For further information just look at the PyOD packages documentation, because it is used under the hood. + + -- Categorical Breakdown Statistics: + + Descriptive statistics of continuous features grouped by values of each categorical feature in the data set: + + + -- Correlation: + + Correlation analysis of continuous features. For analyzing multi-collinearity there is a partial correlation method implemented. The differences between marginal and partial correlations are inspected by visualizing the differences of the coefficients in a heat map as well. + + -- Geo Statistics: + + Descriptive statistics of continuous features grouped by values of each geo features in the data set. Additionally, there is a geo map (OpenStreetMap) generated to visualize statistical distribution. + + -- Text Analyzer: + + Analyze potential text features and generate various numerical features from those + +- Data Visualizer: + +Visualize your data set very easily using Plot.ly an interactive visualization library under the hood. The DataVisualizer is an efficient wrapper to abstract the most important elements for data exploration: + + -- Table Chart: + Visualize matrix (Pandas DataFrame) as an interactive table + + -- Heat Map: + Visualize value range of continuous features as heat map + + -- Geo Map: + Visualize statistics of categorical and continuous features as interactive OpenStreetMap + + -- Contour Chart: + Visualize value ranges of at least two continuous features as contours + + -- Pie Chart: + Visualize occurances of values of categorical features as an interactive pie chart + + -- Bar Chart: + Visualize occurances of values of categorical features as an interactive bar chart + + -- Histogram: + Visualize distribution of continuous features as an interactive histogram + + -- Box-Whisker-Plot: + Visualize descriptive statistics of continuous features as an interactive box-whisker-plot + + -- Violin Chart: + Visualize descriptive statistics of continuous features as an interactive violin chart + + -- Parallel Category Chart: + Visualize relationships interactively between categorical features especially, but it can also be used for mixed relations between values of categorical and continuous features by using brushing as well. + + -- Parallel Coordinate Chart: + Visualize relationships interactively between ranges of continuous features especially, but it can also be used for mixed relations between values of categorical and ranges of continuous features as well. + + -- Scatter Chart: + Visualize values of continuous features interactively. + + -- Scatter3D Chart: + Visualize values of three continuous features in one chart interactively. + + -- Joint Distribution Chart: + Visualize values of two continuous features interactively, including contours and histogram for each continuous feature. + + -- Ridgeline Chart: + Visualize changes in distribution of continuous features on certain time steps separately. + + -- Line Chart: + Visualize distribution after certain time steps as an interactive line chart. + + -- Candlestick Chart: + Visualize descritive statistics for each time steps as an interactive candlestick chart. + + -- Dendrogram: + Visualize hierarchical clusters. + + -- Silhoutte Chart: + Visualize partitionized clusters. + +## 4. Examples: + +Check the jupyter notebook for examples. Happy exploring :) + + +%prep +%autosetup -n easyexplore-0.7.4 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-easyexplore -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Wed May 10 2023 Python_Bot <Python_Bot@openeuler.org> - 0.7.4-1 +- Package Spec generated |