automatic import of python-amplo

author: CoprDistGit <infra@openeuler.org> 2023-04-11 12:40:16 +0000
committer: CoprDistGit <infra@openeuler.org> 2023-04-11 12:40:16 +0000
commit: a5199dbe07dd85bbb6cdb2429621d582f580ca83 (patch)
tree: c5477ef8852b83c65ae9a5dd458c9d72019c1880
parent: 149e8d8e0f1db89d0618c9a4c4694de980891459 (diff)
3 files changed, 591 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..d358a5d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/Amplo-0.17.0.tar.gz
diff --git a/python-amplo.spec b/python-amplo.spec
new file mode 100644
index 0000000..76dc57c
--- /dev/null
+++ b/python-amplo.spec
@@ -0,0 +1,589 @@
+%global _empty_manifest_terminate_build 0
+Name:		python-Amplo
+Version:	0.17.0
+Release:	1
+Summary:	Fully automated end to end machine learning pipeline
+License:	GNU General Public License v3 (GPLv3)
+URL:		https://github.com/nielsuit227/AutoML
+Source0:	https://mirrors.nju.edu.cn/pypi/web/packages/4a/40/f178bed9ff3276ccb073ca265efd1672b8901bcb6a16dedd489f8ebf1e84/Amplo-0.17.0.tar.gz
+BuildArch:	noarch
+
+Requires:	python3-azure-core
+Requires:	python3-azure-storage-blob
+Requires:	python3-catboost
+Requires:	python3-cleanlab
+Requires:	python3-colorlog
+Requires:	python3-joblib
+Requires:	python3-lightgbm
+Requires:	python3-numba
+Requires:	python3-numpy
+Requires:	python3-optuna
+Requires:	python3-pandas
+Requires:	python3-polars
+Requires:	python3-pyarrow
+Requires:	python3-pytest
+Requires:	python3-pywavelets
+Requires:	python3-requests
+Requires:	python3-scikit-learn
+Requires:	python3-scipy
+Requires:	python3-setuptools
+Requires:	python3-shap
+Requires:	python3-tqdm
+Requires:	python3-xgboost
+Requires:	python3-flake8
+Requires:	python3-mypy
+Requires:	python3-types-chardet
+Requires:	python3-types-colorama
+Requires:	python3-types-decorator
+Requires:	python3-types-psycopg2
+Requires:	python3-types-Pygments
+Requires:	python3-types-PyMySQL
+Requires:	python3-types-python-dateutil
+Requires:	python3-types-pytz
+Requires:	python3-types-redis
+Requires:	python3-types-requests
+Requires:	python3-types-setuptools
+Requires:	python3-types-six
+Requires:	python3-types-urllib3
+
+%description
+# Amplo - AutoML (for Machine Data)
+
+[![image](https://img.shields.io/pypi/v/amplo.svg)](https://pypi.python.org/pypi/amplo)
+[![PyPI - License](https://img.shields.io/pypi/l/virtualenv?style=flat-square)](https://opensource.org/licenses/MIT)
+![](https://img.shields.io/badge/python-%3E%3D3.9%2C%3C4.0-blue)
+![](https://tokei.rs/b1/github/nielsuit227/automl)
+![](https://img.shields.io/pypi/dm/amplo)
+
+Welcome to the Automated Machine Learning package `amplo`. Amplo's AutoML is designed specifically for machine data and
+works very well with tabular time series data (especially unbalanced classification!).
+
+Though this is a standalone Python package, Amplo's AutoML is also available on Amplo's Smart Maintenance Platform.
+With a graphical user interface and various data connectors, it is the ideal place for service engineers to get started
+on Predictive.
+
+Amplo's AutoML Pipeline contains the entire Machine Learning development cycle, including exploratory data analysis,
+data cleaning, feature extraction, feature selection, model selection, hyperparameter optimization, stacking,
+version control, production-ready models and documentation. It comes with additional tools such as interval analysers,
+drift detectors, data quality checks, etc.
+
+## 1. Downloading Amplo
+
+The easiest way is to install our Python package through [PyPi](https://pypi.org/project/amplo/):
+
+```bash
+pip install amplo
+```
+
+## 2. Usage
+
+Usage is very simple with Amplo's AutoML Pipeline.
+
+```python
+from amplo import Pipeline
+from sklearn.datasets import make_classification
+from sklearn.datasets import make_regression
+
+x, y = make_classification()
+pipeline = Pipeline()
+pipeline.fit(x, y)
+yp = pipeline.predict_proba(x)
+
+x, y = make_regression()
+pipeline = Pipeline()
+pipeline.fit(x, y)
+yp = pipeline.predict(x)
+```
+
+## 3. Amplo AutoML Features
+
+### Interval Analyser
+
+```python
+from amplo.automl import IntervalAnalyser
+```
+
+Interval Analyser for Log file classification. When log files have to be classified, and there is not enough
+data for time series methods (such as LSTMs, ROCKET or Weasel, Boss, etc.), one needs to fall back to classical
+machine learning models which work better with lower samples. This raises the problem of which samples to
+classify. You shouldn't just simply classify on every sample and accumulate, that may greatly disrupt
+classification performance. Therefore, we introduce this interval analyser. By using an approximate K-Nearest
+Neighbors algorithm, one can estimate the strength of correlation for every sample inside a log. Using this
+allows for better interval selection for classical machine learning models.
+
+To use this interval analyser, make sure that your logs are located in a folder of their class, with one parent folder with all classes, e.g.:
+
+```text
++-- Parent Folder
+|   +-- Class_1
+|       +-- Log_1.*
+|       +-- Log_2.*
+|   +-- Class_2
+|       +-- Log_3.*
+```
+
+### Data Processing
+
+```python
+from amplo.automl import DataProcessor
+```
+
+Automated Data Cleaning:
+
+- Infers & converts data types (integer, floats, categorical, datetime)
+- Reformats column names
+- Removes duplicates columns and rows
+- Handles missing values by:
+  - Removing columns
+  - Removing rows
+  - Interpolating
+  - Filling with zero's
+- Removes outliers using:
+  - Clipping
+  - Z-score
+  - Quantiles
+- Removes constant columns
+
+### Feature Processing
+
+```python
+from amplo.automl import FeatureProcessor
+```
+
+Automatically extracts and selects features. Removes Co-Linear Features.
+Included Feature Extraction algorithms:
+
+- Multiplicative Features
+- Dividing Features
+- Additive Features
+- Subtractive Features
+- Trigonometric Features
+- K-Means Features
+- Lagged Features
+- Differencing Features
+- Inverse Features
+- Datetime Features
+
+Included Feature Selection algorithms:
+
+- Random Forest Feature Importance (Threshold and Increment)
+- Predictive Power Score
+
+### Sequencing
+
+```python
+from amplo.automl import Sequencer
+```
+
+For time series regression problems, it is often useful to include multiple previous samples instead of just the latest.
+This class sequences the data, based on which time steps you want included in the in- and output.
+This is also very useful when working with tensors, as a tensor can be returned which directly fits into a Recurrent Neural Network.
+
+### Modelling
+
+```python
+from amplo.automl import Modeller
+```
+
+Runs various regression or classification models.
+Includes:
+
+- Scikit's Linear Model
+- Scikit's Random Forest
+- Scikit's Bagging
+- Scikit's GradientBoosting
+- Scikit's HistGradientBoosting
+- DMLC's XGBoost
+- Catboost's Catboost
+- Microsoft's LightGBM
+- Stacking Models
+
+### Grid Search
+
+```python
+from amplo.grid_search import OptunaGridSearch
+```
+
+Contains three hyperparameter optimizers with extended predefined model parameters:
+
+- Optuna's Tree-Parzen-Estimator
+
+
+%package -n python3-Amplo
+Summary:	Fully automated end to end machine learning pipeline
+Provides:	python-Amplo
+BuildRequires:	python3-devel
+BuildRequires:	python3-setuptools
+BuildRequires:	python3-pip
+%description -n python3-Amplo
+# Amplo - AutoML (for Machine Data)
+
+[![image](https://img.shields.io/pypi/v/amplo.svg)](https://pypi.python.org/pypi/amplo)
+[![PyPI - License](https://img.shields.io/pypi/l/virtualenv?style=flat-square)](https://opensource.org/licenses/MIT)
+![](https://img.shields.io/badge/python-%3E%3D3.9%2C%3C4.0-blue)
+![](https://tokei.rs/b1/github/nielsuit227/automl)
+![](https://img.shields.io/pypi/dm/amplo)
+
+Welcome to the Automated Machine Learning package `amplo`. Amplo's AutoML is designed specifically for machine data and
+works very well with tabular time series data (especially unbalanced classification!).
+
+Though this is a standalone Python package, Amplo's AutoML is also available on Amplo's Smart Maintenance Platform.
+With a graphical user interface and various data connectors, it is the ideal place for service engineers to get started
+on Predictive.
+
+Amplo's AutoML Pipeline contains the entire Machine Learning development cycle, including exploratory data analysis,
+data cleaning, feature extraction, feature selection, model selection, hyperparameter optimization, stacking,
+version control, production-ready models and documentation. It comes with additional tools such as interval analysers,
+drift detectors, data quality checks, etc.
+
+## 1. Downloading Amplo
+
+The easiest way is to install our Python package through [PyPi](https://pypi.org/project/amplo/):
+
+```bash
+pip install amplo
+```
+
+## 2. Usage
+
+Usage is very simple with Amplo's AutoML Pipeline.
+
+```python
+from amplo import Pipeline
+from sklearn.datasets import make_classification
+from sklearn.datasets import make_regression
+
+x, y = make_classification()
+pipeline = Pipeline()
+pipeline.fit(x, y)
+yp = pipeline.predict_proba(x)
+
+x, y = make_regression()
+pipeline = Pipeline()
+pipeline.fit(x, y)
+yp = pipeline.predict(x)
+```
+
+## 3. Amplo AutoML Features
+
+### Interval Analyser
+
+```python
+from amplo.automl import IntervalAnalyser
+```
+
+Interval Analyser for Log file classification. When log files have to be classified, and there is not enough
+data for time series methods (such as LSTMs, ROCKET or Weasel, Boss, etc.), one needs to fall back to classical
+machine learning models which work better with lower samples. This raises the problem of which samples to
+classify. You shouldn't just simply classify on every sample and accumulate, that may greatly disrupt
+classification performance. Therefore, we introduce this interval analyser. By using an approximate K-Nearest
+Neighbors algorithm, one can estimate the strength of correlation for every sample inside a log. Using this
+allows for better interval selection for classical machine learning models.
+
+To use this interval analyser, make sure that your logs are located in a folder of their class, with one parent folder with all classes, e.g.:
+
+```text
++-- Parent Folder
+|   +-- Class_1
+|       +-- Log_1.*
+|       +-- Log_2.*
+|   +-- Class_2
+|       +-- Log_3.*
+```
+
+### Data Processing
+
+```python
+from amplo.automl import DataProcessor
+```
+
+Automated Data Cleaning:
+
+- Infers & converts data types (integer, floats, categorical, datetime)
+- Reformats column names
+- Removes duplicates columns and rows
+- Handles missing values by:
+  - Removing columns
+  - Removing rows
+  - Interpolating
+  - Filling with zero's
+- Removes outliers using:
+  - Clipping
+  - Z-score
+  - Quantiles
+- Removes constant columns
+
+### Feature Processing
+
+```python
+from amplo.automl import FeatureProcessor
+```
+
+Automatically extracts and selects features. Removes Co-Linear Features.
+Included Feature Extraction algorithms:
+
+- Multiplicative Features
+- Dividing Features
+- Additive Features
+- Subtractive Features
+- Trigonometric Features
+- K-Means Features
+- Lagged Features
+- Differencing Features
+- Inverse Features
+- Datetime Features
+
+Included Feature Selection algorithms:
+
+- Random Forest Feature Importance (Threshold and Increment)
+- Predictive Power Score
+
+### Sequencing
+
+```python
+from amplo.automl import Sequencer
+```
+
+For time series regression problems, it is often useful to include multiple previous samples instead of just the latest.
+This class sequences the data, based on which time steps you want included in the in- and output.
+This is also very useful when working with tensors, as a tensor can be returned which directly fits into a Recurrent Neural Network.
+
+### Modelling
+
+```python
+from amplo.automl import Modeller
+```
+
+Runs various regression or classification models.
+Includes:
+
+- Scikit's Linear Model
+- Scikit's Random Forest
+- Scikit's Bagging
+- Scikit's GradientBoosting
+- Scikit's HistGradientBoosting
+- DMLC's XGBoost
+- Catboost's Catboost
+- Microsoft's LightGBM
+- Stacking Models
+
+### Grid Search
+
+```python
+from amplo.grid_search import OptunaGridSearch
+```
+
+Contains three hyperparameter optimizers with extended predefined model parameters:
+
+- Optuna's Tree-Parzen-Estimator
+
+
+%package help
+Summary:	Development documents and examples for Amplo
+Provides:	python3-Amplo-doc
+%description help
+# Amplo - AutoML (for Machine Data)
+
+[![image](https://img.shields.io/pypi/v/amplo.svg)](https://pypi.python.org/pypi/amplo)
+[![PyPI - License](https://img.shields.io/pypi/l/virtualenv?style=flat-square)](https://opensource.org/licenses/MIT)
+![](https://img.shields.io/badge/python-%3E%3D3.9%2C%3C4.0-blue)
+![](https://tokei.rs/b1/github/nielsuit227/automl)
+![](https://img.shields.io/pypi/dm/amplo)
+
+Welcome to the Automated Machine Learning package `amplo`. Amplo's AutoML is designed specifically for machine data and
+works very well with tabular time series data (especially unbalanced classification!).
+
+Though this is a standalone Python package, Amplo's AutoML is also available on Amplo's Smart Maintenance Platform.
+With a graphical user interface and various data connectors, it is the ideal place for service engineers to get started
+on Predictive.
+
+Amplo's AutoML Pipeline contains the entire Machine Learning development cycle, including exploratory data analysis,
+data cleaning, feature extraction, feature selection, model selection, hyperparameter optimization, stacking,
+version control, production-ready models and documentation. It comes with additional tools such as interval analysers,
+drift detectors, data quality checks, etc.
+
+## 1. Downloading Amplo
+
+The easiest way is to install our Python package through [PyPi](https://pypi.org/project/amplo/):
+
+```bash
+pip install amplo
+```
+
+## 2. Usage
+
+Usage is very simple with Amplo's AutoML Pipeline.
+
+```python
+from amplo import Pipeline
+from sklearn.datasets import make_classification
+from sklearn.datasets import make_regression
+
+x, y = make_classification()
+pipeline = Pipeline()
+pipeline.fit(x, y)
+yp = pipeline.predict_proba(x)
+
+x, y = make_regression()
+pipeline = Pipeline()
+pipeline.fit(x, y)
+yp = pipeline.predict(x)
+```
+
+## 3. Amplo AutoML Features
+
+### Interval Analyser
+
+```python
+from amplo.automl import IntervalAnalyser
+```
+
+Interval Analyser for Log file classification. When log files have to be classified, and there is not enough
+data for time series methods (such as LSTMs, ROCKET or Weasel, Boss, etc.), one needs to fall back to classical
+machine learning models which work better with lower samples. This raises the problem of which samples to
+classify. You shouldn't just simply classify on every sample and accumulate, that may greatly disrupt
+classification performance. Therefore, we introduce this interval analyser. By using an approximate K-Nearest
+Neighbors algorithm, one can estimate the strength of correlation for every sample inside a log. Using this
+allows for better interval selection for classical machine learning models.
+
+To use this interval analyser, make sure that your logs are located in a folder of their class, with one parent folder with all classes, e.g.:
+
+```text
++-- Parent Folder
+|   +-- Class_1
+|       +-- Log_1.*
+|       +-- Log_2.*
+|   +-- Class_2
+|       +-- Log_3.*
+```
+
+### Data Processing
+
+```python
+from amplo.automl import DataProcessor
+```
+
+Automated Data Cleaning:
+
+- Infers & converts data types (integer, floats, categorical, datetime)
+- Reformats column names
+- Removes duplicates columns and rows
+- Handles missing values by:
+  - Removing columns
+  - Removing rows
+  - Interpolating
+  - Filling with zero's
+- Removes outliers using:
+  - Clipping
+  - Z-score
+  - Quantiles
+- Removes constant columns
+
+### Feature Processing
+
+```python
+from amplo.automl import FeatureProcessor
+```
+
+Automatically extracts and selects features. Removes Co-Linear Features.
+Included Feature Extraction algorithms:
+
+- Multiplicative Features
+- Dividing Features
+- Additive Features
+- Subtractive Features
+- Trigonometric Features
+- K-Means Features
+- Lagged Features
+- Differencing Features
+- Inverse Features
+- Datetime Features
+
+Included Feature Selection algorithms:
+
+- Random Forest Feature Importance (Threshold and Increment)
+- Predictive Power Score
+
+### Sequencing
+
+```python
+from amplo.automl import Sequencer
+```
+
+For time series regression problems, it is often useful to include multiple previous samples instead of just the latest.
+This class sequences the data, based on which time steps you want included in the in- and output.
+This is also very useful when working with tensors, as a tensor can be returned which directly fits into a Recurrent Neural Network.
+
+### Modelling
+
+```python
+from amplo.automl import Modeller
+```
+
+Runs various regression or classification models.
+Includes:
+
+- Scikit's Linear Model
+- Scikit's Random Forest
+- Scikit's Bagging
+- Scikit's GradientBoosting
+- Scikit's HistGradientBoosting
+- DMLC's XGBoost
+- Catboost's Catboost
+- Microsoft's LightGBM
+- Stacking Models
+
+### Grid Search
+
+```python
+from amplo.grid_search import OptunaGridSearch
+```
+
+Contains three hyperparameter optimizers with extended predefined model parameters:
+
+- Optuna's Tree-Parzen-Estimator
+
+
+%prep
+%autosetup -n Amplo-0.17.0
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+	find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+	find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+	find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+	find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+	find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-Amplo -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Tue Apr 11 2023 Python_Bot <Python_Bot@openeuler.org> - 0.17.0-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..459f121
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+534ea089ead20cc21926be044afcce44  Amplo-0.17.0.tar.gz
author	CoprDistGit <infra@openeuler.org>	2023-04-11 12:40:16 +0000
committer	CoprDistGit <infra@openeuler.org>	2023-04-11 12:40:16 +0000
commit	a5199dbe07dd85bbb6cdb2429621d582f580ca83 (patch)
tree	c5477ef8852b83c65ae9a5dd458c9d72019c1880
parent	149e8d8e0f1db89d0618c9a4c4694de980891459 (diff)