diff options
author | CoprDistGit <infra@openeuler.org> | 2023-05-05 09:16:28 +0000 |
---|---|---|
committer | CoprDistGit <infra@openeuler.org> | 2023-05-05 09:16:28 +0000 |
commit | a8cf6d535dcd400a4639b17a3d633647e0e99ac9 (patch) | |
tree | e897efcac1234058ea528af525fe982a61b9c38e | |
parent | 5ef97c37a883de3f613c21d826994e57b6ebdd17 (diff) |
automatic import of python-mlregressionopeneuler20.03
-rw-r--r-- | .gitignore | 1 | ||||
-rw-r--r-- | python-mlregression.spec | 462 | ||||
-rw-r--r-- | sources | 1 |
3 files changed, 464 insertions, 0 deletions
@@ -0,0 +1 @@ +/mlregression-0.1.10.tar.gz diff --git a/python-mlregression.spec b/python-mlregression.spec new file mode 100644 index 0000000..99cd786 --- /dev/null +++ b/python-mlregression.spec @@ -0,0 +1,462 @@ +%global _empty_manifest_terminate_build 0 +Name: python-mlregression +Version: 0.1.10 +Release: 1 +Summary: Machine learning regression off-the-shelf +License: MIT License +URL: https://github.com/muhlbach/mlregression +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/ec/b8/85a0e6d41da0cc6934acc4fd78d72a1b8b32f32c3e19bd584b6d99c394bb/mlregression-0.1.10.tar.gz +BuildArch: noarch + + +%description + +# *** ATTENTION *** +Don't immidiately run `pip install mlregression`. See Section _Installation_. + +# Machine learning regression (mlregression) + +Machine Learning Regression (mlregrresion) is an off-the-shelf implementation of the most popular ML methods that automatically takes care of fitting and parameter tuning. + +Currently, the __fully__ implemented models include: +- Ensemble trees (Random forests, XGBoost, LightGBM, GradientBoostingRegressor, ExtraTreesRegressor) +- Penalized regression (Ridge, Lasso, ElasticNet, Lars, LassoLars) +- Neural nets (Simple neural nets with 1-5 hidden layers, rely activation, and early stopping) + +_NB!_ When using penalized regressions, consider using the native CV-implementation from scikit-learn for speed, e.g., simply set `estimator="LassoCV"` similar to Example 1. + +Scikit-learn regressors (together with `XGBoost` and `LightGBM`) can be estimated by setting the `estimator`-argument equal to the name (string) as in Example 1 (`estimator="RandomForestRegressor"`). +Alternatively, one can provide an instance of an estimator, e.g., `estimator=RandomForestRegressor()`. Again, this is fully automated for most Scikit-learn regressors, but for non-standard methods, one would have to provide a parameter grid as well, e.g., `param_grid={...}`. + +Please contact the authors below if you find any bugs or have any suggestions for improvement. Thank you! + +Author: Nicolaj Søndergaard Mühlbach (n.muhlbach at gmail dot com, muhlbach at mit dot edu) + +## Code dependencies +This code has the following dependencies: + +- Python >=3.6 +- numpy >=1.19 +- pandas >=1.3 +- scikit-learn >=1 +- scikit-learn-intelex >= 2021.3 +- daal >= 2021.3 +- daal4py >= 2021.3 +- tbb >= 2021.4 +- xgboost >=1.5 +- lightgbm >=3.2 + + +## Installation +Before calling `pip install mlregression`, we recommend using `conda` to install the dependencies. In our experience, calling the following command works like a charm: +``` +conda install -c conda-forge numpy">=1.19" pandas">=1.3" scikit-learn">=1" scikit-learn-intelex">=2021.3" daal">=2021.3" daal4py">=2021.3" tbb">=2021.4" xgboost">=1.5" lightgbm">=3.2" --force-reinstall +``` +After this, install `mlregression` by calling `pip install mlregression`. +Note that without installing the dependensies, the package will not work. As of now, it does not work when installing the dependensies via `pip install`. The reason is that we are using the Intel® Extension for Scikit-learn to massively speed up computations, but the dependensies are not properly installed via `pip install`. + +## Usage +We demonstrate the use of __mlregression__ below, using random forests, xgboost, and lightGBM as underlying regressors. + +```python +#------------------------------------------------------------------------------ +# Libraries +#------------------------------------------------------------------------------ +# Standard +from sklearn.datasets import make_regression +from sklearn.model_selection import train_test_split + +# This library +from mlregression.mlreg import MLRegressor + +#------------------------------------------------------------------------------ +# Data +#------------------------------------------------------------------------------ +# Generate data +X, y = make_regression(n_samples=500, + n_features=10, + n_informative=5, + n_targets=1, + bias=0.0, + coef=False, + random_state=1991) + +X_train, X_test, y_train, y_test = train_test_split(X, y) + +#------------------------------------------------------------------------------ +# Example 1: Prediction +#------------------------------------------------------------------------------ +# Specify any of the following estimators: +""" +"LinearRegression", +"RidgeCV", "LassoCV", "ElasticNetCV", +"RandomForestRegressor","ExtraTreesRegressor", "GradientBoostingRegressor", +"XGBRegressor", "LGBMegressor", +"MLPRegressor", +""" + +# For instance, pick "RandomForestRegressor" +estimator = "RandomForestRegressor" +# Note that the 'estimator' may also be an instance of a class, e.g., RandomForestRegressor(), conditional on being imported first, e.g. from sklearn.ensemble import RandomForestRegressor + +# Instantiate model and choose the number of parametrizations to examine using cross-validation ('max_n_models') and the number of cross-validation folds ('n_cv_folds') +mlreg = MLRegressor(estimator=estimator, + n_cv_folds=5, + max_n_models=2) + +# Fit +mlreg.fit(X=X_train, y=y_train) + +# Predict +y_hat = mlreg.predict(X=X_test) + +# Access all the usual attributes +mlreg.best_score_ +mlreg.best_estimator_ + +# Compute the score +mlreg.score(X=X_test,y=y_test) + +#------------------------------------------------------------------------------ +# Example 2: Cross-fitting +#------------------------------------------------------------------------------ +# Instantiate model and choose the number of parametrizations to examine using cross-validation ('max_n_models'), the number of cross-validation folds ('n_cv_folds'), AND the number of cross-fitting folds ('n_cf_folds') +mlreg = MLRegressor(estimator=estimator, + n_cv_folds=5, + max_n_models=2, + n_cf_folds=2) + +# Cross fit +mlreg.cross_fit(X=X_train, y=y_train) + +# Extract in-sample that are estimated in an out-of-sample way (e.g., via cross-fitting) +y_hat = mlreg.y_pred_cf_ + +# Likewise, extract the residualized outcomes used in e.g., double machine learning. This is \tilde{Y} = Y - E[Y|X=x] +y_res = mlreg.y_res_cf_ +``` + +<!-- ## Example +We provide an example script in `demo.py`. --> + + + + +%package -n python3-mlregression +Summary: Machine learning regression off-the-shelf +Provides: python-mlregression +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-mlregression + +# *** ATTENTION *** +Don't immidiately run `pip install mlregression`. See Section _Installation_. + +# Machine learning regression (mlregression) + +Machine Learning Regression (mlregrresion) is an off-the-shelf implementation of the most popular ML methods that automatically takes care of fitting and parameter tuning. + +Currently, the __fully__ implemented models include: +- Ensemble trees (Random forests, XGBoost, LightGBM, GradientBoostingRegressor, ExtraTreesRegressor) +- Penalized regression (Ridge, Lasso, ElasticNet, Lars, LassoLars) +- Neural nets (Simple neural nets with 1-5 hidden layers, rely activation, and early stopping) + +_NB!_ When using penalized regressions, consider using the native CV-implementation from scikit-learn for speed, e.g., simply set `estimator="LassoCV"` similar to Example 1. + +Scikit-learn regressors (together with `XGBoost` and `LightGBM`) can be estimated by setting the `estimator`-argument equal to the name (string) as in Example 1 (`estimator="RandomForestRegressor"`). +Alternatively, one can provide an instance of an estimator, e.g., `estimator=RandomForestRegressor()`. Again, this is fully automated for most Scikit-learn regressors, but for non-standard methods, one would have to provide a parameter grid as well, e.g., `param_grid={...}`. + +Please contact the authors below if you find any bugs or have any suggestions for improvement. Thank you! + +Author: Nicolaj Søndergaard Mühlbach (n.muhlbach at gmail dot com, muhlbach at mit dot edu) + +## Code dependencies +This code has the following dependencies: + +- Python >=3.6 +- numpy >=1.19 +- pandas >=1.3 +- scikit-learn >=1 +- scikit-learn-intelex >= 2021.3 +- daal >= 2021.3 +- daal4py >= 2021.3 +- tbb >= 2021.4 +- xgboost >=1.5 +- lightgbm >=3.2 + + +## Installation +Before calling `pip install mlregression`, we recommend using `conda` to install the dependencies. In our experience, calling the following command works like a charm: +``` +conda install -c conda-forge numpy">=1.19" pandas">=1.3" scikit-learn">=1" scikit-learn-intelex">=2021.3" daal">=2021.3" daal4py">=2021.3" tbb">=2021.4" xgboost">=1.5" lightgbm">=3.2" --force-reinstall +``` +After this, install `mlregression` by calling `pip install mlregression`. +Note that without installing the dependensies, the package will not work. As of now, it does not work when installing the dependensies via `pip install`. The reason is that we are using the Intel® Extension for Scikit-learn to massively speed up computations, but the dependensies are not properly installed via `pip install`. + +## Usage +We demonstrate the use of __mlregression__ below, using random forests, xgboost, and lightGBM as underlying regressors. + +```python +#------------------------------------------------------------------------------ +# Libraries +#------------------------------------------------------------------------------ +# Standard +from sklearn.datasets import make_regression +from sklearn.model_selection import train_test_split + +# This library +from mlregression.mlreg import MLRegressor + +#------------------------------------------------------------------------------ +# Data +#------------------------------------------------------------------------------ +# Generate data +X, y = make_regression(n_samples=500, + n_features=10, + n_informative=5, + n_targets=1, + bias=0.0, + coef=False, + random_state=1991) + +X_train, X_test, y_train, y_test = train_test_split(X, y) + +#------------------------------------------------------------------------------ +# Example 1: Prediction +#------------------------------------------------------------------------------ +# Specify any of the following estimators: +""" +"LinearRegression", +"RidgeCV", "LassoCV", "ElasticNetCV", +"RandomForestRegressor","ExtraTreesRegressor", "GradientBoostingRegressor", +"XGBRegressor", "LGBMegressor", +"MLPRegressor", +""" + +# For instance, pick "RandomForestRegressor" +estimator = "RandomForestRegressor" +# Note that the 'estimator' may also be an instance of a class, e.g., RandomForestRegressor(), conditional on being imported first, e.g. from sklearn.ensemble import RandomForestRegressor + +# Instantiate model and choose the number of parametrizations to examine using cross-validation ('max_n_models') and the number of cross-validation folds ('n_cv_folds') +mlreg = MLRegressor(estimator=estimator, + n_cv_folds=5, + max_n_models=2) + +# Fit +mlreg.fit(X=X_train, y=y_train) + +# Predict +y_hat = mlreg.predict(X=X_test) + +# Access all the usual attributes +mlreg.best_score_ +mlreg.best_estimator_ + +# Compute the score +mlreg.score(X=X_test,y=y_test) + +#------------------------------------------------------------------------------ +# Example 2: Cross-fitting +#------------------------------------------------------------------------------ +# Instantiate model and choose the number of parametrizations to examine using cross-validation ('max_n_models'), the number of cross-validation folds ('n_cv_folds'), AND the number of cross-fitting folds ('n_cf_folds') +mlreg = MLRegressor(estimator=estimator, + n_cv_folds=5, + max_n_models=2, + n_cf_folds=2) + +# Cross fit +mlreg.cross_fit(X=X_train, y=y_train) + +# Extract in-sample that are estimated in an out-of-sample way (e.g., via cross-fitting) +y_hat = mlreg.y_pred_cf_ + +# Likewise, extract the residualized outcomes used in e.g., double machine learning. This is \tilde{Y} = Y - E[Y|X=x] +y_res = mlreg.y_res_cf_ +``` + +<!-- ## Example +We provide an example script in `demo.py`. --> + + + + +%package help +Summary: Development documents and examples for mlregression +Provides: python3-mlregression-doc +%description help + +# *** ATTENTION *** +Don't immidiately run `pip install mlregression`. See Section _Installation_. + +# Machine learning regression (mlregression) + +Machine Learning Regression (mlregrresion) is an off-the-shelf implementation of the most popular ML methods that automatically takes care of fitting and parameter tuning. + +Currently, the __fully__ implemented models include: +- Ensemble trees (Random forests, XGBoost, LightGBM, GradientBoostingRegressor, ExtraTreesRegressor) +- Penalized regression (Ridge, Lasso, ElasticNet, Lars, LassoLars) +- Neural nets (Simple neural nets with 1-5 hidden layers, rely activation, and early stopping) + +_NB!_ When using penalized regressions, consider using the native CV-implementation from scikit-learn for speed, e.g., simply set `estimator="LassoCV"` similar to Example 1. + +Scikit-learn regressors (together with `XGBoost` and `LightGBM`) can be estimated by setting the `estimator`-argument equal to the name (string) as in Example 1 (`estimator="RandomForestRegressor"`). +Alternatively, one can provide an instance of an estimator, e.g., `estimator=RandomForestRegressor()`. Again, this is fully automated for most Scikit-learn regressors, but for non-standard methods, one would have to provide a parameter grid as well, e.g., `param_grid={...}`. + +Please contact the authors below if you find any bugs or have any suggestions for improvement. Thank you! + +Author: Nicolaj Søndergaard Mühlbach (n.muhlbach at gmail dot com, muhlbach at mit dot edu) + +## Code dependencies +This code has the following dependencies: + +- Python >=3.6 +- numpy >=1.19 +- pandas >=1.3 +- scikit-learn >=1 +- scikit-learn-intelex >= 2021.3 +- daal >= 2021.3 +- daal4py >= 2021.3 +- tbb >= 2021.4 +- xgboost >=1.5 +- lightgbm >=3.2 + + +## Installation +Before calling `pip install mlregression`, we recommend using `conda` to install the dependencies. In our experience, calling the following command works like a charm: +``` +conda install -c conda-forge numpy">=1.19" pandas">=1.3" scikit-learn">=1" scikit-learn-intelex">=2021.3" daal">=2021.3" daal4py">=2021.3" tbb">=2021.4" xgboost">=1.5" lightgbm">=3.2" --force-reinstall +``` +After this, install `mlregression` by calling `pip install mlregression`. +Note that without installing the dependensies, the package will not work. As of now, it does not work when installing the dependensies via `pip install`. The reason is that we are using the Intel® Extension for Scikit-learn to massively speed up computations, but the dependensies are not properly installed via `pip install`. + +## Usage +We demonstrate the use of __mlregression__ below, using random forests, xgboost, and lightGBM as underlying regressors. + +```python +#------------------------------------------------------------------------------ +# Libraries +#------------------------------------------------------------------------------ +# Standard +from sklearn.datasets import make_regression +from sklearn.model_selection import train_test_split + +# This library +from mlregression.mlreg import MLRegressor + +#------------------------------------------------------------------------------ +# Data +#------------------------------------------------------------------------------ +# Generate data +X, y = make_regression(n_samples=500, + n_features=10, + n_informative=5, + n_targets=1, + bias=0.0, + coef=False, + random_state=1991) + +X_train, X_test, y_train, y_test = train_test_split(X, y) + +#------------------------------------------------------------------------------ +# Example 1: Prediction +#------------------------------------------------------------------------------ +# Specify any of the following estimators: +""" +"LinearRegression", +"RidgeCV", "LassoCV", "ElasticNetCV", +"RandomForestRegressor","ExtraTreesRegressor", "GradientBoostingRegressor", +"XGBRegressor", "LGBMegressor", +"MLPRegressor", +""" + +# For instance, pick "RandomForestRegressor" +estimator = "RandomForestRegressor" +# Note that the 'estimator' may also be an instance of a class, e.g., RandomForestRegressor(), conditional on being imported first, e.g. from sklearn.ensemble import RandomForestRegressor + +# Instantiate model and choose the number of parametrizations to examine using cross-validation ('max_n_models') and the number of cross-validation folds ('n_cv_folds') +mlreg = MLRegressor(estimator=estimator, + n_cv_folds=5, + max_n_models=2) + +# Fit +mlreg.fit(X=X_train, y=y_train) + +# Predict +y_hat = mlreg.predict(X=X_test) + +# Access all the usual attributes +mlreg.best_score_ +mlreg.best_estimator_ + +# Compute the score +mlreg.score(X=X_test,y=y_test) + +#------------------------------------------------------------------------------ +# Example 2: Cross-fitting +#------------------------------------------------------------------------------ +# Instantiate model and choose the number of parametrizations to examine using cross-validation ('max_n_models'), the number of cross-validation folds ('n_cv_folds'), AND the number of cross-fitting folds ('n_cf_folds') +mlreg = MLRegressor(estimator=estimator, + n_cv_folds=5, + max_n_models=2, + n_cf_folds=2) + +# Cross fit +mlreg.cross_fit(X=X_train, y=y_train) + +# Extract in-sample that are estimated in an out-of-sample way (e.g., via cross-fitting) +y_hat = mlreg.y_pred_cf_ + +# Likewise, extract the residualized outcomes used in e.g., double machine learning. This is \tilde{Y} = Y - E[Y|X=x] +y_res = mlreg.y_res_cf_ +``` + +<!-- ## Example +We provide an example script in `demo.py`. --> + + + + +%prep +%autosetup -n mlregression-0.1.10 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-mlregression -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Fri May 05 2023 Python_Bot <Python_Bot@openeuler.org> - 0.1.10-1 +- Package Spec generated @@ -0,0 +1 @@ +8c52fe19428f99c12468e6e7c30d8bff mlregression-0.1.10.tar.gz |