%global _empty_manifest_terminate_build 0 Name: python-autots Version: 0.5.6 Release: 1 Summary: Automated Time Series Forecasting License: MIT License URL: https://github.com/winedarksea/AutoTS Source0: https://mirrors.nju.edu.cn/pypi/web/packages/2a/c4/981d9696d3d7070515a61439d58964cc893aa22a6604f69dd9cb7ad5a4fa/autots-0.5.6.tar.gz BuildArch: noarch Requires: python3-numpy Requires: python3-pandas Requires: python3-statsmodels Requires: python3-scikit-learn Requires: python3-holidays Requires: python3-prophet Requires: python3-fredapi Requires: python3-tensorflow Requires: python3-xgboost Requires: python3-lightgbm Requires: python3-joblib Requires: python3-scipy Requires: python3-arch Requires: python3-numexpr Requires: python3-bottleneck Requires: python3-yfinance Requires: python3-pytrends Requires: python3-matplotlib Requires: python3-requests %description # AutoTS AutoTS is a time series package for Python designed for rapidly deploying high-accuracy forecasts at scale. In 2023, AutoTS has won the M6 forecasting competition, delivering the highest performance investment decisions across 12 months of stock market forecasting. There are dozens of forecasting models usable in the `sklearn` style of `.fit()` and `.predict()`. These includes naive, statistical, machine learning, and deep learning models. Additionally, there are over 30 time series specific transforms usable in the `sklearn` style of `.fit()`, `.transform()` and `.inverse_transform()`. All of these function directly on Pandas Dataframes, without the need for conversion to proprietary objects. All models support forecasting multivariate (multiple time series) outputs and also support probabilistic (upper/lower bound) forecasts. Most models can readily scale to tens and even hundreds of thousands of input series. Many models also support passing in user-defined exogenous regressors. These models are all designed for integration in an AutoML feature search which automatically finds the best models, preprocessing, and ensembling for a given dataset through genetic algorithms. Horizontal and mosaic style ensembles are the flagship ensembling types, allowing each series to receive the most accurate possible models while still maintaining scalability. A combination of metrics and cross-validation options, the ability to apply subsets and weighting, regressor generation tools, simulation forecasting mode, event risk forecasting, live datasets, template import and export, plotting, and a collection of data shaping parameters round out the available feature set. ## Table of Contents * [Installation](https://github.com/winedarksea/AutoTS#installation) * [Basic Use](https://github.com/winedarksea/AutoTS#basic-use) * [Tips for Speed and Large Data](https://github.com/winedarksea/AutoTS#tips-for-speed-and-large-data) * Extended Tutorial [GitHub](https://github.com/winedarksea/AutoTS/blob/master/extended_tutorial.md) or [Docs](https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html) * [Production Example](https://github.com/winedarksea/AutoTS/blob/master/production_example.py) ## Installation ``` pip install autots ``` This includes dependencies for basic models, but [additonal packages](https://github.com/winedarksea/AutoTS/blob/master/extended_tutorial.md#installation-and-dependency-versioning) are required for some models and methods. ## Basic Use Input data for AutoTS is expected to come in either a *long* or a *wide* format: - The *wide* format is a `pandas.DataFrame` with a `pandas.DatetimeIndex` and each column a distinct series. - The *long* format has three columns: - Date (ideally already in pandas-recognized `datetime` format) - Series ID. For a single time series, series_id can be `= None`. - Value - For *long* data, the column name for each of these is passed to `.fit()` as `date_col`, `id_col`, and `value_col`. No parameters are needed for *wide* data. Lower-level functions are only designed for `wide` style data. ```python # also load: _hourly, _monthly, _weekly, _yearly, or _live_daily from autots import AutoTS, load_daily # sample datasets can be used in either of the long or wide import shapes long = False df = load_daily(long=long) model = AutoTS( forecast_length=21, frequency='infer', prediction_interval=0.9, ensemble=None, model_list="fast", # "superfast", "default", "fast_parallel" transformer_list="fast", # "superfast", drop_most_recent=1, max_generations=4, num_validations=2, validation_method="backwards" ) model = model.fit( df, date_col='datetime' if long else None, value_col='value' if long else None, id_col='series_id' if long else None, ) prediction = model.predict() # plot a sample prediction.plot(model.df_wide_numeric, series=model.df_wide_numeric.columns[0], start_date="2019-01-01") # Print the details of the best model print(model) # point forecasts dataframe forecasts_df = prediction.forecast # upper and lower forecasts forecasts_up, forecasts_low = prediction.upper_forecast, prediction.lower_forecast # accuracy of all tried model results model_results = model.results() # and aggregated from cross validation validation_results = model.results("validation") ``` The lower-level API, in particular the large section of time series transformers in the scikit-learn style, can also be utilized independently from the AutoML framework. Check out [extended_tutorial.md](https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html) for a more detailed guide to features. Also take a look at the [production_example.py](https://github.com/winedarksea/AutoTS/blob/master/production_example.py) ## Tips for Speed and Large Data: * Use appropriate model lists, especially the predefined lists: * `superfast` (simple naive models) and `fast` (more complex but still faster models, optimized for many series) * `fast_parallel` (a combination of `fast` and `parallel`) or `parallel`, given many CPU cores are available * `n_jobs` usually gets pretty close with `='auto'` but adjust as necessary for the environment * see a dict of predefined lists (some defined for internal use) with `from autots.models.model_list import model_lists` * Use the `subset` parameter when there are many similar series, `subset=100` will often generalize well for tens of thousands of similar series. * if using `subset`, passing `weights` for series will weight subset selection towards higher priority series. * if limited by RAM, it can be distributed by running multiple instances of AutoTS on different batches of data, having first imported a template pretrained as a starting point for all. * Set `model_interrupt=True` which passes over the current model when a `KeyboardInterrupt` ie `crtl+c` is pressed (although if the interrupt falls between generations it will stop the entire training). * Use the `result_file` method of `.fit()` which will save progress after each generation - helpful to save progress if a long training is being done. Use `import_results` to recover. * While Transformations are pretty fast, setting `transformer_max_depth` to a lower number (say, 2) will increase speed. Also utilize `transformer_list` == 'fast' or 'superfast'. * Check out [this example](https://github.com/winedarksea/AutoTS/discussions/76) of using AutoTS with pandas UDF. * Ensembles are obviously slower to predict because they run many models, 'distance' models 2x slower, and 'simple' models 3x-5x slower. * `ensemble='horizontal-max'` with `model_list='no_shared_fast'` can scale relatively well given many cpu cores because each model is only run on the series it is needed for. * Reducing `num_validations` and `models_to_validate` will decrease runtime but may lead to poorer model selections. * For datasets with many records, upsampling (for example, from daily to monthly frequency forecasts) can reduce training time if appropriate. * this can be done by adjusting `frequency` and `aggfunc` but is probably best done before passing data into AutoTS. * It will be faster if NaN's are already filled. If a search for optimal NaN fill method is not required, then fill any NaN with a satisfactory method before passing to class. * Set `runtime_weighting` in `metric_weighting` to a higher value. This will guide the search towards faster models, although it may come at the expense of accuracy. ## How to Contribute: * Give feedback on where you find the documentation confusing * Use AutoTS and... * Report errors and request features by adding Issues on GitHub * Posting the top model templates for your data (to help improve the starting templates) * Feel free to recommend different search grid parameters for your favorite models * And, of course, contributing to the codebase directly on GitHub. *Also known as Project CATS (Catlin's Automated Time Series) hence the logo.* %package -n python3-autots Summary: Automated Time Series Forecasting Provides: python-autots BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-autots # AutoTS AutoTS is a time series package for Python designed for rapidly deploying high-accuracy forecasts at scale. In 2023, AutoTS has won the M6 forecasting competition, delivering the highest performance investment decisions across 12 months of stock market forecasting. There are dozens of forecasting models usable in the `sklearn` style of `.fit()` and `.predict()`. These includes naive, statistical, machine learning, and deep learning models. Additionally, there are over 30 time series specific transforms usable in the `sklearn` style of `.fit()`, `.transform()` and `.inverse_transform()`. All of these function directly on Pandas Dataframes, without the need for conversion to proprietary objects. All models support forecasting multivariate (multiple time series) outputs and also support probabilistic (upper/lower bound) forecasts. Most models can readily scale to tens and even hundreds of thousands of input series. Many models also support passing in user-defined exogenous regressors. These models are all designed for integration in an AutoML feature search which automatically finds the best models, preprocessing, and ensembling for a given dataset through genetic algorithms. Horizontal and mosaic style ensembles are the flagship ensembling types, allowing each series to receive the most accurate possible models while still maintaining scalability. A combination of metrics and cross-validation options, the ability to apply subsets and weighting, regressor generation tools, simulation forecasting mode, event risk forecasting, live datasets, template import and export, plotting, and a collection of data shaping parameters round out the available feature set. ## Table of Contents * [Installation](https://github.com/winedarksea/AutoTS#installation) * [Basic Use](https://github.com/winedarksea/AutoTS#basic-use) * [Tips for Speed and Large Data](https://github.com/winedarksea/AutoTS#tips-for-speed-and-large-data) * Extended Tutorial [GitHub](https://github.com/winedarksea/AutoTS/blob/master/extended_tutorial.md) or [Docs](https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html) * [Production Example](https://github.com/winedarksea/AutoTS/blob/master/production_example.py) ## Installation ``` pip install autots ``` This includes dependencies for basic models, but [additonal packages](https://github.com/winedarksea/AutoTS/blob/master/extended_tutorial.md#installation-and-dependency-versioning) are required for some models and methods. ## Basic Use Input data for AutoTS is expected to come in either a *long* or a *wide* format: - The *wide* format is a `pandas.DataFrame` with a `pandas.DatetimeIndex` and each column a distinct series. - The *long* format has three columns: - Date (ideally already in pandas-recognized `datetime` format) - Series ID. For a single time series, series_id can be `= None`. - Value - For *long* data, the column name for each of these is passed to `.fit()` as `date_col`, `id_col`, and `value_col`. No parameters are needed for *wide* data. Lower-level functions are only designed for `wide` style data. ```python # also load: _hourly, _monthly, _weekly, _yearly, or _live_daily from autots import AutoTS, load_daily # sample datasets can be used in either of the long or wide import shapes long = False df = load_daily(long=long) model = AutoTS( forecast_length=21, frequency='infer', prediction_interval=0.9, ensemble=None, model_list="fast", # "superfast", "default", "fast_parallel" transformer_list="fast", # "superfast", drop_most_recent=1, max_generations=4, num_validations=2, validation_method="backwards" ) model = model.fit( df, date_col='datetime' if long else None, value_col='value' if long else None, id_col='series_id' if long else None, ) prediction = model.predict() # plot a sample prediction.plot(model.df_wide_numeric, series=model.df_wide_numeric.columns[0], start_date="2019-01-01") # Print the details of the best model print(model) # point forecasts dataframe forecasts_df = prediction.forecast # upper and lower forecasts forecasts_up, forecasts_low = prediction.upper_forecast, prediction.lower_forecast # accuracy of all tried model results model_results = model.results() # and aggregated from cross validation validation_results = model.results("validation") ``` The lower-level API, in particular the large section of time series transformers in the scikit-learn style, can also be utilized independently from the AutoML framework. Check out [extended_tutorial.md](https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html) for a more detailed guide to features. Also take a look at the [production_example.py](https://github.com/winedarksea/AutoTS/blob/master/production_example.py) ## Tips for Speed and Large Data: * Use appropriate model lists, especially the predefined lists: * `superfast` (simple naive models) and `fast` (more complex but still faster models, optimized for many series) * `fast_parallel` (a combination of `fast` and `parallel`) or `parallel`, given many CPU cores are available * `n_jobs` usually gets pretty close with `='auto'` but adjust as necessary for the environment * see a dict of predefined lists (some defined for internal use) with `from autots.models.model_list import model_lists` * Use the `subset` parameter when there are many similar series, `subset=100` will often generalize well for tens of thousands of similar series. * if using `subset`, passing `weights` for series will weight subset selection towards higher priority series. * if limited by RAM, it can be distributed by running multiple instances of AutoTS on different batches of data, having first imported a template pretrained as a starting point for all. * Set `model_interrupt=True` which passes over the current model when a `KeyboardInterrupt` ie `crtl+c` is pressed (although if the interrupt falls between generations it will stop the entire training). * Use the `result_file` method of `.fit()` which will save progress after each generation - helpful to save progress if a long training is being done. Use `import_results` to recover. * While Transformations are pretty fast, setting `transformer_max_depth` to a lower number (say, 2) will increase speed. Also utilize `transformer_list` == 'fast' or 'superfast'. * Check out [this example](https://github.com/winedarksea/AutoTS/discussions/76) of using AutoTS with pandas UDF. * Ensembles are obviously slower to predict because they run many models, 'distance' models 2x slower, and 'simple' models 3x-5x slower. * `ensemble='horizontal-max'` with `model_list='no_shared_fast'` can scale relatively well given many cpu cores because each model is only run on the series it is needed for. * Reducing `num_validations` and `models_to_validate` will decrease runtime but may lead to poorer model selections. * For datasets with many records, upsampling (for example, from daily to monthly frequency forecasts) can reduce training time if appropriate. * this can be done by adjusting `frequency` and `aggfunc` but is probably best done before passing data into AutoTS. * It will be faster if NaN's are already filled. If a search for optimal NaN fill method is not required, then fill any NaN with a satisfactory method before passing to class. * Set `runtime_weighting` in `metric_weighting` to a higher value. This will guide the search towards faster models, although it may come at the expense of accuracy. ## How to Contribute: * Give feedback on where you find the documentation confusing * Use AutoTS and... * Report errors and request features by adding Issues on GitHub * Posting the top model templates for your data (to help improve the starting templates) * Feel free to recommend different search grid parameters for your favorite models * And, of course, contributing to the codebase directly on GitHub. *Also known as Project CATS (Catlin's Automated Time Series) hence the logo.* %package help Summary: Development documents and examples for autots Provides: python3-autots-doc %description help # AutoTS AutoTS is a time series package for Python designed for rapidly deploying high-accuracy forecasts at scale. In 2023, AutoTS has won the M6 forecasting competition, delivering the highest performance investment decisions across 12 months of stock market forecasting. There are dozens of forecasting models usable in the `sklearn` style of `.fit()` and `.predict()`. These includes naive, statistical, machine learning, and deep learning models. Additionally, there are over 30 time series specific transforms usable in the `sklearn` style of `.fit()`, `.transform()` and `.inverse_transform()`. All of these function directly on Pandas Dataframes, without the need for conversion to proprietary objects. All models support forecasting multivariate (multiple time series) outputs and also support probabilistic (upper/lower bound) forecasts. Most models can readily scale to tens and even hundreds of thousands of input series. Many models also support passing in user-defined exogenous regressors. These models are all designed for integration in an AutoML feature search which automatically finds the best models, preprocessing, and ensembling for a given dataset through genetic algorithms. Horizontal and mosaic style ensembles are the flagship ensembling types, allowing each series to receive the most accurate possible models while still maintaining scalability. A combination of metrics and cross-validation options, the ability to apply subsets and weighting, regressor generation tools, simulation forecasting mode, event risk forecasting, live datasets, template import and export, plotting, and a collection of data shaping parameters round out the available feature set. ## Table of Contents * [Installation](https://github.com/winedarksea/AutoTS#installation) * [Basic Use](https://github.com/winedarksea/AutoTS#basic-use) * [Tips for Speed and Large Data](https://github.com/winedarksea/AutoTS#tips-for-speed-and-large-data) * Extended Tutorial [GitHub](https://github.com/winedarksea/AutoTS/blob/master/extended_tutorial.md) or [Docs](https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html) * [Production Example](https://github.com/winedarksea/AutoTS/blob/master/production_example.py) ## Installation ``` pip install autots ``` This includes dependencies for basic models, but [additonal packages](https://github.com/winedarksea/AutoTS/blob/master/extended_tutorial.md#installation-and-dependency-versioning) are required for some models and methods. ## Basic Use Input data for AutoTS is expected to come in either a *long* or a *wide* format: - The *wide* format is a `pandas.DataFrame` with a `pandas.DatetimeIndex` and each column a distinct series. - The *long* format has three columns: - Date (ideally already in pandas-recognized `datetime` format) - Series ID. For a single time series, series_id can be `= None`. - Value - For *long* data, the column name for each of these is passed to `.fit()` as `date_col`, `id_col`, and `value_col`. No parameters are needed for *wide* data. Lower-level functions are only designed for `wide` style data. ```python # also load: _hourly, _monthly, _weekly, _yearly, or _live_daily from autots import AutoTS, load_daily # sample datasets can be used in either of the long or wide import shapes long = False df = load_daily(long=long) model = AutoTS( forecast_length=21, frequency='infer', prediction_interval=0.9, ensemble=None, model_list="fast", # "superfast", "default", "fast_parallel" transformer_list="fast", # "superfast", drop_most_recent=1, max_generations=4, num_validations=2, validation_method="backwards" ) model = model.fit( df, date_col='datetime' if long else None, value_col='value' if long else None, id_col='series_id' if long else None, ) prediction = model.predict() # plot a sample prediction.plot(model.df_wide_numeric, series=model.df_wide_numeric.columns[0], start_date="2019-01-01") # Print the details of the best model print(model) # point forecasts dataframe forecasts_df = prediction.forecast # upper and lower forecasts forecasts_up, forecasts_low = prediction.upper_forecast, prediction.lower_forecast # accuracy of all tried model results model_results = model.results() # and aggregated from cross validation validation_results = model.results("validation") ``` The lower-level API, in particular the large section of time series transformers in the scikit-learn style, can also be utilized independently from the AutoML framework. Check out [extended_tutorial.md](https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html) for a more detailed guide to features. Also take a look at the [production_example.py](https://github.com/winedarksea/AutoTS/blob/master/production_example.py) ## Tips for Speed and Large Data: * Use appropriate model lists, especially the predefined lists: * `superfast` (simple naive models) and `fast` (more complex but still faster models, optimized for many series) * `fast_parallel` (a combination of `fast` and `parallel`) or `parallel`, given many CPU cores are available * `n_jobs` usually gets pretty close with `='auto'` but adjust as necessary for the environment * see a dict of predefined lists (some defined for internal use) with `from autots.models.model_list import model_lists` * Use the `subset` parameter when there are many similar series, `subset=100` will often generalize well for tens of thousands of similar series. * if using `subset`, passing `weights` for series will weight subset selection towards higher priority series. * if limited by RAM, it can be distributed by running multiple instances of AutoTS on different batches of data, having first imported a template pretrained as a starting point for all. * Set `model_interrupt=True` which passes over the current model when a `KeyboardInterrupt` ie `crtl+c` is pressed (although if the interrupt falls between generations it will stop the entire training). * Use the `result_file` method of `.fit()` which will save progress after each generation - helpful to save progress if a long training is being done. Use `import_results` to recover. * While Transformations are pretty fast, setting `transformer_max_depth` to a lower number (say, 2) will increase speed. Also utilize `transformer_list` == 'fast' or 'superfast'. * Check out [this example](https://github.com/winedarksea/AutoTS/discussions/76) of using AutoTS with pandas UDF. * Ensembles are obviously slower to predict because they run many models, 'distance' models 2x slower, and 'simple' models 3x-5x slower. * `ensemble='horizontal-max'` with `model_list='no_shared_fast'` can scale relatively well given many cpu cores because each model is only run on the series it is needed for. * Reducing `num_validations` and `models_to_validate` will decrease runtime but may lead to poorer model selections. * For datasets with many records, upsampling (for example, from daily to monthly frequency forecasts) can reduce training time if appropriate. * this can be done by adjusting `frequency` and `aggfunc` but is probably best done before passing data into AutoTS. * It will be faster if NaN's are already filled. If a search for optimal NaN fill method is not required, then fill any NaN with a satisfactory method before passing to class. * Set `runtime_weighting` in `metric_weighting` to a higher value. This will guide the search towards faster models, although it may come at the expense of accuracy. ## How to Contribute: * Give feedback on where you find the documentation confusing * Use AutoTS and... * Report errors and request features by adding Issues on GitHub * Posting the top model templates for your data (to help improve the starting templates) * Feel free to recommend different search grid parameters for your favorite models * And, of course, contributing to the codebase directly on GitHub. *Also known as Project CATS (Catlin's Automated Time Series) hence the logo.* %prep %autosetup -n autots-0.5.6 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-autots -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Tue Apr 25 2023 Python_Bot - 0.5.6-1 - Package Spec generated