%global _empty_manifest_terminate_build 0 Name: python-nowcasting-dataset Version: 3.7.21 Release: 1 Summary: Nowcasting Dataset License: MIT URL: https://pypi.org/project/nowcasting-dataset/ Source0: https://mirrors.nju.edu.cn/pypi/web/packages/7a/41/a339689cef755fdd46b336e6d6cdd4ce6e0738fbdf55266319d230a12074/nowcasting_dataset-3.7.21.tar.gz BuildArch: noarch Requires: python3-rioxarray Requires: python3-numpy Requires: python3-pandas Requires: python3-geopandas Requires: python3-zarr Requires: python3-xarray Requires: python3-h5netcdf Requires: python3-gcsfs Requires: python3-dask Requires: python3-pvlib Requires: python3-pyproj Requires: python3-pytest Requires: python3-coverage Requires: python3-pytest-cov Requires: python3-jedi Requires: python3-mypy Requires: python3-pydantic Requires: python3-tqdm Requires: python3-fsspec Requires: python3-pathy Requires: python3-opencv-contrib-python-headless Requires: python3-gitpython Requires: python3-pyresample Requires: python3-nowcasting-datamodel Requires: python3-scipy Requires: python3-pyaml-env Requires: python3-black Requires: python3-plotly Requires: python3-gcsfs Requires: python3-matplotlib Requires: python3-ipykernel Requires: python3-pre-commit %description # nowcasting_dataset [![All Contributors](https://img.shields.io/badge/all_contributors-9-orange.svg?style=flat-square)](#contributors-) [![codecov](https://codecov.io/gh/openclimatefix/nowcasting_dataset/branch/main/graph/badge.svg?token=X0P4KTHWVA)](https://codecov.io/gh/openclimatefix/nowcasting_dataset) Pre-prepare batches of data for use in machine learning training. This code combines several data sources including: * Satellite imagery (EUMETSAT SEVIRI RSS 5-minutely data of UK) * Numerical Weather Predictions (NWPs. UK Met Office UKV model from CEDA) * Solar PV power timeseries data (from PVOutput.org, downloaded using our [pvoutput Python code](https://github.com/openclimatefix/pvoutput).) * Estimated total solar PV generation for each of the ~350 "grid supply points" (GSPs) in Britain from [Sheffield Solar's PV Live Regional API](https://www.solar.sheffield.ac.uk/pvlive/regional/). * Topographic data. * The Sun's azimuth and angle. This repo doesn't contain the ML models themselves. Please see [this page for an overview](https://github.com/openclimatefix/nowcasting) of the Open Climate Fix solar PV nowcasting project, and how our code repositories fit together. # User manual ## Installation ### `conda` From within the cloned `nowcasting_dataset` directory: ```shell conda env create -f environment.yml conda activate nowcasting_dataset pip install -e . ``` ### `pip` A (probably older) version is also available through `pip install nowcasting-dataset` ### PV Live API If you want to also install [PVLive](https://github.com/SheffieldSolar/PV_Live-API) then use `pip install git+https://github.com/SheffieldSolar/PV_Live-API ` ### Pre-commit A pre commit hook has been installed which makes `black` run with every commit. You need to install `black` and `pre-commit` (these will be installed by `conda` or `pip` when installing `nowcasting_dataset`) and run `pre-commit install` in this repo. ## Testing To test using the small amount of data stored in this repo: `py.test -s` To output debug logs while running the tests then run `py.test --log-cli-level=10` To test using the full dataset on Google Cloud, add the `--use_cloud_data` switch. ## docker Test using a docker file and database ``` docker stop $(docker ps -a -q) docker-compose -f test-docker-compose.yml build docker-compose -f test-docker-compose.yml run dataset ``` ## Downloading data ### Satellite data Use [Satip](https://github.com/openclimatefix/Satip) to download native EUMETSAT SEVIRI RSS data from EUMETSAT's API and then convert to an intermediate file format. ### PV data from PVOutput.org Download PV timeseries data from PVOutput.org using [our PVOutput code](https://github.com/openclimatefix/pvoutput). ### OCF uk_pv dataset PV solar generation data from the UK. This dataset contains data from 1311 PV systems from 2018-01-01 to 2021-10-27. The time series of solar generation is in 5 minutes chunks. This data is collected from live PV systems in the UK. We have obfuscated the location of the PV systems for privacy. ![](./docs/uk_pv_locations.jpg) ### Numerical weather predictions from the UK Met Office Please use our [`nwp`](https://github.com/openclimatefix/nwp) code to download UKV NWPs and convert to Zarr. ### GSP-level estimates of PV outturn from PV Live Regional TODO - GSP ### Topographical data 1. Make an account at the [USGS EarthExplorer](https://earthexplorer.usgs.gov/) website 2. Create a region of the world to download data for, in our case, the spatial extant of the SEVIRI RSS image 3. Select the data products you want, in this case SRTM elevation maps 4. Download all the SRTM files that cover that area There does not seem to be an automated way to do this selecting and downloading, so this might take awhile. ## Configure `nowcasting_dataset` to point to the downloaded data Copy and modify one of the config yaml files in [`nowcasting_dataset/config/`](https://github.com/openclimatefix/nowcasting_dataset/tree/main/nowcasting_dataset/config). ## Prepare ML batches Run [`scripts/prepare_ml_data.py --help`](https://github.com/openclimatefix/nowcasting_dataset/blob/main/scripts/prepare_ml_data.py) to learn how to run the `prepare_ml_data.py` script. ## What exactly is in each batch? Please see the `data_sources//_model.py` files (where `` is one of {datetime, metadata, gsp, nwp, pv, satellite, sun, topographic}) for documentation about the different data fields in each example / batch. # History of nowcasting_dataset When we first started writing `nowcasting_dataset`, our intention was to load and align data from these three datasets on-the-fly during ML training. But it just isn't quite fast enough to keep a modern GPU constantly fed with data when loading multiple satellite channels and multiple NWP parameters. So, now, this code is used to pre-prepare thousands of batches, and save these batches to disk, each as a separate NetCDF file. These files can then be loaded super-quickly at training time. The end result is a 12x speedup in training. ## Contributors ✨ Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):

Jack Kelly

💻

Jacob Bieker

💻

Peter Dudfield

💻

Flo

💻

Rohan Nuttall

💻

Nasser Benabderrazik

💻

Shanmukh Chava

💻

Rishi Kumar Ray

💻

JanEbbing

💻
This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome! %package -n python3-nowcasting-dataset Summary: Nowcasting Dataset Provides: python-nowcasting-dataset BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-nowcasting-dataset # nowcasting_dataset [![All Contributors](https://img.shields.io/badge/all_contributors-9-orange.svg?style=flat-square)](#contributors-) [![codecov](https://codecov.io/gh/openclimatefix/nowcasting_dataset/branch/main/graph/badge.svg?token=X0P4KTHWVA)](https://codecov.io/gh/openclimatefix/nowcasting_dataset) Pre-prepare batches of data for use in machine learning training. This code combines several data sources including: * Satellite imagery (EUMETSAT SEVIRI RSS 5-minutely data of UK) * Numerical Weather Predictions (NWPs. UK Met Office UKV model from CEDA) * Solar PV power timeseries data (from PVOutput.org, downloaded using our [pvoutput Python code](https://github.com/openclimatefix/pvoutput).) * Estimated total solar PV generation for each of the ~350 "grid supply points" (GSPs) in Britain from [Sheffield Solar's PV Live Regional API](https://www.solar.sheffield.ac.uk/pvlive/regional/). * Topographic data. * The Sun's azimuth and angle. This repo doesn't contain the ML models themselves. Please see [this page for an overview](https://github.com/openclimatefix/nowcasting) of the Open Climate Fix solar PV nowcasting project, and how our code repositories fit together. # User manual ## Installation ### `conda` From within the cloned `nowcasting_dataset` directory: ```shell conda env create -f environment.yml conda activate nowcasting_dataset pip install -e . ``` ### `pip` A (probably older) version is also available through `pip install nowcasting-dataset` ### PV Live API If you want to also install [PVLive](https://github.com/SheffieldSolar/PV_Live-API) then use `pip install git+https://github.com/SheffieldSolar/PV_Live-API ` ### Pre-commit A pre commit hook has been installed which makes `black` run with every commit. You need to install `black` and `pre-commit` (these will be installed by `conda` or `pip` when installing `nowcasting_dataset`) and run `pre-commit install` in this repo. ## Testing To test using the small amount of data stored in this repo: `py.test -s` To output debug logs while running the tests then run `py.test --log-cli-level=10` To test using the full dataset on Google Cloud, add the `--use_cloud_data` switch. ## docker Test using a docker file and database ``` docker stop $(docker ps -a -q) docker-compose -f test-docker-compose.yml build docker-compose -f test-docker-compose.yml run dataset ``` ## Downloading data ### Satellite data Use [Satip](https://github.com/openclimatefix/Satip) to download native EUMETSAT SEVIRI RSS data from EUMETSAT's API and then convert to an intermediate file format. ### PV data from PVOutput.org Download PV timeseries data from PVOutput.org using [our PVOutput code](https://github.com/openclimatefix/pvoutput). ### OCF uk_pv dataset PV solar generation data from the UK. This dataset contains data from 1311 PV systems from 2018-01-01 to 2021-10-27. The time series of solar generation is in 5 minutes chunks. This data is collected from live PV systems in the UK. We have obfuscated the location of the PV systems for privacy. ![](./docs/uk_pv_locations.jpg) ### Numerical weather predictions from the UK Met Office Please use our [`nwp`](https://github.com/openclimatefix/nwp) code to download UKV NWPs and convert to Zarr. ### GSP-level estimates of PV outturn from PV Live Regional TODO - GSP ### Topographical data 1. Make an account at the [USGS EarthExplorer](https://earthexplorer.usgs.gov/) website 2. Create a region of the world to download data for, in our case, the spatial extant of the SEVIRI RSS image 3. Select the data products you want, in this case SRTM elevation maps 4. Download all the SRTM files that cover that area There does not seem to be an automated way to do this selecting and downloading, so this might take awhile. ## Configure `nowcasting_dataset` to point to the downloaded data Copy and modify one of the config yaml files in [`nowcasting_dataset/config/`](https://github.com/openclimatefix/nowcasting_dataset/tree/main/nowcasting_dataset/config). ## Prepare ML batches Run [`scripts/prepare_ml_data.py --help`](https://github.com/openclimatefix/nowcasting_dataset/blob/main/scripts/prepare_ml_data.py) to learn how to run the `prepare_ml_data.py` script. ## What exactly is in each batch? Please see the `data_sources//_model.py` files (where `` is one of {datetime, metadata, gsp, nwp, pv, satellite, sun, topographic}) for documentation about the different data fields in each example / batch. # History of nowcasting_dataset When we first started writing `nowcasting_dataset`, our intention was to load and align data from these three datasets on-the-fly during ML training. But it just isn't quite fast enough to keep a modern GPU constantly fed with data when loading multiple satellite channels and multiple NWP parameters. So, now, this code is used to pre-prepare thousands of batches, and save these batches to disk, each as a separate NetCDF file. These files can then be loaded super-quickly at training time. The end result is a 12x speedup in training. ## Contributors ✨ Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):

Jack Kelly

💻

Jacob Bieker

💻

Peter Dudfield

💻

Flo

💻

Rohan Nuttall

💻

Nasser Benabderrazik

💻

Shanmukh Chava

💻

Rishi Kumar Ray

💻

JanEbbing

💻
This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome! %package help Summary: Development documents and examples for nowcasting-dataset Provides: python3-nowcasting-dataset-doc %description help # nowcasting_dataset [![All Contributors](https://img.shields.io/badge/all_contributors-9-orange.svg?style=flat-square)](#contributors-) [![codecov](https://codecov.io/gh/openclimatefix/nowcasting_dataset/branch/main/graph/badge.svg?token=X0P4KTHWVA)](https://codecov.io/gh/openclimatefix/nowcasting_dataset) Pre-prepare batches of data for use in machine learning training. This code combines several data sources including: * Satellite imagery (EUMETSAT SEVIRI RSS 5-minutely data of UK) * Numerical Weather Predictions (NWPs. UK Met Office UKV model from CEDA) * Solar PV power timeseries data (from PVOutput.org, downloaded using our [pvoutput Python code](https://github.com/openclimatefix/pvoutput).) * Estimated total solar PV generation for each of the ~350 "grid supply points" (GSPs) in Britain from [Sheffield Solar's PV Live Regional API](https://www.solar.sheffield.ac.uk/pvlive/regional/). * Topographic data. * The Sun's azimuth and angle. This repo doesn't contain the ML models themselves. Please see [this page for an overview](https://github.com/openclimatefix/nowcasting) of the Open Climate Fix solar PV nowcasting project, and how our code repositories fit together. # User manual ## Installation ### `conda` From within the cloned `nowcasting_dataset` directory: ```shell conda env create -f environment.yml conda activate nowcasting_dataset pip install -e . ``` ### `pip` A (probably older) version is also available through `pip install nowcasting-dataset` ### PV Live API If you want to also install [PVLive](https://github.com/SheffieldSolar/PV_Live-API) then use `pip install git+https://github.com/SheffieldSolar/PV_Live-API ` ### Pre-commit A pre commit hook has been installed which makes `black` run with every commit. You need to install `black` and `pre-commit` (these will be installed by `conda` or `pip` when installing `nowcasting_dataset`) and run `pre-commit install` in this repo. ## Testing To test using the small amount of data stored in this repo: `py.test -s` To output debug logs while running the tests then run `py.test --log-cli-level=10` To test using the full dataset on Google Cloud, add the `--use_cloud_data` switch. ## docker Test using a docker file and database ``` docker stop $(docker ps -a -q) docker-compose -f test-docker-compose.yml build docker-compose -f test-docker-compose.yml run dataset ``` ## Downloading data ### Satellite data Use [Satip](https://github.com/openclimatefix/Satip) to download native EUMETSAT SEVIRI RSS data from EUMETSAT's API and then convert to an intermediate file format. ### PV data from PVOutput.org Download PV timeseries data from PVOutput.org using [our PVOutput code](https://github.com/openclimatefix/pvoutput). ### OCF uk_pv dataset PV solar generation data from the UK. This dataset contains data from 1311 PV systems from 2018-01-01 to 2021-10-27. The time series of solar generation is in 5 minutes chunks. This data is collected from live PV systems in the UK. We have obfuscated the location of the PV systems for privacy. ![](./docs/uk_pv_locations.jpg) ### Numerical weather predictions from the UK Met Office Please use our [`nwp`](https://github.com/openclimatefix/nwp) code to download UKV NWPs and convert to Zarr. ### GSP-level estimates of PV outturn from PV Live Regional TODO - GSP ### Topographical data 1. Make an account at the [USGS EarthExplorer](https://earthexplorer.usgs.gov/) website 2. Create a region of the world to download data for, in our case, the spatial extant of the SEVIRI RSS image 3. Select the data products you want, in this case SRTM elevation maps 4. Download all the SRTM files that cover that area There does not seem to be an automated way to do this selecting and downloading, so this might take awhile. ## Configure `nowcasting_dataset` to point to the downloaded data Copy and modify one of the config yaml files in [`nowcasting_dataset/config/`](https://github.com/openclimatefix/nowcasting_dataset/tree/main/nowcasting_dataset/config). ## Prepare ML batches Run [`scripts/prepare_ml_data.py --help`](https://github.com/openclimatefix/nowcasting_dataset/blob/main/scripts/prepare_ml_data.py) to learn how to run the `prepare_ml_data.py` script. ## What exactly is in each batch? Please see the `data_sources//_model.py` files (where `` is one of {datetime, metadata, gsp, nwp, pv, satellite, sun, topographic}) for documentation about the different data fields in each example / batch. # History of nowcasting_dataset When we first started writing `nowcasting_dataset`, our intention was to load and align data from these three datasets on-the-fly during ML training. But it just isn't quite fast enough to keep a modern GPU constantly fed with data when loading multiple satellite channels and multiple NWP parameters. So, now, this code is used to pre-prepare thousands of batches, and save these batches to disk, each as a separate NetCDF file. These files can then be loaded super-quickly at training time. The end result is a 12x speedup in training. ## Contributors ✨ Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):

Jack Kelly

💻

Jacob Bieker

💻

Peter Dudfield

💻

Flo

💻

Rohan Nuttall

💻

Nasser Benabderrazik

💻

Shanmukh Chava

💻

Rishi Kumar Ray

💻

JanEbbing

💻
This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome! %prep %autosetup -n nowcasting-dataset-3.7.21 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-nowcasting-dataset -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Sun Apr 23 2023 Python_Bot - 3.7.21-1 - Package Spec generated