summaryrefslogtreecommitdiff
path: root/python-nowcasting-dataset.spec
diff options
context:
space:
mode:
Diffstat (limited to 'python-nowcasting-dataset.spec')
-rw-r--r--python-nowcasting-dataset.spec632
1 files changed, 632 insertions, 0 deletions
diff --git a/python-nowcasting-dataset.spec b/python-nowcasting-dataset.spec
new file mode 100644
index 0000000..d8f8696
--- /dev/null
+++ b/python-nowcasting-dataset.spec
@@ -0,0 +1,632 @@
+%global _empty_manifest_terminate_build 0
+Name: python-nowcasting-dataset
+Version: 3.7.21
+Release: 1
+Summary: Nowcasting Dataset
+License: MIT
+URL: https://pypi.org/project/nowcasting-dataset/
+Source0: https://mirrors.nju.edu.cn/pypi/web/packages/7a/41/a339689cef755fdd46b336e6d6cdd4ce6e0738fbdf55266319d230a12074/nowcasting_dataset-3.7.21.tar.gz
+BuildArch: noarch
+
+Requires: python3-rioxarray
+Requires: python3-numpy
+Requires: python3-pandas
+Requires: python3-geopandas
+Requires: python3-zarr
+Requires: python3-xarray
+Requires: python3-h5netcdf
+Requires: python3-gcsfs
+Requires: python3-dask
+Requires: python3-pvlib
+Requires: python3-pyproj
+Requires: python3-pytest
+Requires: python3-coverage
+Requires: python3-pytest-cov
+Requires: python3-jedi
+Requires: python3-mypy
+Requires: python3-pydantic
+Requires: python3-tqdm
+Requires: python3-fsspec
+Requires: python3-pathy
+Requires: python3-opencv-contrib-python-headless
+Requires: python3-gitpython
+Requires: python3-pyresample
+Requires: python3-nowcasting-datamodel
+Requires: python3-scipy
+Requires: python3-pyaml-env
+Requires: python3-black
+Requires: python3-plotly
+Requires: python3-gcsfs
+Requires: python3-matplotlib
+Requires: python3-ipykernel
+Requires: python3-pre-commit
+
+%description
+# nowcasting_dataset
+<!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->
+[![All Contributors](https://img.shields.io/badge/all_contributors-9-orange.svg?style=flat-square)](#contributors-)
+<!-- ALL-CONTRIBUTORS-BADGE:END -->
+
+[![codecov](https://codecov.io/gh/openclimatefix/nowcasting_dataset/branch/main/graph/badge.svg?token=X0P4KTHWVA)](https://codecov.io/gh/openclimatefix/nowcasting_dataset)
+
+
+Pre-prepare batches of data for use in machine learning training.
+
+This code combines several data sources including:
+
+* Satellite imagery (EUMETSAT SEVIRI RSS 5-minutely data of UK)
+* Numerical Weather Predictions (NWPs. UK Met Office UKV model from CEDA)
+* Solar PV power timeseries data (from PVOutput.org, downloaded using
+ our [pvoutput Python code](https://github.com/openclimatefix/pvoutput).)
+* Estimated total solar PV generation for each of the ~350 "grid supply points"
+ (GSPs) in Britain from [Sheffield Solar's PV Live Regional API](https://www.solar.sheffield.ac.uk/pvlive/regional/).
+* Topographic data.
+* The Sun's azimuth and angle.
+
+This repo doesn't contain the ML models themselves. Please see [this
+page for an overview](https://github.com/openclimatefix/nowcasting) of
+the Open Climate Fix solar PV nowcasting project, and how our code
+repositories fit together.
+
+
+# User manual
+
+## Installation
+
+### `conda`
+
+From within the cloned `nowcasting_dataset` directory:
+
+```shell
+conda env create -f environment.yml
+conda activate nowcasting_dataset
+pip install -e .
+```
+
+### `pip`
+
+A (probably older) version is also available through `pip install nowcasting-dataset`
+
+
+### PV Live API
+If you want to also install [PVLive](https://github.com/SheffieldSolar/PV_Live-API) then use `pip install git+https://github.com/SheffieldSolar/PV_Live-API
+`
+
+### Pre-commit
+
+A pre commit hook has been installed which makes `black` run with every commit. You need to install
+`black` and `pre-commit` (these will be installed by `conda` or `pip` when installing
+`nowcasting_dataset`) and run `pre-commit install` in this repo.
+
+
+## Testing
+
+To test using the small amount of data stored in this repo: `py.test -s`
+
+To output debug logs while running the tests then run `py.test --log-cli-level=10`
+
+To test using the full dataset on Google Cloud, add the `--use_cloud_data` switch.
+
+## docker
+
+Test using a docker file and database
+
+```
+docker stop $(docker ps -a -q)
+docker-compose -f test-docker-compose.yml build
+docker-compose -f test-docker-compose.yml run dataset
+```
+
+## Downloading data
+
+### Satellite data
+
+Use [Satip](https://github.com/openclimatefix/Satip) to download
+ native EUMETSAT SEVIRI RSS data from EUMETSAT's API and then convert
+ to an intermediate file format.
+
+
+### PV data from PVOutput.org
+
+Download PV timeseries data from PVOutput.org using
+[our PVOutput code](https://github.com/openclimatefix/pvoutput).
+
+### OCF uk_pv dataset
+
+PV solar generation data from the UK. This dataset contains data from 1311 PV systems from 2018-01-01 to 2021-10-27. The time series of solar generation is in 5 minutes chunks. This data is collected from live PV systems in the UK. We have obfuscated the location of the PV systems for privacy.
+
+![](./docs/uk_pv_locations.jpg)
+
+
+### Numerical weather predictions from the UK Met Office
+
+Please use our [`nwp`](https://github.com/openclimatefix/nwp) code to download UKV NWPs and convert to Zarr.
+
+
+### GSP-level estimates of PV outturn from PV Live Regional
+
+TODO - GSP
+
+
+### Topographical data
+
+1. Make an account at the [USGS EarthExplorer](https://earthexplorer.usgs.gov/) website
+2. Create a region of the world to download data for, in our case, the spatial extant of the SEVIRI RSS image
+3. Select the data products you want, in this case SRTM elevation maps
+4. Download all the SRTM files that cover that area
+
+There does not seem to be an automated way to do this selecting and downloading, so this might take awhile.
+
+
+## Configure `nowcasting_dataset` to point to the downloaded data
+
+Copy and modify one of the config yaml files in
+[`nowcasting_dataset/config/`](https://github.com/openclimatefix/nowcasting_dataset/tree/main/nowcasting_dataset/config).
+
+
+## Prepare ML batches
+
+Run [`scripts/prepare_ml_data.py --help`](https://github.com/openclimatefix/nowcasting_dataset/blob/main/scripts/prepare_ml_data.py)
+to learn how to run the `prepare_ml_data.py` script.
+
+
+## What exactly is in each batch?
+
+Please see the `data_sources/<modality>/<modality>_model.py` files
+(where `<modality>` is one of {datetime, metadata, gsp, nwp, pv,
+satellite, sun, topographic}) for documentation about the different
+data fields in each example / batch.
+
+
+# History of nowcasting_dataset
+
+When we first started writing `nowcasting_dataset`, our intention was
+to load and align data from these three datasets on-the-fly during ML
+training. But it just isn't quite fast enough to keep a modern GPU constantly fed
+with data when loading multiple satellite channels and multiple NWP
+parameters. So, now, this code is used to pre-prepare thousands of
+batches, and save these batches to disk, each as a separate NetCDF
+file. These files can then be loaded super-quickly at training time.
+The end result is a 12x speedup in training.
+
+## Contributors ✨
+
+Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):
+
+<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
+<!-- prettier-ignore-start -->
+<!-- markdownlint-disable -->
+<table>
+ <tr>
+ <td align="center"><a href="http://jack-kelly.com"><img src="https://avatars.githubusercontent.com/u/460756?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Jack Kelly</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=JackKelly" title="Code">💻</a></td>
+ <td align="center"><a href="https://www.jacobbieker.com"><img src="https://avatars.githubusercontent.com/u/7170359?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Jacob Bieker</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=jacobbieker" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/peterdudfield"><img src="https://avatars.githubusercontent.com/u/34686298?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Peter Dudfield</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=peterdudfield" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/flowirtz"><img src="https://avatars.githubusercontent.com/u/6052785?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Flo</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=flowirtz" title="Code">💻</a></td>
+ <td align="center"><a href="https://rohancalum.github.io/"><img src="https://avatars.githubusercontent.com/u/42122330?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Rohan Nuttall</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=rohancalum" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/lenassero"><img src="https://avatars.githubusercontent.com/u/21358816?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Nasser Benabderrazik</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=lenassero" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/vnshanmukh"><img src="https://avatars.githubusercontent.com/u/67438038?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Shanmukh Chava</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=vnshanmukh" title="Code">💻</a></td>
+ </tr>
+ <tr>
+ <td align="center"><a href="https://github.com/RishiKumarRay"><img src="https://avatars.githubusercontent.com/u/87641376?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Rishi Kumar Ray</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=RishiKumarRay" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/JanEbbing"><img src="https://avatars.githubusercontent.com/u/5873110?v=4?s=100" width="100px;" alt=""/><br /><sub><b>JanEbbing</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=JanEbbing" title="Code">💻</a></td>
+ </tr>
+</table>
+
+<!-- markdownlint-restore -->
+<!-- prettier-ignore-end -->
+
+<!-- ALL-CONTRIBUTORS-LIST:END -->
+
+This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome!
+
+
+%package -n python3-nowcasting-dataset
+Summary: Nowcasting Dataset
+Provides: python-nowcasting-dataset
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-nowcasting-dataset
+# nowcasting_dataset
+<!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->
+[![All Contributors](https://img.shields.io/badge/all_contributors-9-orange.svg?style=flat-square)](#contributors-)
+<!-- ALL-CONTRIBUTORS-BADGE:END -->
+
+[![codecov](https://codecov.io/gh/openclimatefix/nowcasting_dataset/branch/main/graph/badge.svg?token=X0P4KTHWVA)](https://codecov.io/gh/openclimatefix/nowcasting_dataset)
+
+
+Pre-prepare batches of data for use in machine learning training.
+
+This code combines several data sources including:
+
+* Satellite imagery (EUMETSAT SEVIRI RSS 5-minutely data of UK)
+* Numerical Weather Predictions (NWPs. UK Met Office UKV model from CEDA)
+* Solar PV power timeseries data (from PVOutput.org, downloaded using
+ our [pvoutput Python code](https://github.com/openclimatefix/pvoutput).)
+* Estimated total solar PV generation for each of the ~350 "grid supply points"
+ (GSPs) in Britain from [Sheffield Solar's PV Live Regional API](https://www.solar.sheffield.ac.uk/pvlive/regional/).
+* Topographic data.
+* The Sun's azimuth and angle.
+
+This repo doesn't contain the ML models themselves. Please see [this
+page for an overview](https://github.com/openclimatefix/nowcasting) of
+the Open Climate Fix solar PV nowcasting project, and how our code
+repositories fit together.
+
+
+# User manual
+
+## Installation
+
+### `conda`
+
+From within the cloned `nowcasting_dataset` directory:
+
+```shell
+conda env create -f environment.yml
+conda activate nowcasting_dataset
+pip install -e .
+```
+
+### `pip`
+
+A (probably older) version is also available through `pip install nowcasting-dataset`
+
+
+### PV Live API
+If you want to also install [PVLive](https://github.com/SheffieldSolar/PV_Live-API) then use `pip install git+https://github.com/SheffieldSolar/PV_Live-API
+`
+
+### Pre-commit
+
+A pre commit hook has been installed which makes `black` run with every commit. You need to install
+`black` and `pre-commit` (these will be installed by `conda` or `pip` when installing
+`nowcasting_dataset`) and run `pre-commit install` in this repo.
+
+
+## Testing
+
+To test using the small amount of data stored in this repo: `py.test -s`
+
+To output debug logs while running the tests then run `py.test --log-cli-level=10`
+
+To test using the full dataset on Google Cloud, add the `--use_cloud_data` switch.
+
+## docker
+
+Test using a docker file and database
+
+```
+docker stop $(docker ps -a -q)
+docker-compose -f test-docker-compose.yml build
+docker-compose -f test-docker-compose.yml run dataset
+```
+
+## Downloading data
+
+### Satellite data
+
+Use [Satip](https://github.com/openclimatefix/Satip) to download
+ native EUMETSAT SEVIRI RSS data from EUMETSAT's API and then convert
+ to an intermediate file format.
+
+
+### PV data from PVOutput.org
+
+Download PV timeseries data from PVOutput.org using
+[our PVOutput code](https://github.com/openclimatefix/pvoutput).
+
+### OCF uk_pv dataset
+
+PV solar generation data from the UK. This dataset contains data from 1311 PV systems from 2018-01-01 to 2021-10-27. The time series of solar generation is in 5 minutes chunks. This data is collected from live PV systems in the UK. We have obfuscated the location of the PV systems for privacy.
+
+![](./docs/uk_pv_locations.jpg)
+
+
+### Numerical weather predictions from the UK Met Office
+
+Please use our [`nwp`](https://github.com/openclimatefix/nwp) code to download UKV NWPs and convert to Zarr.
+
+
+### GSP-level estimates of PV outturn from PV Live Regional
+
+TODO - GSP
+
+
+### Topographical data
+
+1. Make an account at the [USGS EarthExplorer](https://earthexplorer.usgs.gov/) website
+2. Create a region of the world to download data for, in our case, the spatial extant of the SEVIRI RSS image
+3. Select the data products you want, in this case SRTM elevation maps
+4. Download all the SRTM files that cover that area
+
+There does not seem to be an automated way to do this selecting and downloading, so this might take awhile.
+
+
+## Configure `nowcasting_dataset` to point to the downloaded data
+
+Copy and modify one of the config yaml files in
+[`nowcasting_dataset/config/`](https://github.com/openclimatefix/nowcasting_dataset/tree/main/nowcasting_dataset/config).
+
+
+## Prepare ML batches
+
+Run [`scripts/prepare_ml_data.py --help`](https://github.com/openclimatefix/nowcasting_dataset/blob/main/scripts/prepare_ml_data.py)
+to learn how to run the `prepare_ml_data.py` script.
+
+
+## What exactly is in each batch?
+
+Please see the `data_sources/<modality>/<modality>_model.py` files
+(where `<modality>` is one of {datetime, metadata, gsp, nwp, pv,
+satellite, sun, topographic}) for documentation about the different
+data fields in each example / batch.
+
+
+# History of nowcasting_dataset
+
+When we first started writing `nowcasting_dataset`, our intention was
+to load and align data from these three datasets on-the-fly during ML
+training. But it just isn't quite fast enough to keep a modern GPU constantly fed
+with data when loading multiple satellite channels and multiple NWP
+parameters. So, now, this code is used to pre-prepare thousands of
+batches, and save these batches to disk, each as a separate NetCDF
+file. These files can then be loaded super-quickly at training time.
+The end result is a 12x speedup in training.
+
+## Contributors ✨
+
+Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):
+
+<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
+<!-- prettier-ignore-start -->
+<!-- markdownlint-disable -->
+<table>
+ <tr>
+ <td align="center"><a href="http://jack-kelly.com"><img src="https://avatars.githubusercontent.com/u/460756?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Jack Kelly</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=JackKelly" title="Code">💻</a></td>
+ <td align="center"><a href="https://www.jacobbieker.com"><img src="https://avatars.githubusercontent.com/u/7170359?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Jacob Bieker</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=jacobbieker" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/peterdudfield"><img src="https://avatars.githubusercontent.com/u/34686298?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Peter Dudfield</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=peterdudfield" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/flowirtz"><img src="https://avatars.githubusercontent.com/u/6052785?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Flo</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=flowirtz" title="Code">💻</a></td>
+ <td align="center"><a href="https://rohancalum.github.io/"><img src="https://avatars.githubusercontent.com/u/42122330?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Rohan Nuttall</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=rohancalum" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/lenassero"><img src="https://avatars.githubusercontent.com/u/21358816?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Nasser Benabderrazik</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=lenassero" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/vnshanmukh"><img src="https://avatars.githubusercontent.com/u/67438038?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Shanmukh Chava</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=vnshanmukh" title="Code">💻</a></td>
+ </tr>
+ <tr>
+ <td align="center"><a href="https://github.com/RishiKumarRay"><img src="https://avatars.githubusercontent.com/u/87641376?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Rishi Kumar Ray</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=RishiKumarRay" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/JanEbbing"><img src="https://avatars.githubusercontent.com/u/5873110?v=4?s=100" width="100px;" alt=""/><br /><sub><b>JanEbbing</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=JanEbbing" title="Code">💻</a></td>
+ </tr>
+</table>
+
+<!-- markdownlint-restore -->
+<!-- prettier-ignore-end -->
+
+<!-- ALL-CONTRIBUTORS-LIST:END -->
+
+This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome!
+
+
+%package help
+Summary: Development documents and examples for nowcasting-dataset
+Provides: python3-nowcasting-dataset-doc
+%description help
+# nowcasting_dataset
+<!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->
+[![All Contributors](https://img.shields.io/badge/all_contributors-9-orange.svg?style=flat-square)](#contributors-)
+<!-- ALL-CONTRIBUTORS-BADGE:END -->
+
+[![codecov](https://codecov.io/gh/openclimatefix/nowcasting_dataset/branch/main/graph/badge.svg?token=X0P4KTHWVA)](https://codecov.io/gh/openclimatefix/nowcasting_dataset)
+
+
+Pre-prepare batches of data for use in machine learning training.
+
+This code combines several data sources including:
+
+* Satellite imagery (EUMETSAT SEVIRI RSS 5-minutely data of UK)
+* Numerical Weather Predictions (NWPs. UK Met Office UKV model from CEDA)
+* Solar PV power timeseries data (from PVOutput.org, downloaded using
+ our [pvoutput Python code](https://github.com/openclimatefix/pvoutput).)
+* Estimated total solar PV generation for each of the ~350 "grid supply points"
+ (GSPs) in Britain from [Sheffield Solar's PV Live Regional API](https://www.solar.sheffield.ac.uk/pvlive/regional/).
+* Topographic data.
+* The Sun's azimuth and angle.
+
+This repo doesn't contain the ML models themselves. Please see [this
+page for an overview](https://github.com/openclimatefix/nowcasting) of
+the Open Climate Fix solar PV nowcasting project, and how our code
+repositories fit together.
+
+
+# User manual
+
+## Installation
+
+### `conda`
+
+From within the cloned `nowcasting_dataset` directory:
+
+```shell
+conda env create -f environment.yml
+conda activate nowcasting_dataset
+pip install -e .
+```
+
+### `pip`
+
+A (probably older) version is also available through `pip install nowcasting-dataset`
+
+
+### PV Live API
+If you want to also install [PVLive](https://github.com/SheffieldSolar/PV_Live-API) then use `pip install git+https://github.com/SheffieldSolar/PV_Live-API
+`
+
+### Pre-commit
+
+A pre commit hook has been installed which makes `black` run with every commit. You need to install
+`black` and `pre-commit` (these will be installed by `conda` or `pip` when installing
+`nowcasting_dataset`) and run `pre-commit install` in this repo.
+
+
+## Testing
+
+To test using the small amount of data stored in this repo: `py.test -s`
+
+To output debug logs while running the tests then run `py.test --log-cli-level=10`
+
+To test using the full dataset on Google Cloud, add the `--use_cloud_data` switch.
+
+## docker
+
+Test using a docker file and database
+
+```
+docker stop $(docker ps -a -q)
+docker-compose -f test-docker-compose.yml build
+docker-compose -f test-docker-compose.yml run dataset
+```
+
+## Downloading data
+
+### Satellite data
+
+Use [Satip](https://github.com/openclimatefix/Satip) to download
+ native EUMETSAT SEVIRI RSS data from EUMETSAT's API and then convert
+ to an intermediate file format.
+
+
+### PV data from PVOutput.org
+
+Download PV timeseries data from PVOutput.org using
+[our PVOutput code](https://github.com/openclimatefix/pvoutput).
+
+### OCF uk_pv dataset
+
+PV solar generation data from the UK. This dataset contains data from 1311 PV systems from 2018-01-01 to 2021-10-27. The time series of solar generation is in 5 minutes chunks. This data is collected from live PV systems in the UK. We have obfuscated the location of the PV systems for privacy.
+
+![](./docs/uk_pv_locations.jpg)
+
+
+### Numerical weather predictions from the UK Met Office
+
+Please use our [`nwp`](https://github.com/openclimatefix/nwp) code to download UKV NWPs and convert to Zarr.
+
+
+### GSP-level estimates of PV outturn from PV Live Regional
+
+TODO - GSP
+
+
+### Topographical data
+
+1. Make an account at the [USGS EarthExplorer](https://earthexplorer.usgs.gov/) website
+2. Create a region of the world to download data for, in our case, the spatial extant of the SEVIRI RSS image
+3. Select the data products you want, in this case SRTM elevation maps
+4. Download all the SRTM files that cover that area
+
+There does not seem to be an automated way to do this selecting and downloading, so this might take awhile.
+
+
+## Configure `nowcasting_dataset` to point to the downloaded data
+
+Copy and modify one of the config yaml files in
+[`nowcasting_dataset/config/`](https://github.com/openclimatefix/nowcasting_dataset/tree/main/nowcasting_dataset/config).
+
+
+## Prepare ML batches
+
+Run [`scripts/prepare_ml_data.py --help`](https://github.com/openclimatefix/nowcasting_dataset/blob/main/scripts/prepare_ml_data.py)
+to learn how to run the `prepare_ml_data.py` script.
+
+
+## What exactly is in each batch?
+
+Please see the `data_sources/<modality>/<modality>_model.py` files
+(where `<modality>` is one of {datetime, metadata, gsp, nwp, pv,
+satellite, sun, topographic}) for documentation about the different
+data fields in each example / batch.
+
+
+# History of nowcasting_dataset
+
+When we first started writing `nowcasting_dataset`, our intention was
+to load and align data from these three datasets on-the-fly during ML
+training. But it just isn't quite fast enough to keep a modern GPU constantly fed
+with data when loading multiple satellite channels and multiple NWP
+parameters. So, now, this code is used to pre-prepare thousands of
+batches, and save these batches to disk, each as a separate NetCDF
+file. These files can then be loaded super-quickly at training time.
+The end result is a 12x speedup in training.
+
+## Contributors ✨
+
+Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):
+
+<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
+<!-- prettier-ignore-start -->
+<!-- markdownlint-disable -->
+<table>
+ <tr>
+ <td align="center"><a href="http://jack-kelly.com"><img src="https://avatars.githubusercontent.com/u/460756?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Jack Kelly</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=JackKelly" title="Code">💻</a></td>
+ <td align="center"><a href="https://www.jacobbieker.com"><img src="https://avatars.githubusercontent.com/u/7170359?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Jacob Bieker</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=jacobbieker" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/peterdudfield"><img src="https://avatars.githubusercontent.com/u/34686298?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Peter Dudfield</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=peterdudfield" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/flowirtz"><img src="https://avatars.githubusercontent.com/u/6052785?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Flo</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=flowirtz" title="Code">💻</a></td>
+ <td align="center"><a href="https://rohancalum.github.io/"><img src="https://avatars.githubusercontent.com/u/42122330?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Rohan Nuttall</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=rohancalum" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/lenassero"><img src="https://avatars.githubusercontent.com/u/21358816?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Nasser Benabderrazik</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=lenassero" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/vnshanmukh"><img src="https://avatars.githubusercontent.com/u/67438038?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Shanmukh Chava</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=vnshanmukh" title="Code">💻</a></td>
+ </tr>
+ <tr>
+ <td align="center"><a href="https://github.com/RishiKumarRay"><img src="https://avatars.githubusercontent.com/u/87641376?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Rishi Kumar Ray</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=RishiKumarRay" title="Code">💻</a></td>
+ <td align="center"><a href="https://github.com/JanEbbing"><img src="https://avatars.githubusercontent.com/u/5873110?v=4?s=100" width="100px;" alt=""/><br /><sub><b>JanEbbing</b></sub></a><br /><a href="https://github.com/openclimatefix/nowcasting_dataset/commits?author=JanEbbing" title="Code">💻</a></td>
+ </tr>
+</table>
+
+<!-- markdownlint-restore -->
+<!-- prettier-ignore-end -->
+
+<!-- ALL-CONTRIBUTORS-LIST:END -->
+
+This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome!
+
+
+%prep
+%autosetup -n nowcasting-dataset-3.7.21
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-nowcasting-dataset -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Tue Apr 11 2023 Python_Bot <Python_Bot@openeuler.org> - 3.7.21-1
+- Package Spec generated