diff options
author | CoprDistGit <infra@openeuler.org> | 2023-05-31 06:14:11 +0000 |
---|---|---|
committer | CoprDistGit <infra@openeuler.org> | 2023-05-31 06:14:11 +0000 |
commit | 2c26f2365bd978087db89b204cf758aefd158f54 (patch) | |
tree | 5646e07bb5da32118d6f98fa8305a677c2c6c9bb | |
parent | 8fd51d7b90fb64b2090e8d2eeb441d3a0565f858 (diff) |
automatic import of python-cdp-scrapers
-rw-r--r-- | .gitignore | 1 | ||||
-rw-r--r-- | python-cdp-scrapers.spec | 354 | ||||
-rw-r--r-- | sources | 1 |
3 files changed, 356 insertions, 0 deletions
@@ -0,0 +1 @@ +/cdp-scrapers-0.6.1.tar.gz diff --git a/python-cdp-scrapers.spec b/python-cdp-scrapers.spec new file mode 100644 index 0000000..2b2b8b4 --- /dev/null +++ b/python-cdp-scrapers.spec @@ -0,0 +1,354 @@ +%global _empty_manifest_terminate_build 0 +Name: python-cdp-scrapers +Version: 0.6.1 +Release: 1 +Summary: Scratchpad for scraper development and general utilities. +License: MIT license +URL: https://github.com/CouncilDataProject/cdp-scrapers +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/6d/9d/e4630d79e8378ee1cd210d94a36b18c1ff27617dbba380ff5273304e0de7/cdp-scrapers-0.6.1.tar.gz +BuildArch: noarch + +Requires: python3-beautifulsoup4 +Requires: python3-cdp-backend +Requires: python3-defusedxml +Requires: python3-pytz +Requires: python3-requests +Requires: python3-clean-text +Requires: python3-civic-scraper +Requires: python3-beautifulsoup4 +Requires: python3-cdp-backend +Requires: python3-defusedxml +Requires: python3-pytz +Requires: python3-requests +Requires: python3-clean-text +Requires: python3-civic-scraper +Requires: python3-selenium +Requires: python3-webdriver-manager +Requires: python3-black +Requires: python3-flake8 +Requires: python3-flake8-debugger +Requires: python3-pytest +Requires: python3-pytest-cov +Requires: python3-pytest-raises +Requires: python3-tox +Requires: python3-bump2version +Requires: python3-ipython +Requires: python3-m2r2 +Requires: python3-Sphinx +Requires: python3-sphinx-rtd-theme +Requires: python3-twine +Requires: python3-wheel +Requires: python3-selenium +Requires: python3-webdriver-manager +Requires: python3-selenium +Requires: python3-webdriver-manager +Requires: python3-black +Requires: python3-flake8 +Requires: python3-flake8-debugger +Requires: python3-pytest +Requires: python3-pytest-cov +Requires: python3-pytest-raises +Requires: python3-tox +Requires: python3-bump2version +Requires: python3-ipython +Requires: python3-m2r2 +Requires: python3-Sphinx +Requires: python3-sphinx-rtd-theme +Requires: python3-twine +Requires: python3-wheel +Requires: python3-selenium +Requires: python3-webdriver-manager +Requires: python3-black +Requires: python3-flake8 +Requires: python3-flake8-debugger +Requires: python3-pytest +Requires: python3-pytest-cov +Requires: python3-pytest-raises +Requires: python3-tox + +%description +## Council Data Project +Council Data Project is an open-source project dedicated to providing journalists, +activists, researchers, and all members of each community we serve with the tools they +need to stay informed and hold their Council Members accountable. +For more information about Council Data Project, please visit +[our website](https://councildataproject.org/). +## About +`cdp-scrapers` is a collection of utilities and in-progress or actively maintained +CDP instance event scrapers. The purpose of this library is to help new CDP instance +maintainers have a quick plethora of examples for getting started on developing their +event scraper functions. +## Quick Start +### Legistar +General Legistar utility functions. +```python +from cdp_scrapers.legistar_utils import get_legistar_events_for_timespan +from cdp_scrapers.instances import get_seattle_events +from datetime import datetime +# Get all events (and minutes item and voting details) +# for a provided timespan for a legistar client +# Returns List[Dict] +seattle_legistar_events = get_legistar_events_for_timespan( + client="seattle", + timezone="America/Los_Angeles", + start=datetime(2021, 7, 12), + end=datetime(2021, 7, 14), +) +# Or parse and convert to CDP EventIngestionModel +seattle_cdp_parsed_events = get_seattle_events( + from_dt=datetime(2021, 7, 12), + to_dt=datetime(2021, 7, 14), +) +``` +### Scrapers +#### Event Scraper Structure + +Our current event scraper structure is as shown above. The main function `get_events` +gets all the required data and it calls the `get_content_uris` function to return the +required video data. +If your city uses Legistar and the Legistar data is publicly available. +- You may be able to reuse our scraper with minimal modifications, such as providing the +correct Legistar client ID for your municipality. +- If the Legistar data returned only does not include the EventVideoPath field for the +`Session.video_uri` data, you will only need to implement `get_content_uris`. +If your city does not use Legistar. +- You will need to build your own event scraper. +Example of a completed scraper: [cdp_scrapers.instances.seattle.SeattleScraper](https://councildataproject.org/cdp-scrapers/_modules/cdp_scrapers/instances/seattle.html#SeattleScraper) +For more details about creating a custom scraper for your municipality's Legistar data, +please visit [here](https://councildataproject.org/cdp-scrapers/legistar_scraper.html). +If you would like to deploy a CDP instance or would like to use this library as a +method for retrieving formatted legislative data, please feel free to contribute a new +custom municipality scraper! +#### Creating a Custom Scraper +If it isn't possible to use our generalized Legistar tooling to write your scraper, +you will need to create your own event scraper to proceed with the deployment. +1. Please see our documentation on the +[minimum data required for CDP event ingestion](https://councildataproject.org/cdp-backend/ingestion_models.html) +to understand what data your scraper should return. +2. From there, begin with our +[empty custom scraper function template](https://councildataproject.org/cdp-scrapers/_modules/cdp_scrapers/instances/empty.html#get_events) +and fill in your scraper. +3. After your scraper is completed, you can create a pull request to add your scraper +into the [cdp-scrapers repo](https://github.com/CouncilDataProject/cdp-scrapers) +so it can be added into the final repo for your CDP instance. +4. Our automated action will run your scraper to verify it returns the correct data. +If it is successful, you may proceed to the next deployment step. If not, we will +automatically share the error message so you can fix the issue and the scraper can be +tested again afterwards. +## Installation +**Stable Release:** `pip install cdp-scrapers`<br> +**Development Head:** `pip install git+https://github.com/CouncilDataProject/cdp-scrapers.git` +## Documentation +For full package documentation please visit [councildataproject.org/cdp-scrapers](https://councildataproject.org/cdp-scrapers). +## Development +Refer to [CONTRIBUTING.md](CONTRIBUTING.md) for information related to developing the code. +**MIT license** + +%package -n python3-cdp-scrapers +Summary: Scratchpad for scraper development and general utilities. +Provides: python-cdp-scrapers +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-cdp-scrapers +## Council Data Project +Council Data Project is an open-source project dedicated to providing journalists, +activists, researchers, and all members of each community we serve with the tools they +need to stay informed and hold their Council Members accountable. +For more information about Council Data Project, please visit +[our website](https://councildataproject.org/). +## About +`cdp-scrapers` is a collection of utilities and in-progress or actively maintained +CDP instance event scrapers. The purpose of this library is to help new CDP instance +maintainers have a quick plethora of examples for getting started on developing their +event scraper functions. +## Quick Start +### Legistar +General Legistar utility functions. +```python +from cdp_scrapers.legistar_utils import get_legistar_events_for_timespan +from cdp_scrapers.instances import get_seattle_events +from datetime import datetime +# Get all events (and minutes item and voting details) +# for a provided timespan for a legistar client +# Returns List[Dict] +seattle_legistar_events = get_legistar_events_for_timespan( + client="seattle", + timezone="America/Los_Angeles", + start=datetime(2021, 7, 12), + end=datetime(2021, 7, 14), +) +# Or parse and convert to CDP EventIngestionModel +seattle_cdp_parsed_events = get_seattle_events( + from_dt=datetime(2021, 7, 12), + to_dt=datetime(2021, 7, 14), +) +``` +### Scrapers +#### Event Scraper Structure + +Our current event scraper structure is as shown above. The main function `get_events` +gets all the required data and it calls the `get_content_uris` function to return the +required video data. +If your city uses Legistar and the Legistar data is publicly available. +- You may be able to reuse our scraper with minimal modifications, such as providing the +correct Legistar client ID for your municipality. +- If the Legistar data returned only does not include the EventVideoPath field for the +`Session.video_uri` data, you will only need to implement `get_content_uris`. +If your city does not use Legistar. +- You will need to build your own event scraper. +Example of a completed scraper: [cdp_scrapers.instances.seattle.SeattleScraper](https://councildataproject.org/cdp-scrapers/_modules/cdp_scrapers/instances/seattle.html#SeattleScraper) +For more details about creating a custom scraper for your municipality's Legistar data, +please visit [here](https://councildataproject.org/cdp-scrapers/legistar_scraper.html). +If you would like to deploy a CDP instance or would like to use this library as a +method for retrieving formatted legislative data, please feel free to contribute a new +custom municipality scraper! +#### Creating a Custom Scraper +If it isn't possible to use our generalized Legistar tooling to write your scraper, +you will need to create your own event scraper to proceed with the deployment. +1. Please see our documentation on the +[minimum data required for CDP event ingestion](https://councildataproject.org/cdp-backend/ingestion_models.html) +to understand what data your scraper should return. +2. From there, begin with our +[empty custom scraper function template](https://councildataproject.org/cdp-scrapers/_modules/cdp_scrapers/instances/empty.html#get_events) +and fill in your scraper. +3. After your scraper is completed, you can create a pull request to add your scraper +into the [cdp-scrapers repo](https://github.com/CouncilDataProject/cdp-scrapers) +so it can be added into the final repo for your CDP instance. +4. Our automated action will run your scraper to verify it returns the correct data. +If it is successful, you may proceed to the next deployment step. If not, we will +automatically share the error message so you can fix the issue and the scraper can be +tested again afterwards. +## Installation +**Stable Release:** `pip install cdp-scrapers`<br> +**Development Head:** `pip install git+https://github.com/CouncilDataProject/cdp-scrapers.git` +## Documentation +For full package documentation please visit [councildataproject.org/cdp-scrapers](https://councildataproject.org/cdp-scrapers). +## Development +Refer to [CONTRIBUTING.md](CONTRIBUTING.md) for information related to developing the code. +**MIT license** + +%package help +Summary: Development documents and examples for cdp-scrapers +Provides: python3-cdp-scrapers-doc +%description help +## Council Data Project +Council Data Project is an open-source project dedicated to providing journalists, +activists, researchers, and all members of each community we serve with the tools they +need to stay informed and hold their Council Members accountable. +For more information about Council Data Project, please visit +[our website](https://councildataproject.org/). +## About +`cdp-scrapers` is a collection of utilities and in-progress or actively maintained +CDP instance event scrapers. The purpose of this library is to help new CDP instance +maintainers have a quick plethora of examples for getting started on developing their +event scraper functions. +## Quick Start +### Legistar +General Legistar utility functions. +```python +from cdp_scrapers.legistar_utils import get_legistar_events_for_timespan +from cdp_scrapers.instances import get_seattle_events +from datetime import datetime +# Get all events (and minutes item and voting details) +# for a provided timespan for a legistar client +# Returns List[Dict] +seattle_legistar_events = get_legistar_events_for_timespan( + client="seattle", + timezone="America/Los_Angeles", + start=datetime(2021, 7, 12), + end=datetime(2021, 7, 14), +) +# Or parse and convert to CDP EventIngestionModel +seattle_cdp_parsed_events = get_seattle_events( + from_dt=datetime(2021, 7, 12), + to_dt=datetime(2021, 7, 14), +) +``` +### Scrapers +#### Event Scraper Structure + +Our current event scraper structure is as shown above. The main function `get_events` +gets all the required data and it calls the `get_content_uris` function to return the +required video data. +If your city uses Legistar and the Legistar data is publicly available. +- You may be able to reuse our scraper with minimal modifications, such as providing the +correct Legistar client ID for your municipality. +- If the Legistar data returned only does not include the EventVideoPath field for the +`Session.video_uri` data, you will only need to implement `get_content_uris`. +If your city does not use Legistar. +- You will need to build your own event scraper. +Example of a completed scraper: [cdp_scrapers.instances.seattle.SeattleScraper](https://councildataproject.org/cdp-scrapers/_modules/cdp_scrapers/instances/seattle.html#SeattleScraper) +For more details about creating a custom scraper for your municipality's Legistar data, +please visit [here](https://councildataproject.org/cdp-scrapers/legistar_scraper.html). +If you would like to deploy a CDP instance or would like to use this library as a +method for retrieving formatted legislative data, please feel free to contribute a new +custom municipality scraper! +#### Creating a Custom Scraper +If it isn't possible to use our generalized Legistar tooling to write your scraper, +you will need to create your own event scraper to proceed with the deployment. +1. Please see our documentation on the +[minimum data required for CDP event ingestion](https://councildataproject.org/cdp-backend/ingestion_models.html) +to understand what data your scraper should return. +2. From there, begin with our +[empty custom scraper function template](https://councildataproject.org/cdp-scrapers/_modules/cdp_scrapers/instances/empty.html#get_events) +and fill in your scraper. +3. After your scraper is completed, you can create a pull request to add your scraper +into the [cdp-scrapers repo](https://github.com/CouncilDataProject/cdp-scrapers) +so it can be added into the final repo for your CDP instance. +4. Our automated action will run your scraper to verify it returns the correct data. +If it is successful, you may proceed to the next deployment step. If not, we will +automatically share the error message so you can fix the issue and the scraper can be +tested again afterwards. +## Installation +**Stable Release:** `pip install cdp-scrapers`<br> +**Development Head:** `pip install git+https://github.com/CouncilDataProject/cdp-scrapers.git` +## Documentation +For full package documentation please visit [councildataproject.org/cdp-scrapers](https://councildataproject.org/cdp-scrapers). +## Development +Refer to [CONTRIBUTING.md](CONTRIBUTING.md) for information related to developing the code. +**MIT license** + +%prep +%autosetup -n cdp-scrapers-0.6.1 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-cdp-scrapers -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Wed May 31 2023 Python_Bot <Python_Bot@openeuler.org> - 0.6.1-1 +- Package Spec generated @@ -0,0 +1 @@ +66cafb88e6f7b44b33f7e0fb872b1ff6 cdp-scrapers-0.6.1.tar.gz |