summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorCoprDistGit <infra@openeuler.org>2023-05-31 06:14:11 +0000
committerCoprDistGit <infra@openeuler.org>2023-05-31 06:14:11 +0000
commit2c26f2365bd978087db89b204cf758aefd158f54 (patch)
tree5646e07bb5da32118d6f98fa8305a677c2c6c9bb
parent8fd51d7b90fb64b2090e8d2eeb441d3a0565f858 (diff)
automatic import of python-cdp-scrapers
-rw-r--r--.gitignore1
-rw-r--r--python-cdp-scrapers.spec354
-rw-r--r--sources1
3 files changed, 356 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..9852d2f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/cdp-scrapers-0.6.1.tar.gz
diff --git a/python-cdp-scrapers.spec b/python-cdp-scrapers.spec
new file mode 100644
index 0000000..2b2b8b4
--- /dev/null
+++ b/python-cdp-scrapers.spec
@@ -0,0 +1,354 @@
+%global _empty_manifest_terminate_build 0
+Name: python-cdp-scrapers
+Version: 0.6.1
+Release: 1
+Summary: Scratchpad for scraper development and general utilities.
+License: MIT license
+URL: https://github.com/CouncilDataProject/cdp-scrapers
+Source0: https://mirrors.nju.edu.cn/pypi/web/packages/6d/9d/e4630d79e8378ee1cd210d94a36b18c1ff27617dbba380ff5273304e0de7/cdp-scrapers-0.6.1.tar.gz
+BuildArch: noarch
+
+Requires: python3-beautifulsoup4
+Requires: python3-cdp-backend
+Requires: python3-defusedxml
+Requires: python3-pytz
+Requires: python3-requests
+Requires: python3-clean-text
+Requires: python3-civic-scraper
+Requires: python3-beautifulsoup4
+Requires: python3-cdp-backend
+Requires: python3-defusedxml
+Requires: python3-pytz
+Requires: python3-requests
+Requires: python3-clean-text
+Requires: python3-civic-scraper
+Requires: python3-selenium
+Requires: python3-webdriver-manager
+Requires: python3-black
+Requires: python3-flake8
+Requires: python3-flake8-debugger
+Requires: python3-pytest
+Requires: python3-pytest-cov
+Requires: python3-pytest-raises
+Requires: python3-tox
+Requires: python3-bump2version
+Requires: python3-ipython
+Requires: python3-m2r2
+Requires: python3-Sphinx
+Requires: python3-sphinx-rtd-theme
+Requires: python3-twine
+Requires: python3-wheel
+Requires: python3-selenium
+Requires: python3-webdriver-manager
+Requires: python3-selenium
+Requires: python3-webdriver-manager
+Requires: python3-black
+Requires: python3-flake8
+Requires: python3-flake8-debugger
+Requires: python3-pytest
+Requires: python3-pytest-cov
+Requires: python3-pytest-raises
+Requires: python3-tox
+Requires: python3-bump2version
+Requires: python3-ipython
+Requires: python3-m2r2
+Requires: python3-Sphinx
+Requires: python3-sphinx-rtd-theme
+Requires: python3-twine
+Requires: python3-wheel
+Requires: python3-selenium
+Requires: python3-webdriver-manager
+Requires: python3-black
+Requires: python3-flake8
+Requires: python3-flake8-debugger
+Requires: python3-pytest
+Requires: python3-pytest-cov
+Requires: python3-pytest-raises
+Requires: python3-tox
+
+%description
+## Council Data Project
+Council Data Project is an open-source project dedicated to providing journalists,
+activists, researchers, and all members of each community we serve with the tools they
+need to stay informed and hold their Council Members accountable.
+For more information about Council Data Project, please visit
+[our website](https://councildataproject.org/).
+## About
+`cdp-scrapers` is a collection of utilities and in-progress or actively maintained
+CDP instance event scrapers. The purpose of this library is to help new CDP instance
+maintainers have a quick plethora of examples for getting started on developing their
+event scraper functions.
+## Quick Start
+### Legistar
+General Legistar utility functions.
+```python
+from cdp_scrapers.legistar_utils import get_legistar_events_for_timespan
+from cdp_scrapers.instances import get_seattle_events
+from datetime import datetime
+# Get all events (and minutes item and voting details)
+# for a provided timespan for a legistar client
+# Returns List[Dict]
+seattle_legistar_events = get_legistar_events_for_timespan(
+ client="seattle",
+ timezone="America/Los_Angeles",
+ start=datetime(2021, 7, 12),
+ end=datetime(2021, 7, 14),
+)
+# Or parse and convert to CDP EventIngestionModel
+seattle_cdp_parsed_events = get_seattle_events(
+ from_dt=datetime(2021, 7, 12),
+ to_dt=datetime(2021, 7, 14),
+)
+```
+### Scrapers
+#### Event Scraper Structure
+![get_events](./images/get_events.png)
+Our current event scraper structure is as shown above. The main function `get_events`
+gets all the required data and it calls the `get_content_uris` function to return the
+required video data.
+If your city uses Legistar and the Legistar data is publicly available.
+- You may be able to reuse our scraper with minimal modifications, such as providing the
+correct Legistar client ID for your municipality.
+- If the Legistar data returned only does not include the EventVideoPath field for the
+`Session.video_uri` data, you will only need to implement `get_content_uris`.
+If your city does not use Legistar.
+- You will need to build your own event scraper.
+Example of a completed scraper: [cdp_scrapers.instances.seattle.SeattleScraper](https://councildataproject.org/cdp-scrapers/_modules/cdp_scrapers/instances/seattle.html#SeattleScraper)
+For more details about creating a custom scraper for your municipality's Legistar data,
+please visit [here](https://councildataproject.org/cdp-scrapers/legistar_scraper.html).
+If you would like to deploy a CDP instance or would like to use this library as a
+method for retrieving formatted legislative data, please feel free to contribute a new
+custom municipality scraper!
+#### Creating a Custom Scraper
+If it isn't possible to use our generalized Legistar tooling to write your scraper,
+you will need to create your own event scraper to proceed with the deployment.
+1. Please see our documentation on the
+[minimum data required for CDP event ingestion](https://councildataproject.org/cdp-backend/ingestion_models.html)
+to understand what data your scraper should return.
+2. From there, begin with our
+[empty custom scraper function template](https://councildataproject.org/cdp-scrapers/_modules/cdp_scrapers/instances/empty.html#get_events)
+and fill in your scraper.
+3. After your scraper is completed, you can create a pull request to add your scraper
+into the [cdp-scrapers repo](https://github.com/CouncilDataProject/cdp-scrapers)
+so it can be added into the final repo for your CDP instance.
+4. Our automated action will run your scraper to verify it returns the correct data.
+If it is successful, you may proceed to the next deployment step. If not, we will
+automatically share the error message so you can fix the issue and the scraper can be
+tested again afterwards.
+## Installation
+**Stable Release:** `pip install cdp-scrapers`<br>
+**Development Head:** `pip install git+https://github.com/CouncilDataProject/cdp-scrapers.git`
+## Documentation
+For full package documentation please visit [councildataproject.org/cdp-scrapers](https://councildataproject.org/cdp-scrapers).
+## Development
+Refer to [CONTRIBUTING.md](CONTRIBUTING.md) for information related to developing the code.
+**MIT license**
+
+%package -n python3-cdp-scrapers
+Summary: Scratchpad for scraper development and general utilities.
+Provides: python-cdp-scrapers
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-cdp-scrapers
+## Council Data Project
+Council Data Project is an open-source project dedicated to providing journalists,
+activists, researchers, and all members of each community we serve with the tools they
+need to stay informed and hold their Council Members accountable.
+For more information about Council Data Project, please visit
+[our website](https://councildataproject.org/).
+## About
+`cdp-scrapers` is a collection of utilities and in-progress or actively maintained
+CDP instance event scrapers. The purpose of this library is to help new CDP instance
+maintainers have a quick plethora of examples for getting started on developing their
+event scraper functions.
+## Quick Start
+### Legistar
+General Legistar utility functions.
+```python
+from cdp_scrapers.legistar_utils import get_legistar_events_for_timespan
+from cdp_scrapers.instances import get_seattle_events
+from datetime import datetime
+# Get all events (and minutes item and voting details)
+# for a provided timespan for a legistar client
+# Returns List[Dict]
+seattle_legistar_events = get_legistar_events_for_timespan(
+ client="seattle",
+ timezone="America/Los_Angeles",
+ start=datetime(2021, 7, 12),
+ end=datetime(2021, 7, 14),
+)
+# Or parse and convert to CDP EventIngestionModel
+seattle_cdp_parsed_events = get_seattle_events(
+ from_dt=datetime(2021, 7, 12),
+ to_dt=datetime(2021, 7, 14),
+)
+```
+### Scrapers
+#### Event Scraper Structure
+![get_events](./images/get_events.png)
+Our current event scraper structure is as shown above. The main function `get_events`
+gets all the required data and it calls the `get_content_uris` function to return the
+required video data.
+If your city uses Legistar and the Legistar data is publicly available.
+- You may be able to reuse our scraper with minimal modifications, such as providing the
+correct Legistar client ID for your municipality.
+- If the Legistar data returned only does not include the EventVideoPath field for the
+`Session.video_uri` data, you will only need to implement `get_content_uris`.
+If your city does not use Legistar.
+- You will need to build your own event scraper.
+Example of a completed scraper: [cdp_scrapers.instances.seattle.SeattleScraper](https://councildataproject.org/cdp-scrapers/_modules/cdp_scrapers/instances/seattle.html#SeattleScraper)
+For more details about creating a custom scraper for your municipality's Legistar data,
+please visit [here](https://councildataproject.org/cdp-scrapers/legistar_scraper.html).
+If you would like to deploy a CDP instance or would like to use this library as a
+method for retrieving formatted legislative data, please feel free to contribute a new
+custom municipality scraper!
+#### Creating a Custom Scraper
+If it isn't possible to use our generalized Legistar tooling to write your scraper,
+you will need to create your own event scraper to proceed with the deployment.
+1. Please see our documentation on the
+[minimum data required for CDP event ingestion](https://councildataproject.org/cdp-backend/ingestion_models.html)
+to understand what data your scraper should return.
+2. From there, begin with our
+[empty custom scraper function template](https://councildataproject.org/cdp-scrapers/_modules/cdp_scrapers/instances/empty.html#get_events)
+and fill in your scraper.
+3. After your scraper is completed, you can create a pull request to add your scraper
+into the [cdp-scrapers repo](https://github.com/CouncilDataProject/cdp-scrapers)
+so it can be added into the final repo for your CDP instance.
+4. Our automated action will run your scraper to verify it returns the correct data.
+If it is successful, you may proceed to the next deployment step. If not, we will
+automatically share the error message so you can fix the issue and the scraper can be
+tested again afterwards.
+## Installation
+**Stable Release:** `pip install cdp-scrapers`<br>
+**Development Head:** `pip install git+https://github.com/CouncilDataProject/cdp-scrapers.git`
+## Documentation
+For full package documentation please visit [councildataproject.org/cdp-scrapers](https://councildataproject.org/cdp-scrapers).
+## Development
+Refer to [CONTRIBUTING.md](CONTRIBUTING.md) for information related to developing the code.
+**MIT license**
+
+%package help
+Summary: Development documents and examples for cdp-scrapers
+Provides: python3-cdp-scrapers-doc
+%description help
+## Council Data Project
+Council Data Project is an open-source project dedicated to providing journalists,
+activists, researchers, and all members of each community we serve with the tools they
+need to stay informed and hold their Council Members accountable.
+For more information about Council Data Project, please visit
+[our website](https://councildataproject.org/).
+## About
+`cdp-scrapers` is a collection of utilities and in-progress or actively maintained
+CDP instance event scrapers. The purpose of this library is to help new CDP instance
+maintainers have a quick plethora of examples for getting started on developing their
+event scraper functions.
+## Quick Start
+### Legistar
+General Legistar utility functions.
+```python
+from cdp_scrapers.legistar_utils import get_legistar_events_for_timespan
+from cdp_scrapers.instances import get_seattle_events
+from datetime import datetime
+# Get all events (and minutes item and voting details)
+# for a provided timespan for a legistar client
+# Returns List[Dict]
+seattle_legistar_events = get_legistar_events_for_timespan(
+ client="seattle",
+ timezone="America/Los_Angeles",
+ start=datetime(2021, 7, 12),
+ end=datetime(2021, 7, 14),
+)
+# Or parse and convert to CDP EventIngestionModel
+seattle_cdp_parsed_events = get_seattle_events(
+ from_dt=datetime(2021, 7, 12),
+ to_dt=datetime(2021, 7, 14),
+)
+```
+### Scrapers
+#### Event Scraper Structure
+![get_events](./images/get_events.png)
+Our current event scraper structure is as shown above. The main function `get_events`
+gets all the required data and it calls the `get_content_uris` function to return the
+required video data.
+If your city uses Legistar and the Legistar data is publicly available.
+- You may be able to reuse our scraper with minimal modifications, such as providing the
+correct Legistar client ID for your municipality.
+- If the Legistar data returned only does not include the EventVideoPath field for the
+`Session.video_uri` data, you will only need to implement `get_content_uris`.
+If your city does not use Legistar.
+- You will need to build your own event scraper.
+Example of a completed scraper: [cdp_scrapers.instances.seattle.SeattleScraper](https://councildataproject.org/cdp-scrapers/_modules/cdp_scrapers/instances/seattle.html#SeattleScraper)
+For more details about creating a custom scraper for your municipality's Legistar data,
+please visit [here](https://councildataproject.org/cdp-scrapers/legistar_scraper.html).
+If you would like to deploy a CDP instance or would like to use this library as a
+method for retrieving formatted legislative data, please feel free to contribute a new
+custom municipality scraper!
+#### Creating a Custom Scraper
+If it isn't possible to use our generalized Legistar tooling to write your scraper,
+you will need to create your own event scraper to proceed with the deployment.
+1. Please see our documentation on the
+[minimum data required for CDP event ingestion](https://councildataproject.org/cdp-backend/ingestion_models.html)
+to understand what data your scraper should return.
+2. From there, begin with our
+[empty custom scraper function template](https://councildataproject.org/cdp-scrapers/_modules/cdp_scrapers/instances/empty.html#get_events)
+and fill in your scraper.
+3. After your scraper is completed, you can create a pull request to add your scraper
+into the [cdp-scrapers repo](https://github.com/CouncilDataProject/cdp-scrapers)
+so it can be added into the final repo for your CDP instance.
+4. Our automated action will run your scraper to verify it returns the correct data.
+If it is successful, you may proceed to the next deployment step. If not, we will
+automatically share the error message so you can fix the issue and the scraper can be
+tested again afterwards.
+## Installation
+**Stable Release:** `pip install cdp-scrapers`<br>
+**Development Head:** `pip install git+https://github.com/CouncilDataProject/cdp-scrapers.git`
+## Documentation
+For full package documentation please visit [councildataproject.org/cdp-scrapers](https://councildataproject.org/cdp-scrapers).
+## Development
+Refer to [CONTRIBUTING.md](CONTRIBUTING.md) for information related to developing the code.
+**MIT license**
+
+%prep
+%autosetup -n cdp-scrapers-0.6.1
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-cdp-scrapers -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Wed May 31 2023 Python_Bot <Python_Bot@openeuler.org> - 0.6.1-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..8f96171
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+66cafb88e6f7b44b33f7e0fb872b1ff6 cdp-scrapers-0.6.1.tar.gz