%global _empty_manifest_terminate_build 0 Name: python-Resiliparse Version: 0.14.3 Release: 1 Summary: A collection of robust and fast processing tools for parsing and analyzing (not only) web archive data. License: Apache License 2.0 URL: https://pypi.org/project/Resiliparse/ Source0: https://mirrors.nju.edu.cn/pypi/web/packages/19/99/c73e123faa6159aaebad490f76c133e8ef744560d4d02c663e3a04e030d8/Resiliparse-0.14.3.tar.gz Requires: python3-fastwarc Requires: python3-resiliparse[beam,cli] Requires: python3-boto3 Requires: python3-elasticsearch Requires: python3-apache-beam[aws] Requires: python3-click Requires: python3-joblib Requires: python3-tqdm Requires: python3-beautifulsoup4 Requires: python3-langid Requires: python3-selectolax Requires: python3-pytest Requires: python3-pytest-cov %description # ChatNoir Resiliparse A collection of robust and fast processing tools for parsing and analyzing (not only) web archive data. Resiliparse is a part of the [ChatNoir web analytics toolkit](https://github.com/chatnoir-eu/). ## Installing Resiliparse Pre-built Resiliparse binaries can be installed from PyPi: ```bash pip install resiliparse ``` ## Building Resiliparse From Source You can compile Resiliparse either from the PyPi source package or directly from this repository, though in any case, you need to install all required build-time dependencies first. On Ubuntu, this is done as follows: ```bash # Add Lexbor repository curl -L https://lexbor.com/keys/lexbor_signing.key | sudo apt-key add - echo "deb https://packages.lexbor.com/ubuntu/ $(lsb_release -sc) liblexbor" | \ sudo tee /etc/apt/sources.list.d/lexbor.list # Install build dependencies sudo apt update sudo apt install build-essential python3-dev libuchardet-dev liblexbor-dev libre2-dev ``` To build and install Resiliparse from PyPi, run ```bash pip install --no-binary resiliparse resiliparse ``` That's it. If you prefer to build and install directly from this repository instead, run: ```bash pip install -e resiliparse ``` To build the wheels without installing them, run: ```bash pip wheel -e resiliparse # Or: pip install build && python -m build --wheel resiliparse ``` ## Usage Instructions For detailed usage instructions, please consult the [Resiliparse User Manual](https://resiliparse.chatnoir.eu/en/latest/). ## Cite Us If you use ChatNoir or Resiliparse, please consider citing our [ECIR 2018 demo paper](https://webis.de/downloads/publications/papers/bevendorff_2018.pdf): ```bibtex @InProceedings{bevendorff:2018, address = {Berlin Heidelberg New York}, author = {Janek Bevendorff and Benno Stein and Matthias Hagen and Martin Potthast}, booktitle = {Advances in Information Retrieval. 40th European Conference on IR Research (ECIR 2018)}, editor = {Leif Azzopardi and Allan Hanbury and Gabriella Pasi and Benjamin Piwowarski}, month = mar, publisher = {Springer}, series = {Lecture Notes in Computer Science}, site = {Grenoble, France}, title = {{Elastic ChatNoir: Search Engine for the ClueWeb and the Common Crawl}}, year = 2018 } ``` %package -n python3-Resiliparse Summary: A collection of robust and fast processing tools for parsing and analyzing (not only) web archive data. Provides: python-Resiliparse BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip BuildRequires: python3-cffi BuildRequires: gcc BuildRequires: gdb %description -n python3-Resiliparse # ChatNoir Resiliparse A collection of robust and fast processing tools for parsing and analyzing (not only) web archive data. Resiliparse is a part of the [ChatNoir web analytics toolkit](https://github.com/chatnoir-eu/). ## Installing Resiliparse Pre-built Resiliparse binaries can be installed from PyPi: ```bash pip install resiliparse ``` ## Building Resiliparse From Source You can compile Resiliparse either from the PyPi source package or directly from this repository, though in any case, you need to install all required build-time dependencies first. On Ubuntu, this is done as follows: ```bash # Add Lexbor repository curl -L https://lexbor.com/keys/lexbor_signing.key | sudo apt-key add - echo "deb https://packages.lexbor.com/ubuntu/ $(lsb_release -sc) liblexbor" | \ sudo tee /etc/apt/sources.list.d/lexbor.list # Install build dependencies sudo apt update sudo apt install build-essential python3-dev libuchardet-dev liblexbor-dev libre2-dev ``` To build and install Resiliparse from PyPi, run ```bash pip install --no-binary resiliparse resiliparse ``` That's it. If you prefer to build and install directly from this repository instead, run: ```bash pip install -e resiliparse ``` To build the wheels without installing them, run: ```bash pip wheel -e resiliparse # Or: pip install build && python -m build --wheel resiliparse ``` ## Usage Instructions For detailed usage instructions, please consult the [Resiliparse User Manual](https://resiliparse.chatnoir.eu/en/latest/). ## Cite Us If you use ChatNoir or Resiliparse, please consider citing our [ECIR 2018 demo paper](https://webis.de/downloads/publications/papers/bevendorff_2018.pdf): ```bibtex @InProceedings{bevendorff:2018, address = {Berlin Heidelberg New York}, author = {Janek Bevendorff and Benno Stein and Matthias Hagen and Martin Potthast}, booktitle = {Advances in Information Retrieval. 40th European Conference on IR Research (ECIR 2018)}, editor = {Leif Azzopardi and Allan Hanbury and Gabriella Pasi and Benjamin Piwowarski}, month = mar, publisher = {Springer}, series = {Lecture Notes in Computer Science}, site = {Grenoble, France}, title = {{Elastic ChatNoir: Search Engine for the ClueWeb and the Common Crawl}}, year = 2018 } ``` %package help Summary: Development documents and examples for Resiliparse Provides: python3-Resiliparse-doc %description help # ChatNoir Resiliparse A collection of robust and fast processing tools for parsing and analyzing (not only) web archive data. Resiliparse is a part of the [ChatNoir web analytics toolkit](https://github.com/chatnoir-eu/). ## Installing Resiliparse Pre-built Resiliparse binaries can be installed from PyPi: ```bash pip install resiliparse ``` ## Building Resiliparse From Source You can compile Resiliparse either from the PyPi source package or directly from this repository, though in any case, you need to install all required build-time dependencies first. On Ubuntu, this is done as follows: ```bash # Add Lexbor repository curl -L https://lexbor.com/keys/lexbor_signing.key | sudo apt-key add - echo "deb https://packages.lexbor.com/ubuntu/ $(lsb_release -sc) liblexbor" | \ sudo tee /etc/apt/sources.list.d/lexbor.list # Install build dependencies sudo apt update sudo apt install build-essential python3-dev libuchardet-dev liblexbor-dev libre2-dev ``` To build and install Resiliparse from PyPi, run ```bash pip install --no-binary resiliparse resiliparse ``` That's it. If you prefer to build and install directly from this repository instead, run: ```bash pip install -e resiliparse ``` To build the wheels without installing them, run: ```bash pip wheel -e resiliparse # Or: pip install build && python -m build --wheel resiliparse ``` ## Usage Instructions For detailed usage instructions, please consult the [Resiliparse User Manual](https://resiliparse.chatnoir.eu/en/latest/). ## Cite Us If you use ChatNoir or Resiliparse, please consider citing our [ECIR 2018 demo paper](https://webis.de/downloads/publications/papers/bevendorff_2018.pdf): ```bibtex @InProceedings{bevendorff:2018, address = {Berlin Heidelberg New York}, author = {Janek Bevendorff and Benno Stein and Matthias Hagen and Martin Potthast}, booktitle = {Advances in Information Retrieval. 40th European Conference on IR Research (ECIR 2018)}, editor = {Leif Azzopardi and Allan Hanbury and Gabriella Pasi and Benjamin Piwowarski}, month = mar, publisher = {Springer}, series = {Lecture Notes in Computer Science}, site = {Grenoble, France}, title = {{Elastic ChatNoir: Search Engine for the ClueWeb and the Common Crawl}}, year = 2018 } ``` %prep %autosetup -n Resiliparse-0.14.3 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-Resiliparse -f filelist.lst %dir %{python3_sitearch}/* %files help -f doclist.lst %{_docdir}/* %changelog * Tue Apr 11 2023 Python_Bot - 0.14.3-1 - Package Spec generated