%global _empty_manifest_terminate_build 0 Name: python-html-to-json Version: 2.0.0 Release: 1 Summary: Convert html to json. License: MIT License URL: https://github.com/fhightower/html-to-json Source0: https://mirrors.aliyun.com/pypi/web/packages/da/83/c425c27e4c8f4b622901f8b58ad48e53be14a080d341a70c67570f1ec30a/html_to_json-2.0.0.tar.gz BuildArch: noarch Requires: python3-bs4 %description # HTML to JSON [![PyPI](https://img.shields.io/pypi/v/html-to-json.svg)](https://pypi.python.org/pypi/html-to-json) [![Build Status](https://travis-ci.com/fhightower/html-to-json.svg?branch=main)](https://travis-ci.com/fhightower/html-to-json) [![codecov](https://codecov.io/gh/fhightower/html-to-json/branch/main/graph/badge.svg?token=V0WOIXRGMM)](https://codecov.io/gh/fhightower/html-to-json) Convert HTML and/or HTML tables to JSON. ## Installation ``` pip install html-to-json ``` ## Usage ### HTML to JSON ```python import html_to_json html_string = """ Test site """ output_json = html_to_json.convert(html_string) print(output_json) ``` When calling the `html_to_json.convert` function, you can choose to not capture the text values from the html by passing in the key-word argument `capture_element_values=False`. You can also choose to not capture the attributes of the elements by passing `capture_element_attributes=False` into the function. #### Example Example input: ```html Floyd Hightower's Projects ``` Example output: ```json { "head": [ { "title": [ { "_value": "Floyd Hightower's Projects" }], "meta": [ { "_attributes": { "charset": "UTF-8" } }, { "_attributes": { "name": "description", "content": "Floyd Hightower's Projects" } }, { "_attributes": { "name": "keywords", "content": "projects,fhightower,Floyd,Hightower" } }] }] } ``` ### HTML Tables to JSON In addition to converting HTML to JSON, this library can also intelligently convert HTML tables to JSON. Currently, this library can handle three types of tables: A. Those with [table headers](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/th) in the first row B. Those with table headers in the first column C. Those without table headers Tables of type A and B are diagrammed below: ![This package can handle tables with the headers in the first row or headers in the first column](./html_table_varieties.jpg) #### Example This code: ```python import html_to_json html_string = """
# Malware MD5 Date Added
25548 DarkComet 034a37b2a2307f876adc9538986d7b86 July 9, 2018, 6:25 a.m.
25547 DarkComet 706eeefbac3de4d58b27d964173999c3 July 7, 2018, 6:25 a.m.
""" tables = html_to_json.convert_tables(html_string) print(tables) ``` will produce this output: ```json [ [ { "#": "25548", "Malware": "DarkComet", "MD5": "034a37b2a2307f876adc9538986d7b86", "Date Added": "July 9, 2018, 6:25 a.m." }, { "#": "25547", "Malware": "DarkComet", "MD5": "706eeefbac3de4d58b27d964173999c3", "Date Added": "July 7, 2018, 6:25 a.m." } ] ] ``` ## Credits This package was created with [Cookiecutter](https://github.com/audreyr/cookiecutter) and fhightower's [Python project template](https://github.com/fhightower-templates/python-project-template). %package -n python3-html-to-json Summary: Convert html to json. Provides: python-html-to-json BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-html-to-json # HTML to JSON [![PyPI](https://img.shields.io/pypi/v/html-to-json.svg)](https://pypi.python.org/pypi/html-to-json) [![Build Status](https://travis-ci.com/fhightower/html-to-json.svg?branch=main)](https://travis-ci.com/fhightower/html-to-json) [![codecov](https://codecov.io/gh/fhightower/html-to-json/branch/main/graph/badge.svg?token=V0WOIXRGMM)](https://codecov.io/gh/fhightower/html-to-json) Convert HTML and/or HTML tables to JSON. ## Installation ``` pip install html-to-json ``` ## Usage ### HTML to JSON ```python import html_to_json html_string = """ Test site """ output_json = html_to_json.convert(html_string) print(output_json) ``` When calling the `html_to_json.convert` function, you can choose to not capture the text values from the html by passing in the key-word argument `capture_element_values=False`. You can also choose to not capture the attributes of the elements by passing `capture_element_attributes=False` into the function. #### Example Example input: ```html Floyd Hightower's Projects ``` Example output: ```json { "head": [ { "title": [ { "_value": "Floyd Hightower's Projects" }], "meta": [ { "_attributes": { "charset": "UTF-8" } }, { "_attributes": { "name": "description", "content": "Floyd Hightower's Projects" } }, { "_attributes": { "name": "keywords", "content": "projects,fhightower,Floyd,Hightower" } }] }] } ``` ### HTML Tables to JSON In addition to converting HTML to JSON, this library can also intelligently convert HTML tables to JSON. Currently, this library can handle three types of tables: A. Those with [table headers](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/th) in the first row B. Those with table headers in the first column C. Those without table headers Tables of type A and B are diagrammed below: ![This package can handle tables with the headers in the first row or headers in the first column](./html_table_varieties.jpg) #### Example This code: ```python import html_to_json html_string = """
# Malware MD5 Date Added
25548 DarkComet 034a37b2a2307f876adc9538986d7b86 July 9, 2018, 6:25 a.m.
25547 DarkComet 706eeefbac3de4d58b27d964173999c3 July 7, 2018, 6:25 a.m.
""" tables = html_to_json.convert_tables(html_string) print(tables) ``` will produce this output: ```json [ [ { "#": "25548", "Malware": "DarkComet", "MD5": "034a37b2a2307f876adc9538986d7b86", "Date Added": "July 9, 2018, 6:25 a.m." }, { "#": "25547", "Malware": "DarkComet", "MD5": "706eeefbac3de4d58b27d964173999c3", "Date Added": "July 7, 2018, 6:25 a.m." } ] ] ``` ## Credits This package was created with [Cookiecutter](https://github.com/audreyr/cookiecutter) and fhightower's [Python project template](https://github.com/fhightower-templates/python-project-template). %package help Summary: Development documents and examples for html-to-json Provides: python3-html-to-json-doc %description help # HTML to JSON [![PyPI](https://img.shields.io/pypi/v/html-to-json.svg)](https://pypi.python.org/pypi/html-to-json) [![Build Status](https://travis-ci.com/fhightower/html-to-json.svg?branch=main)](https://travis-ci.com/fhightower/html-to-json) [![codecov](https://codecov.io/gh/fhightower/html-to-json/branch/main/graph/badge.svg?token=V0WOIXRGMM)](https://codecov.io/gh/fhightower/html-to-json) Convert HTML and/or HTML tables to JSON. ## Installation ``` pip install html-to-json ``` ## Usage ### HTML to JSON ```python import html_to_json html_string = """ Test site """ output_json = html_to_json.convert(html_string) print(output_json) ``` When calling the `html_to_json.convert` function, you can choose to not capture the text values from the html by passing in the key-word argument `capture_element_values=False`. You can also choose to not capture the attributes of the elements by passing `capture_element_attributes=False` into the function. #### Example Example input: ```html Floyd Hightower's Projects ``` Example output: ```json { "head": [ { "title": [ { "_value": "Floyd Hightower's Projects" }], "meta": [ { "_attributes": { "charset": "UTF-8" } }, { "_attributes": { "name": "description", "content": "Floyd Hightower's Projects" } }, { "_attributes": { "name": "keywords", "content": "projects,fhightower,Floyd,Hightower" } }] }] } ``` ### HTML Tables to JSON In addition to converting HTML to JSON, this library can also intelligently convert HTML tables to JSON. Currently, this library can handle three types of tables: A. Those with [table headers](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/th) in the first row B. Those with table headers in the first column C. Those without table headers Tables of type A and B are diagrammed below: ![This package can handle tables with the headers in the first row or headers in the first column](./html_table_varieties.jpg) #### Example This code: ```python import html_to_json html_string = """
# Malware MD5 Date Added
25548 DarkComet 034a37b2a2307f876adc9538986d7b86 July 9, 2018, 6:25 a.m.
25547 DarkComet 706eeefbac3de4d58b27d964173999c3 July 7, 2018, 6:25 a.m.
""" tables = html_to_json.convert_tables(html_string) print(tables) ``` will produce this output: ```json [ [ { "#": "25548", "Malware": "DarkComet", "MD5": "034a37b2a2307f876adc9538986d7b86", "Date Added": "July 9, 2018, 6:25 a.m." }, { "#": "25547", "Malware": "DarkComet", "MD5": "706eeefbac3de4d58b27d964173999c3", "Date Added": "July 7, 2018, 6:25 a.m." } ] ] ``` ## Credits This package was created with [Cookiecutter](https://github.com/audreyr/cookiecutter) and fhightower's [Python project template](https://github.com/fhightower-templates/python-project-template). %prep %autosetup -n html_to_json-2.0.0 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-html-to-json -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Fri Jun 09 2023 Python_Bot - 2.0.0-1 - Package Spec generated