%global _empty_manifest_terminate_build 0
Name: python-json-flattener
Version: 0.1.9
Release: 1
Summary: Python library for denormalizing nested dicts or json objects to tables and back
License: BSD
URL: https://github.com/cmungall/json-flattener
Source0: https://mirrors.nju.edu.cn/pypi/web/packages/6d/77/b00e46d904818826275661a690532d3a3a43a4ded0264b2d7fcdb5c0feea/json_flattener-0.1.9.tar.gz
BuildArch: noarch
Requires: python3-click
Requires: python3-pyyaml
%description
# json-flattener
Python library for denormalizing/flattening lists of complex objects to tables/data frames, with roundtripping
## Notebook Example
[EXAMPLE.ipynb](https://github.com/cmungall/json-flattener/blob/main/EXAMPLE.ipynb)
## Description
Given YAML/JSON/JSON-Lines such as:
```yaml
- id: S001
name: Lord of the Rings
genres:
- fantasy
creator:
name: JRR Tolkein
from_country: England
books:
- id: S001.1
name: Fellowship of the Ring
price: 5.99
summary: Hobbits
- id: S001.2
name: The Two Towers
price: 5.99
summary: More hobbits
- id: S001.3
name: Return of the King
price: 6.99
summary: Yet more hobbits
- id: S002
name: The Culture Series
genres:
- scifi
creator:
name: Ian M Banks
from_country: Scotland
books:
- id: S002.1
name: Consider Phlebas
price: 5.99
- id: S002.2
name: Player of Games
price: 5.99
```
Denormalize using `jfl` command:
```bash
jfl flatten -C creator=flat -C books=multivalued -i examples/books1.yaml -o examples/books1-flattened.tsv
```
|id|name|genres|creator_name|creator_from_country|books_name|books_summary|books_price|books_id|creator_genres
|---|---|---|---|---|---|---|---|---|---|
|S001|Lord of the Rings|[fantasy]|JRR Tolkein|England|[Fellowship of the Ring\|The Two Towers\|Return of the King]|[Hobbits\|More hobbits\|Yet more hobbits]|[5.99\|5.99\|6.99]|[S001.1\|S001.2\|S001.3]|
|S002|The Culture Series|[scifi]|Ian M Banks|Scotland|[Consider Phlebas\|Player of Games]||[5.99\|5.99]|[S002.1\|S002.2]|
Convert back to JSON/YAML:
```bash
jfl unflatten -C creator=flat -C books=multivalued -i examples/books1.tsv -o examples/books1.yaml
```
This library also allows complex fields to be directly serialized as json or yaml (the default is to append `_json` to the key). For example:
```bash
jfl flatten -C creator=json -C books=json -i examples/books1.yaml -o examples/books1-jsonified.tsv
```
|id|name|genres|creator_json|books_json|
|---|---|---|---|---|
|S001|Lord of the Rings|[fantasy]|{\"name\": \"JRR Tolkein\", \"from_country\": \"England\"}|[{\"id\": \"S001.1\", \"name\": \"Fellowship of the Ring\", \"summary\": \"Hobbits\", \"price\": 5.99}, {\"id\": \"S001.2\", \"name\": \"The Two Towers\", \"summary\": \"More hobbits\", \"price\": 5.99}, {\"id\": \"S001.3\", \"name\": \"Return of the King\", \"summary\": \"Yet more hobbits\", \"price\": 6.99}]|
|S002|The Culture Series|[scifi]|{\"name\": \"Ian M Banks\", \"from_country\": \"Scotland\"}|[{\"id\": \"S002.1\", \"name\": \"Consider Phlebas\", \"price\": 5.99}, {\"id\": \"S002.2\", \"name\": \"Player of Games\", \"price\": 5.99}]|
|S003|Book of the New Sun|[scifi, fantasy]|{\"name\": \"Gene Wolfe\", \"genres\": [\"scifi\", \"fantasy\"], \"from_country\": \"USA\"}|[{\"id\": \"S003.1\", \"name\": \"Shadow of the Torturer\"}, {\"id\": \"S003.2\", \"name\": \"Claw of the Conciliator\", \"price\": 6.99}]|
|S004|Example with single book||{\"name\": \"Ms Writer\", \"genres\": [\"romance\"], \"from_country\": \"USA\"}|[{\"id\": \"S004.1\", \"name\": \"Blah\"}]|
|S005|Example with no books||{\"name\": \"Mr Unproductive\", \"genres\": [\"romance\", \"scifi\", \"fantasy\"], \"from_country\": \"USA\"}||
See
The primary use case is to go from a rich *normalized* data model (as python objects, JSON, or YAML) to a flatter representation that is amenable to processing with:
* Solr/Lucene
* Pandas/R Dataframes
* Excel/Google sheets
* Unix cut/grep/cat/etc
* Simple denormalized SQL database representations
The target denormalized format is a list of rows / a data matrix, where each cell is either an atom or a list of atoms.
## Method
* Each top level key becomes a column
* if the key value is a dict/object, then flatten
* by default a '_' is used to separate the parent key from the inner key
* e.g. the composition of `creator` and `from_country` becomes `creator_from_country`
* currently one level of flattening is supported
* if the key value is a list of atomic entities, then leave as is
* if the key value is a list of dicts/objects, then flatten each key of this inner dict into a list
* e.g. if `books` is a list of book objects, and `name` is a key on book, then `books_name` is a list of names of each book
* order is significant - the first element of `books_name` is matched to the first element of `books_price`, etc
* Allow any key to be serialized as yaml/json/pickle if configured
## Command line usage (TODO)
## Usage from Python
Documentation coming soon: see test folder for now
## use within LinkML
## Comparison
### Pandas json_normalize
- https://pandas.pydata.org/pandas-docs/version/0.25.0/reference/api/pandas.io.json.json_normalize.html
### Java json-flattener
https://github.com/wnameless/json-flattener
### Python
### csvjson
https://csvjson.com/json2csv
%package -n python3-json-flattener
Summary: Python library for denormalizing nested dicts or json objects to tables and back
Provides: python-json-flattener
BuildRequires: python3-devel
BuildRequires: python3-setuptools
BuildRequires: python3-pip
%description -n python3-json-flattener
# json-flattener
Python library for denormalizing/flattening lists of complex objects to tables/data frames, with roundtripping
## Notebook Example
[EXAMPLE.ipynb](https://github.com/cmungall/json-flattener/blob/main/EXAMPLE.ipynb)
## Description
Given YAML/JSON/JSON-Lines such as:
```yaml
- id: S001
name: Lord of the Rings
genres:
- fantasy
creator:
name: JRR Tolkein
from_country: England
books:
- id: S001.1
name: Fellowship of the Ring
price: 5.99
summary: Hobbits
- id: S001.2
name: The Two Towers
price: 5.99
summary: More hobbits
- id: S001.3
name: Return of the King
price: 6.99
summary: Yet more hobbits
- id: S002
name: The Culture Series
genres:
- scifi
creator:
name: Ian M Banks
from_country: Scotland
books:
- id: S002.1
name: Consider Phlebas
price: 5.99
- id: S002.2
name: Player of Games
price: 5.99
```
Denormalize using `jfl` command:
```bash
jfl flatten -C creator=flat -C books=multivalued -i examples/books1.yaml -o examples/books1-flattened.tsv
```
|id|name|genres|creator_name|creator_from_country|books_name|books_summary|books_price|books_id|creator_genres
|---|---|---|---|---|---|---|---|---|---|
|S001|Lord of the Rings|[fantasy]|JRR Tolkein|England|[Fellowship of the Ring\|The Two Towers\|Return of the King]|[Hobbits\|More hobbits\|Yet more hobbits]|[5.99\|5.99\|6.99]|[S001.1\|S001.2\|S001.3]|
|S002|The Culture Series|[scifi]|Ian M Banks|Scotland|[Consider Phlebas\|Player of Games]||[5.99\|5.99]|[S002.1\|S002.2]|
Convert back to JSON/YAML:
```bash
jfl unflatten -C creator=flat -C books=multivalued -i examples/books1.tsv -o examples/books1.yaml
```
This library also allows complex fields to be directly serialized as json or yaml (the default is to append `_json` to the key). For example:
```bash
jfl flatten -C creator=json -C books=json -i examples/books1.yaml -o examples/books1-jsonified.tsv
```
|id|name|genres|creator_json|books_json|
|---|---|---|---|---|
|S001|Lord of the Rings|[fantasy]|{\"name\": \"JRR Tolkein\", \"from_country\": \"England\"}|[{\"id\": \"S001.1\", \"name\": \"Fellowship of the Ring\", \"summary\": \"Hobbits\", \"price\": 5.99}, {\"id\": \"S001.2\", \"name\": \"The Two Towers\", \"summary\": \"More hobbits\", \"price\": 5.99}, {\"id\": \"S001.3\", \"name\": \"Return of the King\", \"summary\": \"Yet more hobbits\", \"price\": 6.99}]|
|S002|The Culture Series|[scifi]|{\"name\": \"Ian M Banks\", \"from_country\": \"Scotland\"}|[{\"id\": \"S002.1\", \"name\": \"Consider Phlebas\", \"price\": 5.99}, {\"id\": \"S002.2\", \"name\": \"Player of Games\", \"price\": 5.99}]|
|S003|Book of the New Sun|[scifi, fantasy]|{\"name\": \"Gene Wolfe\", \"genres\": [\"scifi\", \"fantasy\"], \"from_country\": \"USA\"}|[{\"id\": \"S003.1\", \"name\": \"Shadow of the Torturer\"}, {\"id\": \"S003.2\", \"name\": \"Claw of the Conciliator\", \"price\": 6.99}]|
|S004|Example with single book||{\"name\": \"Ms Writer\", \"genres\": [\"romance\"], \"from_country\": \"USA\"}|[{\"id\": \"S004.1\", \"name\": \"Blah\"}]|
|S005|Example with no books||{\"name\": \"Mr Unproductive\", \"genres\": [\"romance\", \"scifi\", \"fantasy\"], \"from_country\": \"USA\"}||
See
The primary use case is to go from a rich *normalized* data model (as python objects, JSON, or YAML) to a flatter representation that is amenable to processing with:
* Solr/Lucene
* Pandas/R Dataframes
* Excel/Google sheets
* Unix cut/grep/cat/etc
* Simple denormalized SQL database representations
The target denormalized format is a list of rows / a data matrix, where each cell is either an atom or a list of atoms.
## Method
* Each top level key becomes a column
* if the key value is a dict/object, then flatten
* by default a '_' is used to separate the parent key from the inner key
* e.g. the composition of `creator` and `from_country` becomes `creator_from_country`
* currently one level of flattening is supported
* if the key value is a list of atomic entities, then leave as is
* if the key value is a list of dicts/objects, then flatten each key of this inner dict into a list
* e.g. if `books` is a list of book objects, and `name` is a key on book, then `books_name` is a list of names of each book
* order is significant - the first element of `books_name` is matched to the first element of `books_price`, etc
* Allow any key to be serialized as yaml/json/pickle if configured
## Command line usage (TODO)
## Usage from Python
Documentation coming soon: see test folder for now
## use within LinkML
## Comparison
### Pandas json_normalize
- https://pandas.pydata.org/pandas-docs/version/0.25.0/reference/api/pandas.io.json.json_normalize.html
### Java json-flattener
https://github.com/wnameless/json-flattener
### Python
### csvjson
https://csvjson.com/json2csv
%package help
Summary: Development documents and examples for json-flattener
Provides: python3-json-flattener-doc
%description help
# json-flattener
Python library for denormalizing/flattening lists of complex objects to tables/data frames, with roundtripping
## Notebook Example
[EXAMPLE.ipynb](https://github.com/cmungall/json-flattener/blob/main/EXAMPLE.ipynb)
## Description
Given YAML/JSON/JSON-Lines such as:
```yaml
- id: S001
name: Lord of the Rings
genres:
- fantasy
creator:
name: JRR Tolkein
from_country: England
books:
- id: S001.1
name: Fellowship of the Ring
price: 5.99
summary: Hobbits
- id: S001.2
name: The Two Towers
price: 5.99
summary: More hobbits
- id: S001.3
name: Return of the King
price: 6.99
summary: Yet more hobbits
- id: S002
name: The Culture Series
genres:
- scifi
creator:
name: Ian M Banks
from_country: Scotland
books:
- id: S002.1
name: Consider Phlebas
price: 5.99
- id: S002.2
name: Player of Games
price: 5.99
```
Denormalize using `jfl` command:
```bash
jfl flatten -C creator=flat -C books=multivalued -i examples/books1.yaml -o examples/books1-flattened.tsv
```
|id|name|genres|creator_name|creator_from_country|books_name|books_summary|books_price|books_id|creator_genres
|---|---|---|---|---|---|---|---|---|---|
|S001|Lord of the Rings|[fantasy]|JRR Tolkein|England|[Fellowship of the Ring\|The Two Towers\|Return of the King]|[Hobbits\|More hobbits\|Yet more hobbits]|[5.99\|5.99\|6.99]|[S001.1\|S001.2\|S001.3]|
|S002|The Culture Series|[scifi]|Ian M Banks|Scotland|[Consider Phlebas\|Player of Games]||[5.99\|5.99]|[S002.1\|S002.2]|
Convert back to JSON/YAML:
```bash
jfl unflatten -C creator=flat -C books=multivalued -i examples/books1.tsv -o examples/books1.yaml
```
This library also allows complex fields to be directly serialized as json or yaml (the default is to append `_json` to the key). For example:
```bash
jfl flatten -C creator=json -C books=json -i examples/books1.yaml -o examples/books1-jsonified.tsv
```
|id|name|genres|creator_json|books_json|
|---|---|---|---|---|
|S001|Lord of the Rings|[fantasy]|{\"name\": \"JRR Tolkein\", \"from_country\": \"England\"}|[{\"id\": \"S001.1\", \"name\": \"Fellowship of the Ring\", \"summary\": \"Hobbits\", \"price\": 5.99}, {\"id\": \"S001.2\", \"name\": \"The Two Towers\", \"summary\": \"More hobbits\", \"price\": 5.99}, {\"id\": \"S001.3\", \"name\": \"Return of the King\", \"summary\": \"Yet more hobbits\", \"price\": 6.99}]|
|S002|The Culture Series|[scifi]|{\"name\": \"Ian M Banks\", \"from_country\": \"Scotland\"}|[{\"id\": \"S002.1\", \"name\": \"Consider Phlebas\", \"price\": 5.99}, {\"id\": \"S002.2\", \"name\": \"Player of Games\", \"price\": 5.99}]|
|S003|Book of the New Sun|[scifi, fantasy]|{\"name\": \"Gene Wolfe\", \"genres\": [\"scifi\", \"fantasy\"], \"from_country\": \"USA\"}|[{\"id\": \"S003.1\", \"name\": \"Shadow of the Torturer\"}, {\"id\": \"S003.2\", \"name\": \"Claw of the Conciliator\", \"price\": 6.99}]|
|S004|Example with single book||{\"name\": \"Ms Writer\", \"genres\": [\"romance\"], \"from_country\": \"USA\"}|[{\"id\": \"S004.1\", \"name\": \"Blah\"}]|
|S005|Example with no books||{\"name\": \"Mr Unproductive\", \"genres\": [\"romance\", \"scifi\", \"fantasy\"], \"from_country\": \"USA\"}||
See
The primary use case is to go from a rich *normalized* data model (as python objects, JSON, or YAML) to a flatter representation that is amenable to processing with:
* Solr/Lucene
* Pandas/R Dataframes
* Excel/Google sheets
* Unix cut/grep/cat/etc
* Simple denormalized SQL database representations
The target denormalized format is a list of rows / a data matrix, where each cell is either an atom or a list of atoms.
## Method
* Each top level key becomes a column
* if the key value is a dict/object, then flatten
* by default a '_' is used to separate the parent key from the inner key
* e.g. the composition of `creator` and `from_country` becomes `creator_from_country`
* currently one level of flattening is supported
* if the key value is a list of atomic entities, then leave as is
* if the key value is a list of dicts/objects, then flatten each key of this inner dict into a list
* e.g. if `books` is a list of book objects, and `name` is a key on book, then `books_name` is a list of names of each book
* order is significant - the first element of `books_name` is matched to the first element of `books_price`, etc
* Allow any key to be serialized as yaml/json/pickle if configured
## Command line usage (TODO)
## Usage from Python
Documentation coming soon: see test folder for now
## use within LinkML
## Comparison
### Pandas json_normalize
- https://pandas.pydata.org/pandas-docs/version/0.25.0/reference/api/pandas.io.json.json_normalize.html
### Java json-flattener
https://github.com/wnameless/json-flattener
### Python
### csvjson
https://csvjson.com/json2csv
%prep
%autosetup -n json-flattener-0.1.9
%build
%py3_build
%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .
%files -n python3-json-flattener -f filelist.lst
%dir %{python3_sitelib}/*
%files help -f doclist.lst
%{_docdir}/*
%changelog
* Tue May 30 2023 Python_Bot - 0.1.9-1
- Package Spec generated