%global _empty_manifest_terminate_build 0 Name: python-amsterdam-schema-tools Version: 5.9.3 Release: 1 Summary: Tools to work with Amsterdam Schema. License: Mozilla Public 2.0 URL: https://github.com/amsterdam/schema-tools Source0: https://mirrors.nju.edu.cn/pypi/web/packages/4d/24/9b16605372327f85f59868308c2dc3f6e93f5faf2941312efe6702b27d8c/amsterdam-schema-tools-5.9.3.tar.gz BuildArch: noarch Requires: python3-sqlalchemy Requires: python3-geoalchemy2 Requires: python3-psycopg2 Requires: python3-pg-grant Requires: python3-click Requires: python3-deepdiff Requires: python3-jsonlines Requires: python3-jsonschema[format] Requires: python3-shapely Requires: python3-string-utils Requires: python3-dateutil Requires: python3-requests Requires: python3-jinja2 Requires: python3-mappyfile Requires: python3-methodtools Requires: python3-jsonpath-rw Requires: python3-orjson Requires: python3-more-ds Requires: python3-factory-boy Requires: python3-build Requires: python3-twine Requires: python3-environ Requires: python3-django Requires: python3-django-gisserver Requires: python3-django-environ Requires: python3-django-db-comments Requires: python3-factory-boy Requires: python3-confluent-kafka Requires: python3-types-requests Requires: python3-types-click Requires: python3-types-python-dateutil Requires: python3-flake8 Requires: python3-flake8-colors Requires: python3-flake8-raise Requires: python3-flake8-bandit Requires: python3-flake8-bugbear Requires: python3-flake8-builtins Requires: python3-flake8-comprehensions Requires: python3-flake8-docstrings Requires: python3-flake8-implicit-str-concat Requires: python3-flake8-print Requires: python3-flake8-rst Requires: python3-flake8-string-format Requires: python3-flake8-logging-format Requires: python3-pytest Requires: python3-pytest-cov Requires: python3-pytest-django Requires: python3-pytest-sqlalchemy %description # amsterdam-schema-tools Set of libraries and tools to work with Amsterdam schema. Install the package with: `pip install amsterdam-schema-tools`. This installs the library and a command-line tool called `schema`, with various subcommands. A listing can be obtained from `schema --help`. Subcommands that talk to a PostgreSQL database expect either a `DATABASE_URL` environment variable or a command line option `--db-url` with a DSN. Many subcommands want to know where to find schema files. Most will look in a directory of schemas denoted by the `SCHEMA_URL` environment variable or the `--schema-url` command line option. E.g., schema create tables --schema-url=myschemas mydataset will try to load the schema for `mydataset` from `myschemas/mydataset/dataset.json`. ## Generate amsterdam schema from existing database tables The --prefix argument controls whether table prefixes are removed in the schema, because that is required for Django models. As example we can generate a BAG schema. Point `DATABASE_URL` to `bag_v11` database and then run : schema show tablenames | sort | awk '/^bag_/{print}' | xargs schema introspect db bag --prefix bag_ | jq The **jq** formats it nicely and it can be redirected to the correct directory in the schemas repository directly. ## Express amsterdam schema information in relational tables Amsterdam schema is expressed as jsonschema. However, to make it easier for people with a more relational mind- or toolset it is possible to express amsterdam schema as a set of relational tables. These tables are *meta_dataset*, *meta_table* and *meta_field*. It is possible to convert a jsonschema into the relational table structure and vice-versa. This command converts a dataset from an existing dataset in jsonschema format: schema import schema To convert from relational tables back to jsonschema: schema show schema ## Generating amsterdam schema from existing GeoJSON files The following command can be used to inspect and import the GeoJSON files: schema introspect geojson *.geojson > schema.json edit schema.json # fine-tune the table names schema import geojson schema.json file1.geojson schema import geojson schema.json file2.geojson ## Importing GOB events The schematools library has a module that reads GOB events into database tables that are defines by an Amsterdam schema. This module can be used to read GOB events from a Kafka stream. It is also possible to read GOB events from a batch file with line-separeted events using: schema import events ## Export datasets Datasets can be exported to different file formats. Currently supported are geopackage, csv and jsonlines. The command for exporting the dataset tables is: schema export [geopackage|csv|jsonlines] The command has several command-line options that can be used. Documentations about these flags can be shown using the `--help` options. ## Schema Tools as a pre-commit hook Included in the project is a `pre-commit` hook that can validate schema files in a project such as [amsterdam-schema](https://github.com/Amsterdam/amsterdam-schema) To configure it extend the `.pre-commit-config.yaml` in the project with the schema file defintions as follows: ```yaml - repo: https://github.com/Amsterdam/schema-tools rev: v3.5.0 hooks: - id: validate-schema args: ['https://schemas.data.amsterdam.nl/schema@v1.2.0#'] exclude: | (?x)^( schema.+| # exclude meta schemas datasets/index.json )$ ``` `args` is a one element list containing the URL to the Amsterdam Meta Schema. `validate-schema` will only process `json` files. However not all `json` files are Amsterdam schema files. To exclude files or directories use `exclude` with pattern. `pre-commit` depends on properly tagged revisions of its hooks. Hence, we should not only bump version numbers on updates to this package, but also commit a tag with the version number; see below. ## Doing a release (This is for schema-tools developers.) We use GitHub pull requests. If your PR should produce a new release of schema-tools, make sure one of the commit increments the version number in ``setup.cfg`` appropriately. Then, * merge the commit in GitHub, after review; * pull the code from GitHub and merge it into the master branch, ``git checkout master && git fetch origin && git merge --ff-only origin/master``; * tag the release X.Y.Z with ``git tag -a vX.Y.Z -m "Bump to vX.Y.Z"``; * push the tag to GitHub with ``git push origin --tags``; * release to PyPI: ``make upload`` (requires the PyPI secret). ## Mocking data The schematools library contains two Django management commands to generate mock data. The first one is `create_mock_data` which generates mock data for all the datasets that are found at the configured schema location `SCHEMA_URL` (where `SCHEMA_URL` can be configure to point to a path at the local filesystem). The `create_mock_data` command processes all datasets. However, it is possible to limit this by adding positional arguments. These positional arguments can be dataset ids or paths to the location of the `dataset.json` on the local filesystem. Furthermore, the command has some options, e.g. to change the default number of generated records (`--size`) or to reverse meaning of the positional arguments using `--exclude`. To avoid duplicate primary keys on subsequent runs the `--start-at` options can be used to start autonumbering of primary keys at an offset. E.g. to generate 5 records for the `bag` and `gebieden` datasets, starting the autonumbering of primary keys at 50. ``` django create_mock_data bag gebieden --size 5 --start-at 50 ``` To generate records for all datasets, except for the `fietspaaltjes` dataset: ``` django create_mock_data fietspaaltjes --exclude # or -x ``` To generate records for the `bbga` dataset, by loading the schema from the local filesystem: ``` django create_mock_data /datasets.json ``` During record generation in `create_mock_data`, the relations are not added, so foreign key fields will be filled with NULL values. There is a second management command `relate_mock_data` that can be used to add the relations. This command support positional arguments for datasets in the same way as `create_mock_data`. Furthermore, the command also has the `--exclude` option to reverse the meaning of the positional dataset arguments. E.g. to add relations to all datasets: ``` django relate_mock_data ``` To add relations for `bag` and `gebieden` only: ``` django relate_mock_data bag gebieden ``` To add relations for all datasets except `meetbouten`: ``` django relate_mock_data meetbouten --exclude # or -x ``` NB. When only a subset of the datasets is being mocked, the command can fail when datasets that are involved in a relation are missing, so make sure to include all relevant datasets. For convenience an additional management command `truncate_tables` has been added, to truncate all tables. %package -n python3-amsterdam-schema-tools Summary: Tools to work with Amsterdam Schema. Provides: python-amsterdam-schema-tools BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-amsterdam-schema-tools # amsterdam-schema-tools Set of libraries and tools to work with Amsterdam schema. Install the package with: `pip install amsterdam-schema-tools`. This installs the library and a command-line tool called `schema`, with various subcommands. A listing can be obtained from `schema --help`. Subcommands that talk to a PostgreSQL database expect either a `DATABASE_URL` environment variable or a command line option `--db-url` with a DSN. Many subcommands want to know where to find schema files. Most will look in a directory of schemas denoted by the `SCHEMA_URL` environment variable or the `--schema-url` command line option. E.g., schema create tables --schema-url=myschemas mydataset will try to load the schema for `mydataset` from `myschemas/mydataset/dataset.json`. ## Generate amsterdam schema from existing database tables The --prefix argument controls whether table prefixes are removed in the schema, because that is required for Django models. As example we can generate a BAG schema. Point `DATABASE_URL` to `bag_v11` database and then run : schema show tablenames | sort | awk '/^bag_/{print}' | xargs schema introspect db bag --prefix bag_ | jq The **jq** formats it nicely and it can be redirected to the correct directory in the schemas repository directly. ## Express amsterdam schema information in relational tables Amsterdam schema is expressed as jsonschema. However, to make it easier for people with a more relational mind- or toolset it is possible to express amsterdam schema as a set of relational tables. These tables are *meta_dataset*, *meta_table* and *meta_field*. It is possible to convert a jsonschema into the relational table structure and vice-versa. This command converts a dataset from an existing dataset in jsonschema format: schema import schema To convert from relational tables back to jsonschema: schema show schema ## Generating amsterdam schema from existing GeoJSON files The following command can be used to inspect and import the GeoJSON files: schema introspect geojson *.geojson > schema.json edit schema.json # fine-tune the table names schema import geojson schema.json file1.geojson schema import geojson schema.json file2.geojson ## Importing GOB events The schematools library has a module that reads GOB events into database tables that are defines by an Amsterdam schema. This module can be used to read GOB events from a Kafka stream. It is also possible to read GOB events from a batch file with line-separeted events using: schema import events ## Export datasets Datasets can be exported to different file formats. Currently supported are geopackage, csv and jsonlines. The command for exporting the dataset tables is: schema export [geopackage|csv|jsonlines] The command has several command-line options that can be used. Documentations about these flags can be shown using the `--help` options. ## Schema Tools as a pre-commit hook Included in the project is a `pre-commit` hook that can validate schema files in a project such as [amsterdam-schema](https://github.com/Amsterdam/amsterdam-schema) To configure it extend the `.pre-commit-config.yaml` in the project with the schema file defintions as follows: ```yaml - repo: https://github.com/Amsterdam/schema-tools rev: v3.5.0 hooks: - id: validate-schema args: ['https://schemas.data.amsterdam.nl/schema@v1.2.0#'] exclude: | (?x)^( schema.+| # exclude meta schemas datasets/index.json )$ ``` `args` is a one element list containing the URL to the Amsterdam Meta Schema. `validate-schema` will only process `json` files. However not all `json` files are Amsterdam schema files. To exclude files or directories use `exclude` with pattern. `pre-commit` depends on properly tagged revisions of its hooks. Hence, we should not only bump version numbers on updates to this package, but also commit a tag with the version number; see below. ## Doing a release (This is for schema-tools developers.) We use GitHub pull requests. If your PR should produce a new release of schema-tools, make sure one of the commit increments the version number in ``setup.cfg`` appropriately. Then, * merge the commit in GitHub, after review; * pull the code from GitHub and merge it into the master branch, ``git checkout master && git fetch origin && git merge --ff-only origin/master``; * tag the release X.Y.Z with ``git tag -a vX.Y.Z -m "Bump to vX.Y.Z"``; * push the tag to GitHub with ``git push origin --tags``; * release to PyPI: ``make upload`` (requires the PyPI secret). ## Mocking data The schematools library contains two Django management commands to generate mock data. The first one is `create_mock_data` which generates mock data for all the datasets that are found at the configured schema location `SCHEMA_URL` (where `SCHEMA_URL` can be configure to point to a path at the local filesystem). The `create_mock_data` command processes all datasets. However, it is possible to limit this by adding positional arguments. These positional arguments can be dataset ids or paths to the location of the `dataset.json` on the local filesystem. Furthermore, the command has some options, e.g. to change the default number of generated records (`--size`) or to reverse meaning of the positional arguments using `--exclude`. To avoid duplicate primary keys on subsequent runs the `--start-at` options can be used to start autonumbering of primary keys at an offset. E.g. to generate 5 records for the `bag` and `gebieden` datasets, starting the autonumbering of primary keys at 50. ``` django create_mock_data bag gebieden --size 5 --start-at 50 ``` To generate records for all datasets, except for the `fietspaaltjes` dataset: ``` django create_mock_data fietspaaltjes --exclude # or -x ``` To generate records for the `bbga` dataset, by loading the schema from the local filesystem: ``` django create_mock_data /datasets.json ``` During record generation in `create_mock_data`, the relations are not added, so foreign key fields will be filled with NULL values. There is a second management command `relate_mock_data` that can be used to add the relations. This command support positional arguments for datasets in the same way as `create_mock_data`. Furthermore, the command also has the `--exclude` option to reverse the meaning of the positional dataset arguments. E.g. to add relations to all datasets: ``` django relate_mock_data ``` To add relations for `bag` and `gebieden` only: ``` django relate_mock_data bag gebieden ``` To add relations for all datasets except `meetbouten`: ``` django relate_mock_data meetbouten --exclude # or -x ``` NB. When only a subset of the datasets is being mocked, the command can fail when datasets that are involved in a relation are missing, so make sure to include all relevant datasets. For convenience an additional management command `truncate_tables` has been added, to truncate all tables. %package help Summary: Development documents and examples for amsterdam-schema-tools Provides: python3-amsterdam-schema-tools-doc %description help # amsterdam-schema-tools Set of libraries and tools to work with Amsterdam schema. Install the package with: `pip install amsterdam-schema-tools`. This installs the library and a command-line tool called `schema`, with various subcommands. A listing can be obtained from `schema --help`. Subcommands that talk to a PostgreSQL database expect either a `DATABASE_URL` environment variable or a command line option `--db-url` with a DSN. Many subcommands want to know where to find schema files. Most will look in a directory of schemas denoted by the `SCHEMA_URL` environment variable or the `--schema-url` command line option. E.g., schema create tables --schema-url=myschemas mydataset will try to load the schema for `mydataset` from `myschemas/mydataset/dataset.json`. ## Generate amsterdam schema from existing database tables The --prefix argument controls whether table prefixes are removed in the schema, because that is required for Django models. As example we can generate a BAG schema. Point `DATABASE_URL` to `bag_v11` database and then run : schema show tablenames | sort | awk '/^bag_/{print}' | xargs schema introspect db bag --prefix bag_ | jq The **jq** formats it nicely and it can be redirected to the correct directory in the schemas repository directly. ## Express amsterdam schema information in relational tables Amsterdam schema is expressed as jsonschema. However, to make it easier for people with a more relational mind- or toolset it is possible to express amsterdam schema as a set of relational tables. These tables are *meta_dataset*, *meta_table* and *meta_field*. It is possible to convert a jsonschema into the relational table structure and vice-versa. This command converts a dataset from an existing dataset in jsonschema format: schema import schema To convert from relational tables back to jsonschema: schema show schema ## Generating amsterdam schema from existing GeoJSON files The following command can be used to inspect and import the GeoJSON files: schema introspect geojson *.geojson > schema.json edit schema.json # fine-tune the table names schema import geojson schema.json file1.geojson schema import geojson schema.json file2.geojson ## Importing GOB events The schematools library has a module that reads GOB events into database tables that are defines by an Amsterdam schema. This module can be used to read GOB events from a Kafka stream. It is also possible to read GOB events from a batch file with line-separeted events using: schema import events ## Export datasets Datasets can be exported to different file formats. Currently supported are geopackage, csv and jsonlines. The command for exporting the dataset tables is: schema export [geopackage|csv|jsonlines] The command has several command-line options that can be used. Documentations about these flags can be shown using the `--help` options. ## Schema Tools as a pre-commit hook Included in the project is a `pre-commit` hook that can validate schema files in a project such as [amsterdam-schema](https://github.com/Amsterdam/amsterdam-schema) To configure it extend the `.pre-commit-config.yaml` in the project with the schema file defintions as follows: ```yaml - repo: https://github.com/Amsterdam/schema-tools rev: v3.5.0 hooks: - id: validate-schema args: ['https://schemas.data.amsterdam.nl/schema@v1.2.0#'] exclude: | (?x)^( schema.+| # exclude meta schemas datasets/index.json )$ ``` `args` is a one element list containing the URL to the Amsterdam Meta Schema. `validate-schema` will only process `json` files. However not all `json` files are Amsterdam schema files. To exclude files or directories use `exclude` with pattern. `pre-commit` depends on properly tagged revisions of its hooks. Hence, we should not only bump version numbers on updates to this package, but also commit a tag with the version number; see below. ## Doing a release (This is for schema-tools developers.) We use GitHub pull requests. If your PR should produce a new release of schema-tools, make sure one of the commit increments the version number in ``setup.cfg`` appropriately. Then, * merge the commit in GitHub, after review; * pull the code from GitHub and merge it into the master branch, ``git checkout master && git fetch origin && git merge --ff-only origin/master``; * tag the release X.Y.Z with ``git tag -a vX.Y.Z -m "Bump to vX.Y.Z"``; * push the tag to GitHub with ``git push origin --tags``; * release to PyPI: ``make upload`` (requires the PyPI secret). ## Mocking data The schematools library contains two Django management commands to generate mock data. The first one is `create_mock_data` which generates mock data for all the datasets that are found at the configured schema location `SCHEMA_URL` (where `SCHEMA_URL` can be configure to point to a path at the local filesystem). The `create_mock_data` command processes all datasets. However, it is possible to limit this by adding positional arguments. These positional arguments can be dataset ids or paths to the location of the `dataset.json` on the local filesystem. Furthermore, the command has some options, e.g. to change the default number of generated records (`--size`) or to reverse meaning of the positional arguments using `--exclude`. To avoid duplicate primary keys on subsequent runs the `--start-at` options can be used to start autonumbering of primary keys at an offset. E.g. to generate 5 records for the `bag` and `gebieden` datasets, starting the autonumbering of primary keys at 50. ``` django create_mock_data bag gebieden --size 5 --start-at 50 ``` To generate records for all datasets, except for the `fietspaaltjes` dataset: ``` django create_mock_data fietspaaltjes --exclude # or -x ``` To generate records for the `bbga` dataset, by loading the schema from the local filesystem: ``` django create_mock_data /datasets.json ``` During record generation in `create_mock_data`, the relations are not added, so foreign key fields will be filled with NULL values. There is a second management command `relate_mock_data` that can be used to add the relations. This command support positional arguments for datasets in the same way as `create_mock_data`. Furthermore, the command also has the `--exclude` option to reverse the meaning of the positional dataset arguments. E.g. to add relations to all datasets: ``` django relate_mock_data ``` To add relations for `bag` and `gebieden` only: ``` django relate_mock_data bag gebieden ``` To add relations for all datasets except `meetbouten`: ``` django relate_mock_data meetbouten --exclude # or -x ``` NB. When only a subset of the datasets is being mocked, the command can fail when datasets that are involved in a relation are missing, so make sure to include all relevant datasets. For convenience an additional management command `truncate_tables` has been added, to truncate all tables. %prep %autosetup -n amsterdam-schema-tools-5.9.3 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-amsterdam-schema-tools -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Tue Apr 25 2023 Python_Bot - 5.9.3-1 - Package Spec generated