summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorCoprDistGit <infra@openeuler.org>2023-04-10 15:23:04 +0000
committerCoprDistGit <infra@openeuler.org>2023-04-10 15:23:04 +0000
commit640fb96d52ebf736abd59cc2393b58b9ee97ff34 (patch)
tree8545111a9135aad6d7f6f623f23f0c4853725c64
parent0ff3c3a8510ebeb9cee8bd07e085370a37ae8e87 (diff)
automatic import of python-tableschema
-rw-r--r--.gitignore1
-rw-r--r--python-tableschema.spec4682
-rw-r--r--sources1
3 files changed, 4684 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..cf11c34 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/tableschema-1.20.2.tar.gz
diff --git a/python-tableschema.spec b/python-tableschema.spec
new file mode 100644
index 0000000..2450cd9
--- /dev/null
+++ b/python-tableschema.spec
@@ -0,0 +1,4682 @@
+%global _empty_manifest_terminate_build 0
+Name: python-tableschema
+Version: 1.20.2
+Release: 1
+Summary: A utility library for working with Table Schema in Python
+License: MIT
+URL: https://github.com/frictionlessdata/tableschema-py
+Source0: https://mirrors.nju.edu.cn/pypi/web/packages/86/94/bea498476f3e4b65e7b46e200af68749f0a762d8bccd048898e0267b3859/tableschema-1.20.2.tar.gz
+BuildArch: noarch
+
+Requires: python3-six
+Requires: python3-click
+Requires: python3-requests
+Requires: python3-cached-property
+Requires: python3-dateutil
+Requires: python3-jsonschema
+Requires: python3-unicodecsv
+Requires: python3-isodate
+Requires: python3-rfc3986
+Requires: python3-tabulator
+Requires: python3-mock
+Requires: python3-pylama
+Requires: python3-pytest
+Requires: python3-pytest-cov
+
+%description
+# tableschema-py
+
+[![Travis](https://travis-ci.org/frictionlessdata/tableschema-py.svg?branch=master)](https://travis-ci.org/frictionlessdata/tableschema-py)
+[![Coveralls](http://img.shields.io/coveralls/frictionlessdata/tableschema-py.svg?branch=master)](https://coveralls.io/r/frictionlessdata/tableschema-py?branch=master)
+[![PyPi](https://img.shields.io/pypi/v/tableschema.svg)](https://pypi.python.org/pypi/tableschema)
+[![Github](https://img.shields.io/badge/github-master-brightgreen)](https://github.com/frictionlessdata/tableschema-py)
+[![Gitter](https://img.shields.io/gitter/room/frictionlessdata/chat.svg)](https://gitter.im/frictionlessdata/chat)
+
+A Python implementation of the [Table Schema](http://specs.frictionlessdata.io/table-schema/) standard.
+
+> **[Important Notice]** We have released [Frictionless Framework](https://github.com/frictionlessdata/frictionless-py). This framework provides improved `tableschema` functionality extended to be a complete data solution. The change in not breaking for the existing software so no actions are required. Please read the [Migration Guide](https://framework.frictionlessdata.io/docs/development/migration) from `tableschema` to Frictionless Framework.
+> - we continue to bug-fix `tableschema@1.x` in this [repository](https://github.com/frictionlessdata/tableschema-py) as well as it's available on [PyPi](https://pypi.org/project/tableschema/) as it was before
+> - please note that `frictionless@3.x` version's API, we're working on at the moment, is not stable
+> - we will release `frictionless@4.x` by the end of 2020 to be the first SemVer/stable version
+
+## Features
+
+- `Table` to work with data tables described by Table Schema
+- `Schema` representing Table Schema
+- `Field` representing Table Schema field
+- `validate` to validate Table Schema
+- `infer` to infer Table Schema from data
+- built-in command-line interface to validate and infer schemas
+- storage/plugins system to connect tables to different storage backends like SQL Database
+
+## Contents
+
+<!--TOC-->
+
+ - [Getting Started](#getting-started)
+ - [Installation](#installation)
+ - [Documentation](#documentation)
+ - [Introduction](#introduction)
+ - [Working with Table](#working-with-table)
+ - [Working with Schema](#working-with-schema)
+ - [Working with Field](#working-with-field)
+ - [API Reference](#api-reference)
+ - [`cli`](#cli)
+ - [`Table`](#table)
+ - [`Schema`](#schema)
+ - [`Field`](#field)
+ - [`Storage`](#storage)
+ - [`validate`](#validate)
+ - [`infer`](#infer)
+ - [`FailedCast`](#failedcast)
+ - [`DataPackageException`](#datapackageexception)
+ - [`TableSchemaException`](#tableschemaexception)
+ - [`LoadError`](#loaderror)
+ - [`ValidationError`](#validationerror)
+ - [`CastError`](#casterror)
+ - [`IntegrityError`](#integrityerror)
+ - [`UniqueKeyError`](#uniquekeyerror)
+ - [`RelationError`](#relationerror)
+ - [`UnresolvedFKError`](#unresolvedfkerror)
+ - [`StorageError`](#storageerror)
+ - [Experimental](#experimental)
+ - [Contributing](#contributing)
+ - [Changelog](#changelog)
+
+<!--TOC-->
+
+## Getting Started
+
+### Installation
+
+The package uses semantic versioning. It means that major versions could include breaking changes. It's highly recommended to specify `tableschema` version range in your `setup/requirements` file e.g. `tableschema>=1.0,<2.0`.
+
+```bash
+$ pip install tableschema
+```
+
+## Documentation
+
+### Introduction
+
+Let's start with a simple example:
+
+```python
+from tableschema import Table
+
+# Create table
+table = Table('path.csv', schema='schema.json')
+
+# Print schema descriptor
+print(table.schema.descriptor)
+
+# Print cast rows in a dict form
+for keyed_row in table.iter(keyed=True):
+ print(keyed_row)
+```
+
+### Working with Table
+
+A table is a core concept in a tabular data world. It represents data with metadata (Table Schema). Let's see how we can use it in practice.
+
+Consider we have some local csv file. It could be inline data or from a remote link - all supported by the `Table` class (except local files for in-brower usage of course). But say it's `data.csv` for now:
+
+```csv
+city,location
+london,"51.50,-0.11"
+paris,"48.85,2.30"
+rome,N/A
+```
+
+Let's create and read a table instance. We use the static `Table.load` method and the `table.read` method with the `keyed` option to get an array of keyed rows:
+
+```python
+table = Table('data.csv')
+table.headers # ['city', 'location']
+table.read(keyed=True)
+# [
+# {city: 'london', location: '51.50,-0.11'},
+# {city: 'paris', location: '48.85,2.30'},
+# {city: 'rome', location: 'N/A'},
+# ]
+```
+
+As we can see, our locations are just strings. But they should be geopoints. Also, Rome's location is not available, but it's just a string `N/A` instead of `None`. First we have to infer Table Schema:
+
+```python
+table.infer()
+table.schema.descriptor
+# { fields:
+# [ { name: 'city', type: 'string', format: 'default' },
+# { name: 'location', type: 'geopoint', format: 'default' } ],
+# missingValues: [ '' ] }
+table.read(keyed=True)
+# Fails with a data validation error
+```
+
+Let's fix the "not available" location. There is a `missingValues` property in Table Schema specification. As a first try we set `missingValues` to `N/A` in `table.schema.descriptor`. The schema descriptor can be changed in-place, but all changes should also be committed using `table.schema.commit()`:
+
+```python
+table.schema.descriptor['missingValues'] = 'N/A'
+table.schema.commit()
+table.schema.valid # false
+table.schema.errors
+# [<ValidationError: "'N/A' is not of type 'array'">]
+```
+
+As a good citizens we've decided to check our schema descriptor's validity. And it's not valid! We should use an array for the `missingValues` property. Also, don't forget to include "empty string" as a valid missing value:
+
+```python
+table.schema.descriptor['missingValues'] = ['', 'N/A']
+table.schema.commit()
+table.schema.valid # true
+```
+
+All good. It looks like we're ready to read our data again:
+
+```python
+table.read(keyed=True)
+# [
+# {city: 'london', location: [51.50,-0.11]},
+# {city: 'paris', location: [48.85,2.30]},
+# {city: 'rome', location: null},
+# ]
+```
+
+Now we see that:
+- locations are arrays with numeric latitude and longitude
+- Rome's location is a native Python `None`
+
+And because there are no errors after reading, we can be sure that our data is valid against our schema. Let's save it:
+
+```python
+table.schema.save('schema.json')
+table.save('data.csv')
+```
+
+Our `data.csv` looks the same because it has been stringified back to `csv` format. But now we have `schema.json`:
+
+```json
+{
+ "fields": [
+ {
+ "name": "city",
+ "type": "string",
+ "format": "default"
+ },
+ {
+ "name": "location",
+ "type": "geopoint",
+ "format": "default"
+ }
+ ],
+ "missingValues": [
+ "",
+ "N/A"
+ ]
+}
+
+```
+
+If we decide to improve it even more we could update the schema file and then open it again. But now providing a schema path:
+
+```python
+table = Table('data.csv', schema='schema.json')
+# Continue the work
+```
+
+As already mentioned a given schema can be used to *validate* data (see the [Schema](#schema) section for schema specification details). In default mode invalid data rows immediately trigger an [exception](#exceptions) in the `table.iter()`/`table.write()` methods.
+
+Suppose this schema-invalid local file `invalid_data.csv`:
+```csv
+key,value
+zero,0
+one,not_an_integer
+two,2
+```
+
+We're going to validate the data against the following schema:
+```python
+table = Table(
+ 'invalid_data.csv',
+ schema={'fields': [{'name': 'key'}, {'name': 'value', 'type': 'integer'}]})
+```
+
+Iterating over the data triggers an exception due to the failed cast of `'not_an_integer'` to `int`:
+```python
+for row in table.iter():
+ print(row)
+
+# Traceback (most recent call last):
+# ...
+# tableschema.exceptions.CastError: There are 1 cast errors (see exception.errors) for row "3"
+```
+
+Hint: The row number count starts with 1 and also includes header lines.
+
+(Note: You can optionally switch off `iter()`/`read()` value casting using the cast parameter, see reference below.)
+
+By providing a custom exception handler (a callable) to those methods you can treat occurring exceptions at your own discretion, i.e. to "fail late" and e.g. gather a validation report on the whole data:
+
+```python
+errors = []
+def exc_handler(exc, row_number=None, row_data=None, error_data=None):
+ errors.append((exc, row_number, row_data, error_data))
+
+for row in table.iter(exc_handler=exc_handler):
+ print(row)
+
+# ['zero', 0]
+# ['one', FailedCast('not_an_integer')]
+# ['two', 2]
+
+print(errors)
+
+# [(CastError('There are 1 cast errors (see exception.errors) for row "3"',),
+# 3,
+# OrderedDict([('key', 'one'), ('value', 'not_an_integer')]),
+# OrderedDict([('value', 'not_an_integer')]))]
+```
+
+Note that
+
+- Data rows are yielded even though the data is schema-invalid; this is due to our custom expression handler choosing not to raise exceptions (but rather collect them in the errors list).
+- Data field values that can't get casted properly (if `iter()`/`read()` cast parameter is set to True, which is the default) are wrapped into a `FailedCast` "value holder". This allows for distinguishing uncasted values from successfully casted values on the data consumer side. `FailedCast` instances can only get yielded when custom exception handling is in place.
+- The custom exception handler callable must support a function signature as specified in the `iter()`/`read()` sections of the `Table` class API reference.
+
+### Working with Schema
+
+A model of a schema with helpful methods for working with the schema and supported data. Schema instances can be initialized with a schema source as a url to a JSON file or a JSON object. The schema is initially validated (see [validate](#validate) below). By default validation errors will be stored in `schema.errors` but in a strict mode it will be instantly raised.
+
+Let's create a blank schema. It's not valid because `descriptor.fields` property is required by the [Table Schema](http://specs.frictionlessdata.io/table-schema/) specification:
+
+```python
+schema = Schema()
+schema.valid # false
+schema.errors
+# [<ValidationError: "'fields' is a required property">]
+```
+
+To avoid creating a schema descriptor by hand we will use a `schema.infer` method to infer the descriptor from given data:
+
+```python
+schema.infer([
+ ['id', 'age', 'name'],
+ ['1','39','Paul'],
+ ['2','23','Jimmy'],
+ ['3','36','Jane'],
+ ['4','28','Judy'],
+])
+schema.valid # true
+schema.descriptor
+#{ fields:
+# [ { name: 'id', type: 'integer', format: 'default' },
+# { name: 'age', type: 'integer', format: 'default' },
+# { name: 'name', type: 'string', format: 'default' } ],
+# missingValues: [ '' ] }
+```
+
+Now we have an inferred schema and it's valid. We can cast data rows against our schema. We provide a string input which will be cast correspondingly:
+
+```python
+schema.cast_row(['5', '66', 'Sam'])
+# [ 5, 66, 'Sam' ]
+```
+
+But if we try provide some missing value to the `age` field, the cast will fail because the only valid "missing" value is an empty string. Let's update our schema:
+
+```python
+schema.cast_row(['6', 'N/A', 'Walt'])
+# Cast error
+schema.descriptor['missingValues'] = ['', 'N/A']
+schema.commit()
+schema.cast_row(['6', 'N/A', 'Walt'])
+# [ 6, None, 'Walt' ]
+```
+
+We can save the schema to a local file, and resume work on it at any time by loading it from that file:
+
+```python
+schema.save('schema.json')
+schema = Schema('schema.json')
+```
+
+### Working with Field
+
+```python
+from tableschema import Field
+
+# Init field
+field = Field({'name': 'name', 'type': 'number'})
+
+# Cast a value
+field.cast_value('12345') # -> 12345
+```
+
+Data values can be cast to native Python objects with a Field instance. Type instances can be initialized with [field descriptors](https://specs.frictionlessdata.io/table-schema/). This allows formats and constraints to be defined.
+
+Casting a value will check the value is of the expected type, is in the correct format, and complies with any constraints imposed by a schema. E.g. a date value (in ISO 8601 format) can be cast with a DateType instance. Values that can't be cast will raise an `InvalidCastError` exception.
+
+Casting a value that doesn't meet the constraints will raise a `ConstraintError` exception.
+
+## API Reference
+
+### `cli`
+```python
+cli()
+```
+Command-line interface
+
+```
+Usage: tableschema [OPTIONS] COMMAND [ARGS]...
+
+Options:
+ --help Show this message and exit.
+
+Commands:
+ infer Infer a schema from data.
+ info Return info on this version of Table Schema
+ validate Validate that a supposed schema is in fact a Table Schema.
+```
+
+
+### `Table`
+```python
+Table(self,
+ source,
+ schema=None,
+ strict=False,
+ post_cast=[],
+ storage=None,
+ **options)
+```
+Table representation
+
+__Arguments__
+- __source (str/list[])__: data source one of:
+ - local file (path)
+ - remote file (url)
+ - array of arrays representing the rows
+- __schema (any)__: data schema in all forms supported by `Schema` class
+- __strict (bool)__: strictness option to pass to `Schema` constructor
+- __post_cast (function[])__: list of post cast processors
+- __storage (None)__: storage name like `sql` or `bigquery`
+- __options (dict)__: `tabulator` or storage's options
+
+__Raises__
+- `TableSchemaException`: raises on any error
+
+
+
+#### `table.hash`
+Table's SHA256 hash if it's available.
+
+If it's already read using e.g. `table.read`, otherwise returns `None`.
+In the middle of an iteration it returns hash of already read contents
+
+__Returns__
+
+`str/None`: SHA256 hash
+
+
+
+#### `table.headers`
+Table's headers is available
+
+__Returns__
+
+`str[]`: headers
+
+
+
+#### `table.schema`
+Returns schema class instance if available
+
+__Returns__
+
+`Schema`: schema
+
+
+
+#### `table.size`
+Table's size in BYTES if it's available
+
+If it's already read using e.g. `table.read`, otherwise returns `None`.
+In the middle of an iteration it returns size of already read contents
+
+__Returns__
+
+`int/None`: size in BYTES
+
+
+
+#### `table.iter`
+```python
+table.iter(keyed=False,
+ extended=False,
+ cast=True,
+ integrity=False,
+ relations=False,
+ foreign_keys_values=False,
+ exc_handler=None)
+```
+Iterates through the table data and emits rows cast based on table schema.
+
+__Arguments__
+
+
+ keyed (bool):
+ yield keyed rows in a form of `{header1: value1, header2: value2}`
+ (default is false; the form of rows is `[value1, value2]`)
+
+ extended (bool):
+ yield extended rows in a for of `[rowNumber, [header1, header2], [value1, value2]]`
+ (default is false; the form of rows is `[value1, value2]`)
+
+ cast (bool):
+ disable data casting if false
+ (default is true)
+
+ integrity (dict):
+ dictionary in a form of `{'size': <bytes>, 'hash': '<sha256>'}`
+ to check integrity of the table when it's read completely.
+ Both keys are optional.
+
+ relations (dict):
+ dictionary of foreign key references in a form
+ of `{resource1: [{field1: value1, field2: value2}, ...], ...}`.
+ If provided, foreign key fields will checked and resolved
+ to one of their references (/!\ one-to-many fk are not completely resolved).
+
+ foreign_keys_values (dict):
+ three-level dictionary of foreign key references optimized
+ to speed up validation process in a form of
+ `{resource1: {(fk_field1, fk_field2): {(value1, value2): {one_keyedrow}, ... }}}`.
+ If not provided but relations is true, it will be created
+ before the validation process by *index_foreign_keys_values* method
+
+ exc_handler (func):
+ optional custom exception handler callable.
+ Can be used to defer raising errors (i.e. "fail late"), e.g.
+ for data validation purposes. Must support the signature below
+
+__Custom exception handler__
+
+
+```python
+def exc_handler(exc, row_number=None, row_data=None, error_data=None):
+ '''Custom exception handler (example)
+
+ # Arguments:
+ exc(Exception):
+ Deferred exception instance
+ row_number(int):
+ Data row number that triggers exception exc
+ row_data(OrderedDict):
+ Invalid data row source data
+ error_data(OrderedDict):
+ Data row source data field subset responsible for the error, if
+ applicable (e.g. invalid primary or foreign key fields). May be
+ identical to row_data.
+ '''
+ # ...
+```
+
+__Raises__
+- `TableSchemaException`: base class of any error
+- `CastError`: data cast error
+- `IntegrityError`: integrity checking error
+- `UniqueKeyError`: unique key constraint violation
+- `UnresolvedFKError`: unresolved foreign key reference error
+
+__Returns__
+
+`Iterator[list]`: yields rows
+
+
+
+#### `table.read`
+```python
+table.read(keyed=False,
+ extended=False,
+ cast=True,
+ limit=None,
+ integrity=False,
+ relations=False,
+ foreign_keys_values=False,
+ exc_handler=None)
+```
+Read the whole table and return as array of rows
+
+> It has the same API as `table.iter` except for
+
+__Arguments__
+- __limit (int)__: limit count of rows to read and return
+
+__Returns__
+
+`list[]`: returns rows
+
+
+
+#### `table.infer`
+```python
+table.infer(limit=100,
+ confidence=0.75,
+ missing_values=[''],
+ guesser_cls=None,
+ resolver_cls=None)
+```
+Infer a schema for the table.
+
+It will infer and set Table Schema to `table.schema` based on table data.
+
+__Arguments__
+- __limit (int)__: limit rows sample size
+- __confidence (float)__: how many casting errors are allowed (as a ratio, between 0 and 1)
+- __missing_values (str[])__: list of missing values (by default `['']`)
+- __guesser_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+- __resolver_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+
+__Returns__
+
+`dict`: Table Schema descriptor
+
+
+
+#### `table.save`
+```python
+table.save(target, storage=None, **options)
+```
+Save data source to file locally in CSV format with `,` (comma) delimiter
+
+> To save schema use `table.schema.save()`
+
+__Arguments__
+- __target (str)__: saving target (e.g. file path)
+- __storage (None/str)__: storage name like `sql` or `bigquery`
+- __options (dict)__: `tabulator` or storage options
+
+__Raises__
+- `TableSchemaException`: raises an error if there is saving problem
+
+__Returns__
+
+`True/Storage`: returns true or storage instance
+
+
+
+#### `table.index_foreign_keys_values`
+```python
+table.index_foreign_keys_values(relations)
+```
+Creates a three-level dictionary of foreign key references
+
+We create them optimized to speed up validation process in a form of
+`{resource1: {(fk_field1, fk_field2): {(value1, value2): {one_keyedrow}, ... }}}`.
+
+For each foreign key of the schema it will iterate through the corresponding
+`relations['resource']` to create an index (i.e. a dict) of existing values
+for the foreign fields and store on keyed row for each value combination.
+
+The optimization relies on the indexation of possible values for one foreign key
+in a hashmap to later speed up resolution.
+
+This method is public to allow creating the index once to apply it
+on multiple tables charing the same schema
+(typically [grouped resources in datapackage](https://github.com/frictionlessdata/datapackage-py#group))
+
+__Notes__
+
+- the second key of the output is a tuple of the foreign fields,
+ a proxy identifier of the foreign key
+- the same relation resource can be indexed multiple times
+ as a schema can contain more than one Foreign Keys
+ pointing to the same resource
+
+__Arguments__
+- __relations (dict)__:
+ dict of foreign key references in a form of
+ `{resource1: [{field1: value1, field2: value2}, ...], ...}`.
+ It must contain all resources pointed in the foreign keys schema definition.
+
+__Returns__
+
+`dict`:
+ returns a three-level dictionary of foreign key references
+ optimized to speed up validation process in a form of
+ `{resource1: {(fk_field1, fk_field2): {(value1, value2): {one_keyedrow}, ... }}})`
+
+
+### `Schema`
+```python
+Schema(self, descriptor={}, strict=False)
+```
+Schema representation
+
+__Arguments__
+- __descriptor (str/dict)__: schema descriptor one of:
+ - local path
+ - remote url
+ - dictionary
+- __strict (bool)__: flag to specify validation behaviour:
+ - if false, errors will not be raised but instead collected in `schema.errors`
+ - if true, validation errors are raised immediately
+
+__Raises__
+- `TableSchemaException`: raise any error that occurs during the process
+
+
+
+#### `schema.descriptor`
+Schema's descriptor
+
+__Returns__
+
+`dict`: descriptor
+
+
+
+#### `schema.errors`
+Validation errors
+
+Always empty in strict mode.
+
+__Returns__
+
+`Exception[]`: validation errors
+
+
+
+#### `schema.field_names`
+Schema's field names
+
+__Returns__
+
+`str[]`: an array of field names
+
+
+
+#### `schema.fields`
+Schema's fields
+
+__Returns__
+
+`Field[]`: an array of field instances
+
+
+
+#### `schema.foreign_keys`
+Schema's foreign keys
+
+__Returns__
+
+`dict[]`: foreign keys
+
+
+
+#### `schema.headers`
+Schema's field names
+
+__Returns__
+
+`str[]`: an array of field names
+
+
+
+#### `schema.missing_values`
+Schema's missing values
+
+__Returns__
+
+`str[]`: missing values
+
+
+
+#### `schema.primary_key`
+Schema's primary keys
+
+__Returns__
+
+`str[]`: primary keys
+
+
+
+#### `schema.valid`
+Validation status
+
+Always true in strict mode.
+
+__Returns__
+
+`bool`: validation status
+
+
+
+#### `schema.get_field`
+```python
+schema.get_field(name)
+```
+Get schema's field by name.
+
+> Use `table.update_field` if you want to modify the field descriptor
+
+__Arguments__
+- __name (str)__: schema field name
+
+__Returns__
+
+`Field/None`: `Field` instance or `None` if not found
+
+
+
+#### `schema.get_field`
+```python
+schema.get_field(name)
+```
+Get schema's field by name.
+
+> Use `table.update_field` if you want to modify the field descriptor
+
+__Arguments__
+- __name (str)__: schema field name
+
+__Returns__
+
+`Field/None`: `Field` instance or `None` if not found
+
+
+
+#### `schema.add_field`
+```python
+schema.add_field(descriptor)
+```
+Add new field to schema.
+
+The schema descriptor will be validated with newly added field descriptor.
+
+__Arguments__
+- __descriptor (dict)__: field descriptor
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`Field/None`: added `Field` instance or `None` if not added
+
+
+
+#### `schema.update_field`
+```python
+schema.update_field(name, update)
+```
+Update existing descriptor field by name
+
+__Arguments__
+- __name (str)__: schema field name
+- __update (dict)__: update to apply to field's descriptor
+
+__Returns__
+
+`bool`: true on success and false if no field is found to be modified
+
+
+
+#### `schema.remove_field`
+```python
+schema.remove_field(name)
+```
+Remove field resource by name.
+
+The schema descriptor will be validated after field descriptor removal.
+
+__Arguments__
+- __name (str)__: schema field name
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`Field/None`: removed `Field` instances or `None` if not found
+
+
+
+#### `schema.cast_row`
+```python
+schema.cast_row(row, fail_fast=False, row_number=None, exc_handler=None)
+```
+Cast row based on field types and formats.
+
+__Arguments__
+- __row (any[]__: data row as an array of values
+
+__Returns__
+
+`any[]`: returns cast data row
+
+
+
+#### `schema.infer`
+```python
+schema.infer(rows,
+ headers=1,
+ confidence=0.75,
+ guesser_cls=None,
+ resolver_cls=None)
+```
+Infer and set `schema.descriptor` based on data sample.
+
+__Arguments__
+- __rows (list[])__: array of arrays representing rows.
+- __headers (int/str[])__: data sample headers (one of):
+ - row number containing headers (`rows` should contain headers rows)
+ - array of headers (`rows` should NOT contain headers rows)
+- __confidence (float)__: how many casting errors are allowed (as a ratio, between 0 and 1)
+- __guesser_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+- __resolver_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+
+__Returns__
+
+`dict`: Table Schema descriptor
+
+
+
+#### `schema.commit`
+```python
+schema.commit(strict=None)
+```
+Update schema instance if there are in-place changes in the descriptor.
+
+__Example__
+
+
+```python
+from tableschema import Schema
+descriptor = {'fields': [{'name': 'my_field', 'title': 'My Field', 'type': 'string'}]}
+schema = Schema(descriptor)
+print(schema.get_field('my_field').descriptor['type']) # string
+
+# Update descriptor by field position
+schema.descriptor['fields'][0]['type'] = 'number'
+# Update descriptor by field name
+schema.update_field('my_field', {'title': 'My Pretty Field'}) # True
+
+# Change are not committed
+print(schema.get_field('my_field').descriptor['type']) # string
+print(schema.get_field('my_field').descriptor['title']) # My Field
+
+# Commit change
+schema.commit()
+print(schema.get_field('my_field').descriptor['type']) # number
+print(schema.get_field('my_field').descriptor['title']) # My Pretty Field
+
+```
+
+__Arguments__
+- __strict (bool)__: alter `strict` mode for further work
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`bool`: true on success and false if not modified
+
+
+
+#### `schema.save`
+```python
+schema.save(target, ensure_ascii=True)
+```
+Save schema descriptor to target destination.
+
+__Arguments__
+- __target (str)__: path where to save a descriptor
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`bool`: true on success
+
+
+
+### `Field`
+```python
+Field(self, descriptor, missing_values=[''], schema=None)
+```
+Field representaion
+
+__Arguments__
+- __descriptor (dict)__: schema field descriptor
+- __missingValues (str[])__: an array with string representing missing values
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+
+
+#### `field.constraints`
+Field constraints
+
+__Returns__
+
+`dict`: dict of field constraints
+
+
+
+#### `field.descriptor`
+Fields's descriptor
+
+__Returns__
+
+`dict`: descriptor
+
+
+
+#### `field.format`
+Field format
+
+__Returns__
+
+`str`: field format
+
+
+
+#### `field.missing_values`
+Field's missing values
+
+__Returns__
+
+`str[]`: missing values
+
+
+
+#### `field.name`
+Field name
+
+__Returns__
+
+`str`: field name
+
+
+
+#### `field.required`
+Whether field is required
+
+__Returns__
+
+`bool`: true if required
+
+
+
+#### `field.schema`
+Returns a schema instance if the field belongs to some schema
+
+__Returns__
+
+`Schema`: field's schema
+
+
+
+#### `field.type`
+Field type
+
+__Returns__
+
+`str`: field type
+
+
+
+#### `field.cast_value`
+```python
+field.cast_value(value, constraints=True)
+```
+Cast given value according to the field type and format.
+
+__Arguments__
+- __value (any)__: value to cast against field
+- __constraints (boll/str[])__: gets constraints configuration
+ - it could be set to true to disable constraint checks
+ - it could be an Array of constraints to check e.g. ['minimum', 'maximum']
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`any`: returns cast value
+
+
+
+#### `field.test_value`
+```python
+field.test_value(value, constraints=True)
+```
+Test whether value is compliant to the field.
+
+__Arguments__
+- __value (any)__: value to cast against field
+- __constraints (bool/str[])__: constraints configuration
+
+__Returns__
+
+`bool`: returns if value is compliant to the field
+
+
+### `Storage`
+```python
+Storage(self, **options)
+```
+Storage factory/interface
+
+__For users__
+
+
+> Use `Storage.connect` to instantiate a storage
+
+For instantiation of concrete storage instances,
+`tableschema.Storage` provides a unified factory method `connect`
+(which uses the plugin system under the hood):
+
+```python
+# pip install tableschema_sql
+from tableschema import Storage
+
+storage = Storage.connect('sql', **options)
+storage.create('bucket', descriptor)
+storage.write('bucket', rows)
+storage.read('bucket')
+```
+
+__For integrators__
+
+
+The library includes interface declaration to implement tabular `Storage`.
+This interface allow to use different data storage systems like SQL
+with `tableschema.Table` class (load/save) as well as on the data package level:
+
+![Storage](https://raw.githubusercontent.com/frictionlessdata/tableschema-py/master/data/storage.png)
+
+An implementor must follow `tableschema.Storage` interface
+to write his own storage backend. Concrete storage backends
+could include additional functionality specific to conrete storage system.
+See `plugins` below to know how to integrate custom storage plugin into your workflow.
+
+
+
+#### `storage.buckets`
+Return list of storage bucket names.
+
+A `bucket` is a special term which has almost the same meaning as `table`.
+You should consider `bucket` as a `table` stored in the `storage`.
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+__Returns__
+
+`str[]`: return list of bucket names
+
+
+
+#### `storage.connect`
+```python
+storage.connect(name, **options)
+```
+Create tabular `storage` based on storage name.
+
+> This method is statis: `Storage.connect()`
+
+__Arguments__
+- __name (str)__: storage name like `sql`
+- __options (dict)__: concrete storage options
+
+__Raises__
+- `StorageError`: raises on any error
+
+__Returns__
+
+`Storage`: returns `Storage` instance
+
+
+
+#### `storage.create`
+```python
+storage.create(bucket, descriptor, force=False)
+```
+Create one/multiple buckets.
+
+__Arguments__
+- __bucket (str/list)__: bucket name or list of bucket names
+- __descriptor (dict/dict[])__: schema descriptor or list of descriptors
+- __force (bool)__: whether to delete and re-create already existing buckets
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+
+
+#### `storage.delete`
+```python
+storage.delete(bucket=None, ignore=False)
+```
+Delete one/multiple/all buckets.
+
+__Arguments__
+- __bucket (str/list/None)__: bucket name or list of bucket names to delete.
+ If `None`, all buckets will be deleted
+- __descriptor (dict/dict[])__: schema descriptor or list of descriptors
+- __ignore (bool)__: don't raise an error on non-existent bucket deletion
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+
+
+#### `storage.describe`
+```python
+storage.describe(bucket, descriptor=None)
+```
+Get/set bucket's Table Schema descriptor
+
+__Arguments__
+- __bucket (str)__: bucket name
+- __descriptor (dict/None)__: schema descriptor to set
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+__Returns__
+
+`dict`: returns Table Schema descriptor
+
+
+
+#### `storage.iter`
+```python
+storage.iter(bucket)
+```
+Return an iterator of typed values based on the schema of this bucket.
+
+__Arguments__
+- __bucket (str)__: bucket name
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+__Returns__
+
+`list[]`: yields data rows
+
+
+
+#### `storage.read`
+```python
+storage.read(bucket)
+```
+Read typed values based on the schema of this bucket.
+
+__Arguments__
+- __bucket (str)__: bucket name
+__Raises__
+- `exceptions.StorageError`: raises on any error
+__Returns__
+
+`list[]`: returns data rows
+
+
+
+#### `storage.write`
+```python
+storage.write(bucket, rows)
+```
+This method writes data rows into `storage`.
+
+It should store values of unsupported types as strings internally (like csv does).
+
+__Arguments__
+- __bucket (str)__: bucket name
+- __rows (list[])__: data rows to write
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+
+### `validate`
+```python
+validate(descriptor)
+```
+Validate descriptor
+
+__Arguments__
+- __dict__: descriptor
+
+__Raises__
+- `ValidationError`: on validation errors
+
+__Returns__
+
+`bool`: True
+
+
+### `infer`
+```python
+infer(source,
+ headers=1,
+ limit=100,
+ confidence=0.75,
+ missing_values=[''],
+ guesser_cls=None,
+ resolver_cls=None,
+ **options)
+```
+Infer source schema.
+
+__Arguments__
+- __source (any)__: source as path, url or inline data
+- __headers (int/str[])__: headers rows number or headers list
+- __confidence (float)__: how many casting errors are allowed (as a ratio, between 0 and 1)
+- __missing_values (str[])__: list of missing values (by default `['']`)
+- __guesser_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+- __resolver_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`dict`: returns schema descriptor
+
+
+### `FailedCast`
+```python
+FailedCast(self, value)
+```
+Wrap an original data field value that failed to be properly casted.
+
+FailedCast allows for further processing/yielding values but still be able
+to distinguish uncasted values on the consuming side.
+
+Delegates attribute access and the basic rich comparison methods to the
+underlying object. Supports default user-defined classes hashability i.e.
+is hashable based on object identity (not based on the wrapped value).
+
+__Arguments__
+- __value (any)__: value
+
+
+### `DataPackageException`
+```python
+DataPackageException(self, message, errors=[])
+```
+Base class for all DataPackage/TableSchema exceptions.
+
+If there are multiple errors, they can be read from the exception object:
+
+```python
+try:
+ # lib action
+except DataPackageException as exception:
+ if exception.multiple:
+ for error in exception.errors:
+ # handle error
+```
+
+
+
+#### `datapackageexception.errors`
+List of nested errors
+
+__Returns__
+
+`DataPackageException[]`: list of nested errors
+
+
+
+#### `datapackageexception.multiple`
+Whether it's a nested exception
+
+__Returns__
+
+`bool`: whether it's a nested exception
+
+
+
+### `TableSchemaException`
+```python
+TableSchemaException(self, message, errors=[])
+```
+Base class for all TableSchema exceptions.
+
+
+### `LoadError`
+```python
+LoadError(self, message, errors=[])
+```
+All loading errors.
+
+
+### `ValidationError`
+```python
+ValidationError(self, message, errors=[])
+```
+All validation errors.
+
+
+### `CastError`
+```python
+CastError(self, message, errors=[])
+```
+All value cast errors.
+
+
+### `IntegrityError`
+```python
+IntegrityError(self, message, errors=[])
+```
+All integrity errors.
+
+
+### `UniqueKeyError`
+```python
+UniqueKeyError(self, message, errors=[])
+```
+Unique key constraint violation (CastError subclass)
+
+
+### `RelationError`
+```python
+RelationError(self, message, errors=[])
+```
+All relations errors.
+
+
+### `UnresolvedFKError`
+```python
+UnresolvedFKError(self, message, errors=[])
+```
+Unresolved foreign key reference error (RelationError subclass).
+
+
+### `StorageError`
+```python
+StorageError(self, message, errors=[])
+```
+All storage errors.
+
+## Experimental
+
+> This API is experimental and can be changed/removed in the future
+
+There is an experimental environment variable `TABLESCHEMA_PRESERVE_MISSING_VALUES` which, if it is set, affects how data casting works.
+
+By default, missing values are resolved to `None` values. When this flag is set, missing values are passed through as it is. For example:
+
+> missing_values.py
+
+```python
+from tableschema import Field
+
+field = Field({'type': 'number'}, missing_values=['-'])
+print(field.cast_value('3'))
+print(field.cast_value('-'))
+```
+
+Running this script in different modes:
+
+```bash
+$ python missing_values.py
+3
+None
+$ TABLESCHEMA_PRESERVE_MISSING_VALUES=1 python missing_values.py
+3
+-
+```
+
+The flags affects all the library's APIs and software built on top of `tableschema`. For example, Data Package Pipelines:
+
+```bash
+$ TABLESCHEMA_PRESERVE_MISSING_VALUES=1 dpp run ./my_pipeline
+```
+
+## Contributing
+
+> The project follows the [Open Knowledge International coding standards](https://github.com/okfn/coding-standards).
+
+Recommended way to get started is to create and activate a project virtual environment.
+To install package and development dependencies into active environment:
+
+```bash
+$ make install
+```
+
+To run tests with linting and coverage:
+
+```bash
+$ make test
+```
+
+## Changelog
+
+Here described only breaking and the most important changes. The full changelog and documentation for all released versions can be found in the nicely formatted [commit history](https://github.com/frictionlessdata/tableschema-py/commits/master).
+
+#### v1.20
+
+- Added --json flag to the CLI (#287)
+
+#### v1.19
+
+- Deduplicate field names if guessing in infer
+
+#### v1.18
+
+- Publish `field.ERROR/cast_function/check_functions`
+
+#### v1.17
+
+- Added `schema.missing_values` and `field.missing_values`
+
+#### v1.16
+
+- Fixed the way we parse `geopoint`:
+ - as a string it can be in 3 forms ("default", "array", "object") BUT
+ - as native object it can only be a list/tuple
+
+#### v1.15
+
+- Added an experimental `TABLESCHEMA_PRESERVE_MISSING_VALUES` environment variable flag
+
+#### v1.14
+
+- Allow providing custom guesser and resolver to `table.infer` and `infer`
+
+#### v1.13
+
+- Added `missing_values` argument to the `infer` function (#269)
+
+#### v1.12
+
+- Support optional custom exception handling for table.iter/read (#259)
+
+#### v1.11
+
+- Added `preserve_missing_values` parameter to `field.cast_value`
+
+#### v1.10
+
+- Added an ability to check table's integrity while reading
+
+#### v1.9
+
+- Implemented the `table.size` and `table.hash` properties
+
+#### v1.8
+
+- Added `table.index_foreign_keys_values` and improved foreign key checks performance
+
+#### v1.7
+
+- Added `field.schema` property
+
+#### v1.6
+
+- In `strict` mode raise an exception if there are problems in field construction
+
+#### v1.5
+
+- Allow providing custom guesser and resolver to schema infer
+
+#### v1.4
+
+- Added `schema.update_field` method
+
+#### v1.3
+
+- Support datetime with no time for date casting
+
+#### v1.2
+
+- Support floats like 1.0 for integer casting
+
+#### v1.1
+
+- Added the `confidence` parameter to `infer`
+
+#### v1.0
+
+- The library has been rebased on the Frictionless Data specs v1 - https://frictionlessdata.io/specs/table-schema/
+
+
+
+%package -n python3-tableschema
+Summary: A utility library for working with Table Schema in Python
+Provides: python-tableschema
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-tableschema
+# tableschema-py
+
+[![Travis](https://travis-ci.org/frictionlessdata/tableschema-py.svg?branch=master)](https://travis-ci.org/frictionlessdata/tableschema-py)
+[![Coveralls](http://img.shields.io/coveralls/frictionlessdata/tableschema-py.svg?branch=master)](https://coveralls.io/r/frictionlessdata/tableschema-py?branch=master)
+[![PyPi](https://img.shields.io/pypi/v/tableschema.svg)](https://pypi.python.org/pypi/tableschema)
+[![Github](https://img.shields.io/badge/github-master-brightgreen)](https://github.com/frictionlessdata/tableschema-py)
+[![Gitter](https://img.shields.io/gitter/room/frictionlessdata/chat.svg)](https://gitter.im/frictionlessdata/chat)
+
+A Python implementation of the [Table Schema](http://specs.frictionlessdata.io/table-schema/) standard.
+
+> **[Important Notice]** We have released [Frictionless Framework](https://github.com/frictionlessdata/frictionless-py). This framework provides improved `tableschema` functionality extended to be a complete data solution. The change in not breaking for the existing software so no actions are required. Please read the [Migration Guide](https://framework.frictionlessdata.io/docs/development/migration) from `tableschema` to Frictionless Framework.
+> - we continue to bug-fix `tableschema@1.x` in this [repository](https://github.com/frictionlessdata/tableschema-py) as well as it's available on [PyPi](https://pypi.org/project/tableschema/) as it was before
+> - please note that `frictionless@3.x` version's API, we're working on at the moment, is not stable
+> - we will release `frictionless@4.x` by the end of 2020 to be the first SemVer/stable version
+
+## Features
+
+- `Table` to work with data tables described by Table Schema
+- `Schema` representing Table Schema
+- `Field` representing Table Schema field
+- `validate` to validate Table Schema
+- `infer` to infer Table Schema from data
+- built-in command-line interface to validate and infer schemas
+- storage/plugins system to connect tables to different storage backends like SQL Database
+
+## Contents
+
+<!--TOC-->
+
+ - [Getting Started](#getting-started)
+ - [Installation](#installation)
+ - [Documentation](#documentation)
+ - [Introduction](#introduction)
+ - [Working with Table](#working-with-table)
+ - [Working with Schema](#working-with-schema)
+ - [Working with Field](#working-with-field)
+ - [API Reference](#api-reference)
+ - [`cli`](#cli)
+ - [`Table`](#table)
+ - [`Schema`](#schema)
+ - [`Field`](#field)
+ - [`Storage`](#storage)
+ - [`validate`](#validate)
+ - [`infer`](#infer)
+ - [`FailedCast`](#failedcast)
+ - [`DataPackageException`](#datapackageexception)
+ - [`TableSchemaException`](#tableschemaexception)
+ - [`LoadError`](#loaderror)
+ - [`ValidationError`](#validationerror)
+ - [`CastError`](#casterror)
+ - [`IntegrityError`](#integrityerror)
+ - [`UniqueKeyError`](#uniquekeyerror)
+ - [`RelationError`](#relationerror)
+ - [`UnresolvedFKError`](#unresolvedfkerror)
+ - [`StorageError`](#storageerror)
+ - [Experimental](#experimental)
+ - [Contributing](#contributing)
+ - [Changelog](#changelog)
+
+<!--TOC-->
+
+## Getting Started
+
+### Installation
+
+The package uses semantic versioning. It means that major versions could include breaking changes. It's highly recommended to specify `tableschema` version range in your `setup/requirements` file e.g. `tableschema>=1.0,<2.0`.
+
+```bash
+$ pip install tableschema
+```
+
+## Documentation
+
+### Introduction
+
+Let's start with a simple example:
+
+```python
+from tableschema import Table
+
+# Create table
+table = Table('path.csv', schema='schema.json')
+
+# Print schema descriptor
+print(table.schema.descriptor)
+
+# Print cast rows in a dict form
+for keyed_row in table.iter(keyed=True):
+ print(keyed_row)
+```
+
+### Working with Table
+
+A table is a core concept in a tabular data world. It represents data with metadata (Table Schema). Let's see how we can use it in practice.
+
+Consider we have some local csv file. It could be inline data or from a remote link - all supported by the `Table` class (except local files for in-brower usage of course). But say it's `data.csv` for now:
+
+```csv
+city,location
+london,"51.50,-0.11"
+paris,"48.85,2.30"
+rome,N/A
+```
+
+Let's create and read a table instance. We use the static `Table.load` method and the `table.read` method with the `keyed` option to get an array of keyed rows:
+
+```python
+table = Table('data.csv')
+table.headers # ['city', 'location']
+table.read(keyed=True)
+# [
+# {city: 'london', location: '51.50,-0.11'},
+# {city: 'paris', location: '48.85,2.30'},
+# {city: 'rome', location: 'N/A'},
+# ]
+```
+
+As we can see, our locations are just strings. But they should be geopoints. Also, Rome's location is not available, but it's just a string `N/A` instead of `None`. First we have to infer Table Schema:
+
+```python
+table.infer()
+table.schema.descriptor
+# { fields:
+# [ { name: 'city', type: 'string', format: 'default' },
+# { name: 'location', type: 'geopoint', format: 'default' } ],
+# missingValues: [ '' ] }
+table.read(keyed=True)
+# Fails with a data validation error
+```
+
+Let's fix the "not available" location. There is a `missingValues` property in Table Schema specification. As a first try we set `missingValues` to `N/A` in `table.schema.descriptor`. The schema descriptor can be changed in-place, but all changes should also be committed using `table.schema.commit()`:
+
+```python
+table.schema.descriptor['missingValues'] = 'N/A'
+table.schema.commit()
+table.schema.valid # false
+table.schema.errors
+# [<ValidationError: "'N/A' is not of type 'array'">]
+```
+
+As a good citizens we've decided to check our schema descriptor's validity. And it's not valid! We should use an array for the `missingValues` property. Also, don't forget to include "empty string" as a valid missing value:
+
+```python
+table.schema.descriptor['missingValues'] = ['', 'N/A']
+table.schema.commit()
+table.schema.valid # true
+```
+
+All good. It looks like we're ready to read our data again:
+
+```python
+table.read(keyed=True)
+# [
+# {city: 'london', location: [51.50,-0.11]},
+# {city: 'paris', location: [48.85,2.30]},
+# {city: 'rome', location: null},
+# ]
+```
+
+Now we see that:
+- locations are arrays with numeric latitude and longitude
+- Rome's location is a native Python `None`
+
+And because there are no errors after reading, we can be sure that our data is valid against our schema. Let's save it:
+
+```python
+table.schema.save('schema.json')
+table.save('data.csv')
+```
+
+Our `data.csv` looks the same because it has been stringified back to `csv` format. But now we have `schema.json`:
+
+```json
+{
+ "fields": [
+ {
+ "name": "city",
+ "type": "string",
+ "format": "default"
+ },
+ {
+ "name": "location",
+ "type": "geopoint",
+ "format": "default"
+ }
+ ],
+ "missingValues": [
+ "",
+ "N/A"
+ ]
+}
+
+```
+
+If we decide to improve it even more we could update the schema file and then open it again. But now providing a schema path:
+
+```python
+table = Table('data.csv', schema='schema.json')
+# Continue the work
+```
+
+As already mentioned a given schema can be used to *validate* data (see the [Schema](#schema) section for schema specification details). In default mode invalid data rows immediately trigger an [exception](#exceptions) in the `table.iter()`/`table.write()` methods.
+
+Suppose this schema-invalid local file `invalid_data.csv`:
+```csv
+key,value
+zero,0
+one,not_an_integer
+two,2
+```
+
+We're going to validate the data against the following schema:
+```python
+table = Table(
+ 'invalid_data.csv',
+ schema={'fields': [{'name': 'key'}, {'name': 'value', 'type': 'integer'}]})
+```
+
+Iterating over the data triggers an exception due to the failed cast of `'not_an_integer'` to `int`:
+```python
+for row in table.iter():
+ print(row)
+
+# Traceback (most recent call last):
+# ...
+# tableschema.exceptions.CastError: There are 1 cast errors (see exception.errors) for row "3"
+```
+
+Hint: The row number count starts with 1 and also includes header lines.
+
+(Note: You can optionally switch off `iter()`/`read()` value casting using the cast parameter, see reference below.)
+
+By providing a custom exception handler (a callable) to those methods you can treat occurring exceptions at your own discretion, i.e. to "fail late" and e.g. gather a validation report on the whole data:
+
+```python
+errors = []
+def exc_handler(exc, row_number=None, row_data=None, error_data=None):
+ errors.append((exc, row_number, row_data, error_data))
+
+for row in table.iter(exc_handler=exc_handler):
+ print(row)
+
+# ['zero', 0]
+# ['one', FailedCast('not_an_integer')]
+# ['two', 2]
+
+print(errors)
+
+# [(CastError('There are 1 cast errors (see exception.errors) for row "3"',),
+# 3,
+# OrderedDict([('key', 'one'), ('value', 'not_an_integer')]),
+# OrderedDict([('value', 'not_an_integer')]))]
+```
+
+Note that
+
+- Data rows are yielded even though the data is schema-invalid; this is due to our custom expression handler choosing not to raise exceptions (but rather collect them in the errors list).
+- Data field values that can't get casted properly (if `iter()`/`read()` cast parameter is set to True, which is the default) are wrapped into a `FailedCast` "value holder". This allows for distinguishing uncasted values from successfully casted values on the data consumer side. `FailedCast` instances can only get yielded when custom exception handling is in place.
+- The custom exception handler callable must support a function signature as specified in the `iter()`/`read()` sections of the `Table` class API reference.
+
+### Working with Schema
+
+A model of a schema with helpful methods for working with the schema and supported data. Schema instances can be initialized with a schema source as a url to a JSON file or a JSON object. The schema is initially validated (see [validate](#validate) below). By default validation errors will be stored in `schema.errors` but in a strict mode it will be instantly raised.
+
+Let's create a blank schema. It's not valid because `descriptor.fields` property is required by the [Table Schema](http://specs.frictionlessdata.io/table-schema/) specification:
+
+```python
+schema = Schema()
+schema.valid # false
+schema.errors
+# [<ValidationError: "'fields' is a required property">]
+```
+
+To avoid creating a schema descriptor by hand we will use a `schema.infer` method to infer the descriptor from given data:
+
+```python
+schema.infer([
+ ['id', 'age', 'name'],
+ ['1','39','Paul'],
+ ['2','23','Jimmy'],
+ ['3','36','Jane'],
+ ['4','28','Judy'],
+])
+schema.valid # true
+schema.descriptor
+#{ fields:
+# [ { name: 'id', type: 'integer', format: 'default' },
+# { name: 'age', type: 'integer', format: 'default' },
+# { name: 'name', type: 'string', format: 'default' } ],
+# missingValues: [ '' ] }
+```
+
+Now we have an inferred schema and it's valid. We can cast data rows against our schema. We provide a string input which will be cast correspondingly:
+
+```python
+schema.cast_row(['5', '66', 'Sam'])
+# [ 5, 66, 'Sam' ]
+```
+
+But if we try provide some missing value to the `age` field, the cast will fail because the only valid "missing" value is an empty string. Let's update our schema:
+
+```python
+schema.cast_row(['6', 'N/A', 'Walt'])
+# Cast error
+schema.descriptor['missingValues'] = ['', 'N/A']
+schema.commit()
+schema.cast_row(['6', 'N/A', 'Walt'])
+# [ 6, None, 'Walt' ]
+```
+
+We can save the schema to a local file, and resume work on it at any time by loading it from that file:
+
+```python
+schema.save('schema.json')
+schema = Schema('schema.json')
+```
+
+### Working with Field
+
+```python
+from tableschema import Field
+
+# Init field
+field = Field({'name': 'name', 'type': 'number'})
+
+# Cast a value
+field.cast_value('12345') # -> 12345
+```
+
+Data values can be cast to native Python objects with a Field instance. Type instances can be initialized with [field descriptors](https://specs.frictionlessdata.io/table-schema/). This allows formats and constraints to be defined.
+
+Casting a value will check the value is of the expected type, is in the correct format, and complies with any constraints imposed by a schema. E.g. a date value (in ISO 8601 format) can be cast with a DateType instance. Values that can't be cast will raise an `InvalidCastError` exception.
+
+Casting a value that doesn't meet the constraints will raise a `ConstraintError` exception.
+
+## API Reference
+
+### `cli`
+```python
+cli()
+```
+Command-line interface
+
+```
+Usage: tableschema [OPTIONS] COMMAND [ARGS]...
+
+Options:
+ --help Show this message and exit.
+
+Commands:
+ infer Infer a schema from data.
+ info Return info on this version of Table Schema
+ validate Validate that a supposed schema is in fact a Table Schema.
+```
+
+
+### `Table`
+```python
+Table(self,
+ source,
+ schema=None,
+ strict=False,
+ post_cast=[],
+ storage=None,
+ **options)
+```
+Table representation
+
+__Arguments__
+- __source (str/list[])__: data source one of:
+ - local file (path)
+ - remote file (url)
+ - array of arrays representing the rows
+- __schema (any)__: data schema in all forms supported by `Schema` class
+- __strict (bool)__: strictness option to pass to `Schema` constructor
+- __post_cast (function[])__: list of post cast processors
+- __storage (None)__: storage name like `sql` or `bigquery`
+- __options (dict)__: `tabulator` or storage's options
+
+__Raises__
+- `TableSchemaException`: raises on any error
+
+
+
+#### `table.hash`
+Table's SHA256 hash if it's available.
+
+If it's already read using e.g. `table.read`, otherwise returns `None`.
+In the middle of an iteration it returns hash of already read contents
+
+__Returns__
+
+`str/None`: SHA256 hash
+
+
+
+#### `table.headers`
+Table's headers is available
+
+__Returns__
+
+`str[]`: headers
+
+
+
+#### `table.schema`
+Returns schema class instance if available
+
+__Returns__
+
+`Schema`: schema
+
+
+
+#### `table.size`
+Table's size in BYTES if it's available
+
+If it's already read using e.g. `table.read`, otherwise returns `None`.
+In the middle of an iteration it returns size of already read contents
+
+__Returns__
+
+`int/None`: size in BYTES
+
+
+
+#### `table.iter`
+```python
+table.iter(keyed=False,
+ extended=False,
+ cast=True,
+ integrity=False,
+ relations=False,
+ foreign_keys_values=False,
+ exc_handler=None)
+```
+Iterates through the table data and emits rows cast based on table schema.
+
+__Arguments__
+
+
+ keyed (bool):
+ yield keyed rows in a form of `{header1: value1, header2: value2}`
+ (default is false; the form of rows is `[value1, value2]`)
+
+ extended (bool):
+ yield extended rows in a for of `[rowNumber, [header1, header2], [value1, value2]]`
+ (default is false; the form of rows is `[value1, value2]`)
+
+ cast (bool):
+ disable data casting if false
+ (default is true)
+
+ integrity (dict):
+ dictionary in a form of `{'size': <bytes>, 'hash': '<sha256>'}`
+ to check integrity of the table when it's read completely.
+ Both keys are optional.
+
+ relations (dict):
+ dictionary of foreign key references in a form
+ of `{resource1: [{field1: value1, field2: value2}, ...], ...}`.
+ If provided, foreign key fields will checked and resolved
+ to one of their references (/!\ one-to-many fk are not completely resolved).
+
+ foreign_keys_values (dict):
+ three-level dictionary of foreign key references optimized
+ to speed up validation process in a form of
+ `{resource1: {(fk_field1, fk_field2): {(value1, value2): {one_keyedrow}, ... }}}`.
+ If not provided but relations is true, it will be created
+ before the validation process by *index_foreign_keys_values* method
+
+ exc_handler (func):
+ optional custom exception handler callable.
+ Can be used to defer raising errors (i.e. "fail late"), e.g.
+ for data validation purposes. Must support the signature below
+
+__Custom exception handler__
+
+
+```python
+def exc_handler(exc, row_number=None, row_data=None, error_data=None):
+ '''Custom exception handler (example)
+
+ # Arguments:
+ exc(Exception):
+ Deferred exception instance
+ row_number(int):
+ Data row number that triggers exception exc
+ row_data(OrderedDict):
+ Invalid data row source data
+ error_data(OrderedDict):
+ Data row source data field subset responsible for the error, if
+ applicable (e.g. invalid primary or foreign key fields). May be
+ identical to row_data.
+ '''
+ # ...
+```
+
+__Raises__
+- `TableSchemaException`: base class of any error
+- `CastError`: data cast error
+- `IntegrityError`: integrity checking error
+- `UniqueKeyError`: unique key constraint violation
+- `UnresolvedFKError`: unresolved foreign key reference error
+
+__Returns__
+
+`Iterator[list]`: yields rows
+
+
+
+#### `table.read`
+```python
+table.read(keyed=False,
+ extended=False,
+ cast=True,
+ limit=None,
+ integrity=False,
+ relations=False,
+ foreign_keys_values=False,
+ exc_handler=None)
+```
+Read the whole table and return as array of rows
+
+> It has the same API as `table.iter` except for
+
+__Arguments__
+- __limit (int)__: limit count of rows to read and return
+
+__Returns__
+
+`list[]`: returns rows
+
+
+
+#### `table.infer`
+```python
+table.infer(limit=100,
+ confidence=0.75,
+ missing_values=[''],
+ guesser_cls=None,
+ resolver_cls=None)
+```
+Infer a schema for the table.
+
+It will infer and set Table Schema to `table.schema` based on table data.
+
+__Arguments__
+- __limit (int)__: limit rows sample size
+- __confidence (float)__: how many casting errors are allowed (as a ratio, between 0 and 1)
+- __missing_values (str[])__: list of missing values (by default `['']`)
+- __guesser_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+- __resolver_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+
+__Returns__
+
+`dict`: Table Schema descriptor
+
+
+
+#### `table.save`
+```python
+table.save(target, storage=None, **options)
+```
+Save data source to file locally in CSV format with `,` (comma) delimiter
+
+> To save schema use `table.schema.save()`
+
+__Arguments__
+- __target (str)__: saving target (e.g. file path)
+- __storage (None/str)__: storage name like `sql` or `bigquery`
+- __options (dict)__: `tabulator` or storage options
+
+__Raises__
+- `TableSchemaException`: raises an error if there is saving problem
+
+__Returns__
+
+`True/Storage`: returns true or storage instance
+
+
+
+#### `table.index_foreign_keys_values`
+```python
+table.index_foreign_keys_values(relations)
+```
+Creates a three-level dictionary of foreign key references
+
+We create them optimized to speed up validation process in a form of
+`{resource1: {(fk_field1, fk_field2): {(value1, value2): {one_keyedrow}, ... }}}`.
+
+For each foreign key of the schema it will iterate through the corresponding
+`relations['resource']` to create an index (i.e. a dict) of existing values
+for the foreign fields and store on keyed row for each value combination.
+
+The optimization relies on the indexation of possible values for one foreign key
+in a hashmap to later speed up resolution.
+
+This method is public to allow creating the index once to apply it
+on multiple tables charing the same schema
+(typically [grouped resources in datapackage](https://github.com/frictionlessdata/datapackage-py#group))
+
+__Notes__
+
+- the second key of the output is a tuple of the foreign fields,
+ a proxy identifier of the foreign key
+- the same relation resource can be indexed multiple times
+ as a schema can contain more than one Foreign Keys
+ pointing to the same resource
+
+__Arguments__
+- __relations (dict)__:
+ dict of foreign key references in a form of
+ `{resource1: [{field1: value1, field2: value2}, ...], ...}`.
+ It must contain all resources pointed in the foreign keys schema definition.
+
+__Returns__
+
+`dict`:
+ returns a three-level dictionary of foreign key references
+ optimized to speed up validation process in a form of
+ `{resource1: {(fk_field1, fk_field2): {(value1, value2): {one_keyedrow}, ... }}})`
+
+
+### `Schema`
+```python
+Schema(self, descriptor={}, strict=False)
+```
+Schema representation
+
+__Arguments__
+- __descriptor (str/dict)__: schema descriptor one of:
+ - local path
+ - remote url
+ - dictionary
+- __strict (bool)__: flag to specify validation behaviour:
+ - if false, errors will not be raised but instead collected in `schema.errors`
+ - if true, validation errors are raised immediately
+
+__Raises__
+- `TableSchemaException`: raise any error that occurs during the process
+
+
+
+#### `schema.descriptor`
+Schema's descriptor
+
+__Returns__
+
+`dict`: descriptor
+
+
+
+#### `schema.errors`
+Validation errors
+
+Always empty in strict mode.
+
+__Returns__
+
+`Exception[]`: validation errors
+
+
+
+#### `schema.field_names`
+Schema's field names
+
+__Returns__
+
+`str[]`: an array of field names
+
+
+
+#### `schema.fields`
+Schema's fields
+
+__Returns__
+
+`Field[]`: an array of field instances
+
+
+
+#### `schema.foreign_keys`
+Schema's foreign keys
+
+__Returns__
+
+`dict[]`: foreign keys
+
+
+
+#### `schema.headers`
+Schema's field names
+
+__Returns__
+
+`str[]`: an array of field names
+
+
+
+#### `schema.missing_values`
+Schema's missing values
+
+__Returns__
+
+`str[]`: missing values
+
+
+
+#### `schema.primary_key`
+Schema's primary keys
+
+__Returns__
+
+`str[]`: primary keys
+
+
+
+#### `schema.valid`
+Validation status
+
+Always true in strict mode.
+
+__Returns__
+
+`bool`: validation status
+
+
+
+#### `schema.get_field`
+```python
+schema.get_field(name)
+```
+Get schema's field by name.
+
+> Use `table.update_field` if you want to modify the field descriptor
+
+__Arguments__
+- __name (str)__: schema field name
+
+__Returns__
+
+`Field/None`: `Field` instance or `None` if not found
+
+
+
+#### `schema.get_field`
+```python
+schema.get_field(name)
+```
+Get schema's field by name.
+
+> Use `table.update_field` if you want to modify the field descriptor
+
+__Arguments__
+- __name (str)__: schema field name
+
+__Returns__
+
+`Field/None`: `Field` instance or `None` if not found
+
+
+
+#### `schema.add_field`
+```python
+schema.add_field(descriptor)
+```
+Add new field to schema.
+
+The schema descriptor will be validated with newly added field descriptor.
+
+__Arguments__
+- __descriptor (dict)__: field descriptor
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`Field/None`: added `Field` instance or `None` if not added
+
+
+
+#### `schema.update_field`
+```python
+schema.update_field(name, update)
+```
+Update existing descriptor field by name
+
+__Arguments__
+- __name (str)__: schema field name
+- __update (dict)__: update to apply to field's descriptor
+
+__Returns__
+
+`bool`: true on success and false if no field is found to be modified
+
+
+
+#### `schema.remove_field`
+```python
+schema.remove_field(name)
+```
+Remove field resource by name.
+
+The schema descriptor will be validated after field descriptor removal.
+
+__Arguments__
+- __name (str)__: schema field name
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`Field/None`: removed `Field` instances or `None` if not found
+
+
+
+#### `schema.cast_row`
+```python
+schema.cast_row(row, fail_fast=False, row_number=None, exc_handler=None)
+```
+Cast row based on field types and formats.
+
+__Arguments__
+- __row (any[]__: data row as an array of values
+
+__Returns__
+
+`any[]`: returns cast data row
+
+
+
+#### `schema.infer`
+```python
+schema.infer(rows,
+ headers=1,
+ confidence=0.75,
+ guesser_cls=None,
+ resolver_cls=None)
+```
+Infer and set `schema.descriptor` based on data sample.
+
+__Arguments__
+- __rows (list[])__: array of arrays representing rows.
+- __headers (int/str[])__: data sample headers (one of):
+ - row number containing headers (`rows` should contain headers rows)
+ - array of headers (`rows` should NOT contain headers rows)
+- __confidence (float)__: how many casting errors are allowed (as a ratio, between 0 and 1)
+- __guesser_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+- __resolver_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+
+__Returns__
+
+`dict`: Table Schema descriptor
+
+
+
+#### `schema.commit`
+```python
+schema.commit(strict=None)
+```
+Update schema instance if there are in-place changes in the descriptor.
+
+__Example__
+
+
+```python
+from tableschema import Schema
+descriptor = {'fields': [{'name': 'my_field', 'title': 'My Field', 'type': 'string'}]}
+schema = Schema(descriptor)
+print(schema.get_field('my_field').descriptor['type']) # string
+
+# Update descriptor by field position
+schema.descriptor['fields'][0]['type'] = 'number'
+# Update descriptor by field name
+schema.update_field('my_field', {'title': 'My Pretty Field'}) # True
+
+# Change are not committed
+print(schema.get_field('my_field').descriptor['type']) # string
+print(schema.get_field('my_field').descriptor['title']) # My Field
+
+# Commit change
+schema.commit()
+print(schema.get_field('my_field').descriptor['type']) # number
+print(schema.get_field('my_field').descriptor['title']) # My Pretty Field
+
+```
+
+__Arguments__
+- __strict (bool)__: alter `strict` mode for further work
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`bool`: true on success and false if not modified
+
+
+
+#### `schema.save`
+```python
+schema.save(target, ensure_ascii=True)
+```
+Save schema descriptor to target destination.
+
+__Arguments__
+- __target (str)__: path where to save a descriptor
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`bool`: true on success
+
+
+
+### `Field`
+```python
+Field(self, descriptor, missing_values=[''], schema=None)
+```
+Field representaion
+
+__Arguments__
+- __descriptor (dict)__: schema field descriptor
+- __missingValues (str[])__: an array with string representing missing values
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+
+
+#### `field.constraints`
+Field constraints
+
+__Returns__
+
+`dict`: dict of field constraints
+
+
+
+#### `field.descriptor`
+Fields's descriptor
+
+__Returns__
+
+`dict`: descriptor
+
+
+
+#### `field.format`
+Field format
+
+__Returns__
+
+`str`: field format
+
+
+
+#### `field.missing_values`
+Field's missing values
+
+__Returns__
+
+`str[]`: missing values
+
+
+
+#### `field.name`
+Field name
+
+__Returns__
+
+`str`: field name
+
+
+
+#### `field.required`
+Whether field is required
+
+__Returns__
+
+`bool`: true if required
+
+
+
+#### `field.schema`
+Returns a schema instance if the field belongs to some schema
+
+__Returns__
+
+`Schema`: field's schema
+
+
+
+#### `field.type`
+Field type
+
+__Returns__
+
+`str`: field type
+
+
+
+#### `field.cast_value`
+```python
+field.cast_value(value, constraints=True)
+```
+Cast given value according to the field type and format.
+
+__Arguments__
+- __value (any)__: value to cast against field
+- __constraints (boll/str[])__: gets constraints configuration
+ - it could be set to true to disable constraint checks
+ - it could be an Array of constraints to check e.g. ['minimum', 'maximum']
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`any`: returns cast value
+
+
+
+#### `field.test_value`
+```python
+field.test_value(value, constraints=True)
+```
+Test whether value is compliant to the field.
+
+__Arguments__
+- __value (any)__: value to cast against field
+- __constraints (bool/str[])__: constraints configuration
+
+__Returns__
+
+`bool`: returns if value is compliant to the field
+
+
+### `Storage`
+```python
+Storage(self, **options)
+```
+Storage factory/interface
+
+__For users__
+
+
+> Use `Storage.connect` to instantiate a storage
+
+For instantiation of concrete storage instances,
+`tableschema.Storage` provides a unified factory method `connect`
+(which uses the plugin system under the hood):
+
+```python
+# pip install tableschema_sql
+from tableschema import Storage
+
+storage = Storage.connect('sql', **options)
+storage.create('bucket', descriptor)
+storage.write('bucket', rows)
+storage.read('bucket')
+```
+
+__For integrators__
+
+
+The library includes interface declaration to implement tabular `Storage`.
+This interface allow to use different data storage systems like SQL
+with `tableschema.Table` class (load/save) as well as on the data package level:
+
+![Storage](https://raw.githubusercontent.com/frictionlessdata/tableschema-py/master/data/storage.png)
+
+An implementor must follow `tableschema.Storage` interface
+to write his own storage backend. Concrete storage backends
+could include additional functionality specific to conrete storage system.
+See `plugins` below to know how to integrate custom storage plugin into your workflow.
+
+
+
+#### `storage.buckets`
+Return list of storage bucket names.
+
+A `bucket` is a special term which has almost the same meaning as `table`.
+You should consider `bucket` as a `table` stored in the `storage`.
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+__Returns__
+
+`str[]`: return list of bucket names
+
+
+
+#### `storage.connect`
+```python
+storage.connect(name, **options)
+```
+Create tabular `storage` based on storage name.
+
+> This method is statis: `Storage.connect()`
+
+__Arguments__
+- __name (str)__: storage name like `sql`
+- __options (dict)__: concrete storage options
+
+__Raises__
+- `StorageError`: raises on any error
+
+__Returns__
+
+`Storage`: returns `Storage` instance
+
+
+
+#### `storage.create`
+```python
+storage.create(bucket, descriptor, force=False)
+```
+Create one/multiple buckets.
+
+__Arguments__
+- __bucket (str/list)__: bucket name or list of bucket names
+- __descriptor (dict/dict[])__: schema descriptor or list of descriptors
+- __force (bool)__: whether to delete and re-create already existing buckets
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+
+
+#### `storage.delete`
+```python
+storage.delete(bucket=None, ignore=False)
+```
+Delete one/multiple/all buckets.
+
+__Arguments__
+- __bucket (str/list/None)__: bucket name or list of bucket names to delete.
+ If `None`, all buckets will be deleted
+- __descriptor (dict/dict[])__: schema descriptor or list of descriptors
+- __ignore (bool)__: don't raise an error on non-existent bucket deletion
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+
+
+#### `storage.describe`
+```python
+storage.describe(bucket, descriptor=None)
+```
+Get/set bucket's Table Schema descriptor
+
+__Arguments__
+- __bucket (str)__: bucket name
+- __descriptor (dict/None)__: schema descriptor to set
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+__Returns__
+
+`dict`: returns Table Schema descriptor
+
+
+
+#### `storage.iter`
+```python
+storage.iter(bucket)
+```
+Return an iterator of typed values based on the schema of this bucket.
+
+__Arguments__
+- __bucket (str)__: bucket name
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+__Returns__
+
+`list[]`: yields data rows
+
+
+
+#### `storage.read`
+```python
+storage.read(bucket)
+```
+Read typed values based on the schema of this bucket.
+
+__Arguments__
+- __bucket (str)__: bucket name
+__Raises__
+- `exceptions.StorageError`: raises on any error
+__Returns__
+
+`list[]`: returns data rows
+
+
+
+#### `storage.write`
+```python
+storage.write(bucket, rows)
+```
+This method writes data rows into `storage`.
+
+It should store values of unsupported types as strings internally (like csv does).
+
+__Arguments__
+- __bucket (str)__: bucket name
+- __rows (list[])__: data rows to write
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+
+### `validate`
+```python
+validate(descriptor)
+```
+Validate descriptor
+
+__Arguments__
+- __dict__: descriptor
+
+__Raises__
+- `ValidationError`: on validation errors
+
+__Returns__
+
+`bool`: True
+
+
+### `infer`
+```python
+infer(source,
+ headers=1,
+ limit=100,
+ confidence=0.75,
+ missing_values=[''],
+ guesser_cls=None,
+ resolver_cls=None,
+ **options)
+```
+Infer source schema.
+
+__Arguments__
+- __source (any)__: source as path, url or inline data
+- __headers (int/str[])__: headers rows number or headers list
+- __confidence (float)__: how many casting errors are allowed (as a ratio, between 0 and 1)
+- __missing_values (str[])__: list of missing values (by default `['']`)
+- __guesser_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+- __resolver_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`dict`: returns schema descriptor
+
+
+### `FailedCast`
+```python
+FailedCast(self, value)
+```
+Wrap an original data field value that failed to be properly casted.
+
+FailedCast allows for further processing/yielding values but still be able
+to distinguish uncasted values on the consuming side.
+
+Delegates attribute access and the basic rich comparison methods to the
+underlying object. Supports default user-defined classes hashability i.e.
+is hashable based on object identity (not based on the wrapped value).
+
+__Arguments__
+- __value (any)__: value
+
+
+### `DataPackageException`
+```python
+DataPackageException(self, message, errors=[])
+```
+Base class for all DataPackage/TableSchema exceptions.
+
+If there are multiple errors, they can be read from the exception object:
+
+```python
+try:
+ # lib action
+except DataPackageException as exception:
+ if exception.multiple:
+ for error in exception.errors:
+ # handle error
+```
+
+
+
+#### `datapackageexception.errors`
+List of nested errors
+
+__Returns__
+
+`DataPackageException[]`: list of nested errors
+
+
+
+#### `datapackageexception.multiple`
+Whether it's a nested exception
+
+__Returns__
+
+`bool`: whether it's a nested exception
+
+
+
+### `TableSchemaException`
+```python
+TableSchemaException(self, message, errors=[])
+```
+Base class for all TableSchema exceptions.
+
+
+### `LoadError`
+```python
+LoadError(self, message, errors=[])
+```
+All loading errors.
+
+
+### `ValidationError`
+```python
+ValidationError(self, message, errors=[])
+```
+All validation errors.
+
+
+### `CastError`
+```python
+CastError(self, message, errors=[])
+```
+All value cast errors.
+
+
+### `IntegrityError`
+```python
+IntegrityError(self, message, errors=[])
+```
+All integrity errors.
+
+
+### `UniqueKeyError`
+```python
+UniqueKeyError(self, message, errors=[])
+```
+Unique key constraint violation (CastError subclass)
+
+
+### `RelationError`
+```python
+RelationError(self, message, errors=[])
+```
+All relations errors.
+
+
+### `UnresolvedFKError`
+```python
+UnresolvedFKError(self, message, errors=[])
+```
+Unresolved foreign key reference error (RelationError subclass).
+
+
+### `StorageError`
+```python
+StorageError(self, message, errors=[])
+```
+All storage errors.
+
+## Experimental
+
+> This API is experimental and can be changed/removed in the future
+
+There is an experimental environment variable `TABLESCHEMA_PRESERVE_MISSING_VALUES` which, if it is set, affects how data casting works.
+
+By default, missing values are resolved to `None` values. When this flag is set, missing values are passed through as it is. For example:
+
+> missing_values.py
+
+```python
+from tableschema import Field
+
+field = Field({'type': 'number'}, missing_values=['-'])
+print(field.cast_value('3'))
+print(field.cast_value('-'))
+```
+
+Running this script in different modes:
+
+```bash
+$ python missing_values.py
+3
+None
+$ TABLESCHEMA_PRESERVE_MISSING_VALUES=1 python missing_values.py
+3
+-
+```
+
+The flags affects all the library's APIs and software built on top of `tableschema`. For example, Data Package Pipelines:
+
+```bash
+$ TABLESCHEMA_PRESERVE_MISSING_VALUES=1 dpp run ./my_pipeline
+```
+
+## Contributing
+
+> The project follows the [Open Knowledge International coding standards](https://github.com/okfn/coding-standards).
+
+Recommended way to get started is to create and activate a project virtual environment.
+To install package and development dependencies into active environment:
+
+```bash
+$ make install
+```
+
+To run tests with linting and coverage:
+
+```bash
+$ make test
+```
+
+## Changelog
+
+Here described only breaking and the most important changes. The full changelog and documentation for all released versions can be found in the nicely formatted [commit history](https://github.com/frictionlessdata/tableschema-py/commits/master).
+
+#### v1.20
+
+- Added --json flag to the CLI (#287)
+
+#### v1.19
+
+- Deduplicate field names if guessing in infer
+
+#### v1.18
+
+- Publish `field.ERROR/cast_function/check_functions`
+
+#### v1.17
+
+- Added `schema.missing_values` and `field.missing_values`
+
+#### v1.16
+
+- Fixed the way we parse `geopoint`:
+ - as a string it can be in 3 forms ("default", "array", "object") BUT
+ - as native object it can only be a list/tuple
+
+#### v1.15
+
+- Added an experimental `TABLESCHEMA_PRESERVE_MISSING_VALUES` environment variable flag
+
+#### v1.14
+
+- Allow providing custom guesser and resolver to `table.infer` and `infer`
+
+#### v1.13
+
+- Added `missing_values` argument to the `infer` function (#269)
+
+#### v1.12
+
+- Support optional custom exception handling for table.iter/read (#259)
+
+#### v1.11
+
+- Added `preserve_missing_values` parameter to `field.cast_value`
+
+#### v1.10
+
+- Added an ability to check table's integrity while reading
+
+#### v1.9
+
+- Implemented the `table.size` and `table.hash` properties
+
+#### v1.8
+
+- Added `table.index_foreign_keys_values` and improved foreign key checks performance
+
+#### v1.7
+
+- Added `field.schema` property
+
+#### v1.6
+
+- In `strict` mode raise an exception if there are problems in field construction
+
+#### v1.5
+
+- Allow providing custom guesser and resolver to schema infer
+
+#### v1.4
+
+- Added `schema.update_field` method
+
+#### v1.3
+
+- Support datetime with no time for date casting
+
+#### v1.2
+
+- Support floats like 1.0 for integer casting
+
+#### v1.1
+
+- Added the `confidence` parameter to `infer`
+
+#### v1.0
+
+- The library has been rebased on the Frictionless Data specs v1 - https://frictionlessdata.io/specs/table-schema/
+
+
+
+%package help
+Summary: Development documents and examples for tableschema
+Provides: python3-tableschema-doc
+%description help
+# tableschema-py
+
+[![Travis](https://travis-ci.org/frictionlessdata/tableschema-py.svg?branch=master)](https://travis-ci.org/frictionlessdata/tableschema-py)
+[![Coveralls](http://img.shields.io/coveralls/frictionlessdata/tableschema-py.svg?branch=master)](https://coveralls.io/r/frictionlessdata/tableschema-py?branch=master)
+[![PyPi](https://img.shields.io/pypi/v/tableschema.svg)](https://pypi.python.org/pypi/tableschema)
+[![Github](https://img.shields.io/badge/github-master-brightgreen)](https://github.com/frictionlessdata/tableschema-py)
+[![Gitter](https://img.shields.io/gitter/room/frictionlessdata/chat.svg)](https://gitter.im/frictionlessdata/chat)
+
+A Python implementation of the [Table Schema](http://specs.frictionlessdata.io/table-schema/) standard.
+
+> **[Important Notice]** We have released [Frictionless Framework](https://github.com/frictionlessdata/frictionless-py). This framework provides improved `tableschema` functionality extended to be a complete data solution. The change in not breaking for the existing software so no actions are required. Please read the [Migration Guide](https://framework.frictionlessdata.io/docs/development/migration) from `tableschema` to Frictionless Framework.
+> - we continue to bug-fix `tableschema@1.x` in this [repository](https://github.com/frictionlessdata/tableschema-py) as well as it's available on [PyPi](https://pypi.org/project/tableschema/) as it was before
+> - please note that `frictionless@3.x` version's API, we're working on at the moment, is not stable
+> - we will release `frictionless@4.x` by the end of 2020 to be the first SemVer/stable version
+
+## Features
+
+- `Table` to work with data tables described by Table Schema
+- `Schema` representing Table Schema
+- `Field` representing Table Schema field
+- `validate` to validate Table Schema
+- `infer` to infer Table Schema from data
+- built-in command-line interface to validate and infer schemas
+- storage/plugins system to connect tables to different storage backends like SQL Database
+
+## Contents
+
+<!--TOC-->
+
+ - [Getting Started](#getting-started)
+ - [Installation](#installation)
+ - [Documentation](#documentation)
+ - [Introduction](#introduction)
+ - [Working with Table](#working-with-table)
+ - [Working with Schema](#working-with-schema)
+ - [Working with Field](#working-with-field)
+ - [API Reference](#api-reference)
+ - [`cli`](#cli)
+ - [`Table`](#table)
+ - [`Schema`](#schema)
+ - [`Field`](#field)
+ - [`Storage`](#storage)
+ - [`validate`](#validate)
+ - [`infer`](#infer)
+ - [`FailedCast`](#failedcast)
+ - [`DataPackageException`](#datapackageexception)
+ - [`TableSchemaException`](#tableschemaexception)
+ - [`LoadError`](#loaderror)
+ - [`ValidationError`](#validationerror)
+ - [`CastError`](#casterror)
+ - [`IntegrityError`](#integrityerror)
+ - [`UniqueKeyError`](#uniquekeyerror)
+ - [`RelationError`](#relationerror)
+ - [`UnresolvedFKError`](#unresolvedfkerror)
+ - [`StorageError`](#storageerror)
+ - [Experimental](#experimental)
+ - [Contributing](#contributing)
+ - [Changelog](#changelog)
+
+<!--TOC-->
+
+## Getting Started
+
+### Installation
+
+The package uses semantic versioning. It means that major versions could include breaking changes. It's highly recommended to specify `tableschema` version range in your `setup/requirements` file e.g. `tableschema>=1.0,<2.0`.
+
+```bash
+$ pip install tableschema
+```
+
+## Documentation
+
+### Introduction
+
+Let's start with a simple example:
+
+```python
+from tableschema import Table
+
+# Create table
+table = Table('path.csv', schema='schema.json')
+
+# Print schema descriptor
+print(table.schema.descriptor)
+
+# Print cast rows in a dict form
+for keyed_row in table.iter(keyed=True):
+ print(keyed_row)
+```
+
+### Working with Table
+
+A table is a core concept in a tabular data world. It represents data with metadata (Table Schema). Let's see how we can use it in practice.
+
+Consider we have some local csv file. It could be inline data or from a remote link - all supported by the `Table` class (except local files for in-brower usage of course). But say it's `data.csv` for now:
+
+```csv
+city,location
+london,"51.50,-0.11"
+paris,"48.85,2.30"
+rome,N/A
+```
+
+Let's create and read a table instance. We use the static `Table.load` method and the `table.read` method with the `keyed` option to get an array of keyed rows:
+
+```python
+table = Table('data.csv')
+table.headers # ['city', 'location']
+table.read(keyed=True)
+# [
+# {city: 'london', location: '51.50,-0.11'},
+# {city: 'paris', location: '48.85,2.30'},
+# {city: 'rome', location: 'N/A'},
+# ]
+```
+
+As we can see, our locations are just strings. But they should be geopoints. Also, Rome's location is not available, but it's just a string `N/A` instead of `None`. First we have to infer Table Schema:
+
+```python
+table.infer()
+table.schema.descriptor
+# { fields:
+# [ { name: 'city', type: 'string', format: 'default' },
+# { name: 'location', type: 'geopoint', format: 'default' } ],
+# missingValues: [ '' ] }
+table.read(keyed=True)
+# Fails with a data validation error
+```
+
+Let's fix the "not available" location. There is a `missingValues` property in Table Schema specification. As a first try we set `missingValues` to `N/A` in `table.schema.descriptor`. The schema descriptor can be changed in-place, but all changes should also be committed using `table.schema.commit()`:
+
+```python
+table.schema.descriptor['missingValues'] = 'N/A'
+table.schema.commit()
+table.schema.valid # false
+table.schema.errors
+# [<ValidationError: "'N/A' is not of type 'array'">]
+```
+
+As a good citizens we've decided to check our schema descriptor's validity. And it's not valid! We should use an array for the `missingValues` property. Also, don't forget to include "empty string" as a valid missing value:
+
+```python
+table.schema.descriptor['missingValues'] = ['', 'N/A']
+table.schema.commit()
+table.schema.valid # true
+```
+
+All good. It looks like we're ready to read our data again:
+
+```python
+table.read(keyed=True)
+# [
+# {city: 'london', location: [51.50,-0.11]},
+# {city: 'paris', location: [48.85,2.30]},
+# {city: 'rome', location: null},
+# ]
+```
+
+Now we see that:
+- locations are arrays with numeric latitude and longitude
+- Rome's location is a native Python `None`
+
+And because there are no errors after reading, we can be sure that our data is valid against our schema. Let's save it:
+
+```python
+table.schema.save('schema.json')
+table.save('data.csv')
+```
+
+Our `data.csv` looks the same because it has been stringified back to `csv` format. But now we have `schema.json`:
+
+```json
+{
+ "fields": [
+ {
+ "name": "city",
+ "type": "string",
+ "format": "default"
+ },
+ {
+ "name": "location",
+ "type": "geopoint",
+ "format": "default"
+ }
+ ],
+ "missingValues": [
+ "",
+ "N/A"
+ ]
+}
+
+```
+
+If we decide to improve it even more we could update the schema file and then open it again. But now providing a schema path:
+
+```python
+table = Table('data.csv', schema='schema.json')
+# Continue the work
+```
+
+As already mentioned a given schema can be used to *validate* data (see the [Schema](#schema) section for schema specification details). In default mode invalid data rows immediately trigger an [exception](#exceptions) in the `table.iter()`/`table.write()` methods.
+
+Suppose this schema-invalid local file `invalid_data.csv`:
+```csv
+key,value
+zero,0
+one,not_an_integer
+two,2
+```
+
+We're going to validate the data against the following schema:
+```python
+table = Table(
+ 'invalid_data.csv',
+ schema={'fields': [{'name': 'key'}, {'name': 'value', 'type': 'integer'}]})
+```
+
+Iterating over the data triggers an exception due to the failed cast of `'not_an_integer'` to `int`:
+```python
+for row in table.iter():
+ print(row)
+
+# Traceback (most recent call last):
+# ...
+# tableschema.exceptions.CastError: There are 1 cast errors (see exception.errors) for row "3"
+```
+
+Hint: The row number count starts with 1 and also includes header lines.
+
+(Note: You can optionally switch off `iter()`/`read()` value casting using the cast parameter, see reference below.)
+
+By providing a custom exception handler (a callable) to those methods you can treat occurring exceptions at your own discretion, i.e. to "fail late" and e.g. gather a validation report on the whole data:
+
+```python
+errors = []
+def exc_handler(exc, row_number=None, row_data=None, error_data=None):
+ errors.append((exc, row_number, row_data, error_data))
+
+for row in table.iter(exc_handler=exc_handler):
+ print(row)
+
+# ['zero', 0]
+# ['one', FailedCast('not_an_integer')]
+# ['two', 2]
+
+print(errors)
+
+# [(CastError('There are 1 cast errors (see exception.errors) for row "3"',),
+# 3,
+# OrderedDict([('key', 'one'), ('value', 'not_an_integer')]),
+# OrderedDict([('value', 'not_an_integer')]))]
+```
+
+Note that
+
+- Data rows are yielded even though the data is schema-invalid; this is due to our custom expression handler choosing not to raise exceptions (but rather collect them in the errors list).
+- Data field values that can't get casted properly (if `iter()`/`read()` cast parameter is set to True, which is the default) are wrapped into a `FailedCast` "value holder". This allows for distinguishing uncasted values from successfully casted values on the data consumer side. `FailedCast` instances can only get yielded when custom exception handling is in place.
+- The custom exception handler callable must support a function signature as specified in the `iter()`/`read()` sections of the `Table` class API reference.
+
+### Working with Schema
+
+A model of a schema with helpful methods for working with the schema and supported data. Schema instances can be initialized with a schema source as a url to a JSON file or a JSON object. The schema is initially validated (see [validate](#validate) below). By default validation errors will be stored in `schema.errors` but in a strict mode it will be instantly raised.
+
+Let's create a blank schema. It's not valid because `descriptor.fields` property is required by the [Table Schema](http://specs.frictionlessdata.io/table-schema/) specification:
+
+```python
+schema = Schema()
+schema.valid # false
+schema.errors
+# [<ValidationError: "'fields' is a required property">]
+```
+
+To avoid creating a schema descriptor by hand we will use a `schema.infer` method to infer the descriptor from given data:
+
+```python
+schema.infer([
+ ['id', 'age', 'name'],
+ ['1','39','Paul'],
+ ['2','23','Jimmy'],
+ ['3','36','Jane'],
+ ['4','28','Judy'],
+])
+schema.valid # true
+schema.descriptor
+#{ fields:
+# [ { name: 'id', type: 'integer', format: 'default' },
+# { name: 'age', type: 'integer', format: 'default' },
+# { name: 'name', type: 'string', format: 'default' } ],
+# missingValues: [ '' ] }
+```
+
+Now we have an inferred schema and it's valid. We can cast data rows against our schema. We provide a string input which will be cast correspondingly:
+
+```python
+schema.cast_row(['5', '66', 'Sam'])
+# [ 5, 66, 'Sam' ]
+```
+
+But if we try provide some missing value to the `age` field, the cast will fail because the only valid "missing" value is an empty string. Let's update our schema:
+
+```python
+schema.cast_row(['6', 'N/A', 'Walt'])
+# Cast error
+schema.descriptor['missingValues'] = ['', 'N/A']
+schema.commit()
+schema.cast_row(['6', 'N/A', 'Walt'])
+# [ 6, None, 'Walt' ]
+```
+
+We can save the schema to a local file, and resume work on it at any time by loading it from that file:
+
+```python
+schema.save('schema.json')
+schema = Schema('schema.json')
+```
+
+### Working with Field
+
+```python
+from tableschema import Field
+
+# Init field
+field = Field({'name': 'name', 'type': 'number'})
+
+# Cast a value
+field.cast_value('12345') # -> 12345
+```
+
+Data values can be cast to native Python objects with a Field instance. Type instances can be initialized with [field descriptors](https://specs.frictionlessdata.io/table-schema/). This allows formats and constraints to be defined.
+
+Casting a value will check the value is of the expected type, is in the correct format, and complies with any constraints imposed by a schema. E.g. a date value (in ISO 8601 format) can be cast with a DateType instance. Values that can't be cast will raise an `InvalidCastError` exception.
+
+Casting a value that doesn't meet the constraints will raise a `ConstraintError` exception.
+
+## API Reference
+
+### `cli`
+```python
+cli()
+```
+Command-line interface
+
+```
+Usage: tableschema [OPTIONS] COMMAND [ARGS]...
+
+Options:
+ --help Show this message and exit.
+
+Commands:
+ infer Infer a schema from data.
+ info Return info on this version of Table Schema
+ validate Validate that a supposed schema is in fact a Table Schema.
+```
+
+
+### `Table`
+```python
+Table(self,
+ source,
+ schema=None,
+ strict=False,
+ post_cast=[],
+ storage=None,
+ **options)
+```
+Table representation
+
+__Arguments__
+- __source (str/list[])__: data source one of:
+ - local file (path)
+ - remote file (url)
+ - array of arrays representing the rows
+- __schema (any)__: data schema in all forms supported by `Schema` class
+- __strict (bool)__: strictness option to pass to `Schema` constructor
+- __post_cast (function[])__: list of post cast processors
+- __storage (None)__: storage name like `sql` or `bigquery`
+- __options (dict)__: `tabulator` or storage's options
+
+__Raises__
+- `TableSchemaException`: raises on any error
+
+
+
+#### `table.hash`
+Table's SHA256 hash if it's available.
+
+If it's already read using e.g. `table.read`, otherwise returns `None`.
+In the middle of an iteration it returns hash of already read contents
+
+__Returns__
+
+`str/None`: SHA256 hash
+
+
+
+#### `table.headers`
+Table's headers is available
+
+__Returns__
+
+`str[]`: headers
+
+
+
+#### `table.schema`
+Returns schema class instance if available
+
+__Returns__
+
+`Schema`: schema
+
+
+
+#### `table.size`
+Table's size in BYTES if it's available
+
+If it's already read using e.g. `table.read`, otherwise returns `None`.
+In the middle of an iteration it returns size of already read contents
+
+__Returns__
+
+`int/None`: size in BYTES
+
+
+
+#### `table.iter`
+```python
+table.iter(keyed=False,
+ extended=False,
+ cast=True,
+ integrity=False,
+ relations=False,
+ foreign_keys_values=False,
+ exc_handler=None)
+```
+Iterates through the table data and emits rows cast based on table schema.
+
+__Arguments__
+
+
+ keyed (bool):
+ yield keyed rows in a form of `{header1: value1, header2: value2}`
+ (default is false; the form of rows is `[value1, value2]`)
+
+ extended (bool):
+ yield extended rows in a for of `[rowNumber, [header1, header2], [value1, value2]]`
+ (default is false; the form of rows is `[value1, value2]`)
+
+ cast (bool):
+ disable data casting if false
+ (default is true)
+
+ integrity (dict):
+ dictionary in a form of `{'size': <bytes>, 'hash': '<sha256>'}`
+ to check integrity of the table when it's read completely.
+ Both keys are optional.
+
+ relations (dict):
+ dictionary of foreign key references in a form
+ of `{resource1: [{field1: value1, field2: value2}, ...], ...}`.
+ If provided, foreign key fields will checked and resolved
+ to one of their references (/!\ one-to-many fk are not completely resolved).
+
+ foreign_keys_values (dict):
+ three-level dictionary of foreign key references optimized
+ to speed up validation process in a form of
+ `{resource1: {(fk_field1, fk_field2): {(value1, value2): {one_keyedrow}, ... }}}`.
+ If not provided but relations is true, it will be created
+ before the validation process by *index_foreign_keys_values* method
+
+ exc_handler (func):
+ optional custom exception handler callable.
+ Can be used to defer raising errors (i.e. "fail late"), e.g.
+ for data validation purposes. Must support the signature below
+
+__Custom exception handler__
+
+
+```python
+def exc_handler(exc, row_number=None, row_data=None, error_data=None):
+ '''Custom exception handler (example)
+
+ # Arguments:
+ exc(Exception):
+ Deferred exception instance
+ row_number(int):
+ Data row number that triggers exception exc
+ row_data(OrderedDict):
+ Invalid data row source data
+ error_data(OrderedDict):
+ Data row source data field subset responsible for the error, if
+ applicable (e.g. invalid primary or foreign key fields). May be
+ identical to row_data.
+ '''
+ # ...
+```
+
+__Raises__
+- `TableSchemaException`: base class of any error
+- `CastError`: data cast error
+- `IntegrityError`: integrity checking error
+- `UniqueKeyError`: unique key constraint violation
+- `UnresolvedFKError`: unresolved foreign key reference error
+
+__Returns__
+
+`Iterator[list]`: yields rows
+
+
+
+#### `table.read`
+```python
+table.read(keyed=False,
+ extended=False,
+ cast=True,
+ limit=None,
+ integrity=False,
+ relations=False,
+ foreign_keys_values=False,
+ exc_handler=None)
+```
+Read the whole table and return as array of rows
+
+> It has the same API as `table.iter` except for
+
+__Arguments__
+- __limit (int)__: limit count of rows to read and return
+
+__Returns__
+
+`list[]`: returns rows
+
+
+
+#### `table.infer`
+```python
+table.infer(limit=100,
+ confidence=0.75,
+ missing_values=[''],
+ guesser_cls=None,
+ resolver_cls=None)
+```
+Infer a schema for the table.
+
+It will infer and set Table Schema to `table.schema` based on table data.
+
+__Arguments__
+- __limit (int)__: limit rows sample size
+- __confidence (float)__: how many casting errors are allowed (as a ratio, between 0 and 1)
+- __missing_values (str[])__: list of missing values (by default `['']`)
+- __guesser_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+- __resolver_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+
+__Returns__
+
+`dict`: Table Schema descriptor
+
+
+
+#### `table.save`
+```python
+table.save(target, storage=None, **options)
+```
+Save data source to file locally in CSV format with `,` (comma) delimiter
+
+> To save schema use `table.schema.save()`
+
+__Arguments__
+- __target (str)__: saving target (e.g. file path)
+- __storage (None/str)__: storage name like `sql` or `bigquery`
+- __options (dict)__: `tabulator` or storage options
+
+__Raises__
+- `TableSchemaException`: raises an error if there is saving problem
+
+__Returns__
+
+`True/Storage`: returns true or storage instance
+
+
+
+#### `table.index_foreign_keys_values`
+```python
+table.index_foreign_keys_values(relations)
+```
+Creates a three-level dictionary of foreign key references
+
+We create them optimized to speed up validation process in a form of
+`{resource1: {(fk_field1, fk_field2): {(value1, value2): {one_keyedrow}, ... }}}`.
+
+For each foreign key of the schema it will iterate through the corresponding
+`relations['resource']` to create an index (i.e. a dict) of existing values
+for the foreign fields and store on keyed row for each value combination.
+
+The optimization relies on the indexation of possible values for one foreign key
+in a hashmap to later speed up resolution.
+
+This method is public to allow creating the index once to apply it
+on multiple tables charing the same schema
+(typically [grouped resources in datapackage](https://github.com/frictionlessdata/datapackage-py#group))
+
+__Notes__
+
+- the second key of the output is a tuple of the foreign fields,
+ a proxy identifier of the foreign key
+- the same relation resource can be indexed multiple times
+ as a schema can contain more than one Foreign Keys
+ pointing to the same resource
+
+__Arguments__
+- __relations (dict)__:
+ dict of foreign key references in a form of
+ `{resource1: [{field1: value1, field2: value2}, ...], ...}`.
+ It must contain all resources pointed in the foreign keys schema definition.
+
+__Returns__
+
+`dict`:
+ returns a three-level dictionary of foreign key references
+ optimized to speed up validation process in a form of
+ `{resource1: {(fk_field1, fk_field2): {(value1, value2): {one_keyedrow}, ... }}})`
+
+
+### `Schema`
+```python
+Schema(self, descriptor={}, strict=False)
+```
+Schema representation
+
+__Arguments__
+- __descriptor (str/dict)__: schema descriptor one of:
+ - local path
+ - remote url
+ - dictionary
+- __strict (bool)__: flag to specify validation behaviour:
+ - if false, errors will not be raised but instead collected in `schema.errors`
+ - if true, validation errors are raised immediately
+
+__Raises__
+- `TableSchemaException`: raise any error that occurs during the process
+
+
+
+#### `schema.descriptor`
+Schema's descriptor
+
+__Returns__
+
+`dict`: descriptor
+
+
+
+#### `schema.errors`
+Validation errors
+
+Always empty in strict mode.
+
+__Returns__
+
+`Exception[]`: validation errors
+
+
+
+#### `schema.field_names`
+Schema's field names
+
+__Returns__
+
+`str[]`: an array of field names
+
+
+
+#### `schema.fields`
+Schema's fields
+
+__Returns__
+
+`Field[]`: an array of field instances
+
+
+
+#### `schema.foreign_keys`
+Schema's foreign keys
+
+__Returns__
+
+`dict[]`: foreign keys
+
+
+
+#### `schema.headers`
+Schema's field names
+
+__Returns__
+
+`str[]`: an array of field names
+
+
+
+#### `schema.missing_values`
+Schema's missing values
+
+__Returns__
+
+`str[]`: missing values
+
+
+
+#### `schema.primary_key`
+Schema's primary keys
+
+__Returns__
+
+`str[]`: primary keys
+
+
+
+#### `schema.valid`
+Validation status
+
+Always true in strict mode.
+
+__Returns__
+
+`bool`: validation status
+
+
+
+#### `schema.get_field`
+```python
+schema.get_field(name)
+```
+Get schema's field by name.
+
+> Use `table.update_field` if you want to modify the field descriptor
+
+__Arguments__
+- __name (str)__: schema field name
+
+__Returns__
+
+`Field/None`: `Field` instance or `None` if not found
+
+
+
+#### `schema.get_field`
+```python
+schema.get_field(name)
+```
+Get schema's field by name.
+
+> Use `table.update_field` if you want to modify the field descriptor
+
+__Arguments__
+- __name (str)__: schema field name
+
+__Returns__
+
+`Field/None`: `Field` instance or `None` if not found
+
+
+
+#### `schema.add_field`
+```python
+schema.add_field(descriptor)
+```
+Add new field to schema.
+
+The schema descriptor will be validated with newly added field descriptor.
+
+__Arguments__
+- __descriptor (dict)__: field descriptor
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`Field/None`: added `Field` instance or `None` if not added
+
+
+
+#### `schema.update_field`
+```python
+schema.update_field(name, update)
+```
+Update existing descriptor field by name
+
+__Arguments__
+- __name (str)__: schema field name
+- __update (dict)__: update to apply to field's descriptor
+
+__Returns__
+
+`bool`: true on success and false if no field is found to be modified
+
+
+
+#### `schema.remove_field`
+```python
+schema.remove_field(name)
+```
+Remove field resource by name.
+
+The schema descriptor will be validated after field descriptor removal.
+
+__Arguments__
+- __name (str)__: schema field name
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`Field/None`: removed `Field` instances or `None` if not found
+
+
+
+#### `schema.cast_row`
+```python
+schema.cast_row(row, fail_fast=False, row_number=None, exc_handler=None)
+```
+Cast row based on field types and formats.
+
+__Arguments__
+- __row (any[]__: data row as an array of values
+
+__Returns__
+
+`any[]`: returns cast data row
+
+
+
+#### `schema.infer`
+```python
+schema.infer(rows,
+ headers=1,
+ confidence=0.75,
+ guesser_cls=None,
+ resolver_cls=None)
+```
+Infer and set `schema.descriptor` based on data sample.
+
+__Arguments__
+- __rows (list[])__: array of arrays representing rows.
+- __headers (int/str[])__: data sample headers (one of):
+ - row number containing headers (`rows` should contain headers rows)
+ - array of headers (`rows` should NOT contain headers rows)
+- __confidence (float)__: how many casting errors are allowed (as a ratio, between 0 and 1)
+- __guesser_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+- __resolver_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+
+__Returns__
+
+`dict`: Table Schema descriptor
+
+
+
+#### `schema.commit`
+```python
+schema.commit(strict=None)
+```
+Update schema instance if there are in-place changes in the descriptor.
+
+__Example__
+
+
+```python
+from tableschema import Schema
+descriptor = {'fields': [{'name': 'my_field', 'title': 'My Field', 'type': 'string'}]}
+schema = Schema(descriptor)
+print(schema.get_field('my_field').descriptor['type']) # string
+
+# Update descriptor by field position
+schema.descriptor['fields'][0]['type'] = 'number'
+# Update descriptor by field name
+schema.update_field('my_field', {'title': 'My Pretty Field'}) # True
+
+# Change are not committed
+print(schema.get_field('my_field').descriptor['type']) # string
+print(schema.get_field('my_field').descriptor['title']) # My Field
+
+# Commit change
+schema.commit()
+print(schema.get_field('my_field').descriptor['type']) # number
+print(schema.get_field('my_field').descriptor['title']) # My Pretty Field
+
+```
+
+__Arguments__
+- __strict (bool)__: alter `strict` mode for further work
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`bool`: true on success and false if not modified
+
+
+
+#### `schema.save`
+```python
+schema.save(target, ensure_ascii=True)
+```
+Save schema descriptor to target destination.
+
+__Arguments__
+- __target (str)__: path where to save a descriptor
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`bool`: true on success
+
+
+
+### `Field`
+```python
+Field(self, descriptor, missing_values=[''], schema=None)
+```
+Field representaion
+
+__Arguments__
+- __descriptor (dict)__: schema field descriptor
+- __missingValues (str[])__: an array with string representing missing values
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+
+
+#### `field.constraints`
+Field constraints
+
+__Returns__
+
+`dict`: dict of field constraints
+
+
+
+#### `field.descriptor`
+Fields's descriptor
+
+__Returns__
+
+`dict`: descriptor
+
+
+
+#### `field.format`
+Field format
+
+__Returns__
+
+`str`: field format
+
+
+
+#### `field.missing_values`
+Field's missing values
+
+__Returns__
+
+`str[]`: missing values
+
+
+
+#### `field.name`
+Field name
+
+__Returns__
+
+`str`: field name
+
+
+
+#### `field.required`
+Whether field is required
+
+__Returns__
+
+`bool`: true if required
+
+
+
+#### `field.schema`
+Returns a schema instance if the field belongs to some schema
+
+__Returns__
+
+`Schema`: field's schema
+
+
+
+#### `field.type`
+Field type
+
+__Returns__
+
+`str`: field type
+
+
+
+#### `field.cast_value`
+```python
+field.cast_value(value, constraints=True)
+```
+Cast given value according to the field type and format.
+
+__Arguments__
+- __value (any)__: value to cast against field
+- __constraints (boll/str[])__: gets constraints configuration
+ - it could be set to true to disable constraint checks
+ - it could be an Array of constraints to check e.g. ['minimum', 'maximum']
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`any`: returns cast value
+
+
+
+#### `field.test_value`
+```python
+field.test_value(value, constraints=True)
+```
+Test whether value is compliant to the field.
+
+__Arguments__
+- __value (any)__: value to cast against field
+- __constraints (bool/str[])__: constraints configuration
+
+__Returns__
+
+`bool`: returns if value is compliant to the field
+
+
+### `Storage`
+```python
+Storage(self, **options)
+```
+Storage factory/interface
+
+__For users__
+
+
+> Use `Storage.connect` to instantiate a storage
+
+For instantiation of concrete storage instances,
+`tableschema.Storage` provides a unified factory method `connect`
+(which uses the plugin system under the hood):
+
+```python
+# pip install tableschema_sql
+from tableschema import Storage
+
+storage = Storage.connect('sql', **options)
+storage.create('bucket', descriptor)
+storage.write('bucket', rows)
+storage.read('bucket')
+```
+
+__For integrators__
+
+
+The library includes interface declaration to implement tabular `Storage`.
+This interface allow to use different data storage systems like SQL
+with `tableschema.Table` class (load/save) as well as on the data package level:
+
+![Storage](https://raw.githubusercontent.com/frictionlessdata/tableschema-py/master/data/storage.png)
+
+An implementor must follow `tableschema.Storage` interface
+to write his own storage backend. Concrete storage backends
+could include additional functionality specific to conrete storage system.
+See `plugins` below to know how to integrate custom storage plugin into your workflow.
+
+
+
+#### `storage.buckets`
+Return list of storage bucket names.
+
+A `bucket` is a special term which has almost the same meaning as `table`.
+You should consider `bucket` as a `table` stored in the `storage`.
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+__Returns__
+
+`str[]`: return list of bucket names
+
+
+
+#### `storage.connect`
+```python
+storage.connect(name, **options)
+```
+Create tabular `storage` based on storage name.
+
+> This method is statis: `Storage.connect()`
+
+__Arguments__
+- __name (str)__: storage name like `sql`
+- __options (dict)__: concrete storage options
+
+__Raises__
+- `StorageError`: raises on any error
+
+__Returns__
+
+`Storage`: returns `Storage` instance
+
+
+
+#### `storage.create`
+```python
+storage.create(bucket, descriptor, force=False)
+```
+Create one/multiple buckets.
+
+__Arguments__
+- __bucket (str/list)__: bucket name or list of bucket names
+- __descriptor (dict/dict[])__: schema descriptor or list of descriptors
+- __force (bool)__: whether to delete and re-create already existing buckets
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+
+
+#### `storage.delete`
+```python
+storage.delete(bucket=None, ignore=False)
+```
+Delete one/multiple/all buckets.
+
+__Arguments__
+- __bucket (str/list/None)__: bucket name or list of bucket names to delete.
+ If `None`, all buckets will be deleted
+- __descriptor (dict/dict[])__: schema descriptor or list of descriptors
+- __ignore (bool)__: don't raise an error on non-existent bucket deletion
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+
+
+#### `storage.describe`
+```python
+storage.describe(bucket, descriptor=None)
+```
+Get/set bucket's Table Schema descriptor
+
+__Arguments__
+- __bucket (str)__: bucket name
+- __descriptor (dict/None)__: schema descriptor to set
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+__Returns__
+
+`dict`: returns Table Schema descriptor
+
+
+
+#### `storage.iter`
+```python
+storage.iter(bucket)
+```
+Return an iterator of typed values based on the schema of this bucket.
+
+__Arguments__
+- __bucket (str)__: bucket name
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+__Returns__
+
+`list[]`: yields data rows
+
+
+
+#### `storage.read`
+```python
+storage.read(bucket)
+```
+Read typed values based on the schema of this bucket.
+
+__Arguments__
+- __bucket (str)__: bucket name
+__Raises__
+- `exceptions.StorageError`: raises on any error
+__Returns__
+
+`list[]`: returns data rows
+
+
+
+#### `storage.write`
+```python
+storage.write(bucket, rows)
+```
+This method writes data rows into `storage`.
+
+It should store values of unsupported types as strings internally (like csv does).
+
+__Arguments__
+- __bucket (str)__: bucket name
+- __rows (list[])__: data rows to write
+
+__Raises__
+- `exceptions.StorageError`: raises on any error
+
+
+### `validate`
+```python
+validate(descriptor)
+```
+Validate descriptor
+
+__Arguments__
+- __dict__: descriptor
+
+__Raises__
+- `ValidationError`: on validation errors
+
+__Returns__
+
+`bool`: True
+
+
+### `infer`
+```python
+infer(source,
+ headers=1,
+ limit=100,
+ confidence=0.75,
+ missing_values=[''],
+ guesser_cls=None,
+ resolver_cls=None,
+ **options)
+```
+Infer source schema.
+
+__Arguments__
+- __source (any)__: source as path, url or inline data
+- __headers (int/str[])__: headers rows number or headers list
+- __confidence (float)__: how many casting errors are allowed (as a ratio, between 0 and 1)
+- __missing_values (str[])__: list of missing values (by default `['']`)
+- __guesser_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+- __resolver_cls (class)__: you can implement inferring strategies by
+ providing type-guessing and type-resolving classes [experimental]
+
+__Raises__
+- `TableSchemaException`: raises any error that occurs during the process
+
+__Returns__
+
+`dict`: returns schema descriptor
+
+
+### `FailedCast`
+```python
+FailedCast(self, value)
+```
+Wrap an original data field value that failed to be properly casted.
+
+FailedCast allows for further processing/yielding values but still be able
+to distinguish uncasted values on the consuming side.
+
+Delegates attribute access and the basic rich comparison methods to the
+underlying object. Supports default user-defined classes hashability i.e.
+is hashable based on object identity (not based on the wrapped value).
+
+__Arguments__
+- __value (any)__: value
+
+
+### `DataPackageException`
+```python
+DataPackageException(self, message, errors=[])
+```
+Base class for all DataPackage/TableSchema exceptions.
+
+If there are multiple errors, they can be read from the exception object:
+
+```python
+try:
+ # lib action
+except DataPackageException as exception:
+ if exception.multiple:
+ for error in exception.errors:
+ # handle error
+```
+
+
+
+#### `datapackageexception.errors`
+List of nested errors
+
+__Returns__
+
+`DataPackageException[]`: list of nested errors
+
+
+
+#### `datapackageexception.multiple`
+Whether it's a nested exception
+
+__Returns__
+
+`bool`: whether it's a nested exception
+
+
+
+### `TableSchemaException`
+```python
+TableSchemaException(self, message, errors=[])
+```
+Base class for all TableSchema exceptions.
+
+
+### `LoadError`
+```python
+LoadError(self, message, errors=[])
+```
+All loading errors.
+
+
+### `ValidationError`
+```python
+ValidationError(self, message, errors=[])
+```
+All validation errors.
+
+
+### `CastError`
+```python
+CastError(self, message, errors=[])
+```
+All value cast errors.
+
+
+### `IntegrityError`
+```python
+IntegrityError(self, message, errors=[])
+```
+All integrity errors.
+
+
+### `UniqueKeyError`
+```python
+UniqueKeyError(self, message, errors=[])
+```
+Unique key constraint violation (CastError subclass)
+
+
+### `RelationError`
+```python
+RelationError(self, message, errors=[])
+```
+All relations errors.
+
+
+### `UnresolvedFKError`
+```python
+UnresolvedFKError(self, message, errors=[])
+```
+Unresolved foreign key reference error (RelationError subclass).
+
+
+### `StorageError`
+```python
+StorageError(self, message, errors=[])
+```
+All storage errors.
+
+## Experimental
+
+> This API is experimental and can be changed/removed in the future
+
+There is an experimental environment variable `TABLESCHEMA_PRESERVE_MISSING_VALUES` which, if it is set, affects how data casting works.
+
+By default, missing values are resolved to `None` values. When this flag is set, missing values are passed through as it is. For example:
+
+> missing_values.py
+
+```python
+from tableschema import Field
+
+field = Field({'type': 'number'}, missing_values=['-'])
+print(field.cast_value('3'))
+print(field.cast_value('-'))
+```
+
+Running this script in different modes:
+
+```bash
+$ python missing_values.py
+3
+None
+$ TABLESCHEMA_PRESERVE_MISSING_VALUES=1 python missing_values.py
+3
+-
+```
+
+The flags affects all the library's APIs and software built on top of `tableschema`. For example, Data Package Pipelines:
+
+```bash
+$ TABLESCHEMA_PRESERVE_MISSING_VALUES=1 dpp run ./my_pipeline
+```
+
+## Contributing
+
+> The project follows the [Open Knowledge International coding standards](https://github.com/okfn/coding-standards).
+
+Recommended way to get started is to create and activate a project virtual environment.
+To install package and development dependencies into active environment:
+
+```bash
+$ make install
+```
+
+To run tests with linting and coverage:
+
+```bash
+$ make test
+```
+
+## Changelog
+
+Here described only breaking and the most important changes. The full changelog and documentation for all released versions can be found in the nicely formatted [commit history](https://github.com/frictionlessdata/tableschema-py/commits/master).
+
+#### v1.20
+
+- Added --json flag to the CLI (#287)
+
+#### v1.19
+
+- Deduplicate field names if guessing in infer
+
+#### v1.18
+
+- Publish `field.ERROR/cast_function/check_functions`
+
+#### v1.17
+
+- Added `schema.missing_values` and `field.missing_values`
+
+#### v1.16
+
+- Fixed the way we parse `geopoint`:
+ - as a string it can be in 3 forms ("default", "array", "object") BUT
+ - as native object it can only be a list/tuple
+
+#### v1.15
+
+- Added an experimental `TABLESCHEMA_PRESERVE_MISSING_VALUES` environment variable flag
+
+#### v1.14
+
+- Allow providing custom guesser and resolver to `table.infer` and `infer`
+
+#### v1.13
+
+- Added `missing_values` argument to the `infer` function (#269)
+
+#### v1.12
+
+- Support optional custom exception handling for table.iter/read (#259)
+
+#### v1.11
+
+- Added `preserve_missing_values` parameter to `field.cast_value`
+
+#### v1.10
+
+- Added an ability to check table's integrity while reading
+
+#### v1.9
+
+- Implemented the `table.size` and `table.hash` properties
+
+#### v1.8
+
+- Added `table.index_foreign_keys_values` and improved foreign key checks performance
+
+#### v1.7
+
+- Added `field.schema` property
+
+#### v1.6
+
+- In `strict` mode raise an exception if there are problems in field construction
+
+#### v1.5
+
+- Allow providing custom guesser and resolver to schema infer
+
+#### v1.4
+
+- Added `schema.update_field` method
+
+#### v1.3
+
+- Support datetime with no time for date casting
+
+#### v1.2
+
+- Support floats like 1.0 for integer casting
+
+#### v1.1
+
+- Added the `confidence` parameter to `infer`
+
+#### v1.0
+
+- The library has been rebased on the Frictionless Data specs v1 - https://frictionlessdata.io/specs/table-schema/
+
+
+
+%prep
+%autosetup -n tableschema-1.20.2
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-tableschema -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Mon Apr 10 2023 Python_Bot <Python_Bot@openeuler.org> - 1.20.2-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..b75a0ae
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+dc93f6a59b1099157e79fcb402dc6077 tableschema-1.20.2.tar.gz