summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--.gitignore1
-rw-r--r--python-datapackage.spec4981
-rw-r--r--sources1
3 files changed, 4983 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
index e69de29..22bd678 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/datapackage-1.15.2.tar.gz
diff --git a/python-datapackage.spec b/python-datapackage.spec
new file mode 100644
index 0000000..d64c8f1
--- /dev/null
+++ b/python-datapackage.spec
@@ -0,0 +1,4981 @@
+%global _empty_manifest_terminate_build 0
+Name: python-datapackage
+Version: 1.15.2
+Release: 1
+Summary: Utilities to work with Data Packages as defined on specs.frictionlessdata.io
+License: MIT
+URL: https://github.com/frictionlessdata/datapackage-py
+Source0: https://mirrors.nju.edu.cn/pypi/web/packages/c7/83/ef759d618745503e66c33aa9e218a954c3637a1b9a700ac9b368b5c9908b/datapackage-1.15.2.tar.gz
+BuildArch: noarch
+
+Requires: python3-six
+Requires: python3-click
+Requires: python3-chardet
+Requires: python3-requests
+Requires: python3-jsonschema
+Requires: python3-unicodecsv
+Requires: python3-jsonpointer
+Requires: python3-tableschema
+Requires: python3-tabulator
+Requires: python3-cchardet
+Requires: python3-mock
+Requires: python3-pylama
+Requires: python3-pytest
+Requires: python3-pytest-cov
+Requires: python3-httpretty
+Requires: python3-tableschema-sql
+
+%description
+# datapackage-py
+
+[![Travis](https://travis-ci.org/frictionlessdata/datapackage-py.svg?branch=master)](https://travis-ci.org/frictionlessdata/datapackage-py)
+[![Coveralls](https://coveralls.io/repos/github/frictionlessdata/datapackage-py/badge.svg?branch=master)](https://coveralls.io/github/frictionlessdata/datapackage-py?branch=master)
+[![PyPi](https://img.shields.io/pypi/v/datapackage.svg)](https://pypi.python.org/pypi/datapackage)
+[![Github](https://img.shields.io/badge/github-master-brightgreen)](https://github.com/frictionlessdata/datapackage-py)
+[![Gitter](https://img.shields.io/gitter/room/frictionlessdata/chat.svg)](https://gitter.im/frictionlessdata/chat)
+
+A library for working with [Data Packages](http://specs.frictionlessdata.io/data-package/).
+
+> **[Important Notice]** We have released [Frictionless Framework](https://github.com/frictionlessdata/frictionless-py). This framework provides improved `datapackage` functionality extended to be a complete data solution. The change in not breaking for the existing software so no actions are required. Please read the [Migration Guide](https://framework.frictionlessdata.io/docs/development/migration) from `datapackage` to Frictionless Framework.
+> - we continue to bug-fix `datapackage@1.x` in this [repository](https://github.com/frictionlessdata/datapackage-py) as well as it's available on [PyPi](https://pypi.org/project/datapackage/) as it was before
+> - please note that `frictionless@3.x` version's API, we're working on at the moment, is not stable
+> - we will release `frictionless@4.x` by the end of 2020 to be the first SemVer/stable version
+
+## Features
+
+ - `Package` class for working with data packages
+ - `Resource` class for working with data resources
+ - `Profile` class for working with profiles
+ - `validate` function for validating data package descriptors
+ - `infer` function for inferring data package descriptors
+
+## Contents
+
+<!--TOC-->
+
+ - [Getting Started](#getting-started)
+ - [Installation](#installation)
+ - [Documentation](#documentation)
+ - [Introduction](#introduction)
+ - [Working with Package](#working-with-package)
+ - [Working with Resource](#working-with-resource)
+ - [Working with Group](#working-with-group)
+ - [Working with Profile](#working-with-profile)
+ - [Working with Foreign Keys](#working-with-foreign-keys)
+ - [Working with validate/infer](#working-with-validateinfer)
+ - [Frequently Asked Questions](#frequently-asked-questions)
+ - [API Reference](#api-reference)
+ - [`cli`](#cli)
+ - [`Package`](#package)
+ - [`Resource`](#resource)
+ - [`Group`](#group)
+ - [`Profile`](#profile)
+ - [`validate`](#validate)
+ - [`infer`](#infer)
+ - [`DataPackageException`](#datapackageexception)
+ - [`TableSchemaException`](#tableschemaexception)
+ - [`LoadError`](#loaderror)
+ - [`CastError`](#casterror)
+ - [`IntegrityError`](#integrityerror)
+ - [`RelationError`](#relationerror)
+ - [`StorageError`](#storageerror)
+ - [Contributing](#contributing)
+ - [Changelog](#changelog)
+
+<!--TOC-->
+
+## Getting Started
+
+### Installation
+
+The package use semantic versioning. It means that major versions could include breaking changes. It's highly recommended to specify `datapackage` version range in your `setup/requirements` file e.g. `datapackage>=1.0,<2.0`.
+
+```bash
+$ pip install datapackage
+```
+
+#### OSX 10.14+
+If you receive an error about the `cchardet` package when installing datapackage on Mac OSX 10.14 (Mojave) or higher, follow these steps:
+1. Make sure you have the latest x-code by running the following in terminal: `xcode-select --install`
+2. Then go to [https://developer.apple.com/download/more/](https://developer.apple.com/download/more/) and download the `command line tools`. Note, this requires an Apple ID.
+3. Then, in terminal, run `open /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg`
+You can read more about these steps in this [post.](https://stackoverflow.com/questions/52509602/cant-compile-c-program-on-a-mac-after-upgrade-to-mojave)
+
+## Documentation
+
+### Introduction
+
+Let's start with a simple example:
+
+```python
+from datapackage import Package
+
+package = Package('datapackage.json')
+package.get_resource('resource').read()
+```
+
+### Working with Package
+
+A class for working with data packages. It provides various capabilities like loading local or remote data package, inferring a data package descriptor, saving a data package descriptor and many more.
+
+Consider we have some local csv files in a `data` directory. Let's create a data package based on this data using a `Package` class:
+
+> data/cities.csv
+
+```csv
+city,location
+london,"51.50,-0.11"
+paris,"48.85,2.30"
+rome,"41.89,12.51"
+```
+
+> data/population.csv
+
+```csv
+city,year,population
+london,2017,8780000
+paris,2017,2240000
+rome,2017,2860000
+```
+
+First we create a blank data package:
+
+```python
+package = Package()
+```
+
+Now we're ready to infer a data package descriptor based on data files we have. Because we have two csv files we use glob pattern `**/*.csv`:
+
+```python
+package.infer('**/*.csv')
+package.descriptor
+#{ profile: 'tabular-data-package',
+# resources:
+# [ { path: 'data/cities.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'cities',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: [Object] },
+# { path: 'data/population.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'population',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: [Object] } ] }
+```
+
+An `infer` method has found all our files and inspected it to extract useful metadata like profile, encoding, format, Table Schema etc. Let's tweak it a little bit:
+
+```python
+package.descriptor['resources'][1]['schema']['fields'][1]['type'] = 'year'
+package.commit()
+package.valid # true
+```
+
+Because our resources are tabular we could read it as a tabular data:
+
+```python
+package.get_resource('population').read(keyed=True)
+#[ { city: 'london', year: 2017, population: 8780000 },
+# { city: 'paris', year: 2017, population: 2240000 },
+# { city: 'rome', year: 2017, population: 2860000 } ]
+```
+
+Let's save our descriptor on the disk as a zip-file:
+
+```python
+package.save('datapackage.zip')
+```
+
+To continue the work with the data package we just load it again but this time using local `datapackage.zip`:
+
+```python
+package = Package('datapackage.zip')
+# Continue the work
+```
+
+It was onle basic introduction to the `Package` class. To learn more let's take a look on `Package` class API reference.
+
+### Working with Resource
+
+A class for working with data resources. You can read or iterate tabular resources using the `iter/read` methods and all resource as bytes using `row_iter/row_read` methods.
+
+Consider we have some local csv file. It could be inline data or remote link - all supported by `Resource` class (except local files for in-brower usage of course). But say it's `data.csv` for now:
+
+```csv
+city,location
+london,"51.50,-0.11"
+paris,"48.85,2.30"
+rome,N/A
+```
+
+Let's create and read a resource. Because resource is tabular we could use `resource.read` method with a `keyed` option to get an array of keyed rows:
+
+```python
+resource = Resource({path: 'data.csv'})
+resource.tabular # true
+resource.read(keyed=True)
+# [
+# {city: 'london', location: '51.50,-0.11'},
+# {city: 'paris', location: '48.85,2.30'},
+# {city: 'rome', location: 'N/A'},
+# ]
+resource.headers
+# ['city', 'location']
+# (reading has to be started first)
+```
+
+As we could see our locations are just a strings. But it should be geopoints. Also Rome's location is not available but it's also just a `N/A` string instead of Python `None`. First we have to infer resource metadata:
+
+```python
+resource.infer()
+resource.descriptor
+#{ path: 'data.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'data',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: { fields: [ [Object], [Object] ], missingValues: [ '' ] } }
+resource.read(keyed=True)
+# Fails with a data validation error
+```
+
+Let's fix not available location. There is a `missingValues` property in Table Schema specification. As a first try we set `missingValues` to `N/A` in `resource.descriptor.schema`. Resource descriptor could be changed in-place but all changes should be commited by `resource.commit()`:
+
+```python
+resource.descriptor['schema']['missingValues'] = 'N/A'
+resource.commit()
+resource.valid # False
+resource.errors
+# [<ValidationError: "'N/A' is not of type 'array'">]
+```
+
+As a good citiziens we've decided to check out recource descriptor validity. And it's not valid! We should use an array for `missingValues` property. Also don't forget to have an empty string as a missing value:
+
+```python
+resource.descriptor['schema']['missingValues'] = ['', 'N/A']
+resource.commit()
+resource.valid # true
+```
+
+All good. It looks like we're ready to read our data again:
+
+```python
+resource.read(keyed=True)
+# [
+# {city: 'london', location: [51.50,-0.11]},
+# {city: 'paris', location: [48.85,2.30]},
+# {city: 'rome', location: null},
+# ]
+```
+
+Now we see that:
+- locations are arrays with numeric lattide and longitude
+- Rome's location is a native JavaScript `null`
+
+And because there are no errors on data reading we could be sure that our data is valid againt our schema. Let's save our resource descriptor:
+
+```python
+resource.save('dataresource.json')
+```
+
+Let's check newly-crated `dataresource.json`. It contains path to our data file, inferred metadata and our `missingValues` tweak:
+
+```json
+{
+ "path": "data.csv",
+ "profile": "tabular-data-resource",
+ "encoding": "utf-8",
+ "name": "data",
+ "format": "csv",
+ "mediatype": "text/csv",
+ "schema": {
+ "fields": [
+ {
+ "name": "city",
+ "type": "string",
+ "format": "default"
+ },
+ {
+ "name": "location",
+ "type": "geopoint",
+ "format": "default"
+ }
+ ],
+ "missingValues": [
+ "",
+ "N/A"
+ ]
+ }
+}
+```
+
+If we decide to improve it even more we could update the `dataresource.json` file and then open it again using local file name:
+
+```python
+resource = Resource('dataresource.json')
+# Continue the work
+```
+
+It was onle basic introduction to the `Resource` class. To learn more let's take a look on `Resource` class API reference.
+
+### Working with Group
+
+A class representing a group of tabular resources. Groups can be used to read multiple resource as one or to export them, for example, to a database as one table. To define a group add the `group: <name>` field to corresponding resources. The group's metadata will be created from the "leading" resource's metadata (the first resource with the group name).
+
+Consider we have a data package with two tables partitioned by a year and a shared schema stored separately:
+
+> cars-2017.csv
+
+```csv
+name,value
+bmw,2017
+tesla,2017
+nissan,2017
+```
+
+> cars-2018.csv
+
+```csv
+name,value
+bmw,2018
+tesla,2018
+nissan,2018
+```
+
+> cars.schema.json
+
+```json
+{
+ "fields": [
+ {
+ "name": "name",
+ "type": "string"
+ },
+ {
+ "name": "value",
+ "type": "integer"
+ }
+ ]
+}
+```
+
+> datapackage.json
+
+```json
+{
+ "name": "datapackage",
+ "resources": [
+ {
+ "group": "cars",
+ "name": "cars-2017",
+ "path": "cars-2017.csv",
+ "profile": "tabular-data-resource",
+ "schema": "cars.schema.json"
+ },
+ {
+ "group": "cars",
+ "name": "cars-2018",
+ "path": "cars-2018.csv",
+ "profile": "tabular-data-resource",
+ "schema": "cars.schema.json"
+ }
+ ]
+}
+```
+
+Let's read the resources separately:
+
+```python
+package = Package('datapackage.json')
+package.get_resource('cars-2017').read(keyed=True) == [
+ {'name': 'bmw', 'value': 2017},
+ {'name': 'tesla', 'value': 2017},
+ {'name': 'nissan', 'value': 2017},
+]
+package.get_resource('cars-2018').read(keyed=True) == [
+ {'name': 'bmw', 'value': 2018},
+ {'name': 'tesla', 'value': 2018},
+ {'name': 'nissan', 'value': 2018},
+]
+```
+
+On the other hand, these resources defined with a `group: cars` field. It means we can treat them as a group:
+
+```python
+package = Package('datapackage.json')
+package.get_group('cars').read(keyed=True) == [
+ {'name': 'bmw', 'value': 2017},
+ {'name': 'tesla', 'value': 2017},
+ {'name': 'nissan', 'value': 2017},
+ {'name': 'bmw', 'value': 2018},
+ {'name': 'tesla', 'value': 2018},
+ {'name': 'nissan', 'value': 2018},
+]
+```
+
+We can use this approach when we need to save the data package to a storage, for example, to a SQL database. There is the `merge_groups` flag to enable groupping behaviour:
+
+```python
+package = Package('datapackage.json')
+package.save(storage='sql', engine=engine)
+# SQL tables:
+# - cars-2017
+# - cars-2018
+package.save(storage='sql', engine=engine, merge_groups=True)
+# SQL tables:
+# - cars
+```
+
+### Working with Profile
+
+A component to represent JSON Schema profile from [Profiles Registry]( https://specs.frictionlessdata.io/schemas/registry.json):
+
+```python
+profile = Profile('data-package')
+
+profile.name # data-package
+profile.jsonschema # JSON Schema contents
+
+try:
+ valid = profile.validate(descriptor)
+except exceptions.ValidationError as exception:
+ for error in exception.errors:
+ # handle individual error
+```
+
+### Working with Foreign Keys
+
+The library supports foreign keys described in the [Table Schema](http://specs.frictionlessdata.io/table-schema/#foreign-keys) specification. It means if your data package descriptor use `resources[].schema.foreignKeys` property for some resources a data integrity will be checked on reading operations.
+
+Consider we have a data package:
+
+```python
+DESCRIPTOR = {
+ 'resources': [
+ {
+ 'name': 'teams',
+ 'data': [
+ ['id', 'name', 'city'],
+ ['1', 'Arsenal', 'London'],
+ ['2', 'Real', 'Madrid'],
+ ['3', 'Bayern', 'Munich'],
+ ],
+ 'schema': {
+ 'fields': [
+ {'name': 'id', 'type': 'integer'},
+ {'name': 'name', 'type': 'string'},
+ {'name': 'city', 'type': 'string'},
+ ],
+ 'foreignKeys': [
+ {
+ 'fields': 'city',
+ 'reference': {'resource': 'cities', 'fields': 'name'},
+ },
+ ],
+ },
+ }, {
+ 'name': 'cities',
+ 'data': [
+ ['name', 'country'],
+ ['London', 'England'],
+ ['Madrid', 'Spain'],
+ ],
+ },
+ ],
+}
+```
+
+Let's check relations for a `teams` resource:
+
+```python
+from datapackage import Package
+
+package = Package(DESCRIPTOR)
+teams = package.get_resource('teams')
+teams.check_relations()
+# tableschema.exceptions.RelationError: Foreign key "['city']" violation in row "4"
+```
+
+As we could see there is a foreign key violation. That's because our lookup table `cities` doesn't have a city of `Munich` but we have a team from there. We need to fix it in `cities` resource:
+
+```python
+package.descriptor['resources'][1]['data'].append(['Munich', 'Germany'])
+package.commit()
+teams = package.get_resource('teams')
+teams.check_relations()
+# True
+```
+
+Fixed! But not only a check operation is available. We could use `relations` argument for `resource.iter/read` methods to dereference a resource relations:
+
+```python
+teams.read(keyed=True, relations=True)
+#[{'id': 1, 'name': 'Arsenal', 'city': {'name': 'London', 'country': 'England}},
+# {'id': 2, 'name': 'Real', 'city': {'name': 'Madrid', 'country': 'Spain}},
+# {'id': 3, 'name': 'Bayern', 'city': {'name': 'Munich', 'country': 'Germany}}]
+```
+
+Instead of plain city name we've got a dictionary containing a city data. These `resource.iter/read` methods will fail with the same as `resource.check_relations` error if there is an integrity issue. But only if `relations=True` flag is passed.
+
+### Working with validate/infer
+
+A standalone function to validate a data package descriptor:
+
+```python
+from datapackage import validate, exceptions
+
+try:
+ valid = validate(descriptor)
+except exceptions.ValidationError as exception:
+ for error in exception.errors:
+ # handle individual error
+```
+
+A standalone function to infer a data package descriptor.
+
+```python
+descriptor = infer('**/*.csv')
+#{ profile: 'tabular-data-resource',
+# resources:
+# [ { path: 'data/cities.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'cities',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: [Object] },
+# { path: 'data/population.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'population',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: [Object] } ] }
+```
+
+### Frequently Asked Questions
+
+#### Accessing data behind a proxy server?
+
+Before the `package = Package("https://xxx.json")` call set these environment variables:
+
+```python
+import os
+
+os.environ["HTTP_PROXY"] = 'xxx'
+os.environ["HTTPS_PROXY"] = 'xxx'
+```
+
+## API Reference
+
+### `cli`
+```python
+cli()
+```
+Command-line interface
+
+```
+Usage: datapackage [OPTIONS] COMMAND [ARGS]...
+
+Options:
+ --version Show the version and exit.
+ --help Show this message and exit.
+
+Commands:
+ infer
+ validate
+```
+
+
+### `Package`
+```python
+Package(self,
+ descriptor=None,
+ base_path=None,
+ strict=False,
+ unsafe=False,
+ storage=None,
+ schema=None,
+ default_base_path=None,
+ **options)
+```
+Package representation
+
+__Arguments__
+- __descriptor (str/dict)__: data package descriptor as local path, url or object
+- __base_path (str)__: base path for all relative paths
+- __strict (bool)__: strict flag to alter validation behavior.
+ Setting it to `True` leads to throwing errors
+ on any operation with invalid descriptor
+- __unsafe (bool)__:
+ if `True` unsafe paths will be allowed. For more inforamtion
+ https://specs.frictionlessdata.io/data-resource/#data-location.
+ Default to `False`
+- __storage (str/tableschema.Storage)__: storage name like `sql` or storage instance
+- __options (dict)__: storage options to use for storage creation
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+
+
+#### `package.base_path`
+Package's base path
+
+__Returns__
+
+`str/None`: returns the data package base path
+
+
+
+#### `package.descriptor`
+Package's descriptor
+
+__Returns__
+
+`dict`: descriptor
+
+
+
+#### `package.errors`
+Validation errors
+
+Always empty in strict mode.
+
+__Returns__
+
+`Exception[]`: validation errors
+
+
+
+#### `package.profile`
+Package's profile
+
+__Returns__
+
+`Profile`: an instance of `Profile` class
+
+
+
+#### `package.resource_names`
+Package's resource names
+
+__Returns__
+
+`str[]`: returns an array of resource names
+
+
+
+#### `package.resources`
+Package's resources
+
+__Returns__
+
+`Resource[]`: returns an array of `Resource` instances
+
+
+
+#### `package.valid`
+Validation status
+
+Always true in strict mode.
+
+__Returns__
+
+`bool`: validation status
+
+
+
+#### `package.get_resource`
+```python
+package.get_resource(name)
+```
+Get data package resource by name.
+
+__Arguments__
+- __name (str)__: data resource name
+
+__Returns__
+
+`Resource/None`: returns `Resource` instances or null if not found
+
+
+
+#### `package.add_resource`
+```python
+package.add_resource(descriptor)
+```
+Add new resource to data package.
+
+The data package descriptor will be validated with newly added resource descriptor.
+
+__Arguments__
+- __descriptor (dict)__: data resource descriptor
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`Resource/None`: returns added `Resource` instance or null if not added
+
+
+
+#### `package.remove_resource`
+```python
+package.remove_resource(name)
+```
+Remove data package resource by name.
+
+The data package descriptor will be validated after resource descriptor removal.
+
+__Arguments__
+- __name (str)__: data resource name
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`Resource/None`: returns removed `Resource` instances or null if not found
+
+
+
+#### `package.get_group`
+```python
+package.get_group(name)
+```
+Returns a group of tabular resources by name.
+
+For more information about groups see [Group](#group).
+
+__Arguments__
+- __name (str)__: name of a group of resources
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`Group/None`: returns a `Group` instance or null if not found
+
+
+
+#### `package.infer`
+```python
+package.infer(pattern=False)
+```
+Infer a data package metadata.
+
+> Argument `pattern` works only for local files
+
+If `pattern` is not provided only existent resources will be inferred
+(added metadata like encoding, profile etc). If `pattern` is provided
+new resoures with file names mathing the pattern will be added and inferred.
+It commits changes to data package instance.
+
+__Arguments__
+- __pattern (str)__: glob pattern for new resources
+
+__Returns__
+
+`dict`: returns data package descriptor
+
+
+
+#### `package.commit`
+```python
+package.commit(strict=None)
+```
+Update data package instance if there are in-place changes in the descriptor.
+
+__Example__
+
+
+```python
+package = Package({
+ 'name': 'package',
+ 'resources': [{'name': 'resource', 'data': ['data']}]
+})
+
+package.name # package
+package.descriptor['name'] = 'renamed-package'
+package.name # package
+package.commit()
+package.name # renamed-package
+```
+
+__Arguments__
+- __strict (bool)__: alter `strict` mode for further work
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`bool`: returns true on success and false if not modified
+
+
+
+#### `package.save`
+```python
+package.save(target=None,
+ storage=None,
+ merge_groups=False,
+ to_base_path=False,
+ **options)
+```
+Saves this data package
+
+It saves it to storage if `storage` argument is passed or
+saves this data package's descriptor to json file if `target` arguments
+ends with `.json` or saves this data package to zip file otherwise.
+
+__Example__
+
+
+It creates a zip file into ``file_or_path`` with the contents
+of this Data Package and its resources. Every resource which content
+lives in the local filesystem will be copied to the zip file.
+Consider the following Data Package descriptor:
+
+```json
+{
+ "name": "gdp",
+ "resources": [
+ {"name": "local", "format": "CSV", "path": "data.csv"},
+ {"name": "inline", "data": [4, 8, 15, 16, 23, 42]},
+ {"name": "remote", "url": "http://someplace.com/data.csv"}
+ ]
+}
+```
+
+The final structure of the zip file will be:
+
+```
+./datapackage.json
+./data/local.csv
+```
+
+With the contents of `datapackage.json` being the same as
+returned `datapackage.descriptor`. The resources' file names are generated
+based on their `name` and `format` fields if they exist.
+If the resource has no `name`, it'll be used `resource-X`,
+where `X` is the index of the resource in the `resources` list (starting at zero).
+If the resource has `format`, it'll be lowercased and appended to the `name`,
+becoming "`name.format`".
+
+__Arguments__
+- __target (string/filelike)__:
+ the file path or a file-like object where
+ the contents of this Data Package will be saved into.
+- __storage (str/tableschema.Storage)__:
+ storage name like `sql` or storage instance
+- __merge_groups (bool)__:
+ save all the group's tabular resoruces into one bucket
+ if a storage is provided (for example into one SQL table).
+ Read more about [Group](#group).
+- __to_base_path (bool)__:
+ save the package to the package's base path
+ using the "<base_path>/<target>" route
+- __options (dict)__:
+ storage options to use for storage creation
+
+__Raises__
+- `DataPackageException`: raises if there was some error writing the package
+
+__Returns__
+
+`bool/Storage`: on success return true or a `Storage` instance
+
+### `Resource`
+```python
+Resource(self,
+ descriptor={},
+ base_path=None,
+ strict=False,
+ unsafe=False,
+ storage=None,
+ package=None,
+ **options)
+```
+Resource represenation
+
+__Arguments__
+- __descriptor (str/dict)__: data resource descriptor as local path, url or object
+- __base_path (str)__: base path for all relative paths
+- __strict (bool)__:
+ strict flag to alter validation behavior. Setting it to `true`
+ leads to throwing errors on any operation with invalid descriptor
+- __unsafe (bool)__:
+ if `True` unsafe paths will be allowed. For more inforamtion
+ https://specs.frictionlessdata.io/data-resource/#data-location.
+ Default to `False`
+- __storage (str/tableschema.Storage)__: storage name like `sql` or storage instance
+- __options (dict)__: storage options to use for storage creation
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+
+
+#### `resource.data`
+Return resource data
+
+
+#### `resource.descriptor`
+Package's descriptor
+
+__Returns__
+
+`dict`: descriptor
+
+
+
+#### `resource.errors`
+Validation errors
+
+Always empty in strict mode.
+
+__Returns__
+
+`Exception[]`: validation errors
+
+
+
+#### `resource.group`
+Group name
+
+__Returns__
+
+`str`: group name
+
+
+
+#### `resource.headers`
+Resource's headers
+
+> Only for tabular resources (reading has to be started first or it's `None`)
+
+__Returns__
+
+`str[]/None`: returns data source headers
+
+
+
+#### `resource.inline`
+Whether resource inline
+
+__Returns__
+
+`bool`: returns true if resource is inline
+
+
+
+#### `resource.local`
+Whether resource local
+
+__Returns__
+
+`bool`: returns true if resource is local
+
+
+
+#### `resource.multipart`
+Whether resource multipart
+
+__Returns__
+
+`bool`: returns true if resource is multipart
+
+
+
+#### `resource.name`
+Resource name
+
+__Returns__
+
+`str`: name
+
+
+
+#### `resource.package`
+Package instance if the resource belongs to some package
+
+__Returns__
+
+`Package/None`: a package instance if available
+
+
+
+#### `resource.profile`
+Resource's profile
+
+__Returns__
+
+`Profile`: an instance of `Profile` class
+
+
+
+#### `resource.remote`
+Whether resource remote
+
+__Returns__
+
+`bool`: returns true if resource is remote
+
+
+
+#### `resource.schema`
+Resource's schema
+
+> Only for tabular resources
+
+For tabular resources it returns `Schema` instance to interact with data schema.
+Read API documentation - [tableschema.Schema](https://github.com/frictionlessdata/tableschema-py#schema).
+
+__Returns__
+
+`tableschema.Schema`: schema
+
+
+
+#### `resource.source`
+Resource's source
+
+Combination of `resource.source` and `resource.inline/local/remote/multipart`
+provides predictable interface to work with resource data.
+
+__Returns__
+
+`list/str`: returns `data` or `path` property
+
+
+
+#### `resource.table`
+Return resource table
+
+
+#### `resource.tabular`
+Whether resource tabular
+
+__Returns__
+
+`bool`: returns true if resource is tabular
+
+
+
+#### `resource.valid`
+Validation status
+
+Always true in strict mode.
+
+__Returns__
+
+`bool`: validation status
+
+
+
+#### `resource.iter`
+```python
+resource.iter(integrity=False, relations=False, **options)
+```
+Iterates through the resource data and emits rows cast based on table schema.
+
+> Only for tabular resources
+
+__Arguments__
+
+
+ keyed (bool):
+ yield keyed rows in a form of `{header1: value1, header2: value2}`
+ (default is false; the form of rows is `[value1, value2]`)
+
+ extended (bool):
+ yield extended rows in a for of `[rowNumber, [header1, header2], [value1, value2]]`
+ (default is false; the form of rows is `[value1, value2]`)
+
+ cast (bool):
+ disable data casting if false
+ (default is true)
+
+ integrity (bool):
+ if true actual size in BYTES and SHA256 hash of the file
+ will be checked against `descriptor.bytes` and `descriptor.hash`
+ (other hashing algorithms are not supported and will be skipped silently)
+
+ relations (bool):
+ if true foreign key fields will be checked and resolved to its references
+
+ foreign_keys_values (dict):
+ three-level dictionary of foreign key references optimized
+ to speed up validation process in a form of
+ `{resource1: {(fk_field1, fk_field2): {(value1, value2): {one_keyedrow}, ... }}}`.
+ If not provided but relations is true, it will be created
+ before the validation process by *index_foreign_keys_values* method
+
+ exc_handler (func):
+ optional custom exception handler callable.
+ Can be used to defer raising errors (i.e. "fail late"), e.g.
+ for data validation purposes. Must support the signature below
+
+__Custom exception handler__
+
+
+```python
+def exc_handler(exc, row_number=None, row_data=None, error_data=None):
+ '''Custom exception handler (example)
+
+ # Arguments:
+ exc(Exception):
+ Deferred exception instance
+ row_number(int):
+ Data row number that triggers exception exc
+ row_data(OrderedDict):
+ Invalid data row source data
+ error_data(OrderedDict):
+ Data row source data field subset responsible for the error, if
+ applicable (e.g. invalid primary or foreign key fields). May be
+ identical to row_data.
+ '''
+ # ...
+```
+
+__Raises__
+- `DataPackageException`: base class of any error
+- `CastError`: data cast error
+- `IntegrityError`: integrity checking error
+- `UniqueKeyError`: unique key constraint violation
+- `UnresolvedFKError`: unresolved foreign key reference error
+
+__Returns__
+
+`Iterator[list]`: yields rows
+
+
+
+#### `resource.read`
+```python
+resource.read(integrity=False,
+ relations=False,
+ foreign_keys_values=False,
+ **options)
+```
+Read the whole resource and return as array of rows
+
+> Only for tabular resources
+> It has the same API as `resource.iter` except for
+
+__Arguments__
+- __limit (int)__: limit count of rows to read and return
+
+__Returns__
+
+`list[]`: returns rows
+
+
+
+#### `resource.check_integrity`
+```python
+resource.check_integrity()
+```
+Checks resource integrity
+
+> Only for tabular resources
+
+It checks size in BYTES and SHA256 hash of the file
+against `descriptor.bytes` and `descriptor.hash`
+(other hashing algorithms are not supported and will be skipped silently).
+
+__Raises__
+- `exceptions.IntegrityError`: raises if there are integrity issues
+
+__Returns__
+
+`bool`: returns True if no issues
+
+
+
+#### `resource.check_relations`
+```python
+resource.check_relations(foreign_keys_values=False)
+```
+Check relations
+
+> Only for tabular resources
+
+It checks foreign keys and raises an exception if there are integrity issues.
+
+__Raises__
+- `exceptions.RelationError`: raises if there are relation issues
+
+__Returns__
+
+`bool`: returns True if no issues
+
+
+
+#### `resource.drop_relations`
+```python
+resource.drop_relations()
+```
+Drop relations
+
+> Only for tabular resources
+
+Remove relations data from memory
+
+__Returns__
+
+`bool`: returns True
+
+
+
+#### `resource.raw_iter`
+```python
+resource.raw_iter(stream=False)
+```
+Iterate over data chunks as bytes.
+
+If `stream` is true File-like object will be returned.
+
+__Arguments__
+- __stream (bool)__: File-like object will be returned
+
+__Returns__
+
+`bytes[]/filelike`: returns bytes[]/filelike
+
+
+
+#### `resource.raw_read`
+```python
+resource.raw_read()
+```
+Returns resource data as bytes.
+
+__Returns__
+
+`bytes`: returns resource data in bytes
+
+
+
+#### `resource.infer`
+```python
+resource.infer(**options)
+```
+Infer resource metadata
+
+Like name, format, mediatype, encoding, schema and profile.
+It commits this changes into resource instance.
+
+__Arguments__
+- __options__:
+ options will be passed to `tableschema.infer` call,
+ for more control on results (e.g. for setting `limit`, `confidence` etc.).
+
+__Returns__
+
+`dict`: returns resource descriptor
+
+
+
+#### `resource.commit`
+```python
+resource.commit(strict=None)
+```
+Update resource instance if there are in-place changes in the descriptor.
+
+__Arguments__
+- __strict (bool)__: alter `strict` mode for further work
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`bool`: returns true on success and false if not modified
+
+
+
+#### `resource.save`
+```python
+resource.save(target, storage=None, to_base_path=False, **options)
+```
+Saves this resource
+
+Into storage if `storage` argument is passed or
+saves this resource's descriptor to json file otherwise.
+
+__Arguments__
+- __target (str)__:
+ path where to save a resource
+- __storage (str/tableschema.Storage)__:
+ storage name like `sql` or storage instance
+- __to_base_path (bool)__:
+ save the resource to the resource's base path
+ using the "<base_path>/<target>" route
+- __options (dict)__:
+ storage options to use for storage creation
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`bool`: returns true on success
+Building index...
+Started generating documentation...
+
+### `Group`
+```python
+Group(self, resources)
+```
+Group representation
+
+__Arguments__
+- __Resource[]__: list of TABULAR resources
+
+
+
+#### `group.headers`
+Group's headers
+
+__Returns__
+
+`str[]/None`: returns headers
+
+
+
+#### `group.name`
+Group name
+
+__Returns__
+
+`str`: name
+
+
+
+#### `group.schema`
+Resource's schema
+
+__Returns__
+
+`tableschema.Schema`: schema
+
+
+
+#### `group.iter`
+```python
+group.iter(**options)
+```
+Iterates through the group data and emits rows cast based on table schema.
+
+> It concatenates all the resources and has the same API as `resource.iter`
+
+
+
+#### `group.read`
+```python
+group.read(limit=None, **options)
+```
+Read the whole group and return as array of rows
+
+> It concatenates all the resources and has the same API as `resource.read`
+
+
+
+#### `group.check_relations`
+```python
+group.check_relations()
+```
+Check group's relations
+
+The same as `resource.check_relations` but without the optional
+argument *foreign_keys_values*. This method will test foreignKeys of the
+whole group at once otpimizing the process by creating the foreign_key_values
+hashmap only once before testing the set of resources.
+
+
+### `Profile`
+```python
+Profile(self, profile)
+```
+Profile representation
+
+__Arguments__
+- __profile (str)__: profile name in registry or URL to JSON Schema
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+
+
+#### `profile.jsonschema`
+JSONSchema content
+
+__Returns__
+
+`dict`: returns profile's JSON Schema contents
+
+
+
+#### `profile.name`
+Profile name
+
+__Returns__
+
+`str/None`: name if available
+
+
+
+#### `profile.validate`
+```python
+profile.validate(descriptor)
+```
+Validate a data package `descriptor` against the profile.
+
+__Arguments__
+- __descriptor (dict)__: retrieved and dereferenced data package descriptor
+
+__Raises__
+- `ValidationError`: raises if not valid
+__Returns__
+
+`bool`: returns True if valid
+
+
+### `validate`
+```python
+validate(descriptor)
+```
+Validate a data package descriptor.
+
+__Arguments__
+- __descriptor (str/dict)__: package descriptor (one of):
+ - local path
+ - remote url
+ - object
+
+__Raises__
+- `ValidationError`: raises on invalid
+
+__Returns__
+
+`bool`: returns true on valid
+
+
+### `infer`
+```python
+infer(pattern, base_path=None)
+```
+Infer a data package descriptor.
+
+> Argument `pattern` works only for local files
+
+__Arguments__
+- __pattern (str)__: glob file pattern
+
+__Returns__
+
+`dict`: returns data package descriptor
+
+
+### `DataPackageException`
+```python
+DataPackageException(self, message, errors=[])
+```
+Base class for all DataPackage/TableSchema exceptions.
+
+If there are multiple errors, they can be read from the exception object:
+
+```python
+try:
+ # lib action
+except DataPackageException as exception:
+ if exception.multiple:
+ for error in exception.errors:
+ # handle error
+```
+
+
+
+#### `datapackageexception.errors`
+List of nested errors
+
+__Returns__
+
+`DataPackageException[]`: list of nested errors
+
+
+
+#### `datapackageexception.multiple`
+Whether it's a nested exception
+
+__Returns__
+
+`bool`: whether it's a nested exception
+
+
+
+### `TableSchemaException`
+```python
+TableSchemaException(self, message, errors=[])
+```
+Base class for all TableSchema exceptions.
+
+
+### `LoadError`
+```python
+LoadError(self, message, errors=[])
+```
+All loading errors.
+
+
+### `CastError`
+```python
+CastError(self, message, errors=[])
+```
+All value cast errors.
+
+
+### `IntegrityError`
+```python
+IntegrityError(self, message, errors=[])
+```
+All integrity errors.
+
+
+### `RelationError`
+```python
+RelationError(self, message, errors=[])
+```
+All relations errors.
+
+
+### `StorageError`
+```python
+StorageError(self, message, errors=[])
+```
+All storage errors.
+
+
+## Contributing
+
+> The project follows the [Open Knowledge International coding standards](https://github.com/okfn/coding-standards).
+
+Recommended way to get started is to create and activate a project virtual environment.
+To install package and development dependencies into active environment:
+
+```bash
+$ make install
+```
+
+To run tests with linting and coverage:
+
+```bash
+$ make test
+```
+
+## Changelog
+
+Here described only breaking and the most important changes. The full changelog and documentation for all released versions could be found in nicely formatted [commit history](https://github.com/frictionlessdata/datapackage-py/commits/master).
+
+#### v1.15
+
+> WARNING: it can be breaking for some setups, please read the discussions below
+
+- Fixed header management according to the specs:
+ - https://github.com/frictionlessdata/datapackage-py/pull/257
+ - https://github.com/frictionlessdata/datapackage-py/issues/256
+ - https://github.com/frictionlessdata/forum/issues/1
+
+#### v1.14
+
+- Add experimental options for pick/skiping fileds/rows
+
+#### v1.13
+
+- Add `unsafe` option to Package and Resource (#262)
+
+#### v1.12
+
+- Use `chardet` for encoding deteciton by default. For `cchardet`: `pip install datapackage[cchardet]`
+
+#### v1.11
+
+- `resource/package.save` now accept a `to_base_path` argument (#254)
+- `package.save` now returns a `Storage` instance if available
+
+#### v1.10
+
+- Added an ability to check tabular resource's integrity
+
+#### v1.9
+
+- Added `resource.package` property
+
+#### v1.8
+
+- Added support for [groups of resources](#group)
+
+#### v1.7
+
+- Added support for [compression of resources](https://frictionlessdata.io/specs/patterns/#compression-of-resources)
+
+#### v1.6
+
+- Added support for custom request session
+
+#### v1.5
+
+Updated behaviour:
+- Added support for Python 3.7
+
+#### v1.4
+
+New API added:
+- added `skip_rows` support to the resource descriptor
+
+#### v1.3
+
+New API added:
+- property `package.base_path` is now publicly available
+
+#### v1.2
+
+Updated behaviour:
+- CLI command `$ datapackage infer` now outputs only a JSON-formatted data package descriptor.
+
+#### v1.1
+
+New API added:
+- Added an integration between `Package/Resource` and the `tableschema.Storage` - https://github.com/frictionlessdata/tableschema-py#storage. It allows to load and save data package from/to different storages like SQL/BigQuery/etc.
+
+
+
+%package -n python3-datapackage
+Summary: Utilities to work with Data Packages as defined on specs.frictionlessdata.io
+Provides: python-datapackage
+BuildRequires: python3-devel
+BuildRequires: python3-setuptools
+BuildRequires: python3-pip
+%description -n python3-datapackage
+# datapackage-py
+
+[![Travis](https://travis-ci.org/frictionlessdata/datapackage-py.svg?branch=master)](https://travis-ci.org/frictionlessdata/datapackage-py)
+[![Coveralls](https://coveralls.io/repos/github/frictionlessdata/datapackage-py/badge.svg?branch=master)](https://coveralls.io/github/frictionlessdata/datapackage-py?branch=master)
+[![PyPi](https://img.shields.io/pypi/v/datapackage.svg)](https://pypi.python.org/pypi/datapackage)
+[![Github](https://img.shields.io/badge/github-master-brightgreen)](https://github.com/frictionlessdata/datapackage-py)
+[![Gitter](https://img.shields.io/gitter/room/frictionlessdata/chat.svg)](https://gitter.im/frictionlessdata/chat)
+
+A library for working with [Data Packages](http://specs.frictionlessdata.io/data-package/).
+
+> **[Important Notice]** We have released [Frictionless Framework](https://github.com/frictionlessdata/frictionless-py). This framework provides improved `datapackage` functionality extended to be a complete data solution. The change in not breaking for the existing software so no actions are required. Please read the [Migration Guide](https://framework.frictionlessdata.io/docs/development/migration) from `datapackage` to Frictionless Framework.
+> - we continue to bug-fix `datapackage@1.x` in this [repository](https://github.com/frictionlessdata/datapackage-py) as well as it's available on [PyPi](https://pypi.org/project/datapackage/) as it was before
+> - please note that `frictionless@3.x` version's API, we're working on at the moment, is not stable
+> - we will release `frictionless@4.x` by the end of 2020 to be the first SemVer/stable version
+
+## Features
+
+ - `Package` class for working with data packages
+ - `Resource` class for working with data resources
+ - `Profile` class for working with profiles
+ - `validate` function for validating data package descriptors
+ - `infer` function for inferring data package descriptors
+
+## Contents
+
+<!--TOC-->
+
+ - [Getting Started](#getting-started)
+ - [Installation](#installation)
+ - [Documentation](#documentation)
+ - [Introduction](#introduction)
+ - [Working with Package](#working-with-package)
+ - [Working with Resource](#working-with-resource)
+ - [Working with Group](#working-with-group)
+ - [Working with Profile](#working-with-profile)
+ - [Working with Foreign Keys](#working-with-foreign-keys)
+ - [Working with validate/infer](#working-with-validateinfer)
+ - [Frequently Asked Questions](#frequently-asked-questions)
+ - [API Reference](#api-reference)
+ - [`cli`](#cli)
+ - [`Package`](#package)
+ - [`Resource`](#resource)
+ - [`Group`](#group)
+ - [`Profile`](#profile)
+ - [`validate`](#validate)
+ - [`infer`](#infer)
+ - [`DataPackageException`](#datapackageexception)
+ - [`TableSchemaException`](#tableschemaexception)
+ - [`LoadError`](#loaderror)
+ - [`CastError`](#casterror)
+ - [`IntegrityError`](#integrityerror)
+ - [`RelationError`](#relationerror)
+ - [`StorageError`](#storageerror)
+ - [Contributing](#contributing)
+ - [Changelog](#changelog)
+
+<!--TOC-->
+
+## Getting Started
+
+### Installation
+
+The package use semantic versioning. It means that major versions could include breaking changes. It's highly recommended to specify `datapackage` version range in your `setup/requirements` file e.g. `datapackage>=1.0,<2.0`.
+
+```bash
+$ pip install datapackage
+```
+
+#### OSX 10.14+
+If you receive an error about the `cchardet` package when installing datapackage on Mac OSX 10.14 (Mojave) or higher, follow these steps:
+1. Make sure you have the latest x-code by running the following in terminal: `xcode-select --install`
+2. Then go to [https://developer.apple.com/download/more/](https://developer.apple.com/download/more/) and download the `command line tools`. Note, this requires an Apple ID.
+3. Then, in terminal, run `open /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg`
+You can read more about these steps in this [post.](https://stackoverflow.com/questions/52509602/cant-compile-c-program-on-a-mac-after-upgrade-to-mojave)
+
+## Documentation
+
+### Introduction
+
+Let's start with a simple example:
+
+```python
+from datapackage import Package
+
+package = Package('datapackage.json')
+package.get_resource('resource').read()
+```
+
+### Working with Package
+
+A class for working with data packages. It provides various capabilities like loading local or remote data package, inferring a data package descriptor, saving a data package descriptor and many more.
+
+Consider we have some local csv files in a `data` directory. Let's create a data package based on this data using a `Package` class:
+
+> data/cities.csv
+
+```csv
+city,location
+london,"51.50,-0.11"
+paris,"48.85,2.30"
+rome,"41.89,12.51"
+```
+
+> data/population.csv
+
+```csv
+city,year,population
+london,2017,8780000
+paris,2017,2240000
+rome,2017,2860000
+```
+
+First we create a blank data package:
+
+```python
+package = Package()
+```
+
+Now we're ready to infer a data package descriptor based on data files we have. Because we have two csv files we use glob pattern `**/*.csv`:
+
+```python
+package.infer('**/*.csv')
+package.descriptor
+#{ profile: 'tabular-data-package',
+# resources:
+# [ { path: 'data/cities.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'cities',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: [Object] },
+# { path: 'data/population.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'population',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: [Object] } ] }
+```
+
+An `infer` method has found all our files and inspected it to extract useful metadata like profile, encoding, format, Table Schema etc. Let's tweak it a little bit:
+
+```python
+package.descriptor['resources'][1]['schema']['fields'][1]['type'] = 'year'
+package.commit()
+package.valid # true
+```
+
+Because our resources are tabular we could read it as a tabular data:
+
+```python
+package.get_resource('population').read(keyed=True)
+#[ { city: 'london', year: 2017, population: 8780000 },
+# { city: 'paris', year: 2017, population: 2240000 },
+# { city: 'rome', year: 2017, population: 2860000 } ]
+```
+
+Let's save our descriptor on the disk as a zip-file:
+
+```python
+package.save('datapackage.zip')
+```
+
+To continue the work with the data package we just load it again but this time using local `datapackage.zip`:
+
+```python
+package = Package('datapackage.zip')
+# Continue the work
+```
+
+It was onle basic introduction to the `Package` class. To learn more let's take a look on `Package` class API reference.
+
+### Working with Resource
+
+A class for working with data resources. You can read or iterate tabular resources using the `iter/read` methods and all resource as bytes using `row_iter/row_read` methods.
+
+Consider we have some local csv file. It could be inline data or remote link - all supported by `Resource` class (except local files for in-brower usage of course). But say it's `data.csv` for now:
+
+```csv
+city,location
+london,"51.50,-0.11"
+paris,"48.85,2.30"
+rome,N/A
+```
+
+Let's create and read a resource. Because resource is tabular we could use `resource.read` method with a `keyed` option to get an array of keyed rows:
+
+```python
+resource = Resource({path: 'data.csv'})
+resource.tabular # true
+resource.read(keyed=True)
+# [
+# {city: 'london', location: '51.50,-0.11'},
+# {city: 'paris', location: '48.85,2.30'},
+# {city: 'rome', location: 'N/A'},
+# ]
+resource.headers
+# ['city', 'location']
+# (reading has to be started first)
+```
+
+As we could see our locations are just a strings. But it should be geopoints. Also Rome's location is not available but it's also just a `N/A` string instead of Python `None`. First we have to infer resource metadata:
+
+```python
+resource.infer()
+resource.descriptor
+#{ path: 'data.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'data',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: { fields: [ [Object], [Object] ], missingValues: [ '' ] } }
+resource.read(keyed=True)
+# Fails with a data validation error
+```
+
+Let's fix not available location. There is a `missingValues` property in Table Schema specification. As a first try we set `missingValues` to `N/A` in `resource.descriptor.schema`. Resource descriptor could be changed in-place but all changes should be commited by `resource.commit()`:
+
+```python
+resource.descriptor['schema']['missingValues'] = 'N/A'
+resource.commit()
+resource.valid # False
+resource.errors
+# [<ValidationError: "'N/A' is not of type 'array'">]
+```
+
+As a good citiziens we've decided to check out recource descriptor validity. And it's not valid! We should use an array for `missingValues` property. Also don't forget to have an empty string as a missing value:
+
+```python
+resource.descriptor['schema']['missingValues'] = ['', 'N/A']
+resource.commit()
+resource.valid # true
+```
+
+All good. It looks like we're ready to read our data again:
+
+```python
+resource.read(keyed=True)
+# [
+# {city: 'london', location: [51.50,-0.11]},
+# {city: 'paris', location: [48.85,2.30]},
+# {city: 'rome', location: null},
+# ]
+```
+
+Now we see that:
+- locations are arrays with numeric lattide and longitude
+- Rome's location is a native JavaScript `null`
+
+And because there are no errors on data reading we could be sure that our data is valid againt our schema. Let's save our resource descriptor:
+
+```python
+resource.save('dataresource.json')
+```
+
+Let's check newly-crated `dataresource.json`. It contains path to our data file, inferred metadata and our `missingValues` tweak:
+
+```json
+{
+ "path": "data.csv",
+ "profile": "tabular-data-resource",
+ "encoding": "utf-8",
+ "name": "data",
+ "format": "csv",
+ "mediatype": "text/csv",
+ "schema": {
+ "fields": [
+ {
+ "name": "city",
+ "type": "string",
+ "format": "default"
+ },
+ {
+ "name": "location",
+ "type": "geopoint",
+ "format": "default"
+ }
+ ],
+ "missingValues": [
+ "",
+ "N/A"
+ ]
+ }
+}
+```
+
+If we decide to improve it even more we could update the `dataresource.json` file and then open it again using local file name:
+
+```python
+resource = Resource('dataresource.json')
+# Continue the work
+```
+
+It was onle basic introduction to the `Resource` class. To learn more let's take a look on `Resource` class API reference.
+
+### Working with Group
+
+A class representing a group of tabular resources. Groups can be used to read multiple resource as one or to export them, for example, to a database as one table. To define a group add the `group: <name>` field to corresponding resources. The group's metadata will be created from the "leading" resource's metadata (the first resource with the group name).
+
+Consider we have a data package with two tables partitioned by a year and a shared schema stored separately:
+
+> cars-2017.csv
+
+```csv
+name,value
+bmw,2017
+tesla,2017
+nissan,2017
+```
+
+> cars-2018.csv
+
+```csv
+name,value
+bmw,2018
+tesla,2018
+nissan,2018
+```
+
+> cars.schema.json
+
+```json
+{
+ "fields": [
+ {
+ "name": "name",
+ "type": "string"
+ },
+ {
+ "name": "value",
+ "type": "integer"
+ }
+ ]
+}
+```
+
+> datapackage.json
+
+```json
+{
+ "name": "datapackage",
+ "resources": [
+ {
+ "group": "cars",
+ "name": "cars-2017",
+ "path": "cars-2017.csv",
+ "profile": "tabular-data-resource",
+ "schema": "cars.schema.json"
+ },
+ {
+ "group": "cars",
+ "name": "cars-2018",
+ "path": "cars-2018.csv",
+ "profile": "tabular-data-resource",
+ "schema": "cars.schema.json"
+ }
+ ]
+}
+```
+
+Let's read the resources separately:
+
+```python
+package = Package('datapackage.json')
+package.get_resource('cars-2017').read(keyed=True) == [
+ {'name': 'bmw', 'value': 2017},
+ {'name': 'tesla', 'value': 2017},
+ {'name': 'nissan', 'value': 2017},
+]
+package.get_resource('cars-2018').read(keyed=True) == [
+ {'name': 'bmw', 'value': 2018},
+ {'name': 'tesla', 'value': 2018},
+ {'name': 'nissan', 'value': 2018},
+]
+```
+
+On the other hand, these resources defined with a `group: cars` field. It means we can treat them as a group:
+
+```python
+package = Package('datapackage.json')
+package.get_group('cars').read(keyed=True) == [
+ {'name': 'bmw', 'value': 2017},
+ {'name': 'tesla', 'value': 2017},
+ {'name': 'nissan', 'value': 2017},
+ {'name': 'bmw', 'value': 2018},
+ {'name': 'tesla', 'value': 2018},
+ {'name': 'nissan', 'value': 2018},
+]
+```
+
+We can use this approach when we need to save the data package to a storage, for example, to a SQL database. There is the `merge_groups` flag to enable groupping behaviour:
+
+```python
+package = Package('datapackage.json')
+package.save(storage='sql', engine=engine)
+# SQL tables:
+# - cars-2017
+# - cars-2018
+package.save(storage='sql', engine=engine, merge_groups=True)
+# SQL tables:
+# - cars
+```
+
+### Working with Profile
+
+A component to represent JSON Schema profile from [Profiles Registry]( https://specs.frictionlessdata.io/schemas/registry.json):
+
+```python
+profile = Profile('data-package')
+
+profile.name # data-package
+profile.jsonschema # JSON Schema contents
+
+try:
+ valid = profile.validate(descriptor)
+except exceptions.ValidationError as exception:
+ for error in exception.errors:
+ # handle individual error
+```
+
+### Working with Foreign Keys
+
+The library supports foreign keys described in the [Table Schema](http://specs.frictionlessdata.io/table-schema/#foreign-keys) specification. It means if your data package descriptor use `resources[].schema.foreignKeys` property for some resources a data integrity will be checked on reading operations.
+
+Consider we have a data package:
+
+```python
+DESCRIPTOR = {
+ 'resources': [
+ {
+ 'name': 'teams',
+ 'data': [
+ ['id', 'name', 'city'],
+ ['1', 'Arsenal', 'London'],
+ ['2', 'Real', 'Madrid'],
+ ['3', 'Bayern', 'Munich'],
+ ],
+ 'schema': {
+ 'fields': [
+ {'name': 'id', 'type': 'integer'},
+ {'name': 'name', 'type': 'string'},
+ {'name': 'city', 'type': 'string'},
+ ],
+ 'foreignKeys': [
+ {
+ 'fields': 'city',
+ 'reference': {'resource': 'cities', 'fields': 'name'},
+ },
+ ],
+ },
+ }, {
+ 'name': 'cities',
+ 'data': [
+ ['name', 'country'],
+ ['London', 'England'],
+ ['Madrid', 'Spain'],
+ ],
+ },
+ ],
+}
+```
+
+Let's check relations for a `teams` resource:
+
+```python
+from datapackage import Package
+
+package = Package(DESCRIPTOR)
+teams = package.get_resource('teams')
+teams.check_relations()
+# tableschema.exceptions.RelationError: Foreign key "['city']" violation in row "4"
+```
+
+As we could see there is a foreign key violation. That's because our lookup table `cities` doesn't have a city of `Munich` but we have a team from there. We need to fix it in `cities` resource:
+
+```python
+package.descriptor['resources'][1]['data'].append(['Munich', 'Germany'])
+package.commit()
+teams = package.get_resource('teams')
+teams.check_relations()
+# True
+```
+
+Fixed! But not only a check operation is available. We could use `relations` argument for `resource.iter/read` methods to dereference a resource relations:
+
+```python
+teams.read(keyed=True, relations=True)
+#[{'id': 1, 'name': 'Arsenal', 'city': {'name': 'London', 'country': 'England}},
+# {'id': 2, 'name': 'Real', 'city': {'name': 'Madrid', 'country': 'Spain}},
+# {'id': 3, 'name': 'Bayern', 'city': {'name': 'Munich', 'country': 'Germany}}]
+```
+
+Instead of plain city name we've got a dictionary containing a city data. These `resource.iter/read` methods will fail with the same as `resource.check_relations` error if there is an integrity issue. But only if `relations=True` flag is passed.
+
+### Working with validate/infer
+
+A standalone function to validate a data package descriptor:
+
+```python
+from datapackage import validate, exceptions
+
+try:
+ valid = validate(descriptor)
+except exceptions.ValidationError as exception:
+ for error in exception.errors:
+ # handle individual error
+```
+
+A standalone function to infer a data package descriptor.
+
+```python
+descriptor = infer('**/*.csv')
+#{ profile: 'tabular-data-resource',
+# resources:
+# [ { path: 'data/cities.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'cities',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: [Object] },
+# { path: 'data/population.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'population',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: [Object] } ] }
+```
+
+### Frequently Asked Questions
+
+#### Accessing data behind a proxy server?
+
+Before the `package = Package("https://xxx.json")` call set these environment variables:
+
+```python
+import os
+
+os.environ["HTTP_PROXY"] = 'xxx'
+os.environ["HTTPS_PROXY"] = 'xxx'
+```
+
+## API Reference
+
+### `cli`
+```python
+cli()
+```
+Command-line interface
+
+```
+Usage: datapackage [OPTIONS] COMMAND [ARGS]...
+
+Options:
+ --version Show the version and exit.
+ --help Show this message and exit.
+
+Commands:
+ infer
+ validate
+```
+
+
+### `Package`
+```python
+Package(self,
+ descriptor=None,
+ base_path=None,
+ strict=False,
+ unsafe=False,
+ storage=None,
+ schema=None,
+ default_base_path=None,
+ **options)
+```
+Package representation
+
+__Arguments__
+- __descriptor (str/dict)__: data package descriptor as local path, url or object
+- __base_path (str)__: base path for all relative paths
+- __strict (bool)__: strict flag to alter validation behavior.
+ Setting it to `True` leads to throwing errors
+ on any operation with invalid descriptor
+- __unsafe (bool)__:
+ if `True` unsafe paths will be allowed. For more inforamtion
+ https://specs.frictionlessdata.io/data-resource/#data-location.
+ Default to `False`
+- __storage (str/tableschema.Storage)__: storage name like `sql` or storage instance
+- __options (dict)__: storage options to use for storage creation
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+
+
+#### `package.base_path`
+Package's base path
+
+__Returns__
+
+`str/None`: returns the data package base path
+
+
+
+#### `package.descriptor`
+Package's descriptor
+
+__Returns__
+
+`dict`: descriptor
+
+
+
+#### `package.errors`
+Validation errors
+
+Always empty in strict mode.
+
+__Returns__
+
+`Exception[]`: validation errors
+
+
+
+#### `package.profile`
+Package's profile
+
+__Returns__
+
+`Profile`: an instance of `Profile` class
+
+
+
+#### `package.resource_names`
+Package's resource names
+
+__Returns__
+
+`str[]`: returns an array of resource names
+
+
+
+#### `package.resources`
+Package's resources
+
+__Returns__
+
+`Resource[]`: returns an array of `Resource` instances
+
+
+
+#### `package.valid`
+Validation status
+
+Always true in strict mode.
+
+__Returns__
+
+`bool`: validation status
+
+
+
+#### `package.get_resource`
+```python
+package.get_resource(name)
+```
+Get data package resource by name.
+
+__Arguments__
+- __name (str)__: data resource name
+
+__Returns__
+
+`Resource/None`: returns `Resource` instances or null if not found
+
+
+
+#### `package.add_resource`
+```python
+package.add_resource(descriptor)
+```
+Add new resource to data package.
+
+The data package descriptor will be validated with newly added resource descriptor.
+
+__Arguments__
+- __descriptor (dict)__: data resource descriptor
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`Resource/None`: returns added `Resource` instance or null if not added
+
+
+
+#### `package.remove_resource`
+```python
+package.remove_resource(name)
+```
+Remove data package resource by name.
+
+The data package descriptor will be validated after resource descriptor removal.
+
+__Arguments__
+- __name (str)__: data resource name
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`Resource/None`: returns removed `Resource` instances or null if not found
+
+
+
+#### `package.get_group`
+```python
+package.get_group(name)
+```
+Returns a group of tabular resources by name.
+
+For more information about groups see [Group](#group).
+
+__Arguments__
+- __name (str)__: name of a group of resources
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`Group/None`: returns a `Group` instance or null if not found
+
+
+
+#### `package.infer`
+```python
+package.infer(pattern=False)
+```
+Infer a data package metadata.
+
+> Argument `pattern` works only for local files
+
+If `pattern` is not provided only existent resources will be inferred
+(added metadata like encoding, profile etc). If `pattern` is provided
+new resoures with file names mathing the pattern will be added and inferred.
+It commits changes to data package instance.
+
+__Arguments__
+- __pattern (str)__: glob pattern for new resources
+
+__Returns__
+
+`dict`: returns data package descriptor
+
+
+
+#### `package.commit`
+```python
+package.commit(strict=None)
+```
+Update data package instance if there are in-place changes in the descriptor.
+
+__Example__
+
+
+```python
+package = Package({
+ 'name': 'package',
+ 'resources': [{'name': 'resource', 'data': ['data']}]
+})
+
+package.name # package
+package.descriptor['name'] = 'renamed-package'
+package.name # package
+package.commit()
+package.name # renamed-package
+```
+
+__Arguments__
+- __strict (bool)__: alter `strict` mode for further work
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`bool`: returns true on success and false if not modified
+
+
+
+#### `package.save`
+```python
+package.save(target=None,
+ storage=None,
+ merge_groups=False,
+ to_base_path=False,
+ **options)
+```
+Saves this data package
+
+It saves it to storage if `storage` argument is passed or
+saves this data package's descriptor to json file if `target` arguments
+ends with `.json` or saves this data package to zip file otherwise.
+
+__Example__
+
+
+It creates a zip file into ``file_or_path`` with the contents
+of this Data Package and its resources. Every resource which content
+lives in the local filesystem will be copied to the zip file.
+Consider the following Data Package descriptor:
+
+```json
+{
+ "name": "gdp",
+ "resources": [
+ {"name": "local", "format": "CSV", "path": "data.csv"},
+ {"name": "inline", "data": [4, 8, 15, 16, 23, 42]},
+ {"name": "remote", "url": "http://someplace.com/data.csv"}
+ ]
+}
+```
+
+The final structure of the zip file will be:
+
+```
+./datapackage.json
+./data/local.csv
+```
+
+With the contents of `datapackage.json` being the same as
+returned `datapackage.descriptor`. The resources' file names are generated
+based on their `name` and `format` fields if they exist.
+If the resource has no `name`, it'll be used `resource-X`,
+where `X` is the index of the resource in the `resources` list (starting at zero).
+If the resource has `format`, it'll be lowercased and appended to the `name`,
+becoming "`name.format`".
+
+__Arguments__
+- __target (string/filelike)__:
+ the file path or a file-like object where
+ the contents of this Data Package will be saved into.
+- __storage (str/tableschema.Storage)__:
+ storage name like `sql` or storage instance
+- __merge_groups (bool)__:
+ save all the group's tabular resoruces into one bucket
+ if a storage is provided (for example into one SQL table).
+ Read more about [Group](#group).
+- __to_base_path (bool)__:
+ save the package to the package's base path
+ using the "<base_path>/<target>" route
+- __options (dict)__:
+ storage options to use for storage creation
+
+__Raises__
+- `DataPackageException`: raises if there was some error writing the package
+
+__Returns__
+
+`bool/Storage`: on success return true or a `Storage` instance
+
+### `Resource`
+```python
+Resource(self,
+ descriptor={},
+ base_path=None,
+ strict=False,
+ unsafe=False,
+ storage=None,
+ package=None,
+ **options)
+```
+Resource represenation
+
+__Arguments__
+- __descriptor (str/dict)__: data resource descriptor as local path, url or object
+- __base_path (str)__: base path for all relative paths
+- __strict (bool)__:
+ strict flag to alter validation behavior. Setting it to `true`
+ leads to throwing errors on any operation with invalid descriptor
+- __unsafe (bool)__:
+ if `True` unsafe paths will be allowed. For more inforamtion
+ https://specs.frictionlessdata.io/data-resource/#data-location.
+ Default to `False`
+- __storage (str/tableschema.Storage)__: storage name like `sql` or storage instance
+- __options (dict)__: storage options to use for storage creation
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+
+
+#### `resource.data`
+Return resource data
+
+
+#### `resource.descriptor`
+Package's descriptor
+
+__Returns__
+
+`dict`: descriptor
+
+
+
+#### `resource.errors`
+Validation errors
+
+Always empty in strict mode.
+
+__Returns__
+
+`Exception[]`: validation errors
+
+
+
+#### `resource.group`
+Group name
+
+__Returns__
+
+`str`: group name
+
+
+
+#### `resource.headers`
+Resource's headers
+
+> Only for tabular resources (reading has to be started first or it's `None`)
+
+__Returns__
+
+`str[]/None`: returns data source headers
+
+
+
+#### `resource.inline`
+Whether resource inline
+
+__Returns__
+
+`bool`: returns true if resource is inline
+
+
+
+#### `resource.local`
+Whether resource local
+
+__Returns__
+
+`bool`: returns true if resource is local
+
+
+
+#### `resource.multipart`
+Whether resource multipart
+
+__Returns__
+
+`bool`: returns true if resource is multipart
+
+
+
+#### `resource.name`
+Resource name
+
+__Returns__
+
+`str`: name
+
+
+
+#### `resource.package`
+Package instance if the resource belongs to some package
+
+__Returns__
+
+`Package/None`: a package instance if available
+
+
+
+#### `resource.profile`
+Resource's profile
+
+__Returns__
+
+`Profile`: an instance of `Profile` class
+
+
+
+#### `resource.remote`
+Whether resource remote
+
+__Returns__
+
+`bool`: returns true if resource is remote
+
+
+
+#### `resource.schema`
+Resource's schema
+
+> Only for tabular resources
+
+For tabular resources it returns `Schema` instance to interact with data schema.
+Read API documentation - [tableschema.Schema](https://github.com/frictionlessdata/tableschema-py#schema).
+
+__Returns__
+
+`tableschema.Schema`: schema
+
+
+
+#### `resource.source`
+Resource's source
+
+Combination of `resource.source` and `resource.inline/local/remote/multipart`
+provides predictable interface to work with resource data.
+
+__Returns__
+
+`list/str`: returns `data` or `path` property
+
+
+
+#### `resource.table`
+Return resource table
+
+
+#### `resource.tabular`
+Whether resource tabular
+
+__Returns__
+
+`bool`: returns true if resource is tabular
+
+
+
+#### `resource.valid`
+Validation status
+
+Always true in strict mode.
+
+__Returns__
+
+`bool`: validation status
+
+
+
+#### `resource.iter`
+```python
+resource.iter(integrity=False, relations=False, **options)
+```
+Iterates through the resource data and emits rows cast based on table schema.
+
+> Only for tabular resources
+
+__Arguments__
+
+
+ keyed (bool):
+ yield keyed rows in a form of `{header1: value1, header2: value2}`
+ (default is false; the form of rows is `[value1, value2]`)
+
+ extended (bool):
+ yield extended rows in a for of `[rowNumber, [header1, header2], [value1, value2]]`
+ (default is false; the form of rows is `[value1, value2]`)
+
+ cast (bool):
+ disable data casting if false
+ (default is true)
+
+ integrity (bool):
+ if true actual size in BYTES and SHA256 hash of the file
+ will be checked against `descriptor.bytes` and `descriptor.hash`
+ (other hashing algorithms are not supported and will be skipped silently)
+
+ relations (bool):
+ if true foreign key fields will be checked and resolved to its references
+
+ foreign_keys_values (dict):
+ three-level dictionary of foreign key references optimized
+ to speed up validation process in a form of
+ `{resource1: {(fk_field1, fk_field2): {(value1, value2): {one_keyedrow}, ... }}}`.
+ If not provided but relations is true, it will be created
+ before the validation process by *index_foreign_keys_values* method
+
+ exc_handler (func):
+ optional custom exception handler callable.
+ Can be used to defer raising errors (i.e. "fail late"), e.g.
+ for data validation purposes. Must support the signature below
+
+__Custom exception handler__
+
+
+```python
+def exc_handler(exc, row_number=None, row_data=None, error_data=None):
+ '''Custom exception handler (example)
+
+ # Arguments:
+ exc(Exception):
+ Deferred exception instance
+ row_number(int):
+ Data row number that triggers exception exc
+ row_data(OrderedDict):
+ Invalid data row source data
+ error_data(OrderedDict):
+ Data row source data field subset responsible for the error, if
+ applicable (e.g. invalid primary or foreign key fields). May be
+ identical to row_data.
+ '''
+ # ...
+```
+
+__Raises__
+- `DataPackageException`: base class of any error
+- `CastError`: data cast error
+- `IntegrityError`: integrity checking error
+- `UniqueKeyError`: unique key constraint violation
+- `UnresolvedFKError`: unresolved foreign key reference error
+
+__Returns__
+
+`Iterator[list]`: yields rows
+
+
+
+#### `resource.read`
+```python
+resource.read(integrity=False,
+ relations=False,
+ foreign_keys_values=False,
+ **options)
+```
+Read the whole resource and return as array of rows
+
+> Only for tabular resources
+> It has the same API as `resource.iter` except for
+
+__Arguments__
+- __limit (int)__: limit count of rows to read and return
+
+__Returns__
+
+`list[]`: returns rows
+
+
+
+#### `resource.check_integrity`
+```python
+resource.check_integrity()
+```
+Checks resource integrity
+
+> Only for tabular resources
+
+It checks size in BYTES and SHA256 hash of the file
+against `descriptor.bytes` and `descriptor.hash`
+(other hashing algorithms are not supported and will be skipped silently).
+
+__Raises__
+- `exceptions.IntegrityError`: raises if there are integrity issues
+
+__Returns__
+
+`bool`: returns True if no issues
+
+
+
+#### `resource.check_relations`
+```python
+resource.check_relations(foreign_keys_values=False)
+```
+Check relations
+
+> Only for tabular resources
+
+It checks foreign keys and raises an exception if there are integrity issues.
+
+__Raises__
+- `exceptions.RelationError`: raises if there are relation issues
+
+__Returns__
+
+`bool`: returns True if no issues
+
+
+
+#### `resource.drop_relations`
+```python
+resource.drop_relations()
+```
+Drop relations
+
+> Only for tabular resources
+
+Remove relations data from memory
+
+__Returns__
+
+`bool`: returns True
+
+
+
+#### `resource.raw_iter`
+```python
+resource.raw_iter(stream=False)
+```
+Iterate over data chunks as bytes.
+
+If `stream` is true File-like object will be returned.
+
+__Arguments__
+- __stream (bool)__: File-like object will be returned
+
+__Returns__
+
+`bytes[]/filelike`: returns bytes[]/filelike
+
+
+
+#### `resource.raw_read`
+```python
+resource.raw_read()
+```
+Returns resource data as bytes.
+
+__Returns__
+
+`bytes`: returns resource data in bytes
+
+
+
+#### `resource.infer`
+```python
+resource.infer(**options)
+```
+Infer resource metadata
+
+Like name, format, mediatype, encoding, schema and profile.
+It commits this changes into resource instance.
+
+__Arguments__
+- __options__:
+ options will be passed to `tableschema.infer` call,
+ for more control on results (e.g. for setting `limit`, `confidence` etc.).
+
+__Returns__
+
+`dict`: returns resource descriptor
+
+
+
+#### `resource.commit`
+```python
+resource.commit(strict=None)
+```
+Update resource instance if there are in-place changes in the descriptor.
+
+__Arguments__
+- __strict (bool)__: alter `strict` mode for further work
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`bool`: returns true on success and false if not modified
+
+
+
+#### `resource.save`
+```python
+resource.save(target, storage=None, to_base_path=False, **options)
+```
+Saves this resource
+
+Into storage if `storage` argument is passed or
+saves this resource's descriptor to json file otherwise.
+
+__Arguments__
+- __target (str)__:
+ path where to save a resource
+- __storage (str/tableschema.Storage)__:
+ storage name like `sql` or storage instance
+- __to_base_path (bool)__:
+ save the resource to the resource's base path
+ using the "<base_path>/<target>" route
+- __options (dict)__:
+ storage options to use for storage creation
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`bool`: returns true on success
+Building index...
+Started generating documentation...
+
+### `Group`
+```python
+Group(self, resources)
+```
+Group representation
+
+__Arguments__
+- __Resource[]__: list of TABULAR resources
+
+
+
+#### `group.headers`
+Group's headers
+
+__Returns__
+
+`str[]/None`: returns headers
+
+
+
+#### `group.name`
+Group name
+
+__Returns__
+
+`str`: name
+
+
+
+#### `group.schema`
+Resource's schema
+
+__Returns__
+
+`tableschema.Schema`: schema
+
+
+
+#### `group.iter`
+```python
+group.iter(**options)
+```
+Iterates through the group data and emits rows cast based on table schema.
+
+> It concatenates all the resources and has the same API as `resource.iter`
+
+
+
+#### `group.read`
+```python
+group.read(limit=None, **options)
+```
+Read the whole group and return as array of rows
+
+> It concatenates all the resources and has the same API as `resource.read`
+
+
+
+#### `group.check_relations`
+```python
+group.check_relations()
+```
+Check group's relations
+
+The same as `resource.check_relations` but without the optional
+argument *foreign_keys_values*. This method will test foreignKeys of the
+whole group at once otpimizing the process by creating the foreign_key_values
+hashmap only once before testing the set of resources.
+
+
+### `Profile`
+```python
+Profile(self, profile)
+```
+Profile representation
+
+__Arguments__
+- __profile (str)__: profile name in registry or URL to JSON Schema
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+
+
+#### `profile.jsonschema`
+JSONSchema content
+
+__Returns__
+
+`dict`: returns profile's JSON Schema contents
+
+
+
+#### `profile.name`
+Profile name
+
+__Returns__
+
+`str/None`: name if available
+
+
+
+#### `profile.validate`
+```python
+profile.validate(descriptor)
+```
+Validate a data package `descriptor` against the profile.
+
+__Arguments__
+- __descriptor (dict)__: retrieved and dereferenced data package descriptor
+
+__Raises__
+- `ValidationError`: raises if not valid
+__Returns__
+
+`bool`: returns True if valid
+
+
+### `validate`
+```python
+validate(descriptor)
+```
+Validate a data package descriptor.
+
+__Arguments__
+- __descriptor (str/dict)__: package descriptor (one of):
+ - local path
+ - remote url
+ - object
+
+__Raises__
+- `ValidationError`: raises on invalid
+
+__Returns__
+
+`bool`: returns true on valid
+
+
+### `infer`
+```python
+infer(pattern, base_path=None)
+```
+Infer a data package descriptor.
+
+> Argument `pattern` works only for local files
+
+__Arguments__
+- __pattern (str)__: glob file pattern
+
+__Returns__
+
+`dict`: returns data package descriptor
+
+
+### `DataPackageException`
+```python
+DataPackageException(self, message, errors=[])
+```
+Base class for all DataPackage/TableSchema exceptions.
+
+If there are multiple errors, they can be read from the exception object:
+
+```python
+try:
+ # lib action
+except DataPackageException as exception:
+ if exception.multiple:
+ for error in exception.errors:
+ # handle error
+```
+
+
+
+#### `datapackageexception.errors`
+List of nested errors
+
+__Returns__
+
+`DataPackageException[]`: list of nested errors
+
+
+
+#### `datapackageexception.multiple`
+Whether it's a nested exception
+
+__Returns__
+
+`bool`: whether it's a nested exception
+
+
+
+### `TableSchemaException`
+```python
+TableSchemaException(self, message, errors=[])
+```
+Base class for all TableSchema exceptions.
+
+
+### `LoadError`
+```python
+LoadError(self, message, errors=[])
+```
+All loading errors.
+
+
+### `CastError`
+```python
+CastError(self, message, errors=[])
+```
+All value cast errors.
+
+
+### `IntegrityError`
+```python
+IntegrityError(self, message, errors=[])
+```
+All integrity errors.
+
+
+### `RelationError`
+```python
+RelationError(self, message, errors=[])
+```
+All relations errors.
+
+
+### `StorageError`
+```python
+StorageError(self, message, errors=[])
+```
+All storage errors.
+
+
+## Contributing
+
+> The project follows the [Open Knowledge International coding standards](https://github.com/okfn/coding-standards).
+
+Recommended way to get started is to create and activate a project virtual environment.
+To install package and development dependencies into active environment:
+
+```bash
+$ make install
+```
+
+To run tests with linting and coverage:
+
+```bash
+$ make test
+```
+
+## Changelog
+
+Here described only breaking and the most important changes. The full changelog and documentation for all released versions could be found in nicely formatted [commit history](https://github.com/frictionlessdata/datapackage-py/commits/master).
+
+#### v1.15
+
+> WARNING: it can be breaking for some setups, please read the discussions below
+
+- Fixed header management according to the specs:
+ - https://github.com/frictionlessdata/datapackage-py/pull/257
+ - https://github.com/frictionlessdata/datapackage-py/issues/256
+ - https://github.com/frictionlessdata/forum/issues/1
+
+#### v1.14
+
+- Add experimental options for pick/skiping fileds/rows
+
+#### v1.13
+
+- Add `unsafe` option to Package and Resource (#262)
+
+#### v1.12
+
+- Use `chardet` for encoding deteciton by default. For `cchardet`: `pip install datapackage[cchardet]`
+
+#### v1.11
+
+- `resource/package.save` now accept a `to_base_path` argument (#254)
+- `package.save` now returns a `Storage` instance if available
+
+#### v1.10
+
+- Added an ability to check tabular resource's integrity
+
+#### v1.9
+
+- Added `resource.package` property
+
+#### v1.8
+
+- Added support for [groups of resources](#group)
+
+#### v1.7
+
+- Added support for [compression of resources](https://frictionlessdata.io/specs/patterns/#compression-of-resources)
+
+#### v1.6
+
+- Added support for custom request session
+
+#### v1.5
+
+Updated behaviour:
+- Added support for Python 3.7
+
+#### v1.4
+
+New API added:
+- added `skip_rows` support to the resource descriptor
+
+#### v1.3
+
+New API added:
+- property `package.base_path` is now publicly available
+
+#### v1.2
+
+Updated behaviour:
+- CLI command `$ datapackage infer` now outputs only a JSON-formatted data package descriptor.
+
+#### v1.1
+
+New API added:
+- Added an integration between `Package/Resource` and the `tableschema.Storage` - https://github.com/frictionlessdata/tableschema-py#storage. It allows to load and save data package from/to different storages like SQL/BigQuery/etc.
+
+
+
+%package help
+Summary: Development documents and examples for datapackage
+Provides: python3-datapackage-doc
+%description help
+# datapackage-py
+
+[![Travis](https://travis-ci.org/frictionlessdata/datapackage-py.svg?branch=master)](https://travis-ci.org/frictionlessdata/datapackage-py)
+[![Coveralls](https://coveralls.io/repos/github/frictionlessdata/datapackage-py/badge.svg?branch=master)](https://coveralls.io/github/frictionlessdata/datapackage-py?branch=master)
+[![PyPi](https://img.shields.io/pypi/v/datapackage.svg)](https://pypi.python.org/pypi/datapackage)
+[![Github](https://img.shields.io/badge/github-master-brightgreen)](https://github.com/frictionlessdata/datapackage-py)
+[![Gitter](https://img.shields.io/gitter/room/frictionlessdata/chat.svg)](https://gitter.im/frictionlessdata/chat)
+
+A library for working with [Data Packages](http://specs.frictionlessdata.io/data-package/).
+
+> **[Important Notice]** We have released [Frictionless Framework](https://github.com/frictionlessdata/frictionless-py). This framework provides improved `datapackage` functionality extended to be a complete data solution. The change in not breaking for the existing software so no actions are required. Please read the [Migration Guide](https://framework.frictionlessdata.io/docs/development/migration) from `datapackage` to Frictionless Framework.
+> - we continue to bug-fix `datapackage@1.x` in this [repository](https://github.com/frictionlessdata/datapackage-py) as well as it's available on [PyPi](https://pypi.org/project/datapackage/) as it was before
+> - please note that `frictionless@3.x` version's API, we're working on at the moment, is not stable
+> - we will release `frictionless@4.x` by the end of 2020 to be the first SemVer/stable version
+
+## Features
+
+ - `Package` class for working with data packages
+ - `Resource` class for working with data resources
+ - `Profile` class for working with profiles
+ - `validate` function for validating data package descriptors
+ - `infer` function for inferring data package descriptors
+
+## Contents
+
+<!--TOC-->
+
+ - [Getting Started](#getting-started)
+ - [Installation](#installation)
+ - [Documentation](#documentation)
+ - [Introduction](#introduction)
+ - [Working with Package](#working-with-package)
+ - [Working with Resource](#working-with-resource)
+ - [Working with Group](#working-with-group)
+ - [Working with Profile](#working-with-profile)
+ - [Working with Foreign Keys](#working-with-foreign-keys)
+ - [Working with validate/infer](#working-with-validateinfer)
+ - [Frequently Asked Questions](#frequently-asked-questions)
+ - [API Reference](#api-reference)
+ - [`cli`](#cli)
+ - [`Package`](#package)
+ - [`Resource`](#resource)
+ - [`Group`](#group)
+ - [`Profile`](#profile)
+ - [`validate`](#validate)
+ - [`infer`](#infer)
+ - [`DataPackageException`](#datapackageexception)
+ - [`TableSchemaException`](#tableschemaexception)
+ - [`LoadError`](#loaderror)
+ - [`CastError`](#casterror)
+ - [`IntegrityError`](#integrityerror)
+ - [`RelationError`](#relationerror)
+ - [`StorageError`](#storageerror)
+ - [Contributing](#contributing)
+ - [Changelog](#changelog)
+
+<!--TOC-->
+
+## Getting Started
+
+### Installation
+
+The package use semantic versioning. It means that major versions could include breaking changes. It's highly recommended to specify `datapackage` version range in your `setup/requirements` file e.g. `datapackage>=1.0,<2.0`.
+
+```bash
+$ pip install datapackage
+```
+
+#### OSX 10.14+
+If you receive an error about the `cchardet` package when installing datapackage on Mac OSX 10.14 (Mojave) or higher, follow these steps:
+1. Make sure you have the latest x-code by running the following in terminal: `xcode-select --install`
+2. Then go to [https://developer.apple.com/download/more/](https://developer.apple.com/download/more/) and download the `command line tools`. Note, this requires an Apple ID.
+3. Then, in terminal, run `open /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg`
+You can read more about these steps in this [post.](https://stackoverflow.com/questions/52509602/cant-compile-c-program-on-a-mac-after-upgrade-to-mojave)
+
+## Documentation
+
+### Introduction
+
+Let's start with a simple example:
+
+```python
+from datapackage import Package
+
+package = Package('datapackage.json')
+package.get_resource('resource').read()
+```
+
+### Working with Package
+
+A class for working with data packages. It provides various capabilities like loading local or remote data package, inferring a data package descriptor, saving a data package descriptor and many more.
+
+Consider we have some local csv files in a `data` directory. Let's create a data package based on this data using a `Package` class:
+
+> data/cities.csv
+
+```csv
+city,location
+london,"51.50,-0.11"
+paris,"48.85,2.30"
+rome,"41.89,12.51"
+```
+
+> data/population.csv
+
+```csv
+city,year,population
+london,2017,8780000
+paris,2017,2240000
+rome,2017,2860000
+```
+
+First we create a blank data package:
+
+```python
+package = Package()
+```
+
+Now we're ready to infer a data package descriptor based on data files we have. Because we have two csv files we use glob pattern `**/*.csv`:
+
+```python
+package.infer('**/*.csv')
+package.descriptor
+#{ profile: 'tabular-data-package',
+# resources:
+# [ { path: 'data/cities.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'cities',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: [Object] },
+# { path: 'data/population.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'population',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: [Object] } ] }
+```
+
+An `infer` method has found all our files and inspected it to extract useful metadata like profile, encoding, format, Table Schema etc. Let's tweak it a little bit:
+
+```python
+package.descriptor['resources'][1]['schema']['fields'][1]['type'] = 'year'
+package.commit()
+package.valid # true
+```
+
+Because our resources are tabular we could read it as a tabular data:
+
+```python
+package.get_resource('population').read(keyed=True)
+#[ { city: 'london', year: 2017, population: 8780000 },
+# { city: 'paris', year: 2017, population: 2240000 },
+# { city: 'rome', year: 2017, population: 2860000 } ]
+```
+
+Let's save our descriptor on the disk as a zip-file:
+
+```python
+package.save('datapackage.zip')
+```
+
+To continue the work with the data package we just load it again but this time using local `datapackage.zip`:
+
+```python
+package = Package('datapackage.zip')
+# Continue the work
+```
+
+It was onle basic introduction to the `Package` class. To learn more let's take a look on `Package` class API reference.
+
+### Working with Resource
+
+A class for working with data resources. You can read or iterate tabular resources using the `iter/read` methods and all resource as bytes using `row_iter/row_read` methods.
+
+Consider we have some local csv file. It could be inline data or remote link - all supported by `Resource` class (except local files for in-brower usage of course). But say it's `data.csv` for now:
+
+```csv
+city,location
+london,"51.50,-0.11"
+paris,"48.85,2.30"
+rome,N/A
+```
+
+Let's create and read a resource. Because resource is tabular we could use `resource.read` method with a `keyed` option to get an array of keyed rows:
+
+```python
+resource = Resource({path: 'data.csv'})
+resource.tabular # true
+resource.read(keyed=True)
+# [
+# {city: 'london', location: '51.50,-0.11'},
+# {city: 'paris', location: '48.85,2.30'},
+# {city: 'rome', location: 'N/A'},
+# ]
+resource.headers
+# ['city', 'location']
+# (reading has to be started first)
+```
+
+As we could see our locations are just a strings. But it should be geopoints. Also Rome's location is not available but it's also just a `N/A` string instead of Python `None`. First we have to infer resource metadata:
+
+```python
+resource.infer()
+resource.descriptor
+#{ path: 'data.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'data',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: { fields: [ [Object], [Object] ], missingValues: [ '' ] } }
+resource.read(keyed=True)
+# Fails with a data validation error
+```
+
+Let's fix not available location. There is a `missingValues` property in Table Schema specification. As a first try we set `missingValues` to `N/A` in `resource.descriptor.schema`. Resource descriptor could be changed in-place but all changes should be commited by `resource.commit()`:
+
+```python
+resource.descriptor['schema']['missingValues'] = 'N/A'
+resource.commit()
+resource.valid # False
+resource.errors
+# [<ValidationError: "'N/A' is not of type 'array'">]
+```
+
+As a good citiziens we've decided to check out recource descriptor validity. And it's not valid! We should use an array for `missingValues` property. Also don't forget to have an empty string as a missing value:
+
+```python
+resource.descriptor['schema']['missingValues'] = ['', 'N/A']
+resource.commit()
+resource.valid # true
+```
+
+All good. It looks like we're ready to read our data again:
+
+```python
+resource.read(keyed=True)
+# [
+# {city: 'london', location: [51.50,-0.11]},
+# {city: 'paris', location: [48.85,2.30]},
+# {city: 'rome', location: null},
+# ]
+```
+
+Now we see that:
+- locations are arrays with numeric lattide and longitude
+- Rome's location is a native JavaScript `null`
+
+And because there are no errors on data reading we could be sure that our data is valid againt our schema. Let's save our resource descriptor:
+
+```python
+resource.save('dataresource.json')
+```
+
+Let's check newly-crated `dataresource.json`. It contains path to our data file, inferred metadata and our `missingValues` tweak:
+
+```json
+{
+ "path": "data.csv",
+ "profile": "tabular-data-resource",
+ "encoding": "utf-8",
+ "name": "data",
+ "format": "csv",
+ "mediatype": "text/csv",
+ "schema": {
+ "fields": [
+ {
+ "name": "city",
+ "type": "string",
+ "format": "default"
+ },
+ {
+ "name": "location",
+ "type": "geopoint",
+ "format": "default"
+ }
+ ],
+ "missingValues": [
+ "",
+ "N/A"
+ ]
+ }
+}
+```
+
+If we decide to improve it even more we could update the `dataresource.json` file and then open it again using local file name:
+
+```python
+resource = Resource('dataresource.json')
+# Continue the work
+```
+
+It was onle basic introduction to the `Resource` class. To learn more let's take a look on `Resource` class API reference.
+
+### Working with Group
+
+A class representing a group of tabular resources. Groups can be used to read multiple resource as one or to export them, for example, to a database as one table. To define a group add the `group: <name>` field to corresponding resources. The group's metadata will be created from the "leading" resource's metadata (the first resource with the group name).
+
+Consider we have a data package with two tables partitioned by a year and a shared schema stored separately:
+
+> cars-2017.csv
+
+```csv
+name,value
+bmw,2017
+tesla,2017
+nissan,2017
+```
+
+> cars-2018.csv
+
+```csv
+name,value
+bmw,2018
+tesla,2018
+nissan,2018
+```
+
+> cars.schema.json
+
+```json
+{
+ "fields": [
+ {
+ "name": "name",
+ "type": "string"
+ },
+ {
+ "name": "value",
+ "type": "integer"
+ }
+ ]
+}
+```
+
+> datapackage.json
+
+```json
+{
+ "name": "datapackage",
+ "resources": [
+ {
+ "group": "cars",
+ "name": "cars-2017",
+ "path": "cars-2017.csv",
+ "profile": "tabular-data-resource",
+ "schema": "cars.schema.json"
+ },
+ {
+ "group": "cars",
+ "name": "cars-2018",
+ "path": "cars-2018.csv",
+ "profile": "tabular-data-resource",
+ "schema": "cars.schema.json"
+ }
+ ]
+}
+```
+
+Let's read the resources separately:
+
+```python
+package = Package('datapackage.json')
+package.get_resource('cars-2017').read(keyed=True) == [
+ {'name': 'bmw', 'value': 2017},
+ {'name': 'tesla', 'value': 2017},
+ {'name': 'nissan', 'value': 2017},
+]
+package.get_resource('cars-2018').read(keyed=True) == [
+ {'name': 'bmw', 'value': 2018},
+ {'name': 'tesla', 'value': 2018},
+ {'name': 'nissan', 'value': 2018},
+]
+```
+
+On the other hand, these resources defined with a `group: cars` field. It means we can treat them as a group:
+
+```python
+package = Package('datapackage.json')
+package.get_group('cars').read(keyed=True) == [
+ {'name': 'bmw', 'value': 2017},
+ {'name': 'tesla', 'value': 2017},
+ {'name': 'nissan', 'value': 2017},
+ {'name': 'bmw', 'value': 2018},
+ {'name': 'tesla', 'value': 2018},
+ {'name': 'nissan', 'value': 2018},
+]
+```
+
+We can use this approach when we need to save the data package to a storage, for example, to a SQL database. There is the `merge_groups` flag to enable groupping behaviour:
+
+```python
+package = Package('datapackage.json')
+package.save(storage='sql', engine=engine)
+# SQL tables:
+# - cars-2017
+# - cars-2018
+package.save(storage='sql', engine=engine, merge_groups=True)
+# SQL tables:
+# - cars
+```
+
+### Working with Profile
+
+A component to represent JSON Schema profile from [Profiles Registry]( https://specs.frictionlessdata.io/schemas/registry.json):
+
+```python
+profile = Profile('data-package')
+
+profile.name # data-package
+profile.jsonschema # JSON Schema contents
+
+try:
+ valid = profile.validate(descriptor)
+except exceptions.ValidationError as exception:
+ for error in exception.errors:
+ # handle individual error
+```
+
+### Working with Foreign Keys
+
+The library supports foreign keys described in the [Table Schema](http://specs.frictionlessdata.io/table-schema/#foreign-keys) specification. It means if your data package descriptor use `resources[].schema.foreignKeys` property for some resources a data integrity will be checked on reading operations.
+
+Consider we have a data package:
+
+```python
+DESCRIPTOR = {
+ 'resources': [
+ {
+ 'name': 'teams',
+ 'data': [
+ ['id', 'name', 'city'],
+ ['1', 'Arsenal', 'London'],
+ ['2', 'Real', 'Madrid'],
+ ['3', 'Bayern', 'Munich'],
+ ],
+ 'schema': {
+ 'fields': [
+ {'name': 'id', 'type': 'integer'},
+ {'name': 'name', 'type': 'string'},
+ {'name': 'city', 'type': 'string'},
+ ],
+ 'foreignKeys': [
+ {
+ 'fields': 'city',
+ 'reference': {'resource': 'cities', 'fields': 'name'},
+ },
+ ],
+ },
+ }, {
+ 'name': 'cities',
+ 'data': [
+ ['name', 'country'],
+ ['London', 'England'],
+ ['Madrid', 'Spain'],
+ ],
+ },
+ ],
+}
+```
+
+Let's check relations for a `teams` resource:
+
+```python
+from datapackage import Package
+
+package = Package(DESCRIPTOR)
+teams = package.get_resource('teams')
+teams.check_relations()
+# tableschema.exceptions.RelationError: Foreign key "['city']" violation in row "4"
+```
+
+As we could see there is a foreign key violation. That's because our lookup table `cities` doesn't have a city of `Munich` but we have a team from there. We need to fix it in `cities` resource:
+
+```python
+package.descriptor['resources'][1]['data'].append(['Munich', 'Germany'])
+package.commit()
+teams = package.get_resource('teams')
+teams.check_relations()
+# True
+```
+
+Fixed! But not only a check operation is available. We could use `relations` argument for `resource.iter/read` methods to dereference a resource relations:
+
+```python
+teams.read(keyed=True, relations=True)
+#[{'id': 1, 'name': 'Arsenal', 'city': {'name': 'London', 'country': 'England}},
+# {'id': 2, 'name': 'Real', 'city': {'name': 'Madrid', 'country': 'Spain}},
+# {'id': 3, 'name': 'Bayern', 'city': {'name': 'Munich', 'country': 'Germany}}]
+```
+
+Instead of plain city name we've got a dictionary containing a city data. These `resource.iter/read` methods will fail with the same as `resource.check_relations` error if there is an integrity issue. But only if `relations=True` flag is passed.
+
+### Working with validate/infer
+
+A standalone function to validate a data package descriptor:
+
+```python
+from datapackage import validate, exceptions
+
+try:
+ valid = validate(descriptor)
+except exceptions.ValidationError as exception:
+ for error in exception.errors:
+ # handle individual error
+```
+
+A standalone function to infer a data package descriptor.
+
+```python
+descriptor = infer('**/*.csv')
+#{ profile: 'tabular-data-resource',
+# resources:
+# [ { path: 'data/cities.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'cities',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: [Object] },
+# { path: 'data/population.csv',
+# profile: 'tabular-data-resource',
+# encoding: 'utf-8',
+# name: 'population',
+# format: 'csv',
+# mediatype: 'text/csv',
+# schema: [Object] } ] }
+```
+
+### Frequently Asked Questions
+
+#### Accessing data behind a proxy server?
+
+Before the `package = Package("https://xxx.json")` call set these environment variables:
+
+```python
+import os
+
+os.environ["HTTP_PROXY"] = 'xxx'
+os.environ["HTTPS_PROXY"] = 'xxx'
+```
+
+## API Reference
+
+### `cli`
+```python
+cli()
+```
+Command-line interface
+
+```
+Usage: datapackage [OPTIONS] COMMAND [ARGS]...
+
+Options:
+ --version Show the version and exit.
+ --help Show this message and exit.
+
+Commands:
+ infer
+ validate
+```
+
+
+### `Package`
+```python
+Package(self,
+ descriptor=None,
+ base_path=None,
+ strict=False,
+ unsafe=False,
+ storage=None,
+ schema=None,
+ default_base_path=None,
+ **options)
+```
+Package representation
+
+__Arguments__
+- __descriptor (str/dict)__: data package descriptor as local path, url or object
+- __base_path (str)__: base path for all relative paths
+- __strict (bool)__: strict flag to alter validation behavior.
+ Setting it to `True` leads to throwing errors
+ on any operation with invalid descriptor
+- __unsafe (bool)__:
+ if `True` unsafe paths will be allowed. For more inforamtion
+ https://specs.frictionlessdata.io/data-resource/#data-location.
+ Default to `False`
+- __storage (str/tableschema.Storage)__: storage name like `sql` or storage instance
+- __options (dict)__: storage options to use for storage creation
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+
+
+#### `package.base_path`
+Package's base path
+
+__Returns__
+
+`str/None`: returns the data package base path
+
+
+
+#### `package.descriptor`
+Package's descriptor
+
+__Returns__
+
+`dict`: descriptor
+
+
+
+#### `package.errors`
+Validation errors
+
+Always empty in strict mode.
+
+__Returns__
+
+`Exception[]`: validation errors
+
+
+
+#### `package.profile`
+Package's profile
+
+__Returns__
+
+`Profile`: an instance of `Profile` class
+
+
+
+#### `package.resource_names`
+Package's resource names
+
+__Returns__
+
+`str[]`: returns an array of resource names
+
+
+
+#### `package.resources`
+Package's resources
+
+__Returns__
+
+`Resource[]`: returns an array of `Resource` instances
+
+
+
+#### `package.valid`
+Validation status
+
+Always true in strict mode.
+
+__Returns__
+
+`bool`: validation status
+
+
+
+#### `package.get_resource`
+```python
+package.get_resource(name)
+```
+Get data package resource by name.
+
+__Arguments__
+- __name (str)__: data resource name
+
+__Returns__
+
+`Resource/None`: returns `Resource` instances or null if not found
+
+
+
+#### `package.add_resource`
+```python
+package.add_resource(descriptor)
+```
+Add new resource to data package.
+
+The data package descriptor will be validated with newly added resource descriptor.
+
+__Arguments__
+- __descriptor (dict)__: data resource descriptor
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`Resource/None`: returns added `Resource` instance or null if not added
+
+
+
+#### `package.remove_resource`
+```python
+package.remove_resource(name)
+```
+Remove data package resource by name.
+
+The data package descriptor will be validated after resource descriptor removal.
+
+__Arguments__
+- __name (str)__: data resource name
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`Resource/None`: returns removed `Resource` instances or null if not found
+
+
+
+#### `package.get_group`
+```python
+package.get_group(name)
+```
+Returns a group of tabular resources by name.
+
+For more information about groups see [Group](#group).
+
+__Arguments__
+- __name (str)__: name of a group of resources
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`Group/None`: returns a `Group` instance or null if not found
+
+
+
+#### `package.infer`
+```python
+package.infer(pattern=False)
+```
+Infer a data package metadata.
+
+> Argument `pattern` works only for local files
+
+If `pattern` is not provided only existent resources will be inferred
+(added metadata like encoding, profile etc). If `pattern` is provided
+new resoures with file names mathing the pattern will be added and inferred.
+It commits changes to data package instance.
+
+__Arguments__
+- __pattern (str)__: glob pattern for new resources
+
+__Returns__
+
+`dict`: returns data package descriptor
+
+
+
+#### `package.commit`
+```python
+package.commit(strict=None)
+```
+Update data package instance if there are in-place changes in the descriptor.
+
+__Example__
+
+
+```python
+package = Package({
+ 'name': 'package',
+ 'resources': [{'name': 'resource', 'data': ['data']}]
+})
+
+package.name # package
+package.descriptor['name'] = 'renamed-package'
+package.name # package
+package.commit()
+package.name # renamed-package
+```
+
+__Arguments__
+- __strict (bool)__: alter `strict` mode for further work
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`bool`: returns true on success and false if not modified
+
+
+
+#### `package.save`
+```python
+package.save(target=None,
+ storage=None,
+ merge_groups=False,
+ to_base_path=False,
+ **options)
+```
+Saves this data package
+
+It saves it to storage if `storage` argument is passed or
+saves this data package's descriptor to json file if `target` arguments
+ends with `.json` or saves this data package to zip file otherwise.
+
+__Example__
+
+
+It creates a zip file into ``file_or_path`` with the contents
+of this Data Package and its resources. Every resource which content
+lives in the local filesystem will be copied to the zip file.
+Consider the following Data Package descriptor:
+
+```json
+{
+ "name": "gdp",
+ "resources": [
+ {"name": "local", "format": "CSV", "path": "data.csv"},
+ {"name": "inline", "data": [4, 8, 15, 16, 23, 42]},
+ {"name": "remote", "url": "http://someplace.com/data.csv"}
+ ]
+}
+```
+
+The final structure of the zip file will be:
+
+```
+./datapackage.json
+./data/local.csv
+```
+
+With the contents of `datapackage.json` being the same as
+returned `datapackage.descriptor`. The resources' file names are generated
+based on their `name` and `format` fields if they exist.
+If the resource has no `name`, it'll be used `resource-X`,
+where `X` is the index of the resource in the `resources` list (starting at zero).
+If the resource has `format`, it'll be lowercased and appended to the `name`,
+becoming "`name.format`".
+
+__Arguments__
+- __target (string/filelike)__:
+ the file path or a file-like object where
+ the contents of this Data Package will be saved into.
+- __storage (str/tableschema.Storage)__:
+ storage name like `sql` or storage instance
+- __merge_groups (bool)__:
+ save all the group's tabular resoruces into one bucket
+ if a storage is provided (for example into one SQL table).
+ Read more about [Group](#group).
+- __to_base_path (bool)__:
+ save the package to the package's base path
+ using the "<base_path>/<target>" route
+- __options (dict)__:
+ storage options to use for storage creation
+
+__Raises__
+- `DataPackageException`: raises if there was some error writing the package
+
+__Returns__
+
+`bool/Storage`: on success return true or a `Storage` instance
+
+### `Resource`
+```python
+Resource(self,
+ descriptor={},
+ base_path=None,
+ strict=False,
+ unsafe=False,
+ storage=None,
+ package=None,
+ **options)
+```
+Resource represenation
+
+__Arguments__
+- __descriptor (str/dict)__: data resource descriptor as local path, url or object
+- __base_path (str)__: base path for all relative paths
+- __strict (bool)__:
+ strict flag to alter validation behavior. Setting it to `true`
+ leads to throwing errors on any operation with invalid descriptor
+- __unsafe (bool)__:
+ if `True` unsafe paths will be allowed. For more inforamtion
+ https://specs.frictionlessdata.io/data-resource/#data-location.
+ Default to `False`
+- __storage (str/tableschema.Storage)__: storage name like `sql` or storage instance
+- __options (dict)__: storage options to use for storage creation
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+
+
+#### `resource.data`
+Return resource data
+
+
+#### `resource.descriptor`
+Package's descriptor
+
+__Returns__
+
+`dict`: descriptor
+
+
+
+#### `resource.errors`
+Validation errors
+
+Always empty in strict mode.
+
+__Returns__
+
+`Exception[]`: validation errors
+
+
+
+#### `resource.group`
+Group name
+
+__Returns__
+
+`str`: group name
+
+
+
+#### `resource.headers`
+Resource's headers
+
+> Only for tabular resources (reading has to be started first or it's `None`)
+
+__Returns__
+
+`str[]/None`: returns data source headers
+
+
+
+#### `resource.inline`
+Whether resource inline
+
+__Returns__
+
+`bool`: returns true if resource is inline
+
+
+
+#### `resource.local`
+Whether resource local
+
+__Returns__
+
+`bool`: returns true if resource is local
+
+
+
+#### `resource.multipart`
+Whether resource multipart
+
+__Returns__
+
+`bool`: returns true if resource is multipart
+
+
+
+#### `resource.name`
+Resource name
+
+__Returns__
+
+`str`: name
+
+
+
+#### `resource.package`
+Package instance if the resource belongs to some package
+
+__Returns__
+
+`Package/None`: a package instance if available
+
+
+
+#### `resource.profile`
+Resource's profile
+
+__Returns__
+
+`Profile`: an instance of `Profile` class
+
+
+
+#### `resource.remote`
+Whether resource remote
+
+__Returns__
+
+`bool`: returns true if resource is remote
+
+
+
+#### `resource.schema`
+Resource's schema
+
+> Only for tabular resources
+
+For tabular resources it returns `Schema` instance to interact with data schema.
+Read API documentation - [tableschema.Schema](https://github.com/frictionlessdata/tableschema-py#schema).
+
+__Returns__
+
+`tableschema.Schema`: schema
+
+
+
+#### `resource.source`
+Resource's source
+
+Combination of `resource.source` and `resource.inline/local/remote/multipart`
+provides predictable interface to work with resource data.
+
+__Returns__
+
+`list/str`: returns `data` or `path` property
+
+
+
+#### `resource.table`
+Return resource table
+
+
+#### `resource.tabular`
+Whether resource tabular
+
+__Returns__
+
+`bool`: returns true if resource is tabular
+
+
+
+#### `resource.valid`
+Validation status
+
+Always true in strict mode.
+
+__Returns__
+
+`bool`: validation status
+
+
+
+#### `resource.iter`
+```python
+resource.iter(integrity=False, relations=False, **options)
+```
+Iterates through the resource data and emits rows cast based on table schema.
+
+> Only for tabular resources
+
+__Arguments__
+
+
+ keyed (bool):
+ yield keyed rows in a form of `{header1: value1, header2: value2}`
+ (default is false; the form of rows is `[value1, value2]`)
+
+ extended (bool):
+ yield extended rows in a for of `[rowNumber, [header1, header2], [value1, value2]]`
+ (default is false; the form of rows is `[value1, value2]`)
+
+ cast (bool):
+ disable data casting if false
+ (default is true)
+
+ integrity (bool):
+ if true actual size in BYTES and SHA256 hash of the file
+ will be checked against `descriptor.bytes` and `descriptor.hash`
+ (other hashing algorithms are not supported and will be skipped silently)
+
+ relations (bool):
+ if true foreign key fields will be checked and resolved to its references
+
+ foreign_keys_values (dict):
+ three-level dictionary of foreign key references optimized
+ to speed up validation process in a form of
+ `{resource1: {(fk_field1, fk_field2): {(value1, value2): {one_keyedrow}, ... }}}`.
+ If not provided but relations is true, it will be created
+ before the validation process by *index_foreign_keys_values* method
+
+ exc_handler (func):
+ optional custom exception handler callable.
+ Can be used to defer raising errors (i.e. "fail late"), e.g.
+ for data validation purposes. Must support the signature below
+
+__Custom exception handler__
+
+
+```python
+def exc_handler(exc, row_number=None, row_data=None, error_data=None):
+ '''Custom exception handler (example)
+
+ # Arguments:
+ exc(Exception):
+ Deferred exception instance
+ row_number(int):
+ Data row number that triggers exception exc
+ row_data(OrderedDict):
+ Invalid data row source data
+ error_data(OrderedDict):
+ Data row source data field subset responsible for the error, if
+ applicable (e.g. invalid primary or foreign key fields). May be
+ identical to row_data.
+ '''
+ # ...
+```
+
+__Raises__
+- `DataPackageException`: base class of any error
+- `CastError`: data cast error
+- `IntegrityError`: integrity checking error
+- `UniqueKeyError`: unique key constraint violation
+- `UnresolvedFKError`: unresolved foreign key reference error
+
+__Returns__
+
+`Iterator[list]`: yields rows
+
+
+
+#### `resource.read`
+```python
+resource.read(integrity=False,
+ relations=False,
+ foreign_keys_values=False,
+ **options)
+```
+Read the whole resource and return as array of rows
+
+> Only for tabular resources
+> It has the same API as `resource.iter` except for
+
+__Arguments__
+- __limit (int)__: limit count of rows to read and return
+
+__Returns__
+
+`list[]`: returns rows
+
+
+
+#### `resource.check_integrity`
+```python
+resource.check_integrity()
+```
+Checks resource integrity
+
+> Only for tabular resources
+
+It checks size in BYTES and SHA256 hash of the file
+against `descriptor.bytes` and `descriptor.hash`
+(other hashing algorithms are not supported and will be skipped silently).
+
+__Raises__
+- `exceptions.IntegrityError`: raises if there are integrity issues
+
+__Returns__
+
+`bool`: returns True if no issues
+
+
+
+#### `resource.check_relations`
+```python
+resource.check_relations(foreign_keys_values=False)
+```
+Check relations
+
+> Only for tabular resources
+
+It checks foreign keys and raises an exception if there are integrity issues.
+
+__Raises__
+- `exceptions.RelationError`: raises if there are relation issues
+
+__Returns__
+
+`bool`: returns True if no issues
+
+
+
+#### `resource.drop_relations`
+```python
+resource.drop_relations()
+```
+Drop relations
+
+> Only for tabular resources
+
+Remove relations data from memory
+
+__Returns__
+
+`bool`: returns True
+
+
+
+#### `resource.raw_iter`
+```python
+resource.raw_iter(stream=False)
+```
+Iterate over data chunks as bytes.
+
+If `stream` is true File-like object will be returned.
+
+__Arguments__
+- __stream (bool)__: File-like object will be returned
+
+__Returns__
+
+`bytes[]/filelike`: returns bytes[]/filelike
+
+
+
+#### `resource.raw_read`
+```python
+resource.raw_read()
+```
+Returns resource data as bytes.
+
+__Returns__
+
+`bytes`: returns resource data in bytes
+
+
+
+#### `resource.infer`
+```python
+resource.infer(**options)
+```
+Infer resource metadata
+
+Like name, format, mediatype, encoding, schema and profile.
+It commits this changes into resource instance.
+
+__Arguments__
+- __options__:
+ options will be passed to `tableschema.infer` call,
+ for more control on results (e.g. for setting `limit`, `confidence` etc.).
+
+__Returns__
+
+`dict`: returns resource descriptor
+
+
+
+#### `resource.commit`
+```python
+resource.commit(strict=None)
+```
+Update resource instance if there are in-place changes in the descriptor.
+
+__Arguments__
+- __strict (bool)__: alter `strict` mode for further work
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`bool`: returns true on success and false if not modified
+
+
+
+#### `resource.save`
+```python
+resource.save(target, storage=None, to_base_path=False, **options)
+```
+Saves this resource
+
+Into storage if `storage` argument is passed or
+saves this resource's descriptor to json file otherwise.
+
+__Arguments__
+- __target (str)__:
+ path where to save a resource
+- __storage (str/tableschema.Storage)__:
+ storage name like `sql` or storage instance
+- __to_base_path (bool)__:
+ save the resource to the resource's base path
+ using the "<base_path>/<target>" route
+- __options (dict)__:
+ storage options to use for storage creation
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+__Returns__
+
+`bool`: returns true on success
+Building index...
+Started generating documentation...
+
+### `Group`
+```python
+Group(self, resources)
+```
+Group representation
+
+__Arguments__
+- __Resource[]__: list of TABULAR resources
+
+
+
+#### `group.headers`
+Group's headers
+
+__Returns__
+
+`str[]/None`: returns headers
+
+
+
+#### `group.name`
+Group name
+
+__Returns__
+
+`str`: name
+
+
+
+#### `group.schema`
+Resource's schema
+
+__Returns__
+
+`tableschema.Schema`: schema
+
+
+
+#### `group.iter`
+```python
+group.iter(**options)
+```
+Iterates through the group data and emits rows cast based on table schema.
+
+> It concatenates all the resources and has the same API as `resource.iter`
+
+
+
+#### `group.read`
+```python
+group.read(limit=None, **options)
+```
+Read the whole group and return as array of rows
+
+> It concatenates all the resources and has the same API as `resource.read`
+
+
+
+#### `group.check_relations`
+```python
+group.check_relations()
+```
+Check group's relations
+
+The same as `resource.check_relations` but without the optional
+argument *foreign_keys_values*. This method will test foreignKeys of the
+whole group at once otpimizing the process by creating the foreign_key_values
+hashmap only once before testing the set of resources.
+
+
+### `Profile`
+```python
+Profile(self, profile)
+```
+Profile representation
+
+__Arguments__
+- __profile (str)__: profile name in registry or URL to JSON Schema
+
+__Raises__
+- `DataPackageException`: raises error if something goes wrong
+
+
+
+#### `profile.jsonschema`
+JSONSchema content
+
+__Returns__
+
+`dict`: returns profile's JSON Schema contents
+
+
+
+#### `profile.name`
+Profile name
+
+__Returns__
+
+`str/None`: name if available
+
+
+
+#### `profile.validate`
+```python
+profile.validate(descriptor)
+```
+Validate a data package `descriptor` against the profile.
+
+__Arguments__
+- __descriptor (dict)__: retrieved and dereferenced data package descriptor
+
+__Raises__
+- `ValidationError`: raises if not valid
+__Returns__
+
+`bool`: returns True if valid
+
+
+### `validate`
+```python
+validate(descriptor)
+```
+Validate a data package descriptor.
+
+__Arguments__
+- __descriptor (str/dict)__: package descriptor (one of):
+ - local path
+ - remote url
+ - object
+
+__Raises__
+- `ValidationError`: raises on invalid
+
+__Returns__
+
+`bool`: returns true on valid
+
+
+### `infer`
+```python
+infer(pattern, base_path=None)
+```
+Infer a data package descriptor.
+
+> Argument `pattern` works only for local files
+
+__Arguments__
+- __pattern (str)__: glob file pattern
+
+__Returns__
+
+`dict`: returns data package descriptor
+
+
+### `DataPackageException`
+```python
+DataPackageException(self, message, errors=[])
+```
+Base class for all DataPackage/TableSchema exceptions.
+
+If there are multiple errors, they can be read from the exception object:
+
+```python
+try:
+ # lib action
+except DataPackageException as exception:
+ if exception.multiple:
+ for error in exception.errors:
+ # handle error
+```
+
+
+
+#### `datapackageexception.errors`
+List of nested errors
+
+__Returns__
+
+`DataPackageException[]`: list of nested errors
+
+
+
+#### `datapackageexception.multiple`
+Whether it's a nested exception
+
+__Returns__
+
+`bool`: whether it's a nested exception
+
+
+
+### `TableSchemaException`
+```python
+TableSchemaException(self, message, errors=[])
+```
+Base class for all TableSchema exceptions.
+
+
+### `LoadError`
+```python
+LoadError(self, message, errors=[])
+```
+All loading errors.
+
+
+### `CastError`
+```python
+CastError(self, message, errors=[])
+```
+All value cast errors.
+
+
+### `IntegrityError`
+```python
+IntegrityError(self, message, errors=[])
+```
+All integrity errors.
+
+
+### `RelationError`
+```python
+RelationError(self, message, errors=[])
+```
+All relations errors.
+
+
+### `StorageError`
+```python
+StorageError(self, message, errors=[])
+```
+All storage errors.
+
+
+## Contributing
+
+> The project follows the [Open Knowledge International coding standards](https://github.com/okfn/coding-standards).
+
+Recommended way to get started is to create and activate a project virtual environment.
+To install package and development dependencies into active environment:
+
+```bash
+$ make install
+```
+
+To run tests with linting and coverage:
+
+```bash
+$ make test
+```
+
+## Changelog
+
+Here described only breaking and the most important changes. The full changelog and documentation for all released versions could be found in nicely formatted [commit history](https://github.com/frictionlessdata/datapackage-py/commits/master).
+
+#### v1.15
+
+> WARNING: it can be breaking for some setups, please read the discussions below
+
+- Fixed header management according to the specs:
+ - https://github.com/frictionlessdata/datapackage-py/pull/257
+ - https://github.com/frictionlessdata/datapackage-py/issues/256
+ - https://github.com/frictionlessdata/forum/issues/1
+
+#### v1.14
+
+- Add experimental options for pick/skiping fileds/rows
+
+#### v1.13
+
+- Add `unsafe` option to Package and Resource (#262)
+
+#### v1.12
+
+- Use `chardet` for encoding deteciton by default. For `cchardet`: `pip install datapackage[cchardet]`
+
+#### v1.11
+
+- `resource/package.save` now accept a `to_base_path` argument (#254)
+- `package.save` now returns a `Storage` instance if available
+
+#### v1.10
+
+- Added an ability to check tabular resource's integrity
+
+#### v1.9
+
+- Added `resource.package` property
+
+#### v1.8
+
+- Added support for [groups of resources](#group)
+
+#### v1.7
+
+- Added support for [compression of resources](https://frictionlessdata.io/specs/patterns/#compression-of-resources)
+
+#### v1.6
+
+- Added support for custom request session
+
+#### v1.5
+
+Updated behaviour:
+- Added support for Python 3.7
+
+#### v1.4
+
+New API added:
+- added `skip_rows` support to the resource descriptor
+
+#### v1.3
+
+New API added:
+- property `package.base_path` is now publicly available
+
+#### v1.2
+
+Updated behaviour:
+- CLI command `$ datapackage infer` now outputs only a JSON-formatted data package descriptor.
+
+#### v1.1
+
+New API added:
+- Added an integration between `Package/Resource` and the `tableschema.Storage` - https://github.com/frictionlessdata/tableschema-py#storage. It allows to load and save data package from/to different storages like SQL/BigQuery/etc.
+
+
+
+%prep
+%autosetup -n datapackage-1.15.2
+
+%build
+%py3_build
+
+%install
+%py3_install
+install -d -m755 %{buildroot}/%{_pkgdocdir}
+if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
+if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
+if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
+if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
+pushd %{buildroot}
+if [ -d usr/lib ]; then
+ find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/lib64 ]; then
+ find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/bin ]; then
+ find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+if [ -d usr/sbin ]; then
+ find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
+fi
+touch doclist.lst
+if [ -d usr/share/man ]; then
+ find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
+fi
+popd
+mv %{buildroot}/filelist.lst .
+mv %{buildroot}/doclist.lst .
+
+%files -n python3-datapackage -f filelist.lst
+%dir %{python3_sitelib}/*
+
+%files help -f doclist.lst
+%{_docdir}/*
+
+%changelog
+* Tue Apr 11 2023 Python_Bot <Python_Bot@openeuler.org> - 1.15.2-1
+- Package Spec generated
diff --git a/sources b/sources
new file mode 100644
index 0000000..c86ed3a
--- /dev/null
+++ b/sources
@@ -0,0 +1 @@
+928167c3f467787b984afbd7b1543638 datapackage-1.15.2.tar.gz