Metadata in Table Schema
Authors | Christophe Benz, Johan Richer |
---|
Overview
Table Schemas need their own metadata to be stand-alone and interpreted without relying on other contextual information (Data Package metadata for example). Adding metadata to describe schemas in a structured way would help users to understand them and would increase their sharing and reuse.
Currently it is possible to add custom properties to a Table Schema, but the lack of consensus about those properties restricts common tooling and wider adoption.
Use cases
- Documentation: generating Markdown documentation from the schema itself is a useful use case, and contextual information (description, version, authors…) needs to be retrieved.
- Cataloging: open data standardisation can be increased by improving Table Schemas shareability, for example by searching and categorising them (by keywords, countries, full-text…) in catalogs.
- Machine readability: tools like Goodtables could use catalogs to access Table Schemas in order to help users validate tabular files against existing schemas. Metadata would be needed for tools to find and read those schemas.
Specification
This pattern introduces the following properties to the Table Schema spec (using the Frictionless Data core dictionary as much as possible):
name
: An identifier string for this schema.title
: A human-readable title for this schema.description
: A text description for this schema.keywords
: The keyword(s) that describe this schema. Tags are useful to categorise and catalog schemas.countryCode
: The ISO 3166-1 alpha-2 code for the country where this schema is primarily used. Since open data schemas are very country-specific, it’s useful to have this information in a structured way.homepage
: The home on the web that is related to this schema.path
: A fully qualified URL for this schema. The direct path to the schema itself can be useful to help accessing it (i.e. machine readability).image
: An image to represent this schema. An optional illustration can be useful for example in catalogs to differentiate schemas in a list.licenses
: The license(s) under which this schema is published.resources
: Example tabular data resource(s) validated or invalidated against this schema. Oftentimes, schemas are shared with example resources to illustrate them, with valid or even invalid files (e.g. with constraint errors).sources
: The source(s) used to created this schema. In some cases, schemas are created after a legal text or some draft specification in a human-readable document. In those cases, it’s useful to share them with the schema.created
: The datetime on which this schema was created.lastModified
: The datetime on which this schema was last modified.version
: A unique version number for this schema.contributors
: The contributors to this schema.
Example schema
Implementations
The following links are actual examples already using this pattern, but not 100 % aligned with our proposal. The point is to make the Table Schema users converge towards a common pattern, before considering changing the spec.
- @OpenDataFrance has initiated the creation of Table Schemas to standardise common French open data datasets. Their Markdown documentation is generated automatically from the schemas (using some scripts), including contextual information.
- A tool called Validata was developed, based on Goodtables, to help French open data producers follow the schemas. It uses metadata from the schemas to present them.
- @Etalab has launched schema.data.gouv.fr, an official open data schema catalog, which is specific to France. It needs additional metadata in the schemas to validate them.
- Example Table Schema from @Etalab using metadata properties.