Resource Manifests
About
A data resource is uniquely identified by 3 elements:
- a data uri
Resource manifests are manifests that groups this 3 elements
Default
This section defines the optionality and default value of each element.
| Type | Mandatory | Default |
|---|---|---|
| data uri | true | |
| media type | Optional | Derived from the data uri. Example: for a books.csv file, the media type is text/csv |
| data definition | Optional | Takes the default value for the kind of resource (ie for the kind of media type) Example: for a csv, the default separator is the comma. |
Kind
In a kind manifest:
- the media type is defined/derived from the kind value
- the data-uri is optional
With Data Uri
Example for a CSV
Note that the data uri, books.csv@md locates the csv file books.csv in the same directory as the manifest known as the md connection (manifest directory).
kind: csv
spec:
data-uri: books.csv@md
data-def:
logical-name: favorite_books
header-row-id: 1
delimiter-character: ','
columns:
- name: asin
type: varchar
precision: 20
- name: description
type: varchar
- name: price
type: double
- name: group
type: varchar
When using this manifest, you select the original resource by using the data-uri of the manifest
Example:
tabul data print books-with-data-uri--csv.yml@howto
books.csv@md
asin description price group
------------- ------------------------------------------- ----- ----------
B007US9NA8 The New Encyclopedia of Modern Bodybuilding 27.18 Sports
1439199191 How To Win Friends And Influence People 27.18 Psychology
9780465050659 The Design of Everyday Things 9.99 Design
9780465050659 The Design of Everyday Things 9.99 Psychology
B00555X8OA Thinking, Fast and Slow 8.85 Decision
B00555X8OA Thinking, Fast and Slow 8.85 Psychology
Without Data Uri
Without data-uri, the default value is derived from the manifest file name.
- The resource is supposed to be stored in the same directory as the manifest (ie in the md connection)
- The file extension is derived from the kind value
For example, for the manifest, books–csv.yml, the data uri would be books.csv@md
kind: csv
spec:
data-def:
logical-name: favorite_books
header-row-id: 1
delimiter-character: ','
columns:
- name: asin
type: varchar
precision: 20
- name: description
type: varchar
- name: price
type: double
- name: group
type: varchar
When using this manifest, to select the resource, you can use the data-uri:
- of the manifest
- or of the original resource
Example:
- From the data uri of manifest
tabul data print books--csv.yml@howto
books.csv@md
asin description price group
------------- ------------------------------------------- ----- ----------
B007US9NA8 The New Encyclopedia of Modern Bodybuilding 27.18 Sports
1439199191 How To Win Friends And Influence People 27.18 Psychology
9780465050659 The Design of Everyday Things 9.99 Design
9780465050659 The Design of Everyday Things 9.99 Psychology
B00555X8OA Thinking, Fast and Slow 8.85 Decision
B00555X8OA Thinking, Fast and Slow 8.85 Psychology
- From the data uri of the csv
tabul data print books.csv@howto
books.csv@howto
asin description price group
------------- ------------------------------------------- ----- ----------
B007US9NA8 The New Encyclopedia of Modern Bodybuilding 27.18 Sports
1439199191 How To Win Friends And Influence People 27.18 Psychology
9780465050659 The Design of Everyday Things 9.99 Design
9780465050659 The Design of Everyday Things 9.99 Psychology
B00555X8OA Thinking, Fast and Slow 8.85 Decision
B00555X8OA Thinking, Fast and Slow 8.85 Psychology
Other
We also support the resource and data_def manifest for backward compatibility reason but whenever possible, we recommend to use the kind manifest because:
- it's more versatile
- they will get in future version, a schema in order to get value completion in your editor (known also as intellisense)
There are in total 3 types of manifest to define the 3 data resource elements:
| Type | Description |
|---|---|
| kind | the manifest with only 2 elements where media-type is derived from the kind of manifest |
| resource | the generic manifest that defines all 3 elements |
| data_def | a manifest that defines only the data definition and that should live next to its resource |
In other words, they have the following properties by resource manifest type:
| Type | Define Data Uri | Define Media Type | Define Data Def | Resource that should be selected |
|---|---|---|---|---|
| Kind | Optional | Yes | manifest and resource data uri | |
| Resource | Yes | Yes | Yes | manifest data uri |
| Data Def | Yes | resource data uri |
Resource
The resource manifest specifies all 3 elements
- a data uri
kind: resource
spec:
data-uri: books.csv@md
media-type: text/csv
data-def:
logical-name: favorite_books
header-row-id: 1
delimiter-character: ','
columns:
- name: asin
type: varchar
precision: 20
- name: description
type: varchar
- name: price
type: double
- name: group
type: varchar
When using this manifest, you select the resource by using the data-uri of the manifest
Example:
tabul data print books--resource.yml@howto
books.csv@md
asin description price group
------------- ------------------------------------------- ----- ----------
B007US9NA8 The New Encyclopedia of Modern Bodybuilding 27.18 Sports
1439199191 How To Win Friends And Influence People 27.18 Psychology
9780465050659 The Design of Everyday Things 9.99 Design
9780465050659 The Design of Everyday Things 9.99 Psychology
B00555X8OA Thinking, Fast and Slow 8.85 Decision
B00555X8OA Thinking, Fast and Slow 8.85 Psychology
Data Def
In a data-def manifest, only the data definition is present.
Example for a csv
- if you have a file called books.csv in the directory howto
- you would create a file called books–data-def.yml in the same howto directory
kind: data-def
spec:
logical-name: favorite_books
header-row-id: 1
delimiter-character: ','
columns:
- name: asin
type: varchar
precision: 20
- name: description
type: varchar
- name: price
type: double
- name: group
type: varchar
This manifest should be located next to the data resource in order to define only its data definition.
To use this manifest, you would use the data uri of the original resource
Example:
tabul data print books.csv@howto
books.csv@howto
asin description price group
------------- ------------------------------------------- ----- ----------
B007US9NA8 The New Encyclopedia of Modern Bodybuilding 27.18 Sports
1439199191 How To Win Friends And Influence People 27.18 Psychology
9780465050659 The Design of Everyday Things 9.99 Design
9780465050659 The Design of Everyday Things 9.99 Psychology
B00555X8OA Thinking, Fast and Slow 8.85 Decision
B00555X8OA Thinking, Fast and Slow 8.85 Psychology
It has:
- the advantage that you do operation on the file, not on the manifest.
- the disadvantage that you need to not forget to move the manifest when you move the file