Table of Contents

Data Definition (DataDef.yml)

About

A data definition is a manifest file property or argument that defines the metadata of a data resource (ie attributes and data structure)

Example

Manifest

Example in a kind resource manifest file

kind: csv
spec:
  data-def:
    logical-name: favorite_books
    header-row-id: 1
    delimiter-character: ','
    columns:
      - name: asin
        type: varchar
        precision: 20
      - name: description
        type: varchar
      - name: price
        type: double
      - name: group
        type: varchar


Operation Argument

Example as argument of the define step in a pipeline manifest

kind: pipeline
spec:
  steps:
    - operation: "define"
      arguments:
        # The data def argument
        data-def:
          logical-name: "colors"
          columns: ["id","color"]

Cli Options

You can also set them with tabul_cli_option.

For example, setting a semicolon separator to a CSV

tabul data head \
  --attribute delimiter-character=';' \
  books-semicolon.csv@howto

Usage

Resource Manifest

Data definition may be defined in any resource manifest (ie yaml file that defines a resource)

Operation Argument

This format is used in pipeline step in the data-def argument.

Example:

Tabul Cli Option

You can set the data definition with the following tabul options

Format

The following data definition file shows the common structure of all data definition file that defines the name of the tabular structure and its columns.

Scalar

At the root, you can set any scalar attributes such as the common attributes

Example:

logical-name: LogicalName

where

Relational Structure

Columns

columns:
  - name: column_name1
    Type: date
    ansi-type: date 
    Precision:
    Scale:
    Comment:
    Position: 1
  - name: column_name2
    type: varchar
    precision:
    scale: 0
    comment: A comment

where:

Primary Columns

primary-columns: [ "column_name1", "column_name2" ]

where primary-columns defines a list of column names that compose the primary key.

Extra Attributes

Each data resource type may need additional information about a table or a column. This information can be added at each level (table or column) as a attribute (ie property).

DataResourceAttribute1: value1
Columns:
    - name: column_name
      ColumnProperty1: value1
....

where:

A Property value may be:

The column data generators use them to add the data-supplier argument.