---json { "page_id": "tinebxd98hfmbtsflf7eu" } --- ====== Data Definition (DataDef.yml) ====== ===== About ===== A ''data definition'' is a [[#manifest|manifest file property]] or [[#argument|argument]] that defines the [[docs:resource:metadata|metadata]] of a [[resource|data resource]] (ie [[attribute|attributes]] and [[structure|data structure]]) ===== Example ===== ==== Manifest==== Example in a [[docs:resource:manifest#kind|kind resource manifest file]] kind: csv spec: data-def: logical-name: favorite_books header-row-id: 1 delimiter-character: ',' columns: - name: asin type: varchar precision: 20 - name: description type: varchar - name: price type: double - name: group type: varchar ==== Operation Argument ==== Example as argument of the [[:docs:op:define|define step]] in a [[:docs:flow:pipeline|pipeline manifest]] kind: pipeline spec: steps: - operation: "define" arguments: # The data def argument data-def: logical-name: "colors" columns: ["id","color"] ==== Cli Options ==== You can also set them with [[#tabul cli option]]. For example, setting a semicolon separator to a [[:docs:resource:csv|CSV]] tabul data head \ --attribute delimiter-character=';' \ books-semicolon.csv@howto ===== Usage ===== ==== Resource Manifest ==== ''Data definition'' may be defined in any [[docs:resource:manifest|resource manifest]] (ie yaml file that defines a resource) ==== Operation Argument ==== This format is used in pipeline step in the ''data-def'' argument. Example: * the [[docs:op:define|define operation]] * the [[docs:op:select|select operation]] * the [[:docs:op:enrich|enrich operation]] ==== Tabul Cli Option ==== You can set the data definition with the following [[docs:tabul:option|tabul options]] * ''source-attribute'' or ''target-attribute'' in a [[docs:tabul:data:transfer|transfer command]] * ''attribute'' for others [[docs:tabul:data:start|data command]] ===== Format ===== The following ''data definition'' file shows the common structure of all data definition file that defines the name of the tabular structure and its columns. ==== Scalar ==== At the root, you can set any scalar attributes such as the [[docs:resource:attribute|common attributes]] Example: logical-name: LogicalName where * ''logical-name'' is the [[logical_name|logical name]] of the [[resource|resource]] (Default to the name of the file without structure information). ==== Relational Structure ==== === Columns === columns: - name: column_name1 Type: date ansi-type: date Precision: Scale: Comment: Position: 1 - name: column_name2 type: varchar precision: scale: 0 comment: A comment where: * ''columns'' defines the [[docs:resource:column|columns]] * ''name'' is the name of the column * ''type'' is the [[docs:data_type:data_type|data type name]] of the connection (Default: ''varchar'') * ''ansi-type'' is the [[:docs:data_type:data_type#ansi|ansi type of data]] (By default, derived from the ''type'') * ''precision'' is the precision of the data type (Default value of the data type) * ''scale'' is the scale of the data type (Default value of the data type) * ''comment'' is a comment on the column * ''position'' is the physical column position === Primary Columns === primary-columns: [ "column_name1", "column_name2" ] where ''primary-columns'' defines a list of column names that compose the primary key. ===== Extra Attributes ===== Each data resource type may need additional information about a table or a column. This information can be added at each level (table or column) as a attribute (ie property). DataResourceAttribute1: value1 Columns: - name: column_name ColumnProperty1: value1 .... where: * ''DataResourceAttribute1'' is a [[attribute|resource attribute]] * ''ColumnProperty1'' is a column attribute A Property value may be: * a scalar (ie single value) * a list * or a mapping The [[docs:generator:data-supplier|column data generators]] use them to add the ''data-supplier'' argument.