About
A data resource generator is a content data resource that generates data.
Syntax
A generator is a resource manifest file
- terminated by the suffix –generator.yml
- where the metadata field contains the data definition file syntax and adds:
- the max-record-count attribute (maximum number of record generated)
- and the data-supplier column attribute node where the data supplier properties are set.
Example:
kind: generator
# Spec follows the data definition format
spec:
# The maximum number of record generated
max-record-count: 30
# The number of record generated for each resource in a generate stream supplier
stream-record-count: 10
# The columns definition
columns:
- name: columnName
comment: A column with a sequence integer generator and its properties
data-supplier: # the data supplier
type: sequence
arguments:
start: 3
step: 2
maxTick: 5
- name: columnName2
........
See the column supplier page to see all type of generations that you can choose for a column.
Attributes
max-record-count
max-record-count is an attribute that defines the maximum number of record generated
stream-record-count
stream-record-count is an attribute that defines the number of record generated in generate stream operation.
Count
The count attribute is calculated and defines how many record a generator would generate.
You can see it with the data info command.
Example with a sequence generator, you would get the maximum value. For an integer, this is 2147483647
tabul data info --strict-selection count--generator.yml@howto
Information about the data resource (count@memgen)
attribute value description
------------------- ------------------------------------------------------------------------------------------------ -----------------------------------------------------
MAX_RECORD_COUNT 100 The maximum of records generated
SIZE 100 The size
SIZE_NOT_CAPPED 2147483647 The number of records without max
STREAM_RECORD_COUNT The records generated in a stream
ABSOLUTE_PATH count The absolute path on the data system
ACCESS_TIME The access time (access time)
COMMENT A comment
CONNECTION memgen The connection name
COUNT 100 The number of records
CREATION_TIME 2025-11-10 19:48:59.159433321 The creation time (birth time)
DATA_URI count@memgen The data uri
KIND generator The kind of media
LOGICAL_NAME count The logical name
MD5 ef69caaaeea9c17120821a9eb6c7f1de The Md5 hash
MEDIA_SUBTYPE vnd.tabulify.generator+yaml The media subType
MEDIA_TYPE text/vnd.tabulify.generator+yaml The media type
NAME count The name of the data resource
PARENT gen The parent
PATH count The relative path to the default connection path
SHA384 0c9b6656498be26d413bf3563198f01be3236d017f75943f9406922d08ba4ec137ffde15d2e95dcb4d77f9d6cd6eec79 The Sha384 hash
SHA384_INTEGRITY sha384-DJtmVkmL4m1BO/NWMZjwG+MjbQF/dZQ/lAaSLQi6TsE3/94V0uldy013+dbNbux5 The sha384 value used in the html integrity attribute
TABULAR_TYPE data The tabular type
UPDATE_TIME The last update time (modify time)
MediaType
Manifest file
The media type of a file is text/vnd.tabulify.generator+yaml
Manifest fragment
The media type of a fragment in a define step (ie inline generator) should be text/vnd.tabulify.generator+yaml-fragment
Example:
kind: pipeline
spec:
steps:
- name: "Define"
operation: "define"
args:
data-resource:
# The below media-type has `fragment` in its extension
# It's a special media-type that makes it possible to define a generator as a yaml in a `define` step
media-type: text/vnd.tabulify.generator+yaml-fragment
data-def:
logical-name: "my-sequence"
max-record-count: 5
columns:
- name: "id"
type: integer
data-supplier:
type: sequence
- name: "Print"
comment: "Print the sequence"
operation: "print"
Creation
generators can be created:
- manually by creating a yaml file
- automatically from pipeline input with the fill data operation