Data Resource - Generator Manifest

About

A data resource generator is a content data resource that generates data.

Syntax

terminated by the suffix –generator.yml
where the metadata field contains the data definition file syntax and adds:
- the max-record-count attribute (maximum number of record generated)
- the stream-record-count attribute: number of record generated by poll in a stream
- and the data-supplier column attribute node where the data supplier properties are set.

Example:

kind: generator
# Spec follows the data definition format
spec:
  # The maximum number of record generated
  max-record-count: 30
  # The number of record generated for each resource in a generate stream supplier
  stream-record-count: 10
  # The columns definition
  columns:
    - name: columnName
      comment: A column with a sequence integer generator and its properties
      data-supplier: # the data supplier
        type: sequence
        arguments:
          start: 3
          step: 2
          maxTick: 5
   - name: columnName2
    ........

See the column supplier page to see all type of generations that you can choose for a column.

Attributes

max-record-count

max-record-count is an attribute that defines the maximum number of record generated

stream-record-count

stream-record-count is an attribute that defines the number of record generated in generate stream operation.

Count

The count attribute is calculated and defines how many record a generator would generate.

You can see it with the data info command.

Example with a sequence generator, you would get the maximum value. For an integer, this is 2147483647

tabul data info --strict-selection count--generator.yml@howto

Information about the data resource (count@memgen)
attribute             value                                                                                              description
-------------------   ------------------------------------------------------------------------------------------------   -----------------------------------------------------
MAX_RECORD_COUNT      100                                                                                                The maximum of records generated
SIZE                  100                                                                                                The size
SIZE_NOT_CAPPED       2147483647                                                                                         The number of records without max
STREAM_RECORD_COUNT                                                                                                      The records generated in a stream
ABSOLUTE_PATH         count                                                                                              The absolute path on the data system
ACCESS_TIME                                                                                                              The access time (access time)
COMMENT                                                                                                                  A comment
CONNECTION            memgen                                                                                             The connection name
COUNT                 100                                                                                                The number of records
CREATION_TIME         2025-11-10 19:48:59.159433321                                                                      The creation time (birth time)
DATA_URI              count@memgen                                                                                       The data uri
KIND                  generator                                                                                          The kind of media
LOGICAL_NAME          count                                                                                              The logical name
MD5                   ef69caaaeea9c17120821a9eb6c7f1de                                                                   The Md5 hash
MEDIA_SUBTYPE         vnd.tabulify.generator+yaml                                                                        The media subType
MEDIA_TYPE            text/vnd.tabulify.generator+yaml                                                                   The media type
NAME                  count                                                                                              The name of the data resource
PARENT                gen                                                                                                The parent
PATH                  count                                                                                              The relative path to the default connection path
SHA384                0c9b6656498be26d413bf3563198f01be3236d017f75943f9406922d08ba4ec137ffde15d2e95dcb4d77f9d6cd6eec79   The Sha384 hash
SHA384_INTEGRITY      sha384-DJtmVkmL4m1BO/NWMZjwG+MjbQF/dZQ/lAaSLQi6TsE3/94V0uldy013+dbNbux5                            The sha384 value used in the html integrity attribute
TABULAR_TYPE          data                                                                                               The tabular type
UPDATE_TIME                                                                                                              The last update time (modify time)

MediaType

Manifest file

The media type of a file is text/vnd.tabulify.generator+yaml

Manifest fragment

The media type of a fragment in a define step (ie inline generator) should be text/vnd.tabulify.generator+yaml-fragment

Example:

kind: pipeline
spec:
  steps:
    - name: "Define"
      operation: "define"
      args:
        data-resource:
          # The below media-type has `fragment` in its extension
          # It's a special media-type that makes it possible to define a generator as a yaml in a `define` step
          media-type: text/vnd.tabulify.generator+yaml-fragment
          data-def:
            logical-name: "my-sequence"
            max-record-count: 5
            columns:
              - name: "id"
                type: integer
                data-supplier:
                  type: sequence
    - name: "Print"
      comment: "Print the sequence"
      operation: "print"

Creation

generators can be created:

manually by creating a yaml file
automatically from pipeline input with the fill data operation