This how-to shows you how to generate data from a list of values at random data with the column histogram generator.
Tip: The random column generator can also generate all primary data type (number, date, string) at random. See Tabulify - How to generate random data
To generate data, you need to create a generator file that will describe the data to be generated.
The below data resource generator:
The two buckets columns (buckets_map and buckets_list) are equivalent.
They defines the buckets as being:
kind: generator
spec:
MaxRecordCount: 10
Columns:
- name: id
type: integer
comment: A id column to see easily the number of values generated
data-supplier:
type: sequence
- name: bucket_map
type: varchar
comment: A column with a random color generator and a map of values with the chance factor
data-supplier:
type: histogram
arguments:
Buckets:
blue: 1
red: 1
green: 1
- name: bucket_list
type: varchar
comment: A column with a random color generator and a list of values
data-supplier:
type: histogram
arguments:
Buckets:
- blue
- red
- green
With the data print command, we can print the 30 values generated.
tabul data print histogram_random--generator.yml@howto
id bucket_map bucket_list
-- ---------- -----------
1 blue green
2 blue green
3 red red
4 blue red
5 green blue
6 green red
7 red blue
8 green red
9 green blue
10 green green
Because a generator is just a data resource, you can use it in every data operation.
How to use a generator in a data operation