---json { "aliases": [ { "path": ":howto:pipeline:enrich:enrich" } ], "page_id": "zv16wr10aztteapuiyeoc" } --- ====== How to add information about the selected resources with the Enrich operation ? ====== ===== About ===== [[:docs:op:enrich|enrich]] is an intermediate operation that will add [[:docs:resource:virtual_column|virtual columns]] to its [[:docs:flow:input|inputs]] thanks to [[:docs:generator:data-supplier|data supplier]]. Enrich accepts only one argument ''data-def'' where you can define extra columns called [[:docs:resource:virtual_column|virtual columns]] with their respective [[:docs:generator:data-supplier|data supplier]]. A [[:docs:generator:data-supplier|data supplier]] is a function that supplies a value to a column. In the steps below, we add to the input resources: * the input file name * the input file extension * and an increasing sequence ===== Steps ===== ==== The input example ==== In this example, we will showcase the [[:docs:op:enrich|enrich operation]] with the ''enrich-me.md'' * [[:docs:resource:text|text file]] * located in the ''pipeline/enrich'' subdirectory of the [[:docs:connection:howto|howto directory]] With the [[:docs:tabul:data:concat|cat]] command tabul data cat pipeline/enrich/enrich-me.md@howto We can see the content: This is a file used in the [enrich pipeline](../enrich.yml) for demonstration ==== The pipeline ==== In this example, the [[:docs:flow:pipeline|pipeline]]: * a markdown [[:docs:resource:text|text file]] in the pipeline with the [[:docs:op:define|define operation]] * [[:docs:op:enrich|enrich]] its records with 3 [[:docs:resource:virtual_column|virtual columns]]: * the input ''file name'' thanks to the [[:docs:generator:meta|meta data supplier]]. * the input ''file extension'' thanks to the [[:howto:generator:expression|expression data supplier]] * a column ''line_id'' with an increasing sequence thanks to the [[:docs:generator:sequence|sequence data supplier]] * and [[:docs:op:print|print]] the records kind: pipeline spec: steps: - operation: 'define' arguments: data-resource: data-uri: 'pipeline/enrich/enrich-me.md@howto' - operation: 'enrich' arguments: data-def: columns: - name: file_name data-supplier: type: meta arguments: attribute: name - name: file_extension data-supplier: type: expression arguments: column-variable: file_name expression: "file_name.split('.').pop()" - name: line_id type: integer data-supplier: type: sequence - operation: 'print' ==== The execution result ==== By [[:docs:tabul:flow:execute|executing]] it, we can see the 4 columns created. * the ''lines'' column with the content of the file (''lines'' is the value of the [[:docs:resource:text|text file column-name attribute]]) * the ''file_name'' column with the name of the input * the ''file_extension'' column with the extension of the input * the ''line_id'' column with the line id. We have 2 lines. tabul flow execute --no-results pipeline/enrich.yml@howto pipeline/enrich/enrich-me.md@howto lines file_name file_extension line_id ----------------------------------------------------------- ------------ -------------- ------- This is a file used in the [enrich pipeline](../enrich.yml) enrich-me.md md 1 for demonstration enrich-me.md md 2