Data Operation - Execute

Undraw Data Processing

Data Operation - Execute

About

execute is a intermediate operation that

It's equivalent to a transfer operation:

Howto

Usage

It allows to:

Arguments

Name Default Description
execution-mode See execution_mode The mode of execution (transfer or load)
error-data-uri See error_data_uri The report location of unexpected errors if any happen
fail-on-error true If an error happens, the step will return a bad exit code
output-type results The output data resource type (inputs,results,targets).
Results here:
mean output result, the results of all executions,
NOT the runtime result, the result of a runtime execution
output-result-columns The list of columns for the output result
processing-type batch The processing type
runtime-result-persistence See runtime_result_persistence If true, the runtime results are stored
stop-early true If true, the execution will stop at the first error
strict-input true If true, an input that is not a runtime will throw an error
strict-execution false If true, an execution that returns a bad exit value will throw an error
target-data-uri See target_data_uri A template data uri that define the location of the runtime results
target-data-def Optional Metadata Data Attributes in data definition form
target-media-type Optional The media type of the target

Execution Mode

Execute has 2 modes of executions:

Execution Mode Example

For a Sql SELECT query, if the execution mode is:

  • transfer: the statement is executed and a local iteration over the result set, calculates the count,
  • load: the statement is wrapped in a COUNT statement and is executed to retrieve the count
# example of wrapped count statement
select count(*) from (the original select)

Note that the difference in performance metrics permits to measure the fetch of data on your network.

Transfer

In a transfer mode, execute will:

  • execute the executable,
  • fetch, store and report the result
  • count the number of records on the result

Load

In a load mode, execute will:,

  • create a count request, execute it
  • get the count

Default Execution Mode

The default value is based on the tabular-type of the inputs.

default mode of execution tabular type
load data
transfer command (exit code column is present)

Target Data Uri

In this operation, the targets are the results of each runtime execution (known as runtime result)

You can define where they are stored with:

Default value:

execute/${pipeline_start_time}-pipe-${pipeline_logical_name}/${input_logical_name}.log@tmp

Note that the runtime result is stored only if the runtime-result-persistence has the value true.

Error Data Uri

If any unexpected error occurs, the error log is stored in the location defined by the error-data-uri argument.

The error-data-uri value is a template data uri

The default value is:

execute/${pipeline_start_time}-pipe-${pipeline_logical_name}/${input_logical_name}-err.log@tmp

Runtime Result Persistence

If true, the result of the runtime execution are stored in the location defined by the target data uri.

Note that this attribute has only a effect on a transfer execution mode and will have no effect with the load execution mode.

Strict input

The input should be a runtime.

If strict-input is:

  • true, an error is thrown
  • false, if the input data path is:
    • a file, it's transformed as a executable executed at its own connection (ie at its own working directory)
    • not a file, a error is thrown

Runtime Result Persistence

runtime-result-persistence specificies if the results of the execution of a runtime should be stored.

This argument lets you choose to not store this result in loop execution mode

By default, it's:

Output Result Columns

You can choose the column of the output result with the output-result-columns argument.

Name Count
Execution Mode
Default
Loop
Execution Mode
Default
Description
runtime_data_uri x x The runtime data uri (ie uri of the input executed)
exit_code x x The exit code (if 0, no error was seen)
count x x The record count
latency x x The duration in a human string
data_uri x the target_data_uri or the error data uri
error_message x x the error message if any
Other possible values, not chosen by default
latency_millis The duration in milliseconds
result_data_uri the target_data_uri where the runtime result is stored
runtime_executable_path The executable data uri
runtime_connection The runtime connection
start_time the start time (timestamp)
end_time the end time (timestamp)
error_data_uri the error data uri

Cli

The execute operation is also available at the cli with the data execute command.

Note

Execution on records

The execute operation don't support yet directly execution of script stored on records but you can achieve it by:




Related HowTo
Undraw Data Processing
Database HowTo - How to load your database with the TPCDS benchmark

This howto will show you how to load a relational database schema in order to create a benchmark with the data query command

Task Runner