Data Operation - Unzip
About
unzip is a stream intermediate operation that will unzip/unpack:
Example
Cli
The unzip operation is also available via the tabul data unzip command
Arguments
| Argument | Default | Description |
|---|---|---|
| entry-selector | A list of glob pattern. If set, only the archive entries that match will be extracted | |
| strip-components | 0 | Number of parts striped from the entry path to calculate the destination relative path from the destination directory (equivalent to strip-components in tar) |
| target-data-uri | ${entry_path}@tmp | a template data uri that defines the destination directory where input and entry attributes may be used For instance ${input_logical_name}@tmp |
| Flow Property | ||
| output-type | results | The output - targets, the extracted entry - inputs, the archive inputs are passed - results, the results of the extraction is passed |
| stream-type | map | The type of stream operation * map will produce one output for one archive * split will produce one output by archive entry |
target-data-uri
The target-data-uri defines the location of the extracted entry.
The path therefore is mandatory and needs to be unique (By default, the ${entry_path})
The following variables may be used in the template data uri
- all resource attribute of the input path. ie
- For instance, for the logical name: ${input_logical_name}@tmp
- the following entry attributes:
- ${entry_path} : the entry path
- ${entry_N}: the matched group if there is a match with the entry_selector where N is the matched group position
strip-components
strip-components removes one or more names from the entry path used in the target_data_uri.
It is used generally to delete the root directory in the path.
For instance,
- if your entry path is archive-name/file.txt,
- setting strip-components to 1 will:
- remove the archive-name part
- set the entry path to file.txt
Note that if you knew that the root directory was called archive-name, you could also delete it with matched group backreference. ie:
- entry-selector: archive-name/*
- target-data-uri: entry_1@tmp
Results
If you set as output the value results, you will get a data resource with the following columns:
| Columns | Description |
|---|---|
| target_data_uri | the data uri of the extracted archive entry |
| entry_path | the archive entry path |
| entry_media_type | the archive entry media type |
| entry_media_size | the archive entry size |
| entry_update_time | the archive update time |
Example:
target_data_uri entry_path entry_media_type entry_size entry_update_time
-------------------------------- ------------- ---------------- ---------- ---------------------
world/empty.txt@tmp world/empty.txt text/plain 0 2025-08-19 08:57:53.0
world/foo.txt@tmp world/foo.txt text/plain 5 2025-08-19 08:57:42.0
Important Note
Overwritten Extraction Mode
The target path is a path where an entry is going to be extracted.
If this target path exists, the file is overwritten by the extracted entry.