---json { "page_id": "zppr4dosafwlgjt7pwacw" } --- ====== Data Operation - Unzip ====== ===== About ===== ''unzip'' is a [[:docs:flow:processing-type#stream|stream]] [[:docs:flow:intermediate|intermediate operation]] that will unzip/unpack: * its [[:docs:resource:archive|archive]] [[:docs:flow:input|input]] * into a [[:docs:flow:target|target]] [[:docs:resource:directory|directory]] defined by the [[#target data uri|target-data-uri]] argument ===== Example ===== * [[:howto:mysql:sample_schema|]] ===== Cli ===== The ''unzip'' operation is also available via the [[:docs:tabul:data:unzip|tabul data unzip]] command ===== Arguments ===== ^ Argument ^ Default ^ Description ^ | ''entry-selector'' | | A list of [[:docs:common:globbing|glob pattern]]. If set, only the archive entries that match will be extracted | | [[#strip-components]] | ''0'' | Number of parts striped from the [[:docs:resource:archive-entry#entry-path|entry path]] to calculate the destination relative path from the destination directory (equivalent to ''strip-components'' in tar) | | [[#target-data-uri]] | ''%%${entry_path}@tmp%%'' | a [[docs:flow:template_data_uri|template data uri]] that defines the destination [[:docs:resource:directory|directory]] where ''input'' and ''entry'' attributes may be used \\ For instance ''%%${input_logical_name}@tmp%%'' | ^ Flow Property ^^^ | ''output-type'' | ''results'' | The [[:docs:flow:output|output]] \\ - ''targets'', the extracted entry \\ - ''inputs'', the archive inputs are passed \\ - ''results'', the [[#results|results]] of the extraction is passed | | ''stream-type'' | ''map'' | The type of stream operation \\ * ''map'' will produce one output for one archive \\ * ''split'' will produce one output by archive entry | ==== target-data-uri ==== The ''target-data-uri'' defines the location of the extracted entry. The path therefore is mandatory and needs to be unique (By default, the ''%%${entry_path}%%'') The following variables may be used in the [[:docs:flow:template_data_uri|template data uri]] * all [[:docs:resource:attribute|resource attribute]] of the ''input'' path. ie * For instance, for the [[:docs:resource:logical_name|logical name]]: ''%%${input_logical_name}@tmp%%'' * the following ''entry'' attributes: * ''%%${entry_path}%%'' : the [[:docs:resource:archive-entry#path|entry path]] * ''%%${entry_N}%%'': the [[:docs:flow:template_string#glob_matched_groups|matched group]] if there is a match with the ''entry_selector'' where ''N'' is the matched group position ==== strip-components ==== ''strip-components'' removes one or more names from the ''entry path'' used in the [[#target data uri]]. It is used generally to delete the root directory in the path. For instance, * if your ''entry path'' is ''archive-name/file.txt'', * setting ''strip-components'' to ''1'' will: * remove the ''archive-name'' part * set the ''entry path'' to ''file.txt'' Note that if you knew that the root directory was called ''archive-name'', you could also delete it with [[:docs:flow:template_string#glob_matched_groups|matched group backreference]]. ie: * ''entry-selector'': ''archive-name/*'' * ''target-data-uri'': ''${entry_1}@tmp'' ===== Results ===== If you set as ''output'' the value ''results'', you will get a data resource with the following columns: ^ Columns ^ Description ^ | ''target_data_uri'' | the [[:docs:resource:data_uri|data uri]] of the extracted archive entry | | ''entry_path'' | the [[..:resource:archive-entry#path|archive entry path]] | | ''entry_media_type'' | the [[..:resource:archive-entry#media_type|archive entry media type]] | | ''entry_media_size'' | the [[..:resource:archive-entry#size|archive entry size]] | | ''entry_update_time'' | the [[..:resource:archive-entry#columns|archive update time]] | Example: target_data_uri entry_path entry_media_type entry_size entry_update_time -------------------------------- ------------- ---------------- ---------- --------------------- world/empty.txt@tmp world/empty.txt text/plain 0 2025-08-19 08:57:53.0 world/foo.txt@tmp world/foo.txt text/plain 5 2025-08-19 08:57:42.0 ===== Important Note ===== ==== Overwritten Extraction Mode ==== The target path is a path where an entry is going to be extracted. If this target path exists, the file is overwritten by the extracted entry.