How to define an archive entry as data resource?

About

This howto will show you how to define an entry in an archive as data resource.

Scenario

In the world-db.tar.gz archive of the MySQL world sample schema, there is a file called world.sql and we don't want to extract it each time but to use it directly as data resource.

Steps

Determine the entry path

The first step is to determine the entry path of your file in the archive.

You can list the content of an archive with the tabul print command

tabul data print https://downloads.mysql.com/docs/world-db.tar.gz

[email protected]\docs
path                 media_type          size   update_time
------------------   ---------------   ------   ---------------------
world-db/            inode/directory        0   2025-10-31 23:05:45.0
world-db/world.sql   application/sql   398629   2025-10-31 23:05:45.0

On this listing, we can see that the world.sql is located at the path world-db/world.sql

Manifest

The second step is to define the archive entry. In Tabulify, you define it by creating a manifest archive entry.

In this example, the manifest is called archive/world-sql–archive-entry.yml.

With the cat command, we can see its content.

tabul data cat archive/world-sql--archive-entry.yml@howto

kind: archive-entry
spec:
  # The data uri of the archive
  data-uri: https://downloads.mysql.com/docs/world-db.tar.gz
  data-def:
    # The path of the file in the archive
    entry-path: world-db/world.sql

In this manifest,

the data-uri defines the archive data uri
the entry-path (in data-def) defines the path entry

Runtime Data URI

In the third step, we define the archive entry resource as a runtime resource.

Why? The manifest is an runtime.

To:

execute the manifest archive/world-sql–archive-entry.yml
located in the howto directory
and extract the entry into the tmp directory.

we define the following runtime data uri:

(archive/world-sql--archive-entry.yml@howto)@tmp

The entry will be extracted in to tmp directory/entry-path.

Usage

With this runtime data uri, you can use the archive entry as a resource.

For instance, to look at the first records with the head command of the sql file.

tabul data head '(archive/world-sql--archive-entry.yml@howto)@tmp'
# The quotes are mandatory in bash because parenthesis are a bash token (ie subshell)

The first 10 rows of the data resource (world-db/world.sql@tmp):
name             subset           category   sql
--------------   --------------   --------   ------------------------------------------------------------------
script_comment   script_comment   comment    -- MySQL dump 10.13  Distrib 8.0.19, for osx10.14 (x86_64)\n
script_comment   script_comment   comment    --\n
script_comment   script_comment   comment    -- Host: 127.0.0.1    Database: world\n
script_comment   script_comment   comment    -- ------------------------------------------------------\n
script_comment   script_comment   comment    -- Server version	8.0.19-debug\n
unknown          unknown          unknown    /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */
unknown          unknown          unknown    /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */
unknown          unknown          unknown    /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */
unknown          unknown          unknown    /*!50503 SET NAMES utf8mb4 */
unknown          unknown          unknown    /*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */