About
Tpcds offers 100 queries. We will load the tpcds data and execute the query number 11.
Steps
Start Postgres
The next steps will be using the postgres howto connection.
- You need to start the postgres docker image.
tabul service start postgres
1 service was started
Name Type
-------- ------
postgres docker
- Ping your connection to check that the database is up.
tabul connection ping postgres
The connection (postgres) has been pinged successfully
Query file location
The file containing this query is located below the tpcds-query connection.
To get the absolute path and other information, you can use the data info. Example:
tabul data info postgres/query_11.sql@tpcds_query
Information about the data resource (postgres/query_11.sql@tpcds_query)
attribute value description
---------------- ------------------------------------------------------------------------------------------------ ----------------------------------------------------------------------
CHARACTER_SET UTF-8 The character set of the file
COLUMN_NAME sql The name of the column when the text content is returned on one column
END_OF_RECORD ;\r, ;\r\n, ;\n, /\r, /\r\n, /\n, \rgo\r, \r\ngo\r\n, \ngo\n, \rGO\r, \r\nGO\r\n, \nGO\n The End Of Record string sequence
PARSING_MODE SQL How to parse the SQL files
URI file:///opt/tabulify/resources/tpcds_query/postgres/query_11.sql The URI of the file
ABSOLUTE_PATH /opt/tabulify/resources/tpcds_query/postgres/query_11.sql The absolute path on the data system
ACCESS_TIME 2025-10-12 16:18:26.09177789 The access time (access time)
COMMENT A comment
CONNECTION tpcds_query The connection name
COUNT 1 The number of records
CREATION_TIME 2025-10-10 19:00:36.984447554 The creation time (birth time)
DATA_URI postgres/query_11.sql@tpcds_query The data uri
KIND sql The kind of media
LOGICAL_NAME query_11 The logical name
MD5 25d9c65fe0726c0d7b6e23655379e00d The Md5 hash
MEDIA_SUBTYPE sql The media subType
MEDIA_TYPE text/sql The media type
NAME query_11.sql The name of the data resource
PARENT postgres The parent
PATH postgres/query_11.sql The relative path to the default connection path
SHA384 ab8b87103f126f50388a56d6da667840599b7aaadef0a6fbb1340f3764ffb2169d1e4f8ff471c3ebcbe4a4b944970e83 The Sha384 hash
SHA384_INTEGRITY sha384-q4uHED8Sb1A4ilbW2mZ4QFmbeqre8Kb7sTQPN2T/shadHk+P9HHD68vkpLlElw6D The sha384 value used in the html integrity attribute
SIZE 2819 The number of byte
TABULAR_TYPE data The tabular type
UPDATE_TIME 2025-10-10 19:00:36.984447554 The last update time (modify time)
This query shows customers that have generated growing sales from one year to another in the s and w sales type. Below is an excerpt.
tabul data print --type text --pipe-mode postgres/query_11.sql@tpcds_query
with year_total as (
select c_customer_id customer_id
,c_first_name customer_first_name
,c_last_name customer_last_name
,c_preferred_cust_flag customer_preferred_cust_flag
,c_birth_country customer_birth_country
,c_login customer_login
,c_email_address customer_email_address
,d_year dyear
,sum(ss_ext_list_price-ss_ext_discount_amt) year_total
,'s' sale_type
from customer
,store_sales
,date_dim
where c_customer_sk = ss_customer_sk
and ss_sold_date_sk = d_date_sk
group by c_customer_id
,c_first_name
,c_last_name
,c_preferred_cust_flag
,c_birth_country
,c_login
,c_email_address
,d_year
union all
select c_customer_id customer_id
,c_first_name customer_first_name
,c_last_name customer_last_name
,c_preferred_cust_flag customer_preferred_cust_flag
,c_birth_country customer_birth_country
,c_login customer_login
,c_email_address customer_email_address
,d_year dyear
,sum(ws_ext_list_price-ws_ext_discount_amt) year_total
,'w' sale_type
from customer
,web_sales
,date_dim
where c_customer_sk = ws_bill_customer_sk
and ws_sold_date_sk = d_date_sk
group by c_customer_id
,c_first_name
,c_last_name
,c_preferred_cust_flag
,c_birth_country
,c_login
,c_email_address
,d_year
)
select
t_s_secyear.customer_id
,t_s_secyear.customer_first_name
,t_s_secyear.customer_last_name
,t_s_secyear.customer_email_address
from year_total t_s_firstyear
,year_total t_s_secyear
,year_total t_w_firstyear
,year_total t_w_secyear
where t_s_secyear.customer_id = t_s_firstyear.customer_id
and t_s_firstyear.customer_id = t_w_secyear.customer_id
and t_s_firstyear.customer_id = t_w_firstyear.customer_id
and t_s_firstyear.sale_type = 's'
and t_w_firstyear.sale_type = 'w'
and t_s_secyear.sale_type = 's'
and t_w_secyear.sale_type = 'w'
and t_s_firstyear.dyear = 2001
and t_s_secyear.dyear = 2001+1
and t_w_firstyear.dyear = 2001
and t_w_secyear.dyear = 2001+1
and t_s_firstyear.year_total > 0
and t_w_firstyear.year_total > 0
and case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else 0.0 end
> case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else 0.0 end
order by t_s_secyear.customer_id
,t_s_secyear.customer_first_name
,t_s_secyear.customer_last_name
,t_s_secyear.customer_email_address
limit 100;
Loading the sales data
To prevent any conflict, you can execute the below command to drop all tables from the postgres howto connection
tabul data drop --no-strict-selection *@postgres
Loading the data with the transfer command. The query_11 is a sales query, we are then loading the sales data and their dependencies
tabul data transfer --with-dependencies *sales@tpcds @postgres
where:
- *sales@tpcds is a data selector that:
- selects all tables that finish with the word sales (ie *sales)
- in the TPC-DS connection (ie tpcds)
- --with-dependencies add the dependencies of the selection.
- @postgres defines the target connection (the postgres howto connection)
Transfer results
input target latency record_count error_code error_message
---------------------------- ------------------------------- ------- ------------ ---------- -------------
call_center@tpcds call_center@postgres 0.92s 2
catalog_page@tpcds catalog_page@postgres 0.507s 11718
catalog_sales@tpcds catalog_sales@postgres 14.293s 89807
customer@tpcds customer@postgres 0.161s 1000
customer_address@tpcds customer_address@postgres 0.63s 1000
customer_demographics@tpcds customer_demographics@postgres 35.549s 1920800
date_dim@tpcds date_dim@postgres 4.945s 73049
household_demographics@tpcds household_demographics@postgres 0.175s 7200
income_band@tpcds income_band@postgres 0.37s 20
item@tpcds item@postgres 0.298s 2000
promotion@tpcds promotion@postgres 0.54s 3
ship_mode@tpcds ship_mode@postgres 0.40s 20
store@tpcds store@postgres 0.42s 2
store_sales@tpcds store_sales@postgres 11.813s 120527
time_dim@tpcds time_dim@postgres 2.141s 86400
warehouse@tpcds warehouse@postgres 0.96s 1
web_page@tpcds web_page@postgres 0.55s 2
web_sales@tpcds web_sales@postgres 2.86s 11876
web_site@tpcds web_site@postgres 0.47s 2
Executing the query
A query is selected via a runtime selector where:
- the first part is a data selector that locates one or more files that contains the query
- and the second part is the connection where the query should run.
In the below script selector
(postgres/query_11.sql@tpcds_query)@postgres
- postgres/query_11.sql@tpcds_query is a data selector that select the file query_11.sql in the subdirectory postgres of the tpcds-queries connection.
- sqlite is the connection where the query is executed.
A query data uri can be used with all command.
Example with the data print command.
tabul data print '(postgres/query_11.sql@tpcds_query)@postgres'
# The quotes are mandatory because parenthesis have a meaning in Bash (ie they start a subshell)
(postgres/query_11.sql@tpcds_query)@postgres
customer_id customer_first_name customer_last_name customer_email_address
---------------- -------------------- ------------------------------ --------------------------------------------------
AAAAAAAABMDAAAAA Jose Guzman [email protected]
AAAAAAAAFLAAAAAA Joseph Riley [email protected]
AAAAAAAAFNCAAAAA Jennifer Fitzpatrick [email protected]
AAAAAAAAJCAAAAAA Maxine Carlson [email protected]
AAAAAAAAJGBAAAAA Bonnie Harrison [email protected]
AAAAAAAAKMDAAAAA Alexander White [email protected]
AAAAAAAAOFCAAAAA Timothy Grogan [email protected]
AAAAAAAAPCBAAAAA Luz Ransom [email protected]
AAAAAAAAPDDAAAAA Reginald Garcia [email protected]