Learning Tabulify - Step 7 - How to execute a SQL Query
Query
For Tabulify, a SQL Query is a runtime resource that when executed returns a data set in tabular format (known as the result set).
The selector of a runtime is known as a runtime selector. This selector is composed of a data selector inside its expression.
The syntax is:
(sqlSelector)@connection
where:
- @connection defined the connection where the SQL Statements will be executed.
In the next steps, we will show this concepts by running a query of the Tpcds benchmark.
Query 11
Tpcds offers 100 queries that are all located in the tpcds-queries connection
We will execute the query 11 that lists the customers that have generated growing sales from one year to another in the s and w sales type.
Below is the query 11 located at sqlite/query_11.sql@tpcds_query
tabul data print --type text --pipe-mode sqlite/query_11.sql@tpcds_query
# --type text disable sql statement parsing
# --pipe-mode disallow the headers
with year_total as (
select c_customer_id customer_id
,c_first_name customer_first_name
,c_last_name customer_last_name
,c_preferred_cust_flag customer_preferred_cust_flag
,c_birth_country customer_birth_country
,c_login customer_login
,c_email_address customer_email_address
,d_year dyear
,sum(ss_ext_list_price-ss_ext_discount_amt) year_total
,'s' sale_type
from customer
,store_sales
,date_dim
where c_customer_sk = ss_customer_sk
and ss_sold_date_sk = d_date_sk
group by c_customer_id
,c_first_name
,c_last_name
To make the query 11 executable, we need to add the execution connection.
The executable data uri will be then:
(sqlite/query_11.sql@tpcds_query)@connection
Printing the query
Printing the query against the sqlite connection will show us the customers that have generated growing sales from one year to another in the s and w sales type.
tabul data print '(sqlite/query_11.sql@tpcds_query)@sqlite'
# The quotes are only mandatory in bash because parenthesis are a bash token (ie subshell)
(sqlite/query_11.sql@tpcds_query)@sqlite
customer_id customer_first_name customer_last_name customer_email_address
---------------- ------------------- ------------------ ----------------------------------
AAAAAAAABMDAAAAA Jose Guzman [email protected]
AAAAAAAAFLAAAAAA Joseph Riley [email protected]
AAAAAAAAJGBAAAAA Bonnie Harrison [email protected]
AAAAAAAAKMDAAAAA Alexander White [email protected]
AAAAAAAAOFCAAAAA Timothy Grogan [email protected]
A query is just a data resource
Don't forget that:
- an runtime is just a data resource.
A query can therefore be used on every command as any other data resource.
How to copy a query into another table
For instance, if you want to copy its output to a table called query_11, you would issue the following command:
tabul data copy '(sqlite/query_11.sql@tpcds_query)@sqlite' query_11@sqlite
# The quotes are only mandatory in bash because parenthesis are a bash token (ie subshell)
And the content of the new table is just the same as the output of the query 11.
tabul data print query_11@sqlite
query_11@sqlite
customer_id customer_first_name customer_last_name customer_email_address
---------------- ------------------- ------------------ ----------------------------------
AAAAAAAABMDAAAAA Jose Guzman [email protected]
AAAAAAAAFLAAAAAA Joseph Riley [email protected]
AAAAAAAAJGBAAAAA Bonnie Harrison [email protected]
AAAAAAAAKMDAAAAA Alexander White [email protected]
AAAAAAAAOFCAAAAA Timothy Grogan [email protected]
Let's drop it to keep a clean schema
tabul data drop query_11@sqlite
Dropped data resources
data_uri media_type
--------------- ----------
query_11@sqlite table
How to download a query as a CSV
For instance, if you want to copy its output:
- to a CSV called query_11.csv
- into the current directory (cd)
you would issue the following command
tabul data copy '(sqlite/query_11.sql@tpcds_query)@sqlite' query_11.csv@cd
# The quotes are only mandatory in bash because parenthesis are a bash token (ie subshell)
And the content of the new csv is also just the same than the output of the query 11.
tabul data print query_11.csv@cd
query_11.csv@cd
customer_id customer_first_name customer_last_name customer_email_address
---------------- ------------------- ------------------ ----------------------------------
AAAAAAAABMDAAAAA Jose Guzman [email protected]
AAAAAAAAFLAAAAAA Joseph Riley [email protected]
AAAAAAAAJGBAAAAA Bonnie Harrison [email protected]
AAAAAAAAKMDAAAAA Alexander White [email protected]
AAAAAAAAOFCAAAAA Timothy Grogan [email protected]
Next
Now that we can query our data, we are going to learn how to modify the content of a table