Learning Tabulify - Step 6 - How to transfer Data Resources
Data Transfer
In Tabulify, when you want to manipulate data, it's called a transfer.
For data manipulation, the transfer command is the only command you need.
The transfer command is the most powerfull command and we will introduce it first with a copy.
Sqlite HowTo Connection
Tabulify comes pre-installed with connections used in the HowTo. We call this connections, the Howto's connections.
In this learning guide, we will use as target connection, the local sqlite how-to connection
tabul connection info sqlite
Information about the connection (sqlite)
Attribute Value Description
-------------------------- --------------------------------------------------- --------------------------------------------------------------------
BOOLEAN_DATA_TYPE Native Boolean data type used to store boolean values
BUILDER_CACHE_ENABLED true Enable or disable the builder cache
CHAR_DEFAULT_PRECISION 1 Default CHAR precision
COMMENT The sqlite default connection A connection description
DATABASE_MAJOR_VERSION The major version number of the database
DATABASE_MINOR_VERSION The minor version number of the database
DATABASE_PRODUCT_NAME The name of the database
DATABASE_PRODUCT_VERSION The version of the database
DATE_DATA_TYPE NATIVE Date data type used to store date values
DRIVER org.sqlite.JDBC The jdbc driver class
DRIVER_NAME The driver name
DRIVER_VERSION The driver version
JDBC_MAJOR_VERSION The major version number of JDBC
JDBC_MINOR_VERSION The minor version number of JDBC
LOGIN_STATEMENTS Statements that runs after a connection has been established
MAX_CONCURRENT_THREAD The maximum number of threads that can be created against the system
MAX_NAME_IN_PATH The maximum number of names in a path
NAME sqlite The name of the connection
NAME_QUOTING_DISABLED_CASE UPPERCASE The case to apply when quoting is disabled
NAME_QUOTING_ENABLED true Enable quoting of names
NATIVES Native Driver attributes (jdbc properties, ...)
NCHAR_DEFAULT_PRECISION 1 Default NCHAR precision
NVARCHAR_DEFAULT_PRECISION 0 Default NVARCHAR precision
ORIGIN CONF The origin of the connection
PASSWORD The user password
SUPPORT_BATCH_UPDATES If the system supports batch SQL updates
SUPPORT_NAMED_PARAMETERS If the system supports named parameters in the SQL statement
TIMESTAMP_DATA_TYPE NATIVE Timestamp data type used to store timestamp values
TIME_DATA_TYPE NATIVE Time format data type to store time values
TYPE_AFFINITY_CONVERSION false If true, the type name are converted to affinity name
URI jdbc:sqlite:////home/tabulify/.tabul/sqlite.sqlite3 The uri of the connection
USER The user name
VARCHAR_DEFAULT_PRECISION 0 Default VARCHAR precision
WORKING_PATH The working path (Schema for database, directory for file system)
Copy
For the purpose of this learning guide, we will copy the sales data from tpcds to the sqlite connection with the copy command that is an alias of the transfer command.
tabul data copy --with-dependencies *sales@tpcds @sqlite
where:
- tabul data copy points to the copy command
- *sales@tpcds is a data selector that select all table that ends with sales in the tpcds connection (We have covered this subject in the how to select data resource page)
- --with-dependencies select the dependencies of the selected data resources
- @sqlite defines the sqlite connection as the target connection
At the end of the process, you get a list of all tables where data were loaded and the number of row that were inserted.
Transfer results
input target latency record_count error_code error_message
---------------------------- ----------------------------- ------- ------------ ---------- -------------
call_center@tpcds call_center@sqlite 0.60s 2
catalog_page@tpcds catalog_page@sqlite 0.220s 11718
catalog_sales@tpcds catalog_sales@sqlite 6.675s 89807
customer@tpcds customer@sqlite 0.52s 1000
customer_address@tpcds customer_address@sqlite 0.42s 1000
customer_demographics@tpcds customer_demographics@sqlite 19.117s 1920800
date_dim@tpcds date_dim@sqlite 3.594s 73049
household_demographics@tpcds household_demographics@sqlite 0.66s 7200
income_band@tpcds income_band@sqlite 0.24s 20
item@tpcds item@sqlite 0.185s 2000
promotion@tpcds promotion@sqlite 0.17s 3
ship_mode@tpcds ship_mode@sqlite 0.25s 20
store@tpcds store@sqlite 0.21s 2
store_sales@tpcds store_sales@sqlite 5.516s 120527
time_dim@tpcds time_dim@sqlite 1.29s 86400
warehouse@tpcds warehouse@sqlite 0.96s 1
web_page@tpcds web_page@sqlite 0.22s 2
web_sales@tpcds web_sales@sqlite 0.846s 11876
web_site@tpcds web_site@sqlite 0.19s 2
Copy Verification
To check the copy, we will list the tables in the sqlite connection with the list command and the count attribute to see the number of rows by table.
tabul data list --attribute count *@sqlite
# or with short options
# tabul data list -a count *@sqlite
path media_type count
---------------------- ---------- -------
call_center table 2
catalog_page table 11718
catalog_sales table 89807
customer table 1000
customer_address table 1000
customer_demographics table 1920800
date_dim table 73049
household_demographics table 7200
income_band table 20
item table 2000
promotion table 3
ship_mode table 20
store table 2
store_sales table 120527
time_dim table 86400
warehouse table 1
web_page table 2
web_sales table 11876
web_site table 2
Next
Now, that we got data, we can discover how to execute a Sql Query