Table of Contents

Tabul - Data Diff Command (Compare)

About

data diff is a tabul command that performs a data diff (compare) against one or several data resources.

You can diff:

Example

Data Selector (Source)

The data selector can choose any content resource such as:

Syntax

tabul data diff -h
Tabul data diff
===============

Performs a comparison (diff) between data resources.

To have a meaningful output, we recommend that the data resources to diff
have been sorted in a ascendant order on the unique columns

Driver Columns (Unique Columns):
It's highly recommended to set the `--driver-columns` option.
The `driver columns` are the record identifier and drives the gathering of the record to compare.
Without driver columns, all columns are taken in order.
By default, this is
  * all columns in order for a record diff
  * the column name for a structure diff

With the `--report-type` option, you can control the output:
  * `summary` will return a summary report,
  * `unified` will return a unified report (default, record grain)

With the `--diff-data-origin` option, you can control the origin of data:
  * `record` will perform a comparison on the records of the data resource,
  * `structure` will perform a comparison on the structure of the data resource (by attribute name)
  * `attributes` will perform a comparison on the attributes of the data resource (by attribute key)

Exit:
If there is a non-equality (ie a diff), the process will exist with an error status, unless --no-fail is set.



Examples
--------

 1 - Data diff between two different queries located in the current directory:


    tabul data diff (queryFile1.sql)@sqlite (queryFile2.sql)@sqlite


 2 - Data diff between a query and a table on two different systems


    tabul data diff (queryFile.sql)@sqlite table@postgres




Syntax
------


    tabul data diff [options|flags] <source-selector...> <target-data-uri>


where:


  Arguments:

    <source-selector...>                                    A data selector that selects the data resources to compare

    <target-data-uri>                                       A target data uri (Example: table@connection or foo.csv@cd)


  Options:

    -do,--diff-data-origin <record|structure|attributes>    The data origin of the diff (record, structure or attributes)

    --driver-columns <columnName>                           The column names that drive the diff (unique columns normally)

    -et,--equality-type <loss|strict>                       The type of equality (loss, strict)

    --max-change-count <value>                              The maximum number of changes detected before stopping the diff

    --no-colors                                             if true, the output is colored

    --no-fail                                               When true, a diff with inequality will fail

    -rt,--report-type <cell|unified|summary>                The type of report

    --sparse                                                If set, the unified report output will be sparse (ie only the changes are seen, no context)


  Data Definition Options:

    -sa,--source-attribute <attributeName=value>            Set a source attribute

    -ta,--target-attribute <attributeName=value>            Set a target attribute


  Selection Options:

    --strict-selection                                      If set the selection will return an error if no data resources have been selected

    -wd,--with-dependencies                                 If set, the dependencies will be also selected


  Global Options:

    -ah,--app-home <path>                                   The app home directory (default to the .tabul.yml file directory)

    -vf,--conf <path>                                       The path to a configuration file

    -ee,--exec-env <name>                                   The execution environment (prod or dev)

    -h,--help                                               Print this help

    -l,--log-level <error|warning|tip|info|fine>            Set the log level

    -ns,--not-strict                                        A minor error will not stop the process.

    -odu,--output-data-uri <outputDataUri>                  defines the output data uri for the feedback data (default: console)

    -oo,--output-operation <dataOperation>                  defines the data operations (replace, truncate) on an existing output resource before transfer.

    -oop,--output-transfer-operation <transferOperation>    defines the output transfer operation (insert, update, merge, copy). Default to `copy` for a file system and `insert` for a database.

    -pp,--passphrase <passphrase>                           A passphrase (master password) to decrypt the encrypted vault values (Env: TABUL_PASSPHRASE)

    --pipe-mode                                             Use pipe mode if you want to pipe the output in a shell. Pipe mode will not print the headers (ie column name) and will not make the control character visible

    -v,--version                                            Print version information