---json { "aliases": [ { "path": ":docs:tabul:data:compare" } ], "page_id": "vky10xrtqhc7d0te81bi4" } --- ====== Tabul - Data Diff Command (Compare)====== ===== About ===== ''data diff'' is a [[docs:tabul:command|tabul command]] that performs a [[docs:op:diff|data diff (compare)]] against one or several [[docs:resource:resource|data resources]]. You can diff: * [[docs:resource:content|content data resource]] * the [[docs:resource:structure|structure of data resource]] ===== Example ===== * [[howto:getting_started:10_resource_comparison]] ===== Data Selector (Source) ===== The [[docs:flow:data_selector|data selector]] can choose any [[docs:resource:content|content resource]] such as: * [[docs:resource:sql_table|table]] * [[docs:resource:request|request]] * [[docs:resource:file|file]] ===== Syntax ===== tabul data diff -h Tabul data diff =============== Performs a comparison (diff) between data resources. To have a meaningful output, we recommend that the data resources to diff have been sorted in a ascendant order on the unique columns Driver Columns (Unique Columns): It's highly recommended to set the `--driver-columns` option. The `driver columns` are the record identifier and drives the gathering of the record to compare. Without driver columns, all columns are taken in order. By default, this is * all columns in order for a record diff * the column name for a structure diff With the `--report-type` option, you can control the output: * `summary` will return a summary report, * `unified` will return a unified report (default, record grain) With the `--diff-data-origin` option, you can control the origin of data: * `record` will perform a comparison on the records of the data resource, * `structure` will perform a comparison on the structure of the data resource (by attribute name) * `attributes` will perform a comparison on the attributes of the data resource (by attribute key) Exit: If there is a non-equality (ie a diff), the process will exist with an error status, unless --no-fail is set. Examples -------- 1 - Data diff between two different queries located in the current directory: tabul data diff (queryFile1.sql)@sqlite (queryFile2.sql)@sqlite 2 - Data diff between a query and a table on two different systems tabul data diff (queryFile.sql)@sqlite table@postgres Syntax ------ tabul data diff [options|flags] where: Arguments: A data selector that selects the data resources to compare A target data uri (Example: table@connection or foo.csv@cd) Options: -do,--diff-data-origin The data origin of the diff (record, structure or attributes) --driver-columns The column names that drive the diff (unique columns normally) -et,--equality-type The type of equality (loss, strict) --max-change-count The maximum number of changes detected before stopping the diff --no-colors if true, the output is colored --no-fail When true, a diff with inequality will fail -rt,--report-type The type of report --sparse If set, the unified report output will be sparse (ie only the changes are seen, no context) Data Definition Options: -sa,--source-attribute Set a source attribute -ta,--target-attribute Set a target attribute Selection Options: --strict-selection If set the selection will return an error if no data resources have been selected -wd,--with-dependencies If set, the dependencies will be also selected Global Options: -ah,--app-home The app home directory (default to the .tabul.yml file directory) -vf,--conf The path to a configuration file -ee,--exec-env The execution environment (prod or dev) -h,--help Print this help -l,--log-level Set the log level -ns,--not-strict A minor error will not stop the process. -odu,--output-data-uri defines the output data uri for the feedback data (default: console) -oo,--output-operation defines the data operations (replace, truncate) on an existing output resource before transfer. -oop,--output-transfer-operation defines the output transfer operation (insert, update, merge, copy). Default to `copy` for a file system and `insert` for a database. -pp,--passphrase A passphrase (master password) to decrypt the encrypted vault values (Env: TABUL_PASSPHRASE) --pipe-mode Use pipe mode if you want to pipe the output in a shell. Pipe mode will not print the headers (ie column name) and will not make the control character visible -v,--version Print version information