Data Diff

Compare your datasets to detect reconciliation issues

Telmai allows you to compare the delta between two tables. The delta will include information about:

  • # of new records

  • # of missing records

  • # of changed records

  • Details on schema changes

  • Delta data in parquet format

Telmai’s Data Diff feature runs as part of tables’ regular scan. You will first need to connect the two datasets, you want to compare. Then, you will need to update the configs for the table you want to compare (target table) using these steps:

  1. Define the ID attribute

  2. You will be prompted to fill details:

    1. Source table: Dataset you want to compare to

    2. Result Destination: Output for parquet files (S3, Azure Blob, or GCP storage)

  1. Once the details is selected, you will be prompted to fill more details on associated bucket

  2. Next scan will analyze the deltas between both datasets, and alerts will be created if deltas exist

Data Diff Alert Example

If any differences is detected across the source and target datasets, a “Data Difference” alert is created similar to below picture:

Clicking on the alert, will show more details on changed schema and records similar to picture below:

Lastly, navigating to the output parquet files, you can see more details on changed records.

Last updated