Data Quality External Reporting
This page explains the Centralized DQ Monitor Scan Reporting feature, which generates a comprehensive, standardized report for every Data Quality (DQ) Monitor scan and appends the results to a designated external table. This mechanism centralizes DQ metrics from various sources and monitors, providing a unified, historical view of data quality performance.
Report Structure (External Destination Table Schema)
The external table is the central repository for all DQ scan results. It provides a standardized format that allows users to query and analyze DQ performance across all projects, data assets, and monitors.
Column Name
Data Type
Description
project_id
Long
Identifier for the project where the data asset resides.
data_asset_id
String
Identifier for the specific data asset (e.g., table/stream) being monitored.
monitor_Id
Long
Unique identifier for the Monitor that performed the check. This is the primary key for tracking the rule execution.
scan_timestamp
Timestamp
The exact time the DQ job completed execution.
total_records_failed
Integer
Count of records that failed the specific Monitor's check.
total_records_scanned
Integer
Total count of records processed by the job for this check.
record_id_attribute_name
String
The name of the primary key/ID attribute used to uniquely identify records in the data asset.
record_id_sample
Array of Strings
A sample of up to 100 failed record IDs. This sample aids in immediate investigation and debugging.
User Actionability
To check DQ status: Users should query the External Destination Table, filtering by
data_asset_idandscan_timestamp.To debug failures: Users can use the
record_id_samplealong with therecord_id_attribute_nameto look up the failing records directly in the source data asset for diagnosis.
Configuring Reporting
Reporting can only be configured via APIs. Please refer to DQ Reporting APIs.
Last updated