Dataset Health Overview
This page describes the Overview page, its components, and how to get the most out of it.
This page can be utilized to understand the status of a given dataset and its attributes at a given upload.
First, let's note the health KPIs this page displays:
KPI | Scope | |
---|---|---|
Record count | Table | Number of records in a given upload |
Total record count | Table | This KPI helps to track changes in the total size of the monitored table and is only available when CDC is enabled (i.e., the “delta only” flag is on). If the flag is off or the monitored source is not an SQL database, this field will be set to N/A. |
Table Freshness | Table | Time between dataset upload-time and dataset update-time. Note: the source dataset metadata need to support update time |
Record Freshness | Table | Record level freshness is defined by setting an expectation on a timestamp attribute within the data source (e.g.,. “Record Update Date” attribute to be no more than 1 month from now).If the timestamp attribute is not configured this KPI will be N/A. |
Completeness | Attribute/Table | Percent of records where the attribute value is not null, not empty, or not one of the user-defined placeholders like N/A. |
Correctness | Attribute | Percent of records where the attribute value meets all expectations, set for the attribute. Correctness is calculated only for attributes where Expectations are set, otherwise, the default is 100%. |
Accuracy | Attribute | The ratio between business metrics dimensions where no drift was detected to the total number of dimensions. If multiple business metrics are defined for an attribute, then the minimum accuracy is picked for this attribute. |
Uniqueness | Attribute | The ratio of records with unique id versus the total number of records. |
Duplicates | Attribute | Count of duplicate values |
Unique | Attribute | Count of unique values |
Distinct | Attribute | Count of distinct values |
Empty | Attribute | Count of empty values |
Cardinality | Attribute | Values cardinality ration (Low, Med, High) |
Note: for all attribute level metrics, table level metric is calculated as average across different columns.
Now to see these KPIs, let's go over the UI:
Selector Component
This component allows you to select:
Dataset
Attributes to calculate metrics on
Segments to filter on
Specific upload
Health KPIs Summary
This components provides a summary of calculated metrics on table and attribute scopes. Only the calculated
Attribute(s) KPIs
Finally for every attribute (or column) we are calculating the corresponding health KPIs
Last updated