Telmai Academy
Search…
Data Quality and Observability Academy
Data Quality Indicators
Introduction Indicators of Data Quality
Selecting Data Quality Indicators
Completeness
Uniqueness
Freshness
Validity
Accuracy
Consistency
Advanced Topic: Implementing DQ indicators
Completeness
Correctness
Profiling data
Basics of profiling
Interactive Profiling
Monitoring
Monitoring data quality
Monitoring definitions
Monitoring Sources
Use Cases
For Designers
For Engineers
For Support
Powered By
GitBook
Uniqueness
The number of records that can be identified uniquely based on a predefined key.
Good to know:
Uniqueness is the inverse of duplicates.
Measuring Uniqueness
Prerequisite for this is that primary key defined
Formula : 1 – primary_key_count / total_row_count
Common unit of Measure:
Value count or percentage
Examples
Three different problems uniqueness could have occurred:
One record with one key value occurs more than once in a dataset (duplicate with identical key values). The two records are not unique.
Key | Student Name
22 | John snow
22 | John snow
Often times Datastore constraints can easily help avoid this issue.
Multiple records with same values occur more than once in a dataset (duplicate with different key values). Object John is not unique in the dataset.
Key | Student Name
22 | John snow
37 | John snow
A record has the same key as another record, and both occur in a dataset (false duplicate). Key 22 is not unique.
Key | Student Name
22 | John snow
37 | John snow
Most often users will need use sophisticated
master data management and Identify resolution systems
for resolving duplicates like these.
Related dimension:
Consistency
Data Quality Indicators - Previous
Completeness
Next - Data Quality Indicators
Freshness
Last modified
7mo ago
Copy link
Contents
Measuring Uniqueness
Examples