Introduction - Indicators of Data Quality
This chapter focuses on six key data quality indicators: completeness, uniqueness, timeliness, validity, accuracy and consistency.
Data Quality Indicators
A chef preparing a gourmet meal might use a thermometer to check the temperature of the meat. A mechanic working on a car might use a dipstick to check the engine's oil level. For data teams, how do we measure data quality?
Data quality is more important than ever, but as technology evolves, the way we talk about data quality is struggling to keep up.
Historically many data quality indicators have been adopted, like Accuracy, Validity, Completeness, Consistency, Reliability, Timeliness, Uniqueness, Accessibility, Confidentiality, Relevance, Integrity, … etc.
However, there is no standardization of their names or descriptions.
A comprehensive survey of over 60 quality dimensions was conducted by DAMA NL Foundations and published in DDQ-Research-2020 as an attempt to move towards more standardization. Among the many dimensions, a small subset of the most critical ones emerged.
These are referred to as the primary or critical dimensions. They are:
Completeness, Validity, Accuracy, Consistency, Uniqueness, and Timeliness.
This chapter focuses on these six widely used dimensions and their measurements, which we refer to as data quality indicators (DQI).