Data Quality Score

Overview

The Data Quality (DQ) Score is a normalized, business-relevant measure of data health that provides a single indicator of a dataset's fitness for use. Expressed as a percentage from 0% to 100%, the DQ Score helps organizations quickly assess and monitor the overall quality of their data assets.

Purpose

The DQ Score methodology ensures:

Consistency - Standardized measurement across all data assets
Normalization - Comparable scores regardless of data volume or complexity
Business Relevance - Weighted dimensions that reflect organizational priorities
Actionability - Clear identification of data quality issues requiring attention

Core Dimensions

The DQ Score is calculated as a weighted average of four fundamental data quality dimensions:

Completeness

Measures the extent to which required data fields are populated.

Metric: Percentage of required fields containing values Score Range: 0-100%

Validity

Measures compliance with defined business rules and data constraints.

Metric: Percentage of records passing validation rules Score Range: 0-100%

Freshness (Timeliness)

Measures whether data is up-to-date and meets timeliness requirements.

Metric: Binary indicator of freshness incidents Score Range: 0 or 100

Integrity (Incident Health)

Measures operational stability through the volume of open data quality incidents.

Metric: Count of open incidents relative to threshold Score Range: 0-100

Score Calculation

Normalization Process

All source metrics are normalized to scores between 0 and 100 before being integrated into the final DQ Score calculation.

Completeness (S_C)

S_C = Completeness percentage

Directly uses the percentage of populated required fields.

Validity (S_V)

S_V = Validity percentage

Directly uses the percentage of records passing business rules.

Freshness (S_F)

If No Freshness Incidents: S_F = 100
If Freshness Incidents Exist: S_F = 0 (or configured penalty score, e.g., 80)

Integrity/Incidents (S_Inc)

S_Inc = MAX(0, 100 × (1 - (I_current / I_max)))

Where:
- I_current = Number of open incidents
- I_max = Maximum tolerable incident threshold (default: 20)

Constraints:

If I_current ≥ I_max, then S_Inc = 0
If I_current = 0, then S_Inc = 100

Final DQ Score Formula

DQ Score = ((S_C × W_C) + (S_V × W_V) + (S_F × W_F) + (S_Inc × W_Inc)) / W_Total

Where:
- S = Normalized dimension score (0-100)
- W = Dimension weight
- W_Total = W_C + W_V + W_F + W_Inc

Output: Value between 0 and 100

Default Weights

Telmai provides industry-standard default weights that prioritize data accuracy and fundamental usability:

Dimension

Default Weight

Rationale

Validity

40%

Highest priority - measures compliance with critical business rules

Completeness

30%

Second priority - measures availability of required information

Integrity (Incidents)

20%

High priority penalty - reflects operational stability and issue volume

Freshness

10%

Contextual priority - importance varies by use case

TOTAL

100%

Simplifies calculation denominator

Weight Customization

Weights can be adjusted per dataset to reflect specific business requirements:

Real-time systems: Increase Freshness weight (e.g., 25-30%)
Analytical systems: Prioritize Completeness and Validity
Mission-critical systems: Increase Integrity/Incidents weight

Configuration

Dimension Weights

The system allows dynamic, per-dataset configuration of dimension weights:

Navigate to your dataset settings
Select Data Quality Score Configuration
Adjust weights to match business priorities
Document justification for non-default weights

Note: All four weights must sum to 100.

Incident Threshold (I_max)

Configure the maximum tolerable incident threshold per dataset:

Default: 20 open incidents
Low-tolerance assets: 5-10 incidents
High-volume assets: 30-50 incidents

The threshold should reflect:

Dataset criticality
Typical incident volumes
Business impact tolerance

Best Practices

Interpreting DQ Scores

Score Range

Quality Level

Recommended Action

90-100

Excellent

Maintain current practices

75-89

Good

Monitor trends, address minor issues

60-74

Fair

Investigate dimension contributors, plan improvements

0-59

Poor

Immediate attention required, escalate issues

To start using this feature, please refer to DQ Score APIs

PreviousMetrics Inspector NextRemediation

Last updated 1 month ago

hashtagOverview

hashtagPurpose

hashtagCore Dimensions

hashtagCompleteness

hashtagValidity

hashtagFreshness (Timeliness)

hashtagIntegrity (Incident Health)

hashtagScore Calculation

hashtagNormalization Process

hashtagFinal DQ Score Formula

hashtagDefault Weights

hashtagWeight Customization

hashtagConfiguration

hashtagDimension Weights

hashtagIncident Threshold (I_max)

hashtagBest Practices

hashtagInterpreting DQ Scores