Uniqueness

The number of records that can be identified uniquely based on a predefined key.

Good to know: Uniqueness is the inverse of duplicates.

Measuring Uniqueness

Prerequisite for this is that primary key defined

Formula : 1 – primary_key_count / total_row_count

Common unit of Measure: Value count or percentage

Examples

Three different problems uniqueness could have occurred:

One record with one key value occurs more than once in a dataset (duplicate with identical key values). The two records are not unique.

Key | Student Name

22 | John snow

22 | John snow

Often times Datastore constraints can easily help avoid this issue.

Multiple records with same values occur more than once in a dataset (duplicate with different key values). Object John is not unique in the dataset.

Key | Student Name

22 | John snow

37 | John snow

A record has the same key as another record, and both occur in a dataset (false duplicate). Key 22 is not unique.

Key | Student Name

22 | John snow

37 | John snow

Most often users will need use sophisticated master data management and Identify resolution systems for resolving duplicates like these.

Related dimension: Consistency

Last updated