LogoLogo
HOMEBLOG
  • Getting Started
  • Connect to Data
    • Projects
    • Data Connectors
      • Google BigQuery
      • Google Cloud Storage
      • Iceberg
      • Snowflake
      • AWS S3
      • AWS Athena
      • AWS Redshift
      • Databricks Delta
      • Azure Blob
      • Salesforce
      • SAP Hana
      • File Path Options
      • SQL Server
      • Trino
    • Connection Modes
    • Triggering Scans
    • Configuring a Data Source
  • Profiling Data
    • Data Health Metrics
    • Data Health Overview Page
    • Interactive Profiling Tool: Investigator
    • Data Diff
    • Compound Attributes
      • List of Supported Functions
  • Monitoring Data
    • Data Quality Metrics
    • Alert Policies
    • Data Trends and Alerts
    • Metrics Inspector
  • Data Quality Rules
    • Rules Expression Examples
  • PII Data Detection
  • Remediation
    • Data Binning
    • Circuit Breaker
  • Integrations
    • Jira Integration
    • Slack
    • Jobs Status Notification
  • User Management
    • Microsoft Entra IDP Setup
    • Auth0 Setup
    • Okta SSO Setup
    • SSO Configuration
  • API Reference
    • Authentication API
    • API Keys
    • Telmai IP List
    • Get Google Service Account API
  • Source APIs
    • Source APIs
  • Upload Data APIs
    • Upload data from Cloud
      • RedShift Request data
      • GCS Request data
      • Azure Request data
      • GBQ Request data
      • Snowflake Request data
      • Amazon S3 Request data
      • Delta Lake Request
      • Trino Request data
    • Track upload job
    • Check for alerts
  • Admin APIs
    • User Management
  • Telmai Releases
    • Release Notes
      • 25.2.1
      • 25.2.0
      • 25.1.3
      • 25.1.2
      • 25.1.0
Powered by GitBook
On this page
  1. Profiling Data

Data Health Metrics

PreviousProfiling DataNextData Health Overview Page

Last updated 9 months ago

Telmai automatically calculates a comprehensive set of health metrics to monitor your data. These metrics include:

Column

Description

Total Record Count

The total size of the monitored table. This is only available when CDC is enabled (i.e., the “delta only” flag is on). If the flag is off or the monitored source is not an SQL database, this field will be set to N/A.

Record Count

Reflects the size of the delta when CDC is enabled; otherwise, it represents the size of the entire dataset.

Completeness

Percentage of non-null/missing/placeholder values, tracked at both the data source and attribute levels:

  • Attribute level: Percentage of records where the attribute is not null, empty, or a placeholder (e.g., N/A)

  • Data Source level: Compounded average of attribute-level completeness within a data source

Correctness

Correctness is tracked both at the data source and attribute levels. Correctness is calculated only for attributes where Expectations are set, otherwise, the default is 100%.

  • Attribute level: Percentage of records where the attribute meets all set expectations.

  • Source level: Compounded average of attribute-level correctness within a data source.

Freshness

Freshness is tracked both at data source and record level.

  • Record level: Determined by setting an expectation on a timestamp attribute (e.g., “Record Update Date” should be no more than one month old). Configure this in the Advanced section of the Edit Connection menu.

  • Table level:Based on the time since the table was last updated.

Uniqueness

Uniqueness of the records based on an ID attribute. The KPI measures the ratio of records with unique id versus the total number of records. For example, if out of 10 records, there are 2 records sharing the same value the uniqueness is 80%.

An attribute can be marked as an ID attribute in the Advanced section of the Edit Connection menu. If ID attribute is not configured this KPI will be N/A

Accuracy

The accuracy of the values is determined through analysis of historical data. It detects discrepancies when current values deviate from predictions. For example, if the revenue for the company Acme Inc was slowly growing from $4M to $5M over the past year, but in today’s observation it’s 20M, then such a value is considered inaccurate. Accuracy is only calculated for attributes that have “Custom Metrics” Configured, otherwise, the default is 100%.

  • Attribute level: Ratio of dimensions with no drift detected to the total number of dimensions. If multiple custom metrics are defined, the minimum accuracy is used.

  • Data Source level: Compounded average of attribute-level accuracy within a data source.

These metrics can be found in different places across the application, depending on the scope you’re looking at. In case is enabled, only a subset of these metrics are measured based on the source type.

lightweight scan