LogoLogo
HOMEBLOG
  • Getting Started
  • Connect to Data
    • Projects
    • Data Connectors
      • Google BigQuery
      • Google Cloud Storage
      • Iceberg
      • Snowflake
      • AWS S3
      • AWS Athena
      • AWS Redshift
      • Databricks Delta
      • Azure Blob
      • Salesforce
      • SAP Hana
      • File Path Options
      • SQL Server
      • Trino
    • Connection Modes
    • Triggering Scans
    • Configuring a Data Source
  • Profiling Data
    • Data Health Metrics
    • Data Health Overview Page
    • Interactive Profiling Tool: Investigator
    • Data Diff
    • Compound Attributes
      • List of Supported Functions
  • Monitoring Data
    • Data Quality Metrics
    • Alert Policies
    • Data Trends and Alerts
    • Metrics Inspector
  • Data Quality Rules
    • Rules Expression Examples
  • PII Data Detection
  • Remediation
    • Data Binning
    • Circuit Breaker
  • Integrations
    • Jira Integration
    • Slack
    • Jobs Status Notification
  • User Management
    • Microsoft Entra IDP Setup
    • Auth0 Setup
    • Okta SSO Setup
    • SSO Configuration
  • API Reference
    • Authentication API
    • API Keys
    • Telmai IP List
    • Get Google Service Account API
  • Source APIs
    • Source APIs
  • Upload Data APIs
    • Upload data from Cloud
      • RedShift Request data
      • GCS Request data
      • Azure Request data
      • GBQ Request data
      • Snowflake Request data
      • Amazon S3 Request data
      • Delta Lake Request
      • Trino Request data
    • Track upload job
    • Check for alerts
  • Admin APIs
    • User Management
  • Telmai Releases
    • Release Notes
      • 25.2.1
      • 25.2.0
      • 25.1.3
      • 25.1.2
      • 25.1.0
Powered by GitBook
On this page
  • Out-of-The-Box Metrics
  • User Defined Metrics
  1. Monitoring Data

Data Quality Metrics

PreviousMonitoring DataNextAlert Policies

Last updated 10 days ago

Telmai enables users to monitor a wide range of data quality metrics and receive alerts when anomalies or issues are detected. Alerts serve as warnings that something may not be as expected. Some alerts can be configured to trigger notifications, while others are displayed in the Telmai UI for informational purposes. Additionally, some alerts may initiate remediation actions, such as segregating good data from bad, triggering a circuit breaker for the pipeline, and more.

What is monitored?

  1. Out-of-the-Box Metrics: Telmai automatically monitors a variety of pre-defined metrics related to table metadata and Health KPIs(mentioned in )

  2. Custom Metrics: Users can define their own metrics to monitor specific data quality concerns.

Each monitored metric is validated against a set of policies, which define thresholds, scope, notification settings, and more. If a metric violates a policy, an alert is generated. The flow diagram below illustrates how alerts are created.

Out-of-The-Box Metrics

  • Record Level Freshness: Percentage of outdated records in the scan, based on defined freshness criteria

  • Table Level Freshness: Time elapsed since the last change in the monitored table at the time of the scan

  • Record Count: Number of records being scanned

  • Total Table Records Count: Total number of records in the monitored table. This may be larger than the record count if the scan involves only a subset of data, such as when delta detection is configured

  • Correctness: Percentage of records that meet defined data quality rules

  • Completeness: Percentage of records with non-null/non-empty values

  • Record ID Uniqueness: Percentage of unique records based on the configured ID attribute.

  • Uniqueness: Percentage of unique values within an attribute

User Defined Metrics

Users can create custom metrics to track specific anomalies. To add a new custom metric:

  1. Select the Dataset: Choose the dataset you want to monitor

  2. Navigate to “Alerting Policies”: Go to the “Metrics” tab

  3. Add a Custom Metric:

    • Click the “+ Custom Metric” button

    • A new window will open, allowing you to define the metric

  4. Define the Metric:

    • Name: Enter a name for the metric

    • Description: Provide a brief explanation of the metric

    • Expression: SQL syntax for aggregation:

      • Attribute names must be wrapped in backticks `

      • Maximum number of group by dimension is 4

      • Example: SUM(`sales`) group by `region`, `country`

      • See more examples

    • Click Validate and Save to save the metric

Telmai will start monitoring this metric in future scans

Important: Defining a metric here makes it available for tracking and alerting. However, no alerts will be generated until a policy is created using this metric.

Supported aggregations

Below is the list of supported functions

min
max
count
avg
sum
distinct
variance
median
stddev

Aggregation Examples

// Count distinct age for different cities
count(distinct(`Age`)) group by `City`

// Sum of salaries over total count
sum(`salaries`)/count(*)

// Average age for each school and district
avg(`age`) group by `school`, `district`
Data Health KPIs