LogoLogo
HOMEBLOG
  • Getting Started
  • Connect to Data
    • Projects
    • Data Connectors
      • Google BigQuery
      • Google Cloud Storage
      • Iceberg
      • Snowflake
      • AWS S3
      • AWS Athena
      • AWS Redshift
      • Databricks Delta
      • Azure Blob
      • Salesforce
      • SAP Hana
      • File Path Options
      • SQL Server
      • Trino
    • Connection Modes
    • Triggering Scans
    • Configuring a Data Source
  • Profiling Data
    • Data Health Metrics
    • Data Health Overview Page
    • Interactive Profiling Tool: Investigator
    • Data Diff
    • Compound Attributes
      • List of Supported Functions
  • Monitoring Data
    • Data Quality Metrics
    • Alert Policies
    • Data Trends and Alerts
    • Metrics Inspector
  • Data Quality Rules
    • Rules Expression Examples
  • PII Data Detection
  • Remediation
    • Data Binning
    • Circuit Breaker
  • Integrations
    • Jira Integration
    • Slack
    • Jobs Status Notification
  • User Management
    • Okta SSO Setup
    • SSO Configuration
  • API Reference
    • Authentication API
    • API Keys
    • Telmai IP List
    • Get Google Service Account API
  • Source APIs
    • Source APIs
  • Upload Data APIs
    • Upload data from Cloud
      • RedShift Request data
      • GCS Request data
      • Azure Request data
      • GBQ Request data
      • Snowflake Request data
      • Amazon S3 Request data
      • Delta Lake Request
      • Trino Request data
    • Track upload job
    • Check for alerts
  • Admin APIs
    • User Management
  • Telmai Releases
    • Release Notes
      • 25.2.0
      • 25.1.3
      • 25.1.2
      • 25.1.0
Powered by GitBook
On this page

Profiling Data

PreviousConfiguring a Data SourceNextData Health Metrics

Last updated 3 days ago

Before starting data profiling or monitoring, Telmai requires you to specify which attributes need to be monitored. When a data source is created, the schema is analyzed (excluding nested JSON attributes). By default, all attributes are disabled, so the first step is to enable the required ones:

  1. Navigate to the Configuration Page

  2. Select the desired Data Source

  3. Click the lightning icon next to the attributes you want to profile. Alternatively, click “Select All” to enable all attributes

Open the context menu for the same data source and select “Scan”

You have two scan options:

  • Single Data Scan

  • Historical Data Scan

Single Data Scan

This option launches a job to scan the current data, which can be tracked and stopped if needed:

Historical Data Scan

For SQL-based sources with a configured schedule, you can perform a Historical scan. This option automatically breaks down records added over the past 10 periods and launches 10 independent scans for each period, allowing you to analyze historical data dynamics and metric behaviour over time.

For historical scans to work, the Timestamp Attribute (set under Edit Connection → Advanced → Timestamp Attribute) must be configured at the source. The column's data type should match the appropriate type for the specific database.

Database
Timezone Data Type
Date Data Type

BigQuery

TIMESTAMP

DATE

Snowflake

TIMESTAMP_TZ

DATE

Iceberg

TIMESTAMP WITH TIME ZONE

DATE

Redshift

TIMESTAMPTZ

DATE

Athena

timestamp with time zone

date

Databricks

TIMESTAMP

DATE

SQL Server

DATETIMEOFFSET

DATE

Trino

timestamp with time zone

date