# Basics of profiling

Data profiling is **the process of examining, analyzing, and creating useful summaries of any data-set**. The process yields a high-level overview which helps in the discovery of data quality issues, risks, and overall trends.&#x20;

Data profiling produces critical insights into data that data teams can then leverage to build data products.

With the tremendous amount of data available today, dataOps teams are getting more and more overwhelmed by all the information they’ve collected. As a result, they fail to take full advantage of their data, and its value and usefulness diminish.&#x20;

Data profiling speeds up the design and development of the analytical cloud platform  by identifying all of the transformations and data cleansing activities required to transition the data safely.

Data profiling helps organize and manages big data to unlock its full potential and deliver powerful insights.&#x20;

![High level profiling information](/files/NiFrsnGG7D0IorBOkJxV)

Common outcome of profiling data

* Detailed structural schema analysis
* Data quality analysis to identify data content problem and risk areas
* Distribution of data values and patterns to identify the different standards and rules inherent to the data
* Redundant attributes that are either empty/incomplete, or have not been maintained

One of the biggest challenges that data profiling addresses is helping to scope and assess the risks associated with a data migration or integration project.<br>

There are many open source libraries available today for getting static profiling informations. One such pandas can be downloaded from here : <https://github.com/pandas-profiling/pandas-profiling>

For more interactive data analysis refer next chapter.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.telm.ai/academy/profiling-data/basics-of-profiling.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
