Compound Attributes

Telmai support for creating compound attributes using the base attributes of the source. Users can add custom rule expressions to allow measuring attribute correctness. Compound attributes are transformation of one or more attribute to allow for more complex analysis. Once a compound attribute is created, expectations can be set in the same way it is done for normal attributes. Business metrics how ever cannot use these compound attributes. We will add support for this down the lane.

Lets take a use case and talk through the usage of this feature. Below is a table which shows the order table of an imaginary enterprise.

If one wanted to monitor the 2 day delivery timeline set for products with the above data; the only options were:

  1. add a view which would have this computed column.

  2. create a table or query which would generate the data with the difference between the order-date and delivery-date.

The first approach means that one has to reach out to whoever is managing the database and it would mean an additional setup. The second would mean that the query had to be written and maintained by somebody who might or might not know the nuances of SQL. Even though both of solutions are possible, the simplicity of doing this against the existing data is lost. This is where the complex attribute feature comes in. Lets see how this can be done with an example.

Adding a Compound Attribute

To simplify this, we created compound attributes. Here's how you can do the same using the feature:

  1. Onboard a dataset similar to what's documented in Data on-boarding page.

  2. Navigate to Configuration page and select the corresponding dataset.

    • This will now prompt you to select the attributes and the transformation

    • Attribute ID: Compound attribute name

    • Attributes to compound: List of base attributes dataset

    • Expression: Desired ransformation

  3. For this example, we will set the following:

    • Once this is set, we can go ahead and create the attribute

input-details
Attribute ID = ORDER_DELIVERY_COMMITMENT_MET
Attributes to compound = DELIVERY_DATE, ORDER_DATE
Expression =  
if( datediff("DELIVERY_DATE","ORDER_DATE") <= 3 , 'true' , 'false') <, 'false')
  1. Once the attribute is created, we can now set the expectation rules by navigating to Navigate to Business Rules

    • Add attribute ORDER_DELIVERY_COMMITMEENT_MET, and set expectation for rejecting false values.

  2. Set an alert policy for the DQI correctness with the rule to alert in case correctness is less than 100%. This is optional, but help one get an alert in case the rule fails.

  3. Run the Upload Job and navigate to Overview section.

    • You will see that for the attribute ORDER_DELIVERY_COMMITMENT_MET, the correctness value would not be 100% since the row has records where the condition is not met.

  1. Clicking on the attribute would take one to the investigator where one can see the rows the rule which was set has failed - id’s with 12000001 and 12000006 where the delivery date is more than the 3 days.

Supported functions:

This feature supports a whitelist set of SQL functions and expressions. In general only aggregate functions are not supported. Please click here to see the full list of whitelisted functions.

Usage Notes:

  • As user inputs an expression, an evaluation is done to ensure a valid SQL expression

  • There are some cases where the validation will pass, but will fail during executions due to casting related errors. To mitigate, the values should be cast appropriately to the expected type.

  • The expression output must always be string type. In case it is not, users need to cast to a string explicitly. Example:

    • If the expression has contains function; the output is boolean contains( last_name , 'Smith')

      The user needs to cast the value string cast ( contains( last_name , 'Smith') as string)

  • All parameters are treated as strings. User needs to cast the parameters to corresponding function param type. Example:

    • If user wants to use the date_add function which takes in number (my_number) the and a date column (which is a string) The user will need to cast my_number first like this: date_add('2016-07-30', cast (my_number as int))

Examples

Example 1:

DATE_ADD('2016-07-30', my_number)

The my_number column is a string, so it needs an explicit cast since the date_add takes in a number as the parameter

DATE_ADD('2016-07-30', CAST (my_number as INT))

Example 2:

LENGTH(CONCAT_WS(\",\", firstname, lastname)) > 100

The output is a boolean but needs to be marked as string so that we can store

CAST(LENGTH(CONCAT_WS(\",\", firstname, lastname)) > 100 as STRING)

Example 3:

IF( DATEDIFF(current_timestamp(), orderdate) > 10 , true , false)

Convert boolean to string values

IF( DATEDIFF(current_timestamp(), orderdate) > 10 , 'true' , 'false' )

Example 4:

IF( ISNOTNULL(country) and ISNOTNULL(city) and price != 0 and (price <= 100)

),

'true', 'false')

price is a string, so cast it to double first or cast comparator to string

IF( ISNOTNULL(country) and ISNOTNULL(city) and price != '0' and (CAST(price as DOUBLE) <= 100)

),

'true', 'false')

Last updated