Overview
This section covers basic data-related features of data models.
This section covers basic data-related features of data models.
OCPM Catalog Tables allow accessing metadata about a object centric process mining perspective (former: data model) including registered tables.
"OCPM":"Object_Type"
"OCPM":"Event_Type"
"OCPM":"Event_Object_Type_Relations"
Standard aggregation functions group input rows together and calculate a single value for each group.
This aggregate function calculates the average of a set of values.
AVG ( table.column )
This aggregate function calculates the number of rows in a specified table.
COUNT_TABLE ( table )
This function counts the rows of a column per group. It can be applied on columns of any data type.
COUNT ( table.column )
This function calculates the number of distinct elements per group. COUNT DISTINCT can be applied on any data type.
COUNT ( DISTINCT table.column )
Returns the first element of the specified source column for each element in a group.
FIRST ( table.input_column [, ORDER BY table.column [ASC|DESC] ] )
In general, only one aggregation is calculated for a query. With a global aggregation it is possible to calculate an additional aggregation function over all the values of a column.
GLOBAL ( aggregation )
Returns the last element of the specified source column for each element in a group.
LAST ( table.input_column [, ORDER BY table.column [ASC|DESC] ] )
This function calculates the maximum value per group. MAX can be applied on any data type.
MAX ( table.column )
This function calculates the median per group. Median can be applied to INT, FLOAT or DATE columns.
MEDIAN ( table.column )
This function calculates the minimum value per group. MIN can be applied on any data type.
MIN ( table.column )
This function finds the most frequent elements per group. Mode can be applied on any data type. For multi-modal input, where there are several result candidates, the element with the smallest value is chosen. For elements of type STRING, the smallest value is determined by its lexicographical order.
MODE ( table.column )
PRODUCT calculates the product over an INT or FLOAT column.
PRODUCT ( table.column )
This function calculates the given quantile value per group. QUANTILE can be applied to INT, FLOAT or DATE columns.
QUANTILE ( table.column, quantile )
This function calculates the standard deviation of a set of samples. The standard deviation uses the "n-1" method. Standard deviation can be applied to INT or FLOAT data types.
STDEV ( table.column )
The STRING_AGG operator returns the concatenation of strings from the specified source column for each element in a group. The delimiter will always be inserted between the concatenation of the strings. Multiple order by expressions can be used in order to determine the order of the concatenation.
STRING_AGG ( source_table.column, delimiter [, ORDER BY source_table.column [ASC|DESC] ] )
SUM calculates the sum over an INT or FLOAT column.
SUM ( table.column )
This function calculates the trimmed mean over an INT or FLOAT column.
TRIMMED_MEAN ( table.column [, lower_cutoff [, upper_cutoff ] ] )
VAR calculates the variance per group. The variance is the square root of the standard deviation. Variance can be applied to INT or FLOAT columns.
VAR ( table.column )
Window functions calculate an aggregate value based on a range of neighboring rows.
RUNNING_TOTAL sums up all entries of a given column and returns all intermediate sums. It can be applied to INT or FLOAT columns.
RUNNING_TOTAL ( table.column )
Moving operators calculate a value across a range of neighboring rows. A range of neighboring rows, used for the calculation of a moving operator result is called window.
This function calculates the average for each window. It can be applied to INT or FLOAT columns.
MOVING_AVG ( table.column, start, end )
This function counts the distinct values for each window. It can be applied to any data type.
MOVING_COUNT_DISTINCT ( table.column, start, end )
This function counts the values for each window. It can be applied to any data type.
MOVING_COUNT ( table.column, start, end )
Calculates the maximum for each window. It can be applied to any data type.
MOVING_MAX ( table.column, start, end )
This function calculates the median for each window. It can be applied to any data type.
MOVING_MEDIAN ( table.column, start, end )
Calculates the minimum for each window. It can be applied to any data type.
MOVING_MIN ( table.column, start, end )
This function calculates the standard deviation for each window. It can be applied to INT or FLOAT columns.
MOVING_STDEV ( table.column, start, end )
Calculates the sum for each window. It can be applied to INT or FLOAT columns.
MOVING_SUM ( table.column, start, end )
Calculates the trimmed mean with a cut off of 10% for each window. It can be applied to INT or FLOAT columns.
MOVING_TRIMMED_MEAN ( table.column, start, end )
Calculates the variance for each window. It can be applied to INT or FLOAT columns.
MOVING_VAR ( table.column, start, end )
The Pull-Up-functions allow you to aggregate a column based on another table. You can define the target table to which the entries of a column from another table are pulled, and you can explicitly define on which basis calculations are executed.
Calculates the average of the specified source column for each element in the given target table.
PU_AVG ( target_table, source_table.column [, filter_expression] )
Calculates the number of distinct elements in the specified source column for each element in the given target table.
PU_COUNT_DISTINCT ( target_table, source_table.column [, filter_expression] )
Calculates the number of elements in the specified source column for each element in the given target table.
PU_COUNT ( target_table, source_table.column [, filter_expression] )
Returns the first element of the specified source column for each element in the given target table. An order by expression can be set to define the order that should be used to determine the first element.
PU_FIRST ( target_table, source_table.column [, filter_expression] [, ORDER BY source_table.column [ASC|DESC] ] )
Returns the last element of the specified source column for each element of the given target table. An order by expression can be set to define the order that should be used to determine the last element.
PU_LAST ( target_table, source_table.column [, filter_expression] [, ORDER BY source_table.column [ASC|DESC] ] )
Calculates the maximum of the specified source column for each element in the given target table.
PU_MAX ( target_table, source_table.column [, filter_expression] )
Calculates the median of the specified source column for each element in the given target table.
PU_MEDIAN ( target_table, source_table.column [, filter_expression] )
Calculates the mode of the specified source column for each element in the given target table. For multi-modal input, where there are several result candidates, the element with the smallest value is chosen. For elements of type STRING, the smallest value is determined by its lexicographical order.
PU_MODE ( target_table, source_table.column [, filter_expression] )
Calculates the minimum of the specified source column for each element in the given target table.
PU_MIN ( target_table, source_table.column [, filter_expression] )
Calculates the product of the specified source column for each element in the given target table.
PU_PRODUCT ( target_table, source_table.column [, filter_expression] )
Calculates the quantile of the specified source column for each element in the given target table.
PU_QUANTILE ( target_table, source_table.column, quantile [, filter_expression] )
Calculates the trimmed mean of the specified source column for each element in the given target table.
PU_TRIMMED_MEAN ( target_table, source_table.column [, lower_cutoff [, upper_cutoff ] ] [, filter_expression ] )
The PU_STRING_AGG operator returns the concatenation of strings from the specified source column for each element in the given target table. The delimiter will be always inserted between the concatenation of the strings. Multiple order by expressions can be used in order to determine the order of the concatenation.
PU_STRING_AGG ( target_table, source_table.column, delimiter [, filter_expression ] [, ORDER BY source_table.column [ASC|DESC] ] )
Calculates the sum of the specified source column for each element in the given target table.
PU_SUM ( target_table, source_table.column [, filter_expression] )
Calculates the standard deviation of the specified source column per each group of samples per group in the given target table. The standard deviation is using the "n-1" method. Standard deviation can be applied to INT or FLOAT columns.
PU_STDEV ( target_table, source_table.column [, filter_expression] )
Besides using a table from the data model as a target table inside a PU-function, generated tables can be passed.
To aggregate all values from a source column to a single value, the CONSTANT function can be used as a target table inside a PU-function. The resulting value is a constant.
CONSTANT()
The DOMAIN_TABLE function can be used to create a temporary table from various column(s), which can be used as a target table inside all PU-functions.
DOMAIN_TABLE ( table.column, ... )
This section includes functions related to currency manipulation.
This operator converts the document currency of SAP tables using the internal tables TCURR, TCURF and TCURX to perform the conversion.
CURRENCY_CONVERT_SAP ( to_currency, exchange_rate_type, mandt_column, document_currency_column, document_date_column, amount_column [, source_system_main_table_column ])
This operator converts currencies based on a provided CURRENCY_CONVERSION_RATES table.
CURRENCY_CONVERT ( amount, FROM ( from_currency ), TO ( to_currency ), date [, currency_conversion_rates [, exchange_rate_type [,
This operator converts the currency value as stored by SAP to the actual value based on the internal TCURX table.
CURRENCY_SAP ( amount_column, document_currency_type_column [, source_system_column ] )
This operator converts quantity units based on a QUANTITY_CONVERSION_RATES table, which is optionally provided by the user. If not provided by the user, a default QUANTITY_CONVERSION_RATES table will be used.
QUANTITY_CONVERT ( amount, FROM(from_unit), TO(to_unit) [, identifier, [ quantity_conversion_rates ] ] )
This section includes Celonis Analysis specific functions.
COMMON_TABLE allows to refer to the common table of multiple expressions in the data model without using its exact table name. This operator might be helpful when the common table of multiple expressions is unknown. The given input expressions have to be columns.
COMMON_TABLE ( expression [, expression, ...] )
Saved formulas are reusable PQL statements which can be defined in the frontend. With KPI a saved formula can be called. The result is the same as if the saved formula would have been directly placed where KPI is called.
KPI ( formula_name (, argument )* )
Returns the celonis username of the current user.
USER_NAME()
This section contains data flow operators.
BIND pulls a column or constant to a specified table. In case of a column, this requires that the owner table of the column and the specified target table have a direct or indirect 1:N relationship.
BIND( target_table, value)
BIND_FILTERS pulls a filter to a specified table. Multiple filters on a table are merged together by a logical AND.
BIND_FILTERS( target_table , condition [, condition ] *)
CASE WHEN evaluates a list of conditions and returns result expressions based on these conditions.
CASE WHEN condition THEN result_expression [ WHEN condition THEN result_expression ]* [ ELSE result_expression ] END
The COALESCE function returns the first element that is not NULL of a set of values.
COALESCE (table.column1, ..., table.columnN )
Filters can be defined as Analysis filters, Sheet filters or Component filters. If a query is sent to Celonis, all active filters are propagated to the requested table(s). Multiple filters on a table are merged together by a logical AND.
FILTER [FORCED] condition;
FILTER_TO_NULL immediately propagates the FILTER to the specified column by setting all violating rows to NULL.
FILTER_TO_NULL ( table.column )
The GREATEST function returns the greatest element that is not NULL from a set of values.
GREATEST ( table.column1 , ... , table.columnN )
The LEAST function returns the least element that is not NULL from a set of values.
LEAST ( table.column1 , ... , table.columnN )
The LOOKUP operator allows you to lookup matching rows of a source column to a target table where the target and source tables cannot have a pre-defined join relationship in the Data Model. As no join between the tables exists, LOOKUP implicitly disregards filters from the source side.
LOOKUP ( target_table, source_column, (join_condition) [, (join_condition),... ] )
(target_table.column, source_table.column)
The MULTI CASE WHEN operator evaluates a list of conditions and returns a new column containing result expressions based on these conditions. In contrast to the CASE WHEN operator, the result expressions for all conditions that evaluate to true are returned. Furthermore, the MULTI CASE WHEN operator creates a new table that is joined back to the common table of all inputs.
MULTI CASE WHEN condition THEN result_expression [ WHEN condition THEN result_expression ]* [ ELSE result_expression ] END
REMAP_INTS allows you to map values of a column of type INT.
REMAP_INTS ( table.column, [ old_value1, new_value1 ] , ... [, others_value ] )
REMAP_VALUES allows you to map values of a column of type STRING.
REMAP_VALUES ( table.column, [ old_value1, new_value1 ] , ... [, others_value ] )
The UNIQUE_ID function returns a unique non-negative INT for each unique tuple in the combination of the input columns.
UNIQUE_ID ( table.column1 , ... , table.columnN )
This section includes data generation functions.
GENERATE_RANGE creates values inside a range based on a given step size. It can be used to generate values for dropdown components.
GENERATE_RANGE ( step_size , range_start , range_end )
RANGE_APPEND creates values inside a range based on a given step size and appends them to the given column.
RANGE_APPEND ( table.column, step_size [, range_start [, range_end ] ] )
With RANGE, a column which is not related to any table in the Data Model, can be created. It is filled with a range of INT or DATE values.
RANGE ( start_value, step_size, step_count )
Data type conversion can be explicit by using custom functions, or implicit depending on context.
TO_DATE converts STRING input to DATE output. You must define the FORMAT. If the input does not match the FORMAT, the function returns NULL.
TO_DATE ( table.column, FORMAT ( format ) )
TO_FLOAT converts STRING input to FLOAT output. If the input does not represent a float number (or the value of the representation is outside of the supported FLOAT range), the function returns NULL.
TO_FLOAT ( table.column )
TO_INT converts STRING input to INT output. If the input does not represent an integer number (or the value of the representation is outside of the supported INT range), the function returns NULL.
TO_INT ( table.column )
TO_STRING converts INT or DATE input to STRING output. For DATE input, you must specify the output format.
TO_STRING ( table.column [, FORMAT ( format ) ] )
TO_NULLABLE_INT evaluates a boolean condition input and produces an INT output. The function returns 1 if the condition evaluates to TRUE, 0 if the condition evaluates to FALSE, and NULL if the condition evaluates to NULL.
TO_NULLABLE_INT ( condition )
This section includes a variety of datetime functions.
The TIMELINE_COLUMN and TIMELINE_TABLE functions create a continuous timeline for a given time unit (e.g., days). They can be used to aggregate values from one or multiple tables onto one common time axis. This makes it easy to show the trend of a KPI over a period of time.
TIMELINE_COLUMN ( timeunit, time_columns [, partition ] [, time_range] )
TIMELINE_TABLE ( timeunit, time_columns [, partition ] [, time_range] )
DateTime Calendars can be used to define the calendar which is used for DATE calculations.
FACTORY_CALENDAR produces a new factory calendar with the configuration as passed to it.
FACTORY_CALENDAR ( calendar_table.start_date_column, calendar_table.end_date_column [, filter_condition ] )
FACTORY_CALENDAR ( calendar_table [, filter_condition ] )
A call to INTERSECT takes two arbitrary calendars as input and produces a new intersected calendar (its type depends on the input calendars).
INTERSECT ( lhs_calendar, rhs_calendar )
WEEKDAY_CALENDAR produces a new weekday calendar using the given configuration.
WEEKDAY_CALENDAR ( day [ day_time ] ... )
WEEKDAY_CALENDAR ( calendar_table )
WORKDAY_CALENDAR produces a new workday calendar (i.e., SAP TFACS-style calendar or sometimes called SAP Factory Calendar) using the TFACS table passed to it.
WORKDAY_CALENDAR ( TFACS_table [, TFACS_entry_identifier ] )
DateTime Constant Functions allow to use the current date or time in a query.
The HOUR_NOW function returns the hour of the current time of the app-server in the specified time zone.
HOUR_NOW ( [time_zone_id] )
The MINUTE_NOW function returns the minute of the current time of the app-server in the specified time zone.
MINUTE_NOW ( [time_zone_id] )
The TODAY function returns the current date in the specified timezone.
TODAY( [time_zone_id] )
DateTime Difference Functions can be used to calculate the time difference between two DATEs.
DATE_BETWEEN determines whether the first date is between the second date and the third date.
DATE_BETWEEN ( table.column1, table.column2, table.column3 )
DATEDIFF computes the difference between two dates in the specified time unit.
DATEDIFF ( time_unit, table.date_column1, table.date_column2 )
DAYS_BETWEEN computes the difference between two dates in days.
DAYS_BETWEEN ( table.column1, table.column2 )
HOURS_BETWEEN computes the difference between two timestamps in hours. If a calendar is specified, only the passed time in the calendar is considered.
HOURS_BETWEEN ( start_column, end_column [, calendar_specification [, calendar_id_column]] )
MILLIS_BETWEEN computes the difference between two timestamps in milliseconds. If a calendar is specified, only the passed time in the calendar is considered.
MILLIS_BETWEEN ( start_column, end_column [, calendar_specification [, calendar_id_column]] )
MINUTES_BETWEEN computes the difference between two timestamps in minutes. If a calendar is specified, only the passed time in the calendar is considered.
MINUTES_BETWEEN ( start_column, end_column [, calendar_specification [, calendar_id_column]] )
MONTHS_BETWEEN computes the difference between two dates in months.
MONTHS_BETWEEN ( table.column1, table.column2 )
SECONDS_BETWEEN computes the difference between two timestamps in seconds. If a calendar is specified, only the passed time in the calendar is considered.
SECONDS_BETWEEN ( start_column, end_column [, calendar_specification [, calendar_id_column]] )
The WORKDAYS_BETWEEN function determines the amount of workdays between two given dates. When the start date is before the the end date, the end date is not counted. When the start date is after the end date, the start date is not counted and a negative value is returned.
WORKDAYS_BETWEEN ( start_date, end_date, calendar_specification [, calendar_id_column])
WORKDAYS_BETWEEN ( calendar_id, start_date, end_date [, INVALID_TO_NULL] )
YEARS_BETWEEN computes the difference between two dates in years.
YEARS_BETWEEN ( table.column1, table.column2 )
DateTime Constant Functions can be used to modify DATEs.
ADD_DAYS adds a given number of days to a given date.
ADD_DAYS ( table.base_column, table.days_column )
ADD_HOURS adds a given number of hours to a given timestamp. If a calendar is specified, only the passed time in the calendar is considered.
ADD_HOURS ( start_column, hours_column [, calendar_specification [, calendar_id_column]])
ADD_MILLIS adds a given number of milliseconds to a given timestamp. If a calendar is specified, only the passed time in the calendar is considered.
ADD_MILLIS ( start_column, milliseconds_column [, calendar_specification [, calendar_id_column]])
ADD_MINUTES adds a given number of minutes to a given timestamp. If a calendar is specified, only the passed time in the calendar is considered.
ADD_MINUTES ( start_column, minutes_column [, calendar_specification [, calendar_id_column]])
ADD_MONTHS adds a given number of months to a given date.
ADD_MONTHS ( table.base_column, table.months_column )
ADD_SECONDS adds a given number of seconds to a given timestamp. If a calendar is specified, only the passed time in the calendar is considered.
ADD_SECONDS ( start_column, seconds_column [, calendar_specification [, calendar_id_column]])
The ADD_WORKDAYS function adds a given number of workdays to a given date. If the start date is not a workday, ADD_WORKDAYS returns the same result as if the start date was the next workday instead (even when 0 days are added).
ADD_WORKDAYS ( date, number_of_days, calendar_specification [, calendar_id_column])
ADD_WORKDAYS ( workday_calendar_config, date, number_of_days )
ADD_YEARS adds a given number of years to a given date.
ADD_YEARS ( table.base_column, table.years_column )
CONVERT_TIMEZONE converts a given date from one specified timezone to another. The function takes into account the Daylight Saving Time.
CONVERT_TIMEZONE ( table.date_column [, from_timezone_id], to_timezone_id )
DateTime Projection Functions contains functions to extract the different parts from a DATE and to map DATEs to other data types and vice versa.
CALENDAR_WEEK returns the calendar week of a given date. Celonis follows the standard ISO 8601 for numbering the calendar weeks.
CALENDAR_WEEK ( table.column )
DATE_MATCH(column, [YEARS], [QUARTERS], [MONTHS], [WEEKS], [DAYS])
Returns the day of the week as number. The operator returns for Sunday 0, for Monday 1, for Tuesday 2 and so on.
DAY_OF_WEEK ( table.column )
Returns the number of days in the given month of the given timestamp.
DAYS_IN_MONTH( table.column )
Returns the day of a given date.
DAY( table.column )
Returns the hour of a given date.
HOURS ( table.column )
IN_CALENDAR checks whether a given date is within a calendar.
IN_CALENDAR ( timestamp_column, calendar_specification [, calendar_id_column])
Returns the milliseconds of a given date.
MILLIS ( table.column )
Returns the minutes of a given date.
MINUTES ( table.column )
Returns the month of a given date.
MONTH ( table.column )
Returns the quarter of a given date.
QUARTER ( table.column )
The REMAP_TIMESTAMPS function counts the number of passed time units for given dates since the epoch year (1970-01-01 00:00:00.000). The timestamps for which to calculate the passed time and also the time unit to use, are given as a parameter to the function call. Additionally, the user can specify a CALENDAR configuration which allows to restrict the dates considered in the calculations. For example, using the WEEKDAY_CALENDAR allows to only consider certain valid weekdays in the calculations.
REMAP_TIMESTAMPS ( table.column, time_unit [, calendar_specification [, calendar_id_column ] ] )
Returns the seconds of a given date.
SECONDS ( table.column)
Deprecated: Please use TO_DATE instead.
TO_TIMESTAMP converts a string into a DATE. The format of the input string can be defined.
TO_TIMESTAMP ( table.column, format )
Returns the year of a given date.
YEAR ( table.column )
DateTime rounding functions round down the subordinates of the chosen time unit into the ground state. They can be applied to DATE columns and always return a DATE column.
This function rounds a DATE input down to the year by setting its MONTH, DAY, HOUR, MINUTE, SECOND and MILLISECOND values to their ground state.
ROUND_YEAR ( table.date_column )
This function rounds a DATE input down to the month by setting its DAY, HOUR, MINUTE, SECOND and MILLISECOND values to their ground state.
ROUND_MONTH ( table.date_column )
This function rounds a DATE input down to the day by setting its HOUR, MINUTE, SECOND and MILLISECOND values to their ground state.
ROUND_DAY ( table.date_column )
This function rounds a DATE input down to the hour by setting its MINUTE, SECOND and MILLISECOND values to their ground state.
ROUND_HOUR ( table.date_column )
This function rounds a DATE input down to the minute by setting its SECOND and MILLISECOND values to their ground state.
ROUND_MINUTE ( table.date_column )
This function rounds a DATE input down to the second by setting its MILLISECOND values to their ground state.
ROUND_SECOND ( table.date_column )
ROUND_QUARTER(column)
This function rounds a DATE input down to the week by setting its HOUR, MINUTE, SECOND and MILLISECOND values to their ground state. The DAY value is set to the Monday of the week of that date.
ROUND_WEEK ( table.date_column )
Logical operators are used to combine boolean expressions into a more complex boolean expressions.
logical expression AND logical expression
NOT negates a logical expression.
NOT (logical expression)
logical expression OR logical expression
This section comprises standard machine learning functions.
KMeans finds clusters based on the K-means++ algorithm. It assigns a number to every cluster.
KMEANS ( k )
KMEANS ( TRAIN_[FILTERED_]KM ( [ EXCLUDED ( table.exclude_column, ... ), ] INPUT ( table.input_column, ... ), k ), CLUSTER ( table.cluster_column, ... ) )
The Linear Regression function models the relationship between input variables by fitting a linear equation to the input data.
LINEAR_REGRESSION ( TRAIN_[FILTERED_]LM ( INPUT ( table.input_column, ... ), OUTPUT ( table.output_column ) ), PREDICT ( table.predict_column, ... ) )
This section comprises standard mathematical functions including basic arithmetic operations.
The ABC function performs an ABC Analysis. Such an analysis divides entries in a column into three groups (A, B, C). Group A represents the values with the highest share of the sum of all values. Group C represents the values with the lowest share. The size of each group is defined with thresholds.
[FILTERED_]ABC ( table.column [,A [, B ] ] )
ABS is a mathematical function that returns the absolute (positive) value of a specified numeric expression.
ABS ( table.column )
The addition operator computes the sum of two values.
ADD ( table.column1 , table.column2 )
table.column1 + table.column2
The Ceiling function returns the smallest integer greater than, or equal to, the specified numeric expression.
CEIL ( table.column )
The division operator computes the quotient of two values.
DIV ( table.column1 , table.column2 )
table.column1 / table.column2
The FLOOR function returns the greatest integer smaller than, or equal to, the specified numeric expression.
FLOOR ( table.column )
INVERSE negates the input number.
INVERSE ( table.column )
The LOG function returns the logarithm of the specified float expression.
LOG ( table.column [, base] )
The modulo operator returns the remainder of a division of both arguments.
MODULO ( dividend, divisor )
dividend % divisor
The multiplication operator computes the product of two values.
MULT ( table.column1 , table.column2 )
table.column1 * table.column2
The POWER function returns the value of the specified expression to the specified power.
POWER ( table.column, exponent )
The ROUND function rounds a numeric value to the closest number.
ROUND ( column [, precision ] )
The SQRT function calculates the positive square root of the specified value.
SQRT ( table.column )
The SQUARE function calculates the square for each input value.
SQUARE ( table.column )
The subtraction operator computes the difference of two values.
SUB ( table.column1 , table.column2 )
table.column1 - table.column2
Object Link enables the analysis of graph data. An example for such data is a bill of materials.
LINK_FILTER enables object-based recursive filtering on the Object Link graph.
LINK_FILTER ( filter_expression, {ANCESTORS|DESCENDANTS} [, number_of_hops] )
LINK_FILTER ( BIND_FILTERS ( object_link_table, filter_expression ), {ANCESTORS|DESCENDANTS} [, number_of_hops] )
LINK_FILTER_ORDERED ( filter_expression, {ANCESTORS|DESCENDANTS} [, number_of_hops] )
LINK_FILTER_ORDERED ( BIND_FILTERS ( activity_table, filter_expression ), {ANCESTORS|DESCENDANTS} [, number_of_hops] )
The LINK_PATH_SOURCE - LINK_PATH_TARGET operators return tables containing the links, respectively joined to source or target object, that were used during LINK_PATH traversal.
LINK_PATH_SOURCE ( link_path_table , mapping_table1.column [, mapping_table2.column ] ... )
LINK_PATH_SOURCE ( link_path_table.column , mapping_table1.column [, mapping_table2.column ] ... )
LINK_PATH_TARGET ( link_path_table , mapping_table1.column [, mapping_table2.column ] ... )
LINK_PATH_TARGET ( link_path_table.column , mapping_table1.column [, mapping_table2.column ] ... )
The LINK_PATH operator creates an internal activity and case table to represent individual paths calculated by traversing the Object Link graph. The resulting activity table is joined towards the input table and to the internal case table in a N:1 fashion. Additional tables containing link attributes are generated and can be accessed via the LINK_PATH_SOURCE/TARGET operators. An example of the created tables and how they are joined is shown below:
LINK_PATH ( input_table.column [, direction ] [, CONSTRAINED BY ( [ START ( start_objects_expression ) ] [, END ( end_objects_expression ) ] [, LENGTH ( comparison ) ] [, ALL ( all_objects_expression ) ] [, WITH/WITHOUT CYCLES] ) ] )
The LINK_OBJECTS operator creates a new table containing all rows of the input table that are part of the Object Link graph.
LINK_OBJECTS ( input_table.column )[.COLUMN]
LINK_SOURCE and LINK_TARGET output the source or target objects of links that are generated according to the Object Link configuration of the data model. LINK_ATTRIBUTES returns link values.
LINK_SOURCE ( object_link_table1.column [, object_link_table2.column] ... )
LINK_TARGET ( object_link_table1.column [, object_link_table2.column] ... )
LINK_ATTRIBUTES ( mapping_table1.column [, mapping_table2.column] ... )
LINK_SOURCE( BIND ( object_link_table1, table1.column ) [, BIND ( object_link_table2, table2.column ) ] ...)
LINK_TARGET( BIND ( object_link_table1, table1.column ) [, BIND ( object_link_table2, table2.column ) ] ...)
Predicate functions and operators check if the input argument satisfies a condition.
BETWEEN ... AND ... returns true for all cases of a column that are in the closed interval given. Returns false otherwise. The reverse is true for NOT BETWEEN ... AND ....
table.column BETWEEN inclusive start of interval AND inclusive end of interval
table.column NOT BETWEEN inclusive start of interval AND inclusive end of interval
The IN_LIKE predicate determines whether one of the multiple right string patterns matches the left expression string. The left expression could either be a column or a constant string. The pattern list has to contain at least one entry while all entries have to be STRING values. The output can be negated by using NOT IN_LIKE.
value/column IN_LIKE ( value1/column1, ... )
value/column NOT IN_LIKE ( value1/column1, ... )
IN returns true for all cases of a column or a value that match any of the values of the match list. An item in the match list can also be another column. Returns false otherwise. The reverse is true for NOT IN.
value/column IN ( value1/column1, ... )
value/column NOT IN ( value1/column1, ... )
The ISNULL function returns an INT which indicates whether the input value is NULL. If the input value is NULL, the function returns 1; otherwise it returns 0.
ISNULL ( table.column )
The LIKE predicate determines whether the right expression string matches the left expression string. The output can be negated by using NOT LIKE.
table.column LIKE pattern
table.column NOT LIKE pattern
MULTI_IN returns true for all cases where one or multiple columns match any tuple of the match list. Returns false otherwise. The reverse is true for NOT MULTI_IN.
MULTI_IN ( (table.column, ...), (value1_1, ...), (value2_1, ...), ...)
NOT MULTI_IN ( (table.column, ...), (value1_1, ...), (value2_1, ...), ...)
Process functions perform calculations based on the Case-Id and the Activity column of the event log.
Deprecated: Please use INDEX_ACTIVITY_TYPE instead.
The ACTIVATION_COUNT operator returns for every activity in every case, how many times, at a given point in a process this activity has already occurred.
ACTIVATION_COUNT ( activity_table.column )
ACTIVITY_LAG returns the row of the activity table that precedes the current row by offset number of rows within a case. The resulting column is a temporary column of the activity table.
ACTIVITY_LAG ( activity_table.column [, offset ] )
ACTIVITY_LEAD returns the row of the activity table that follows the current row by offset number of rows within a case. The resulting column is a temporary column of the activity table.
ACTIVITY_LEAD ( activity_table.column [, offset ] )
BPMN_CONFORMS provides a way for simple binary yes/no conformance checking on a BPMN model to distinguish fitting traces from non-fitting traces.
BPMN_CONFORMS ( flattened_events_table.column , bpmn_model, [ ALLOW( filter_or_shorthand [, filter_or_shorthand ... ] ) ] )
This operator crops cases to a range of activities. All values outside this range are mapped to null. Values within the range are flagged with 1, in contrast to CALC_CROP_TO_NULL which returns the input values. The result is an INT column that is activity-based and returns 1 for the activities inside the specified range and null otherwise.
CALC_CROP ( begin_range_specifier TO end_range_specifier, activity_table.string_column )
This operator crops cases to a range of activities. All values outside this range are mapped to null. Values within the range are kept, in contrast to CALC_CROP which maps them to 1. The result is a temporary STRING column of the activity table. It contains activities of the specified range and null otherwise.
CALC_CROP_TO_NULL ( begin_range_specifier TO end_range_specifier, activity_table.string_column )
The CLUSTER_VARIANTS operator groups similar process variants (or traces) into clusters. Variants which cannot be assigned to a cluster are marked as noise.
CLUSTER_VARIANTS ( variant_column, MIN_PTS, EPSILON )
CONFORMANCE flags activities whether they are conformant with the given model. READABLE returns violation descriptions in English. To use the model from the conformance checker sheet in Analysis, the CONFORMANCE query can be written to a variable.
CONFORMANCE ( activity_table.string_column, [ places ], [ transitions ], [ edges ], [ mapping ], [ start_places ], [ end_places ] )
READABLE ( conformance_query )
CREATE_EVENTLOG returns an activity table that is based on a given lead object and a set of included event types.
CREATE_EVENTLOG ( object_table_name [ FILTER lead_object_filter ] , INCLUDE [included_event_1 [ VIA ( relationship_1 ) ] [ FILTER event_filter_1 ], included_event_2 [ VIA ( relationship_2 ) ] [ FILTER event_filter_2 ], ...]).column_name
The DEFAULT ACTIVITY_COLUMN statement can be used to temporarily overwrite the default eventlog for the execution of a single query. The specified column is used whenever an operator would use the activity column of the default eventlog.
DEFAULT ACTIVITY_COLUMN activity_table.column;
SOURCE and TARGET functions provide a way to combine values from two different rows of the activity table into the same row, e.g. for calculating the throughput time between consecutive events inside a case.
SOURCE ( activity_table.column [, activity_table.filter_column ] [, edge_configuration [ WITH START( [ start_value ] ) ] ] )
SOURCE and TARGET functions provide a way to combine values from two different rows of the activity table into the same row, e.g. for calculating the throughput time between consecutive events inside a case.
SOURCE ( activity_table.column [, activity_table.filter_column ] , WITH START( [ start_value ] ) )
The ESTIMATE_CLUSTER_PARAMS operator estimates appropriate values for the MIN_PTS parameter of the CLUSTER_VARIANTS operator.
ESTIMATE_CLUSTER_PARAMS ( table.variant_column, epsilon, number_of_values, recursion_depth )
For each row of a dynamically created eventlog, EVENTLOG_SOURCE_TABLE will return the activity table name from which the row originates.
EVENTLOG_SOURCE_TABLE ( dynamically-created-eventlog.column )
MATCH_ACTIVITIES flags cases with certain activities without taking the exact order of activities into account. If the order is of importance, use MATCH_PROCESS_REGEX or MATCH_PROCESS.
MATCH_ACTIVITIES([ activity_table.string_column, ] [STARTING activity_list ] [, ] [NODE activity_list ] [, ] [NODE_ANY activity_list ] [, ] [ENDING activity_list ] [, ] [EXCLUDING activity_list ] [, ] [EXCLUDING_ALL activity_list ])
MERGE_EVENTLOG merges two columns of the same type from two activity tables into one common column. The sorting of entries in the resulting column is based on timestamps. Depending on the join scenario, the result of MERGE_EVENTLOG might contain some duplicated activities. If this is unwanted, then MERGE_EVENTLOG_DISTINCT can be used. Please note that MERGE_EVENTLOG and MERGE_EVENTLOG_DISTINCT only differ regarding duplicated activities and otherwise behave the same.
MERGE_EVENTLOG ( target_table.column [, FILTER target_table_filter_expression ] , source_table.column [, FILTER source_table_filter_expression ] )
MERGE_EVENTLOG merges two columns of the same type from two activity tables into one common column. The sorting of entries in the resulting column is based on timestamps. Depending on the join scenario, the result of MERGE_EVENTLOG might contain some duplicated activities. If this is unwanted, then MERGE_EVENTLOG_DISTINCT can be used. Please note that MERGE_EVENTLOG and MERGE_EVENTLOG_DISTINCT only differ regarding duplicated activities and otherwise behave the same.
MERGE_EVENTLOG_DISTINCT ( target_table.column [, FILTER target_table_filter_expression ] , source_table.column [, FILTER source_table_filter_expression ] )
PROCESS EQUALS matches the variants of a process based on simple expressions. PROCESS EQUALS is less powerful than MATCH_PROCESS_REGEX and MATCH_PROCESS but it is also simpler to use.
PROCESS [ ON activity_table.string_column ] [ NOT ] equals [ start ] activity ( to activity )* [ end ]
Filters the variants based on a regular expression defined over the activities.
MATCH_PROCESS_REGEX ( activity_table.string_column, regular_expression )
MATCH_PROCESS matches the variants of a process against a given pattern.
MATCH_PROCESS ( [ activity_table.string_column ,] node (, node)* CONNECTED BY edge (, edge)* )
Deprecated: Please use INDEX_ACTIVITY_ORDER instead.
PROCESS_ORDER returns the position of each activity within a case. Only not null activities are counted.
PROCESS_ORDER ( activity_table.column )
Deprecated: Deprecated alias for CREATE_EVENTLOG, please use CREATE_EVENTLOG instead.
PROJECT_ON_OBJECT returns an activity table that is based on a given lead object and a set of included event types.
PROJECT_ON_OBJECT ( object_table_name [ FILTER lead_object_filter ] , INCLUDE [included_event_1 [ VIA ( relationship_1 ) ] [ FILTER event_filter_1 ], included_event_2 [ VIA ( relationship_2 ) ] [ FILTER event_filter_2 ], ...]).column_name
CALC_REWORK counts the number of activities per case. A filter expression can be specified to restrict the activities that should be taken into account. The result is an INT column in which the value of each row gives the number of activities for the respective case ID. The resulting column is temporarily added to the case table.
CALC_REWORK ()
CALC_REWORK ( filter_condition )
CALC_REWORK ( activity_table.column )
CALC_REWORK ( filter_condition , activity_table.column )
Throughput is used to calculate, for each case, the time between two activities. From which activity the calculated throughput time should start and at which it should end can be configured through range specifiers. The resulting column is temporarily added to the case table and returns the throughput time between the two specified activities per case.
CALC_THROUGHPUT ( begin_range_specifier TO end_range_specifier, timestamps [, activity_table.string_column ] )
CALC_THROUGHPUT ( object_table_name , begin_event_specifier TO end_event_specifier , time_unit )
TRANSIT_COLUMN computes transition edges between related cases from two different processes.
TRANSIT_COLUMN(TIMESTAMP_INTERLEAVED_MINER(activity_table_a.activity_column, activity_table_b.activity_column), activity_column )
TRANSIT_COLUMN(TIMESTAMP_NONINTERLEAVED_MINER(activity_table_a.activity_column, activity_table_b.activity_column), activity_column )
TRANSIT_COLUMN(MATCH_MINER(activity_table_a.activity_column, activity_table_b.activity_column, activity_table_a.out_msg, activity_table_b.in_msg), activity_column )
TRANSIT_COLUMN(MANUAL_MINER(activity_table_a.activity_column, activity_table_b.activity_column, [manual_value_1, manual_value_2], ...), activity_column )
VARIANT aggregates all activities of a case into a string, which represents a variant of the process. SHORTENED can be used to shorten self-loops to a maximum number of occurrences. The resulting column is case-based and contains one value for each case that gives the (shortened) string of activities for this case.
VARIANT ( activity_table.string_column )
SHORTENED ( VARIANT ( activity_table.column ) [, max_cycle_length ] )
Index functions create integer indices based on a column.
The INDEX_ACTIVITY_LOOP_REVERSE operator returns how many times every activity has occurred in direct succession in every case in reverse order.
INDEX_ACTIVITY_LOOP_REVERSE ( activity_table.column )
The INDEX_ACTIVITY_LOOP operator returns how many times every activity has occurred in direct succession in every case.
INDEX_ACTIVITY_LOOP ( activity_table.column )
INDEX_ACTIVITY_ORDER_REVERSE returns the reverse position of each activity within a case. Only not null activities are counted.
INDEX_ACTIVITY_ORDER_REVERSE ( activity_table.column )
INDEX_ACTIVITY_ORDER returns the position of each activity within a case. Only not null activities are counted.
INDEX_ACTIVITY_ORDER ( activity_table.column )
The INDEX_ACTIVITY_TYPE_REVERSE operator returns for every activity in every case, how many times, at a given point in a process this activity has already occurred in reversed order.
INDEX_ACTIVITY_TYPE_REVERSE ( activity_table.column )
The INDEX_ACTIVITY_TYPE operator returns for every activity in every case, how many times, at a given point in a process this activity has already occurred.
INDEX_ACTIVITY_TYPE ( activity_table.column )
Process reference functions can be used to reference the case table or configured activity columns based on an expression.
ACTIVITY_COLUMN allows to refer the activity column of an activity table in the data model without using the exact column name.
ACTIVITY_COLUMN ( [ expression ] )
ACTIVITY_TABLE allows to refer to the activity table in the data model without using its exact table name. This operator is helpful in data models with multiple activity/case tables.
ACTIVITY_TABLE ( [ expression ] )
CASE_ID_COLUMN allows to refer the case column of an activity table in the data model without using the exact column name.
CASE_ID_COLUMN ( [ expression ] )
CASE_TABLE allows to refer to the case table in the data model without using its exact table name. This operator is helpful in data models with multiple activity/case tables.
CASE_TABLE ( [ expression ] )
END_TIMESTAMP_COLUMN allows to refer the end timestamp column of an activity table in the data model without using the exact column name. If no end timestamp column is defined for the target activity table, the timestamp column is referenced.
END_TIMESTAMP_COLUMN ( [ expression ] )
SORTING_COLUMN allows to refer the sorting column of an activity table in the data model without using the exact column name.
SORTING_COLUMN ( [ expression ] )
TIMESTAMP_COLUMN allows to refer the timestamp column of an activity table in the data model without using the exact column name.
TIMESTAMP_COLUMN ( [ expression ] )
Set operators operate on sets of column values and table rows that may also contain multiple identical elements.
The UNION_ALL_PULLBACK function returns the column corresponding to the index requested from the UNION_ALL table.
UNION_ALL_PULLBACK ( expression, index )
The UNION_ALL_TABLE function returns the table formed from the vertical concatenation of the columns of the input tables.
UNION_ALL_TABLE ( table1 , ... , tableN )
The UNION_ALL function returns the vertical concatenation of columns.
UNION_ALL ( table.column1 , ... , table.columnN )
Static PQL functions provide a lightweight way to check general properties of a data model or expression before actually executing a PQL query. For example, you can check the data type of a data model column and based on the type modify the actual PQL query to be executed.
ARGUMENT_COUNT counts the number of arguments (i.e., constants or expressions) passed to it.
ARGUMENT_COUNT ( [ Expression [ , ... ] ] )
COLUMN_TYPE returns the type of a data model column.
COLUMN_TYPE ( "TABLE"."COLUMN" )
IF evaluates a list of static conditions and replaces itself with the query for the first static condition that is true.
IF static_condition THEN calculation [ ELSE IF condition THEN calculation ]* [ ELSE calculation ] END
STATIC CASE WHEN evaluates a list of static conditions and replaces itself with the expression for the first static condition that is true.
STATIC CASE WHEN static_condition THEN expression [ WHEN static_condition THEN expression ]* ELSE expression END
The three statistical functions available right now - QNORM, ZSCORE and BUCKET_LOWER_BOUND/BUCKET_UPPER_BOUND- are used to map or standardize a column under the assumption of statistical properties of the data, or in the case of BUCKET_LOWER_BOUND/BUCKET_UPPER_BOUND to sort the column values into suitable histogram buckets.
BUCKET_UPPER_BOUND and BUCKET_LOWER_BOUND calculate histogram buckets and identify the buckets in which the input values reside.
BUCKET_UPPER_BOUND ( table.column [, SUGGESTED_WIDTH(suggested_width)] [, SUGGESTED_TICKS[suggested_ticks] ] [, MAX_COUNT(max_count) ] )
BUCKET_UPPER_BOUND ( table.column [, SUGGESTED_COUNT(suggested_count)] [, SUGGESTED_TICKS[suggested_ticks] ] [, MAX_COUNT(max_count) ] )
BUCKET_LOWER_BOUND ( table.column [, SUGGESTED_WIDTH(suggested_width)] [, SUGGESTED_TICKS[suggested_ticks] ] [, MAX_COUNT(max_count) ] )
BUCKET_LOWER_BOUND ( table.column [, SUGGESTED_COUNT(suggested_count)] [, SUGGESTED_TICKS[suggested_ticks] ] [, MAX_COUNT(max_count) ] )
BUCKET_UPPER_BOUND ( )
BUCKET_LOWER_BOUND ( )
Evaluates the quantile function of values in a FLOAT column assuming the normal distribution N(0,1).
QNORM ( table.column )
This function calculates the z-score over an INT or a FLOAT. The output type is always FLOAT.
ZSCORE ( table.column [, PARTITION BY ( partition_column, ... )] )
There are multiple String Functions available in Celonis that allow to modify STRING data of columns and constants.
CLUSTER_STRINGS clusters strings in a column based on their similarity.
CLUSTER_STRINGS ( table.column , EDIT_THRESHOLD ( edit_distance ) [, WEIGHT ( weighted_tokens , weight ) ] [, PARTITION BY ( partition_column, ...)])
CONCAT and || return the concatenation of two or more strings.
CONCAT ( table.column1, ..., table.columnN )
table.column1 || table.column2 [ || table.columnN ]*
DELETE_CHARACTERS returns a string with the characters defined in the match string removed.
DELETE_CHARACTERS ( table.column, match_string )
LEFT returns the first n characters of a string. If the string is smaller than n characters, the whole string is returned.
LEFT ( table.column, n )
LEN returns the size of the string, i.e., the number of characters.
LEN ( table.column )
LOWER returns the lower-cased version of a string.
LOWER ( table.column )
LTRIM returns the string with leading whitespace removed. Optionally pass a string argument containing Unicode characters to be trimmed from the string.
LTRIM ( table.column [, characters ])
MAP_CHARACTERS returns a string with the characters defined in the match string replaced by the corresponding character.
MAP_CHARACTERS ( table.column, match_string, replace_string )
MATCH_STRINGS finds the top-k best matching strings from another column for each string in the input column.
MATCH_STRINGS (table1.input_column, table2.match_column [, TOP_K ( top_k )] [, SEPARATOR ( separator ) ])
PATINDEX searches for the pattern-substring and returns its index (starting from 1) within the input if it was found (zero otherwise).
PATINDEX ( table.column, pattern [, occurrence ] )
REPLACE returns the strings with pattern-substrings replaced by the string replace.
REPLACE ( table.column, pattern, replace )
REVERSE returns the reversed string.
REVERSE ( table.column )
RIGHT returns the last n characters of a string. If the string is smaller than n characters, the whole string is returned.
RIGHT ( table.column, n )
RTRIM returns the string with trailing whitespace removed. Optionally pass a string argument containing Unicode characters to be trimmed from the string.
RTRIM ( table.column [, characters ] )
STRINGHASH computes a cryptographic hash of a string, encoded using base64 encoding. The returned hash may change in the future.
STRINGHASH ( table.column )
STRING_SPLIT splits the string into sub-strings based on pattern and returns the split-index-th split.
STRING_SPLIT ( table.column, pattern, split-index )
Deprecated: Please use TO_INT or TO_FLOAT instead.
STR_TO_INT converts the input strings to integers. If the input is not a number (or outside of the supported range), null is returned. A decimal point is allowed in input, but decimal places are ignored/truncated.
STR_TO_INT ( table.column )
SUBSTRING returns the substring of size size starting at character with zero-based index start.
SUBSTRING ( table.column, start, size )
UPPER returns the upper-cased version of a string.
UPPER ( table.column )
Window functions and operators perform calculations in a partitioned table.
INDEX_ORDER returns a column with integer indices, starting from 1. The indices indicate the order of the rows.
INDEX_ORDER ( column [, ORDER BY ( sort_column [sorting], ... )] [, PARTITION BY ( partition_column, ... )] )
INTERPOLATE can be used to interpolate missing values (i.e. NULL values). It is possible to specify a column based ordering and partitioning. The output type is INT if input column is INT and INTERPOLATE ( column, interpolation type [, ORDER BY ( sort_column [sorting], ... )] [, PARTITION BY ( partition_column, ... )] )
LAG returns the row that precedes the current row by offset number of rows. It is possible to specify a column based ordering and partitioning. Null values are skipped.
LAG ( column [, ORDER BY ( sort_column [sorting], ... )] [, PARTITION BY ( partition_column, ... )] [, offset ] )
LEAD returns the row that follows the current row by offset number of rows. It is possible to specify a column based ordering and partitioning. Null values are skipped.
LEAD ( table.column [, ORDER BY ( order_column [sorting], ... )] [, PARTITION BY ( partition_column, ... )] [, offset ] )
RUNNING_SUM returns the Sum of the previous rows. It is possible to specify a column based ordering and partitioning. Null values are skipped.
RUNNING_SUM ( column [, ORDER BY ( sort_column [sorting], ... )] [, PARTITION BY ( partition_column, ... )] )
WINDOW_AVG can be used to calculate the average over a user-defined window. It is possible to specify a column based ordering and partitioning. ``` WINDOW_AVG ( column, start, end [, ORDER BY ( sort_column [sorting], ... )] [, PARTITION BY ( partition_column, ... )] )