Usual Oracle Analytical Functions
Continued from previous post Oracle Analytical Functions (1)
In this article we are going to examine some of the most usual Oracle Analytical Functions. There is a list of functions below.
| Function | Description |
|
lag |
To access prior row in a partition or result set |
| lead | To access later row in a partition or result set |
|
first_value |
To access first row in a partition or result set |
| Last_value | To access last row in a partition or result set |
|
nth_value
|
To access any arbitrary row in a partition or result set |
| rank | To rank the rows in a sort order. Ranks are not skipped in the case of ties |
| dense_rank | To rank the rows in a sort order. Ranks are not skipped in the case of ties |
|
row_number |
To sort the rows and add a unique number to each row. This is a nondeterministic function |
|
ratio_to_report |
To compute the ratio of value to the report |
| percent_rank | To compute the rank of value normalized to a value between 0 and 1 |
|
percentile_cont |
To retrieve the value matching with the specified percent_rank. Reverse of percent_rank function |
|
percentile_dist |
To retrieve the value matching with the specified percent_rank. Assumes discreet distribution model |
| ntile | To group rows into units |
| listagg | To convert column values from different rows into a list format |
NOTE. All examples use the HR Oracle sample schema.
Aggregation functions
Aggregation functions can operate in analytic mode or conventional non-analytic mode.
Aggregation functions in non-analytic mode reduce the result set to fewer rows.
However, in analytic mode, aggregation functions do not reduce the result set but can fetch both aggregated and non-aggregated columns in the same row.
Usual Analytical Functions
Using ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW clause:
SELECT first_name,
last_name,
salary,
hire_date,
d.department_id,
department_name,
SUM(salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) sum_salary
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_id = 100;
Using ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING clause:
SELECT first_name,
last_name,
salary,
hire_date,
d.department_id,
department_name,
SUM(salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) sum_salary
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_id = 100;
Using ROWS BETWEEN N PRECEDING AND N FOLLOWING clause (for granular window specifications):
SELECT first_name,
last_name,
salary,
hire_date,
d.department_id,
department_name,
SUM(salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) sum_salary
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_id = 100;
Lead and Lag Functions
Lag and lead functions have ability to access a prior row or a latter row in the result set.
These functions do not support the windowing clause. Only the partition by and order by clauses are supported with these two functions.
Syntax is as follows:
lag | lead (expression, offset, default ) over (partition-clause order-by-clause)
It is possible to access any row within a data partition by specifying a different offset. The lag and lead functions are using an offset of one to access the first prior row and the first next row.
SELECT first_name,
last_name,
salary,
department_name,
LAG(salary, 1, salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date) salary_prev_rec,
LEAD(salary, 1, salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date) salary_nex_rec
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_id = 100;
first_value and last_value
These functions are useful in calculating the maximum and minimum values in an ordered result set. The first_value function retrieves the column value from the first row in a window of rows; the last_value function retrieves the column value from the last row in that window.
Syntax for the first_value/last_value functions is as follows:
first_value | last_value(expression) over (partition-clause order-by-clause windowing-clause)
Aggregation can be performed at a different level with a different partitioning clause as in the following example:
SELECT first_name,
last_name,
salary,
department_name,
FIRST_VALUE(salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) min_salary,
LAST_VALUE(salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) max_salary
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_id = 100;
To be continued…
Bibliography:
Database Data Warehousing Guide/SQL for Analysis and Reporting
Pro Oracle SQL by Morton, K., Osborne, K., Sands, R., Shamsudeen, R., Still, J.