Usual Oracle Analytical Functions

Continued from previous post Oracle Analytical Functions (1)

In this article we are going to examine some of the most usual Oracle Analytical Functions. There is a list of functions below.

Function Description

lag

To access prior row in a partition or result set
lead To access later row in a partition or result set

first_value

To access first row in a partition or result set
Last_value To access last row in a partition or result set

nth_value

To access any arbitrary row in a partition or result set
rank To rank the rows in a sort order. Ranks are not skipped in the case of ties
dense_rank To rank the rows in a sort order. Ranks are not skipped in the case of ties

row_number

To sort the rows and add a unique number to each row. This is a nondeterministic function

ratio_to_report

To compute the ratio of value to the report
percent_rank To compute the rank of value normalized to a value between 0 and 1

percentile_cont

To retrieve the value matching with the specified percent_rank. Reverse of percent_rank function

percentile_dist

To retrieve the value matching with the specified percent_rank. Assumes discreet distribution model
ntile To group rows into units
listagg To convert column values from different rows into a list format

 

NOTE. All examples use the HR Oracle sample schema.

Aggregation functions

Aggregation functions can operate in analytic mode or conventional non-analytic mode.

Aggregation functions in non-analytic mode reduce the result set to fewer rows.

However, in analytic mode, aggregation functions do not reduce the result set but can fetch both aggregated and non-aggregated columns in the same row.

Usual Analytical Functions

Using ROWS BETWEEN UNBOUNDED PRECEDING  AND CURRENT ROW clause:

SELECT first_name,
last_name,
salary,
hire_date,
d.department_id,
department_name,
SUM(salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date ROWS BETWEEN UNBOUNDED PRECEDING  AND CURRENT ROW) sum_salary
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_id = 100;

Using ROWS BETWEEN UNBOUNDED PRECEDING  AND UNBOUNDED FOLLOWING clause:

SELECT first_name,
last_name,
salary,
hire_date,
d.department_id,
department_name,
SUM(salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date ROWS BETWEEN UNBOUNDED PRECEDING  AND UNBOUNDED FOLLOWING) sum_salary
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_id = 100;

Using ROWS BETWEEN N PRECEDING  AND N FOLLOWING clause (for granular window specifications):

SELECT first_name,
last_name,
salary,
hire_date,
d.department_id,
department_name,
SUM(salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date ROWS BETWEEN 1 PRECEDING  AND 1 FOLLOWING) sum_salary
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_id = 100;

Lead and Lag Functions

Lag and lead functions have ability to access a prior row or a latter row in the result set.

These functions do not support the windowing clause. Only the partition by and order by clauses are supported with these two functions.

Syntax is as follows:

lag | lead (expression, offset, default ) over (partition-clause order-by-clause)

It is possible to access any row within a data partition by specifying a different offset. The lag and lead functions are using an offset of one to access the first prior row and the first next row.

SELECT first_name,
last_name,
salary,
department_name,
LAG(salary, 1, salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date) salary_prev_rec,
LEAD(salary, 1, salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date) salary_nex_rec
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_id = 100;

first_value and last_value

These functions are useful in calculating the maximum and minimum values in an ordered result set. The first_value function retrieves the column value from the first row in a window of rows; the last_value function retrieves the column value from the last row in that window.

Syntax for the first_value/last_value functions is as follows:

first_value | last_value(expression) over (partition-clause order-by-clause windowing-clause)

Aggregation can be performed at a different level with a different partitioning clause as in the following example:

SELECT first_name,
last_name,
salary,
department_name,
FIRST_VALUE(salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) min_salary,
LAST_VALUE(salary) OVER (PARTITION BY e.department_id ORDER BY salary, hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) max_salary
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_id = 100;

To be continued…

 

Bibliography:

Database Data Warehousing Guide/SQL for Analysis and Reporting

Pro Oracle SQL by Morton, K., Osborne, K., Sands, R., Shamsudeen, R., Still, J.

Leave a Reply

Your email address will not be published. Required fields are marked *

Post comment