SQL Analytic Functions

CUME_DIST windowing function

Syntax

CUME_DIST ( ) OVER ( [ partition_by_clause ] [ order_by_clause ] )

Requires partition_by_clause and/or order_by_clause.

Description

The analytic function CUME_DIST calculates the cumulative distribution of a value within a group of values. The cumulative distribution of a value is the relative position of a specified value in a group of values. With a value set ordered by ascending values, the CUME_DIST of a value for a given row is defined as the number of rows with values less than or equal to the value in the given row, divided by the number of rows evaluated in the partition. The result of the function is float.

Example

select empno, deptno, projno, cume_dist() over (partition by deptno, projno order by empno) from emp;

       EMPNO   DEPTNO  PROJNO              CUME_DIST()
       -----   ------  ------              -----------
        7782       10     101     1.0000000000000000000
        7839       10     102     0.5000000000000000000
        7934       10     102     1.0000000000000000000
        7329       20     101     0.2000000000000000000
        7566       20     101     0.4000000000000000000
        7788       20     101     0.6000000000000000000
        7876       20     101     0.8000000000000000000
        7902       20     101     1.0000000000000000000
        7698       30     101     1.0000000000000000000
        7844       30     102     0.5000000000000000000
        7900       30     102     1.0000000000000000000
        7499       30     103     0.3333333333333330000
        7521       30     103     0.6666666666666670000
        7654       30     103     1.0000000000000000000

See Also

 

FIRST_VALUE windowing function

Syntax

FIRST_VALUE ( [scalar_expression ] ) OVER ( [ partition_by_clause ] [ order_by_clause ] )

Requires partition_by_clause and/or order_by_clause.

Description

The analytic function FIRST_VALUE returns the first value of an ordered set of values. The argument column_ref or expression can be of any type. It must result in a single value. The result of the function is of the same data type as that of the argument. The result can have a null value.

Example

select empno, deptno, projno, sal, first_value(sal) over (partition by deptno, projno order by sal) lowest_sal from emp;

       EMPNO    DEPTNO    PROJNO        SAL        LOWEST_SAL
       -----    ------    ------        ---       -----------
        7782        10       101     2450.00          2450.00
        7934        10       102     1300.00          1300.00
        7839        10       102     5000.00          1300.00
        7329        20       101      800.00           800.00
        7876        20       101     1100.00           800.00
        7566        20       101     2975.00           800.00
        7788        20       101     3000.00           800.00
        7902        20       101     3000.00           800.00
        7698        30       101     2850.00          2850.00
        7900        30       102      950.00           950.00
        7844        30       102     1500.00           950.00
        7521        30       103     1250.00          1250.00
        7654        30       103     1250.00          1250.00
        7499        30       103     1600.00          1250.00

Note

Analytic functions are recalculated for each new value in the ordered result set. Therefore, first_value() will return the same value for each row in a partition. The return value for last_value() within a partition will change as each new value in the order_by_clause is returned. The same value will be returned for each row in the result set with the same value from the order_by_clause.

See Also

 

LAG windowing function

Declaration

LAG (scalar_expression [ , offset ] [ , default ] ) [ IGNORE NULLS | RESPECT NULLS ] OVER ( [ partition_by_clause ] [ order_by_clause ] )

Requires partition_by_clause and/or order_by_clause.

Description

LAG accesses data from a previous row in the same result set without the use of a self-join.

Use this analytic function in a SELECT statement to compare values in the current row with values in a previous row:

  • expression - Defines the value the function returns from the specified row.
  • offset - a positive or negative value depending on before or after current row.
  • default - default is the value used when there is no row at the specified position.
  • order_by (required)
ROWS BETWEEN xxx AND yyy - ROWS and RANGE BETWEEN clauses may change the first row in the partition.
RANGE BETWEEN xxx AND yyy - ROWS and RANGE BETWEEN clauses may change the first row in the partition.

 lag() and lead() return a value from a row that is before or after the current row in the current partition. lag() and lead() are synonyms except they treat the direction of the offset oppositely. That is, a negative offset value in lag() returns the same result as a positive offset in lead().

lag(expr, offset, default)  ==  lead(expr, -offset, default)
  • A positive offset in lag( offset ) is the number of rows before the current row.
  • A negative offset in lag( offset ) is the number of rows after the current row.
  • A positive offset in lead( offset ) is the number of rows after the current row.
  • A negative offset in lead( offset ) is the number of rows before the current row.

Example


SELECT
    Sale_Date,
    Product_ID,
    Price,
    LAG(Price, 1, 0) OVER (
        PARTITION BY Product_ID
        ORDER BY Sale_Date
    ) AS Previous_Price,
    Price - LAG(Price, 1, 0) OVER (
        PARTITION BY Product_ID
        ORDER BY Sale_Date
    ) AS Price_Change
FROM
    Sales_History
WHERE
    Product_ID = 'Product_A'
ORDER BY
    Sale_Date;


Sale_Date	Product_ID	Price	Previous_Price	Price_Change
2024-01-01	Product_A	10.00	0.00			10.00
2024-01-15	Product_A	12.00	10.00			2.00
2024-01-20	Product_A	11.00	12.00			-1.00
2024-02-05	Product_A	11.00	11.00			0.00 

See Also

LEAD

 

LEAD windowing function

Declaration

LEAD ( scalar_expression [ , offset ] [ , default ] ) [ IGNORE NULLS | RESPECT NULLS ] OVER ( [ partition_by_clause ] [ order_by_clause ] )

Requires partition_by_clause and/or order_by_clause.

Description

LEAD accesses data from a subsequent row in the same result set without the use of a self-join.

Use this analytic function in a SELECT statement to compare values in the current row with values in a following row.

  • expression - defines the value the function returns from the specified row.
  • offset - a positive or negative value depending on before or after current row.
  • default - default is the value used when there is no row at the specified position.
  • order_by (required)
ROWS BETWEEN xxx AND yyy - ROWS and RANGE BETWEEN clauses may change the first row in the partition.
RANGE BETWEEN xxx AND yyy - ROWS and RANGE BETWEEN clauses may change the first row in the partition.

 lag() and lead() return a value from a row that is before or after the current row in the current partition. lag() and lead() are synonyms except they treat the direction of the offset oppositely. That is, a negative offset value in lag() returns the same result as a positive offset in lead().

lag(expr, offset, default)  ==  lead(expr, -offset, default)
  • A positive offset in lag( offset ) is the number of rows before the current row.
  • A negative offset in lag( offset ) is the number of rows after the current row.
  • A positive offset in lead( offset ) is the number of rows after the current row.
  • A negative offset in lead( offset ) is the number of rows before the current row.

Example

SELECT
    Sale_Date,
    Product_ID,
    Price,
    LEAD(Sale_Date, 1) OVER (
        PARTITION BY Product_ID
        ORDER BY Sale_Date
    ) AS Next_Sale_Date,
    -- Calculate the difference in days between the next sale and the current sale
    JULIANDAY(LEAD(Sale_Date, 1) OVER (
        PARTITION BY Product_ID
        ORDER BY Sale_Date
    )) - JULIANDAY(Sale_Date) AS Days_Until_Next_Sale
FROM
    Sales_History
WHERE
    Product_ID = 'Product_B'
ORDER BY
    Sale_Date;

Sale_Date	Product_ID	Price	Next_Sale_Date	Days_Until_Next_Sale
2024-03-01	Product_B	25.00	2024-03-05		4
2024-03-05	Product_B	26.50	2024-03-12		7
2024-03-12	Product_B	25.00	2024-04-01		20
2024-04-01	Product_B	27.00	NULL			NULL

See Also

LAG

 

LAST_VALUE windowing function

Syntax

LAST_VALUE ( [scalar_expression ] ) OVER ( [ partition_by_clause ] [ order_by_clause ] )

Requires partition_by_clause and/or order_by_clause.

Description

The analytic function LAST_VALUE returns the last value of an ordered set of values. The argument column_ref or expression can be of any type. It must result in a single value. The result of the function is of the same data type as that of the argument. The result can have a null value.

Example

select empno, deptno, projno, hiredate, sal, last_value(hiredate) over (partition by deptno, projno order by sal) last_date from emp;

       EMPNO  DEPTNO    PROJNO       HIREDATE        SAL      LAST_DATE
       -----  ------    ------       --------        ---   ------------
        7782      10       101      06/09/1981   2450.00      06/09/1981
        7934      10       102      01/23/1982   1300.00      01/23/1982
        7839      10       102      11/17/1981   5000.00      11/17/1981
        7329      20       101      12/17/1980    800.00      12/17/1980
        7876      20       101      01/12/1983   1100.00      01/12/1983
        7566      20       101      04/02/1981   2975.00      04/02/1981
        7788      20       101      12/09/1982   3000.00      12/03/1981
        7902      20       101      12/03/1981   3000.00      12/03/1981
        7698      30       101      05/01/1981   2850.00      05/01/1981
        7900      30       102      12/03/1981    950.00      12/03/1981
        7844      30       102      09/08/1981   1500.00      09/08/1981
        7521      30       103      02/22/1981   1250.00      09/28/1981
        7654      30       103      09/28/1981   1250.00      09/28/1981
        7499      30       103      02/20/1981   1600.00      02/20/1981

Note

Analytic functions are recalculated for each new value in the ordered result set. Therefore, first_value() will return the same value for each row in a partition. The return value for last_value() within a partition will change as each new value in the order_by_clause is returned. The same value will be returned for each row in the result set with the same value from the order_by_clause.

See Also

 

PERCENTILE_CONT windowing function

Declaration

PERCENTILE_CONT ( numeric_literal ) WITHIN GROUP ( [ order_by_clause ] ) OVER ( [ partition_by_clause ] )

Requires partition_by_clause and/or order_by_clause.

Description

The window function PERCENTILE_CONT assumes a continuous distribution of the values in the window and takes a percentile value to calculate a value. The value is interpolated, and while it will fall within the range of the window, the value itself may not exist within the result set.

• The argument numeric_literal must be a value between 0.0 and 1.0 .

• WITHIN GROUP specifies a set of numeric values which will be ordered, possibly partitioned, and then used to compute the percentile result.

• The OVER clause of the PERCENTILE_CONT function supports only the partition_by_clause. An ORDER BY clause may NOT be specified within the OVER clause.

Example

select c1, c3, percentile_cont(0.6) within group (order by c1) over (partition by c3) from pc2;

C1 C3 PERCENTILE_CONT(0.600000)
-- -- -------------------------
   1  1.6000000000000000000
1  1  1.6000000000000000000
2  1  1.6000000000000000000
4  2  4.6000000000000000000
5  2  4.6000000000000000000
   3
6 records selected

 

PERCENTILE_DISC windowing function

Declaration

PERCENTILE_DISC ( numeric_literal ) WITHIN GROUP ( [ order_by_clause ] ) OVER ( [ partition_by_clause ] )

Requires partition_by_clause and/or order_by_clause.

Description

The window function PERCENTILE_DISC computes a percentile for the sorted values within a window. The result is a column value that exists within the window which represents the value with the smallest CUME_DIST value that is greater than or equal to the percentile value specified by the numeric_literal argument.

• The argument numeric_literal must be a value between 0.0 and 1.0 .

• WITHIN GROUP specifies a set of numeric values which will be ordered, possibly partitioned and then used to compute the percentile result.

• The OVER clause of the PERCENTILE_CONT function supports only the partition_by_clause. An ORDER BY clause may NOT be specified within the OVER clause.

Example

select c1, c3, percentile_disc(0.6) within group (order by c1) over (partition by c3) from pc2;

C1 C3 PERCENTILE_D
-- -- ------------
   1  2
1  1  2
2  1  2
4  2  5
5  2  5
   3
6 records selected

 

PERCENT_RANK windowing function

Syntax

PERCENT_RANK ( ) OVER ( [ partition_by_clause ] [ order by_clause ] )

Requires partition_by_clause and/or order_by_clause.

Description

The analytic function PERCENT_RANK calculates the relative rank of a value within a result set. The result of the function is float.

Example

select empno, deptno, projno, percent_rank() over (partition by deptno, projno order by empno) from emp;

       EMPNO   DEPTNO   PROJNO   	      PERCENT_RANK()
       -----   ------   ------               -----------
        7782       10      101     0.0000000000000000000
        7839       10      102     0.0000000000000000000
        7934       10      102     1.0000000000000000000
        7329       20      101     0.0000000000000000000
        7566       20      101     0.2500000000000000000
        7788       20      101     0.5000000000000000000
        7876       20      101     0.7500000000000000000
        7902       20      101     1.0000000000000000000
        7698       30      101     0.0000000000000000000
        7844       30      102     0.0000000000000000000
        7900       30      102     1.0000000000000000000
        7499       30      103     0.0000000000000000000
        7521       30      103     0.5000000000000000000
        7654       30      103     1.0000000000000000000

See Also