云数据库 GAUSSDB-窗口函数:窗口函数

时间:2024-11-02 18:52:10

窗口函数

窗口函数与OVER语句一起使用。OVER语句用于对数据进行分组,并对组内元素进行排序。窗口函数用于给组内的值生成序号。

窗口函数中的order by后面必须跟字段名,若order by后面跟数字,该数字会被按照常量处理,因此对目标列没有起到排序的作用。

  • RANK()

    描述:RANK函数为各组内值生成跳跃排序序号,其中,相同的值具有相同序号。

    返回值类型:BIGINT

    示例:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    openGauss=# CREATE TABLE rank_t1(a int, b int);
    
    openGauss=# INSERT INTO rank_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);
    
    openGauss=# SELECT a,b,RANK() OVER(PARTITION BY a ORDER BY b) FROM rank_t1;
     a | b | rank 
    ---+---+------
     1 | 1 |    1
     1 | 1 |    1
     1 | 2 |    3
     1 | 3 |    4
     2 | 4 |    1
     2 | 5 |    2
     3 | 6 |    1
    (7 rows)
    
    openGauss=# DROP TABLE rank_t1;
    
  • ROW_NUMBER()

    描述:ROW_NUMBER函数为各组内值生成连续排序序号,其中,相同的值其序号也不相同。

    返回值类型:BIGINT

    示例:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    openGauss=# CREATE TABLE row_number_t1(a int, b int);
    
    openGauss=# INSERT INTO row_number_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);
    
    openGauss=# SELECT a,b,ROW_NUMBER() OVER(PARTITION BY a ORDER BY b) FROM row_number_t1;
     a | b | row_number 
    ---+---+------------
     1 | 1 |          1
     1 | 1 |          2
     1 | 2 |          3
     1 | 3 |          4
     2 | 4 |          1
     2 | 5 |          2
     3 | 6 |          1
    (7 rows)
    
    openGauss=# DROP TABLE row_number_t1;
    
  • DENSE_RANK()

    描述:DENSE_RANK函数为各组内值生成连续排序序号,其中,相同的值具有相同序号。

    返回值类型:BIGINT

    示例:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    openGauss=# CREATE TABLE dense_rank_t1(a int, b int);
    
    openGauss=# INSERT INTO dense_rank_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);
    
    openGauss=# SELECT a,b,DENSE_RANK() OVER(PARTITION BY a ORDER BY b) FROM dense_rank_t1;
     a | b | dense_rank 
    ---+---+------------
     1 | 1 |          1
     1 | 1 |          1
     1 | 2 |          2
     1 | 3 |          3
     2 | 4 |          1
     2 | 5 |          2
     3 | 6 |          1
    (7 rows)
    
    openGauss=# DROP TABLE dense_rank_t1;
    
  • PERCENT_RANK()

    描述:PERCENT_RANK函数为各组内对应值生成相对序号,即根据公式 (rank - 1) / (totalrows - 1)计算所得的值。其中rank为该值依据RANK函数所生成的对应序号,totalrows为该分组内的总元素个数。

    返回值类型:DOUBLE PRECISION

    示例:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    openGauss=# CREATE TABLE percent_rank_t1(a int, b int);
    
    openGauss=# INSERT INTO percent_rank_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);
    
    openGauss=# SELECT a,b,PERCENT_RANK() OVER(PARTITION BY a ORDER BY b) FROM percent_rank_t1;
     a | b |   percent_rank   
    ---+---+------------------
     1 | 1 |                0
     1 | 1 |                0
     1 | 2 | .666666666666667
     1 | 3 |                1
     2 | 4 |                0
     2 | 5 |                1
     3 | 6 |                0
    (7 rows)
    
    openGauss=# DROP TABLE percent_rank_t1;
    
  • CUME_DIST()

    描述:CUME_DIST函数为各组内对应值生成累积分布序号。即根据公式(小于等于当前值的数据行数)/(该分组总行数totalrows)计算所得的相对序号。

    返回值类型:DOUBLE PRECISION

    示例:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    openGauss=# CREATE TABLE cume_dist_t1(a int, b int);
    
    openGauss=# INSERT INTO cume_dist_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);
    
    openGauss=# SELECT a,b,CUME_DIST() OVER(PARTITION BY a ORDER BY b) FROM cume_dist_t1;
     a | b | cume_dist 
    ---+---+-----------
     1 | 1 |        .5
     1 | 1 |        .5
     1 | 2 |       .75
     1 | 3 |         1
     2 | 4 |        .5
     2 | 5 |         1
     3 | 6 |         1
    (7 rows)
    
    openGauss=# DROP TABLE cume_dist_t1;
    
  • NTILE(num_buckets integer)

    描述:NTILE函数根据num_buckets integer将有序的数据集合平均分配到num_buckets所指定数量的桶中,并将桶号分配给每一行。分配时应尽量做到平均分配。

    返回值类型:INTEGER

    示例:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    openGauss=# CREATE TABLE ntile_t1(a int, b int);
    
    openGauss=# INSERT INTO ntile_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);
    
    openGauss=# SELECT a,b,NTILE(2) OVER(PARTITION BY a ORDER BY b) FROM ntile_t1;
     a | b | ntile 
    ---+---+-------
     1 | 1 |     1
     1 | 1 |     1
     1 | 2 |     2
     1 | 3 |     2
     2 | 4 |     1
     2 | 5 |     2
     3 | 6 |     1
    (7 rows)
    
    openGauss=# DROP TABLE ntile_t1;
    
  • LAG(value any [, offset integer [, default any ]])

    描述:LAG函数为各组内对应值生成滞后值。即当前值对应的行数往前偏移offset位后所得行的value值作为序号。若经过偏移后行数不存在,则对应结果取为default值。若无指定,在默认情况下,offset取为1,default值取为NULL。default值的类型需要与value值的类型保持一致。

    返回值类型:与参数数据类型相同

    示例:

    -- 建表并插入数据
    openGauss=# CREATE TABLE ta1 (hire_date date, last_name varchar(20), department_id int);
    CREATE TABLE
    openGauss=# INSERT INTO ta1 values('07-DEC-02', 'Raphaely', 30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('24-JUL-05', 'Tobias',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('24-DEC-05', 'Baida',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('18-MAY-03', 'Khoo', 30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('15-NOV-06', 'Himuro',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('10-AUG-07', 'Colmenares',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('10-MAY-07', 'yq',  11);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('10-MAY-08', 'zi',  11);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('', 'yq1',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values(null, 'yq2',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('10-DEC-07', 'yq3',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values(null, null,  11);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values(null, null,  11);
    INSERT 0 1
    
    -- 调用LAG,指定offset=3, default=null
    openGauss=# SELECT hire_date, last_name, department_id, lag(hire_date, 3, null)  OVER (PARTITION BY department_id ORDER BY last_name) AS "NextHired" FROM ta1 ORDER BY department_id;
          hire_date      | last_name  | department_id |      NextHired
    ---------------------+------------+---------------+---------------------
     2007-05-10 00:00:00 | yq         |            11 |
     2008-05-10 00:00:00 | zi         |            11 |
                         |            |            11 |
                         |            |            11 | 2007-05-10 00:00:00
     2005-12-24 00:00:00 | Baida      |            30 |
     2007-08-10 00:00:00 | Colmenares |            30 |
     2006-11-15 00:00:00 | Himuro     |            30 |
     2003-05-18 00:00:00 | Khoo       |            30 | 2005-12-24 00:00:00
     2002-12-07 00:00:00 | Raphaely   |            30 | 2007-08-10 00:00:00
     2005-07-24 00:00:00 | Tobias     |            30 | 2006-11-15 00:00:00
                         | yq1        |            30 | 2003-05-18 00:00:00
                         | yq2        |            30 | 2002-12-07 00:00:00
     2007-12-10 00:00:00 | yq3        |            30 | 2005-07-24 00:00:00
    (13 rows)
    
    openGauss=# DROP TABLE ta1;
  • LEAD(value any [, offset integer [, default any ]])

    描述:LEAD函数为各组内对应值生成提前值。即当前值对应的行数向后偏移offset位后所得行的value值作为序号。若经过向后偏移后行数超过当前组内的总行数,则对应结果取为default值。若无指定,在默认情况下,offset取为1,default值取为NULL。default值的类型需要与value值的类型保持一致。

    返回值类型:与参数数据类型相同。

    示例:

    openGauss=# CREATE TABLE ta1 (hire_date date, last_name varchar(20), department_id int);
    CREATE TABLE
    openGauss=# INSERT INTO ta1 values('07-DEC-02', 'Raphaely', 30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('24-JUL-05', 'Tobias',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('24-DEC-05', 'Baida',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('18-MAY-03', 'Khoo', 30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('15-NOV-06', 'Himuro',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('10-AUG-07', 'Colmenares',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('10-MAY-07', 'yq',  11);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('10-MAY-08', 'zi',  11);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('', 'yq1',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values(null, 'yq2',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values('10-DEC-07', 'yq3',  30);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values(null, null,  11);
    INSERT 0 1
    openGauss=# INSERT INTO ta1 values(null, null,  11);
    INSERT 0 1
    
    -- 调用LEAD,指定offset=2
    openGauss=# SELECT hire_date, last_name, department_id, lead(hire_date, 2) OVER (PARTITION BY department_id ORDER BY last_name) AS "NextHired" FROM ta1 ORDER BY department_id;
          hire_date      | last_name  | department_id |      NextHired
    ---------------------+------------+---------------+---------------------
     2007-05-10 00:00:00 | yq         |            11 |
     2008-05-10 00:00:00 | zi         |            11 |
                         |            |            11 |
                         |            |            11 |
     2005-12-24 00:00:00 | Baida      |            30 | 2006-11-15 00:00:00
     2007-08-10 00:00:00 | Colmenares |            30 | 2003-05-18 00:00:00
     2006-11-15 00:00:00 | Himuro     |            30 | 2002-12-07 00:00:00
     2003-05-18 00:00:00 | Khoo       |            30 | 2005-07-24 00:00:00
     2002-12-07 00:00:00 | Raphaely   |            30 |
     2005-07-24 00:00:00 | Tobias     |            30 |
                         | yq1        |            30 | 2007-12-10 00:00:00
                         | yq2        |            30 |
     2007-12-10 00:00:00 | yq3        |            30 |
    (13 rows)
    
    openGauss=# DROP TABLE ta1;
  • FIRST_VALUE(value any)

    描述:FIRST_VALUE函数取各组内的第一个值作为返回结果。

    返回值类型:与参数数据类型相同。

    示例:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    openGauss=# CREATE TABLE first_value_t1(a int, b int);
    
    openGauss=# INSERT INTO first_value_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);
    
    openGauss=# SELECT a,b,FIRST_VALUE(b) OVER(PARTITION BY a ORDER BY b) FROM first_value_t1;
     a | b | first_value 
    ---+---+-------------
     1 | 1 |           1
     1 | 1 |           1
     1 | 2 |           1
     1 | 3 |           1
     2 | 4 |           4
     2 | 5 |           4
     3 | 6 |           6
    (7 rows)
    
    openGauss=# DROP TABLE first_value_t1;
    
  • LAST_VALUE(value any)

    描述:LAST_VALUE函数取各组内的最后一个值作为返回结果。

    返回值类型:与参数数据类型相同。

    示例:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    openGauss=# CREATE TABLE last_value_t1(a int, b int);
    
    openGauss=# INSERT INTO last_value_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);
    
    openGauss=# SELECT a,b,LAST_VALUE(b) OVER(PARTITION BY a ORDER BY b) FROM last_value_t1;
     a | b | last_value 
    ---+---+------------
     1 | 1 |          1
     1 | 1 |          1
     1 | 2 |          2
     1 | 3 |          3
     2 | 4 |          4
     2 | 5 |          5
     3 | 6 |          6
    (7 rows)
    
    openGauss=# DROP TABLE last_value_t1;
    
  • DELTA

    描述:返回当前行和前一行的差值。

    参数:numeric

    返回值类型:numeric

  • NTH_VALUE(value any, nth integer)

    描述:NTH_VALUE函数返回该组内的第nth行作为结果。若该行不存在,则默认返回NULL。

    返回值类型:与参数数据类型相同。

    示例:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    openGauss=# CREATE TABLE nth_value_t1(a int, b int);
    
    openGauss=# INSERT INTO nth_value_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);
    
    openGauss=# SELECT a,b,NTH_VALUE(b, 2) OVER(PARTITION BY a order by b) FROM nth_value_t1;
     a | b | nth_value 
    ---+---+-----------
     1 | 1 |         1
     1 | 1 |         1
     1 | 2 |         1
     1 | 3 |         1
     2 | 4 |          
     2 | 5 |         5
     3 | 6 |          
    (7 rows)
    
    openGauss=# DROP TABLE nth_value_t1;
    
support.huaweicloud.com/centralized-devg-v2-gaussdb/gaussdb_42_0303.html