数据湖探索 DLI-窗口:OVER WINDOW

时间:2024-11-16 13:21:40

OVER WINDOW

Over Window与Group Window区别在于Over window每一行都会输出一条记录。

语法格式

1
2
3
4
OVER (
  [PARTITION BY partition_name]
  ORDER BY proctime|rowtime(ROWS number PRECEDING) |(RANGE (BETWEEN INTERVAL '1' SECOND PRECEDING AND CURRENT ROW | UNBOUNDED preceding))
)

语法说明

表3 参数说明

参数

参数说明

PARTITION BY

指定分组的主键,每个分组各自进行计算。

ORDER BY

指定数据按processing time或event time作为时间戳。

ROWS

个数窗口。

RANGE

时间窗口。

注意事项

  • 同一select里所有聚合函数定义的窗口都必须保持一致。
  • 当前Over窗口只支持前向计算(preceding),不支持following计算。
  • 必须指定ORDER BY 按processing time或event time。
  • 不支持对常量做聚合操作,如sum(2)。

示例

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
// 计算从规则启动到目前为止的计数及总和(in proctime)
insert into temp SELECT name,
    count(amount) OVER (PARTITION BY name ORDER BY proctime RANGE UNBOUNDED preceding) as cnt1,
    sum(amount) OVER (PARTITION BY name ORDER BY proctime RANGE UNBOUNDED preceding) as cnt2
    FROM Orders;
  
// 计算最近四条记录的计数及总和(in proctime)
insert into temp SELECT name,
    count(amount) OVER (PARTITION BY name ORDER BY proctime ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) as cnt1,
    sum(amount) OVER (PARTITION BY name ORDER BY proctime ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) as cnt2
    FROM Orders;

// 计算最近60s的计数及总和(in eventtime),基于事件时间处理,事件时间为Orders中的timeattr字段
insert into temp SELECT name,
    count(amount) OVER (PARTITION BY name ORDER BY timeattr RANGE BETWEEN INTERVAL '60' SECOND PRECEDING AND CURRENT ROW) as cnt1,
    sum(amount) OVER (PARTITION BY name ORDER BY timeattr RANGE BETWEEN INTERVAL '60' SECOND PRECEDING AND CURRENT ROW) as cnt2
    FROM Orders;

support.huaweicloud.com/sqlref-flink-dli/dli_08_0218.html