MAPREDUCE服务 MRS-FlinkSQL OVER窗口支持超期退窗:使用方法

时间:2024-11-28 01:44:41

使用方法

配置Flink作业时,可通过在FlinkServer Web UI的作业开发界面添加自定义参数“over.window.interval”,且值配置为大于或等于“0”时开启窗口支持数据超期功能,创建作业可参考如何创建FlinkServer作业。该设置会对作业中的所有over窗口生效,建议对单over窗口的作业使用此功能。

  • SQL示例:
    CREATE TABLE OverSource (
      `rowtime` TIMESTAMP_LTZ(3),
      `groupId` INT,
      `value` INT,
      `name` STRING,
      `additional_field` STRING,
      `proctime` as PROCTIME(),
      WATERMARK FOR rowtime AS rowtime
    ) WITH (
     'connector' = 'kafka',
     'topic' = 'test_source',
     'properties.bootstrap.servers' = 'Kafka的Broker实例业务IP:Kafka端口号',  
     'properties.group.id' = 'testGroup', 
     'scan.startup.mode' = 'latest-offset', 
     'format' = 'csv',  
     'properties.sasl.kerberos.service.name' = 'kafka',   --FlinkServer所在集群为非安全模式去掉此参数
     'properties.security.protocol' = 'SASL_PLAINTEXT',  --FlinkServer所在集群为非安全模式去掉此参数
     'properties.kerberos.domain.name' = 'hadoop.系统 域名 ' --FlinkServer所在集群为非安全模式去掉此参数
    );
    
    CREATE TABLE LD_SINK(
     `name` STRING, `groupId` INT, `rowtime` TIMESTAMP_LTZ(3),`count_zw` BIGINT,`sum_zw` BIGINT
    ) WITH ( 
     'connector' = 'print'
    );
    
    SELECT
      `name`,
      `groupId`,
      COUNT(`value`) OVER (
        PARTITION BY groupId
        ORDER BY
          proctime RANGE BETWEEN INTERVAL '10' second PRECEDING
          AND CURRENT ROW
      ) as count_zw,
      SUM(`value`) OVER (
        PARTITION BY groupId
        ORDER BY
          proctime RANGE BETWEEN INTERVAL '10' second PRECEDING
          AND CURRENT ROW
      ) as sum_zw
    FROM
      OverSource
support.huaweicloud.com/cmpntguide-lts-mrs/mrs_01_248951.html