MAPREDUCE服务 MRS-Flink性能调优建议:通过表级JTL进行状态后端优化

时间:2024-06-17 09:21:36

通过表级JTL进行状态后端优化

本章节适用于 MRS 3.3.0及以后版本。

在Flink双流inner Join场景下,若Join业务允许join一次就可以剔除后端中的数据时,可以使用该特性。

该特性只适用于流流inner join。

可通过使用Hint方式单独为左表和右表设置不同join次数:

  • Hint方式格式:
    table_path /*+ OPTIONS(key=val [, key=val]*) */  
    
    key:
         stringLiteral 
    val:
         stringLiteral
  • 在SQL语句中配置示例:
    CREATE TABLE user_info (`user_id` VARCHAR, `user_name` VARCHAR) WITH (
      'connector' = 'kafka',
      'topic' = 'user_info_001',
      'properties.bootstrap.servers' = '192.168.64.138:21005',
      'properties.group.id' = 'testGroup',
      'scan.startup.mode' = 'latest-offset',
      'value.format' = 'csv'
    );
    CREATE table print(
      `user_id` VARCHAR,
      `user_name` VARCHAR,
      `score` INT
    ) WITH ('connector' = 'print');
    CREATE TABLE user_score (user_id VARCHAR, score INT) WITH (
      'connector' = 'kafka',
      'topic' = 'user_score_001',
      'properties.bootstrap.servers' = '192.168.64.138:21005',
      'properties.group.id' = 'testGroup',
      'scan.startup.mode' = 'latest-offset',
      'value.format' = 'csv'
    );
    INSERT INTO
      print
    SELECT
      t.user_id,
      t.user_name,
      d.score
    FROM
      user_info as t
      JOIN 
      --  为左表和右表设置不同的JTL关联次数
      /*+ OPTIONS('eliminate-state.left.threshold'='1','eliminate-state.right.threshold'='1') */
      user_score as d ON t.user_id = d.user_id;
support.huaweicloud.com/devg-rule-mrs/mrs_07_450173.html