MAPREDUCE服务 MRS-配置Hudi分区并发控制:使用分区并发机制

时间:2024-07-02 16:40:06

使用分区并发机制

通过设置参数:hoodie.support.partition.lock=true来启动分区并发写。

示例:

spark datasource方式开启分区并发写:

upsert_data.write.format("hudi").
option("hoodie.datasource.write.table.type", "COPY_ON_WRITE").
option("hoodie.datasource.write.precombine.field", "col2").
option("hoodie.datasource.write.recordkey.field", "primary_key").
option("hoodie.datasource.write.partitionpath.field", "col0").
option("hoodie.upsert.shuffle.parallelism", 4).
option("hoodie.datasource.write.hive_style_partitioning", "true").
option("hoodie.support.partition.lock", "true").
option("hoodie.table.name", "tb_test_cow").
mode("Append").save(s"/tmp/huditest/tb_test_cow")

spark-sql开启分区并发写:

set hoodie.support.partition.lock=true;
insert into hudi_table1 select 1,1,1;
support.huaweicloud.com/cmpntguide-lts-mrs/mrs_01_248924.html