MAPREDUCE服务 MRS-Hive HCatalog应用开发快速入门:编译并运行程序
编译并运行程序
- 编译HCatalog样例程序:
- 在IDEA Maven工具窗口,选择clean生命周期,执行Maven构建过程。
- 选择package生命周期,执行Maven构建过程。
图2 打包样例程序
当输出“BUILD SUC CES S”,表示编译成功。
编译成功后将会在样例工程的“target”目录下生成jar包“hcatalog-example-XXX.jar”。
[INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 03:30 min [INFO] Finished at: 2023-05-17T20:22:44+08:00 [INFO] ------------------------------------------------------------------------
- 登录Hive Beeline命令行,创建用于HCatalog程序分析的源表及数据表。
source /opt/client/bigdata_env
kinit hiveuser
beeline
create table t1(col1 int);
create table t2(col1 int,col2 int);
向源数据表t1中插入测试数据:
insert into table t1 select 1 union all select 1 union all select 2 union all select 2 union all select 3;
select * from t1;
+----------+ | t1.col1 | +----------+ | 1 | | 1 | | 2 | | 2 | | 3 | +----------+
- 将导出的jar包上传至集群客户端所在的Linux节点指定路径,例如“/opt/hive_demo”。
- 为方便后续操作,将样例程序目录、客户端组件目录等配置为公共变量。
退出Beeline命令行,执行以下命令:
export HCAT_CLIENT=/opt/hive_demo
export HADOOP_HOME=/opt/client/HDFS/hadoop
export HIVE_HOME=/opt/client/Hive/Beeline
export HCAT_HOME=$HIVE_HOME/../HCatalog
export LIB_JARS=$HCAT_HOME/lib/hive-hcatalog-core-XXX.jar,$HCAT_HOME/lib/hive-metastore-XXX.jar,$HCAT_HOME/lib/hive-standalone-metastore-XXX.jar,$HIVE_HOME/lib/hive-exec-XXX.jar,$HCAT_HOME/lib/libfb303-XXX.jar,$HCAT_HOME/lib/slf4j-api-XXX.jar,$HCAT_HOME/lib/jdo-api-XXX.jar,$HCAT_HOME/lib/antlr-runtime-XXX.jar,$HCAT_HOME/lib/datanucleus-api-jdo-XXX.jar,$HCAT_HOME/lib/datanucleus-core-XXX.jar,$HCAT_HOME/lib/datanucleus-rdbms-fi-XXX.jar,$HCAT_HOME/lib/log4j-api-XXX.jar,$HCAT_HOME/lib/log4j-core-XXX.jar,$HIVE_HOME/lib/commons-lang-XXX.jar,$HIVE_HOME/lib/hive-exec-XXX.jar
export HADOOP_CLASSPATH=$HCAT_HOME/lib/hive-hcatalog-core-XXX.jar:$HCAT_HOME/lib/hive-metastore-XXX.jar:$HCAT_HOME/lib/hive-standalone-metastore-XXX.jar:$HIVE_HOME/lib/hive-exec-XXX.jar:$HCAT_HOME/lib/libfb303-XXX.jar:$HADOOP_HOME/etc/hadoop:$HCAT_HOME/conf:$HCAT_HOME/lib/slf4j-api-XXX.jar:$HCAT_HOME/lib/jdo-api-XXX.jar:$HCAT_HOME/lib/antlr-runtime-XXX.jar:$HCAT_HOME/lib/datanucleus-api-jdo-XXX.jar:$HCAT_HOME/lib/datanucleus-core-XXX.jar:$HCAT_HOME/lib/datanucleus-rdbms-fi-XXX.jar:$HCAT_HOME/lib/log4j-api-XXX.jar:$HCAT_HOME/lib/log4j-core-XXX.jar:$HIVE_HOME/lib/commons-lang-XXX.jar:$HIVE_HOME/lib/hive-exec-XXX.jar
LIB_JARS和HADOOP_CLASSPATH中指定的jar包的版本号“XXX”需要根据实际环境的版本号进行修改。
- 使用Yarn客户端提交任务。
yarn --config $HADOOP_HOME/etc/hadoop jar $HCAT_CLIENT/hcatalog-example-XXX.jar com.huawei.bigdata.HCatalogExample -libjars $LIB_JARS t1 t2
... 2023-05-18 20:05:56,691 INFO mapreduce.Job: The url to track the job: https://host-192-168-64-122:26001/proxy/application_1683438782910_0008/ 2023-05-18 20:05:56,692 INFO mapreduce.Job: Running job: job_1683438782910_0008 2023-05-18 20:06:07,250 INFO mapreduce.Job: Job job_1683438782910_0008 running in uber mode : false 2023-05-18 20:06:07,253 INFO mapreduce.Job: map 0% reduce 0% 2023-05-18 20:06:15,362 INFO mapreduce.Job: map 25% reduce 0% 2023-05-18 20:06:16,386 INFO mapreduce.Job: map 50% reduce 0% 2023-05-18 20:06:35,999 INFO mapreduce.Job: map 100% reduce 0% 2023-05-18 20:06:42,048 INFO mapreduce.Job: map 100% reduce 100% 2023-05-18 20:06:43,136 INFO mapreduce.Job: Job job_1683438782910_0008 completed successfully 2023-05-18 20:06:44,118 INFO mapreduce.Job: Counters: 54 ...
- 作业任务运行完成后,进入Hive Beeline命令行,查询t2表数据内容,查看数据分析结果。
select * from t2;
+----------+----------+ | t2.col1 | t2.col2 | +----------+----------+ | 1 | 2 | | 2 | 2 | | 3 | 1 | +----------+----------+