MAPREDUCE服务 MRS-Spark on Yarn的client模式下spark-submit提交任务出现FileNotFoundException异常:回答

时间:2024-06-19 16:04:40

回答

原因分析:

在yarn-client模式下执行任务时,Spark的Driver程序在本地执行;其中通过-Dlog4j.configuration=./log4j-executor.properties配置了Driver的日志文件,log4j-executor.properties配置文件设置日志输出到${spark.yarn.app.container.log.dir}/stdout文件中,由于Spark Driver在本地执行时${spark.yarn.app.container.log.dir}没有设置即为空,则输出目录为/stdout,而非root用户下,在根目录下面没有创建和修改stdout的权限,就会报FileNotFoundException异常。而在yarn-cluster模式下执行任务时,Spark的Driver程序在Application Master下执行,而在Application Master启动时就会通过-D${spark.yarn.app.container.log.dir}设置了输出目录,因此在yarn-cluster模式执行任务不会报FileNotFoundException异常。

处理方法:

注:下面所说的$SPAKR_HOME默认是/opt/client/Spark/spark

解决方案1:手动切换日志配置文件。修改文件$SPARK_HOME/conf/spark-defaults.conf中spark.driver.extraJavaOptions的配置项-Dlog4j.configuration=./log4j-executor.properties(默认情况下为./log4j-executor.properties),在yarn-client模式下,修改为-Dlog4j.configuration=./log4j.properties,在yarn-cluster模式下修改为-Dlog4j.configuration=./log4j-executor.properties。

解决方案2:修改启动脚本$SPARK_HOME/bin/spark-class。在spark-class脚本#!/usr/bin/env bash下面添加。

# Judge mode: client and cluster; Default: client
argv=`echo $@ | tr [A-Z] [a-z]`
if [[ "$argv" =~ "--master" ]];then
    mode=`echo $argv | sed -e 's/.*--master //'`
    master=`echo $mode | awk '{print $1}'`
    case $master in
    "yarn")
        deploy=`echo $mode | awk '{print $3}'`
        if [[ "$mode" =~ "--deploy-mode" ]];then
                deploy=$deploy
        else
                deploy="client"
        fi
    ;;
  "yarn-client"|"local")
        deploy="client"
    ;;
    "yarn-cluster")
        deploy="cluster"
    ;;
    esac
else
    deploy="client"
fi
# modify the spark-defaults.conf
number=`sed -n -e '/spark.driver.extraJavaOptions/=' $SPARK_HOME/conf/spark-defaults.conf`
if [ "$deploy"x = "client"x ];then
    `sed -i "${number}s/-Dlog4j.configuration=.*properties /-Dlog4j.configuration=.\/log4j.properties /g" $SPARK_HOME/conf/spark-defaults.conf`
else
    `sed -i "${number}s/-Dlog4j.configuration=.*properties /-Dlog4j.configuration=.\/log4j-executor.properties /g" $SPARK_HOME/conf/spark-defaults.conf`
fi

这些脚本行的功能和解决方案1类似,通过判断yarn的模式来修改文件$SPARK_HOME/conf/spark-defaults.conf中spark.driver.extraJavaOptions的配置项-Dlog4j.configuration=./log4j-executor.properties。

support.huaweicloud.com/devg-mrs/mrs_06_0431.html