MAPREDUCE服务 MRS-Flink客户端执行命令报错“Could not connect to the leading JobManager”:问题背景与现象

时间:2024-08-27 10:23:52

问题背景与现象

创建Fllink集群,执行yarn-session.sh命令卡住一段时间后报错:

2018-09-20 22:51:16,842 | WARN  | [main] | Unable to get ClusterClient status from Application Client | org.apache.flink.yarn.YarnClusterClient (YarnClusterClient.java:253) 
org.apache.flink.util.FlinkException: Could not connect to the leading JobManager. Please check that the JobManager is running.
	at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:861)
	at org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:248)
	at org.apache.flink.yarn.YarnClusterClient.waitForClusterToBeReady(YarnClusterClient.java:516)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:717)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:514)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:511)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:511)
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway.
	at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:79)
	at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:856)
	... 10 common frames omitted
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
support.huaweicloud.com/trouble-mrs/mrs_03_0135.html