云服务器内容精选

  • 回答 用户尝试收集大量数据到Driver端,如果Driver端的内存不足以存放这些数据,那么就会抛出OOM(OutOfMemory)的异常,然后Driver端一直在进行GC,尝试回收垃圾来存放返回的数据,导致应用长时间挂起。 解决措施: 如果用户需要在OOM场景下强制将应用退出,那么可以在启动Spark Core应用时,在客户端配置文件“$SPARK_HOME/conf/spark-defaults.conf”中的配置项“spark.driver.extraJavaOptions”中添加如下内容: -XX:OnOutOfMemoryError='kill -9 %p'
  • 问题 执行Spark Core应用,尝试收集大量数据到Driver端,当Driver端内存不足时,应用挂起不退出,日志内容如下。 16/04/19 15:56:22 ERROR Utils: Uncaught exception in thread task-result-getter-2java.lang.OutOfMemoryError: Java heap spaceat java.lang.reflect.Array.newArray(Native Method)at java.lang.reflect.Array.newInstance(Array.java:75)at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1671)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345)at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1707)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345)at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:71)at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:91)at org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:94)at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:66)at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:57)at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:57)at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1716)at org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:56)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)Exception in thread "task-result-getter-2" java.lang.OutOfMemoryError: Java heap spaceat java.lang.reflect.Array.newArray(Native Method)at java.lang.reflect.Array.newInstance(Array.java:75)at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1671)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345)at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1707)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345)at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:71)at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:91)at org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:94)at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:66)at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:57)at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:57)at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1716)at org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:56)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)