spark on yarn 错误
使用spark on yarn跑任务的时候,出现了异常错误,错误如下:
hduser@xxx:/data1/hadoop/spark/bin$ ./spark-shell --master yarn --deploy-mode client 2020-04-13 10:04:25 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 2020-04-13 10:10:10 ERROR YarnClientSchedulerBackend:70 - The YARN application has already ended! It might have been killed or the Application Master may have failed to start. Check the YARN application logs for more details. 2020-04-13 10:10:10 ERROR SparkContext:91 - Error initializing SparkContext. org.apache.spark.SparkException: Application application_1586511331938_0025 failed 2 times due to AM Container for appattempt_1586511331938_0025_000002 exited with exitCode: 1 For more detailed output, check application tracking page:http://xxx:8088/proxy/application_1586511331938_0025/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1586511331938_0025_02_000159 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:94) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:63) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:178) at org.apache.spark.SparkContext.<init>(SparkContext.scala:501) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:106) at $line3.$read$$iw$$iw.<init>(<console>:15) at $line3.$read$$iw.<init>(<console>:43) at $line3.$read.<init>(<console>:45) at $line3.$read$.<init>(<console>:49) at $line3.$read$.<clinit>(<console>) at $line3.$eval$.$print$lzycompute(<console>:7) at $line3.$eval$.$print(<console>:6) at $line3.$eval.$print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:793) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1054) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:645) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:644) at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19) at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:644) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:576) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:572) at scala.tools.nsc.interpreter.IMain$$anonfun$quietRun$1.apply(IMain.scala:231) at scala.tools.nsc.interpreter.IMain$$anonfun$quietRun$1.apply(IMain.scala:231) at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:221) at scala.tools.nsc.interpreter.IMain.quietRun(IMain.scala:231) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1$$anonfun$apply$mcV$sp$1.apply(SparkILoop.scala:109) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1$$anonfun$apply$mcV$sp$1.apply(SparkILoop.scala:109) at scala.collection.immutable.List.foreach(List.scala:392) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:109) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:109) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:109) at scala.tools.nsc.interpreter.ILoop.savingReplayStack(ILoop.scala:91) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:108) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$org$apache$spark$repl$SparkILoop$$anonfun$$loopPostInit$1$1.apply$mcV$sp(SparkILoop.scala:211) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$org$apache$spark$repl$SparkILoop$$anonfun$$loopPostInit$1$1.apply(SparkILoop.scala:199) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$org$apache$spark$repl$SparkILoop$$anonfun$$loopPostInit$1$1.apply(SparkILoop.scala:199) at scala.tools.nsc.interpreter.ILoop$$anonfun$mumly$1.apply(ILoop.scala:189) at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:221) at scala.tools.nsc.interpreter.ILoop.mumly(ILoop.scala:186) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.org$apache$spark$repl$SparkILoop$$anonfun$$loopPostInit$1(SparkILoop.scala:199) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$startup$1$1.apply(SparkILoop.scala:267) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$startup$1$1.apply(SparkILoop.scala:247) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.withSuppressedSettings$1(SparkILoop.scala:235) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.startup$1(SparkILoop.scala:247) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:282) at org.apache.spark.repl.SparkILoop.runClosure(SparkILoop.scala:159) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:182) at org.apache.spark.repl.Main$.doMain(Main.scala:78) at org.apache.spark.repl.Main$.main(Main.scala:58) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 2020-04-13 10:10:10 WARN YarnSchedulerBackend$YarnSchedulerEndpoint:66 - Attempted to request executors before the AM has registered! 2020-04-13 10:10:11 WARN MetricsSystem:66 - Stopping a MetricsSystem that is not running 2020-04-13 10:10:11 ERROR Main:91 - Failed to initialize Spark session. org.apache.spark.SparkException: Application application_1586511331938_0025 failed 2 times due to AM Container for appattempt_1586511331938_0025_000002 exited with exitCode: 1 For more detailed output, check application tracking page:http://xxx:8088/proxy/application_1586511331938_0025/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1586511331938_0025_02_000159 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:94) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:63) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:178) at org.apache.spark.SparkContext.<init>(SparkContext.scala:501) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:106) at $line3.$read$$iw$$iw.<init>(<console>:15) at $line3.$read$$iw.<init>(<console>:43) at $line3.$read.<init>(<console>:45) at $line3.$read$.<init>(<console>:49) at $line3.$read$.<clinit>(<console>) at $line3.$eval$.$print$lzycompute(<console>:7) at $line3.$eval$.$print(<console>:6) at $line3.$eval.$print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:793) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1054) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:645) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:644) at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19) at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:644) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:576) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:572) at scala.tools.nsc.interpreter.IMain$$anonfun$quietRun$1.apply(IMain.scala:231) at scala.tools.nsc.interpreter.IMain$$anonfun$quietRun$1.apply(IMain.scala:231) at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:221) at scala.tools.nsc.interpreter.IMain.quietRun(IMain.scala:231) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1$$anonfun$apply$mcV$sp$1.apply(SparkILoop.scala:109) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1$$anonfun$apply$mcV$sp$1.apply(SparkILoop.scala:109) at scala.collection.immutable.List.foreach(List.scala:392) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:109) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:109) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:109) at scala.tools.nsc.interpreter.ILoop.savingReplayStack(ILoop.scala:91) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:108) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$org$apache$spark$repl$SparkILoop$$anonfun$$loopPostInit$1$1.apply$mcV$sp(SparkILoop.scala:211) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$org$apache$spark$repl$SparkILoop$$anonfun$$loopPostInit$1$1.apply(SparkILoop.scala:199) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$org$apache$spark$repl$SparkILoop$$anonfun$$loopPostInit$1$1.apply(SparkILoop.scala:199) at scala.tools.nsc.interpreter.ILoop$$anonfun$mumly$1.apply(ILoop.scala:189) at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:221) at scala.tools.nsc.interpreter.ILoop.mumly(ILoop.scala:186) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.org$apache$spark$repl$SparkILoop$$anonfun$$loopPostInit$1(SparkILoop.scala:199) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$startup$1$1.apply(SparkILoop.scala:267) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$startup$1$1.apply(SparkILoop.scala:247) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.withSuppressedSettings$1(SparkILoop.scala:235) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.startup$1(SparkILoop.scala:247) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:282) at org.apache.spark.repl.SparkILoop.runClosure(SparkILoop.scala:159) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:182) at org.apache.spark.repl.Main$.doMain(Main.scala:78) at org.apache.spark.repl.Main$.main(Main.scala:58) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
分析:
通过上述的分析,可以看到是由于yarn 在启动AM的时候,AM没有正常的启动,导致spark运行失败,但是,去yarn的8088界面查看saprk任务容器日志,没有相应的错误日志输出,怀疑是由于AM容器没有启动,也就导致没有容器日志,后来配置spark 历史日志,发现也没有对应的错误日志出现,同时,查看resourcemanager以及nodemanager的日志,同样没有爆出具体导致任务失败的原因。
导致这个问题一直卡住。
根据网上其他博客,碰到的上述这个问题,基本都是说由于yarn的资源不够,关闭到yarn资源监测就行,但是我这是线上集群,整个集群内存,CPU完全是够的,跑mr任务是完全没问题。
终上:报上诉的问题,不一定是由于资源不够导致的。
解决:
由于hadoop集群是以前同事部署的,使用的hadoop2.6.0,java使用的1.7,而在部署spark on yarn的时候,在spark环境里面使用的java1.8,所以导致了失败(这个不提示版本不一致是真难受),所以在配置spark on yarn的时候,java环境变量一定要配置相同。
修改hadoop-env.sh里面的JAVA_HOME的值,配置与spark 里面一值,同时重新启动yarn,问题解决
集群hadoop版本是2.6.0,spark版本是2.4.0,,我在部署的时候使用的是spark2.4.0-hadoop-2.7.0版本的spark包,当时也怀疑是由于hadoop版本不匹配,所以去网上下载spark-2.4.0-hadoop-2.6.0版本spark包,但是发现,下载速度真的是慢的要死,所以这里把spark-2.4.0-hadoop-2.6.0版本的包分享出来,给需要的朋友使用
spark-2.4-hadoop2.6.0 链接 :
链接:https://pan.baidu.com/s/1iPECAwfmmene_efowN6MjA
提取码:8l4y
借鉴:https://www.cnblogs.com/braveym/p/8571142.html