yarn提交任务模式,提示pyspark.zip找不到

来源:3-5 -RDD特性在源码中的体现

gyy_

2019-12-04

/home/hadoop/app/spark-2.4.4-bin-hadoop2.7/bin/spark-submit --master yarn --name spark0402 ~/local_scripts/scripts/spark0402.py hdfs://node1:9000/hello.txt hdfs://node1:9000/output

报错信息主要是这样:
file:/home/hadoop/.sparkStaging/application_1575014477999_0002/pyspark.zip does not exist

19/12/03 05:14:10 INFO SparkContext: Successfully stopped SparkContext
Traceback (most recent call last):
  File "/home/hadoop/local_scripts/scripts/test.py", line 14, in <module>
    sc = SparkContext(conf=conf)
  File "/home/hadoop/app/spark-2.4.4-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/context.py", line 136, in __init__
  File "/home/hadoop/app/spark-2.4.4-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/context.py", line 198, in _do_init
  File "/home/hadoop/app/spark-2.4.4-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/context.py", line 306, in _initialize_context
  File "/home/hadoop/app/spark-2.4.4-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1525, in __call__
  File "/home/hadoop/app/spark-2.4.4-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: org.apache.spark.SparkException: Application application_1575014477999_0002 failed 2 times due to AM Container for appattempt_1575014477999_0002_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:http://node1:8088/cluster/app/application_1575014477999_0002Then, click on links to logs of each attempt.
Diagnostics: File file:/home/hadoop/.sparkStaging/application_1575014477999_0002/pyspark.zip does not exist
java.io.FileNotFoundException: File file:/home/hadoop/.sparkStaging/application_1575014477999_0002/pyspark.zip does not exist
	
写回答

1回答

Michael_PK

2019-12-04

你跑个wc案例,确保你的yarn是正常的,现在这日志,感觉你的yarn可能有问题

0
2
Michael_PK
回复
gyy_
这么解决是可以,但是代码写死了。我估计还是hadoop的配置有问题,导致有些参数找不到
2019-12-04
共2条回复

Python3实战Spark大数据分析及调度

使用Python3对Spark应用程序进行开发调优,掌握Azkaban任务调度

1046 学习 · 434 问题

查看课程