spark集群运行错误
来源:2-13 使用IDEA和Maven开发第一个Spark应用程序

慕容3565349
2022-07-03
package com.wsx.spark
import org.apache.spark.{SparkConf, SparkContext}
object SparkTest {
val SPARK_URL = "spark://192.168.72.132:7077"
val HADOOP_URI = "hdfs://192.168.72.132:9000"
val FILE_PATH = "/input/wc.input"
val OUTPUT = "output"
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("SparkHomework").setMaster(SPARK_URL)
conf.setJars(Seq("C:\\Users\\wusx\\Desktop\\wsx\\Code\\review\\hadoop-train\\target\\hadoop-train-1.0.jar"))
val sc = new SparkContext(conf)
println(sc)
println(HADOOP_URI + FILE_PATH)
val counts = sc.textFile(HADOOP_URI+FILE_PATH).flatMap(line => line.split(","))
.foreach(println)
// .map(x => (x, 1))
// .reduceByKey((x, y) => x + y)
// 判断文件是否已经存在
// val file = new File(OUTPUT)
// if(file.exists) {
// file.delete()
// }
// counts.foreach(println)
sc.stop()
}
}
错误:
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6) (192.168.72.134 executor 2): java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance of org.apache.spark.rdd.MapPartitionsRDD
Caused by: java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance of org.apache.spark.rdd.MapPartitionsRDD
老师请问这个问题怎么解决啊?
写回答
1回答
-
Michael_PK
2022-07-04
强烈建议:写代码的时候不要连到比如standalone或者yarn上去,而是直接使用local模式。
测试通过后,打包再提交到不同的运行模式上去。
你按照上课的方式先把代码测试通过,看看有什么异常贴出来
022022-07-07
相似问题