编程方式 Interoperating 失败

来源:4-12 实现方式二

pain7

2020-07-30

def inferSchema(spark: SparkSession): Unit = {
    import spark.implicits._

    val dataFrame: DataFrame = spark.read.text("/Users/pain/Documents/bigdata/spark/spark-learning/input/people.txt")
    val personDF: DataFrame = dataFrame.map(_.getString(0).split(",")).map(x => Person(x(0), x(1).trim.toLong)).toDF()
    personDF.show()

    personDF.createOrReplaceTempView("people")
    val oldPersonDF: DataFrame = spark.sql("select name, age from people where age > 22")
    oldPersonDF.map(x => "Name: " + x.getAs[String]("name")).show()
}

def specifySchema(spark: SparkSession): Unit = {
    import spark.implicits._

    val dataFrame: DataFrame = spark.read.text("/Users/pain/Documents/bigdata/spark/spark-learning/input/people.txt")
    val rowDF: DataFrame = dataFrame.map(_.getString(0).split(",")).map(x => Row(x(0), x(1).trim.toLong)).toDF()

    val structType: StructType = StructType(StructField("name", StringType, true) :: StructField("age", LongType, true) :: Nil)
    val df: DataFrame = spark.createDataFrame(rowDF.rdd, structType)

    df.show()
}

case class Person(name: String, age: Long)

我调用 inferSchema 这个方法没有问题,但是调用 specifySchema 就会包下面这个错误:
图片描述

写回答

1回答

Michael_PK

2020-07-30

这个代码没看出什么问题,这样,你直接到git上把我的代码拷贝进来试试,对比下代码的顺序

0
2
Michael_PK
回复
pain7
我觉得很大可能是_.getString(0).split(",") 这个地方引起的数据类型问题
2020-07-30
共2条回复

SparkSQL入门 整合Kudu实现广告业务数据分析

大数据工程师干货课程 带你从入门到实战掌握SparkSQL

535 学习 · 192 问题

查看课程