第二次提交代码报错,,,

来源:4-14 -算子综合案例实战之词频统计重构

慕九州8702158

2020-09-27

$ ./pyspark --master local[2] --jars ~/lib/elasticsearch-spark-20_2.11-6.3.0.jar

代码为

from pyspark.sql.types import *
from pyspark.sql.functions import udf


def get_grade(value):
    if value <= 50 and value >= 0:
        return "健康"
    elif value <= 100:
        return "中等"
    elif value <= 150:
        return "对敏感人群不健康"
    elif value <= 200:
        return "不健康"
    elif value <= 300:
        return "非常不健康"
    elif value <= 500:
        return "危险"
    elif value > 500:
        return "爆表"
    else:
        return None



data2017 = spark.read.format("csv").option("header", "true").option("inferSchema", "true").load("/data/Beijing_2017_HourlyPM25_created20170803.csv").select("Year","Month","Day","Hour","Value","QC Name")

grade_function_udf = udf(get_grade, StringType())


group2017 = data2017.withColumn("Grade", grade_function_udf(data2017['value'])).groupBy("Grade").count()


res2017 = group2017.select("Grade", "count", group2017["count"]/data2017.count())

data2017那句之后就会报错

>>> data2017 = spark.read.format("csv").option("header", "true").option("inferSc                 hema", "true").load("/data/Beijing_2017_HourlyPM25_created20170803.csv").select(                 "Year","Month","Day","Hour","Value","QC Name")

Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000f81                 00000, 29884416, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 29884416 bytes for committing re                 served memory.
# An error report file with more information is saved as:
# /home/wangwei/app/spark-2.3.1-bin-2.6.0-cdh5.7.0/bin/hs_err_pid5227.log
ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/home/wangwei/app/spark-2.3.1-bin-2.6.0-cdh5.7.0/python/lib/py4j-0.10.7-                 src.zip/py4j/java_gateway.py", line 1159, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/wangwei/app/spark-2.3.1-bin-2.6.0-cdh5.7.0/python/lib/py4j-0.10.7-                 src.zip/py4j/java_gateway.py", line 985, in send_command
    response = connection.send_command(command)
  File "/home/wangwei/app/spark-2.3.1-bin-2.6.0-cdh5.7.0/python/lib/py4j-0.10.7-                 src.zip/py4j/java_gateway.py", line 1164, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/wangwei/app/spark-2.3.1-bin-2.6.0-cdh5.7.0/python/pyspark/sql/read                 writer.py", line 166, in load
    return self._df(self._jreader.load(path))
  File "/home/wangwei/app/spark-2.3.1-bin-2.6.0-cdh5.7.0/python/lib/py4j-0.10.7-                 src.zip/py4j/java_gateway.py", line 1257, in __call__
  File "/home/wangwei/app/spark-2.3.1-bin-2.6.0-cdh5.7.0/python/pyspark/sql/util                 s.py", line 63, in deco
    return f(*a, **kw)
  File "/home/wangwei/app/spark-2.3.1-bin-2.6.0-cdh5.7.0/python/lib/py4j-0.10.7-                 src.zip/py4j/protocol.py", line 336, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling o36.load

写回答

1回答

Michael_PK

2020-09-27

There is insufficient memory for the Java Runtime Environment to continue



机器内存不够了

0
10
Michael_PK
回复
慕九州8702158
你可以选择计费的方式的。那个很便宜的,用完就关,需要再打开
2020-09-27
共10条回复

Python3实战Spark大数据分析及调度

使用Python3对Spark应用程序进行开发调优,掌握Azkaban任务调度

1046 学习 · 434 问题

查看课程