跪求导师解答:wc在本地跑没问题,在服务器上跑报语法错误:
来源:4-14 -算子综合案例实战之词频统计重构
Luoters
2021-10-04
if len(sys.argv) != 3: print("Usage: wordcount ", file=sys.stderr) sys.exit(-1)语法错误
代码如下:
import sys
from pyspark import SparkConf, SparkContext
if __name__ == '__main__':
if len(sys.argv) != 3:
print("Usage: wordcount <input> <output>", file=sys.stderr)
sys.exit(-1)
conf = SparkConf()
sc = SparkContext(conf=conf)
conf.set("spark.executor.heartbeatInterval", "3600s")
def wcResult():
word_count = sc.textFile(sys.argv[1]).flatMap(lambda line: line.split(" ")) \
.map(lambda x: (x, 1)).reduceByKey(lambda a, b: a + b)
output = word_count.collect()
for (word, count) in output:
print("%s: %i" % (word, count))
def wcSaveFile():
sc.textFile(sys.argv[1]).flatMap(lambda line:line.split(" "))\
.map(lambda x:(x, 1)).reduceByKey(lambda a, b:a + b).saveAsTextFile(sys.argv[2])
#wcResult()
wcSaveFile()
sc.stop()
语法错误:(但是竟然本地能跑)
找了半天实在不知错在哪里,注释掉又能跑,但结果输出到文件好像有问题:
写回答
1回答
-
Michael_PK
2021-10-09
检查下你代码中切割的分隔符是空格还是制表符?导致切分的有问题?
022021-10-09
相似问题