聚类评价指标
来源:9-1 Kmeans算法概述
慕娘1225113
2019-05-16
spark ml中有关于聚类结果的一些评价指标吗,比如轮廓系数之类的
写回答
1回答
-
Wotchin
2019-05-18
当然有啦,kmeans的评价指标一般是Silhouette score(轮廓系数),可以通过调用spark的评估类来实现:
val kmeans = new KMeans().setK(2).setSeed(1L) val model = kmeans.fit(dataset) // Make predictions val predictions = model.transform(dataset) // Evaluate clustering by computing Silhouette score val evaluator = new ClusteringEvaluator() val silhouette = evaluator.evaluate(predictions) println(s"Silhouette with squared euclidean distance = $silhouette")
以上代码节选自:
"examples/src/main/scala/org/apache/spark/examples/ml/KMeansExample.scala"
除此之外,还可以度量一些距离,参考此处:
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.evaluation.ClusteringEvaluator
022019-05-26
相似问题