ES7.5.2相关性算分问题,找了很多资料都不对,请教老师问题
来源:7-2 -相关性算分

苦瓜苦也
2020-03-15
GET test_search_relevance/_settings
{
"test_search_relevance" : {
"settings" : {
"index" : {
"creation_date" : "1584190500424",
"number_of_shards" : "3",
"number_of_replicas" : "1",
"uuid" : "SnNbrQ_JR7GKa0tQPqSgZg",
"version" : {
"created" : "7050299"
},
"provided_name" : "test_search_relevance"
}
}
}
}
PUT test_search_relevance/_bulk
{"index":{"_id":1}}
{"name":"hello"}
{"index":{"_id":2}}
{"name":"hello,world!"}
{"index":{"_id":3}}
{"name":"hello,world! a beautiful world"}
GET /test_search_relevance/_search
{
"explain": true,
"query":{
"match":{
"name":"hello"
}
}
}
截取部分
{
"_shard" : "[test_search_relevance][2]",
"_node" : "cSYFpZ2ZSwWfwgItKZ8zHA",
"_index" : "test_search_relevance",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"name" : "hello"
},
"_explanation" : {
"value" : 0.2876821,
"description" : "weight(name:hello in 0) [PerFieldSimilarity], result of:",
"details" : [
{
"value" : 0.2876821,
"description" : "score(freq=1.0), product of:",
"details" : [
{
"value" : 2.2,
"description" : "boost",
"details" : [ ]
},
{
"value" : 0.2876821,
"description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
"details" : [
{
"value" : 1,
"description" : "n, number of documents containing term",
"details" : [ ]
},
{
"value" : 1,
"description" : "N, total number of documents with field",
"details" : [ ]
}
]
},
{
"value" : 0.45454544,
"description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
"details" : [
{
"value" : 1.0,
"description" : "freq, occurrences of term within document",
"details" : [ ]
},
{
"value" : 1.2,
"description" : "k1, term saturation parameter",
"details" : [ ]
},
{
"value" : 0.75,
"description" : "b, length normalization parameter",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "dl, length of field",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "avgdl, average length of field",
"details" : [ ]
}
]
}
]
}
]
}
}
问题1)里面的max_score怎么计算出来的?
问题2)里面的hits._score的值0.2876821怎么计算出来的?
问题3)里面的_explanation.value的值0.2876821是怎么计算出来的?里面的IDF数值计算出来的倒是0.2876821。
import math
print(math.log(1 + (1 - 1 + 0.5) / (1 + 0.5))) #0.28768207245178085
里面的TF数值计算出来的倒是0.45454544。
print(1.0 / (1.0 + 1.2 * (1 - 0.75 + 0.75 * 1.0 / 1.0))) #0.45454545454545453
查询计算资料的时候的一篇文档是这样的。
但是用IDF*(qi) * R(qi,d) = 0.2876821 * 0.45454544 = 0.13076458672462402 显然跟前面提的三个问题里面的数值不一样?不知道那三个数值怎么算的
写回答
1回答
-
rockybean
2020-03-20
max_score 是指所有命中的文档中得分最高的那个文档 score。
里面分值的计算,你按照它里面的详细解释应该是可以算出来的。
你最后算不出来应该是计算步骤有问题,这个我得找时间看下。
00
相似问题