分词后数据查询问题

来源：6-8 -脑裂问题

simons_fan

2020-12-17

老师您好，有个问题请教下，感谢：
文本值"新建文件夹"，采用ik_smart分词器后，会被分成新建、文件夹两个单词，，用户输入"新建文”单词搜索不到，采用的match查询也无效，索引结构及查询语句如下：

{
	"settings": {
		"number_of_shards": 3,
		"number_of_replicas": 2
	},
	"mappings": {
		"_doc": {
			"properties": {
				"id": {
					"type": "keyword"
				},
				"documentId": {
					"type": "integer"
				},
				"name": {
					"type": "text",
					"analyzer": "ik_smart"
				},
				"content": {
					"type": "text",
					"analyzer": "ik_smart"
				},
				"labelName": {
					"type": "text"
				},
				"ownerName": {
					"type": "text"
				},
				"lastBrowseUser": {
					"type": "text"
				},
				"createTime": {
					"type": "date"
				},
				"updateTime": {
					"type": "date"
				},
				"lastBrowseTime": {
					"type": "date"
				},
				"type": {
					"type": "keyword"
				}
			}
		}
	}
}

{
	"bool": {
		"must": [
			{
				"multi_match": {
					"query": "新建文",
					"fields": [
						"name^1.0"
					],
					"type": "best_fields",
					"operator": "OR",
					"slop": 0,
					"prefix_length": 0,
					"max_expansions": 50,
					"zero_terms_query": "NONE",
					"auto_generate_synonyms_phrase_query": true,
					"fuzzy_transpositions": true,
					"boost": 1.0
				}
			}
		],
		"adjust_pure_negative": true,
		"boost": 1.0
	}
}

写回答

1回答

rockybean

2020-12-18

已采纳

你可以看下新建文的分词结果，通过 _analyze 接口查看下

GET _analyze
{
"analyzer": "ik_smart",
"text": ["新建文"]
}

结果如下:

{
"tokens" : [
{
"token" : "新",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "建文",
"start_offset" : 1,
"end_offset" : 3,
"type" : "CN_WORD",
"position" : 1
}
]
}