分词后数据查询问题

来源:6-8 -脑裂问题

simons_fan

2020-12-17

老师您好,有个问题请教下,感谢:
文本值"新建文件夹",采用ik_smart分词器后,会被分成 新建、文件夹 两个单词,,用户输入"新建文”单词搜索不到,采用的match查询也无效,索引结构及查询语句如下:

{
	"settings": {
		"number_of_shards": 3,
		"number_of_replicas": 2
	},
	"mappings": {
		"_doc": {
			"properties": {
				"id": {
					"type": "keyword"
				},
				"documentId": {
					"type": "integer"
				},
				"name": {
					"type": "text",
					"analyzer": "ik_smart"
				},
				"content": {
					"type": "text",
					"analyzer": "ik_smart"
				},
				"labelName": {
					"type": "text"
				},
				"ownerName": {
					"type": "text"
				},
				"lastBrowseUser": {
					"type": "text"
				},
				"createTime": {
					"type": "date"
				},
				"updateTime": {
					"type": "date"
				},
				"lastBrowseTime": {
					"type": "date"
				},
				"type": {
					"type": "keyword"
				}
			}
		}
	}
}
{
	"bool": {
		"must": [
			{
				"multi_match": {
					"query": "新建文",
					"fields": [
						"name^1.0"
					],
					"type": "best_fields",
					"operator": "OR",
					"slop": 0,
					"prefix_length": 0,
					"max_expansions": 50,
					"zero_terms_query": "NONE",
					"auto_generate_synonyms_phrase_query": true,
					"fuzzy_transpositions": true,
					"boost": 1.0
				}
			}
		],
		"adjust_pure_negative": true,
		"boost": 1.0
	}
}
写回答

1回答

rockybean

2020-12-18

你可以看下 新建文 的分词结果,通过  _analyze 接口查看下

GET _analyze

{

  "analyzer": "ik_smart",

  "text": ["新建文"]

}

结果如下:

{

  "tokens" : [

    {

      "token" : "新",

      "start_offset" : 0,

      "end_offset" : 1,

      "type" : "CN_CHAR",

      "position" : 0

    },

    {

      "token" : "建文",

      "start_offset" : 1,

      "end_offset" : 3,

      "type" : "CN_WORD",

      "position" : 1

    }

  ]

}



这样一看,没搜到就很正常了


1
3
simons_fan
非常感谢!
2020-12-19
共3条回复

Elastic Stack从入门到实践,动手搭建数据分析系统

有了Elastic Stack,不用写一行代码,你也可以玩转大数据分析!

1361 学习 · 397 问题

查看课程