3-11的自定义分词不生效

来源:3-11 -自定义分词

rick_and_

2021-02-27

跟着视频敲的, 不生效

PUT test_index_2
{
	"settings": {
		"analysis": {
			"analyzer": {
				"my_custom_analyzer": {
					"type": "custom",
					"char_filter": [
						"emoticons"
					],
					"tokenizer": "punctuation",
					"filter": [
						"lowercase",
						"english_stop"
					]
				}
			},
			"tokenizer": {
				"punctuation": {
					"type": "pattern",
					"pattern": "[.,?!]"
				}
			},
			"char_filter": {
				"emoticons": {
					"type": "mapping",
					"mappings": [
						":)=>happy",
						":(=>cry"
					]
        }
			},
			"filter": {
				"english_stop": {
					"type": "stop",
					"stopwords": "_english_"
				}
			}
		}
	}
}
写回答

2回答

rockybean

2021-02-28

这个是因为你 :) 配置和测试不一致,前者英文,后者中文,参见下面

GET test_index_2/_analyze

{

  "analyzer": "my_custom_analyzer",

  "text": "hi, he drink milk! :)"

}




PUT test_index_2

{

"settings": {

"analysis": {

"analyzer": {

"my_custom_analyzer": {

"type": "custom",

"char_filter": [

"emoticons"

],

"tokenizer": "punctuation",

"filter": [

"lowercase",

"english_stop"

]

}

},

"tokenizer": {

"punctuation": {

"type": "pattern",

"pattern": "[.,?!]"

}

},

"char_filter": {

"emoticons": {

"type": "mapping",

"mappings": [

":)=>happy",

":(=>cry"

]

        }

},

"filter": {

"english_stop": {

"type": "stop",

"stopwords": "_english_"

}

}

}

}

}


0
0

rockybean

2021-02-27

es 的版本是多少?

报错是什么?

0
1
rick_and_
版本7.X 没有报错,但是分词没生效. POST test_index_1/_analyze { "analyzer": "my_analyzer", "text": "hi, he drink milk! :)" } 分词的结果是这样 { "tokens" : [ { "token" : "hi", "start_offset" : 0, "end_offset" : 2, "type" : "word", "position" : 0 }, { "token" : " he drink milk", "start_offset" : 3, "end_offset" : 17, "type" : "word", "position" : 1 }, { "token" : " :)", "start_offset" : 18, "end_offset" : 21, "type" : "word", "position" : 2 } ] }
2021-02-27
共1条回复

Elastic Stack从入门到实践,动手搭建数据分析系统

有了Elastic Stack,不用写一行代码,你也可以玩转大数据分析!

1361 学习 · 397 问题

查看课程