es7版本的问题

来源:3-10 -自定义分词之 TokenFilter

悟亦凡

2022-04-10

请问如图的错误,我大概理解7.0版本以上是不允许连词长度最大值与最小值相差超过1,那么我需要在语法中如何显示的去配置对应的允许跨越长度差值?
相关截图:

6251ce6228ca567505001000.jpg

写回答

2回答

rockybean

2022-04-10

可以用下面的方法测试

PUT test_ngram_filter
{
  "settings": {
    "index": {
      "max_ngram_diff": 10
    },
    "analysis": {
      "analyzer": {
        "my_custom_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "stop",
            "lowercase",
            "3_5_grams"
          ]
        }
      },
      "filter": {
        "3_5_grams": {
          "type": "ngram",
          "min_gram": 3,
          "max_gram": 5
        }
      }
    }
  }
}

POST test_ngram_filter/_analyze
{
  "analyzer": "my_custom_analyzer",
  "text": "a Hello World"
}


0
0

rockybean

2022-04-10

麻烦贴下文字版本,图片分辨率太低了,看不清楚

0
1
悟亦凡
请求样例: POST /_analyze { "text": ["a Hello World"], "tokenizer": "standard", "filter": [ "stop", "lowercase", { "type":"ngram", "min_gram":2, "max_gram":4 } ] } 返回错误: { "error" : { "root_cause" : [ { "type" : "illegal_argument_exception", "reason" : "The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [2]. This limit can be set by changing the [index.max_ngram_diff] index level setting." } ], "type" : "illegal_argument_exception", "reason" : "The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [2]. This limit can be set by changing the [index.max_ngram_diff] index level setting." }, "status" : 400 }
2022-04-10
共1条回复

Elastic Stack从入门到实践,动手搭建数据分析系统

有了Elastic Stack,不用写一行代码,你也可以玩转大数据分析!

1361 学习 · 397 问题

查看课程