elasticsearch学习

一、分词器指定
1、如果想要在创建索引和查询时分别使用不同的分词器,ElasticSearch也是支持的。
2、在创建索引,指定analyzer,ES在创建时会先检查是否设置了analyzer字段,如果没定义就用ES预设的
3、在查询时,指定search_analyzer,ES查询时会先检查是否设置了search_analyzer字段,如果没有设置,还会去检查创建索引时是否指定了analyzer,如果有则使用;还是没有还设置才会去使用ES预设的

二、ES使用分词器
1、插入文档时,将text类型的字段做分词然后插入倒排索引,此时就可能用到analyzer指定的分词器
2、在查询时,先对要查询的text类型的输入做分词,再去倒排索引搜索,此时就可能用到search_analyzer指定的分词器

三、demo(使用中文+拼音分词器)

PUT order
{
   "mappings" : {
      "properties" : {
      
        "fullTitle" : {
          "type" : "text",
          "fields" : {
            "suggest" : {
              "type" : "completion",
              "analyzer" : "ik_smart_pinyin",
              "preserve_separators" : true,
              "preserve_position_increments" : true,
              "max_input_length" : 50
            }
          },
          
          "analyzer" : "ik_max_word_pinyin",
          "search_analyzer": "ik_smart_pinyin", 
          "store": true
        },
        "id" : {
          "type" : "long"
        },
        "longTitle" : {
          "type" : "text",
          "copy_to" : [
            "fullTitle"
          ],
          "analyzer" : "ik_max_word_pinyin",
          "search_analyzer": "ik_smart_pinyin"
        },
        "title" : {
          "type" : "text",
          "copy_to" : [
            "fullTitle"
          ],
          "analyzer" : "ik_max_word_pinyin",
          "search_analyzer": "ik_smart_pinyin"
        },
        "creator":{
          "type": "text",
          "fields": {
            "mykeyword":{
              "type":"keyword"
            }
          }
        },
        "price":{
          "type": "float"
        }
      }
    },
   "settings" : {
      "index" : {
        "refresh_interval" : "5s",
        "number_of_shards" : "5",
        "number_of_replicas": "1"
      },"analysis": {
            "analyzer": {
                "ik_smart_pinyin": {
                    "type": "custom",
                    "tokenizer": "ik_smart",
                    "filter": ["my_pinyin"]
                },
                "ik_max_word_pinyin": {
                    "type": "custom",
                    "tokenizer": "ik_max_word",
                    "filter": ["my_pinyin"]
                }
            },
            "filter": {
                "my_pinyin": {
                "type": "pinyin",
                "keep_separate_first_letter": false,
                "keep_full_pinyin": true,
                "keep_original": true,
                "limit_first_letter_length": 16,
                "lowercase": true,
                "remove_duplicated_term": true,
                "keep_joined_full_pinyin": true 
                }
            }
        }

    }
}

 

posted @ 2022-05-20 16:14  b̶i̶n̶g̶.̶  阅读(40)  评论(0编辑  收藏  举报