elasticsearch学习
一、分词器指定
1、如果想要在创建索引和查询时分别使用不同的分词器,ElasticSearch也是支持的。
2、在创建索引,指定analyzer,ES在创建时会先检查是否设置了analyzer字段,如果没定义就用ES预设的
3、在查询时,指定search_analyzer,ES查询时会先检查是否设置了search_analyzer字段,如果没有设置,还会去检查创建索引时是否指定了analyzer,如果有则使用;还是没有还设置才会去使用ES预设的
二、ES使用分词器
1、插入文档时,将text类型的字段做分词然后插入倒排索引,此时就可能用到analyzer指定的分词器
2、在查询时,先对要查询的text类型的输入做分词,再去倒排索引搜索,此时就可能用到search_analyzer指定的分词器
三、demo(使用中文+拼音分词器)
PUT order
{
"mappings" : {
"properties" : {
"fullTitle" : {
"type" : "text",
"fields" : {
"suggest" : {
"type" : "completion",
"analyzer" : "ik_smart_pinyin",
"preserve_separators" : true,
"preserve_position_increments" : true,
"max_input_length" : 50
}
},
"analyzer" : "ik_max_word_pinyin",
"search_analyzer": "ik_smart_pinyin",
"store": true
},
"id" : {
"type" : "long"
},
"longTitle" : {
"type" : "text",
"copy_to" : [
"fullTitle"
],
"analyzer" : "ik_max_word_pinyin",
"search_analyzer": "ik_smart_pinyin"
},
"title" : {
"type" : "text",
"copy_to" : [
"fullTitle"
],
"analyzer" : "ik_max_word_pinyin",
"search_analyzer": "ik_smart_pinyin"
},
"creator":{
"type": "text",
"fields": {
"mykeyword":{
"type":"keyword"
}
}
},
"price":{
"type": "float"
}
}
},
"settings" : {
"index" : {
"refresh_interval" : "5s",
"number_of_shards" : "5",
"number_of_replicas": "1"
},"analysis": {
"analyzer": {
"ik_smart_pinyin": {
"type": "custom",
"tokenizer": "ik_smart",
"filter": ["my_pinyin"]
},
"ik_max_word_pinyin": {
"type": "custom",
"tokenizer": "ik_max_word",
"filter": ["my_pinyin"]
}
},
"filter": {
"my_pinyin": {
"type": "pinyin",
"keep_separate_first_letter": false,
"keep_full_pinyin": true,
"keep_original": true,
"limit_first_letter_length": 16,
"lowercase": true,
"remove_duplicated_term": true,
"keep_joined_full_pinyin": true
}
}
}
}
}