elasticsearch5.x:查询建议介绍、Suggester 介绍以及Java-api实现
elasticsearch5.x:查询建议介绍、Suggester 介绍
参考:http://www.cnblogs.com/leeSmall/p/9206646.html
参考(重点):https://elasticsearch.cn/article/142
参考(官网):https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
一、查询建议介绍
1. 查询建议是什么?
查询建议,为用户提供良好的使用体验。主要包括: 拼写检查; 自动建议查询词(自动补全)
拼写检查如图:
自动建议查询词(自动补全):
2. ES中查询建议的API
查询建议也是使用_search端点地址。在DSL中suggest节点来定义需要的建议查询
示例1:定义单个建议查询词
POST twitter/_search { "query" : { "match": { "message": "tring out Elasticsearch" } }, "suggest" : { <!-- 定义建议查询 --> "my-suggestion" : { <!-- 一个建议查询名 --> "text" : "tring out Elasticsearch", <!-- 查询文本 --> "term" : { <!-- 使用词项建议器 --> "field" : "message" <!-- 指定在哪个字段上获取建议词 --> } } } } PUT index { "mappings":{ "completion":{ "properties":{ "title": { "type": "text", "analyzer": "ik_smart" }, "title_suggest": { "type": "completion", "analyzer": "ik_smart", "search_analyzer": "ik_smart" } } } } }
示例2:定义多个建议查询词
POST _search { "suggest": { "my-suggest-1" : { "text" : "tring out Elasticsearch", "term" : { "field" : "message" } }, "my-suggest-2" : { "text" : "kmichy", "term" : { "field" : "user" } } } }
示例3:多个建议查询可以使用全局的查询文本
POST _search { "suggest": { "text" : "tring out Elasticsearch", "my-suggest-1" : { "term" : { "field" : "message" } }, "my-suggest-2" : { "term" : { "field" : "user" } } } }
二、Suggester 介绍
1. Term suggester
term 词项建议器,对给入的文本进行分词,为每个词进行模糊查询提供词项建议。对于在索引中存在词默认不提供建议词,不存在的词则根据模糊查询结果进行排序后取一定数量的建议词。
常用的建议选项:
示例1:
POST twitter/_search { "query" : { "match": { "message": "tring out Elasticsearch" } }, "suggest" : { <!-- 定义建议查询 --> "my-suggestion" : { <!-- 一个建议查询名 --> "text" : "tring out Elasticsearch", <!-- 查询文本 --> "term" : { <!-- 使用词项建议器 --> "field" : "message" <!-- 指定在哪个字段上获取建议词 --> } } } }
2. phrase suggester
phrase 短语建议,在term的基础上,会考量多个term之间的关系,比如是否同时出现在索引的原文里,相邻程度,以及词频等
示例
POST twitter/_search { "query" : { "match": { "message": "tring out Elasticsearch" } }, "suggest" : { "my-suggestion" : { "text" : "tring out Elasticsearch", "phrase" : { "field" : "message" } } } }
结果:
{ "took": 30, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 1.113083, "hits": [ { "_index": "twitter", "_type": "tweet", "_id": "4", "_score": 1.113083, "_source": { "user": "kimchy", "postDate": "2018-07-23T07:29:57.653Z", "message": "trying out Elasticsearch" } }, { "_index": "twitter", "_type": "tweet", "_id": "7", "_score": 0.98382175, "_source": { "user": "yuchen20", "postDate": "2018-07-23T08:12:05.604Z", "message": "trying out Elasticsearch" } } ] }, "suggest": { <!-- 建议--> "my-suggestion": [ { "text": "tring out Elasticsearch", "offset": 0, "length": 23, "options": [{ { "text": "trying out elasticsearch", "score": 0.5118434 } ] } ] } }
3. Completion suggester 自动补全
针对自动补全场景而设计的建议器。此场景下用户每输入一个字符的时候,就需要即时发送一次查询请求到后端查找匹配项,在用户输入速度较高的情况下对后端响应速度要求比较苛刻。因此实现上它和前面两个Suggester采用了不同的数据结构,索引并非通过倒排来完成,而是将analyze过的数据编码成FST和索引一起存放。对于一个open状态的索引,FST会被ES整个装载到内存里的,进行前缀查找速度极快。但是FST只能用于前缀查找,这也是Completion Suggester的局限所在。
官网链接:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
示例1:
为了使用自动补全,索引中用来提供补全建议的字段需特殊设计,字段类型为 completion。 先设置mapping:
PUT index/ { "mappings":{ "completion":{ "properties":{ "title": { "type": "text", "analyzer": "ik_smart" }, "title_suggest": { "type": "completion", "analyzer": "ik_smart", "search_analyzer": "ik_smart" } } } } }
重点是title_suggest,这个字段就是之后我们搜索补全的字段,需要设置type为completion,analyzer按情况设置分析器
索引数据:
POST /index/completion/_bulk { "index" : { } } { "title": "背景天安门广场大学", "title_suggest": "背景天安门广场大学"} { "index" : { } } { "title": "北京天安门","title_suggest": "北京天安门"} { "index" : { } } { "title": "北京鸟巢","title_suggest": "北京鸟巢"} { "index" : { } } { "title": "奥林匹克公园","title_suggest": "奥林匹克公园"} { "index" : { } } { "title": "奥林匹克森林公园","title_suggest": "奥林匹克森林公园"} { "index" : { } } { "title": "北京奥林匹克公园","title_suggest": "北京奥林匹克公园"} { "index" : { } } { "title": "北京奥林匹克公园","title_suggest": {"input": "我爱中国","weight": 100}}
索引的时候可以对suggest字段,增加weight增加排序权重
搜索补全:
POST /index/completion/_search { "size": 0, "suggest":{ "blog-suggest":{ "prefix":"北京", "completion":{ "field":"title_suggest" } } } }
结果:
{ "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 0, "max_score": 0, "hits": [] }, "suggest": { "blog-suggest": [ { "text": "北京", "offset": 0, "length": 2, "options": [ { "text": "北京天安门", "_index": "index", "_type": "completion", "_id": "AWSRo_hn9K_aupETR6FR", "_score": 1, "_source": { "title": "北京天安门", "title_suggest": "北京天安门" } }, { "text": "北京奥林匹克公园", "_index": "index", "_type": "completion", "_id": "AWSRo_hn9K_aupETR6FV", "_score": 1, "_source": { "title": "北京奥林匹克公园", "title_suggest": "北京奥林匹克公园" } }, { "text": "北京鸟巢", "_index": "index", "_type": "completion", "_id": "AWSRo_hn9K_aupETR6FS", "_score": 1, "_source": { "title": "北京鸟巢", "title_suggest": "北京鸟巢" } } ] } ] } }
示例2:
创建映射
PUT music { "mappings": { "docc" : { "properties" : { "suggest" : { "type" : "completion" }, "title" : { "type": "keyword" } } } } }
Input 指定输入词 Weight 指定排序值(可选)
PUT music/docc/1?refresh { "suggest" : { "input": [ "Nevermind", "Nirvana" ], "weight" : 34 } }
指定不同的排序值:
PUT music/_doc/1?refresh { "suggest" : [ { "input": "Nevermind", "weight" : 10 }, { "input": "Nirvana", "weight" : 3 } ]}
放入一条重复数据
PUT music/docc/2?refresh { "suggest" : { "input": [ "Nevermind", "Nirvana" ], "weight" : 20 } }
查询建议根据前缀查询:
POST music/_search?pretty { "suggest": { "song-suggest" : { "prefix" : "nir", "completion" : { "field" : "suggest" } } } }
对建议查询结果去重: "skip_duplicates": true ,该特性在6.x支持,5.x不支持
POST music/_search?pretty { "suggest": { "song-suggest" : { "prefix" : "nir", "completion" : { "field" : "suggest", "skip_duplicates": true } } } }
查询建议文档存储短语
PUT music/docc/3?refresh { "suggest" : { "input": [ "lucene solr", "lucene so cool","lucene elasticsearch" ], "weight" : 20 } } PUT music/docc/4?refresh { "suggest" : { "input": ["lucene solr cool","lucene elasticsearch" ], "weight" : 10 } }
查询
POST music/_search?pretty { "suggest": { "song-suggest" : { "prefix" : "lucene s", "completion" : { "field" : "suggest" } } } }
三 、java -api
## elasticsearch5.x:查询建议java-api介绍、Suggester 介绍 参考:http://www.mamicode.com/info-detail-2347270.html package com.youlan.es.util; import java.util.concurrent.ExecutionException; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import org.elasticsearch.action.search.SearchRequest; import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.transport.TransportClient; import org.elasticsearch.rest.RestStatus; import org.elasticsearch.search.builder.SearchSourceBuilder; import org.elasticsearch.search.suggest.*; import org.elasticsearch.search.suggest.completion.CompletionSuggestion; import org.elasticsearch.search.suggest.phrase.PhraseSuggestion; import org.elasticsearch.search.suggest.term.TermSuggestion; public class SuggestDemo { private static Logger logger = LogManager.getRootLogger(); //拼写检查(英文) public static void termSuggest(TransportClient client) { // 1、创建search请求 //SearchRequest searchRequest = new SearchRequest(); SearchRequest searchRequest = new SearchRequest("twitter"); // 2、用SearchSourceBuilder来构造查询请求体 ,请仔细查看它的方法,构造各种查询的方法都在这。 SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.size(0); //做查询建议 //词项建议 SuggestionBuilder termSuggestionBuilder = SuggestBuilders.termSuggestion("message").text("tring out Elticsearch");//搜索框输入内容:tring out Elticsearch SuggestBuilder suggestBuilder = new SuggestBuilder(); suggestBuilder.addSuggestion("suggest_user", termSuggestionBuilder); sourceBuilder.suggest(suggestBuilder); searchRequest.source(sourceBuilder); try{ //3、发送请求 SearchResponse searchResponse = client.search(searchRequest).get(); //4、处理响应 //搜索结果状态信息 if(RestStatus.OK.equals(searchResponse.status())) { // 获取建议结果 Suggest suggest = searchResponse.getSuggest(); TermSuggestion termSuggestion = suggest.getSuggestion("suggest_user"); for (TermSuggestion.Entry entry : termSuggestion.getEntries()) { logger.info("text: " + entry.getText().string()); for (TermSuggestion.Entry.Option option : entry) { String suggestText = option.getText().string();//建议内容 logger.info(" suggest option : " + suggestText); } } } } catch (InterruptedException | ExecutionException e) { logger.error(e); } /* "suggest": { "my-suggestion": [ { "text": "tring", "offset": 0, "length": 5, "options": [ { "text": "trying", "score": 0.8, "freq": 2 } ] }, { "text": "out", "offset": 6, "length": 3, "options": [] }, { "text": "elasticsearch", "offset": 10, "length": 13, "options": [] } ] }*/ } public static void phraseSuggest(TransportClient client){ //1、创建search请求 SearchRequest searchRequest = new SearchRequest("twitter"); //2、构造查询qing'qi请求体 SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.size(0); SuggestionBuilder phraseSuggestBuilder = SuggestBuilders.phraseSuggestion( "message").text("tring out"); SuggestBuilder suggestBuilder = new SuggestBuilder(); suggestBuilder.addSuggestion("my-suggestion",phraseSuggestBuilder); sourceBuilder.suggest(suggestBuilder); searchRequest.source(sourceBuilder); try { //3、发送请求 SearchResponse searchResponse = client.search(searchRequest).get(); //4、处理响应 //搜索状态信息 if (RestStatus.OK.equals(searchResponse.status())){ //获得建议 Suggest suggest = searchResponse.getSuggest(); PhraseSuggestion phraseSuggestion =suggest.getSuggestion("my-suggestion"); for (PhraseSuggestion.Entry entry:phraseSuggestion){ logger.info("text:"+entry.getText().string()); for (PhraseSuggestion.Entry.Option option:entry){ String suggestText = option.getText().string(); logger.info(" suggest option :"+suggestText); } } } } catch (InterruptedException e) { logger.error("请求出错:"+e); } catch (ExecutionException e) { logger.error(e); } } //自动补全 public static void completionSuggester(TransportClient client) { // 1、创建search请求 //SearchRequest searchRequest = new SearchRequest(); SearchRequest searchRequest = new SearchRequest("music"); // 2、用SearchSourceBuilder来构造查询请求体 ,请仔细查看它的方法,构造各种查询的方法都在这。 SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.size(0); //做查询建议 //自动补全 /*POST music/_search?pretty { "suggest": { "song-suggest" : { "prefix" : "lucene s", "completion" : { "field" : "suggest" , "skip_duplicates": true } } } }*/ SuggestionBuilder termSuggestionBuilder = SuggestBuilders.completionSuggestion("suggest").prefix("lucene s"); // .skipDuplicates(true) 6.x去重; SuggestBuilder suggestBuilder = new SuggestBuilder(); suggestBuilder.addSuggestion("song-suggest", termSuggestionBuilder); sourceBuilder.suggest(suggestBuilder); searchRequest.source(sourceBuilder); try { //3、发送请求 SearchResponse searchResponse = client.search(searchRequest).get(); //4、处理响应 //搜索结果状态信息 if(RestStatus.OK.equals(searchResponse.status())) { // 获取建议结果 Suggest suggest = searchResponse.getSuggest(); CompletionSuggestion termSuggestion = suggest.getSuggestion("song-suggest"); for (CompletionSuggestion.Entry entry : termSuggestion.getEntries()) { logger.info("text: " + entry.getText().string()); for (CompletionSuggestion.Entry.Option option : entry) { String suggestText = option.getText().string(); logger.info(" suggest option : " + suggestText); } } } } catch (InterruptedException | ExecutionException e) { logger.error(e); } // 结果: // { // "took": 7, // "timed_out": false, // "_shards": { // "total": 5, // "successful": 5, // "skipped": 0, // "failed": 0 // }, // "hits": { // "total": 0, // "max_score": 0, // "hits": [] // }, // "suggest": { // "song-suggest": [ // { // "text": "lucene s", // "offset": 0, // "length": 8, // "options": [ // { // "text": "lucene so cool", // "_index": "music", // "_type": "docc", // "_id": "3", // "_score": 20, // "_source": { // "suggest": { // "input": [ // "lucene solr", // "lucene so cool", // "lucene elasticsearch" // ], // "weight": 20 // } // } // }, // { // "text": "lucene solr cool", // "_index": "music", // "_type": "docc", // "_id": "4", // "_score": 10, // "_source": { // "suggest": { // "input": [ // "lucene solr cool", // "lucene elasticsearch" // ], // "weight": 10 // } // } // } // ] // } // ] // } // } } public static void main(String[] args) { EsClient esClient= new EsClient(); try (TransportClient client =esClient.getConnection() ;) { logger.info("---------------- 拼写检查:termSuggest----------------------"); termSuggest(client); logger.info("------------------ 短语建议:phraseSuggest--------------------"); phraseSuggest(client); logger.info("------------------ 自动补全:completionSuggester--------------------"); completionSuggester(client); } catch (Exception e) { logger.error(e); } } }