ElasticSearch搜索
ElasticSearch搜索
1 DSL搜索
DSL(Domain Specifific Language)是ES提出的基于json的搜索方式,在搜索时传入特定的json格式的数据来完成不同的搜索需求。
1.1.搜索全部记录并分页
@Test
public void testSearchAll() throws Exception {
//搜索请求对象
SearchRequest searchRequest = new SearchRequest("xc_course");
searchRequest.types("doc");
//搜索构建源对象
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
//设置分页
//页码
int page =1;
int size =2;
//起始记录的下标
int from=(page-1) * size;
searchSourceBuilder.from(from); //起始记录的下标,从0开始
searchSourceBuilder.size(size); //每业显示几条
//设置搜索方式
searchSourceBuilder.query(QueryBuilders.matchAllQuery()); //搜索全部
//设置原字段过滤,第一个参数表示:包括哪些字段,第二个表示不包括那个字段
searchSourceBuilder.fetchSource(new String[]{"name","studymodel","price","timestamp"},new String[]{});
// 向搜索请求对象中设置搜索源
searchRequest.source(searchSourceBuilder);
//执行搜索,向es发送http请求
SearchResponse search = client.search(searchRequest);
//搜索结果
SearchHits hits = search.getHits();
//匹配到的总记录数
long totalHits = hits.getTotalHits();
// 得到匹配度高的文档
SearchHit[] hits1 = hits.getHits();
SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
for (SearchHit hit : hits1){
//文档的主键
String id = hit.getId();
//源文档内容
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
System.out.println(sourceAsMap);
//日期
// Date timestamp = dateFormat.parse((String) sourceAsMap.get("timestamp"));
// System.out.println(timestamp);
}
}
1.2.Term Query
Term Query为精确查询,在搜索时会整体匹配关键字,不再将关键字分词。
将1.1.中搜索方式设置为:
QueryBuilders.termQuery("name","spring") //根据名字查询,包含spring
1.3.根据id精确匹配查询
将1.1.中搜索方式改:
QueryBuilders.termQuery("name","spring") //根据名字查询,包含spring
1.4.match Query
1、基本使用
match Query即全文检索,它的搜索方式是先将搜索字符串分词,再使用各各词条从索引中搜索。
match query与Term query区别是match query在搜索前先将搜索关键字分词,再拿各各词语去索引中搜索。
发送:post http://localhost:9200/xc_course/doc/_search
{ "query":
{ "match" :
{ "description" :
{ "query" : "spring开发", "operator" : "or" }
}
}
}
operator:or 表示 只要有一个词在文档中出现则就符合条件,and表示每个词都在文档中出现则才符合条件
query:搜索的关键字,对于英文关键字如果有多个单词则中间要用半角逗号分隔,而对于中文关键字中间可以用
逗号分隔也可以不用。
-
minimum_should_match
指定文档匹配词的占比:
{ "query":
{ "match" :
{ "description" :
{
"query" : "spring开发框架", "minimum_should_match": "80%"
}
}
}
}
设置"minimum_should_match": "80%"表示,三个词在文档的匹配占比为80%,即3*0.8=2.4,向上取整得2,表
示至少有两个词在文档中要匹配成功。
java中实现:
//设置搜索方式
searchSourceBuilder.query(QueryBuilders.matchQuery("description","spring开发框架")
.minimumShouldMatch("80%"));
1.5.multi Query
termQuery和matchQuery一次只能匹配一个Field,本节学习multiQuery,一次可以匹配多个字段。
1、基本使用
单项匹配是在一个fifield中去匹配,多项匹配是拿关键字去多个Field中匹配。
例: 拿关键字 “spring css”去匹配name 和description字段。
{
"query": {
"multi_match" : {
"query" : "spring css",
"minimum_should_match": "50%",
"fields": [ "name", "description" ]
}
}
}
2.提升boost
匹配多个字段时可以提升字段的boost(权重)来提高得分;
在搜索的时候如果一个关键词在名字的权重比内容中的全重大,则优先搜索到名字权重大的;
javaClient中:
//设置搜索方式
searchSourceBuilder.query(QueryBuilders.multiMatchQuery("spring css","name","description")
.minimumShouldMatch("80%") //拼配程度
.field("name",10)); //name的占比提高10倍
1.6. 布尔查询
例: 对name、description进行匹配查询,并且对studymodel进行精确查询
{ "_source" :["name", "studymodel", "description"],
"from" : 0,
"size" : 1,
"query":
{ "bool" :
{ "must":
[
{ "multi_match" :
{ "query" : "spring框架",
"minimum_should_match": "50%",
"fields": [ "name^10", "description" ]
}
},
{
"term":{
"studymodel" : "201001"
}
}
]
}
}
}
must:表示必须,多个查询条件必须都满足。(通常使用must)
should:表示或者,多个查询条件只要有一个满足即可。
must_not:表示非。
/**
* BoolQuery
*
*/
@Test
public void testBoolQuery() throws Exception {
//搜索请求对象
SearchRequest searchRequest = new SearchRequest("xc_course");
searchRequest.types("doc");
//搜索构建源对象
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
/**
*BoolQuery设置搜索方式
*/
// 1.multiMatchQueryBuilder
MultiMatchQueryBuilder multiMatchQueryBuilder = QueryBuilders.multiMatchQuery("spring css", "name", "description")
.minimumShouldMatch("80%") //拼配程度
.field("name", 10);
//2.再定义一个termQuery
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("studymodel", "201001");
//BoolQuery
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(multiMatchQueryBuilder);
boolQueryBuilder.must(termQueryBuilder); //必须满足这两个条件
searchSourceBuilder.query(boolQueryBuilder);
//设置原字段过滤,第一个参数表示:包括哪些字段,第二个表示不包括那个字段
searchSourceBuilder.fetchSource(new String[]{"name","studymodel","price","timestamp"},new String[]{});
// 向搜索请求对象中设置搜索源
searchRequest.source(searchSourceBuilder);
//执行搜索,向es发送http请求
SearchResponse search = client.search(searchRequest);
//搜索结果
SearchHits hits = search.getHits();
//匹配到的总记录数
long totalHits = hits.getTotalHits();
// 得到匹配度高的文档
SearchHit[] hits1 = hits.getHits();
SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
for (SearchHit hit : hits1){
//文档的主键
String id = hit.getId();
//源文档内容
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
System.out.println(sourceAsMap);
}
}
1.7.过滤器
过虑是针对搜索的结果进行过虑,过虑器主要判断的是文档是否匹配,不去计算和判断文档的匹配度得分,所以过
虑器性能比查询要高,且方便缓存,推荐尽量使用过虑器去实现查询或者过虑器和查询共同使用。
{
"_source" : [ "name", "studymodel", "description","price"],
"query": {
"bool" : {
"must":[{
"multi_match" : {
"query" : "spring框架",
"minimum_should_match": "50%",
"fields": [ "name^10", "description" ]
}} ],
"filter": [ {
"term": { "studymodel": "201001" }},
{ "range": {
"price": {
"gte": 60 ,"lte" : 100
}}} ] } }
}
range:范围过虑,保留大于等于60 并且小于等于100的记录。
/**
* filter
*
*/
@Test
public void testBoolQueryByFilter() throws Exception {
//搜索请求对象
SearchRequest searchRequest = new SearchRequest("xc_course");
searchRequest.types("doc");
//搜索构建源对象
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
/**
*BoolQuery设置搜索方式
*/
//1.multiMatchQueryBuilder
MultiMatchQueryBuilder multiMatchQueryBuilder = QueryBuilders.multiMatchQuery("spring css", "name", "description")
.minimumShouldMatch("80%") //拼配程度
.field("name", 10);
//BoolQuery
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(multiMatchQueryBuilder);
//2.定义一个过滤器
boolQueryBuilder.filter(QueryBuilders.termQuery("studymodel","201001"));
boolQueryBuilder.filter(QueryBuilders.rangeQuery("price").gte(60).lte(100));
searchSourceBuilder.query(boolQueryBuilder);
//设置原字段过滤,第一个参数表示:包括哪些字段,第二个表示不包括那个字段
searchSourceBuilder.fetchSource(new String[]{"name","studymodel","price","timestamp"},new String[]{});
// 向搜索请求对象中设置搜索源
searchRequest.source(searchSourceBuilder);
//执行搜索,向es发送http请求
SearchResponse search = client.search(searchRequest);
//搜索结果
SearchHits hits = search.getHits();
//匹配到的总记录数
long totalHits = hits.getTotalHits();
// 得到匹配度高的文档
SearchHit[] hits1 = hits.getHits();
SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
for (SearchHit hit : hits1){
//文档的主键
String id = hit.getId();
//源文档内容
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
System.out.println(sourceAsMap);
}
}
1.8.排序sort
{
"_source" : [ "name", "studymodel", "description","price"],
"query": {
"bool" : {
"filter": [ {
"range": {
"price": {
"gte": 0 ,"lte" : 100}}}
] } },
"sort" : [ {
"studymodel" : "desc" },
{ "price" : "asc" }
]
}
java client:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
//BoolQuery
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
//定义一个过滤器
boolQueryBuilder.filter(QueryBuilders.rangeQuery("price").gte(60).lte(100));
searchSourceBuilder.query(boolQueryBuilder);
//添加排序
searchSourceBuilder.sort("studymodel", SortOrder.DESC);
searchSourceBuilder.sort("price",SortOrder.ASC);
1.9. 高亮显示
{
"_source" : [ "name", "studymodel", "description","price"],
"query": {
"bool" : {
"must":[{
"multi_match" : {
"query" : "开发框架",
"minimum_should_match": "50%",
"fields": [ "name^10", "description" ]
}} ],
"filter": [ {
"term": { "studymodel": "201001" }},
{ "range": {
"price": {
"gte": 60 ,"lte" : 100
}}} ] } }
"sort" : [ {
"studymodel" : "desc" },
{ "price" : "asc" }
],
"highlight": {
"pre_tags": ["<tag1>"], #前缀
"post_tags": ["</tag2>"], #后缀
"fields": { "name": {}, "description":{} } }
}
在name和description中出现“开发框架”时,进行高亮,在前后加
java client:
/**
* 高亮
*
*/
@Test
public void testHightLight() throws Exception {
//搜索请求对象
SearchRequest searchRequest = new SearchRequest("xc_course");
searchRequest.types("doc");
//搜索构建源对象
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
/**
* BoolQuery设置搜索方式
*/
//1. multiMatchQueryBuilder
MultiMatchQueryBuilder multiMatchQueryBuilder = QueryBuilders.multiMatchQuery("开发框架", "name", "description")
.minimumShouldMatch("80%") //拼配程度
.field("name", 10);
//BoolQuery
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(multiMatchQueryBuilder);
//定义一个过滤器
boolQueryBuilder.filter(QueryBuilders.rangeQuery("price").gte(0).lte(100));
searchSourceBuilder.query(boolQueryBuilder);
//设置原字段过滤,第一个参数表示:包括哪些字段,第二个表示不包括那个字段
searchSourceBuilder.fetchSource(new String[]{"name","studymodel","price","timestamp"},new String[]{});
//设置高亮
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.preTags("<Tag>");
highlightBuilder.postTags("</Tag>");
highlightBuilder.fields().add(new HighlightBuilder.Field("name"));
searchSourceBuilder.highlighter(highlightBuilder);
// 向搜索请求对象中设置搜索源
searchRequest.source(searchSourceBuilder);
//执行搜索,向es发送http请求
SearchResponse search = client.search(searchRequest);
//搜索结果
SearchHits hits = search.getHits();
//匹配到的总记录数
long totalHits = hits.getTotalHits();
// 得到匹配度高的文档
SearchHit[] hits1 = hits.getHits();
SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
for (SearchHit hit : hits1){
//文档的主键
String id = hit.getId();
//源文档内容
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
//取出name高亮字段
String name = null;
Map<String, HighlightField> highlightFields = hit.getHighlightFields();
if (highlightFields!=null){
HighlightField nameHighlightField = highlightFields.get("name");
if (nameHighlightField != null){
Text[] fragments = nameHighlightField.getFragments();
StringBuffer stringBuffer = new StringBuffer();
for (Text fra: fragments){
stringBuffer.append(fra);
}
name = stringBuffer.toString();
}
}
System.out.println(name);
}
}
}