elasticsearch常用查询
query DSL match 查询 { "match": { "tweet": "About Search" } } 注:match查询只能就指定某个确切字段某个确切的值进行搜索,做精确匹配搜索时, 你最好用过滤语句,因为过滤语句可以缓存数据。 match_phrase 查询 { "query": { "match_phrase": { "title": "quick brown fox" } } } 注:与match相比,不会拆分查询条件 参考:https://blog.csdn.net/liuxiao723846/article/details/78365078?locationNum=2&fps=1 multi_match查询:对多个field查询 { "multi_match": { "query": "full text search", "fields": [ "title", "body" ] } } bool 查询 must: 查询指定文档一定要被包含。 filter: 和must类似,但不计分。 must_not: 查询指定文档一定不要被包含。 should: 查询指定文档,满足一个条件就返回。 POST _search { "query": { "bool" : { "must" : { "term" : { "user" : "kimchy" } }, "filter": { "term" : { "tag" : "tech" } }, "must_not" : { "range" : { "age" : { "gte" : 10, "lte" : 20 } } }, "should" : [ { "term" : { "tag" : "wow" } }, { "term" : { "tag" : "elasticsearch" } } ], "minimum_should_match" : 1, "boost" : 1.0 } } } prefix 查询:以什么字符开头 { "query": { "prefix": { "hostname": "wx" } } } wildcards 查询:通配符查询 { "query": { "wildcard": { "postcode": "W?F*HW" } } } 注:?用来匹配任意字符,*用来匹配零个或者多个字符 regexp 查询:正则表达式查询 { "query": { "regexp": { "postcode": "W[0-9].+" } } } 注: 1. 字段、词条 字段"Quick brown fox" 会产生 词条"quick","brown"和"fox" 2. prefix,wildcard以及regexp查询基于词条 --------------------------------------------------------------------- filter DSL term 过滤:精确匹配 { "query": { "term": { "age": 26 } } } terms 过滤:指定多个匹配条件 { "query": { "terms": { "status": [ 304, 302 ] } } } range 过滤 { "range": { "age": { "gte": 20, "lt": 30 } } } 范围操作符包含: gt :: 大于 gte:: 大于等于 lt :: 小于 lte:: 小于等于 exists/missing 过滤:过滤字段是否存在 { "exists": { "field": "title" } } bool过滤:合并多个过滤条件查询结果的布尔逻辑 { "bool": { "must": { "term": { "folder": "inbox" }}, "must_not": { "term": { "tag": "spam" }}, "should": [ { "term": { "starred": true }}, { "term": { "unread": true }} ] } } 注: must :: 多个查询条件的完全匹配,相当于 and。 must_not :: 多个查询条件的相反匹配,相当于 not。 should :: 至少有一个查询条件匹配, 相当于 or。
must例子,多个条件使用[] { "query": { "bool": { "must": [{ "term": { "category": "38" } }, { "range": { "time": { "gte": "1539827880", "lt": "1539827881" } } } ], "must_not": [], "should": [] } }, "from": 0, "size": 10, "sort": [], "aggs": {} }
聚合:
1. 筛选出指定时间的记录并求出sum(collection),其中collection字段为数值 { "query" : { "match" : {"time":1539827880} }, "aggs" : { "connections" : { "sum" : { "field" : "connection" } } } }
2. 筛选出时间范围内的数据然后根据category进行分类,每个类别中计算connection的总数,并根据result排序 { "query": { "range": { "time": { "gte": 1539827880, "lt": 1539827881 } } }, "aggs": { "_result": { "terms": { "field": "category",
"order": {"result":"desc"}
}, "aggs": { "result": { "sum": { "field": "connection" } } } } } }
3. 多重聚合 { "size": 0, "query": { "range": { "time": { "gte": 1539741480, "lte": 1539827880 } } }, "aggs": { "_result": { "terms": { "field": "category", "order": { "result": "desc" }, "size": 5 }, "aggs": { "result": { "sum": { "field": "connection" } }, "IPs": { "terms": { "field": "ip", "order": { "ip_result" : "desc" }, "size": 5 }, "aggs": { "ip_result": { "sum": { "field": "connection" } } } } } } } }
#DE log,先筛选type=2的日志,然后根据source_address统计repeat_times的和(即该source_address的log出现的次数),倒排 { "size": 0, "query": { "bool": { "must": [ { "term": { "type": 2 } } ] } }, "aggs": { "all_source_address": { "terms": { "field": "source_address", "order": {"repeat_times_total": "desc"} #注意:根据聚合后的字段来排序 }, "aggs": { "repeat_times_total": { "sum": {"field": "repeat_times"} } } } } }