es 查询
前提
ES在查询过程中比较多遇到符合查询,既需要多个字段过滤也需要特殊情况处理。
bool(组合查询)
- must 所有的语句都 必须(must) 匹配,与 AND 、= 等价。
- must_not 所有的语句都 不能(must not) 匹配,与 NOT 、!= 等价。
- should 至少有一个语句要匹配,与 OR 等价。
字段 | 介绍 | 类型 |
---|---|---|
must | 文档必须匹配must查询条件 | 数组 |
should | 文档应该匹配should子句查询的一个或多个 | 数组 |
must_not | 文档不能匹配该查询条件 | 数组 |
filter | 过滤器,文 | 档必须匹配该过滤条件,跟must子句的唯一区别是,filter不影响查询的score |
案例:
查看sql 状态为EXCEPTION 和 开始时间大于2021-04-22 日的数据。
select * from hadoop_impala where (queryState = "EXCEPTION" and startTime > '2021-04-22');
答案:
query = {
"query": {
"bool": {
"must": [
{
"match": {
"queryState": "EXCEPTION"
}
},
{
"range": {
"startTime": {
"gt": dtt
}
}
}
]
}
}
}
聚合查询
先分组,再计算每组的总和值
- 案例一:
select sum(peakUserMemory), proxyuser from presto group by proxyuser;
- 结果:
{
"query": {
},
"aggs": {
"proxyusers": {
"terms": {
"field": "proxyuser",
"size": 1000
},
"aggs": {
"sum_memory": {
"sum": {
"field": "peakUserMemory"
}
}
}
}
}
}
分组统计排序取topn
GET /hadoop_impala*/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"queryType": {
"value": "QUERY"
}
}
}
]
}
},
"aggs": {
"proxyusers": {
"terms": {
"field": "parse_sql",
"size": 10000
},
"aggs": {
"top_sales_hits": {
"top_hits": {
"sort": [
{
"durationMillis": {
"order": "desc"
}
}
],
"_source": {
"includes": [ "durationMillis", "parse_sql" ]
},
"size": 2
}
}
}
}
}
}
having
GET cars/_search
{
"size": 0,
"aggs": {
"colors": {
"terms": {
"field": "color"
},
"aggs": {
"avg_price": {
"avg": {
"field": "price"
}
},
"bucket_filter": {
"bucket_selector": {
"buckets_path": {
"avgPrice": "avg_price"
},
"script": "params.avgPrice>4000"
}
}
}
}
}
}