elasticsearch 复杂查询小记

以下接口调用都基于5.5版本

JSON 文档格式

{
  "_index": "zipkin-2017-09-06",
  "_type": "span",
  "_id": "AV5WSb1lKwYfgxikh_Fp",
  "_score": null,
  "_source": {
    "timestamp_millis": 1504686226897,
    "traceId": "58d858be36d2493e",
    "id": "eb5e8ee2ff39eaa7",
    "name": "close",
    "parentId": "47622e0c4229a48b",
    "timestamp": 1504686226897000,
    "duration": 2,
    "binaryAnnotations": [
      {
        "key": "ip",
        "value": "127.0.0.1",
        "endpoint": {
          "serviceName": "redis",
          "ipv4": "127.0.0.1",
          "port": 20880
        }
      },
      {
        "key": "lc",
        "value": "unknown",
        "endpoint": {
          "serviceName": "redis",
          "ipv4": "127.0.0.1",
          "port": 20880
        }
      },
      {
        "key": "service",
        "value": "redis",
        "endpoint": {
          "serviceName": "redis",
          "ipv4": "127.0.0.1",
          "port": 20880
        }
      }
    ]
  },
  "fields": {
    "timestamp_millis": [
      1504686226897
    ]
  },
  "sort": [
    1504686226897
  ]
}

1.OR条件查询格式

{"query":{"bool":{"should":[{},{},{}...}]}},"size":400,"from":0,"sort":[{"timestamp":{"order":"desc","unmapped_type":"boolean"}}]}

 should条件的意思就只要匹配到里面其中一个条件就可以命中, 如

{"query":{"bool":{"should":[{"match":{"traceId":"6edb691b4bc775b1"}},{"match":{"traceId":"7e5b391r4bc775b1"}}]}},"size":400,"from":0,"sort":[{"timestamp":{"order":"desc","unmapped_type":"boolean"}}]}

 只要traceId等于其中一个值就可以命中

 

2.AND 条件查询格式

{"query":{"bool":{"must":[{},{},{}...}]}},"size":400,"from":0,"sort":[{"timestamp":{"order":"desc","unmapped_type":"boolean"}}]}

 must条件的意思就是必须匹配里面的所有条件才可以命中,如

{"query":{"bool":{"must":[{"range":{"timestamp":{"gte":1504581280866000,"lte":1504581280878000,"format":"date_time_no_millis"}}}, {"match":{"traceId":"6edb691b4bc775b1"}}],"must_not": {"exists": { "field": "parentId"  } }}},"size":400,"from":0,"sort":[{"timestamp":{"order":"desc","unmapped_type":"boolean"}}]}

 必须匹配traceId=6edb691b4bc775b1, 并且时间范围在1504581280866000,1504581280878000

 

3.是否含有某key

"must_not": {"exists": { "field": "parentId"  } }

 意思是查询必须没有parenId这个key的数据

{"query":{ "bool":{"must":[{"range":{"timestamp":{"gte":1504581280866000,"lte":1504581280878000,"format":"date_time_no_millis"}}},  {"match":{"traceId":"6edb691b4bc775b1"}}],"must_not": {"exists": { "field": "parentId"  } }}},   "size":400,"from":0,"sort":[{"timestamp":{"order":"desc","unmapped_type":"boolean"}}]}

 

PS: 不管是must,should,must_not都是平级的,包含在bool里面

 

4.嵌套查询

{"query":{ "bool":{"must":[{"range":{"timestamp":{"gte":1504581280866000,"lte":1504581280878000,"format":"date_time_no_millis"}}},  {"match":{"traceId":"6edb691b4bc775b1"}},{"nested": {"path": "binaryAnnotations" ,"query": { "bool": {"must": [{ "match": { "binaryAnnotations.key": "service" }},{ "match": { "binaryAnnotations.value": "WebRequest" }}] } }}}],"must_not": {"exists": { "field": "parentId"  } }}},   "size":400,"from":0,"sort":[{"timestamp":{"order":"desc","unmapped_type":"boolean"}}]}

 nested嵌套查询和其他match,range条件一样,是包含在must,should这些条件里面

{"nested": {"path": "binaryAnnotations" ,"query": { "bool": {"must": [{ "match": { "binaryAnnotations.key": "service" }},{ "match": { "binaryAnnotations.value": "WebRequest" }}] } }}}

 我们的JSON文档里有binaryAnnotations这个key, 而value是一个数组, 嵌套查询必须指定path,在我们这里就是binaryAnnotations,然后里面再使用query查询,query里面的语法和外层的一样

5.复合条件嵌套查询

假设我们要查询binaryAnnotations  里面两个并行的条件

{"query":{ "bool":{"must":[{"range":{"timestamp":{"gte":1504581280866000,"lte":1504581280878000,"format":"date_time_no_millis"}}},  {"match":{"traceId":"6edb691b4bc775b1"}},{"nested": {"path": "binaryAnnotations" ,"query": { "bool": {"must": [{ "match": { "binaryAnnotations.key": "service" }},{ "match": { "binaryAnnotations.value": "WebRequest" }}] } }}},{"nested": {"path": "binaryAnnotations" ,"query": { "bool": {"must": [{ "match": { "binaryAnnotations.key": "ip" }},{ "match": { "binaryAnnotations.value": "127.0.0.1" }}] } }}}],"must_not": {"exists": { "field": "parentId"  } }}},   "size":400,"from":0,"sort":[{"timestamp":{"order":"desc","unmapped_type":"boolean"}}]}

 

6.去重查询

{"query":{"bool":{"must":[ {"match":{"name":"query"}} ]}}, "aggs": {"traceId": {"terms": {"field": "traceId","size": 10  }}}, "size":10,"from":0,"sort":[{"timestamp":{"order":"desc","unmapped_type":"boolean"}}]}

去重要使用aggs 语句,和query查询平级,这里的意思是获取name=query 的记录并且用traceId去重

posted on 2017-09-07 10:31  devilwind  阅读(1436)  评论(0编辑  收藏  举报