[ElasticSearch]ES操作之总和桶聚合(Sum Bucket Aggregation)

最近从同事那里学到了很多ES查询的新姿势,总结一波.

总和桶聚合(Sum Bucket Aggregation)


使用场景: 获取某分组条件下所有桶的指定度量的和


比如: 根据某个条件分组,获取前1000条数据出现的数量和.

可以用笨办法定义变量,循环遍历分组,拿到count再求和的方式,但不够逼格,既然ES提供了方法,直接调用即可.

 

传送门:https://xiaoxiami.gitbook.io/elasticsearch/ji-chu/36aggregationsju-he-fen-679029/363guan-dao-ju-540828-pipeline-aggregations/zong-he-tong-ju-540828-sum-bucket-aggregation

 

例1-DSL写法:

"aggs": {
    "all": {
        "terms": {
          "field": "topics",
          "size": 5
      }
    },
    "sum":{
        "sum_bucket":{
          "buckets_path":"all>_count"
      }
    }
}    

结果:

      "aggregations": {
            "all": {
              "doc_count_error_upper_bound": 11656,
              "sum_other_doc_count": 2575137,
              "buckets": [
                {
                  "key": "xx",
                  "doc_count": 129636
                },
                {
                  "key": "xxx",
                  "doc_count": 41586
                },
                {
                  "key": "xxxx",
                  "doc_count": 39196
                },
                {
                  "key": "xxxxx",
                  "doc_count": 38775
                },
                {
                  "key": "xxxxxx",
                  "doc_count": 23163
                }
              ]
            },
            "sum": {
              "value": 272356
            }
         }

sum的value就是分组的doc_count的和

 

java操作rest-high-level-client写法:

     SearchSourceBuilder sourceBuilder = new SearchSourceBuilder()
                .query(new MatchAllQueryBuilder())
                .size(0)
                .timeout(TimeValue.timeValueMillis(120000));
        TermsAggregationBuilder terms = AggregationBuilders.terms("all").field("topics").size(5);
        SumBucketPipelineAggregationBuilder sumBucket = new SumBucketPipelineAggregationBuilder("sum", "all>_count");
        sourceBuilder.aggregation(terms).aggregation(sumBucket);
        SearchRequest request = new SearchRequest(xxIndex)
                .types(xxType)
                .source(sourceBuilder);
        SearchResponse response = esClient.getClient().search(request);
        Map<String, Aggregation> map = response.getAggregations().getAsMap();
        double sum = ((ParsedSimpleValue)map.get("sum")).value();

 

除了count,其他度量条件(数字类型)也可以求和,比如对分组下的某个字段求和,然后获取所有分组的和

例2-DSL写法:

      "aggs": {
            "all": {
              "terms": {
                "field": "topics",
                "size": 5
              },
              "aggs": {
                "friends_cnt": {
                  "sum": {
                    "field": "friends_cnt"
                  }
                }
              }
            },
            "sum":{
              "sum_bucket":{
                "buckets_path":"all>friends_cnt"
              }
            }
          }

结果:

           "aggregations": {
            "all": {
              "doc_count_error_upper_bound": 11656,
              "sum_other_doc_count": 2575137,
              "buckets": [
                {
                  "key": "xx",
                  "doc_count": 129636,
                  "friends_cnt": {
                    "value": 55291503
                  }
                },
                {
                  "key": "xxx",
                  "doc_count": 41586,
                  "friends_cnt": {
                    "value": 21381248
                  }
                },
                {
                  "key": "xxxx",
                  "doc_count": 39196,
                  "friends_cnt": {
                    "value": 14668921
                  }
                },
                {
                  "key": "xxxxx",
                  "doc_count": 38775,
                  "friends_cnt": {
                    "value": 19805247
                  }
                },
                {
                  "key": "xxxxxx",
                  "doc_count": 23163,
                  "friends_cnt": {
                    "value": 10268415
                  }
                }
              ]
            },
            "sum": {
              "value": 121415334
            }
          }        

基于java:只需要修改第一个聚合条件,加一个子聚合,然后修改sumbucket的"_count"

     SearchSourceBuilder sourceBuilder = new SearchSourceBuilder()
                .query(new MatchAllQueryBuilder())
                .size(0)
                .timeout(TimeValue.timeValueMillis(120000));
        TermsAggregationBuilder terms = AggregationBuilders.terms("all").field("topics").size(5)
                .subAggregation(new SumAggregationBuilder("friends_cnt").field("friends_cnt"));
        SumBucketPipelineAggregationBuilder sumBucket = new SumBucketPipelineAggregationBuilder("sum", "all>friends_cnt");
        sourceBuilder.aggregation(terms).aggregation(sumBucket);
        SearchRequest request = new SearchRequest(xxIndex)
                .types(xxType)
                .source(sourceBuilder);
        SearchResponse response = esClient.getClient().search(request);
        Map<String, Aggregation> map = response.getAggregations().getAsMap();
        double sum = ((ParsedSimpleValue)map.get("sum")).value();
        return Double.toString(sum);

 

posted @ 2020-05-09 16:09  念欲似毒  阅读(12481)  评论(0编辑  收藏  举报