ElasticSearch 实现分词全文检索 - 聚合查询 cardinality

目录

ElasticSearch 实现分词全文检索 - 概述
ElasticSearch 实现分词全文检索 - ES、Kibana、IK安装
ElasticSearch 实现分词全文检索 - Restful基本操作
ElasticSearch 实现分词全文检索 - Java SpringBoot ES 索引操作
ElasticSearch 实现分词全文检索 - Java SpringBoot ES 文档操作
ElasticSearch 实现分词全文检索 - 测试数据准备
ElasticSearch 实现分词全文检索 - term、terms查询
ElasticSearch 实现分词全文检索 - match、match_all、multimatch查询
ElasticSearch 实现分词全文检索 - id、ids、prefix、fuzzy、wildcard、range、regexp 查询
ElasticSearch 实现分词全文检索 - Scroll 深分页
ElasticSearch 实现分词全文检索 - delete-by-query
ElasticSearch 实现分词全文检索 - 复合查询
ElasticSearch 实现分词全文检索 - filter查询
ElasticSearch 实现分词全文检索 - 高亮查询
ElasticSearch 实现分词全文检索 - 聚合查询 cardinality
ElasticSearch 实现分词全文检索 - 经纬度查询
ElasticSearch 实现分词全文检索 - 搜素关键字自动补全(suggest)
ElasticSearch 实现分词全文检索 - SpringBoot 完整实现 Demo 附源码

数据准备

ElasticSearch 实现分词全文检索 - 测试数据准备

聚合查询

ES 的聚合查询和MySQL的聚合查询类型,ES的聚合查询相比MySQL要强大,提供的统计数据的方式多种多样

# ES聚合查询的 Restful 语法
POST /index/type/_search
{
   "aggs":{
       "名字(agg)":{
            "agg_type":{
               "属性":"值"
            }
       }
   }
}

去重计数查询 (Cardinality)

去重计数,即 Cardinality,第一步先将返回的文档中的一个指定的field进行去重,统计一共有多少条

#去重计数
POST /sms-logs-index/_search
{
  "aggs": {
    "agg": {
      "cardinality": {
        "field": "province"
      }
    }
  }
}

Java

@Test
void cardinalityQuery() throws Exception {
    String indexName = "sms-logs-index";
    RestHighLevelClient client = ESClient.getClient();

    //1. 创建SearchRequest对象
    SearchRequest request = new SearchRequest(indexName);

    //2. 指定查询条件
    SearchSourceBuilder builder = new SearchSourceBuilder();
    builder.aggregation(AggregationBuilders.cardinality("agg").field("province"));

    request.source(builder);

    //3. 执行查询
    SearchResponse resp = client.search(request, RequestOptions.DEFAULT);

    //4. 输出返回值
    Cardinality agg = resp.getAggregations().get("agg");
    long value = agg.getValue();
    System.out.println(value);
}

范围统计 (range)

统计一定范围内出现的文档个数,比如:针对某一个Field的值在 0100,100200,200~300 之间文档出现的个数分别是多少
范围统计可以针对普通的数值,针对时间类型,针对IP类型,都可以做相应的统计。
range,data_range,ip_range

# 数值方式范围统计
POST /sms-logs-index/_search
{
  "aggs": {
    "agg": {
      "range": {
        "field": "fee",
        "ranges": [
          {
            "to": 20
          },
          {
            "from": 20, # from 有包含当前值的意思
            "to": 30
          },
          {
            "from": 30
          }
        ]
      }
    }
  }
}

# 数值方式范围统计
POST /sms-logs-index/_search
{
  "aggs": {
    "agg": {
      "date_range": {
        "field": "createDate",
        "format":"yyyy",
        "ranges": [
          {
            "to": 2023  # 2023以前的数据量
          }, 
          {
            "from": 2023 # 2023以后的数据量
          }
        ]
      }
    }
  }
}

# IP方式范围统计
POST /sms-logs-index/_search
{
  "aggs": {
    "agg": {
      "ip_range": {
        "field": "ipAddr",
        "ranges": [
          {
            "to": "172.16.0.4"
          }, 
          {
            "from": "172.16.0.4"
          }
        ]
      }
    }
  }
}

Java

@Test
void rangeQuery() throws Exception {
    String indexName = "sms-logs-index";
    RestHighLevelClient client = ESClient.getClient();

    //1. 创建SearchRequest对象
    SearchRequest request = new SearchRequest(indexName);

    //2. 指定查询条件
    SearchSourceBuilder builder = new SearchSourceBuilder();
    builder.aggregation(AggregationBuilders.range("agg").field("fee")
            .addUnboundedTo(20)
            .addRange(20, 30)
            .addUnboundedFrom(30));

    request.source(builder);

    //3. 执行查询
    SearchResponse resp = client.search(request, RequestOptions.DEFAULT);

    //4. 输出返回值
    org.elasticsearch.search.aggregations.bucket.range.Range agg = resp.getAggregations().get("agg");
    for (Range.Bucket bucket : agg.getBuckets()) {
        String key = bucket.getKeyAsString();
        Object from = bucket.getFrom();
        Object to = bucket.getTo();
        long docCount = bucket.getDocCount();
        System.out.println(String.format("Key:%s From: %s  to: %s DocCount: %s", key, from, to, docCount));
    }
}

统计聚合查询 (extended_stats)

他可以查询指定Field的最大值,最小值,平均值,平方和...

# 统计聚合查询
POST /sms-logs-index/_search
{
  "aggs": {
    "agg": {
      "extended_stats": {
        "field": "fee"
      }
    }
  }
}

返回值

"aggregations" : {
    "agg" : {
      "count" : 8,
      "min" : 17.0,
      "max" : 45.0,
      "avg" : 31.25,
      "sum" : 250.0,
      "sum_of_squares" : 8468.0,
      "variance" : 81.9375,
      "variance_population" : 81.9375,
      "variance_sampling" : 93.64285714285714,
      "std_deviation" : 9.051933495115836,
      "std_deviation_population" : 9.051933495115836,
      "std_deviation_sampling" : 9.676923950453322,
      "std_deviation_bounds" : {
        "upper" : 49.35386699023167,
        "lower" : 13.146133009768327,
        "upper_population" : 49.35386699023167,
        "lower_population" : 13.146133009768327,
        "upper_sampling" : 50.60384790090664,
        "lower_sampling" : 11.896152099093356
      }
    }
  }

Java

@Test
void extendedQuery() throws Exception {
    String indexName = "sms-logs-index";
    RestHighLevelClient client = ESClient.getClient();

    //1. 创建SearchRequest对象
    SearchRequest request = new SearchRequest(indexName);

    //2. 指定查询条件
    SearchSourceBuilder builder = new SearchSourceBuilder();
    builder.aggregation(AggregationBuilders.extendedStats("agg").field("fee"));
    request.source(builder);

    //3. 执行查询
    SearchResponse resp = client.search(request, RequestOptions.DEFAULT);

    //4. 输出返回值
    ExtendedStats agg = resp.getAggregations().get("agg");
    double max = agg.getMax();
    double min = agg.getMin();
    System.out.println(String.format("Max:%s Min: %s ", max, min));
}

官方文档:https://www.elastic.co/guide/cn/elasticsearch/reference/index.html

posted @ 2023-03-21 09:44  VipSoft  阅读(558)  评论(0编辑  收藏  举报