DSL:Elasticsearch嵌套式对象Nested

参考:

Elasticsearch嵌套式对象Nested分析

es中索引对象包含数组子节点时, 查询和聚合的统计需要注意的问题。

https://es.xiaoleilu.com/402_Nested/30_Nested_objects.html

嵌套对象

事实上在Elasticsearch中,创建丶删除丶修改一个文档是是原子性的,因此我们可以在一个文档中储存密切关联的实体。

举例来说,我们可以在一个文档中储存一笔订单及其所有内容,或是储存一个Blog文章及其所有回应,藉由传递一个comments阵列:

PUT /my_index/blogpost/1
{
  "title": "Nest eggs",
  "body": "Making your money work...",
  "tags": [
    "cash",
    "shares"
  ],
  "comments": [
    {
      "name": "John Smith",
      "comment": "Great article",
      "age": 28,
      "stars": 4,
      "date": "2014-09-01"
    },
    {
      "name": "Alice White",
      "comment": "More like this please",
      "age": 31,
      "stars": 5,
      "date": "2014-10-22"
    }
  ]
}
View Code

<1> 如果我们依靠动态映射,comments栏位会被自动建立为一个object栏位。

因为所有内容都在同一个文档中,使搜寻时并不需要连接(join)blog文章与回应,因此搜寻表现更加优异。

问题在於以上的文档可能会如下所示的匹配一个搜寻:

GET my_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "comments.age": 31
          }
        },
        {
          "match": {
            "comments.name": "John Smith"
          }
        }
      ]
    }
  }
}
View Code

Alice是31岁,而不是28岁!

造成跨对象配对的原因如同我们在对象阵列中所讨论到,在于我们优美结构的JSON文档在索引中被扁平化为下方的 键-值 形式:

{
  "title":            [ eggs, nest ],
  "body":             [ making, money, work, your ],
  "tags":             [ cash, shares ],
  "comments.name":    [ alice, john, smith, white ],
  "comments.comment": [ article, great, like, more, please, this ],
  "comments.age":     [ 28, 31 ],
  "comments.stars":   [ 4, 5 ],
  "comments.date":    [ 2014-09-01, 2014-10-22 ]
}
View Code

Alice与31 以及 John与2014-09-01 之间的关联已经无法挽回的消失了。 当object类型的栏位用于储存单一对象是非常有用的。 从搜寻的角度来看,对於排序一个对象阵列来说关联是不需要的东西。

这是嵌套对象被设计来解决的问题。 藉由映射commments栏位为nested类型而不是object类型, 每个嵌套对象会被索引为一个隐藏分割文档,例如:

{ <1>
  "comments.name":    [ john, smith ],
  "comments.comment": [ article, great ],
  "comments.age":     [ 28 ],
  "comments.stars":   [ 4 ],
  "comments.date":    [ 2014-09-01 ]
}
{ <2>
  "comments.name":    [ alice, white ],
  "comments.comment": [ like, more, please, this ],
  "comments.age":     [ 31 ],
  "comments.stars":   [ 5 ],
  "comments.date":    [ 2014-10-22 ]
}
{ <3>
  "title":            [ eggs, nest ],
  "body":             [ making, money, work, your ],
  "tags":             [ cash, shares ]
}
View Code

<1> 第一个嵌套对象

<2> 第二个嵌套对象

<3> 根或是父文档

藉由分别索引每个嵌套对象,对象的栏位中保持了其关联。 我们的查询可以只在同一个嵌套对象都匹配时才回应。

不仅如此,因嵌套对象都被索引了,连接嵌套对象至根文档的查询速度非常快--几乎与查询单一文档一样快。

这些额外的嵌套对象被隐藏起来,我们无法直接访问他们。 为了要新增丶修改或移除一个嵌套对象,我们必须重新索引整个文档。 要牢记搜寻要求的结果并不是只有嵌套对象,而是整个文档。

示例

包含嵌套对象的文档创建

创建映射

PUT devicelog_22
{
 "mappings" : {
      "log" : {
        "properties" : {
          "Items" : {
            "type":"nested",
            "properties" : {
              "name" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "unit" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "value" : {
                "type" : "double"
              }
            }
          },
          "OperationDateTime" : {
            "type" : "date",
            "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis",
            "ignore_malformed":false
          },
          "systemId" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          }
        }
      }
    }
}
View Code

批量插入数据

POST /devicelog_22/log/_bulk
{"index":{"_id":1}}
{"systemId":"001","OperationDateTime":"2020-03-02 10:34:03","Items":[{"name":"k1","value": 11},{"name":"k2","value": 12},{"dd":"k3","value":33}]}
{"index":{"_id":2}}
{"systemId":"001","OperationDateTime":"2020-03-02 11:34:03","Items":[{"name":"k1","value": 22},{"name":"k2","value": 33},{"dd":"k3","value":44}]}
{"index":{"_id":3}}
{"systemId":"001","OperationDateTime":"2020-03-02 16:34:03","Items":[{"name":"k1","value": 45.3},{"name":"k2","value": 89.333},{"dd":"k3","value":18.33}]}
{"index":{"_id":4}}
{"systemId":"001","OperationDateTime":"2020-03-03 15:34:03","Items":[{"name":"k1","value": 45.3},{"name":"k2","value": 89.333},{"dd":"k3","value":18.33}]}
{"index":{"_id":5}}
{"systemId":"001","OperationDateTime":"2020-03-03 18:34:03","Items":[{"name":"k1","value": 222.3},{"name":"k2","value": 33.333},{"dd":"k3","value":55.33}]}
View Code

聚合分析

GET devicelog_22/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "item.name": {
      "nested": {
        "path": "Items"
      },
      "aggs": {
        "terms": {
          "terms": {
            "field": "Items.name.keyword",
            "size": 10
          },
          "aggs": {
            "sum-aggs": {
              "sum": {
                "field": "Items.value"
              }
            }
          }
        }
      }
    }
  }
}
View Code 

对于嵌套对象的过滤查询

GET devicelog_22/_search
{
  "query": {
    "nested": {
      "path": "Items",
      "query": {
        "bool": {
          "must": [
            {"term": {
              "Items.dd": {
                "value": "k3"
              }
            }}
          ]
        }
      }
    }
  }
}
View Code

java查询

    @Test
    public  void test2(){
        try {
            SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

            QueryBuilder queryBuilderHave=QueryBuilders.termQuery("Items.name","k1");
            NestedQueryBuilder nestedQueryBuilder=QueryBuilders.nestedQuery("Items",queryBuilderHave, ScoreMode.None);

            QueryBuilder queryBuilder=QueryBuilders.boolQuery().must(nestedQueryBuilder);

            TopHitsAggregationBuilder aggregation1= AggregationBuilders.topHits("details").size(1).
                    sort(SortBuilders.fieldSort("OperationDateTime").order(SortOrder.DESC)).fetchSource(true);

            DateHistogramAggregationBuilder aggregation= AggregationBuilders.dateHistogram("agg").
                    keyed(true).format("yyyy-MM-dd").field("OperationDateTime").dateHistogramInterval(DateHistogramInterval.DAY).
                    timeZone(DateTimeZone.forOffsetHours(8)).subAggregation(aggregation1);

            searchSourceBuilder.size(0).fetchSource(false).
                    timeout(new TimeValue(1000, TimeUnit.SECONDS)).query(queryBuilder).aggregation(aggregation);

            SearchRequest searchRequest=new SearchRequest();
            searchRequest.source(searchSourceBuilder);
            SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
            if(searchResponse.status()== RestStatus.OK){
               Histogram histogram= searchResponse.getAggregations().get("agg");
                EsDocumentList esDocumentList=new EsDocumentList();
                for (Histogram.Bucket entry : histogram.getBuckets()){
                   TopHits topHits=entry.getAggregations().get("details");
                    for (SearchHit hit : topHits.getHits().getHits()) {
                        Map<String, Object> mapResult = hit.getSourceAsMap();
                        esDocumentList.add(mapResult);
                    }
                }
               System.out.println("11");
            }

        }catch (Exception ex){
            System.out.println(ex);
        }
View Code

 

posted @ 2020-04-06 19:30  弱水三千12138  阅读(1966)  评论(0编辑  收藏  举报