1 介绍

1.1 简介

　　在Elasticsearch下，一个文档除了有数据之外，它还包含了元数据(Metadata)。每创建一条数据时，都会对元数据进行写入等操作，当然有些元数据是在创建mapping的时候就会设置

　　它里面定义了每个添加的doc的处理方式。类似于数据库的表结构数据

1.2 元数据

2 _all

2.1 简介

　　_all字段是把其它字段拼接在一起的超级字段，所有的字段用空格分开，_all字段会被解析和索引，但是不存储。当你只想返回包含某个关键字的文档但是不明确地搜某个字段的时候就需要使用_all字段。

　　注意，在6.0.0中已弃用。

2.2 使用copy_to实现和_all一样的功能

1）添加mapping

　　字段full_content是其它三个字段拼接起来的

PUT mytest2
{
  "mappings": {
      "properties": {
        "title": {
          "type":    "text",
          "copy_to": "full_content" 
        },
        "name": {
          "type":    "text",
          "copy_to": "full_content" 
        },
        "desc": {
          "type":    "text",
          "copy_to": "full_content" 
        },
        "full_content": {
          "type":    "text"
        }
    }
  }
}

2）添加数据

PUT mytest2/_doc/1
{
  "title":"JAVA a",
  "name":"JON",
  "desc":"JAVA niu bi"
}

3）查询带有JAVA的数据

GET mytest2/_search
{
  "query": {
    "match": {
      "full_content": "JAVA"
    }
  }
}

3 _field_names

3.1 简介

　　_field_names字段用来存储文档中的所有非空字段的名字。这个字段常用于exists查询

　　注意：由于_field_names引入了一些索引时间开销，因此，如果要优化索引速度并且不需要exists查询，则可能需要禁用此字段

3.2 示例

1）添加数据

PUT mytest2/_doc/2
{
  "title": "This is a document"
}

PUT mytest2/_doc/3
{
  "title": "This is another document",
  "body": "This document has a body"
}

2）通过exists查询

　　查询body字段非空的数据

GET mytest2/_search
{
  "query": {
    "exists": {
      "field": "body"
    }
  }
}

　　查询结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "mytest2",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "title" : "This is another document",
          "body" : "This document has a body"
        }
      }
    ]
  }
}

4 _id

4.1 简介

　　每个文档都有一个_id唯一标识它的索引，以便可以使用GET API或 ids query查找文档。

　　不指定的话，es也会默认生成一个id字符串。

　　_id查询经常用在一下查询中：term， terms，match，query_string，simple_query_string

4.2 示例

1）添加数据，指定id为3a

PUT mytest2/_doc/3a
{
  "title": "aaaaaaaa",
  "body": "bbbbbbbbbbbbbbbb"
}

2）查询

GET mytest2/_search
{
  "query": {
    "term": {
      "_id": {
        "value": "3a"
      }
    }
  }
}

　　查询结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "mytest2",
        "_type" : "_doc",
        "_id" : "3a",
        "_score" : 1.0,
        "_source" : {
          "title" : "aaaaaaaa",
          "body" : "bbbbbbbbbbbbbbbb"
        }
      }
    ]
  }
}

5 _index

5.1 简介

　　多索引查询时，有时候只需要在特地在索引名上进行查询，_index字段提供了便利，也就是说可以对索引名进行term查询、terms查询、聚合分析、使用脚本和排序。

　　_index是一个虚拟字段，不会真的加到Lucene索引中，对_index进行term、terms查询(也包括match、query_string、simple_query_string)，但是不支持prefix、wildcard、regexp和fuzzy查询

5.2 示例

1）两个索引添加数据

PUT mytest3/_doc/1
{
  "text": "Document in index 1"
}

PUT mytest4/_doc/2
{
  "text": "Document in index 2"
}

2）查询

　　索引名做查询、聚合、排序并使用脚本新增字段

GET mytest3,mytest4/_search
{
  "query": {
    "terms": {
      "_index": ["mytest3", "mytest4"] 
    }
  },
  "aggs": {
    "indices": {
      "terms": {
        "field": "_index", 
        "size": 10
      }
    }
  },
  "sort": [
    {
      "_index": { 
        "order": "asc"
      }
    }
  ],
  "script_fields": {
    "index_name": {
      "script": {
        "lang": "painless",
        "inline": "doc['_index']" 
      }
    }
  }
}

　　查询结果

#! Deprecation: Deprecated field [inline] used, expected [source] instead
{
  "took" : 15,
  "timed_out" : false,
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "mytest3",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : null,
        "fields" : {
          "index_name" : [
            "mytest3"
          ]
        },
        "sort" : [
          "mytest3"
        ]
      },
      {
        "_index" : "mytest4",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : null,
        "fields" : {
          "index_name" : [
            "mytest4"
          ]
        },
        "sort" : [
          "mytest4"
        ]
      }
    ]
  },
  "aggregations" : {
    "indices" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "mytest3",
          "doc_count" : 1
        },
        {
          "key" : "mytest4",
          "doc_count" : 1
        }
      ]
    }
  }
}

6 _parent

6.1 简介

　　_parent用于指定同一索引中文档的父子关系。

6.2 示例

7 _routing

7.1 简介

　　路由参数，ELasticsearch通过以下公式计算文档应该分到哪个分片上

shard_num = hash(_routing) % num_primary_shards

　　_routing字段的默认值使用的文档的_id字段。如果存在父文档，则使用文档的_parent编号

　　可以通过为每个文档指定一个自定义的路由值来实现自定义的路由方式

7.2 示例

1）添加数据并指定路由值

PUT example/_doc/1?routing=user1
{
    "id":1, 
    "name": "username1"
}

PUT example/_doc/2?routing=user1
{
    "id":1, 
    "name": "username1"
}

2）通过路由查询

GET example/_search
{
    "query": {
        "match": {
            "_routing":"user1"
        }
    }
}

　　返回结果

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.18232156,
    "hits" : [
      {
        "_index" : "example",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.18232156,
        "_routing" : "user1",
        "_source" : {
          "id" : 1,
          "name" : "username1"
        }
      },
      {
        "_index" : "example",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.18232156,
        "_routing" : "user1",
        "_source" : {
          "id" : 1,
          "name" : "username1"
        }
      }
    ]
  }
}

8 _source

8.1 简介

　　存储的文档的原始值。默认_source字段是开启的，也可以关闭

　　但是一般情况下不要关闭，除法你不想做一些操作：

使用update、update_by_query、reindex
使用高亮
数据备份、改变mapping、升级索引
通过原始字段debug查询或者聚合

8.2 示例

1）添加数据

PUT source-example/_doc/1
{
    "id":1, 
    "name": "username1"
}

2）查询

GET source-example/_search
{
    "query": {
        "match": {
            "id":"1"
        }
    }
}

　　查询结果，_source中就是添加的原始数据

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "source-example",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "id" : 1,
          "name" : "username1"
        }
      }
    ]
  }
}

9 _type

　　每条被索引的文档都有一个_type字段，可以根据_type进行查询、聚合、脚本和排序

10 _uid

　　每一个文档都拥有一个_type和_id，_id字段不会被索引，但可以通过_uid字段来访问该字段。_id相当于一个document的身份编号。
　　_id在特定query中可以被获取，如term, terms, query_string, simple_string，但在聚合统计、自定义脚本以及排序时，不可以被使用，此时只能通过_uid来代替。

发表于 2023-02-13 15:30 金天黑日阅读(182) 评论(0) 编辑收藏举报

ES的Meta-Fields(元数据)

1 介绍

2 _all

3 _field_names

4 _id

5 _index

6 _parent

7 _routing

8 _source

9 _type

10 _uid