ElasticSearch(二):文档的基本CRUD与批量操作

ElasticSearch(二):文档的基本CRUD与批量操作

学习课程链接《Elasticsearch核心技术与实战》


## Create 文档 支持自动生成文档_id和指定文档_id两种方式。 * 通过调用`POST index_name/_doc`,系统会自动生成文档 _id。 ``` #create document. 自动生成 _id POST users/_doc { "user" : "Mike", "post_date" : "2019-04-15T14:12:12", "message" : "trying out Kibana" } ``` ``` #返回结果 { "_index" : "users", "_type" : "_doc", "_id" : "TyPHr20BkakgvNgYZu2L",#自动生成文档的_id "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 1, "_primary_term" : 1 } ``` * 使用`PUT index_name/_create/_id`或`PUT index_name/_doc/_id?op_type=create`创建时,URI中显示指定`_create`,此时如果该_id的文档已经存在,操作会失败。 ``` #1.create document. 指定 _id 如果已经存在,就报错 PUT users/_create/1 { "user" : "Jack", "post_date" : "2019-05-15T14:12:12", "message" : "trying out Elasticsearch" } #2.create document. 指定_id。如果_id已经存在,报错 PUT users/_doc/1?op_type=create { "user" : "Jack", "post_date" : "2019-05-15T14:12:12", "message" : "trying out Elasticsearch" } ``` ``` #如果_id已经存在报错信息,如下: { "error": { "root_cause": [ { "type": "version_conflict_engine_exception", "reason": "[1]: version conflict, document already exists (current version [1])", "index_uuid": "ohLNyzUmTv6cm-Ih9kH0bw", "shard": "0", "index": "users" } ], "type": "version_conflict_engine_exception", "reason": "[1]: version conflict, document already exists (current version [1])", "index_uuid": "ohLNyzUmTv6cm-Ih9kH0bw", "shard": "0", "index": "users" }, "status": 409 } ```
## Index 文档 Index和Create不一样的地方:如果文档不存在,就索引新的文档。否则现有的文档会被删除,新的文档被索引,版本信息+1。使用`PUT index_name/_doc/_id`。 ``` PUT users/_doc/1 { "user" : "Mike" } ``` ``` #返回结果 { "_index" : "users", "_type" : "_doc", "_id" : "1", "_version" : 3, #版本增加 "result" : "updated", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 4, "_primary_term" : 2 } ```
## Update 文档 Update方法不会删除原来的文档,而是实现真正的数据更新,更新的文档必须存在,更新的内容需要包含在`doc`中。 ``` #更新文档API POST index_name/_update/_id { "doc":{ "field1":"value1", "field2":"value2" } } ```
 #更新_id=1文档
POST users/_update/1
{
    "doc":{
        "post_date" : "2019-05-15T14:12:12",
         "message" : "trying out Elasticsearch"
    }  
}

## Get 文档 根据文档ID,获取相应文档信息,`GET index_name/_doc/_id` ``` #Get the document by ID GET users/_doc/1 ``` ``` #返回结果 { "_index" : "users", "_type" : "_doc", "_id" : "1", "_version" : 1, "_seq_no" : 2, "_primary_term" : 1, "found" : true, "_source" : { "user" : "Jack", "post_date" : "2019-05-15T14:12:12", "message" : "trying out Elasticsearch" } } ```
## Delete 文档 根据文档ID,删除相应文档信息,`DELETE index_name/_doc/_id` ``` # 删除文档 DELETE users/_doc/1 ```
## 批量操作-bulk 批量操作,可以减少网络连接所产生的开销,提高性能。 * 支持在一次API调用中,对不同的索引进行操作。 * 支持四种类型操作:`Index`,`Create`,`Update`,`Delete`。 * 可以在URI中指定Index,也可以在请求中指定。 * 操作中单条操作失败,并不会影响其他操作。 * 返回结果包括了每一条操作执行的结果。 * 不要发送过多数据,一般建议是1000-5000个文档,如果你的文档很大,可以适当减少队列,大小建议是5-15MB,默认不能超过100M,会报错。
### Bulk 操作
POST _bulk
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test2", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
#返回结果
{
  "took" : 227,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "delete" : {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "2",
        "_version" : 1,
        "result" : "not_found",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1,
        "status" : 404
      }
    },
    {
      "create" : {
        "_index" : "test2",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "update" : {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 2,
        "_primary_term" : 1,
        "status" : 200
      }
    }
  ]
}

## 批量读取-mget mget 是通过文档`_id`列表得到文档信息。 ``` ### mget 操作 GET /_mget { "docs" : [ { "_index" : "test", "_id" : "1" }, { "_index" : "test", "_id" : "2" } ] }

URI中指定index

GET /test/_mget
{
"docs" : [
{
"_id" : "1"
},
{
"_id" : "2"
}
]
}

GET /_mget
{
"docs" : [
{
"_index" : "test",
"_id" : "1",
"_source" : false
},
{
"_index" : "test",
"_id" : "2",
"_source" : ["field3", "field4"]
},
{
"_index" : "test",
"_id" : "3",
"_source" : {
"include": ["user"],
"exclude": ["user.location"]
}
}
]
}

返回结果

{
"docs" : [
{
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_version" : 4,
"_seq_no" : 5,
"_primary_term" : 1,
"found" : true,
"_source" : {
"field1" : "value1",
"field2" : "value2"
}
},
{
"_index" : "test",
"_type" : "_doc",
"_id" : "2",
"found" : false
}
]
}



<br/>
## 批量查询-msearch
msearch 是根据查询条件,搜索到相应文档。

POST kibana_sample_data_ecommerce/_msearch
{}
{"query" : {"match_all" : {}},"size":1}
{"index" : "kibana_sample_data_flights"}




<br/>
## 常见错误返回说明
问题|原因
---|:--
无法连接|网络故障或集群挂了
连接无法关闭|网络故障或节点出错
429|集群过于繁忙
4xx|请求体格式有错
500|集群内部错误
posted @ 2019-10-09 22:40  牧汜  阅读(305)  评论(0编辑  收藏  举报