Elasticsearch专题精讲—— REST APIs —— Document APIs —— Update API
REST APIs —— Document APIs —— Update API
更新 API(_update) 允许根据提供的脚本更新文件。该操作从索引中获取文档,运行脚本(使用可选的脚本语言和参数),并对结果进行索引(还允许删除或忽略该操作)。它使用版本控制来确保在 Get 和 Reindex 操作期间没有发生任何更新。
我理解意思是���: Elasticsearch 的 Update API 允许通过替换或修改文档的一部分来更新现有文档。使用 Update API 可以: 1. 更新现有文档的一部分而不是整个文档,减少网络往返次数和减少数据传输量。 2. 避免使用脚本修改文档,因为这可能会导致性能问题和潜在的安全问题。 3. 在文档不存在时创建文档。 Update API 的完整语法如下:
POST /{index}/_update/{id} { "doc": { "field": "value" } } 其中: 1. {index}: 要更新的文档所在的索引的名称。 2. {id}: 要更新的文档的 ID。 3. "doc": 要更新的文档的一部分。可以包括一个或多个字段。
Update API 还支持一些可选参数,例如 _source、_retry_on_conflict、_detect_noop、_routing、_script、upsert 等。 使用 Update API,可以实现部分更新文档,而不是替换整个文档。此外,由于使用 _update API 操作需要更少的网络往返,因此它可以提高更新文档的效率。
https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-update.html
Updates a document using the specified script.
使用指定的脚本更新文档。
1、Request(请求)
https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-update.html#docs-update-api-request
POST /< index>/_update/<_id> POST /< index>/_update/<_id>
2、Prerequisites(先决条件)
https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-update.html#docs-update-api-prereqs
If the Elasticsearch security features are enabled, you must have the index or write index privilege for the target index or index alias.
如果启用了 Elasticsearch 安全特性,则必须具有目标索引或索引别名的索引或写索引特权。
3、Description(描述)
https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-update.html#update-api-desc
Enables you to script document updates. The script can update, delete, or skip modifying the document. The update API also supports passing a partial document, which is merged into the existing document. To fully replace an existing document, use the index API.
使能够编写文档更新脚本。脚本可以更新、删除或跳过对文档的修改。更新 API 还支持传递部分文档,该部分文档合并到现有文档中。若要完全替换现有文档,请使用索引 API。
This operation:
- Gets the document (collocated with the shard) from the index.
- Runs the specified script.
- Indexes the result.
索引结果
从索引中获取与 shard 共存的 document
运行指定脚本
The document must still be reindexed, but using update removes some network roundtrips and reduces chances of version conflicts between the GET and the index operation.
这个文档仍然需要被重新索引,但是使用更新操作可以减少网络往返次数并降低获取操作与索引操作之间版本冲突的可能性。
The _source field must be enabled to use update. In addition to _source, you can access the following variables through the ctx map: _index, _type, _id, _version, _routing, and _now (the current timestamp).
在使用更新操作之前必须启用 _source 字段。除了 _source 之外,您还可以通过 ctx 映射访问以下变量:_index、_type、_id、_version、_routing 和 _now(当前时间戳)。
4、Examples(例子)
https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-update.html#update-api-example
curl -X PUT "localhost:9200/test/_doc/1?pretty" -H 'Content-Type: application/json' -d' { "counter" : 1, "tags" : ["red"] }'
To increment the counter, you can submit an update request with the following script:
要增加计数器,您可以使用以下脚本提交更新请求:
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "script" : { "source": "ctx._source.counter += params.count", "lang": "painless", "params" : { "count" : 4 } } }'
Similarly, you could use and update script to add a tag to the list of tags (this is just a list, so the tag is added even it exists):
类似地,您可以使用更新脚本向 list of tags 中添加 tag(如果标签已存在,则仍将其添加到列表中):
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "script": { "source": "ctx._source.tags.add(params.tag)", "lang": "painless", "params": { "tag": "blue" } } }'
You could also remove a tag from the list of tags. The Painless function to remove a tag takes the array index of the element you want to remove. To avoid a possible runtime error, you first need to make sure the tag exists. If the list contains duplicates of the tag, this script just removes one occurrence.
你也可以从标签列表中删除标签。删除标签的 Painless 函数需要提供待删除元素的数组索引。为避免可能的运行时错误,你需要先确认标签是否存在。如果列表中存在标签的重复出现,该脚本将仅删除其中一个。
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "script": { "source": "if (ctx._source.tags.contains(params.tag)) { ctx._source.tags.remove(ctx._source.tags.indexOf(params.tag)) }", "lang": "painless", "params": { "tag": "blue" } } }'
You can also add and remove fields from a document. For example, this script adds the field new_field:
还可以在文档中添加和删除字段:
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "script" : "ctx._source.new_field = \u0027value_of_new_field\u0027" }'
Conversely, this script removes the field new_field:
相反,这个脚本删除了字段 new_field:
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "script" : "ctx._source.remove(\u0027new_field\u0027)" }'
The following script removes a subfield from an object field:
下面的脚本从对象字段中移除一个子字段:
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "script": "ctx._source[\u0027my-object\u0027].remove(\u0027my-subfield\u0027)" }'
Instead of updating the document, you can also change the operation that is executed from within the script. For example, this request deletes the doc if the tags field contains green, otherwise it does nothing (noop):
与其更新文档,你也可以在脚本中更改执行的操作。例如,该请求会在 tags 字段包含“green”时删除文档,否则不执行任何操作(noop):
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "script": { "source": "if (ctx._source.tags.contains(params.tag)) { ctx.op = \u0027delete\u0027 } else { ctx.op = \u0027noop\u0027 }", "lang": "painless", "params": { "tag": "green" } } }'
5、Update part of a document(更新文档的一部分)
https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-update.html#_update_part_of_a_document
The following partial update adds a new field to the existing document:
以下部分更新为现有文件增加了一个新字段:
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "doc": { "name": "new_name" } }'
6、Detect noop updates(检测 noop 更新)
https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-update.html#_detect_noop_updates
By default updates that don’t change anything detect that they don’t change anything and return "result": "noop":
默认情况下,不更改任何内容的更新会检测到不更改任何内容并返回“ result”: “ noop”:
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "doc": { "name": "new_name" } }'
我理解意思是说: 在 Elasticsearch 的 Update API 中,如果更新请求不改变文档的内容,Elasticsearch 会检测到这个变化并返回一个 JSON 响应,其中包含一个 "result":"noop" 字段。这个 "noop" 字段表示“无操作”,即 Elasticsearch 检测到更新请求不需要执行任何实际操作。 这在进行批量操作时非常有用,因为它允许你避免进行不必要的更新操作,从而节省时间和资源。例如,如果你要对一个包含成千上万个文档的索引进行更新,检测到不需要更新的文档可以帮助你尽快完成操作并避免不必要的开销。 请注意,在某些情况下,即使请求未更改文档的内容,Elasticsearch 仍可能会执行实际的操作。比如,如果你在更新请求中使用了脚本或插件,这些操作可能会改变文档的一些其他属性,从而引起实际更新操作的执行。
If the value of name is already new_name, the update request is ignored and the result element in the response returns noop:
如果 name 的值已经是 new _ name,则忽略更新请求,并且响应中的 result 元素返回 noop:
{ "_shards": { "total": 0, "successful": 0, "failed": 0 }, "_index": "test", "_id": "1", "_version": 2, "_primary_term": 1, "_seq_no": 1, "result": "noop" }
You can disable this behavior by setting "detect_noop": false:
您可以通过设置“ check _ noop”来禁用这种行为: false :
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "doc": { "name": "new_name" }, "detect_noop": false }'
7、Upsert
https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-update.html#upserts
If the document does not already exist, the contents of the upsert element are inserted as a new document. If the document exists, the script is executed:
如果文档不存在,那么 upsert 中的内容将被插入作为新文档。如果文档已经存在,则会执行脚本。
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "script": { "source": "ctx._source.counter += params.count", "lang": "painless", "params": { "count": 4 } }, "upsert": { "counter": 1 } }'
8、Scripted Upsert
https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-update.html#scripted_upsert
To run the script whether or not the document exists, set scripted_upsert to true:
若要运行脚本,无论文档是否存在,请将 scripted _ upsert 设置为 true:
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "scripted_upsert": true, "script": { "source": "if ( ctx.op == \u0027create\u0027 ) {\n ctx._source.counter = params.count\n} else {\n ctx._source.counter += params.count\n}", "params": { "count": 4 } }, "upsert": {} }'
9、Doc as upsert
https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-update.html#doc_as_upsert
Instead of sending a partial doc plus an upsert doc, you can set doc_as_upsert to true to use the contents of doc as the upsert value:
您可以将 doc_as_upsert 设置为 true 来代替发送局部文档和 upsert 文档,这将使用 doc 内容作为 upsert 值。
curl -X POST "localhost:9200/test/_update/1?pretty" -H 'Content-Type: application/json' -d' { "doc": { "name": "new_name" }, "doc_as_upsert": true }'
Using ingest pipelines with doc_as_upsert is not supported.
不支持在使用 doc_as_upsert 时使用 Ingest pipelines。