Elasticsearch专题精讲—— REST APIs —— Document APIs —— Delete by query API

REST APIs —— Document APIs —— Delete by query API

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html

Deletes documents that match the specified query.

删除与指定查询匹配的文档。

        curl -X POST "localhost:9200/my-index-000001/_delete_by_query?pretty" -H 'Content-Type: application/json' -d'
        {
          "query": {
            "match": {
              "user.id": "elkbee"
            }
          }
        }'
    

1、Request(请求)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-api-request

        POST //_delete_by_query

        POST//_delete_by_query
    

2、Prerequisites(先决条件)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-api-prereqs

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or alias:

如果启用了 Elasticsearch 安全特性,您必须对目标数据流、索引或别名拥有以下索引特权:

    • read
    • write
    • 删除或写入

3、Description(描述)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-api-desc

You can specify the query criteria in the request URI or the request body using the same syntax as the Search API.

您可以使用与 SearchAPI 相同的语法在请求 URI 或请求体中指定查询条件。

When you submit a delete by query request, Elasticsearch gets a snapshot of the data stream or index when it begins processing the request and deletes matching documents using internal versioning. If a document changes between the time that the snapshot is taken and the delete operation is processed, it results in a version conflict and the delete operation fails.

在删除查询的处理过程中,Elasticsearch创建的快照是删除操作的基础,该快照是基于该查询在生成时已经匹配的文档创建的。一旦快照被创建,Elasticsearch就开始执行删除操作,如果有任何文档在删除操作的过程中被更改,那么这些文档的版本就会与快照版本不一致,此时版本冲突就会发生,导致删除操作无法完成。在这种情况下,您可能需要考虑重新运行删除操作,或者尝试使用更快速的方法来删除文档。 为了避免这种情况,您可以在删除操作之前或期间暂停索引更改或关闭索引,并在删除操作完成后再重新打开索引或恢复索引更改。此外,您可以使用版本控制机制来保证文档的同步,或者在删除文档时指定版本号,以确保只删除指定版本的文档,而不会影响其他版本的文档。 总之,虽然删除查询是一种强大的工具,可以快速、简便地删除不需要的文档,但它也可能会带来一些风险和麻烦,您需要谨慎使用,并了解如何应对版本冲突的情况。

我理解意思是说: 在删除查询的处理过程中,Elasticsearch创建的快照是删除操作的基础,该快照是基于该查询在生成时已经匹配的文档创建的。一旦快照被创建,Elasticsearch就开始执行删除操作,如果有任何文档在删除操作的过程中被更改,那么这些文档的版本就会与快照版本不一致,此时版本冲突就会发生,导致删除操作无法完成。在这种情况下,您可能需要考虑重新运行删除操作,或者尝试使用更快速的方法来删除文档。 为了避免这种情况,您可以在删除操作之前或期间暂停索引更改或关闭索引,并在删除操作完成后再重新打开索引或恢复索引更改。此外,您可以使用版本控制机制来保证文档的同步,或者在删除文档时指定版本号,以确保只删除指定版本的文档,而不会影响其他版本的文档。 总之,虽然删除查询是一种强大的工具,可以快速、简便地删除不需要的文档,但它也可能会带来一些风险和麻烦,您需要谨慎使用,并了解如何应对版本冲突的情况。

Documents with a version equal to 0 cannot be deleted using delete by query because internal versioning does not support 0 as a valid version number.

版本等于0的文档不能通过查询删除,因为内部版本控制不支持0作为有效的版本号。

While processing a delete by query request, Elasticsearch performs multiple search requests sequentially to find all of the matching documents to delete. A bulk delete request is performed for each batch of matching documents. If a search or bulk request is rejected, the requests are retried up to 10 times, with exponential back off. If the maximum retry limit is reached, processing halts and all failed requests are returned in the response. Any delete requests that completed successfully still stick, they are not rolled back.

在处理按查询请求删除时,Elasticsearch 会顺序执行多个搜索请求,以找到要删除的所有匹配文档。对每批匹配的文档执行批量删除请求。如果一个搜索或批量请求被拒绝,请求将被重试10次,并以指数形式后退。如果达到最大重试限制,则在响应中返回处理暂停和所有失败的请求。任何成功完成的删除请求仍然存在,它们不会回滚。

You can opt to count version conflicts instead of halting and returning by setting conflicts to proceed. Note that if you opt to count version conflicts the operation could attempt to delete more documents from the source than max_docs until it has successfully deleted max_docs documents, or it has gone through every document in the source query.

您可以选择计算版本冲突,而不是通过设置冲突来停止和返回。请注意,如果选择计算版本冲突,则操作可能会尝试从源中删除比 max_docs 更多的文档,直到成功删除 max_docs 文档,或者遍历源查询中的每个文档。

4、Refreshing shards(刷新分片)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#_refreshing_shards

Specifying the refresh parameter refreshes all shards involved in the delete by query once the request completes. This is different than the delete API’s refresh parameter, which causes just the shard that received the delete request to be refreshed. Unlike the delete API, it does not support wait_for.

指定刷新参数refresh将在请求完成后刷新delete by query涉及的所有分片。这与删除API的refresh参数不同,后者仅导致接收删除请求的分片被刷新。与删除API不同,它不支持wait_for参数。

我理解意思是说: 简而言之,refresh 参数会在请求完成后强制刷新所有涉及到删除操作的分片,让更改立即反映在搜索结果中。与之相比,删除 API 的 refresh 参数只会刷新接收到该请求的分片。需要注意的是,delete by query 不支持 wait_for 参数,因此执行删除操作时需要等待刷新完成并不保证删除结果立即在搜索结果中体现。

5、Running delete by query asynchronously(异步运行查询删除)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-task-api

If the request contains wait_for_completion=false, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to cancel or get the status of the task. Elasticsearch creates a record of this task as a document at .tasks/task/${taskId}. When you are done with a task, you should delete the task document so Elasticsearch can reclaim the space.

如果请求中包含 wait_for_completion=false,则Elasticsearch会执行一些预先检查,启动请求,并返回一个任务,您可以使用该任务取消或获取任务状态。Elasticsearch会在 .tasks/task/${taskId} 中创建任务的文档记录。当您完成一个任务时,应该删除任务文档,以便Elasticsearch可以回收空间。

6、Waiting for active shards(等待激活的分片)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#_waiting_for_active_shards

wait_for_active_shards controls how many copies of a shard must be active before proceeding with the request. See Active shards for details. timeout controls how long each write request waits for unavailable shards to become available. Both work exactly the way they work in the Bulk API. Delete by query uses scrolled searches, so you can also specify the scroll parameter to control how long it keeps the search context alive, for example ?scroll=10m. The default is 5 minutes.

wait_for_active_shards 控制在继续请求之前必须激活多少个副本分片。有关详情,请参见活动分片。 timeout 控制每个写请求等待不可用分片变为可用的时间。两者的工作方式与 Bulk API 中的工作方式完全相同。 删除查询使用滚动搜索,因此您还可以指定scroll参数来控制保持搜索上下文有效的时间,例如 scroll=10m。默认值为5分钟。

7、Throttling delete requests(限制删除请求)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-throttle

To control the rate at which delete by query issues batches of delete operations, you can set requests_per_second to any positive decimal number. This pads each batch with a wait time to throttle the rate. Set requests_per_second to -1 to disable throttling.

为了控制 delete by query 命令发送删除操作的速率,您可以将 requests_per_second 参数设置为一个正小数。这会通过添加一些等待时间来限制删除操作的速率。将 requests_per_second 参数设置为 -1 可以禁用速率限制。

Throttling uses a wait time between batches so that the internal scroll requests can be given a timeout that takes the request padding into account. The padding time is the difference between the batch size divided by the requests_per_second and the time spent writing. By default the batch size is 1000, so if requests_per_second is set to 500:

限流在批次之间使用等待时间,以便可以为内部滚动请求设置一个考虑到请求填充时间的超时。填充时间是批次大小除以每秒请求次数和写入花费时间的差值。默认情况下,批次大小为1000,因此,如果每秒请求次数设置为500:

target_time = 1000 / 500 per second = 2 seconds
        wait_time = target_time - write_time = 2 seconds - .5 seconds = 1.5 seconds
                

Since the batch is issued as a single _bulk request, large batch sizes cause Elasticsearch to create many requests and wait before starting the next set. This is "bursty" instead of "smooth".

由于一批请求作为单个 _bulk 请求发出,因此大批量的大小会导致 Elasticsearch 创建许多请求并在开始下一组之前等待。这会导致出现“爆发式”而不是“平滑式”的情况。

8、Slicing(切片)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-slice

Delete by query supports sliced scroll to parallelize the delete process. This can improve efficiency and provide a convenient way to break the request down into smaller parts.

删除查询支持使用 sliced scroll 来并行化删除过程。这可以提高效率,并提供一种方便的方式将请求拆分成较小的部分。

Setting slices to auto chooses a reasonable number for most data streams and indices. If you’re slicing manually or otherwise tuning automatic slicing, keep in mind that:

将slices设置为auto会为大多数数据流和索引选择一个合理的数量。如果您手动进行切片或以其他方式调整自动切片,请记住:

    • Query performance is most efficient when the number ofslicesis equal to the number of shards in the index or backing index. If that number is large (for example, 500), choose a lower number as too manysliceshurts performance. Settingsliceshigher than the number of shards generally does not improve efficiency and adds overhead.

查询性能最佳时,切片数量应该和索引或后备索引中 shards 数量相等。如果这个数字很大(例如 500),请选择一个较低的数字,因为过多的切片会影响性能。将切片设置得高于 shards 的数量通常不会提高效率,反而会增加开销。

    • Delete performance scales linearly across available resources with the number of slices.

删除性能随着切片数量的增加而呈线性比例增长。

Whether query or delete performance dominates the runtime depends on the documents being reindexed and cluster resources.

查询或删除性能的占据运行时间的比例,取决于正在重新索引的文档和集群资源。

9、Request body(请求 body)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-api-request-body

query:(Optional, query object) Specifies the documents to delete using the Query DSL.

(可选的,查询对象)使用Query DSL指定要删除的文档。

10、Response body(返回 body)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-api-response-body

The JSON response looks like this:

JSON 响应如下:

{
                  "took" : 147,         //整个操作从开始到结束所用的毫秒数。
                  "timed_out": false,   //如果在“按查询删除”执行期间执行的任何请求超时,则此标志设置为true。
                  "total": 119,         //成功处理的文档数。
                  "deleted": 119,       //成功删除的文档树
                  "batches": 1,         //“按查询删除”操作返回的滚动响应数量。
                  "version_conflicts": 0,   //“按查询删除”操作冲突的版本数量。
                  "noops": 0,           //对于“按查询删除”操作,该字段始终等于零。 它仅存在是为了使“按查询删除”,“按查询更新”和“重新索引”API返回具有相同结构的响应。
                  "retries": {          //通过查询删除尝试重试的次数。bulk 是重试批量操作的数量,search 是重试搜索操作的数量。
                    "bulk": 0,
                    "search": 0
                  },
                  "throttled_millis": 0,    //请求按每秒请求数量进行调整后睡眠的毫秒数。
                  "requests_per_second": -1.0,  //按查询删除期间有效执行的每秒请求数。
                  "throttled_until_millis": 0,  //这个字段在 _delete_by_query 响应中应始终为零。它仅在使用 Task API 时才有意义,表示下一次执行被限制请求的时间(自纪元以来的毫秒数),以符合 requests_per_second。
                  "failures" : [ ]  //如果在处理过程中出现了无法恢复的错误,则会返回一个失败数组。如果此数组非空,则请求会因为这些失败而被中止。Delete by query 使用批处理实现,任何失败都会导致整个过程中止,但是当前批处理中的所有失败都会被收集到该数组中。您可以使用 conflicts 选项来防止因版本冲突而导致的重新索引中止。
                }
                    

11、Example(例子)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-api-example

11.1、Delete all documents from the my-index-000001 data stream or index:

11.1、从 my-index-000001数据流或索引中删除所有文档:

curl -X POST "localhost:9200/my-index-000001/_delete_by_query?conflicts=proceed&pretty" -H 'Content-Type: application/json' -d'
        {
          "query": {
            "match_all": {}
          }
        }'        
    

11.2、Delete documents from multiple data streams or indices:

11.2、从多个数据流或索引中删除文档:

curl -X POST "localhost:9200/my-index-000001,my-index-000002/_delete_by_query?pretty" -H 'Content-Type: application/json' -d'
        {
          "query": {
            "match_all": {}
          }
        }'     

11.3、Limit the delete by query operation to shards that a particular routing value:

11.3、将“按查询删除”操作限制为特定路由值的分片:

curl -X POST "localhost:9200/my-index-000001/_delete_by_query?routing=1&pretty" -H 'Content-Type: application/json' -d'
    {
      "query": {
        "range" : {
            "age" : {
               "gte" : 10
            }
        }
      }
    }'

11.4、By default _delete_by_query uses scroll batches of 1000. You can change the batch size with the scroll_size URL parameter:

11.4、默认情况下,_delete_by_query 使用 1000 条文档为一批的 scroll batches 方式。您可以使用 scroll_size URL 参数更改批处理大小。

curl -X POST "localhost:9200/my-index-000001/_delete_by_query?scroll_size=5000&pretty" -H 'Content-Type: application/json' -d'
        {
          "query": {
            "term": {
              "user.id": "kimchy"
            }
          }
        }'
    

11.5、Delete a document using a unique attribute:

11.5、使用惟一属性删除文档:

curl -X POST "localhost:9200/my-index-000001/_delete_by_query?pretty" -H 'Content-Type: application/json' -d'
        {
          "query": {
            "term": {
              "user.id": "kimchy"
            }
          },
          "max_docs": 1
        }'

12、Slice manually(手动切片)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-manual-slice

Slice a delete by query manually by providing a slice id and total number of slices:

通过提供 slice id 和 total number of slices,手动地拆分一个删除查询:

curl -X POST "localhost:9200/my-index-000001/_delete_by_query?pretty" -H 'Content-Type: application/json' -d'
        {
          "slice": {
            "id": 0,
            "max": 2
          },
          "query": {
            "range": {
              "http.response.bytes": {
                "lt": 2000000
              }
            }
          }
        }
        '
        curl -X POST "localhost:9200/my-index-000001/_delete_by_query?pretty" -H 'Content-Type: application/json' -d'
        {
          "slice": {
            "id": 1,
            "max": 2
          },
          "query": {
            "range": {
              "http.response.bytes": {
                "lt": 2000000
              }
            }
          }
        }'        

Which you can verify works with:

您可以通过以下方式验证其有效性:

curl -X GET "localhost:9200/_refresh?pretty"
        curl -X POST "localhost:9200/my-index-000001/_search?size=0&filter_path=hits.total&pretty" -H 'Content-Type: application/json' -d'
        {
          "query": {
            "range": {
              "http.response.bytes": {
                "lt": 2000000
              }
            }
          }
        }'
            
    

Which results in a sensible total like this one:

结果就是这样一个合理的总数:

{
        "hits": {
          "total" : {
              "value": 0,
              "relation": "eq"
          }
        }
      }          
        

13、Use automatic slicing(使用自动切片)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-automatic-slice

You can also let delete-by-query automatically parallelize using sliced scroll to slice on _id. Use slices to specify the number of slices to use:

您也可以使用 slice 滚动来自动并行执行删除查询,以便在 _id 上对结果进行 slice。使用“slices”参数来指定要使用的切片数量。

curl -X POST "localhost:9200/my-index-000001/_delete_by_query?refresh&slices=5&pretty" -H 'Content-Type: application/json' -d'
        {
          "query": {
            "range": {
              "http.response.bytes": {
                "lt": 2000000
              }
            }
          }
        }'
        

Which you also can verify works with:

您还可以使用以下工具验证:

curl -X POST "localhost:9200/my-index-000001/_search?size=0&filter_path=hits.total&pretty" -H 'Content-Type: application/json' -d'
    {
      "query": {
        "range": {
          "http.response.bytes": {
            "lt": 2000000
          }
        }
      }
    }'

Which results in a sensible total like this one:

结果就是这样一个合理的总数:

{
        "hits": {
          "total" : {
              "value": 0,
              "relation": "eq"
          }
        }
      }
      

Setting slices to auto will let Elasticsearch choose the number of slices to use. This setting will use one slice per shard, up to a certain limit. If there are multiple source data streams or indices, it will choose the number of slices based on the index or backing index with the smallest number of shards.

将“slices”参数设置为“auto”将允许 Elasticsearch 选择要使用的切片数量。此设置将使用每个分片一个切片,直到达到一定的限制。如果存在多个源数据流或索引,则它将基于具有最小分片数的索引或支持索引的数量选择切片数。

  1. You can see these requests in the Tasks APIs. These sub-requests are "child" tasks of the task for the request withslices.
  2. 您可以在任务 API 中查看这些请求。这些子请求是具有 slices 参数请求的“父”任务的子任务。

  3. Fetching the status of the task for the request withslicesonly contains the status of completed slices.
  4. 使用仅包含已完成切片状态的内容,可以获取带有“slices”参数的请求的任务状态。

    我理解意思是说: 当使用“slices”参数执行删除查询并获取其任务状态时,返回的信息中只包含已完成的子请求的状态。这是因为每个切片都是一个独立的子请求,它们在背后并行执行,因此任务状态只反映已完成的切片的状态,而不是等待执行或尚未完成的切片的状态。因此,如果您希望获得完整的任务状态,可能需要等待所有子请求完成后再进行查询,或者自己统计多个切片的状态信息。

  5. These sub-requests are individually addressable for things like cancellation and rethrottling.
  6. 这些子请求可通过单独的地址进行取消和重新调节限流等操作。

  7. Rethrottling the request withsliceswill rethrottle the unfinished sub-request proportionally.
  8. 使用slices参数对请求执行重新限流时,会按比例重新限流未完成的子请求。

    我理解意思是说: 如果使用了“slices”参数执行删除查询,在重新调节限流时,未完成的子请求将按比例进行处理。也就是说,如果从并行子请求的总体速率中减少了某个百分比的限制,则未完成的各个子请求也会被相应地调整限流,以匹配其在总体请求限制中的计划份额。

  9. Canceling the request withsliceswill cancel each sub-request.
  10. 使用slices参数取消请求时,会取消每个子请求。

  11. Due to the nature ofsliceseach sub-request won’t get a perfectly even portion of the documents. All documents will be addressed, but some slices may be larger than others. Expect larger slices to have a more even distribution.
  12. 由于切片的特性,每个子请求不会获得完美均等的文档部分。所有文档都将被处理,但有些切片可能比其他切片更大。预计更大的切片具有更均匀的分布。

  13. Parameters likerequests_per_secondandmax_docson a request withslicesare distributed proportionally to each sub-request. Combine that with the point above about distribution being uneven and you should conclude that usingmax_docswithslices&might not result in exactlymax_docsdocuments being deleted.
  14. 使用带有slices参数的请求时,诸如requests_per_second和max_docs之类的参数会按比例分配给每个子请求。结合上述关于分布不均匀的信息,您应该得出结论,使用max_docs与slices可能不会完全删除max_docs个文档。

    我理解意思是说: 当使用带有slices参数的请求时,在使用requests_per_second和max_docs这类参数的情况下,它们会按比例分配给每个子请求。这意味着如果您使用max_docs参数来指定要删除的文档数量,并使用slices参数执行该删除查询,实际上可能会删除少于或多于指定数量的文档。 这是由于分布不均匀的原因。不同的索引分片可能包含不同数量的文档,因此如果删除查询在特定分片上执行得快,那么在那个分片上被删除的文档数量就有可能多于其他分片。因此,在使用slices参数并指定max_docs时,您不能保证删除的确切数量将是您指定的值。

  15. Each sub-request gets a slightly different snapshot of the source data stream or index though these are all taken at approximately the same time.
  16. 由于切片的特性,每个子请求不会获得完美均等的文档部分。所有文档都将被处理,但有些切片可能比其他切片更大。预计更大的切片具有更均匀的分布。

    我理解意思是说: 在 Elasticsearch 中,删除查询是通过在不同的分片(shard)上并行执行多个子查询来实现的。每个子查询都会获得源数据流或索引的略微不同的快照,尽管这些快照都在大约同一时间被拍摄。这是因为在执行查询时,可能会有新的文档被索引,或者已经被删除的文档仍然存在于某些分片中,但已经在其他分片中被删除。因此,每个子查询获得的数据快照可能略有不同,但由于它们在同一时间开始执行,因此它们仍然具有相似的数据视图。最终,这些子查询的结果将被组合在一起,以确定要从索引中删除的文档。需要注意的是,在使用 slices 参数自动并行执行删除查询时,可能会出现删除不完全的情况,特别是在使用 max_docs 参数时,可能会导致实际删除的文档数量与指定的数量不同,这是因为分布不均匀的原因。因此,在使用删除查询时应该谨慎,并使用适当的参数和设置,以确保删除操作不会破坏数据完整性。

14、Get the status of a delete by query operation(通过查询操作获取删除的状态)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-rethrottle

Use the tasks API to get the status of a delete by query operation:

使用任务 API 获取通过查询操作删除的状态:

curl -X GET "localhost:9200/_tasks?detailed=true&actions=*/delete/byquery&pretty"

The response looks like:

他们的反应似乎是:

{
    "nodes" : {
      "r1A2WoRbTwKZ516z6NEs5A" : {
        "name" : "r1A2WoR",
        "transport_address" : "127.0.0.1:9300",
        "host" : "127.0.0.1",
        "ip" : "127.0.0.1:9300",
        "attributes" : {
          "testattr" : "test",
          "portsfile" : "true"
        },
        "tasks" : {
          "r1A2WoRbTwKZ516z6NEs5A:36619" : {
            "node" : "r1A2WoRbTwKZ516z6NEs5A",
            "id" : 36619,
            "type" : "transport",
            "action" : "indices:data/write/delete/byquery",
            "status" : {    
              "total" : 6154,
              "updated" : 0,
              "created" : 0,
              "deleted" : 3500,
              "batches" : 36,
              "version_conflicts" : 0,
              "noops" : 0,
              "retries": 0,
              "throttled_millis": 0
            },
            "description" : ""
          }
        }
      }
    }
  }
  

This object contains the actual status. It is just like the response JSON with the important addition of the total field. total is the total number of operations that the reindex expects to perform. You can estimate the progress by adding the updated, created, and deleted fields. The request will finish when their sum is equal to the total field.

该对象包含实际状态信息,类似于响应 JSON,但重要的是,它增加了一个名为 “total” 的字段,表示重建索引操作预计要执行的总数。通过将 updated、created 和 deleted 字段相加,可以估算操作的进度。当它们的和等于 total 字段时,重建索引请求将完成。

With the task id you can look up the task directly:

使用任务 id,你可以直接查找任务:

curl -X GET "localhost:9200/_tasks/r1A2WoRbTwKZ516z6NEs5A:36619?pretty"

The advantage of this API is that it integrates with wait_for_completion=false to transparently return the status of completed tasks. If the task is completed and wait_for_completion=false was set on it then it’ll come back with results or an error field. The cost of this feature is the document that wait_for_completion=false creates at .tasks/task/${taskId}. It is up to you to delete that document.

这个 API 的优点是可以与 wait_for_completion=false 集成,以透明地返回已完成任务的状态。如果任务完成并且 wait_for_completion=false 被设置为它,则它将返回结果或错误字段。这个功能的成本是 wait_for_completion=false 在 .tasks/task/${taskId} 处创建的文档。你需要自己删除这个文档。

15、Cancel a delete by query operation(通过查询操作取消删除)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/docs-delete-by-query.html#docs-delete-by-query-cancel-task-api

Any delete by query can be canceled using the task cancel API:

可以使用任务取消 API 取消任何按查询进行的删除:

curl -X POST "localhost:9200/_tasks/r1A2WoRbTwKZ516z6NEs5A:36619/_cancel?pretty"

Any delete by The task ID can be found using the tasks API.

可以使用任务 API 找到任务 ID。

Any delete by Cancellation should happen quickly but might take a few seconds. The task status API above will continue to list the delete by query task until this task checks that it has been cancelled and terminates itself.

取消应该很快发生,但可能需要几秒钟。上面的任务状态 API 将继续列出按查询删除任务,直到该任务检查它是否已被取消并自行终止。

我理解意思是说: 如果通过取消进行删除,那么这个过程应该会非常快,但可能需要几秒钟的时间。上面的任务状态 API 将继续列出删除查询任务,直到该任务检查到自己已被取消并终止。也就是说,当你取消一个删除查询任务后,该任务可能需要一些时间来终止并清理所有相关资源,而任务状态 API 将一直显示该任务的状态,直到任务被完全终止。因此,在取消删除查询任务后,可以通过循环调用任务状态 API 来检查任务是否已经成功终止。

posted @ 2023-06-02 17:11  左扬  阅读(165)  评论(0编辑  收藏  举报
levels of contents