Elasticsearch专题精讲—— REST APIs —— Cluster APIs —— Cluster allocation explain API(解释分配给索引或分片的节点选择过程的API)

REST APIs —— Cluster APIs —— Cluster allocation explain API(用于提供关于特定分片当前分配情况的解释)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/cluster-allocation-explain.html#cluster-allocation-explain

Provides an explanation for a shard’s current allocation.

提供对 shard 当前分配的解释。

curl -X GET "localhost:9200/_cluster/allocation/explain?pretty" -H 'Content-Type: application/json' -d'
        {
          "index": "my-index-000001",
          "shard": 0,
          "primary": false,
          "current_node": "my-node"
        }'

我理解意思是说: Cluster allocation explain API 是 Elasticsearch 8 中的一个 API,用于提供关于特定分片当前分配情况的解释。使用此 API 可以查看某个集群中特定分片当前所在的节点、分片状态以及分片信息等相关信息。这对于集群管理员来说非常有用,因为他们可以使用此 API 来诊断集群中可能出现的问题,并实时了解集群的健康状况。

1、Request(请求)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/cluster-allocation-explain.html#cluster-allocation-explain-api-request

        GET  _cluster/allocation/explain
        POST _cluster/allocation/explain

2、Description(描述)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/cluster-allocation-explain.html#cluster-allocation-explain-api-desc

The purpose of the cluster allocation explain API is to provide explanations for shard allocations in the cluster. For unassigned shards, the explain API provides an explanation for why the shard is unassigned. For assigned shards, the explain API provides an explanation for why the shard is remaining on its current node and has not moved or rebalanced to another node. This API can be very useful when attempting to diagnose why a shard is unassigned or why a shard continues to remain on its current node when you might expect otherwise.

Cluster allocation explain API 的目的是提供有关集群中分片分配情况的解释。对于未分配的分片,解释 API 会提供为什么分片未被分配的解释。对于已分配的分片,解释 API 会提供为什么分片仍停留在当前节点而没有移动或重新平衡到其他节点的解释。当故障诊断需要查明为什么分片未被分配或为什么分片继续停留在当前节点时,使用此API非常有用。

3、Unassigned primary shard(未分配的主分片)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/cluster-allocation-explain.html#_unassigned_primary_shard

The following request gets an allocation explanation for an unassigned primary shard.

以下请求获取未分配的主分片的分配说明。

curl -X GET "localhost:9200/_cluster/allocation/explain?pretty" -H 'Content-Type: application/json' -d'
        {
          "index": "my-index-000001",
          "shard": 0,
          "primary": true
        }'

The API response indicates the shard can only be allocated to a nonexistent node.

API响应指示分片只能分配给一个不存在的节点。

    // 这是一个未分配的分片,意味着Elasticsearch没有将该分片分配给任何节点。
        {
            "index" : "my-index-000001",
            "shard" : 0,
            "primary" : true,
            "current_state" : "unassigned",        //分片当前的状态      
            "unassigned_info" : {
              "reason" : "INDEX_CREATED",          //碎片最初未分配的原因               
              "at" : "2017-01-04T18:08:16.600Z",
              "last_allocation_status" : "no"
            },
            "can_allocate" : "no",                 //是否分配碎片                    
            "allocate_explanation" : "Elasticsearch isn't allowed to allocate this shard to any of the nodes in the cluster. Choose a node to which you expect this shard to be allocated, find this node in the node-by-node explanation, and address the reasons which prevent Elasticsearch from allocating this shard there.",
            "node_allocation_decisions" : [
              {
                "node_id" : "8qt2rY-pT6KNZB3-hGfLnw",
                "node_name" : "node-0",
                "transport_address" : "127.0.0.1:9401",
                "node_attributes" : {},
                "node_decision" : "no",             //是否将分片分配给特定节点                   
                "weight_ranking" : 1,
                "deciders" : [
                  {
                    "decider" : "filter",           //导致节点没有决策的决策者                  
                    "decision" : "NO",
                    "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"nonexistent_node\"]"   //解释为什么决策者返回“无决策”,并提供一个有用的提示,指向导致决策的设置
                  }
                ]
              }
            ]
          }

The following response contains an allocation explanation for an unassigned primary shard that was previously allocated.

下面的响应包含了一个未被分配的主分片先前已经被分配的分配说明。

{
    "index" : "my-index-000001",
    "shard" : 0,
    "primary" : true,
    "current_state" : "unassigned",
    "unassigned_info" : {
      "reason" : "NODE_LEFT",
      "at" : "2017-01-04T18:03:28.464Z",
      "details" : "node_left[OIWe8UhhThCK0V5XfmdrmQ]",
      "last_allocation_status" : "no_valid_shard_copy"
    },
    "can_allocate" : "no_valid_shard_copy",
    "allocate_explanation" : "Elasticsearch can't allocate this shard because there are no copies of its data in the cluster. Elasticsearch will allocate this shard when a node holding a good copy of its data joins the cluster. If no such node is available, restore this index from a recent snapshot."
}

4、Unassigned replica shard(未分配的复制分片)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/cluster-allocation-explain.html#_unassigned_replica_shard

The following response contains an allocation explanation for a replica that’s unassigned due to delayed allocation.

下面的响应包含由于分配延迟而未分配的副本的分配说明。

        {
            "index" : "my-index-000001",
            "shard" : 0,
            "primary" : false,
            "current_state" : "unassigned",
            "unassigned_info" : {
              "reason" : "NODE_LEFT",
              "at" : "2017-01-04T18:53:59.498Z",
              "details" : "node_left[G92ZwuuaRY-9n8_tc-IzEg]",
              "last_allocation_status" : "no_attempt"
            },
            "can_allocate" : "allocation_delayed",
            "allocate_explanation" : "The node containing this shard copy recently left the cluster. Elasticsearch is waiting for it to return. If the node does not return within [%s] then Elasticsearch will allocate this shard to another node. Please wait.",
            "configured_delay" : "1m",                  //在分配副本分片之前配置的延迟,该副本分片由于节点持有而离开群集而不存在。                
            "configured_delay_in_millis" : 60000,
            "remaining_delay" : "59.8s",                //分配副本分片之前的剩余延迟。
          
            "remaining_delay_in_millis" : 59824,
            "node_allocation_decisions" : [
              {
                "node_id" : "pmnHu_ooQWCPEFobZGbpWw",
                "node_name" : "node_t2",
                "transport_address" : "127.0.0.1:9402",
                "node_decision" : "yes"
              },
              {
                "node_id" : "3sULLVJrRneSg0EfBB-2Ew",
                "node_name" : "node_t0",
                "transport_address" : "127.0.0.1:9400",
                "node_decision" : "no",
                "store" : {                             //有关在节点上找到的碎片数据的信息。            
                  "matching_size" : "4.2kb",
                  "matching_size_in_bytes" : 4325
                },
                "deciders" : [
                  {
                    "decider" : "same_shard",
                    "decision" : "NO",
                    "explanation" : "a copy of this shard is already allocated to this node [[my-index-000001][0], node[3sULLVJrRneSg0EfBB-2Ew], [P], s[STARTED], a[id=eV9P8BN1QPqRc3B4PLx6cg]]"
                  }
                ]
              }
            ]
          }

5、Assigned shard(分配的分片)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/cluster-allocation-explain.html#_assigned_shard

The following response contains an allocation explanation for an assigned shard. The response indicates the shard is not allowed to remain on its current node and must be reallocated.

下面的响应包含了一个已分配分片的分配说明。响应表明该分片不能保留在其当前节点上,并且必须重新分配。

        //这是一个分配说明用于解释为什么该分片不允许留在当前节点或移动到其他节点。
        {
            "index" : "my-index-000001",                //分片所属的索引名称。
            "shard" : 0,                                //分片ID。
            "primary" : true,                           //是否为主分片。
            "current_state" : "started",                //当前状态,指示分片当前是否已启动。
            "current_node" : {                          //当前节点ID,名称和传输地址, 表示当前分片所在节点。    
              "id" : "8lWJeJ7tSoui0bxrwuNhTA",
              "name" : "node_t1",
              "transport_address" : "127.0.0.1:9401"
            },
            "can_remain_on_current_node" : "no",        //表示当前节点是否允许分片继续留在该节点上。
            "can_remain_decisions" : [                  //不允许当前节点保留分片的决策者及其解释。              
              {
                "decider" : "filter",
                "decision" : "NO",
                "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"nonexistent_node\"]"
              }
            ],
            "can_move_to_other_node" : "no",            //是否允许移动分片到另一个节点。
            //如果无法将分片留在当前节点,为什么不能将其移动到另一个节点的解释。          
            "move_explanation" : "This shard may not remain on its current node, but Elasticsearch isn't allowed to move it to another node. Choose a node to which you expect this shard to be allocated, find this node in the node-by-node explanation, and address the reasons which prevent Elasticsearch from allocating this shard there.",
            "node_allocation_decisions" : [             //有可能分配此分片的其他节点及其决策解释。
              {
                "node_id" : "_P8olZS8Twax9u6ioN-GGA",
                "node_name" : "node_t0",
                "transport_address" : "127.0.0.1:9400",
                "node_decision" : "no",
                "weight_ranking" : 1,
                "deciders" : [
                  {
                    "decider" : "filter",
                    "decision" : "NO",
                    "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"nonexistent_node\"]"
                  }
                ]
              }
            ]
          }

The following response contains an allocation explanation for a shard that must remain on its current node. Moving the shard to another node would not improve cluster balance.

Cluster allocation explain API的响应中包含一个分片的分配说明,该分片必须保留在当前节点上。将分片移动到另一个节点无法改善集群的平衡。

{
    "index" : "my-index-000001",        //这是分片所属的索引名称。
    "shard" : 0,                        //这是分片的编号。
    "primary" : true,                   //如果这个分片是主分片,则为true,如果是副本,则为false。
    "current_state" : "started",        //这是分片的当前状态,此处为started,表示分片正在运行。
    "current_node" : {                  //这是目前分片所在的节点的信息,包括节点ID、名称、传输地址和权重排名。
      "id" : "wLzJm4N4RymDkBYxwWoJsg",
      "name" : "node_t0",
      "transport_address" : "127.0.0.1:9400",
      "weight_ranking" : 1
    },
    "can_remain_on_current_node" : "yes",   //这是一个标记,表示当前分片是否可以继续保留在当前节点上。
    "can_rebalance_cluster" : "yes",        //这是一个标记,表示当前分片是否可以被集群重新平衡。            
    "can_rebalance_to_other_node" : "no",   //这是一个标记,表示当前分片是否可以被重新分配到其它节点上。

    //这是一个说明文本,表示为什么不能将该分片移动到其他节点,因为没有节点是可被分配的,以改善集群的平衡。如果您期望将此分片重新分配到其他节点,请查找节点-by-node解释中的该节点,并解决阻止Elasticsearch在该节点上重平衡该分片的原因。
    "rebalance_explanation" : "Elasticsearch cannot rebalance this shard to another node since there is no node towhich allocation is permitted which would improve the cluster balance. If you expect this shard to be rebalancedto another node, find this node in the node-by-node explanation and address the reasons which preventElasticsearch from rebalancing this shard there.",

    //这是分配决策的列表,其中每个元素表示一个节点,该节点可以被分配为该分片的新位置。每个元素包括节点Id、名称、传输地址、节点决策和权重排名等信息。在这个例子中,列表中只有一个元素,表示该节点比当前节点的集群平衡更差,因此无法被选中来作为该分片的新位置。
    "node_allocation_decisions" : [
      {
        "node_id" : "oE3EGFc8QN-Tdi5FFEprIA",
        "node_name" : "node_t1",
        "transport_address" : "127.0.0.1:9401",
        "node_decision" : "worse_balance",          
        "weight_ranking" : 1
      }
    ]
}

6、No arguments(无参数)

https://www.elastic.co/guide/en/elasticsearch/reference/8.8/cluster-allocation-explain.html#_no_arguments

If you call the API with no arguments, Elasticsearch retrieves an allocation explanation for an arbitrary unassigned primary or replica shard.

如果您在不传入任何参数的情况下调用API,则 Elasticsearch 将检索任意未分配的原始分片或副本的分配说明。

        curl -X GET "localhost:9200/_cluster/allocation/explain?pretty"

If the cluster contains no unassigned shards, the API returns a 400 error.

如果集群中不存在未分配的分片,则该API将返回400错误。

posted @ 2023-06-24 21:40  左扬  阅读(266)  评论(0编辑  收藏  举报
levels of contents