ES集群状态red修复

总结

一、遇到集群Red时，我们可以从如下方法排查：

集群层面：/_cluster/health。
索引层面：/_cluster/health?pretty&level=indices。
分片层面：/_cluster/health?pretty&level=shards。
看恢复情况：/_recovery?pretty。

二、有unassigned分片的排查思路

_cluster/allocation/explain 先诊断。
/_cluster/reroute 尝试重新分配。

三、数据重放

如果实在恢复不了，那只能索引重建了。提供一种思路：
先新建备份索引

curl -XPUT ‘http://xxxx:9200/a_index_copy/‘ -d ‘{
“settings”:{
        “index”:{
                “number_of_shards”:3,
                “number_of_replicas”:2
            }
    }
}

通过reindex，将目前可用的数据导入：

POST _reindex
{

        "source": {
        "index": "a_index"
        },
            "dest": {
            "index": "a_index_copy",
            "op_type": "create"
    }
}

删除a_index索引，这个必须要先做，否则别名无法添加.

curl -XDELETE 'http://xxxx:9200/a_index'

创建a_index_copy索引

curl -XPUT ‘http://xxxx:9200/a_index_copy/‘ -d ‘{
“settings”:{
                “index”:{
                “number_of_shards”:3,
                “number_of_replicas”:2
            }
    }
}

通过reindex api将a_index数据copy到a_index_copy。

POST _reindex
{
"source": {
"index": "a_index"
},
"dest": {
"index": "a_index_copy",
"op_type": "create"
}
}
删除a_index索引，这个必须要先做，否则别名无法添加

curl -XDELETE 'http://xxxx:9200/a_index'

给a_index_copy添加别名a_index

curl -XPOST 'http://xxxx:9200/_aliases' -d '
{
        "actions": [
            {"add": {"index": "a_index_copy", "alias": "a_index"}}
    ]
}'

四、translog总结

translog在节点有问题时，能够帮助阻止数据的丢失
设计目的：

1、帮助节点从失败从快速恢复。

2、辅助flush。避免在flush过程中数据丢失。

posted @ 2022-06-09 19:18 西门运维阅读(1321) 评论(0) 收藏举报

刷新页面返回顶部

Jack He

ES集群状态red修复

总结

一、遇到集群Red时，我们可以从如下方法排查：

二、有unassigned分片的排查思路

三、数据重放

四、translog总结

公告