一、调整节点磁盘水位线
1、ES默认会根据data节点磁盘使用空间情况分配新shards,或将节点上已有shards迁移到其它节点上;
-
cluster.routing.allocation.disk.watermark.low:意味着ES不会为磁盘使用率超过此值的节点分配新shards,支持动态调整,默认85%或;
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.disk.watermark.low":"85%"}}'
-
cluster.routing.allocation.disk.watermark.high:意味着ES会为磁盘使用率超过此值的节点迁出shards或重分配shards,支持动态调整,默认90%;
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.disk.watermark.high":"90%"}}'
-
cluster.routing.allocation.disk.watermark.flood_stage:意味着ES会对磁盘使用率超过此值的节点上的所有索引设置只读(index.blocks.read_only_allow_delete),拒绝新数据写入,支持动态调整,默认95%。
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.disk.watermark.flood_stage":"95%"}}'
2、以上3个控制节点磁盘使用率的参数也支持将百分比值修改成磁盘空间剩余的绝对值,API如下
PUT _cluster/settings { "transient": { "cluster.routing.allocation.disk.watermark.low": "100gb", "cluster.routing.allocation.disk.watermark.high": "50gb", "cluster.routing.allocation.disk.watermark.flood_stage": "10gb" } }
二、集群中Shards分配与恢复
1、cluster.routing.allocation.enable:控制shards分配规则,默认all,需要快速重启data节点的情况下建议将值设置为none;
- all:允许分配集群中所有的分片;
- none:不允许分片集群中所有的分片;
- primaries:只允许分配集群中索引的新分片;
-
new_primaries:只允许分配集群中新索引的主分片;
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.enable":"all"}}'
2、分片恢复
-
cluster.routing.allocation.node_concurrent_incoming_recoveries:分片恢复过程中单节点允许多少并发分片传入数,默认2;
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.node_concurrent_incoming_recoveries":"2"}}'
-
cluster.routing.allocation.node_concurrent_outgoing_recoveries:分片恢复过程中单节点允许多少并发分片传出数,默认2;
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.node_concurrent_outgoing_recoveries":"2"}}'
-
cluster.routing.allocation.node_concurrent_recoveries:同时设置分片传入与传出的并发数;
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.node_concurrent_recoveries":"4"}}'
-
indices.recovery.max_bytes_per_sec:设置恢复过程中节点间每秒传输速率,默认40mb,如果主机网卡及磁盘IO配置高,可以适当调高此值,以提高分片恢复速率;
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"indices.recovery.max_bytes_per_sec":"40mb"}}'
-
cluster.routing.allocation.node_initial_primaries_recoveries: ES7.9 设置分片恢复速率;
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.node_initial_primaries_recoveries":"10"}}'
3、控制分片分配到单个data节点的数量
-
index.routing.allocation.total_shards_per_node:控制indices的分片分配到单个节点的分片数;
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/indices_test/_settings -d '{"index.routing.allocation.total_shards_per_node":"3"}'
- cluster.routing.allocation.total_shards_per_node:控制集群中所有分片分配到单个节点的分片数,不常用。
PUT _cluster/settings
{
"persistent": {
"cluster.max_shards_per_node":2000
}
}
cluster.max_shards_per_node:控制集群中每个data节点上的分片数量,默认1000
PUT _cluster/settings
{
"transient" : {
"cluster.routing.allocation.exclude._ip" : "10.0.0.1" }
}
action.search.shard_count.limit:控制一次查询覆盖的shards数量
{
"persistent" : {
"action.search.shard_count.limit" : "800"
}
}
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.exclude._ip":"127.0.0.1"}}'
4、手动执行集群分片重平衡
POST /_cluster/reroute?retry_failed=true
添加参数
- explain:如果使用?explain查询参数,则返回结果会包含一个为什么可以执行或者为什么不能执行的解释信息;
- retry_failed:如果使用?retry_failed查询参数,则将尝试对之前分配失败的分片重试一次分配;
- timeout:等待响应的超时时间,如果超时则请求失败并返回错误,默认30s。
手动迁移分片API
POST /_cluster/reroute
{
"commands" : [
{
"move" : {
"index" : "apm-7.4.0-prod-3-2021.04.14",
"shard" : 1,
"from_node" : "es-cn-n6w24rnm500dh5ljz-74a4d33e-0002",
"to_node" : "es-cn-n6w24rnm500dh5ljz-74a4d33e-0001"
}
}
]
}
5、查看shard未分配原因
GET _cluster/allocation/explain?pretty
三、集群中Shards rebalance
1、cluster.routing.rebalance.enable:控制集群shards rebalance参数,默认all,需要快速重启节点的情况下建议将值设置为none;
- all:允许集群中所有shards进行rebalance;
- none:不允许集群中所有indices的shards进行rebalance;
- primaries:只允许集群中主shards进行rebalance;
-
replicas:只允许集群中副本shards进行rebalance;
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.rebalance.enable":"all"}}'
2、cluster.routing.allocation.cluster_concurrent_rebalance:允许集群中分片rebalance的并发数量,默认是2,集群扩容新data节点后通过调大此参数值让分片更快迁移到新节点上,快速达到集群分片Rebalance;
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.cluster_concurrent_rebalance":"2"}}'
3、调整分片分配策略
cluster.routing.allocation.balance.index:倾向indices内分片平衡分配到各数据节点,默认0.55f
cluster.routing.allocation.balance.shard:倾向集群所有分片平衡分配到各数据节点,默认0.45fcurl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.balance.index":"0.8f"}}' curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.balance.shard":"0.2f"}}'
四、索引settings
1、调整索引分片数
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/indices_test/_settings -d '{"index":{"number_of_replicas" : 1}}'
2、调整索引refresh频率
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/indices_test/_settings -d '{"index":{"refresh_interval" : "30s"}}'
3、调整索引translog flush策略
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/indices_test/_settings -d '{"index":{"translog.durability" : "async","translog.flush_threshold_size":"1gb"}}'
4、设置每个节点上每个索引的分片(主+副本)数量,
curl -H 'Content-Type:application/json' -XPUT http://127.0.0.1:9200/indices_test/_settings -d '{"index.routing.allocation.total_shards_per_node":5}'
参考官网:https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-cluster.html#cluster-shard-allocation-filtering
五、reindex
1、集群内部reindex源索引到目标索引

POST _reindex?wait_for_completion=false
{
"source": {
"index": "indices_sour",
"query": {"match_all": {}},
"size": 2000
},
"dest": {
"index": "indices_dest"
}
}
2、从其它集群中reindex索引到本集群中

POST _reindex?wait_for_completion=false
{
"source": {
"remote": {
"host": "",
"username": "elastic",
"password": "xxxxxx"
},
"index": "indices_sour",
"query": {"match_all": {}},
"size": 2000
},
"dest": {
"index": "indices_dest"
}
}
POST _reindex?wait_for_completion=false
{
"source": {
"index": "apm-7.4.0-prod-2021.04.14",
"query": {
"bool": {
"must": [],
"filter": [
{
"match_all": {}
},
{
"range": {
"@timestamp": {
"gte": "2021-04-14T08:00:00.000Z",
"lte": "2021-04-14T14:00:00.000Z",
"format": "strict_date_optional_time"
}
}
}
],
"should": [],
"must_not": []
}
},
"size": 10000
},
"dest": {
"index": "apm-7.4.0-prod-3-2021.04.14"
}
}
3、查看reindex任务的进度
GET _tasks/reindex_id
六、template
1、创建index template

PUT _template/us_data
{
"order": 5,
"index_patterns": ["*"],
"settings": {
"index": {
"refresh_interval": "60s",
"number_of_replicas": "1",
"translog": {
"flush_threshold_size": "2gb",
"sync_interval": "120s",
"durability": "async"
}
}
},
"mappings": {},
"aliases": {}
}
七、日志级别设置

PUT /_cluster/settings
{
"transient":{
"logger._root":"INFO"
}
}
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?