es索引调优

1.`index.refresh_interval`: "30s" 建议调大点

这个参数的意思是数据写入后几秒可以被搜索到，默认是 1s。每次索引的 refresh 会产生一个新的 lucene 段, 这会导致频繁的合并行为，如果业务需求对实时性要求没那么高，可以将此参数调大，实际调优告诉我，该参数确实很给力，cpu 使用率直线下降

2.translog调优

ES 为了保证数据不丢失，每次 index、bulk、delete、update 完成的时候，一定会触发刷新 translog 到磁盘上。在提高数据安全性的同时当然也降低了一点性能

    {
      "index": {
          "translog": {
              "flush_threshold_size": "1gb",   log文件大小
              "sync_interval": "30s",          sync间隔调高
              "durability": "async"            异步更新
          }
      }
    }

Lucene只有在commit的时候才会把之前的变更持久化存储到磁盘（每次操作都写到磁盘的话，代价太大），在commit之前如果出现故障，上一次commit之后的变更都会丢失
为了防止数据丢失，Lucene会把变更操作都记录在translog里，在出现故障的时候，从上次commit起记录在translog里的变更都可以恢复，尽量保证数据不丢失
Lucene的flush操作就是执行一次commit，同时开始记录一个新的translog，所以translog是用来记录从上次commit到下一次commit之间的操作的
flush操作的频率是通过translog的大小控制的，当translog大小达到一定值的时候就执行一次flush，对应参数为index.translog.flush_threshold_size，默认值是512mb，这里调整为1gb，减少flush的次数
translog本身是文件，也需要存储到磁盘，它的存储方式通过index.translog.durability和index.translog.sync_interval设定。默认情况下，index.translog.durability=request，意为每次请求都会把translog写到磁盘。这种设定可以降低数据丢失的风险，但是磁盘IO开销会较大
这里采用异步方式持久化translog，每隔30秒写一次磁盘

3.index.store.throttle.type:"none"

做buck操作而无需考虑search性能的情况狂下，你当然想要让index的性能达到disk的极限了,完全禁止merge。当index结束后，重新开启即可

index操作首先会生成很多小的segment，会有异步逻辑合并（merge）这些segment
merge操作比较消耗IO，当系统IO性能比较差的时候，merge会影响查询和索引的性能。
index.store.throttle.type和index.store.throttle.max_bytes_per_sec可以在节点级或者index级限制merge操作消耗的磁盘带宽，防止因为merge导致磁盘高负载，影响其他操作
另一篇关于ES2.x index调优的文章里讲到，如果不关心查询的性能，可以把index.store.throttle.type设为none，意为不对merge操作限速
这个参数默认配置是针对merge操作限制使用磁盘带宽20MBps

4.indices.store.throttle.max_bytes_per_sec:"100mb" （使用的是SSD）

es的默认设置对此是有考虑的：不想让search的性能受到merge的影响，但是在一些情况下（比如ssd，或者index-heavy），限制设置的上限确实有些低。默认是20mb/s，对于splnning disk而言，这个默认值非常合理。如果是ssd disk，可以考虑增加到100-200mb/s

5.index.merge.scheduler.max_thread_count: 1 （使用的是机械磁盘而非 SSD）

优化点：减少并发并发merge对磁盘的消耗
index由多个shard组成，每个shard又分成很多segment，segment是index数据存储的最小单位
执行索引操作时，ES会先生成小的segment
segment比较多的时候会影响搜索性能（要查询很多segment），ES有离线的逻辑对小的segment进行合并，优化查询性能。但是合并过程中会消耗较多磁盘IO，会影响查询性能
index.merge.scheduler.max_thread_count控制并发的merge线程数，如果存储是并发性能较好的SSD，可以用系统默认的max(1, min(4, availableProcessors / 2))，普通磁盘的话设为1
总的一段解说：
- - segment merge操作非常昂贵，会吃掉大量的disk io。merge操作是在后台被调度，因为执行时间会比较久，尤其是比较大的segment。这样是合理的，因为大segment的merge操作是相对较少的。
    
    但是有时候merge操作会落后与index的摄入量。如果出现这种情况，es会自动限制index request只用一个线程。这会防止segment爆炸（在被merge前出现了大量的segment）。es在发现merge操作落后与index的摄入量的时候，日志中会出现以“now throttling indexing”开头的INFO日志。es的默认设置对此是有考虑的：不想让search的性能受到merge的影响，但是在一些情况下（比如ssd，或者index-heavy），限制设置的上限确实有些低。默认是20mb/s，对于splnning disk而言，这个默认值非常合理。如果是ssd disk，可以考虑增加到100-200mb/s，对应设置项为indices.store.throttle.max_bytes_per_sec:"100mb"，逐个测试哪一个具体的数值更合适一些。如果是在做buck操作而无需考虑search性能的情况狂下，你当然想要让index的性能达到disk的极限了，因此设置indices.store.throttle.type:"none"。完全禁止merge。当index结束后，重新开启即可。如果是在使用spinning disk而非ssd，在elasticsearch.yml中增加以下设置：index.merge.scheduler.max_thread_count：1。Spinning disk在处理并发io的时候较慢，因此需要减少访问disk的并发线程数。上述设置将会允许3个thread。如果是ssd的话，可以忽略，默认的设置工作的就很好，Math.min(3,Runtime.getRuntime().availableProcessors()/2)。最后，调整index.translog.flush_threshold_size，默认200mb，比如增加到1gb。这样会使得大的segment在flush之前先在内存中聚合。通过生成更大的segment，减少flush的频率，同时较少了merge的频率。
    
    以上所有的措施都在致力于较少disk io，以提升index性能。

6. indices.memory.index_buffer_size: "20%"

优化点：降低被动写磁盘的可能性
该配置项指定了用于索引操作的内存大小，索引的结果先存在内存中，缓存空间满了的话，缓存的内容会以segment为单位写到磁盘。显然，增大缓存空间大小可以降低被动写磁盘的频率

7. `index.number_of_replicas: 0`

如果你在做大批量导入，文档在复制的时候，整个文档内容都被发往副本节点，然后逐字的把索引过程重复一遍。这意味着每个副本也会执行分析、索引以及可能的合并过程。相反，如果你的索引是零副本，然后在写入完成后再开启副本，恢复过程本质上只是一个字节到字节的网络传输。相比重复索引过程，这个算是相当高效的了。

8. `线程池优化略`

Elasticsearch常用配置及性能参数

cluster.name: estest 集群名称
node.name: “testanya” 节点名称

node.master: false 是否主节点
node.data: true 是否存储数据

index.store.type: niofs 读写文件方式
index.cache.field.type: soft 缓存类型

bootstrap.mlockall: true 禁用swap

gateway.type: local 本地存储

gateway.recover_after_nodes: 3 3个数据节点开始恢复

gateway.recover_after_time: 5m 5分钟后开始恢复数据

gateway.expected_nodes: 4 4个es节点开始恢复

cluster.routing.allocation.node_initial_primaries_recoveries:8 并发恢复分片数
cluster.routing.allocation.node_concurrent_recoveries:2 同时recovery并发数

indices.recovery.max_bytes_per_sec: 250mb 数据在节点间传输最大带宽
indices.recovery.concurrent_streams: 8 同时读取数据文件流线程

discovery.zen.ping.multicast.enabled: false 禁用多播
discovery.zen.ping.unicast.hosts:[“192.168.169.11:9300”, “192.168.169.12:9300”]

discovery.zen.fd.ping_interval: 10s 节点间存活检测间隔
discovery.zen.fd.ping_timeout: 120s 存活超时时间
discovery.zen.fd.ping_retries: 6 存活超时重试次数

http.cors.enabled: true 使用监控

index.analysis.analyzer.ik.type:”ik” ik分词

thread pool setting

threadpool.index.type: fixed 写索引线程池类型
threadpool.index.size: 64 线程池大小（建议2~3倍cpu数）
threadpool.index.queue_size: 1000 队列大小

threadpool.search.size: 64 搜索线程池大小
threadpool.search.type: fixed 搜索线程池类型
threadpool.search.queue_size: 1000 队列大小

threadpool.get.type: fixed 取数据线程池类型
threadpool.get.size: 32 取数据线程池大小
threadpool.get.queue_size: 1000 队列大小

threadpool.bulk.type: fixed 批量请求线程池类型
threadpool.bulk.size: 32 批量请求线程池大小
threadpool.bulk.queue_size: 1000 队列大小

threadpool.flush.type: fixed 刷磁盘线程池类型
threadpool.flush.size: 32 刷磁盘线程池大小
threadpool.flush.queue_size: 1000 队列大小

indices.store.throttle.type: merge
indices.store.throttle.type: none 写磁盘类型
indices.store.throttle.max_bytes_per_sec:500mb 写磁盘最大带宽

index.merge.scheduler.max_thread_count: 8 索引merge最大线程数
index.translog.flush_threshold_size:600MB 刷新translog文件阀值

cluster.routing.allocation.node_initial_primaries_recoveries:8 并发恢复分片数
cluster.routing.allocation.node_concurrent_recoveries:2 同时recovery并发数

使用bulk API 增加入库速度
初次索引的时候，把 replica 设置为 0

增大 threadpool.index.queue_size 1000
增大 indices.memory.index_buffer_size: 20%
index.translog.durability: async –这个可以异步写硬盘，增大写的速度
增大 index.translog.flush_threshold_size: 600MB
增大 index.translog.flush_threshold_ops: 500000

curl -XPOST '127.0.0.1:9200/_cluster/settings' -d '{
    "transient" : 
        {
          "index.indexing.slowlog.threshold.index.warn": "10s",
            "index.indexing.slowlog.threshold.index.info": "5s",
            "index.indexing.slowlog.threshold.index.debug": "2s",
            "index.indexing.slowlog.threshold.index.trace": "500ms",
            "index.indexing.slowlog.level": "info",
            "index.indexing.slowlog.source": "1000",
            "indices.memory.index_buffer_size": "20%"
        }

}'

curl -XPOST '127.0.0.1:9200/_cluster/settings' -d '{
    "transient" : 
        {
          "index.search.slowlog.threshold.query.warn": "10s",
        "index.search.slowlog.threshold.query.info": "5s",
        "index.search.slowlog.threshold.query.debug": "2s",
        "index.search.slowlog.threshold.query.trace": "500ms",
        "index.search.slowlog.threshold.fetch.warn": "1s",
        "index.search.slowlog.threshold.fetch.info": "800ms",
        "index.search.slowlog.threshold.fetch.debug": "500ms",
        "index.search.slowlog.threshold.fetch.trace": "200ms"
        }

}'

–节点下线时，把所有后缀为 -2的从集群中排除

curl -XPUT   http://127.0.0.1:9200/_cluster/settings
{ "transient" : 
      {"cluster.routing.allocation.enable" : "all",            "cluster.routing.allocation.exclude._name":".*-2"}
  }

curl -XPUT ip:9200/_cluster/settings -d
'{
    "transient": {
        "logger.discover": "DEBUG" 
    }
    "persistent": {
        "discovery.zen.minimum_master_nodes": 2
    }
}'

—批量指定节点下线

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '{
    "transient": {
        "cluster.routing.allocation.exclude._name": "atest11-2,atest12-2,anatest13-2,antest14-2" 
    }

}'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '{
    "transient": {
        "cluster.routing.allocation.exclude._name": "test_aa73_2,test_aa73" 
    }

}'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '{
    "transient": {
        "cluster.routing.allocation.exclude._name": "" 
    }

}'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '{
    "transient": {
        "cluster.routing.allocation.cluster_concurrent_rebalance": 10
    }

}'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '{
    "transient": {
        "indices.store.throttle.type": "none",
         "index.store.type": "niofs",
         "index.cache.field.type": "soft",
         "indices.store.throttle.max_bytes_per_sec": "500mb",
          "index.translog.flush_threshold_size": "600MB",
         "threadpool.flush.type": "fixed",
        "threadpool.flush.size": 32,
       "threadpool.flush.queue_size": 1000
    }

}'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '{
    "transient": {
        "index.indexing.slowlog.level": "warn" 
    }

}'

shard的移动
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
    "commands" : [ {
        "move" :
            {
              "index" : "test_aa_20160529", "shard" : 4,
              "from_node" : "node1", "to_node" : "node2"
            }
        },
        {
          "allocate" : {
              "index" : "test", "shard" : 1, "node" : "node3"
          }
        }
    ]
}'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '
{
  "transient": {
    "logger.indices.recovery": "DEBUG"
  }
}'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '
{
  "transient": {
    "cluster.routing.allocation.node_initial_primaries_recoveries": "100" 
  }
}'

curl -XPOST '127.0.0.1:9200/_cluster/settings' -d '{
    "transient" : 
        {
            "indices.memory.index_buffer_size": "20%"
        }

}'

curl -XPOST '127.0.0.1:9200/_cluster/settings' -d '{
    "transient" : 
        {
          "index.indexing.slowlog.level" :   "info" 
        }

}'

参考：

https://www.elastic.co/guide/cn/elasticsearch/guide/current/indexing-performance.html

http://www.voidcn.com/article/p-cpddjjyf-kv.html

http://www.voidcn.com/article/p-kehaizma-mb.html

http://www.voidcn.com/article/p-ypzfonym-bcq.html

http://www.voidcn.com/article/p-ubmbspny-od.html

http://www.voidcn.com/article/p-bwwyyoyx-mc.html

https://cloud.tencent.com/developer/article/1511890

https://www.it610.com/article/1280062656044089344.htm

posted @ 2021-02-22 11:53 fat_girl_spring 阅读(1871) 评论(0) 编辑收藏举报

刷新页面返回顶部

fat_girl_spring