es数据的冷热分离实验

hot node：用于支持索引并写入新文档、

warm node：用于处理不太频繁查询的只读索引

Hot node

我们可以使用 hot node 来做 indexing：

indexing 是 CPU 和 IO 的密集操作，因此热节点应该是功能强大的服务器
比 warm node 更快的存储

Warm node

对较旧的只读索引使用热节点：

倾向于利用大型附加磁盘（通常是旋转磁盘）
大量数据可能需要其他节点才能满足性能要求

Shard filtering

Shard filtering 在 Elasticsearch 中，我们可以利用这个能力来把我们想要的index放入到我们想要的 node 里。我们可以使用在elasticsearch.yml 配置文件中的：

node.attr 来指定我们 node 属性：hot 或是 warm。

在 index 的 settings 里通过 index.routing.allocation 来指定索引（index) 到一个满足要求的 node

动态设置	将索引分配给其{attr}具有
index.routing.allocation.include.{attr}	至少是其中的一个值
index.routing.allocation.exclude.{attr}	不含其中的任何值
index.routing.allocation.require.{attr}	必须包含所有的值

就像上面的表格说明的一样：include 指的是至少包含其中的一个值；exclude 指的是不包含任何值；require 指的是必须包含里面索引的值。这些值实际上我们用来标识 node 的 tag。针对自己的配置这些 tag 可以由厂商自己标识。

为节点分配索引有三种规则：

系统版本：CentOS7

节点规划：

热数据节点： 192.168.2.4

温数据节点： 192.168.2.190

PS：这里就没分 hot warm cold 这种三级存储，我们一般使用 hot warm 2种即可。

热数据节点： 192.168.2.4 的配置如下：

cluster.name: my-application

node.name: node-1
node.attr.rack: r1

node.attr.temperature: hot

path.data: ./data/
path.logs: ./logs

node.master: true
node.data: true
node.ingest: true

bootstrap.memory_lock: true

network.host: 0.0.0.0
http.port: 9200

cluster.initial_master_nodes: 
  - 192.168.2.4:9300
  - 192.168.2.190:9300
  
discovery.seed_hosts:
  - 192.168.2.4
  - 192.168.2.190
  
gateway.recover_after_nodes: 1
#action.destructive_requires_name: true
############# xpack 的配置项 ####################
#xpack.security.enabled: true
#xpack.security.transport.ssl.enabled: true
xpack.security.audit.enabled: true
xpack.security.audit.logfile.events.include: access_denied, access_granted, anonymous_access_denied, authentication_failed,connection_denied, tampered_request, run_as_denied, run_as_granted
xpack.security.audit.logfile.emit_node_host_address: true
xpack.security.audit.logfile.emit_node_host_name: true
xpack.sql.enabled: true
xpack.ilm.enabled: true

温数据节点： 192.168.2.190

cluster.name: my-application

node.name: es11
node.attr.rack: r1

node.attr.temperature: warm

path.data: ./data/
path.logs: ./logs

node.master: true
node.data: true
node.ingest: true

bootstrap.memory_lock: true

network.host: 0.0.0.0
http.port: 9200

cluster.initial_master_nodes: 
  - 192.168.2.4:9300
  - 192.168.2.190:9300
  
discovery.seed_hosts:
  - 192.168.2.4
  - 192.168.2.190
gateway.recover_after_nodes: 1

#action.destructive_requires_name: true

############# xpack 的配置项 ####################
#xpack.security.enabled: true
#xpack.security.transport.ssl.enabled: true
xpack.security.audit.enabled: true
xpack.security.audit.logfile.events.include: access_denied, access_granted, anonymous_access_denied, authentication_failed,connection_denied, tampered_request, run_as_denied, run_as_granted
xpack.security.audit.logfile.emit_node_host_address: true
xpack.security.audit.logfile.emit_node_host_name: true
xpack.sql.enabled: true
xpack.ilm.enabled: true

创建索引，并将数据搬迁到hot节点：

curl -H 'Content-Type: application/json' -X PUT http://localhost:9200/index-2019.10.19?pretty
curl -H 'Content-Type: application/json' -X PUT http://localhost:9200/index-2019.10.19/_settings -d '
{
  "index.routing.allocation.require.temperature": "hot"
}'

如果要将 index-2019.10.19 的数据搬迁到温节点，我们使用下面的这个命令就行

curl -H 'Content-Type: application/json' -X PUT http://localhost:9200/index-2019.10.19/_settings -d '
{
  "index.routing.allocation.require.temperature": "warm"
}'

流程跑通后，我们可以写个脚本，将7天前的索引，打标签，存放到es的warm节点(大容量HDD磁盘)：

#!/bin/bash

day=$(date +"%Y.%m.%d" -d -7day)
# echo ${day}

curl -H 'Content-Type: application/json'  -X PUT http://192.168.2.4:9200/*-${day}/_settings -d '
{
  "index.routing.allocation.require.temperature": "warm"
}'

另外，在es7里面提供 index-lifecycle-management 这个功能，我们在kibana 界面里面就可以进行配置。具体可以查阅es官方的文档

针对硬件的 shard filtering

上面我们说了，对于 node.attr 来说，我们可以添加任意的属性。在上面的我们已经使用 hot/warm 来标识我们的 my_temp 属性。其实我们也可以同时定义一些能标识硬件的属性 my_server，这个属性值可以为 small，medium 及 large。有多个属性组成的集群就像是如下的结构：

那么这样的集群里的每个 node 可能具有不同的属性。我们可以通过如下的方法来分配索引到同时具有两个或以上属性的 node 里:

 PUT my_index1 
 {
   "settings": {
      "number_of_shards": 2,
       "number_of_replicas": 1, 
       "index.routing.allocation.include.my_server": "medium",             
       "index.routing.allocation.require.my_temp": "hot"
   }
 }

如上所示，我们把我们的 my_index1 分配到这么一个 node：这个 node 必须具有 hot 属性，同时也具有 medium 的属性。针对我们上面显示的图片，只有 node1 满足我们的要求。

总结：在今天的这篇文章中，我们介绍了如何使用 shard filtering 来控制我们的 index 的分配。在实际的操作中，可能大家会觉得麻烦一点，因为这个需要我们自己来管理这个。这个技术可以和我之前的文章 “Elasticsearch: rollover API” 一起配合使用。Elasticsearch 实际已经帮我做好了。在接下来的文章里，我会来介绍如何使用 Index life cycle policy 来自动管理我们的索引。

posted @ 2021-01-15 11:54 fat_girl_spring 阅读(1141) 评论(0) 编辑收藏举报

刷新页面返回顶部

fat_girl_spring

es数据的冷热分离实验

Hot node

Warm node

Shard filtering

针对硬件的 shard filtering

公告