sisimi的点点滴滴

你若盛开,清风自来

导航

crush class实验

标签(空格分隔): ceph,ceph实验,crushmap


luminous版本的ceph新增了一个功能crush class,这个功能又可以称为磁盘智能分组。因为这个功能就是根据磁盘类型自动的进行属性的关联,然后进行分类。无需手动修改crushmap,极大的减少了人为的操作。以前的操作有多麻烦可以看看:ceph crushmap

ceph中的每个osd设备都可以选择一个class类型与之关联,默认情况下,在创建osd的时候会自动识别设备类型,然后设置该设备为相应的类。通常有三种class类型:hdd,ssd,nvme。

由于当前实验环境下没有ssd和nvme设备,只好修改class标签,假装为有ssd设备,然后进行实验。

一,实验环境

[root@node3 ~]# cat /etc/redhat-release 
CentOS Linux release 7.3.1611 (Core) 
[root@node3 ~]# ceph -v
ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)

二,修改crush class:

1,查看当前集群布局:

[root@node3 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
-1       0.05878 root default                           
-3       0.01959     host node1                         
 0   hdd 0.00980         osd.0      up  1.00000 1.00000 
 3   hdd 0.00980         osd.3      up  1.00000 1.00000 
-5       0.01959     host node2                         
 1   hdd 0.00980         osd.1      up  1.00000 1.00000 
 4   hdd 0.00980         osd.4      up  1.00000 1.00000 
-7       0.01959     host node3                         
 2   hdd 0.00980         osd.2      up  1.00000 1.00000 
 5   hdd 0.00980         osd.5      up  1.00000 1.00000 

可以看到只有第二列为CLASS,只有hdd类型。
通过查看crush class,确实只有hdd类型

[root@node3 ~]# ceph osd crush class ls
[
    "hdd"
]

2,删除osd.0,osd.1,osd.2的class:

[root@node3 ~]# for i in 0 1 2;do ceph osd crush rm-device-class osd.$i;done
done removing class of osd(s): 0
done removing class of osd(s): 1
done removing class of osd(s): 2

再次通过命令ceph osd tree查看osd.0,osd.1,osd.2的class

[root@node3 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
-1       0.05878 root default                           
-3       0.01959     host node1                         
 0       0.00980         osd.0      up  1.00000 1.00000 
 3   hdd 0.00980         osd.3      up  1.00000 1.00000 
-5       0.01959     host node2                         
 1       0.00980         osd.1      up  1.00000 1.00000 
 4   hdd 0.00980         osd.4      up  1.00000 1.00000 
-7       0.01959     host node3                         
 2       0.00980         osd.2      up  1.00000 1.00000 
 5   hdd 0.00980         osd.5      up  1.00000 1.00000 

可以发现osd.0,osd.1,osd.2的class为空

3,设置osd.0,osd.1,osd.2的class为ssd:

[root@node3 ~]# for i in 0 1 2;do ceph osd crush set-device-class ssd osd.$i;done
set osd(s) 0 to class 'ssd'
set osd(s) 1 to class 'ssd'
set osd(s) 2 to class 'ssd'

再次通过命令ceph osd tree查看osd.0,osd.1,osd.2的class

[root@node3 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
-1       0.05878 root default                           
-3       0.01959     host node1                         
 3   hdd 0.00980         osd.3      up  1.00000 1.00000 
 0   ssd 0.00980         osd.0      up  1.00000 1.00000 
-5       0.01959     host node2                         
 4   hdd 0.00980         osd.4      up  1.00000 1.00000 
 1   ssd 0.00980         osd.1      up  1.00000 1.00000 
-7       0.01959     host node3                         
 5   hdd 0.00980         osd.5      up  1.00000 1.00000 
 2   ssd 0.00980         osd.2      up  1.00000 1.00000 

可以看到osd.0,osd.1,osd.2的class变为ssd
再查看一下crush class:

[root@node3 ~]# ceph osd crush class ls
[
    "hdd",
    "ssd"
]

可以看到class中多出了一个名为ssd的class

4,创建一个优先使用ssd设备的crush rule:

创建了一个rule的名字为:rule-ssd,在root名为default下的rule

[root@node3 ~]# ceph osd crush rule create-replicated rule-ssd default  host ssd 

查看集群的rule:

[root@node3 ~]# ceph osd crush rule ls
replicated_rule
rule-ssd

可以看到多出了一个名为rule-ssd的rule
通过下面的命令下载集群crushmap查看有哪些变化:

[root@node3 ~]# ceph osd getcrushmap -o crushmap
20
[root@node3 ~]# crushtool -d crushmap -o crushmap
[root@node3 ~]# cat crushmap
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class ssd
device 1 osd.1 class ssd
device 2 osd.2 class ssd
device 3 osd.3 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host node1 {
        id -3           # do not change unnecessarily
        id -4 class hdd         # do not change unnecessarily
        id -9 class ssd         # do not change unnecessarily
        # weight 0.020
        alg straw2
        hash 0  # rjenkins1
        item osd.0 weight 0.010
        item osd.3 weight 0.010
}
host node2 {
        id -5           # do not change unnecessarily
        id -6 class hdd         # do not change unnecessarily
        id -10 class ssd                # do not change unnecessarily
        # weight 0.020
        alg straw2
        hash 0  # rjenkins1
        item osd.1 weight 0.010
        item osd.4 weight 0.010
}
host node3 {
        id -7           # do not change unnecessarily
        id -8 class hdd         # do not change unnecessarily
        id -11 class ssd                # do not change unnecessarily
        # weight 0.020
        alg straw2
        hash 0  # rjenkins1
        item osd.2 weight 0.010
        item osd.5 weight 0.010
}
root default {
        id -1           # do not change unnecessarily
        id -2 class hdd         # do not change unnecessarily
        id -12 class ssd                # do not change unnecessarily
        # weight 0.059
        alg straw2
        hash 0  # rjenkins1
        item node1 weight 0.020
        item node2 weight 0.020
        item node3 weight 0.020
}

# rules
rule replicated_rule {
        id 0
        type replicated
        min_size 1
        max_size 10
        step take default
        step chooseleaf firstn 0 type host
        step emit
}
rule rule-ssd {
        id 1
        type replicated
        min_size 1
        max_size 10
        step take default class ssd
        step chooseleaf firstn 0 type host
        step emit
}

# end crush map

可以看到在root default下多了一行: id -12 class ssd。在rules下,多了一个rule rule-ssd其id为1

5,创建一个使用该rule-ssd规则的存储池:

[root@node3 ~]# ceph osd pool create ssdpool 64 64 rule-ssd
pool 'ssdpool' created

查看ssdpool的信息可以看到使用的crush_rule 为1,也就是rule-ssd

[root@node3 ~]# ceph osd pool ls detail
pool 1 'ssdpool' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 39 flags hashpspool stripe_width 0

6,创建对象测试ssdpool:

创建一个对象test并放到ssdpool中:

[root@node3 ~]# rados -p ssdpool ls
[root@node3 ~]# echo "hahah" >test.txt
[root@node3 ~]# rados -p ssdpool put test test.txt 
[root@node3 ~]# rados -p ssdpool ls
test

查看该对象的osd组:

[root@node3 ~]# ceph osd map ssdpool test
osdmap e46 pool 'ssdpool' (1) object 'test' -> pg 1.40e8aab5 (1.35) -> up ([1,2,0], p1) acting ([1,2,0], p1)

可以看到该对象的osd组使用的都是ssd磁盘,至此验证成功。可以看出crush class相当于一个辨别磁盘类型的标签。

三,参考文献:

  1. ceph luminous 新功能之磁盘智能分组
  2. CRUSH MAPS
  3. ceph crushmap

posted on 2017-11-08 15:07  sisimi_2017  阅读(1330)  评论(1编辑  收藏  举报