部署前先考虑mysql在Kubernetes中需要的过程细节
维持MySql主从集群状态需要依赖ip,而且配置好后是固定的,但是,在Kubernetes中pod重启后ip就变了,而实时监控po ip又很困难。可不可以利用类似动态dns那种ip与域名自动映射,到时候直接用固定域名访问不是就可以解决。所以可以借助COREDNS的能力。因为换了概念,把主从配置的ip换成域名。那么域名就需要是固定。
集群搭建,我需要对外提供四层服务那么就需要NodePort Service。而维持集群状态,我需要集群发现功能。目的是把域名给我解析成地址,那么就要Headless Service
因为集群重启有序,并且涉及数据持久化数据是独立存储,挂掉恢复或者重启还能自动连接。那么结合最上面一条综合考虑,满足这些需求的就只剩sts(sts控制的pod与容器名字刚好就是sts.name + 编号)。因为deploy创建的pod名字与容器内名字不固定(命名规则:deploy.name+rs.name+hash)重启po ip也会变,并且持久化也会因为重启而且不能自动恢复,最严重启动无序导致集群混乱异常。而恰恰这些原因都是部署集群最关注的地方。
组件缩写:
deploy: Deployment
sts: StatefulSet
po: Pod
cm: ConfigMap
sc: StorageClass
pv: PersistentVolume
pvc: PersistentVolumeClaim
1、部署需求
1). 是主从复制集群
2). 有1个或者多个节点(本例单主)
3). 从节点可以横向扩缩
4). 所有写操作只能在主节点,读在所有节点都可读取
5). 如果MySql容器进程挂掉或主从异常能重启自动恢复
6). 主节点挂掉能自动恢复
Master: write
Slave: read
Master index = 0
Slave index = 1 or 2
index根据hostname抓取后面的id取值即可,因为sts的pod内容器命名规则是sts name+id形成,id就可以作为索引。因为sts中pod启动顺序是有序的,所以不怕混乱,挂掉恢复也会保持。
Xtrabackup
注:都知道这个是mysql备份工具,有一点要注意,8.0版本不能备份之前版本的mysql,只能备份8.0以上的mysql数据库
Xtrabackup备份后会生成一些文件
1)、 xtrabackup开头的都是
[root@master mysql-statefulset-pod]# kubectl exec -it pod/mysql-0 -n mysql -c mysql -- bash
root@mysql-1:/var/lib/mysql# ls
'#innodb_temp' ca-key.pem ib_logfile0 mysql-0-bin.index private_key.pem undo_002 xtrabackup_master_key_id
auto.cnf ca.pem ib_logfile1 mysql-1-relay-bin.000001 public_key.pem xtrabackup_backupfiles xtrabackup_slave_info
backup-my.cnf change_master_to.sql.orig ibdata1 mysql-1-relay-bin.000002 server-cert.pem xtrabackup_binlog_info xtrabackup_tablespaces
binlog.000001 client-cert.pem ibtmp1 mysql-1-relay-bin.index server-key.pem xtrabackup_checkpoints
binlog.000002 client-key.pem mysql mysql.ibd sys xtrabackup_info
binlog.index ib_buffer_pool mysql-0-bin.000004 performance_schema undo_001 xtrabackup_logfile
root@mysql-1:/var/lib/mysql#
2)、 备份的信息
root@mysql-1:/var/lib/mysql# cat xtrabackup_info
uuid = 1668c169-a3d1-11eb-a3a8-72033cfab7a8
name =
tool_name = xtrabackup
tool_command = --backup --slave-info --stream=xbstream --host=127.0.0.1 --user=root
tool_version = 8.0.9
ibbackup_version = 8.0.9
server_version = 8.0.19
start_time = 2021-04-23 01:13:12
end_time = 2021-04-23 01:13:18
lock_time = 0
binlog_pos = filename 'mysql-0-bin.000004', position '155'
innodb_from_lsn = 0
innodb_to_lsn = 31267893
partial = N
incremental = N
format = xbstream
compressed = N
encrypted = N
3)、 主从CHANGE MASTER TO中的MASTER_LOG_FILE与MASTER_LOG_POS的信息就在这取
root@mysql-1:/var/lib/mysql# cat xtrabackup_binlog_info
mysql-0-bin.000004 155
4)、 这个文件在恢复数据时会先检查它,如果没有就报错。同样这个在用xtrabackup增量备份的时候会检查,其中的LAST_LSN如果大于这个值就开始增量备份
root@mysql-1:/var/lib/mysql# cat xtrabackup_checkpoints
backup_type = full-prepared
from_lsn = 0
to_lsn = 31267404
last_lsn = 31267414
flushed_lsn = 0
5)、 这个文件是空的,可以在后面判断是主从复制时使用。是从主节点来的数据,还是从从节点来的数据。主节点过来的数据,虽然有这个文件但是文件是空的,而从节点过来的是有数据的。
root@mysql-1:/var/lib/mysql# cat xtrabackup_slave_info
root@mysql-1:/var/lib/mysql#
正文:
Kubernetes跑Mysql自动主从流程
1、 先从xtrabackup中截取MASTER_LOG_FILE and MASTER_LOG_POS
2、 判断机器索引是否为0,如果是那么为0,那么就是主节点,跳过。否则就是从节点,那么就需要执行CHANGE MASTER TO配置。
3、启动从节点
4、加入其他从节点
目前涉及有三个点:
1、主节点与从节点my.cnf配置不同需要区分开
2、节点之间需要传输数据
3、从节点第一次执行需要一些初始设定,而且重启需要防止第一个从节点二次初始化。(例如:CHANGE MASTER TO)
先解决第一个问题
1)、 配置文件不同,那就在一个cm下以不同的key,定义不同的配置。然后根据判断index后copy过去,可以用initcontainer搞这个事情
apiVersion: v1
kind: ConfigMap
metadata:
name: mysql
namespace: mysql
labels:
app: mysql
data:
master.cnf: |
[client]
default-character-set=utf8mb4
[mysql]
default-character-set=utf8mb4
[mysqld]
log-bin
binlog_expire_logs_seconds=2592000
max_connections=1000000
binlog_format=MIXED
default-time-zone='+8:00'
character-set-client-handshake=FALSE
character-set-server=utf8mb4
collation-server=utf8mb4_unicode_ci
init_connect='SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci'
slave.cnf: |
[client]
default-character-set=utf8mb4
[mysql]
default-character-set=utf8mb4
[mysqld]
super-read-only
max_connections=1000000
binlog_format=MIXED
# 下面三条结尾介绍配置原因
relay_log_info_repository=table
master_info_repository=table
relay_log_recovery = 1
default-time-zone='+8:00'
character-set-client-handshake=FALSE
character-set-server=utf8mb4
collation-server=utf8mb4_unicode_ci
init_connect='SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci'
判断索引
initContainers:
- name: init-mysql
image: mysql:8.0.19
imagePullPolicy: IfNotPresent
command:
- bash
- "-c"
- |
set -ex
# 从hostname中去index,例子:mysql-0去的就是后边的0。 [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
# BASH_REMATCH[x]取组,配合shell中的( )使用,匹配的括号内正则匹配的值。
# 取值赋给myinde变量,如果严格来写,记得变量大写。我比较懒,都小写了。 myindex=${BASH_REMATCH[1]}
echo [mysqld] > /mnt/conf.d/server-id.cnf
echo server-id=$((100 + ${myindex})) >> /mnt/conf.d/server-id.cnf
if [[ ${myindex} -eq 0 ]]; then
cp /mnt/config-map/master.cnf /mnt/conf.d/
else
cp /mnt/config-map/slave.cnf /mnt/conf.d/
fi
2)、 传输文件可以用ncat来进行
只接受文件,但是这个实际还有一层意思。是向mysql-[id] 3307发送请求。initcontainer下的clone-mysql
ncat --recv-only mysql-$((${myindex}-1)).mysql 3307
监听3307端口,收到请求,只发送数据。container下的xtrabackup
exec ncat --listen --keep-open --send-only --max-conns=1 3307
3)、 从节点第一次执行需要一些初始设定,并防止重启或恢复后二次初始化主从配置。(例如:CHANGE MASTER TO)
containers: - name: xtrabackup image: mzmuer/xtrabackup:1.0 imagePullPolicy: IfNotPresent ports: - name: xtrabackup containerPort: 3307 command: - bash - "-c" - | set -ex
# 进入数据目录(或者说要备份或恢复的目录) cd /var/lib/mysql
# 如果这个文件有数据,证明,这是从从节点传输过来的。(-s)如果有数据就成立
# 如果没有数据有xtrabackup_binlog_info,就会认为是从主节点过来的,但是若不是,会因为判断错误导致change_master_to.sql.in再次被重写,影响后续第一个从节点之后所有以后节点的主从配置(因为第二次copy的binlog中是slave的binlog信息与偏移量)
# 所以xtrabackup_binlog_info这个存在会影响判断变更结果,所以需要删除。 if [[ -s xtrabackup_slave_info ]]; then
# 直接改名 mv xtrabackup_slave_info change_master_to.sql.in
# 如果不删除会继续下面elif的判断,不删除的条件成立会继续执行导致文件变更。 rm -f xtrabackup_binlog_info
# 如果这个文件存在,证明数据是从主节点传输过来的。 elif [[ -f xtrabackup_binlog_info ]]; then
# 那么,取其中值因为中间是空格分割所以用了[[:space:]]配合两端()用BASH_REMATCH取值。 [[ `cat xtrabackup_binlog_info` =~ ^(.*?)[[:space:]]+(.*?)$ ]] || exit 1
# 取到的值以换行回车方式存储到change_master_to.sql.in echo -e "CHANGE MASTER TO\nMASTER_LOG_FILE='${BASH_REMATCH[1]}',\nMASTER_LOG_POS=${BASH_REMATCH[2]}" > change_master_to.sql.in
# 删除为了如果容器重启第一个从节点二次初始化。 rm -f xtrabackup_binlog_info fi if [[ -f change_master_to.sql.in ]]; then echo "Waiting for mysqld to be ready (accepting connections)" until mysql -h 127.0.0.1 -e "SELECT 1"; do sleep 1; done echo "Initializing replication from clone position" mv change_master_to.sql.in change_master_to.sql.orig mysql -h 127.0.0.1 <<EOF $(<change_master_to.sql.orig), MASTER_HOST='mysql-0.mysql', MASTER_USER='repl', MASTER_PASSWORD='repl123.', MASTER_CONNECT_RETRY=10; START SLAVE; EOF echo "master slave config ok" fi
配置对外提供四层服务与维持集群状态的Service(两个)
apiVersion: v1
kind: Service
metadata:
name: mysql
namespace: mysql
labels:
app: mysql
spec:
selector:
app: mysql
ports:
- name: mysql
port: 3306
clusterIP: None
---
apiVersion: v1
kind: Service
metadata:
name: mysql-read
namespace: mysql
labels:
app: mysql
spec:
selector:
app: mysql
ports:
- name: mysql
port: 3306
先写主题框架:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
namespace: mysql
spec:
replicas: 3
selector:
matchLabels:
app: mysql
serviceName: mysql
template:
metadata:
labels:
app: mysql
spec:
# 初始化容器 initContainers:
# 初始化master或slave配置文件用 - name: init-mysql
image: mysql:8.0.19
# 本地有镜像就直接用没有才下载 imagePullPolicy: IfNotPresent
# 从节点复制备份文件用 - name: clone-mysql
image: mzmuer/xtrabackup:1.0
imagePullPolicy: IfNotPresent
containers:
# 主要服务容器 - name: mysql
image: mysql:8.0.19
imagePullPolicy: IfNotPresent
ports:
# 容器接口 - name: mysql
containerPort: 3306
# master or slave初始化主从配置 - name: xtrabackup
image: mzmuer/xtrabackup:1.0
imagePullPolicy: IfNotPresent
ports:
- name: xtrabackup
containerPort: 3307
# 初始化配置文件所有的共享存储 volumes:
- name: conf
emptyDir: {}
# 数据文件存储 - name: config-map
configMap:
# cm的名字 name: mysql
# 创建pvc使用这个模板 volumeClaimTemplates:
- metadata:
name: data
spec:
# 存储类名字sc名 storageClassName: "managed-nfs-storage"
# 匹配pv的一个参数,读写权限 accessModes: ["ReadWriteMany"]
resources:
requests:
# 匹配pv的指标容量 storage: 10Gi
填写初始化配置文件容器:
initContainers:
- name: init-mysql
image: mysql:8.0.19
imagePullPolicy: IfNotPresent
command:
- bash
- "-c"
- |
set -ex
[[ `hostname` =~ -([0-9]+)$ ]] || exit 1
myindex=${BASH_REMATCH[1]}
# 两个细节两个配置文件,第一个server-id.cnf里面记录的是集群内的成员id用来区分实例。 echo [mysqld] > /mnt/conf.d/server-id.cnf
# 向刚刚的server-id.cnf中写入server-id=100+id。 echo server-id=$((100 + ${myindex})) >> /mnt/conf.d/server-id.cnf
# 如果index为0,那么就拷贝cm中的master配置过去,其他就拷贝slave配置过去 if [[ ${myindex} -eq 0 ]]; then
cp /mnt/config-map/master.cnf /mnt/conf.d/
else
cp /mnt/config-map/slave.cnf /mnt/conf.d/
fi
volumeMounts:
- name: conf
mountPath: /mnt/conf.d
- name: config-map
mountPath: /mnt/config-map
注:所有MySQL实例都要配置一个独一无二的server-id。默认值为0,当server-id=0时,虽然会记录二进制日志,如果在主节点,会拒绝所有的从节点连接。如果是从节点,会拒绝链接其他实例
server_id不能相同,因为用来决定是否执行event,相同就不执行
填写初始化复制文件容器:
initContainers:
- name: clone-mysql
image: mzmuer/xtrabackup:1.0
imagePullPolicy: IfNotPresent
command:
- bash
- "-c"
- |
set -ex
# 如果目录存在证明已经有数据了,退出。否则执行下一个判断。
[[ -d /var/lib/mysql/mysql ]] && exit 0
# 右侧正则表达式如果不成立退出
[[ `hostname` =~ -([0-9]+)$ ]] || exit 1
# 取上面正则表达式匹配的值
myindex=${BASH_REMATCH[1]}
# 如果索引等于0是是主节点,退出
[[ ${myindex} -eq 0 ]] && exit 0
# 向mysql-[id] 3307 发起请求并只接受数据,且解压数据到/var/lib/mysql中(x解压C指定目录)
ncat --recv-only mysql-$((${myindex}-1)).mysql 3307 | xbstream -x -C /var/lib/mysql
# 恢复数据
xtrabackup --prepare --target-dir=/var/lib/mysql
# --redo-only 加了会导致未提交的事务不回滚
volumeMounts:
- name: data
mountPath: /var/lib/mysql
subPath: mysql
- name: conf
mountPath: /etc/mysql/conf.d
填写主服务容器:
containers:
- name: mysql
image: mysql:8.0.19
imagePullPolicy: IfNotPresent
args: ["--default-authentication-plugin=mysql_native_password"]
env:
- name: MYSQL_ALLOW_EMPTY_PASSWORD
value: "1"
ports:
- name: mysql
containerPort: 3306
volumeMounts:
- name: data
mountPath: /var/lib/mysql
subPath: mysql
- name: conf
mountPath: /etc/mysql/conf.d
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
exec:
command: ["mysqladmin", "ping"]
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
readinessProbe:
exec:
command: ["mysql", "-h", "127.0.0.1", "-e", "SELECT 1"]
initialDelaySeconds: 5
periodSeconds: 2
timeoutSeconds: 1
填写初始化配置容器:
containers:
- name: xtrabackup
image: mzmuer/xtrabackup:1.0
imagePullPolicy: IfNotPresent
ports:
- name: xtrabackup
containerPort: 3307
command:
- bash
- "-c"
- |
set -ex
# 进入数据目录(或者说要备份或恢复的目录)
cd /var/lib/mysql
# 如果这个文件有数据,证明,这是从从节点传输过来的。(-s)如果有数据就成立
# 如果没有数据有xtrabackup_binlog_info,就会认为是从主节点过来的,但是若不是,会因为判断错误导致change_master_to.sql.in再次被重写,影响后续第一个从节点之后所有以后节点的主从配置(因为第二次copy的binlog中是slave的binlog信息与偏移量)
# 所以xtrabackup_binlog_info这个存在会影响判断变更结果,所以需要删除。
if [[ -s xtrabackup_slave_info ]]; then
# 直接改名 mv xtrabackup_slave_info change_master_to.sql.in
# 如果不删除会继续下面elif的判断,不删除的条件成立会继续执行导致文件变更。 rm -f xtrabackup_binlog_info
# 如果这个文件存在,证明数据是从主节点传输过来的。 elif [[ -f xtrabackup_binlog_info ]]; then
# 那么,取其中值因为中间是空格分割所以用了[[:space:]]配合两端()用BASH_REMATCH取值。 [[ `cat xtrabackup_binlog_info` =~ ^(.*?)[[:space:]]+(.*?)$ ]] || exit 1
# 取到的值以换行回车方式存储到change_master_to.sql.in echo -e "CHANGE MASTER TO\nMASTER_LOG_FILE='${BASH_REMATCH[1]}',\nMASTER_LOG_POS=${BASH_REMATCH[2]}" > change_master_to.sql.in
# 删除为了如果容器重启第一个从节点二次重写change_master_to.sql.in执行后续初始化 rm -f xtrabackup_binlog_info fi if [[ -f change_master_to.sql.in ]]; then echo "Waiting for mysqld to be ready (accepting connections)"
# 检测服务是否启动 until mysql -h 127.0.0.1 -e "SELECT 1"; do sleep 1; done echo "Initializing replication from clone position"
# 重命名防止重启在此检测初始化 mv change_master_to.sql.in change_master_to.sql.orig
# 初始化主从配置 mysql -h 127.0.0.1 <<EOF $(<change_master_to.sql.orig), MASTER_HOST='mysql-0.mysql', MASTER_USER='repl', MASTER_PASSWORD='repl123.', MASTER_CONNECT_RETRY=10; START SLAVE; EOF echo "master slave config ok" fi [[ `hostname` =~ -([0-9]+)$ ]] || exit 1 myindex=${BASH_REMATCH[1]}
# 如果是主节点配置主从同步账户 if [[ ${myindex} -eq 0 ]]; then mysql -h 127.0.0.1 <<EOF use mysql; delete from user where user='repl' and host='%';flush privileges; CREATE USER 'repl'@'%' IDENTIFIED WITH mysql_native_password BY 'repl123.'; GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';flush privileges; SELECT Host, User, plugin from mysql.user; EOF else
# 获取主从健康状态 Slave_healthy=`mysql -h 127.0.0.1 -e 'show slave status\G' 2> /dev/null |grep -E "Slave_IO_Running|Slave_SQL_Running"|awk '{print $2}'|grep -c Yes` if [[ ${Slave_healthy} -eq 2 ]]; then echo "master slave healthy ok"
# 主从运行中需要停止刷新reloy_log然后开启从节点 elif [[ ${Slave_healthy} -eq 1 ]]; then echo "master slave healthy no" mysql -h 127.0.0.1 <<EOF stop slave; reset slave; start slave; EOF echo "master slave healthy revert"
# 主从为启动可能是reloy_log还是上一个节点信息导致,直接刷新开启从节点 elif [[ ${Slave_healthy} -eq 0 ]]; then echo "master slave healthy no" mysql -h 127.0.0.1 <<EOF reset slave; start slave; EOF echo "master slave healthy revert" fi fi
# 监听 3307 端口,如果收到请求。只发送备份数据。 exec ncat --listen --keep-open --send-only --max-conns=1 3307 -c \ "xtrabackup --backup --slave-info --stream=xbstream --host=127.0.0.1 --user=root" volumeMounts: - name: data mountPath: /var/lib/mysql subPath: mysql - name: conf mountPath: /etc/mysql/conf.d resources: requests: cpu: 100m memory: 100Mi limits: cpu: 200m memory: 200Mi
存储用的nfs,但是安装了插件使用了储存类,如果出异常,排错方法如下:
1、现在nfs服务端查看/etc/exports配置文件,没有配置加配置
2、查看nfs服务状态,如果正常。使用exportfs -v查看信息
3、如果nfs服务状态异常,重启服务记得先启动rpcbind后启动nfs
4、上述正常,在其他设备上showmount -e [nfs ip]查看nfs可用挂载目录信息
都正常后,安装rbac deployment 与 class待sc出现即可使用nfs存储类
到此整体部署结束。
注:每次尾从节点有一个大坑。
如果CM配置文件slave.cnf部分没有relay_log_info_repository=table、master_info_repository=table和relay_log_recovery=1那么最后一个从节点,始终不会正常,原因我说在下面。
1 [root@master ~]# kubectl logs pod/mysql-2 -n mysql -c mysql
2 2021-04-23 08:53:50+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.0.19-1debian10 started.
3 2021-04-23 08:53:50+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'
4 2021-04-23 08:53:50+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.0.19-1debian10 started.
5 2021-04-23T08:53:50.504572Z 0 [Warning] [MY-000081] [Server] option 'max_connections': unsigned value 1000000 adjusted to 100000.
6 2021-04-23T08:53:50.969604Z 0 [Warning] [MY-011070] [Server] 'Disabling symbolic links using --skip-symbolic-links (or equivalent) is the default. Consider not using this option as it' is deprecated and will be removed in a future release.
7 2021-04-23T08:53:50.969742Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.19) starting as process 1
8 2021-04-23T08:53:52.853035Z 0 [System] [MY-010229] [Server] Starting XA crash recovery...
9 2021-04-23T08:53:52.867197Z 0 [System] [MY-010232] [Server] XA crash recovery finished.
10 2021-04-23T08:53:52.937037Z 0 [Warning] [MY-010075] [Server] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 6dda8985-a411-11eb-9ed4-f223f4afe76a.
11 2021-04-23T08:53:53.754834Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
12 2021-04-23T08:53:53.964524Z 0 [Warning] [MY-011810] [Server] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory.
13 2021-04-23T08:53:54.042613Z 0 [Warning] [MY-010604] [Repl] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=mysql-2-relay-bin' to avoid this problem.
14 2021-04-23T08:53:54.106498Z 0 [ERROR] [MY-010544] [Repl] Failed to open the relay log './mysql-1-relay-bin.000002' (relay_log_pos 204).
15 2021-04-23T08:53:54.106684Z 0 [ERROR] [MY-011059] [Repl] Could not find target log file mentioned in relay log info in the index file './mysql-2-relay-bin.index' during relay log initialization.
16 2021-04-23T08:53:54.111596Z 0 [ERROR] [MY-010426] [Repl] Slave: Failed to initialize the master info structure for channel ''; its record may still be present in 'mysql.slave_master_info' table, consider deleting it.
17 2021-04-23T08:53:54.111764Z 0 [ERROR] [MY-010529] [Repl] Failed to create or recover replication info repositories.
18 2021-04-23T08:53:54.114091Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.19' socket: '/var/run/mysqld/mysqld.sock' port: 3306 MySQL Community Server - GPL.
19 2021-04-23T08:53:54.177303Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Socket: '/var/run/mysqld/mysqlx.sock' bind-address: '::' port: 33060
20 2021-04-23T08:53:54.449627Z 9 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL '' executed'. Previous state master_host='mysql-0.mysql', master_port= 3306, master_log_file='mysql-0-bin.000004', master_log_pos= 155, master_bind=''. New state master_host='mysql-0.mysql', master_port= 3306, master_log_file='mysql-0-bin.000004', master_log_pos= 155, master_bind=''.
标注部分提示找不到mysql-2-relay-bin.index文件。不是没有,但是不是在这个节点,是在上一个节点。因为是冷备文件恢复,所以导致Relay_log内容还是上一个节点的Relay_log,所以启动必报错。
而mysql-1-relay-bin.000002就是第一个从节点的Relay_log_name
1、relay log 文件:由IO thread线程从主库读取的二进制日志事件组成,该日志被Slave上的SQL thread线程执行,从而实现数据的复制。
2、master info log:该文件保存slave连接master的状态以及配置信息。在my.cnf 配置master-info-repository=table使用。这些信息会被写入slave_master_info 表中。替换早期版本的相关文件。
3、relay log info log:该文件保存slave上relay log的执行位置。在my.cnf中配置 relay-log-info-repository=table使用。用slave_relay_log_info表替换早期版本的相关文件。当slave上执行start slave时,就会读取表中的位置信息。
使用表替换原来的文件,是为了崩溃后的安全复制。大大提高意外情况下从库的可靠性。
reset slave作用:
1、删除slave_master_info ,slave_relay_log_info两个表中数据。
2、删除所有relay log文件,并重新创建新的relay log文件。
3、不会改变gtid_executed 或者 gtid_purged的值。
relay_log_recovery
打开该参数,在数据库启动后立即启动自动relay log恢复。在恢复过程中,创建一个新的relay log文件。将sql线程的位置初始化到新的relay log,并将i/o线程初始化到sql线程位置。
mysql在运行过程中,从库上可能会出现以外宕机的情况,那么在从库启动后,必须能够恢复到停止之前的状态。i/o线程必须恢复到接受事务的位置,sql线程必须恢复到一直执行事务的位置。
用表来存储这些信息,并且把这些表设置为innodb引擎,通过使用事务性存储引擎,总能够恢复这个信息。可以配置参数master_info_repository=table和relay_log_info_repository=table使从库信息存储到表中。从库如何从宕机状态恢复到正确状态,取决与从库是单线程还是多线程,relay_log_recover参数值,以及master_auto_position的使用方式。
1,单线程模式的复制
当基于gtid模式复制的时候,并且设置了master_auto_posion参数和relay_log_recover=0,使用该设置,其中relay_log_info_repository和其他变量的设置都不会影响恢复。
当基于传统模式复制的情况下,需要设置relay_log_recovery=1和relay_log_info_repository=table。
2,多线程模式的复制
当基于gtid模式复制的时候,并且设置了master_auto_position和relay_log_recover=0,使用该配置,其relay_log_info_repository和其他变量的设置都会影响恢复。
当基于传统模式复制的时候,请设置relay_log_recover=1,sync_relay_log=1和relay_log_info_repository=table。
现在从节点查看slave_relay_log_info然后做记录。
[root@master ~]# kubectl exec -it pod/mysql-1 -n mysql -c mysql -- bash
1 mysql> select * from slave_relay_log_info\G
2 *************************** 1. row ***************************
3 Number_of_lines: 11
4 Relay_log_name: ./mysql-1-relay-bin.000002
5 Relay_log_pos: 204
6 Master_log_name: mysql-0-bin.000004
7 Master_log_pos: 155
8 Sql_delay: 0
9 Number_of_workers: 0
10 Id: 1
11 Channel_name:
12 Privilege_checks_username: NULL
13 Privilege_checks_hostname: NULL
14 Require_row_format: 0
15 1 row in set (0.00 sec)
查看尾从节点的slave_relay_log_info然后再次做记录。
[root@master ~]# kubectl exec -it pod/mysql-2 -n mysql -c mysql -- bash
1 mysql> select * from slave_relay_log_info\G
2 *************************** 1. row ***************************
3 Number_of_lines: 11
4 Relay_log_name: ./mysql-1-relay-bin.000002
5 Relay_log_pos: 204
6 Master_log_name: mysql-0-bin.000004
7 Master_log_pos: 155
8 Sql_delay: 0
9 Number_of_workers: 0
10 Id: 1
11 Channel_name:
12 Privilege_checks_username: NULL
13 Privilege_checks_hostname: NULL
14 Require_row_format: 0
15 1 row in set (0.00 sec)
查看正常之后的。
[root@master ~]# kubectl exec -it pod/mysql-2 -n mysql -c mysql -- bash
1 mysql> select * from slave_relay_log_info\G
2 *************************** 1. row ***************************
3 Number_of_lines: 11
4 Relay_log_name: ./mysql-2-relay-bin.000007
5 Relay_log_pos: 373
6 Master_log_name: mysql-0-bin.000004
7 Master_log_pos: 155
8 Sql_delay: 0
9 Number_of_workers: 0
10 Id: 1
11 Channel_name:
12 Privilege_checks_username: NULL
13 Privilege_checks_hostname: NULL
14 Require_row_format: 0
15 1 row in set (0.00 sec)
附上sts完整yaml代码
1 apiVersion: apps/v1
2 kind: StatefulSet
3 metadata:
4 name: mysql
5 namespace: mysql
6 spec:
7 replicas: 3
8 selector:
9 matchLabels:
10 app: mysql
11 serviceName: mysql
12 template:
13 metadata:
14 labels:
15 app: mysql
16 spec:
17 initContainers:
18 - name: init-mysql
19 image: mysql:8.0.19
20 imagePullPolicy: IfNotPresent
21 command:
22 - bash
23 - "-c"
24 - |
25 set -ex
26 [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
27 myindex=${BASH_REMATCH[1]}
28 echo [mysqld] > /mnt/conf.d/server-id.cnf
29 echo server-id=$((100 + ${myindex})) >> /mnt/conf.d/server-id.cnf
30 if [[ ${myindex} -eq 0 ]]; then
31 cp /mnt/config-map/master.cnf /mnt/conf.d/
32 else
33 cp /mnt/config-map/slave.cnf /mnt/conf.d/
34 fi
35 volumeMounts:
36 - name: conf
37 mountPath: /mnt/conf.d
38 - name: config-map
39 mountPath: /mnt/config-map
40 - name: clone-mysql
41 image: mzmuer/xtrabackup:1.0
42 imagePullPolicy: IfNotPresent
43 command:
44 - bash
45 - "-c"
46 - |
47 set -ex
48 [[ -d /var/lib/mysql/mysql ]] && exit 0
49 [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
50 myindex=${BASH_REMATCH[1]}
51 [[ ${myindex} -eq 0 ]] && exit 0
52 ncat --recv-only mysql-$((${myindex}-1)).mysql 3307 | xbstream -x -C /var/lib/mysql
53 xtrabackup --prepare --target-dir=/var/lib/mysql
54 # --redo-only 加了会导致为提交的事务不回滚
55 volumeMounts:
56 - name: data
57 mountPath: /var/lib/mysql
58 subPath: mysql
59 - name: conf
60 mountPath: /etc/mysql/conf.d
61 containers:
62 - name: mysql
63 image: mysql:8.0.19
64 imagePullPolicy: IfNotPresent
65 args: ["--default-authentication-plugin=mysql_native_password"]
66 env:
67 - name: MYSQL_ALLOW_EMPTY_PASSWORD
68 value: "1"
69 ports:
70 - name: mysql
71 containerPort: 3306
72 volumeMounts:
73 - name: data
74 mountPath: /var/lib/mysql
75 subPath: mysql
76 - name: conf
77 mountPath: /etc/mysql/conf.d
78 resources:
79 requests:
80 cpu: 250m
81 memory: 256Mi
82 limits:
83 cpu: 500m
84 memory: 512Mi
85 livenessProbe:
86 exec:
87 command: ["mysqladmin", "ping"]
88 initialDelaySeconds: 30
89 periodSeconds: 10
90 timeoutSeconds: 5
91 readinessProbe:
92 exec:
93 command: ["mysql", "-h", "127.0.0.1", "-e", "SELECT 1"]
94 initialDelaySeconds: 5
95 periodSeconds: 2
96 timeoutSeconds: 1
97 - name: xtrabackup
98 image: mzmuer/xtrabackup:1.0
99 imagePullPolicy: IfNotPresent
100 ports:
101 - name: xtrabackup
102 containerPort: 3307
103 command:
104 - bash
105 - "-c"
106 - |
107 set -ex
108 mkdir /test
109 cd /var/lib/mysql
110 if [[ -s xtrabackup_slave_info ]]; then
111 mv xtrabackup_slave_info change_master_to.sql.in
112 rm -f xtrabackup_binlog_info
113 elif [[ -f xtrabackup_binlog_info ]]; then
114 [[ `cat xtrabackup_binlog_info` =~ ^(.*?)[[:space:]]+(.*?)$ ]] || exit 1
115 echo -e "CHANGE MASTER TO\nMASTER_LOG_FILE='${BASH_REMATCH[1]}',\nMASTER_LOG_POS=${BASH_REMATCH[2]}" > change_master_to.sql.in
116 rm -f xtrabackup_binlog_info
117 fi
118 if [[ -f change_master_to.sql.in ]]; then
119 echo "Waiting for mysqld to be ready (accepting connections)"
120 until mysql -h 127.0.0.1 -e "SELECT 1"; do sleep 1; done
121 echo "Initializing replication from clone position"
122 mv change_master_to.sql.in change_master_to.sql.orig
123 mysql -h 127.0.0.1 <<EOF
124 $(<change_master_to.sql.orig),
125 MASTER_HOST='mysql-0.mysql',
126 MASTER_USER='repl',
127 MASTER_PASSWORD='repl123.',
128 MASTER_CONNECT_RETRY=10;
129 START SLAVE;
130 EOF
131 echo "master slave config ok"
132 fi
133 [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
134 myindex=${BASH_REMATCH[1]}
135 if [[ ${myindex} -eq 0 ]]; then
136 mysql -h 127.0.0.1 <<EOF
137 use mysql;
138 delete from user where user='repl' and host='%';flush privileges;
139 CREATE USER 'repl'@'%' IDENTIFIED WITH mysql_native_password BY 'repl123.';
140 GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';flush privileges;
141 SELECT Host, User, plugin from mysql.user;
142 EOF
143 else
144 Slave_healthy=`mysql -h 127.0.0.1 -e 'show slave status\G' 2> /dev/null |grep -E "Slave_IO_Running|Slave_SQL_Running"|awk '{print $2}'|grep -c Yes`
145 if [[ ${Slave_healthy} -eq 2 ]]; then
146 echo "master slave healthy ok"
147 elif [[ ${Slave_healthy} -eq 1 ]]; then
148 echo "master slave healthy no"
149 mysql -h 127.0.0.1 <<EOF
150 stop slave;
151 reset slave;
152 start slave;
153 EOF
154 echo "master slave healthy revert"
155 elif [[ ${Slave_healthy} -eq 0 ]]; then
156 echo "master slave healthy no"
157 mysql -h 127.0.0.1 <<EOF
158 reset slave;
159 start slave;
160 EOF
161 echo "master slave healthy revert"
162 fi
163 fi
164 exec ncat --listen --keep-open --send-only --max-conns=1 3307 -c \
165 "xtrabackup --backup --slave-info --stream=xbstream --host=127.0.0.1 --user=root"
166 volumeMounts:
167 - name: data
168 mountPath: /var/lib/mysql
169 subPath: mysql
170 - name: conf
171 mountPath: /etc/mysql/conf.d
172 resources:
173 requests:
174 cpu: 100m
175 memory: 100Mi
176 limits:
177 cpu: 200m
178 memory: 200Mi
179 volumes:
180 - name: conf
181 emptyDir: {}
182 - name: config-map
183 configMap:
184 name: mysql
185 volumeClaimTemplates:
186 - metadata:
187 name: data
188 spec:
189 storageClassName: "managed-nfs-storage"
190 accessModes: ["ReadWriteMany"]
191 resources:
192 requests:
193 storage: 10Gi
作者:K&