MySQL主从复制与集群
MySQL主从复制与集群
主要是掌握主从复制的原理和实现过程,搭建MySQL主从集群
MySQL单节点工作的弊端:存在单点故障,导致整个系统崩溃的问题,无法查询等
扩展到MySQL集群的好处:需要将同一份数据放在两个DB server上,主从复制保证数据的一致性,基于binlog
可以在读写分离/主从复制架构中加入proxy代理,对外只有一个IP/域名,内部的机制
1.消除单点故障
2.提高读写性能,提高性能瓶颈
3.使用读写分离,写(更新机)和读(查询机)分开部署,提高查询库的用途--->开源软件可以用mycat来代替proxy
MySQL主从复制原理:异步复制
主从复制的方式:异步复制,主从节点之间可能会有数据的丢失,因为采用异步复制不保证同步的一定的实时性
异步复制在复制同步大量的数据时:比如执行存储过程,可能是百万千万条记录一起插入到库中,会导致同步时延非常大,可能到1min往上了
[hellodb]>show slave status \G;
Seconds_Behind_Master: 21
1.开启binlog
master主节点开启二进制功能binlog,开启两个binlog开关,最好是主从节点都开启
新数据更新写操作,先写事务日志redo_log,在写binlog
show variables like "%log_bin%";
show variables like "%sql_log_bin%";
set global sql_log_bin=1;
vim /etc/my.cnf
log_bin=/data/mysql/binlog
systemctl restart mysqld
flush logs; --更新日志文件,更新Binlog
2.开启dump同步线程
服务线程---IO线程---sql thread
同步机制:异步复制,下发写操作,更新,返回客户端(无论是本地socket也好,还是远程工具也好),最好是用异步复制,效率比较高;数据库一般都是并发读写操作的
实时同步:同步数据到从节点成功后,才返回写操作OK
实时同步binlog到从节点,同步binlog的pos,保证位置的一致性
binlog同步完成后写入到中继日志relay_log,通过sql thread写入到从库中,完成数据更新,其实relay_log就是主节点binlog同步后的存放位置,是主节点的数据更新
--创建一个用户用于同步,允许访问所有库
--指定从哪个pos开始查看并导出binlog日志
mysqlbinlog --start-position=524 /data/mysql/binlog/rocky-bin.000003 > /backup/binlog.sql
各种复制架构
1.首选主--->从,比较均衡
2.1主2从,增加冗余性,1主从2从,增加冗余性,但是复制存在延迟
3.两个主,互为主备节点,存在双写数据冲突的问题,可以配置读写分离代理实现,master1+2作为读写节点,一个坏了另一个马上升级为主节点,提供服务
在proxy上指定一下读写操作指向节点的IP就行
搭建主从架构的问题:最好是数据库版本一致,从节点最好是使用高版本,因为不同版本之间存在binlog日志格式不一致的问题,高版本可以兼容低版本的
slave:高版本,5.7-8.0,最好是重装,然后导入完整备份就行
master:低版本,5.7-8.0
搭建MySQL主从同步环境
环境:主master 10.0.0.132,基本步骤,注意:主从复制关系在机器服务重启后仍然保持同步状态;从节点关机后,在开机后马上可以同步过来,如果数据量大存在一定的延迟
从slave:10.0.0.128
version:mysql 8.0.30
1.主节点
1) 二进制日志
2) server_id=
3) 创建用户,replication slave 权限
4) 全备份 --master-data=1,--master-data=2,binlog的pos加上注释
2.从节点
1) server_id=
2) source /data/full.sql 导入备份
3) CHANGE MASTER TO 修改主节点信息
MASTER_HOST='10.0.0.8',
MASTER_USER='repluser',
MASTER_PASSWORD='123456',
MASTER_PORT=3306,
MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=30489887,
MASTER_CONNECT_RETRY=5;
4) start slave;
stop slave;
reset slave all; ##重置从节点
master status:执行DML语句后的pos开始复制
start slave;
3.主主架构
1.开启binlog
如果是8.0及以下得版本,则需要手动开启binlog
##8.0默认开启binlog
show variables like "%log_bin%";
show variables like "%sql_log_bin%";
set global sql_log_bin=1;
vim /etc/my.cnf
log_bin=/data/mysql/binlog/matser-bin --->指定Binlog日志文件的前缀
systemctl restart mysqld
flush logs; --更新日志文件,更新Binlog
2.配置主从节点基本信息
##主节点,修改hostname
hostnamectl set-hostname master
##修改mysqld配置文件中的server-id,可以配置为1,每个节点的server-id一定要不一样,不然无法区分
可以配置为IP地址最后一位
vim /etc/my.cnf
[mysqld]
server-id=132
##查看主节点的二进制文件信息,不用指定任何库,直接复制所有新增写入的数据,从这个binlog的pos开始复制,--start-position=157
[(none)]>show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| rocky-bin.000019 | 157 | | | |
+------------------+----------+--------------+------------------+-------------------+
3.主节点创建同步账号,授权
revoke回收权限
##查看catyer这个用户的权限
show grants for 'catyer'@'10.0.0.%';
+-----------------------------------------------------------+
| Grants for catyer@10.0.0.% |
+-----------------------------------------------------------+
| GRANT USAGE ON *.* TO `catyer`@`10.0.0.%` |
| GRANT ALL PRIVILEGES ON `zabbix`.* TO `catyer`@`10.0.0.%` |
+-----------------------------------------------------------+
##回收删除权限
revoke delete on hellodb.* from catyer@'10.0.0.%';
flush privileges;
##回收所有权限
revoke all on hellodb.* from catyer@'10.0.0.%';
flush privileges;
各版本之间的差异,一定要执行grant,不然会出现无法生成master节点的dump线程
##早期版本5.7左右,授权+创建账号
grant replication slave on *.* to sync@'%‘ identified by '123';
##MySQL 8.0版本
create user xxx@'%' identified by '123';
show grants for sync@'%';
##回收权限,刷新权限,赋予复制权限
revoke all on *.* from sync@'%';
grant replication slave on *.* to sync@'%';
flush privileges;
[(none)]>show grants for sync@'%';
+----------------------------------------------+
| Grants for sync@% |
+----------------------------------------------+
| GRANT REPLICATION SLAVE ON *.* TO `sync`@`%` |
+----------------------------------------------+
1 row in set (0.01 sec)
这一条是failed,证明sync用户没有replication的权限
4.主节点做一份全备,导入到备节点,确保初始数据一致
可以考虑将所有库都备份导出一次,形成不同的.sql文件,然后使用存储过程批量导入.sql数据,写成脚本或者是MySQL的存储过程--->可以查一下,直接或者写成事务,批量化执行,最好是脚本
因为从库可能是新库,初始数据不一致,如果直接同步的话那就只有新数据,没有老数据了
单库备份
#!/bin/bash
TIME=`date +%Y-%m-%d_%H-%M-%S`
DIR=/data/backup
DB="hellodb"
PASS=123
HOST=10.0.0.128
##-B表示选择
[ -d $DIR ] && echo "dir exist" || mkdir -p $DIR
mysqldump -uroot -p$PASS --single-transaction --source-data=2 -B $DB -F | gzip > ${DIR}/${DB}.${TIME}.sql.gz
#mysqldump -usync -p$PASS -h$HOST --single-transaction --source-data=2 -A | gzip > /data/backup/backup.sql.gz
scp ${DIR}/${DB}.${TIME}.sql.gz $HOST:/data/backup
多库备份,除去系统库+expect ssh交互
#!/bin/bash
TIME=`date +%F_%T`
DIR=/data/backup
DB="hellodb"
PASS=123
##批量化复制主机
HOST=(10.0.0.128
10.0.0.129)
[ -d $DIR ] && echo "dir exist" || mkdir -p $DIR
mysqldump -uroot -p$PASS --single-transaction --source-data=2 -A -F > ${DIR}/${DB}_${TIME}_all.sql
#mysqldump -usync -p$PASS -h$HOST --single-transaction --source-data=2 -A | gzip > /data/backup/backup.sql.gz
for IP in ${HOST[*]};do
#免交互复制ssh-keygen到目的主机
expect <<EOF
spawn ssh-copy-id root@$IP
expect {
"yes/no" {send "yes\n";exp_continue}
"password" {send "123\n"}
}
expect eof
EOF
scp ${DIR}/${DB}_${TIME}_all.sql $IP:/data/backup
echo "$IP copy success"
done
完全备份-A,直接source就行,不用选库
##-A完全备份,-F刷新日志
##!/bin/bash
TIME=`date +%Y-%m-%d_%H-%M-%S`
DIR=/data/backup
DB="hellodb"
PASS=123
HOST=10.0.0.128
[ -d $DIR ] && echo "dir exist" || mkdir -p $DIR
mysqldump -uroot -p$PASS --single-transaction --source-data=2 -A -F > ${DIR}/${DB}_${TIME}_all.sql
#mysqldump -usync -p$PASS -h$HOST --single-transaction --source-data=2 -A | gzip > /data/backup/backup.sql.gz
scp ${DIR}/${DB}_${TIME}_all.sql $HOST:/data/backup
[zabbix]>source /data/backup/hellodb_2022-09-11_09-15-11_all.sql
主节点看一下dump线程,在运行的线程
show processlist;
[(none)]>show processlist;
+----+-----------------+-----------+------+---------+------+------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-----------------+-----------+------+---------+------+------------------------+------------------+
| 5 | event_scheduler | localhost | NULL | Daemon | 7920 | Waiting on empty queue | NULL |
| 52 | root | localhost | NULL | Query | 0 | init | show processlist |
+----+-----------------+-----------+------+---------+------+------------------------+------------------+
单表备份:直接备份一个库内的一个表
也可以备份多张表
##假如不小心清空从库表的数据
truncate table stu;
[hellodb]>select * from stu;
Empty set (0.00 sec)
##导出单表,还原
mysqldump -uroot -p123 hellodb stu > /data/stu.sql
scp /data/stu.sql 10.0.0.128:/data/
source /data/stu.sql
[hellodb]>select * from stu where stuid=25;
+-------+-------------+-----+--------+
| StuID | Name | Age | Gender |
+-------+-------------+-----+--------+
| 25 | Sun Dasheng | 100 | M |
+-------+-------------+-----+--------+
##备份多张表
mysqldump -uroot -p123 hellodb stu teachers job > /data/stu.sql
5.从节点配置文件以及导入备份,基于线程来复制
可以先导入文件,再show master status记录下binlog的pos和名字,再执行change master
vim /etc/my.cnf
read-only ##只读,防止普通用户修改数据,不能防止root
CHANGE MASTER TO
MASTER_HOST='10.0.0.132',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=30489887,
MASTER_CONNECT_RETRY=5;
##取消掉注释,复制过去,从节点加上binlog选项
CHANGE MASTER TO
MASTER_HOST='10.0.0.132',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='rocky-bin.000783', MASTER_LOG_POS=197;
关闭二进制,再导入,查看从节点的信息,发生生成了中继日志,即master的binlog写过来的日志存放文件relay_log,SLAVE_IO还是not running,开启复制线程
set sql_log_bin=0;
source /data/backup/hellodb_2022-09-11_10-48-26_all.sql
show slave status\G;
[zabbix]>show slave status\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 10.0.0.132
Master_User: sync
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: rocky-bin.000021
Read_Master_Log_Pos: 157
Relay_Log_File: master-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: rocky-bin.000021
Slave_IO_Running: No
Slave_SQL_Running: No
start slave;
show processlist;
IO线程+SQL线程,IO线程是从主节点接收同步信息,通过SQL thread写入到从库的数据中,Query thread,显示waiting for updates
主节点:binlog dump线程
6.测试主从环境
主写新,看从有无,可以执行存储过程
测试执行存储过程,从节点直接SQL thread断掉了
[hellodb]>select * from teachers;
+-----+---------------+-----+--------+
| TID | Name | Age | Gender |
+-----+---------------+-----+--------+
| 1 | Song Jiang | 45 | M |
| 2 | Zhang Sanfeng | 94 | M |
| 3 | Miejue Shitai | 77 | F |
| 4 | Lin Chaoying | 93 | F |
| 5 | catyer | 16 | M |
| 6 | catyer2 | 17 | M |
+-----+---------------+-----+--------+
##主节点写入新数据,删除数据,没了
[hellodb]>delete from teachers where tid=6;
Query OK, 1 row affected (0.00 sec)
[hellodb]>select * from teachers;
+-----+---------------+-----+--------+
| TID | Name | Age | Gender |
+-----+---------------+-----+--------+
| 1 | Song Jiang | 45 | M |
| 2 | Zhang Sanfeng | 94 | M |
| 3 | Miejue Shitai | 77 | F |
| 4 | Lin Chaoying | 93 | F |
| 5 | catyer | 16 | M |
+-----+---------------+-----+--------+
5 rows in set (0.00 sec)
##测试执行存储过程
insert into teachers(name,age,gender) values('a',22,'M');
select * from teachers;
出现Slave_IO_Running: NO的解决办法:可能是用户没权限
##复制账号没权限,从节点查看
[(none)]>show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 10.0.0.132
Master_User: sync
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: rocky-bin.000022
Read_Master_Log_Pos: 60979553
Relay_Log_File: master-relay-bin.000002
Relay_Log_Pos: 326
Relay_Master_Log_File: rocky-bin.000022
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
##授权
grant replication slave on *.* to sync@'%';
flush privileges;
[(none)]>show grants for sync@'%';
+----------------------------------------------+
| Grants for sync@% |
+----------------------------------------------+
| GRANT REPLICATION SLAVE ON *.* TO `sync`@`%` |
+----------------------------------------------+
1 row in set (0.01 sec)
出现Slave_SQL_Running:No的解决办法
重新执行一下slave的change master语句,记录下master的binlog pos,修改执行
可以查看日志,错误日志:mysql-error.log,可以自定义一下错误日志的位置
##从停止slave
stop slave;
##主查看binlog日志的pos
show master status;
[hellodb]>show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| rocky-bin.000022 | 60979553 | | | |
+------------------+----------+--------------+------------------+-------------------+
##从手动修改Pos,执行
CHANGE MASTER TO
MASTER_HOST='10.0.0.132',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='rocky-bin.000784', MASTER_LOG_POS=197;
##启动slave
start slave
##可能是表结构不一致,需要修改表结构字段
2022-09-11T03:39:17.143632Z 28 [Warning] [MY-010584] [Repl] Slave SQL for channel '': ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: MY-001756
desc testlog;
show create table testlog;
执行一个存储过程后SQL_running=NO的解决,通过show master status的报错查看
https://www.cnblogs.com/Knight7971/p/9970807.html
主要是看errorlog来拍错,让我们看一下error log或者是performance_schema.replication_applier_status_by_worker,这个表来查看有什么原因导致的
在同步前,新的数据要是master节点存在,slave节点不存在的,就算是空表也不行
##先看错误现象,SQL_RUNNING为NO,让看看错误日志
[hellodb]>show slave status \G;
Seconds_Behind_Master: 21
Slave_SQL_Running: No
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'ANONYMOUS' at master log rocky-bin.000023, end_log_pos 444.
See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
##查看错误日志,好像是表的什么问题,再看下一个表
tail -f mysql.log
2022-09-11T03:34:59.334704Z 23 [ERROR] [MY-013146] [Repl] Slave SQL for channel '': Worker 1 failed executing transaction 'ANONYMOUS' at master log rocky-bin.000022, end_log_pos 60979827; Column 1 of table 'hellodb.testlog' cannot be converted from type 'char(30(bytes))' to type 'char(40(bytes) utf8mb4)', Error_code: MY-013146
##查出原因了,这个testlog表已经存在了,不能往里面写数据,在同步前,新的数据
[hellodb]>select * from performance_schema.replication_applier_status_by_worker\G;
LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'ANONYMOUS' at master log rocky-bin.000023, end_log_pos 444;
Error 'Table 'testlog' already exists' on query. Default database: 'hellodb'. Query: 'create table testlog (id int auto_increment primary key,name char(30),salary int default 20) character set utf8mb4'
##删除掉表,停止slave,启动slave,修改MASTER信息
drop table testlog;
stop slave
##从手动修改Pos,执行
CHANGE MASTER TO
MASTER_HOST='10.0.0.132',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='rocky-bin.000024', MASTER_LOG_POS=157;
start slave
##再次执行存储过程,在源端加点东西,先创建表
use hellodb;
create table testlog (id int auto_increment primary key,name char(30),salary int default 20) character set utf8mb4;
delimiter $$
create procedure sp_testlog()
begin
declare i int;
set i = 1;
while i <= 100000
do insert into testlog(name,salary) values (concat('wang',FLOOR(RAND() * 100000)),FLOOR(RAND() * 1000000));
set i = i +1;
end while;
end$$
delimiter ;
source /root/testlog.sql
call sp_testlog();
##从节点有21s的复制同步延迟,已经优化很多了
show slave status \G;
[hellodb]>select count(*) from testlog;
+----------+
| count(*) |
+----------+
| 36626 |
+----------+
Seconds_Behind_Master: 21
##完成执行存储过程的同步
[hellodb]>select count(*) from testlog;
+----------+
| count(*) |
+----------+
| 100000 |
+----------+
show processlist;
49 | sync | 10.0.0.128:60752 | NULL | Binlog Dump | 1475 | Source has sent all binlog to replica; waiting for more updates | NULL
wait for events
ss -nt:查看已经建立的TCP连接,ESTABLISHED,10.0.0.128:业务端口已建立
7.同步堵塞(blocked),序号冲突,跳过错误
跳过指定事件,跳过错误数量
show variables like '%sql_slave_skip_counter%';
##跳过N个错误
set global sql_slave_skip_counter=N
有时候,从节点没有设置为read-only只读,然后在从节点新增了记录,比如id=9,而主节点又新增记录,序号也为9,那么这个9和从节点的9就不是同一个问题,导致冲突,造成复制堵塞(blocked),entity的记录冲突了
show slave status \G;
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
##从节点写入一条记录,8号记录
insert into teachers(name,age,gender) values('tea',11,'M');
[hellodb]>select * from teachers where tid=8;
+-----+------+-----+--------+
| TID | Name | Age | Gender |
+-----+------+-----+--------+
| 8 | tea | 11 | M |
+-----+------+-----+--------+
##主节点插入一条记录
insert into teachers(name,age,gender) values('teach',20,'M');
##两边的tid-8的记录不一样了,查看一下错误
[hellodb]>select * from teachers;
+-----+---------------+-----+--------+
| TID | Name | Age | Gender |
+-----+---------------+-----+--------+
| 1 | Song Jiang | 45 | M |
| 2 | Zhang Sanfeng | 94 | M |
| 3 | Miejue Shitai | 77 | F |
| 4 | Lin Chaoying | 93 | F |
| 5 | catyer | 16 | M |
| 7 | teach | 20 | M |
| 8 | teach | 22 | M |
+-----+---------------+-----+--------+
7 rows in set (0.00 sec)
[hellodb]>select * from teachers;
+-----+---------------+-----+--------+
| TID | Name | Age | Gender |
+-----+---------------+-----+--------+
| 1 | Song Jiang | 45 | M |
| 2 | Zhang Sanfeng | 94 | M |
| 3 | Miejue Shitai | 77 | F |
| 4 | Lin Chaoying | 93 | F |
| 5 | catyer | 16 | M |
| 7 | uuu | 11 | M |
| 8 | tea | 11 | M |
+-----+---------------+-----+--------+
7 rows in set (0.00 sec)
##查出是entity 7序号7直接堵塞掉了,第七条出问题,导致后面的记录都无法复制过去了
##其实是7号和8号都冲突了,需要跳过两个错误
show slave status \G;
[hellodb]>select * from performance_schema.replication_applier_status_by_worker\G;
LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'ANONYMOUS' at master log rocky-bin.000024, end_log_pos 30689864; Could not execute Write_rows event on table hellodb.teachers;
Duplicate entry '7' for key 'teachers.PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log rocky-bin.000024, end_log_pos 30689864
临时解决方案:跳过这个错误
stop slave;
##跳过2个错误
set global sql_slave_skip_counter=2;
select @@sql_slave_skip_counter;
start slave;
show slave status \G;
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
##主节点插入,有了
[hellodb]>insert into teachers(name,age,gender) values('teach9',20,'M');
[hellodb]>select * from teachers where tid=9;
+-----+--------+-----+--------+
| TID | Name | Age | Gender |
+-----+--------+-----+--------+
| 9 | teach9 | 20 | M |
+-----+--------+-----+--------+
##错误解决办法,手动改,多行记录的话,手动改多行记录就行
update xxx set where
双主架构实现:互为主节点
将主节点也执行change master,使其变为从节点的从节点,容易造成数据冲突,不建议使用,双向同步,假如说用户无意间写到两个不同的主,而不是一份数据双写
最好还是写一份数据,安全,不会冲突
环境:主master 10.0.0.132,从slave 10.0.0.128,现在132也是128的从
1.从节点上查看MySQL的binlog,记录binlog日志pos
show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| slave-bin.000004 | 157 | | | |
+------------------+----------+--------------+------------------+-------------------+
2.修改change master
CHANGE MASTER TO
MASTER_HOST='10.0.0.128',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='slave-bin.000004', MASTER_LOG_POS=157;
##启动slave
start slave
##重置
reset slave all
半同步机制(重点):同步到一个从或者超过延迟时间就返回成功,加快事务的返回--->用于主从(主备架构)
异步复制的优点:返回结果快,写入到主库了后就返回写入成功了,SQL query set
异步复制的缺陷:如果在同步的过程中,master dump thread挂了,则无法同步,从节点的数据也不是最新的;而且在slave库还需要完成SQL thread写入到磁盘,这段如果没完成也是导致数据不是最新的情况
半同步机制:加入了检查从库是否数据落盘的插件,只有主库---从库,从库的事务commit后,写入磁盘了,主库这边才返回写入成功;或者是超过了同步延时时间,也会返回写成功
基本过程(增强型半同步复制):优点,先检查是否同步,解决了主库返回成功后,访问从库(可能是只读库)没有数据,没同步到位的问题
1.用户提交事务(DML),执行SQL语句;客户端可能是本地socket,也可能是远端用户
2.写二进制日志
3.先等slave dump完成复制到slave
4.提交事务,写入磁盘
5.返回写成功

模拟环境搭建一主两从:rpl_semi_sync_master_enabled
主:132 ,从1:128,从2:131
插件:rpl_semi_sync_master_enabled,show plugins
1.同步主从数据,开启slave
##Master节点操作配置
show variables like '%rpl_semi_sync_master_enabled%';
vim /etc/my.cnf
rpl_semi_sync_master_enabled=on
rpl_semi_sync_master_timeout=
##查看系统自带的插件,安装插件
1.binlog、密码插件认证、守护进程,存储引擎等
[(none)]>show plugins;
+---------------------------------+----------+--------------------+---------+---------+
| Name | Status | Type | Library | License |
+---------------------------------+----------+--------------------+---------+---------+
| binlog | ACTIVE | STORAGE ENGINE | NULL | GPL |
| mysql_native_password | ACTIVE | AUTHENTICATION | NULL | GPL |
| sha256_password | ACTIVE | AUTHENTICATION | NULL | GPL |
| caching_sha2_password | ACTIVE | AUTHENTICATION | NULL | GPL |
| sha2_cache_cleaner | ACTIVE | AUDIT | NULL | GPL |
| daemon_keyring_proxy_plugin | ACTIVE | DAEMON | NULL | GPL |
| CSV | ACTIVE | STORAGE ENGINE | NULL | GPL |
2.安装半同步插件rpl_semi_sync_master ##主从节点都要安装
install plugin rpl_semi_sync_master soname 'semisync_master.so';
Query OK, 0 rows affected, 1 warning (0.07 sec)
show plugins;
rpl_semi_sync_master | ACTIVE | REPLICATION | semisync_master.so | GPL
3.修改配置文件
vim /etc/my.cnf
rpl_semi_sync_master_enabled=on ##开启半同步功能
rpl_semi_sync_master_timeout=10000 ##默认超时时间:10s,以毫秒为单位,默认10s
[(none)]>select @@rpl_semi_sync_master_timeout;
+--------------------------------+
| @@rpl_semi_sync_master_timeout |
+--------------------------------+
| 10000 |
+--------------------------------+
systemctl restart mysqld
4.查看插件状态
[(none)]>select @@rpl_semi_sync_master_enabled;
+--------------------------------+
| @@rpl_semi_sync_master_enabled |
+--------------------------------+
| 1 |
+--------------------------------+
5.完全备份到从库,确认log-bin的pos,##master和slave1都执行,scp到从库
grep -i "change master" hellodb_2022-09-12_11:45:11_all.sql
show master status;
-- CHANGE MASTER TO MASTER_LOG_FILE='rocky-bin.000030', MASTER_LOG_POS=157;
stop slave;
source /data/backup/hellodb_2022-09-12_11:45:11_all.sql
CHANGE MASTER TO
MASTER_HOST='10.0.0.132',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='rocky-bin.000030', MASTER_LOG_POS=157;
start slave;
show slave status \G;
##确认sql_io和slave_io都启动
##如果遇到以下问题,slave同步问题
stop slave;
reset slave all;
start slave;
2.从节点安装插件
完成两边的半同步插件安装,现在没有配置timeout,都是10s
##安装插件和卸载插件,主节点master
install plugin rpl_semi_sync_master soname 'semisync_master.so';
uninstall plugin rpl_semi_sync_master;
##从节点安装slave
install plugin rpl_semi_sync_slave soname 'semisync_slave.so';
select @@rpl_semi_sync_master_enabled;
+--------------------------------+
| @@rpl_semi_sync_master_enabled |
+--------------------------------+
| 0 |
+--------------------------------+
1.动态设置变量
set global rpl_semi_sync_master_enabled=ON;
set global rpl_semi_sync_slave_enabled=ON;
show plugins;
[(none)]>select @@rpl_semi_sync_master_enabled;
+--------------------------------+
| @@rpl_semi_sync_master_enabled |
+--------------------------------+
| 1 |
+--------------------------------+
2.主写入配置文件,从写入配置文件
vim /etc/my.cnf
[mysqld]
rpl_semi_sync_master_enabled
systemctl restart mysqld
#从写入配置文件
[mysqld]
rpl_semi_sync_slave_enabled
systemctl restart mysqld
##主节点测试查看,clients=2才是正确的
[hellodb]>show status like '%semi%';
+--------------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients | 2 |
| Rpl_semi_sync_master_net_avg_wait_time | 0 |
| Rpl_semi_sync_master_net_wait_time | 0 |
| Rpl_semi_sync_master_net_waits | 0 |
| Rpl_semi_sync_master_no_times | 0 |
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_status | ON |
| Rpl_semi_sync_master_timefunc_failures | 0 |
| Rpl_semi_sync_master_tx_avg_wait_time | 0 |
| Rpl_semi_sync_master_tx_wait_time | 0 |
| Rpl_semi_sync_master_tx_waits | 0 |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0 |
| Rpl_semi_sync_master_wait_sessions | 0 |
| Rpl_semi_sync_master_yes_tx | 0 |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)
3.测试10s的延迟,如果过了10s没有slave响应,则返回成功
可以修改全局变量超时时间,单位是ms毫秒
停掉两个slave的同步看看
stop slave;
[hellodb]>select @@rpl_semi_sync_master_timeout;
+--------------------------------+
| @@rpl_semi_sync_master_timeout |
+--------------------------------+
| 10000 |
+--------------------------------+
##修改全局变量
set global rpl_semi_sync_master_timeout=2000;
Query OK, 0 rows affected (0.00 sec)
##无slave响应,需要10s返回成功,timeout值可以设置
[hellodb]>insert into teachers(name,age,gender) values('a',22,'M');
Query OK, 1 row affected (10.00 sec)
##有slave响应,秒回,证明半同步复制是成功的
start slave;
[hellodb]>insert into teachers(name,age,gender) values('a',22,'M');
Query OK, 1 row affected (0.02 sec)
##MySQL默认配置,在同步后才提交
[hellodb]>show variables like 'Rpl_semi_sync_master_wait_point';
+---------------------------------+------------+
| Variable_name | Value |
+---------------------------------+------------+
| rpl_semi_sync_master_wait_point | AFTER_SYNC |
+---------------------------------+------------+
复制过滤器binlog-do-db
指定哪些库生成二进制文件日志
vim /etc/my.cnf
binlog_do_db=db1 ##白名单
binlog_ignore_db=db2 ##黑名单
实现事务的并发自动复制,自动找binlog_pos:GTID(MySQL 5.7及以上)
好处:可以在主从复制关系中,不需要管理员或者是用户去关注binlog的pos,开启gtid,自动寻找事务的binlog点,自动复制新东西过来
支持多DUMP线程自动复制,大大降低延迟
事务的唯一性:GTID=server_uuid+transaction_id,是全局唯一的
server_id来源于:/data/mysql/auto.cnf
[root@rocky mysql]#cat auto.cnf
[auto]
server-uuid=5aa76b1b-1b77-11ed-9219-000c29333d85
开启GTID,主从节点都要开启这个GTID_MODE
vim /etc/my.cnf
[mysqld]
gtid_mode=on
enforce_gtid_consistency
systemctl restart mysqld
##查看主节点的pos
[hellodb]>show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| rocky-bin.000032 | 157 | | | |
+------------------+----------+--------------+------------------+-------------------+
##从节点配置
stop slave;
CHANGE MASTER TO
MASTER_HOST='10.0.0.132',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_AUTO_POSITION=1;
#MASTER_LOG_FILE='slave-bin.000004', MASTER_LOG_POS=157; --->这一段可以不要了
start slave;
##也可以
resrt slave all;
##已和主节点同步完成了,做复制方便很多了
[(none)]>show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 10.0.0.132
Master_User: sync
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: rocky-bin.000032
Read_Master_Log_Pos: 157
MySQL主从复制的一些监控健康性(bash实现)
使用监控软件监控的时候,需要检查一些指标,使用bash看出来就好了
##监控复制的健康,两个进程,从节点slave
mysql -uroot -p123 -e 'show slave status\G' | awk -F':' '/Slave_IO_Running/{print $2}'
mysql -uroot -p123 -e 'show slave status\G' | awk -F':' '/Slave_SQL_Running/{print $2}'
show master status; ##主节点的binlog
show binlog events; ##查询binlog日志的内容
mysqlbinlog rocky-bin.000003
show slave status \G;
show processlist;
stop slave
reset slave all
start slave
1.配置gtid,实现自动识别binlog的pos
2.减少大事务的产生,减少同步复制延迟
3.减少锁机制
4.建立主从机制
造成MySQL主从数据不一致的原因
binlog格式(row级别或者是statement级别语句)、是否开启binlog、同步是否正常(show slave status)、服务是否正常,版本是否一致(最好是从节点高)等
主库的复制账号sync没有grant replication slave权限
定期监控主从之间的心跳问题,监控slave status状态
---->重新完全备份,导入,重新建立主从关系等
mycat:数据库proxy的实现
proxy服务器:用于写好逻辑,如果写请求,就到主节点;如果是读请求,就到从节点(更新机、查询机等)
垂直拆分:分库分表,关系不大的表可以拆分,如果是例如stu表和teacher表的话,那就是不太方便的;分库的话,可以和开发协商好;根据不同的业务类型进行拆分
水平拆分,放在不同的服务器中,以便解放性能瓶颈,或者读写分离等
--->通过proxy代理来区分数据需要放在什么节点上,proxy代理节点也要冗余双节点,以免出现单点故障的问题
主要功能:实现一个MySQL代理proxy的功能,通过proxy与后端多个MySQL服务器相连,实现大规模的MySQL服务器集群功能
mycat核心协议:JDBC,与多个主流数据库相连,mycat是对数据库层面的抽象
工作原理:前端(HA-proxy)负载均衡---mycat proxy代理集群(冗余节点)---转发请求到后端1-N个MySQL服务器节点(冗余节点),类似K8s集群中,我们需要两个nginx节点做转发一样,万一一个pod挂了,还有另一个nginx pod作为转发请求
mycat实现读写分离:
1.接收写操作请求:转发到后端主节点,更新机
2.接收读请求:转发到后端查询机,读节点

mycat实现读写分离

环境:客户端ubuntu 10.0.0.131
mycat节点:10.0.0.130
主节点:10.0.0.132
从节点:10.0.0.128
1.下载包,解压缩到指定目录,配置执行的环境变量,这个可用于其他的服务,例如nginx编译安装
适用于所有编译安装的服务,可以将路径写到环境变量里面去,就不用人为进入到bin目录下了,类似tomcat也一样的,nginx等很多服务
tar xf Mycat-server-1.6.7.4-release-20200105164103-linux.tar.gz -C /mycat
##查看包内的内容,已经编译好的东西,可以直接使用
bin catlet conf lib logs version.txt
##修改执行文件的path路径,让他能够直接执行,不需要加上执行文件路径
原路径:/apps/mycat/bin/mycat
vim /etc/profile.d/mycat.sh,让系统执行mycat指令的时候,直接去调用环境变量的PATH
PATH=/apps/mycat/bin:$PATH
##让环境变量生效
. /etc/profile.d/mycat.sh / source /etc/profile.d/mycat.sh
[11:01:44 root@proxy bin]#mycat
Usage: /apps/mycat/bin/mycat { console | start | stop | restart | status | dump }
2.安装java编译环境,修改mycat默认端口
升配一下mycat节点的内存,java进程比较吃内存
yum -y install java
##java 8版本
java -version
openjdk version "1.8.0_342"
OpenJDK Runtime Environment (build 1.8.0_342-b07)
OpenJDK 64-Bit Server VM (build 25.342-b07, mixed mode)
启动mycat,监听的是8066和9066两个端口,对外暴露的,默认两个对外端口,可以修改的
对外暴露的端口默认是3306的
mycat start
ss -ntl | grep 8066
LISTEN 0 100 :::8066 :::*
##ss -ntlp:表示显示是哪个进程协议服务占用了这个端口,仅显示TCP连接,显示users,这个端口的进程
netstat -ntlp
[11:15:41 root@proxy ~]#ss -ntlp | grep 8066
LISTEN 0 100 :::8066 :::* users:(("java",pid=11766,fd=86))
[11:15:51 root@proxy ~]#ss -ntlp | grep 9066
LISTEN 0 100 :::9066 :::* users:(("java",pid=11766,fd=82))
##查看mycat的日志,/logs/wrapper.log
[11:18:26 root@proxy logs]#tail -f wrapper.log
STATUS | wrapper | 2022/09/17 11:14:07 | --> Wrapper Started as Daemon
STATUS | wrapper | 2022/09/17 11:14:07 | Launching a JVM...
INFO | jvm 1 | 2022/09/17 11:14:08 | Wrapper (Version 3.2.3) http://wrapper.tanukisoftware.org
INFO | jvm 1 | 2022/09/17 11:14:08 | Copyright 1999-2006 Tanuki Software, Inc. All Rights Reserved.
INFO | jvm 1 | 2022/09/17 11:14:08 |
INFO | jvm 1 | 2022/09/17 11:14:09 | MyCAT Server startup successfully. see logs in logs/mycat.log
修改mycat端口,默认这一段是注释掉的,需要复制新的,并且修改默认客户端端口号;对外暴露端口:3306,管理端口:9066
mycat默认连接用户:root,默认密码:123456,修改为123
/serverport,/user
/conf/server.xml
<!-- <property name="serverPort">8066</property> <property name="managerPort">9066</property> -->
<property name="serverPort">3306</property>
<property name="managerPort">9066</property>
<user name="root" defaultAccount="true">
<property name="password">123</property>
<property name="schemas">TESTDB</property>
<property name="defaultSchema">TESTDB</property>
3.在后端MySQL上创建mycat连接用户
授权给到hellodb库的权限,读写权限,因为访问到proxy可以进行这个库的读写分离
[hellodb]>create user mycat@'%' identified by '123';
Query OK, 0 rows affected (0.01 sec)
[hellodb]>grant all on hellodb.* to mycat@'%';
Query OK, 0 rows affected (0.00 sec)
[hellodb]>flush privileges;
Query OK, 0 rows affected (0.01 sec)
4.在schema.xml上实现读写分离的效果
修改schema.xml文件,修改readhost和writehost,修改后短的节点
schema name:在mycat代理上显示给客户端看的DB名字
<?xml version="1.0"?>
<!DOCTYPE mycat:schema SYSTEM "schema.dtd">
<mycat:schema xmlns:mycat="http://io.mycat/">
<schema name="TESTDB" checkSQLschema="false" sqlMaxLimit="100" dataNode="dn1">
</schema>
<dataNode name="dn1" dataHost="localhost1" database="hellodb" /> <!-- 后端DB名字 -->
<dataHost name="localhost1" maxCon="1000" minCon="10" balance="1"
writeType="0" dbType="mysql" dbDriver="native" switchType="1" slaveThreshold="100">
<heartbeat>select user()</heartbeat>
<writeHost host="host1" url="10.0.0.132:3306" user="mycat" password="123"> <!-- 后端DB1的IP -->
<readHost host="host2" url="10.0.0.128:3306" user="mycat" password="123" /> <!-- 后端DB2的IP -->
</writeHost>
</dataHost>
</mycat:schema>
验证修改server和schema后的结果
##重启服务
mycat restart
##查看日志,jvm1,证明是java编译的东西
tail -f /apps/mycat/logs/wrappers.log
INFO | jvm 1 | 2022/09/17 14:15:46 | MyCAT Server startup successfully. see logs in logs/mycat.log
##查看机器的监听端口是不是修改成3306了
ss -ntlp | grep 3306
LISTEN 0 100 :::3306 :::* users:(("java",pid=50461,fd=86))
##客户端连接到mycat代理,随便找个网络通的,查询发现就是hellodb的内容,可直接进行读写操作
mysql -uroot -p123 -h 10.0.0.130
mysql> show databases;;
+----------+
| DATABASE |
+----------+
| TESTDB |
+----------+
1 row in set (0.00 sec)
mysql> show tables;
+-------------------+
| Tables_in_hellodb |
+-------------------+
| classes |
| coc |
| courses |
| scores |
| stu |
| teachers |
| testlog |
| toc |
+-------------------+
8 rows in set (0.01 sec)
5.验证mycat读写分离到server1和2的效果,使用general_log通用日志实现
##开启server1和server2的general_log日志
[hellodb]>select @@general_log;
+---------------+
| @@general_log |
+---------------+
| 0 |
+---------------+
set global general_log=1;
show variables like '%general_log%';
[hellodb]>show variables like '%general_log%';
+------------------+-----------------------+
| Variable_name | Value |
+------------------+-----------------------+
| general_log | ON |
| general_log_file | /data/mysql/rocky.log |
+------------------+-----------------------+
[(none)]>show variables like '%general_log%';
+------------------+------------------------+
| Variable_name | Value |
+------------------+------------------------+
| general_log | ON |
| general_log_file | /data/mysql/master.log |
+------------------+------------------------+
bash上开启
tail -f /data/mysql/rocky.log
tail -f /data/mysql/master.log
由ubuntu客户端向mycat进行写入
insert into teachers(name,age,gender) values('mycat',11,'M');
##位于132主节点,日志有记录,从节点上只有一个同步的事务记录下来了
2022-09-17T06:40:37.211746Z 112 Query insert into teachers(name,age,gender) values('mycat',11,'M')
2022-09-17T06:40:37.215598Z 7 Query BEGIN
2022-09-17T06:40:37.216535Z 7 Query COMMIT /* implicit, from Xid_log_event */
##查询一下查询的操作,是哪个server,看server id,查询的是128从节点
select @@server_id;
mysql> select @@server_id;
+-------------+
| @@server_id |
+-------------+
| 128 |
+-------------+
##更新操作试试,已经修改为serverid,正是132的id,证明读写分离是OK的
mysql> update teachers set age=@@server_id where tid=13;
Query OK, 1 row affected (0.01 sec)
mysql> select *from teachers where tid=13;;
+-----+-------+-----+--------+
| TID | Name | Age | Gender |
+-----+-------+-----+--------+
| 13 | mycat | 132 | M |
+-----+-------+-----+--------+
heartbeat() select user()用途
用于后端MySQL服务器的心跳检测,mycat服务器周期性的向后端MySQL机器发送健康检查的指令,=select一下用户,如果又返回结果,则证明可以MySQL服务是存活的
2022-09-17T06:33:46.730245Z 20 Query select user()
2022-09-17T06:33:56.730453Z 16 Query select user()
2022-09-17T06:34:06.731198Z 18 Query select user()
mycat暂时性的痛点
1.本身的proxy节点就存在单点故障的问题
2.无法保证后端主节点挂掉后,从节点可以替换上来,可以使用MHA高可用来实现
重点:MySQL主从复制的实现,主从数据的同步,binlog的pos记录,配置,可以在5.7后的版本开启gtid自动记录binlog的pos,可以比较简便的实现主从复制的binlog pos
MySQL MHA实现高可用
实现从节点自动替换主节点工作,也就是sql server的always on方案
always on高可用方案:自动实现故障转移,A为主,B为从,A故障业务自动转向B,或者是有提出业务的方法等,建立always on集群节点,对外暴露一个虚拟的IP,程序只需要对接到这个IP,至于后端是哪台更新机/查询机工作,并不重要
---->always on高可用性组
恢复数据的办法:
通过MHA也好,alwaysOn方案也好,终究数据在同步的时候都是存在延迟的;而一个借点挂了的话,要从节点提升一个新主上来,从必然会有一部分的数据丢失,这一点可以由MHA或者是alwayson集群管理来实现,尽可能通过redo log/binlog来恢复数据,也可以人工手动恢复数据
MySQL的实现方案:MHA高可用方案---->非常常用,一般是一主一从,也可以一主多从等这样的架构,现在阿里云上面的RDS主备架构估计也就MHA机制实现的
高可用方案的可靠性:一定要做到非常可靠,当然数据完全不丢失也不太可能,最好做到尽量不丢失,主要还是看企业内的业务需要使用怎么样的数据库,对应好怎么样的高可用方案;不同类型的DB,高可用方案不同
MHA的实现原理
1.利用健康检查语句select来检查master-slave的健康性,一旦有宕机,则manager从master上保存二进制文件,监控健康情况
2.识别哪个slave同步的数据最新,因为slave同步数据,也需要经过调用SQL thread,而这个SQL进程不一定已经落盘了
3.应用中继日志relay_log到其他节点
4.将从master保存的最新binlog同步到所有slave节点--->有个同步binlog的步骤--->也可以手动将master节点的binlog拷贝过来(如果机器在坏的不多的情况下,只有服务挂了)
5.从slave内提升一个新的master(manager行为)
6.其他的slave指向新主(manager行为)
7.移除故障节点,开始修复
MHA对外的IP地址:VIP虚拟IP,类似SQL SERVER的alyways on高可用方案,集群提供虚拟IP,通过虚拟IP来决定读/写操作,VIP自动识别新的master节点,承载写操作
安装部署MHA和MySQL主从环境
MHA的包:分为管理节点和agent节点,server端和client端,管理节点负责管理各个MySQL集群,MySQL集群内包括主从节点都是管理节点的client端
server端:只能在Cent OS 7机器
node端:可以在Cent OS 8以上
MHA检查MySQL集群的健康性:也是通过select来完成健康检查,一旦无反馈,则会认为这个manager有问题,这个集群有问题,直接提升一个从节点作为主节点
类似mycat代理通过select user(),查询语句来检查,always on节点肯定也有健康检查
环境:可以加个客户端
10.0.0.130 manager/mycat
10.0.0.132 master节点
10.0.0.128 slave1
10.0.0.129 slave2
1.配置主从节点关系,主从复制,my.cnf文件配置
master节点基本配置:服务端,开启gtid,自动识别事务,自动发现binlog的pos,主从都要开,不开的话,从节点会报错
[mysqld]
server-id=132 ##id
datadir=/data/mysql ##数据目录
log-bin=/data/mysql/binlog/rocky-bin ##binlog目录,指定binlog文件前缀
socket=/data/mysql/mysql.sock
log-error=/data/mysql/mysql.log ##错误日志目录
pid-file=/data/mysql/mysql.pid
#skip-grant-tables
default_authentication_plugin=mysql_native_password
#general_log ##开启general_log,这个可以在MySQL内set global general_log=1实现
rpl_semi_sync_master_enabled ##半同步复制,可以设置timeout的时间
gtid_mode=on ##开启gtid,自动识别事务,自动发现binlog的pos
enforce_gtid_consistency
[client]
socket=/data/mysql/mysql.sock
slave基本配置:开启gtid,自动识别事务,自动发现binlog的pos,主从都要开,不开的话,从节点会报错
[mysqld]
server-id=128
log-bin=/data/mysql/binlog/slave-bin
datadir=/data/mysql
socket=/data/mysql/mysql.sock
log-error=/data/mysql/mysql.log
pid-file=/data/mysql/mysql.pid
rpl_semi_sync_slave_enabled ##半同步复制,可以设置timeout的时间
gtid_mode=on ##开启gtid,自动识别事务,自动发现binlog的pos,主从都要开,不开的话,从节点会报错
enforce_gtid_consistency
read_only
[client]
socket=/data/mysql/mysql.sock
2.安装管理端和node端
#!/bin/bash
host=(10.0.0.128
10.0.0.132
10.0.0.129)
##expect判断直接scp rpm包过去
for i in ${host[*]};do
expect <<EOF
spawn scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm $i:/root
expect {
"yes/no" { send "yes\n"; exp_continue }
"password" { send "123\n" };
}
expect eof
EOF
done
yum -y install mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
yum -y install mha4mysql-node-0.58-0.el7.centos.noarch.rpm
3.实现MySQL主从机器之间的SSH-KEY互信,以及manager端对于MySQL主从节点的互信
先生成自己的ssh-key,再将自己的信息复制到自己的节点,known_hosts里面已经有4个机器了,刚刚复制过
将自己的.ssh复制到其他机器的:/root目录下
:/root/.ssh:代表也拷贝.ssh目录本身
:/root/.ssh/*:代表拷贝.ssh目录下的东西
ssh-keygen
ssh-copy-id 10.0.0.130
[10:34:49 root@proxy ~]#cat .ssh/known_hosts
10.0.0.132 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBOEAlD1m567k7Q2QEem7PjJOu2e91dNnnvu3BieR3f3TVFQdqsWqGI/qCgJVLQ7KWWPNFy137uYUwviZa7d8Q/Q=
10.0.0.128 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBGNiEH9kMJZU+HXl2fpHO2JQ9tIbiVh5qljJS+ZoGh1iNzX3S7pGM/Dd16BkSVJgrgMyXOdr832C7na6SIgoylM=
10.0.0.129 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBOf8Uo6Tm0vy+2g9ctYsTwc/tfWcXHQtY5EywK0odPY1KFdXvw/bi75eyTbBqMUnX90RY36V+SV2CF05KCP1FJ0=
10.0.0.130 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBOf8Uo6Tm0vy+2g9ctYsTwc/tfWcXHQtY5EywK0odPY1KFdXvw/bi75eyTbBqMUnX90RY36V+SV2CF05KCP1FJ0=
##复制ssh认证
#!/bin/bash
host=(10.0.0.128
10.0.0.132
10.0.0.129)
##expect判断直接scp rpm包过去,还是得先进行ssh-copy-id的验证
for i in ${host[*]};do
expect <<EOF
spawn ssh-copy-id $i
expect {
"yes/no" { send "yes\n"; exp_continue }
"password" { send "123\n" };
}
expect eof
EOF
scp /root/.ssh $i:/root
done
4.manager端配置
app1.cnf配置
##新建配置文件的目录
mkdir -p /data/mastermha/app1
touch /data/mastermha/app1
vim /data/mastermha/app1.cnf
##app1.cnf文件配置
[server default]
user=mhauser ##在主节点上创建用户,自动同步到从节点--->完成
password=123
manager_workdir=/data/mastermha/app1/
manager_log=/data/mastermha/app1/manager.log ##自动创建管理端的日志
remote_workdir=/data/mastermha/app1/
ssh_user=root
repl_user=sync --->完成
repl_password=123
ping_interval=1
master_ip_failover_script=/usr/local/bin/master_ip_failover
report_script=/usr/local/bin/sendmail.sh
check_repl_delay=0
master_binlog_dir=/data/mysql/binlog ##binlog位置
[server1]
hostname=10.0.0.132
candidate_master=1 ##master优先做主
[server2]
hostname=10.0.0.128
candidate_master=1 ##master优先做主
[server3]
hostname=10.0.0.129
##检查binlog,也可以开启general_log观察结果
[(none)]>show variables like '%log_bin%';
##主从都开启general_log
select @@general_log;
set global general_log=ON;
MySQL集群节点授权mha账户
可以修改MySQL集群的配置,所以需要能够有所有库的所有权限,需要和app1.cnf内一致
##在主节点上创建用户,自动同步到从节点
[hellodb]>create user mhauser@'%' identified by '123';
Query OK, 0 rows affected (0.01 sec)
[hellodb]>grant all on *.* to mhauser@'%';
Query OK, 0 rows affected (0.00 sec)
[hellodb]>flush privileges;
Query OK, 0 rows affected (0.00 sec)
[hellodb]>select user,host from mysql.user;
+------------------+-----------+
| user | host |
+------------------+-----------+
| mhauser | % |
| mycat | % |
| root | % |
| sync | % |
| wp | % |
| catyer | 10.0.0.% |
| mysql.infoschema | localhost |
| mysql.session | localhost |
| mysql.sys | localhost |
| root | localhost |
| wpuser | localhost |
+------------------+-----------+
11 rows in set (0.00 sec)
master节点添加一个VIP,参考perl脚本的写法
master down了可以将master飘到别的master上
将perl脚本放到/use/local/bin下面,临时加
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip"; ##网卡上加多一个IP地址,网卡名记得修改,
my $ssh_stop_vip = "/sbin/ifconfig eens33:$key down";
##master节点配置
ifconfig ens224
ifconfig ens224:1 10.0.0.100/24
2: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:0c:29:33:3d:85 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.132/24 brd 10.0.0.255 scope global noprefixroute ens224
valid_lft forever preferred_lft forever
inet 10.0.0.100/24 brd 10.0.0.255 scope global secondary ens224:1
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe33:3d85/64 scope link noprefixroute
valid_lft forever preferred_lft forever
perl脚本的实现:/usr/local/bin/master_ip_failover
把这个脚本放在这个路径下,记得修改执行权限
chmod +x /usr/local/bin/master_ip_failover
脚本实现
#!/usr/bin/env perl
# Copyright (C) 2011 DeNA Co.,Ltd.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
## Note: This is a sample script and is not complete. Modify the script based on your environment.
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
use MHA::DBHelper;
my (
$command, $ssh_user, $orig_master_host,
$orig_master_ip, $orig_master_port, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password
);
my $vip = '10.0.0.100/24'; ##VIP给程序的显示
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig eens33:$key $vip"; ##网卡上加多一个IP地址,网卡名记得修改
my $ssh_stop_vip = "/sbin/ifconfig eens33:$key down";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
);
exit &main();
sub main {
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
# updating global catalog, etc
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
自动发送告警邮件实现
配置163邮箱的auth,需要update一下postfix的包,以免postfix这个邮件服务太老了
vim /etc.mail.rc
set from=catyer.163.com
set smtp=smtp.163.com
set smtp-auth-user=catyer.163.com
set smtp-auth-password=SOKFDDZOQCDEZCYY
yum -y install postfix
vim sendmail.sh
##记得修改执行权限
chmod +x sendmail.sh
echo "MHA is down!" | mail -s "MHA Warning" catyer@163.com
5.检查app1.cnf文件的语法
successfully,检查到各个节点的通信
##检查各个节点之间的SSH连接
masterha_check_ssh --conf=/root/mastermha/app1.cnf
##检查主从复制
masterha_check_repl --conf=/root/mastermha/app1.cnf
抛出异常,无法识别super_read_only变量,缺少libmysqlclient.so.18依赖文件
不用修改任何的东西,只是缺少这个libmysqlclient.so.18依赖库文件
也不是什么super_read_only没开,都不关事的
Checking if super_read_only is defined and turned on..install_driver(mysql) failed: Can't load '/usr/lib64/perl5/vendor_perl/auto/DBD/mysql/mysql.so' for module DBD::mysql: libmysqlclient.so.18: cannot open shared object file: No such file or directory at /usr/lib64/perl5/DynaLoader.pm line 190.
at (eval 11) line 3.
Compilation failed in require at (eval 11) line 3
vim +244 /usr/share/perl5/vendor_perl/MHA/SlaveUtil.pm
my $sth = $dbh->prepare("SELECT 0 as Value");
#$dbh->do("SET GLOBAL super_read_only=off;");
$dbh->do("SET GLOBAL read_only=on;");
my $sth = $dbh->prepare("SELECT \@\@global.super_read_only as Value");
set global super_read_only=ON;
my $sth = $dbh->prepare("SELECT 1 as Value"); ##--->心跳检查
ln -s /usr/local/mysql/lib/libmysqlclient.so.21.2.30 /usr/lib64/libmysqlclient.so.21.2.30
wget -O /usr/lib/libmysqlclient.so.18 http://files.directadmin.com/services/es_7.0_64/libmysqlclient.so.18
set global global super_read_only=OFF;
解决办法:因为是二进制安装的,所以缺少这个包mysql-community-libs-compat-8.0.30-1.el7.x86_64.rpm,从官网下载了,不知道为啥安装不上,缺少依赖
yum -y install mysql-community-libs
上次元数据过期检查:0:45:58 前,执行于 2022年09月18日 星期日 15时49分55秒。
未找到匹配的参数: mysql-community-libs
错误:没有任何匹配: mysql-community-libs
##从别的机器使用yum安装mysql-8.0.30,拷贝/usr/lib64/mysql/libmysqlclient.so.18过来
host=(10.0.0.128
10.0.0.132
10.0.0.129)
for i in ${host[*]};do
scp /usr/lib64/mysql/libmysqlclient.so.18 $i:/usr/lib64/mysql/
done
[root@master mysql]#ls
libmysqlclient.a libmysqlclient.so libmysqlclient.so.18 libmysqlservices.a
check ok
6.启动MHA进程,查看日志;若集群内的主机挂掉后,需要再次执行开始MHA进程
管理节点上执行
masterha_manager --conf=/root/mastermha/app1.cnf --remove_dead_master_conf --ingore_last_failover
##挂在后台执行,生产中执行
nohup masterha_manager --conf=/root/mastermha/app1.cnf --remove_dead_master_conf --ingore_last_failover
查看master的general_log,发现是有健康检查的条目的,只检查MASTER节点健康
[hellodb]>show variables like '%general_log%';
tail -f /data/mysql/rocky.log
tail -f /data/mysql/master.log
tail -f /data/mysql/slave1.log
2022-09-18T08:56:24.776722Z 205 Query SELECT 1 As Value
2022-09-18T08:56:25.777208Z 205 Query SELECT 1 As Value
2022-09-18T08:56:26.777319Z 205 Query SELECT 1 As Value
tail -f /data/mastermha/app1/manager.log
ping select until mysql doesn't respond; ##健康检查,只有MASTER节点会有健康检查,所以SLAVE1和SLAVE2都不会有
[16:53:19 root@proxy ~]#tail -f /data/mastermha/app1/manager.log
Checking the Status of the script.. OK
SIOCSIFADDR: No such device
eth0:1: ERROR while getting interface flags: No such device
SIOCSIFNETMASK: No such device
Sun Sep 18 16:55:27 2022 - [info] OK.
Sun Sep 18 16:55:27 2022 - [warning] shutdown_script is not defined.
Sun Sep 18 16:55:27 2022 - [info] Set master ping interval 1 seconds.
Sun Sep 18 16:55:27 2022 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Sun Sep 18 16:55:27 2022 - [info] Starting ping health check on 10.0.0.132(10.0.0.132:3306)..
Sun Sep 18 16:55:27 2022 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
7.测试MHA的功能,模拟主节点down机
MHA进程为一次性的,加入down掉了,就要重新启动MHA进程
在主节点上执行存储过程,会自动同步到从节点。
之前在app1.cnf上面配置了candidate_master=1,现在128替换132称为主节点了
call sp_testlog;
select count(*) from testlog;
##停止MySQL服务
systemctl stop mysqld
##生成了一个manager.log的报告,fail的报告
tail -f /data/mastermha/app1/manager.log
----- Failover Report -----
app1: MySQL Master failover 10.0.0.132 to 10.0.0.128 succeeded
Master 10.0.0.132 is down!
Check MHA Manager logs at proxy:/data/mastermha/app1/manager.log for details.
查看10.0.0.128的状态,已经切换到主了,检查数据同步的多少
如果在MASTER节点挂掉的情况下,you新写入的数据,则需要全库导出导入,备份恢复一下再加入到MySQL集群
##从节点状态没了
[hellodb]>show slave status \G;
Empty set, 1 warning (0.00 sec)
##在UP的从节点查看,已经切换成128了
[hellodb]>show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 10.0.0.128
Master_User: sync
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: slave-bin.000013
Read_Master_Log_Pos: 35534754
Relay_Log_File: slave1-relay-bin.000002
Relay_Log_Pos: 326
Relay_Master_Log_File: slave-bin.000013
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
##查看执行的存储过程一致性,count基本一致
[hellodb]>select count(*) from testlog;
+----------+
| count(*) |
+----------+
| 115030 |
+----------+
##新主节点查看进程,来自129的从节点
| 111 | sync | 10.0.0.129:45244 | NULL | Binlog Dump | 1413 | Source has sent all binlog to replica; waiting for more updates | NULL |
##没改动,没更新的数据的情况下,可以直接change master;
[hellodb]>flush logs;
Query OK, 0 rows affected (0.00 sec)
[hellodb]>show master status;
+------------------+----------+--------------+------------------+-----------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-----------------------------------------------+
| slave-bin.000014 | 197 | | | 5aa76b1b-1b77-11ed-9219-000c29333d85:1-115049 |
+------------------+----------+--------------+------------------+-----------------------------------------------+
##修改主节点
CHANGE MASTER TO
MASTER_HOST='10.0.0.128',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='slave-bin.000014', MASTER_LOG_POS=197;
start slave;
##节点状态是OK的
[(none)]>show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 10.0.0.128
Master_User: sync
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: slave-bin.000014
Read_Master_Log_Pos: 197
Relay_Log_File: rocky-relay-bin.000002
Relay_Log_Pos: 326
Relay_Master_Log_File: slave-bin.000014
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
TIDB:分布式的数据库
分布式表
MySQL的最大连接数:show variables like '%connection%'
最大并发连接数:151个
[(none)]>show variables like '%connection%';
+-----------------------------------+----------------------+
| Variable_name | Value |
+-----------------------------------+----------------------+
| character_set_connection | utf8mb4 |
| collation_connection | utf8mb4_0900_ai_ci |
| connection_memory_chunk_size | 8912 |
| connection_memory_limit | 18446744073709551615 |
| global_connection_memory_limit | 18446744073709551615 |
| global_connection_memory_tracking | OFF |
| max_connections | 151 |
| max_user_connections | 0 |
| mysqlx_max_connections | 100 |
+-----------------------------------+----------------------+
mysqlslap:MySQL压测工具
测试用例:查看在1000并发的情况下,
mysqlslap -a -c 1000
[root@rocky ~]#mysqlslap -uroot -p123 -a -c 1000
mysqlslap: [Warning] Using a password on the command line interface can be insecure.
Benchmark
Average number of seconds to run all queries: 17.270 seconds
Minimum number of seconds to run all queries: 17.270 seconds
Maximum number of seconds to run all queries: 17.270 seconds
Number of clients running queries: 1000
Average number of queries per client: 0
调整MySQL的最大并发连接数
set global max_connections=5000;
vim /etc/my.cnf
max_connections=5000
[(none)]>set global max_connections=5000;
Query OK, 0 rows affected (0.00 sec)
[(none)]>show databases;
+--------------------+
| Database |
+--------------------+
| catyer |
| hellodb |
| information_schema |
| mysql |
| mysqlslap |
| performance_schema |
| sys |
| trigger_pra |
| wordpress |
| zabbix |
+--------------------+
10 rows in set (0.02 sec)
本文作者:Catyer
本文链接:https://www.cnblogs.com/catyer/p/16728923.html
版权声明:本作品采用知识共享署名-非商业性使用-禁止演绎 2.5 中国大陆许可协议进行许可。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· .NET10 - 预览版1新功能体验(一)