MySQL数据库使用pt-online-schema-change添加字段报错不能连接到从库:Cannot connect to A=utf8,P=xxx
【环境介绍】
系统环境:腾讯云 5.7.18 + pt-online-schema-change 3.0.5
【情况描述】
使用 pt-online-schema-change对表添加字段操作报错:
脚本:
[18:07:19][root@ipa2 shengchan]# cat pt_add_column.sh
[18:07:19]#!/bin/bash
[18:07:19]table=$1
[18:07:19]alter_conment=$2
[18:07:19]
[18:07:19]cnn_host='10.1.xx.xx'
[18:07:19]cnn_user='monitor'
[18:07:19]cnn_pwd='xxxx'
[18:07:19]cnn_db='xx'
[18:07:19]
[18:07:19]echo "$table"
[18:07:19]echo "$alter_conment"
[18:07:19]pt-online-schema-change --charset=utf8 --no-version-check --user=${cnn_user} --password=${cnn_pwd} --host=${cnn_host} P=3306,D=${cnn_db},t=$table --alter "${alter_conment}" --alter-foreign-keys-method=auto --no-check-replication-filters --execute
执行:
[18:07:31][root@ipa2 shengchan]# sh pt_add_column.sh X_XXXXXX_MESSAGE "ADD COLUMN FBUSINESS_ID varchar(64) NULL COMMENT '业务id' AFTER FVISIBLE_RANGE,ADD COLUMN FBUSINESS_TYPE varchar(30) NULL COMMENT '业务类型' AFTER FBUSINESS_ID"
[18:07:31]X_XXXXXX_MESSAGE
[18:07:31]ADD COLUMN FBUSINESS_ID varchar(64) NULL COMMENT '业务id' AFTER FVISIBLE_RANGE,ADD COLUMN FBUSINESS_TYPE varchar(30) NULL COMMENT '业务类型' AFTER FBUSINESS_ID
[18:07:31]Cannot connect to A=utf8,P=20127,h=,p=...,u=monitor
[18:07:31]Cannot connect to A=utf8,P=20121,h=,p=...,u=monitor
[18:07:31]No slaves found. See --recursion-method if host TENCENT64.site has slaves.
[18:07:31]Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
[18:07:31]XXeration, tries, wait:
[18:07:31] analyze_table, 10, 1
[18:07:31] cXXy_rows, 10, 0.25
[18:07:31] create_triggers, 10, 1
[18:07:31] drXX_triggers, 10, 1
[18:07:31] swap_tables, 10, 1
[18:07:31] update_foreign_keys, 10, 1
[18:07:32]Child tables:
[18:07:32] `XX`.`X_XXXX_GROUP_MESSAGE_XXXXX` (approx. 5906 rows)
[18:07:32] `XX`.`X_XXXX_MESSAGE_XXX_WAY` (approx. 17489 rows)
[18:07:32]Will automatically choose the method to update foreign keys.
[18:07:32]Altering `XX`.`X_XXXX_MESSAGE`...
[18:07:32]Creating new table...
[18:07:32]Created new table XX._X_XXXX_MESSAGE_new OK.
[18:07:32]Altering new table...
[18:07:32]Altered `XX`.`_X_XXXX_MESSAGE_new` OK.
[18:07:32]2020-10-14T18:06:28 Creating triggers...
[18:07:32]2020-10-14T18:06:28 Created triggers OK.
[18:07:32]2020-10-14T18:06:28 CXXying approximately 1740449 rows...
[18:07:32]Cannot connect to A=utf8,P=20127,h=,p=...,u=monitor
[18:07:32]Cannot connect to A=utf8,P=20121,h=,p=...,u=monitor
......
[18:07:43]Cannot connect to A=utf8,P=20127,h=,p=...,u=monitor
[18:07:43]Cannot connect to A=utf8,P=20121,h=,p=...,u=monitor
[18:07:43]^C# Exiting on SIGINT.
[18:07:43]Not drXXping triggers because the tool was interrupted. To drXX the triggers, execute:
[18:07:43]DRXX TRIGGER IF EXISTS `XX`.`pt_osc_XX_X_XXXX_MESSAGE_del`
[18:07:43]DRXX TRIGGER IF EXISTS `XX`.`pt_osc_XX_X_XXXX_MESSAGE_upd`
[18:07:43]DRXX TRIGGER IF EXISTS `XX`.`pt_osc_XX_X_XXXX_MESSAGE_ins`
[18:07:43]Not drXXping the new table `XX`.`_X_XXXX_MESSAGE_new` because the tool was interrupted. To drXX the new table, execute:
[18:07:43]DRXX TABLE IF EXISTS `XX`.`_X_XXXX_MESSAGE_new`;
[18:07:43]`XX`.`X_XXXX_MESSAGE` was not altered.
[18:07:44][root@ipa2 shengchan]#
【情况分析】
从上面信息查看,没有检测连接到两个端口的数据库
通过命令查看数据库从库的信息,与连接的信息一致:
(monitor@xx.xx.xx.xx) [(none)]> show slave hosts;
+-----------+------+-------+-----------+--------------------------------------+
| Server_id | Host | Port | Master_id | Slave_UUID |
+-----------+------+-------+-----------+--------------------------------------+
| 51567 | | 20127 | 51565 | 46439a24-0273-11eb-96a5-a4fa76fdf28e |
| 2180 | | 20121 | 51565 | 78e1111b-0273-11eb-bea1-6c92bf48d2a0 |
+-----------+------+-------+-----------+--------------------------------------+
查看其它文章这个报错,说是主从的用户密码一致,经过验证用户,权限都没有问题(从库只读开启,严格把控,出现用户密码不一致问题可能性非常小)。
查看OSC资料涉及尝试查找从库的参数信息,查看到--recursion-method方式发现从库:
从这里查看建议使用processlist方式查找从库,hosts方式不可靠
添加参数--recursion-method=processlist即可;
【注意事项】
注意:重新操作时注意一开始后面提示的信息,显示的是工具没有进行删除触发器跟新创建的表,需要手工删除触发器跟新创建的表(确认好信息操作,防止误操作)。
查看触发器:
select TRIGGER_SCHEMA,TRIGGER_NAME,EVENT_OBJECT_TABLE from triggers where EVENT_OBJECT_TABLE='';