代码改变世界

MySQL The instance is already part of another Replication Group

2024-06-13 16:16  潇湘隐者  阅读(125)  评论(0编辑  收藏  举报

MySQL InnoDB Cluster(测试环境为MySQL 8.0.35)将一个实例重新加入集群时,遇到了下面这个错误"The instance 'dbu03:3306' is already part of another Replication Group"

MySQL  10.160.2.55:3306 ssl  JS > cluster.addInstance('cdmin@10.160.2.62:3306')
ERROR: RuntimeError: The instance 'dbu03:3306' is already part of another Replication Group
Cluster.addInstance: The instance 'dbu03:3306' is already part of another Replication Group (RuntimeError)
MySQL  10.160.2.55:3306 ssl  JS > 

那么如何解决这个问题呢?官方文档介绍,碰到这种情况是遇到了一个Bug,可以通过mysql shell连接到这个实例(加入cluster遇到问题的实例,此案例为:10.160.2.62)

然后执行下面命令

shell.options.verbose=3
shell.options["dba.logSql"]=2
shell.options["logLevel"]=8
\sql
stop group_replication;
\js
dba.dropMetadataSchema();

实际执行过程如下所示:

 MySQL  10.160.2.62:3306 ssl  JS > shell.options.verbose=3
3
 MySQL  10.160.2.62:3306 ssl  JS > shell.options["dba.logSql"]=2
2
 MySQL  10.160.2.62:3306 ssl  JS > shell.options["logLevel"]=8
8
 MySQL  10.160.2.62:3306 ssl  JS > \sql
Switching to SQL mode... Commands end with ;
Fetching global names for auto-completion... Press ^C to stop.
stop group_replication; MySQL  10.16stop group_replication;
Query OK, 0 rows affected (0.0050 sec)
 MySQL  10.160.2.62:3306 ssl  SQL > \js
Switching to JavaScript mode...
 MySQL  10.160.2.62:3306 ssl  JS > \js
 MySQL  10.160.2.62:3306 ssl  JS > dba.dropMetadataSchema();
verbose: 2024-06-13T08:33:55Z: Connecting to MySQL at: mysql://cdmin@10.160.2.62:3306?connect-timeout=5000
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: CONNECTED: 10.160.2.62:3306
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SET SESSION `autocommit` = 1
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SET SESSION `sql_mode` = 'ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION'
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SET SESSION `group_replication_consistency` = 'EVENTUAL'
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SET SESSION `collation_connection` = 'utf8mb4_0900_ai_ci'
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SET SESSION `group_concat_max_len` = 1073741824
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SELECT COALESCE(@@report_host, @@hostname),  COALESCE(@@report_port, @@port)
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SELECT @@server_uuid
verbose: 2024-06-13T08:33:55Z: Metadata operations will use dbu03:3306
verbose: 2024-06-13T08:33:55Z: Metadata operations will use dbu03:3306
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SHOW DATABASES LIKE 'mysql_innodb_cluster_metadata'
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SHOW DATABASES LIKE 'mysql_innodb_cluster_metadata_bkp'
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: select count(*) FROM performance_schema.replication_group_members WHERE MEMBER_ID = @@server_uuid AND MEMBER_STATE NOT IN ('OFFLINE''UNREACHABLE')
verbose: 2024-06-13T08:33:55Z: Instance type check: dbu03:3306: GR is installed but not active
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SELECT
    c.channel_name, c.host, c.port, c.user,
    s.source_uuid, s.group_name, s.last_heartbeat_timestamp,
    s.service_state io_state, st.processlist_state io_thread_state,
    s.last_error_number io_errno, s.last_error_message io_errmsg,
    s.last_error_timestamp io_errtime,
    co.service_state co_state, cot.processlist_state co_thread_state,
    co.last_error_number co_errno, co.last_error_message co_errmsg,
    co.last_error_timestamp co_errtime,
    w.service_state w_state, wt.processlist_state w_thread_state,
    w.last_error_number w_errno, w.last_error_message w_errmsg,
    w.last_error_timestamp w_errtime,
    /*!80011 TIMEDIFF(NOW(6),
      IF(TIMEDIFF(s.LAST_QUEUED_TRANSACTION_START_QUEUE_TIMESTAMP,
          s.LAST_HEARTBEAT_TIMESTAMP) >= 0,
        s.LAST_QUEUED_TRANSACTION_START_QUEUE_TIMESTAMP,
        s.LAST_HEARTBEAT_TIMESTAMP
      )) as time_since_last_message,
    IF(s.LAST_QUEUED_TRANSACTION='' OR s.LAST_QUEUED_TRANSACTION=latest_w.LAST_APPLIED_TRANSACTION,
      'IDLE',
      'APPLYING') as applier_busy_state,
    IF(s.LAST_QUEUED_TRANSACTION='' OR s.LAST_QUEUED_TRANSACTION=latest_w.LAST_APPLIED_TRANSACTION,
      NULL,
      TIMEDIFF(latest_w.LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP,
        latest_w.LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP)
      ) as lag_from_original,
    IF(s.LAST_QUEUED_TRANSACTION='' OR s.LAST_QUEUED_TRANSACTION=latest_w.LAST_APPLIED_TRANSACTION,
      NULL,
      TIMEDIFF(latest_w.LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP,
        latest_w.LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP)
      ) as lag_from_immediate,
    */
    GTID_SUBTRACT(s.RECEIVED_TRANSACTION_SET, @@global.gtid_executed)
      as queued_gtid_set_to_apply
  FROM performance_schema.replication_connection_configuration c
  JOIN performance_schema.replication_connection_status s
    ON c.channel_name = s.channel_name
  LEFT JOIN performance_schema.replication_applier_status_by_coordinator co
    ON c.channel_name = co.channel_name
  JOIN performance_schema.replication_applier_status a
    ON c.channel_name = a.channel_name
  JOIN performance_schema.replication_applier_status_by_worker w
    ON c.channel_name = w.channel_name
  LEFT JOIN
  /* if parallel replication, fetch owner of most recently applied tx */
    (SELECT *
      FROM performance_schema.replication_applier_status_by_worker
      /*!80011 ORDER BY LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP DESC */
      LIMIT 1) latest_w
    ON c.channel_name = latest_w.channel_name
  LEFT JOIN performance_schema.threads st
    ON s.thread_id = st.thread_id
  LEFT JOIN performance_schema.threads cot
    ON co.thread_id = cot.thread_id
  LEFT JOIN performance_schema.threads wt
    ON w.thread_id = wt.thread_id
ORDER BY channel_name
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SHOW SLAVE HOSTS
verbose: 2024-06-13T08:33:55Z: Refreshing metadata cache from 'dbu03:3306'
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SHOW DATABASES LIKE 'mysql_innodb_cluster_metadata'
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SHOW DATABASES LIKE 'mysql_innodb_cluster_metadata_bkp'
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SHOW DATABASES LIKE 'mysql_innodb_cluster_metadata'
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SHOW DATABASES LIKE 'mysql_innodb_cluster_metadata_bkp'
verbose: 2024-06-13T08:33:55Z: DONE!
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SHOW DATABASES LIKE 'mysql_innodb_cluster_metadata'
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SHOW DATABASES LIKE 'mysql_innodb_cluster_metadata_bkp'
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: select count(*) FROM performance_schema.replication_group_members WHERE MEMBER_ID = @@server_uuid AND MEMBER_STATE NOT IN ('OFFLINE''UNREACHABLE')
verbose: 2024-06-13T08:33:55Z: Instance type check: dbu03:3306: GR is installed but not active
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SELECT
    c.channel_name, c.host, c.port, c.user,
    s.source_uuid, s.group_name, s.last_heartbeat_timestamp,
    s.service_state io_state, st.processlist_state io_thread_state,
    s.last_error_number io_errno, s.last_error_message io_errmsg,
    s.last_error_timestamp io_errtime,
    co.service_state co_state, cot.processlist_state co_thread_state,
    co.last_error_number co_errno, co.last_error_message co_errmsg,
    co.last_error_timestamp co_errtime,
    w.service_state w_state, wt.processlist_state w_thread_state,
    w.last_error_number w_errno, w.last_error_message w_errmsg,
    w.last_error_timestamp w_errtime,
    /*!80011 TIMEDIFF(NOW(6),
      IF(TIMEDIFF(s.LAST_QUEUED_TRANSACTION_START_QUEUE_TIMESTAMP,
          s.LAST_HEARTBEAT_TIMESTAMP) >= 0,
        s.LAST_QUEUED_TRANSACTION_START_QUEUE_TIMESTAMP,
        s.LAST_HEARTBEAT_TIMESTAMP
      )) as time_since_last_message,
    IF(s.LAST_QUEUED_TRANSACTION='' OR s.LAST_QUEUED_TRANSACTION=latest_w.LAST_APPLIED_TRANSACTION,
      'IDLE',
      'APPLYING') as applier_busy_state,
    IF(s.LAST_QUEUED_TRANSACTION='' OR s.LAST_QUEUED_TRANSACTION=latest_w.LAST_APPLIED_TRANSACTION,
      NULL,
      TIMEDIFF(latest_w.LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP,
        latest_w.LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP)
      ) as lag_from_original,
    IF(s.LAST_QUEUED_TRANSACTION='' OR s.LAST_QUEUED_TRANSACTION=latest_w.LAST_APPLIED_TRANSACTION,
      NULL,
      TIMEDIFF(latest_w.LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP,
        latest_w.LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP)
      ) as lag_from_immediate,
    */
    GTID_SUBTRACT(s.RECEIVED_TRANSACTION_SET, @@global.gtid_executed)
      as queued_gtid_set_to_apply
  FROM performance_schema.replication_connection_configuration c
  JOIN performance_schema.replication_connection_status s
    ON c.channel_name = s.channel_name
  LEFT JOIN performance_schema.replication_applier_status_by_coordinator co
    ON c.channel_name = co.channel_name
  JOIN performance_schema.replication_applier_status a
    ON c.channel_name = a.channel_name
  JOIN performance_schema.replication_applier_status_by_worker w
    ON c.channel_name = w.channel_name
  LEFT JOIN
  /* if parallel replication, fetch owner of most recently applied tx */
    (SELECT *
      FROM performance_schema.replication_applier_status_by_worker
      /*!80011 ORDER BY LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP DESC */
      LIMIT 1) latest_w
    ON c.channel_name = latest_w.channel_name
  LEFT JOIN performance_schema.threads st
    ON s.thread_id = st.thread_id
  LEFT JOIN performance_schema.threads cot
    ON co.thread_id = cot.thread_id
  LEFT JOIN performance_schema.threads wt
    ON w.thread_id = wt.thread_id
ORDER BY channel_name
verbose: 2024-06-13T08:33:55Z: Dba.dropMetadataSchema: tid=530: SQL: SHOW SLAVE HOSTS
Dba.dropMetadataSchema: This function is not available through a session to a standalone instance (MYSQLSH 51300)
 MySQL  10.160.2.62:3306 ssl  JS > 

如果上面命令没有成功,那么我们就必须连接到数据库,手工执行下面命令

stop group_replication;
drop schema mysql_innodb_cluster_metadata;

然后在主节点执行下面命令,就可以重新将实例加入MySQL InnoDB Cluster。

var cluster=dba.getCluster()
cluster.addInstance('cdmin@10.160.2.62:3306')
cluster.status()

关于这个问题,官方文档The instance 'mysqlNode:3306' is already part of another Replication Group, How To Solve? (Doc ID 2809308.1)[1] 中介绍,是因为遇到了Bug #33294010 clusterset: rejoinInstance() on broken GR member fail[2] 如下截图所示,这个Bug在8.1.0中已经Fix掉了。

参考资料

[1]

1: https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=425530417327847&id=2809308.1&_afrWindowMode=0&_adf.ctrl-state=3s7yvgwzw_4

[2]

2: https://support.oracle.com/epmos/faces/BugDisplay?_afrLoop=426942735497044&id=33294010&_afrWindowMode=0&_adf.ctrl-state=3s7yvgwzw_53