MySQL Cluster 日常维护
一. 数据备份
相信大家都熟悉mysql的日常备份工具,比如mysqldump对数据库进行逻辑备份。这个方法同样适用MySQL Cluster,备份方法和其他存储引擎一样,唯一的区别是在任意的SQL节点备份数据。我们来看一个例子。
我这里在SQL节点192.168.0.70上面使用mysqldump备份test库:
[root@192.168.0.70 ~]# mysqldump --single-transaction --skip-comments test > /data/bak.sql
DROP TABLE IF EXISTS `t1`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `t1` ( `name` varchar(20) DEFAULT NULL ) ENGINE=ndbcluster DEFAULT CHARSET=latin1; /*!40101 SET character_set_client = @saved_cs_client */; LOCK TABLES `t1` WRITE; /*!40000 ALTER TABLE `t1` DISABLE KEYS */; INSERT INTO `t1` VALUES ('yayun'),('atlas'); /*!40000 ALTER TABLE `t1` ENABLE KEYS */; UNLOCK TABLES; /*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;
可以看见导出的SQL和其他引擎没有区别,这里就不详细介绍了。我们重点介绍cluster的物理备份方法。还是以前面搭建的cluster环境为例子。当集群全部启动以后,进入管理节点执行start backup命令启动备份。输入?可以查看各种帮助选项。
ndb_mgm> ? --------------------------------------------------------------------------- NDB Cluster -- Management Client -- Help --------------------------------------------------------------------------- HELP Print help text HELP COMMAND Print detailed help for COMMAND(e.g. SHOW) SHOW Print information about cluster CREATE NODEGROUP <id>,<id>... Add a Nodegroup containing nodes DROP NODEGROUP <NG> Drop nodegroup with id NG START BACKUP [NOWAIT | WAIT STARTED | WAIT COMPLETED] START BACKUP [<backup id>] [NOWAIT | WAIT STARTED | WAIT COMPLETED] START BACKUP [<backup id>] [SNAPSHOTSTART | SNAPSHOTEND] [NOWAIT | WAIT STARTED | WAIT COMPLETED] Start backup (default WAIT COMPLETED,SNAPSHOTEND) ABORT BACKUP <backup id> Abort backup SHUTDOWN Shutdown all processes in cluster CLUSTERLOG ON [<severity>] ... Enable Cluster logging CLUSTERLOG OFF [<severity>] ... Disable Cluster logging CLUSTERLOG TOGGLE [<severity>] ... Toggle severity filter on/off CLUSTERLOG INFO Print cluster log information <id> START Start data node (started with -n) <id> RESTART [-n] [-i] [-a] [-f] Restart data or management server node <id> STOP [-a] [-f] Stop data or management server node ENTER SINGLE USER MODE <id> Enter single user mode EXIT SINGLE USER MODE Exit single user mode <id> STATUS Print status <id> CLUSTERLOG {<category>=<level>}+ Set log level for cluster log PURGE STALE SESSIONS Reset reserved nodeid's in the mgmt server CONNECT [<connectstring>] Connect to management server (reconnect if already connected) <id> REPORT <report-type> Display report for <report-type> QUIT Quit management client <severity> = ALERT | CRITICAL | ERROR | WARNING | INFO | DEBUG <category> = STARTUP | SHUTDOWN | STATISTICS | CHECKPOINT | NODERESTART | CONNECTION | INFO | ERROR | CONGESTION | DEBUG | BACKUP | SCHEMA <level> = 0 - 15 <id> = ALL | Any database node id For detailed help on COMMAND, use HELP COMMAND. ndb_mgm>
输入start backup启动备份,如下:
ndb_mgm> START BACKUP Waiting for completed, this may take several minutes Node 2: Backup 1 started from node 1 Node 2: Backup 1 started from node 1 completed StartGCP: 1232 StopGCP: 1235 #Records: 2061 #LogRecords: 0 Data: 51864 bytes Log: 0 bytes ndb_mgm>
在备份日志中,需要注意"Backup 1",它表示该备份的唯一ID,如果做第二次备份,备份ID会变成"Backup 2"。当日志中显示"Node 2: Backup 1 started from node 1 completed"时,本次备份结束。备份的数据保存在每个NDB(数据节点)下,具体的备份路径是$MYSQL_HOME/data/BACKUP/BACKUP-ID。这是默认的,我的数据放在/data/mysql-cluster/data/,所以备份也在该目录下面。上面的备份可以在两个数据节点都可以看到:
[root@192.168.0.40 BACKUP-1]# pwd /data/mysql-cluster/data/BACKUP/BACKUP-1 [root@192.168.0.40 BACKUP-1]# ll total 48 -rw-r--r-- 1 root root 25660 Apr 16 22:24 BACKUP-1-0.3.Data -rw-r--r-- 1 root root 14096 Apr 16 22:24 BACKUP-1.3.ctl -rw-r--r-- 1 root root 52 Apr 16 22:24 BACKUP-1.3.log [root@192.168.0.40 BACKUP-1]#
[root@192.168.0.50 BACKUP-1]# pwd /data/mysql-cluster/data/BACKUP/BACKUP-1 [root@192.168.0.50 BACKUP-1]# ll total 48 -rw-r--r-- 1 root root 26748 Apr 16 22:24 BACKUP-1-0.2.Data -rw-r--r-- 1 root root 14096 Apr 16 22:24 BACKUP-1.2.ctl -rw-r--r-- 1 root root 52 Apr 16 22:24 BACKUP-1.2.log [root@192.168.0.50 BACKUP-1]#
大家或许发现了以上两个节点的备份文件名不一样,一个包含".2",另外一个包含".3",这个数字表明了此备份是哪个数据节点上的数据。下面清楚的显示了ID号。
[ndbd(NDB)] 2 node(s) id=2 @192.168.0.50 (mysql-5.5.19 ndb-7.2.4, Nodegroup: 0, Master) id=3 @192.168.0.40 (mysql-5.5.19 ndb-7.2.4, Nodegroup: 0)
对于大数据量的备份,MySQL Cluster还提供了几个备份的参数可供调整,这些参数需要写在config.ini的[NDBD DEFAULT]或者[NDBD]组中,对各参数的具体说明如下:
(1)BackupDataBufferSize:将数据写入磁盘之前用于对数据进行缓冲处理的内存量。
(2)BackupLogBufferSzie: 将日志记录写入磁盘之前用于对其进行缓冲处理的内存量。
(3)BackupMemory: 在数据库节点中为备份分配的总内存。它应是分配给备份数据缓冲的内存和分配给日志缓冲的内存之和。
(4)BackupWriteSize: 每次写入磁盘的块大小,适用于备份数据缓冲和备份日志缓冲。
二. 数据恢复
对于用"start backup" 进行备份的cluster,必须使用ndb_restore工具进行数据恢复。ndb_restore是mysql cluster带的管理工具,在shell中执行,而不是ndb_mgm工具中的一个命令。
我们上面已经备份数据了,我们人为删除test库下面的表t1,然后恢复数据:
mysql> drop table test.t1; Query OK, 1 row affected (0.64 sec)
mysql>
(1)在数据节点2(192.168.0.50)的shell命令中执行如下命令:
[root@192.168.0.50 ~]# ndb_restore -b 1 -n 2 -c host=192.168.0.30:1186 -m -r /data/mysql-cluster/data/BACKUP/BACKUP-1/ Backup Id = 1 Nodeid = 2 backup path = /data/mysql-cluster/data/BACKUP/BACKUP-1/ Opening file '/data/mysql-cluster/data/BACKUP/BACKUP-1/BACKUP-1.2.ctl' File size 14096 bytes Backup version in files: ndb-6.3.11 ndb version: mysql-5.5.19 ndb-7.2.4 Stop GCP of Backup: 1234 Connected to ndb!! Successfully restored table `test/def/t1` Successfully restored table event REPL$test/t1 Opening file '/data/mysql-cluster/data/BACKUP/BACKUP-1/BACKUP-1-0.2.Data' File size 26748 bytes _____________________________________________________ Processing data in table: mysql/def/NDB$BLOB_7_3(8) fragment 0 _____________________________________________________ Processing data in table: mysql/def/ndb_index_stat_sample(5) fragment 0 _____________________________________________________ Processing data in table: sys/def/NDB$EVENTS_0(3) fragment 0 _____________________________________________________ Processing data in table: mysql/def/ndb_apply_status(9) fragment 0 _____________________________________________________ Processing data in table: mysql/def/ndb_index_stat_head(4) fragment 0 _____________________________________________________ Processing data in table: test/def/t1(10) fragment 0 _____________________________________________________ Processing data in table: sys/def/SYSTAB_0(2) fragment 0 _____________________________________________________ Processing data in table: mysql/def/ndb_schema(7) fragment 0 Opening file '/data/mysql-cluster/data/BACKUP/BACKUP-1/BACKUP-1.2.log' File size 52 bytes Restored 1 tuples and 0 log entries NDBT_ProgramExit: 0 - OK [root@192.168.0.50 ~]#
其中命令行中的各参数意义如下:
参数 说明 -b 备份id -n NDB节点id -m 恢复表定义 -r 备份路径 -c cluster管理节点连接
因为是第一个节点恢复,所以需要加参数-m来恢复表定义,这样在其他节点恢复时就不需要再加该参数,否则会报如下错误:
Unable to find table: `t1` Restore: Failed to restore table: `test/def/t1` ... Exiting
(2)在节点3(192.168.0.40)再进行恢复:
[root@192.168.0.40 ~]# ndb_restore -b 1 -n 3 -c host=192.168.0.30:1186 -r /data/mysql-cluster/data/BACKUP/BACKUP-1/ Backup Id = 1 Nodeid = 3 backup path = /data/mysql-cluster/data/BACKUP/BACKUP-1/ Opening file '/data/mysql-cluster/data/BACKUP/BACKUP-1/BACKUP-1.3.ctl' File size 14096 bytes Backup version in files: ndb-6.3.11 ndb version: mysql-5.5.19 ndb-7.2.4 Stop GCP of Backup: 1234 Connected to ndb!! Opening file '/data/mysql-cluster/data/BACKUP/BACKUP-1/BACKUP-1-0.3.Data' File size 25660 bytes _____________________________________________________ Processing data in table: mysql/def/NDB$BLOB_7_3(8) fragment 1 _____________________________________________________ Processing data in table: mysql/def/ndb_index_stat_sample(5) fragment 1 _____________________________________________________ Processing data in table: sys/def/NDB$EVENTS_0(3) fragment 1 _____________________________________________________ Processing data in table: mysql/def/ndb_apply_status(9) fragment 1 _____________________________________________________ Processing data in table: mysql/def/ndb_index_stat_head(4) fragment 1 _____________________________________________________ Processing data in table: test/def/t1(10) fragment 1 _____________________________________________________ Processing data in table: sys/def/SYSTAB_0(2) fragment 1 _____________________________________________________ Processing data in table: mysql/def/ndb_schema(7) fragment 1 Opening file '/data/mysql-cluster/data/BACKUP/BACKUP-1/BACKUP-1.3.log' File size 52 bytes Restored 1 tuples and 0 log entries NDBT_ProgramExit: 0 - OK [root@192.168.0.40 ~]#
(4)查看数据是否恢复(任意SQL节点,这里选择192.168.0.70):
mysql> select * from t1; +-------+ | name | +-------+ | yayun | | atlas | +-------+ 2 rows in set (0.05 sec) mysql>
此时数据已经恢复正常。
三 .日志管理
MySQL Cluster提供了两种日志,分别是集群日志(cluster log)和节点日志(node log)。前者记录了所有Cluster节点生成的日志,后者仅仅记录了数据节点的本地事件。在大多数情况下,一般推荐使用集群日志,因为它在一个地方记录了所有节点的数据,更方便进行管理。节点日志一般在开发过程中使用,或者用来调试程序代码。
clusterlog一般记录在配置文件(config.ini)所在的目录下,文件名格式为ndb_<nodeid>_cluster.log,其中nodeid为管理节点号。
下面是测试环境中的一段clusterlog:
[root@192.168.0.30 mysql-cluster]# tail -n 10 ndb_1_cluster.log #Tablespaces: 0 #Logfilegroups: 0 #datafiles: 0 #undofiles: 0 2014-04-16 23:21:48 [MgmtSrvr] INFO -- Node 6: Restore data: backup 3 from node 3 #Records: 3201 Data: 44814 bytes 2014-04-16 23:21:48 [MgmtSrvr] INFO -- Node 6: Restore log: backup 3 from node 3 #Records: 0 Data: 0 bytes 2014-04-16 23:21:48 [MgmtSrvr] INFO -- Node 6: Restore completed: backup 3 from node 3 2014-04-16 23:21:48 [MgmtSrvr] ALERT -- Node 2: Node 6 Disconnected 2014-04-16 23:21:48 [MgmtSrvr] INFO -- Node 2: Communication to Node 6 closed 2014-04-16 23:21:48 [MgmtSrvr] INFO -- Node 3: Communication to Node 6 closed 2014-04-16 23:21:48 [MgmtSrvr] ALERT -- Node 3: Node 6 Disconnected 2014-04-16 23:21:51 [MgmtSrvr] INFO -- Node 3: Communication to Node 6 opened 2014-04-16 23:21:52 [MgmtSrvr] INFO -- Node 2: Communication to Node 6 opened [root@192.168.0.30 mysql-cluster]#
可以使用ndb_mgm客户端管理工具打开或者关闭日志,具体操作如下:
[root@192.168.0.30 ~]# ndb_mgm -- NDB Cluster -- Management Client -- ndb_mgm> clusterlog info Connected to Management Server at: localhost:1186 Severities enabled: INFO WARNING ERROR CRITICAL ALERT ndb_mgm>
当前日志是打开的,可以用clusterlog off命令关闭日志:
ndb_mgm> clusterlog off Cluster logging is disabled ndb_mgm>
查看日志状态发现已关闭:
ndb_mgm> clusterlog info Cluster logging is disabled. ndb_mgm>
执行命令clusterlog on将日志打开:
ndb_mgm> clusterlog on Cluster logging is enabled. ndb_mgm> clusterlog info Severities enabled: INFO WARNING ERROR CRITICAL ALERT ndb_mgm>
cluster中的日志有很多类型,可以按照如下类别进行过滤,使得日志只记录我们关心的信息。
(1)Category(类别):可以是STARTUP,SHUTDOWN,STATISTICS,CHECKPOINT,NODERESTART,CONNECTION,ERROR或者INFO中的任意值。具体的信息大家可以查阅mysql相关文档。
(2)Priority(优先级):由从1-15的数字表示,1表示最重要,而15表示最不重要。每种Category都有一个 默认的优先级阀值,如下面所示,优先级阀值以下的日志将被记录,反之,优先级阀值以上的日志将不会被记录。
类别 默认阀值(所有数据节点) STARTUP 7 SHUTDOWN 7 STATISTICS 7 CHECKPOINT 7 NODERESTART 7 CONNECTION 7 ERROR 15 INFO 7
(3)Severity Level(严重级别):可以是下面的值之一,ALERT,CRITICAL,ERROR,WARNING,INFO或DEBUG。这些值的含义如下:
严重级别 事件定义 ALERT 应立刻更正的情况,如损坏的系统数据库 CRITICAL 临界状况,如设备错误或资源不足 ERROR 应予以更正的状况,如配置错误 WARNING 不能称为错误的情况,但仍需要特别处理 INFO 通报性信息 DEBUG 调试信息,用于NDB Cluster开发
这3种分类可以让我们从3个不同的角度来对日志进行过滤。过滤的方法可以用ndb_mgm工具来完成,具体设置方法如下。
(1)node_id CLUSTERLOG category=threshold:用于小于等于threshold的优先级将category事件记录到cluster日志。node_id可以为ALL(所有节点)。或者只指定某个节点。
(2)CLUSTERLOG TOGGLE severity_level:使得指定的severity_level打开或者关闭。
例如,要将测试环境中的节点2的STARTUP事件只记录级别为3以下的日志,可以进入ndb_mgm后执行命令:
ndb_mgm> 2 clusterlog startup=3 Executing CLUSTERLOG STARTUP=3 on node 2 OK! ndb_mgm>
如果要在cluster中过滤掉DEBUG和INFO信息,可以执行如下命令:
ndb_mgm> clusterlog toggle debug info DEBUG enabled INFO disabled ndb_mgm>
然后查看日志状态,发现DEBUG和INFO信息已经不存在了。
ndb_mgm> clusterlog info Severities enabled: DEBUG WARNING ERROR CRITICAL ALERT ndb_mgm>