【案例】 实例已删除 如何恢复



 
问题:主要是 分布式的物理备份 , 我又不能手动来恢复。

思路: 1.要先创建一个实例 ,然后将备份移动到该实例的路径下,然后进行的恢复 

1.在赤兔上创建新的实例
2.在hdfs 上根据实例信息 创建相关目录 group_1610847854_45 ,set_1610848126_1   如果有多个分片 也都要创建
hadoop fs -mkdir -p /tdsqlbackup/tdsqlzk/group_1610847854_45/autocoldbackup/sets/set_1610848126_1/binlog
hadoop fs -mkdir -p /tdsqlbackup/tdsqlzk/group_1610847854_45/autocoldbackup/sets/set_1610848126_1/errlog
hadoop fs -mkdir -p /tdsqlbackup/tdsqlzk/group_1610847854_45/autocoldbackup/sets/set_1610848126_1/slowlog
hadoop fs -mkdir -p /tdsqlbackup/tdsqlzk/group_1610847854_45/autocoldbackup/sets/set_1610848126_1/xtrabackup

3.将备份文件,和日志文件拷贝到相应目录 (注意 cp 的时候需要把分片数加上)
cp 物理文件 和 binlog 
hadoop fs -cp -d  /tdsqlbackup/tdsqlzk/group_1610846781_37/autocoldbackup/sets/set_1610847062_1/xtrabackup/xtrabackup+1610847791+20210117+094311+10.85.10.53+4002+171279522+20210117+094325++xbstream.lz4  /tdsqlbackup/tdsqlzk/group_1610847854_45/autocoldbackup/sets/set_1610848126_1/xtrabackup/xtrabackup+1610847791+20210117+094311+10.85.10.53+4002+171279522+20210117+094325+1+xbstream.lz4
hadoop fs -cp -d  /tdsqlbackup/tdsqlzk/group_1610846781_37/autocoldbackup/sets/set_1610847062_1/xtrabackup/xtrabackup+1610847791+20210117+094311+10.85.10.53+4002+171279522+20210117+094325++xbstream.lz4.ok  /tdsqlbackup/tdsqlzk/group_1610847854_45/autocoldbackup/sets/set_1610848126_1/xtrabackup/xtrabackup+1610847791+20210117+094311+10.85.10.53+4002+171279522+20210117+094325+1+xbstream.lz4.ok

hadoop fs -cp -d  /tdsqlbackup/tdsqlzk/group_1610846781_37/autocoldbackup/sets/set_1610847062_1/binlog/binlog+1610847031+20210117+093031+10.85.10.53+4002+171279522+binlog.000001.lz4  /tdsqlbackup/tdsqlzk/group_1610847854_45/autocoldbackup/sets/set_1610848126_1/binlog/binlog+1610847031+20210117+093031+10.85.10.53+4002+171279522+binlog.000001.lz4

4.在赤兔上进行恢复 ,注意回档时间如果选择当前的话, 要有相应的binlog 
  测试好像至少需要在一个binlog的时间点后在, 不然会报    can not get route  in the retreat time
    我的猜想:1.这个时间是必须要一个binlog  
             2.这个时间点需要新创建的实例后面, 而恢复的时候直接恢复到没有binlog 为止 (测试这种可能性较大)
               新实例创建结束时间: 在 日志管理 -> 控制台操作日志  的   创建分布式实例任务流程 => 任务开始时间  +  执行时间  = 实例创建结束的时间
  
  
#原理
1.在还原的时候,点击下一步时,需要去flush log 日志,所需需要一个可用的实例。
2.在恢复的时候是去扫描是否有可用的备份文件,跟文件名无关, 不会去匹配文件名
  所以只需要把 备份文件 cp 到 目录就可以了。
  
3.在恢复的时候 指定时间点 没有恢复成后,  有待继续 测试
  使用当前的时间点(默认的当前的), 恢复成功

4.没有binlog 也是恢复不成功的。  
   恢复:明确要去找binlog 
    ERROR 57093,backupfilemgn.cpp:228:pickValidBinlog,tid:0x7f6f16d5e880,serverid is not legal serverid = 171279522
    ERROR 57093,binlogmgn.cpp:67:getAvailableBinlog,tid:0x7f6f16d5e880,unable to get the available binlog
    ERROR 57093,recoverset.cpp:161:recover,tid:0x7f6f16d5e880,Binlog Recover Failed:Binlog Error:Check the binlog failure on HDFS.

#注意有时候 赤兔上反回  
   提示:	【-2007】the zookeeper operation timeout
   可以等待,后台执行完成在看。  因为任务已经下出下去了。

#总结:
1.恢复的时候选 使用当前的时间点(默认的当前的)
2.恢复的时候至少需要一个binlog 文件。
  没有binlog ,备份片的数据也会还原,只是状态不可用, 可以尝试手动调整。
  


测试1:  回当时间必须设置到, 新实例创建后的时间点
#1. 查看新实例创建完成时间:

#2.回当时间设置在 52分:回档成功
#问题 :提示找到分版,但是不可用  ,   可能是指定的时间太靠前了。  
#       在次测试,就用当前时间恢复成轼, 不手动指定时间。
[2021-01-17 11:36:25 190425] ERROR 81370,xtrabackupmgn.cpp:128:getAvailableImage,tid:0x7fe7bbf4c880,There are 1 of backup files , but no one available.
[2021-01-17 11:36:25 190564] ERROR 81370,recoverset.cpp:153:recover,tid:0x7fe7bbf4c880,Xtrabackup Recover Failed:Xtrabackup Error:Unable to get available image.

测试2:  只有备份文件,没有binlog 
1.删掉binlog 
[tdsql@tdsql1 ~]$ hadoop fs -rm /tdsqlbackup/tdsqlzk/group_1610847854_45/autocoldbackup/sets/set_1610848126_1/binlog/*                       # */
Deleted /tdsqlbackup/tdsqlzk/group_1610847854_45/autocoldbackup/sets/set_1610848126_1/binlog/binlog+1610847031+20210117+093031+10.85.10.53+4002+171279522+binlog.000001.lz4

2. 在次,不指定时间 进行恢复 ,也恢复失败
 [2021-01-17 12:54:55 457702] ERROR 57093,backupfilemgn.cpp:228:pickValidBinlog,tid:0x7f6f16d5e880,serverid is not legal serverid = 171279522
[2021-01-17 12:54:55 457818] ERROR 57093,binlogmgn.cpp:67:getAvailableBinlog,tid:0x7f6f16d5e880,unable to get the available binlog
[2021-01-17 12:54:55 457863] ERROR 57093,recoverset.cpp:161:recover,tid:0x7f6f16d5e880,Binlog Recover Failed:Binlog Error:Check the binlog failure on HDFS.

3.但实际上,备份片的数据已经还源,只是状态不可用。

1. set 状态处理回档中
2.实例状态处理不可用
在zk中找到相应的 group 和 set 
#1.更改set 状态 ,将回档中 101   调到正常 0
get /tdsqlzk/group_1610858871_151/sets/set@set_1610859155_1/setrun@set_1610859155_1
[zk: localhost:2181(CONNECTED) 8] get /tdsqlzk/group_1610858871_151/sets/set@set_1610859155_1/setrun@set_1610859155_1
{"cgroup_cpu":"","degrade_flag":0,"history_ids":[{"id":"set_1610859155_1","timestamp":1610859155,"type":0}],"kpstatus":101,"master":{"alive":"0","city":"default","election":true,"hb_err":"0","idc":"IDC_CQ_YB_9527_01","idc_weight":"100","losthbtime":"0","name":"10.85.10.51_4013","sqlasyn":"1","weight":"1","zone":"default"},"password":"gYNPpki%7Q5WDD3WhL","proxy":[{"name":"10.85.10.51_15002"},{"name":"10.85.10.53_15002"},{"name":"10.85.10.52_15002"}],"read_only":"0","resource_info":{"cpu":100,"data_disk":16000,"log_disk":4000,"mem":1500},"set":"set_1610859155_1","slave":[{"alive":"0","city":"default","election":true,"hb_err":"0","idc":"IDC_CQ_YB_9527_02","idc_weight":"100","losthbtime":"0","name":"10.85.10.52_4013","sqlasyn":"1","weight":"1","zone":"default"},{"alive":"0","city":"default","election":true,"hb_err":"0","idc":"IDC_CQ_YB_9527_03","idc_weight":"100","losthbtime":"0","name":"10.85.10.53_4013","sqlasyn":"1","weight":"1","zone":"default"}],"specid":32768,"status":0,"uniqueid":"unique_1326296854_1610859155","user":"tdsqlsys_normal"}

手动设置 SET 值   将kpstatus 改为 0 

[zk: localhost:2181(CONNECTED) 8] set /tdsqlzk/group_1610858871_151/sets/set@set_1610859155_1/setrun@set_1610859155_1 {"cgroup_cpu":"","degrade_flag":0,"history_ids":[{"id":"set_1610859155_1","timestamp":1610859155,"type":0}],"kpstatus":101,"master":{"alive":"0","city":"default","election":true,"hb_err":"0","idc":"IDC_CQ_YB_9527_01","idc_weight":"100","losthbtime":"0","name":"10.85.10.51_4013","sqlasyn":"1","weight":"1","zone":"default"},"password":"gYNPpki%7Q5WDD3WhL","proxy":[{"name":"10.85.10.51_15002"},{"name":"10.85.10.53_15002"},{"name":"10.85.10.52_15002"}],"read_only":"0","resource_info":{"cpu":100,"data_disk":16000,"log_disk":4000,"mem":1500},"set":"set_1610859155_1","slave":[{"alive":"0","city":"default","election":true,"hb_err":"0","idc":"IDC_CQ_YB_9527_02","idc_weight":"100","losthbtime":"0","name":"10.85.10.52_4013","sqlasyn":"1","weight":"1","zone":"default"},{"alive":"0","city":"default","election":true,"hb_err":"0","idc":"IDC_CQ_YB_9527_03","idc_weight":"100","losthbtime":"0","name":"10.85.10.53_4013","sqlasyn":"1","weight":"1","zone":"default"}],"specid":32768,"status":0,"uniqueid":"unique_1326296854_1610859155","user":"tdsqlsys_normal"}

2.将实例中的不可用调到 正常0 
get /tdsqlzk/group_1610858871_151/routes/status@group
{"kpstatus":"30000","status":"1"}

set /tdsqlzk/group_1610858871_151/routes/status@group {"status":"0"}

3.测试登录  ,我这里是新创建的账号   可以登录 ,如果可以用的话, 此时建议将数据做一个逻辑备份。

[root@tdsql1 ~]# mysql -utest -ptest -h10.85.10.51 -P15002
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MySQL connection id is 4561
Server version: 5.7.17-11-V2.0R540D002-20191226-1152-log Source distribution
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)]> 





posted @ 2022-02-16 10:15  www.cqdba.cn  阅读(118)  评论(0编辑  收藏  举报