DG(DataGuard)正常切换
1.IP信息
数据库 | 主机名 | 办公IP | 产线IP | ILO IP | 办公VIP | 产线VIP |
---|---|---|---|---|---|---|
主 | monkey01 | 192.168.1.183 | 192.168.2.183 | 192.168.3.183 | 192.168.1.33 | 192.168.2.33 |
库 | monkey02 | 192.168.1.184 | 192.168.2.184 | 192.168.3.184 | ||
备 | dg_monkey01 | 192.168.1.31 | 192.168.2.31 | 192.168.3.31 | 192.168.1.178 | 192.168.2.178 |
库 | dg_monkey02 | 192.168.1.32 | 192.168.2.32 | 192.168.3.32 |
2.坏块检查
坏块检查应该提前在备库进行检查,至少提前一天。
# 进入rman
rman target / nocatalog
# 执行检查
run {
allocate channel d1 type disk;
allocate channel d2 type disk;
allocate channel d3 type disk;
allocate channel d4 type disk;
backup validate check logical database;
}
# 查看(在read only状态就可以看)
SELECT * from V$DATABASE_BLOCK_CORRUPTION;
3.oracle patch检查
# 11204
opatch lsinv|grep -i 18685209
opatch lsinv|grep -i 18841764
opatch lsinv|grep -i 19769489
opatch lsinv|grep -i 12578873
opatch lsinv|grep -i 26392168
opatch lsinv|grep -i 30070097
# 19c
p6880880_122010_Linux-x86-64.zip
p34419443_190000_Linux-x86-64.zip
4.SPP检查
运维工具---》SPP管理
5.DG相关参数检查
DG参数可以参考:DataGuard(DG)参数详解
# db_unique_name
主库: monkey
备库: monkeystdby
# log_archive_config
主备库:dg_config=(monkey,monkeystdby)
# log_archive_dest_1
主库:'location=/archlog/ valid_for=(all_logfiles,all_roles) db_unique_name=monkey';
备库:'location=/archlog/ valid_for=(all_logfiles,all_roles) db_unique_name=monkeystdby';
# log_archive_dest_2
主库:'SERVICE=monkeystdby ASYNC VALID_FOR=(online_logfiles,primary_role) DB_UNIQUE_NAME=monkeystdby';
备库:'service=monkey async valid_for=(online_logfiles, primary_role) db_unique_name=monkey';
# log_archive_dest_state_1
主备库:ENABLE
# log_archive_dest_state_2
主备库:ENABLE
# fal_client monkey为主库tnsnames.ora中本库的tns名,monkeystdby为备库tnsname.ora指向本库的tns名
主库:monkey
备库:monkeystdby
# fal_server monkey为备库tnsnames.ora中指向主库的tns名,monkeystdby为主库tnsname.ora指向备库的tns名
主库:monkeystdby
备库:monkey
# log_archive_max_processes
主备库:30
# standby_file_management
主库:AUTO
备库:AUTO
6.修改错误参数
如果第三步参数有问题,在切换前纠正好
7.关闭备份
8.关闭EM,OGG
emctl stop dbconsole
# 如果数据库安装了多个版本的ogg,都要关闭
cd /oggdata/ogg11
ggsci
stop *
stop mgr
cd /oggdata/ogg19
ggsci
stop *
stop mgr
# 保存info *,detail,info *命令的结果,后面重新创建19ogg有用
9.禁用排程
# 备份
crontab -l > crontab_backup.txt
# 禁用
crontab -l | sed '/^$/! s/^/#/' | crontab -
10.主库checkpoint
sqlplus / as sysdba
alter system archive log current;
alter system checkpoint;
ps -ef|grep -i local=no|grep -v grep|awk '{print $2}'|xargs kill -9
ps -ef|grep -i local=no|grep -v grep
alter system checkpoint;
11.冻结数据库
# 主库
hagrp -freeze monkey-cluster
# 备库
hagrp -freeze monkey-cluster
12.切换
主库:
alter system archive log current;
SELECT SWITCHOVER_STATUS FROM V$DATABASE;
-- 如果结果为TO STANDBY
alter database commit to switchover to physical standby;
-- 如果结果为SESSIONS ACTIVE
alter database commit to switchover to physical standby with session shutdown;
-- 打开
shutdown immediate;
startup nomount;
alter database mount standby database;
备库:
alter database recover managed standby database cancel;
select switchover_status from v$database;
-- 如果结果为TO PRIMARY
alter database commit to switchover to primary;
-- 如果结果为SESSIONS ACTIVE
alter database commit to switchover to primary with session shutdown;
-- 打开
select open_mode,database_role from V$database;
shutdown immediate;
startup;
13.应用patch
DG库oracle打完patch,由于数据库无法read write模式打开,因此需要在打开后应用
cd /oracle/11204/rdbms/admin
sqlplus / as sysdba
@catbundle.sql psu apply
14.主备库更换VIP
14.1.禁用日志传输
alter system set log_archive_dest_state_2=defer;
14.2.原主库下VIP,更新VIP
hagrp -unfreeze monkey-cluster
hares -offline monkey-lsn -sys monkey01
hares -offline off-vip -sys monkey01
hares -offline sfc-vip -sys monkey01
haconf -makerw
hares -modify off-vip Address 192.168.1.178
hares -modify sfc-vip Address 192.168.2.178
haconf -dump -makero
14.3.现主库下VIP,更新VIP
hagrp -unfreeze monkey-cluster
hares -offline monkey-lsn -sys dg_monkey01
hares -offline off-ip -sys dg_monkey01
hares -offline sfc-ip -sys dg_monkey01
haconf -makerw
hares -modify off-ip Address 192.168.1.33
hares -modify sfc-ip Address 192.168.2.33
haconf -dump -makero
15.修改hosts
主库主备机,备库主备机都要修改(主要调整VIP部分)
16.修改主备库DG相关参数
此时主库的db_unique_name为monkeystdby,为了避免歧义,准备改回来
16.1.主库修改参数
alter system set db_unique_name=monkey scope=spfile;
alter system set log_archive_dest_1='location=/archlog valid_for=(all_logfiles,all_roles) db_unique_name=monkey' scope=spfile;
alter system set log_archive_dest_2='SERVICE=monkeystdby async valid_for=(online_logfiles,primary_role) db_unique_name=monkeystdby' scope=spfile;
alter system set fal_client=monkey;
alter system set fal_server=monkeystdby;
16.2.备库修改参数
alter system set db_unique_name=monkeystdby scope=spfile;
alter system set log_archive_dest_1='location=/archlog valid_for=(all_logfiles,all_roles) db_unique_name=monkeystdby' scope=spfile;
alter system set log_archive_dest_2='SERVICE=monkey async valid_for=(online_logfiles,primary_role) db_unique_name=monkey' scope=spfile;
alter system set fal_client=monkeystdby;
alter system set fal_server=monkey;
16.3.修改监听
tnsnames.ora中无需更改,monkey和monkeystdby两个tns本来配置的就是主库和备库的IP
listener.ora要修改,修改为对应的IP
16.4.重启数据库
# 主库
hagrp -unfreeze monkey-cluster
# 备库
hagrp -unfreeze monkey-cluster
# 主备库数据库都要重启
17.上线VIP
现主库:
hares -online off-ip -sys dg_monkey01
hares -online sfc-ip -sys dg_monkey01
hares -online Netlsnr -sys dg_monkey01
现备库
hares -online off-vip -sys monkey01
hares -online sfc-vip -sys monkey01
hares -online monkey-lsn -sys monkey01
18.主库打开日志传输
-- 主库打开
alter system set log_archive_dest_state_2=enable;
alter system switch logfile;
-- 备库应用
alter database recover managed standby database disconnect;
19.切换测试
19.1.同步文件
spfile,orapwd,listener.ora,tnsnames.ora传送至备机
19.2.切换
hagrp -switch monkey-cluster -to dg_monkey02
20.排程
20.1.把脚本传送过来
scp -r /scripts/* 192.168.1.31:/scripts/
20.2.把备份的排程传送过来
su - oracle
crontab /tmp/crontab_backup.txt
21.ogg处理
把主库的ogg目录传送过来,打开即可
针对19ogg,需要重新创建
注意
- 创建的目录要和原主库的一样,端口一样
- 把trail目录,定义文件传送到对应目录下
- 新建的进程注意启动时间和原来的一样(使用前面保存的命令文件)
22.编译失效对象
cd /oracle/11204/rdbms/admin
sqlplus / as sysdba
@utlrp.sql
23.重建EM
一定要确保hosts文件正确,不然创建EM会报错
emca -config dbcontrol db -repos recreate
24.全库收集统计信息
BEGIN
DBMS_STATS.gather_database_stats;
END;
/
25.收尾
23.1.备库文件同步
备库的spfile等主备机也要同步
23.2.排程是否执行
尤其注意排程的脚本是否存在,root下也要注意一下
23.3.ogg是否正常走
running不代表就走了,也要看看lag
23.4.表空间够不够
UNDO,TEMP等
23.5.redo
redo有没有多组,够不够
23.6.日志
oracle的alert日志,OS的message日志
24.监控
切换后的机器是否有监控,监控是否开启(zabbix等)
25.归档
归档目录是否正确,restore point是否删除