如何利用Direct NFS克隆数据库
CloneDB是Oracle 11.2.0.3推出的一项新特性,它利用的了11g新引入的Direct NFS。它直接利用目标数据库的备份,无需将备份COPY到克隆环境下,使得一个备份可以克隆多个不同用途的数据库。它同时采用了copy-on-write技术,使得备份中只有被修改了的block才会被应用到克隆数据库中,极大的节省了克隆数据库的空间,并提升了数据库克隆的速度。
下面对该项特性进行了测试,共涉及三台服务器,采用的均是32位RHEL 6.3操作系统,数据库版本11.2.0.4。
NFS服务器: 主机名:node3.being.com IP: 192.168.2.13 共享路径:/u01 生产数据库: 主机名: node1.being.com IP: 192.168.2.11 ORACLE_HOME: /u01/app/oracle/product/11.2.0.4/db_1 ORACLE_SID:prod 挂载点:/backup/prod/ 克隆数据库: 主机名:node2.being.com IP:192.168.2.12 ORACLE_HOME:/u01/app/oracle/product/11.2.0.4/db_1 ORACLE_SID:orcl 挂载点:/prod/backup
一、准备NFS环境
首先查看一下node3的磁盘情况
[root@node3 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 5.8G 3.9G 1.6G 72% / tmpfs 504M 0 504M 0% /dev/shm /dev/sda4 20G 172M 19G 1% /u01
在这里,打算将/u01作为NFS磁盘共享出去。
在NFS Server的主配置文件中/etc/exports添加以下内容:
/u01 *(rw,no_root_squash)
在这里,注意/u01的权限,刚开始/u01默认的是755权限,node1挂载后,无法创建文件,报“Permission denied”。
修改/u01的权限
[root@node3 ~]# chmod 757 /u01/
重启rpcbind和nfs服务
[root@node3 ~]# /etc/init.d/rpcbind restart
Stopping rpcbind: [FAILED]
Starting rpcbind: [ OK ]
[root@node3 ~]# /etc/init.d/nfs restart
Shutting down NFS daemon: [FAILED]
Shutting down NFS mountd: [FAILED]
Shutting down NFS quotas: [FAILED]
Starting NFS services: [ OK ]
Starting NFS quotas: [ OK ]
Starting NFS mountd: [ OK ]
Starting NFS daemon: [ OK ]
查看配置是否成功
[root@node3 ~]# showmount -e 192.168.2.13 Export list for 192.168.2.13: /u01 *
将rpcbind和nfs服务设为开机自启动
[root@node3 ~]# chkconfig rpcbind on
[root@node3 ~]# chkconfig nfs on
二、在生产数据库上挂载NFS文件系统
以root身份登录,创建备份目录
[root@node1 ~]# mkdir /backup/prod
挂载node3提供的nfs磁盘
[root@node1 ~]# showmount -e 192.168.2.13 Export list for 192.168.2.13: /u01 * [root@node1 ~]# mount 192.168.2.13:/u01 /backup/prod/ [root@node1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 7.7G 3.6G 3.8G 49% / tmpfs 400M 184M 217M 46% /dev/shm /dev/sda1 194M 27M 158M 15% /boot /dev/sda4 28G 11G 16G 41% /u01 192.168.2.13:/u01 20G 172M 19G 1% /backup/prod
挂载后,无法给/backup/prod修改文件属主和属组,譬如我想将该目录设置为oracle用户专享的,结果如下:
[root@node1 ~]# chown oracle.oinstall /backup/prod/ chown: changing ownership of `/backup/prod/': Invalid argument
后来发现实在无此必要,node1中/backup/prod的权限和node3中的/u01权限是相对应的,都是757。我以oracle用户的身份在该目录下创建文件,文件的属主和属组就分别是oracle和oinstall。
设置为开机自动挂载
在/etc/rc.local添加以下内容:
mount 192.168.2.13:/u01 /backup/prod/
三、在克隆数据库上挂载NFS文件系统
以root身份登录,创建备份目录,名字随意
[root@node2 ~]# mkdir -p /prod/backup
挂载node3提供的NFS共享盘
[root@node2 ~]# mount 192.168.2.13:/u01 /prod/backup/ [root@node2 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 9.7G 6.2G 3.0G 68% / tmpfs 400M 0 400M 0% /dev/shm /dev/sda1 194M 27M 158M 15% /boot /dev/sdb 20G 4.2G 15G 23% /u01 192.168.2.13:/u01 20G 172M 19G 1% /prod/backup
同样设置开机自动挂载
在/etc/rc.local中添加以下内容:
mount 192.168.2.13:/u01 /prod/backup/
四、克隆数据库上启用Direct NFS
登录oracle账户,新建$ORACLE_HOME/dbs/oranfstab文件,内容如下:
[oracle@node2 ~]$ cat $ORACLE_HOME/dbs/oranfstab server:node3.being.com path:192.168.2.13 export:/u01 mount:/prod/backup
修改libodm11.so库文件的链接对象
[oracle@node2 lib]$ mv libodm11.so libodm11.so.bak [oracle@node2 lib]$ ln -s libnfsodm11.so libodm11.so
五、备份生产数据库
支持三种备份方式:在线热备(类似于alter database begin backup)、offline 备份(需关闭数据库)、rman “BACKUP AS COPY”(不支持rman备份集)
在这里,我们选用rman备份
以oracle身份登录node1
[oracle@node1 ~]$ rman target /
RMAN> sql 'alter database begin backup'; RMAN> backup as copy database format '/backup/prod/%U';
RMAN> sql 'alter database end backup';
备份完毕后,查看/backup/prod中的备份文件
[oracle@node1 ~]$ ll /backup/prod/ total 1294792 -rw-r-----. 1 oracle asmadmin 98304 Apr 24 00:26 0cq553i6_1_1 -rw-r-----. 1 oracle asmadmin 9748480 Apr 24 00:26 cf_D-PROD_id-289579616_0aq553hp -rw-r-----. 1 oracle asmadmin 513810432 Apr 24 00:26 data_D-PROD_I-289579616_TS-SYSAUX_FNO-2_08q553gd -rw-r-----. 1 oracle asmadmin 765468672 Apr 24 00:25 data_D-PROD_I-289579616_TS-SYSTEM_FNO-1_07q553dp -rw-r-----. 1 oracle asmadmin 31465472 Apr 24 00:26 data_D-PROD_I-289579616_TS-UNDOTBS1_FNO-3_09q553hj -rw-r-----. 1 oracle asmadmin 5251072 Apr 24 00:26 data_D-PROD_I-289579616_TS-USERS_FNO-4_0bq553i5 drwx------. 2 root root 16384 Apr 23 19:45 lost+found
其中cf_D-PROD_id-289579616_0aq553hp为控制文件的备份
六、为克隆数据库创建pfile
登录生产数据库
[oracle@node1 ~]$ sqlplus / as sysdba
生成pfile
SQL> create pfile='/backup/prod/pfile.ora' from spfile;
修改参数文件,除将所有涉及到prod的地方改为orcl外,同时新增了一个参数CLONEDB=TRUE,这个是必需的,最后参数文件内容如下:
[oracle@node1 ~]$ cat /backup/prod/pfile.ora orcl.__db_cache_size=83886080 orcl.__java_pool_size=4194304 orcl.__large_pool_size=8388608 orcl.__oracle_base='/u01/app/oracle'#ORACLE_BASE set from environment orcl.__pga_aggregate_target=125829120 orcl.__sga_target=188743680 orcl.__shared_io_pool_size=0 orcl.__shared_pool_size=83886080 orcl.__streams_pool_size=0 *.audit_file_dest='/u01/app/oracle/admin/orcl/adump' *.audit_trail='db' *.compatible='11.2.0.4.0' *.control_files='/u01/app/oracle/oradata/orcl/control01.ctl','/u01/app/oracle/fast_recovery_area/orcl/control02.ctl' *.db_block_size=8192 *.db_domain='' *.db_name='orcl' *.db_recovery_file_dest='/u01/app/oracle/fast_recovery_area' *.db_recovery_file_dest_size=4353687552 *.diagnostic_dest='/u01/app/oracle' *.dispatchers='(PROTOCOL=TCP) (SERVICE=orclXDB)' *.memory_target=314572800 *.open_cursors=300 *.processes=150 *.remote_login_passwordfile='EXCLUSIVE' *.undo_tablespace='UNDOTBS1' CLONEDB=TRUE
在克隆数据库中新建参数文件中涉及的目录
[oracle@node2 admin]$ mkdir -p /u01/app/oracle/admin/orcl/adump [oracle@node2 admin]$ mkdir -p /u01/app/oracle/oradata/orcl/ [oracle@node2 admin]$ mkdir -p /u01/app/oracle/fast_recovery_area/orcl
七、克隆数据库
ORACLE官方文档中提供了两种方式,一种是利用oracle自身提供的脚本,一个是手动创建。
在这里,我们选择脚本方式,通过脚本执行过程中的输出,对比ORACLE官方文档,我们可以发现两者其实是一样的,只不过手动创建所需要的命令写到脚本里面去了。
首先在克隆数据库所在的主机上设置环境变量
[oracle@node2 ~]$ export MASTER_COPY_DIR=/prod/backup [oracle@node2 ~]$ export CLONE_FILE_CREATE_DEST=/u01/app/oracle/oradata/orcl [oracle@node2 ~]$ export CLONEDB_NAME=orcl
其中MASTER_COPY_DIR指的是备份所在的目录,CLONE_FILE_CREATE_DEST指的是数据文件、日志文件、控制文件所在的目录,CLONEDB_NAME指定数据库名
执行数据库提供的perl脚本,关于脚本所在的目录及其执行的方式,我们来看看官方的说明:
因为我的数据库版本为11.2.0.4,故直接COPY到home目录下
[oracle@node2 ~]$ cp /u01/app/oracle/product/11.2.0.1/db_1/rdbms/install/clonedb.pl ~
执行脚本
[oracle@node2 ~]$ perl clonedb.pl /prod/backup/pfile.ora crtdb.sql dbren.sql
在/home/oracle下会生成两个文件,crtdb.sql和dbren.sql
登录数据库,执行crtdb.sql
[oracle@node2 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Fri Apr 24 18:26:46 2015 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to an idle instance. SQL> @crtdb.sql SQL> SET FEEDBACK 1 SQL> SET NUMWIDTH 10 SQL> SET LINESIZE 80 SQL> SET TRIMSPOOL ON SQL> SET TAB OFF SQL> SET PAGESIZE 100 SQL> SQL> STARTUP NOMOUNT PFILE=/u01/app/oracle/oradata/orcl/initorcl.ora ORACLE instance started. Total System Global Area 313860096 bytes Fixed Size 1364340 bytes Variable Size 272633484 bytes Database Buffers 33554432 bytes Redo Buffers 6307840 bytes SQL> CREATE CONTROLFILE REUSE SET DATABASE orcl RESETLOGS 2 MAXLOGFILES 32 3 MAXLOGMEMBERS 2 4 MAXINSTANCES 1 5 MAXLOGHISTORY 908 6 LOGFILE 7 GROUP 1 '/u01/app/oracle/oradata/orcl/orcl_log1.log' SIZE 100M BLOCKSIZE 512, 8 GROUP 2 '/u01/app/oracle/oradata/orcl/orcl_log2.log' SIZE 100M BLOCKSIZE 512 9 DATAFILE 10 '/prod/backup/0cq553i6_1_1', 11 '/prod/backup/cf_D-PROD_id-289579616_0aq553hp', 12 '/prod/backup/data_D-PROD_I-289579616_TS-SYSAUX_FNO-2_08q553gd', 13 '/prod/backup/data_D-PROD_I-289579616_TS-SYSTEM_FNO-1_07q553dp', 14 '/prod/backup/data_D-PROD_I-289579616_TS-UNDOTBS1_FNO-3_09q553hj', 15 '/prod/backup/data_D-PROD_I-289579616_TS-USERS_FNO-4_0bq553i5', 16 '/prod/backup/lost+found', 17 '/prod/backup/pfile.ora' 18 CHARACTER SET WE8DEC; CREATE CONTROLFILE REUSE SET DATABASE orcl RESETLOGS * ERROR at line 1: ORA-01503: CREATE CONTROLFILE failed ORA-00058: DB_BLOCK_SIZE must be 16384 to mount this database (not 8192)
报以上错误,再来看看datafile下面的文件,然后它把/prod/backup下面的所有的文件都包括其中了,包括控制文件的备份和lost+found文件,而这并不是数据文件,关闭数据库,删除该目录下的其它文件,将pfile move到/home/oracle下,重新执行clonedb.pl脚本
SQL> shutdown immediate ORA-01507: database not mounted ORACLE instance shut down. SQL> quit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - Production With the Partitioning, OLAP, Data Mining and Real Application Testing options [oracle@node2 ~]$ cd /prod/backup/ [oracle@node2 backup]$ ls 0cq553i6_1_1 cf_D-PROD_id-289579616_0aq553hp data_D-PROD_I-289579616_TS-SYSAUX_FNO-2_08q553gd data_D-PROD_I-289579616_TS-SYSTEM_FNO-1_07q553dp data_D-PROD_I-289579616_TS-UNDOTBS1_FNO-3_09q553hj data_D-PROD_I-289579616_TS-USERS_FNO-4_0bq553i5 lost+found pfile.ora [oracle@node2 backup]$ rm 0cq553i6_1_1
[oracle@node2 backup]$ rm cf_D-PROD_id-289579616_0aq553hp [oracle@node2 backup]$ rm -rf lost+found/ [oracle@node2 backup]$ mv pfile.ora ~ [oracle@node2 backup]$ cd ~ [oracle@node2 ~]$ perl clonedb.pl pfile.ora crtdb.sql dbren.sql
重新执行crtdb.sql脚本,结果如下:
[oracle@node2 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Mon Apr 27 18:13:39 2015 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to an idle instance. SQL> @crtdb.sql SQL> SET FEEDBACK 1 SQL> SET NUMWIDTH 10 SQL> SET LINESIZE 80 SQL> SET TRIMSPOOL ON SQL> SET TAB OFF SQL> SET PAGESIZE 100 SQL> SQL> STARTUP NOMOUNT PFILE=/u01/app/oracle/oradata/orcl/initorcl.ora ORACLE instance started. Total System Global Area 313860096 bytes Fixed Size 1364340 bytes Variable Size 272633484 bytes Database Buffers 33554432 bytes Redo Buffers 6307840 bytes SQL> CREATE CONTROLFILE REUSE SET DATABASE orcl RESETLOGS 2 MAXLOGFILES 32 3 MAXLOGMEMBERS 2 4 MAXINSTANCES 1 5 MAXLOGHISTORY 908 6 LOGFILE 7 GROUP 1 '/u01/app/oracle/oradata/orcl/orcl_log1.log' SIZE 100M BLOCKSIZE 512, 8 GROUP 2 '/u01/app/oracle/oradata/orcl/orcl_log2.log' SIZE 100M BLOCKSIZE 512 9 DATAFILE 10 '/prod/backup/data_D-PROD_I-289579616_TS-SYSAUX_FNO-2_0pq5eur7', 11 '/prod/backup/data_D-PROD_I-289579616_TS-SYSTEM_FNO-1_0oq5eunu', 12 '/prod/backup/data_D-PROD_I-289579616_TS-UNDOTBS1_FNO-3_0rq5euss', 13 '/prod/backup/data_D-PROD_I-289579616_TS-USERS_FNO-4_0qq5eusl' 14 CHARACTER SET WE8DEC; Control file created.
执行dbren.sql脚本
SQL> @dbren.sql SQL> declare 2 begin 3 dbms_dnfs.clonedb_renamefile('/prod/backup/data_D-PROD_I-289579616_TS-SYSAUX_FNO-2_0pq5eur7' , '/u01/app/oracle/oradata/orcl/ora_data_orcl0.dbf'); 4 dbms_dnfs.clonedb_renamefile('/prod/backup/data_D-PROD_I-289579616_TS-SYSTEM_FNO-1_0oq5eunu' , '/u01/app/oracle/oradata/orcl/ora_data_orcl1.dbf'); 5 dbms_dnfs.clonedb_renamefile('/prod/backup/data_D-PROD_I-289579616_TS-UNDOTBS1_FNO-3_0rq5euss' , '/u01/app/oracle/oradata/orcl/ora_data_orcl2.dbf'); 6 dbms_dnfs.clonedb_renamefile('/prod/backup/data_D-PROD_I-289579616_TS-USERS_FNO-4_0qq5eusl' , '/u01/app/oracle/oradata/orcl/ora_data_orcl3.dbf'); 7 end; 8 / PL/SQL procedure successfully completed. SQL> show errors; No errors. SQL> alter database open resetlogs; alter database open resetlogs * ERROR at line 1: ORA-01190: control file or data file 1 is from before the last RESETLOGS ORA-01110: data file 1: '/u01/app/oracle/oradata/orcl/ora_data_orcl1.dbf' SQL> drop tablespace TEMP; drop tablespace TEMP * ERROR at line 1: ORA-01109: database not open SQL> create temporary tablespace TEMP; create temporary tablespace TEMP * ERROR at line 1: ORA-01109: database not open
下面我们来看看三个检查点:
1. 系统检查点
SQL> select checkpoint_change# from v$database; CHECKPOINT_CHANGE# ------------------ 791020 1 row selected.
2. 数据文件检查点
SQL> select file#,checkpoint_change#,last_change# from v$datafile; FILE# CHECKPOINT_CHANGE# LAST_CHANGE# ---------- ------------------ ------------ 1 791020 791020 2 791020 791020 3 791020 791020 4 791020 791020 4 rows selected.
3. 数据文件头检查点
SQL> select CHECKPOINT_CHANGE# from v$datafile_header; CHECKPOINT_CHANGE# ------------------ 791019 791019 791019 791019 4 rows selected.
大家发现没有?系统检查点和数据文件检查点是一致的,但数据文件头检查点是不一致的。
总结:
尝试了很多次,包括关闭数据库进行数据库冷备,数据文件头检查点和上述两个检查点的数值总是不一致,而且总是比上述检查点的数值少1,怀疑这是个bug,总之,这次实验就这样不了了之了,后续有时间的话再行测试。同时,官方文档中有一段话:
Because CloneDB databases use the backup files as their backend storage, the backup files must be available to each CloneDB database for it to run.If the backup files become unavailable, then the CloneDB databases return errors.
意思就是在运行克隆数据库的时候,备份文件必须是可用的,如果删除了备份文件或者卸载了NFS文件系统,则克隆数据库将不可用。
很好奇,可惜没有机会进行测试。
尽管该实验最后没有成功,但反复做了很多次,中间结果并不像实验步骤这样直观,有以下几点主要注意:
1> 刚开始做测试的时候,写的是/u01 *(rw),后来启用Direct NFS,创建tablespace做测试时会报以下错误:
SQL> create tablespace test datafile '/backup/prod/test.dbf' size 2m; create tablespace test datafile '/backup/prod/test.dbf' size 2m * ERROR at line 1: ORA-01119: error in creating database file '/backup/prod/test.dbf' ORA-27086: unable to lock file - already in use Linux Error: 37: No locks available Additional information: 10
后来改为/u01 *(rw,no_root_squash)就没有报该错误,RAC文档里推荐的是/u01 *(rw,sync,no_wdelay,insecure,insecure_locks,no_root_squash)。
2> 如果采用RMAN备份的话,一定要在备份模式下,即'alter database begin backup',如果直接backup as copy database format '/backup/prod/%U',在执行
dbren.sql脚本时,会出现以下错误:
SQL> @dbren.sql SQL> declare 2 begin 3 dbms_dnfs.clonedb_renamefile('/prod/backup/data_D-PROD_I-289579616_TS-SYSAUX_FNO-2_08q553gd' , '/u01/app/oracle/oradata/orcl/ora_data_orcl0.dbf'); 4 dbms_dnfs.clonedb_renamefile('/prod/backup/data_D-PROD_I-289579616_TS-SYSTEM_FNO-1_07q553dp' , '/u01/app/oracle/oradata/orcl/ora_data_orcl1.dbf'); 5 dbms_dnfs.clonedb_renamefile('/prod/backup/data_D-PROD_I-289579616_TS-UNDOTBS1_FNO-3_09q553hj' , '/u01/app/oracle/oradata/orcl/ora_data_orcl2.dbf'); 6 dbms_dnfs.clonedb_renamefile('/prod/backup/data_D-PROD_I-289579616_TS-USERS_FNO-4_0bq553i5' , '/u01/app/oracle/oradata/orcl/ora_data_orcl3.dbf'); 7 end; 8 / PL/SQL procedure successfully completed. SQL> show errors; No errors. SQL> alter database open resetlogs; alter database open resetlogs * ERROR at line 1: ORA-01152: file 1 was not restored from a sufficiently old backup ORA-01110: data file 1: '/u01/app/oracle/oradata/orcl/ora_data_orcl1.dbf'
3> 如果数据库启用了Direct NFS,在告警日志中会有如下信息:Oracle instance running with ODM: Oracle Direct NFS ODM Library Version 3.0。
同时,可查看v$dnfs_servers视图。
SQL> select * from v$dnfs_servers; ID SVRNAME DIRNA MNTPORT NFSPORT WTMAX RTMAX --- -------------------- ----- ---------- ---------- ---------- ------ 1 node3.being.com /u01 44046 2049 32768 32768
注意:并不是启动了Direct NFS,该视图就会填充,可通过创建测试表空间来验证create tablespace test datafile '/backup/prod/test.dbf' size 2m,如果能创建成功,该视图也会被填充的。
参考文档:
1. Direct NFS (DNFS) CloneDB in Oracle Database 11g Release 2 (Patchset 11.2.0.2 onward)