DBA入门之Oracle数据库的关闭
1.正常关闭数据库的状况
下面我们来看看数据库是怎样根据SCN和Checkpoint来进行一致性判断及恢复控制的。
有关SCN和CheckPoint的相关知识大家可以参考:
DBA入门之认识Oracle SCN(System Change Number)
我们知道,在控制文件和数据文件头上,对于每个数据文件都有一个“Checkpoint SCN”和“Stop SCN”。这些Checkpoint和SCN至关重要,Oracle通过比较这些SCN值来确定数据库是否需要恢复。
下面是来自一个Clean Shutdown的数据库的控制文件和数据文件头的内容。
因为数据库在关闭之前执行了完全检查点,所以线程检查点SCN和所有数据文件检查点SCN和数据文件Stop SCN都一致。
首先通过shutdown immediate关闭数据库,然后Mount状态转储获取控制文件内容:
[oracle@czjie ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Thu Nov 17 22:26:17 2011
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 218103808 bytes
Fixed Size 1266680 bytes
Variable Size 121637896 bytes
Database Buffers 92274688 bytes
Redo Buffers 2924544 bytes
Database mounted.
SQL> alter session set events 'Immediate trace name controlf level 12';
Session altered.
SQL> select u_dump.value || '/' || db_name.value || '_ora_' || v$process.spid || nvl2(v$process.traceid, '_' || v$process.traceid, null )|| '.trc' "Trace File" from v$parameter u_dump cross join v$parameter db_name cross join v$process join v$session on v$process.addr = v$session.paddr where u_dump.name = 'user_dump_dest' and db_name.name = 'db_name' and v$session.audsid=sys_context('userenv','sessionid');
Trace File
--------------------------------------------------------------------------------
/opt/ora10g/admin/ORCL/udump/ORCL_ora_3899.trc
这个trace文件里就记录了控制文件的详细内容。
(1) 关于数据库的相关信息。
***************************************************************************
DATABASE ENTRY
***************************************************************************
(size = 316, compat size = 316, section max = 1, section in-use = 1,
last-recid= 0, old-recno = 0, last-recno = 0)
(extent = 1, blkno = 1, numrecs = 1)
11/10/2011 22:40:12
DB Name "ORCL"
Database flags = 0x00404001 0x00001000
Controlfile Creation Timestamp 11/10/2011 22:40:17
Incmplt recovery scn: 0x0000.00000000
Resetlogs scn: 0x0000.0006ce7b Resetlogs Timestamp 11/10/2011 22:40:23
Prior resetlogs scn: 0x0000.00000001 Prior resetlogs Timestamp 06/30/2005 19:09:40
Redo Version: compatible=0xa200100
#Data files = 4, #Online files = 4
Database checkpoint: Thread=1 scn: 0x0000.000e6957 –这里就是检查点SCK
Threads: #Enabled=1, #Open=0, Head=0, Tail=0
enabled threads: 01000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Max log members = 3, Max data members = 1
Arch list: Head=2, Tail=2, Force scn: 0x0000.000de15cscn: 0x0000.000e47c4
Activation ID: 1294635980
Controlfile Checkpointed at scn: 0x0000.000e6899 11/17/2011 22:25:51
thread:0 rba:(0x0.0.0)
enabled threads: 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
(2) Redo信息。
***************************************************************************
REDO THREAD RECORDS
***************************************************************************
(size = 256, compat size = 256, section max = 8, section in-use = 1,
last-recid= 0, old-recno = 0, last-recno = 0)
(extent = 1, blkno = 9, numrecs = 8)
THREAD #1 - status:0xe thread links forward:0 back:0
#logs:3 first:1 last:3 current:2 last used seq#:0x27
enabled at scn: 0x0000.0006ce7b 11/10/2011 22:40:23
disabled at scn: 0x0000.00000000 01/01/1988 00:00:00
opened at 11/17/2011 22:25:46 by instance ORCL
Checkpointed at scn: 0x0000.000e6957 11/17/2011 22:26:32 –检查点信息
thread:1 rba:(0x27.764f.10)
enabled threads: 01000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
log history: 38
restore point keep sequence: 0
(3) 数据文件的检查点信息。
抽取一个数据文件的信息作为实例:
DATA FILE #3:
(name #5) /opt/ora10g/oradata/ORCL/sysaux01.dbf
creation size=0 block size=8192 status=0xe head=5 tail=5 dup=1
tablespace 2, index=3 krfil=3 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:120 scn: 0x0000.000e6957 11/17/2011 22:26:32 --检查点SCN
Stop scn: 0x0000.000e6957 11/17/2011 22:26:32 --Stop SCN
Creation Checkpointed at scn: 0x0000.000019d1 06/30/2005 19:10:27
thread:0 rba:(0x0.0.0)
enabled threads: 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Offline scn: 0x0000.0006ce7a prev_range: 0
Online Checkpointed at scn: 0x0000.0006ce7b 11/10/2011 22:40:23
thread:1 rba:(0x1.2.0)
enabled threads: 01000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Hot Backup end marker scn: 0x0000.00000000
aux_file is NOT DEFINED
注意这里,数据库正常关闭后,由于执行了完整检查点,数据文件处于一致的状态,检查点SCN在此等于Stop SCN。
在此情况下,由于数据库处于一致状态,如果数据文件没有损失,下次启动Oracle就能够通过验证,顺利启动。
2.数据库异常关闭的情况
通过shutdown abort可以模拟一次异常,当使用shutdown abort方式关闭数据库时,Oracle会立即中断所有事物,关闭当前所有数据库连接,不执行检查点,理解关闭数据库。使用这种方式关闭数据库和断电以前的故障类似,数据库在下次启动时必须执行实例恢复才能启动,除非在特别紧急的情况下,否则不建议使用这种方式关闭数据库。
来看以下测试:
SQL> shutdown abort;
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 218103808 bytes
Fixed Size 1266680 bytes
Variable Size 121637896 bytes
Database Buffers 92274688 bytes
Redo Buffers 2924544 bytes
Database mounted.
SQL> alter session set events 'immediate trace name controlf level 12';
Session altered.
SQL> select u_dump.value || '/' || db_name.value || '_ora_' || v$process.spid || nvl2(v$process.traceid, '_' || v$process.traceid, null )|| '.trc' "Trace File" from v$parameter u_dump cross join v$parameter db_name cross join v$process join v$session on v$process.addr = v$session.paddr where u_dump.name = 'user_dump_dest' and db_name.name = 'db_name' and v$session.audsid=sys_context('userenv','sessionid');
Trace File
--------------------------------------------------------------------------------
/opt/ora10g/admin/ORCL/udump/ORCL_ora_4002.trc
看看此时控制文件的内容。
1. 数据库的相关信息
在Database Entry部分,可以看到数据库的Thread Checkpoint信息:
***************************************************************************
DATABASE ENTRY
***************************************************************************
(size = 316, compat size = 316, section max = 1, section in-use = 1,
last-recid= 0, old-recno = 0, last-recno = 0)
(extent = 1, blkno = 1, numrecs = 1)
11/10/2011 22:40:12
DB Name "ORCL"
Database flags = 0x00404001 0x00001000
Controlfile Creation Timestamp 11/10/2011 22:40:17
Incmplt recovery scn: 0x0000.00000000
Resetlogs scn: 0x0000.0006ce7b Resetlogs Timestamp 11/10/2011 22:40:23
Prior resetlogs scn: 0x0000.00000001 Prior resetlogs Timestamp 06/30/2005 19:09:40
Redo Version: compatible=0xa200100
#Data files = 4, #Online files = 4
Database checkpoint: Thread=1 scn: 0x0000.000e6958 --检查点信息
Threads: #Enabled=1, #Open=1, Head=1, Tail=1
enabled threads: 01000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Max log members = 3, Max data members = 1
Arch list: Head=2, Tail=2, Force scn: 0x0000.000de15cscn: 0x0000.000e47c4
Activation ID: 1294635980
Controlfile Checkpointed at scn: 0x0000.000e6980 11/17/2011 22:39:39
thread:0 rba:(0x0.0.0)
enabled threads: 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
2. 控制文件记录的redo信息
在控制文件中,也可以找到REDO THREAD的检查点信息
***************************************************************************
REDO THREAD RECORDS
***************************************************************************
(size = 256, compat size = 256, section max = 8, section in-use = 1,
last-recid= 0, old-recno = 0, last-recno = 0)
(extent = 1, blkno = 9, numrecs = 8)
THREAD #1 - status:0xf thread links forward:0 back:0
#logs:3 first:1 last:3 current:2 last used seq#:0x27
enabled at scn: 0x0000.0006ce7b 11/10/2011 22:40:23
disabled at scn: 0x0000.00000000 01/01/1988 00:00:00
opened at 11/17/2011 22:39:36 by instance ORCL
Checkpointed at scn: 0x0000.000e6958 11/17/2011 22:39:36 –检查点信息
thread:1 rba:(0x27.764f.10)
enabled threads: 01000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
log history: 38
restore point keep sequence: 0
3. 数据文件检查点信息
同样,一下是控制文件中记录的数据文件检查点信息:
DATA FILE #3:
(name #5) /opt/ora10g/oradata/ORCL/sysaux01.dbf
creation size=0 block size=8192 status=0xe head=5 tail=5 dup=1
tablespace 2, index=3 krfil=3 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:121 scn: 0x0000.000e6958 11/17/2011 22:39:36 –控制点信息
Stop scn: 0xffff.ffffffff 11/17/2011 22:26:32
Creation Checkpointed at scn: 0x0000.000019d1 06/30/2005 19:10:27
thread:0 rba:(0x0.0.0)
enabled threads: 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Offline scn: 0x0000.0006ce7a prev_range: 0
Online Checkpointed at scn: 0x0000.0006ce7b 11/10/2011 22:40:23
thread:1 rba:(0x1.2.0)
enabled threads: 01000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Hot Backup end marker scn: 0x0000.00000000
aux_file is NOT DEFINED
注意此处,由于数据库是一场关闭,数据库没有完成最后的检查点,数据文件在Stop SCN仍然为无穷大(ffffffff)。
在以上的信息中,各部分的Checkpoint SCN都一直,但是数据文件的Stop SCN不等于Checkpoint SCN,这意味着数据库上一次关闭没有执行完全教案差点,是异常关闭。此时启动数据库需要进行恢复。
4. 数据库的实例恢复
在数据库异常关闭之后,下次启动时,Oracle会自动执行实例恢复(Instance Recovery),实例恢复包括两个步骤:Cache Recovery和Transaction Recovery。
继续以上的测试,启动数据库之后可以从alert_<sid>.log文件中获得数据库关于恢复的相关信息:
SQL> alter database open;
Database altered.
Thu Nov 17 22:51:54 2011
alter database open
Thu Nov 17 22:51:54 2011
Beginning crash recovery of 1 threads
Thu Nov 17 22:51:54 2011
Started redo scan
Thu Nov 17 22:51:54 2011
Completed redo scan
190 redo blocks read, 72 data blocks need recovery
Thu Nov 17 22:51:55 2011
Started redo application at
Thread 1: logseq 39, block 30288
Thu Nov 17 22:51:55 2011
Recovery of Online Redo Log: Thread 1 Group 2 Seq 39 Reading mem 0
Mem# 0: /opt/ora10g/oradata/ORCL/redo02.log
Thu Nov 17 22:51:55 2011
Completed redo application
Thu Nov 17 22:51:55 2011
Completed crash recovery at
Thread 1: logseq 39, block 30478, scn 964614
72 data blocks read, 72 data blocks written, 190 redo blocks read
Thu Nov 17 22:51:55 2011
注意到Oracle在恢复过程中,首先读取日志,从最后完成的检查点开始,应用所有重做记录,这个过程叫前滚(Rolling Forward),也就是Cache Recovery过程,完成前滚之后,数据库可以被打开提供访问和使用。
此后进入实例恢复的第二阶段,Oracle回滚未提交的事务。Oracle使用两个特点来增加这个恢复阶段的效率,这两个特点是Fast-Start On-Demand Rollback和Fast-Start Parallel Rollback(这些特点是Fast-Start Fault Recovery的组成部分,仅在Oracle 8i自会后的企业版中可用)。
使用Fast-Start On-Demand Rollback特点,Oracle自动允许在数据库打开之后开始新的事务,这通常只需要很短的Cache Recovery时间。如果一个用户试图访问被一场终止进程锁定的记录,Oracle回滚那些新事物请求的记录,也就是说,因需求而回滚,然而,新事务不需要等待漫长的事务回滚时间。在Fast-Start On-Demand Rollback中,后台进程SMON充当一个调度员,使用多个服务器进程并行回滚一个事务集。
Fast-Start Parallel Rollback的一个特殊形式是内部事务恢复(Intra-Transaction Recovery)。在内部事务恢复中,一个大的事务可以被拆分,非配给几个服务器进程并行回滚。可以通过初始化参数FAST_START_PARALLEL_ROLLBACK来控制并行回滚,该参数有3个参数值。
FALSE:禁止Fast-Start Parallel Rollback。
LOW:限制恢复进程不能超过2倍的CPU_COUNT。
HIGH:限制恢复进程不能超过4个的CPU_COUNT.
作者:czjie
出处:http://www.cnblogs.com/czjie/
本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。