oracle alter index rebuild offline与online

oracle index build online与offline
测试环境为oracle 11.2.0.4

--sql test
SQL> conn test/test
create table test.rb_test(id number,con varchar2(20));

begin
for i in 1..1000000 loop
insert into test.rb_test values(i,'ok');
commit;
end loop;
end;
/
SQL> select count(*) from test.rb_test;

  COUNT(*)
----------
   1000000
SQL> create index idx_rb_test on test.rb_test(id);

1 测试默认的rebuild,即rebuild offline

SQL> alter index idx_rb_test rebuild;

Index altered.
SQL> select sid,type,id1,id2,lmode,request,block from v$lock where type in('DL','TM','TX');
SQL> /

       SID TY         ID1    ID2     LMODE      REQUEST      BLOCK
---------- -- ---------- ---------- ---------- ---------- ----------
       133 DL       87286      0         3        0       0
       133 DL       87286      0         3        0       0
       133 TM       87286      0         4        0       0
       133 TX      524318       7858         6        0       0
这时候持有的LMODE 4的锁,即在rebuild 期间我们不能执行DML 操作

2 对表进行dml操作,不提交,在rebuild

--sesion 1
SQL> update  rb_test set con='hello test!' where id=1828;

1 row updated.
--session 2
SQL> alter index idx_rb_test rebuild; ##这时候rebuild报错
alter index idx_rb_test rebuild
            *
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
--session 1
SQL> commit;

Commit complete.
--session 2
SQL> /  ##执行成功

Index altered

3 在索引rebuild 期间,我们去update table,并查看v$lock

--session 1
SQL> set timing on
SQL> update  rb_test set con='hello test!' where id=1830;

1 row updated.

Elapsed: 00:00:00.00
SQL> commit;

Commit complete.

Elapsed: 00:00:00.01

SQL> alter index idx_rb_test rebuild;

Index altered.

Elapsed: 00:00:08.74

--session 2
SQL> update  rb_test set con='hello test!' where id=1830; ##会被133阻塞,该144等待REQUEST mode为3的lock

2 rows updated.

Elapsed: 00:00:07.67

--session 3
SQL> /

       SID TY         ID1    ID2     LMODE      REQUEST      BLOCK
---------- -- ---------- ---------- ---------- ---------- ----------
       133 DL       87286      0         3        0       0
       133 DL       87286      0         3        0       0
       144 TM       87286      0         0        3       0 ---REQUEST mode为3的lock
       133 TM       87286      0         4        0       1
       133 TX      458785      17460         6        0       0

Elapsed: 00:00:00.10

##session1 执行完成之后,只有session2的dml操作,持有2个锁,对表加模式为3的tm锁,对数据行的模式为6的tx锁

SQL> /

       SID TY         ID1    ID2     LMODE      REQUEST      BLOCK
---------- -- ---------- ---------- ---------- ---------- ----------
       144 TM       87286      0         3        0       0
       144 TX      262164      17266         6        0       0

Elapsed: 00:00:00.07

4 rebuild online

step 1: update rb_test set con='hello test!' where id=1833; no commit;
step 2: rebuild online
step 3: update
SQL> select sys_context('userenv','sid') from dual;
--session 1,sid=133

SQL> update  rb_test set con='hello test!' where id=1833;

2 rows updated.

Elapsed: 00:00:00.01
SQL> select sid,type,id1,id2,lmode,request,block from v$lock where type in('DL','TM','TX');

       SID TY         ID1    ID2     LMODE      REQUEST      BLOCK
---------- -- ---------- ---------- ---------- ---------- ----------
       133 TM       87286      0         3        0       0
       133 TX       65554      17262         6        0       0

Elapsed: 00:00:00.06
--session 2,sid=144
SQL> alter index idx_rb_test rebuild online; ##hang住
--session 3,sid=146
SQL> update  rb_test set con='hello test!' where id=1837;

2 rows updated.
这里session 3 顺利执行,没有什么锁等待(11.2.0.4)
查看锁情况
SQL> /

       SID TY         ID1    ID2     LMODE      REQUEST      BLOCK
---------- -- ---------- ---------- ---------- ---------- ----------
       144 TX       65554      17262         0        4       0 ##这里144即session2在等待请求锁为4的锁
       144 DL       87286      0         3        0       0
       144 DL       87286      0         3        0       0
       144 TM       87286      0         2        0       0
       144 TM       87308      0         4        0       0
       133 TM       87286      0         3        0       0
       144 TX      131102      30599         6        0       0
       133 TX       65554      17262         6        0       1

8 rows selected.

当前有大量事务在操作,这时候去rebuild online,这个操作就会需要去拿TX 4的lock,
在oracle 11g之前,这个rebuild online之后的dml操作都会锁等待,之后的事务没有办法进行,解决方法就是找到阻止rebuild online的拿到想tx 4的锁。
把它kill掉。这样rebuild online操作就会继续执行,后续的dml操作就可以继续。
这点和rebuild offline的区别,后者直接报错返回ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

--session 1 提交
SQL> commit;

Commit complete.

Elapsed: 00:00:00.01
SQL> select sid,type,id1,id2,lmode,request,block from v$lock where type in('DL','TM','TX');

       SID TY         ID1    ID2     LMODE      REQUEST      BLOCK
---------- -- ---------- ---------- ---------- ---------- ----------
       144 TX      393247      22992         0        4       0
       144 DL       87286      0         3        0       0
       144 DL       87286      0         3        0       0
       146 TM       87286      0         3        0       0
       144 TM       87286      0         2        0       0
       144 TM       87308      0         4        0       0
       146 TX      393247      22992         6        0       1
       144 TX      131102      30599         6        0       0
133支持的锁已经释放掉,144等待146的tx 4的锁
--session 3 提交
SQL> commit;

Commit complete.

Elapsed: 00:00:00.00
--sesion 2完成
SQL> alter index idx_rb_test rebuild online;

Index altered.

Elapsed: 00:02:29.98
由此可见rebuild online 是个需要谨慎使用的命令

引用 参考

--https://blog.csdn.net/tianlesoftware/article/details/6538928
--http://www.itpub.net/thread-1445427-1-1.html

kill 掉正在rebuild online 的session。 可能会导致在下次rebuild index或者drop,analyze 的时候报ORA-08104的错误。 
因为在异常终止online rebuild操作的时候,oracle没来得及清理相应的临时段和标志位,系统认为online rebuild操作还在进行造成的

SQL> alter index idx_rb_test rebuild online; ##CTRL+C
^Calter index idx_rb_test rebuild online
*
ERROR at line 1:
ORA-01013: user requested cancel of current operation


Elapsed: 00:00:03.08

SQL> exec dbms_stats.gather_table_stats('TEST','RB_TEST');

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.09
SQL> alter index idx_rb_test rebuild online;

Index altered.

Elapsed: 00:00:03.99
SQL> /  ##执行期间用另外的session kill掉该session
alter index idx_rb_test rebuild online
*
ERROR at line 1:
ORA-00028: your session has been killed
ORA-00028: your session has been killed
--sesion 2
SQL> alter system kill session '133,56593';

System altered.

Elapsed: 00:00:01.01
SQL> conn test/test
Connected.
SQL> alter index idx_rb_test rebuild online;

Index altered.

Elapsed: 00:00:05.58
还是没有报错
SQL> alter index idx_rb_test rebuild online; ##这次关闭ssh
再次连接执行
SQL> conn test/test 
Connected.
SQL> alter index idx_rb_test rebuild online;

Index altered.

SQL> 
还是没有报错
如果报错的话
ERROR at line 1:

ORA-08104: this index object 53367 is being online built or rebuilt
SQL> select obj#,flags from ind$ where obj#=53367;
OBJ#      FLAGS

---------- ----------

     53367        514
sql.bsq 是个总的说明,在dcore.bsq 里找到了ind$的创建SQL:
/* mutable flags: anything permanent should go into property */
/* unusable (dls) : 0×01 */
/* analyzed : 0×02 */
/* no logging : 0×04 */
/* index is currently being built : 0×08 */
/* index creation was incomplete : 0×10 */
/* key compression enabled : 0×20 */
/* user-specified stats : 0×40 */
/* secondary index on IOT : 0×80 */
/* index is being online built : 0×100 */
/* index is being online rebuilt : 0×200 */
/* index is disabled : 0×400 */
/* global stats : 0×800 */
/* fake index(internal) : 0×1000 */
/* index on UROWID column(s) : 0×2000 */
/* index with large key : 0×4000 */
/* move partitioned rows in base table : 0×8000 */
/* index usage monitoring enabled : 0×10000 */
这里的0×200 等是十六进制来表示的。 Flags 是514, 其16进制是是202,514=0×202,表示该索引状态为index is being online rebuilt : 0×200 + analyzed : 0×02
SQL> select to_char(514,'xxxxxxxxxxxxxxx') from dual;

TO_CHAR(514,'XXX
----------------
         202
 在上面,我们说减去512. 512 的16进制是200. 对应的是:/* index is being online rebuilt : 0×200 */。 所以,我们在rebuild的时候,会对flags 加上512.
 MOS 803008.1 上的说明:
SMON should cleanup the failed online index rebuild operation and so correct this. However, 
if the table is highly active with transactions, SMON may not be able to get the required lock and so the index will not get cleaned up. 
In such situations, you can manually cleanup the failed index rebuild using the DBMS_REPAIR.ONLINE_INDEX_CLEAN procedure.
To do this, if activity on the problem table can be stopped, then simply execute:
connect / as sysdba
select dbms_repar.online_index_clean(<problem index object_id>) from dual; 
exit

select dbms_repair.online_index_clean(53367) from dual;

DECLARE
RetVal BOOLEAN;
OBJECT_ID BINARY_INTEGER;
WAIT_FOR_LOCK BINARY_INTEGER;
BEGIN
OBJECT_ID := 53367;
WAIT_FOR_LOCK := NULL;
RetVal := SYS.DBMS_REPAIR.ONLINE_INDEX_CLEAN ();
COMMIT;
END;
/

select obj#,flags from ind$ where obj#=53367;
OBJ#     FLAGS

---------- ----------

53367      2
在执行clean命令的时候,可能会遇到:
ORA-00054: resource busy and acquire with NOWAIT specified
多执行几次就ok了。 应该也是和这个锁有关。

========
- Online Index rebuild takes a long time.
- ONLINE INDEX REBUILD SCANS THE BASE TABLE AND NOT THE INDEX
Symptoms:
=========
Performance issues while rebuilding very large indexes.
- The offline rebuilds of their index is relatively quick -finishes in 15 minutes.
- Issuing index rebuild ONLINE statement => finishes in about an hour.
- This behavior of ONLINE index rebuilds makes it a non-option for large tables
as it just takes too long to scan the table to rebuild the index. The offline may not be feasible due to due to the 24/7 nature of the database.
- This may be a loss of functionality for such situations.
- If we attempt to simultaneously ONLINE rebuild the same indexes we may encounter hanging behavior indefinitely (or more than 6 hours).
DIAGNOSTIC ANALYSIS:
--------------------
We can trace the sessions rebuilding the indexes with 10046 level 12.
Comparing the IO reads for the index-rebuild and the index-rebuild-online reveals the following:
-ONLINE index rebuilds
It scans the base table and it doesn't scan the blocks of the index.
-OFFLINE index rebuilds
It scans the index for the build operation.
- This behaviour is across all versions.
Cause
Cause/Explanation
=============
When you rebuild index online,
- it will do a full tablescan on the base table.
- At the same time it will maintain a journal table for DML data, which has
changed during this index rebuilding operation.
So it should take longer time, specially if you do lots of DML on the same table,while rebuilding index online.
On the other hand, while rebuilding the index without online option, Oracle will grab the index in X-mode and rebuild a new index segment by
selecting the data from the old index. So here we are
- not allowing any DML on the table hence there is no journal table involved
- and it is doing an index scan
Hence it will be pretty fast.
Fix
Solution/Conclusion:
===========
- The ONLINE index rebuild reads the base table, and this is by design.
- Rebuilding index ONLINE is pretty slow.
- Rebuilding index offline is very fast, but it prevents any DML on the base table.

-- rebuild index online的时候,会选择全表扫描,同时会维护一个中间日志表,用来记录在rebuild 期间的增量数据,原理类似于物化视图日志,
日志表是一个索引组织表(IOT),这张中间表只有插入,不会有删除和修改操作,而且只有主键条件查询,正是IOT最合适的场景。
--rebuild offline时,选择的6模式的X 锁,它根据old index 来rebuild。 因此不允许进行DML,也就没有中间表。因此也比较块。

DML操作一般要加两个锁,一个是对表加模式为3的TM锁,一个是对数据行的模式为6的TX锁。只要操作的不是同一行数据,是互不阻塞的
在rebuild index online 的开始和结束阶段时,需要短暂的对表持有模式为4的TM锁的,当获取到4级别的锁之后,才降为2级。
如果rebuild online一直没获取到4级别的锁,那么相关的DML全部产生等待。 在执行期间只持有模式2的TM锁,不会阻塞DML操作。
在Oracle 11g之后,oracle做了特殊处理,后续的dml不会被rebuild online的4级别锁阻塞.

所以如果在执行rebuild index online前长事务,并且并发量比较大,则一旦执行alter index rebuild online,可能因为长事务阻塞,
可能导致系统瞬间出现大量的锁,对于压力比较大的系统,这是一个不小的风险。这是需要迅速找出导致阻塞的会话kill掉,
rebuild index online一旦执行,不可轻易中断,否则可能遇到ORA-08104

 

posted @ 2019-05-20 17:49  春困秋乏夏打盹  阅读(2495)  评论(0编辑  收藏  举报