Oracle性能调整指导纲要

Oracle性能调整指导纲要

一、Oracle性能优化方法论

1．为什么（what）要优化（系统慢了？慢是表象）

2．怎样（how）优化？（需要找到慢的原因）

a．是系统的问题？

b．是数据库的问题？

3．谁（who）来优化？

a．系统架构师（系统架构设计的有问题，麻烦大了）

b．系统管理员（os、storage问题）

c．数据库管理员（db问题）

d．应用程序设计人员（应用设计问题）

e．应用程序开发人员（代码书写问题）

是谁的问题由谁来承担，找到理由拒绝让他们说不…

4．什么时候（when）优化？（优化时是否需要停应用：7×24）

5．制定相应的调整策略

6．组建调整的人员

7．开始调整

8．观察性能的变化

9．再调整再观察，直到性能可以接受

10．调整结束，撰写调整报告，开会总结，分享经验

二、 oracle是如何工作的（How Oracle Works）

How Oracle Works
The following example describes the most basic level of operations that Oracle performs. This illustrates an Oracle configuration where the user and associated server process are on separate computers (connected through a network).
An instance has started on the computer running Oracle (often called the host or database server).
A computer running an application (a local computer or client workstation) runs the application in a user process. The client application attempts to establish a connection to the server using the proper Oracle Net Services driver.
The server is running the proper Oracle Net Services driver. The server detects the connection request from the application and creates a dedicated server process on behalf of the user process.
The user runs a SQL statement and commits the transaction. For example, the user changes a name in a row of a table.
The server process receives the statement and checks the shared pool for any shared SQL area that contains a similar SQL statement. If a shared SQL area is found, then the server process checks the user's access privileges to the requested data, and the previously existing shared SQL area is used to process the statement. If not, then a new shared SQL area is allocated for the statement, so it can be parsed and processed.
The server process retrieves any necessary data values from the actual datafile (table) or those stored in the SGA.
The server process modifies data in the system global area. The DBWn process writes modified blocks permanently to disk when doing so is efficient. Because the transaction is committed, the LGWR process immediately records the transaction in the redo log file.
If the transaction is successful, then the server process sends a message across the network to the application. If it is not successful, then an error message is transmitted.
Throughout this entire procedure, the other background processes run, watching for conditions that require intervention. In addition, the database server manages other users' transactions and prevents contention between transactions that request the same data.

三、合理使用计算机资源

和计算机交互无非是和下面5种重要资源交互：

1．File (disk)

IO问题无疑是计算机发展的最大瓶颈，因此在部署storage时一定要考虑io的吞吐量(throughout)

2．Memory (为什么计算机要使用内存？减少物理io)

3．Process

a．thread（windows）

b．process（unix/linux）

4．Network（*）

5．Cpu （*）

1.过度消耗cpu资源的sql诊断和调整

2.latch等待会消耗大量cpu

3.大量解析消耗cpu资源

三、 IO调整

1．IO调整的原则

a．最小化io的访问（要贯穿整个应用的设计）

b．均衡各类文件的分布(ctl,dbf,redo,archivedlog,tempfile)

c．合理使用temporary tablespace（group）

2．存储级调整

a．存储规划不合理，导致io过于集中

b．寻找证据

1）iostat

2）Topas

3）Select * from v$filestat

4）Select * from v$tempstat

c．Sa(system administrator)或者厂商协助调整

3．Db级调整

a．了解数据库的物理结构

b．合理规划和io相关的文件分布

c．均衡（分散）io

d．文件分布的原则

1）controlfile单独存放，mirror不要过多，2份足以

2）redo单独存放，不要放在raid 5上，member不要过多，2份足以

3）undo单独存放

4）datafile尽可能的要分散，尤其是分区表和分区索引所存储的数据文件

5）tempfile文件单独存放，不同的用户使用不同的临时表空间，最好能使用temporary tablespace group

6）index 和data分离，对io影响不大，便于管理（index的原理rowid）

7）archivelog和所有其他文件分离，要务必和redo绝对分离

四、 内存调整

1．os级调整

a．寻找证据

1）vmstat （换页严重 page in page out）

2）topas

b．明确os内存不足的原因

2．db级调整

a．了解sga的作用及组成部分

1）advisor的激活和使用

2）调整sga_target

3）调整buffer cache

4）调整shared pool

b．参数sga_max_size的作用，9.2引入该参数之后可以动态调整内存

c．lock_sga的作用，防止sga被os swap出去，windows下不起作用

d．了解pga的作用

调整pga_aggregate_target

e．理解work area的重要概念有助于理解和调整pga

五、 process调整

1．进程的种类和作用

a）background

b）server

c）user（client）

2．专用模式和共享模式

3. 合理的调整process参数

4. windows客户端v$session里字段process的含义

六、 block空间使用调整

1．Dmt和lmt

2．Assm(bitmap)和mssm（freelist）

3．Pctfree和pctused的作用以及相互制约关系

七、 lock和latch争夺

1．row level lock

2．table level lock(表级锁的兼容性)

3．Block（阻塞）和deadlock

4．latch(在wait event中介绍)

八、 statistics

1．什么是统计信息?它的作用是什么？

2．实例级的统计信息v$sysstat

3．Session级别的统计信息v$sesstat

4．Optimizer statistics（在应用调整里介绍）

5．Optimizer statistics对oracle生成执行计划产生重要影响

6．系统统计信息

九、 wait event

1. 什么是等待事件（v$event）

2. oracle通过等待事件来反应系统的性能问题

3. 通过等待事件诊断系统性能问题

十、 Metric(度量、指标)和baseline

1．Metric比statistics粒度更细，可以从多角度分析问题

2．Metric的引入主要为了实现proactive（积极、主动）维护数据库

3．Baseline（基线）的作用，Baseline的引入为了方便、准确的给Metric设定值，从而使系统根据Metric产生更有意义的alert，根据这些alert我们能够事先维护系统最终把问题处理在萌芽状态

十一、 statspack、awr、addm、ash

工欲善其事，必先利其器

Oracle中重要性能诊断工具介绍

十二、 Checkpoint和redo调整

1．实例恢复

2．控制实例恢复的时间其实就是控制checkpoint发生的频率

3．调整redo的大小

数据库逻辑结构设计和应用级别的调整

十三、应用设计

1. 在设计表时严格遵守第三范式

2. 理解不同的应用对数据的存储和访问方式以便使用不同的oracle table type

3. oracle 支持的table type

a) heap

b) iot

c) partition

d) cluster

e) temporary

4. 合理使用index

5. 合理使用view

View可以增加程序的可读性、控制数据的安全访问，也是面向对象特性的一个体现，但是view的使用在某种程度上对sql性能没有任何提高

6. 合理使用trigger

a. 在大型的、复杂的应用里尽可能的少使用trigger，使用它方便的同时可能不仅会使应用的灵活性受到限制同时可能会对性能产生影响

b. 如果trigger实现的功能很复杂，那么最好把复杂的业务逻辑通过procedure或者function来实现之后再在trigger里调用process或者function

7. 要尽可能的多使用procedure、function或者package

a. procedure、function，package进行了预编译

b. procedure、function，package中sql自动使用绑定变量

8. 尽可能的多使用synonym

Synonym的引入是oracle面向对象思想的又一重要体现，在procedure、function，package里尽可能的多使用Synonym以减少由于对象（如表）名字改变而大量修改引用该对象的procedure、function，package

9. 少使用sequence

使用sequence有时候确实很方便，不过sequence不能保证绝对的连续，而且如果对sequence的一些特性（如cache）控制不好会对性能产生影响

10.合理使用外键

在大型的、复杂的应用里尽可能的少使用外键，外键的使用会使数据的完整性、一致性得到有力保障，但是它的使用有时会使应用的灵活性受到很大限制，如果不使用它，那么数据的完整性和一致性完全需要我们应用开发人员自己来控制，这对应用设计和开发人员都提出了很高的要求，这个到底是否使用根据自己的应用特点和团队技术实力自行选择

11.绑定变量

a. 在以oltp为主的应用系统中对使用绑定变量，因为在oltp系统里都是一些小的事务对应的sql，sql的解析可能消耗的资源往往大于执行消耗的资源，因此我们尽可能的要减少解消耗的资源对系统性能的影响

b. 在以olap为主的应用系统里通常不需要使用绑定变量，因为在olap应用系统里往往都是一些大的事务对用的长时间执行的sql，sql的执行时间远远大于sql的解析时间，因此使用绑定变量没有太多意义

十四、应用调整

1.Optimizer介绍(cbo，rbo)

2.Rbo介绍

3.Execute plan

a.set auto trace [on|traceonly {explain|statistics}]

b.explain PLAN [ SET STATEMENT_ID [=] < string literal > ] [ INTO < table_name > ]
FOR < sql_statement >

创建plan_table: $ORACLE_HOME/rdbms/admin/utlxplan.sql

select*from table(dbms_xplan.display('PLAN_TABLE','TEST','ADVANCED'));

c.v$sql_plan

d.10046 event

e.plsql developer F5

4.Index的原理

rowid的格式以及在index leaf node中的存储

5.Index的种类

a. b*tree

b. bitmap

6.Index的scan

a.Unique scan

b．Scan scan

c．Desc scn

d．Index full scan

e．Fast full scan

f．Index skip scan

7.Cbo介绍

8.Optimizer statistics

Dbms_stats介绍

9.Table access sort

a. ordered

b. leading

c. use_hash

d. use_nl

e. use_merge

10.Optimizer hint

a.index

b.no_index

c.full

d.rull

11.Table join method

a.loop nested join

b.sort merge join

c.hash join

12.Plan stability

13.Stored outline

14.确定high load sql

a.v$sqlarea

b.ASH

15.Sql trace+tkprof

十五、sql调整指导公式

用友著名sql调优专家李奎阳（pub id：anlinew）总结了他多年来的sql调优公式：T=S/V

原文连接：

http://www.itpub.net/viewthread.php?tid=931723&highlight=%2Banlinew

下面是引用他的解释：

我的SQL调优公式T＝S/V

比较愚钝，从事调优工作这么些年，处理过的性能问题也不少，也就总结出这么个公式来

S指sql所需访问的资源总量，V指sql单位时间所能访问的资源量，T自然就是SQL执行所需时间了这个公式很简单，调优的目标是减少T，T＝S/V，那么我们要调的对象自然就要放到这里的S和V上，调优的方法就是围绕减少S和增大V增大V（单位时间所能访问的资源量），在硬件设备不变的情况下所能做的只有充分利用现有资源，如：通过调整SGA充分利用物理MEMORY，通过并行处理充分利用CPU，通过调整IO分布充分利用硬盘处理能力，还有。。。。升级设备？减少S（sql所需访问的资源总量），这通常是调优工作的重中之重，SQL调优的主要目的就是围绕着如何减少S在进行，在ORACLE中，所需访问的资源以block记，一条SQL执行所读写的block数直接影响到SQL的执行时间，如何知道SQL执行所读写的block数呢，简单的方法如下：
set autotrace traceonly
SQL> select count(*) from test;已用时间: 00: 00: 10.01执行计划
----------------------------------------------------------
Plan hash value: 1950795681

-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 19342 (1)| 00:03:53 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| TEST | 5285K| 19342 (1)| 00:03:53 |
-------------------------------------------------------------------统计信息
----------------------------------------------------------
204 recursive calls
0 db block gets
70655 consistent gets
70616 physical reads
0 redo size
422 bytes sent via SQL*Net to client
416 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
1 rows processed通过上面的信息我们可以看到此SQL访问的block数为70655，并且基本上是物理读，其执行时间为00: 00: 10.01大家可能一眼就看出来了，这条SQL执行了全表扫描，加索引优化就可以了，没错，索引访问正是减少SQL所需访问资源的一个主要途径其效果也很明显已用时间: 00: 00: 01.89执行计划
----------------------------------------------------------
Plan hash value: 826211483

-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3089 (2)| 00:00:38 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| I_TEST1 | 5285K| 3089 (2)| 00:00:38 |
-------------------------------------------------------------------------统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
11218 consistent gets
11197 physical reads
0 redo size
422 bytes sent via SQL*Net to client
416 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed其访问的block数变为11218，其执行时间随之变为00: 00: 01.89通过减少S，我们可以看到T得到明显的减小上面情况是在V不变的情况下（都是物理读）的差别再看看V最大化的结果

SQL> select count(*) from test;执行计划
----------------------------------------------------------
Plan hash value: 826211483

-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3089 (2)| 00:00:38 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| I_TEST1 | 5285K| 3089 (2)| 00:00:38 |
-------------------------------------------------------------------------统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
11218 consistent gets
0 physical reads
0 redo size
422 bytes sent via SQL*Net to client
416 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed已用时间: 00: 00: 00.79

SQL> select /*+parallel(test 2)*/ count(*) from test;执行计划
----------------------------------------------------------
Plan hash value: 826211483

-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3089 (2)| 00:00:38 |

posted @ 2010-10-26 08:32 无双的小宝阅读(927) 评论(0) 编辑收藏举报

刷新页面返回顶部

无双小宝

看着看着就淡了，想着想着就忘了

Oracle性能调整指导纲要

数据库逻辑结构设计和应用级别的调整

公告