Oracle Alerts 与 Metrics(警告与度量)说明
一.Alerts 和 Metrics说明
官方链接:
Monitoring and Tuning the Database
http://docs.oracle.com/cd/E11882_01/server.112/e10897/montune.htm#CACCIHAB
Alerts help you monitor your database. Most alerts notify youof when particular metric thresholds are exceeded.For each alert, you can set critical and warning threshold values. Thesethreshold values are meant to be boundary values that when exceeded, indicatethat the system is in an undesirable state. For example, when a tablespacebecomes 97 percent full, this can be considered undesirable, and OracleDatabase generates a critical alert.
--Alerts 可以帮助我们监控数据库,大部分alerts 信息都是超过metric thresholds。 对于每一个alert,都可以设置critical和 warning threshold值。 当系统达到这个alert时,就说明这个指标已经处于异常状态。 比如当表空间使用了97%,那么就可以认为已经异常,数据库就会生成一条critical alert。
这些信息在OEM 上都可以很直观的查看。
企业管理器(OEM)介绍: Grid Control 和DatabaseControl
http://blog.csdn.net/tianlesoftware/article/details/5704374
http://blog.csdn.net/tianlesoftware/article/details/4702978
Other alertscorrespond to database events such as Snapshot Too Old or Resumable Sessionsuspended. These types of alerts indicate that the event has occurred.
另一种alerts与db events 相关,如snapshot too old或resumable sessionsuspended。 当发生这些alert时,就会生成对应的alert。
In addition tonotification, you can set alerts to perform some action such as running ascript. For instance, scripts that shrink tablespace objects can be useful fora Tablespace Usage warning alert.
By default, Oracle Database issues severalalerts, including the following:
--默认情况下,oracle会发布如下的alerts:
(1) Archive Area Used (warning at80 percent full)
(2) Broken Job Count and Failed JobCount (warning when goes above 0)
(3) Current Open Cursors Count(warning when goes above 1200)
(4) Dump Area Used (warning at 95percent full)
(5) Session Limit Usage (warning at90 percent, critical at 97 percent)
(6) Tablespace Space Used (warningat 85 percent full, critical at 97 percent full)
You can modify these alerts and others bysetting their metrics.
这些alerts 和其他的Metrics 值都可以使用DBMS_SERVER_ALERT包的SET_THRESHOLD,GET_THRESHOLD 方法来修改和查看,也可以直接使用OEM来进行修改。
与Alerts 相关的数据字典如下表:
View | Description |
DBA_THRESHOLDS | Lists the threshold settings defined for the instance |
DBA_OUTSTANDING_ALERTS | Describes the outstanding alerts in the database |
DBA_ALERT_HISTORY | Lists a history of alerts that have been cleared |
V$ALERT_TYPES | Provides information such as group and type for each alert |
V$METRICNAME | Contains the names, identifiers, and other information about the system metrics |
V$METRIC | Contains system-level metric values |
V$METRIC_HISTORY | Contains a history of system-level metric values |
小结一下:
Alert 信息的产生分两种: OEM 产生 和 DB Server 产生。
(1)OEM 是通过EMD(OEMdaemon)直接访问SGA 获取。
(2)DB Server 是通过MMON 进程获取。 Server产生的alert信息在alert_que中排队,该队列的主要consumer是DBCONSOLE。
DB Server产生的alerts 有分两种:基于threshold values 和 event。
(1)基于threshold的alert信息可以在DBA_OUTSTANDING_ALERTS中。当被clear的时候,会进入DBA_ALERT_HISTORY,而alert history被purge也是基于AWR信息的purge策略。
(2)基于事件的alert比如下面这些:SnapshotToo Old, ResumableSession Suspended。这部分alert信息由DBCONSOLE控制,因为DBCONSOLE会把相关信息存放在自己的repository中。
这里要强调一点MMON 进程,在之前整理的Blog里有说明:
http://blog.csdn.net/tianlesoftware/article/details/6711694
The manageabilitymonitor process (MMON) performs many tasks related to the AutomaticWorkload Repository (AWR). For example, MMON writes when a metric violatesits threshold value, taking snapshots, and capturing statistics value forrecently modified SQL objects.
The manageabilitymonitor lite process (MMNL) writes statistics from the Active SessionHistory (ASH) buffer in the SGA to disk. MMNL writes to disk when the ASHbuffer is full.
From: http://docs.oracle.com/cd/E11882_01/server.112/e25789/process.htm#CNCPT89087
通过官网上的这段说明,可以很清楚的理解MMON具体的工作:MMONwrites when a metric violates its threshold value, taking snapshots,and capturing statistics value for recently modified SQL objects.
MMON主要执行一些于AWR 相关的任务。如检查metric,产生快照,捕捉统计信息. 这里分Base Statistics和Metrics:
(1) base statistics是指收集的裸信息。
(2) Metrics是基于base statistics的第二层信息,可以跟踪数据库的活动变化情况。该统计信息每分钟由MMON进程进行更新。METRIC的值由MMON进程计算,保存在内存中一个小时。
Metrics每分钟更新记录可以通过相关的视图来确认,这些视图有:
V$SYSMETRIC,
V$SESSMETRIC,
V$SERVICEMETRIC,
V$METRICNAME
V$FILEMETRIC,
V$EVENTMETRIC,
V$WAITCLASSMETRIC
V$SYSMETRIC_HISTORY,DBA_HIST_*
如:
SYS@anqing1(rac1)> alter session setnls_date_format='yyyy-mm-dd hh24:mi:ss';
Session altered.
SYS@anqing1(rac1)> selectbegin_time,end_time from V$EVENTMETRIC where rownum=1;
BEGIN_TIME END_TIME
------------------- -------------------
2011-12-28 01:27:08 2011-12-28 01:28:07
通过以上的查询,可以确认其时间间隔是每分钟一次。
系统每秒对v$session进行抽样,放在SGA中的ASH区,该ASH区采用的是循环使用的机制,V$ACTIVE_SESSION_HISTORY中包含的是活动会话的信息。不是所有的ASH数据被写入磁盘,因为数据量太大,所以进行了过滤。通常每60分钟(快照默认时间)MMON进程会写这些信息,当ASH BUFFER满的话MMNL进程会写。
默认情况下AWR快照一个小时收集一次,在Oracle 10g里,快照保存7天,11g 保留8天。AWR产生snapshot的频率和保留策略修改参考:
Oracle AWR(AutomaticWorkload Repository) 说明
http://blog.csdn.net/tianlesoftware/article/details/4682300
http://blog.csdn.net/tianlesoftware/article/details/4668723
二. 相关的视图
之前整理过一篇根据dba_errors视图来查看报警的文章:
http://blog.csdn.net/tianlesoftware/article/details/6433418
DBA_ERRORS:describes the current errors on allstored objects in the database.
dba_errors 视图记录的是数据库对象上当前存在的错误,这些错误信息包含对象名称,类型,所有者,错误类型和错误原因,比如存储过程有语法错误,没有编译通过,这时就会在dba_errors里留下记录。 这里我们看的是其他的几个相关视图。
2.1 V$SYSMETRIC_HISTORY
V$SYSMETRIC_HISTORY displaysall system metric values available in the database. Both long duration(60-second with 1 hour history) and short duration (15-second with one-intervalonly) metrics are displayed by this view.
Column | Datatype | Description |
BEGIN_TIME | DATE | Begin time of the interval |
END_TIME | DATE | End time of the interval |
INTSIZE_CSEC | NUMBER | Interval size (in hundredths of a second) |
GROUP_ID | NUMBER | Metric group ID |
METRIC_ID | NUMBER | Metric ID |
METRIC_NAME | VARCHAR2(64) | Metric name |
VALUE | NUMBER | Metric value |
METRIC_UNIT | VARCHAR2(64) | Metric unit description |
http://docs.oracle.com/cd/E11882_01/server.112/e25513/dynviews_3091.htm
2.2 DBA_OUTSTANDING_ALERTS
DBA_OUTSTANDING_ALERTS describesalerts which the server considers to be outstanding.
http://docs.oracle.com/cd/E11882_01/server.112/e25513/statviews_4167.htm
在Alerts 体系中,基于threshold 的alert 信息可以通过dba_outstanding_alerts视图查看,当这个警告被clear 后,信息会被转移,那么该警告信息可以通过dba_alert_history视图查看, 该视图里的历史数据也有有限的,其存放周期也受AWR的purge 策略影响。
比如我们的表空间使用率过高,超过了threshold 值,那么可以通过 DBA_OUTSTANDING_ALERTS查看到,当我们添加数据文件之后,警告接触,我们就可以从dba_alert_history视图里查看。
2.3 DBA_ALERT_HISTORY
DBA_ALERT_HISTORY describesa time-limited history of alerts which are no longer outstanding.
http://docs.oracle.com/cd/E11882_01/server.112/e25513/statviews_3044.htm
示例:
select object_type,reason,creation_time,suggested_action,metric_value from dba_alert_history
-------------------------------------------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
Email: tianlesoftware@gmail.com
Skype: tianlesoftware
Blog: http://www.tianlesoftware.com
Weibo: http://weibo.com/tianlesoftware
Twitter: http://twitter.com/tianlesoftware
Facebook: http://www.facebook.com/tianlesoftware
-------加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请----
DBA1 群:62697716(满); DBA2 群:62697977(满) DBA3 群:62697850(满)
DBA 超级群:63306533(满); DBA4 群:83829929 DBA5群: 142216823
DBA6 群:158654907 DBA7 群:172855474