Oracle Golden Gate 系列十四 -- 监控 GG 状态 说明
一.使用命令查看
主要有如下命令:
这里注意STATS 指静态的信息,而STATUS 是运行时的信息。
1.1 Monitoring an Extract recovery
If Extractabends when a long-running transaction is open, it can seem to take a long timeto recover when it is started again. To recover its processing state, Extractmust search back through the online and archived logs (if necessary) to findthe first log record for that long-running transaction. The farther back intime that the transaction started, the longer the recovery takes, in general,and Extract can appear to be stalled.
--当一个长事务在运行时,此时Extract 进程异常中断,那么在下次启动时就会花很长的时间来进行recover操作。
在恢复过程中,Extract 进程需要搜索online和archived logs 信息来查找长事务的第一条log 记录。从而确定事务的开始时间,然后进行恢复,在恢复过程中,Extract 的操作是比较慢的。
To confirm thatExtract is recovering properly, use the SEND EXTRACT command with the STATUS option.One of the following status notations appears, and you can follow the progressas Extract changes its log read position over the course of the recovery.
为了确认Extract 的recover 状态,可以使用如下命令查看:
GGSCI>Send extract_name status
或者:
GGSCI>Send extract extract_name status
该命令中的状态有如下三种:
(1) In recovery[1] – Extract isrecovering to its checkpoint in the transaction log.
(2) In recovery[2] – Extract isrecovering from its checkpoint to the end of the trail.
(3) Recovery complete – Therecovery is finished, and normal processing will resume.
示例:
GGSCI (gg1) 12>send extract ext1 status
Sending STATUS request to EXTRACT EXT1 ...
EXTRACT EXT1 (PID 5269)
Current status: In recovery[1]: At EOF
Current read position:
Sequence #: 24
RBA: 6921352
Timestamp: 2011-11-17 20:17:20.000000
Current write position:
Sequence #: 0
RBA: 0
Timestamp: 2011-11-17 16:56:31.777616
Extract Trail: /u01/ggate/dirdat/lt
GGSCI (gg1) 13> send ext1 status
Sending STATUS request to EXTRACT EXT1 ...
EXTRACT EXT1 (PID 5269)
Current status: Inrecovery[1]: At EOF
Current read position:
Sequence #: 24
RBA: 6921352
Timestamp: 2011-11-17 20:17:20.000000
Current write position:
Sequence #: 0
RBA: 0
Timestamp: 2011-11-17 16:56:31.777616
Extract Trail: /u01/ggate/dirdat/lt
1.2 Monitoring lag
Lag statisticsshow you how well the Oracle GoldenGate processes are keeping pace with theamount of data that is being generated by the business applications. With this information,you can diagnose suspected problems and tune the performance of the Oracle GoldenGateprocesses to minimize the latency between the source and target databases.
Lag 的静态信息可以显示GG 进程处理的data 数量。
For Extract, lagis the difference, in seconds, between the time that a record was processed byExtract (based on the system clock) and the timestamp of that record in thedata source.
--对于Extract,lag 表示Extract 进程处理记录的时间与记录在Data source中timestamp的一个时间差。 这个可以体现Extract 的反应时间。单位是秒。
For Replicat,lag is the difference, in seconds, between the time that the last record was processedby Replicat (based on the system clock) and the timestamp of the record in the trail.
--同样对于Replicat,lag 表示的是Replicat 进程处理的最后一条记录与这条记录在trail 文件中timestamp 的时间差。单位是秒。
查看lag statistics 的信息可以使用如下两种语法:
(1)LAG {EXTRACT | REPLICAT | ER}{<group | wildcard>}
(2)SEND {EXTRACT | REPLICAT}{<group | wildcard>}, GETLAG
这里要注意的是, SEND 命令返回的log statistics 是checkpointed 中记录的最后一条记录,而不是process 当前处理的记录,所以SEND 命令显示的信息没有LAG 或 INFO 命令显示的准确。
示例:
GGSCI (gg1) 20> lag er *
Sending GETLAG request to EXTRACT DPUMP ...
No records yet processed.
At EOF, no more records to process.
Sending GETLAG request to EXTRACT EXT1 ...
Last record lag: 21 seconds.
At EOF, no more records to process.
GGSCI (gg1) 21> send ext1 getlag
Sending GETLAG request to EXTRACT EXT1 ...
Last record lag: 21 seconds.
At EOF, no more records to process.
有三种方式来控制Lag 的报警设置:
(1)Use the LAGREPORTMINUTES or LAGREPORTHOURSparameter to specify the interval at which Manager checks for Extract andReplicat lag.
--这2个参数设置Manager 检查Extract 和ReplicatLag的时间间隔。
(2)Use the LAGCRITICALSECONDS, LAGCRITICALMINUTES,or LAGCRITICALHOURS parameter to specify a lag threshold that is consideredcritical, and to force a warning message to the error log when the threshold isreached. This parameter affects Extract and Replicat processes on the localsystem.
--这3个参数控制Lag 的界限值,当超过这个值,就认为是严重的,将强制写一条警告信息到error log里。 这个参数只影响本地系统上的Extract 和Replicat 进程。
(3)Use the LAGINFOSECONDS, LAGINFOMINUTES,or LAGINFOHOURS parameter to specify how often to report lag information to theerror log. If the lag is greater than the value specified with the LAGCRITICAL parameter,Manager reports the lag as critical; otherwise, it reports the lag as aninformational message. A value of zero (0) forces a message at the frequencyspecified with the LAGREPORTMINUTES or LAGREPORTHOURS parameter.
--这3个参数指定多长时间将lag 信息写入error log。
1.3 Monitoring processing volume
The volumestatistics show you the amount of data that is being processed by an Oracle GoldenGateprocess, and how fast it is being moved through the Oracle GoldenGate system.With this information, you can diagnose suspected problems and tune the performanceof the Oracle GoldenGate processes.
1.3.1 查看 volume statistics
语法:
STATS {EXTRACT | REPLICAT | ER} {<group| wildcard>} [TABLE {<name | wildcard>}]
示例:
GGSCI (gg1) 22> stats er ext1
Sending STATS request to EXTRACT EXT1 ...
Start of Statistics at 2011-11-18 16:30:35.
DDL replication statistics (for alltrails):
*** Total statistics since extractstarted ***
Operations 0.00
Mapped operations 0.00
Unmapped operations 0.00
Other operations 0.00
Excluded operations 0.00
Output to /u01/ggate/dirdat/lt:
Extracting from DAVE.PDBA to DAVE.PDBA:
*** Total statistics since 2011-11-1815:13:17 ***
Total inserts 0.00
Total updates 0.00
Total deletes 1.00
Total discards 0.00
Total operations 1.00
*** Daily statistics since 2011-11-1815:13:17 ***
Total inserts 0.00
Total updates 0.00
Total deletes 1.00
Total discards 0.00
Total operations 1.00
*** Hourly statistics since 2011-11-1816:00:00 ***
No database operations have been performed.
*** Latest statistics since 2011-11-1815:13:17 ***
Total inserts 0.00
Total updates 0.00
Total deletes 1.00
Total discards 0.00
Total operations 1.00
End of Statistics.
GGSCI (gg1) 23> statsextract ext1 table pdba
Sending STATS request to EXTRACT EXT1 ...
Start of Statistics at 2011-11-18 16:31:17.
DDL replication statistics (for alltrails):
*** Total statistics since extractstarted ***
Operations 0.00
Mapped operations 0.00
Unmapped operations 0.00
Other operations 0.00
Excluded operations 0.00
Output to /u01/ggate/dirdat/lt:
End of Statistics.
1.3.2 查看 processing rate
语法:
STATS {EXTRACT | REPLICAT | ER} {<group| wildcard>}, REPORTRATE {HR | MIN | SEC}
--HR/MIN/SEC==小时/分钟/秒
示例:
GGSCI (gg1) 24> stats er ext1,reportrate min
Sending STATS request to EXTRACT EXT1 ...
Start of Statistics at 2011-11-18 16:34:36.
DDL replication statistics (for alltrails):
*** Total statistics since extractstarted ***
Operations 0.00
Mapped operations 0.00
Unmapped operations 0.00
Other operations 0.00
Excluded operations 0.00
Output to /u01/ggate/dirdat/lt:
Extracting from DAVE.PDBA to DAVE.PDBA:
*** Total statistics since 2011-11-1815:13:17 ***
Total inserts/minute: 0.00
Total updates/minute: 0.00
Total deletes/minute: 0.01
Total discards/minute: 0.00
Total operations/minute: 0.01
*** Daily statistics since 2011-11-1815:13:17 ***
Total inserts/minute: 0.00
Total updates/minute: 0.00
Total deletes/minute: 0.01
Total discards/minute: 0.00
Total operations/minute: 0.01
*** Hourly statistics since 2011-11-1816:00:00 ***
No database operations have been performed.
*** Latest statistics since 2011-11-1815:13:17 ***
Total inserts/minute: 0.00
Total updates/minute: 0.00
Total deletes/minute: 0.01
Total discards/minute: 0.00
Total operations/minute: 0.01
End of Statistics.
1.3.3 查看自启动以来单表的总的操作
语法:
STATS {EXTRACT | REPLICAT | ER} {<group| wildcard>},TOTALSONLY <table>
示例:
GGSCI (gg1) 25> stats er ext1,totalsonly pdba
Sending STATS request to EXTRACT EXT1 ...
Start of Statistics at 2011-11-18 16:37:51.
DDL replication statistics (for alltrails):
*** Total statistics since extractstarted ***
Operations 0.00
Mapped operations 0.00
Unmapped operations 0.00
Other operations 0.00
Excluded operations 0.00
Output to /u01/ggate/dirdat/lt:
Cumulative totals for specified table(s):
*** Total statistics since 2011-11-1815:13:17 ***
No database operations have been performed.
*** Daily statistics since 2011-11-1815:13:17 ***
No database operations have been performed.
*** Hourly statistics since 2011-11-1816:00:00 ***
No database operations have been performed.
*** Latest statistics since 2011-11-1815:13:17 ***
No database operations have been performed.
End of Statistics.
1.3.4 To limit the types of statistics that are displayed
语法:
STATS {EXTRACT | REPLICAT | ER} {<group| wildcard>},{TOTAL | DAILY | HOURLY | LATEST}
示例:
GGSCI (gg1) 28> stats ext1 total
Sending STATS request to EXTRACT EXT1 ...
Start of Statistics at 2011-11-18 16:44:52.
DDL replication statistics (for alltrails):
*** Total statistics since extractstarted ***
Operations 0.00
Mapped operations 0.00
Unmapped operations 0.00
Other operations 0.00
Excluded operations 0.00
Output to /u01/ggate/dirdat/lt:
Extracting from DAVE.PDBA to DAVE.PDBA:
*** Total statistics since 2011-11-1815:13:17 ***
Total inserts 0.00
Total updates 0.00
Total deletes 1.00
Total discards 0.00
Total operations 1.00
End of Statistics.
小技巧:
命令中的extract和replicat 类型不用指定,后面的逗号也可以省略,gg 都会自动识别。
1.3.5 To clear allfilters that were set with previous options
语法:
STATS {EXTRACT | REPLICAT | ER} {<group| wildcard>}, RESET
1.3.6 To send interimstatistics to the report file
语法:
SEND {EXTRACT | REPLICAT | ER} {<group |wildcard>}, REPORT
二.使用Errorlog
Error log 存放在GG 的安装目录下面:
gg1:/u01/ggate> ll ggserr.log
-rw-rw-rw- 1 oracle oinstall 149756 Nov 1816:44 ggserr.log
使用GG error log可以查看一下信息:
(1) a history of GGSCI commands
(2) Oracle GoldenGate processesthat started and stopped
(3) processing that was performed
(4) errors that occurred
(5) informational and warningmessages
Because the error log shows events as they occurred in sequence, it is a good tool for detectingthe cause (or causes) of an error. For example, you might discover that:
(1) someone stopped a process
(2) a process failed to make aTCP/IP or database connection
(3) a process could not open a file
2.1 To view the error log
Use any of the following:
(1) Standard shell command to viewthe ggserr.log file within the root Oracle GoldenGate
(2) directory
(3) Oracle GoldenGate Director
(4) VIEW GGSEVT command in GGSCI
语法:VIEW GGSEVT
示例:
GGSCI (gg1) 29> view ggsevt
2011-11-08 20:08:12 INFO OGG-00987 Oracle GoldenGateCommand Interpreter for
Oracle: GGSCI command (oracle): edit params mgr.
2011-11-08 20:11:09 INFO OGG-00987 Oracle GoldenGateCommand Interpreter for
Oracle: GGSCI command (oracle): start manager.
2011-11-08 20:11:11 INFO OGG-00983 Oracle GoldenGateManager for Oracle, mgr
.prm: Manager started (port 7809).
2011-11-08 20:36:22 INFO OGG-00987 Oracle GoldenGateCommand Interpreter for
Oracle: GGSCI command (oracle): add extract ext1 tranlog, begin now.
2011-11-08 20:36:47 INFO OGG-01749 Oracle GoldenGateCommand Interpreter for
Oracle: Successfully registered EXTRACT EXT1 to start managing log retention at
SCN1121060.
2011-11-08 20:37:16 INFO OGG-00987 Oracle GoldenGateCommand Interpreter for
Oracle: GGSCI command (oracle): add exttrail /u01/ggate/dirdat/lt extract ext1
.
2.2 To filter the error log
The error logcan become very large, but you can filter it based on a keyword. For example, thisfilter show only errors:
$ moreggserr.log | grep ERROR
示例:
gg1:/u01/ggate> more ggserr.log | grepERROR
2011-11-09 21:00:32 ERROR OGG-01224 Oracle GoldenGateCapture for Oracle, ext1.prm: TCP/IPerror 113 (No route to host).
2011-11-09 21:00:33 ERROR OGG-01668 Oracle GoldenGateCapture for Oracle, ext1.prm: PROCESSABENDING.
2011-11-15 20:51:50 ERROR OGG-01203 Oracle GoldenGateCapture for Oracle, ext2.prm: EXTRACTabending.
Because the error log will continue to grow as you use Oracle GoldenGate, consider archivingand deleting the oldest entries in the file.
NOTE:
The Collectorprocess might stop reporting to the log on UNIX systems after the log has beencleaned up. To get reporting started again, restart the Collector process
after the cleanup.
三.使用进程报告
根据进程报告,可以查看如下内容:
(1) parameters in use
(2) table and column mapping
(3) database information
(4) runtime messages and errors
(5) runtime statistics for thenumber of operations processed
Every Extract,Replicat, and Manager process generates a report file at the end of each run. Thereport can help you diagnose problems that occurred during the run, such asinvalid mapping syntax, SQL errors, and connection errors.
每个Extract,Replicat和Manager进程,在运行结束时都会生成一个report 文件。 通过这个文件可以查看进行在运行期间的相关信息。
3.1 To view a process report
Use any of the following:
(1) standard shell command forviewing a text file
(2) Oracle GoldenGate Director
(3) VIEW REPORT command in GGSCI
语法:
VIEW REPORT {<group> | <filename> | MGR}
Where:
(1) <group> shows an Extract or Replicatreport that has the default name, which is the name of the associated group.
(2) <file name> shows anyExtract or Replicat report file that matches a given path name. Must be used ifa non-default report name was assigned with the REPORT option of the ADDEXTRACT or ADD REPLICAT command when the group was created.
(3) MGR shows the Manager processreport.
Report names arein upper case if the operating system is case-sensitive. By default,reportshave a file extension of .rpt, for example EXTORA.rpt. The default location isthe dirrpt sub-directory of the Oracle GoldenGate directory.
--如果操作系统大小写敏感,那么Report Name就是大写,默认情况下,Report 文件扩展名是rpt,默认目录是GG 安装目录的dirrpt 目录下。
示例:
GGSCI (gg1) 30> view report ext1
***********************************************************************
Oracle GoldenGate Capture forOracle
Version 11.1.1.1 OGGCORE_11.1.1_PLATFORMS_110421.2040
Linux, x64, 64bit (optimized), Oracle 11g on Apr 30 2011 18:52:51
Copyright (C) 1995, 2011, Oracle and/or itsaffiliates. All rights reserved.
Starting at 2011-11-1813:30:22
***********************************************************************
Operating System Version:
Linux
Version #1 SMP Tue Aug 18 15:59:52 EDT2009, Release 2.6.18-164.el5xen
Node: gg1
Machine: x86_64
soft limit hard limit
Address Space Size : unlimited unlimited
Heap Size : unlimited unlimited
File Size : unlimited unlimited
CPU Time : unlimited unlimited
…..
3.2 To determine the name and location of a process report
Use the INFO command in GGSCI.
语法:
INFO<group>, DETAIL
3.3 To view information if a process abends without a report
Run the processfrom the command shell of the operating system (not GGSCI) to send the informationto the terminal.
如果进程中断,并没有生成Report 的情况,我们可以使用如下语法来查看进程的信息。
在操作系统里执行如下语法:
<process>paramfile <path name>.prm
Where:
(1) <process> is either Extract or Replicat.
(2) paramfile <path name>.prm is the fullyqualified name of the parameter file.
示例:
gg1:/u01/ggate> extractparamfile /u01/ggate/dirdat/ext1.prm
Source Context :
SourceModule : [ggstd.util.file]
SourceID :[/scratch/sganti/view_storage/sganti_core_lin64/oggcore/OpenSys/src/gglib/ggstd/fileutl.c]
SourceFunction :[ggOpenFile]
SourceLine : [681]
ThreadBacktrace : [8]elements
:[extract(CMessageContext::AddThreadContext()+0x26) [0x66a416]]
:[extract(CMessageFactory::CreateMessage(CSourceContext*, unsigned int,...)+0x7b2) [0x660ee2]]
:[extract(_MSG_ERR_FILE_OPEN_ERROR(CSourceContext*, char const*,CMessageFactory::MessageDisposition)+0x92) [0x633952]]
:[extract(ggOpenFile(char const*, char const*)+0x7e) [0x58851e]]
: [extract[0x512f63]]
: [extract(main+0x1a8) [0x5254a8]]
:[/lib64/libc.so.6(__libc_start_main+0xf4) [0x34fa41d994]]
:[extract(__gxx_personality_v0+0x1f2) [0x4f2bda]]
2011-11-18 17:23:31 ERROR OGG-01091 Unable to open file"/u01/ggate/dirdat/ext1.prm" (error 2, No such file or directory).
2011-11-18 17:23:31 ERROR OGG-01668 PROCESS ABENDING.
3.4 Scheduling runtime statistics in the process report
By default,runtime statistics are written to the report once, at the end of each run. For longor continuous runs, you can use optional parameters to view these statistics ona regular basis, without waiting for the end of the run.
--默认情况下,运行时的静态信息只在进程结束时写如report。 如果是一个长时间运行的进程,我们可以使用可选的参数来查看进程的信息,而不是等进程stop。
3.4.1 To set a schedulefor reporting runtime statistics
Use the REPORT parameterin the Extract or Replicat parameter file to specify a day and time to generateruntime statistics in the report.
--在Extract 或Replicat 进程里指定REPORT参数,就可以在指定的时间间隔内规则的生成report。
3.4.2 To send runtimestatistics to the report on demand
Use the SENDEXTRACT or SEND REPLICAT command with the REPORT option to view current runtimestatistics when needed.
使用send extract 或者 send replicat 命令加report 参数来查看进程当前的运行信息。
示例:
GGSCI (gg1) 35> send ext1 report
Sending REPORT request to EXTRACT EXT1 ...
Request processed.
3.5 Viewing record counts in the process report
Use the REPORTCOUNTparameter to report a count of transaction records that Extract or Replicatprocessed since startup. Each transaction record represents a logical database operationthat was performed within a transaction that was captured by Oracle GoldenGate.The record count is printed to the report file and to the screen.
--REPORTCOUNT 参数可以显示进程自启动以来事务操作的数量。 每个事务操作都会被GG 捕获。
3.6 Managing process reports
Once created, areport file must remain in its original location for Oracle GoldenGate to operateproperly after processing has started.
Whenever aprocess starts, Oracle GoldenGate creates a new report file and ages the previousone by appending a sequence number to the name. The numbers increment from 0(the previous one) to 9 (the oldest).
No process everhas more than ten aged reports and one active report. After the tenth aged report,the oldest is deleted when a new report is created. Set up an archivingschedule for aged report files in case they are needed to resolve a servicerequest.
3.6.1 To prevent anExtract or Replicat report file from becoming too large
Use the REPORTROLLOVERparameter to force report files to age on a regular schedule, instead of when aprocess starts. For long or continuous runs, setting an aging schedule controlsthe size of the active report file and provides a more predictable set ofarchives that can be included in your archiving routine.
3.6.2 To prevent SQLerrors from filling up the Replicat report
Use the WARNRATEparameter to set a threshold for the number of SQL errors that can be toleratedon any target table before being reported to the process report and to theerror log. The errors are reported as a warning. If your environment cantolerate a large number of these errors, increasing WARNRATE helps to minimizethe size of those files.
四.使用discardfile
Use a discardfile to capture information about Oracle GoldenGate operations that failed. Thisinformation can help you to resolve data errors, such as those that involveinvalid column mapping.
--discard file可以存放GG 失败的操作记录。
Discard file 包含如下信息:
(1) The database error message
(2) The sequence number of the datasource or trail file
(3) The relative byte address ofthe record in the data source or trail file
(4) The details of the discardedoperation, such as column values of a DML statement or the text of a DDLstatement.
A discard filecan be used for Extract or Replicat, but it is most useful for Replicat to log operationsthat could not be reconstructed or applied.
--discard file 可以使用在Extract 和 Replicat 进程上,但是大多数情况下是在Replicat 进程上使用。
4.1 To use a discard file
Include the DISCARDFILEparameter in the Extract or Replicat parameter file. You must supply a name forthe file. The parameter has options that control the maximum file size, afterwhich the process abends, and whether new content overwrites or appends toexisting content.
--Extract 和 Replicat 进程都可以包含DISCARDFILE参数,如果使用该参数,必须指明file name。 这个参数的可选参数包括最大filesize,和程序异常中止后,启动时是overwrite 还是append 这个discard file.
语法:
DISCARDFILE<file name> [, APPEND | PURGE] [, MAXBYTES <n> | MEGABYTES<n>]
NOTE:
To prevent theneed to perform manual maintenance of discard files, use either the PURGE orAPPEND option. Otherwise, you must specify a different discard file name beforestarting each process run, because Oracle GoldenGate will not write to anexisting discard file.
--为了避免人工的维护discard file,可以使用purge 或者append 参数,这样就可以正常启动了。 不然就需要在启动进程前指定新的位置,因为GG 不会覆盖已经存在的discard file。
4.2 To view a discard file
Use either of the following:
(1) Standard shell command to viewthe file by name
(2) VIEW REPORT command in GGSCI,with the discard file name as input
语法:
VIEW REPORT<file name>
GGSCI (gg2) 4> view params rep1
replicat rep1
ASSUMETARGETDEFS
userid ggate@gg2,password ggate
discardfile /u01/ggate/dirdat/rep1_discard.txt, append, megabytes 10
--HANDLECOLLISIONS
ddl include all
ddlerror default ignore retryop
map dave.pdba, target dave.pdba;
示例:
GGSCI (gg2) 5> view report /u01/ggate/dirdat/rep1_discard.txt
Oracle GoldenGate Delivery for Oracleprocess started, group REP1 discard file op
ened: 2011-11-08 20:51:55
Oracle GoldenGate Delivery for Oracleprocess started, group REP1 discard file op
ened: 2011-11-09 10:39:47
Oracle GoldenGate Delivery for Oracleprocess started, group REP1 discard file op
ened: 2011-11-16 11:23:44
4.3 To manage discard files
Use the DISCARDROLLOVERparameter to set a schedule for aging discard files. For long or continuousruns, setting an aging schedule prevents the discard file from filling up and causingthe process to abend, and it provides a predictable set of archives that can beincluded in your archiving routine.
语法:
DISCARDROLLOVER{AT <hh:mi> | ON <day of week> | AT <hh:mi> ON <day ofweek>}
-------------------------------------------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
Blog: http://blog.csdn.net/tianlesoftware
Weibo: http://weibo.com/tianlesoftware
Email: tianlesoftware@gmail.com
Skype: tianlesoftware
-------加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请----
DBA1 群:62697716(满); DBA2 群:62697977(满) DBA3 群:62697850(满)
DBA 超级群:63306533(满); DBA4 群:83829929(满) DBA5群: 142216823(满)
DBA6 群:158654907(满) DBA7 群:69087192(满) DBA8 群:172855474
DBA 超级群2:151508914 DBA9群:102954821 聊天 群:40132017(满)