Oracle SQL optimization(摘自SAP Note 766349)
Symptom
1. What is SQL optimization?
2. How important is SQL optimization?
3. Which criteria are used to determine the load of an SQL statement?
4. Where can I find information on the executed SQL statements?
5. How do I find the most processing-intensive SQL statements overall?
6. How do I find information on an SQL statement that is currently running?
7. How can I determine what values are hidden behind the bind variables?
8. Why does the SQL statement appear different on database level than in the ABAP source code?
9. How many processing-intensive SQL statements should be observed as part of SQL optimization?
10. What are selection and join conditions?
11. What is a run schedule?
12. How can I display the run schedule for an SQL statement?
13. What are the characteristic components of a run schedule?
14. What is an optimal run schedule?
15. What utilities are available for further analysis of a processing-intensive SQL statement?
16. How can I optimize a processing-intensive SQL statement?
17. What must be considered in terms of the index design?
18. What analysis tools are available at Oracle level?
19. What information do the "Estimated Costs" and "Estimated Rows" provide for an SQL statement?
20. Is it possible to determine a buffer quality for individual tables?
21. What happens when there is an overflow of the Buffer Gets?
22. Can I see a sample analysis?
23. Where can I find further information on SQL optimization?
Frequently asked questions Solution
1. What is SQL optimization?
SQL optimization (or SQL statement tuning or Shared Cursor cache analysis) refers to the identification, analysis and optimization of SQL statements that are responsible for the highest load in relation to I/O and CPU consumption on database level. These statements are also called "processing-intensive SQL statements".
Due to the complexity of the topic, this note can only offer an outline description of an analysis. However, SAP offers various books, training courses and services on the subject (see also the final question in this note).
2. How important is SQL optimization?
Regular analysis and optimization of processing-intensive SQL statements is the MOST important basis for efficient system operation. Resource-intensive SQL statements are directly responsible for increased I/O and CPU activities and indirectly responsible for subsequent problems such as buffer busy waits or unusually high I/O times.
3. Which criteria are used to determine the load of an SQL statement?
The two main criteria for the load that creates an SQL statement are:
- Number of disk reads (or physical reads)
- Number of blocks that were not in the memory of the Oracle buffer and therefore had to be imported from the disk
- Measure of the I/O load of an SQL statement
- Number of buffer gets (or logical reads or reads)
- Number of the blocks read in Oracle Buffer Pool in the memory
- Measure of the CPU and memory load of an SQL statement
The following criteria are also relevant and should not be neglected, even though they generally have less influence upon the database performance:
- Number of processed rows
- Number of table entries returned by an SQL statement
- Measurement for the network load of an SQL statement
- Number of executions
- Measurement for the communication load with the database
- CPU-Time
- Measurement of the CPU load of an SQL statement
- Elapsed-Time
- Measurement of the actual duration of an SQL statement
- Number of direct writes (for Oracle >=10g)
- Measurement for PSAPTEMPT and direct path write activities (for example sorting, LOB accesses, parallel query, hash-joins, bitmap operations)
The SQL statements most relevant for SQL optimization are "Disk Read" and "Buffer Get". It is irrelevant whether a statement is executed often, causing only a small load each time or whether it is executed just once causing a large load. It is the total load triggered by the statement that is decisive and not the load per execution.
In the R/3 environment, statements with bind variables are parsed (:A0, :A1 ...). These bind variables may contain other specific values. Nevertheless, all statements that have only varying values are considered as an IN statement (on a bind variable level).
4. Where can I find information on the executed SQL statements?
The central source of information for SQL statements is the Shared Cursor Cache, which you can access through an SAP system by selecting
Transaction ST04
-> Detail Analysis Menu
-> SQL Request
-> Detail Analysis Menu
-> SQL Request
This cache includes the following information for all SQL statements executed since the last database start that were not displaced again from the shared cursor cache:
- Executions: Number of executions
- Disk reads: Number of blocks read from the disk
- Disk reads/execution: Number of blocks read by the disk per execution
- Buffer gets: Number of blocks read from the buffer pool
- Buffer gets/execution: Number of blocks read per execution from the pool buffer
- Rows processed: Number of processed rows
- Rows processed/execution: Number of rows processed for each execution
- Buffer gets/Row: Number of blocks read per processed row from the buffer pool
- CPU Time: Consumed CPU time (SAP basis >= 6.40)
- Elapsed Time: Duration of the execution (SAP basis >= 6.40)
- SQL statement: Text of the SQL statement
- Program name: Name of the calling program
- "Callpoint in the ABAP program" button: Exact location in the ABAP source code when the statement call originates
There are also buttons to define a reset time and start an evaluation since this reset time only.
5. How do I find the most processing-intensive SQL statements overall?
You receive the SQL statements that are currently most processing intensive in terms of one of the three load criteria by sorting the Shared Cursor Cache entries into disk reads, buffer gets or processed rows. It is best to sort according to the three criteria one after the other and to optimize the statements with the highest load in each case.
To automatically determine the most processing-intensive statements, you can also use the RSORADLD report (or /SDF/RSORADLD), that lists all statements by default that are responsible for more than 2% of the disk reads or 5% of the buffer gets.
The SAP Early Watch Alert Services also gives you an overview of the most processing-intensive SQL statements.
6. How do I find information on an SQL statement that is currently running?
If a work process is busy accessing a table in SM50/SM66 for an extended period, you can determine the relevant Oracle session using the Client-PID as described in Note 618868.
To obtain details of the block accesses, you can determine this statement in the second step in the Shared Cursor Cache.
7. How can I determine what values are hidden behind the bind variables?
In some cases the ABAP source code contains literals or constants that are transferred 1:1 to the database. Therefore an examination of the ABAP source code will provide information about the values transferred to the database.
Otherwise, up to Oracle 9i you can only determine the content of the bind variables by taking measures such as the following BEFORE executing an SQL statement:
- Activate an SQL trace using Transaction ST05
- J2EE Environment: Activate an SQL trace using a web browser URL (http://<hostname>:<port>/SQLTrace) or using SAP J2EE Engine Visual Administrator
- Activate an ORADEBUG trace (Note 613872)
- Debug the ABAP program that launches the SQL statement
No further information about the contents of the bind variables can be obtained until 9i.
As of Oracle 10g, the view V$SQL_BIND_CAPTURE is available, which stores the contents of bind variables for the executed SQL statements. With this information, you can use the following query to determine the bind variable contents specified for an SQL statement:
SELECT
SUBSTR(SBC.NAME, 1, 10) BIND,
SUBSTR(SBC.VALUE_STRING, 1, 50) VALUE,
COUNT(*) "NUMBER"
FROM
V$SQL_BIND_CAPTURE SBC, V$SQL S
WHERE
S.SQL_TEXT LIKE '<sql_statement_pattern>' AND
S.SQL_ID = SBC.SQL_ID
GROUP BY NAME, VALUE_STRING
ORDER BY 1, 2;
SELECT
SUBSTR(SBC.NAME, 1, 10) BIND,
SUBSTR(SBC.VALUE_STRING, 1, 50) VALUE,
COUNT(*) "NUMBER"
FROM
V$SQL_BIND_CAPTURE SBC, V$SQL S
WHERE
S.SQL_TEXT LIKE '<sql_statement_pattern>' AND
S.SQL_ID = SBC.SQL_ID
GROUP BY NAME, VALUE_STRING
ORDER BY 1, 2;
The bind variables are updated in V$SQL_BIND_CAPTURE every 15 minutes at the earliest. If you require a faster refresh for analysis purposes, you can temporarily set the underlying underscore parameter _CURSOR_BIND_CAPTURE_INTERVAL to a second value than 900 (that is, the default setting of 15 minutes):
ALTER SYSTEM SET "_CURSOR_BIND_CAPTURE_INTERVAL"=<seconds>;
ALTER SYSTEM SET "_CURSOR_BIND_CAPTURE_INTERVAL"=<seconds>;
By default, only the first 400 byte of the bind variable content of an SQL statement are saved. For statements with a lot of bind variables, this may mean that the values of the last bind variables are not captured. If you require further variable content beyond 400 byte, you can set the _CURSOR_BIND_CAPTURE_AREA_SIZE parameter to a value <bytes> of more than 400, for example:ALTER SYSTEM SET "_CURSOR_BIND_CAPTURE_AREA_SIZE"=<bytes>;
8. Why does the SQL statement appear different on database level than in the ABAP source code?
The Open-SQL statements executed from R/3 are transferred to the database via the database interface (DBI). In many cases, the statement is modified in the DBI before being transferred to the database:
- If a column with an empty variable is compared in the WHERE section, the DBI omits this condition.
- When you use FOR ALL ENTRIES, the program distributes a value list depending on the DBI parameters described in Note 48230 (particularly rsdb/max_blocking_factor, rsdb/max_in_blocking_factor) into statements with short IN lists or OR linkages.
Note that RSPARAM always displays a value of -1 for these parameters when you use the default values. You can determine the value actually in use by referring to the dev_w* workprocess trace (Transaction ST11).
If the FOR ALL ENTRIES list is empty, all data is generally selected from the current client ("SELECT ... FROM <table> WHERE MANDT = :A0"). All conditions in the WHERE part are ignored.
- Accesses to R/3 pool and cluster tables (that are not available as independent tables at an Oracle level) are converted into accesses on the related table pools or table clusters.
- If tables in R/3 are completely or generically buffered, the buffers are reloaded, if necessary, with special DBI statements (for example, "SELECT * FROM <table> WHERE MANDT = :A0 ORDER BY <primary_key_fields>" for completely buffered tables) that may be completely different to the statement from the ABAP source code.
- The DBI provides ABAP statements that have "SELECT SINGLE" or "UP TO ROWS" without ORDER BY with the FIRST_ROWS hint and "WHERE ROWNUM <= ..." (Note 135048).
- Some operations (for example, kernel procedures, bulk operations, generations) can generate SQL statements although there is no regular SQL statement in the ABAP source code.
- Even when you use conversion exits, there can be significant differences between ABAP source code and database-side statements (such as additional conditions that are not explicitly present in the ABAP source code).
- IN conditions from the ABAP source code may be converted into any number of possible conditions on the database level, depending on your selection criteria: "=", "IN", "LIKE", "BETWEEN", ">", "<", ">=", "<="
- Columns that appear in both the selection list and the WHERE condition are removed from the selection list if it is clear from the WHERE condition what the column's value must be.
- If an expression ends with a space followed by a placeholder, the system generates an OR connection as follows:
SQL statement: ... WHERE <column> LIKE '<string> %'
Statement after DBI transformation: ... WHERE (<column> LIKE '<string> %' OR <column> LIKE '<string>')
9. How many processing-intensive SQL statements should be observed as part of SQL optimization?
There is no single, general response to this question. The answer depends primarily on the corresponding rows of optimization and the current system status. Therefore, in one case, optimization of a single highly processing-intensive SQL statement can ensure that a previously barely usable system function again, while in another case, more than 50 statements must be optimized to achieve the required improvement in performance.
Indicators regarding whether a system is already well tuned or not are outlined in Note 618868.
10. What are selection and join conditions?
During a selection condition, a column is compared with one or several specific values (e.g. "MANDT = :A0", "BDAT > '20050821'"). In join conditions, two column values from different tables are compared (for example, "T_00.MANDT = T_01.MANDT")
Only the selection conditions can ever be used to enter a join. Only when you access the inner table of a nested loop join (see below) can the join conditions also be used.
Selection and join conditions are specified in the WHERE part of the SQL statement and - in the case of views - are also specified within the view definition in Transaction SE11. These view conditions MUST NOT be overlooked when analyzing SQL statements.
11. What is a run schedule?
A run schedule (or explain, access path) shows how Oracle accesses the requested data (index access, full table scan, join type ...).
12. How can I display the run schedule for an SQL statement?
Up to and including Oracle 8i, the run schedule could be determined using the Oracle Explain function. As of Oracle 9i it is also buffered in the V$SQL_PLAN performance view.
From the shared cursor cache overview, you can go to the execution plan using the Explain button or by clicking on the SQL statement and selecting the Explain button on the dialog box.
Whether the explains displayed in R/3 are based on V$SQL_PLAN or the previous explain depends on the R/3 Release and Support Package version. If the display of the R/3 explain does not contain any further information, the normal Oracle explain is used. If you find explicit information concerning V$SQL_PLAN, the data comes from this view.
13. What are the characteristic components of a run schedule?
- TABLE ACCESS FULL
During a full table scan, all table blocks up to the high water mark (last ever filled block) are read
- INDEX RANGE SCAN
Depending on the selection conditions, a subarea of an index is read
- INDEX UNIQUE SCAN
In the selection conditions, all key fields of a primary index are specified with "=" so that no more than one entry of the index is returned.
- INDEX FULL SCAN
The entire index tree is read from left to right (-> sorted result)
- INDEX FAST FULL SCAN (as of Oracle 9i)
The blocks of the index are read in the sequence on the hard disk (due to DB_FILE_MULTIBLOCK_READ_COUNT this is fater than INDEX FULL SCAN but the result is not sorted)
- INDEX SKIP SCAN (as of Oracle 9i)
This is comparable with several INDEX RANGE SCANs and also takes into account index fields behind an index field that is not specified in the selection conditions or is not specified with "=".
- TABLE ACCESS BY INDEX ROWID
Read the complete data record of the table based on the ROWID that was determined with a preceding index access
- NESTED LOOPS
Two tables are joined, where the system is accessed using the table in the run schedule directly under NESTED LOOPS and the second table is accessed with all of the entries that are returned by this table.
- SORT JOIN / MERGE JOIN
With the sort merge join, the relevant entries are read in parallel from both tables. The two resulting quantities are then sorted and finally combined for the final resulting quantity.
- MERGE JOIN CARTESIAN / BUFFER SORT
A Cartesian MERGE JOIN (which often appears in relation to a BUFFER SORT) generates the Cartesian product of the two resulting quantities. If both of the resulting quantities have x and y elements, the Cartesian product result contains x * y records.
- HASH JOIN
A hash table is set up from the access table (the table directly under HASH JOIN in the run schedule). The entries of the second table are then checked on this hash table. If the join conditions are fulfilled, the data record will be returned.
- VIEW index$_join$_001
A line such as this in the execution plan indicates an index join. You can use this join type to combine columns from several indexes using a hash join, before the table needs to be accessed. Index joins are especially useful if there are several indexes that are moderately selective, but that are very selective if used in combination.
14. What is an optimal run schedule?
An optimal run schedule only needs a minimum of blocks to be read. The following is especially important for optimal data access:
- In most cases, an index access is better than a full table scan. A full table scan is mainly useful when a large portion of the table entries must be read.
- Otherwise, you must ensure that the selective conditions are supported by an index. The greater the selectivity of a condition, the more the resulting quantity will be restricted. Note: there is no direct relationship between the selectivity and the number of different values of a column. For example, if a column only ever contains the default value ' ' (and has therefore only a single value), but in the selection conditions you are searching using a value other than ' ', the condition is very selective. On the other hand, if a column contains 1,000,000 different values, but 90% of the entries have the value '000000000000000', a comparison of the column using '000000000000000' will be very unselective.
Also refer to the criteria specified below for the index design.
- Hash joins are deactivated in the standard R/3 system (HASH_JOIN_ENABLE is set to FALSE). They are only really used effectively in BW environments.
- Merge Join Cartesians should hardly ever occur. If they do, there is likely a serious error (completely obsolete statistics, missing Join or selection conditions, and so on). Also, ensure that the Oracle parameter OPTIMIZER_INDEX_CACHING has a value of 50 or lower.
- Nested loop joins are usually preferable to sort merge joins. Sort merge joins are only useful if the expected resulting quantities are small, based on the selection conditions for both tables, and there is no good join condition.
- The access table of a nested loop join and a hash join must be the table with the smaller resulting set.
15. What utilities are available for further analysis of a processing-intensive SQL statement?
The following examinations can be used for further analysis, depending on the analyses in the shared cursor cache in ST04:
a) Actual values in bind variables
To decide what conditions really are selective (that is, which significantly limit the resulting quantity), typical values must be determined in the bind variables. Use transaction ST05 to create an SQL trace that includes the execution of the statement. The trace must be started on the instance on which the transaction is executed with the relevant statement. Selecting the statement and choosing the "Replace Variables" button gives you a statement with the actual values, instead of binding variables.
If no SQL trace is possible ad hoc, you can also go to the calling ABAP program, because you are working in various places with literals instead of ABAP variables, which displays the values used in the plaintext.
b) Number of suitable table entries
You can use the actual values from an SQL statement to determine the selectivity by selecting the number of table entries that match the conditions. This can happen - for the current client - using transaction SE16. To do this, enter the individual conditions for the table and select the "Number of Entries" button.
At Oracle level, you can determine the number of suitable entries using the following command:
SELECT COUNT(*) FROM <table> WHERE <condition_list>;
SELECT COUNT(*) FROM <table> WHERE <condition_list>;
The advantage of this method it that unlike SE16, it is not limited to the current client.
Caution: A large number of blocks are imported with these methods, which can increase the I/O load. Do not execute these actions on a large scale when the system has a heavy load.
c) Column value distribution
Use Transaction DB05 for a rough overview of what and how often value combinations of one or several columns appear in the table.
DB05 does not state WHICH column combination appears how often. In addition, the frequency is only broken down by intervals (1-10, 11-100, ...). broken down. The result of a DB05 analysis is therefore only of limited use.
It is particularly useful for columns with a limited number of value combinations to determine each of these combinations including the exact number of occurrences. This can be done as follows at Oracle level:
ALTER SESSION SET DB_FILE_MULTIBLOCK_READ_COUNT = 128;
SELECT <column1>, ..., <columnN>, COUNT(*) FROM <table>
GROUP BY <column1>, ..., <columnN>;
ALTER SESSION SET DB_FILE_MULTIBLOCK_READ_COUNT = 128;
SELECT <column1>, ..., <columnN>, COUNT(*) FROM <table>
GROUP BY <column1>, ..., <columnN>;
The ALTER SESSION accelerates the subsequent SELECT (during a full table scan, the system then reads 128 instead of only 8 or 32 blocks for each I/O request from the disk). Since the parameter affects the cost accounting, it can only be set to 128 here, and not generally.
As of Release 4.6C, you can use transaction TAANA to determine the value distribution of columns.
Caution: A large number of blocks are imported with these methods, which can increase the I/O load. Do not execute these actions on a large scale when the system has a heavy load.
d) Index fragmentation
To prevent a large number of block accesses being caused by a fragmented index, check the index fragmentation as described in Note 771929.
e) Source of the SQL statement
You can use the report name and source code button in the shared cursor cache to reach the callpoint in ABAP. If it is still not clear there in which context the source code will run, a where-used list can return further information.
Information on the triggering transaction can also be determined by capturing a current query in SM66 or by a permanent SQL trace restricted to the table using ST05.
If no report name is displayed in the shared cursor cache, the statement is called from a tool or script outside the R/3 system. However, there is also a number of external tools that can be identified with a separate name.
If branching using a source code button fails, you may have dynamically generated source code.
If several reports execute an identical statement, the report that first executed the statement is always displayed.
You cannot branch to the source code for statements that originated in the R/3 kernel.
f) Block accesses for a specific statement execution
To test alternative access plans or partial accesses of a join, SQL statements can be executed at Oracle level. To ensure that it is easy to find their key performance indicators (such as the number of buffer gets and disk reads for each execution) in the shared cursor cache, you should add a unique and easily identifiable comment to the statement to be executed:
SELECT /* <comment> */ ...
SELECT /* <comment> */ ...
Now you can easily search for the statement by using the search term <comment> in the Shared Cursor Cache.
g) Access paths for varied SQL statements
In various cases it can be useful to determine the access path for varied SQL statements, for instance:
- Effect of hints
- Effect of more or fewer OR concatenations or IN list entries
- Effect of changed Oracle parameters
- Determining the costs for partial statements
You can use the "Enter SQL statement" button to determine the access path for any statement in transaction ST05. However, you can only change dynamic Oracle parameters with newer releases and Support Packages.
16. How can I optimize a processing-intensive SQL statement?
Before carrying out technical tuning of a processing intensive SQL statement, you should always check from the application view to what extent the SQL statement is even useful. For example, if the SQL statement is the result of an inadequately filled input template, a poorly-qualified SE16 SELECT (report /1BCDWB/DB*), an incorrect query (report AQ*) or a design error in a customer-specific development, the statement must be optimized from a non-technical view or avoided completely. If the statement comes from the standard SAP system, you can check whether a general solution is already available during an SAP note search.
Below is a general overview of classes of expensive SQL statements and possible technical tuning approaches:
- Accesses to objects of the Oracle DDIC (DBA views)
Queries on DBA views are often very processing-intensive. For statements on DBA views, it does not make sense to optimize the access by creating new indexes or specifying hints.
These accesses are usually caused by monitoring tools (transaction RZ20, BR*TOOLS, external monitoring tools). You must therefore first determine the initializing tool and check to what extent the query can be completely deactivated or the frequency of the execution can be reduced.
If long runtimes and a high number of BUFFER GETs occur during DBA_SEGMENT access in particular, then refer to Note 871455.
- Accesses within the F4 search help (M_V* views)
Search help runtimes may vary, depending on the selection criteria. Since all possible queries cannot be optimized to the same extent, it is important to limit query constellations to the main query constellations only. Furthermore, you can implement a &SUBSTITUTE LITERALS& hint in accordance with Note 1008433. This hint partially results in optimal accesses, even if there are no histograms.
- Large number of executions
If an SQL statement is processing intensive, primarily due to the large number of executions, you must check whether the number of executions can be reduced in the application.
If the statement in question contains a FOR ALL ENTRIES expression, it may also be possible to increase the blocking factors as described in Note 881083.
- Large number of processed rows for each execution
If an SQL statement reads a very large number of data records with each execution, you must check in the application whether the scope of the dataset can be reduced (for example, by specifying more conditions in the WHERE part).
- Large number of buffer gets in each execution
Check if the use of resources by Oracle is optimal. Consider the following issues:
- Does the Oracle Optimizer fail to choose the current best possible access path?
If a detailed analysis of the SQL statement means that Oracle does not decide on the optimal access path, you should first check whether the CBO statistics correctly reflect the current dataset. To make sure, you can generate new statistics that are as accurate as possible (Note 588668). You should also ensure that a statistical run is scheduled at least once a week as described in Note 132861.
If the access path is corrected and no side effects occur, you can also create statistics as described in Note 756335 on tables that do not usually receive any statistics as described in 122718.
If the access path is also not optimal with good statistics, this is often due to "features" of the RBO and CBO, which are described in Note 176754. In suche cases you can improve Oracle's access by using hints (see Note 130480) or by adjusting the statistic values (see Note 724545).
Additional reasons for an incorrect access path are the wrong Oracle parameter settings or the use of an obsolete patch set. For more information, see Note 618868.
To understand the CBO decisions better, also refer to Note 750631, which contains approximation formulas for the cost accounting.
Under certain conditions, the displayed access path is also incorrect because Oracle performs another access internally. For more information, see Note 723879.
- Can the access be optimized by creating an additional index?
If the existing selective conditions are not sufficiently supported by an index, a corresponding index can be created for the optimization or - if no side effects are possible - an existing index can be adjusted.
Note that in the R/3 system, some central tables are accessed using special index tables (Note 185530 for SD, Note 191492 for MM/WM, and Note 187906 for PP/PM). In the cases described, you should open the initial screen using an index table, so that no other index must be created. The same applies for accesses on the table BSEG, for the index tables such as BSIS, BSAS, BSID, and BSAD.
- Is the number of buffer gets inexplicably high?
If possible, roughly calculate how many blocks must be read for the statement. A maximum of 5 block accesses are required per execution for each INDEX UNIQUE SCAN (or, more specifically: BLEVEL of index + 1).
If a significantly greater number of blocks is read with INDEX RANGE SCANs that your estimates would lead you to expect, this may be due to an index fragmentation. For more information, see Note 771929.
If changes take placc to data in the same block even as it is accessed, the consistent reading mechanism and its associated access to CR blocks ("Consistent Read") may lead to significantly increased block accesses. Refer to Note 913247 for more detailed information.
Another possible reason for a large number of Buffer Gets is a high volume of network communication between the database and SAP. This effect is usually negligible, however.
When you use LIKE conditions, a placehold at the beginning or in the middle of the search template may cause a greater number of buffer gets, since Oracle can only optimally use placeholders that are at the end of the search term.
If you execute mass-data operations ("FROM TABLE <internal_table>" expressions in the ABAP statement), the underlying SQL statement is executed for each data record of <internal_table>, while the system only records a single executeion in the shared cursor cache. This causes an unnecessarily high number of block accesses for each execution.
In individual cases, the high number of buffer gets is caused by the actual run schedule differing from the displayed run schedule. This can be checked as described in Note 723879.
- Large number of disk reads per execution
A high number of disk reads is often associated with full table scans, because the blocks imported during a full table scan are suppressed more quickly than blocks that were read during an index access. In such a case, you must use appropriate measures to force an index access (create a new index, adjust the statistics, hints, ...).
A large number of disk reads often occurs in connection with index range scans and subsequent table accesses with ROWID if the clustering factor of the index is high. For more information, see Note 832343.
To avoid blocks of a critical table being suppressed from the buffer pool after some time, and later being read again from disk, you can set up a keep pool as described in Note 762808.
If the application needs to read a lot, or all, of the entries of a table repeatedly, and the table is not too large, consider buffering the table completely in the SAP system. You must ensure that the delayed SAP buffer synchronization will cause no problems for the application, and that the table's change rate is not too high.
Otherwise the same applies during the analysis of statements with a high number of disk reads as for statements with a high number of buffer gets.
- High CPU-time
The usual cause of high CPU load is a large number of BUFFER GETs. Note 712624 contains analysis options and more characteristic reasons for increased CPU consumption.
- High Elapsed time
In many cases, a high elapsed time that cannot be explained by DISK READs or BUFFER GETs is caused by lock situations, especially enqueues (refer to Note 745639). If necessary, you can use ORADEBUG (refer to Note 613872) to create a trace of the statement in question to obtain more information.
- Large number of direct write operations
Statements with a high number of direct write operations can be analyzed as follows (depending on the triggering operation):
- (Sorting: Check whether the statement must sort an unnecessarily large amount of data, or if the sorting can be completely avoided [for example, by choosing an index that supports the sorting sequence).
- LOB accesses: Check whether it would be useful to cache LOB data, as described in Note 563359.
- Parallel query: Check whether the relevant SQL statement can be processed usefully without a parallel query (see Note 651060).
- Hash operations: Check whether the data scope of a hash operation can be reduced, or if the hash operation can be completely avoided (for example, by avoiding an undesirable hash join).
- Bitmap operations: Check whether a bitmap operation (for example star transformation) can be usefully optimized.
17. What must be considered in terms of the index design?
For more information, see Note 912620.
18. What analysis tools are available at Oracle level?
For various reasons, an SQL analysis may be useful at database level directly (for example, if no R/3 functions are available in a Java-only environment). Therefore, useful Oracle functions for SQL analysis are described below:
- Creating an SQL trace
You can use ORADEBUG (Note 613872) or DBMS_SYSTEM (Note 602667) to create an SQL trace of any session.
- Displaying an Explain plan
With the following command, the run schedule is displayed for all subsequent SQL statements, instead of them being executed:
SET AUTOTRACE TRACEONLY EXPLAIN
You can deactivate this mode with:
SET AUTOTRACE OFF
If you want to create an Explain for a statement with bind variables, the above approach does not work. You can use the following option instead:
EXPLAIN PLAN FOR <sql_statement>;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
- Display block accesses of an SQL statement
Among other things, the following command counts the Disk Reads and Buffer Gets of the subsequent SQL statements (without the actual result of the query being returned):
SET AUTOTRACE TRACEONLY
You can deactivate this mode with:
SET AUTOTRACE OFF
- PLAN statistics (Oracle >=9i)
As of Oracle 9i you can collect detailed PLAN statistics such as processed records, buffer gets, disk reads or required time. Since the collection of these extensive statistics has a significant effect on database performance, this option is normally deactivated. You can use SQLPLUS to activate and evaluate these PLAN statistics on session level for individual tests. To do this, proceed as follows:
- Activating the PLAN statistics:
ALTER SESSION SET STATISTICS_LEVEL=ALL;
- Executing the relevant SQL statements
- Query from V$SQL_PLAN_STATISTICS_ALL
SELECT
SUBSTR(LPAD(' ', PS.DEPTH) || OPERATION || ' ' ||
OPTIONS || DECODE(OBJECT_NAME, NULL, NULL, ' (' ||
OBJECT_NAME || ')'), 1, 40) ACTION,
PS.COST,
PS.CARDINALITY,
PS.LAST_OUTPUT_ROWS "ROWS_PROC",
PS.LAST_CR_BUFFER_GETS + PS.LAST_CU_BUFFER_GETS BUFFER_GETS,
PS.LAST_DISK_READS DISK_READS,
PS.LAST_ELAPSED_TIME TIME
FROM V$SQL_PLAN_STATISTICS_ALL PS, V$SQL S WHERE
S.SQL_TEXT LIKE '<sql_statement_pattern>' AND
PS.ADDRESS = S.ADDRESS
ORDER BY PS.ID;
The columns COST and CARDINALITY contain the CBO estimates, while the columns ROWS_PROC, BUFFER_GETS, DISK_READS and TIME (in microseconds) display the actual statistics of the last execution.
- Deactivating the PLAN statistics:
ALTER SESSION SET STATISTICS_LEVEL=TYPICAL;
- PLAN-Statistiken (Oracle >= 10g)
As of Oracle 10g, you can also use the GATHER_PLAN_STATISTICS hint:
- Execute the SQL statement using the GATHER_PLAN_STATISTICS hint:
SELECT /*+ GATHER_PLAN_STATISTICS */ ...
- Indicate the execution plan using the following command:
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,
NULL, 'ALLSTATS'));
The records and the time for each execution step are contained in the columns "A-Rows" and "A-Time".
- Sample data
In some situations, it is useful to get an impression of the data composition without having to scan the entire table or without having to rely on how representative the first records of a table are. You can use the following command to randomly select data records:
SELECT * FROM <table> SAMPLE (<percent>);
SELECT * FROM <table> SAMPLE (<percent>);
19. What information do the "Estimated Costs" and "Estimated Rows" provide for an SQL statement?
An EXPLAIN displays "Estimated Costs" and "Estimated Rows", which are simply the CBO's calculation results (refer to Note 7550631). Since these results are based upon a number of assumptions (column values are distributed equally, statistics), and depend upon the database parameter settings, the calculated costs and rows are useful only within a margin for error. High "Estimated Costs" and "Estimated Rows" are therefore neither a satisfactory nor a necessary indication of an expensive SQL statement. Also, the calculated costs have no actual effect upon the performance - the deciding costs are always the actual ones, in the form of BUFFER GETs, DISK READs, or processed rows.
20. Is it possible to determine a buffer quality for individual tables?
The Oracle Buffer Quality (see Note 618868 for more information) is a global statistic of the database. Since individual SQL statements can strongly affect this global value (both positively and negatively), a table-specific buffer quality is desirable. You can use the following SQL statement (based on V$SEGMENT_STATISTICS) to determine the 20 tables that have the poorest buffer quality (this query only considers tables with at least 10,000 buffer gets):
SELECT * FROM
( SELECT
S1.OBJECT_NAME OBJECT,
S1.VALUE LOGICAL_READS,
S2.VALUE PHYSICAL_READS,
ROUND((S1.VALUE - S2.VALUE) / S1.VALUE * 100, 2) "QUALITY (%)"
FROM
V$SEGMENT_STATISTICS S1, V$SEGMENT_STATISTICS S2
WHERE
S1.OBJ# = S2.OBJ# AND
S1.VALUE > 10000 AND
S1.STATISTIC_NAME = 'logical reads' AND
S2.STATISTIC_NAME = 'physical reads'
ORDER BY 4 ASC
)
WHERE ROWNUM <=20;
SELECT * FROM
( SELECT
S1.OBJECT_NAME OBJECT,
S1.VALUE LOGICAL_READS,
S2.VALUE PHYSICAL_READS,
ROUND((S1.VALUE - S2.VALUE) / S1.VALUE * 100, 2) "QUALITY (%)"
FROM
V$SEGMENT_STATISTICS S1, V$SEGMENT_STATISTICS S2
WHERE
S1.OBJ# = S2.OBJ# AND
S1.VALUE > 10000 AND
S1.STATISTIC_NAME = 'logical reads' AND
S2.STATISTIC_NAME = 'physical reads'
ORDER BY 4 ASC
)
WHERE ROWNUM <=20;
21. What happens when there is an overflow of the Buffer Get?
For Oracle Release 9i or lower, if the SQL statement executes more than two billion Buffer Gets, the value overflows in the Shared Cursor Cache in the negative area. For Oracle Release 10g, the limit is four billion. When the system reaches this limit, the count begins at 0 again. In both cases, the result is that the system can no longer immediately recognize a large number of Buffer Gets. In addition, in many cases, the Shared Cursor Cache analysis ignores statements with negative Buffer Gets values, and as a result, the most expensive SQL statements regarding Buffer Gets are not even displayed. To recognize the affected SQL statements despite the overflow, you can select the following options:
- Check whether there are accesses with a negative number of Buffer Gets:
SELECT * FROM V$SQL WHERE BUFFER_GETS < 0;
- Check the plausibility of the data - if the system displays a number of Buffer Gets that is significantly lower than is possible, an overflow may be responsible.
- When you use Oracle 10g, also consider the AWR data from DBA_HIST_SQLSTAT that was aggregated on the basis of Snapshot, since overflows between two snapshots (generally found in one-hour intervals) are improbable.
22. Can I see a sample analysis?
The following SQL statement guarantees several thousand disk reads and buffer gets per execution, although hardly any entries are returned:
SELECT *
FROM "LTAP" T_00 , "LTAK" T_01
WHERE
( T_01 . "MANDT" = :A0 AND
T_01 . "LGNUM" = T_00 . "LGNUM" AND
T_01 . "TANUM" = T_00 . "TANUM" ) AND
T_00 . "MANDT" = :A1 AND
T_00 . "LGNUM" = :A2 AND
T_00 . "PQUIT" = :A3
SELECT *
FROM "LTAP" T_00 , "LTAK" T_01
WHERE
( T_01 . "MANDT" = :A0 AND
T_01 . "LGNUM" = T_00 . "LGNUM" AND
T_01 . "TANUM" = T_00 . "TANUM" ) AND
T_00 . "MANDT" = :A1 AND
T_00 . "LGNUM" = :A2 AND
T_00 . "PQUIT" = :A3
The execution path contains a sort merge join:
SELECT STATEMENT ( Estimated Costs = 1,712 , Estimated #Rows = 52,660 )
MERGE JOIN
TABLE ACCESS BY INDEX ROWID LTAP
INDEX RANGE SCAN LTAP~0
SORT JOIN
TABLE ACCESS BY INDEX ROWID LTAK
INDEX RANGE SCAN LTAK~R
SELECT STATEMENT ( Estimated Costs = 1,712 , Estimated #Rows = 52,660 )
MERGE JOIN
TABLE ACCESS BY INDEX ROWID LTAP
INDEX RANGE SCAN LTAP~0
SORT JOIN
TABLE ACCESS BY INDEX ROWID LTAK
INDEX RANGE SCAN LTAK~R
A nested loop join is most likely preferable to a sort merge join. Because - leaving aside the unselective MANDT - there are only specifications for LTAP selection conditions in the WHERE part (LGNUM, PQUIT columns), you can check whether LTAP is suitable as an access table for a nested loop join. A suitable index already exists:
NONUNIQUE Index LTAP~M
Column Name #Distinct
MANDT 1
LGNUM 7
PQUIT 2
MATNR 24.280
NONUNIQUE Index LTAP~M
Column Name #Distinct
MANDT 1
LGNUM 7
PQUIT 2
MATNR 24.280
As you can see in the #Distinct column, neither LGNUM nor PQUIT have many different values. However, this information does not yet allow any direct conclusions to be drawn on the selectivity of the conditions.
Even without an SQL trace, in this case you can already read the PQUIT value ' ' from the ABAP source code:
SELECT LTAP~LGNUM LTAP~TANUM LTAP~TAPOS
LTAP~VLTYP LTAP~NLTYP LTAK~BWLVS
LTAK~BDATU LTAK~BZEIT
INTO CORRESPONDING FIELDS OF TABLE IT_LTAP
FROM LTAP JOIN LTAK
ON LTAK~LGNUM = LTAP~LGNUM AND
LTAK~TANUM = LTAP~TANUM
WHERE LTAP~LGNUM = P_LGNUM
AND LTAP~PQUIT = ' '.
SELECT LTAP~LGNUM LTAP~TANUM LTAP~TAPOS
LTAP~VLTYP LTAP~NLTYP LTAK~BWLVS
LTAK~BDATU LTAK~BZEIT
INTO CORRESPONDING FIELDS OF TABLE IT_LTAP
FROM LTAP JOIN LTAK
ON LTAK~LGNUM = LTAP~LGNUM AND
LTAK~TANUM = LTAP~TANUM
WHERE LTAP~LGNUM = P_LGNUM
AND LTAP~PQUIT = ' '.
Now you can determine the exact value distribution for LGNUM and PQUIT using a GROUP BY:
SELECT LGNUM, PQUIT, COUNT(*) FROM LTAP GROUP BY LGNUM, PQUIT;
LGN P COUNT(*)
--- - ----------
I01 X 10
I02 5
I02 X 33955
I03 X 3088
P01 164
P01 X 81941
S01 2
S01 X 67807
S02 10
S02 X 3201
W01 33
W01 X 139158
SELECT LGNUM, PQUIT, COUNT(*) FROM LTAP GROUP BY LGNUM, PQUIT;
LGN P COUNT(*)
--- - ----------
I01 X 10
I02 5
I02 X 33955
I03 X 3088
P01 164
P01 X 81941
S01 2
S01 X 67807
S02 10
S02 X 3201
W01 33
W01 X 139158
You can tell that only very few entries ever exist for the used condition PQUIT = ' '. The condition is therefore very selective, although PQUIT only assumes two different values. A nested loop join with access through LTAP is therefore far preferable to a sort merge join.
The following options are available to persuade Oracle to use a nested loop join:
- Specification of a hint (see Note 772497)
- Sufficiently increase the statistics value for the distinct values of PQUIT, to make access using the LTAP~M index more attractive for Oracle (Note 724545).
One solution within the application is to split the join into two separate statements as described in Note 187935.
As an alternative, you can also use a SUBSTITUTE_LITERALS hint and histograms (refer to Note 811852).
23. Where can I find further information on SQL optimization?
In addition to covering general performance topics, the final section of Note 618868 contains references to books, SAP training and SAP services that have the SQL optimization as a central topic.
相关连接:http://blog.csdn.net/CompassButton/archive/2006/07/31/1004912.aspx