代码改变世界

Oracle:Using the DBMS_STATS-package

2011-07-19 13:19  Tracy.  阅读(481)  评论(0编辑  收藏  举报

原文地址

http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:735625536552

You Asked

We are doing performance-testing of SQL-statements in our project. When running 
explain-plans on the statements, the results are often not realistic due to the fact that 
we have not loaded the tables with data, and therefore the indexes, in most cases, will 
not be used as they will in production.

I found the DBMS_STATS-package and am wondering if I can use this package to virtually 
load my test-environment when using the explain-plan. Our Oracle-support team doubt that 
it will be possible to extract the data using GET/EXPORT_TABLE/INDEX/COLUMN_STATS, 
modifying the data and then using SET/IMPORT_TABLE/INDEX/COLUMN_STATS to put new 
statistics back into the data-dictionary-tables.

Are there any descriptions on the tables the CREATE_STAT_TABLE makes?

Regards,
Harald 

and we said...

That table structure is not documented and is not designed to be modified by anything 
other then the DBMS_STATS package.  Its sole purpose is to take stats from one database 
to another via export/import in the dbms_stats package.  Attempting to plug your own 
values in there would be difficult at best and potentially harmful at worst.

Instead of trying to insert/update rows in that table -- you should just use the other 
entries in the dbms_stats package.  For example, to set the statistics for a table -- you 
would use the dbms_stats.set_table_stats procedure.  It can either put the stats you want 
into the data dictionary directly OR you can have this procedure stuff those stats in the 
table created by CREATE_STAT_TABLE.  You use the formal named arguments in the 
set_table_stats procedure and they'll take care of stuffing them into the table itself.   
You lose NO functionality by doing this -- there would be NOTHING to be gained by 
accessing the table created directly.

You would use the SET_COLUMN|INDEX|TABLE_STATS to set whatever stats you wanted on the 
data.  There would be no reason to modify existing data.  You can see all of the 
"existing" data by querying the data dictionary. 
 
Is it used to set statistics and test the performance i.e. the execution path.

Actually, We are interested in knowing how we can use and where we can use 
dbms_stats.set_table_stats to the maximum benefit. 


Followup   September 29, 2001 - 2pm Central time zone:

You can use it to transfer the stats from your production system (with all of the data) to your 
test system.  You might do that to review the query plans generated by new/altered queries in your 
application before putting them into production -- so you can see what might be different.

I use it to show the effects of having differeing numbers of rows/blocks in tables.  For example, 
in order to demonstrate a point -- I might analyze an empty table -- run a query and show the plan. 
 Then, I will use dbms_stats to trick the optimizer into thinking there are LOTS and LOTS of rows 
and blocks in there and how the different plans.  You can use this to see how the optimizer will 
change its mind about the proper plan over time.

Another use I've found for it is with temporary tables.  You cannot really analyze them but what 
you can do is create a "real" table -- fill it up with representative data -- and then export these 
stats using dbms_stats.  Later, you create the temporary table and import these other stats you 
saved.  Now the optimizer will use these stats for the temporary table when developing query plans 
that involve it.  This could be important if you put lots of rows into a temp table and then use it 
in subsquent SQL statments -- the optimizer would otherwise think the temporary table is "small" 
-----------------------------------------------------------------------------------------------------
Hi

I have tested a few times and always find that dbms_stats is slower than ordinary analyze even 
specifying parallel degree, why is that? we are using 8.1.6.3

Also is it possible analyze a table using analyze table xx compute statistics in parallel? 



Followup   November 1, 2001 - 4pm Central time zone:

Geez, no test case, no parameters, not even "we find it X% slower" -- nothing.

Perhaps the tables you used were trivial in size (parallel can take much longer).  Perhaps you were 
comparing apples with oranges (eg: dbms_stats by default does "for all columns size 1" -- did your 
analyze do that too?)

No, you cannot analyze in parallel, only dbms_stats can.

Also, I do not see it being significantly slower (given that dbms_stats will just issue an analyze 
in this particular case you would expect that to be so!):

ops$tkyte@ORA717DEV.US.ORACLE.COM> variable n number
ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> analyze table t delete statistics;

Table analyzed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> exec :n := dbms_utility.get_time

PL/SQL procedure successfully completed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> exec dbms_stats.gather_table_stats( user, 'T', method_opt => 
'for all indexed columns', cascade => true );

PL/SQL procedure successfully completed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> exec dbms_output.put_line( (dbms_utility.get_time-:n) || ' hsecs 
' );
22762 hsecs

PL/SQL procedure successfully completed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> analyze table t delete statistics;

Table analyzed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> exec :n := dbms_utility.get_time

PL/SQL procedure successfully completed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> analyze table t
  2  compute statistics
  3  for table
  4  for all indexed columns
  5  for all indexes
  6  /

Table analyzed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> exec dbms_output.put_line( (dbms_utility.get_time-:n) || ' hsecs 
' );
22732 hsecs

PL/SQL procedure successfully completed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> select num_Rows from user_tables where table_name = 'T';

  NUM_ROWS
----------
   1093312


and when I do it in parallel -- I see a HUGE difference:


ops$tkyte@ORA717DEV.US.ORACLE.COM> variable n number
ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> analyze table t delete statistics;

Table analyzed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> exec :n := dbms_utility.get_time

PL/SQL procedure successfully completed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> exec dbms_stats.gather_table_stats( user, 'T', method_opt => 
'for all indexed columns size 1', cascade => true, degree=>8 );

PL/SQL procedure successfully completed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> exec dbms_output.put_line( (dbms_utility.get_time-:n) || ' hsecs 
' );
2525 hsecs

PL/SQL procedure successfully completed.

I think you were using tables that were so small as to be insignificant in even timing them.

Btw, table t is:

create table t as select * from all_objects;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;

create index t_idx1 on t(object_id);
create index t_idx2 on t(object_name);