Why is my SYSTEM Tablespace so Big? August 3, 2009

Posted by mwidlake in internals, VLDB.
Tags: data dictionary, statistics, VLDB
trackback

How big is your SYSTEM tablespace? 512M? 1GB? 4GB? Bigger? Is it bigger than the whole of the first critical database you ever looked after? Why has the SYSTEM tablespace got so big?

You can easily check the size of the SYSTEM tablespace with any oracle development/admin GUI or with this script.

@dbf_lst
Enter value for ts_name: system
old   8: where tablespace_name like upper('&ts_name'||'%')
new   8: where tablespace_name like upper('system'||'%')
continue> 
 
FILE_NAME                                  F_ID  TS_NAME         SIZE_M   THOU_BL
----------------------------------------- ----- ---------------- -------- -------
+DISCG/sid/datafile/system.272.596722817      1 SYSTEM             24,920   3,190

That is 24GB. It is not the biggest one I have seen, but it is up there in the top 3.

Not so many versions ago, it was not unusual to have a system tablespace of less then 128MB in size. In fact, if you did not use stored PL/SQL this was probably the case up to and through 8i and into 9i. So – it’s PL/SQL causing the bloat? No, probably not. Ahhh, what about all that AWR automatic real-time monitoring gubbins you hear so much about? Nope, that goes into SYSAUX.

No, it’s probably down to one or both of two internal tables, SYS.AUD$ and SYS.HISTGRM$.

Let’s check out what are the biggest objects in this particular SYSTEM tablespace:

select owner,segment_name,segment_type
 ,bytes/(1024*1024) size_m
 from dba_segments
 where tablespace_name = 'SYSTEM'
 and    bytes/(1024*1024) &gt;1
 order by size_m desc
 
OWNER        SEGMENT_NAME         SEGMENT_TYPE         SIZE_M
------------ -------------------- ------------------ --------
SYS          C_OBJ#_INTCOL#       CLUSTER              13,313
SYS          AUD$                 TABLE                 5,688
SYS          I_H_OBJ#_COL#        INDEX                 1,863
SYS          I_AUD1               INDEX                 1,606
SYS          HIST_HEAD$           TABLE                   311
SYS          SOURCE$              TABLE                   224
SYS          IDL_UB1$             TABLE                   224
SYS          C_FILE#_BLOCK#       CLUSTER                 208
SYS          INDPART$             TABLE                   160
SYS          OBJ$                 TABLE                   144
SYS          I_HH_OBJ#_COL#       INDEX                   128
SYS          I_HH_OBJ#_INTCOL#    INDEX                   128
SYS          I_OBJ2               INDEX                    80
SYS          I_FILE#_BLOCK#       INDEX                    62
SYS          TABPART$             TABLE                    57
SYS          I_OBJ1               INDEX                    49
{snip}

To be a little obtuse, I’ll start with the second largest object, SYS.AUD$, which you can see is getting close to 6GB in size {Oh, and SYS.I_AUD1, the fourth largest object, is the PK index on AUD$}.

AUD$ is the underlying table that holds all of the system auditing information that is turned on and off with the SQL AUDIT command. You can use AUDIT to monitor particular users, changes to system priveleges, access on key tables or particular activity like table changes. Go check the SQL manual if you want to know more.
The records for this captured activity all go into this table. On this particular system a few key priveleges and logons by certain users are the only things being audited, but the data has built up over 3 or 4 years.

On version 9 and 10 the options for managing this table are…poor. Bascially, you have to do it yourself. I think I am right to say that it is the only SYS-owned table that Oracle Corp say it is OK for you to move into another tablespace yourself and that you are allowed to delete and truncate data from. If you AUDIT a lot of things or activities you probably need to delete some data. {If you AUDIT a lot of things or activities and you don’t delete any of the data, you probably did not turn on AUDIT very long ago or have not checked the size of your SYSTEM tablespace for a while…}.

Oh, and one final thing on the AUD$ table. It is lacking indexes – it has only one {ignoring LOB indexes}. This is a good thing as indexes slow down inserts quite significantly {I use a rule of thumb that if it takes “N“ms to insert a record into a table, it will take “2.2*N“ms if there is an index on the table, “3.4*N“ms if there are two indexes, “4.6*N“ms if there are three… Get the idea? However, I have not checked this for real in years. Treat this statement as a potential myth}. I would recommend you do not add further indexes to AUD$. If you need to investigate the information in there, I would strongly suggest you extract what you want into a new table with a CTAS statement, then index that copy table and do your work on that. You do not want to slow down further everything that is already slowed down a little by being audited.

What about the largest thing in the SYSTEM tablespace? It’s SYS.C_OBJ#_INTCOL# nd is over 13GB in size. I said it would be SYS.HISTGRM$ didn’t I?

Well, SYS.C_OBJ#_INTCOL# is a cluster. Clusters are structures, segments, that hold one or more tables. All records for the same key across all those tables are held together, in the same block or blocks. The key is either a normal index or a hash. The idea is that it speeds up selecting records from different tables with the same key – they are all in the same block or blocks. It does work too, but they seem to have fallen out of favour with Oracle designers, but are worth considering.

In this particular cluster there is only one table, SYS.HISTGRM$:

select owner,table_name
from dba_tables
where cluster_name='C_OBJ#_INTCOL#'
 
OWNER                          TABLE_NAME
------------------------------ ------------------------
SYS                            HISTGRM$
1 row selected.

So, SYS.HISTGRM$ is the only table in the cluster. It {and it’s cluster index, on columns OBJ# and INTCOL#} are hiding in this cluster segment. It IS the biggest thing in this particular SYSTEM tablespace. The only index, other than the cluster index, on the table SYS.HISTGRM$ is I_H_OBJ#_COL# and is the third largest object in the SYSTEM tablespace.

--ind_cols
select
INDEX_OWNER                 ind_owner
,INDEX_NAME                  ind_Name
,TABLE_NAME                  tab_Name
,substr(column_position,1,3) Psn
,COLUMN_NAME                 Col_Name
from dba_ind_columns
where table_name like upper(nvl('&tab_name','WHOOPS')||'%')
order by 3,1,2,4,5
--eof
@ind_cols
Enter value for tab_name: histgrm$
 
IND_OWNER   IND_NAME           TAB_NAME           PSN       COL_NAME
----------- ------------------ ------------------ --------- --------------
SYS         I_H_OBJ#_COL#      HISTGRM$           1         OBJ#
                                                  2         COL#
2 rows selected.
Elapsed: 00:00:02.07
 
-- you find the cluster index by looking on DBA_INDEXES and
--DBA_IND_COLUMNS where the "TABLE_NAME" is the name of the cluster
@ind_cols
Enter value for tab_name: C_OBJ#_INTCOL#
 
IND_OWNER   IND_NAME           TAB_NAME           PSN       COL_NAME
----------- ------------------ ------------------ --------- --------------
SYS         I_OBJ#_INTCOL#     C_OBJ#_INTCOL#     1         OBJ#
                                                  2         INTCOL#
2 rows selected.
Elapsed: 00:00:00.93

What is HISTGRM$? Do you think it looks like it could be something to do with column Histograms? Well it is, it’s the underlying data dictionary table that holds column histograms. It started to become common to see a large HISTGRM$ table with Oracle 9 systems with many partitions {and of course subpartitions, if used}, where statistics were gathered on these partitioned tables aggressively and included a METHOD_OPT clause that causes column statistics to be collected. eg FOR ALL INDEXED COLUMNS SIZE 200 or FOR ALL COLUMNS SIZE 100.

Where most people have seen it balloon in size though is with Oracle 10 and the introduction of…the Automated Stats gathering job {so, there is a connection to previous recent posts 🙂 }.
Every night, every weekend, any partition or sub-partiton that is 10% or more different to last time stats were gathered, get stats gathered. The automatic job gathers states with METHOD_OPT=AUTO. Which means your Oracle software decides which columns to gather histogram stats on and how many buckets it collects.

Whoever wrote that part of the stats gathering feature really liked to gather information.

If you have lots of partitions {or sub-partitions} and gather stats with the METHOD_OPT=AUTO, the two factors added together result in a massive HISTGRM$ table and thus a massive cluster.

Is space important? No, of course not, disc acreage is cheap, just buy bigger discs ;-). {That, of course, was a highly questionable suggestion}.
Being serious, I’m personally not too concerend about my SYSTEM tablespace being that big, after all compared to the size of databases that use thousands of partitions, 32GB is usually not a major concern, but I know of people who did not expect their system tablespace to get that big and have issues with simple running out of strage available to them.

I am more concerned about the impact on parse times and general data dictionary perforance of such large objects. I’ve found the data dictionary can be very slow for some queries when there are large numbers of extents, objects, columns…Histograms are just in there with the mix. I’ve had to work around such slow data dictionary access several times in the past and I suspect {though have not had time to in any way support this with evidence} that parsing is slowed down by unneccessary histograms. As an example of this, which actually has nothing to do with SYS.HISTGRM$, if you look back up this post at the code listing the indexes on the cluster and table, the code took between a second and two seconds to run. Normally it takes less than 1/10 of a second.

Can you do anything about the size of these objects? Well, with SYS.AUD$ Oracle allow you to delete from it and move the table to a new tablespace. If you use the SQL AUDIT command, you should include housekeeping the SYS.AUD$ table in your overall implementation.

SYS.C_OBJ#_INTCOL# and SYS.HISTGRM in it are slightly trickier. It is not easy to reduce the space used by these objects and thus the SYSTEM tablespace once they have grown large, but you can stop them getting bigger by collecting fewer column histogram statistics on your partitioned tables. Obviously that is not as simple as it sounds, but many sites with large numbers of objects do turn off the automated stats gathering and run their own code.