Hardware Monitoring & Trouble Shooting: take care of system cache
Just read from the SAP Performance Optimization Guide that when we do hardware monitoring and optimization, we should mind the ‘system cahce’. and as the book said, I didn’t find the corresponding configuration path on our internal server, which finally made me ask help from google and the following interesting story really helped me to understand why we should always keep in mind that system cache might be the trouble maker.
Limiting system cache size in Windows Server 2003 By Kelvin Tan
On a consulting gig, I was recently asked to investigate a strange problem with a Lucene server on Windows Server 2003.
The Lucene index was periodically refreshed by running a new instance of the app, then killing the old one via “taskkill”. Worked fine, except the available memory displayed by Task Manager somehow steadily decreased with every app refresh, and it would run out of memory every 2-3 days. However, just by killing _all_ java processes, the memory would be magically reclaimed and available memory would immediately jump up to the correct amount.
Well, turns out that the culprit was Windows’ file system cache, and the default Windows Server 2003 settings which gives priority to the System Cache over application processes. So we ended up with a huge file system cache and not enough memory to start a new application process.
Here are some links which were helpful in troubleshooting this problem, and its eventual solution.
Initially I was trying to find where the missing memory went, since the “Available physical memory” value didn’t match process totals given to me by the “Processes” tab.
Everything you always wanted to know about Task Manager but were afraid to ask helped me determine that “Commit Charge” was what I was really interested in, and that value did indeed match process totals. So it wasn’t some memory leak then.
After taking another look at Task Manager, I realized the biggest culprit was System Cache.
http://smallvoid.com/article/winnt-system-cache.html gives an overview of the Windows system cache, and, in particular, lists these tools:
- SysInternals CacheSet - It can only reset the cache to a certain size from where it can grow or shrink again.
- Uwe Sieber - NtCacheSet - Periodically resets the cache working set, just like CacheSet but does it with a specified interval.
- Uwe Sieber - SetSystemFileCacheSize - Sets a permanent upper limit for the file cache using theSetSystemFileCacheSize API (Win2k3 only).
Both CacheSet and SetSystemFileCacheSize worked in setting an upper limit on the file cache size, and that solved our problem of the missing memory.
Raymond Zhang
If you want to discuss with me about any idea, please contact me at raymond.zhang@sap.com