Fanr

  博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理

Creating a Performance Baseline - Part 1

You'll often hear that you should monitor the performance of SQL Server. You may read a little about performance monitoring, and you may turn on a few counters or perform a query against a dynamic management view that you know about. But, you may still wonder "Are these numbers good or bad?"

To determine if something is bad, you need to know what it looks like when it is good. Sounds obvious doesn't it? By creating a performance baseline, you can learn what your numbers are when your system is performing well. A performance baseline includes a single performance chart that is accompanied by an interpretation of the results, based on your environment.

To establish your performance baseline against Workforce Central, you'll need to find a time when the performance of your SQL Server environment is considered normal. For example, no users are complaining about slow responses, no backups or large jobs are running, and no "special" processing is taking place. Once you find that time, you'll need to collect a range of Windows Performance Monitor (perfmon) counters, information from dynamic management views, and maybe even a small SQL Server Profiler trace. Then, you can use the results of your collection as the starting point for subsequent performance collections. How do the new numbers compare to the baseline numbers, when everything was fine? Did one counter go up or down? Did several numbers change? Having something to compare the current numbers with can help you identify the source of new performance bottlenecks.

What Should You Monitor?

The actual counters, dynamic management views, or SQL Server Profiler trace events that you should collect are based on your system setup. But, the counters that we list below are a good place to start. If you capture these counters, you should have enough information to determine if you are having a performance issue—and if you are having an issue, which area is the source.

Note: Many of the counters that we list below list a threshold. These threshold numbers are not written in stone, and your actual values may be different. It is important to note that a standard threshold number is a starting point—if your value is a little higher or a little lower, the values that you see during your performance baseline collection become your new thresholds.

Monitoring the Disk Subsystem

There are several methods to monitor the disk subsystem. Since the disk subsystem is getting more and more complex each year, we recommend that database administrators monitor the following Performance Monitor counters to understand the latency of their disk I/O requests.

These two counters should provide you with enough information to determine if you have disk I/O bottlenecks. You should monitor these counters at both the host layer and the virtual server layer.

As with all performance monitoring, you should monitor on a continuous basis and then report daily or hourly, as your situation demands. This prevents nonpeak hours from watering down the peak-hour readings and distorting the performance measurements.

LogicalDisk(*): Avg. Disk Sec/Read.

This counter measures the average time, in seconds, of a read of data from the disk.

Thresholds:

· Less than 10 milliseconds (ms) = very good

· Between 10 and 20 ms = okay

· Between 20 and 50 ms = slow, needs attention

· Greater than 50 ms = serious I/O bottleneck

Note that Avg. Disk Sec/Read is a server-wide counter and therefore cannot be obtained through SQL Server 2005 dynamic management views.

LogicalDisk(*): Avg. Disk Sec/Write.

This counter measures the average time, in seconds, of a write of data to the disk.

Thresholds:

· Less than 10 ms = very good

· Between 10 and 20 ms = okay

· Between 20 and 50 ms = slow, needs attention

· Greater than 50 ms = serious I/O bottleneck

Note that Avg. Disk Sec/Write is a server-wide counter and therefore cannot be obtained through SQL Server 2005 dynamic management views.

Monitoring the CPU

We also recommend that database administrators monitor the following counters to understand the utilization of the host and virtual server CPU resources.

These three counters should provide you with enough information to determine if you have CPU bottlenecks. You should monitor these counters at both the host layer and the virtual server layer.

As with all performance monitoring, you should monitor on a continuous basis and then report less frequently—either daily or hourly. This gives you a more accurate picture of your performance.

System: Context Switching(below 1,000 per processor).

A high rate of context switching indicates resource queuing. In environments that have high rates of context switching, you should be careful to limit the applications and services that are placed on the boxes that are displaying the high context switch rates. The average value for this counter should remain below 1,000 per processor. Higher rates indicate that there may be CPU pressure, but this value should not be used as a single indicator of CPU pressure.

To use SQL Server dynamic management views instead of Performance Monitor for this counter, run the following query.

 

--Will work against databases in 80 compatibility mode
SELECT 'Context Switching by Scheduler'
SELECT cpu_id --ID of CPU if affinity mask in use (255 otherwise)
,is_online --Whether being used by SQL Server
,context_switches_count
,current_tasks_count
,current_workers_count
,active_workers_count
,pending_disk_io_count
,* 
FROM Sys.dm_os_schedulers

Processor(_Total): Privileged Time(Below 10 percent).

This counter measures the percentage of elapsed time that the process threads spent executing code in privileged mode. Average values above 10 percent indicate possible CPU pressure.

You should capture this counter for each individual processor.

Note that Privileged Time is a server-wide counter and therefore cannot be obtained through SQL Server 2005 dynamic management views.

Processor(_Total): Processor Time(below 80 percent).

This counter measures the percentage of elapsed time that all process threads used the processor to execute instructions. Average values above 80 percent indicate possible CPU pressure.

You should capture this counter for each individual processor.

Note that Processor Time is a server-wide counter and therefore cannot be obtained through SQL Server 2005 dynamic management views.

To Be Continued

We will continue to talk about performance baseline and tracking memory usage tomorrow. Sign up for our RSS feed so you know when the blog post is published.

Creating a Performance Baseline - Part 2

This is a continuation of yesterday's post. Today's topic is how to get a performance baseline for SQL Server's memory usage.

Why Do You Need a Memory Usage Performance Baseline?

We are going to show you how to monitor memory usage. Without a baseline, you won't know if SQL Server is using more or less memory than it should to run at full throttle.

Monitoring Memory

Database platforms are designed to consume as much memory as you will allow them to consume. This means that it is often critical to monitor memory use when you deal with SQL Server. Luckily, the SQL Server development team has included multiple methods to monitor memory for an instance of SQL Server.

We recommend that you watch the following SQL Server and memory metrics to understand the utilization of the virtual server memory resources:

· SQL Server: Buffer Manager Page Life Expectancy

· Memory: Pages/sec

· SQL Server: Buffer Manager: Free Pages

· Memory: Available Mbytes

· Memory: Free System Page Table Entries

· SQL Server: Memory Manager Memory Grants Pending

· Paging File(_Total): % Usage

You should monitor these counters at the virtual server layer.

Important: As with all performance monitoring, you will need to monitor on a continuous basis, but you should record your measurements in small time blocks, such as daily or hourly. This approach will prevent nonpeak hours from watering down the peak-hour readings and giving you a false view of performance.

SQL Server: Buffer Manager Page Life Expectancy.

Page life expectancy is the number of seconds a page will stay in the buffer pool without references. Values below 300 indicate possible memory pressure on the server.

In SQL Server 2005 and SQL Server 2008, the sys.dm_os_performance_counters dynamic management view returns the current value for the page life expectancy.

-- Page Life Expectancy
-- Will work against databases in 80 compatibility mode
SELECT *
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Page life expectancy'

Memory: Pages/sec.

The Pages/sec counter shows the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays. It is the sum of Memory\Pages Input/sec and Memory\Pages Output/sec. It is counted in numbers of pages, so it can be compared to other counts of pages, such as Memory\Page Faults/sec, without conversion. It includes pages that are retrieved to satisfy faults in the file system cache (usually requested by applications) and noncached mapped memory files.

A high value here (greater than 20) means that hard page faults (in which the operating system goes to the disk to resolve memory) are costing disk I/O and CPU resources. To resolve this problem, add more RAM, or remove other applications from the database server. Alternatively, limit the amount of memory that is available to SQL Server. Note that some paging will usually be present because of the way that the operating system works.

An average value of around 5 should be normal.

Pages/sec is a server-wide counter, so you can't obtain it through SQL Server 2005 dynamic management views.

SQL Server: Buffer Manager: Free Pages.

The Free Pages counter shows the total number of free pages on all free lists. Minimum values below 640 indicate memory pressure.

In SQL Server 2005 and SQL Server 2008, the sys.dm_os_performance_counters dynamic management view returns the current value for Free Pages.

-- Free Pages
-- Will work against databases in 80 compatibility mode
SELECT *
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Free pages'
AND [object_name] LIKE '%BUFFER MANAGER%'

Memory: Available Mbytes.

The Available Mbytes counter shows the amount of memory that is available to processes that are running on the computer, in megabytes rather than in bytes, as reported by Memory\Available Bytes. This value should remain above 128. If the value falls below 128, memory pressure may be a problem if the system is paging which can be obtained from the performance counter Paging File(_Total): %Usage.

Available Mbytes is a server-wide counter, so you can't obtain it through SQL Server 2005 dynamic management views.

Memory: Free System Page Table Entries.

The Free System Page Table Entries counter shows the number of page table entries that are not currently in use by the system. This counter indicates memory pressure if it falls below 3,000. Typically, for systems that are not using the /3GB switch, this counter should be between 80,000 and 140,000. For systems that are using the /3GB switch, this counter should be about 15,000.

Free System Page Table Entries is a server-wide counter, so you can't obtain it through SQL Server 2005 dynamic management views.

SQL Server: Memory Manager Memory Grants Pending.

The Memory Grants Pending counter shows the current number of processes that are waiting for a workspace memory grant. This value should remain around 0.

In SQL Server 2005 and SQL Server 2008, the sys.dm_os_performance_counters dynamic management view returns the current value for Memory Grants Pending.

-- Memory Grants Pending
-- Will work against databases in 80 compatability mode
select * from sys.dm_os_performance_counters
WHERE counter_name like 'Memory Grants Pending%'

Paging File(_Total): % Usage.

The Paging File(_Total) counter shows the amount of the page file instance that is in use. This value should be below 70 percent; values over 70 percent indicate memory pressure. The use of the paging file with SQL Server indicates that the Lock pages in memory policy setting has not been enabled for the SQL Server service account.

Paging file usage is a server-wide counter, so you can't obtain it through SQL Server 2005 dynamic management views.

Metrics Reference Baseline

So where is your baseline? Watch these metrics over a period of time, and see where your deployment hangs out most often. You want the values to be in the target measurement range that we show in the following table. If they are not, it's a good indication that you need to tweak something to squeeze the best performance from Workforce Central on SQL Server.

Metric

Target measurement

SQL Server: Buffer Manager Page Life Expectancy

Greater than 300

Memory: Pages/sec

Less than 20

SQL Server: Buffer Manager: Free Pages

Greater than 640

Memory: Available Mbytes

Greater than 128

Memory: Free System Page Table Entries

Greater than 3,000

SQL Server: Memory Manager Memory Grants Pending

0

Paging File(_Total): % Usage

Less than 70 percent

from http://blogs.msdn.com/b/kronos/archive/2010/03/23/creating-a-performance-baseline-part-1.aspx
posted on 2011-04-09 14:04  Fanr_Zh  阅读(1084)  评论(0编辑  收藏  举报