[zz]Driver Verifier Flags
http://www.osronline.com/article.cfm?article=325
Special Pool
By enabling the Special Pool option, you enable two safeguards for one of the most insidious types of driver error: memory corruption.
The first set of potentially memory corrupting errors that this option will catch is buffer overruns - accessing memory after a valid address range. Driver Verifier catches these by adding what are called "guard pages" to the tail of every allocation that the driver makes. Driver Verifier then marks these pages as "no access" so that an access violation will occur if these pages happen to be touched by the driver. If the access violation does trigger, Verifier traps it and bugchecks the system in a more controlled way than usual. By that we mean the bugcheck code and stack trace will be very explicit about the error and the stack trace will pinpoint the offending code exactly. This is important, because it is very common for a driver that is writing to a random location off the end of its buffer to corrupt another driver in the system. When a situation like that happens, the system will bugcheck and typically blame the wrong driver. These types of blue screens are extremely hard to debug and even harder to explain to your customers. Note that according to the DDK documentation, you can use the GFlags utility to alternatively choose to have the guard pages added to the head of the allocations instead of the tail. This would allow you to catch buffer underrun errors, (accessing memory before a valid address range), which are less common.
The other set of potentially memory corrupting errors that Special Pool will catch are accesses to memory after it has been freed. This is another problem that is particularly tricky to track down in the wild, because it can easily go undetected for long periods of time. It generally only causes a problem if the system is under heavy load and the address is quickly recycled to another driver (or even the same driver!) in the system. Driver Verifier plays a pretty cool trick in order to catch these errors. What it does is free the memory that is backing the allocation, but leaves the virtual to physical address mapping (i.e. the PTE) active but marked as "no access". This means that if the driver then attempts to access the memory, an access violation will occur and the system will bugcheck.
Special Pool is not a magic bullet for a couple of reasons though. First of all, it will not catch stray pointer accesses that point to valid allocations. It is such a common practice for one component in the system to allocate memory and pass it for use in another component that checking for something like this would be impossible. Also, as has been previously reported in The NT Insider, when enabling Special Pool for your driver, your pool allocation tags are not preserved. This means that if you are trying to track down memory leak issues, it’s probably best to not test with Special Pool enabled.
Pool Tracking
The Pool Tracking option enables one check that is similar to the Special Pool overrun check and another to track resource cleanup.
The overrun check in Pool Tracking does essentially what the Special Pool check does – it adds a page to the tail of memory allocations, except the guard pages are not marked as "no access." Instead, they are filled in with a particular pattern. If the pattern is modified when the memory block is freed, the system bugchecks. This is slightly less helpful than the special pool option because it only catches the corruption after the fact, making it more difficult to find the true source.
The other check that Pool Tracking enables concerns driver unloading. When the driver is unloaded, Pool Tracking makes sure that all of the resources allocated by the driver have been freed. If the driver is unloaded and it has not freed all of its memory resources, the system bugchecks and indicates how much memory has been leaked. Further, if pool tagging has been enabled, the pool tags of the leaked memory allocations are also indicated. This option is extremely helpful if your driver supports being unloaded, but if you are a file system driver, for example, this check does not provide any additional help.
Force IRQL Checking
We always stress in our classes that you cannot write a driver if you do not understand IRQLs. If you spend a few minutes browsing the NTDEV and NTFSD newsgroups, it will quickly become obvious that not everyone has taken an OSR seminar. But, even if you know the rules like the palm of your hand, you still need to obey them and the Force IRQL Checking option can help you do just that.
Force IRQL Checking enforces the number one IRQL rule: you must not touch any pageable memory at IRQL DISPATCH_LEVEL or above. The reason for this, of course, is that if the pageable memory happens to not be resident, a DISPATCH_LEVEL software interrupt must be executed to bring the page into memory. If the code that is currently running is already at DISPATCH_LEVEL or above, the DISPATCH_LEVEL interrupt cannot run and the page fault cannot be satisfied. Because the Memory Manager aggressively caches pages, it is entirely possible that this bug will go unnoticed during your testing because the pages have already been faulted in at an earlier time by a thread running at a proper IRQL.
The way that Driver Verifier enforces the pageable memory and IRQL rule is by paging out all pageable memory after every IRQL raised to DISPATCH_LEVEL or above. This ensures that all accesses to memory regions marked as pageable at an elevated IRQL generate a DRIVER_IRQL_NOT_LESS_OR_EQUAL bugcheck.
I/O Verification
I/O Verification gets brken down into two creatively named levels: Level 1 and Level 2. On Windows XP you always get both Level 1 and Level 2 when you select I/O Verification from the Driver Verifier GUI, but on Windows 2000, Level 2 must be explicitly enabled (see the DDK docs for details on how to do this).
When Level 1 I/O Verification is enabled, all IRPs are allocated out of special pool, which is helpful in catching some common errors (if you’ve ever tried to fill in the current stack location of an IRP that you’ve allocated, then you definitely want to flip on Level 1 I/O Verification). Other Level 1 checks include:
-
- Calling IoCompleteRequest on an IRP with a cancel routine still set
- Calling IoCallDriver from a dispatch routine at a different IRQL than you were called at
- Calling IoCallDriver with an invalid device object
Level 2 I/O Verification expands upon Level 1 I/O Verification with one difference: If a kernel debugger is attached, Level 2 I/O Verifications will not bugcheck the system. Instead, an ASSERT is issued with a detailed description of the error and, in some cases, even a URL where you can get more information. If you choose to ignore these errors, the machine will continue to run, potentially giving you the ability to fix your code and reload your driver without a reboot. This is quite a convenience, to say the least. Also, Level 2 I/O Verification comprises over fifty I/O checks. Here are some good ones:
-
- Calling IoCallDriver on an IRP with a cancel routine still set
- Deleting a device that is attached to a lower device without first calling IoDetachDevice
- Completing IRP_MJ_PNP requests that you don’t handle, instead of passing them down
- Manually copying a stack location instead of using IoCopyCurrentIrpStackLocationToNext and not clearing the upper driver’s completion routine
Enhanced I/O Verification
Enhanced I/O Verification is a feature added to Driver Verifier in Windows XP to add to the laundry list of I/O checks done by Driver Verifier. These checks are reported in the same way as the Level 2 I/O Verifications in that they appear as ASSERTs when a kernel debugger is attached and can be ignored without bug checking the system.
Does the golden rule we violated in Bug #2, "If you mark the IRP as pending you must return STATUS_PENDING," sometimes escape you? If so, Enhanced I/O Verification is your friend as it monitors your IRPs and ensures that you follow this rule. Another neat trick that this option enables is mixing up the PnP load order of devices in the system. This ensures that just because driver A starts before driver B on every system you’ve run your driver on, you don’t code to that fact.
This is also the option that will trap Bug #1, by sending bogus PnP, Power, and WMI IRPs to your stack to check for proper processing of IRPs of each type.
Deadlock Detection
Deadlock Detection is another Driver Verifier option that was added to Windows XP. Enabling it causes Verifier to track all of your driver’s acquires and releases of spinlocks, mutexes and fast mutexes and ensures that a locking hierarchy is in place and is followed. An interesting thing to note here is that Deadlock Detection is constantly monitoring your acquires and releases, and building a large graph of the use of your locks throughout the driver. If it finds a potential deadlock condition, it will bugcheck the system. What this means is that your code, as written, may never hit a deadlock, but if there’s a potential for it, the system will still bugcheck.
The thinking here is that you should have a locking hierarchy in place and always follow it, even if in some places you "know better." This provides a more robust code base that is less prone to develop locking issues in the future.
If your system does bugcheck due to Deadlock Detection, the !deadlock WinDBG command may be used to get detailed information revealing why the bugcheck occurred.
DMA Checking (a.k.a. DMA Verification a.k.a. HAL Verification)
Also only available in XP and later, DMA Checking enables a wide array of checks that ensure proper use of the DMA APIs. One nice feature that you get with DMA Checking is that it causes all DMA transfers to be double-buffered by Verifier. Though the chances are small that this will discover bugs in your code, it guarantees that your driver will work properly on PAE systems with greater than 4GB of RAM.
An exhaustive list of the checks can be found in the DDK documentation, but here are some of the more interesting ones:
-
- Catch buffer overruns and underruns on the DMA buffer
- Check proper allocation and destruction of adapters, common buffers and scatter gather lists
- Proper use of map registers
- Use of valid DMA buffers (i.e. ensuring they are not NULL or pageable)
The DDK documentation lists over twenty checks that it makes to your DMA operations when enabled, so this option is something that you’re, without a doubt, going to want to enable if you’re writing a driver that supports DMA.
The !dma WinDBG extension knows about DMA Checking and can be used to get extended information about DMA adapters currently being verified.
Low Resources Simulation
Low Resources Simulation is the one test that we generally recommend that you do not enable until your final rounds of testing. Enabling this option will result in random failures for memory requests. What your driver does in these situations and how gracefully it must handle them is entirely device and environment specific, but at the very least the system should not bug check because of a NULL pointer dereference.
How your driver handles low resource conditions is device specific because, for most drivers, just checking for NULL and returning an error is sufficient. However, there are several types of drivers that need to be fully capable of handling these situations by falling back to memory that was previously allocated when resources were not scarce. No testing cycle is complete until your driver has proven to not bring down the entire system because of a call to ExAllocatePoolWithTag failing.
Disk Integrity Checking
Disk Integrity Checking was added to Driver Verifier in Server 2003. If you are working with a driver in the storage stack, this option can be extremely helpful in finding data corruption errors. Every time a sector is read from or written to the disk, this check computes the CRC and, if it has been previously accessed, compares it to its previous CRC. If the CRCs don’t match, the system will bugcheck. As you can imagine, enabling this option puts a serious strain on the resources of the system, so it should generally not be enabled during day-to-day testing and developing.
IRP Logging
IRP Logging was also added in Server 2003 and is sort of an oddball Driver Verifier option. What it does is keep a copy of the last twenty IRPs that the driver being verified has received in a circular buffer. You can then extract the information about the last twenty IRPs to a text file by using the DC2WMIParser utility. There doesn’t appear to be anything documented about how to retrieve this info from the debugger, and Driver Verifier usually lets you know that something went wrong by bugchecking the system, so we’re not quite sure enabling this option is very useful. But, it’s there so check it out and see if it suits any of your needs.