Solving Windows 7 crashes with debugging tools
Tackle that blue screen of death with free Microsoft software
By Dirk A. D. Smith | Network World US | Published: 14:12, 18 April 2011
Configure WinDbg to locate symbols
There are an amazing number of symbol table files for Windows. This is so because every build of the operating system, even one off variants, results in a new file. Fortunately, WinDbg can handle it for you but you must configure it with the correct search path. To do this, launch WinDbg and select the following:
File > Symbol file path
Then enter the following path: (Make sure that your firewall allows access to msdl.microsoft.com)
Note that the address between the asterisks is where you want the symbols stored for future reference. For example, I store the symbols in a folder called symbols at the root of my C drive, thus:
When opening a memory dump, WinDbg will look at the executable files (.exe, .dll, etc.) and extract version information. It then creates a request to the symbol server at Microsoft, which includes this version information and locates the precise symbol tables to draw information from. It won't download all symbols for the specific operating system you are troubleshooting, it will download what it needs. Alternatively, you can opt to download and store the complete symbol file from Microsoft.
This, however, will run from about 600MB to near 800MB for each version of the operating system you analyse. In contrast WinDbg downloaded less than 100MB to analyse several versions of the operating system on my test machine. Even with the low cost of hard drives these days, the space savings is significant.
About dump files
A memory dump file is a snapshot of what the system had in memory when it crashed. Though perhaps the least attractive and correspondingly least intuitive thing you are likely ever to look at, it is your best friend when the operating system falls over. Windows creates three different sizes of memory dumps: minidumps, kernel dumps and full dumps.
1. Small or minidump
Windows 7 minidumps are 256 Kbytes, which is tiny by any standard, however they have grown from the Windows 2000/XP days when they were only 64K. One of the reasons they are so small is that they do not contain any of the binary or executable files that were in memory at the time of the failure. However, those files are critically important for subsequent analysis by the debugger. As long as you are debugging on the machine that created the dump file WinDbg can find them in the System Root folders (unless the binaries were changed by a system update after the dump file was created). Alternatively the debugger should be able to locate them through SymServ. Properly configured, Windows 7 creates and saves a minidump for every crash event as well as a kernel dump (described below).
2. Kernel dump
Kernel dumps are roughly equal in size to the RAM occupied by the Windows 7's kernel. On my notebook a kernel dump runs about 344MB and compressed it is just over 100MB. One advantage to a kernel dump is that it contains the binaries. As a default I would always have the system save the latest kernel dump. Remember that while saving it, the system will also save a minidump.
3. Complete or full dump
A full memory dump is about equal to the amount of installed RAM. With many systems having multiple GBs, this can quickly become a storage issue, especially if you are having more than the occasional crash. Normally I do not advise saving a full memory dump because they take so much space and are generally unneeded.
However, Microsoft's Vachon advises that "if you are trying to debug a very complex problem, such as an RPC issue between multiple services in the box and you want to see what the services are doing in User Mode, the full memory dump can be very helpful." Therefore, stick to the kernel dump but be prepared to switch the setting to generate a full dump on occasion.
What if you do not have a memory dump to work with?
If you do not have a memory dump to look at, do not worry, you can make it crash! The simplest way (without having to change Registry settings) is to run a cool tool called NotMyFault (thank you Mark Russinovich and the team at SysInternals.) It provides a selection of options to load a misbehaving driver (which requires administrative privileges).
But remember...it WILL CREATE A SYSTEM CRASH! So prepare your system and be sure to let anyone who needs access to the system to log off for a few minutes. Save any files that contain information you might otherwise lose and close applications. If you have configured your system as described above, it should work fine. The machine should go down, reboot and you will have both a minidump as well as a kernel dump to look at. I've used it plenty of times and had no problems.
Download NotMyFault and force a system crash:
- Download the NotMyFault tool from the following Microsoft website and extract the files to a folder: http://download.sysinternals.com/Files/Notmyfault.zip
- Right click on NotMyFault.exe or at the Command Prompt type NotMyFault. If you get the message "You don't have permission to open this file" then try again but when right-clicking select "Run as Administrator".
- From the menu select "High IRQL fault (kernelmode)" and the Do Bug button. This will generate a memory dump file and a "Stop D1" error.
- Sit back... your system will be back in momentarily and you will have both a minidump and kernel dump to view.
Load a dump file:
If you get the message "You don't have permission to open this file", relaunch WinDbg by right clicking on it and selecting Run as administrator.
Once the debugger is running, select the menu option File > Open crash dump and point it to open the memory dump you want to analyse. When offered to Save information for workspace select Yes if you want it to remember where the dump file is.
WinDbg looks for the Windows symbol files for that precise build of Windows. It references the symbol file path, accesses microsoft.com and displays the results.
NOTE: If the debugger seems busy, it is probably the first time a dump file for a specific machine has been opened, therefore, WinDbg is downloading symbols from SymServ. The next time a dump is opened for the same machine the debugger will likely seem much faster since the symbol files will be available locally.
A Command window will appear. This is where the crash analysis will be displayed. At the lower left will be a KD > prompt. To the right of the prompt is a single line window where you will enter commands.
Possible error messages
You may get the message:
*** ERROR: Symbol file could not be found. Defaulted to export symbols for ntoskrnl.exe -
One of the following three things is usually wrong:
- Your path is incorrect, check to make sure there are no typos or other errors (such as a blank white space) in the symbol file path you entered earlier
- Your connection failed, check your Internet connection to make sure it is working properly
- Your firewall blocked access to the symbol files or the symbol files were damaged during retrieval
If your path and connection are solid, then it's likely that the problem is your firewall. If a firewall initially blocks WinDbg from downloading a symbol table, it can result in a corrupted symbol file. If unblocking the firewall and attempting to download the symbol file again does not work; the symbol file remains damaged. The quickest fix is to close WinDbg, delete the symbols folder (which you most likely set at c:\symbols), and unblock the firewall. Now, reopen WinDbg and a dump file. The debugger will recreate the folder and re-download the symbols.
If you see this message:
***** Kernel symbols are WRONG. Please fix symbols to do analysis.
then WinDbg was unable to retrieve the proper symbols and it will resort to using the default symbol table. But as the warning suggests, it cannot produce accurate results. Remember that symbol tables are generated when programs are compiled, so there is a symbol table file for every Windows version, patch, hot fix and so on. Go back up to the section above and ensure you have the right path set, the connection is good and it is not blocked.
Look through WinDbg's output. You may see an error message similar to the following that indicates it could not locate information myfault.sys:
Unable to load image \??\C:\Windows\system32\drivers\myfault.sys, Win32 error 0n2
*** WARNING: Unable to verify timestamp for myfault.sys
*** ERROR: Module load completed but symbols could not be loaded for myfault.sys
This means that the debugger was looking for information on myfault.sys. However, since it is like a third party driver (OK, it is made by Microsoft but it is certainly not a regular Microsoft product) there are no symbols for it (Microsoft does not store all of the third party drivers). You can ignore this error message. Vendors do not typically ship drivers with symbol files, and they aren't necessary to your work. You can pinpoint the problem driver without them.
When you have WinDbg open a dump file, it automatically runs a basic analysis. Without even giving the debugger any direct commands (other than to open a specific dump file) it has named a suspect as shown in the screen below.