Contents Contents ................................................................................................................................... 2 Introduction .................................................................................................................................. 4 Topics covered. ........................................................................................................................ 4 Debugging and Instrumenting scripts. .................................................................................. 4 Using tools to do first analyses. ............................................................................................ 4 Symbols and symbol servers................................................................................................ 5 Calling conventions............................................................................................................... 5 Stack tracing and related subjects........................................................................................ 5 Debugging of "free builds" .................................................................................................... 5 Creation and debugging of "dump files". .............................................................................. 5 Working with ILDASM and ILASM........................................................................................ 5 Other useful WinDbg extensions .......................................................................................... 5 Remote debugging ............................................................................................................... 5 Debug scenarios................................................................................................................... 6 The lifecycle of a computer application. ................................................................................... 6 "Production debugging" versus "development debugging"...................................................... 7 Debugging and instrumenting scripting code............................................................................... 8 Introduction............................................................................................................................... 8 Installing the debugger ............................................................................................................. 8 Debugging a vbscript................................................................................................................ 9 Instrumenting code................................................................................................................. 11 Using the SAITools component. ......................................................................................... 11 Diagnostic tools.......................................................................................................................... 15 Introduction............................................................................................................................. 15 Preventing and solving DLL Hell problems ............................................................................ 15 The "Change Journal"......................................................................................................... 15 What kind of DLL Hell can we encounter ? ........................................................................ 17 From where are DLL's loaded ? ......................................................................................... 17 The Global Assembly Cache. ............................................................................................. 19 Working with assemblies in the cache................................................................................ 19 Tools to find DLL Hell problems. ............................................................................................ 27 The Dependency Walker .................................................................................................... 27 The Process Explorer ......................................................................................................... 28 The DLL Spy....................................................................................................................... 29 The Assembly Binding Log Viewer (FUSLOGVW.EXE) .................................................... 29 FileMon and RegMon ......................................................................................................... 30 Using the 'WinDbg' debugger......................................................................................... 31 Introduction......................................................................................................................... 31 See what is loaded during the execution of an application. ............................................... 32 Looking at the execution of an application ......................................................................... 34 Symbols and symbol servers..................................................................................................... 37 Introduction............................................................................................................................. 37 Compiler and linker settings................................................................................................... 38 Symbol servers....................................................................................................................... 41 Settings in the debugger ........................................................................................................ 42 Calling conventions.................................................................................................................... 43 Introduction............................................................................................................................. 43 STDCALL Calling convention (__stdcall) ............................................................................... 44 CDECL Calling convention (__cdecl) ..................................................................................... 44 FASTCALL Calling convention (__fastcall) ............................................................................ 44 THISCALL Calling convention................................................................................................ 45 Naked functions...................................................................................................................... 45 copyright 2004 Hans De Smaele
3 Debugging applications.............................................................................................................. 46 Introduction............................................................................................................................. 46 Installing WinDbg ................................................................................................................... 46 Exercise 1 : Unwind1 (unwind the stack) ............................................................................... 47 Debugging unwind1.exe ..................................................................................................... 49 The debug session explained............................................................................................. 55 Unwind1 source code ......................................................................................................... 58 Exercise 2 : CrashDivider ( a crashing .NET application) ...................................................... 60 Introduction......................................................................................................................... 60 First and second change exception handling. .................................................................... 61 Debugging "CrashDivider" .................................................................................................. 61 The exercise explained....................................................................................................... 71 The CrashDivider code. ...................................................................................................... 73 Debugging "free builds" analyzing "crash dumps".................................................................. 75 Introduction............................................................................................................................. 75 Creating a dump..................................................................................................................... 75 The debug scenario................................................................................................................ 77 The exercise explained .......................................................................................................... 87 Debugging .NET applications where symbols are missing........................................................ 89 Introduction............................................................................................................................. 89 The exercise........................................................................................................................... 89 Remote debugging..................................................................................................................... 98 Introduction............................................................................................................................. 98 Visual Studio 6.0 remote debugging setup ............................................................................ 99 Remote debugging with WinDbg............................................................................................ 99 Some debugging scenario's..................................................................................................... 101 Introduction........................................................................................................................... 101 Debugging heap corruption.................................................................................................. 101 Debugging spinning threads (100% cpu usage). ................................................................. 106 Debugging (finding) memory leaks. ..................................................................................... 115 Debugging a NT service during startup................................................................................ 118 Debugging a COM+ application in combination with IIS...................................................... 119 Debugging .NET problems................................................................................................... 119 Using the OEM Tools............................................................................................................... 121 Introduction........................................................................................................................... 121 Userdump............................................................................................................................. 121 Genedump............................................................................................................................ 123 At last.................................................................................................................................... 124 Round-up.............................................................................................................................. 124 Literature .............................................................................................................................. 124 Books................................................................................................................................ 124 Patterns and practices...................................................................................................... 125 MSDN articles................................................................................................................... 125 Knowledge base articles................................................................................................... 125 OSR Online....................................................................................................................... 125 Other sources ................................................................................................................... 125 Application debugging in a production environment Version 1.1 4 1 Introduction The document you have at hand is made for people who have to support and maintain a production environment. In this course we'll try to give an answer on questions like : "yesterday it still worked and now it doesn't anymore, what did happen ?" or "why does it work on this computer and not on another one ?". Each chapter can be seen as a stand alone par. However, some background information and knowledge like symbols usage and calling conventions are essential parts you must know to follow this course. Every part of this paper is explained as a walkthrough. This gives the student the possibility to use the document as reference material and/or as a workbook to learn some particular debugging scenarios. In no way this guide attempts to be a replacement for the manuals that come with the debuggers. The WinDbg debugger comes with such an outstanding and complete documentation that it is impossible to do better than that. The reader of this document shouldn't be proficient in writing applications, nor should she/he be an expert in assembler or another language to follow this course. However, as with most things in live : "the more you know in advance, the better". The topics covered in this guide are rather complicated, but working with the WinDbg debugger is explained step by step. Topics covered. Starting with easy things like debugging and instrumenting of vbscript code, more difficult scenarios are explained as well : Debugging and Instrumenting scripts. In this chapter you will learn how to debug scripting code : how to walk through the code, how to set breakpoints, how to watch and modify variable values, Also, a component is discussed that can be used to add information to the code that can be very useful during debugging. Using tools to do first analyses. Often we can find out very fast what went wrong, using some basic diagnostics tools. For example, showing disk or registry access can be done using FileMon and RegMon. Lots of problems related to DLL Hell can be diagnosed using the Dependency Walker, network problems can be tested using a simple ping or a network sniffer can be used if needed And, last but not least, the WinDbg debugger can be used in almost every debugging scenario where other tools are not appropriate. copyright 2004 Hans De Smaele
5 Symbols and symbol servers. What are symbols, what information do they contain, how do we create them, how do we use them, All this and how a symbol server can be helpful when we're working with symbols is here explained. This is an important topic, and it must be known before starting with the remaining chapters. Calling conventions. In this chapter the passing of parameters between functions, be it in the main program or in a DLL is discussed. If you're serious about debugging, then you must understand and recognize calling conventions. Stack tracing and related subjects. This is the most important part of this course. In this chapter you learn how to walk the stack, how to use the WinDbg debugger, how to read and analyze assembler code. Samples are given for native and .NET applications, the exercises are done on the checked build (debug version) of the applications. Debugging of " free builds" "Free builds" are the release versions of applications. Using the debugger, we'll walk through this kind of applications and examine the differences between the free build and the checked build of a program. Creation and debugging of " dump files" . In a production environment, often there is no time to debug the running application (or maybe the program just crashed, so you can't debug it anymore). In these cases, debugging must be done on a snapshot of the failing process. In this chapter, the usage of ADPlus and the situations where it can be used is explained. Working with ILDASM and ILASM When you've to debug a .NET application for which the symbols are missing, you can regenerate the intermediate language using ILDASM. Then, you can reassemble it using ILASM with the option /DEBUG. In this way, you're rebuilding the application with symbols for the intermediate language. That's what's explained here. Other useful WinDbg extensions In previous chapters, the usage of the Logger and SOS extensions are thoroughly explained. However, there are more interesting WinDbg extensions. In this chapter we've a look at the OEM extensions for our favorite debugger. Also, other OEM tools are discussed. Remote debugging How to do remote debugging (and how to set up the debuggers for this), is explained in this chapter. We discuss this for three debuggers : Visual Studio 6.0 Visual Studio.NET 2003 WinDbg
Application debugging in a production environment Version 1.1 6 Debug scenarios In this part we explain how to debug some common debug problems. Situations like 100% cpu usage, deadlocks, memory leaks, memory corruption, are handled. To finalize this course, a preview is given about the successor of this course : Advanced debugging techniques how are applications seen by the kernel. The lifecycle of a computer application. When we talk about production debugging, we talk about finding and pinpointing problems with applications that did work at some point in time. The short overview you see hereunder describes the lifetime of a computer program, and the steps where it can you wrong with it. 1. At the very beginning, the analysts analyze the tasks that must be done and define the process to build the application. 2. The developers are using the analysts work to put the program into code. 3. The application gets tested by the developers, by dedicated testing teams, and are finally tested by the end users (UAT = Users Acceptance Testing). 4. The program goes into production. 5. Bugs found in the application are fixed, new features are added to the program, algorithms are modified, 6. The application becomes obsolete and goes complete or partially out of production.
If the application is build in a "state of the art" way, and if all test scenarios are followed, then it can be considered that the application will work as expected. But, as we all know, at some point of time the risk is very high that the application will fail at some point. Reasons for this can be : The analysts didn't understand (or misunderstood) what was expected. As a result, parts of the application must be rewritten. Despite thoroughly testing, not all the bugs in the application were discovered. It's quite possible that some bugs couldn't be found during the test phase, because of different hardware, different load on the application, different network topology, different data used, . The application gets influenced by other things. The installation of other programs on the same server, or the installation of service packs, can lead to the so known "DLL Hell" problems. Or maybe the application starts to work differently after modifying security settings, ..
People who have to debug these problems are mostly confronted with problems of the third type. This kind of problems is tackled in chapter three, where we discuss several tools to diagnose application problems. Problems of type two will pop up after some time that the application is in production. When there are really bugs in the application that weren't found during development and testing phase, then you're confronted with the hardest type of problems. Especially, when the application doesn't crash but presents odd behavior, debugging with a debugger is needed. Chapter 4 and the remaining chapters are all dedicated to the solving of this kind of bugs. Mostly, the WinDbg debugger will be used during the debugging scenarios because it's currently the most powerful debugger at hand for these tasks. copyright 2004 Hans De Smaele
7 " Production debugging" versus " development debugging" A developer has his toolkit at hand. And even more important : the developer has access to the source code of the application. However, there are some more differences between production- and development debugging :
Debugging during development Debugging during production The IDE is available The IDE is (most of the time) not available Source code is available Source code is (often) not available Code can be modified to test Code can't be modified Debugging is done on the "checked build" Debugging is done on the "free build" This is "live" debugging Debugging must often be done on a dump file Breakpoints can be set If working on a dump file, you can't set breakpoints The used data is known The used data is not known The person who's doing the debugging has time to debug There is no time. Every minute downtime costs a lot of money.
One of the most important things you should keep in mind is that you're not debugging code (as a developer does), but that you're debugging the used data. Indeed, before the application came into production, several tests were done on the program. Since the application passed these tests with the used data, there must be some problem with the currently used data, and that data must be found. Then, the developer can use that data to retest the program and to find the faults in it.
Note: A recommendation is : only use a debugger like WinDbg as a last resort. First of all, try to pinpoint the problem using log files, release management papers, diagnostics tools, just because it can be so very difficult to debug with a debugger in a production environment.
Application debugging in a production environment Version 1.1 8 2 Debugging and instrumenting scripting code. Introduction. Debugging a script is not that difficult, because this is interpreted code that is readable. However, this doesn't change the fact that a script can contain bugs or that the used logic in a script isn't doing what we expect it to do. Furthermore, it isn't always easy to find out why a script runs fine on one computer (or one day) and not on another computer. Reasons for this can be : Security settings different on the computers (or since last time the script ran) The existence (or nonexistence) of a particular file Missing a registry value ..
For all these problems, the functionality of the script must be debugged. And this can become rather difficult in situations like a "computer startup script", because it's not easy to follow the execution steps in such a scenario. In this chapter, we'll discuss how to debug a vbscript using the script debugger. Also, the usage of a small component is discussed that can help you to add valuable debug information to your scripts (and other applications). Installing the debugger The script debugger can be found on the Windows 2000 operating systems disk. You can install it using the "Add/Remove Programs" applet on the control panel. Then, select "Add/Remove Windows Components" and check the "Script Debugger" (see figure 2.1). For other operating systems, or when you don't have the install disk at hand, you can download the script debugger from the Microsoft website at the url http://msdn.microsoft.com/scripting. Select at that site the option "downloads" and choose the script debugger. Make sure you select the right version for the operating system you're running : there are versions for this debugger for WinNT 4.0, Windows 2000 and WinXP (and even for Win9x). The 32-bit version of the debugger is called scd10en.exe at the download site. You can use this as a search word on the website if the url above should be broken. copyright 2004 Hans De Smaele
9
Note: It is possible that the script debugger doesn't work anymore after the installation of the .NET framework or Visual Studio.NET. This shouldn't be a problem since Visual Studio.NET comes with a very good script debugger itself.
Figure 2.001 The script debugger installation option for Windows 2000 Debugging a vbscript As an example we use a script that attempts to open a file in the c:\temp folder of the computer. To make sure that the script fails, the file that must be opened doesn't exist. Also, to show some of the things that can be done with the debugger, other useless code is added to the script as well. Let's have a look at the code : option explicit call main
sub main dim FriendlyWords : FriendlyWords = "Hello everybody" call ShowFriendlyWords(FriendlyWords) end sub
sub ShowFriendlyWords(Greetings) Msgbox Greetings & ", I gonna open a file for you..." call OpenIt() end sub
sub OpenIt() dim MyFile : MyFile = "C:\temp\sai.txt" const ForReading = 1, ForWriting = 2, ForAppending = 8 dim fso, f set fso = CreateObject("Scripting.FileSystemObject") stop set f = fso.OpenTextFile(MyFile,ForWriting,false) f.write "Hello" f.close end sub Application debugging in a production environment Version 1.1 10 The script's execution starts with the main function, defining a local variable FriendlyWords and giving it a value Hello everybody. Then, the subroutine ShowFriendlyWords() is called with the message as a parameter. In this routine, the message is displayed and then the subroutine OpenIt() is called. If the file c:\temp\sai.txt doesn't exist, the script will fail. The displayed error message, when running the script with wscript.exe, is :
Figure 2.002 Error dialog when executing the script. As you can see, the reason why the script fails is explained. Also, the line is given where the problem occurred. What you don't get is the name of the file that is not found, but this you can easily find in the code for this script. However, this should be more difficult if the name of the file is a variable and not a hard coded filename. Thanks to the debugger, we can see the value of the variables in the script, and even better : we can modify their value. A script can be started with the commandline parameter //X or //D. When started with //D, like in cscript.exe c:\sai\writetest.vbs //D, the debugger only comes up when an error is encountered or when the keyword stop is reached in the script (see our script). When the parameter //X is used, the debugger starts immediately when the script gets executed. Figure 2.3 shows the debugger in action.
Figure 2.3 The script debugger in action. copyright 2004 Hans De Smaele
11 As you can see, the script execution is interrupted at the stop keyword. One of the windows available in the script debugger shows the source code of the script. Another important window is the Command Window. In this window, the value of variables can be consulted and altered. Consulting the value of a variable is done using the question mark, and giving it a new value is done using the =-sign (see figure). In large scripts, the Call Stack window is very important as well. This window shows what subroutine/function called another one. It's most valuable to use this as a way to track code execution. Unfortunately, the Call Stack window doesn't show the passed parameters between functions, but you can find this in the code window. When using the script debugger, very soon you'll find out that this debugger is useful, but has its limitations. It is more appropriate to use another debugger like Visual Studio.NET or Visual Interdev (for the Visual Studio 6.0 users), but this debugger is harder to install (and doesn't come for free) on a production platform. Maybe, the best solution is the instrumentation of the vbscript code Instrumenting code. Something a developer should avoid is the popup of a small dialog like we've seen in figure 2.2. This can be avoided using the on error resume next statement (or using a try/catch construction). Thanks to this kind of mechanisms, the developer can add code that handles errors during execution. Part of this code can be instrumentation, making it easier to follow the execution flow of the code. Developers can just use the MsgBox function at strategic places in the code. The returned (showed) information can then be used to track and understand what is going on in the code. The problem with this approach is that when a MsgBox function is forgotten, the end user sees this information during execution and the function requires a user interaction as well. A better solution is the usage of a log file. The developer can consult this file after the script as executed, to see what went wrong. However, the problem with this is that such a log file must be cleaned up and that the developer must wait until the script execution ended before the log can be examined. For that, a better solution is the usage of the OutputDebugString API. This API (Application Programming Interface) reports the given parameter data to an attached debugger. If there is no debugger attached, the API simply does nothing. The usage of this API is very straightforward when used in C/C++ and VB code, but not in a scripting language where APIs can't be called. For this, you should use a kind of component like the one below : Using the SAITools component. This COM component makes it possible to use the OutputDebugString API in scripts. Also, the methods in the component allow you to modify the behavior of this. During the creation (instantiation) of the SAITools.DebugObj object, the object checks if a certain registry value (HKLM/SOFTWARE/SAI value Debug) is set to "1" (see figure 2.4). If this value is set, then the OutputDebugString API is used inside the WriteDebugString method. If this value is set to "0" or if the value is not found, nothing is written. If every script and application is using this method, then it is possible to set the computer in debug mode, just by setting this value. A second method, UseApplicationFlag, can be used to modify the behavior of the Debug registry value as discussed above. Using this method, an additional check is done to see if the used "ApplicationFlag" is set to 0, 1 or 2. If it is set to 0, nothing changes ; if it is set to 1, there is always debug information send whatever the value of Debug is ; if the Application debugging in a production environment Version 1.1 12 value is set to 2, there is never information send (most useful when working with passwords and so in a script).
Figure 2.4 Registry settings, used by the SAITools component
The script below shows you how to use the SAITools component. This script can also be used as a template for other scripts.
Option Explicit
Dim DebugObj
Call main
' =================================== ' here we start our script... ' =================================== sub main() call InitDebugObj() DbgInfo "Starting script..."
' this is always useful, check if there are commandline parameters Dim ObjArgs Set ObjArgs = wscript.Arguments DbgInfo "The number of arguments is : " & objArgs.Count if objArgs.Count <> 0 then DbgInfo "The commandline parameters are :" dim x for x = 0 to objArgs.count-1 DbgInfo objArgs(x) next end if
' from here, you are on your own... call MyStuff()
' this piece of code may not be seen by the debugger DebugObj.UseApplicationFlag "SecretCode" DbgInfo "This is not seen by the debugger, since SecretCode as value 2" DebugObj.ReInitialize
' tell us that the script is done copyright 2004 Hans De Smaele
13 DbgInfo "Script ended successfully" end sub
sub MyStuff() on error resume next
DbgInfo "Entering routine MyStuff()"
dim MyFile MyFile = "C:\temp\SAI.TXT"
'open de file for writing const ForReading=1, ForWriting=2, ForAppending=8 dim fso, f
set fso = CreateObject("Scripting.FileSystemObject") if IsObject(fso) then set f = fso.OpenTextFile(MyFile,ForWriting,false) if IsObject(f) then f.Write "Hello World" f.close else DbgInfo "Error opening file " & MyFile end if else DbgInfo "Error creating the FileSystemObject object" end if
DbgInfo "Leaving routine MyStuff()"
end sub
' ==================================== ' helper functions ' ==================================== sub InitDebugObj() on error resume next set debugobj = CreateObject("SAITools.DbgObj") on error goto 0 end sub
sub DbgInfo(DebugString) if IsObject(debugobj) then DebugObj.WriteDebugString DebugString & chr(13) & chr(10) end if end sub
Application debugging in a production environment Version 1.1 14 As long as there is no debugger attached, the debug information is not displayed. There is a powerful tool, called DbgView, that acts like a debugger. It is free downloadable from the SysInternals website (http://www.sysinternals.com). This tool allows you to see even remotely the output that is send by the OutputDebugString API. When using this tool, the following output is displayed when executing the script above.
Figure 2.5 DebugView in action.
copyright 2004 Hans De Smaele
15 3 Diagnostic tools Introduction During the lifecycle of a computer program, many things can go wrong. However, every application problem is related with one of the two (or both) problems outlined : There are bugs in the application and/or a modification was done on the platform where the application is running.
In the first part of this chapter, we'll have a look at the problem caused by modifying something on the computer. We'll discuss how to prevent that these problems happen and how to diagnose them. Then, in the second part, we use some tools that can give us a better idea about what is going on inside the program. Preventing and solving DLL Hell problems The " Change Journal" The proverb says : "Prevention is better than cure". This means that you must avoid installing DLL's that don't match with existing applications. The easiest way to find out what DLL's are replaced/updated during the installation of a new application is by taking a snapshot of the system before the installation and after the installation. Then, you can create the delta between these two snapshots to find out what is modified. This is the approach used by the DUPS (DLL Universal Problem Solver) solution, as it was explained in MSDN. This system, written by Rick Anderson, consist of four tools : DLister : creates a list of all the DLL's on the system (enumerate them and get name, file size, version information and date for every found DLL). DComp : creates a list with the delta between two snapshots. Dtxt2Db : imports the files, created with DLister and DComp, in a database for further use. DlgDtxt2Db : a GUI version for the Dtxt2Db tool.
This tool is still useful on an older system. However, on Windows 2000, XP and Windows 2003 platforms, this tool is obsolete. When the disks on your system are formatted using the NTFS file system, the best tool at hand is the Change Journal. The Change Journal is an integrated tool on NTFS volumes and allows you to track in real time additions, modifications and deletions of files and folders on every disk. Each record in the Change Journal takes about 80 100 bytes, and you can configure how big the Change Journal may grow. This system is already used by virus scanners, indexing Application debugging in a production environment Version 1.1 16 systems, replication managers and backup systems. The only problem with the Change Journal is that there isn't any application that a system administrator can use for it. If you want to use the Change Journal, you have to write a tool for yourself. An example of such a tool is on the MSDN Magazine website (http://msdn.microsoft.com/msdnmag). This example, CJTest, records all the actions you do on a disk. When the CJTest application is started for the first time on a disk, a dialog is displayed indicating that a full scan of the disk is required (figure 3.1).
Figure 3.1 The CJTest initial dialog. Soon, the disk is scanned and the Change Journal is active now. To show how it works, a folder and file are created, modified by Notepad, and deleted again (figure 3.2).
Figure 3.2 Some commands to test the Change Journal's functionality. Each time a command is given, the Change Journal gets updated, and this is displayed in the CJTest application (figure 3.3 below). Of course, you don't need to keep an application like CJTest running on your desktop all the time. As soon as you activate the Change Journal, it will stay active until you deactivate it. Together with CJTest comes another sample, CJDump, that shows you how to list all the records that were recorded by the Change Journal. So, the Change Journal doesn't prevent you to install a DLL or other file, but at least it keeps a log about the files and folders that are created, deleted and modified. This, together with Auditing enabled on your W2K OS, will give you a lot of information that you can use to find out why an application doesn't work anymore after the installation of additional files.
Note: The Change Journal doesn't give information about the content of a file. Only information about creation, deletion or changes are recorded.
copyright 2004 Hans De Smaele
17
Figure 3.3 The Change Journal data, displayed in CJTest. What kind of DLL Hell can we encounter ? If, despite our circumspection, DLL Hell occurs, the first thing you should do is to define what kind of DLL Hell is occurring. The different kind of DLL Hell that can come across are : A DLL is replaced (overwritten) by an older version of the same DLL. In this case you get a message, telling that one or more functions that the application is calling, don't exist anymore. This is the easiest kind of DLL Hell : just replace the old version with the newer version of the DLL. A new DLL has functional modifications, when compared with the previous version. This happens when the new DLL is not 100% backwards compatible with the previous version. In this case, the different versions of DLLs must be installed side by side. In this scenario, every application 'gets' his own DLL in the same folder as where the application is located. A .local file can be used if needed (see later). This kind of DLL Hell can be difficult to find, also because it doesn't lead to a program crash all the time (also called a hard fault). Sometimes only the output of the application is influenced (soft fault) and it can take a long time before this kind of problem is even noticed. The ordinals in the new DLL don't match the ordinals in the previous version. This is a situation that rarely occurs and the solution for this is also the side by side installation of the DLL versions.
From where are DLL's loaded ? Since there have been so many problems with DLL Hell, Microsoft has done a best effort to overcome these problems, starting with the Windows 2000 operating system. However, you must understand how DLL's get loaded by applications before you can choose the best approach to solve the problem. The way the OS Loader looks for (and finds) a DLL is : 1. The folder where the application is stored that will use the DLL. 2. The current, working directory for this application. 3. The Windows System folder 4. The Windows folder 5. The folders found in the path environment variable.
Application debugging in a production environment Version 1.1 18 However, this 'seek' sequence can be modified by the calling application when explicit binding between the application and DLL is used. When using the LoadLibraryEx API with the value LOAD_WITH_ALTERED_SEARCH_PATH as a third parameter, the load sequence is : 1. The specified folder in the first parameter of the LoadLibraryEx API. 2. The current folder for the application. 3. The Windows System folder. 4. The Windows folder 5. The folders found in the path environment variable.
A third way to modify the order in which DLL's are find, is using the so-called "Known DLL's". These are ordinary DLL's with the only exception that the OS always looks in the same folder to load these DLL's. The list of DLL's that are treated as "Known DLL's" can be found in the registry key : HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Session Manager\KnownDLLs. Figure 3.4 shows the content of this registry key on a Windows 2003 Standard Server.
Figure 3.4 The list of "Known DLL's" When the API LoadLibrary or LoadLibraryEx is used, the function first checks if there is a DLL name given as parameter where the DLL name includes the .DLL extension. If this extension isn't found, then the DLL gets loaded using the rules we've just discussed. However, if the .DLL extension is given, then the function is looking for the DLL name in the registry, but without the extension (see figure 3.4). Again, if the name is not found, the usual rules are followed. If the DLL name is found in the list of Known Dll's, then the DLL is loaded whose name corresponds with the data value in the registry. Usually, this data value is identical with the registry key, but with the extension added. That data value is then searched by the OS loader in the folder that corresponds with the value of the DLLDirectory key in the registry (see figure 3.4). The default value for this key is %SystemRoot%\System32. copyright 2004 Hans De Smaele
19 A fourth way to influence the way a DLL is loaded, is by using a file with the same name as the executable program, followed by the .local extension. This file must be put in the same folder as the executable. When the loader sees this file, all DLL's and COM components are loaded from this folder. Thanks to all this, there is always a technique that can be used to undo the effect of DLL Hell. It is always possible to make sure that the application loads the right DLL. The Global Assembly Cache. .NET applications avoid the usage of native DLL's (unmanaged code). In .NET, assemblies are used. One such an assembly can consist of one or more files that make together a logical group. Each such an assembly consists of compiled code and a so called manifest file. An assembly doesn't need to be registered, as opposite with COM components. The assembly is placed in the same folder as the application that is using it and is then called a private assembly. If the assembly should be used by more than one application, it is placed in the Global Assembly Cache (GAC). An application will use the right assembly, because it uses the assembly it was bound with during the compilation / linking process, and by using the content of the manifest file that accompany the assembly. 1. The search algorithm a .NET application is using, is : 2. The application path 3. The 'Bin' folder below the application path 4. The folder that has the same name as the assembly, below the application path (for example : the assembly SAI.DLL can be found then in C:\MyApplication\SAI). 5. The Global Assembly Cache.
The DLL Hell problem in the GAC is solved by the .NET runtime, using the version information that exist for the assembly. This version number consist of : Major number Minor number Revision number Build number
Placing the assembly in the GAC can be done by using the program GACUTIL.EXE or by dragging and dropping the assembly into the GAC. The GAC itself can be found under the system folder in %SystemRoot%\Assembly. Several versions of an assembly can be placed 'side by side' into the GAC, and an application can choose what version it will use by using a configuration file. What you see, using the explorer, is an hoax. Indeed, it is impossible to place two (or more) files with the same name in the same folder. A special template shows the assemblies in the GAC in such a way that it seems that they are installed in the same folder. The exercise below explains this whole process in detail. Working with assemblies in the cache. In this exercise, a small assembly is made (SAI.DLL) that is consumed by an application (SAITest.exe). The code for the assembly is, written in C#, is very simple. There is only one method inside the assembly, SayHello(), that isreturning a string that gives version information. The first time, we'll give the string as value : Hello from version 1.0. Then, we put this assembly in the GAC. After that, we make an application that is calling this method. As you may expect, the program will display a message that returns the string value. Application debugging in a production environment Version 1.1 20 using System; namespace sai { public class SaiDemo { public string SayHello() { String me = "Hello from version 1.0"; return me; } } }
As we all know, the Class1.cs is not the only file that makes the assembly solution. There is another important file : AssemblyInfo.cs. The content for this file is displayed hereunder. Important in this file is the version information : [assembly: AssemblyVersion("1.0.*")] This line indicates that the assembly version has 1 as major number, 0 as minor number and that the revision and build number may be generated by the compiler. Here is the code :
using System.Reflection; using System.Runtime.CompilerServices;
// // General Information about an assembly is controlled through the following // set of attributes. Change these attribute values to modify the information // associated with an assembly. // [assembly: AssemblyTitle("")] [assembly: AssemblyDescription("")] [assembly: AssemblyConfiguration("")] [assembly: AssemblyCompany("")] [assembly: AssemblyProduct("")] [assembly: AssemblyCopyright("")] [assembly: AssemblyTrademark("")] [assembly: AssemblyCulture("")]
// // Version information for an assembly consists of the following four values: // // Major Version // Minor Version // Build Number // Revision // // You can specify all the values or you can default the Revision and Build // Numbers copyright 2004 Hans De Smaele
21 // by using the '*' as shown below:
[assembly: AssemblyVersion("1.0.*")]
// // In order to sign your assembly you must specify a key to use. Refer to the // Microsoft .NET Framework documentation for more information on assembly // signing. // // Use the attributes below to control which key is used for signing. // // Notes: // (*) If no key is specified, the assembly is not signed. // (*) KeyName refers to a key that has been installed in the Crypto Service // Provider (CSP) on your machine. KeyFile refers to a file which contains // a key. // (*) If the KeyFile and the KeyName values are both specified, the // following processing occurs: // (1) If the KeyName can be found in the CSP, that key is used. // (2) If the KeyName does not exist and the KeyFile does exist, the key // in the KeyFile is installed into the CSP and used. // (*) In order to create a KeyFile, you can use the sn.exe (Strong Name) // utility. // When specifying the KeyFile, the location of the KeyFile should be // relative to the project output directory which is // %Project Directory%\obj\<configuration>. For example, if your KeyFile is // located in the project directory, you would specify the AssemblyKeyFile // attribute as [assembly: AssemblyKeyFile("..\\..\\mykey.snk")] // (*) Delay Signing is an advanced option - see the Microsoft .NET Framework // documentation for more information on this. // [assembly: AssemblyDelaySign(false)] [assembly: AssemblyKeyFile("C:\\HANS\\PRIVATE\\SAI\\SAI.SNK")] [assembly: AssemblyKeyName("")] The line [assembly: AssemblyKeyFile. ] points to a file that is generated using the SN.EXE tool, generating a public/private key pair that can be used to sign the assembly. The code for the application, using this assembly, is : using System; using System.Drawing; using System.Collections; using System.ComponentModel; using System.Windows.Forms; using System.Data;
namespace SaiTest { /// <summary> /// Summary description for Form1. /// </summary> Application debugging in a production environment Version 1.1 22 public class Form1 : System.Windows.Forms.Form { private System.Windows.Forms.Label label1; /// <summary> /// Required designer variable. /// </summary> private System.ComponentModel.Container components = null;
public Form1() { // // Required for Windows Form Designer support // InitializeComponent();
// // TODO: Add code after InitializeComponent call // sai.SaiDemo ss = new sai.SaiDemo(); this.label1.Text = ss.SayHello(); }
/// <summary> /// Clean up any resources being used. /// </summary> protected override void Dispose( bool disposing ) { if( disposing ) { if (components != null) { components.Dispose(); } } base.Dispose( disposing ); }
#region Windows Form Designer generated code /// <summary> /// Required method for Designer support - do not modify /// the contents of this method with the code editor. /// </summary> private void InitializeComponent() { this.label1 = new System.Windows.Forms.Label(); this.SuspendLayout(); // // label1 // copyright 2004 Hans De Smaele
23 this.label1.Location = new System.Drawing.Point(24, 48); this.label1.Name = "label1"; this.label1.Size = new System.Drawing.Size(248, 23); this.label1.TabIndex = 0; this.label1.Text = "label1"; // // Form1 // this.AutoScaleBaseSize = new System.Drawing.Size(5, 13); this.ClientSize = new System.Drawing.Size(292, 101); this.Controls.Add(this.label1); this.Name = "Form1"; this.Text = "Form1"; this.Load += new System.EventHandler(this.Form1_Load); this.ResumeLayout(false);
} #endregion
/// <summary> /// The main entry point for the application. /// </summary> [STAThread] static void Main() { Application.Run(new Form1()); }
When running the application, SaiTest.exe is bound with the SAI.DLL assembly, version 1.0. Thus, the output is like in figure 3.5. Then, update the SAI.CS SaiAssemblyInfo files so that the version information becomes 1.1. Put this assembly also in the GAC (figure 3.6).
Figure 3.5 SAITest output, bound with release 1.0 of the assembly Application debugging in a production environment Version 1.1 24
Figure 3.6 Both versions of the SAI.DLL assembly in the GAC
Currently, there are two versions of the assembly in the GAC, and the application is still using the first version. If you want the application to load the newest version of the assembly, you can use the .NET configuration MMC snap-in for this (figure 3.7).
Figure 3.7 The .NET Configuration mmc snap-in.
Use the "Add an Application to Configure" task and select then the application you want to configure from the dialog box, or browse for the application (figure 3.8). After you selected the application, that program is added to the list of configured applications (figure 3.9). Then, you can follow the wizard to select the assembly you want to bind with during the execution of the application (fig. 3.10 3.12). When you follow these steps, the application will have an output as displayed :
copyright 2004 Hans De Smaele
25
Figure 3.8 Selection dialog box to configure an application
Figure 3.9 The SaiTest application is added to the list of configured applications.
Figure 3.10 The next step Application debugging in a production environment Version 1.1 26
Figure 3.11 Select the assembly
Figure 3.12 Finished, the application is configured to use the new version of the assembly.
The wizard created a file with the same name as the application, followed by the extension config (thus : SaiTest.exe.config), in the same folder as the application. The content of this application is : <?xml version="1.0"?> <configuration> <runtime> <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1"> <dependentAssembly> <assemblyIdentity name="sai" publicKeyToken="059e2af93d0cd983" /> <bindingRedirect oldVersion="1.0.1559.35955" newVersion="1.1.1559.39043"/> </dependentAssembly> </assemblyBinding> </runtime> </configuration> copyright 2004 Hans De Smaele
27 Using the technique above, it's very easy to configure a .NET application in such a way that DLL Hell becomes history. If, for one reason or another, you need to alter one of these assemblies, you can find it below the assembly folder. As said before, it looks like both versions of the assembly are stored in the same folder. When looking at this using the command prompt, you see that each version of the assembly has its own subfolder (figure 3.13).
Figure 3.13 The subfolders for several versions of the same assembly
Now, we know how to solve DLL Hell. However, before you can solve a problem with a DLL, you have to know what DLL is causing trouble. In the rest of this chapter, we'll discuss some tools that can help us to find out what files (DLL's, assemblies) are loaded by an application and where things go wrong. Tools to find DLL Hell problems. The Dependency Walker This tool is one of the best known and most used application that can help us to provide information about the APIs available in a DLL. Also, the dependency walker gives very detailed information about which DLLs are loaded by the application. The Diagnostics disk that comes with your Microsoft Windows OS has the Dependency Walker included. However, it's better to download the latest version of this tool from the website http://www.dependencywalker.com. The version you find here has features not found in the version that is on the cd. Since DLLs can be loaded implicit and explicit, you need a version of the Dependency Walker that can "run" the application, what the latest version does. Thanks to this, you can see how and when DLLs are loaded by the application. When using the "profile" option, the Dependency Walker acts like a real debugger. Figure 3.14 shows this tool in action. In the given example, you can see that the application StaticSAI.exe needs the DLLs USER32.DLL, KERNEL32.DLL and SAI.DLL. These DLLs again needs other DLLs, for example : SAI.DLL needs SAI2.DLL. Also, as you can see on figure 3.14 (and what is very interesting during the rest of this course), you can see that SAI2.DLL doesn't has symbol files available and that the Application debugging in a production environment Version 1.1 28 preferred load address is the same as for the DLL SAI.DLL. The consequences of this will be discussed later in this course.
Figure 3.14 The Dependency Walker The Process Explorer This application, downloadable from http://www.sysinternals.com, is more complete than the Dependency Walker. It gives also information about the cpu usage, handle usage, loaded DLLs, application tree, . You can see an example of this in figure 3.15.
Figure 3.15 The Process Explorer copyright 2004 Hans De Smaele
29 If you have the Dependency Walker installed in the same folder as the Process Explorer, it's possible to launch the Dependency Walker from the DLL menu. Furthermore, you can chose to show only .NET applications in the Process Explorer. Since the Process Explorer comes also with a complete help file, we don't discuss it further in this manual. The DLL Spy This tool, written by Christophe Nasarre and explained in the MSDN magazine, deserves also a place in your toolkit. It does the opposite of the previous tools : where the Dependency Walker and the Process Explorer give information about processes and the DLLs that are linked with it, the DLL Spy shows all the processes that are using one specific DLL. This is most useful when you want to delete (or recompile) a DLL and you get a message "Access Denied". Thanks to this tool, you can find very easy all the applications that are currently using the DLL you're working with. You can download this tool, together with the source code, from the MSDN website http://msdn.microsoft.com/msdnmag. At the same place, you find all the information you need about the usage of it. Figure 3.16 shows the DLL Spy.
Figure 3.16 DLL Spy The Assembly Binding Log Viewer (FUSLOGVW.EXE) Programs, using the .NET framework, don't load only DLLs but also assemblies. If you want to know what assemblies are loaded by an application, and if this was successful, you should use the Assembly Binding Log Viewer (figure 3.17). You can find this tool in the C:\Program Files\Microsoft Visual Studio.NET 2003\SDK\v1.1\Bin folder. However, if you want to have a very detailed output, you need to set some registry values below the key HKLM\Software\Microsoft\Fusion. These values are : LogResourceBinds (DWORD != 0) : shows all bindings that failed ForceLog (DWORD != 0) : shows all bindings Application debugging in a production environment Version 1.1 30
Figure 3.17 Fuslogvw.exe FileMon and RegMon These tools, also downloadable from http://www.sysinternals.com, allow you to see what files are accessed on your file system and/or what registry information is read or updated. These tools give you so much information, that you must use the 'filter' options that are available on the menus. To see the tools in action, I've written a small application (ToolTest.exe) that is accessing a file and the registry. Running this application, with FileMon and RegMon activated, gives an output like the figures 3.18 and 3.19 below.
Figure 3.18 FileMon output for the ToolTest application On the figure above, you see that the application is queried at startup for all kind of information. Then, an attempt is done to open the manifest file for this application and to open the .local file for the application. Also, for each operation, you see the success or failure of it. The figure below shows the way the registry deals with this. When you look at the code for the 'ToolTest' application, you see the registry key and value we're using in it, and how the value is read and set during the execution of the application. copyright 2004 Hans De Smaele
31
Fig 3.19 RegMon in action
Using the 'WinDbg' debugger Introduction As we've seen so far, tools like the Dependency Walker, Process Explorer, DLL Spy, FileMon, RegMon, can give us a lot of information about the used DLL's on our system, and how files and the registry are accessed. At the other hand, there are still a lot of open questions : What code is executed inside an application ? What happens on the computer, in the OS, when an application is running ? Does an application modifies something that other applications may need ? Are there other applications modifying things that 'our' application may need ? ..
If we want an answer on these questions, then you need a 'real' debugger. And, if you don't want to examine the 'failing' application alone, but also the influence of other applications on the application that you have to debug, then you need a debugger that is powerful enough to give us this information. Such a debugger is WinDbg. WinDbg is a so-called hybrid debugger, since it can debug user applications and kernel code. Because this debugger is easy to install and is one of the most powerful debuggers available, this is our debugger of choice during the rest of this course. There is very little information available about how to use WinDbg, and if there is an article dedicated to it, always you'll read that learning to work with WinDbg is a steep, almost vertical learning curve. However, once you know how to use it, you'll never change it. Also, it comes with a very, very good and very, very, very complete help file that explains many debugging techniques and has a really complete reference list of all available commands. In this chapter, we use WinDbg to see what DLL's get loaded during the startup of an application (and/or the OS). Then, we use it to control what API's are called in these DLL's. However, if you want to see all this at work, you must be sure that the debugger is well installed and that the needed (and correct) symbol files are loaded. So, if you're a Application debugging in a production environment Version 1.1 32 novice with WinDbg, I strongly suggest that you read first the beginning of chapter 6 (installing the debugger) and chapter 4 (symbol files). See what is loaded during the execution of an application. In this exercise, we have a look at what files (DLL's) are loaded during the execution of an application. Also, we'll find out what functions are called at that time. The program we are using for this demonstration is notepad. If you want to see what DLL's (and drivers) are loaded, you must configure the debugger to do kernel debugging. Please follow the steps below : run : attrib s h r BOOT.INI (is located in the root folder of the startup disk). Copy the line of text you find below the [operating systems] section and add this as a new line below the first one. Modify the last part of the line so that the parameter /fastdetect is replaced with /debugport=COM1 /baudrate=115200. At the same time, type some text between the double quotes in this line (like : Windows 2000 Debug mode).
This added line gives you the possibility to choose between normal startup and startup in debug mode (see figure 3.20).
Figure 3.20 Startup options (choose the second option for kernel debugging). Do not start the operating system yet. First of all, start the debugger. To do so, just start WinDbg and select File / Kernel debug on the menu. Then, fill in the corresponding values for the line you've just added to the BOOT.INI file on the debuggee (the computer you want to debug). For the example above, /debugport=COM1 /baudrate=115200, the settings will be like in figure 3.21.
Note: the exercise can also be done on one single computer, but then you will see the loading of DLL's only for the application you selected. Furthermore, connecting two computers can be done using a 1394 interface also. Please, consult the WinDbg help file for more information and options. See also chapter 6 for information about the usage of a virtual pc in these scenarios.
copyright 2004 Hans De Smaele
33
Figure 3.21 The kernel debugging dialog When you click the OK button, the debugger is waiting for a connection. Start up the debuggee now and select the option Microsoft Windows 2000 Debug Mode. Soon, information about the debuggee is displayed in the debugger. If you want to stop the debuggee, just press CTRL-Break (or select on the menu Debug / Break). After you did so, the system that you want to debug seems to be frozen. Now, you're ready to give commands to the debugger. There are three kinds of commands possible : Ordinary debugging commands like g (go), k (stack trace), dd (display dword), . Commands that control the behavior of the debugger itself are prefixed with a dot (.). Examples of this are : .sympath +somefolder, .logopen, .reload, . Commands that give debugging instructions that are not part of the debugger itself, but that are available in the so known debugger extensions. These commands start with an exclamation point. Examples of this are : !process 0 0, !gflag +sls,
For our demonstration, be sure the correct symbols are loaded and type then !gflag +sls. (without the dot). This instruction tells the OS that we want to see the loading of files (SLS = Show Loader Snaps). If you give the !gflag command without a parameter, the actual global flags settings are displayed. When you start an application (like notepad), you get an overwhelming amount of information displayed in the debugger. For this, you must type the g command (go). You can stop this behavior by giving the command !gflag sls. You can activate the same functionality by setting the global flags on the debuggee as well. For this, just run the gflags.exe tool that comes with the debugger. You can set this flag for the whole operating system (as we just did) or you can do this for just one application. When you want to do this for one application only, you must start the debugger in user mode (not in kernel mode as we did) or you can use the gflags.exe tool and set the show loader snaps for one particular application. The usage of the gflags.exe tool will be discussed during other debugging scenario's later in this course. Figure 3.22 shows the gflags.exe tool that comes with the latest debugging tools.
Note: Earlier versions of the gflags.exe tool have another look, but the functionality remains the same. However, the newest version has more options.
Application debugging in a production environment Version 1.1 34
Fig 3.22 Gflags.exe Looking at the execution of an application Now, we know how to see what DLL's are loaded by an application. This time, we'll try to find out what an application is doing during its execution time. More specific, we'll have a look at what API's are called inside the application. For this, the windbg debugger is used again, but this time in user mode. Start WinDbg and select File / Open Executable (use again ToolTest.exe for this demo). Fill in the full path- and application name in the dialog (figure 3.23). In this dialog, you can also give commandline parameters and a working directory.
Figure 3.23 The "Open Executable" dialog in WinDbg copyright 2004 Hans De Smaele
35 After pressing the "Open" button, the application starts and you get directly information about the DLL's that are loaded, together with their load address. When you compare this information with the returned information from the Dependency Walker, this data must be equal. After showing the DLL information, the application stops and you get an indication that a break instruction is encountered. When you don'it start the debugger with the g option (g = go), the debugger always stops at this point. I mostly use this to set additional breakpoints. For this exercise, we don't need additional breakpoints, since we want to get as many information as possible for the whole execution of the program. What we will do is loading an extension that allows us to record what API's are called by the program. In the command window, type the following commands (each followed by <enter>) : 1. .load logexts.dll 2. .chain 3. !logm i * 4. !loge 5. G 6. !logd 7. !logb f
Hereunder, each line is explained : 1. .load logexts.dll is the command that loads the extension logexts.dll. Because the command starts with a dot, this is a command that gives an instruction to the debugger itself. 2. .chain is another command given to the debugger. It shows all the loaded debugger extensions and the place they take in the chain. When different extension have the same command, the command that belongs to the latest loaded extension will be executed, unless you specify the command together with the extension (e.g. MyExts!MyCmd). 3. !logm I * means an extension command, since it starts with an exclamation point. This command loads all the logging modules : m stands for modules, the * means all modules. 4. !loge is an extension command that enables (e = enable) the logging functionality. 5. g is a normal debugging command (no dot or ! point). This command means go, so the program starts running. We get some additional information on the screen, showing that other DLL's get loaded. In this example, the program stops execution after displaying a message box. 6. !logd is the extension command that disables (d = disable) the logging functionality. It's the opposite of the !loge extension. 7. !logb f is an extension command that flushes the buffers (b = buffers, f = flush).
Now, all information is stored on disk, so we can stop the debugger. Give the q (quit) command for this. During this exercise, we instructed the debugger to send the output to a logv file. It's also possible to send the output to screen or to a text file. A logv file is a file that we can read using the logviewer.exe tool (also installed during the installation of the debugger tools). Open the file ToolTest.exe.lgv with the logviewer program, located in the c:\debuggers folder. The ToolTest.exe.lgv file is stored in the folder Desktop\LogExts, unless you specified another output folder for it (see debugger help file for this). Using logviewer, you can see almost all API's that were called during the execution of the program. And not only the API's, but also the used parameter values and the return value of each called API is recorded. Figure 3.24 gives an indication of the output in logviewer. Application debugging in a production environment Version 1.1 36
Fig 3.24 Logviewer, with the log from ToolTest.exe loaded. When you compare this kind of logs between different computers, you can find out rather easy why an application is behaving differently on one specific computer. For this, the logviewer can export the content to a file and create a diff file for example. Once again, please refer to the help file for more information about the usage of this tool. Now, we know what DLL's are loaded by an application and what API's are called in the application and DLL's. The only thing we don't know yet, is why these instructions are called. How we find out this, using the debugger, is explained in later chapters, when we discuss stack tracing. But first, some theory now !
copyright 2004 Hans De Smaele
37 4 Symbols and symbol servers Introduction Symbol files are files that contain debugging information. There are several kinds of symbol files and the kind of debugging information they contain is different. Some compilers / linkers can't create separate symbol files (or they are configured not to create them). In this case, debugging information is stored in the executable file itself. However, since this kind of symbols doesn't contain a lot of information, and because the size of the executable itself grows enormous with this, this isn't the preferred symbol file format. Furthermore, since the debugging information is stored in the executable itself, it's easier for hackers to reverse engineer the application. There where symbols are kept in separate files, a distinction can be made between two types : DBG files and PDB files. The most complete are the PDB files and it is this kind of symbol files that are generated during the compilation / linker process with Microsoft's Visual Studio. Symbols that Microsoft puts at the disposal of debuggers are sometimes of the DBG type and sometimes of the PDB type. These symbol files can be downloaded from the website http://www.microsoft.com/whdc/ddk/debugging/symbols.mspx. Or, when your debugging platform has access to the internet, you can configure the debugger to download the correct symbols from Microsoft's symbol server. Then, the needed symbol files are automatically downloaded at the moment you need them. Symbols, created for the applications build in your own company, can also be stored on a symbol server at your company. The way how to set up such a symbol server is fully explained in the documentation that comes with WinDbg. Those who are asking themselves why symbols are needed will see during this course that it is very hard (even sometimes almost impossible) to debug without these symbols. When there are no symbols available, an instruction looks like this in the debugger : 004027AD call dword ptr ds:[409060h] With the correct symbols available, the same instruction is seen like this : 004027AD call dword ptr MyFunction (409060h). The debugger sees that a function call is made to the adres 0x409060h. Thanks to the symbol files, the debugger can find out that this address corresponds with a function called MyFunction. Of course, this makes it much easier to interpret the instruction in the debugger. Additionally, when the symbol file has the PDB format, the available information is : Converting an address into the corresponding function- and/or variable names. Translation of a program address into a line number in the source file (if available) Application debugging in a production environment Version 1.1 38 Information about the passed parameters and variables, and where they can be found on the stack. Information about the size and type of variables
Compiler and linker settings When symbol files generation is wanted during the compiling / linking process (and why shouldn't it ?), some project settings are required. The figures below show these settings for Microsoft Visual C++ 6.0, Visual C++.NET, Visual Basic 6.0, Visual Basic.NET and C#.
Figure 4.1 Microsoft Visual C++ 6.0 Compiler settings
Figure 4.2 Microsoft Visual C++ 6.0 Linker settings copyright 2004 Hans De Smaele
39
Figure 4.3 Microsoft Visual C++.NET compiler settings
Figure 4.4 Microsoft Visual C++.NET linker settings
Application debugging in a production environment Version 1.1 40
Figure 4.5 Microsoft Visual Basic 6.0 settings
Figure 4.6 Microsoft Visual Basic.NET settings
copyright 2004 Hans De Smaele
41
Fig 4.7 Microsoft C# settings.
Symbol servers Symbol files, generated during the compiling / linking process, can be used by the debugger. The symbols can be loaded explicit from a place (see below) or the symbol files can be stored in a symbol server for later usage. The advantage of this approach is that the symbols, with the correct version, get loaded automatically into the debugger when needed. With the 'Debugging Tools for Windows' package (where WinDbg is part off), a symbol server is included. The tool SymStore.exe is an application that allows us to store symbols in such a way that the symbol server can use it. The usage of this tool is very detailed explained in the help files that come with the package. However, there is at least one command you must know : the command that allows you to add a symbol file to the symbol store (the command below must be given on one line) : SymStore add /r /f \\DevServer\Share\DbgTest\release\*.* /s \\SymServer\Symbols /t "DbgTest" /v 1.0.0.1 /c "Symstore demo" In this command line, SymStore is the name of the application (SymStore.exe), add is the keyword saying that symbols must be added (the opposite keyword is del = delete), /r means that all files must be added recursively, /f indicates what files must be added (in our example : all files below the release folder of our project), /s <path name> is the root name of our symbol store, /t <project name> is the name of the project we want to add, /v gives version information about the project and /c is free text that can be added as comment. Application debugging in a production environment Version 1.1 42 Settings in the debugger If you want to use symbols in the debugger (and you surely should !), there are several ways to tell the debugger where to find these symbols. The easiest (and best) way to do so is to use an environment variable. An example of this is (everything on one line) : _NT_SYMBOL_PATH=srv*c:\debuggers\symbols*http://msdl.microsoft.com/download/symbol s;srv*c:\debuggers\symbols\MyApplications*\\SymbolserverName\Symbols;c:\program files\Microsoft Visual Studio.NET 2003\SDK\v1.1;c:\debuggers In the example above, the environment variable _NT_SYMBOL_PATH points to four different locations where symbols can be found. Each one of these locations is separated by a semicolon. The first two locations are pointing to symbol stores and can be contacted using the symbol server. When you see the letters srv*, then you know that a symbol store is used. The part that follows the asterisk is the path where the found symbols (that are downloaded) must come in this case : the first in c:\debuggers\symbols. After the second asterisk comes the location of the symbol store for the first one in the example is this : http://msdl.microsoft.com/download/symbols. The second store is \\SymbolserverName\Symbols and is a share on a server where the symbols can be found that we've put there with SymStore.exe. The two other locations are subfolders of Visual Studio.NET 2003 and c:\debuggers. Another way to set the symbol files path is by using the WinDbg GUI. Select File / Symbol File Path and enter the path information. Don't forget to check the 'Reload' checkbox to make your settings active. Figure 4.8 shows an example of these settings.
Figure 4.8 The Symbol Search Path dialog of WinDbg A third way to set or modify the symbol path is by using the .sympath command in the debugger. The syntax is : .sympath +c:\SomeFolder\SomeSubfolder .reload In the syntax above, the +-sign means that a path must be added. You can use the minus sign to remove a path from the list. The .reload command is needed to make your new settings active.
copyright 2004 Hans De Smaele
43 5 Calling conventions Introduction A computer application is made of several functions (or subroutines). If the whole code should be located in one single routine, it should be very hard to maintain it and thousands lines of code should be needed. Also, not all the code resides in one single file, large chunks of code are stored in DLLs, COM components, assemblies, .. and so on. These so called libraries are often written in different languages (C++, Visual Basic, Pascal, C#, .) and/or compiled with different flavors of compilers. To make these libraries work together, parameters and return values must be passed between these functions. The way how these parameters are passed are called the calling conventions. There are 5 different calling conventions, and you must know and recognize them to do debugging on the assembler level. In this chapter the five calling conventions are explained, and in the next chapter we'll work with them in a first debug session. Each function, depending on the used calling convention, has a so called prologue and epilogue section. The function of it is : Prologue code Set up the frame pointer Store the registers' values on the stack Reserve space for local variables
Epilogue code Restore the registers' values Clean up the reserved space of the local variables
A typical function looks like this :
PUSH EBP ' store the old frame pointer on the stack MOV EBP, ESP ' set the new frame pointer for this function SUB ESP, 0x10 ' reserve place for local variables PUSH ESI PUSH EDI PUSH ECX ' store registers values ' here comes the code for the function POP ECX POP EDI Application debugging in a production environment Version 1.1 44 POP ESI ' restore the register values, saved during the prologue MOV ESP, EBP POP EBP ' restore the frame pointer RET 8 ' clean up and return
In the following sections, each calling convention is explained. STDCALL Calling convention (__stdcall) This convention is used when the number of arguments that must be passed are known in advance. The characteristics are : Arguments are passed from right to left The called function cleans up the stack The function name is prefixed by an underscore The @-sign postfixes the function name, followed by the number of bytes passed in the arguments (for example : _MyFunction@12). No case translation is performed This calling convention can't work with a variable argument list.
CDECL Calling convention (__cdecl) This convention is used when a variable number of parameters must be passed to the function. The characteristics of this convention are : It's the 'default' calling convention for the 'C' language Arguments are passed from right to left The calling function is cleaning up the stack An underscore is prefixing the function name (e.g. _MyFunction) There is no case translation performed This convention can work with a variable number of parameters
FASTCALL Calling convention (__fastcall) This convention is only used with Intel cpu's. This convention is faster because the first two parameters are passed using registers and not the stack. The characteristics are : It's the "default" calling convention for Borland Delphi compilers Arguments are passed from right to left The first two DWORD parameters are passed via the registers ECX and EDX. The called function cleans up the stack The function name is prefixed with a @-sign and postfixed with another @-sign, followed by the number of bytes in the arguments list (e.g. @MyFunc@12) There is no case translation This convention can't work with a variable number of parameters
copyright 2004 Hans De Smaele
45 THISCALL Calling convention This convention is the "default" way of passing parameters between C++ member function that are using a fixed number of parameters. It isn't a 'real' calling convention, because it can't be specified like the previous calling conventions. THISCALL is not a keyword like for example STDCALL or CDECL. Characteristics for this calling convention are : Arguments are passed from right to left The calling function cleans up the stack There is no name decoration The 'this' pointer is stored in the ECX register Automatically used with C++ class members, unless otherwise specified. COM methods are passed using standard calls.
Naked functions This method is used when the generation of prologue and epilogue code must be avoided. This is needed when (for example) C-code must talk with legacy assembler code. The characteristics are : Arguments are passed from right to left The developer is responsible for the specific prologue and epilogue code The calling function cleans up the stack
Recognizing these calling conventions is very important when debugging on the assembler level. In the next chapter we start with the debug process of a first failing application
Application debugging in a production environment Version 1.1 46 6 Debugging applications Introduction In this chapter we discuss two debugging scenario's : Debugging a native C++ application Debugging a C# .NET application
For these examples, we're working with the checked build (debug version) of the applications we'll debug. In the next chapter we have a look at the differences between the checked build and the free build (release version). The goal of this chapter is to get comfortable with the WinDbg debugger and to understand how a debug session works. During the second exercise we discuss also the usage of the SOS extension, allowing to debug .NET applications with WinDbg. However, before we can start with all this, a few words about the setup of the debugger. Installing WinDbg The installation of the debugger is very straightforward : simply double click the msi package that you've downloaded from the debugger website (http://www.microsoft.com/whdc/ddk/debugging/default.mspx) and choose custom setup. This gives you the option to install everything that comes with the debugger, even the SDK that allows you to build your own debugger extensions. Also, I install the debugger in the folder C:\Debuggers and not in the C:\Program Files\... folder. When I have to give the full path for the debugger, it's shorter to type this . The next thing I do is adding the folder C:\Debuggers to the path environment variable. Especially when using the SOS extension, this makes the setup for this much easier, as you'll see later. As a last step before using the debugger, create an environment variable _NT_SYMBOL_PATH and give it the values as we described in chapter 4. Using the debugger for kernel mode debugging, can be done on one single computer when you're running Windows XP, Windows 2003 or when you're using the LiveKD debugger (downloadable from http://www.sysinternals.com) in combination with WinDbg. Also, you can use Microsoft's Virtual PC 2004 to have two operating systems simultaneous running on your computer. This makes it possible to do 'real' kernel debugging between, for example, the host OS and a virtual computer. The settings how to do this are described in chapter 11. copyright 2004 Hans De Smaele
47 Exercise 1 : Unwind1 (unwind the stack) In this example, general debugging techniques are discussed. More specific, how to unwind the stack, how to unassembled parts of application, how to use the symbol files, what about calling conventions, is explained in this section. You can consider this as one the most important parts of this course. The application itself is rather simplistic, but it behaves as any other failing applications in a production environment. It works well for a while, and then suddenly it crashes. Without using the source files, we'll find out what happened in the program. After starting the program (a console application written in C++ and compiled with Microsoft's Visual Studio 6.0 C++ compiler), we get a screen like shown in figure 6.1. As you can see, after operating well for a while, it goes down (figure 6.2).
Figure 6.1 Application Unwind1.exe in action
Figure 6.2 . Until it goes down.
Depending on the used operating system and the settings about how to handle an application crash, and depending on the kind of problem that occurred, a dialog like the one you see in figure 6.2 is displayed. Another dialog you'll see often (when a problem occurs) is the one like figure 6.3 (see below). This dialog is also seen when the operating system reboots, if Windows 2003 is the running operating system. As you can see on figure 6.2 and 6.3, the address is given where the error occurred. Having this address, it is possible to find out what instruction is executed at that place. If Application debugging in a production environment Version 1.1 48 you click the 'click here' hyperlink on the dialog (as seen in figure 6.3), you get more information that can be used to diagnose the problem (see figure 6.4 and 6.5).
Figure 6.3 Another dialog with information about the program failure.
Figure 6.4 More detailed information about where it went wrong
Figure 6.5 Here you see in what file the information about the crash is stored.
As you can see on figure 6.4, the error occurred in application unwind1.exe, and more specific in the module unwind1.exe. Furthermore, you see the offset where it happened. If you have a look at the dependency walker window, used for this application, you can copyright 2004 Hans De Smaele
49 see that the application starts at address 0x00400000 (see figure 6.6). Adding the offset 0x00001260 to that address, you get the crash address as displayed on figure 6.2.
Figure 6.6 The dependency walker application, with unwind1.exe loaded. In the next section, we'll use WinDbg to find out what actually happened during the execution of the program. Debugging unwind1.exe Start WinDbg and select the menu option File / Open Executable. Select then the checked build of the unwind1.exe application as file to execute (see figure 6.7).
Figure 6.7 The File / Open Executable dialog box in WinDbg. Application debugging in a production environment Version 1.1 50 On this dialog, you can also give program arguments, and you can set the starting directory as well. When you click now the 'Open' button, the following output is displayed in the command window of the debugger :
Microsoft (R) Windows Debugger Version 6.3.0011.2 Copyright (c) Microsoft Corporation. All rights reserved.
CommandLine: C:\SAI\Debugging\unwind1\Debug\unwind1.exe Symbol search path is: e:\Symbols\Windows2003;e:\Symbols\Windows2000;e:\Symbols\Windows2000\SP3;e:\Symbol s\Windows2000\SP4;srv*e:\Symbols\Other*http://msdl.microsoft.com/download/symbols; srv*e:\Symbols\Other*\\s05074\symbols Executable search path is: ModLoad: 00400000 0042f000 unwind1.exe ModLoad: 77f40000 77ffa000 ntdll.dll ModLoad: 77e40000 77f34000 C:\WINDOWS\system32\kernel32.dll (cb0.438): Break instruction exception - code 80000003 (first chance) eax=77fc35ef ebx=7ffdf000 ecx=00000006 edx=77f8ed40 esi=77fc23b4 edi=00241f08 eip=77f43847 esp=0012fb60 ebp=0012fc48 iopl=0 nv up ei pl nz na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202 ntdll!DbgBreakPoint: 77f43847 cc int 3
This information tells us the version of the debugger (as you can see, the latest version of WinDbg is used here). Then, the commandline is displayed. This is the full pathname of the application that we're debugging. The next line (here displayed on three lines, for formatting reasons) gives us the search path where symbols can be found. The next lines show us the loaded modules, together with their load addresses and the size of the application (here 0x2f0000 bytes). The last information we get at this point is the content of the registers. The latest instruction (int 3) means that a breakpoint was encountered. This you can see on the ntdll!DbgBreakPoint text. When WinDbg is not started with the startup option g, then the debugger will always break in the application at this point. I prefer to start a debug session this way, because I can give additional instructions then at this point. The most used instruction you give here are : Adapt de symbol path Open debug log file Set breakpoints .
Hereunder you can see the complete debug session. I've added the line numbers later, since this makes it easier to explain what is done on each line during the session.
1 : Opened log file 'e:\unwind.txt' 2 : 0:000> ln 0x00401260 3 : *** WARNING: Unable to verify checksum for unwind1.exe 4 : c:\sai\debugging\unwind1\unwind.cpp(71)+0x4 5 : (00401200) unwind1!CallFast+0x60 | (004012c0) unwind1!scanf 6 : 0:000> bp unwind1!CallFast 7 : 0:000> bl copyright 2004 Hans De Smaele
55 170 : 0040124d c745f400000000 mov dword ptr [ebp-0xc],0x0 171 : 00401254 eb06 jmp unwind1!CallFast+0x5c (0040125c) 172 : 00401256 8b4508 mov eax,[ebp+0x8] 173 : 00401259 8945f4 mov [ebp-0xc],eax 174 : 0040125c 8b45f8 mov eax,[ebp-0x8] 175 : 0040125f 99 cdq 176 : 00401260 f77df4 idiv dword ptr [ebp-0xc] 177 : 00401263 8945f0 mov [ebp-0x10],eax 178 : 00401266 8b4df0 mov ecx,[ebp-0x10] 179 : 00401269 51 push ecx 180 : 0040126a 8b55f4 mov edx,[ebp-0xc] 181 : 0040126d 52 push edx 182 : 0040126e 8b45f8 mov eax,[ebp-0x8] 183 : 00401271 50 push eax 184 : 00401272 6888514200 push 0x425188 185 : 00401277 e8a4000000 call unwind1!printf (00401320) 186 : 0040127c 83c410 add esp,0x10 187 : 0040127f b803000000 mov eax,0x3 188 : 00401284 5f pop edi 189 : 00401285 5e pop esi 190 : 00401286 5b pop ebx 191 : 00401287 83c450 add esp,0x50 192 : 0040128a 3bec cmp ebp,esp 193 : 0040128c e80f010000 call unwind1!_chkesp (004013a0) 194 : 00401291 8be5 mov esp,ebp 195 : 00401293 5d pop ebp 196 : 00401294 c20400 ret 0x4 197 : 00401297 cc int 3 198 : 00401298 cc int 3 199 : 00401299 cc int 3 200 : 0040129a cc int 3 201 : 0040129b cc int 3 202 : 0040129c cc int 3 203 : 0040129d cc int 3 204 : 0040129e cc int 3 205 : 0040129f cc int 3 206 : 004012a0 cc int 3 207 : 0:000> .logclose 208 : Closing open log file e:\unwind.txt
The debug session explained As you can see at line 1, a new log file is created. The advantage of working with log files is that you can pick up the debug session again at a later point in time, if you can't find the problem directly. Also, it can be most useful to keep the log files as a reference for later debugging sessions (and problems). For this exercise, we don't use symbols that can be found on a symbol store. For this, the debugger is explicitly told where to find the symbols for this application. Here, I gave this instruction before opening the log file (sorry for this) with the commands : Application debugging in a production environment Version 1.1 56 .sympath+ c:\sai\debugging\unwind1\debug .reload Do not forget to issue the .reload command, otherwise the symbol path is not really updated. Since we saw that the program went down at address 0x00401260, a check is made to find out what is the nearest function call. This is done with the ln command, as you can see at line 2. At line 5, you see that the nearest function is unwind1!CallFast. For this, we put a breakpoint at the beginning of this function (line 6). Then, on line 7, the bl command is given (bl = breakpoints list). On line 8 you see that there is actually 1 breakpoint set, with the number 0. This makes sense, since we've just set this breakpoint . On line 9, the go command is given, and on line 10 you can see that breakpoint 0 is hit. Line 11 till 15 gives the current values in the registers of the cpu. Starting from here, we'll examine what functions where already called by the application, so that we can reconstruct what the program did until now. The first thing we do is executing the ln eip command. This gives us the nearest function, seen from where the application is at this moment (that's why we give the eip as parameter for the ln command eip is the instruction pointer). This is done at line 16. As a result, the debugger shows that we're at the beginning of the CallFast function (again, that's just what we expected). This is seen on line 18 till 20. On line 21, the double word is asked for the esp register. Because the return address of the current function is stored on the stack, this value (address) points to the calling function. The obtained value, 004011d2 in this case, corresponds with the function CallWithStd, as you can see when executing the ln command on line 30. The register EBP holds the address of the previous frame pointer. We ask for this at line 33. Then, at line 42 we ask again for the name of the function that is found on the returned address (00401163). There we find the CallWithCDecl function. Just as you can do with linked lists, you can repeat this sequence until the stack is completely rewound (lines 45, 57, 69 and 80). This way of working is slow, but you must know how this mechanism works in case you don't have all the symbols available. Knowing the load addresses of the DLL's that the application is loading (and the offsets for the functions in these DLL's), you can reconstruct the stack (with more or less effort ). The load addresses and offsets of functions can be obtained using a tool like the Dependency Walker.
Note: Using the DDS ESP command, you get a perfect impression about how values are stored on the stack. You can use this to find back called functions also.
Of course, the debugger has build-in commands to unwind the stack. When you give the k command (line 89), you get the whole stack trace back. Another useful command is kb (line 97). This command does the same as the k command, but returns also the first three double words that are used for parameter passing. The kv command does the same, but displays as well the used calling convention (line 105). Thanks to the used symbol format (pdb format), the line numbers in the source code, corresponding with the beginning of the functions, is available (and displayed) as well. copyright 2004 Hans De Smaele
57
Note: Another way to find out about the used calling conventions, and the number of parameters (bytes) that are used for each call, you can give the following commands : 0:000> .symopt- 0x2 *** WARNING: Unable to verify checksum for unwind1.exe Symbol options are 30235 0:000> .symopt+ 0x4000 *** WARNING: Unable to verify checksum for unwind1.exe Symbol options are 34235 0:000> k ChildEBP RetAddr 0012fe60 004011d2 unwind1!@CallFast@12 [c:\sai\debugging\unwind1\unwind.cpp @ 58] 0012feb8 00401163 unwind1!_CallWithStd@12+0x42 [c:\sai\debugging\unwind1\unwind.cpp @ 54] 0012ff18 004010ca unwind1!_CallWithCDecl+0x43 [c:\sai\debugging\unwind1\unwind.cpp @ 45] 0012ff80 004014c9 unwind1!_main+0x9a [c:\sai\debugging\unwind1\unwind.cpp @ 31] 0012ffc0 77e4f38c unwind1!_mainCRTStartup+0xe9 [crt0.c @ 206] 0012fff0 00000000 kernel32!_BaseProcessStart@4+0x23
Do not forget to reset these symbols afterwards !
At this point, we know what functions are called by the program, what calling conventions are used and how many parameters are passed between the functions. So, it's time to have a look at the values of these parameters. Since a FastCall calling convention passes the first two parameters using the ECX and EDX registers, we just call these values back (the r command at line 113). As you see, the value in the ECX register is 0x00425150. This address fits in the address space that is used by the application (between 00400000 and 0042f000, as we've previously seen). This value points to the text that is passed as first parameter. When doing a db 0x00425150 command, we see the string in memory (line 119). The da 0x00425150 command displays the same string in ascii format (line 128). The value in the EDX register is 0x3c. Using the .formats command, we find easily the decimal value for this (60). This is indeed the first value that we typed in into the program (line 131). The third parameter is passed using the stack and has the value 6. This is seen directly using the kb command (line 99). The first double word on the stack is the return address of the function, and the second double word is then the first passed parameter on the stack (for a FastCall convention call). Since the return address of a function can be found on [EBP+0x4], the first parameter is located at [EBP+0x8]. Following this principle, the second parameter can be found at [EBP+0xC], and the third parameter at [EBP+0x10]. This is another way to find back the values of the passed parameters.
Note: Parameter values are positive offsets from the EBP register, where local variables are negative offsets from the EBP register.
Now, we have the values of the passed parameters. This is the most important thing that must be known to further debug. The next step is to analyze the code. More specific, why Application debugging in a production environment Version 1.1 58 is the program crashing with these parameter values ? For this, just use the u (unassembled) command, followed by the function name that you want to disassemble.
Note: You can work also with addresses in place of function names. More specific, if you don't have the right symbols, you MUST use addresses.
For this exercise, the unassembled command is given at line 141. As you can see, the disassembled function starts with the prologue of the function. At the lines 155 and 156 you can see that the values, stored in EDX and ECX, are copied into two local variables. At the lines 166 and 167 you see that two other local variables are initialized with the values 0. We know from the beginning that the application crashes at address 0x00401260. When we look at the instruction at that address, we see that the value in EAX is divided by the value in [EBP-0xC]. When we step back, you see [EBP-0xC] gets the value 0 at line 170. Stepping further back learns us why : if the value of the third parameter has the value 6, then [EBP-0xC] gets the value 0. You can see this at line 168. The only reason why a programmer should do this in this program is for educational reasons, I guess . Of course, there are other windows in WinDbg that you can use to see the memory, the registers, local variables, but a very interesting one is the "call stack" window. When you open this window, you see the unwound stack. If you click on one of the functions there, you get the source code of the function !
Note: Make sure you've set the File / Source Path correct, otherwise the debugger will search for the source file at the location that is stored in the symbols (see kv command).
This finalizes the first exercise. Without having the source code at hand, we found out what functions were called, what parameters were used, what values these parameters had, and finally why the application crashed. This is what production debugging is all about ! To make it easier to follow the course, the source code for this application can be found hereunder. Unwind1 source code
#include <windows.h> #include <stdio.h>
/* use the extern "C" to turn off C++ name decoration. */ extern "C" { int CallWithCDecl(char* szMessage, int a, int b); int __stdcall CallWithStd(char* szMessage, int a, int b); int __fastcall CallFast(char* szMessage, int a, int b); }
int main(void) { int iNumber1 = 0, iNumber2 = 0;
copyright 2004 Hans De Smaele
59 printf("Divider application : (press 0 - 0 to stop)\n\n");
while(1) { printf("Give an integer, please : "); scanf("%d",&iNumber1); printf("Give another integer, please : "); // give here a 6 scanf("%d",&iNumber2);
int CallWithCDecl(char* szMessage, int a, int b) { printf("%s %d, %d\n",szMessage,a,b);
CallWithStd("Now in the CallWithStd function, parameters are :",a,b);
return 1; }
int __stdcall CallWithStd(char* szMessage, int a, int b) { printf("%s %d, %d\n",szMessage,a,b);
CallFast("Now in the CallFast function, parameters are :", a, b);
return 2; }
int __fastcall CallFast(char* szMessage, int a, int b) { printf("%s %d, %d\n",szMessage,a, b);
int iDivider = 0, iResult = 0; if(b == 6) { iDivider = 0; Application debugging in a production environment Version 1.1 60 } else { iDivider = b; }
iResult = a / iDivider;
printf("Result of the division of %d by %d is : %d\n",a,iDivider,iResult);
return 3; }
Exercise 2 : CrashDivider ( a crashing .NET application) Introduction In this example we've a look at how we can debug a .NET application with WinDbg. The program we're going to debug is written in C#, and we let it crash on a division (just like in the previous exercise). However, there where the previous program stopped with a "stop report" dialog, the CrashDivider application gives an exception error dialog (see figure 6.8).
Figure 6.8 The "exception error" dialog. During this exercise, we'll use again the debug version of the application. In the next chapter, the free build of this application will be used. For this exercise, we don't start the debugger using the shortcut. The best method to start the debugger is from the Visual Studio.NET 2003 command prompt. This prompt can be started from Start / Programs / Microsoft Visual Studio.NET 2003 / Visual Studio.NET copyright 2004 Hans De Smaele
61 Tools / Visual Studio.NET 2003 command prompt. Make also sure that the folder where WinDbg can be found is added to the path environment variable. When you start WinDbg now from the command prompt, all settings are set right to use the SOS extension we need for the .NET application debugging. First and second change exception handling. When debugging an application, it's possible to choose if occurring faults are handled by the application itself or by the debugger. Indeed, the term "first and second change exception handling" just tells who gets the first change to handle the error. A second change exception handling is only possible when a debugger is attached to the program. After loading the application that must be debugged, open the dialog to set the event filters (Debug / Event filters), and set CLR exceptions as 'enabled' but not as 'handled' (see figure 6.9).
Figure 6.9 CLR exception handling settings Additionally, you can add also the CLR notification exception. This setting is only available starting with version 6.3.5.1 of WinDbg. Debugging " CrashDivider" Just as with the previous debugging exercise, line numbers are added to the debug log, because this makes it easier to explain the debug scenario. Start WinDbg from the Visual Studio.NET command prompt, load the CrashDivider application in the debugger, set the event filters as we've explained above, add the path for the symbol files, and there we go ! 1 : Opened log file 'c:\temp\crashdivider.txt' 2 : 0:000> .sympath+ c:\sai\debugging\crashdivider\bin\debug 3 : Symbol search path is: e:\Symbols\Windows2003;e:\Symbols\Windows2000;e:\Symbols\Windows2000\SP3;e:\Symbol s\Windows2000\SP4;srv*e:\Symbols\Other*http://msdl.microsoft.com/download/symbols; srv*e:\Symbols\Other*\\s05074\symbols;c:\sai\debugging\crashdivider\bin\debug Application debugging in a production environment Version 1.1 62 4 : 0:000> .reload 5 : Reloading current modules 6 : ...... 7 : 0:000> g 8 : ModLoad: 77290000 772d9000 C:\WINDOWS\system32\SHLWAPI.dll 9 : ModLoad: 77c00000 77c44000 C:\WINDOWS\system32\GDI32.dll 10 : ModLoad: 77d00000 77d8f000 C:\WINDOWS\system32\USER32.dll 11 : ModLoad: 77ba0000 77bf4000 C:\WINDOWS\system32\msvcrt.dll 12 : ModLoad: 791b0000 79412000 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorwks.dll 13 : ModLoad: 7c340000 7c396000 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\MSVCR71.dll 14 : ModLoad: 79040000 79085000 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\fusion.dll 15 : ModLoad: 77380000 77b5d000 C:\WINDOWS\system32\SHELL32.dll 16 : ModLoad: 70ad0000 70bb6000 C:\WINDOWS\WinSxS\x86_Microsoft.Windows.Common- Controls_6595b64144ccf1df_6.0.100.0_x-ww_8417450B\comctl32.dll 17 : ModLoad: 79780000 79980000 c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll 18 : ModLoad: 79980000 79ca6000 c:\windows\assembly\nativeimages1_v1.1.4322\mscorlib\1.0.5000.0__b77a5c561934e089_ 6b93758c\mscorlib.dll 19 : ModLoad: 79510000 79523000 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorsn.dll 20 : ModLoad: 77160000 77285000 C:\WINDOWS\system32\ole32.dll 21 : ModLoad: 63000000 63014000 C:\WINDOWS\system32\SynTPFcs.dll 22 : ModLoad: 77b90000 77b98000 C:\WINDOWS\system32\VERSION.dll 23 : ModLoad: 744f0000 7453b000 C:\WINDOWS\system32\MSCTF.dll 24 : ModLoad: 79430000 7947c000 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\MSCORJIT.DLL 25 : ModLoad: 51a70000 51af0000 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\diasymreader.dll 26 : (c7c.b60): CLR exception - code e0434f4d (first chance) 27 : First chance exceptions are reported before any exception handling. 28 : This exception may be expected and handled. 29 : eax=0012f52c ebx=00000007 ecx=00148918 edx=0001290f esi=00000000 edi=00000000 30 : eip=77e649d3 esp=0012f528 ebp=0012f57c iopl=0 nv up ei pl zr na po nc 31 : cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000246 32 : KERNEL32!RaiseException+0x51: 33 : 77e649d3 5e pop esi 34 : 0:000> kb 35 : ChildEBP RetAddr Args to Child 36 : 0012f57c 7921020d e0434f4d 00000001 00000000 KERNEL32!RaiseException+0x51 37 : 0012f5d4 792ed555 04a43f98 00000000 04a43f98 mscorwks!RaiseTheException+0xa0 copyright 2004 Hans De Smaele
The exercise explained. The first thing that must be done, is opening a log file. At line 2, the symbol path is adjusted, so that the symbols for the application can be found. To make these settings active, the .reload command is issued (line 4). Once again, it's better to set up a symbol store in your enterprise ; thanks to that, you should never worry about the loading of the right symbols. At line 7, the go command is issued. Directly, we see what DLL's and assemblies are loaded by the application (line 8 25). Thanks to the full path names, you can see directly the version of the framework that is used by this application (version V1.1.4322). In this exercise, two values were given : 70 and 7. Let's see if we can find these values back, and more interesting : why is an exception thrown by the application. If the event filter was not set that the debugger should catch the exception first, then the application (or the underlying framework) should have tried to handle the exception if no debugger was attached. However, since there is a debugger attached in this case, a second change to handle the exception should be given to the debugger (remember that a second change exception handling is only possible if a debugger is attached). If there isn't a debugger attached, and if there is no exception handling coded in the application, the exception dialog (see fig 6.8) should appear. At line 26, you see that a CLR exception has occurred.
Note: restart the debugging session without the event filter set. You'll see that a first change exception is fired, directly followed by a second change exception.
Just as with a native application (no .NET involved), the debugger gives us directly information about the CPU registers, the stack pointer, instruction pointer, base pointer, . At line 34, the kb command is given, so that we can see the stack trace. Line 36, 37 and 38 show the function calls that make the exception dialog to appear. At line 39, a warning is given that the following frames may be wrong. Also, the lines 40, 41 and 42 don't have a descriptive text about the function names. Rather, just addresses are given. This is because the code that is located there, is executed by the framework (this is .NET code). Because of this, the debugger can't find back the function names. The reason is that the application is JIT compiled (Just In Time). The following frames can be resolved again. At line 55, the SOS extension is loaded into the debugger. This extension (SOS = Son Of Strike) is made to allow WinDbg to debug .NET code. Because we started WinDbg from the Visual Studio.NET command prompt, there is no need to give a full path name when loading the SOS extension. Application debugging in a production environment Version 1.1 72
Note: starting with WinDbg version 6.3.11.3, there is a SOS extension that comes with the debugger. However, during this exercise, the SOS extension that comes with Visual Studio.NET 2003 is used.
The first thing to do, after loading the SOS extension, is to ask for the number of threads in the application. This is done at line 56 (!threads). Because this command is an extension command, the command must start with a bang sign (!). Line 58 tells us that the SOS table version 5 is loaded. This is the table that corresponds with the .NET framework 1.1 (other framework versions have different table versions as well). As you can see, two managed threads are active in the application : a "finalizer" thread that is found in every .NET application (responsible for the 'clean up' of the application), and another thread where the System.ArgumentException occurred (line 66 and 67). Since we're interested in this latest thread, is this the thread we'll work on. As you can see, this thread is running in a single threaded apartment (STA). The thread state (here 6020) indicates this as well.
Note: Information about thread states can be found in the appendix of the "Patters and Practices" book titled : "Production Debugging for .NET Framework Applications".
The next used command is !clrstack all (line 68). At line 69 you see that we are indeed working on the right thread (thread 0). The command option all has as a result that all information is shown : CPU registers values, function parameters and local variables. For example, the function CrashDivider.Class1.TheDivider has three parameters (a string object and two I4 objects). Also, the address for this function (EIP = 0x071a0179) is the address that couldn't be resolved with the kb command (line 72). At line 75, information is given about the file and line number where this function can be found. Because the checked build of the application is used here, all information about the parameters and the local variables is available (their names and values). You can see that the values 70 and 7 are used for the I4 (int32) parameters. The names of the local variables are c and k. The string parameter is an object of type class System.String and has the name szMessage. The content of this string can be found at address 0x04a43dd8 (fig 6.10).
Figure 6.10 The System.String object "szMessage" as seen in memory. You see directly that the string is in Unicode format. Also, the string is prefixed with some other characters. This is because it isn't just a string (array of characters) as it is used in copyright 2004 Hans De Smaele
73 C/C++ applications, but it is an .NET object. Thus, when you give the !dumpobj 0x04a43dd8 command, you'll get all the information about this object. When you want to see more than just the managed stack, use the !dumpstack command (line 98). This command gives us the complete stack trace back, and thus we can find the managed functions back in it also (line 110, 111 and 112). As you see, this command gives a lot of information, but is most valuable when parts of the stack trace are missing due to missing symbols. Now, we know what functions are called during the execution of the program, what kind of parameters are used, the values of these parameters and the used local variables. If we want to find out why an exception was thrown, we have to look at the assembler code again. However, because we're dealing with .NET code, we need the !u command for this (and not the u command). This command is issued at line 179. The given address (0x071a0179) is the place where the instruction pointer was at the moment of the exception). Since the IP is pointing to the next instruction to execute, you see that the function mscorwks!JIT_Throw was called (at line 207). The !u command displays the complete function for us, together with the start address of the function and the prologue (line 182). When we read the assembler code, we see at line 197 that the value in [ebp+0x8] is compared with the value 10 (decimal). If this value is greater or equal than 10, the division is performed, otherwise the exception is thrown ! The value at [ebp+0x8] is 7, being the third passed parameter. Since the function is cleaning up the stack (ret 0x4), the used method is following the fastcall calling convention (first two parameters are passed through registers). Attention : the first two parameters are here not passed using the ECX and EDX register, but via the EDI and ESI register. This is caused by code optimalisation by the compiler and this is something we'll see very often when working with free builds. Since we've seen that EDI, ESI and EBX are pushed onto the stack, let's have a look which objects are on the stack : execute the command !dumpstackobjects (line 244). On return, the debugger displays all the objects on the stack. One of these objects (04a43f98) is a System.ArgumentException object. When you dump the content of this object (!dumpobj 04a43f98 on line 262), you get a lot of information about it. One of the parts of the object is another object, called _message (line 273). When you dump this object again (line 285), you can see the message's string, having the value "Throwing an exception" (line 291). The syntax on line 301 is used to show all threads, and to display stack information about these threads. Combining debugger instructions like this is very common with WinDbg. More information about the usage of the SOS.DLL extension can be found in the extension itself (type !clr10\sos.help after loading the extension and when using the SOS extension that comes with the debugger). However, more information can be found in the webpage at C:\Program Files\Microsoft Visual Studio.NET 2003\SDK\v1.1\Tool Developers Guide\Samples\sos\sos.htm. The CrashDivider code. When you look at the code below, you'll see that an exception was indeed thrown.
using System;
namespace CrashDivider { /// <summary> /// Summary description for Class1. Application debugging in a production environment Version 1.1 74 /// </summary> class Class1 { /// <summary> /// The main entry point for the application. /// </summary> [STAThread] static void Main(string[] args) { Console.Write("Enter an integer : "); int i = Int32.Parse(Console.ReadLine()); Console.Write("Enter another integer : "); int j = Int32.Parse(Console.ReadLine()); JustAFunction("Now in just another function",i,j); Console.Write("Press <enter> to stop..."); Console.Read(); }
static void JustAFunction(string szMessage, int a, int b) { Console.WriteLine(szMessage); TheDivider("Now ready to divide...",a,b); return; }
static void TheDivider(string szMessage, int a, int b) { Console.WriteLine(szMessage); int c, k; if(b < 10) { throw new ArgumentException ( "Throwing an exception" , "x" ) ; // c = 0; } else { c = b; }
k = a / c;
Console.WriteLine("The division of {0} by {1} is : {2}",a, c ,k);
So far, we've discussed techniques to debug applications that run in the context of a debugger. Also, the checked build of the applications has been used so far. However, in a production environment, the free build of the program is used. Also, you can hardly ask the end user if she/he can start the application from within a debugger. A first solution would be to define WinDbg as the default debugger. When a problem occurs, it isn't the famous Dr. Watson but WinDbg that is started. But once again : what can an end user do with the debugger ? And then : can you ask the end user not to touch anything anymore until you have the time to go to her/his desk and start debugging the application ? And what about NT services and COM+ components on a server ? As said before : every moment that an application (and especially a server application) can't run, costs a lot of money. This means that the program must be restarted as soon as possible and that debugging must be done "off line". Only in very rare situations, where the bug is very hard to find, you can consider to debug on the server itself. For all other scenarios, create a dump file and analyze this one. Creating a dump When people hear about a dump, they think directly about the famous BSOD (Blue Screen Of Death) memory dump. This dump, when configured this way (see later), writes the full memory content to disk, so that the debugging people can analyze this. Of course, you shouldn't bring down the whole server just to debug one COM component or so. You can create application dumps for applications that remain "hanging" or that "crash". The tool we use for this, is ADPLUS.VBS. This script comes with the debugging tools (same package as WinDbg) and can be adapted to your own needs if wanted (see later). The syntax to use ADPLUS for a crashing application is : ADPlus.vs crash pn crashdivider.exe quiet In this command, -crash indicates that ADPlus will be used for a crashing application. Otherwise, the hang command should be used. pn is followed by the name of the process that must be monitored, where pn stands for process name. If you want to debug only one instance of a certain application, give the command p <PID>, where PID is the Application debugging in a production environment Version 1.1 76 process identifier for your application (as found in the Task Manager, for example). The last option (-quiet) is used so that we don't see any dialog on the screen. If you want to take several dumps for a problem, in an automated way, you should avoid having dialog boxes popping up. Running ADPlus without any parameter, gives you an overview of the possibilities of the script. More detailed information can be found in the help files that come with the debugger. Live debugging and analyzing a dump is not so different. The only thing you can't do in a dump, is setting breakpoints and modifying variables and data flow. However, you've seen that until now, we didn't really need this. Modifying variables, program flow and setting breakpoints is something that must be done during the development phase, and not during production debugging. For now, just start CrashDivider.exe (the free build) and when the program is ready to accept user input, start ADPlus with the same syntax as described above. Figure 7.1 shows the command prompt, just after issuing the command. As soon as you give a value smaller than 10 (as we've seen in the previous chapter), the application will crash.
Figure 7.1 The debugger attached to the "crashdivider" application When you run the ADPlus script, you see on the task bar that the CDB debugger is active. When the application crashes, the dump is generated by CDB and after the generation of the dump, CDB stops.
Note: The CDB debugger is a console debugger that can be used to debug user applications. You cannot use it to do kernel debugging. The commands used in CDB are the same as used in WinDbg.
Since there were no parameters given to ADPlus regarding the output folder, a folder is created in the same folder as where ADPlus is located. The name of the created folder is Crash_Mode__Date_mm-dd-yyyy__Time_hh:mm:ssmm. In this folder, you find a report about the debugger instructions, a list with all running processes at the time of the crash, a log file, two dump files and an additional folder holding a configuration file for the debugger. Of course, in our case, we're mostly interested in the generated dump files. Because the ADPlus script was not configured to catch the bug directly, the debugger created a "2nd change .NET CLR" dump. During the next debug session, we'll use this dump for the analysis of the problem, but we could have used the "1st change process shutdown" dump as well.. copyright 2004 Hans De Smaele
87 349 : 0709013b 8bd8 mov ebx,eax 350 : 0709013d b90807b979 mov ecx,0x79b90708 351 : 07090142 e8d11e34f9 call 003d2018 352 : 07090147 8bd0 mov edx,eax 353 : 07090149 897204 mov [edx+0x4],esi 354 : 0709014c 8bf2 mov esi,edx 355 : 0709014e b90807b979 mov ecx,0x79b90708 356 : 07090153 e8c01e34f9 call 003d2018 357 : 07090158 8bd0 mov edx,eax 358 : 0709015a 897a04 mov [edx+0x4],edi 359 : 0709015d 8bfa mov edi,edx 360 : 0709015f b90807b979 mov ecx,0x79b90708 361 : 07090164 e8af1e34f9 call 003d2018 362 : 07090169 8bd0 mov edx,eax 363 : 0709016b 895a04 mov [edx+0x4],ebx 364 : 0709016e 8bc2 mov eax,edx 365 : 07090170 56 push esi 366 : 07090171 57 push edi 367 : 07090172 50 push eax 368 : 07090173 8b0d2c20a405 mov ecx,[05a4202c] 369 : 07090179 8b15a011a405 mov edx,[05a411a0] 370 : 0709017f 8b01 mov eax,[ecx] 371 : 07090181 ff90e8000000 call dword ptr [eax+0xe8] 372 : 07090187 5b pop ebx 373 : 07090188 5e pop esi 374 : 07090189 5f pop edi 375 : 0709018a c20400 ret 0x4 376 : 0709018d b98c7bb879 mov ecx,0x79b87b8c 377 : 07090192 e8811e34f9 call 003d2018 378 : 07090197 8bf0 mov esi,eax 379 : 07090199 ff359c11a405 push dword ptr [05a4119c] 380 : 0709019f 8b159811a405 mov edx,[05a41198] 381 : 070901a5 8bce mov ecx,esi 382 : 070901a7 ff15007cb879 call dword ptr [mscorlib_79980000+0x207c00 (79b87c00)] 383 : 0:000> .logclose 384 : Closing open log file c:\temp\dump.txt 385 : The exercise explained Just as with the previous debugging scenario, we give the command kb to see the stack trace. We see that information is missing for managed code. For this, the SOS extension is loaded. Then we look how many threads there are and ask for the failing thread the managed stack (lines 2, 8, 24, 25 and 37). The most remarkable thing you see is that the function 'JustAFunction' doesn't exist anymore. When using the !dumpstack command, you see that CrashDivider.Class1.Main is directly calling CrashDivider.Class1.TheDivider (line 68). This is due to compiler optimization. The compiler considered the 'JustAFunction' function too simple. For this, it Application debugging in a production environment Version 1.1 88 made the function a so called 'inline function'. When you take a closer look at the content of the main function, you see that the code of the 'JustAFunction' is now in the main function. Use the !u 070900e5 command to see the content of the main function.
Note: The address of the main function can be found at line 47 or 68.
At line 222 you see that the code of the second function is indeed embedded in main ("Now in just another function"). When looking at line 59 (in the !dumpstack output), you see that the function is prefixed by a so called 'Method description'. The value you find here, can be used with the !DumpMD command (dump method description). This gives you more information about the function. In the output, you see this at line 327. At line 333, you see the virtual address of the main function (VA). Using the given address, you can disassemble the code in native mode (not managed). If you don't get VA information, but IL RVA (Intermediate Language Relative Virtual Address), then the code is not JIT compiled yet.
copyright 2004 Hans De Smaele
89 8 Debugging .NET applications where symbols are missing. Introduction Sooner or later you'll have to debug a .NET application for which you don't have the symbols available. The reason for this can be : Your company hasn't rules yet about the generation of symbol files during the build process. You have to debug an application that isn't made by your company, and where the creator of the program didn't deliver the symbols together with the application. Debugging without symbols is difficult. You must, just as with native applications, use the addresses you find back on the stack. Then, you should try to find back these addresses with a tool like the 'Dependency Walker'. Because all this is so difficult and time consuming, it's far better to reverse the application into the intermediate language (IL code) and then to recompile it with options set, so that symbols are generated. In the next exercise, I show you how this is done, using the CrashDivider application. The exercise Open a Visual Studio.NET 2003 command prompt and navigate to the folder where the free build of CrashDivider can be found. From there, execute the following command : Ildasm crashdivider.exe /out=crashdivider.il After executing this command, the ildasm program (intermediate language disassemble) creates the intermediate code for the CrashDivider application. Next to the il file, a resource file is generated with version information and so on. Here below, you see the generated code :
// Microsoft (R) .NET Framework IL Disassembler. Version 1.1.4322.573 // Copyright (C) Microsoft Corporation 1998-2002. All rights reserved.
// =============== CLASS MEMBERS DECLARATION =================== // note that class flags, 'extends' and 'implements' clauses // are provided here for information only
.namespace CrashDivider { .class private auto ansi beforefieldinit Class1 extends [mscorlib]System.Object { .method private hidebysig static void Main(string[] args) cil managed { .entrypoint .custom instance void [mscorlib]System.STAThreadAttribute::.ctor() = ( 01 00 00 00 ) // Code size 71 (0x47) .maxstack 3 .locals init ([0] int32 i, [1] int32 j) IL_0000: ldstr "Enter an integer : " IL_0005: call void [mscorlib]System.Console::Write(string) IL_000a: call string [mscorlib]System.Console::ReadLine() IL_000f: call int32 [mscorlib]System.Int32::Parse(string) IL_0014: stloc.0 IL_0015: ldstr "Enter another integer : " IL_001a: call void [mscorlib]System.Console::Write(string) IL_001f: call string [mscorlib]System.Console::ReadLine() Application debugging in a production environment Version 1.1 92 IL_0024: call int32 [mscorlib]System.Int32::Parse(string) IL_0029: stloc.1 IL_002a: ldstr "Now in just another function" IL_002f: ldloc.0 IL_0030: ldloc.1 IL_0031: call void CrashDivider.Class1::JustAFunction(string, int32, int32) IL_0036: ldstr "Press <enter> to stop..." IL_003b: call void [mscorlib]System.Console::Write(string) IL_0040: call int32 [mscorlib]System.Console::Read() IL_0045: pop IL_0046: ret } // end of method Class1::Main
.method private hidebysig static void JustAFunction(string szMessage, int32 a, int32 b) cil managed { // Code size 19 (0x13) .maxstack 3 IL_0000: ldarg.0 IL_0001: call void [mscorlib]System.Console::WriteLine(string) IL_0006: ldstr "Now ready to divide..." IL_000b: ldarg.1 IL_000c: ldarg.2 IL_000d: call void CrashDivider.Class1::TheDivider(string, int32, int32) IL_0012: ret } // end of method Class1::JustAFunction
Now, the il file can be assembled again, with debug information included. After this, symbol files for the application will be available. Just execute the command : Application debugging in a production environment Version 1.1 94 Ilasm /DEBUG /RESOURCE=crashdivider.res crashdivider.il The output generated by this program can be seen on figure 8.1.
Figure 8.1 The output, generated by the ilasm program. When starting the program now under the debugger, you see that the symbols point to the il file. This is not as easy to read as C# code, but at least easier then assembler code. You can see a part of the debug scenario here below.
Microsoft (R) Windows Debugger Version 6.3.0017.0 Copyright (c) Microsoft Corporation. All rights reserved.
CommandLine: C:\SAI\Debugging\CrashDivider\bin\Release\crashdivider.EXE Symbol search path is: e:\Symbols\Windows2003;e:\Symbols\Windows2000;e:\Symbols\Windows2000\SP3;e:\Symbol s\Windows2000\SP4;srv*e:\Symbols\Other*http://msdl.microsoft.com/download/symbols; srv*e:\Symbols\Other*\\s05074\symbols Executable search path is: ModLoad: 00400000 0040a000 crashdivider.exe ModLoad: 77f40000 77ffa000 ntdll.dll ModLoad: 79170000 79196000 C:\WINDOWS\system32\mscoree.dll ModLoad: 77e40000 77f34000 C:\WINDOWS\system32\KERNEL32.dll ModLoad: 77da0000 77e30000 C:\WINDOWS\system32\ADVAPI32.dll ModLoad: 77c50000 77cf5000 C:\WINDOWS\system32\RPCRT4.dll (87c.aa0): Break instruction exception - code 80000003 (first chance) eax=77fc35ef ebx=7ffdf000 ecx=00000004 edx=77f8ed10 esi=77fc23b4 edi=00241f08 eip=77f43847 esp=0012fb60 ebp=0012fc48 iopl=0 nv up ei pl nz na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202 ntdll!DbgBreakPoint: 77f43847 cc int 3 0:000> .sympath+ c:\sai\debugging\crashdivider\bin\release Symbol search path is: e:\Symbols\Windows2003;e:\Symbols\Windows2000;e:\Symbols\Windows2000\SP3;e:\Symbol s\Windows2000\SP4;srv*e:\Symbols\Other*http://msdl.microsoft.com/download/symbols; srv*e:\Symbols\Other*\\s05074\symbols;c:\sai\debugging\crashdivider\bin\release 0:000> .reload copyright 2004 Hans De Smaele
As you can see on the listing of the il code, a test is done to see if the second parameter is smaller then 10 (IL_0009: bge.s IL_001b).
Application debugging in a production environment Version 1.1 98 9 Remote debugging Introduction Remote debugging can be most useful in scenarios like : During the development stage, the debugging of a GUI application can be easier via remote debugging. The computer where the fault occurred is on a remote site that you can't (or dont want) to access. The bug that occurred is so difficult that a 'live debugging session' is required (for example : you need to set breakpoints somewhere, ).
However, one thing is for sure : debugging is time consuming. Only when the problem is on a station that just runs one single application, and the application doesn't work anymore (doesn't even start anymore), you can consider to debug on that station itself. Of course, you can't install the complete Visual Studio.NET 2003 development environment on that computer, just to find the bug. For this, remote debugging can be a solution (even when the two computers are located side by side). Once you have a connection between the two computers, you can start debugging just as you would during the development stage. Here below, you can find what files must be installed on the debuggee (the station where the fault occurs). Visual Studio.NET remote components setup The installation of the remote components is very easy : just take the Visual Studio.NET 2003 DVD and run setup.exe. Chose at the dialog the option "Remote components setup" (see figure 9.1). Of course, don't forget that the .NET framework must be installed on the computer where the bug occurs (but why should you do .NET debugging on a station where a .NET application can't run ). If you don't have the VS.NET 2003 DVD at hand, just copy the components below from your computer to the debuggee : MSVCR71.DLL (located in %systemroot%\system32) MSVCI71.DLL (same folder) MSVCP71.DLL (same folder) MSVCMON.EXE (located in <vs.net install dir>\common7\packages\debugger) NATDBGDM.DLL (same folder) copyright 2004 Hans De Smaele
99 NATDBGTLNET.DLL (same folder)
Start on the debuggee the program MSVCMON.EXE and there you go !
Figure 9.1 Choose here the "Remote Components Setup" Visual Studio 6.0 remote debugging setup To do remote debugging with VS 6.0, you need the following files : MSVCMON.EXE MSVCRT.DLL TLN0T.DLL DM.DLL MSVCP60.DLL MSDIS110.DLL PSAPI.DLL (only needed on a Microsoft Windows NT computer).
Here also, start MSVCMON.EXE and there you go ! Remote debugging with WinDbg. Personally, I like WinDbg the most to do remote debugging. Not only is this the most powerful debugger, but it has also the most possibilities to connect computers with each other. In some scenarios, you can connect up to 5 computers to debug a station !
Note: Check the WinDbg documentation for more information about how to setup remote debugging.
Application debugging in a production environment Version 1.1 100 An easy way to debug user applications (not kernel problems) with WinDbg is by installing WinDbg on the debugging station and "DbgSvr" (process server) on the remote computer. Give the command below on the remote computer : DbgSvr t tcp:port=9999 And on the debugging computer, where WinDbg is located, run : WinDbg premote tcp:server=<remote station name>, port=9999 p <PID> And there you go !
copyright 2004 Hans De Smaele
101 10 Some debugging scenario's. Introduction The things you learned so far allow you to find most of the bugs (of not all of them !). However, some bugs can be found easier if you use a specific approach. In this chapter we discuss some debugging scenario's and give a kind of 'best practice' to find these bugs. The kind of bugs we'll discuss here are : Debugging heap corruption Debugging spinning threads (100 % cpu usage) Debugging memory leaks Debugging a NT service during service startup Debugging COM+ applications (in combination with IIS). .NET debugging scenario's.
Debugging heap corruption For this exercise we use a small application that writes too far in memory. When you execute the program, you get a screen like figure 10.1.
Figure 10.1 The dialog box you get after a heap corruption occurred. This screen always shows that a heap corruption has occurred. Another screen that points to heap corruption is a dialog box with the message "Error : access violation C000005.". Let's try to debug this ! Start WinDbg, select File / Open Executable and navigate to the "Heapcorruption.exe" program. Start the application with the "g" command. The application starts and we get a small dialog box. Press the "Go !" button. You can see that the debugger breaks into the program with an error message about the HEAP (line 31 in the debugging listing). After giving the "kb", you can see at line 49 that Application debugging in a production environment Version 1.1 102 an error was detected at line 118 in the source code (at the end of the function OnButton1). The debug listing is here below :
The listing (partial) for the application is displayed here below :
void CHeapcorruptionDlg::OnButton1() { // TODO: Add your control notification handler code here int i = 0; LPVOID p; PTCHAR cp;
Application debugging in a production environment Version 1.1 104 m_Listbox.AddString("Start");
m_Listbox.AddString("About to Global Alloc");
p = GlobalAlloc(GMEM_FIXED,32);
cp = (PTCHAR) p;
m_Listbox.AddString("About to fill buffer"); for (i = 0; i < 1000; i++) *cp++ = 'a';
m_Listbox.AddString("About to free memory"); GlobalFree(p);
m_Listbox.AddString("Complete"); }
As you can see, the error is not at the end of the function ! Somewhere in the function, a memory block of 32 bytes is reserved - p = GlobalAlloc(GMEM_FIXED,32) and somewhat later you see that the memory block is overwritten during the loop that counts from 0 till 999. When you active heap checking on your computer, you find the correct place where the error in the code is. To do so, you must use the "gflags.exe" program. Start Gflags.exe and fill in the dialog as on figure 10.2.
Figure 10.2 Gflag.exe settings to detect heap corruption. copyright 2004 Hans De Smaele
105 Do not give the full path name for the "heapcorruption.exe" application, but just type the name of the image. After you've selected the check boxes as in the figure, click the "Apply" button (clicking on the OK or Cancel button just ends the gflags application). Now, use WinDbg again and start a new debugging session for the application. As you can see in the debugging listing below, the debugger reports an access violation error and points this time to the correct line in the function.
CommandLine: C:\SAI\Debugging\heapcorruption\Release\heapcorruption.exe Symbol search path is: e:\Symbols\Windows2003;e:\Symbols\Windows2000;e:\Symbols\Windows2000\SP3;e:\Symbol s\Windows2000\SP4;srv*e:\Symbols\Other*http://msdl.microsoft.com/download/symbols; srv*e:\Symbols\Other*\\s05074\symbols;c:\sai\debugging\heapcorruption\release Executable search path is: ModLoad: 00400000 00405000 heapcorruption.exe ModLoad: 77f40000 77ffa000 ntdll.dll Page heap: pid 0xA58: page heap enabled with flags 0x3. ModLoad: 77e40000 77f34000 C:\WINDOWS\system32\kernel32.dll ModLoad: 73d20000 73e13000 C:\WINDOWS\system32\MFC42.DLL ModLoad: 77ba0000 77bf4000 C:\WINDOWS\system32\msvcrt.dll ModLoad: 77c00000 77c44000 C:\WINDOWS\system32\GDI32.dll ModLoad: 77d00000 77d8f000 C:\WINDOWS\system32\USER32.dll (a58.2dc): Break instruction exception - code 80000003 (first chance) eax=77fc35ef ebx=7ffdf000 ecx=00000005 edx=77f8ed20 esi=77fc23b4 edi=00c18fb0 eip=77f43847 esp=0012fb60 ebp=0012fc48 iopl=0 nv up ei pl nz na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202 ntdll!DbgBreakPoint: 77f43847 cc int 3 0:000> .sympath+ c:\sai\debugging\heapcorruption\release Symbol search path is: e:\Symbols\Windows2003;e:\Symbols\Windows2000;e:\Symbols\Windows2000\SP3;e:\Symbol s\Windows2000\SP4;srv*e:\Symbols\Other*http://msdl.microsoft.com/download/symbols; srv*e:\Symbols\Other*\\s05074\symbols;c:\sai\debugging\heapcorruption\release 0:000> .reload Reloading current modules ....... 0:000> g ModLoad: 70bc0000 70c50000 C:\WINDOWS\WinSxS\x86_Microsoft.Windows.Common- Controls_6595b64144ccf1df_5.82.0.0_x-ww_8A69BA05\COMCTL32.DLL ModLoad: 77da0000 77e30000 C:\WINDOWS\system32\ADVAPI32.dll ModLoad: 77c50000 77cf5000 C:\WINDOWS\system32\RPCRT4.dll ModLoad: 63000000 63014000 C:\WINDOWS\system32\SynTPFcs.dll ModLoad: 77b90000 77b98000 C:\WINDOWS\system32\VERSION.dll ModLoad: 744f0000 7453b000 C:\WINDOWS\system32\MSCTF.dll ModLoad: 770e0000 7715d000 C:\WINDOWS\system32\OLEAUT32.DLL ModLoad: 77160000 77285000 C:\WINDOWS\system32\ole32.dll (a58.2dc): Access violation - code c0000005 (first chance) First chance exceptions are reported before any exception handling. This exception may be expected and handled. eax=01581000 ebx=00000001 ecx=00000020 edx=01581000 esi=00402310 edi=0012fe4c Application debugging in a production environment Version 1.1 106 eip=0040148c esp=0012f84c ebp=0012f85c iopl=0 nv up ei ng nz ac pe cy cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00010293 *** WARNING: Unable to verify checksum for heapcorruption.exe heapcorruption!CHeapcorruptionDlg::OnButton1+0x71: 0040148c c60261 mov byte ptr [edx],0x61 ds:0023:01581000=?? 0:000> kb ChildEBP RetAddr Args to Child 0012f85c 73d223db 00402310 00000111 0012f89c heapcorruption!CHeapcorruptionDlg::OnButton1+0x71 [C:\SAI\Debugging\heapcorruption\heapcorruptionDlg.cpp @ 113] 0012f86c 73d222ed 0012fe4c 000003e9 00000000 MFC42!_AfxDispatchCmdMsg+0x80 0012f89c 73d88194 000003e9 00000000 00000000 MFC42!CCmdTarget::OnCmdMsg+0x108 0012f8c0 73d23097 000003e9 00000000 00000000 MFC42!CDialog::OnCmdMsg+0x1b 0012f910 73d21b4e 00000000 000f06ba 0012fe4c MFC42!CWnd::OnCommand+0x51 0012f990 73d21afd 00000111 000003e9 000f06ba MFC42!CWnd::OnWndMsg+0x2f 0012f9b0 73d21a78 00000111 000003e9 000f06ba MFC42!CWnd::WindowProc+0x22 0012fa10 73d219d0 00000000 00310780 00000111 MFC42!AfxCallWndProc+0x91 0012fa30 73daf562 00310780 00000111 000003e9 MFC42!AfxWndProc+0x34 *** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\system32\USER32.dll - 0012fa5c 77d0612f 00310780 00000111 000003e9 MFC42!AfxWndProcBase+0x39 WARNING: Stack unwind information not available. Following frames may be wrong. 0012fa88 77d069a5 73daf529 00310780 00000111 USER32+0x612f 0012fb00 77d0881b 00000000 73daf529 00310780 USER32!GetMessageW+0x1ee 0012fb3c 77d0887d 000bf1d0 010b7e88 000003e9 USER32!MapWindowPoints+0x1a7 0012fb5c 77d346b2 00310780 00000111 000003e9 USER32!SendMessageW+0x47 0012fc04 77d13a43 00000001 00000202 00000000 USER32!DefMDIChildProcA+0x5f5 0012fc24 77d0612f 000f06ba 00000202 00000000 USER32!DialogBoxIndirectParamW+0x386 0012fc50 77d069a5 77d139e5 000f06ba 00000202 USER32+0x612f 0012fcc8 77d06689 00000000 77d139e5 000f06ba USER32!GetMessageW+0x1ee 0012fd30 77d06704 004030cc 00000000 77d08d19 USER32!TranslateMessage+0x520 0012fd60 77d0ffcc 00310780 010bf588 004030cc USER32!DispatchMessageW+0xb
Debugging spinning threads (100% cpu usage). For this exercise, we use an application called "ThreadSpin.exe". This application creates a number of worker threads, where one of these is using almost all the cpu. The diagnose which thread is using the cpu can be done with a combined use of the Performance Monitor and WinDbg. The steps to find this kind of bugs is explained in great detail (with lots of screen prints). The output can be slightly different, depending on the used operating system. Here, Windows 2003 Standard server is used. Start the "Performance Monitor (perfmon)", delete all eventually set counters and open then "Counter Logs" under "Performance Logs and Alerts". At the right side, create a new log ("New Log Settings.") (see figure 10.3). copyright 2004 Hans De Smaele
107
Figure 10.3 Create a new log in the Performance Monitor After giving a name for the new log, chose the "Add" button (or "Add Counters" button) (see figure 10.4).
Figure 10.4 The "Add Counters" dialog, as seen in Microsoft Windows 2003 Standard Server. Now, add as counters "All counters" and "All instances" for the performance objects "Processor","Process" and "Thread" (figure 10.5 shows these settings for "Processor"). After you added all the counters, click the "Close" button. The dialog box you see should have the content as seen in figure 10.6. Don't forget to set (at least for this exercise) the interval counter at 1 second. If you want to chose another folder and filename (than the standard = c:\perflogs), you can do so below the tab "Log Files". Application debugging in a production environment Version 1.1 108
Figure 10.5 The "Add Counters" dialog for the "Processor" object
Figure 10.6 The dialog box, after adding all the needed counters. As soon as you click the "Ok" button, the performance counters logging is activated. Start now the "ThreadSpin.exe" application and click the "Go !" button. The application starts to put a heavy load on the processor. You can control this with the "Task Manager" (figure 10.7). The output of the application itself can be seen on figure 10.8. copyright 2004 Hans De Smaele
109
Figure 10.7 The cpu is at 100% after launching the "threadspin" application. The output of the application is like figure 10.8. You can see that the application is somewhere in an endless loop.
Figure 10.8 The application's output Open a command prompt (make sure the folder where the debugging tools are installed, is added to the path) and type the following command : ADplus hang pn ThreadSpin.exe o c:\perflogs The ADPlus script instructs the CDB debugger to take a memory snapshot of the process with name "ThreadSpin.exe". The output of this comes in the "c:\perflogs" folder. The ADPlus script displays a dialog as in figure 10.9. If you don't want to see this messagebox, add the quiet option to the command above. Application debugging in a production environment Version 1.1 110
Figure 10.9 The ADPlus dialog, when using the "hang" mode. Click the "OK" button on the dialog and you'll see that the CDB debugger disappears. This means that the memory content is written to disk. Now, you can close the application. If the application doesn't respond anymore, use the "Kill" utility that comes with the debugger tools to end the application (see figure 10.10).
Figure 10.10 Ending the Threadspin application with the "kill" utility.
Note: In some circumstances, you can leave the application running. The CDB debugger doesn't break your application.
Now, stop the logging in Performance Monitor. Then, select in the Performance Monitor window the "System Monitor" in the navigation pane, and select the option "Properties" in the rightmost part (using a 'right mouse click') (see figure 10.11).
Figure 10.11 copyright 2004 Hans De Smaele
111 Then, select the log file that is just created under the "Source" tab (figure 10.12).
Figure 10.12 Select the just created log file. Now, you can add the collected counters data to "System Monitor". When you add the process "ThreadSpin", you see that it is this application that is indeed consuming the cpu cycles. If you want to know what thread is responsible for this, select as "Performance object" the object "Thread" and select for the counters "% Processor Time" and "ID Thread". As instances, you must select all thread instances for the "Threadspin" application (figure 10.13).
Figure 10.13 The thread information selected. Application debugging in a production environment Version 1.1 112 After you've added the counter's instances, click the "lamp bulb" on the menu. The outcome of this is that the selected thread is highlighted now. Step through the list of threads until you see a thread that is consuming more cpu than the others. Since thread 0 is the main thread of execution, and because all output goes through this thread, it's quite normal that this thread is consuming a lot of cpu. So, seek further for another thread this has a higher cpu consummation than the others. On the figure below (10.14) you can see that it is thread 18 that has a higher consummation.
Figure 10.14 Thread 18 has a higher cpu consummation than the others. Now, find for this thread the corresponding thread id. You must do this because the thread number (here 18) is not consistent during the whole program execution. As threads can end and others can be created, the thread numbers are reused by newly created threads. For this, you need the thread id to work with. As you see on the figure below (10.15), the thread id is 2492 (decimal) for thread 18.
Figure 10.15 The found thread id for thread number 18. copyright 2004 Hans De Smaele
113 Now, start WinDbg and open the dump file that is generated by ADPlus (select File / Open Crash Dump). This dump can be debugged in the same way as we debugged the other applications and dump files. The first thing you should do is asking for all threads (~ command). Then, in the list of threads, search for the thread with thread id 2492 decimal. Since the thread id's are displayed in hex, just convert 2492 into hex using the "? 0n2492" evaluation command. As you see in the debug listing, this corresponds with the hex value 9bc. You see that thread id 9bc corresponds with thread 18. Switch to this thread (~18s) and ask the stack trace for this thread. In this stack trace, you find the function and line number where the thread is consuming so many cpu cycles.
Loading Dump File [C:\perflogs\Hang_Mode__Date_06-13-2004__Time_09-32-4040\PID- 2152__THREADSPIN.EXE__full_2004-06-13_09-32-42-310_0868.dmp] User Mini Dump File with Full Memory: Only application data is available
Comment: 'Full dump in Hang Mode for THREADSPIN.EXE_running_on_L90004' Windows Server 2003 Version 3790 UP Free x86 compatible Product: Server, suite: TerminalServer SingleUserTS Debug session time: Sun Jun 13 09:32:42 2004 System Uptime: 4 days 1:12:29.045 Process Uptime: 0 days 0:00:38.000 Symbol search path is: e:\Symbols\Windows2003;e:\Symbols\Windows2000;e:\Symbols\Windows2000\SP3;e:\Symbol s\Windows2000\SP4;srv*e:\Symbols\Other*http://msdl.microsoft.com/download/symbols; srv*e:\Symbols\Other*\\s05074\symbols Executable search path is: ................ (868.478): Wake debugger - code 80000007 (!!! second chance !!!) eax=00000200 ebx=77d09134 ecx=00414878 edx=00000000 esi=004030bc edi=004030bc eip=7ffe0304 esp=0012fd9c ebp=0012fdc0 iopl=0 nv up ei pl zr na po nc cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000246 SharedUserData!SystemCallStub+0x4: 7ffe0304 c3 ret 0:000> .sympath+ c:\sai\debugging\threadspin\release Symbol search path is: e:\Symbols\Windows2003;e:\Symbols\Windows2000;e:\Symbols\Windows2000\SP3;e:\Symbol s\Windows2000\SP4;srv*e:\Symbols\Other*http://msdl.microsoft.com/download/symbols; srv*e:\Symbols\Other*\\s05074\symbols;c:\sai\debugging\threadspin\release 0:000> .reload ................ 0:000> ~ . 0 Id: 868.478 Suspend: 1 Teb: 7ffde000 Unfrozen 1 Id: 868.a20 Suspend: 1 Teb: 7ffdd000 Unfrozen 2 Id: 868.c68 Suspend: 1 Teb: 7ffdc000 Unfrozen 3 Id: 868.b30 Suspend: 1 Teb: 7ffdb000 Unfrozen 4 Id: 868.ca8 Suspend: 1 Teb: 7ffda000 Unfrozen 5 Id: 868.f18 Suspend: 1 Teb: 7ffd9000 Unfrozen 6 Id: 868.438 Suspend: 1 Teb: 7ffd8000 Unfrozen 7 Id: 868.c08 Suspend: 1 Teb: 7ffd7000 Unfrozen Application debugging in a production environment Version 1.1 114 8 Id: 868.b90 Suspend: 1 Teb: 7ffd6000 Unfrozen 9 Id: 868.be0 Suspend: 1 Teb: 7ffd5000 Unfrozen 10 Id: 868.c18 Suspend: 1 Teb: 7ffd4000 Unfrozen 11 Id: 868.f58 Suspend: 1 Teb: 7ffaf000 Unfrozen 12 Id: 868.c2c Suspend: 1 Teb: 7ffae000 Unfrozen 13 Id: 868.79c Suspend: 1 Teb: 7ffad000 Unfrozen 14 Id: 868.e54 Suspend: 1 Teb: 7ffac000 Unfrozen 15 Id: 868.b34 Suspend: 1 Teb: 7ffab000 Unfrozen 16 Id: 868.d40 Suspend: 1 Teb: 7ffaa000 Unfrozen 17 Id: 868.ae8 Suspend: 1 Teb: 7ffa9000 Unfrozen 18 Id: 868.9bc Suspend: 1 Teb: 7ffa8000 Unfrozen 19 Id: 868.f1c Suspend: 1 Teb: 7ffa7000 Unfrozen 20 Id: 868.88c Suspend: 1 Teb: 7ffa6000 Unfrozen 0:000> ? 0n2492 Evaluate expression: 2492 = 000009bc 0:000> ~18s *** ERROR: Symbol file could not be found. Defaulted to export symbols for user32.dll - eax=e22da400 ebx=00000000 ecx=00414ad0 edx=00000000 esi=0063b3a8 edi=00000180 eip=7ffe0304 esp=01fdfeb4 ebp=01fdfeec iopl=0 nv up ei pl zr na po nc cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000246 SharedUserData!SystemCallStub+0x4: 7ffe0304 c3 ret 0:018> kb ChildEBP RetAddr Args to Child 01fdfeb0 77d082d5 77d0afa6 0017063c 00000180 SharedUserData!SystemCallStub+0x4 WARNING: Stack unwind information not available. Following frames may be wrong. 01fdfeec 77d097ae 0063b3a8 00000180 00000000 user32!DefWindowProcW+0xa5 *** WARNING: Unable to verify checksum for 01fdff0c 0040189f 0017063c 00000180 00000000 user32!SendMessageA+0x47 01fdff28 0040156f 01fdff38 00000000 20746944 ThreadSpin!CListBox::AddString+0x1f [C:\Program Files\Microsoft Visual Studio\VC98\MFC\INCLUDE\afxwin2.inl @ 669] 01fdffb8 77e4a990 00000000 00000000 00000000 ThreadSpin!ThreadFunc2+0x54 [C:\SAI\Debugging\ThreadSpin\ThreadSpinDlg.cpp @ 169] 01fdffec 00000000 0040151b 00000000 00000000 kernel32!BaseThreadStart+0x34 You can see that the executed instruction is in function "ThreadSpin!ThreadFunc2", and that this function (and the line that is executed) can be found in the "ThreadSpinDlg" file, at line 169. The code for this function is displayed hereunder.
// Now output results for (int i = 0; i < 5; i+1) { sprintf(x,"Dit is test %d",i); copyright 2004 Hans De Smaele
115 g_pListBox->AddString(x); }
return 0; }
As you can see, the last part of the loop contains a bug. The instruction "i+1" should be "i++" or "i=i+1". Debugging (finding) memory leaks. There are several ways to find (hunt) memory leaks. One of these is the "umdh.exe" application that comes with the debugging package. Since this tool is thoroughly explained in the debuggers help, we don't discuss this tool here. The program we're using here is "LeakDiag.exe" (and related tools). It's the only tool used during this workshop that isn't downloadable. If you need this tool, just ask for it at Microsoft's PSS (Premium Systems Support). To demonstrate this tool, we'll use a small application, written in Microsoft Visual Basic 6.0. Most of you will think that you can't write memory leaks in VB 6.0, but as you'll see : a bad developer can write memory leaks in any application, with any compiler . The test program has two buttons : one to allocate memory that will be released again, and another one that allocates memory without releasing it (memory leak).
Figure 10.16 The "leaking" application Again, we're using the Performance Monitor, but this time in combination with "LeakDiag". First of all, start the application "Project1.exe" and start then "Perfmon". Select the object "Process", the counter "Private Bytes" and the instance "Project1". Then, press several times the buttons on the "Project1" application. You can see now that there is indeed memory leaking away (see figure 10.17).
Figure 10.17 Application debugging in a production environment Version 1.1 116 Now, start the "LeakDiag" tool and select the "Tools / Options" menu item. Fill in the folder path where the symbol files for the leaking application can be found. Make sure that you check on the "Resolve symbols when logging" checkbox (figure 10.18).
Figure 10.18 The Tools/Options dialog box for LeakDiag Select the application "Project1.exe" (the leaking one) and choose "Windows Heap Allocator" as the memory allocator that we want to control. Then, press the "Start" button. Check the current value for the "Private bytes" in the Performance monitor. Now, press several times the "Leak It" button in the VB application "Project1" and look again at the value for "Private bytes". Then, press the "Log" button in LeakDiag. The result of pressing the "Log" button is that a log file is made (this can take a while). As soon as the "Log" button is enabled again, press again several times the "Leak It" button, followed by pressing the "Log" button. Repeat this action several times (3 4 times) and click then the "Stop" button on the "LeakDiag" application. After this, click the "Unload" button (also in "LeakDiag"). You can see that the code, injected by LeakDiag, is now removed from the application. Performance monitor, LeakDiag and the leaking application can be stopped at this point. Use now the application "LDParser" and open the first log file. You'll see three windows, as in figure 10.19.
Figure 10.19 The LDParser application. copyright 2004 Hans De Smaele
117 The window "Allocation Size Details" shows the allocations, grouped by size and the number of allocations of this size that occurred . The "Stacks" window shows the different call stacks that allocated memory, but didn't give back that memory. The "Frame Details" window shows the stack frames for the selected stack. It is in this window that we see, thanks to our symbol file, that the memory was called in the application "Project1.exe", the function "Form1__Command1_Click" and at line 40 in the file "Form1.frm". This is a very clear indication where the memory leak occurred ! The code below shows this (badly written) application.
Dim a() Private Sub Command1_Click() x = UBound(a) + 1 ReDim Preserve a(x) a(x) = Space(8192) End Sub
Private Sub Command2_Click() Form1.MousePointer = 11 Dim b b = Space(8192) Sleep 10000 b = Space(0) Form1.MousePointer = 0 End Sub
Private Sub Form_Load() ReDim a(0) a(0) = Space(8192) End Sub As you can see, the global variable "a" takes 8192 bytes more, each time the Command1 button is pressed. When you have several logs available, you can use a third tool (LDGrapher) to see the memory consummation. This tool displays the number of allocations and the used memory (see figure 10.20).
Figure 10.20 The LDGrapher application Application debugging in a production environment Version 1.1 118 Debugging a NT service during startup. Debugging a NT service is comparable with the debugging of a normal application. You can debug a crash dump, or you can attach the debugger on the running service (just like with any other application). However, there is a difference when you have to debug a service during startup. A service is started by the SCM (Service Control Manager), and when the service isn't started yet after 30 seconds, the SCM assumes that the startup failed (and thus, the service is not started). Also, when we want to start the service under control of the debugger (and not under control of the SCM), then a few things have to be modified The first thing to do is setting the necessary values to allow the debugger to interact with the service. Start "GFlags.exe" and make the settings as in figure 10.21.
Figure 10.21 The GFlags settings to interact with a NT service. Click the "Apply" button to make the settings active, and click then the "OK" button to close GFlags.exe You can control of the settings are really set by inspecting the registry key : HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution\Options\MyService.exe If this registry entry is missing, then something went wrong and you have to redo the procedure as explained above. The next step you have to do is starting the "services mmc snap-in", select the service you want to debug, and check on the option "Allow service to interact with the desktop". When you start the service now, the debugger will start as well, so that you can debug the service. However, after 30 seconds, the SCM will complain that the service can't be started and it will be closed by the SCM. To avoid this, you must modify the time limit (30 seconds) that the SCM will wait for a "starting" service. To do so, open "Regedit" (or Regedt32) and navigate to "HKLM\SYSTEM\CurrentControlSet\Control" and create there a new REG_DWORD value with the name "ServicesPipeTimeout" (without quotes). Give this key a high value copyright 2004 Hans De Smaele
119 (like 3600000). The effect of this is that the SCM will give you now 1 hour the time to debug the service's startup.
Note: You must restart your computer for this setting to take effect.
When you restart the service now, the debugger is started as well and you have one hour the time to debug it. Set the symbol and source path for the service and give the go command. If the service doesn't crash immediately, you can set breakpoints, you can dump all thread stacks (~kb), you can .find and solve the problem. After finding the problem, recompile and relink the service. Then, remove the made settings in the registry and you can use the service in a normal fasion. Debugging a COM+ application in combination with IIS Debugging a COM+ application is done in the same way as a normal application. When the COM+ application runs "out of process" then you must stop the failing COM+ application. Then, start the Task Manager and restart the COM+ application. When restarting it, you see in the Task Manager the Program Identifier (PID) for the just started DLLHost.exe application. Now, you can debug this DLLHost instance (or you can dump it for later examination). The scenario below can be used to find memory leaks in an IIS application. However, be aware of the fact that "growing" memory usage during the first 24 hours after the start of the application can be considered normal. IIS is caching a lot of data and the TTL (Time To Live) for this cache is 24 hours. If you are sure that there is a memory leak, do then the following : 1. Start "tlist.exe k" to get a list of all processes. The k option shows you all MTS packages in each process. When you have the PID for the leaking process, you can find here the name for this process. 2. Isolate the virtual directories : set these on "High (Isolated)". 3. Isolate the COM/COM+ DLLs until you find the component that is leaking memory. 4. Start the Performance Monitor, select "Performance Logs and Alerts", right-click "Counter logs", select "New Log Settings", give it a name, press the "Add" button, select "All counters" and "All instances" and add the following list of objects : Active Server Pages, Memory, Process, Processor, Thread, Web Service, Internet Information Services Global. 5. Set the interval on 5 minutes and click the "OK" button. 6. After you saw that memory was leaked, take a dump as you have done during the previous exercise about memory leakage. Then, use "LeakDiag" to find the leak.
Debugging .NET problems To debug .NET applications, there exist an outstanding document that you can use as a guide book. This document, free downloadable from the Microsoft website, is "Production Debugging for .NET Framework Applications". The topics below are covered in this book : The differences between IIS 5.0 and IIS 6.0 Debugging .NET memory problems, including the working of the Garbage Collector Application debugging in a production environment Version 1.1 120 Debugging contention problems (deadlocks) Debugging crashing .NET applications.
Since this book is really a complete "walkthrough" that you can use as manual or study book (and that you can follow easily thanks to the knowledge you build up during this workshop) we don't explain these scenarios in our workshop. Together with the "patterns and practices" document "Production Debugging for .NET Framework Applications", there comes an msi-package with the used exercises and tools, so that you can experiment on your own. Some parts of the document are a bit "outdated", because there are references to the .NET Framework 1.0. However, if you follow the recommendations about setup and usage of the SOS extension as seen in this workshop, you shouldn't have any problem to make things working. copyright 2004 Hans De Smaele
121 11 Using the OEM Tools Introduction The debugger package that Microsoft makes available for you consist of the following programs and tools : WinDbg ADPlus.vbs TList.exe UMDH.exe ..
Next to these tools, there are other tools available that work closely together with de debugging package. These tools are the so called OEM tools. The OEM tools is a package made of specific WinDbg extensions and some programs that can be used to find bugs. Most of these tools are made to help during "kernel debugging". But, some of these tools can be used, together with WinDbg, to find and solve some nasty problems. Nowaday, a lot of these tools are superseded by the WinDbg package itself, but it is still interesting to have a closer look at two applications that come with the OEM tools. The tools are very well documented (help comes with it), so we'll only have a quick look at it in this workshop. Userdump This tool allows you to do just the same as what "tlist.exe" does, but it gives you also the possibility to "dump" one or more processes. And just as ADPlus.vbs can dump a "hanging" process, Userdump can do this as well. After the setup of "Userdump", you can configure it via the Control Panel (figure 11.1). The applet in the control panel gives you the possibility to "monitor" applications. Thanks to this, you can define actions when a certain exception occurs. These actions you can define by using a set of so called "rules". Figures 11.2, 11.3 and 11.4 show you the way you can control one or several programs using Userdump. Application debugging in a production environment Version 1.1 122
Figure 11.1 The Userdump applet in the control panel
Figure 11.2
Figure 11.3 copyright 2004 Hans De Smaele
123
Figure 11.4
Besides catching exceptions that can occur in applications, "userdump" can be used to dump applications that don't react any more (hanging programs). You can configure this with the second tab in the "userdump" dialog. Figure 11.5 shows the settings for this.
Figure 11.5 Genedump This tool allows you to distillate information about a user process from a full memory dump. This can be very handy when your computer "hangs" completely. In this case, you can give a keyboard combination that forces a BSOD (Blue Screen Of Death). When the settings of your OS as set to generate a full memory dump, Genedump can extract the information about one single process out of this dump. The keyboard combination to generate a BSOD is CTRL + SCROLL LOCK + SCROLL LOCK. Here, the CTRL-key must be the rightmost CTRL-key on your keyboard. This keyboard combination works only when the value "CrashOnCtrlScroll" is set to 0x01 (REG_DWORD) below the key HKLM\SYSTEM\CurrentControlSet\Services\i8042prt\Parameters. Application debugging in a production environment Version 1.1 124 12 At last Round-up Debugging applications in a production environment is a difficult task. I think that the steps below are the best sequence of actions you can take to start troubleshooting an application that is running in production. Use your good sense, don't panic. Ask the customer what the problem is, and since when the problem occurs. Be sure that nobody modified the configuration, installed service packs, adapted security settings, ran application fixes, Do you have the problem on several computers ? Do you have the problem with several users (on several computers) ? Can you reproduce the problem in the production environment ?
After you checked the things above, and after you're sure that there is a problem, ask the following questions : Is the source of the application available ? Are there symbol files available ? Can the problem be reproduced in a development environment ?
Once you have an answer on these questions, you can decide to debug the application with WinDbg (or another debugger you have at your disposal and that you are used to work with). The techniques we've used during this course (stack tracing, reading assembler code, ) can be used during debugging of "kernel code problems" as well. How to debug device drivers and "core components" of the Microsoft operating system is discussed in the second part of this course : Kernel Debugging with WinDbg. Literature Books Debugging Applications for Microsoft.NET and Microsoft Windows (ISBN 0-7356-1536-5) Debugging Applications(ISBN 0-7356-0886-5) Programming Applications for Microsoft Windows (ISBN 1-5723-1996-8) copyright 2004 Hans De Smaele
125 Inside Microsoft.NET IL Assembler (ISBN 0-7356-1547-0) Inside Windows 2000 third edition (ISBN 0-7356-1021-5) The Windows 2000 Device Driver Book (ISBN 0-1302-0431-5) Programming the Windows Driver Model, second edition (ISBN 0-7356-1803-8) The revolutionary guide to assembly language (ISBN 1-8744-1612-5) Intel Architecture software developer's manual : Basic Architecture, Instruction set reference and developers guide. Patterns and practices Production Debugging for .NET Framework Applications. MSDN articles SOS is not an ABBA song anymore (June 2003) ILDASM is your new best friend (May 2001) Knowledge base articles First and second change exceptions handling (105675) Use ADPlus to troubleshoot "Hangs" and "Crashes" (286350) Troubleshoot Third-Party .NET-connected Language Applications (312400) OSR Online Stacking the deck finding your way through the stack (NT Insider, vol 9, issue 6) Don't call us calling conventions for the x86 (NT Insider, vol 10, issue 1) Wild speculations debugging another crash dump (NT Insider, vol 10, issue 2) Other sources I want to thank all people at Microsoft who are helping me and others to solve bugs and problems in windows applications and systems. Especially the people of Microsoft Belgium and those of the Windows Support team in the US (who wrote WinDbg) I have to thank for their support and feedback.
Application debugging in a production environment Version 1.1