Part 3
Part 3
The final piece of the puzzle lives in kernel-mode, and provides the events and
structures that we have seen until now so that debugging can work at all. Dbgk does not
rely on KD, and is an entirely different component which, since Windows XP, provides its
own object and system calls to manage it. Previous versions of Windows did not have
such an object, and relied instead on a static data structure that would be analyzed in
m em ory and then used for various notifications sent through W indow s’s Local
Procedure Call (LPC) mechanisms.
One of the great things about the availability of these system calls and the presence of a
debug object is the fact that kernel-mode drivers can also engage in user-mode
debugging. Although this was probably not one of the goals of this new design, it is a
feature that should be of interest to some. Although the actual Nt* calls are not
exported, they can still be accessed by a driver which knows their system call ID. Even
though this number changes between each OS version, it is relatively easy to keep a
table in the driver. By adding a TDI interface to such a driver, it could be possible to
develop a high-speed remote debugger driver, which would have no user-mode
components at all and thus allow debugging every single process on the machine
remotely.
The first thing w e’ll do is take a look at the actual object w hich im plem ents user-mode
debugging, the DEBUG_OBJECT:
//
// Debug Object
//
typedef struct _DEBUG_OBJECT
{
KEVENT EventsPresent;
FAST_MUTEX Mutex;
LIST_ENTRY EventList;
union
{
ULONG Flags;
struct
{
UCHAR DebuggerInactive:1;
UCHAR KillProcessOnExit:1;
};
};
} DEBUG_OBJECT, *PDEBUG_OBJECT;
As you can see, the object itself is a rather lightweight wrapper around the actual event
on which user-mode uses WaitForDebugEvent, a list of actual debug events, a lock, and
certain flags relating to this debugging session, such as whether or not the debugger is
connected, and if the process should be killed when disconnecting. Therefore, the
structure we are more interested in is the DEBUG_EVENT structure:
//
// Debug Event
//
typedef struct _DEBUG_EVENT
{
LIST_ENTRY EventList;
KEVENT ContinueEvent;
CLIENT_ID ClientId;
PEPROCESS Process;
PETHREAD Thread;
NTSTATUS Status;
ULONG Flags;
PETHREAD BackoutThread;
DBGKM_MSG ApiMsg;
} DEBUG_EVENT, *PDEBUG_EVENT;
It’s this structure that contains all the data related to a debugging event. Since m any
events can be queued up before a caller does a WaitForDebugEvent in user-mode,
debug events m ust be linked together w ith the D EBU G _O BJECT, and that’s w hat the
event list is for.
Some of the other members hold the PID and TID of the event from which the
notification came, as well as pointers to the respective process and thread objects in
kernel-mode. The event in this structure is used internally to notify the kernel when a
response to a debugger message is available. This response usually comes in the form of
a ContinueDebugEvent call from Win32, which will signal the event.
The final structure contained in a debug event is the actual API message being sent,
which contains the data that user-mode will see, and is the kernel-mode representation
of the D BG U I_W AIT_STATE_CH AN G E structure. That’s right, the kernel has yet another
way of representing debug events, and it too will need to be converted later so that the
Native Debugging interface can understand it.
The good thing, as seen in the structure below, is that most of the fields have remained
constant, and that the kernel internally still uses the DBGKM structure which were
already shown in the DbgUi structure. However, instead of using DBG_STATE constants,
the kernel uses another type of constants called API Message Numbers, which are
shown below:
//
// Debug Message API Number
//
typedef enum _DBGKM_APINUMBER
{
DbgKmExceptionApi = 0,
DbgKmCreateThreadApi = 1,
DbgKmCreateProcessApi = 2,
DbgKmExitThreadApi = 3,
DbgKmExitProcessApi = 4,
DbgKmLoadDllApi = 5,
DbgKmUnloadDllApi = 6,
DbgKmErrorReportApi = 7,
DbgKmMaxApiNumber = 8,
} DBGKM_APINUMBER;
These API Numbers are self-explanatory and are still kept for compatibility with the old
LPC mechanism. The kernel will convert them to the actual debug states expected by
D bgU i. N ow let’s look at the D ebug M essage structure itself, w hich m atches the sam e
message used in previous versions of windows on top of the LPC mechanism:
//
// LPC Debug Message
//
typedef struct _DBGKM_MSG
{
PORT_MESSAGE h;
DBGKM_APINUMBER ApiNumber;
ULONG ReturnedStatus;
union
{
DBGKM_EXCEPTION Exception;
DBGKM_CREATE_THREAD CreateThread;
DBGKM_CREATE_PROCESS CreateProcess;
DBGKM_EXIT_THREAD ExitThread;
DBGKM_EXIT_PROCESS ExitProcess;
DBGKM_LOAD_DLL LoadDll;
DBGKM_UNLOAD_DLL UnloadDll;
};
} DBGKM_MSG, *PDBGKM_MSG;
Note of course, that for our purposes, we can ignore the PORT_MESSAGE part of the
structure, as we will not be looking at DbgSs (the component that handles LPC Messages
and the layer that wrapped DbgUi before Windows XP).
Now that we know what the structures look at, we can s tart looking at some of the
Native API functions (System calls) which wrap the Debug Object as if it was just another
object such as an event or semaphore.
The first system call that is required and that w e’ve seen from D bgU i in Part 2 is
NtCreateDebugObject, which will return a handle to the debug object that can be
attached and waited on later. The implementation is rather simple:
NTSTATUS
NTAPI
NtCreateDebugObject(OUT PHANDLE DebugHandle,
IN ACCESS_MASK DesiredAccess,
IN POBJECT_ATTRIBUTES ObjectAttributes,
IN BOOLEAN KillProcessOnExit)
{
KPROCESSOR_MODE PreviousMode = ExGetPreviousMode();
PDEBUG_OBJECT DebugObject;
HANDLE hDebug;
NTSTATUS Status = STATUS_SUCCESS;
PAGED_CODE();
/* Return Status */
DBGKTRACE(DBGK_OBJECT_DEBUG, "Handle: %p DebugObject: %p\n",
hDebug, DebugObject);
return Status;
One of the interesting things in using this API directly from user-mode is that it allows
naming the debug object so that it can be inserted in the object directory namespace.
Unfortunately, there does not exit an NtOpenDebugObject call, so the name cannot be
used for lookups, but this can be stored internally, or the objects can be inserted into a
tree such as \DebugObjects which can later be enumerated.
Mixed with the fact that the DbgUi layer can be thus skipped, this means that debug
object handles need not be stored in the TEB, and this gives the debugger writer the
ability to write a debugger which is debugging multiple processes in the same time,
seamlessly switching between debug objects, and implementing a custom
WaitForDebugEvent by using WaitForMultipleObjects which can receive debugging
events from multiple processes.
Handling the messages and handles from all these processes can be slightly daunting,
but using a model similar to how kernel32 stores per-thread data in the TEB, it can be
modeled to additionally store per-process data. The end result would be a powerful
debugger an innovation that others don’t yet support.
This API is also simple, and relies on much larger internal routines to perform most of
the work. First of all, one of the problems with attaching is that the process might have
already created multiple new threads, as well as loaded various DLLs. The system cannot
anticipate which process will be debugged, so it does not queue these debug events
anywhere, instead, the Dbgk module must scan each each thread and module, and send
the appropriate “fake” event m essage to the debugger. For exam ple, w hen attaching to
a process, this is what will generate the storm of DLL load events so that the debugger
can know w hat’s going on.
Without going in the internals of the Dbgkp calls, the way that DLL load messages are
sent are by looping the loader list contained in the PEB, through Peb->Ldr. There is a
hard-coded m axim um of 500 D LLs,so that the list w on’t be looped indefinitely. Instead
of using the DLL name contained in the corresponding LDR_DATA_TABLE_ENTRY
structures however, an internal API called MmGetFileNameForAddress is used, which
w ill find the VAD for the D LL’s base address, and use it to get the SECTIO N _O BJECT
associated with it. From this SECTION_OBJECT, the Memory Manager can find the
FILE_OBJECT, and then use ObQueryNameString to query the full name of the DLL,
which can be used to open the handle that user-mode will receive.
Note that the NamePointer parameter of the Load DLL structure is not filled out,
although it easily could be.
For looping newly created threads, the helper PsGetNextProcessThread API is used,
which will loop every thread. For the first thread, this will generate a Create Process
debug event, while each subsequent thread will result in Create Thread messages. For
the process, event data is retrieved from the SectionBaseAddress pointer, which has the
base image pointer. For threads, the only data returned is the start address, saved
already in ETHREAD.
The second part of DbgkpSetProcessDebugObject will parse any debug events that have
already been associated to the object. This means all those fake messages we just sent.
This will mean acquiring the rundown protection for each thread, as well as checking for
various race conditions or not-fully-inserted threads w hich m ight’ve been picked up.
Finally, the PEB is modified so that the BeingDebugged flag is enabled. Once the routine
is done, the debug object is fully associated to the target.
NTSTATUS
NTAPI
NtRemoveProcessDebug(IN HANDLE ProcessHandle,
IN HANDLE DebugHandle)
{
PEPROCESS Process;
PDEBUG_OBJECT DebugObject;
KPROCESSOR_MODE PreviousMode = KeGetPreviousMode();
NTSTATUS Status;
PAGED_CODE();
DBGKTRACE(DBGK_PROCESS_DEBUG, "Process: %p Handle: %p\n",
ProcessHandle, DebugHandle);
Like most rich NT Objects, Dbgk provides an interface to the debug object, and allows
some of its settings to be queried or modified. For now, only the set routine is
implemented, and it supports a single flag – whether or not detaching should kill the
process. This is how the corresponding Win32 API (DebugSetProcessKillOnExit) is
supported, through the NtSetInformationDebugObject system call:
NTSTATUS
NTAPI
NtSetInformationDebugObject(IN HANDLE DebugHandle,
IN DEBUGOBJECTINFOCLASS
DebugObjectInformationClass,
IN PVOID DebugInformation,
IN ULONG DebugInformationLength,
OUT PULONG ReturnLength OPTIONAL)
{
PDEBUG_OBJECT DebugObject;
KPROCESSOR_MODE PreviousMode = ExGetPreviousMode();
NTSTATUS Status = STATUS_SUCCESS;
PDEBUG_OBJECT_KILL_PROCESS_ON_EXIT_INFORMATION DebugInfo =
DebugInformation;
PAGED_CODE();
/* Return Status */
return Status;
}
Finally, the last functionality provided by the debug object is the wait and continue calls,
which implement the bi-directional channel through which a debugger can receive
events, modify the target state or its own internal data, and then resume execution.
These calls are a big more involved because of the inherent synching issues involved.
First,let’s explore NtWaitForDebugEvent:
NTSTATUS
NTAPI
NtWaitForDebugEvent(IN HANDLE DebugHandle,
IN BOOLEAN Alertable,
IN PLARGE_INTEGER Timeout OPTIONAL,
OUT PDBGUI_WAIT_STATE_CHANGE StateChange)
{
KPROCESSOR_MODE PreviousMode = ExGetPreviousMode();
LARGE_INTEGER SafeTimeOut;
PEPROCESS Process;
LARGE_INTEGER StartTime;
PETHREAD Thread;
BOOLEAN GotEvent;
LARGE_INTEGER NewTime;
PDEBUG_OBJECT DebugObject;
DBGUI_WAIT_STATE_CHANGE WaitStateChange;
NTSTATUS Status = STATUS_SUCCESS;
PDEBUG_EVENT DebugEvent, DebugEvent2;
PLIST_ENTRY ListHead, NextEntry, NextEntry2;
PAGED_CODE();
DBGKTRACE(DBGK_OBJECT_DEBUG, "Handle: %p\n", DebugHandle);
/* Check flags */
if (!(DebugEvent->Flags &
(DEBUG_EVENT_FLAGS_USED | DEBUG_EVENT_FLAGS_INACTIVE )))
{
/* We got an event */
GotEvent = TRUE;
/* Set flag */
DebugEvent->Flags |= DEBUG_EVENT_FLAGS_INACTIVE;
}
else
{
/* Unsignal the event */
KeClearEvent(&DebugObject->EventsPresent);
}
/* Set success */
Status = STATUS_SUCCESS;
}
/* Substract times */
SafeTimeOut.QuadPart += (NewTime.QuadPart -
StartTime.QuadPart);
StartTime = NewTime;
/* Return status */
return Status;
}
The first thing that happens is that a wait on the debug object is done, or more
specifically, on the EventsPresent event. When this wait is satisfied, the object is locked
and the APIfirst m akes sure that it hasn’t becom e inactive before the lock w as acquired.
Once this is confirmed, the current debug events are parsed, and the system call
ensures that the debug event hasn’t already been used (processed) and that it hasn’t
been made inactive. If these flags check out, then the list is parsed again to make sure
there aren’t any other events for the sam e process. If any are found, then this event is
marked as inactive, and nothing is sent. This seems to block any multiple events from
being sent ifthey’re for the sam e process.
Once a debug event has been obtained, the process and thread are referenced and the
structure is converted to a DbgUi Wait State Change structure, and the event is marked
as used. After the debug object lock is released, DbgkOpenHandles is called in the
success case when a debug event is found, which will open the right handles that user-
mode expects in the DbgUi structure, after which the extra references to the process
and thread are dropped.
After the wait is done, the DbgUi structure is copied back to the caller.
The final API, NtDebugContinue allows continuing from a debug event, and its
implementation serves to remove the debug event sent and to wake the target. It is
implemented like this:
NTSTATUS
NTAPI
NtDebugContinue(IN HANDLE DebugHandle,
IN PCLIENT_ID AppClientId,
IN NTSTATUS ContinueStatus)
{
KPROCESSOR_MODE PreviousMode = ExGetPreviousMode();
PDEBUG_OBJECT DebugObject;
NTSTATUS Status = STATUS_SUCCESS;
PDEBUG_EVENT DebugEvent = NULL, DebugEventToWake = NULL;
PLIST_ENTRY ListHead, NextEntry;
BOOLEAN NeedsWake = FALSE;
CLIENT_ID ClientId;
PAGED_CODE();
DBGKTRACE(DBGK_OBJECT_DEBUG, "Handle: %p Status: %p\n",
DebugHandle, ContinueStatus);
/* Compare process ID */
if (DebugEvent->ClientId.UniqueProcess ==
AppClientId->UniqueProcess)
{
/* Check if we already found a match */
if (NeedsWake)
{
/* Wake it up and break out */
DebugEvent->Flags &= ~DEBUG_EVENT_FLAGS_USED;
KeSetEvent(&DebugEvent->ContinueEvent,
IO_NO_INCREMENT,
FALSE);
break;
}
/* Return status */
return Status;
}
First, it is responsible for validating the continuation status that is used, to a limi ted
number of recognized status codes. Then, each debug event is looped. If the process
and thread IDs match, then the debug event is removed from the list, and the routine
remembers that an event was found. If any additional events are found for the same
process, then the inactive flag is remove. Recall that this flag was added by the wait
routine, to prohibit multiple events for the same process to be sent together.
Once all debug events are parsed, then the target is woken, which basically means
resuming the thread if the message was for a fake thread create message, releasing its
rundown protection, and then either freeing the debug event or notify whoever was
waiting on it (depending on how it was created).
This concludes the last section on the User-Mode Debugging implementation Windows
XP and higher, and since this section focused on kernel-m ode, let’s see w hat w e’ve
learned from this section:
Dbgk is the component in the kernel that handles all the support code for the
debugging functionality.
The implementation is exposed through an NT Object called DEBUG_OBJECT and
provides various system calls to access it.
It is possible to write a kernel-mode debugger for user-mode applications.
It is possible to write a debugger that can debug multiple applications in the
same time, as long as the DbgUi layer is skipped and re-implemented using
system calls by hand.
The kernel uses its own version of the wait state change structure, encapsulated
in a DEBUG_EVENT structure.
The kernel still has legacy support for LPC based DbgSs debugging.
The kernel opens all the handles that are present in the event structures, and
user-mode is responsible to close them.
The kernel is written such that only one event for the same process is sent at one
time.
The kernel needs to parse the PEB Loader Data to get a list of loaded DLLs, and
has a hard coded limit of 500 loop iterations.
This also brings us to the end of this series on debugging internals, and the author hopes
that readers have enjoyed this thorough analysis and will be able to write better, more
flexible, more powerful debuggers which can utilize some of these internals to provide
greater power to users and developers.