Normal view

There are new articles available, click to refresh the page.
Before yesterdayStories by Yarden Shafir on Medium

Security Research and the Creative Process

I get asked pretty often about my research process, how I find research ideas and how I approach a new idea or project. I don’t find those questions especially useful — the answers are usually very specific and not necessarily helpful to anyone not focusing on my specific corner of infosec or working exactly the way I like to work. So instead of answering these questions here, I will talk about something that I think the security industry doesn’t focus on enough — creativity. Because creativity is at the base of all that we do: finding a new research project, bypassing a security mitigation, hunting for a source of a bug, and pretty much everything else. This is the point where a lot of you jump in to say “but I’m just not creative!” and I politely disagree and tell you that regardless of your basic level of creativity, there are a few tricks that can improve it, or at least make your work more interesting and fun.

These are tricks that I mostly learned while practicing circus and then translated them to my tech work and research. You can pretty much apply them to anything you do, technical or not.

Can Limitations Be a Good Thing?

This could sound counter-intuitive but adding artificial roadblocks can encourage your brain to look at a problem from a different perspective and build different paths around it. Think of tying your right arm behind your back — yes it will be extremely inconvenient, but it will also force you to learn how to get more functionality from your left, or learn how to do things single-handedly when you used to use two hands for them. Maybe you’ll even learn to use your legs or other body parts for help.

In a similar way, adding limitations to your research can force you to try new things, learn new techniques or maybe even develop a whole new thing to overcome the “roadblock”. A few practical examples of that would be:

1. Creating a kernel exploit which patches the token’s privileges, but can only patch the “Present” bits and not the “Enabled” bits, demonstrated here: Exploiting a “Simple” Vulnerability, Part 2 — What If We Made Exploitation Harder? — Winsider Seminars & Solutions Inc. (windows-internals.com)

2. Writing a code injector, but injecting code into a process where Dynamic Code Guard is enabled — meaning no dynamic code is allowed so you can’t allocate and write a shellcode into arbitrary memory.

3. Write a test malware, but you’re not allowed to use your usual persistence methods.

4. Create a tool to monitor the system for malicious activity — which is not allowed to load a kernel driver and can only run in User-Mode.

An easy way to enforce real-world limitations on an offensive project would be to enable security mitigations — you could take an exploit that was written for an old operating system, or one that enables almost no security mitigations, and slowly enable newer mitigations and find ways to bypass them. First enable ASLR, then CFG, Dynamic Code Guard… you can work your way up to a fully patched Windows 10 machine enabling VBS, HVCI, etc.

Solve the Maze Backwards

Do you know the feeling of having a new project, knowing what you need to do and then spending the next five hours staring at an empty IDE and feeling overwhelmed? Yeah, me too. Starting a project is always the hardest part. But you don’t actually need to start it at the beginning. Remember being a kid, solving the maze on the back of the cereal box and always starting from the end because for some reason it made it so much easier? (Please tell me I’m not the only one who did that). Good news — you can do the same thing with your research project!

Let’s imagine you’re trying to write a kernel exploit that needs to do the following things:

1. Find the base address of ntoskrnl.exe.

2. Search for the offset of a specific global variable that you want to overwrite.

3. Write a shellcode into memory.

4. Patch the global variable from step 2 to point to your shellcode.

Now let’s say you have no idea how to do step #1, doing binary searches is gross and not fun so you’re not excited about step #2 and you hate writing shellcodes so you’re really not looking forward to step #3. However, step #4 seems great and you know exactly how to do it. Sadly it’s all the way on the other side of steps 1–3 and you don’t know how you’ll ever manage to find the motivation to get all the way there.

You don’t have to — you can “cheat” and use a debugger to find the address of the global variable in memory and hard-code it. Then you can use the debugger again to write a simple “int 3” instruction somewhere in memory and hard-code that address as well. Then you implement step #4 as if steps 1–3 are already done. This method is way more fun and should give your brain the dopamine boost it needs to at least try to implement one of the other steps. Besides, adding features to existing code is about 1000x easier than writing code where nothing exists.

You don’t have to start from the end either — you can pick any step that seems the easiest or the most fun and start from there and slowly build the steps around it later.

Give Yourself a Break

This advice is given so often it’s basically a cliché but it’s true so I feel like I have to say it: If you’ve been staring at your project for an hour and made no progress — stop. Go for a walk, take a shower, take a nap, go have that lunch you probably skipped because you were too focused on work, or if you feel too guilty to step away from the computer — at least work on a different task. Preferably something quick and easy to make your brain feel that you achieved something and allow you to take an actual break. Allowing your brain to rest and focus on some other things will make it happy and the solution to issue you’ve been stuck on will probably come to you while you’re not working anyway.

These tips might not work for everyone but those usually work for me when I’m stuck or out of ideas so I hope at least some of that will work for you too!

Windows Debugger API — The End of Versioned Structures

Windows Debugger API — The End of Versioned Structures

Some time ago I was introduced to the Windows debugger API and found it incredibly useful for projects that focus on forensics or analysis of data on a machine. This API allows us to open a dump file taken on any windows machine and read information from it using the symbols that match the specific modules contained in the dump.

This API can be used in live debugging as well, either user-mode debugging of a process or kernel debugging. This post will show how to use it to analyze a memory dump, but this can be converted to live debugging relatively easily.

The main benefit of the debugger API is that it uses the specific symbols for the Windows version that it is running against, letting us write code that will work with any Windows version without having to keep an ever-growing header of structures for different versions, and needing to choose the right one and update our code every time the structure changes. For example, a common data structure to look at on Windows is the process, represented in the kernel by the EPROCESS structure. This structure changes almost every Windows build, meaning that fields inside it keep moving around. A field we are interested in might be at offset 0x100 in one Windows version, 0x120 in another, 0x108 in another, and so on. If we use the wrong offset the driver will not work properly and is very likely to accidentally crash the system. By using the symbols, we also receive the correct size and type of each structure and its sub-structures, so a nested structure getting larger, or a field changing its type, for example being a push lock in one version and a spin lock another, will be handled correctly by the debugger API without and code changes on our side.

The debugger API avoids this problem entirely by using symbols, so we can write our code once and it will run successfully on dumps taken from every possible Windows version without any need for updates when new builds are released. Also, it runs in user-mode so it doesn’t have all the inherent risks that kernel mode code carries with it, and since it can operate on a dump file, it doesn’t have to run on the machine that it analyzes. Which can be a huge benefit, as sometimes we can’t run our debugging tools on the machine we are interested in. This also lets us do extremely complicated things on much faster machines, such as analyzing a dump — or many dumps — in the cloud.

The main disadvantage of it is that the interface is not as easy as just using the types directly, and it takes some effort to get used to it. It also means slightly uglier, less readable code, unless you create macros around some of the calls.

In this post we’ll learn how to write a simple program that opens a memory dump iterates over all the processes and prints the name and PID of each one. For anyone not familiar with process representation in the Windows kernel, all the processes are linked together by a linked list (that is a LIST_ENTRY structure that points to the next entry and the previous entry). This list is pointed to by the nt!PsActiveProcessHead symbol and the list is found at the ActiveProcessLinks field of the EPROCESS structure. Of course, the symbol is not exported and the EPROCESS structure is not available in any of the public headers so implementing this in a driver will require some hard coded offsets and version checks to get the right offsets for each. Or we can use the debugger API instead!

To access all of this functionality we’ll need to include DbgEng.h and link against DbgEng.lib. And this is the right time for an important tip shared by Alex Ionescu — the debugging-related DLLs supplied by Windows are unstable and will often simply not work at all and leave you confused and wondering what you did wrong and why your code that was perfectly good yesterday is suddenly failing. WinDbg comes with its own versions of all the DLLs required for this functionality, that are way better. So you’ll want to copy Dbgeng.dll, Dbghelp.dll and Symsrv.dll from the directory where windbg.exe is into your output directory of this project. Do whatever you need to remember to always use the DLLs that come with WinDbg, this will save you a lot of time and frustration later.

Now that we have that covered we can start writing the code. Before we can access the dump file, we need to initialize 4 basic variables:

IDebugClient* debugClient;
IDebugSymbols* debugSymbols;
IDebugDataSpaces* dataSpaces;
IDebugControl* debugControl;

These will let us open the dump, access its memory and the symbols for all the modules in it and use them to parse the contents of the dump. First, we call DebugCreate to initialize the debugClient variable:

DebugCreate(__uuidof(IDebugClient), (PVOID*)&debugClient);

Note that all the functions we’ll use here return an HRESULT that should be validated using SUCCEEDED(result). In this post I will skip those validations to keep the code smaller and easier to read, but in any real program these should not be skipped.

After we initialized debugClient we can use it to initialize the other 3:

debugClient->QueryInterface(__uuidof(IDebugSymbols), 
(PVOID*)&debugSymbols);
debugClient->QueryInterface(__uuidof(IDebugDataSpaces),
(PVOID*)&dataSpaces);
debugClient->QueryInterface(__uuidof(IDebugControl),
(PVOID*)&debugControl);

There, setup done. We can open our dump file with debugClient->OpenDumpFile and then wait until all symbol files are loaded:

debugClient->OpenDumpFile(DumpFilePath);
debugControl->WaitForEvent(DEBUG_WAIT_DEFAULT, 0);

Once the dump is loaded we can start reading it. The module we are most interested in here is nt — we are going to use the PsActiveProcessHead symbol as well as the EPROCESS structure that belong to it. So we need to get the base of the module using dataSpaces->ReadDebuggerData. This function receives 4 arguments — Index, Buffer, BufferSize and DataSize. The last one is an optional output parameter, telling us how many bytes were written, or if the buffer wasn’t large enough, how many bytes are needed. To keep things simple we will always pass nullptr as DataSize, since we know in advance the needed sizes for all of our data. The second and third arguments are pretty clear so no need to say much about them. And for the first argument we need to look at the list of options found at DbgEng.h:

// Indices for ReadDebuggerData interface
#define DEBUG_DATA_KernBase 24
#define DEBUG_DATA_BreakpointWithStatusAddr 32
#define DEBUG_DATA_SavedContextAddr 40
#define DEBUG_DATA_KiCallUserModeAddr 56
#define DEBUG_DATA_KeUserCallbackDispatcherAddr 64
#define DEBUG_DATA_PsLoadedModuleListAddr 72
#define DEBUG_DATA_PsActiveProcessHeadAddr 80
#define DEBUG_DATA_PspCidTableAddr 88
#define DEBUG_DATA_ExpSystemResourcesListAddr 96
#define DEBUG_DATA_ExpPagedPoolDescriptorAddr 104
#define DEBUG_DATA_ExpNumberOfPagedPoolsAddr 112
...

These are all commonly used symbols, so they get their own index to make querying their value faster and easier. Later in this post we’ll see how we can get the value of a symbol that is less common and isn’t on this list.

The first index on this list is, conveniently, DEBUG_DATA_KernBase. So we create a variable to get the base address of the nt module and call ReadDebuggerData:

ULONG64 kernBase;
dataSpaces->ReadDebuggerData(DEBUG_DATA_KernBase,
&kernBase,
sizeof(kernBase),
nullptr);

Next, we want to iterate over all the processes and print information about them. To do that we need the EPROCESS type. One annoying thing about the debugger API is that it doesn’t allow us to use types like we would if they were in a header file. We can’t declare a variable of type EPROCESS and access its fields. Instead we need to access memory through a type ID and the offsets inside the type. Foe example, if we want to access the ImageFileName field inside a process we will need to read the information that’s found in processAddr + imageFileNameOffset. But this is getting a bit ahead. First we need to get the type ID of _EPROCESS using debugSymbols->GetTypeId, which receives the module base, type name and an output argument for the type ID. As the name suggests, this function doesn’t give us the type itself, only an identifier that we’ll use to get offsets inside the structure:

ULONG EPROCESS;
debugSymbols->GetTypeId(kernBase, “_EPROCESS”, &EPROCESS);

Now let’s get the offsets of the fields inside the EPROCESS so we can easily access them. Since we want to print the name and PID of each process we’ll need the ImageFileName and UniqueProcessId fields, in addition to ActiveProcessLinks so we iterate over the processes. To get those we’ll call debugSymbols->GetFieldOffset, which receives the module base, type ID, field name and an output argument that will receive the field offset:

ULONG imageFileNameOffset;
ULONG uniquePidOffset;
ULONG activeProcessLinksOffset;
debugSymbols->GetFieldOffset(kernBase,
EPROCESS,
“ImageFileName”,
&imageFileNameOffset);
debugSymbols->GetFieldOffset(kernBase,
EPROCESS,
“UniqueProcessId”,
&uniquePidOffset);
debugSymbols->GetFieldOffset(kernBase,
EPROCESS,
“ActiveProcessLinks”,
&activeProcessLinksOffset);

To start iterating the process list we need to read PsActiveProcessHead. You might have noticed earlier that this symbol has an index in DbgEng.h so it can be read directly using ReadDebuggerData. But for this example we won’t read it that way, and instead show how to read it like a symbol that doesn’t have an index. So first we need to get the symbol offset in the dump file, using debugSymbols->GetOffsetByName:

ULONG64 activeProcessHead;
debugSymbols->GetOffsetByName(“nt!PsActiveProcessHead”,
&activeProcessHead);

This doesn’t give us the actual value yet, only the offset of this symbol. To get the value we’ll need to read the memory that this address points to from the dump using dataSpaces->ReadVirtual, which receives an address to read from, Buffer, BufferSize and an optional output argument BytesRead. We know that this symbol points to a LIST_ENTRY structure so we can just define a local linked list and read the variable into it. In this case we got lucky — the LIST_ENTRY structure is documented. If this symbol contained a non-documented structure this process would require a couple more steps and be a bit more painful.

LIST_ENTRY activeProcessLinks;
dataSpaces->ReadVirtual(activeProcessHead,
&activeProcessLinks,
sizeof(activeProcessLinks),
nullptr);

Now we have almost everything we need to start iterating the process list! We’ll define a local process variable and use it to store the address of the current process we’re looking at. In each iteration, activeProcessLinks.Flink will point to the first process in the system, but it won’t point to the beginning of the EPROCESS. It points to the ActiveProcessLinks field, so to get to the beginning of the structure we’ll need to subtract the offset of ActiveProcessLinks field from the address (basically what the CONTAINING_RECORD macro would do if we could use it here). Notice that we are using a ULONG64 here on purpose, instead of a ULONG_PTR to save us the pain of using pointer arithmetic and avoiding casts in future function calls, since most debugger API functions receive arguments as ULONG64:

ULONG64 process;
process = (ULONG64)activeProcessLinks.Flink — activeProcessLinksOffset;

The process iteration is pretty simple — for each process we want to read the ImageFileName value and UniqueProcessId value, and then read the next process pointer from ActiveProcessLinks. Notice that we cannot access any data in the debugger directly. The addresses we have are meaningless in the context of our current process (they are also kernel addresses, and our application is running in user mode and not necessarily on the right machine), and we need to call dataSpaces->ReadVirtual, or any of the other debugger functions that let us read data, to access any of the memory and will have to read these values for each process.

Generally we don’t have to read each value separately, we can also read the whole EPROCESS structure with debugSymbols->ReadTypedDataVirtual for each process and then access the fields by their offsets. But the EPROCESS structure is very large and we only need a few specific fields, so reading the whole structure is pretty wasteful and not necessary in this case.

We now have everything we need to implement our process iteration:

UCHAR imageFileName[15];
ULONG64 uniquePid;
LIST_ENTRY activeProcessLinks;
do
{
//
// Read process name, pid and activeProcessLinks
// for the current process
//
dataSpaces->ReadVirtual(process + imageFileNameOffset,
&imageFileName,
sizeof(imageFileName),
nullptr);
dataSpaces->ReadVirtual(process + uniquePidOffset,
&uniquePid,
sizeof(uniquePid),
nullptr);
dataSpaces->ReadVirtual(process + activeProcessLinksOffset,
&activeProcessLinks,
sizeof(activeProcessLinks),
nullptr);
printf(“Current process name: %s, pid: %d\n”,
imageFileName,
uniquePid);
//
// Get the next process from the list and
// subtract activeProcessLinksOffset
// to get to the start of the EPROCESS.
//
process = (ULONG64)activeProcessLinks.Flink — activeProcessLinksOffset;
} while ((ULONG64)activeProcessLinks.Flink != activeProcessHead);

That’s it, that’s all we need to get this nice output:

Some of you might notice that a few of these process names look incomplete. This is because the ImageFileName field only has the first 15 bytes of the process name, while the full name is saved in an OBJECT_NAME_INFORMATION structure (which is actually just a UNICODE_STRING) in SeAuditProcessCreationInfo.ImageFileName. But in this post I wanted to keep things simple so we’ll use ImageFileName here.

Now we only have one last part left — being good developers and cleaning up after ourselves:

if (debugClient != nullptr)
{
debugClient->EndSession(DEBUG_END_ACTIVE_DETACH);
debugClient->Release();
}
if (debugSymbols != nullptr)
{
debugSymbols->Release();
}
if (dataSpaces != nullptr)
{
dataSpaces->Release();
}
if (debugControl != nullptr)
{
debugControl->Release();
}

This was a very brief, but hopefully helpful, introduction to the debugger API. There are endless more options available with this, looking at DbgEng.h or at the official documentation should reveal a lot more. I hope you all find this as useful as I do and will find new and interesting things to use it for.


Windows Debugger API — The End of Versioned Structures was originally published in The Startup on Medium, where people are continuing the conversation by highlighting and responding to this story.

WinDbg — the Fun Way: Part 1

WinDbg — the Fun Way: Part 1

A while ago, WinDbg added support for a new debugger data model, a change that completely changed the way we can use WinDbg. No more horrible MASM commands and obscure syntax. No more copying addresses or parameters to a Notepad file so that you can use them in the next commands without scrolling up. No more running the same command over and over with different addresses to iterate over a list or an array.

This is part 1 of this guide, because I didn’t actually think anyone would read through 8000 words of me explaining WinDbg commands. So you get 2 posts of 4000 words! That’s better, right?

In this first post we will learn the basics of how to use this new data model — using custom registers and new built-in registers, iterating over objects, searching them and filtering them and customizing them with anonymous types. And finally we will learn how to parse arrays and lists in a much nicer and easier way than you’re used to.

And in the net post we’ll learn the more complicated and fancier methods and features that this data model gives us. Now that we all know what to expect and grabbed another cup of coffee, let’s start!

This data model, accessed in WinDbg through the dx command, is an extremely powerful tool, able to define custom variables, structures, functions and use a wide range of new capabilities. It also lets us search and filter information with LINQ — a natural query language built on top of database languages such as SQL.

This data model is documented and even has usage examples on GitHub. Additionally, all of its modules have documentation that can be viewed in the debugger with dx -v <method> (though you will get the same documentation if you run dx <method> without the -v flag):

dx -v Debugger.Utility.Collections.FromListEntry
Debugger.Utility.Collections.FromListEntry [FromListEntry(ListEntry, [<ModuleName | ModuleObject>], TypeName, FieldExpression) — Method which converts a LIST_ENTRY specified by the ‘ListEntry’ parameter of types whose name is specified by the string ‘TypeName’ and whose embedded links within that type are accessed via an expression specified by the string ‘FieldExpression’ into a collection object. If an optional module name or object is specified, the type name is looked up in the context of such module]

There has also been some external documentation, but I felt like there were things that needed further explanation and that this feature is worth more attention than it receives.

Custom Registers

First, NatVis adds the option for custom registers. Kind of like MASM had @$t1, @$t2, @$t3 , etc. Only now you can call them whatever name you want, and they can have a type of your choice:

dx @$myString = “My String”
dx @$myInt = 123

We can see all our variables with dx @$vars and remove them with dx @$vars.Remove("var name"), or clear all with @$vars.Clear(). We can also use dx to show handle more complicated structures, such as an EPROCESS. As you might know, symbols in public PDBs don’t have type information. With the old debugger, this wasn’t always a problem, since in MASM, there’s no types anyway, and we could use the poi command to dereference a pointer.

0: kd> dt nt!_EPROCESS poi(nt!PsInitialSystemProcess)
+0x000 Pcb : _KPROCESS
+0x2e0 ProcessLock : _EX_PUSH_LOCK
+0x2e8 UniqueProcessId : (null)
...

But things got messier when the variable isn’t a pointer, like with PsIdleProcess:

0: kd> dt nt!_KPROCESS @@masm(nt!PsIdleProcess)
+0x000 Header : _DISPATCHER_HEADER
+0x018 ProfileListHead : _LIST_ENTRY [ 0x00000048`0411017e - 0x00000000`00000004 ]
+0x028 DirectoryTableBase : 0xffffb10b`79f08010
+0x030 ThreadListHead : _LIST_ENTRY [ 0x00001388`00000000 - 0xfffff801`1b401000 ]
+0x040 ProcessLock : 0
+0x044 ProcessTimerDelay : 0
+0x048 DeepFreezeStartTime : 0xffffe880`00000000
...

We first have to use explicit MASM operators to get the address of PsIdleProcess and then print it as an EPROCESS. With dx we can be smarter and cast symbols directly, using c-style casts. But when we try to cast nt!PsInitialSystemProcess to a pointer to an EPROCESS:

dx @$systemProc = (nt!_EPROCESS*)nt!PsInitialSystemProcess
Error: No type (or void) for object at Address 0xfffff8074ef843a0

We get an error.

Like I mentioned, symbols have no type. And we can’t cast something with no type. So we need to take the address of the symbol, and cast it to a pointer to the type we want (In this case, PsInitialSystemProcess is already a pointer to an EPROCESS so we need to cast its address to a pointer to a pointer to an EPROCESS).

dx @$systemProc = *(nt!_EPROCESS**)&nt!PsInitialSystemProcess

Now that we have a typed variable, we can access its fields like we would do in C:

0: kd> dx @$systemProc->ImageFileName
@$systemProc->ImageFileName               [Type: unsigned char [15]]
[0] : 0x53 [Type: unsigned char]
[1] : 0x79 [Type: unsigned char]
[2] : 0x73 [Type: unsigned char]
[3] : 0x74 [Type: unsigned char]
[4] : 0x65 [Type: unsigned char]
[5] : 0x6d [Type: unsigned char]
[6] : 0x0 [Type: unsigned char]

And we can cast that to get a nicer output:

dx (char*)@$systemProc->ImageFileName
(char*)@$systemProc->ImageFileName                 : 0xffffc10c8e87e710 : "System" [Type: char *]

We can also use ToDisplayString to cast it from a char* to a string. We have two options — ToDisplayString("s"), which will cast it to a string and keep the quotes as part of the string, or ToDisplayString("sb"), which will remove them:

dx ((char*)@$systemProc->ImageFileName).ToDisplayString("s")
((char*)@$systemProc->ImageFileName).ToDisplayString("s") : "System"
Length : 0x8
dx ((char*)@$systemProc->ImageFileName).ToDisplayString("sb")
((char*)@$systemProc->ImageFileName).ToDisplayString("sb") : System
Length : 0x6

Built-in Registers

This is fun, but for processes (and a few other things) there is an even easier way. Together with NatVis’ implementation in WinDbg we got some “free” registers already containing some useful information — curframe, curprocess, cursession, curstack and curthread. It’s not hard to guess their contents by their names, but let’s take a look:

@$curframe

Gives us information about the current frame. I never actually used it myself, but it might be useful:

dx -r1 @$curframe.Attributes
@$curframe.Attributes                
InstructionOffset : 0xfffff8074ebda1e1
ReturnOffset : 0xfffff80752ad2b61
FrameOffset : 0xfffff80751968830
StackOffset : 0xfffff80751968838
FuncTableEntry : 0x0
Virtual : 1
FrameNumber : 0x0

@$curprocess

A container with information about the current process. This is not an EPROCESS (though it does contain it). It contains easily accessible information about the current process, like its threads, loaded modules, handles, etc.

dx @$curprocess
@$curprocess                 : System [Switch To]
KernelObject [Type: _EPROCESS]
Name : System
Id : 0x4
Handle : 0xf0f0f0f0
Threads
Modules
Environment
Devices
Io

In KernelObject we have the EPROCESS, but we can also use the other fields. For example, we can access all the handles held by the process through @$curprocess.Io.Handles, which will lead us to an array of handles, indexed by their handle number:

dx @$curprocess.Io.Handles
@$curprocess.Io.Handles                
[0x4]
[0x8]
[0xc]
[0x10]
[0x14]
[0x18]
[0x1c]
[0x20]
[0x24]
[0x28]
...

System has a lot of handles, these are just the first few! Let’s just take a look at the first one (which we can also access through @$curprocess.Io.Handles[0x4]):

dx @$curprocess.Io.Handles.First()
@$curprocess.Io.Handles.First()                
Handle : 0x4
Type : Process
GrantedAccess : Delete | ReadControl | WriteDac | WriteOwner | Synch | Terminate | CreateThread | VMOp | VMRead | VMWrite | DupHandle | CreateProcess | SetQuota | SetInfo | QueryInfo | SetPort
Object [Type: _OBJECT_HEADER]

We can see the handle, the type of object the handle is for, its granted access, and we even have a pointer to the object itself (or its object header, to be precise)!

There are plenty more things to find under this register, and I encourage you to investigate them, but I will not show all of them.

By the way, have we mentioned already that dx allows tab completion?

@$cursession

As its name suggests, this register gives us information about the current debugger session:

dx @$cursession
@$cursession                 : Remote KD: KdSrv:Server=@{<Local>},Trans=@{NET:Port=55556,Key=1.2.3.4,Target=192.168.251.21}
Processes
Id : 0
Devices
Attributes

So, we can get information about our debugger session, which is always fun. But there are more useful things to be found here, such as the Processes field, which is an array of all processes, indexed by their PID. Let’s pick one of them:

dx @$cursession.Processes[0x1d8]
@$cursession.Processes[0x1d8]                 : smss.exe [Switch To]
KernelObject [Type: _EPROCESS]
Name : smss.exe
Id : 0x1d8
Handle : 0xf0f0f0f0
Threads
Modules
Environment
Devices
Io

Now we can get all that useful information about every single process! We can also search through processes by filtering them based on a search (such as by their name, specific modules loaded into them, strings in their command line, etc. But I will explain all of that later.

@$curstack

This register contains a single field — frames — which shows us the current stack in an easily-handled way:

dx @$curstack.Frames
@$curstack.Frames                
[0x0] : nt!DbgBreakPointWithStatus + 0x1 [Switch To]
[0x1] : kdnic!TXTransmitQueuedSends + 0x125 [Switch To]
[0x2] : kdnic!TXSendCompleteDpc + 0x14d [Switch To]
[0x3] : nt!KiProcessExpiredTimerList + 0x169 [Switch To]
[0x4] : nt!KiRetireDpcList + 0x4e9 [Switch To]
[0x5] : nt!KiIdleLoop + 0x7e [Switch To]

@$curthread

Gives us information about the current thread, just like @$curprocess:

dx @$curthread
@$curthread                 : nt!DbgBreakPointWithStatus+0x1 (fffff807`4ebda1e1)  [Switch To]
KernelObject [Type: _ETHREAD]
Id : 0x0
Stack
Registers
Environment

It contains the ETHREAD in KernelObject, but also contains the TEB in Environment, and can show us the thread ID, stack and registers.

dx @$curthread.Registers
@$curthread.Registers                
User
Kernel
SIMD
FloatingPoint

We have them conveniently separated to user, kernel, SIMD and FloatingPoint registers, and we can look at each separately:

dx -r1 @$curthread.Registers.Kernel
@$curthread.Registers.Kernel
cr0 : 0x80050033
cr2 : 0x207b8f7abbe
cr3 : 0x6d4002
cr4 : 0x370678
cr8 : 0xf
gdtr : 0xffff9d815ffdbfb0
gdtl : 0x57
idtr : 0xffff9d815ffd9000
idtl : 0xfff
tr : 0x40
ldtr : 0x0
kmxcsr : 0x1f80
kdr0 : 0x0
kdr1 : 0x0
kdr2 : 0x0
kdr3 : 0x0
kdr6 : 0xfffe0ff0
kdr7 : 0x400

xcr0 : 0x1f

Searching and Filtering

A very useful thing that NatVis allows us to do, which we briefly mentioned before, is searching, filtering and ordering information in an SQL-like way through Select, Where, OrderBy and more.

For example, let’s try to find all the processes that don’t enable high entropy ASLR. This information is stored in the EPROCESS->MitigationFlags field, and the value for HighEntropyASLREnabled is 0x20 (all values can be found here and in the public symbols).

First, we’ll declare a new register with that value, just to make things more readable:

0: kd> dx @$highEntropyAslr = 0x20
@$highEntropyAslr = 0x20 : 32

And then create our query to iterate over all processes and only pick ones where the HighEntropyASLREnabled bit is not set:

dx -r1 @$cursession.Processes.Where(p => (p.KernelObject.MitigationFlags & @$highEntropyAslr) == 0)
@$cursession.Processes.Where(p => (p.KernelObject.MitigationFlags & @$highEntropyAslr) == 0)                
[0x910] : spoolsv.exe [Switch To]
[0xb40] : IpOverUsbSvc.exe [Switch To]
[0x1610] : explorer.exe [Switch To]
[0x1d8c] : OneDrive.exe [Switch To]

Or we can check the flag directly through MitigationFlagsValues and get the same results:

dx -r1 @$cursession.Processes.Where(p => (p.KernelObject.MitigationFlagsValues.HighEntropyASLREnabled == 0))
@$cursession.Processes.Where(p => (p.KernelObject.MitigationFlagsValues.HighEntropyASLREnabled == 0))                
[0x910] : spoolsv.exe [Switch To]
[0xb40] : IpOverUsbSvc.exe [Switch To]
[0x1610] : explorer.exe [Switch To]
[0x1d8c] : OneDrive.exe [Switch To]

We can also use Select() to only show certain attributes of things we iterate over. Here we choose to see only the number of threads each process has:

dx @$cursession.Processes.Select(p => p.Threads.Count())
@$cursession.Processes.Select(p => p.Threads.Count())                
[0x0] : 0x6
[0x4] : 0xeb
[0x78] : 0x4
[0x1d8] : 0x5
[0x244] : 0xe
[0x294] : 0x8
[0x2a0] : 0x10
[0x2f8] : 0x9
[0x328] : 0xa
[0x33c] : 0xd
[0x3a8] : 0x2c
[0x3c0] : 0x8
[0x3c8] : 0x8
[0x204] : 0x15
[0x300] : 0x1d
[0x444] : 0x3f
...

We can also see everything in decimal by adding , d to the end of the command, to specify the output format (we can also use b for binary, o for octal or s for string):

dx @$cursession.Processes.Select(p => p.Threads.Count()), d
@$cursession.Processes.Select(p => p.Threads.Count()), d                
[0] : 6
[4] : 235
[120] : 4
[472] : 5
[580] : 14
[660] : 8
[672] : 16
[760] : 9
[808] : 10
[828] : 13
[936] : 44
[960] : 8
[968] : 8
[516] : 21
[768] : 29
[1092] : 63
...

Or, in a slightly more complicated example, see the ideal processor for each thread running in a certain process (I chose a process at random, just to see something that is not the System process):

dx -r1 @$cursession.Processes[0x1b2c].Threads.Select(t => t.Environment.EnvironmentBlock.CurrentIdealProcessor.Number)
@$cursession.Processes[0x1b2c].Threads.Select(t => t.Environment.EnvironmentBlock.CurrentIdealProcessor.Number)                
[0x1b30] : 0x1 [Type: unsigned char]
[0x1b40] : 0x2 [Type: unsigned char]
[0x1b4c] : 0x3 [Type: unsigned char]
[0x1b50] : 0x4 [Type: unsigned char]
[0x1b48] : 0x5 [Type: unsigned char]
[0x1b5c] : 0x0 [Type: unsigned char]
[0x1b64] : 0x1 [Type: unsigned char]

We can also use OrderBy to get nicer results, for example to get a list of all processes sorted by alphabetical order:

dx -r1 @$cursession.Processes.OrderBy(p => p.Name)
@$cursession.Processes.OrderBy(p => p.Name)                
[0x1848] : ApplicationFrameHost.exe [Switch To]
[0x0] : Idle [Switch To]
[0xb40] : IpOverUsbSvc.exe [Switch To]
[0x106c] : LogonUI.exe [Switch To]
[0x754] : MemCompression [Switch To]
[0x187c] : MicrosoftEdge.exe [Switch To]
[0x1b94] : MicrosoftEdgeCP.exe [Switch To]
[0x1b7c] : MicrosoftEdgeSH.exe [Switch To]
[0xb98] : MsMpEng.exe [Switch To]
[0x1158] : NisSrv.exe [Switch To]
[0x1d8c] : OneDrive.exe [Switch To]
[0x78] : Registry [Switch To]
[0x1ed0] : RuntimeBroker.exe [Switch To]
...

If we want them in a descending order, we can use OrderByDescending.

But what if we want to pick more than one attribute to see? There is a solution for that too.

Anonymous Types

We can declare a type of our own, that will be unnamed and only used in the scope of our query, using this syntax: Select(x => new { var1 = x.A, var2 = x.B, ...}).

We’ll try it out on one of our previous examples. Let’s say for each process we want to show a process name and its thread count:

dx @$cursession.Processes.Select(p => new {Name = p.Name, ThreadCount = p.Threads.Count()})
@$cursession.Processes.Select(p => new {Name = p.Name, ThreadCount = p.Threads.Count()})                
[0x0]
[0x4]
[0x78]
[0x1d8]
[0x244]
[0x294]
[0x2a0]
[0x2f8]
[0x328]
...

But now we only see the process container, not the actual information. To see the information itself we need to go one layer deeper, by using -r2. The number specifies the output recursion level. The default is -r1, -r0 will show no output, -r2 will show two levels, etc.

dx -r2 @$cursession.Processes.Select(p => new {Name = p.Name, ThreadCount = p.Threads.Count()})
@$cursession.Processes.Select(p => new {Name = p.Name, ThreadCount = p.Threads.Count()})                
[0x0]
Name : Idle
ThreadCount : 0x6
[0x4]
Name : System
ThreadCount : 0xeb
[0x78]
Name : Registry
ThreadCount : 0x4
[0x1d8]
Name : smss.exe
ThreadCount : 0x5
[0x244]
Name : csrss.exe
ThreadCount : 0xe
[0x294]
Name : wininit.exe
ThreadCount : 0x8
[0x2a0]
Name : csrss.exe
ThreadCount : 0x10
...

This already looks much better, but we can make it look even nicer with the new grid view, accessed through the -g flag:

dx -g @$cursession.Processes.Select(p => new {Name = p.Name, ThreadCount = p.Threads.Count()})
Output of dx -g @$cursession.Processes.Select(p => new {Name = p.Name, ThreadCount = p.Threads.Count()})

OK, this just looks awesome. And yes, these headings are clickable and will sort the table!

And if we want to see the PIDs and thread numbers in decimal we can just add , d to the end of the command:

dx -g @$cursession.Processes.Select(p => new {Name = p.Name, ThreadCount = p.Threads.Count()}),d

Arrays and Lists

DX also gives us a new, much easier way, to handle arrays and lists with new syntax.
Let’s look at arrays first, where the syntax is dx *(TYPE(*)[Size])<pointer to array start>.

For this example, we will dump the contents on PsInvertedFunctionTable, which contains an array of up to 256 cached modules in its TableEntry field.

First, we will get the pointer of this symbol and cast it to _INVERTED_FUNCTION_TABLE:

dx @$inverted = (nt!_INVERTED_FUNCTION_TABLE*)&nt!PsInvertedFunctionTable
@$inverted = (nt!_INVERTED_FUNCTION_TABLE*)&nt!PsInvertedFunctionTable                 : 0xfffff8074ef9b010 [Type: _INVERTED_FUNCTION_TABLE *]
[+0x000] CurrentSize : 0xbe [Type: unsigned long]
[+0x004] MaximumSize : 0x100 [Type: unsigned long]
[+0x008] Epoch : 0x19e [Type: unsigned long]
[+0x00c] Overflow : 0x0 [Type: unsigned char]
[+0x010] TableEntry [Type: _INVERTED_FUNCTION_TABLE_ENTRY [256]]

Now we can create our array. Unfortunately, the size of the array has to be static and can’t use a register, so we need to input it manually, based on CurrentSize (or just set it to 256, which is the size of the whole array). And we can use the grid view to print it nicely:

dx -g @$tableEntry = *(nt!_INVERTED_FUNCTION_TABLE_ENTRY(*)[0xbe])@$inverted->TableEntry
Output of dx -g @$tableEntry = *(nt!_INVERTED_FUNCTION_TABLE_ENTRY(*)[0xbe])@$inverted->TableEntry

Alternatively, we can use the Take() method, which receives a number and prints that amount of elements from a collection, and get the same result:

dx -g @$inverted->TableEntry->Take(@$inverted->CurrentSize)

We can also do the same thing to see the UserInvertedFunctionTable (right after we switch to user that’s not System), starting from nt!KeUserInvertedFunctionTable:

dx @$inverted = *(nt!_INVERTED_FUNCTION_TABLE**)&nt!KeUserInvertedFunctionTable
@$inverted = *(nt!_INVERTED_FUNCTION_TABLE**)&nt!KeUserInvertedFunctionTable                 : 0x7ffa19e3a4d0 [Type: _INVERTED_FUNCTION_TABLE *]
[+0x000] CurrentSize : 0x2 [Type: unsigned long]
[+0x004] MaximumSize : 0x200 [Type: unsigned long]
[+0x008] Epoch : 0x6 [Type: unsigned long]
[+0x00c] Overflow : 0x0 [Type: unsigned char]
[+0x010] TableEntry [Type: _INVERTED_FUNCTION_TABLE_ENTRY [256]]
dx -g @$inverted->TableEntry->Take(@$inverted->CurrentSize)

And of course we can use Select() , Where() or other functions to filter, sort or select only specific fields for our output and get tailored results that fit exactly what we need.

The next thing to handle is lists — Windows is full of linked lists, you can find them everywhere. Linking processes, threads, modules, DPCs, IRPs, and more.

Fortunately the new data model has a very useful Debugger method - Debugger.Utiilty.Collections.FromListEntry, which takes in a linked list head, type and name of the field in this type containing the LIST_ENTRY structure, and will return a container of all the list contents.

So, for our example let’s dump all the handle tables in the system. Our starting point will be the symbol nt!HandleTableListHead, the type of the objects in the list is nt!_HANDLE_TABLE and the field linking the list is HandleTableList:

dx -r2 Debugger.Utility.Collections.FromListEntry(*(nt!_LIST_ENTRY*)&nt!HandleTableListHead, "nt!_HANDLE_TABLE", "HandleTableList")
Debugger.Utility.Collections.FromListEntry(*(nt!_LIST_ENTRY*)&nt!HandleTableListHead, "nt!_HANDLE_TABLE", "HandleTableList")                
[0x0] [Type: _HANDLE_TABLE]
[+0x000] NextHandleNeedingPool : 0x3400 [Type: unsigned long]
[+0x004] ExtraInfoPages : 0 [Type: long]
[+0x008] TableCode : 0xffff8d8dcfd18001 [Type: unsigned __int64]
[+0x010] QuotaProcess : 0x0 [Type: _EPROCESS *]
[+0x018] HandleTableList [Type: _LIST_ENTRY]
[+0x028] UniqueProcessId : 0x4 [Type: unsigned long]
[+0x02c] Flags : 0x0 [Type: unsigned long]
[+0x02c ( 0: 0)] StrictFIFO : 0x0 [Type: unsigned char]
[+0x02c ( 1: 1)] EnableHandleExceptions : 0x0 [Type: unsigned char]
[+0x02c ( 2: 2)] Rundown : 0x0 [Type: unsigned char]
[+0x02c ( 3: 3)] Duplicated : 0x0 [Type: unsigned char]
[+0x02c ( 4: 4)] RaiseUMExceptionOnInvalidHandleClose : 0x0 [Type: unsigned char]
[+0x030] HandleContentionEvent [Type: _EX_PUSH_LOCK]
[+0x038] HandleTableLock [Type: _EX_PUSH_LOCK]
[+0x040] FreeLists [Type: _HANDLE_TABLE_FREE_LIST [1]]
[+0x040] ActualEntry [Type: unsigned char [32]]
[+0x060] DebugInfo : 0x0 [Type: _HANDLE_TRACE_DEBUG_INFO *]
[0x1] [Type: _HANDLE_TABLE]
[+0x000] NextHandleNeedingPool : 0x400 [Type: unsigned long]
[+0x004] ExtraInfoPages : 0 [Type: long]
[+0x008] TableCode : 0xffff8d8dcb651000 [Type: unsigned __int64]
[+0x010] QuotaProcess : 0xffffb90a530e4080 [Type: _EPROCESS *]
[+0x018] HandleTableList [Type: _LIST_ENTRY]
[+0x028] UniqueProcessId : 0x78 [Type: unsigned long]
[+0x02c] Flags : 0x10 [Type: unsigned long]
[+0x02c ( 0: 0)] StrictFIFO : 0x0 [Type: unsigned char]
[+0x02c ( 1: 1)] EnableHandleExceptions : 0x0 [Type: unsigned char]
[+0x02c ( 2: 2)] Rundown : 0x0 [Type: unsigned char]
[+0x02c ( 3: 3)] Duplicated : 0x0 [Type: unsigned char]
[+0x02c ( 4: 4)] RaiseUMExceptionOnInvalidHandleClose : 0x1 [Type: unsigned char]
[+0x030] HandleContentionEvent [Type: _EX_PUSH_LOCK]
[+0x038] HandleTableLock [Type: _EX_PUSH_LOCK]
[+0x040] FreeLists [Type: _HANDLE_TABLE_FREE_LIST [1]]
[+0x040] ActualEntry [Type: unsigned char [32]]
[+0x060] DebugInfo : 0x0 [Type: _HANDLE_TRACE_DEBUG_INFO *]
...

See the QuotaProcess field? That field points to the process that this handle table belongs to. Since every process has a handle table, this allows us to enumerate all the processes on the system in a way that’s not widely known. This method has been used by rootkits in the past to enumerate processes without being detected by EDR products. So to implement this we just need to Select() the QuotaProcess from each entry in our handle table list. To create a nicer looking output we can also create an anonymous container with the process name, PID and EPROCESS pointer:

dx -r2 (Debugger.Utility.Collections.FromListEntry(*(nt!_LIST_ENTRY*)&nt!HandleTableListHead, "nt!_HANDLE_TABLE", "HandleTableList")).Select(h => new { Object = h.QuotaProcess, Name = ((char*)h.QuotaProcess->ImageFileName).ToDisplayString("s"), PID = (__int64)h.QuotaProcess->UniqueProcessId})
(Debugger.Utility.Collections.FromListEntry(*(nt!_LIST_ENTRY*)&nt!HandleTableListHead, "nt!_HANDLE_TABLE", "HandleTableList")).Select(h => new { Object = h.QuotaProcess, Name = ((char*)h.QuotaProcess->ImageFileName).ToDisplayString("s"), PID = (__int64)h.QuotaProcess->UniqueProcessId})
[0x0]            : Unspecified error (0x80004005)
[0x1]
Object : 0xffffb10b70906080 [Type: _EPROCESS *]
Name : "Registry"
PID : 120 [Type: __int64]
[0x2]
Object : 0xffffb10b72eba0c0 [Type: _EPROCESS *]
Name : "smss.exe"
PID : 584 [Type: __int64]
[0x3]
Object : 0xffffb10b76586140 [Type: _EPROCESS *]
Name : "csrss.exe"
PID : 696 [Type: __int64]
[0x4]
Object : 0xffffb10b77132140 [Type: _EPROCESS *]
Name : "wininit.exe"
PID : 772 [Type: __int64]
[0x5]
Object : 0xffffb10b770a2080 [Type: _EPROCESS *]
Name : "csrss.exe"
PID : 780 [Type: __int64]
[0x6]
Object : 0xffffb10b7716d080 [Type: _EPROCESS *]
Name : "winlogon.exe"
PID : 852 [Type: __int64]
...

PID : 852 [Type: __int64]

The first result is the table belonging to the System process and it does not have a QuotaProcess, which is the reason this query returns an error for it. But it should work perfectly for every other entry in the array. If we want to make our output prettier, we can filter out entries where QuotaProcess == 0 before we do the Select():

dx -r2 (Debugger.Utility.Collections.FromListEntry(*(nt!_LIST_ENTRY*)&nt!HandleTableListHead, “nt!_HANDLE_TABLE”, “HandleTableList”)).Where(h => h.QuotaProcess != 0).Select(h => new { Object = h.QuotaProcess, Name = ((char*)h.QuotaProcess->ImageFileName).ToDisplayString("s"), PID = h.QuotaProcess->UniqueProcessId})

As we already showed before, we can also print this list in a graphic view or use any LINQ queries to make the output match our needs.

This is the end of our first part, but don’t worry, the second part is right here, and it contains all the fancy new dx methods such as a new disassembler, defining our own methods, conditional breakpoints that actually work, and more.

WinDbg — the Fun Way: Part 2

WinDbg — the Fun Way: Part 2

Welcome to part 2 of me trying to make you enjoy debugging on Windows (wow, I’m a nerd)!

In the first part we got to know the basics of the new debugger data model — Using the new objects, having custom registers, searching and filtering output, declaring anonymous types and parsing lists and arrays. In this part we will learn how to use legacy commands with dx, get to know the amazing new disassembler, create synthetic methods and types, see the fancy changes to breakpoints and use the filesystem from within the debugger.

This sounds like a lot. Because it is. So let’s start!

Legacy Commands

This new data model completely changes the debugging experience. But sometimes you do need to use one of the old commands or extensions that we all got used to, and that don’t have a matching functionality under dx.

But we can still use these under dx with Debugger.Utility.Control.ExecuteCommand, which lets us run a legacy command as part of a dx query. For example, we can use the legacy u command to unassemble the address that is pointed to by RIP in our second stack frame.

Since dx output is decimal by default and legacy commands only take hex input we first need to convert it to hex using ToDisplayString("x"):

dx Debugger.Utility.Control.ExecuteCommand("u " + @$curstack.Frames[1].Attributes.InstructionOffset.ToDisplayString("x"))
Debugger.Utility.Control.ExecuteCommand("u " + @$curstack.Frames[1].Attributes.InstructionOffset.ToDisplayString("x"))                
[0x0] : kdnic!TXTransmitQueuedSends+0x125:
[0x1] : fffff807`52ad2b61 4883c430 add rsp,30h
[0x2] : fffff807`52ad2b65 5b pop rbx
[0x3] : fffff807`52ad2b66 c3 ret
[0x4] : fffff807`52ad2b67 4c8d4370 lea r8,[rbx+70h]
[0x5] : fffff807`52ad2b6b 488bd7 mov rdx,rdi
[0x6] : fffff807`52ad2b6e 488d4b60 lea rcx,[rbx+60h]
[0x7] : fffff807`52ad2b72 4c8b15d7350000 mov r10,qword ptr [kdnic!_imp_ExInterlockedInsertTailList (fffff807`52ad6150)]
[0x8] : fffff807`52ad2b79 e8123af8fb call nt!ExInterlockedInsertTailList (fffff807`4ea56590)

Another useful legacy command is !irp. This command supplies us with a lot of information about IRPs, so no need to work hard to recreate it with dx.

So we will try to run !irp for all IRPs in lsass.exe process. Let’s walk through that:

First, we need to find the process container for lsass.exe. We already know how to do that using Where(). Then we’ll pick the first process returned. Usually there should only be one lsass anyway, unless there are server silos on the machine:

dx @$lsass = @$cursession.Processes.Where(p => p.Name == “lsass.exe”).First()

Then we need to iterate over IrpList for each thread in the process and get the IRPs themselves. We can easily do that with FromListEntry() that we’ve seen already. Then we only pick the threads that have IRPs in their list:

dx -r4 @$irpThreads = @$lsass.Threads.Select(t => new {irp = Debugger.Utility.Collections.FromListEntry(t.KernelObject.IrpList, "nt!_IRP", "ThreadListEntry")}).Where(t => t.irp.Count() != 0)
@$irpThreads = @$lsass.Threads.Select(t => new {irp = 
Debugger.Utility.Collections.FromListEntry(t.KernelObject.IrpList, "nt!_IRP", "ThreadListEntry")}).Where(t => t.irp.Count() != 0)
[0x384]
irp
[0x0] [Type: _IRP]
[<Raw View>] [Type: _IRP]
IoStack : Size = 12, Current IRP_MJ_DIRECTORY_CONTROL / 0x2 for Device for "\FileSystem\Ntfs"
CurrentThread : 0xffffb90a59477080 [Type: _ETHREAD *]
[0x1] [Type: _IRP]
[<Raw View>] [Type: _IRP]
IoStack : Size = 12, Current IRP_MJ_DIRECTORY_CONTROL / 0x2 for Device for "\FileSystem\Ntfs"
CurrentThread : 0xffffb90a59477080 [Type: _ETHREAD *]

We can stop here for a moment, click on IoStack for one of the IRPs (or run with -r5 to see all of them) and get the stack in a nice container we can work with:

dx @$irpThreads.First().irp[0].IoStack
@$irpThreads.First().irp[0].IoStack                 : Size = 12, Current IRP_MJ_DIRECTORY_CONTROL / 0x2 for Device for "\FileSystem\Ntfs"
[0] : IRP_MJ_CREATE / 0x0 for {...} [Type: _IO_STACK_LOCATION]
[1] : IRP_MJ_CREATE / 0x0 for {...} [Type: _IO_STACK_LOCATION]
[2] : IRP_MJ_CREATE / 0x0 for {...} [Type: _IO_STACK_LOCATION]
[3] : IRP_MJ_CREATE / 0x0 for {...} [Type: _IO_STACK_LOCATION]
[4] : IRP_MJ_CREATE / 0x0 for {...} [Type: _IO_STACK_LOCATION]
[5] : IRP_MJ_CREATE / 0x0 for {...} [Type: _IO_STACK_LOCATION]
[6] : IRP_MJ_CREATE / 0x0 for {...} [Type: _IO_STACK_LOCATION]
[7] : IRP_MJ_CREATE / 0x0 for {...} [Type: _IO_STACK_LOCATION]
[8] : IRP_MJ_CREATE / 0x0 for {...} [Type: _IO_STACK_LOCATION]
[9] : IRP_MJ_CREATE / 0x0 for {...} [Type: _IO_STACK_LOCATION]
[10] : IRP_MJ_DIRECTORY_CONTROL / 0x2 for Device for "\FileSystem\Ntfs" [Type: _IO_STACK_LOCATION]
[11] : IRP_MJ_DIRECTORY_CONTROL / 0x2 for Device for "\FileSystem\FltMgr" [Type: _IO_STACK_LOCATION]

And as the final step we will iterate over every thread, and over every IRP in them, and ExecuteCommand !irp <irp address>. Here too we need casting and ToDisplayString("x") to match the format expected by legacy commands (the output of !irp is very long so we trimmed it down to focus on the interesting data):

dx -r3 @$irpThreads.Select(t => t.irp.Select(i => Debugger.Utility.Control.ExecuteCommand("!irp " + ((__int64)&i).ToDisplayString("x"))))
@$irpThreads.Select(t => t.irp.Select(i => Debugger.Utility.Control.ExecuteCommand("!irp " + ((__int64)&i).ToDisplayString("x"))))                
[0x384]
[0x0]
[0x0] : Irp is active with 12 stacks 11 is current (= 0xffffb90a5b8f4d40)
[0x1] : No Mdl: No System Buffer: Thread ffffb90a59477080: Irp stack trace.
[0x2] : cmd flg cl Device File Completion-Context
[0x3] : [N/A(0), N/A(0)]
...
[0x34] : Irp Extension present at 0xffffb90a5b8f4dd0:
[0x1]
[0x0] : Irp is active with 12 stacks 11 is current (= 0xffffb90a5bd24840)
[0x1] : No Mdl: No System Buffer: Thread ffffb90a59477080: Irp stack trace.
[0x2] : cmd flg cl Device File Completion-Context
[0x3] : [N/A(0), N/A(0)]
...
[0x34] : Irp Extension present at 0xffffb90a5bd248d0:

Most of the information given to us by !irp we can get by parsing the IRPs with dx and dumping the IoStack for each. But there are a few things we might have a harder time to get but receive from the legacy command such as the existence and address of an IrpExtension and information about a possible Mdl linked to the Irp.

Disassembler

We used the u command as an example, though in this case there actually is functionality implementing this in dx, through Debugger.Utility.Code.CreateDisassember and DisassembleBlock, creating iterable and searchable disassembly:

dx -r3 Debugger.Utility.Code.CreateDisassembler().DisassembleBlocks(@$curstack.Frames[1].Attributes.InstructionOffset)
Debugger.Utility.Code.CreateDisassembler().DisassembleBlocks(@$curstack.Frames[1].Attributes.InstructionOffset)                
[0xfffff80752ad2b61] : Basic Block [0xfffff80752ad2b61 - 0xfffff80752ad2b67)
StartAddress : 0xfffff80752ad2b61
EndAddress : 0xfffff80752ad2b67
Instructions
[0xfffff80752ad2b61] : add rsp,30h
[0xfffff80752ad2b65] : pop rbx
[0xfffff80752ad2b66] : ret
[0xfffff80752ad2b67] : Basic Block [0xfffff80752ad2b67 - 0xfffff80752ad2b7e)
StartAddress : 0xfffff80752ad2b67
EndAddress : 0xfffff80752ad2b7e
Instructions
[0xfffff80752ad2b67] : lea r8,[rbx+70h]
[0xfffff80752ad2b6b] : mov rdx,rdi
[0xfffff80752ad2b6e] : lea rcx,[rbx+60h]
[0xfffff80752ad2b72] : mov r10,qword ptr [kdnic!__imp_ExInterlockedInsertTailList (fffff80752ad6150)]
[0xfffff80752ad2b79] : call ntkrnlmp!ExInterlockedInsertTailList (fffff8074ea56590)
[0xfffff80752ad2b7e] : Basic Block [0xfffff80752ad2b7e - 0xfffff80752ad2b80)
StartAddress : 0xfffff80752ad2b7e
EndAddress : 0xfffff80752ad2b80
Instructions
[0xfffff80752ad2b7e] : jmp kdnic!TXTransmitQueuedSends+0xd0 (fffff80752ad2b0c)
[0xfffff80752ad2b80] : Basic Block [0xfffff80752ad2b80 - 0xfffff80752ad2b81)
StartAddress : 0xfffff80752ad2b80
EndAddress : 0xfffff80752ad2b81
Instructions
...

And the cleaned-up version, picking only the instructions and flattening the tree:

dx -r2 Debugger.Utility.Code.CreateDisassembler().DisassembleBlocks(@$curstack.Frames[1].Attributes.InstructionOffset).Select(b => b.Instructions).Flatten()
Debugger.Utility.Code.CreateDisassembler().DisassembleBlocks(@$curstack.Frames[1].Attributes.InstructionOffset).Select(b => b.Instructions).Flatten()                
[0x0]
[0xfffff80752ad2b61] : add rsp,30h
[0xfffff80752ad2b65] : pop rbx
[0xfffff80752ad2b66] : ret
[0x1]
[0xfffff80752ad2b67] : lea r8,[rbx+70h]
[0xfffff80752ad2b6b] : mov rdx,rdi
[0xfffff80752ad2b6e] : lea rcx,[rbx+60h]
[0xfffff80752ad2b72] : mov r10,qword ptr [kdnic!__imp_ExInterlockedInsertTailList (fffff80752ad6150)]
[0xfffff80752ad2b79] : call ntkrnlmp!ExInterlockedInsertTailList (fffff8074ea56590)
[0x2]
[0xfffff80752ad2b7e] : jmp kdnic!TXTransmitQueuedSends+0xd0 (fffff80752ad2b0c)
[0x3]
[0xfffff80752ad2b80] : int 3
[0x4]
[0xfffff80752ad2b81] : int 3
...

Synthetic Methods

Another functionality that we get with this debugger data model is to create functions of our own and use them, with this syntax:

0: kd> dx @$multiplyByThree = (x => x * 3)
@$multiplyByThree = (x => x * 3)
0: kd> dx @$multiplyByThree(5)
@$multiplyByThree(5) : 15

Or we can have functions taking multiple arguments:

0: kd> dx @$add = ((x, y) => x + y)
@$add = ((x, y) => x + y)
0: kd> dx @$add(5, 7)
@$add(5, 7)      : 12

Or if we want to really go a few levels up, we can apply these functions to the disassembly output we saw earlier to find all writes into memory in ZwSetInformationProcess. For that there are a few checks we need to apply to each instruction to know whether or not it’s a write into memory:

· Does it have at least 2 operands?
For example, ret will have zero and jmp <address> will have one. We only care about cases where one value is being written into some location, which will always require two operands. To verify that we will check for each instruction Operands.Count() > 1.

· Is this a memory reference?
We are only interested in writes into memory and want to ignore instructions like mon r10, rcx. To do that, we will check for each instruction its Operands[0].Attributes.IsMemoryReference == true.
We check Operands[0] because that will be the destination. If we wanted to find memory reads we would have checked the source, which is in Operands[1].

· Is the destination operand an output?
We want to filter out instructions where memory is referenced but not written into. To check that we will use Operands[0].IsOutput == true.

· As our last filter we want to ignore memory writes into the stack, which will look like mov [rsp+0x18], 1 or mov [rbp-0x10], rdx.
We will check the register of the first operand and make sure its index is not the rsp index (0x14) or rbp index (0x15).

We will write a function, @$isMemWrite, that receives a block and only returns the instructions that contain a memory write, based in these checks. Then we can create a disassembler, disassemble our target function and only print the memory writes in it:

dx -r0 @$rspId = 0x14
dx -r0 @$rbpId = 0x15
dx -r0 @$isMemWrite = (b => b.Instructions.Where(i => i.Operands.Count() > 1 && i.Operands[0].Attributes.IsOutput && i.Operands[0].Registers[0].Id != @$rspId && i.Operands[0].Registers[0].Id != @$rbpId && i.Operands[0].Attributes.IsMemoryReference))
dx -r0 @$findMemWrite = (a => Debugger.Utility.Code.CreateDisassembler().DisassembleBlocks(a).Select(b => @$isMemWrite(b)))
dx -r2 @$findMemWrite(&nt!ZwSetInformationProcess).Where(b => b.Count() != 0)
@$findMemWrite(&nt!ZwSetInformationProcess).Where(b => b.Count() != 0)                
[0xfffff8074ebd23d4]
[0xfffff8074ebd23e9] : mov qword ptr [r10+80h],rax
[0xfffff8074ebd23f5] : mov qword ptr [r10+44h],rax
[0xfffff8074ebd2415]
[0xfffff8074ebd2421] : mov qword ptr [r10+98h],r8
[0xfffff8074ebd2428] : mov qword ptr [r10+0F8h],r9
[0xfffff8074ebd2433] : mov byte ptr gs:[5D18h],al
[0xfffff8074ebd25b2]
[0xfffff8074ebd25c3] : mov qword ptr [rcx],rax
[0xfffff8074ebd25c9] : mov qword ptr [rcx+8],rax
[0xfffff8074ebd25d0] : mov qword ptr [rcx+10h],rax
[0xfffff8074ebd25d7] : mov qword ptr [rcx+18h],rax
[0xfffff8074ebd25df] : mov qword ptr [rcx+0A0h],rax
[0xfffff8074ebd263f]
[0xfffff8074ebd264f] : and byte ptr [rax+5],0FDh
[0xfffff8074ebd26da]
[0xfffff8074ebd26e3] : mov qword ptr [rcx],rax
[0xfffff8074ebd26e9] : mov qword ptr [rcx+8],rax
[0xfffff8074ebd26f0] : mov qword ptr [rcx+10h],rax
[0xfffff8074ebd26f7] : mov qword ptr [rcx+18h],rax
[0xfffff8074ebd26ff] : mov qword ptr [rcx+0A0h],rax
[0xfffff8074ebd2708] : mov word ptr [rcx+72h],ax
...

As another project combining almost everything mentioned here, we can try to create a version of !apc using dx. To simplify we will only look for kernel APCs. To do that, we have a few steps:

  • Iterate over all the processes using @$cursession.Processes to find the ones containing threads where KTHREAD.ApcState.KernelApcPending is set to 1.
  • Make a container in the process with only the threads that have pending kernel APCs. Ignore the rest.
  • For each of these threads, iterate over KTHREAD.ApcState.ApcListHead[0] (contains the kernel APCs) and gather interesting information about them. We can do that with the FromListHead() method we’ve seen earlier.
    To make our container as similar as possible to !apc, we will only get KernelRoutine and RundownRoutine, though in your implementation you might find there are other fields that interest you as well.
  • To make the container easier to navigate, collect process name, ID and EPROCESS address, and thread ID and ETHREAD address.
  • In our implementation we implemented a few helper functions:
    @$printLn — runs the legacy command ln with the supplied address, to get information about the symbol
    @$extractBetween — extract a string between two other strings, will be used for getting a substring from the output of @$printLn
    @$printSymbol — Sends an address to @$printLn and extracts the symbol name only using @$extractSymbol
    @$apcsForThread — Finds all kernel APCs for a thread and creates a container with their KernelRoutine and RundownRoutine.

We then got all the processes that have threads with pending kernel APCs and saved it into the @$procWithKernelApcs register, and then in a separate command got the APC information using @$apcsForThread. We also cast the EPPROCESS and ETHREAD pointers to void* so dx doesn’t print the whole structure when we print the final result.

This was our way of solving this problem, but there can be others, and yours doesn’t have to be identical to ours!

The script we came up with is:

dx -r0 @$printLn = (a => Debugger.Utility.Control.ExecuteCommand(“ln “+((__int64)a).ToDisplayString(“x”)))
dx -r0 @$extractBetween = ((x,y,z) => x.Substring(x.IndexOf(y) + y.Length, x.IndexOf(z) — x.IndexOf(y) — y.Length))
dx -r0 @$printSymbol = (a => @$extractBetween(@$printLn(a)[3], “ “, “|”))
dx -r0 @$apcsForThread = (t => new {TID = t.Id, Object = (void*)&t.KernelObject, Apcs = Debugger.Utility.Collections.FromListEntry(*(nt!_LIST_ENTRY*)&t.KernelObject.Tcb.ApcState.ApcListHead[0], “nt!_KAPC”, “ApcListEntry”).Select(a => new { Kernel = @$printSymbol(a.KernelRoutine), Rundown = @$printSymbol(a.RundownRoutine)})})
dx -r0 @$procWithKernelApc = @$cursession.Processes.Select(p => new {Name = p.Name, PID = p.Id, Object = (void*)&p.KernelObject, ApcThreads = p.Threads.Where(t => t.KernelObject.Tcb.ApcState.KernelApcPending != 0)}).Where(p => p.ApcThreads.Count() != 0)
dx -r6 @$procWithKernelApc.Select(p => new { Name = p.Name, PID = p.PID, Object = p.Object, ApcThreads = p.ApcThreads.Select(t => @$apcsForThread(t))})

And it produces the following output:

dx -r6 @$procWithKernelApc.Select(p => new { Name = p.Name, PID = p.PID, Object = p.Object, ApcThreads = p.ApcThreads.Select(t => @$apcsForThread(t))})
@$procWithKernelApc.Select(p => new { Name = p.Name, PID = p.PID, Object = p.Object, ApcThreads = p.ApcThreads.Select(t => @$apcsForThread(t))})                
[0x15b8]
Name : SearchUI.exe
Length : 0xc
PID : 0x15b8
Object : 0xffffb90a5b1300c0 [Type: void *]
ApcThreads
[0x159c]
TID : 0x159c
Object : 0xffffb90a5b14f080 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList
[0x1528]
TID : 0x1528
Object : 0xffffb90a5aa6b080 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList
[0x16b4]
TID : 0x16b4
Object : 0xffffb90a59f1e080 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList
[0x16a0]
TID : 0x16a0
Object : 0xffffb90a5b141080 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList
[0x16b8]
TID : 0x16b8
Object : 0xffffb90a5aab20c0 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList
[0x1740]
TID : 0x1740
Object : 0xffffb90a5ab362c0 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList
[0x1780]
TID : 0x1780
Object : 0xffffb90a5b468080 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList
[0x1778]
TID : 0x1778
Object : 0xffffb90a5b6f7080 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList
[0x17d0]
TID : 0x17d0
Object : 0xffffb90a5b1e8080 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList
[0x17d4]
TID : 0x17d4
Object : 0xffffb90a5b32f080 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList
[0x17f8]
TID : 0x17f8
Object : 0xffffb90a5b32e080 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList
[0xb28]
TID : 0xb28
Object : 0xffffb90a5b065600 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList
[0x1850]
TID : 0x1850
Object : 0xffffb90a5b6a5080 [Type: void *]
Apcs
[0x0]
Kernel : nt!EmpCheckErrataList
Rundown : nt!EmpCheckErrataList

Not as pretty as !apc, but still pretty

We can also print it as a table, receive information about the processes and be able to explore the APCs of each process separately:

dx -g @$procWithKernelApc.Select(p => new { Name = p.Name, PID = p.PID, Object = p.Object, ApcThreads = p.ApcThreads.Select(t => @$apcsForThread(t))})

But wait, what are all these APCs withnt!EmpCheckErrataList? And why does SearchUI.exe have all of them? What does this process have to do with erratas?

The secret is that there are not actually APCs meant to callnt!EmpCheckErrataList. And no, the symbols are not wrong.

The thing we see here is happening because the compiler is being smart — when it sees a few different functions that have the same code, it makes them all point to the same piece of code, instead of duplicating this code multiple times. You might think that this is not a thing that would happen very often, but lets look at the disassembly for nt!EmpCheckErrataList (the old way this time):

u EmpCheckErrataList
nt!EmpCheckErrataList:
fffff807`4eb86010 c20000 ret 0
fffff807`4eb86013 cc int 3
fffff807`4eb86014 cc int 3
fffff807`4eb86015 cc int 3
fffff807`4eb86016 cc int 3

This is actually just a stub. It might be a function that has not been implemented yet (probably the case for this one) or a function that is meant to be a stub for a good reason. The function that is the real KernelRoutine/RundownRoutine of these APCs is nt!KiSchedulerApcNop, and is meant to be a stub on purpose, and has been for many years. And we can see it has the same code and points to the same address:

u nt!KiSchedulerApcNop
nt!EmpCheckErrataList:
fffff807`4eb86010 c20000 ret 0
fffff807`4eb86013 cc int 3
fffff807`4eb86014 cc int 3
fffff807`4eb86015 cc int 3
fffff807`4eb86016 cc int 3

So why do we see so many of these APCs?

When a thread is being suspended, the system creates a semaphore and queues an APC to the thread that will wait on that semaphore. The thread will be waiting until someone asks the resume it, and then the system will free the semaphore and the thread will stop waiting and will resume. The APC itself doesn’t need to do much, but it must have a KernelRoutine and a RundownRoutine, so the system places a stub there. In the symbols this stub receives the name of one of the functions that have this “code”, this time nt!EmpCheckErrataList, but it can be a different one in the next version.

Anyone interested in the suspension mechanism can look at ReactOS. The code for these functions changed a bit since, and the stub function was renamed from KiSuspendNop to KiSchedulerApcNop, but the general design stayed similar.

But I got distracted, this is not what this blog was supposed to be talking about. Let’s get back to WinDbg and synthetic functions:

Synthetic Types

After covering synthetic methods, we can also add our own named types and use them to parse data where the type is not available to us.

For example, let’s try to print the PspCreateProcessNotifyRoutine array, which holds all the registered process notify routines — function that are registered by drivers and will receive a notification whenever a process starts. But this array doesn’t contain pointers to the registered routines. Instead it contains pointers to the non-documented EX_CALLBACK_ROUTINE_BLOCK structure.

So to parse this array, we need to make sure WinDbg knows this type — to do that we use Synthetic Types. We start by creating a header file containing all the types we want to define (I used c:\temp\header.h). In this case it’s just EX_CALLBACK_ROUTINE_BLOCK, that we can find in ReactOS:

typedef struct _EX_CALLBACK_ROUTINE_BLOCK
{
_EX_RUNDOWN_REF RundownProtect;
void* Function;
void* Context;
} EX_CALLBACK_ROUTINE_BLOCK, *PEX_CALLBACK_ROUTINE_BLOCK;

Now we can ask WinDbg to load it and add the types to the nt module:

dx Debugger.Utility.Analysis.SyntheticTypes.ReadHeader("c:\\temp\\header.h", "nt")
Debugger.Utility.Analysis.SyntheticTypes.ReadHeader("c:\\temp\\header.h", "nt")                 : ntkrnlmp.exe(header.h)
ReturnEnumsAsObjects : false
RegisterSyntheticTypeModels : false
Module : ntkrnlmp.exe
Header : header.h
Types

This gives us an object which lets us see all the types added to this module.
Now that we defined the type, we can use it with CreateInstance:

dx Debugger.Utility.Analysis.SyntheticTypes.CreateInstance("_EX_CALLBACK_ROUTINE_BLOCK", *(__int64*)&nt!PspCreateProcessNotifyRoutine)
Debugger.Utility.Analysis.SyntheticTypes.CreateInstance("_EX_CALLBACK_ROUTINE_BLOCK", *(__int64*)&nt!PspCreateProcessNotifyRoutine)                
RundownProtect [Type: _EX_RUNDOWN_REF]
Function : 0xfffff8074cbdff50 [Type: void *]
Context : 0x6e496c4102031400 [Type: void *]

It’s important to notice that CreateInstance only takes __int64 inputs so any other type has to be cast. It’s good to know this in advance because the error messages these modules return are not always easy to understand.

Now, if we look at our output, and specifically at Context, something seems weird. And actually if we try to dump Function we will see it doesn’t point to any code:

dq 0xfffff8074cbdff50
fffff807`4cbdff50 ????????`???????? ????????`????????
fffff807`4cbdff60 ????????`???????? ????????`????????
fffff807`4cbdff70 ????????`???????? ????????`????????
fffff807`4cbdff80 ????????`???????? ????????`????????
fffff807`4cbdff90 ????????`???????? ????????`????????

So what happened?
The problem is not our cast to EX_CALLBACK_ROUTINE_BLOCK, but the address we are casting. If we dump the values in PspCreateProcessNotifyRoutine we might see what it is:

dx ((void**[0x40])&nt!PspCreateProcessNotifyRoutine).Where(a => a != 0)
((void**[0x40])&nt!PspCreateProcessNotifyRoutine).Where(a => a != 0)                
[0] : 0xffffb90a530504ef [Type: void * *]
[1] : 0xffffb90a532a512f [Type: void * *]
[2] : 0xffffb90a53da9d5f [Type: void * *]
[3] : 0xffffb90a53da9ccf [Type: void * *]
[4] : 0xffffb90a53e5d15f [Type: void * *]
[5] : 0xffffb90a571469ef [Type: void * *]
[6] : 0xffffb90a5714722f [Type: void * *]
[7] : 0xffffb90a571473df [Type: void * *]
[8] : 0xffffb90a597d989f [Type: void * *]

The lower half-byte in all of these is 0xF, while we know that pointers in x64 machines are always aligned to 8 bytes, and usually to 0x10. This is because I oversimplified it earlier — these are not pointers to EX_CALLBACK_ROUTINE_BLOCK, they are actually EX_CALLBACK structures (another type that is not in the public pdb), containing an EX_RUNDOWN_REF. But to make this example simpler we will treat them as simple pointers that have been ORed with 0xF, since this is good enough for our purposes. If you ever choose to write a driver that will handle PspCreateProcessNotifyRoutine please do not use this hack, look into ReactOS and do things properly. 😊
So to fix our command we just need to align the addresses to 0x10 before casting them. To do that we do:

<address> & 0xFFFFFFFFFFFFFFF0

Or the nicer version:

<address> & ~0xF

Let’s use that in our command:

dx Debugger.Utility.Analysis.SyntheticTypes.CreateInstance("_EX_CALLBACK_ROUTINE_BLOCK", (*(__int64*)&nt!PspCreateProcessNotifyRoutine) & ~0xf)
Debugger.Utility.Analysis.SyntheticTypes.CreateInstance("_EX_CALLBACK_ROUTINE_BLOCK", (*(__int64*)&nt!PspCreateProcessNotifyRoutine) & ~0xf)                
RundownProtect [Type: _EX_RUNDOWN_REF]
Function : 0xfffff8074ea7f310 [Type: void *]
Context : 0x0 [Type: void *]

This looks better. Let’s check that Function actually points to a function this time:

ln 0xfffff8074ea7f310
Browse module
Set bu breakpoint
(fffff807`4ea7f310)   nt!ViCreateProcessCallback   |  (fffff807`4ea7f330)   nt!RtlStringCbLengthW
Exact matches:
nt!ViCreateProcessCallback (void)

Looks much better! Now we can get define this cast as a synthetic method and get the function addresses for all routines in the array:

dx -r0 @$getCallbackRoutine = (a => Debugger.Utility.Analysis.SyntheticTypes.CreateInstance("_EX_CALLBACK_ROUTINE_BLOCK", (__int64)(a & ~0xf)))
dx ((void**[0x40])&nt!PspCreateProcessNotifyRoutine).Where(a => a != 0).Select(a => @$getCallbackRoutine(a).Function)
((void**[0x40])&nt!PspCreateProcessNotifyRoutine).Where(a => a != 0).Select(a => @$getCallbackRoutine(a).Function)                
[0] : 0xfffff8074ea7f310 [Type: void *]
[1] : 0xfffff8074ff97220 [Type: void *]
[2] : 0xfffff80750a41330 [Type: void *]
[3] : 0xfffff8074f8ab420 [Type: void *]
[4] : 0xfffff8075106d9f0 [Type: void *]
[5] : 0xfffff807516dd930 [Type: void *]
[6] : 0xfffff8074ff252c0 [Type: void *]
[7] : 0xfffff807520b6aa0 [Type: void *]
[8] : 0xfffff80753a63cf0 [Type: void *]

But this will be more fun if we could see the symbols instead of the addresses. We already know how to get the symbols by executing the legacy command ln, but this time we will do it with .printf. First we will write a helper function @$getsym which will run the command printf "%y", <address>:

dx -r0 @$getsym = (x => Debugger.Utility.Control.ExecuteCommand(".printf\"%y\", " + ((__int64)x).ToDisplayString("x"))[0])

Then we will send every function address to this method, to print the symbol:

dx ((void**[0x40])&nt!PspCreateProcessNotifyRoutine).Where(a => a != 0).Select(a => @$getsym(@$getCallbackRoutine(a).Function))
((void**[0x40])&nt!PspCreateProcessNotifyRoutine).Where(a => a != 0).Select(a => @$getsym(@$getCallbackRoutine(a).Function))                
[0] : nt!ViCreateProcessCallback (fffff807`4ea7f310)
[1] : cng!CngCreateProcessNotifyRoutine (fffff807`4ff97220)
[2] : WdFilter!MpCreateProcessNotifyRoutineEx (fffff807`50a41330)
[3] : ksecdd!KsecCreateProcessNotifyRoutine (fffff807`4f8ab420)
[4] : tcpip!CreateProcessNotifyRoutineEx (fffff807`5106d9f0)
[5] : iorate!IoRateProcessCreateNotify (fffff807`516dd930)
[6] : CI!I_PEProcessNotify (fffff807`4ff252c0)
[7] : dxgkrnl!DxgkProcessNotify (fffff807`520b6aa0)
[8] : peauth+0x43cf0 (fffff807`53a63cf0)

There, much nicer!

Breakpoints

Conditional Breakpoint

Conditional breakpoints are a huge pain-point when debugging. And with the old MASM syntax they’re almost impossible to use. I spent hours trying to get them to work the way I wanted to, but the command turns out to be so awful that I can’t even understand what I was trying to do, not to mention why it doesn’t filter anything or how to fix it.

Well, these days are over. We can now use dx queries for conditional breakpoints with the following syntax: bp /w “dx query" <address>.

For example, let’s say we are trying to debug an issue involving file opens by Wow64 processes. The function NtOpenProcess is called all the time, but we only care about calls done by Wow64 processes, which are not the majority of processes on modern systems. So to avoid helplessly going through 100 debugger breaks until we get lucky or struggle with MASM-style conditional breakpoints, we can do this instead:

bp /w "@$curprocess.KernelObject.WoW64Process != 0" nt!NtOpenProcess

We then let the machine run, and when the breakpoint is hit we can check if it worked:

Breakpoint 3 hit
nt!NtOpenProcess:
fffff807`2e96b7e0 4883ec38 sub rsp,38h
dx @$curprocess.KernelObject.WoW64Process
@$curprocess.KernelObject.WoW64Process                 : 0xffffc10f5163b390 [Type: _EWOW64PROCESS *]
[+0x000] Peb : 0xf88000 [Type: void *]
[+0x008] Machine : 0x14c [Type: unsigned short]
[+0x00c] NtdllType : PsWowX86SystemDll (1) [Type: _SYSTEM_DLL_TYPE]
dx @$curprocess.Name
@$curprocess.Name : IpOverUsbSvc.exe
Length : 0x10

The process that triggered our breakpoint is a WoW64 process!
For anyone who has ever tried using conditional breakpoints with MASM, this is a life-changing addition.

Other Breakpoint Options

There are a few other interesting breakpoint options found under Debugger.Utility.Control:

  • SetBreakpointAtSourceLocation — allowing us to set a breakpoint in a module whose source file is available to us, with this syntax: dx Debugger.Utility.Control.SetBreakpointAtSourceLocation("MyModule!myFile.cpp", “172”)
  • SetBreakpointAtOffset — sets a breakpoint at an offset inside a function — dx Debugger.Utility.Control.SetBreakpointAtOffset("NtOpenFile", 8, “nt")
  • SetBreakpointForReadWriteFile — similar to the legacy ba command but with more readable syntax, this lets us set a breakpoint to issue a debug break whenever anyone reads or writes to an address. It has default configuration of type = Hardware Write and size = 1.
    For example, let’s try to break on every read of Ci!g_CiOptions, a variable whose size is 4 bytes:
dx Debugger.Utility.Control.SetBreakpointForReadWrite(&Ci!g_CiOptions, “Hardware Read”, 0x4)

We let the machine keep running and almost immediately our breakpoint is hit:

0: kd> g
Breakpoint 0 hit
CI!CiValidateImageHeader+0x51b:
fffff807`2f6fcb1b 740c je CI!CiValidateImageHeader+0x529 (fffff807`2f6fcb29)

CI!CiValidateImageHeader read this global variable when validating an image header. In this specific example, we will see reads of this variable very often and writes into it are the more interesting case, as it can show us an attempt to tamper with signature validation.

An interesting thing to notice about these commands in that they don’t just set a breakpoint, they actually return it as an object we can control, which has attributes like IsEnabled, Condition (allowing us to set a condition), PassCount (telling us how many times this breakpoint has been hit) and more.

FileSystem

Under Debugger.Utility we have the FileSystem module, letting us query and control the file system on the host machine (not the machine we are debugging) from within the debugger:

dx -r1 Debugger.Utility.FileSystem
Debugger.Utility.FileSystem
CreateFile       [CreateFile(path, [disposition]) - Creates a file at the specified path and returns a file object.  'disposition' can be one of 'CreateAlways' or 'CreateNew']
CreateTempFile   [CreateTempFile() - Creates a temporary file in the %TEMP% folder and returns a file object]
CreateTextReader [CreateTextReader(file | path, [encoding]) - Creates a text reader over the specified file.  If a path is passed instead of a file, a file is opened at the specified path.  'encoding' can be 'Utf16', 'Utf8', or 'Ascii'.  'Ascii' is the default]
CreateTextWriter [CreateTextWriter(file | path, [encoding]) - Creates a text writer over the specified file.  If a path is passed instead of a file, a file is created at the specified path.  'encoding' can be 'Utf16', 'Utf8', or 'Ascii'.  'Ascii' is the default]
CurrentDirectory : C:\WINDOWS\system32
DeleteFile       [DeleteFile(path) - Deletes a file at the specified path]
FileExists       [FileExists(path) - Checks for the existance of a file at the specified path]
OpenFile         [OpenFile(path) - Opens a file read/write at the specified path]
TempDirectory    : C:\Users\yshafir\AppData\Local\Temp

We can create files, open them, write into them, delete them or check if a file exists in a certain path. To see a simple example, let’s dump the contents of our current directory — C:\Windows\System32:

dx -r1 Debugger.Utility.FileSystem.CurrentDirectory.Files
Debugger.Utility.FileSystem.CurrentDirectory.Files                
[0x0] : C:\WINDOWS\system32\07409496-a423-4a3e-b620-2cfb01a9318d_HyperV-ComputeNetwork.dll
[0x1] : C:\WINDOWS\system32\1
[0x2] : C:\WINDOWS\system32\103
[0x3] : C:\WINDOWS\system32\108
[0x4] : C:\WINDOWS\system32\11
[0x5] : C:\WINDOWS\system32\113
...
[0x44] : C:\WINDOWS\system32\93
[0x45] : C:\WINDOWS\system32\98
[0x46] : C:\WINDOWS\system32\@AppHelpToast.png
[0x47] : C:\WINDOWS\system32\@AudioToastIcon.png
[0x48] : C:\WINDOWS\system32\@BackgroundAccessToastIcon.png
[0x49] : C:\WINDOWS\system32\@bitlockertoastimage.png
[0x4a] : C:\WINDOWS\system32\@edptoastimage.png
[0x4b] : C:\WINDOWS\system32\@EnrollmentToastIcon.png
[0x4c] : C:\WINDOWS\system32\@language_notification_icon.png
[0x4d] : C:\WINDOWS\system32\@optionalfeatures.png
[0x4e] : C:\WINDOWS\system32\@VpnToastIcon.png
[0x4f] : C:\WINDOWS\system32\@WiFiNotificationIcon.png
[0x50] : C:\WINDOWS\system32\@windows-hello-V4.1.gif
[0x51] : C:\WINDOWS\system32\@WindowsHelloFaceToastIcon.png
[0x52] : C:\WINDOWS\system32\@WindowsUpdateToastIcon.contrast-black.png
[0x53] : C:\WINDOWS\system32\@WindowsUpdateToastIcon.contrast-white.png
[0x54] : C:\WINDOWS\system32\@WindowsUpdateToastIcon.png
[0x55] : C:\WINDOWS\system32\@WirelessDisplayToast.png
[0x56] : C:\WINDOWS\system32\@WwanNotificationIcon.png
[0x57] : C:\WINDOWS\system32\@WwanSimLockIcon.png
[0x58] : C:\WINDOWS\system32\aadauthhelper.dll
[0x59] : C:\WINDOWS\system32\aadcloudap.dll
[0x5a] : C:\WINDOWS\system32\aadjcsp.dll
[0x5b] : C:\WINDOWS\system32\aadtb.dll
[0x5c] : C:\WINDOWS\system32\aadWamExtension.dll
[0x5d] : C:\WINDOWS\system32\AboutSettingsHandlers.dll
[0x5e] : C:\WINDOWS\system32\AboveLockAppHost.dll
[0x5f] : C:\WINDOWS\system32\accessibilitycpl.dll
[0x60] : C:\WINDOWS\system32\accountaccessor.dll
[0x61] : C:\WINDOWS\system32\AccountsRt.dll
[0x62] : C:\WINDOWS\system32\AcGenral.dll
...

We can choose to delete one of these files:

dx -r1 Debugger.Utility.FileSystem.CurrentDirectory.Files[1].Delete()

Or delete it through DeleteFile:

dx Debugger.Utility.FileSystem.DeleteFile(“C:\\WINDOWS\\system32\\71”)

Notice that in this module paths have to have double backslash (“\\”), as they would if we had called the Win32 API ourselves.

As a last exercise we’ll put together a few of the things we learned here — we’re going to create a breakpoint on a kernel variable, get the symbol that accessed it from the stack and write the symbol the accessed it into a file on our host machine.

Let’s break it down into steps:

  • Open a file to write the results to.
  • Create a text writer, which we will use to write into the file.
  • Create a breakpoint for access into a variable. In this case we’ll choose nt!PsInitialSystemProcess and set a breakpoint for read access. We will use the old MASM syntax to run a dx command every time the breakpoint is hit and move on: ba r4 <address> "dx <command>; g"
    Our command will use @$curstack to get the address that accessed the variable, and then use the @$getsym helper function we wrote earlier to find the symbol for it. We’ll use our text writer to write the result into the file.
  • Finally, we will close the file.

Putting it all together:

dx -r0 @$getsym = (x => Debugger.Utility.Control.ExecuteCommand(".printf\"%y\", " + ((__int64)x).ToDisplayString("x"))[0])
dx @$tmpFile = Debugger.Utility.FileSystem.TempDirectory.OpenFile("log.txt")
dx @$txtWriter = Debugger.Utility.FileSystem.CreateTextWriter(@$tmpFile)
ba r4 nt!PsInitialSystemProcess "dx @$txtWriter.WriteLine(@$getsym(@$curstack.Frames[0].Attributes.InstructionOffset)); g"

We let the machine run for as long as we want, and when we want to stop the logging we can disable or clear the breakpoint and close the file with dx @$tmpFile.Close().

Now we can open our @$tmpFile and look at the results:

That’s it! What an amazingly easy way to log information about the debugger!

So that’s the end of our WinDbg series! All the scripts in this series will be uploaded to a github repo, as well as some new ones not included here. I suggest you investigate this data model further, because we didn’t even cover all the different methods it contains. Write cool tools of your own and share them with the world :)

And as long as this guide was, these are not even all the possible options in the new data model. And I didn’t even mention the new support for Javascript! You can get more information about using Javascript in WinDbg and the new and exciting support for TTD (time travel debugging) in this excellent post.

Adventures in avoiding (list) head

Working with lists is hard. I can never get them right the first time and keep finding myself having to draw them to understand how they work, or forget to advance them in a list and get stuck in a loop. Every single time. Can you believe someone is actually paying me to write code? That runs in the kernel?

Anyway, I worked a lot with lists recently in a few projects that I might publish some day when I find the inner motivation to finish them. And I had the same problem in a few of them — I didn’t start iterating over the list from its head, but from a random item, without knowing there my list head was. And knowing where the list head is can be important.

Take this example — we want to parse the kernel process list and want to get the value of Process->DiskCounters->BytesRead for each process:

This should work fine for any normal process:

But what will happen when we reach the list head?

The list head is not a part of a real EPROCESS structure and it is surrounded by other, unrelated variables. If we try to treat it like a normal EPROCESS we will read these and might try to use them as pointers and dereference them, which will crash sooner or later.

But a useful thing to remember is that there is one significant difference between the list head and the rest of the list — lists connect data structures that are allocated in the pool, while the list head will be a global variable in the driver that manages the list (in our example, ntoskrnl.exe has nt!PsActiveProcessHead as a global variable, used to access the process list).

There is no easy way that I know of to check if an address is in the pool or not, but we can use a trick and call RtlPcToFileHeader. This function receives an address and writes the base address of the image it’s in into an output parameter. So we can do:

We can also verify that the list head is inside the image it’s supposed to be in, by getting the image base address from a known symbol and comparing:

Windows RS3 added the useful RtlPcToFileName function, that makes our code a bit prettier:

Yes, More Callbacks — The Kernel Extension Mechanism

Yes, More Callbacks — The Kernel Extension Mechanism

Recently I had to write a kernel-mode driver. This has made a lot of people very angry and been widely regarded as a bad move. (Douglas Adams, paraphrased)

Like any other piece of code written by me, this driver had several major bugs which caused some interesting side effects. Specifically, it prevented some other drivers from loading properly and caused the system to crash.

As it turns out, many drivers assume their initialization routine (DriverEntry) is always successful, and don’t take it well when this assumption breaks. j00ru documented some of these cases a few years ago in his blog, and many of them are still relevant in current Windows versions. However, these buggy drivers are not really the issue here, and j00ru covered it better than I could anyway. Instead I focused on just one of these drivers, which caught my attention and dragged me into researching the so-called “windows kernel host extensions” mechanism.

The lucky driver is Bam.sys (Background Activity Moderator) — a new driver which was introduced in Windows 10 version 1709 (RS3). When its DriverEntry fails mid-way, the call stack leading to the system crash looks like this:

From this crash dump, we can see that Bam.sys registered a process creation callback and forgot to unregister it before unloading. Then, when a process was created / terminated, the system tried to call this callback, encountered a stale pointer and crashed.

The interesting thing here is not the crash itself, but rather how Bam.sys registers this callback. Normally, process creation callbacks are registered via nt!PsSetCreateProcessNotifyRoutine(Ex), which adds the callback to the nt!PspCreateProcessNotifyRoutine array. Then, whenever a process is being created or terminated, nt!PspCallProcessNotifyRoutines iterates over this array and calls all of the registered callbacks. However, if we run for example “!wdbgark.wa_systemcb /type process“ in WinDbg, we’ll see that the callback used by Bam.sys is not found in this array.

Instead, Bam.sys uses a whole other mechanism to register its callbacks.

If we take a look at nt!PspCallProcessNotifyRoutines, we can see an explicit reference to some variable named nt!PspBamExtensionHost (there is a similar one referring to the Dam.sys driver). It retrieves a so-called “extension table” using this “extension host” and calls the first function in the extension table, which is bam!BampCreateProcessCallback.

If we open Bam.sys in IDA, we can easily find bam!BampCreateProcessCallback and search for its xrefs. Conveniently, it only has one, in bam!BampRegisterKernelExtension:

As suspected, Bam!BampCreateProcessCallback is not registered via the normal callback registration mechanism. It is actually being stored in a function table named Bam!BampKernelCalloutTable, which is later being passed, together with some other parameters (we’ll talk about them in a minute) to the undocumented nt!ExRegisterExtension function.

I tried to search for any documentation or hints for what this function was responsible for, or what this “extension” is, and couldn’t find much. The only useful resource I found was the leaked ntosifs.h header file, which contains the prototype for nt!ExRegisterExtension as well as the layout of the _EX_EXTENSION_REGISTRATION_1 structure.

Prototype for nt!ExRegisterExtension and _EX_EXTENSION_REGISTRATION_1, as supplied in ntosifs.h:

NTKERNELAPI NTSTATUS ExRegisterExtension (
    _Outptr_ PEX_EXTENSION *Extension,
    _In_ ULONG RegistrationVersion,
    _In_ PVOID RegistrationInfo
);
typedef struct _EX_EXTENSION_REGISTRATION_1 {
    USHORT ExtensionId;
    USHORT ExtensionVersion;
    USHORT FunctionCount;
    VOID *FunctionTable;
    PVOID *HostInterface;
    PVOID DriverObject;
} EX_EXTENSION_REGISTRATION_1, *PEX_EXTENSION_REGISTRATION_1;

After a bit of reverse engineering, I figured that the formal input parameter “PVOID RegistrationInfo” is actually of type PEX_EXTENSION_REGISTRATION_1.

The pseudo-code of nt!ExRegisterExtension is shown in appendix B, but here are the main points:

  1. nt!ExRegisterExtension extracts the ExtensionId and ExtensionVersion members of the RegistrationInfo structure and uses them to locate a matching host in nt!ExpHostList (using the nt!ExpFindHost function, whose pseudo-code appears in appendix B).
  2. Then, the function verifies that the amount of functions supplied in RegistrationInfo->FunctionCount matches the expected amount set in the host’s structure. It also makes sure that the host’s FunctionTable field has not already been initialized. Basically, this check means that an extension cannot be registered twice.
  3. If everything seems OK, the host’s FunctionTable field is set to point to the FunctionTable supplied in RegistrationInfo.
  4. Additionally, RegistrationInfo->HostInterface is set to point to some data found in the host structure. This data is interesting, and we’ll discuss it soon.
  5. Eventually, the fully initialized host is returned to the caller via an output parameter.

We saw that nt!ExRegisterExtension searches for a host that matches RegistrationInfo. The question now is, where do these hosts come from?

  • During its initialization, NTOS performs several calls to nt!ExRegisterHost. In every call it passes a structure identifying a single driver from a list of predetermined drivers (full list in appendix A). For example, here is the call which initializes a host for Bam.sys:
  • nt!ExRegisterHost allocates a structure of type _HOST_LIST_ENTRY (unofficial name, coined by me), initializes it with data supplied by the caller, and adds it to the end of nt!ExpHostList. The _HOST_LIST_ENTRY structure is undocumented, and looks something like this:
struct _HOST_LIST_ENTRY
{
    _LIST_ENTRY List;
    DWORD RefCount;
    USHORT ExtensionId;
    USHORT ExtensionVersion;
    USHORT FunctionCount; // number of callbacks that the extension 
// contains
    POOL_TYPE PoolType;   // where this host is allocated
    PVOID HostInterface; // table of unexported nt functions, 
// to be used by the driver to which
// this extension belongs
    PVOID FunctionAddress; // optional, rarely used. 
// This callback is called before
// and after an extension for this
// host is registered / unregistered
    PVOID ArgForFunction; // will be sent to the function saved here
    _EX_RUNDOWN_REF RundownRef;
    _EX_PUSH_LOCK Lock;
    PVOID FunctionTable; // a table of the callbacks that the 
// driver “registers”
    DWORD Flags;         // Only uses one bit. 
// Not sure about its meaning.
} HOST_LIST_ENTRY, *PHOST_LIST_ENTRY;
  • When one of the predetermined drivers loads, it registers an extension using nt!ExRegisterExtension and supplies a RegistrationInfo structure, containing a table of functions (as we saw Bam.sys doing). This table of functions will be placed in the FunctionTable member of the matching host. These functions will be called by NTOS in certain occasions, which makes them some kind of callbacks.

Earlier we saw that part of nt!ExRegisterExtension functionality is to set RegistrationInfo->HostInterface (which contains a global variable in the calling driver) to point to some data found in the host structure. Let’s get back to that.

Every driver which registers an extension has a host initialized for it by NTOS. This host contains, among other things, a HostInterface, pointing to a predetermined table of unexported NTOS functions. Different drivers receive different HostInterfaces, and some don’t receive one at all.

For example, this is the HostInterface that Bam.sys receives:

So the “kernel extensions” mechanism is actually a bi-directional communication port: The driver supplies a list of “callbacks”, to be called on different occasions, and receives a set of functions for its own internal use.

To stick with the example of Bam.sys, let’s take a look at the callbacks that it supplies:

  • BampCreateProcessCallback
  • BampSetThrottleStateCallback
  • BampGetThrottleStateCallback
  • BampSetUserSettings
  • BampGetUserSettingsHandle

The host initialized for Bam.sys “knows” in advance that it should receive a table of 5 functions. These functions must be laid-out in the exact order presented here, since they are called according to their index. As we can see in this case, where the function found in nt!PspBamExtensionHost->FunctionTable[4] is called:

To conclude, there exists a mechanism to “extend” NTOS by means of registering specific callbacks and retrieving unexported functions to be used by certain predetermined drivers.

I don’t know if there is any practical use for this knowledge, but I thought it was interesting enough to share. If you find anything useful / interesting to do with this mechanism, I’d love to know :)

Appendix A — Extension hosts initialized by NTOS:

Appendix B — functions pseudo-code:

Appendix C — structures definitions:

struct _HOST_INFORMATION
{
    USHORT ExtensionId;
    USHORT ExtensionVersion;
    DWORD FunctionCount;
    POOL_TYPE PoolType;
    PVOID HostInterface;
    PVOID FunctionAddress;
    PVOID ArgForFunction;
    PVOID unk;
} HOST_INFORMATION, *PHOST_INFORMATION;

struct _HOST_LIST_ENTRY
{
    _LIST_ENTRY List;
    DWORD RefCount;
    USHORT ExtensionId;
    USHORT ExtensionVersion;
    USHORT FunctionCount; // number of callbacks that the 
// extension contains
    POOL_TYPE PoolType;   // where this host is allocated
    PVOID HostInterface;  // table of unexported nt functions, 
// to be used by the driver to which
// this extension belongs
    PVOID FunctionAddress; // optional, rarely used. 
// This callback is called before and
// after an extension for this host
// is registered / unregistered
    PVOID ArgForFunction; // will be sent to the function saved here
    _EX_RUNDOWN_REF RundownRef;
    _EX_PUSH_LOCK Lock;
    PVOID FunctionTable;    // a table of the callbacks that 
// the driver “registers”
DWORD Flags;                // Only uses one flag. 
// Not sure about its meaning.
} HOST_LIST_ENTRY, *PHOST_LIST_ENTRY;;

struct _EX_EXTENSION_REGISTRATION_1
{
    USHORT ExtensionId;
    USHORT ExtensionVersion;
    USHORT FunctionCount;
    PVOID FunctionTable;
    PVOID *HostTable;
    PVOID DriverObject;
}EX_EXTENSION_REGISTRATION_1, *PEX_EXTENSION_REGISTRATION_1;

Yes, More Callbacks — The Kernel Extension Mechanism was originally published in Yarden_Shafir on Medium, where people are continuing the conversation by highlighting and responding to this story.

❌
❌