Reading view

There are new articles available, click to refresh the page.

In-Memory Disassembly for EDR/AV Unhooking

2 April 2023 at 21:33

Recap on EDR Hooks

EDR hooking isn’t a new thing, and is likely not new to you (otherwise see here), there’s plenty of samples online of unhooking NTDLL, though most tend to leverage direct syscalls or mapping of ntdll from disk or knowndlls.

Below we’ll walk through the hooks of a particular AV (Sophos AV) and determine why many of the public methods fail, and how we created our Rust PoC to work against the self-protection techniques of similar hooking engines.

Why Most NTDLL Unhooking Methods Fail (In This Case)

All public samples I’ve found that don’t use direct syscalls rely on using VirtualProtect (or NtProtectVirtualMemory) without avoiding any hooks placed on NtProtectVirtualMemory.

This is reminiscent of a chicken and egg problem, they require NtProtectVirtualMemory to unhook, yet NtProtectVirtualMemory itself is hooked, so they’re forced to go through a hooked function to unhook, and this gives the EDR/AV a chance to detect and prevent the operation (which Sophos AV does)

One way to avoid this (the only way that I’ve found public samples for) rely on direct syscalls, meaning instead of modifying memory to unhook, you simply have your own syscall stubs and issue syscalls directly from your own modules instead of ntdll, meaning you no longer use any APIs from NTDLL, resulting in avoiding NTDLL hooks.

There’s a few concerns with this approach, one being not all NTDLL functions are syscalls (see PssNtCaptureSnapshot below)

`PssNtCaptureSnapshot` in NTDLL — notably more than a syscall

Secondly, inline syscalls themselves can be flagged as suspicious, though again this is the only public method I’ve found that would work against this target.

A (Publicly) New Method: In-Memory Disassembly

Its no secret we love Rust, so when we developed a Rust sample for unhooking against this target, we found a nice reason to publish unhooking without direct syscalls that also avoids NtProtectVirtualMemory hooks, something not found in other public samples.

This solution is based on the fact that the original code blocks that were replaced by hooks still live somewhere in memory, they have to as the AV/EDR may permit calls to go through if deemed legit.

So we utilize in-memory disassembly to identify patterns that lead to the original (unhooked) code blocks and find the unhooked original NtProtectVirtualMemory function, with that we can use it to apply the rest of our unhooking logic to remove all EDR hooks.

Lets identify the patterns in Sophos that lead to the original unhooked blocks:

`NtProtectVirtualMemory` hook, contains a direct jump followed by an indirect jump

The start of each hooked function in ntdll is a direct JMP, followed by an indirect JMP to somewhere outside of NTDLL (in this case, into hmpalert.dll).

This means detection of hooked functions is as simple as finding the JMP at the start of the function (we can leverage this to easily enumerate all hooked functions).

To enumerate the exported functions in NTDLL (and search for hooks), we get a handle to NTDLL via the PEB, then walk NTDLL’s export table.

The next problem: How do we identify the original unhooked code blocks for a particular function? Lets look at further disassembly in hmpalert’s hook:

Further disassembly of `NtProtectVirtualMemory`, identifying the original syscall stub

Further disassembly shows a pattern of an indirect pointer load into RAX, following by an indirect call that will JMP to the pointer stored in RAX, which in this case is the original syscall stub for NtProtectVirtualMemory.

This pattern is similar for non-syscall functions that are hooked, lets look at PssNtCaptureSnapshot:

Note how the hook starts the same way, a jump followed by an indirect jump into hmpalert.dll (outside of NTDLL).

Continuing disassembly we see:

`PssNtCaptureSnapshot` original code block

We see a similar pattern here, RAX is loaded with a pointer followed by an indirect call that JMPs to the address stored in RAX, which is the original code block of PssNtCaptureSnapshot without hooks!

As we can identify these patterns to locate the unhooked original functions using a disassembler, we simply translated that logic into Rust code that uses in-memory disassembly to identify the original code blocks at runtime.

Once we locate the unhooked/original functions at runtime, we replace the hooks from the EDR/AV with our own hook that JMPs into the unpatched originals, for example:

After our unhooking, the JMPs at the beginning of the previously hooked functions redirect to the original code blocks we found in memory! We avoiding direct syscalls, and we didn’t need to use any hooked APIs prior to unhooking (avoiding the previously mentioned chicken-egg problem).

Sample Code

Our sample code can be found here: https://github.com/Signal-Labs/iat_unhook_sample, note the unhook_exports function.

Windows Hotpatching & Process Injection

Blog - Signal Labs

christopher vella

6 March 2023 at 07:01

Summary

I spent a day reversing syscalls and Kernel entry points to find new ways I could share memory or inject code between threads & processes, during which I decided to dive into Hotpatching on Windows and write a PoC which turned out to be a great exercise in working with the PE format.

Below I talk through Hotpatching, resulting in a PoC & PE32/32+ helper code.

What is hotpatching?

Hotpatching is a method supported throughout Windows for modifying running PE32/32+ objects in memory, including Kernel drivers & processes (even those in the Secure Kernel), usually to permit patching or updating code without requiring restarts. This is documented in a few places, including the PE32/32+ format (see an overview from Microsoft here: https://techcommunity.microsoft.com/t5/windows-os-platform-blog/hotpatching-on-windows/ba-p/2959541).

Enabling Hotpatch Support

Hotpatching is not enabled by default in all versions of the OS, its currently supported in Insider builds and in Azure edition of server builds (Azure editions are downloadable as ISOs or deployable in Azure directly, more information here: https://learn.microsoft.com/en-us/azure/automanage/automanage-hotpatch and here: https://learn.microsoft.com/en-us/windows-server/get-started/enable-hotpatch-azure-edition.

If you’re working with the raw ISO or an Insider build, you may need to set registry keys to turn on Hotpatching (as per the links above). Insider builds only require the registry keys to be set, which are included in the PoC at the end.

Hotpatching also relies on the presence of the Secure Kernel, though you can leverage Hotpatching without the Secure Kernel if you set an additional registry key.

All the information about Hotpatching can be gleamed from public documentation, including PE information here: https://microsoft.github.io/windows-docs-rs/doc/windows/Win32/System/SystemServices/struct.IMAGE_HOT_PATCH_INFO.html, here: https://learn.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-image_load_config_directory64 & through reversing of the NtManageHotPatch syscall in ntdll.dll and ntoskrnl.exe.

Pseudo Overview of the Hotpatching Process

In addition to the documentation above, below is my own general walkthrough of the Hotpatching process based on my PoC development.

Patches are created via the NtManageHotPatch syscall, this syscall takes multiple parameters which determine the operation to call. When we call it to create a patch it will expect to load a PE32/32+ file describing the patch.

These Hotpatch PE32/32+ files are like regular PE32/32+ executable images however they include Hotpatch entries (Including the IMAGE_HOT_PATCH_INFO struct here: https://microsoft.github.io/windows-docs-rs/doc/windows/Win32/System/SystemServices/struct.IMAGE_HOT_PATCH_INFO.html, the IMAGE_HOT_PATCH_BASE here: https://microsoft.github.io/windows-docs-rs/doc/windows/Win32/System/SystemServices/struct.IMAGE_HOT_PATCH_BASE.html & other IMAGE_HOT_PATCH_* structures.

The Hotpatch entries start with the IMAGE_HOT_PATCH_INFO struct which is stored in a section in the PE32/32+ file, pointed to by the HotPatchTableOffset field in the IMAGE_LOAD_CONFIG_DIRECTORY64 DataDirectory entry in the OptionalHeader of the PE32/32+ file.

The PE file is mapped into a system section by the Kernel, which parses the file to determine the offsets of the Hotpatch structures in the file, it then creates a Hotpatch entry in a list.

The Hotpatch will be valid for images of a certain checksum and timedatestamp, which will apply to any image with the corresponding checksum in its OptionalHeader, and its timedatestamp from its IMAGE_FILE_HEADER. This is how a Hotpatch file tells the system which image the Hotpatch is valid for, e.g. if we wanted to patch kernelbase.dll, we’d read the checksum and timedatestamp from kernelbase.dll and set the OriginalCheckSum and OriginalTimeDateStamp fields of our IMAGE_HOT_PATCH_BASE struct to those values.

Additionally, if the patch PE contains the exported function __PatchMainCallout__, it will be automatically invoked after the patch is loaded in a process.

Once the patch is loaded into the Kernel, depending on the type of patch it may automatically be applied to all running processes as the Kernel enumerates processes and calls a notification callback in ntdll.dll to handle checking for patches.

Limitations & Notes

While Hotpatching is a powerful feature, permitting code changes to multiple parts of the system, there are two main limitations (for non-Microsoft users)

Administrator privileges is generally required to enable Hotpatching
To globally apply a Hotpatched PE, the PE is required to be at least Microsoft signed or higher (preventing common injection of unsigned DLLs)

For 2. above, the PE does not need to be signed to be loaded into the Kernel list of Hotpatches, and you can still map the Hotpatch into your process by utilizing NtManageHotPatch, which provides a way to map the section handle of your Hotpatch regardless of its signature.

There are other behaviors of Hotpatching not mentioned here, such as targeting process by user SID, or the fact that processes may continually attempt to load your Hotpatch regardless of validity (which can cause processes to stop launching, or even act as a targeted DoS against certain processes).

Additionally if you target kernelbase the Hotpatched PE can be loaded outside of the typical notification callback, such as in LdrpInitializeKernel32Functions instead, which has its own interesting properties not discussed here.

I could also foresee uses of this by EDRs to supply ntdll & kernelbase patches, instead of their current approach of injecting + hooking.

Code Samples

Code samples (not complete) for Hotpatching (+ partial helper code for working with PE32 files) are included here: https://github.com/Signal-Labs/Hotpatching_PoC, the Hotpatch loader will take the compiled hotpatch_replace_vs file (which is expected to already have a LoadConfig table, which is possible if you compile with /GS for example) and create a new file that’s a clone with a Hotpatch entry (a partially valid entry, just enough to get it loaded as a Hotpatch record in the Kernel). It also includes support for enabling Hotpatching if you run as Admin.

Memory Corruption in vmware-vmx.exe

Blog - Signal Labs

christopher vella

30 September 2022 at 18:11

Preface: Hypervisor Bugs?

Firstly — while the below is a memory corruption bug in VMware’s vmware-vmx.exe process, it is benign (not quite exploitable, partly why I’m comfortable dropping it here) but it is fun to talk about and came from personal VMware fuzzing adventures.

The Bug: A Tale of Two

To first reach the memory corruption bug, we have to take vmware-vmx.exe down the path of panicking!

This is because the actual memory corruption lives in VMware’s zlib.dll, this module is used when vmware-vmx.exe encounters an error and proceeds to create a crash-dump of itself and compress that to disk.

Snippet of the coredmp logic in vmware-vmx’s panic handler

Looking at xref’s to vmware’s panic handler, there’s quite a few ways to make the vmware-vmx process panic

We just need to hit one! How can we do this? We need our first bug to hit the panic handler!
Turns out my VMware fuzzer found such a bug, and its in the LSI Logic handler in the process, in particular this line:

So this initial bug isn’t anything crazy, its essentially an ASSERT due to unexpected/malformed input, this bug alone will just crash our own VM and as such doesn’t really constitute a bug itself (unless you can do something else with it, I had ideas of continually restarting/crashing my own VM to take up crash-dump / log space on the host for instance)

However my fuzzer didn’t report this as an ASSERT, it found an actual memory corruption bug! Turns out during the panicing process in this instance data is sent to VMware’s zlib’s deflate function (for compression) and this code actually has an overflow read!

What happens here is that a buffer is looped over a set of iterations and for each iteration we read 8 bytes from the buffer, however it goes one iteration too far and attempts to read past the bounds of the buffer on the final iteration:

Repro for Yourself

Want to test this yourself? Grab the bootx64.efi (uefi bootloader) I’ve made below, you can create a VMware vmdk and plant this in the vmdk’s “\EFI\Boot” folder, such that when you mount the vmdk the structure is “\EFI\Boot\bootx64.efi”.

When you launch the VM my bootloader will run and repro the crash for you (its not a 100% success rate, probably about 80%).

bootx64.efi: https://github.com/Kharos102/VmwareBlogBug

Fuzzing WeChat’s Wxam Parser

Blog - Signal Labs

christopher vella

7 August 2022 at 19:30

Background

WeChat (if you haven’t heard of it) is a super popular chat app similar to the likes of WhatsApp, and runs on iOS, Android, Windows and MacOS.
Being a chat app, it handles various file formats like images and videos, and also propriety formats like “Wxam” (which honestly I haven’t researched before so you’ll see how I approached that).

You’ll also see below some of the challenges I had in my harnessing of the target and how my initial fuzzer framework I chose had to be replaced due to lack of support for certain functionality that WeChat used (and how I debugged this).

Researching the Target

Now that we know what WeChat is we can look at how I decided to write a fuzzer (in 1 day!) for this target!
It started by deciding I wanted to blog about fuzzing something, previously I’ve had blogs on Logic bugs and I wanted to balance that with some cool fuzzing target I haven’t looked at before, so I started by browsing ZDI to see if any displayed targets were interesting.

I noticed a few entries for WeChat like the below:

Now at this point I know what WeChat is, but I have no idea what WXAM is (but its safe to guess its some format that gets parsed).

So my next step was to simply install WeChat in a VM! Note that here I’m targeting the Windows build of WeChat, for the following reasons:

I want this to be quick, its primarily for this blog post and I know I can fuzz Windows targets faster than iOS/Android
If this parser also exists on other platforms, it probably isn’t much different (potentially if I find the bug on Windows, it’ll exist on the other platforms)

Now its installed and I have a bunch of executables and DLL files in C:\Program Files (x86)\Tencent\WeChat, so how do I find the WXAM parsing functionality?

Finding the Target

A good starting point may be to dump all the imported & exported functions from all the executables and DLLs and search for anything with the name “wxam” in it, but I went a different route — I simply guessed and opened the DLL that sounded interesting in IDA!

For me, looking at the list of DLLs I spotted “WeChatWin.dll”, this sounds like a main DLL for WeChat that handles certain Windows specific APIs or something? Who knows, but it stood out more than some of the other DLLs, so I opened this in IDA.

This DLL took a while to load, its pretty large (~40mb), once done the first thing I did was search in functions, imports & exports for the name “wxam”, there I found:

wxam2pic imported function shown in WeChatWin.dll

We spot an imported function named “wxam2pic” that lives in “VoipEngine.dll” — nice! This is a great starting point, it even sounds like a parser.

Before I look at wxam2pic in VoipEngine, I first examine cross-references to this import within WeChatWin.dll and see how WeChatWin uses this, I spot two functions that call this, including this one:

Scrolling to the top of this function we spot:

Don’t you love debug prints?
This string alone implies the function we’re looking at is a “WxAMDecoderHelper”, specifically this function handles the “DecodeWxam” functionality — Awesome! This is exactly the type of function that corresponds with the ZDI entries we saw.

There’s something else notable about this function, look at how IDA shows the prototype:

Its a custom calling convention!

This means if we were to target this function for fuzzing directly, we’d have to match this custom parameter passing convention instead of Visual Studio’s provided options (fastcall, cdecl, etc).

Instead, I took a look at the function that calls this function, and I got:

(Note: ignore the function name itself, I named it this from what I saw!)

Nice, this function uses a standard calling convention (fastcall), takes only two arguments and calls the DecodeWxam function (handling the custom calling convention for us!)

We also see from the debug print that this function appears to decode the Wxam and then re-encode it as a jpeg, this would be a great function to fuzz!

(Note: There’s another decoder that transforms the Wxam to a GIF! We’re not going to look at that one in this blog, but its essentially the same).

Reversing the Target Function

Alright so I want to fuzz this function as it appears to take a Wxam file and parse it, lets analyze the parameters.

Lets view cross-references to this function to see how its called:

(Note: I named the read_file function myself, if you open this function you see a simple CreateFile + ReadFile operation on the provided fName variable!)

From this, I see the following:

A filename is provided to the function I myself named “read_file” and a buffer is returned in v11
The buffer and a value is passed to “isWxGF”, this function reads a header and the flag to determine if we should parse it further or not
- Actually, turns out the input structure is a format of a 32bit input buffer pointer followed by a 32bit size of input. So isWxGF takes (pBuffer, buf_sz)
If we pass the “isWxGF” check, we call the decoder function passing through:
- The address of an input structure that contains (pBuffer, buf_sz), the pseudocode looks similar to
  - InputStruct inputStruct = (pBuffer, buf_sz)
  - Where the first input to the decoder function is a pointer to our inputStruct
- A pointer to a int containing the value 0
  - This pointer seems to be some output from the decoder, if its non-zero its assumed to be another valid pointer

This seems super easy to fuzz:

We can fuzz using shared-memory mode in a fuzzer like WinAFL
Our fuzz function will:
- Call isWxGF; and if successful:
- Calls the decoder

So I wrote a harness to do this in WinAFL, however:

This usually means our program is crashing before reaching the our fuzz function.

So I run WinAFL under WinDBG and see an invalid address dereference when trying to load the “WeChatWin.dll” file!

I analyze the DLL entry point and spot:

I see, this DLL uses CRT (also thread-local storage) — this causes issues with DynamoRIO (which I was using with WinAFL).

This can be confirmed by compiling my executable with CRT support and noting that WinAFL crashes before our process main executes at all!

So this means we can’t use DynamoRIO, our options include:

Using WinAFL in IntelPT mode (I’m using an AMD CPU, so no go here)
Use a different fuzzer

Well I chose a different fuzzer.

I could have gone the snapshot route with Nyx or what-the-fuzz, instead I decided to try Jackalope

This has a very similar command line to WinAFL, and uses TinyInst for instrumentation (no DynamoRIO!)

Upon trying this, it worked:

Its fuzzing, and we are getting new coverage!

At this point I stopped, I got the fuzzer working well enough I was happy for the day, next steps would include:

Analyzing coverage, ensuring we’re not hitting any roadblocks
Check stability / determinism, ensure there’s no globals we need to reset
- Or just throw this into a snapshot fuzzer
Reverse the WXAM format and create better corpus, and a format-aware mutator

Also note that in the isWxGF function, I noted the header bytes it checks for and ensured my initial corpus had that header (so we start with an input that successfully passes that check).

There are other things I did in the harness, which are general fuzzing things like obtaining the non-exported function pointers to our target functions we wanted to fuzz.

I’ve included the harness I used below, along with the Jackalope command line I used to kick off fuzzing, feel free to take this and expand on it or view coverage to see how far it gets!

Overall this was a fun half a day exercise at quickly writing a basic fuzzing harness based on some ZDI entry.

Update — Android Bugs!

So, turns out some of the bugs I found from this fuzzer were reproducible on Android:

@h0mbre_ @domenuk ezpz? pic.twitter.com/CiLoUxQtb5
— Christopher (@Kharosx0) August 8, 2022

Files

I put all the files on my Github: https://github.com/Kharos102/BasicWXAMFuzzer

Want to Learn Fuzzing?

We offer Vulnerability Research & Fuzzing trainings live or self-paced (For our self-paced trainings, see: https://signal-labs.thinkific.com/collections)

For any questions, feel free to contact us!

Rediscovering Epic Games 0-Days (Forever Unpatched?)

Blog - Signal Labs

christopher vella

6 July 2022 at 22:30

How It Started

So one day I was browsing ZDI (usually its the same sort of targets, lots of Foxit bugs, Adobe, Ivanti, etc) and noticed a couple entries by @izobashi (ZDI-22-537, ZDI-22-538) for Epic Games Launcher, there were two things that stood out:

It wasn’t patched at time of advisory release (which means no patch in 120 days since reporting it, maybe unpatched forever?)
It was file overwrite and file deletion bugs which can be leveraged for LPE, and affected the installer (these bugs are common and very familiar to me)

Now as a gamer (albeit not one with Epic’s launcher installed) I’ve had the displeasure of noting multiple vulnerabilities in gaming related software (alongside a strong dislike for anti-cheats in my kernel or acting as a hypervisor, though I understand why they do), I figured I’d check now if I can find these same bugs in the latest version of Epic’s launcher.

Although we don’t have a PoC or really any detailed information from the ZDI listings, the bugs are familiar enough that we can jump right in with our trusty ProcMon and see what we find.

Finding the bugs

After installing Epic’s launcher I immediately find the installer in C:\Windows\Installer (shh! its a secret directory), I know this is Epic’s MSI due to the signature matching Epic as expected:

The reason bugs in installers are common can be noted to a few factors:

They typically auto-elevate to SYSTEM (even if you’re just a lowly non-administrative user)
They can be executed in install / remove / repair modes that perform various operations, including file operations (copy, rename, delete) and can run arbitrary bundled scripts
People don’t spend security $$$ on hardening their installers? (I don’t know, but it sure seems like it)

Before we go any further, lets configure our ProcMon, but first — why ProcMon?:

Tells us what processes are doing (to an extent)
1. What files they’re accessing
2. What permissions they’re operating at
3. What files / registry entries they’re reading / writing / deleting
Is filterable
1. Write rules to only show / capture what you’re interested in

Now to configure ProcMon, what are we looking for exactly?:

File creation/opening events that satisfy the following:
1. Operating on folders or files we can control
  1. Why? So we can redirect them via symlinks of course
    1. If it overwrites a file in C:\windows\system32, how would our lowly non-admin user control it?
    2. If it overwrites a file in C:\users\lowly_user\Desktop that we can control, its a different story!
    3. (Or any other location we have write or similar access to)
2. ?? (There are more potentially interesting events, like paths or files that don’t exist that we may create, etc… but for this exercise we don’t care)

Now you may think if we exclude the following folders, that’d be good enough to meet our requirements above:

The point of the above is:

Capture CreateFile and Load Image operations
Ensure username contains NT (e.g. NT SYSTEM)
Exclude folders we can’t modify / control:
- C:\ProgramData\Microsoft
- C:\Program Files*
- C:\Windows\

Can you think what the problem with the above excluded directories is?

…

….

Well actually there are multiple (for example, C:\windows\temp is typically user-writable! Meaning we actually can have some control over the contents of this directory, yet in the above filters we exclude it, although this isn’t an issue for this particular example).

The actual issue is excluding all of C:\Program Files, because Epic actually applies a permissive DACL on c:\Program Files (x86)\Epic Games\Launcher and its subfolders! (Not a great thing to do in general…)

This can be verified with icacls:

(Tip: Enumerate ACLs on everything -> install software -> enumerate again -> diff!)

Ok so lets ensure the path C:\Program Files (x86)\Epic Games\Launcher is included in our procmon filter and start capturing (In this case I’m going to remove the exclude for C:\Program Files and specifically include the launcher path above — once we have the trace we can play with the filters to see other interesting events too, like searching for operations that begin with Set to see renames, deletions, etc).

Lets right click the .msi file and press Repair, wait for it to complete and see if anything interesting happens.

Well that’s a lot.

To be honest its not surprising, the .msi in repair mode is there to, well, “repair” its files (which is typically achieved by replacing them with a pre-packaged good version).

Since the ACLs on this folder are weak, what would happen if we were to redirect one of these files elsewhere?

To test this, I’m going to show the two 0-days, first is the file overwrite.

Lets start by grabbing the symbolic testing tools from GPZ and compile them.

Now lets turn a folder (in this case, C:\Program Files (x86)\Epic Games\Launcher\Engine\Binaries\Win32 into a symlink pointing to \RPC Control:

(If you can’t delete Win32, try stopping Epic’s running processes first)

Ok now lets try the repair operation again and see what happens.

Ok so its looking for a DLL in our Win32 folder, however Win32 now points to \RPC Control and there doesn’t exist any \RPC Control\msvcp140_2.dll for the target to obtain a handle to.

Lets try creating this, and redirecting it to C:\Windows\System32\License.rtf as an example, now lets first note the size of our License.rtf file:

Ok so yours is likely not 7 bytes like mine, but note that mine only allows modification by Administrators or higher, users just have RX.

Now lets create the link to it:

Now press Retry on the msi error, and you’ll notice it continues and pops up another error (for a different DLL!)

However, note that License.rtf has been overwritten!

This is the first 0-day, arbitrary file overwrite!
To ensure this sticks, we can now delete Win32, recreate it as a regular folder (mkdir Win32) and press Retry, this should cause the installer to continue without any more errors and leave the file overwritten.
However, we can turn this into an arbitrary deletion vulnerability by causing the target to now delete License.rtf!

We can do this by simply pressing Cancel instead of retry! The target MSI will rollback its operations, and this will cause it to delete the overwritten file entirely!

With these two bugs (file overwrite + file deletion) we can actually leverage them for LPE, there’s other posts on achieving this (e.g. https://www.zerodayinitiative.com/blog/2022/3/16/abusing-arbitrary-file-deletes-to-escalate-privilege-and-other-great-tricks)

Whos taking bets how long these bugs will remain as 0-days in Epic’s launcher?

Announcing Self-Paced Trainings!

Blog - Signal Labs

christopher vella

30 April 2022 at 16:44

Self-paced trainings are arriving for all existing public trainings, this includes:

Vulnerability Research & Fuzzing
Reverse Engineering
Offensive Tool Development
Misc workshops

This change comes from both interest from previous students & my own preference to learn via pre-recorded content.

Features of self-paced trainings include:

Pre-recorded content that matches the 4-day live training versions
- Includes all the materials you’d normally get in the 4-day live version
- Includes a free seat on the next 4-day live version (pending seat availability)
Unlimited discussions via email/twitter/discord with instructor
Free and paid workshops / mini-trainings on various topics
- I also take requests on workshops / mini-trainings / topics you’d like to see

Different platforms for hosting the self-paced versions have been considered, currently we’re experimenting with the Thinkific platform and are in the process of modifying & uploading all the recorded content (I recently relocated from Australia to USA — this has delayed the self-paced development a bit, but a lot of content is currently uploaded).

While the self-paced versions are being edited and uploaded, I’m offering access to it at a discounted rate (20% off!), this gets you:

Access to draft versions of the training content as they’re developed
Lifetime Access to the training once completed

Once a particular training has been finalized, the discount for it will no longer be offered.

You can find the draft self-paced training offerings (as they’re developed) here: https://signal-labs.thinkific.com/collections

(Link will be updated when training is finalized)

For any questions feel free to contact us via email at [email protected]

Happy Hacking!

Finding a Kernel 0-day in VMware vCenter Converter via Static Reverse Engineering

Blog - Signal Labs

christopher vella

26 January 2022 at 22:40

I posted a poll on twitter (Christopher on Twitter: "Next blog topic?" / Twitter) to decide on what this blog post would be about, and the results indicated it should be about Kernel driver reversing.

I figured I’d make it a bit more exciting by finding a new Kernel 0-day to integrate into the blog post, and so I started thinking what driver would be a fun target.
I’ve reversed VMware drivers before, primarily ones relating to their Hypervisor, but I’ve also used their vCenter Converter tool before and wondered what attack surface that introduces when installed.

Turns out it installs a Kernel component (vstor2-x64.sys) which is interactable via low-privileged users, we can see this driver installed with the name “vstor2-mntapi20-shared” in the “Driver” directory using Sysinternals’ WinObj.exe tool.

To confirm low-privileged users can interact with this driver, we take a look at the “Device” directory.
Drivers have various ways of communicating with user-land code, one common method is for the driver to expose a device that user-land code can open a handle to (using the CreateFile APIs), we find the device with the same name, double-click it and view its security attributes:

We see in the device security properties that the “everyone” group has read & write permissions, this means low-privileged users can obtain a handle to the device and use it to communicate to the driver.

Note that the driver and device names in these directories are set in the driver’s DriverEntry when it is loaded by Windows, first the device is created using IoCreateDevice, usually followed by a symbolic link creation using IoCreateSymbolicLink to give access to user-land code.

When a user-land process wants to communicate with a device driver, it will obtain a file handle to the device. In this case the code would look like:

#define USR_DEVICE_NAME L"\\\\.\\vstor2-mntapi20-shared"

HANDLE hDevice = CreateFileW(USR_DEVICE_NAME,

GENERIC_READ | GENERIC_WRITE,

FILE_SHARE_READ | FILE_SHARE_WRITE,

NULL,

OPEN_EXISTING,

0,

NULL);

This code results in the IRP_MJ_CREATE_HANDLER dispatch handler for the driver being called, this dispatch handler is part of the DRIVER_OBJECT for the target driver, which is the first argument to the driver’s DriverEntry, this structure has a MajorFunction array which can be set to function pointers that will handle callbacks for various events (like the create handler being called when a process opens a handle to the device driver)

In the image above we know the first argument to DriverEntry for any driver is a pointer to the DRIVER_OBJECT structure, with this information we can follow where this variable is used to find the code that sets the function pointers for the MajorFunction array.

We can find out which MajorFunction index maps to which IRP_MJ_xxx function by looking at sample code provided by Microsoft, specifically on line 284 here.

Since we now know which array index maps to which function, we rename the functions with meaningful names as shown in the image above (e.g. we name entry 0xe to ioctl_handler, as it handles DeviceIoControl messages from processes.

The read & write callbacks are called when a process calls ReadFile or WriteFile on the device handle, there are other callbacks too which we won’t go through.

To start with, lets analyze the irp_mj_create handler and see what happens when we create a handle to this device driver.

By default, this is what we see:

Firstly, we can improve decompilation by setting the correct types for a1 and a2, which we know must conform to the DRIVER_DISPATCH specification.

Doing so results in the following:

There’s a few things happening in this function, two important structures shown that are usually important are:

DeviceExtension object in the DEVICE_OBJECT structure
FsContext object in the IRP->CurrentStackLocation->FileObject structure

The DeviceExtension object is a pointer to a buffer created and managed by the driver object. It is accessible to the driver via the DEVICE_OBJECT structure (and thus accessible to the driver in all DRIVER_DISPATCH callbacks. Drivers typically create and use this buffer to manage state, variables & other information the driver wants to be able to access in a variety of locations (for example, if the driver supports various functions to Open, Read, Write or Close TCP connections via IOCTLs, the driver may store its current state (e.g. whether the connection is Open or Closed) in this DeviceExtension buffer, and whenever the Close function is called, it will check the state in the DeviceExtension buffer to ensure its in a state that can be closed), essentially its just a buffer that the driver uses to store/retrieve information from a variety of contexts/functions.

The FsContext structure is similar and can be used as an arbitrary buffer, the main difference is that the DEVICE_OBJECT structure is created by the driver during the IoCreateDevice call, which means the DeviceExtension buffer does not get torn down or re-created when a user process opens or closes a handle to the device, while the FsContext structure is associated with a FILE_OBJECT structure that is created when CreateFile is called, and destroyed when the handle is closed, meaning the FsContext buffer is per-handle.

From the decompiled code we see that a buffer of 0x20 size is allocated and set to be the FsContext structure, and we also see that the first 64bits of this structure is set to v5 in the code, which corresponds to the DeviceExtension pointer, meaning we already figured out that the FsContext struct contains a pointer to the DeviceExtension as its first element.

E.g.

struct FsContext {

PVOID pDevExt;

};

Figuring out the rest of the elements to the FsContext and DeviceExtension structures is a simple but sometimes tedious process of looking at all the DRIVER_DISPATCH functions for the driver (like the ioctl handler) and noting down what offsets are accessed in these structs and how they’re used (e.g. if offset 0x8 in the DeviceExtension is used in a KeAcquireSpinLockRaiseToDpc call, then we know that offset is a pointer to a KSPIN_LOCK object).

Taking the time to documents the structures this way pays off, it helps greatly when trying to understanding the decompilation, as with some effort we can transform the IRP_MJ_CREATE handler to look like the below:

When looking at the FsContext structure for example, we can open Ida’s Local Types window and create it using C syntax, which I created below:

Note that as you figure out what each element is, you can define the elements as random junk and rename/retype them as you go (so long as you know the size of the structure, which we get easily here via the 0x20 size argument to ExAllocatePoolWithTag).

Now that we’ve analyzed the IRP_MJ_CREATE handler and determined there’s nothing stopping us from creating a handle, we can look into how the driver handles Read, Write & DeviceIOControl requests from user processes.

In analyzing these handlers, we see heavy usage of the FsContext and DeviceExtension buffers, including checks on whether its contents are initialized.

Turns out, there are quite a few vulnerabilities in this driver that are reachable if you form your input correctly to hit their code paths, while I won’t go through all of them (some are still pending disclosure!), we will take a look at one which is a simple user->kernel DoS.

In IOCTL 0x2A0014 we see the DeviceExtension buffer get memset to 0 to clear its contents:

This is followed by a memmove that copies 0x100 bytes from the user’s input buffer to the DeviceExtension buffer, meaning those byte offsets we copy into are user controlled (I denote this with a _uc tag at the end of the variable name:

During this IOCTL, another field in the DeviceExtension also gets set (which seems to indicate that the DeviceExtension buffer has been initialized):

This is critical to triggering the bug (which we will see next).

So, the actual bug doesn’t live in the IOCTL handlers, instead it lives in the IRP_MJ_READ and IRP_MJ_WRITE handlers (note that in this case the READ and WRITE handlers are the same function, they just check the provided IRP to determine if the operation is a READ or WRITE).

In this handler, we can see a check to determine if the DeviceExtension’s some_if_field has been initialized:

After clearing this condition, the bug can be seen in sub_12840 in the following condition statement:

Here we see I denoted the unkn13 variable in the DeviceExtension buffer with _uc, this means its user controlled (in fact, its set during the memmove call we saw earlier).

From the decompilation we see that the code does a % operation on our user controllable value, this translates to a div instruction:

If you’re familiar with X86, you’ll know that a div instruction on the value 0 causes a divide-by-zero exception, we can easily trigger this here by provided an input buffer filled with 0 when we call the IOCTL 0x2A0014 to set the user controllable contents in the DeviceExtension buffer, then we can trigger this code by attempting to read/write the device handle using ReadFile or WriteFile APIs.

In fact there are multiple ways to trigger this, as the DeviceExtension buffer is essentially a global buffer, and no locking is used when reading this value, there exist race conditions where one thread is calling IOCTL 0x2A0014 and another is calling the read or write handler, such that this div instruction may be hit right after the memset operation in IOCTL 0x2A0014 clears the DeviceExtension buffer to 0.

In fact, there are multiple locations such race conditions would affect the code paths taken in this driver!

Overall, this driver is a good target for reverse engineering practice with Kernel drivers due to its use of not only IOCTLs, but also read & write handlers + the use of the FsContext and DeviceExtension buffers that need to be reversed to understand what the driver is doing, and how we can influence it. All the bugs found in this driver were purely from static reverse engineering as a fun exercise.

Interested in Reverse Engineering & Vulnerability Research Training?

We frequently run public sessions (or private sessions upon request) for trainings in Reverse Engineering & Vulnerability Research, see our Upcoming Trainings or Subscribe to get notified of our next public session dates.

Emulating File I/O for In-Memory Fuzzing

Blog - Signal Labs

christopher vella

12 October 2020 at 14:12

One problem I’ve encountered during fuzzing is how to best fuzz an application that performs multiple file reads on an input file, but in a performant way (e.g. in-memory without actually touching disk). For example, say an application takes in an input file path from a user and parses it, if the application loads the entire file into a single buffer to parse, this is simple to fuzz in-memory (we can modify the buffer in-memory and resume), however if the target does multiple reads on a file from disk, how can we fuzz performantly?

Of course if we’re fuzzing by replacing the file on disk for each fuzz case we can fuzz such a target, but for performance if we’re fuzzing entirely in-memory (or using a snapshot-fuzzer that doesn’t support disk-based I/O) we need to ensure each read operation the target performs on our input does not actually touch disk, but instead reads from memory.

The method I decided to implement for my fuzzing was to hook the different file IO operations (e.g. ReadFile) and implement my own custom handlers for these functions that redirects the read operations to memory instead of disk, this has multiple benefits:

We eliminate syscalls, as lots of file operations result in syscalls and my custom handler does not use syscalls, we avoid context switching into the kernel and obtain better perf
We keep track of different file operations but it all operates on a memory-mapped version of our input file, this means we can mutate the entire mem-mapped file once and guarantee all ReadFile calls will be on our mutated Memory-mapped file

The normal operation of reading a file (without using my hooks) is:

CreateFile is called on a file target
ReadFile is used on the target to read into a buffer (resulting in syscalls and disk IO)
Process parses the buffer
ReadFile is used on the target to read more from the file on disk
Process continues to parse the buffer

With our hooks, the operations instead look like:

CreateFile is called on a file target (our hook memory maps the target once entirely in-memory)
ReadFile is used on the target to read into a buffer (resulting in our custom ReadFile implementation to be called via our hook, and we handle the ReadFile by returning contents from our in-memory copy of the file, resulting in no syscalls or Disk IO)
Process parses the buffer
ReadFile is used on the target to read more from the file (in-memory again, just like the first ReadFile)
Process continues to parse the buffer

Process Reading a File with our Hooks (In-Memory)

This greatly simplifies mutation and eliminates syscalls for the file IO operations.

The implementation wasn’t complex, MSDN has good documentation on how the APIs perform so we can emulate them, alongside writing a test suite to verify our emulation accuracy.

The code for this can be found on my GitHub: https://github.com/Kharos102/FileHook

Fuzzing FoxitReader 9.7’s ConvertToPDF

Blog - Signal Labs

christopher vella

21 August 2020 at 15:12

Inspiration to create a fuzzing harness for FoxitReader’s ConvertToPDF function (targeting version 9.7) came from discovering Richard Johnson’s fuzzer for a previous version of FoxitReader.

(found here: https://www.cnblogs.com/st404/p/9384704.html).

Multiple changes have since been introduced in the way FoxitReader converts an image to a PDF, including the introduction of new Vtables entries, the necessity to load in the main FoxitReader.exe binary (including fixing the IAT and modifying data sections to contain valid handles to the current process heap) + more.

The source for my version of the fuzzing harness targeting version 9.7 can be found on my GitHub: https://github.com/Kharos102/FoxitFuzz9.7

Below is a quick walkthrough of the reversing and coding performed to get this harness working.

Firstly — based on the existing work from the previous fuzzers available, we know that most of the calls for the conversion of an image to a PDF occur via vtable function calls from an object returned from ConvertToPDF_x86!CreateFXPDFConvertor, however this could also be found manually by debugging the application and adding a breakpoint on file read accesses to the image we supply as a parameter to the conversion function, and then walking the call stack.

To start our harness, I decided to analyse how the actual FoxitReader.exe process sets up objects required for the conversion function by setting a breakpoint for the CreateFXPDFConvertor function.

Next, by stepping out and setting a breakpoint on all the vtable function pointers for the returned object, we can discover what order these functions are called along with their parameters as this will be necessary for us to setup the object before calling the actual conversion routine.

We know how to view the vtable as the pointer to the vtable is the first 4-bytes (32bit) when dumping the object.

During this process we can notice multiple differences compared to the older versions of FoxitReader, including changes to existing function prototypes and the introduction of new vtable functions that require to be called.

After executing and noting the details of execution, we hit the main conversion function from the vtable of our object, here we can analyse the main parameter (some sort of conversion buffer structure) by viewing its memory and noting its contents.

First we see the initial 4-bytes are a pointer to an offset within the FoxitReader.exe image

This means our harness will have to load the FoxitReader image in-memory to also supply a valid pointer (we also have to fix its IAT and modify the image too, as we discover after testing the harness).

Then we continue noting down the buffer’s contents, including the input file path at offset +0x1624, the output file path at offset +0x182c, and more (including a version string).

Finally after the conversion the object is released and the buffer is freed.

After noting all the above we can make a harness from the information discovered and test.

During testing, certain issues where discovered and accounted for, including exceptions in FoxitReader.exe that was loaded into memory, due to imports being used, this was fixed by fixing up the process IAT when loaded.

Additionally, calls to HeapAlloc were occurring where the heap handle was obtained via an offset in the FoxitReader image loaded in-memory, however it was uninitialised, this was fixed by writing the current process heap handle into the FoxitReader image at the offset HeapAlloc was expecting.

Overall the process was not long and the resulting harness allows for fuzzing of the ConvertToPDF functionality in-memory for FoxitReader 9.7.

EDR Observations

Blog - Signal Labs

christopher vella

20 August 2020 at 15:13

EDR Primer

EDRs generally contain the following components:

Self-Protection
Hooking Engine
Virtualization/Sandbox/Emulation
Log/Alert Generation
Network Comms

Quick Primer: Kernel Callbacks

EDRs also utilize kernel callbacks as exposed by the windows NT kernel, including:

PsSetCreateProcessNotifyRoutine
PsSetLoadImageNotifyRoutine
PsSetThreadCreateNotifyRoutine
ObRegisterCallbacks
CmRegisterCallbacks

Exported callback routines in ntoskrnl.exe

These callbacks may be used by kernel drivers such that when an event happens (process creation, registry modifications, handle creations, etc) the kernel driver is notified (pre or post op) and may interfere with the operation or result.

A common usage of this is for EDRs to be notified of process creations and inject their own userland DLLs (usually to hook NTDLL) in the newly created processes before they execute.

Additionally EDRs may intercept handle creation events and block those that occur on their protected processes (for example, in self-protection mode they may prevent other processes from obtaining handles to their processes).

Quick Primer: Disassembling Callbacks

Callbacks can be enumerated and disassembled on Windows via Kernel Debugging (or in-kernel disassembling e.g. by compiling a kernel driver with disassembly functionality such as via Capstone).

If using KD/Windbg, we can leverage public symbols to first disassemble the function PsSetCreateProcessNotifyRoutine with the command u nt!PsSetCreateProcessNotifyRoutine

Disassembly of Nt!PsSetCreateProcessNotifyRoutine in Windbg

We then follow any initial JMP (depending on the version of ntoskrnl.exe) to the main implementation of the function (e.g. nt!PspSetCreateProcessNotifyRoutine)

Continue disassembling the function and look for a LEA instruction on the callback array symbol. Callbacks are stored in arrays of an undocumented EX_CALLBACK structure from which we can discover the function pointer that points to the actual callback function registered for a particular driver.

LEA instruction operating on the callback array

As shown above, the callback array used in the LEA instruction on the last line (loaded into R13) also has the symbol nt!PspCreateProcessNotifyRoutine).

Next, we dump the contents of the callback array:

Here the command dq nt!PspCreateProcessNotifyRoutine was used to dump the contents of the callback array symbol as quadwords.

We can resolve the callback function registered for each of these callback entries by changing the last byte of an entry from F to 8, this will contain a pointer to the function registered to the callback:

Above, we chose the first entry ffff998ae70d3b8f, then we change the last byte such that the value becomes ffff998ae70d3b88 then we disassembled it as instructions using the command u poi(ffff998ae70d3b88) discovering that this function is the callback function with the symbol nt!ViCreateProcessCallback.

Hooking

Hooking techniques are commonly used by EDRs to intercept userland functions for API monitoring or blocking. The following demonstrates a common use for hooking where an EDR registers for process callback notifications and injects a DLL into each newly created process, this DLL then hooks ntdll.dll functions to block/alert/monitor malicious behaviour (e.g. blocking calls to NtReadVirtualMemory where the target process handle represents the lsass process).

EDRs may also leverage sandbox, emulation or virtualization to run a binary in isolation and log API usage.

Common Weaknesses

The following list represents common weaknesses identified in multiple EDR solutions

Binary Padding

Scanning and emulation of a binary may be used to detect malicious behaviour, however many EDRs (and Ads) have file size limitations on the file to analyse.

As a result, by appending junk to the end of a binary until it is roughly 100mb in size may be enough to prevent the EDR/AV from analysing it (and due to the PE32/PE32+ format, junk appended at the end of an executable will not affect its execution).

This is effective against products that heavily rely on an emulation & scanning layer to detect threats.

Unmonitored APIs

Typical APIs used for malicious activity (e.g. combinations of VirtualAllocEx, WriteProcessMemory & CreateRemoteThread) may be alerted on by EDRs for process injection.

However, performing the same or similar actions with different sets of APIs may evade EDRs and go unnoticed.

For example, in the case of dumping sensitive process memory (like that from the lsass process) EDRs may not alert on handle creation of the target process, but may instead alert when an api like MiniDumpWriteDump or ReadProcessMemory is called on the target.

However, if we clone the target process with PssCaptureSnapshot and dump the memory of the cloned lsass process instead, we may bypass such detections. This stems from the following main factors:

Simple handle creations on a target process are permitted;
Cloning lsass is permitted; and
Dumping memory of non-sensitive processes are permitted

By cloning lsass, the cloned lsass process doesn’t get the same protections by the EDR as the original lsass process, thereby permitting dumping of the lsass clone.

This can be performed using the Windows APIs, or by using tools like ProcDump.exe with the -r flag.

Another example is DLL injection via Windows hooks (e.g. leveraging SetWindowsHookEx api), this method of process injection does not rely on the typical Windows injection methods of opening a process, writing into the process memory and then spawning a new thread, and can bypass typical process injection detections.

Breaking Process Trees

EDRs leverage process trees for detecting malicious behaviour (e.g. alerting if word.exe spawns cmd.exe), however we can leverage COM objects such as C08AFD90-F2A1-11D1-8455-00A0C91F3880 that exposes the ShellExecute function to spawn arbitrary processes under the explorer.exe process, even from within VBScript running under word.exe.

There are other techniques too (e.g. leveraging RPC) that may also be applicable to break process-tree based detections.

Attacking EDRs

EDR weaknesses also include certain design flaws that make them susceptible to subversion.

For example, as shown above, userland hooking may be key to an EDR’s detection capabilities (such that without it, the product may be rendered useless).

EDRs that hook userland APIs via hooking ntdll.dll may be subverted by loading a fresh copy of ntdll.dll into the process and redirecting (via hooks) our API calls to the newly loaded (and unhooked by EDR) ntdll.

This technique along for bypassing EDR hooks may be enough to then perform malicious actions (like lsass dumping) without any alerts or detections.

EDRs also expose a lot of attack surface due to their massive codebase (drivers, IPC, support for various file formats) that may make them susceptible to a range of 0-day vulnerabilities, as such proper testing of these products should be a priority.