Normal view

There are new articles available, click to refresh the page.
Before yesterdayBlog - Signal Labs

Announcing Self-Paced Trainings!

30 April 2022 at 16:44

Self-paced trainings are arriving for all existing public trainings, this includes:

  • Vulnerability Research & Fuzzing

  • Reverse Engineering

  • Offensive Tool Development

  • Misc workshops

This change comes from both interest from previous students & my own preference to learn via pre-recorded content.

Features of self-paced trainings include:

  • Pre-recorded content that matches the 4-day live training versions

    • Includes all the materials you’d normally get in the 4-day live version

    • Includes a free seat on the next 4-day live version (pending seat availability)

  • Unlimited discussions via email/twitter/discord with instructor

  • Free and paid workshops / mini-trainings on various topics

    • I also take requests on workshops / mini-trainings / topics you’d like to see

Different platforms for hosting the self-paced versions have been considered, currently we’re experimenting with the Thinkific platform and are in the process of modifying & uploading all the recorded content (I recently relocated from Australia to USA — this has delayed the self-paced development a bit, but a lot of content is currently uploaded).

While the self-paced versions are being edited and uploaded, I’m offering access to it at a discounted rate (20% off!), this gets you:

  • Access to draft versions of the training content as they’re developed

  • Lifetime Access to the training once completed

Once a particular training has been finalized, the discount for it will no longer be offered.

You can find the draft self-paced training offerings (as they’re developed) here: https://signal-labs.thinkific.com/collections

(Link will be updated when training is finalized)


For any questions feel free to contact us via email at [email protected]

Happy Hacking!

Finding a Kernel 0-day in VMware vCenter Converter via Static Reverse Engineering

26 January 2022 at 22:40

I posted a poll on twitter (Christopher on Twitter: "Next blog topic?" / Twitter) to decide on what this blog post would be about, and the results indicated it should be about Kernel driver reversing.

I figured I’d make it a bit more exciting by finding a new Kernel 0-day to integrate into the blog post, and so I started thinking what driver would be a fun target.
I’ve reversed VMware drivers before, primarily ones relating to their Hypervisor, but I’ve also used their vCenter Converter tool before and wondered what attack surface that introduces when installed.

Turns out it installs a Kernel component (vstor2-x64.sys) which is interactable via low-privileged users, we can see this driver installed with the name “vstor2-mntapi20-shared” in the “Driver” directory using Sysinternals’ WinObj.exe tool.

To confirm low-privileged users can interact with this driver, we take a look at the “Device” directory.
Drivers have various ways of communicating with user-land code, one common method is for the driver to expose a device that user-land code can open a handle to (using the CreateFile APIs), we find the device with the same name, double-click it and view its security attributes:

We see in the device security properties that the “everyone” group has read & write permissions, this means low-privileged users can obtain a handle to the device and use it to communicate to the driver.

Note that the driver and device names in these directories are set in the driver’s DriverEntry when it is loaded by Windows, first the device is created using IoCreateDevice, usually followed by a symbolic link creation using IoCreateSymbolicLink to give access to user-land code.

When a user-land process wants to communicate with a device driver, it will obtain a file handle to the device. In this case the code would look like:

#define USR_DEVICE_NAME L"\\\\.\\vstor2-mntapi20-shared"

HANDLE hDevice = CreateFileW(USR_DEVICE_NAME,

GENERIC_READ | GENERIC_WRITE,

FILE_SHARE_READ | FILE_SHARE_WRITE,

NULL,

OPEN_EXISTING,

0,

NULL);

This code results in the IRP_MJ_CREATE_HANDLER dispatch handler for the driver being called, this dispatch handler is part of the DRIVER_OBJECT for the target driver, which is the first argument to the driver’s DriverEntry, this structure has a MajorFunction array which can be set to function pointers that will handle callbacks for various events (like the create handler being called when a process opens a handle to the device driver)

In the image above we know the first argument to DriverEntry for any driver is a pointer to the DRIVER_OBJECT structure, with this information we can follow where this variable is used to find the code that sets the function pointers for the MajorFunction array.

We can find out which MajorFunction index maps to which IRP_MJ_xxx function by looking at sample code provided by Microsoft, specifically on line 284 here.

Since we now know which array index maps to which function, we rename the functions with meaningful names as shown in the image above (e.g. we name entry 0xe to ioctl_handler, as it handles DeviceIoControl messages from processes.

The read & write callbacks are called when a process calls ReadFile or WriteFile on the device handle, there are other callbacks too which we won’t go through.

To start with, lets analyze the irp_mj_create handler and see what happens when we create a handle to this device driver.

By default, this is what we see:

Firstly, we can improve decompilation by setting the correct types for a1 and a2, which we know must conform to the DRIVER_DISPATCH specification.

Doing so results in the following:

There’s a few things happening in this function, two important structures shown that are usually important are:

  • DeviceExtension object in the DEVICE_OBJECT structure

  • FsContext object in the IRP->CurrentStackLocation->FileObject structure

The DeviceExtension object is a pointer to a buffer created and managed by the driver object. It is accessible to the driver via the DEVICE_OBJECT structure (and thus accessible to the driver in all DRIVER_DISPATCH callbacks. Drivers typically create and use this buffer to manage state, variables & other information the driver wants to be able to access in a variety of locations (for example, if the driver supports various functions to Open, Read, Write or Close TCP connections via IOCTLs, the driver may store its current state (e.g. whether the connection is Open or Closed) in this DeviceExtension buffer, and whenever the Close function is called, it will check the state in the DeviceExtension buffer to ensure its in a state that can be closed), essentially its just a buffer that the driver uses to store/retrieve information from a variety of contexts/functions.

The FsContext structure is similar and can be used as an arbitrary buffer, the main difference is that the DEVICE_OBJECT structure is created by the driver during the IoCreateDevice call, which means the DeviceExtension buffer does not get torn down or re-created when a user process opens or closes a handle to the device, while the FsContext structure is associated with a FILE_OBJECT structure that is created when CreateFile is called, and destroyed when the handle is closed, meaning the FsContext buffer is per-handle.

From the decompiled code we see that a buffer of 0x20 size is allocated and set to be the FsContext structure, and we also see that the first 64bits of this structure is set to v5 in the code, which corresponds to the DeviceExtension pointer, meaning we already figured out that the FsContext struct contains a pointer to the DeviceExtension as its first element.

E.g.

struct FsContext {

PVOID pDevExt;

};

Figuring out the rest of the elements to the FsContext and DeviceExtension structures is a simple but sometimes tedious process of looking at all the DRIVER_DISPATCH functions for the driver (like the ioctl handler) and noting down what offsets are accessed in these structs and how they’re used (e.g. if offset 0x8 in the DeviceExtension is used in a KeAcquireSpinLockRaiseToDpc call, then we know that offset is a pointer to a KSPIN_LOCK object).

Taking the time to documents the structures this way pays off, it helps greatly when trying to understanding the decompilation, as with some effort we can transform the IRP_MJ_CREATE handler to look like the below:

When looking at the FsContext structure for example, we can open Ida’s Local Types window and create it using C syntax, which I created below:

Note that as you figure out what each element is, you can define the elements as random junk and rename/retype them as you go (so long as you know the size of the structure, which we get easily here via the 0x20 size argument to ExAllocatePoolWithTag).

Now that we’ve analyzed the IRP_MJ_CREATE handler and determined there’s nothing stopping us from creating a handle, we can look into how the driver handles Read, Write & DeviceIOControl requests from user processes.

In analyzing these handlers, we see heavy usage of the FsContext and DeviceExtension buffers, including checks on whether its contents are initialized.

Turns out, there are quite a few vulnerabilities in this driver that are reachable if you form your input correctly to hit their code paths, while I won’t go through all of them (some are still pending disclosure!), we will take a look at one which is a simple user->kernel DoS.

In IOCTL 0x2A0014 we see the DeviceExtension buffer get memset to 0 to clear its contents:

This is followed by a memmove that copies 0x100 bytes from the user’s input buffer to the DeviceExtension buffer, meaning those byte offsets we copy into are user controlled (I denote this with a _uc tag at the end of the variable name:

During this IOCTL, another field in the DeviceExtension also gets set (which seems to indicate that the DeviceExtension buffer has been initialized):

This is critical to triggering the bug (which we will see next).

So, the actual bug doesn’t live in the IOCTL handlers, instead it lives in the IRP_MJ_READ and IRP_MJ_WRITE handlers (note that in this case the READ and WRITE handlers are the same function, they just check the provided IRP to determine if the operation is a READ or WRITE).

In this handler, we can see a check to determine if the DeviceExtension’s some_if_field has been initialized:

After clearing this condition, the bug can be seen in sub_12840 in the following condition statement:

Here we see I denoted the unkn13 variable in the DeviceExtension buffer with _uc, this means its user controlled (in fact, its set during the memmove call we saw earlier).

From the decompilation we see that the code does a % operation on our user controllable value, this translates to a div instruction:

If you’re familiar with X86, you’ll know that a div instruction on the value 0 causes a divide-by-zero exception, we can easily trigger this here by provided an input buffer filled with 0 when we call the IOCTL 0x2A0014 to set the user controllable contents in the DeviceExtension buffer, then we can trigger this code by attempting to read/write the device handle using ReadFile or WriteFile APIs.

In fact there are multiple ways to trigger this, as the DeviceExtension buffer is essentially a global buffer, and no locking is used when reading this value, there exist race conditions where one thread is calling IOCTL 0x2A0014 and another is calling the read or write handler, such that this div instruction may be hit right after the memset operation in IOCTL 0x2A0014 clears the DeviceExtension buffer to 0.

In fact, there are multiple locations such race conditions would affect the code paths taken in this driver!

Overall, this driver is a good target for reverse engineering practice with Kernel drivers due to its use of not only IOCTLs, but also read & write handlers + the use of the FsContext and DeviceExtension buffers that need to be reversed to understand what the driver is doing, and how we can influence it. All the bugs found in this driver were purely from static reverse engineering as a fun exercise.

Interested in Reverse Engineering & Vulnerability Research Training?

We frequently run public sessions (or private sessions upon request) for trainings in Reverse Engineering & Vulnerability Research, see our Upcoming Trainings or Subscribe to get notified of our next public session dates.

Emulating File I/O for In-Memory Fuzzing

12 October 2020 at 14:12

One problem I’ve encountered during fuzzing is how to best fuzz an application that performs multiple file reads on an input file, but in a performant way (e.g. in-memory without actually touching disk). For example, say an application takes in an input file path from a user and parses it, if the application loads the entire file into a single buffer to parse, this is simple to fuzz in-memory (we can modify the buffer in-memory and resume), however if the target does multiple reads on a file from disk, how can we fuzz performantly?

Of course if we’re fuzzing by replacing the file on disk for each fuzz case we can fuzz such a target, but for performance if we’re fuzzing entirely in-memory (or using a snapshot-fuzzer that doesn’t support disk-based I/O) we need to ensure each read operation the target performs on our input does not actually touch disk, but instead reads from memory.

The method I decided to implement for my fuzzing was to hook the different file IO operations (e.g. ReadFile) and implement my own custom handlers for these functions that redirects the read operations to memory instead of disk, this has multiple benefits:

  1. We eliminate syscalls, as lots of file operations result in syscalls and my custom handler does not use syscalls, we avoid context switching into the kernel and obtain better perf

  2. We keep track of different file operations but it all operates on a memory-mapped version of our input file, this means we can mutate the entire mem-mapped file once and guarantee all ReadFile calls will be on our mutated Memory-mapped file

The normal operation of reading a file (without using my hooks) is:

  1. CreateFile is called on a file target

  2. ReadFile is used on the target to read into a buffer (resulting in syscalls and disk IO)

  3. Process parses the buffer

  4. ReadFile is used on the target to read more from the file on disk

  5. Process continues to parse the buffer

Process Reading from Disk without Hooks

With our hooks, the operations instead look like:

  1. CreateFile is called on a file target (our hook memory maps the target once entirely in-memory)

  2. ReadFile is used on the target to read into a buffer (resulting in our custom ReadFile implementation to be called via our hook, and we handle the ReadFile by returning contents from our in-memory copy of the file, resulting in no syscalls or Disk IO)

  3. Process parses the buffer

  4. ReadFile is used on the target to read more from the file (in-memory again, just like the first ReadFile)

  5. Process continues to parse the buffer

Process Reading a File with our Hooks (In-Memory)

This greatly simplifies mutation and eliminates syscalls for the file IO operations.

The implementation wasn’t complex, MSDN has good documentation on how the APIs perform so we can emulate them, alongside writing a test suite to verify our emulation accuracy.

The code for this can be found on my GitHub: https://github.com/Kharos102/FileHook

Fuzzing FoxitReader 9.7’s ConvertToPDF

21 August 2020 at 15:12

Inspiration to create a fuzzing harness for FoxitReader’s ConvertToPDF function (targeting version 9.7) came from discovering Richard Johnson’s fuzzer for a previous version of FoxitReader.

(found here: https://www.cnblogs.com/st404/p/9384704.html).

Multiple changes have since been introduced in the way FoxitReader converts an image to a PDF, including the introduction of new Vtables entries, the necessity to load in the main FoxitReader.exe binary (including fixing the IAT and modifying data sections to contain valid handles to the current process heap) + more.

The source for my version of the fuzzing harness targeting version 9.7 can be found on my GitHub: https://github.com/Kharos102/FoxitFuzz9.7

Below is a quick walkthrough of the reversing and coding performed to get this harness working.

Firstly — based on the existing work from the previous fuzzers available, we know that most of the calls for the conversion of an image to a PDF occur via vtable function calls from an object returned from ConvertToPDF_x86!CreateFXPDFConvertor, however this could also be found manually by debugging the application and adding a breakpoint on file read accesses to the image we supply as a parameter to the conversion function, and then walking the call stack.

To start our harness, I decided to analyse how the actual FoxitReader.exe process sets up objects required for the conversion function by setting a breakpoint for the CreateFXPDFConvertor function.

Next, by stepping out and setting a breakpoint on all the vtable function pointers for the returned object, we can discover what order these functions are called along with their parameters as this will be necessary for us to setup the object before calling the actual conversion routine.

Dumping the Object’s VTable

We know how to view the vtable as the pointer to the vtable is the first 4-bytes (32bit) when dumping the object.

During this process we can notice multiple differences compared to the older versions of FoxitReader, including changes to existing function prototypes and the introduction of new vtable functions that require to be called.

After executing and noting the details of execution, we hit the main conversion function from the vtable of our object, here we can analyse the main parameter (some sort of conversion buffer structure) by viewing its memory and noting its contents.

First we see the initial 4-bytes are a pointer to an offset within the FoxitReader.exe image

Buffer Structure Analysis

This means our harness will have to load the FoxitReader image in-memory to also supply a valid pointer (we also have to fix its IAT and modify the image too, as we discover after testing the harness).

Then we continue noting down the buffer’s contents, including the input file path at offset +0x1624, the output file path at offset +0x182c, and more (including a version string).

Finally after the conversion the object is released and the buffer is freed.

After noting all the above we can make a harness from the information discovered and test.

During testing, certain issues where discovered and accounted for, including exceptions in FoxitReader.exe that was loaded into memory, due to imports being used, this was fixed by fixing up the process IAT when loaded.

Additionally, calls to HeapAlloc were occurring where the heap handle was obtained via an offset in the FoxitReader image loaded in-memory, however it was uninitialised, this was fixed by writing the current process heap handle into the FoxitReader image at the offset HeapAlloc was expecting.

Overall the process was not long and the resulting harness allows for fuzzing of the ConvertToPDF functionality in-memory for FoxitReader 9.7.

EDR Observations

20 August 2020 at 15:13

EDR Primer

EDRs generally contain the following components:

  • Self-Protection

  • Hooking Engine

  • Virtualization/Sandbox/Emulation

  • Log/Alert Generation

  • Network Comms

Quick Primer: Kernel Callbacks

EDRs also utilize kernel callbacks as exposed by the windows NT kernel, including:

  • PsSetCreateProcessNotifyRoutine

  • PsSetLoadImageNotifyRoutine

  • PsSetThreadCreateNotifyRoutine

  • ObRegisterCallbacks

  • CmRegisterCallbacks

Exported callback routines in ntoskrnl.exe

These callbacks may be used by kernel drivers such that when an event happens (process creation, registry modifications, handle creations, etc) the kernel driver is notified (pre or post op) and may interfere with the operation or result.

A common usage of this is for EDRs to be notified of process creations and inject their own userland DLLs (usually to hook NTDLL) in the newly created processes before they execute.

Additionally EDRs may intercept handle creation events and block those that occur on their protected processes (for example, in self-protection mode they may prevent other processes from obtaining handles to their processes).

Quick Primer: Disassembling Callbacks

Callbacks can be enumerated and disassembled on Windows via Kernel Debugging (or in-kernel disassembling e.g. by compiling a kernel driver with disassembly functionality such as via Capstone).

If using KD/Windbg, we can leverage public symbols to first disassemble the function PsSetCreateProcessNotifyRoutine with the command u nt!PsSetCreateProcessNotifyRoutine

Disassembly of Nt!PsSetCreateProcessNotifyRoutine in Windbg

We then follow any initial JMP (depending on the version of ntoskrnl.exe) to the main implementation of the function (e.g. nt!PspSetCreateProcessNotifyRoutine)

Continue disassembling the function and look for a LEA instruction on the callback array symbol. Callbacks are stored in arrays of an undocumented EX_CALLBACK structure from which we can discover the function pointer that points to the actual callback function registered for a particular driver.

LEA instruction operating on the callback array

As shown above, the callback array used in the LEA instruction on the last line (loaded into R13) also has the symbol nt!PspCreateProcessNotifyRoutine).

Next, we dump the contents of the callback array:

Dumping Contents of the Callback Array

Here the command dq nt!PspCreateProcessNotifyRoutine was used to dump the contents of the callback array symbol as quadwords.

We can resolve the callback function registered for each of these callback entries by changing the last byte of an entry from F to 8, this will contain a pointer to the function registered to the callback:

Disassembling a Callback Function

Above, we chose the first entry ffff998ae70d3b8f, then we change the last byte such that the value becomes ffff998ae70d3b88 then we disassembled it as instructions using the command u poi(ffff998ae70d3b88) discovering that this function is the callback function with the symbol nt!ViCreateProcessCallback.

Hooking

Hooking techniques are commonly used by EDRs to intercept userland functions for API monitoring or blocking. The following demonstrates a common use for hooking where an EDR registers for process callback notifications and injects a DLL into each newly created process, this DLL then hooks ntdll.dll functions to block/alert/monitor malicious behaviour (e.g. blocking calls to NtReadVirtualMemory where the target process handle represents the lsass process).

Process Injection via Callbacks

EDRs may also leverage sandbox, emulation or virtualization to run a binary in isolation and log API usage.

Common Weaknesses

The following list represents common weaknesses identified in multiple EDR solutions

Binary Padding

Scanning and emulation of a binary may be used to detect malicious behaviour, however many EDRs (and Ads) have file size limitations on the file to analyse.

As a result, by appending junk to the end of a binary until it is roughly 100mb in size may be enough to prevent the EDR/AV from analysing it (and due to the PE32/PE32+ format, junk appended at the end of an executable will not affect its execution).

This is effective against products that heavily rely on an emulation & scanning layer to detect threats.

Unmonitored APIs

Typical APIs used for malicious activity (e.g. combinations of VirtualAllocEx, WriteProcessMemory & CreateRemoteThread) may be alerted on by EDRs for process injection.

However, performing the same or similar actions with different sets of APIs may evade EDRs and go unnoticed.

For example, in the case of dumping sensitive process memory (like that from the lsass process) EDRs may not alert on handle creation of the target process, but may instead alert when an api like MiniDumpWriteDump or ReadProcessMemory is called on the target.

However, if we clone the target process with PssCaptureSnapshot and dump the memory of the cloned lsass process instead, we may bypass such detections. This stems from the following main factors:

  1. Simple handle creations on a target process are permitted;

  2. Cloning lsass is permitted; and

  3. Dumping memory of non-sensitive processes are permitted

By cloning lsass, the cloned lsass process doesn’t get the same protections by the EDR as the original lsass process, thereby permitting dumping of the lsass clone.

This can be performed using the Windows APIs, or by using tools like ProcDump.exe with the -r flag.

Another example is DLL injection via Windows hooks (e.g. leveraging SetWindowsHookEx api), this method of process injection does not rely on the typical Windows injection methods of opening a process, writing into the process memory and then spawning a new thread, and can bypass typical process injection detections.

Breaking Process Trees

EDRs leverage process trees for detecting malicious behaviour (e.g. alerting if word.exe spawns cmd.exe), however we can leverage COM objects such as C08AFD90-F2A1-11D1-8455-00A0C91F3880 that exposes the ShellExecute function to spawn arbitrary processes under the explorer.exe process, even from within VBScript running under word.exe.

There are other techniques too (e.g. leveraging RPC) that may also be applicable to break process-tree based detections.

Attacking EDRs

EDR weaknesses also include certain design flaws that make them susceptible to subversion.

For example, as shown above, userland hooking may be key to an EDR’s detection capabilities (such that without it, the product may be rendered useless).

EDRs that hook userland APIs via hooking ntdll.dll may be subverted by loading a fresh copy of ntdll.dll into the process and redirecting (via hooks) our API calls to the newly loaded (and unhooked by EDR) ntdll.

This technique along for bypassing EDR hooks may be enough to then perform malicious actions (like lsass dumping) without any alerts or detections.

EDRs also expose a lot of attack surface due to their massive codebase (drivers, IPC, support for various file formats) that may make them susceptible to a range of 0-day vulnerabilities, as such proper testing of these products should be a priority.

❌
❌