ZecOps is excited to announce the release of ZecOps for Mobile 2.0, which includes full support for Android. With this release, ZecOps has extended its best-in-class automatic digital forensics capabilities to the two most widespread and important mobile operating systems in the world, iOS and Android.
We see it in the news everyday: sophisticated threat actors can bypass all existing security defenses. These mistakes lead to sudden reboots, crashes, appearances in logs / OS telemetry, bugs, errors, battery loss, and other “unexplained” anomalies. ZecOps for Mobile analyzes the associated events against databases of attack techniques, common weaknesses (CWEs), and common vulnerabilities (CVEs). ZecOps’s core technology utilizes machine learning for insights, correlation and identifying anomalous behavior for 0-day attacks. Following a quick investigation, ZecOps produces a detailed assessment of if, when, and how a mobile device has been compromised.
World-leading governments, defense agencies, enterprises, and VIPs rely on ZecOps to automate their advanced investigations, greatly improving their threat intelligence, threat detection, APT hunting, and risk & compromise assessment capabilities. With support for Android, ZecOps can now extend this threat intelligence across an entire organization’s mobile footprint.
Supported versions:
Android 8 and above – until latest
iOS 10 and above – until latest
Supported HW Models:
All device models are supported on both Android and iOS.
ZecOps provides the most thoroughoperating system telemetry analysis as part of its advanced digital forensics. By focusing on the trails that hackers leave (“Attackers’ Mistakes”), ZecOps can provide sophisticated security organizations with critical information on the attackers’ tools, advanced persistent threats, and even discovery of attacks leveraging zero-day vulnerabilities.
In the past few years XNU had few vulns in a newly added/changed code areas (extra_recipe, kq double release) and in the content filter area (bug collision uaf, silent patched uaf) so it is no surprise that the combination of the newly added code and complex areas (content-filter) alongside with a funny comment caught our attention.
0x1- Discovery story
Upon a closer look at the newly added xnu source of Darwin 19 you might notice a strange comment in content_filter.c:
/*
* TO DO LIST
*
* SOONER:
*
* Deal with OOB
*
* LATER:
*
* If support datagram, enqueue control and address mbufs as well
*/
Is this comment referring to OOB read/write issues? Probably not but it won’t hurt to run a quick search for those so we will use the magic tool CMD +f to search for memcpy calls and in less than two minutes you will find the following
0x2- The bug.
The newly updated cfil_sock_attach function which is easily reached from tcp_usr_connect and tcp_usr_connectx with controlled variables:
errno_t
cfil_sock_attach(struct socket *so, struct sockaddr *local, struct sockaddr *remote, int dir) // (Part A)
{
errno_t error = 0;
uint32_t filter_control_unit;
socket_lock_assert_owned(so);
/* Limit ourselves to TCP that are not MPTCP subflows */
if ((so->so_proto->pr_domain->dom_family != PF_INET &&
so->so_proto->pr_domain->dom_family != PF_INET6) ||
so->so_proto->pr_type != SOCK_STREAM ||
so->so_proto->pr_protocol != IPPROTO_TCP ||
(so->so_flags & SOF_MP_SUBFLOW) != 0 ||
(so->so_flags1 & SOF1_CONTENT_FILTER_SKIP) != 0) {
goto done;
}
filter_control_unit = necp_socket_get_content_filter_control_unit(so);
if (filter_control_unit == 0) {
goto done;
}
if (filter_control_unit == NECP_FILTER_UNIT_NO_FILTER) {
goto done;
}
if ((filter_control_unit & NECP_MASK_USERSPACE_ONLY) != 0) {
OSIncrementAtomic(&cfil_stats.cfs_sock_userspace_only);
goto done;
}
if (cfil_active_count == 0) {
OSIncrementAtomic(&cfil_stats.cfs_sock_attach_in_vain);
goto done;
}
if (so->so_cfil != NULL) {
OSIncrementAtomic(&cfil_stats.cfs_sock_attach_already);
CFIL_LOG(LOG_ERR, "already attached");
} else {
cfil_info_alloc(so, NULL);
if (so->so_cfil == NULL) {
error = ENOMEM;
OSIncrementAtomic(&cfil_stats.cfs_sock_attach_no_mem);
goto done;
}
so->so_cfil->cfi_dir = dir;
}
if (cfil_info_attach_unit(so, filter_control_unit, so->so_cfil) == 0) {
CFIL_LOG(LOG_ERR, "cfil_info_attach_unit(%u) failed",
filter_control_unit);
OSIncrementAtomic(&cfil_stats.cfs_sock_attach_failed);
goto done;
}
CFIL_LOG(LOG_INFO, "so %llx filter_control_unit %u sockID %llx",
(uint64_t)VM_KERNEL_ADDRPERM(so),
filter_control_unit, so->so_cfil->cfi_sock_id);
so->so_flags |= SOF_CONTENT_FILTER;
OSIncrementAtomic(&cfil_stats.cfs_sock_attached);
/* Hold a reference on the socket */
so->so_usecount++;
/*
* Save passed addresses for attach event msg (in case resend
* is needed.
*/
if (remote != NULL) {
memcpy(&so->so_cfil->cfi_so_attach_faddr, remote, remote->sa_len); // Part B
}
if (local != NULL) {
memcpy(&so->so_cfil->cfi_so_attach_laddr, local, local->sa_len); // Part C
}
error = cfil_dispatch_attach_event(so, so->so_cfil, 0, dir);
/* We can recover from flow control or out of memory errors */
if (error == ENOBUFS || error == ENOMEM) {
error = 0;
} else if (error != 0) {
goto done;
}
CFIL_INFO_VERIFY(so->so_cfil);
done:
return error;
}
We can see that in (Part A) the function receives two sockaddrs parameters (local and remote) which are user controlled and then using their sa_len struct member (remote in (Part B) and local in (Part C)) in order to copy data to cfi_so_attach_laddr and cfi_so_attach_faddr. Parts (A) (B) and (C) were all result of a new changes in XNU.
So what’s the problem? The problem is there is lack of check of sa_len which can be set up to 255 and then will be used in a memcpy to copy data into a union sockaddr_in_4_6 which is a 28 bytes struct – resulting in a buffer overflow.
The PoC below which is almost identical to Ian Beer’s mptcp with two changes. This POC requires a pre-requisite to reach the vulnerable area. In order to trigger the vulnerability we need to use an MDM enrolled device with NECP policy, or attach the socket to a valid filter_control_unit. One way to do it is to create one with cfilutil and then manually write it to kernel memory using a kernel debugger.
Here is a picture of the vulnerable part in macOS 10.15.1 compiled kernel (before the issue was reported):
Here is a picture of the vulnerable part in macOS 10.15.6 compiled kernel (after the issue was reported):
The panic call with the mecmpy_chk is gone alongside the patch!
Did the original developer knew this function was vulnerable and placed it there as a placeholder until a proper patch? Your guess is good as ours.
Also note that the call to memcpy_chk before the real_mode_bootstarp_end (which is a wraparound of memcpy) is what kept this issue from being exploitable.
0x4- What can we take from this?
Read comments they might give us valuable information
Newly added code is oftentimes buggy
Content filter code is complex and tricky
Now with Pangu’s recent blog post and Ian Beer mptcp bug we can learn that sockaddr->sa_len already caused multiple issues and should be audited a bit more carefully.
0x5- Attacks in the wild?
This issue is not dangerous. During our investigation of this bug, ZecOps checked its targeted threats intelligence database, and saw no active attacks associated with this issue. We still advise to update to the latest version to receive all other updates.
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
In the previous part of the series, SMBleedingGhost Writeup Part II: Unauthenticated Memory Read – Preparing the Ground for an RCE, we described two techniques that allow us to read uninitialized memory from the pool buffers allocated by the SrvNetAllocateBuffer function of the srvnet.sys module. The first technique accomplishes that by crafting a special SMB packet and deducing information from the server’s response. The second technique, which has less limitations, does that by sending specially crafted compressed data and deducing information depending on whether the server drops the connection.
The next thing we had to understand was: what can be done with this reading ability? As a reminder, we began this research with a write-what-where primitive that we demonstrated in our previous research about achieving local privilege escalation. Since most of the memory layout in the modern Windows versions is randomized, we need to have at least one pointer to be able to do something useful with the write-what-where primitive. Unfortunately, memory allocated with the SrvNetAllocateBuffer function is mostly used for network data such as SMB packets and doesn’t contain system pointers. We could try and read uninitialized memory left by a previous allocation that wasn’t done with SrvNetAllocateBuffer, but it would be difficult to predict where to look for a pointer in this case, especially since we can’t run code on the target computer that could help us grooming the pool (unlike in the case of a local privilege escalation, for example). So we started looking for something more reliable.
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
SrvNetAllocateBuffer and the allocated buffer layout
As we already mentioned in our local privilege escalation research, the SrvNetAllocateBuffer function doesn’t just return a buffer with the requested size. Instead, it returns a pointer to a struct that is located at the bottom of the pool-allocated memory block, containing information about the allocated buffer. The layout of the pool-allocated memory block is the following:
While our reading technique can only read bytes from the “User buffer” region, we can use the integer overflow bug to copy parts of the SRVNET_BUFFER_HDR struct to the “User buffer” region of another buffer, which we can then read. We can do that by setting the Offset field to point at the SRVNET_BUFFER_HDR struct beyond the data we want to read. We just need to make sure that the data that is located there can be interpreted as valid compressed data, otherwise the copying won’t happen.
Hunting for pointers
Let’s take a look at the fields of the SRVNET_BUFFER_HDR struct and see whether there’s something worth reading:
#pragma pack(push, 1)
struct SRVNET_BUFFER_HDR {
/*00*/ (orange) LIST_ENTRY ConnectionBufferList;
/*10*/ WORD BufferFlags; // 0x01 - no transport header, 0x02 - part of a lookaside list
/*12*/ WORD LookasideListIndex; // 0 to 8
/*14*/ WORD LookasideListLogicalProcessor;
/*16*/ WORD TracingDataCount; // 0, 1 or 2, for TracingPtr1/2, TracingUnknown1/2
/*18*/ (blue) PBYTE UserBufferPtr;
/*20*/ DWORD UserBufferSizeAllocated;
/*24*/ DWORD UserBufferSizeUsed;
/*28*/ DWORD PoolAllocationSize;
/*2C*/ BYTE unknown1[4];
/*30*/ (blue) PBYTE PoolAllocationPtr;
/*38*/ (blue) PMDL pMdl1;
/*40*/ DWORD BytesProcessed;
/*44*/ BYTE unknown2[4];
/*48*/ SIZE_T BytesReceived;
/*50*/ (blue) PMDL pMdl2;
/*58*/ (orange) PVOID pSrvNetWskStruct;
/*60*/ DWORD SmbFlags;
/*64*/ (orange) PVOID TracingPtr1;
/*6C*/ SIZE_T TracingUnknown1;
/*74*/ (orange) PVOID TracingPtr2;
/*7C*/ SIZE_T TracingUnknown2;
/*84*/ BYTE unknown3[12];
};
#pragma pack(pop)
The colored variables are pointers. The blue-colored pointers all point inside the pool-allocated memory block, with offsets which can be calculated in advance, so it’s enough to read one of them. Having an absolute pointer to the pool-allocated memory block will surely be helpful. Regarding the orange-colored pointers:
ConnectionBufferList – A linked list of all of the received, unhandled buffers of a connection. The list head is a part of the connection object created by the SrvNetAllocateConnection function in srvnet.sys. A buffer is added to the list by the SrvNetWskReceiveComplete function. In our case, there will be only one buffer in the list, so both pointers (Flink and Blink of the LIST_ENTRY struct) will point to the list head inside the connection object.
pSrvNetWskStruct – Initially, a pointer to the connection object mentioned above. The pointer is set by the SrvNetWskReceiveEvent function, but is overridden by the SrvNetWskReceiveComplete function with the pointer to the SRVNET_BUFFER_HDR struct. Thus, reading it is not more useful than reading one of the other blue-colored pointers. By the way, if you search for “pSrvNetWskStruct“ you’ll find out that it played a role in exploiting EternalBlue.
TracingPtr1/2 – These pointers are only used when tracing is enabled, as it seems.
As you can see, the only other useful pointer for us to read is one of the pointers from the ConnectionBufferList struct. Both pointers (Blink and Flink of the LIST_ENTRY struct) point to the connection object. The object struct has been named SRVNET_RECV by EternalBlue researchers, so we’ll use this name as well.
Getting a module base address
Now that we know how to get the two pointers – a pointer to a pool-allocated memory block and a pointer to an SRVNET_RECV struct – we can freely modify the two buffers using the write-what-where primitive. There are probably several ways from this point to achieve RCE, but we had a feeling that getting a base address of a module would be the most straightforward option since there are so many things we can modify in a data section of a module. As we’ve seen, none of the pointers in a memory block allocated by SrvNetAllocateBuffer point to a module. We had hopes for the SRVNET_RECV struct, but we didn’t find pointers that point to a module there, too. On the bright side, there are several pointers to modules one additional dereference away:
At this point, we noticed that since we can override those pointers in SRVNET_RECT, we can call an arbitrary function by replacing the HandlerFunctions pointer and triggering one of the events, e.g. closing the connection so that Srv2DisconnectHandler is called. This will come in handy later, but we didn’t have any function pointers to call yet, so we continued with our attempt to get a module base address.
Unlike writing, reading those pointers is not as easy since our technique allows us to read only from the “User buffer” region. So close, yet so far. Since we can get and modify a pool-allocated memory block and an SRVNET_RECV struct, we hoped to find code that we can trigger that does a double-dereference-read followed by a double-dereference-write with two variables that we control, similar to the following:
ptr1 = *(pSrvNetRecv + offset1)
value = *ptr1
ptr2 = *(pSrvNetRecv + offset2)
*ptr2 = value
If we could find such a snippet, we would trigger it to copy the first pointer (e.g. HandlerFunctions) to the “User buffer” region, read it, then copy the second pointer (e.g. the Srv2ConnectHandler function pointer) to the “User buffer” region and read it as well, deducing the module base address from it. We searched for such a snippet for a long time, but didn’t find a good match. Finally, we settled for a sub-optimal option which nevertheless worked. Let’s take a look at the relevant part of the SrvNetFreeBuffer function (simplified):
Upon freeing the buffer, if buffer flags 0x02 (means the buffer is part of a lookaside list) and 0x01 (means the buffer has no transport header) are set, some operations are made on the two MDL objects to add the transport header before resetting the flags to zero and returning the buffer back to the lookaside list. If we set aside the meaning behind the operations on the MDL objects for a moment and look at the operations in terms of memory manipulation, we can notice that the code does a double-dereference-read followed by a double-dereference-write with two variables that we control (the two MDL pointers), which is what we were looking for. The downside is that the content that we want to read from is also modified (lines 13-16, 29), a side effect we hoped to avoid.
Given the above, here’s how we managed to read the AcceptSocket pointer:
1. Prepare buffer A from a lookaside list such that the “User buffer” region is filled with zeros. This buffer will end up holding the pointer that we’ll eventually read.
2. Prepare buffer B from a different lookaside list such that:
The pMdl1 pointer points at the address of the HandlerFunctions pointer minus 0x18, the offset of MappedSystemVa in the MDL struct.
The pMdl2 pointer points at the “User buffer” region of Buffer A.
The Flags field is set to 0x03.
We can override the SRVNET_BUFFER_HDR struct fields by decompressing them from a larger buffer using the technique described in the Observation #2 section of the previous part of the writeup.
3. When buffer B is freed, the following operations will take place:
The MDL flags will be read from the second MDL at buffer A. If the MDL_PARTIAL_HAS_BEEN_MAPPED flag is set, MmUnmapLockedPages will be called and the system will likely crash. That’s why we filled the buffer with zeros in step 1.
The HandlerFunctions pointer and the memory around it will be modified as depicted here:
The HandlerFunctions pointer and the memory around it will be read as depicted here:
+00 | __ __ __ __ __ __ __ __
+08 | __ __ __ __ __ __ __ __
+10 | __ __ __ __ __ __ __ __
+18 | ab cd ef gh ij kl mn op <-- HandlerFunctions
+20 | __ __ __ __ __ __ __ __
+28 | qr st uv wx __ __ __ __
The “User buffer” region of buffer A will be modified as depicted here: (The orange-colored bytes contain the pointer we want to read. We just need to order them properly.)
4. Read the AcceptSocket pointer from the “User buffer” region of buffer A.
The good news: we managed to read the pointer. The bad news: we corrupted some data in the SRVNET_RECT struct. Luckily for us, the corruption doesn’t affect the system as long as nothing happens with the relevant connection. When something does happen, e.g. the connection closes, the system crashes. That’s not a problem since we’ll get RCE soon, and we can fix the corruption if we want to. We didn’t implement such a fix in our POC and such fix was left as an exercise for the reader.
After reading the AcceptSocket pointer, we used the same technique to read the srvnet!SrvNetWskConnDispatch pointer. We read the AcceptSocket pointer and not the HandlerFunctions pointer since the array of handler functions is shared between all connections, while the buffer pointed by AcceptSocket is not shared with other connections. Therefore, we can corrupt the latter, affecting the stability of only a single connection.
If we have a copy of the srvnet.sys file used on the target computer, we can just compute the offset of the SrvNetWskConnDispatch pointer in the module locally and subtract the offset from the pointer we read, getting the srvnet.sys module base address as a result. That’s what we did in our POC to keep things simple. One can improve it to be more general. One option that comes to mind is keeping several versions of srvnet.sys locally, and deducing the correct one by the least significant bytes of the read pointer.
Implementing arbitrary read
From the beginning of this research we had a convenient write-what-where (arbitrary write) primitive, but had nothing that allowed us to read memory. We worked hard until now to gain some memory reading abilities, and at this point we felt that we had enough tools to make our life easier and implement a convenient arbitrary read primitive. We began by exploring the possibilities of calling an arbitrary function.
Given that we have the base address of the srvnet.sys module, we can call any of the module’s functions. But what about the function’s arguments? The srv2!Srv2ReceiveHandler function is called by SrvNetCommonReceiveHandler, and the call looks like this:
The first two arguments are read from the SRVNET_RECT struct, so we can control them. We don’t have as much control over the other arguments. The x86-64 calling convention specifies that it’s the caller’s responsibility to allocate and free the stack space for the arguments, so even though a 8-arguments function is intended to be called, we can replace the pointer with a function that expects any other amount of arguments, and it will work.
Here are the steps we used to trigger the function call:
Send a specially crafted message so that the connection’s SRVNET_RECT struct pointer will be copied to a buffer we can read.
Send another, valid message, which will reuse the same SRVNET_RECT struct, but don’t close the connection yet. Note that when a connection is closed, the SRVNET_RECT struct is not freed. The SrvNetPrepareConnectionForReuse function is called to reset the struct so that it can be reused for the next connection.
Read the SRVNET_RECT struct pointer that we copied in step 1.
Replace the HandlerFunctions pointer and the arguments using the write-what-where primitive.
Send an additional message over the connection from step 2 so that the function that took the place of srv2!Srv2ReceiveHandler is called.
Now all we had to do was to find a convenient function to copy memory from one location to another, so that we can copy arbitrary memory to the pool buffer we can read from. memcpy comes to mind, and srvnet.sys does have such a function (memmove, to be precise), but this function requires a third argument, the amount of bytes to be copied, which we don’t control. Failing to find a convenient function that requires one or two arguments, we realized that we’re not limited by functions implemented in srvnet.sys, we can also call functions from srvnet’s import table by pointing HandlerFunctions at the right offset. There, we found the perfect function: RtlCopyUnicodeString.
The RtlCopyUnicodeString function gets two UNICODE_STRING pointers as arguments, and copies the content of the source string to the destination string. Unlike C strings which are NULL-terminated, strings in the kernel are defined by the UNICODE_STRING struct which holds a pointer to the string, and the string’s length in bytes. The string buffer can hold any binary data. If you peek at the implementation of RtlCopyUnicodeString, you can see that the copying is done with the memmove function, i.e. plain binary data copying. All we have to do is prepare our two UNICODE_STRING structs and call RtlCopyUnicodeString, then read the copied data:
Executing shellcode
After achieving a convenient arbitrary read primitive, we moved on to the next challenge towards our goal of remote code execution: running a shellcode. We used the technique that Morten Schenk presented in his Black Hat USA 2017 talk (pages 47-51).
The idea is to write a shellcode below the KUSER_SHARED_DATA structure which is located at a constant address, the only address that is not randomized in the kernel memory layout of the recent Windows versions. Then modify the relevant page table entry, making the page executable. The base address of the page table entries in the kernel is randomized, but can be retrieved from the MiGetPteAddress function in ntoskrnl.exe. Here are the steps we used to execute our shellcode:
Use our arbitrary read primitive to get the base address of ntoskrnl.exe from srvnet’s import table.
Read the base address of the page table entries from the MiGetPteAddress function, as described in Morten’s slides.
Write the shellcode at address KUSER_SHARED_DATA + 0x800 (0xFFFFF78000000800). Note that we could also use one of the pool buffers, using KUSER_SHARED_DATA is just more convenient.
Calculate the relevant page table entry address and clear the NX bit to allow execution, as described in Morten’s slides.
Call the shellcode using our ability to call an arbitrary function.
Launching a reverse shell
Technically, we achieved remote code execution, so we could stop here. But if we’re not popping calc or launching a reverse shell, the POC is not complete, so we went on to fill that gap. Since our shellcode runs in kernel mode, we can’t just run cmd.exe or calc.exe and call it a day. We needed to find a way to get our code to run in user mode. While searching for prior work on the topic we found sleepya’s shellcode, written originally for EternalBlue exploits, which is designed to do just that.
In short, here’s what the shellcode does:
Hook IA32_LSTAR MSR to lower the IRQL (Interrupt Request Level) from DISPATCH_LEVEL to PASSIVE_LEVEL. The shellcode begins execution at the DISPATCH_LEVEL IRQL which imposes several limitations. For more information see the great explanation of zerosum0x0.
Find a privileged user mode process (lsass.exe or spoolsv.exe) and queue a user mode APC in one of the alertable threads that is in waiting state.
In the APC kernel routine, allocate EXECUTE_READWRITE memory and point the APC normal (user mode) routine there. Then copy the user mode shellcode to the newly allocated memory, prepended with a stub to create a new thread.
In the APC normal routine a new thread is created, executing the user mode shellcode.
Published about three years ago, the shellcode didn’t work right away on recent Windows versions, so we had to make a couple of adjustments:
Incompatibility with the KVA Shadow mitigation. In the blog post Fixing Remote Windows Kernel Payloads to Bypass Meltdown KVA Shadow zerosum0x0 explains why the first part of the shellcode, IA32_LSTAR MSR hooking, isn’t supported when the KVA Shadow mitigation is enabled, and proposes a fix. We tried the proposed fix, but it didn’t work on newer Windows versions – zerosum0x0 targeted Windows 10 version 1809 while we were targeting versions 1903 and 1909. The right thing to do is to improve the fix or find another solution, but we just removed the IRQL lowering part. As a result, the POC can sometimes crash the system while trying to access paged memory (bug check IRQL_NOT_LESS_OR_EQUAL), but it doesn’t happen often, so we left it as is since it’s good enough for a POC.
Fixed finding the base address of ntoskrnl.exe. At first, we tried using zerosum0x0’s method – get an address of the first ISR (Interrupt Service Routine), which is located in ntoskrnl.exe, and search for a nearby PE header. The method didn’t work for us since the ISR pointer points to ntoskrnl’s INITKDBG section which is not mapped. Since we already found the ntoskrnl.exe base address, we fixed it by just passing it as an argument to the shellcode.
Fixed a problem with finding the offset of ETHREAD.ThreadListEntry. The original code looked for the current thread in the thread list of the current process. The thread won’t be found if the current thread is attached to a different process than the one it was originally created in (see KeStackAttachProcess).
Fixed the UserApcPending check in the KAPC_STATE struct for Windows 10 version R5 and newer. Since Windows 10 version R5 UserApcPending shares a byte with the newly added bit value, SpecialUserApcPending.
With the above fixed, we finally managed to make the shellcode work, we just needed to fill in the user mode part of the code to run. We used MSFvenom, the Metasploit payload generator, to generate a user mode shellcode to spawn a reverse shell.
Targets with more than one logical processor
In the Observation #1 section of the previous part of the writeup we assumed that our target has only one logical processor. With this assumption, we could rely on the lookaside lists buffer reusing, knowing that we get the same buffer every time as long as the allocation size is the same. As a reminder, the lookaside lists are created upon initialization, a list for each size and logical processor, as depicted in the following table:
→ Allocation size
↓ Logical Processor
0x1100
0x2100
0x4100
0x8100
0x10100
0x20100
0x40100
0x80100
0x100100
Processor 1
Processor 2
…
…
Processor n
Each cell with the “” symbol is a separate lookaside list.
With more than one logical processor, things are a bit more complicated – we get the same buffer only as long as the allocation is made on the same logical processor. Our first attempt at overcoming this limitation was redundancy. When writing to one of the lookaside list buffers, write multiple times. When reading from one of the lookaside list buffers, read multiple times and choose the most common value. This approach would work if the logical processor usage was distributed evenly, but we found that it’s not the case. We tested our POC in VirtualBox, and from our observations, some logical processors are preferred over others. For a setup of 4 logical cores, here’s the distribution of handling the incoming packet in a test execution:
Logical processor
Incoming packets handled
Logical processor 1
0.2%
Logical processor 2
0.8%
Logical processor 3
7.9%
Logical processor 4
91.1%
Here’s the distribution of handling the decompression:
Logical processor
Decompressions executed
Logical processor 1
13.3%
Logical processor 2
5.1%
Logical processor 3
6.8%
Logical processor 4
74.8%
As you can see, in this specific case logical processor 4 did most of the work. Logical processor 1 handled only 1 out of every 500 incoming packets!
We tweaked the POC such that it sends several packets simultaneously from multiple threads to improve the logical processor usage distribution. We also added error detection, so that if the data that is read doesn’t make sense, another reading attempt is made instead of proceeding and most likely crashing the system. The changes we made were enough to make the POC work with VirtualBox targets with multiple logical processors, but from a quick test the POC doesn’t work with VMware targets or (at least some) physical computers with multiple logical processors. We didn’t try to improve the POC further to support all targets, which we believe can be achieved with a better strategy for a reading and writing order.
If you’d like to study the code, we suggest starting with the initial, less noisy version which was designed for a single logical processor. It can be found in a previous commit here.
ZecOps Detection
ZecOps classify forensics logs related to this issue as #SMBGhost and #SMBleed. You can find more information on how to use ZecOps solutions for Endpoints & Servers, Mobile devices, or applications. Besides SMBleed / SMBGhost, ZecOps Crash Forensics solutions can find other, previously unknown vulnerabilities, that are exploited in the wild. If you care about persistent threats – we’ll be happy to assist.
Remediation
You can remediate the impact of both issues by doing one of the following:
Applying the latest security issues (recommended)
Block port 445 / enforce host-isolation
Disable SMBv3.1.1 compression
Summary
This is the third and final part of the writeup, in which we used the findings from the previous parts to achieve RCE using SMBGhost and SMBleed. We hope you enjoyed the read. Here’s a recap of the milestones during our research on the SMB bugs:
A write-what-where primitive, demonstrated in our previous research about achieving local privilege escalation.
In our previous blog post, we demonstrated how the SMBGhost bug (CVE-2020-0796) can be exploited for local privilege escalation. A brief reminder: CVE-2020-0796, also known as “SMBGhost”, is a bug in the compression mechanism of SMBv3.1.1. The bug affects Windows 10 versions 1903 and 1909, and it was announced and patched by Microsoft about 3 months ago. In the previous blog post we mentioned that although the Microsoft Security Advisory describes the bug as a Remote Code Execution (RCE) vulnerability, there is no public POC that demonstrates RCE through this bug. This was true until chompie1337 released the first public RCE POC, based on the writeup of Ricerca Security. Our POC uses a different method, and doesn’t involve physical memory access. Instead, we use the SMBleed (CVE-2020-1206) bug to help with the exploitation.
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
Our previous research led to the local privilege escalation attack that we have shown in our previous writeup. SMBGhost can be used for an RCE attack and we aim to demonstrate how we achieved it in this series of blog posts. As we showed in the previous writeup, we were able to implement a remote write-what-where primitive. However, for an RCE capability we need to know where to write the arbitrary data. Since most of the memory layout in the modern Windows versions is randomized, having the ability to write arbitrary data in any location is still very limiting. While searching for another capability to assist with the attack, we discovered a new bug in Microsoft’s SMB implementation. For technical details and a POC, check out our recent publication. We named it SMBleed since it allows to leak parts of memory remotely, similar to Heartbleed, just via SMB. While the concept is similar and an authenticated user can read large blocks of uninitialized data, the attack surface without authentication is more limited. Since we aimed for an unauthenticated RCE exploitation, the first thing we looked for is a way to read memory unauthenticated.
Diving into SMB
Note: The following sections describe in detail a technique we were able to use for exploitation, but dumped in favor of a different approach which worked better in our case. Still, it’s an approach that we felt is worth sharing. If you prefer to stick to what ended up in our final POC, you can just read Observation #1 and Observation #2, and then skip to the A different approach – decompression section.
The SMBleed bug allows an attacker to send a message such that its beginning is controlled by the attacker, while the rest of the message contains uninitialized data which is treated as a part of the message. For an authenticated user, there’s an easy way to exploit this using the SMB2 WRITE message to write uninitialized data to a file, and then read it with the SMB2 READ command. We started by looking for a similar technique for an unauthenticated user – a way to send a message such that a part of it can be retrieved later.
After skimming over the protocol specification and debugging a couple of sessions, we saw that a regular flow begins with the following commands that are sent by the client:
If incorrect credentials are used, the session is aborted after the second SMB2 SESSION_SETUP request.
We assume that we don’t have valid credentials, so we checked whether other commands can be sent without authentication. We found the following after some experimentation:
The first command to be sent must be SMB2 NEGOTIATE. It also must be the only SMB2 NEGOTIATE command during the session.
Since the SMB2 NEGOTIATE message is not compressed (the compression algorithm, if any, is decided during the negotiation), all that’s left is SMB2 SESSION_SETUP. So we took a closer look at the format of the SMB2 SESSION_SETUP message, hoping to find a way to get some of the data that is being sent back.
A closer look at SMB2 SESSION_SETUP
As we’ve already mentioned, a regular session that we observed sends two SMB2 SESSION_SETUP commands. At first, we checked whether one of the replies to these messages sends back some of the data. If that was the case, we could try to craft a message such that the data is left uninitialized. Unfortunately, we didn’t find such data. We couldn’t find a way to affect the first response, and the second response had an empty body and the 0xC000006D (STATUS_LOGON_FAILURE) status in the packet header (remember, we assume we don’t have valid credentials). The first SMB2 SESSION_SETUP request contains an NTLM Negotiate message, and the second SMB2 SESSION_SETUP request contains an NTLM Authenticate message. The former is rather simple, and we weren’t able to use it for something interesting, so we focused on the latter.
The NTLM Authenticate message
After studying the NTLM Authenticate message we came to the conclusion that the message’s most complex part, which is the best fit for misuse, is the NTLM2 V2 Response structure. It’s a variable-length byte array, mostly consisting of the NTLMv2_CLIENT_CHALLENGE structure. We noticed that if the structure doesn’t pass some of the initial checks, the 0xC000000D (STATUS_INVALID_PARAMETER) parameter is returned instead of 0xC000006D (STATUS_LOGON_FAILURE). Some of these checks are verifying the AvPairs field.
The AvPairs field is a variable-length byte array that contains a sequence of AV_PAIR structures. Each AV_PAIR structure defines an attribute/value pair. The attribute is defined by the AvId field, the AvLen field defines the value’s length in bytes, and the Value field is a variable-length byte-array that contains the value itself. An item with the attribute MsvAvEOL and a zero length marks the end of the array.
The authentication message is handled by the SsprHandleAuthenticateMessage function in the msv1_0.dll module. Among the initial checks, the function makes sure that the AvPairs array contains the following attributes: 0x0001 (MsvAvNbComputerName), 0x0002 (MsvAvNbDomainName). The value is not checked. The check itself is done by traversing the array and checking whether the requested attribute exists, and whether its length is within the struct. If the length is too large, the traversal is stopped. So practically, the MsvAvEOL item is not required for the NTLM Authenticate message to be valid.
At this point we figured that we can craft a request that can provide an answer to the following question: Given two bytes at offset x, interpreted as uint16, is the value larger than y? x and y are controlled by us. Consider the following packet:
The content of value 0x0001 (MsvAvNbComputerName) doesn’t matter, so we can use it to adjust the offset of the second value. For the second value, we only set the attribute as 0x0002 (MsvAvNbDomainName), leaving the length and the value uninitialized. We also set the size of the whole packet so that there are y bytes that follow the length field. There are two possible outcomes depending on the uninitialized value of the length field of the second value:
length <= y: In this case the check passes, since a valid 0x0002 (MsvAvNbDomainName) value is found. The server returns 0xC000006D (STATUS_LOGON_FAILURE) since the credentials are incorrect.
length > y: In this case the check fails, since the second value has an invalid length and is discarded. The server returns 0xC000000D (STATUS_INVALID_PARAMETER) for this case.
According to the server response we can deduce the answer to our question.
So, now we can get this small piece of information, right? Not so fast. Unfortunately, the NTLM Authenticate message is limited to 0xB48 bytes, and is discarded if it’s larger than that. The check is done by the SspContextGetMessage function in the msv1_0.dll module. Can we solve this problem by leaving only one of the two length bytes uninitialized? Unfortunately not, since the uint16 value is encoded as little endian, and to the best of our knowledge at this point, we can only leave the second, significant byte uninitialized, which doesn’t help too much. Unable to achieve something better within a single SMB session, we looked at what else can be done.
Observation #1: Lookaside lists
As we already mentioned in our previous research, the modules that handle SMB in the kernel (srv2.sys and srvnet.sys) use a custom allocation function, SrvNetAllocateBuffer, exported by srvnet.sys. This function uses lookaside lists for small allocations as an optimization. Lookaside lists are used for effectively reserving a set of reusable, fixed-size buffers for the driver.
The lookaside lists are created upon initialization, a list for each size and logical processor, as depicted in the following table:
→ Allocation size
↓ Logical Processor
0x1100
0x2100
0x4100
0x8100
0x10100
0x20100
0x40100
0x80100
0x100100
Processor 1
Processor 2
…
…
Processor n
Each cell with the “” symbol is a separate lookaside list. To simplify our analysis, we’ll assume our target has only one logical processor (we’ll cover targets with more than one logical processor in the third part of the writeup). In this case, as long as the same amount of bytes is allocated, the same lookaside list is used, and the same allocated buffer is reused again and again. We can use this implementation detail to have some control over the uninitialized data, as we’ll see soon.
Observation #2: Failing the decompression
Let’s revisit what happens when a compressed packet is decompressed (refer to our previous research for more details and pseudocode):
In case CompressedData is invalid, the decompression stage fails, the copy stage is not executed, and the connection is dropped. But the decompression might fail only after extracting a part of CompressedData which is valid. This allows us to craft a request such that data of our choice will be written at an offset of our choice, like this:
Back to the NTLM Authenticate message
We can use the above observations to make our technique work by using two steps:
Send a message with an invalid compressed data such that only a single zero byte is extracted. That byte will be the most significant byte of the length of the second value in the AvPairs array.
Send a message just as before, but make sure that the same lookaside list is used for the allocation, so that the zero byte will be there.
This time, this technique can answer the following question: Given a byte at offset x, is the value larger than y? As before, x and y are controlled by us.
Since we can re-use the buffer again and again by making sure the same lookaside list is used, we can repeat the steps several times while changing y, and finally deduce the byte value at a given offset.
Unfortunately, this technique has a limitation – the offset of the byte we can read is limited to 0xADB bytes from the beginning of the packet buffer. That’s because the offset of the NTLM Authenticate message (AUTHENTICATE_MESSAGE) is limited to 0x40 bytes after the end of the SMB2 SESSION_SETUP headers (enforced by the Smb2ValidateSessionSetup function in srv2.sys), and the size of the NTLM Authenticate message (AUTHENTICATE_MESSAGE) is limited to 0xB48 bytes, as we already mentioned.
Overcoming the offset limitation
Let’s say that we want to read a byte at offset 0x1100 (we’ll see why we want to go that far in the third part of the writeup). We can’t do it directly with our technique, but we found the following solution: since the buffers get reused from the lookaside lists, we can “lift up” the target byte via the decompression function by setting the Offset field to point beyond that byte. We just need to make sure that the data that is located there can be interpreted as valid compressed data, otherwise the copying won’t happen.
The incoming packet buffer contains extra 16 header bytes which aren’t copied over when the decompression takes place. As a result, the copied data, including the target byte, is copied to a location 16 bytes closer to the beginning of the allocated buffer. We can repeat that several times, until the target byte offset is low enough.
Address leak POC
You can find a script that demonstrates the above technique here. Remember that we assumed that the target computer has only one logical processor, so you’ll have to configure your VM properly to get the script working. If all goes well, the script will read and print an address from the NonPagedPoolNx pool. In fact, that would be the address of one of the buffers residing in one of the lookaside lists.
A different approach – decompression
While advancing with our research, we realized that the decompressed SMB packet is not the only complex structure that can be invalid in various ways. Even before handling all of the SMB-related structures, the compressed buffer can be invalid as well. If the decompression fails, the connection is dropped, which can be detected.
Microsoft’s SMB implementation offers three compression algorithms to choose from: LZNT1, Plain LZ77 and LZ77+Huffman. We looked at LZNT1 since it’s the first in the list, and it’s rather simple – about 80 Python lines for a decompression function. Without diving too much into details, the compressed data consists of a sequence of compressed blocks, each beginning with a uint16 variable marking its length. When a length of zero is encountered, the decompression completes (similar to a NULL-terminated string, but it’s optional). Also, conveniently, a range of zero bytes represents valid compressed data. With the above, we managed to answer the same question as we did with the previous approach: Given a byte at offset x, is the value larger than y? Here, too, x and y are controlled by us.
We accomplished that by sending a valid packed which is followed by a range of bytes similar to the following (note that it’s a simplification, the actual byte values are a bit different):
There are two possible outcomes depending on the uninitialized value of the least significant byte of the length field:
length <= y: In this case the whole compressed block will consist out of zero bytes, which is completely valid, and the next block’s length will be zero, completing the decompression successfully. The server will return a response.
length > y: In this case, either the first or the second compression block will contain 0xFF bytes, which will fail the decompression. The server will drop the connection.
Just like with the previous technique, we can use observations #1 and #2 to craft a message with an uninitialized byte in the middle of the message by using two steps:
Send a message with invalid compressed data such that only the part we need is extracted. The bytes that will be extracted are the bytes in the image above.
Send a second message, but make sure that the same lookaside list is used for the allocation, so that the bytes from step 1 will be there.
Note that the Offset value in the SMB packet header will point to the compressed data, which can be valid or not depending on the value of the initialized byte. The valid SMB packet will be sent uncompressed. Note also that since the Offset value is larger than the message itself, there’s an overflow in the calculation of the compressed data size, which ends up being a huge number. Usually that’s not an issue since the decompression ends quickly, either successfully or not. But sometimes the system crashes due to an out of bounds read. We didn’t try to solve this since it happens rarely, and the POC is complex enough.
The most notable advantage of this technique compared to the previous one is that there’s no offset limitation anymore. Even though we managed to overcome the limitation, it required sending a large number of packets, hurting performance and stability.
ZecOps Detection
ZecOps classify forensics logs related to this issue as the following tags #SMBGhost and #SMBleed. You can find more information on how to use ZecOps solutions for Endpoints & Servers, Mobile devices, or applications.
Remediation
You can remediate the impact of both issues by doing one of the following:
Applying the latest security issues (recommended)
Block port 445 / enforce host-isolation
Disable SMBv3.1.1 compression
Part II – Summary
In this part, we described how we managed to read uninitialized data from the kernel pool, remotely and without authentication, by exploiting SMBGhost and SMBleed. In the third part we’ll show how it helped us achieve RCE.
While looking at the vulnerable function of SMBGhost, we discovered another vulnerability: SMBleed (CVE-2020-1206).
SMBleed allows to leak kernel memory remotely.
Combined with SMBGhost, which was patched three months ago, SMBleed allows to achieve pre-auth Remote Code Execution (RCE).
POC #1: SMBleed remote kernel memory read: POC #1 Link
POC #2: Pre-Auth RCE Combining SMBleed with SMBGhost: POC #2 Link
Introduction
The SMBGhost (CVE-2020-0796) bug in the compression mechanism of SMBv3.1.1 was fixed about three months ago. In our previous writeup we explained the bug, and demonstrated a way to exploit it for local privilege escalation. As we found during our research, it’s not the only bug in the SMB decompression functionality. SMBleed happens in the same function as SMBGhost. The bug allows an attacker to read uninitialized kernel memory, as we illustrated in detail in this writeup.
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
The bug happens in the same function as with SMBGhost, the Srv2DecompressData function in the srv2.sys SMB server driver. Below is a simplified version of the function, with the irrelevant details omitted:
The Srv2DecompressData function receives the compressed message which is sent by the client, allocates the required amount of memory, and decompresses the data. Then, if the Offset field is not zero, it copies the data that is placed before the compressed data as is to the beginning of the allocated buffer.
The SMBGhost bug happened due to lack of integer overflow checks. It was fixed by Microsoft and even though we didn’t add it to our function to keep it simple, this time we will assume that the function checks for integer overflows and discards the message in these cases. Even with these checks in place, there’s still a serious bug. Can you spot it?
Faking OriginalCompressedSegmentSize again
Previously, we exploited SMBGhost by setting the OriginalCompressedSegmentSize field to be a huge number, causing an integer overflow followed by an out of bounds write. What if we set it to be a number which is just a little bit larger than the actual decompressed data we send? For example, if the size of our compressed data is x after decompression, and we set OriginalCompressedSegmentSize to be x + 0x1000, we’ll get the following:
The uninitialized kernel data is going to be treated as a part of our message.
If you didn’t read our previous writeup, you might think that the Srv2DecompressData function call should fail due to the check that follows the SmbCompressionDecompress call:
Specifically, in our example, you might assume that while the value of the OriginalCompressedSegmentSize field is x + 0x1000, FinalCompressedSize will be set to x in this case. In fact, FinalCompressedSize will be set to x + 0x1000 as well due to the implementation of the SmbCompressionDecompress function:
NTSTATUS SmbCompressionDecompress(
USHORT CompressionAlgorithm,
PUCHAR UncompressedBuffer,
ULONG UncompressedBufferSize,
PUCHAR CompressedBuffer,
ULONG CompressedBufferSize,
PULONG FinalCompressedSize)
{
// ...
NTSTATUS Status = RtlDecompressBufferEx2(
...,
FinalUncompressedSize,
...);
if (status >= 0) {
*FinalCompressedSize = CompressedBufferSize;
}
// ...
return Status;
}
In case of a successful decompression, FinalCompressedSize is updated to hold the value of CompressedBufferSize, which is the size of the buffer. Not only this seemingly unnecessary, deliberate update of the FinalCompressedSize value made the exploitation of SMBGhost easier, it also allowed the SMBleed bug to exist.
Basic exploitation
The SMB message we used to demonstrate the vulnerability is the SMB2 WRITE message. The message structure contains fields such as the amount of bytes to write and flags, followed by a variable length buffer. That’s perfect for exploiting the bug, since we can craft a message such that we specify the header, but the variable length buffer contains uninitialized data. We based our POC on Microsoft’s WindowsProtocolTestSuites repository (that we also used for the first SMBGhost reproduction), introducing this small addition to the compression function:
Note that our POC requires credentials and a writable share, which are available in many scenarios, but the bug applies to every message, so it can potentially be exploited without authentication. Also note that the leaked memory is from previous allocations in the NonPagedPoolNx pool, and since we control the allocation size, we might be able to control the data that is being leaked to some degree.
Windows 10 versions 1903, 1909 and 2004 are affected. During testing, our POC crashed one of our Windows 10 1903 machines. After analyzing the crash with Neutrinowe saw that the earliest, unpatched versions of Windows 10 1903 have a null pointer dereference bug while handling valid, compressed SMB packets. Please note, we didn’t investigate further to find whether it’s possible to bypass the null pointer dereference bug and exploit the system.
Here’s a summary of the affected Windows versions with the relevant updates installed:
Windows 10 Version 2004
Update
SMBGhost
SMBleed
KB4557957
Not Vulnerable
Not Vulnerable
Before KB4557957
Not Vulnerable
Vulnerable
Windows 10 Version 1909
Update
SMBGhost
SMBleed
KB4560960
Not Vulnerable
Not Vulnerable
KB4551762
Not Vulnerable
Vulnerable
Before KB4551762
Vulnerable
Vulnerable
Windows 10 Version 1903
Update
Null Dereference Bug
SMBGhost
SMBleed
KB4560960
Fixed
Not Vulnerable
Not Vulnerable
KB4551762
Fixed
Not Vulnerable
Vulnerable
KB4512941
Fixed
Vulnerable
Vulnerable
None of the above
Not Fixed
Vulnerable
Potentially vulnerable*
* We haven’t tried to bypass the null dereference bug, but it may be possible through another method (for example, using SMBGhost Write-What-Where primitive)
SMBleedingGhost? Chaining SMBleed with SMBGhost for pre-auth RCE
Exploiting the SMBleed bug without authentication is less straightforward, but also possible. We were able to use it together with the SMBGhost bug to achieve RCE (Remote Code Execution). A writeup with the technical details will be published soon. For now, please see below a POC demonstrating the exploitation. This POC is released only for educational and research purposes, as well as for evaluation of security defenses. Use at your own risk. ZecOps takes no responsibility for any misuse of this POC.
ZecOps Neutrino customers detect exploitation of SMBleed & SMBGhost – no further action is required. SMBleed & SMBGhost can be detected in multiple ways, including crash dump analysis, a network traffic analysis. Signatures are available to ZecOps Threat Intelligence subscribers. Feel free to reach out to us at [email protected] for more information.
Remediation
You can remediate both SMBleed and SMBGhost by doing one or more of the following things:
Windows update will solve the issues completely (recommended)
Blocking port 445 will stop lateral movements using these vulnerabilities
Enforcing host isolation
Disabling SMB 3.1.1 compression (not a recommended solution)
Shout out to Chompie that exploited this bug with a different technique. Chompie’s POC is available here.
Further to Apple’s patch of the MailDemon vulnerability (see our blog here), ZecOps Research Team has analyzed and compared the MailDemon patches of iOS 13.4.5 beta and iOS 13.5.
Our analysis concluded that the patches are different, and that iOS 13.4.5 beta patch was incomplete and could be still vulnerable under certain circumstances.
Since the 13.4.5 beta patch was insufficient, Apple issued a complete patch utilising a different approach which fixed this issue completely on both iOS 13.5 and iOS 12.4.7 as a special security update for older devices.
This may explain why it took about one month for a full patch to be released.
iOS 13.4.5 beta patch
The following is the heap-overflow vulnerability patch on iOS 13.4.5 beta.
The function -[MFMutableData appendBytes:length:] raises an exception if -[MFMutableData _mapMutableData] returns false.
In order to see when -[MFMutableData _mapMutableData] returns false, let’s take a look at how it is implemented:
When mmap fails it returns False, but still allocates a 8-bytes chunk and stores the pointer in self->bytes. This patch raises an exception before copying data into self->bytes, which solves the heap overflow issue partially.
The patch makes sure an exception will be raised inside -[MFMutableData appendBytes:length:]. However, there are other functions that call -[MFMutableData _mapMutableData] and interact with self->bytes which will be an 8-bytes chunk if mmap fails, these functions do not check if mmap fails or not since the patch only affects -[MFMutableData appendBytes:length:].
Following is an actual backtrace taken from MobileMail:
Since the bytes returned by mutableBytes is usually considered to be modifiable given following from Apple’s documentation:
This property is similar to, but different than the bytes property. The bytes property contains a pointer to a constant. You can use The bytes pointer to read the data managed by the data object, but you cannot modify that data. However, if the mutableBytes property contains a non-null pointer, this pointer points to mutable data. You can use the mutableBytes pointer to modify the data managed by the data object.
Apple’s documentation
Both -[MFMutableData mutableBytes] and -[MFMutableData bytes] returns self->bytes points to the 8-bytes chunk if mmap fails, which might lead to heap overflow under some circumstances.
The following is an example of how things could go wrong, the heap overflow still would happen even if it checks length before memcpy:
size_t length = 0x30000;
MFMutableData* mdata = [MFMutableData alloc];
data = malloc(length);
[mdata initWithBytesNoCopy:data length:length];
size_t mdata_len = [mdata length];
char* mbytes = [mdata mutableBytes];//mbytes could be a 8-bytes chunk
size_t new_data_len = 90;
char* new_data = malloc(new_data_len);
if (new_data_len <= mdata_len) {
memcpy(mbytes, new_data, new_data_len);//heap overflow if mmap fails
}
iOS 13.5 Patch
Following the iOS 13.5 patch, an exception is raised in “-[MFMutableData _mapMutableData] ”, right after mmap fails and it doesn’t return the 8-bytes chunk anymore. This approach fixes the issue completely.
Summary
iOS 13.5 patch is the correct way to patch the heap overflow vulnerability. It is important to double check security patches and verify that the patch is complete.
At ZecOps we help developers to find security weaknesses, and validate if the issue was correctly solved automatically. If you would like to find similar vulnerabilities in your applications/programs, we are now adding additional users to our CrashOps SDK beta program. If you do not own an app, and would like to inspect your phone for suspicious activity – check out ZecOps iOS DFIR solution – Gluon.
We were able to use this technique to verify that this vulnerability is exploitable. We are still working on improving the success rate.
Present two new examples of in-the-wild triggers so you can judge by yourself if these bugs worth an out of band patch
Suggestions to Apple on how to improve forensics information / logs and important questions following Apple’s response to the previous disclosure
Launching a bounty program for people who have traces of attacks with total bounties of $27,337
MailDemon appears to be even more ancient than we initially thought. There is a trigger for this vulnerability, in the wild, 10 years ago, on iPhone 2g, iOS 3.1.3
Following our announcement of RCE vulnerabilities discovery in the default Mail application on iOS, we have been contacted by numerous individuals who suspect they were targeted by this and related vulnerabilities in Mail.
ZecOps encourages Apple to release an out of band patch for the recently disclosed vulnerabilities and hopes that this blog will provide additional reinforcement to release patches as early as possible. In this blogpost we will show a simple way to spray the heap, whereby we were able to prove that remote exploitation of this issue is possible, and we will also provide two examples of triggers observed in the wild.
At present, we already have the following:
Remote heap-overflow in Mail application
Ability to trigger the vulnerability remotely with attacker-controlled input through an incoming mail
Ability to alter code execution
Kernel Elevation of Privileges 0day
What we don’t have:
An infoleak – but therein rests a surprise: an infoleak is not mandatory to be in Mail since an infoleak in almost any other process would be sufficient. Since dyld_shared_cache is shared through most processes, an infoleak vulnerability doesn’t necessarily have to be inside MobileMail, for example CVE-2019-8646 of iMessage can do the trick remotely as well – which opens additional attack surface (Facetime, other apps, iMessage, etc). There is a great talk by 5aelo during OffensiveCon covering similar topics.
Therefore, now we have all the requirements to exploit this bug remotely. Nonetheless, we prefer to be cautious in chaining this together because:
We have no intention of disclosing the LPE – it allows us to perform filesystem extraction / memory inspection on A12 devices and above when needed. You can read more about the problems of analyzing mobile devices at FreeTheSandbox.org
We haven’t seen exploitation in the wild for the LPE.
We will also share two examples of triggers that we have seen in the wild and let you make your own inferences and conclusions.
were you targeted by this vulnerability?
MailDemon Bounty
Lastly, we will present a bounty for those submissions that were able to demonstrate that they were attacked.
Exploiting MailDemon
As we previously hinted, MailDemon is a great candidate for exploitation because it overwrites small chunks of a MALLOC_NANO memory region, which stores a large number of Objective-C objects. Consequently, it allows attackers to manipulate an ISA pointer of the corrupted objects (allowing them to cause type confusions) or overwrite a function pointer to control the code flow of the process. This represents a viable approach of taking over the affected process.
Heap Spray & Heap Grooming Technique
In order to control the code flow, a heap spray is required to place crafted data into the memory. With the sprayed fake class containing a fake method cache of ‘dealloc’ method, we were able to control the Program Counter (PC) register after triggering the vulnerability using this method*.
The following is a partial crash log generated while testing our POC:
Exception Type: EXC_BAD_ACCESS (SIGBUS)
Exception Subtype: EXC_ARM_DA_ALIGN at 0xdeadbeefdeadbeef
VM Region Info: 0xdeadbeefdeadbeef is not in any region. Bytes after previous region: 16045690973559045872
REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL
MALLOC_NANO 0000000280000000-00000002a0000000 [512.0M] rw-/rwx SM=PRV
--->
UNUSED SPACE AT END
Thread 18 name: Dispatch queue: com.apple.CFNetwork.Connection
Thread 18 Crashed:
0 ??? 0xdeadbeefdeadbeef 0 + -2401053088876216593
1 libdispatch.dylib 0x00000001b7732338 _dispatch_lane_serial_drain$VARIANT$mp + 612
2 libdispatch.dylib 0x00000001b7732e74 _dispatch_lane_invoke$VARIANT$mp + 480
3 libdispatch.dylib 0x00000001b773410c _dispatch_workloop_invoke$VARIANT$mp + 1960
4 libdispatch.dylib 0x00000001b773b4ac _dispatch_workloop_worker_thread + 596
5 libsystem_pthread.dylib 0x00000001b796a114 _pthread_wqthread + 304
6 libsystem_pthread.dylib 0x00000001b796ccd4 start_wqthread + 4
Thread 18 crashed with ARM Thread State (64-bit):
x0: 0x0000000281606300 x1: 0x00000001e4b97b04 x2: 0x0000000000000004 x3: 0x00000001b791df30
x4: 0x00000002827e81c0 x5: 0x0000000000000000 x6: 0x0000000106e5af60 x7: 0x0000000000000940
x8: 0x00000001f14a6f68 x9: 0x00000001e4b97b04 x10: 0x0000000110000ae0 x11: 0x000000130000001f
x12: 0x0000000110000b10 x13: 0x000001a1f14b0141 x14: 0x00000000ef02b800 x15: 0x0000000000000057
x16: 0x00000001f14b0140 x17: 0xdeadbeefdeadbeef x18: 0x0000000000000000 x19: 0x0000000108e68038
x20: 0x0000000108e68000 x21: 0x0000000108e68000 x22: 0x000000016ff3f0e0 x23: 0xa3a3a3a3a3a3a3a3
x24: 0x0000000282721140 x25: 0x0000000108e68038 x26: 0x000000016ff3eac0 x27: 0x00000002827e8e80
x28: 0x000000016ff3f0e0 fp: 0x000000016ff3e870 lr: 0x00000001b6f3db9c
sp: 0x000000016ff3e400 pc: 0xdeadbeefdeadbeef cpsr: 0x60000000
The ideal primitive for heap spray in this case is a memory leak bug that can be triggered from remote, since we want the sprayed memory to stay untouched until the memory corruption is triggered. We left this as an exercise for the reader. Such primitive could qualify for up to $7,337 bounty from ZecOps (read more below).
Another way is using MFMutableData itself – when the size of MFMutableData is less than 0x20000 bytes it allocates memory from the heap instead of creating a file to store the content. And we can control the MFMutableData size by splitting content of the email into lines less than 0x20000 bytes since the IMAP library reads email content by lines. With this primitive we have a better chance to place payload into the address we want.
Trigger
An oversized email is capable of reproducing the vulnerability as a PoC(see details in our previous blog), but for a stable exploit, we need to take a closer look at “-[MFMutableData appendBytes:length:]“
-[MFMutableData appendBytes:length:]
{
int old_len = [self length];
//...
char* bytes = self->bytes;
if(!bytes){
bytes = [self _mapMutableData]; //Might be a data pointer of a size 8 heap
}
copy_dst = bytes + old_len;
//...
memmove(copy_dst, append_bytes, append_length); // It used append_length to copy the memory, causing an OOB writing in a small heap
}
The destination address of memove is ”bytes + old_len” instead of’ ‘bytes”. So what if we accumulate too much data before triggering the vulnerability? The “old_len” would end up with a very big value so that the destination address will end up in a invalid address which is beyond the edge of this region and crash immediately, given that the size of MALLOC_NANO region is 512MB.
In order to reduce the size of “padding”, we need to consume as much data as possible before triggering the vulnerability – a memory leak would be one of our candidates.
Noteworthy, the “padding” doesn’t mean the overflow address is completely random, the “padding” is predictable by hardware models since the RAM size is the same, and mmap is usually failed at the same size during our tests.
Crash analysis
This post discusses several triggers and exploitability of the MobileMail vulnerability detected in the wild which we covered in our previous blog.
Case 1 shows that the vulnerability is triggered in the wild before it was disclosed.
Case 2 is due to memory corruption in the MALLOC_NANO region, the value of the corrupted memory is part of the sent email and completely controlled by the sender.
Case 1
The following crash was triggered right inside the vulnerable function while the overflow happens.
With [a] and [b] we know that the process crashed inside “memmove” called by “-[MFMutableData appendBytes:length:]”, which means the value of “copy_dst” is an invalid address at first place which is 0x4a35630e.
So where did the value of the register x0 (0x4a35630e) come from? It’s much smaller than the lowest valid address.
Turns out that the process crashed when after failing to mmap a file and then failing to allocate the 8 byte memory at the same time.
The invalid address 0x4a35630e is actually the offset which is the length of MFMutableData before triggering the vulnerability(i.e. “old_len”). When calloc fails to allocate the memory it returns NULL, so the copy_dst will be “0 + old_len(0x4a35630e)”.
In this case the “old_len” is about 1.2GB which matches the average length of our POC which is likely to cause mmap failure and trigger the vulnerability.
Please note that x8-x15, and x0 are fully controlled by the sender.
The crash gives us another answer for our question above: “What if we accumulate too much data before triggering the vulnerability?” – The allocation of the 8-bytes memory could fail and crash while copying the payload to an invalid address. This can make reliable exploitation more difficult, as we may crash before taking over the program counter.
A Blast From The Past: Mysterious Trigger on iOS 3.1.3 in 2010!
Vulnerable version: iOS 3.1.3 on iPhone 2G Time of crash: 22nd of October, 2010
The user “shyamsandeep”, registered on the 12th of June 2008 and last logged in on the 16th of October 2011 and had a single post in the forum, which contained this exact trigger.
This crash had r0 equal to 0x037ea000, which could be the result of the 1st vulnerability we disclosed in our previous blog which was due to ftruncate() failure. Interestingly, as we explained in the first case, it could also be a result of the allocation of 8-bytes memory failure however it is not possible to determine the exact reason since the log lacked memory regions information. Nonetheless, it is certain that there were triggers in the wild for this exploitable vulnerability since 2010.
[a]: The pointer of the object was overwritten with “0x0041004100410041” which is AAAA in unicode.
[b] is one of the instructions around the crashed address we’ve added for better understanding, the process crashed on instruction “ldr x8, [x0]” while -[__NSDictionaryM removeAllObjects] was trying to release one the objects.
By reverse engineering -[__NSDictionaryM removeAllObjects], we understand that register x0 was loaded from x28(0x0000000282693330), since register x28 was never changed before the crash.
Let’s take a look at the virtual memory region information of x28: 0x0000000282693330, the overwritten object was stored in MALLOC_NANO region which stores small heap chunks. The heap overflow vulnerability corrupts the same region since it overflows on a 8-bytes heap chunk which is also stored in MALLOC_NANO.
This crash is actually pretty close to controlling the PC since it controls the pointer of an Objective-C object. By pointing the value of register x0 to a memory sprayed with a fake object and class with fake method cache, the attackers could control the PC pointer, this phrack blog explains the details.
Summary
It is rare to see that user-provided inputs trigger and control remote vulnerabilities.
We prove that it is possible to exploit this vulnerability using the described technique.
We have observed real world triggers with a large allocation size.
We have seen real world triggers with values that are controlled by the sender.
The emails we looked for were missing / deleted.
Success-rate can be improved. This bug had in-the-wild triggers in 2010 on an iPhone 2G device.
In our opinion, based on the above, this bug is worth an out of band patch.
How Can Apple Improve the Logs?
The lack of details in iOS logs and the lack of options to choose the granularity of the data for both individuals and organizations need to change to get iOS to be on-par with MacOS, Linux, and Windows capabilities. In general, the concept of hacking into a phone in order to analyze it, is completely flawed and should not be the normal way to do it.
We suggest Apple improve its error diagnostics process to help individuals, organizations, and SOCs to investigate their devices. We have a few helpful technical suggestions:
Crashes improvement: Enable to see memory next to each pointer / register
Crashes improvement: Show stack / heap memory / memory near registers
Add PIDs/PPIDs/UID/EUID to all applicable events
Ability to send these logs to a remote server without physically connecting the phone – we are aware of multiple cases where the logs were mysteriously deleted
Ability to perform complete digital forensics analysis of suspected iOS devices without a need to hack into the device first.
Questions for Apple
How many triggers have you seen to this heap overflow since iOS 3.1.3?
How were you able to determine within one day that all of the triggers to this bug were not malicious and did you actually go over each event ?
When are you planning to patch this vulnerability?
What are you going to do about enhancing forensics on mobile devices (see the list above)?
MailDemon Bounty
If you experienced any of the three symptoms below, use another mail application (e.g. Outlook for Desktop), and send the relevant emails (including the Email Source) to the address [email protected]– there are instructions at the bottom of this post.
Suspected emails may appear as follows:
Bounty details: We will validate if the email contains an exploit code. For the first two submissions containing Mail exploits that were verified by ZecOps team, we will provide:
$10,000 USD bounty
One license for ZecOps Gluon (DFIR for mobile devices) for 1 year
One license for ZecOps Neutrino (DFIR for endpoints and servers) for 1 year.
We will provide an additional bounty of up to $7,337 for exploit primitive as described above.
We will determine what were the first two valid submissions according to the date they were received in our email server and if they contain an exploit code. A total of $27,337 USD in bounties and licenses of ZecOps Gluon & Neutrino.
For suspicious submissions, we would also request device logs in order to determine other relevant information about potential attackers exploiting vulnerabilities in Mail and other vulnerabilities on the device.
Please note: Not every email that causes the symptoms above and shared with us will qualify for a bounty as there could be other bugs in MobileMail/maild – we’re only looking for ones that contain an attack.
How to send the emails using Outlook :
Open Outlook from a computer and locate the relevant email