πŸ”’
There are new articles available, click to refresh the page.
Before yesterdayVulnerabily Research

CVE-2020-3992 & CVE-2021-21974: Pre-Auth Remote Code Execution in VMware ESXi

2 March 2021 at 16:00

Last fall, I reported two critical-rated, pre-authentication remote code execution vulnerabilities in the VMware ESXi platform. Both of them reside within the same component, the Service Location Protocol (SLP) service. In October, VMware released aΒ patch to address one of the vulnerabilities, but it was incomplete and could be bypassed. VMware released a second patch in November completely addressing the use-after-free (UAF) portion of these bugs. The UAF vulnerability was assigned CVE-2020-3992. After that, VMware released a third patch in February completely addressing the heap overflow portion of these bugs. The heap overflow was assigned CVE-2021-21974.

This blog takes a look at both bugs and how the heap overflow could be used for code execution. Here is a quick video demonstrating the exploit in action:

Service Location Protocol (SLP) is a network service that listens on TCP and UDP port 427 on default installations of VMware ESXi. The implementation VMware uses is based on OpenSLP 1.0.1. VMware maintains its own version and has added some hardening to it.

The service parses network input without authentication and runs as root, so a vulnerability in the ESXi SLP service may lead to pre-auth remote code execution as root. This vector could also be used as a virtual machine escape, since by default a guest can access the SLP service on the host.

The Use-After-Free Bug (CVE-2020-3992)

This bug exists only in VMware’s implementation of SLP. Here is the simplified pseudocode:

At (3), if a SLP_FUNCT_DAADVERT or SLP_FUNCT_SRVREG request is handled correctly, it will save the allocated SLPMessage into the database. However, at (4), the SLPMessage is freed even though the handled request returns without error. It leaves a dangling pointer in the database. It is possible the free at (4) was added in the course of fixing some older bugs.

Bypassing the First Patch for CVE-2020-3992

The first patch (build-16850804) by VMware was interesting. VMware didn’t make any changes to the vulnerable code shown above. Instead, they added logic to check the source IP address before handling the request. The logic, which is in IsAddrLocal(), allows requests from a source IP address of localhost only.

After a few seconds, you might notice that it can still be accessed from an IPv6 link-local address via the LAN.

The Second Patch for CVE-2020-3992

Just over two weeks later, the second patch (build-17119627) was released. This time, they improved the IP source address check logic.

This change does eliminate the IPv6 vector. Additionally, they patched the root cause of the UAF bug by clearing the pointer to the SLPMessage after adding it to the database.

The Heap Overflow Bug (CVE-2021-21974)

Like the previous bug, this bug exists only in VMware’s implementation of SLP. Here is the simplified pseudocode:

At (5), srvurl comes from network input, but the function does not terminate srvurl with a NULL byte before using strstr(). The out-of-bounds string search leads to a heap overflow at (6). This happened because VMware did not merge an update from the original OpenSLP project.

The Patch for CVE-2021-21974

Six weeks later, the third patch (build- 17325551) was released. It addressed the root cause of the heap overflow bug by checking the length before the memcpy at (6).

Exploitation

All Linux exploit mitigations are enabled for /bin/slpd, and most notably, Position Independent Executables (PIE). This makes it difficult to achieve code execution without first disclosing some addresses from memory. At first, I considered using the UAF, but I could not figure out an effective method to get a memory disclosure. Therefore, I moved my focus to the heap overflow bug instead.

Upgrading the Overflow

SLP uses struct SLPBuffer to handle events that it sends and receives. One SLPBuffer* sendbuf and one SLPBuffer* recvbuf are allocated for each SLPDSocket* connection.

The plan is to partially overwrite the start or curpos pointer in SLPBuffer and leak some memory on the next message reply. However, the sendbuf is emptied and updated before each reply. Fortunately, there is a timeslot during which sendbuf can survive due to the select-based socket model:

  1. Fill a socket send buffer without receiving until the send buffer is full.
  2. Partially overwrite sendbuf->curpos for that socket.
  3. Start to receive from the socket. The leaked memory will be appended at the end.

There are some additional challenges, though:

Β Β Β Β Β Β Β -- Due to the use of strstr(), you cannot overflow with a NULL byte.
Β Β Β Β Β Β Β -- The overflowed buffer (obuf) will be automatically freed very soon after the return of SLPParseSrvUrl().

Together, this means that the overwrite can only extend partway through the next chunk header. Otherwise, the size of the next free chunk will be set to a very large value (four non-NULL bytes), and shortly after obuf is freed, the process will abort.

The following layout overcomes these challenges:

layout3.PNG

Assume that the target is sendbuf. In (F1), each chunk marked β€œIN USE” can be either a SLPBuffer or a SLPDSocket. A hole is prepared for obuf in (F2). After triggering the overflow in (F4), the next freed chunk is enlarged and overlapped onto the target. Next, obuf is then freed in (F5). Now, you can allocate a new recvbuf from a new connection to overwrite the target in (F6). This time the overwrite can include NULL bytes.

There is an additional problem:

Β Β Β Β Β Β Β -- Many malloc() functions from OpenSLP are replaced with calloc() by VMware.

The recvbuf in (F6) is also allocated from calloc(), which zero-initializes memory. This means that partial pointer overwrites are not possible when recvbuf overlaps the target. There is a trick to get around that, though: You can first overwrite the IS_MAPPED flag on the freed chunk in (F4). This causes calloc() to skip the zero initialization on the next allocation. This is a general method that is useful in many situations where you want to perform an overwrite on target.

Putting It All Together

  1. Overwrite a connection state (connection->state) as STREAM_WRITE_FIRST. This is necessary so that sendbuf->curpos will get reset to sendbuf->start in preparation for the memory disclosure.
  2. Partially overwrite sendbuf->start with 2 NULL bytes, where sendbuf belongs to the connection mentioned in step 1. Start receiving from the connection. You can then get memory disclosure, including the address of sendbuf.
  3. Overwrite sendbuf->curpos from a new connection to leak the address of a recvbuf, which is allocated from mmap(). Once you have an mmapped address, it becomes possible to infer the libc base address.
  4. Overwrite recvbuf->curpos from a new connection, setting it to the address of free_hook. Start sending on the connection. You can then overwrite free_hook.
  5. Close a connection, invoking free_hook to start the ROP chain.

These steps may not be the optimized form.

Privilege Level Obtained

If everything goes fine, you can execute arbitrary code with root permission on the target ESXi system. In ESXi 7, a new feature called DaemonSandboxing was prepared for SLP. It uses an AppArmor-like sandbox to isolate the SLP daemon. However, I find that this is disabled by default in my environment.

This suggests that a sandbox escape stage will be required in the future.

Conclusion

VMware ESXi is a popular infrastructure for cloud service providers and many others. Because of its popularity, these bugs may be exploited in the wild at some point. To defend against this vulnerability, you can either apply the relevant patches or implement the workaround. You should consider applying both to ensure your systems are adequately protected. Additionally, VMware now recommends disabling the OpenSLP service in ESXi if it is not used.

We look forward to seeing other methods to exploit these bugs as well as other ESXi vulnerabilities in general. Until then, you can find me on TwitterΒ @_wmliang_, and follow theΒ teamΒ for the latest in exploit techniques and security patches.

CVE-2020-3992 & CVE-2021-21974: Pre-Auth Remote Code Execution in VMware ESXi

CVE-2020-8625: A Fifteen-Year-Old RCE Bug Returns in ISC BIND Server

25 February 2021 at 17:30

In October 2020, we received a submission from an anonymous researcher targeting the ISC BIND server. The discovery was based upon an earlier vulnerability, CVE-2006-5989, which affected the Apache module mod_auth_kerb and was initially found by an anonymous researcher. The ISC BIND server shared the vulnerable code within the Simple and Protected GSSAPI Negotiation Mechanism (SPNEGO) component, but ISC did not merge the patch at that time. After 15 years, ISC patched the bug in BIND and assigned it CVE-2020-8625.

This vulnerability affects BIND versions from 9.11 to 9.16. It can be triggered remotely and without authentication. It leads to a 4-byte heap overflow. This submission was close to earning a larger payout through our Targeting Incentive Program, but lacked the full exploit needed to qualify for the full award. Still, it’s a great submission, and the bug is worth looking in greater detail.

The Vulnerability

The heap overflow bug exists in function der_get_oid(), which is in lib/dns/spnego.c.

This function allocates an array buffer at (1). The variable len is used to keep track of the number of available elements remaining in the buffer. The code fills the first 2 elements at (2), but it only decreases len by 1 at (3). As a result, the loop (4) can overflow the buffer by 1 element. The type of data->components is int, so we have a 4-byte heap overflow.

The Trigger

Since the vulnerability exists within the SPNEGO component, TKEY-GSSAPI configuration is necessary in BIND.

The dns.keytab file can be found in bin/tests/system/tsiggss/ns1/, and the example.nil.db file is generated by the script bin/tests/system/tsiggss/setup.sh.

Now the environment is ready. Upon receiving a crafted request, the vulnerability is triggered, producing the following call stack:

Exploitation

The exploitability for this bug is highly dependent on the glibc version. The following explanation is based on Ubuntu 18.04 with glibc 2.27, which enables tcache support.

First, we have to determine what is under control from this overflow bug.

Β Β Β Β Β Β Β -- The size and content of the vulnerable buffer, which is allocated in der_get_oid(), is controllable. By the way, the buffer will be freed when the current request is done.
Β Β Β Β Β Β Β -- There is a while loop in decode_MechTypeList() to execute der_get_oid() repeatedly. The loop count is controllable.

With these two points in mind, we can manipulate the heap fairly easily. To prepare the heap, we can exhaust tcache bins of any size and refill them after the request is done. Also, the refilled chunks can be contiguous in memory. This makes the memory layout quite conducive to exploitation via a buffer overflow.

Arbitrary write

At this stage, achieving an arbitrary write is straightforward by abusing the tcache freelist.

  1. Trigger a 4-byte overflow to enlarge the next free chunk size.
  2. Allocate the corrupted chunk on the next request. It will be moved to the new tcache bin when the request is ended.
  3. Allocate the corrupted chunk again with the new size. The corrupted chunk overlaps the next free chunk and overwrites its freelist with an arbitrary value.
  4. Allocate from the poisoned tcache freelist. It will return an arbitrary address.

Attempting to leak an address

All Linux mitigations are enabled by default for BIND. We have to struggle with ASLR first, which means we will need to find a way to leak an address from memory. A possible chance for obtaining a leak is in code_NegTokenArg() function. It is used for encoding response messages into a buffer, which will be sent to the client.

buf at (5) is a temporary buffer. Its initial size is 1024 bytes, which is within the range of sizes handled by tcache. outbuf at (6) is the buffer that will be sent to the client. Its size is within range for tcache also. If it is possible to apply a tcache dup attack on these two buffer sizes, the two malloc() calls at (5) and (6) will return the same address. After the free() at (7), a tcache->next pointer will be updated into buf, which is already overlapped with outbuf. This means a heap pointer will leak to the client.

Ideally, buf_len at (6) should be chosen to be large enough to avoid interfering with small tcache bins. Unfortunately, it seems the maximum value is only about 96 bytes. Due to this problem, the process does not survive and crashes very soon after the client gets the leaked heap pointer. More research is needed to find a way to continue the path to a full exploit.

The Patch

The patched versions are BIND 9.16.12 and BIND 9.11.28. To fix BIND 9.16, ISC fixed the buffer allocation size at (1). In BIND 9.11, they applied the patch as well.

Conclusion

This bug shows how vulnerabilities can reside undetected for years, even when the software is open source and in wide use. Software maintainers need to closely monitor all of the external modules they consume to ensure they stay up to date with the latest patches. It also shows how complex this challenge can be. ISC BIND is the most popular DNS server on the internet. The scope of impact is quite large, especially since the vulnerability can be triggered remotely and without authentication. All are advised to update their DNS servers as soon as possible.

For more information about our Targeted Incentive Program, check out this blog. We hope to see more submissions for this program in the future. Until then, you can find me on Twitter @_wmliang_, and follow the team for the latest in exploit techniques and security patches.

CVE-2020-8625: A Fifteen-Year-Old RCE Bug Returns in ISC BIND Server

  • There are no more articles
❌