🔒
There are new articles available, click to refresh the page.
Before yesterdayZero Day Initiative - Blog

CVE-2021-27077: Selecting Bitmaps into Mismatched Device Contexts

28 July 2021 at 15:28

In March 2021, Microsoft released a patch to correct a vulnerability in the Windows GDI subsystem. The bug could allow an attacker to execute code with escalated privileges. This vulnerability was reported to the ZDI program by security researcher Marcin Wiązowski. The patch for CVE-2021-27077 addresses several of his bug submissions. He has graciously provided this detailed write-up of the vulnerabilities and an analysis of the patch from Microsoft.


To handle devices on which drawing can be performed (pixel-based devices), Windows provides so-called device contexts (DCs). A device context object encapsulates a device object where a device object represents either a screen or a printer. Device contexts are an abstraction layer between user-mode code and low-level device drivers. They also act as containers, referencing various graphic objects used during drawing operations, such as pens, brushes, bitmaps, palettes, regions, and paths. To establish a connection between a device context and some other graphic object, the user-mode code calls SelectObject. For example:

In this example, we create a screen-related device context and three other GDI objects: a region, a font, and a bitmap. By selecting them into the device context, we cause them to be used during the ExtTextOut call below. This call will draw text on our TestBitmap by using our TestFont, and the drawing area will be clipped by our TestRegion.

For us, the important facts about device contexts are:

1 - They can be either screen-related or printer-related.
2 - To reference the screen or printer, each device context internally keeps a handle, referred to as hdev. Internally, a device context object resides in kernel memory, and its hdev field is a pointer to another object residing in kernel memory, this one being a device object.

Bitmaps selected into device contexts

Bitmaps are represented in kernel memory as surface objects. The partial structure definition shown here includes some fields documented by Microsoft, as well as some undocumented fields that have been reconstructed:

As we can see, bitmaps can contain a hdev value. This value is initially set to NULL, but selecting the bitmap into a device context copies the hdev value from the device context to the bitmap’s hdev field.

Now let’s do an experiment:

In this example:

1) We create one printer-related device context (DCPrn) and one screen-related device context (DCScr). For the printer DC, we used the printer named “Microsoft XPS Document Writer” which is available by default, but any installed printer could be used instead. If necessary, the EnumPrinters API could be used to get the names of all installed printers.
2) We create a bitmap (Bitmap). Note that it’s a monochrome (1 bit deep) bitmap, so it’s compatible with any device context. This includes 1-bit or 8-bit printer-related DCs and 32-bit screen-related DCs.
3) We select the bitmap into the printer-related DC. This sets the bitmap’s hdev field so that it points to the printer device object in kernel memory. The variable PrevBitmap is set to the default bitmap that was created together with the printer DC.
4) We deselect our bitmap (by selecting PrevBitmap instead), so we’ll be able to select Bitmap into some other device context again. Bitmaps are specific in that sense. They can be selected into only one device context at a time, so deselecting is required.
5) Finally, we select Bitmap into a screen-related DC. This updates bitmap’s hdev field so that it no longer points to the printer device object, but rather to the screen device object.

This code isn’t malicious, but we’ll make it malicious in the next step. But first, let’s look at some changes that were introduced in Windows 2000.

User-Mode Printer Drivers (UMPD)

Historically, both screens and printers used to be handled by kernel-mode drivers. While there are only a small number of vendors that manufacture graphic cards and write corresponding driver code, the world of printers is quite a different story. Undoubtedly there were plenty of printer manufacturers who introduced numerous security risks and instabilities in their drivers. To make things safer and more stable, Windows 2000 introduced an architecture that allowed printer drivers to work in user mode instead. Both kernel-mode and user-mode printer drivers coexisted until Windows Vista. Since then, Windows has supported only user-mode drivers for printing.

Consequently, only some stubs remained in kernel code for printer handling. These stubs mainly just make callbacks to user mode. In user mode, these calls are first passed to some internal code in user32.dll, then to some more code in gdi32.dll, then to yet more internal code in gdi32full.dll, which finally passes them to user-mode printer drivers for handling:

To provide user-mode drivers with the needed functionalities, some kernel-mode APIs now also have their user-mode API counterparts. One such API is EngAssociateSurface. Display drivers use calls to the kernel-mode function win32kbase.sys!EngAssociateSurface, whereas printer drivers use the user-mode function gdi32.dll!EngAssociateSurface:

For our purposes, we can consider the user-mode EngAssociateSurface function to be an extended version of the SelectObject API. When passing a bitmap as the first parameter, we can set not only the bitmap’s hdev field but also its dhpdev and flags fields. Here are a few things to consider:

1) The hdev parameter will be a valid value obtained from an existing printer-related device context. User-mode code can obtain this value since it’s forwarded by the kernel to the user-mode printer driver. This will be explained below in greater detail. After validation, the passed hdev parameter will be copied to the bitmap’s hdev field.
2) The flHooks parameter may contain any combination of HOOK_XXX flags, documented here. These flags will be set in the bitmap’s flags field.
3) The dhpdev value is not passed as a parameter. However, the kernel maintains its internal list of hdev-dhpdev pairs, so setting the bitmap’s hdev field will also set its dhpdev field.

During device initialization, the device driver can pass a dhpdev value of the driver’s choice to the operating system, thus creating a hdev-dhpdev pair. In practice, the dhpdev value points to a block of driver-allocated memory, which will be either kernel-mode memory for display drivers or user-mode memory for printer drivers. This observation is very important for us.

Whenever executing a GDI graphics primitive (for example, ExtTextOut, which renders text to a DC), there are two code paths that could be chosen. One code path implements the primitive via a call to a corresponding driver function (supposing one exists) that knows how to perform that graphics primitive natively. The other code path uses a generic implementation of the primitive as supplied by win32k. The bitmap’s flags field, containing a combination of values from the HOOK_XXX enumeration, tells the operating system which GDI operations should be handled by the win32k subsystem and which should be directed to a corresponding device driver function. When so requested, the driver is identified by the device context’s hdev field.

Since we used ExtTextOut in our example above, let’s consider the HOOK_TEXTOUT flag. The ExtTextOut GDI function takes a device context as a parameter and passes it to the kernel-mode implementation of GDI. The kernel obtains the bitmap that is selected into this device context and checks if the bitmap’s flags field has the HOOK_TEXTOUT bit set. If this bit is not set, the kernel will pass our request to the generic win32kfull.sys!EngTextOut function to render the text. If HOOK_TEXTOUT is set, though, the kernel will forward our request to the device driver as specified by the bitmap’s hdev field. In this case, execution will be passed in kernel mode to one of three possible places (a few more are used in some specific cases, but they are not interesting for us):

Cdd.dll!DrvTextOut, which is a part of a top-level display driver, used for screen devices when only one monitor is active.
win32kfull.sys!MulTextOut, which is a part of a top-level display driver, used for multi-monitor configurations. This driver is embedded directly in the win32k subsystem.
win32kfull.sys!UMPDDrvTextOut, which is one of the printer stubs in the kernel. It uses a callback to pass the request through to user mode. This allows the request to be handled by the appropriate user-mode printer driver.

We are now ready to make our piece of code malicious. To achieve this, we’ll replace one of the SelectObject calls with a call to EngAssociateSurface call, passing HOOK_TEXTOUT as the flags parameter:

Other modifications are as follows:

1) We no longer need to call CreateCompatibleDC when creating DCPrn. This is because we no longer call SelectObject on DCPrn but rather EngAssociateSurface on DCPrn’s hdev value.
2) Because of internal EngAssociateSurface requirements, we can’t use the CreateBitmap call anymore. We must instead call CreateCompatibleBitmap on DCPrn.
3) We no longer need to deselect our bitmap from DCPrn. This is because there is no SelectObject call on DCPrn but rather EngAssociateSurface on DCPrn’s hdev value.
4) Note that we explicitly call a Unicode (wide) version of the ExtTextOut API. We also provide an ETO_IGNORELANGUAGE flag to this call, along with a very long string. This way, our ExtTextOut request will be passed directly to the kernel mode without any further user-mode handling.

Let’s look at the comments in the code snippet above. The EngAssociateSurface call sets the bitmap’s hdev field so it points to a printer device in kernel mode. As explained above, this will also affect the bitmap’s dhpdev field. As we remember, this value, in practice, points to a block of driver-allocated memory. Since printer drivers are handled in user mode, this will be some block of user-mode memory. Finally, the HOOK_TEXTOUT bit is set in the bitmap’s flags field.

The SelectObject call is then made, which ties the bitmap to DCScr and modifies the bitmap’s hdev field so it no longer points to the printer device but rather to a screen device. Other bitmap fields are not modified. In particular, SelectObject will not alter the dhpdev field.

Finally, we have the ExtTextOutW call on the screen-related DCScr. When handling this call, the kernel will see the HOOK_TEXTOUT flag set in the bitmap’s flags field, so drawing will not be handled by the win32k subsystem. Instead, our request will be passed to the device driver specified by the DCScr’s hdev value, which in our case is the display driver. This way, thanks to our manipulations, the display driver will get the bitmap containing a pointer to some user-mode memory in the bitmap’s dhpdev field while it expects to see a pointer to its own, kernel-mode memory there. We definitely found a security problem.

By preparing a block of user-mode memory appropriately, we can trick the display driver into overwriting kernel memory of our choice. This means privilege escalation is possible. But first, we must perform some interactions with a user-mode printer driver to help us fulfull the following four requirements:

Requirement 1: We must obtain the printer’s hdev value so we will be able to make the EngAssociateSurface call. This is a kernel-mode pointer.
Requirement 2: We must obtain the printer’s dhpdev pointer (note that this is a user-mode pointer), so we can prepare the memory that the display driver will access there. Even better, we can also modify this pointer at will, as explained below.
Requirement 3: Calling CreateCompatibleBitmap on DCPrn must return a bitmap that can be then selected into DCScr, so it must be either a 1-bit or a 32-bit bitmap. For this reason, we must force the user-mode printer driver to create either a 1-bit or a 32-bit device context, which will become our DCPrn.
Requirement 4: We must force the user-mode printer driver to report that it has a native text rendering function. Otherwise, calling EngAssociateSurface with HOOK_TEXTOUT will fail.

Hooking the UMPD implementation

Printer drivers are user-mode DLLs registered in the Registry. Printer drivers may implement functions that are defined in the winddi.h header file and are identified by integer constants:

To implement INDEX_DrvEnablePDEV functionality, device drivers provide a function with the following definition:

As we can see, this function gets the hdev value as a parameter and returns the dhpdev value. This means that we need to install some wrapper around this function in the printer driver to be able to obtain and modify these values.

To reach our goal, we could hook the user-mode printer driver itself, as described in detail here. However, this is quite a complex approach. Instead, we’ll try another solution that is both simpler and more universal. Instead of placing hooks in the driver itself, we will hook the user-mode code that is executed above the driver:

By injecting our own layer between user32.dll!ClientPrinterThunk and gdi32.dll!GdiPrinterThunk, we can obtain a central point at which all data passed to or returned from printer drivers can be read and modified. This can be achieved easily, since gdi32.dll!GdiPrinterThunk is a public export from the gdi32.dll module. Because user32.dll is a PE image mapped into memory, it’s enough to enumerate its import table and replace all pointers to gdi32.dll!GdiPrinterThunk with pointers to our own code. This gives us the ability to modify the incoming data, call the original gdi32.dll!GdiPrinterThunk function, and modify the returned data as needed.

The gdi32.dll!GdiPrinterThunk function, although exported by gdi32.dll, is not publicly defined. As of today, it has four parameters and can be reconstructed as:

In this call, the input buffer contains the identifier of the driver function to be called (i.e., one of the INDEX_XXX values) along with data needed for the call. The output buffer receives the results returned by the printer driver. The needed parts of the input and output buffer can be reconstructed as:

For our exploit, we need custom handling of the following INDEX_XXX requests in our GdiPrinterThunk wrapper:

1) INDEX_DrvEnablePDEV
2) INDEX_UMPDDrvDriverFn (undocumented, called from win32kfull.sys!UMPDDrvDriverFn)
3) INDEX_DrvDisablePDEV

Aside from the INDEX_XXX values defined in winddi.h, GdiPrinterThunk also receives a few undocumented commands. These have INDEX_XXX values greater than the highest documented value. Undocumented commands are never forwarded down to the lowest level (the user-mode printer driver DLL). They are designed for internal gdi32full.dll use only. The first command passed to the GdiPrinterThunk call is always a particular undocumented command, which, after increasing by 3, gives us the needed INDEX_UMPDDrvDriverFn value.

After using CreateDC to create a printer-related device context, we’ll catch many requests in our installed GdiPrinterThunk wrapper. Most of them should be just passed directly to the original gdi32.dll!GdiPrinterThunk implementation, with the three exceptions as mentioned above:

• When handling INDEX_DrvEnablePDEV, we can find the hdev value needed for the EngAssociateSurface call in the input buffer, which satisfies Requirement #1. Then, after calling the original gdi32.dll!GdiPrinterThunk function, we can find the dhpdev value in the output buffer, which points to some block of user-mode memory. Disassembling gdi32full.dll shows that this block of memory contains only two handles/pointers. At this moment, we can allocate some large (64 kB) block of user-mode memory, copy these two handles/pointers to our block, and overwrite the original dhpdev value with a pointer to our own block. By disclosing dhpdev as well as overwriting it with a new value to our liking, we have satisfied Requirement #2. Additionally, the output buffer contains a pointer to a pdi record. We should set pdi->iDitherFormat to BMF_1BPP and also ensure that pdi->flGraphicsCaps has the GCAPS_PALMANAGED flag cleared. This way, Requirement #3 will be met as well.

• When handling INDEX_UMPDDrvDriverFn, we get the DriverFn array in the output buffer. After calling the original gdi32.dll!GdiPrinterThunk function, we should set DriverFn[INDEX_DrvTextOut] to TRUE so the kernel will think that text drawing functionality is handled natively by the printer driver. Thanks to this, passing HOOK_TEXTOUT flag to our EngAssociateSurface call will succeed. This will satisfy Requirement #4.

• When handling INDEX_DrvDisablePDEV, we’ll find our own dhpdev value in the input buffer. It’s good to replace it with the original value before calling the original gdi32.dll!GdiPrinterThunk function to avoid problems with heap memory. gdi32full.dll treats this value as a pointer to its own block of heap memory and tries to release it.

Kernel Exploitation with a Fake dhpdev Block

We are now ready to pass our block of user-mode memory to the display driver. Our choice will be the multi-monitor display driver. More precisely, we will target its win32kfull.sys!MulTextOut function. This will limit our exploitation to multi-monitor machines only, but exploitation will be a bit tricky. The multi-monitor driver expects to see its own data structures in the dhpdev block, so we must craft this block of memory properly to obtain a controllable memory write. A detailed description of the needed memory contents would be quite boring and not really cutting-edge, so let’s immediately focus on the tricky part. There are some flags stored in the dhpdev block where we can set the RC_PALETTE bit. This informs the multi-monitor driver that our displays need a palette for drawing. A palette is used for dealing with 8-bit color modes. Enabling such modes for displays has not been possible since Windows 8. However, the code for handling them is still present. This allows us to use the RC_PALETTE flag and thus force the driver to create a palette object while, by preparing contents of the fake dhpdev block carefully, we can control the memory address where the palette object will be stored. In other words, we can overwrite kernel memory of our choice with a palette handle value.

Our target for overwriting will be our process token’s privileges – see here for details on how to obtain the address of our privileges in kernel memory. In short, we can treat privileges as two 64-bit bitfields located in kernel memory. One of these bitfields tells which privileges are present and another one tells which are enabled. For our purposes, it’s enough to gain the following four privileges:

1) SeCreateTokenPrivilege
2) SeSecurityPrivilege
3) SeAssignPrimaryTokenPrivilege
4) SeTcbPrivilege

To obtain these privileges, we need to set their corresponding bits in the present bitfield. Generally speaking, once a privilege is present in the token, it can be enabled by calling the AdjustTokenPrivileges API. However, not all privileges can be enabled so easily. Each process token contains not only its privileges but also a so-called integrity level. Normal processes don’t have an integrity level high enough to enable the most critical privileges just by making an API call. For this reason, we must perform our exploitation twice: once to overwrite the bitfield indicating which privileges are present and a second time to overwrite the bitfield indicating which privileges are enabled. This means we must:

1) Prepare our fake dhpdev block in such a way that the handle to the palette object will be stored in the present bitfield.
2) Call ExtTextOutW.
3) Prepare our fake dhpdev block once again so the handle to the palette object will be stored in the enabled bitfield.
4) Call ExtTextOutW again.

For successful exploitation, we need to hold only the four privileges listed above. All other privileges may be either set or cleared. Having these four privileges is enough to make a successful call to an undocumented NtCreateToken native API. This allows us to create a token with all privileges present and enabled and with the highest possible integrity level. Passing such a token to a CreateProcessAsUser call allows us to create a maximally powerful user-mode process.

Our last task is to force the kernel to create palettes with the proper handle values so that writing so those values to the privileges bitfields will set the required bits.

Controlling GDI handle values

The lowest 16 bits of each GDI handle act as an index to a so-called global GDI table. For example, a handle value may be:

0x6c0859b2

where:
1) 0x59b2 is an index to a slot in the global GDI table.
2) 0x08 is an object type. 0x08 stands for a palette object.
3) 0x6c is a counter that is incremented each time a slot is assigned for a new object.

Slots in the global GDI table are used in a LIFO (Last In, First Out) manner. We can experiment and create 3 GDI objects: Object1, Object2, and Object3. Obtained handle values may be, for example:

0x74xx25c8
0x21xx1287
0x15xx4f34

Now we can delete these objects in reverse order (Object3, Object2, and Object1) and recreate them in the original order again. The handle values will most probably still reference the same slots, and their values will now be:

0x75xx25c8
0x22xx1287
0x16xx4f34

To achieve our goal, we should:
1) Create many GDI objects of any type. Since there is a per-process limit of 10,000 GDI handles, we can create 9,500 objects. Assuming a linear distribution, 1/16 of the obtained handle values (nearly 600 of them) will have the needed bit combination.
2) Delete the objects that don’t have the needed bit combination in their lowest 16 bits.
3) Delete the objects that do have the needed bit combination in their lowest 16 bits.

The slots of the most recently deleted objects will be first to be reused again when the multi-monitor driver allocates its palette objects during exploitation. Since the most recently deleted objects have the required bit combination in their handle values, the kernel-generated palette objects will also have this same bit combination.

Due to multitasking, other processes may interfere with our manipulations and release additional slots or reuse our just-released slots before they are used by the multi-monitor driver when creating its internal palette objects. This is not a big problem. We can call PrivilegeCheck to see if our process token’s privileges have been overwritten as expected. In case of failure, we can just attempt our exploitation once again.

As a result of our exploitation:
1) The needed privileges in our process token will always be set.
2) There are some other privileges that will always be in a predictable state (set or cleared) due to their being overwritten by the 0x08 part of the handle value.
3) The remaining privileges will be overwritten randomly.

The Patch

When the vulnerability was found, it was first thought that the problem was that the multi-monitor driver doesn’t validate that the dhpdev pointer in the received bitmaps falls within the kernel-mode memory range. We should note that some other HOOK_XXX flags could also be used for exploitation, for example, HOOK_LINETO along with a LineTo call. Therefore, the validation would need to be implemented in multiple places in the multi-monitor driver.

However, under the hood, things are probably more complicated than they appear, so another solution has been implemented. Now, when a bitmap’s dhpdev field is non-NULL (which would be a consequence of making an EngAssociateSurface call), and the bitmap is also printer-compatible (one of the bitmap’s undocumented flags indicates this), selecting the bitmap into a screen-related device context is no longer possible. A newly-added win32kbase.sys!bIsSurfaceAllowedInDC function checks these conditions and causes the SelectObject call to fail. The introduced change shouldn’t cause any problems in real-life scenarios, since the algorithm used for the exploitation is highly artificial. However, this change still breaks compatibility to some extent. For this reason, the patch can be disabled by adding an undocumented registry key from an elevated command prompt and rebooting:

reg.exe add HKEY_LOCAL_MACHINE\System\CurrentControlSet\Policies /v {b7ca08da-d52e-4acb-9866-7dad281e4489} /t REG_DWORD /d 1

Another newly-added function named win32kbase.sys!UMPDAllowPrinterSurfaceInDisplayDC is now called during system initialization. It reads the registry key (if it exists) and stores the result in a global variable called win32kbase.sys!gAllowPrinterSurfaceInDisplayDC. This variable is then read by the aforementioned win32kbase.sys!bIsSurfaceAllowedInDC function, which is internally called during each SelectObject call.


Thanks again to Marcin for providing this thorough write-up. He has contributed many bugs to the ZDI program over the last couple of years, and we certainly hope to see more submissions from him in the future. Until then, follow the team for the latest in exploit techniques and security patches.

CVE-2021-27077: Selecting Bitmaps into Mismatched Device Contexts

From Pwn2Own 2021: A New Attack Surface on Microsoft Exchange - ProxyShell!

18 August 2021 at 14:54

In April 2021, Orange Tsai from DEVCORE Research Team demonstrated a remote code execution vulnerability in Microsoft Exchange during the Pwn2Own Vancouver 2021 contest. In doing so, he earned himself $200,000. Since then, he has disclosed several other bugs in Exchange and presented some of his findings at the recent Black Hat conference. Now that the bugs have been addressed by Microsoft, Orange has graciously provided this detailed write-up of the vulnerabilities he calls “ProxyShell”.


Hi, I am Orange Tsai from DEVCORE Research Team. In this article, I will introduce the exploit chain we demonstrated at the Pwn2Own 2021. It’s a pre-auth RCE on Microsoft Exchange Server and we named it ProxyShell! This article will provide additional details of the vulnerabilities. Regarding the architecture, and the new attack surface we uncovered, you can follow my talk on Black Hat USA and DEFCON or read the technical analysis in our blog.

ProxyShell consists of 3 vulnerabilities:

CVE-2021-34473 - Pre-auth Path Confusion leads to ACL Bypass
CVE-2021-34523 - Elevation of Privilege on Exchange PowerShell Backend
CVE-2021-31207 - Post-auth Arbitrary-File-Write leads to RCE

With ProxyShell, an unauthenticated attacker can execute arbitrary commands on Microsoft Exchange Server through an exposed 443 port!

CVE-2021-34473 - Pre-auth Path Confusion

The first vulnerability of ProxyShell is similar to the SSRF in ProxyLogon. It too appears when the frontend (known as Client Access Services, or CAS) is calculating the backend URL. When a client HTTP request is categorized as an Explicit Logon Request, Exchange will normalize the request URL and remove the mailbox address part before routing the request to the backend.

Explicit Login is a special feature in Exchange to make a browser embed or display a specific user’s mailbox or calendar with a single URL. To accomplish this feature, this URL must be simple and include the mailbox address to be displayed. For example:

         https://exchange/OWA/[email protected]/Default.aspx

Through our research, we found that in certain handlers such as EwsAutodiscoverProxyRequestHandler, we can specify the mailbox address via the query string. Because Exchange doesn’t conduct sufficient checks on the mailbox address, we can erase a part of the URL via the query string during the URL normalization to access an arbitrary backend URL.

HttpProxy/EwsAutodiscoverProxyRequestHandler.cs

From the above code snippet, if the URL passes the check of IsAutodiscoverV2PreviewRequest, we can specify the Explicit Logon address via the Email parameter of the query string. It’s easy because this method just performs a simple validation of the URL suffix.

The Explicit Logon address will then be passed as an argument to method RemoveExplicitLogonFromUrlAbsoluteUri, and the method just uses Substring to erase the pattern we specified.

Here we designed the following URL to abuse the normalization process of Explicit Logon URL:

         https://exchange/autodiscover/[email protected]/?&          Email=autodiscover/autodiscover.json%[email protected]

upload_478912a7f6e2273a32eb713e9bce6e25.png

This faulty URL normalization lets us access an arbitrary backend URL while running as the Exchange Server machine account. Although this bug is not as powerful as the SSRF in ProxyLogon, and we could manipulate only the path part of the URL, it’s still powerful enough for us to conduct further attacks with arbitrary backend access.

upload_904b9bf84f3227a749234404c6062591.png

CVE-2021-34523 - Exchange PowerShell Backend Elevation-of-Privilege

So far, we can access arbitrary backend URLs. The remaining part is post-exploitation. Due to the in-depth RBAC defense of Exchange (the ProtocolType in /Autodiscover is different from /Ecp), the unprivileged operation used in ProxyLogon which generates an ECP session is forbidden. So, we have to discover a new approach to exploit it. Here we focus on the feature called Exchange PowerShell Remoting!

Exchange PowerShell Remoting is a feature that lets users send mail, read mail, and even update the configuration from the command line. Exchange PowerShell Remoting is built upon WS-Management and implements numerous Cmdlets for automation. However, the authentication and authorization parts are still based on the original CAS architecture.

It should be noted that although we can access the backend of Exchange PowerShell, we still can’t interact with it correctly because there is no valid mailbox for the User NT AUTHORITY\SYSTEM. We also can’t inject the X-CommonAccessToken header to forge our identity to impersonate a different user.

So what can we do? We thoroughly examined the implementation of the Exchange PowerShell backend and found an interesting piece of code that can be used to specify the user identity via the URL.

Configuration\RemotePowershellBackendCmdletProxyModule.cs

From the code snippet, when the PowerShell backend can’t find the X-CommonAccessToken header in the current request, it will try to deserialize and restore the user identity from the parameter X-Rps-CAT of the query string. It looks like the code snippet is designed for internal Exchange PowerShell intercommunication. However, because we can access the backend directly and specify an arbitrary value in X-Rps-CAT, we have the ability to impersonate any user. We leverage this to “downgrade” ourselves from the SYSTEM account, which has no mailbox, to Exchange Admin.

And now we can execute arbitrary Exchange PowerShell commands as Exchange Admin!

CVE-2021-31207 - Post-auth Arbitrary-File-Write

The last part of the exploit chain is to find a post-auth RCE technique using Exchange PowerShell commands. It’s not difficult because we are the admin and there are hundreds of commands that could be leveraged. Here we found the command New-MailboxExportRequest, which exports a user’s mailbox to a specified path.

  New-MailboxExportRequest -Mailbox [email protected] -FilePath
  \\127.0.0.1\C$\path\to\shell.aspx

This command is useful to us, since it lets us create a file at an arbitrary path. To make things better, the exported file is a mailbox that stores the user’s mails, so we can deliver our malicious payload through SMTP. But the only problem is - it seems like the mail content is not stored in plaintext format because we can’t find our payload in the exported file :(

We found the output is in Outlook Personal Folders (PST) format. By reading the official documentation from Microsoft, we learned that the PST just uses a simple Permutative Encoding (NDB_CRYPT_PERMUTE) to encode our payload. So we can encode the payload before sending it out, and when the server tries to save and encode our payload, it turns it into the original malicious code.

The Exploit

Let’s chain everything together!

Step 1 - Malicious payload delivery

We first deliver our encoded web shell to the targeted mailbox through SMTP. If the target mail server doesn’t support sending mail from an unauthorized user, Gmail can be also used as an alternative way to deliver the malicious payload externally.

Step 2 - PowerShell session establishment

Because PowerShell is based on the WinRM protocol, and it’s not easy to implement a universal WinRM client, we use a proxy server to hijack the PowerShell connection and modify the traffic. We first rewrite the URL to the path of EwsAutodiscoverProxyRequestHandler, which will trigger the path confusion bug and let us access the PowerShell backend. Then we insert the parameter X-Rps-CAT into the query string to impersonate any user. Here we specify the SID of Exchange Admin to become the admin!

Step 3 - Malicious PowerShell command execution

In the established PowerShell session, we execute the following PowerShell commands:

  1. New-ManagementRoleAssignment to grant ourselves the Mailbox Import Export role
  2. New-MailboxExportRequest to export the mailbox containing our malicious payload to webroot, to act as our web shell

Additional notes

After ProxyLogon, Windows Defender started blocking dangerous behaviors under the webroot of Exchange Server. To spawn a shell at Pwn2Own successfully, we spent a little time bypassing Defender. We found that Defender blocks us if we call cmd.exe directly. However, if we first copy cmd.exe to a <random>.exe under webroot via Scripting.FileSystemObject and then execute it, it works and Defender is silent :P

The other side note is that if the organization is using the Exchange Server cluster, sometimes an InvalidShellID exception occurs. The reason for this problem is that you are dealing with a load balancer, so a bit of luck is needed. Try to catch the exception and send the request again to solve that ;)

The last step is to enjoy your shell! Here is a demonstration video:

The Patch

Microsoft fixed all 3 of the ProxyShell vulnerabilities via patches released in April and May, but it announced the patches and assigned the CVEs three months after. The reason Microsoft gave is that:

Msft.png

Regarding the patch of CVE-2021-31207, Microsoft didn’t fix the arbitrary file write but used a allowlist to limit the file extension to .pst, .eml, or .ost.

As for the vulnerabilities which were patched in April but assigned CVEs in July, Exchange now checks the value of IsAuthenticated to ensure all frontend requests are authenticated before generating the Kerberos ticket to access the backend.

upload_7c2d5577bfa74e8024f562bc3154f40c.png

Conclusion

Although the April patch mitigated the authentication part of this new attack surface, the CAS is still a good place for security researchers to hunt for bugs. In fact, we have uncovered a few additional bugs after the April patch. In conclusion, Exchange Server is a treasure waiting for you to find bugs. As we mentioned in our previous article, even in 2020, a hard-coded cryptography key could still be found in Exchange Server. I can assure you that Microsoft will fix more Exchange vulnerabilities in the future.

For the system administrators, since it’s an architecture problem, it’s hard to mitigate the attack surface with one single action. All you can do is keep your Exchange Server up-to-date and limit its external Internet exposure. At the very least, please apply the April Cumulative Update to prevent most of these pre-auth bugs!


Thanks again to Orange for providing this detailed analysis of his research. He has contributed many bugs to the ZDI program over the last couple of years, and we certainly hope to see more submissions from him in the future. Until then, follow the team for the latest in exploit techniques and security patches.

From Pwn2Own 2021: A New Attack Surface on Microsoft Exchange - ProxyShell!

CVE-2021-28632 & CVE-2021-39840: Bypassing Locks in Adobe Reader

21 October 2021 at 16:12

Over the past few months, Adobe has patched several remote code execution bugs in Adobe Acrobat and Reader that were reported by researcher Mark Vincent Yason (@MarkYason) through our program. Two of these bugs, in particular, CVE-2021-28632 and CVE-2021-39840, are related Use-After-Free bugs even though they were patched months apart. Mark has graciously provided this detailed write-up of these vulnerabilities and their root cause.


This blog post describes two Adobe Reader use-after-free vulnerabilities that I submitted to ZDI: One from the June 2021 patch (CVE-2021-28632) and one from the September 2021 patch (CVE-2021-39840). An interesting aspect about these two bugs is that they are related – the first bug was discovered via fuzzing and the second bug was discovered by reverse engineering and then bypassing the patch for the first bug.

CVE-2021-28632: Understanding Field Locks

One early morning while doing my routine crash analysis, one Adobe Reader crash caught my attention:

After a couple of hours minimizing and cleaning up the fuzzer-generated PDF file, the resulting simplified proof-of-concept (PoC) was as follows:

PDF portion (important parts only):

JavaScript portion:

The crash involved a use-after-free of CPDField objects. CPDField objects are internal AcroForm.api C++ objects that represent text fields, button fields, etc. in interactive forms.

In the PDF portion above, two CPDField objects are created to represent the two text fields named fieldParent and fieldChild. Note that the created objects have the type CTextField, a subclass of CPDField, which is used for text fields. To simplify the discussion, they will be referred to as CPDField objects.

An important component for triggering the bug is that fieldChild should be a descendant of fieldParent by specifying it in the /Kids key of the fieldParent PDF object dictionary (see [A] above) as documented in the PDF file format specification:

img01.jpg

Another important concept relating to the bug is that to prevent a CPDField object from being freed while it is in use, an internal property named LockFieldProp is used. Internal properties of CPDField objects are stored via a C++ map member variable.

If LockFieldProp is not zero, it means that the CPDField object is locked and can't be freed; if it is zero or is not set, it means that the CPDField object is unlocked and can be freed. Below is the visual representation of the two CPDField objects in the PoC before the field locking code (discussed later) is called: fieldParent is unlocked (LockFieldProp is 0) and is in green, and fieldChild is also unlocked (LockFieldProp is not set) and is also in green:

img02.jpg

On the JavaScript portion of the PoC, the code sets up a JavaScript callback so that when the “Format” event is triggered for fieldParent, a custom JavaScript function callback() will be executed [2]. The JavaScript code then triggers a “Format” event by setting the textSize property of fieldParent [3]. Internally, this executes the textSize property setter of JavaScript Field objects in AcroForm.api.

One of the first actions of the textSize property setter in AcroForm.api is to call the following field locking code against fieldParent:

The above code locks the CPDField object passed to it by setting its LockFieldProp property to 1 [AA].

After executing the field locking code, the lock state of fieldParent (locked: in red) and fieldChild (unlocked: in green) are as follows:

img03.jpg

Note that in the later versions of Adobe Reader, the value of LockFieldProp is a pointer to a counter instead of being set with the value 1 or 0.

Next, the textSize property setter in AcroForm.api calls the following recursive CPDField method where the use-after-free occurs:

On the first call to the above method, the this pointer points to the locked fieldParent CPDField object. Because it has no associated widget [aa], the method performs a recursive call [cc] with the this pointer pointing to each of fieldParent's children [bb].

Therefore, on the second call to the above method, the this pointer points to the fieldChild CPDField object, and since it has an associated widget (see [B] in the PDF portion of the PoC), a notification will be triggered [dd] that results in the custom JavaScript callback() function to be executed. As shown in the previous illustration, the locking code only locked fieldParent while fieldChild is left unlocked. Because fieldChild is unlocked, the removeField("fieldChild") call in the custom JavaScript callback() function (see [1] in the JavaScript portion of the PoC) succeeds in freeing the fieldChild CPDField object. This leads to the this pointer in the recursive method to become a dangling pointer after the call in [dd]. The dangling this pointer is later dereferenced resulting in the crash.

This first vulnerability was patched in June 2021 by Adobe and assigned CVE-2021-28632.

CVE-2021-39840: Reversing Patch and Bypassing Locks

I was curious to see how Adobe patched CVE-2021-28632, so after the patch was released, I decided to look at the updated AcroForm.api.

Upon reversing the updated field locking code, I noticed an addition of a call to a method that locks the passed field’s immediate descendants:

With the added code, both fieldParent and fieldChild will be locked and the PoC for the first bug will fail in freeing fieldChild:

img04.jpg

While assessing the updated code and thinking, I arrived at a thought: since the locking code only additionally locks the immediate descendants of the field, what if the field has a non-immediate descendant?... a grandchild field! I quickly modified the PoC for CVE-2021-28632 to the following:

PDF portion (important parts only):

JavaScript portion:

And then loaded the updated PoC in Adobe Reader under a debugger, hit go... and crash!

The patch was bypassed, and Adobe Reader crashed at the same location in the previously discussed recursive method where the use-after-free originally occurred.

Upon further analysis, I confirmed that the illustration below was the state of the field locks when the recursive method was called. Notice that fieldGrandChild is unlocked, and therefore, can be freed:

img05.jpg

The recursive CPDField method started with the this pointer pointing to fieldParent, and then called itself with the this pointer pointing to fieldChild, and then called itself again with the this pointer pointing to fieldGrandChild. Since fieldGrandChild has an attached widget, the JavaScript callback() function that frees fieldGrandChild was executed, effectively making the this pointer a dangling pointer.

This second vulnerability was patched in September 2021 by Adobe and assigned CVE-2021-39840.

Controlling Field Objects

Control of the freed CPDField object is straightforward via JavaScript: after the CPDField object is freed via the removeField() call, the JavaScript code can spray the heap with similarly sized data or an object to replace the contents of the freed CPDField object.

When I submitted my reports to ZDI, I included a second PoC that demonstrates full control of the CPDField object and then dereferences a controlled, virtual function table pointer:

Conclusion

Implementation of object trees, particularly those in applications where the objects can be controlled and destroyed arbitrarily, is prone to use-after-free vulnerabilities. For developers, special attention must be made to the implementation of object reference tracking and object locking. For vulnerability researchers, they represent opportunities for uncovering interesting vulnerabilities.


Thanks again to Mark for providing this thorough write-up. He has contributed many bugs to the ZDI program over the last few years, and we certainly hope to see more submissions from him in the future. Until then, follow the team for the latest in exploit techniques and security patches.

CVE-2021-28632 & CVE-2021-39840: Bypassing Locks in Adobe Reader

CVE-2022-23088: Exploiting a Heap Overflow in the FreeBSD Wi-Fi Stack

16 June 2022 at 16:38

In April of this year, FreeBSD patched a 13-year-old heap overflow in the Wi-Fi stack that could allow network-adjacent attackers to execute arbitrary code on affected installations of FreeBSD Kernel. This bug was originally reported to the ZDI program by a researcher known as m00nbsd and patched in April 2022 as FreeBSD-SA-22:07.wifi_meshid. The researcher has graciously provided this detailed write-up of the vulnerability and a proof-of-concept exploit demonstrating the bug.


Our goal is to achieve kernel remote code execution on a target FreeBSD system using a heap overflow vulnerability in the Wi-Fi stack of the FreeBSD kernel. This vulnerability has been assigned CVE-2022-23088 and affects all FreeBSD versions since 2009, along with many FreeBSD derivatives such as pfSense and OPNsense. It was patched by the FreeBSD project in April 2022.

The Vulnerability

When a system is scanning for available Wi-Fi networks, it listens for management frames that are emitted by Wi-Fi access points in its vicinity. There are several types of management frames, but we are interested in only one: the beacon management subtype. A beacon frame has the following format:

In FreeBSD, beacon frames are parsed in ieee80211_parse_beacon(). This function iterates over the sequence of options in the frame and keeps pointers to the options it will use later.

One option, IEEE80211_ELEMID_MESHID, is particularly interesting:

There is no sanity check performed on the option length (frm[1]). Later, in function sta_add(), there is a memcpy that takes this option length as size argument and a fixed-size buffer as destination:

Due to the lack of a sanity check on the option length, there is an ideal buffer overflow condition here: the attacker can control both the size of the overflow (sp->meshid[1]) as well as the contents that will be written (sp->meshid).

Therefore, when a FreeBSD system is scanning for available networks, an attacker can trigger a kernel heap overflow by sending a beacon frame that has an oversized IEEE80211_ELEMID_MESHID option.

Building a “Write-What-Where” Primitive

Let’s look at ise->se_meshid, the buffer that is overflown during the memcpy. It is defined in the ieee80211_scan_entry structure:

Here, the overflow of se_meshid allows us to overwrite the se_ies field that follows. The se_ies field is of type struct ieee80211_ies, defined as follows:

We will be interested in overwriting the last two fields: data and len.

Back in sta_add(), not long after the memcpy, there is a function call that uses the se_ies field:

This ieee80211_ies_init() function is defined as:

The data parameter here points to the beginning of the beacon options in the frame, and len is the full length of all the beacon options. In other words, the (data,len) couple describes the options buffer that is part of the frame that the attacker sent:

Figure 1 - The Options Buffer

As noted in the code snippet, the ies structure is fully attacker-controlled thanks to the buffer overflow.

Therefore, at $1:

       -- We can control len, given that this is the full size of the options buffer, and we can decide that size by just adding or removing options to our frame.
       -- We can control the contents of data, given that this is the contents of the options buffer.
       -- We can control the ies->data pointer, by overflowing into it.

We thus have a near-perfect write-what-where primitive that allows us to write almost whatever data we want at whatever kernel memory address we want just by sending one beacon frame that has an oversized MeshId option.

Constraints on the Primitive

A few constraints apply to the primitive:

  1. The FreeBSD kernel expects there to be SSID and Rates options in the frame. That means that there are 2x2=4 bytes of the options buffer that we cannot control. In addition, our oversized MeshId option has a 2-byte header, so that makes 6 bytes we cannot control. For convenience and simplicity, we will place these bytes at the beginning of the options buffer.

  2. Our oversized MeshId option must be large enough to overwrite up to the ies->data and ies->len fields, but not more. This equates to a length of 182 bytes for the MeshId option, plus its 2-byte header. To accommodate the MeshId option plus the two options mentioned above, we will use an options buffer of size 188 bytes.

  3. The ies->data and ies->len fields we overwrite are respectively the last 8 and 4 bytes within the options buffer, so they too are constrained.

  4. The ies->len field must be equal to len for the branch at $0 not to be taken.

The big picture of what the frame looks like given the aforementioned constraints:

Figure 2 - The Big Picture

Maintaining stability of the target

Let’s say we send a beacon frame that uses the primitive to overwrite an area of kernel memory. There is going to be a problem at some point when the kernel subsequently tries to free ies->data. This is because ies->data is supposed to point to a malloc-allocated buffer, which may no longer be true after our overwrite.

To maintain stability and avoid crashing the target, we can send a second, corrective beacon frame that overflows ies->data to be NULL and ies->len to be 0. After that, when the kernel attempts to free ies->data, it will see that the pointer is NULL, and won’t do anything as a result. This allows us to maintain the stability of the target and ensure our primitive doesn’t crash it.

Choosing what to write, and where to write it

Now we have a nice write-what-where primitive that we can invoke with just one beacon frame plus one corrective frame. What exactly can we do with it now?

A first thought might be to use the primitive to overwrite the instructions that the kernel executes in memory. (Un)fortunately, in FreeBSD, the kernel text segment is not writable, so we cannot use our primitive to directly overwrite kernel instructions. Furthermore, the kernel implements W^X, so no writable pages are executable.

However, the page tables that map executable pages are writable.

Injecting an implant

Here we are going to inject an implant into the target's kernel that will process "commands" that we send to it later on via subsequent Wi-Fi frames — a full kernel backdoor, if you will.

The process of injecting this implant will take four beacon frames.

This is a bit of a bumpy technical ride, so fasten your seatbelt.

Frame 1: Injecting the payload

Background: the direct map is a special kernel memory region that contiguously maps the entire physical memory of the system as writable pages, except for the physical pages of read-only text segments.

We use the primitive to write data at the physical address 0x1000 using the direct map:

Figure 3 - Frame 1

0x1000 was chosen as a convenient physical address because it is unused. The data we write is as follows:

Figure 4 - Frame 1

The implant shellcode area contains the instructions of our implant. The rest of the fields are explained below.

Frame 2: Overwriting an L3 PTE

Quick Background: x64 CPUs use page tables to map virtual addresses to physical RAM pages, in a 4-level hierarchy that goes from L4 (root) to L0 (leaf). Refer to the official AMD and Intel specifications for more details.

In this step, we use the primitive to overwrite an L3 page table entry (PTE) and make it point to the L2 PTE that we wrote at physical address 0x1000 as part of Frame 1. We crafted that L2 PTE precisely to point to our three L1 PTEs, which themselves point to two different areas: (1) two physical pages of the kernel text segment, and (2) our implant shellcode.

In other words, by overwriting an L3 PTE, we create a branch in the page tables that maps two pages of kernel text as writable and our shellcode as executable:

Figure 5 - Frame 2

At this point, our shellcode is mapped into the target’s virtual memory space and is ready to be executed. We will see below why we're mapping the two pages of the kernel text segment.

The attentive reader may be concerned about the constraints of the primitive here. Nothing to worry about: the L3 PTE space is mostly unpopulated. We just have to choose an unpopulated range where overwriting 188 bytes will not matter.

Frame 3: Patching the text segment

In order to jump into our newly mapped shellcode, we will patch the beginning of the sta_input() function in the text segment. This function is part of the Wi-Fi stack and gets called each time the kernel receives a Wi-Fi frame, so it seems like the perfect place to invoke our implant.

How can we patch it, given that the kernel text segment is not writable? By mapping as writable the text pages that contain the function. This is what we did in frames 1 and 2, with the first two L1 PTEs, that now give us a writable view of the instruction bytes of sta_input():

Figure 6 - Our writable view of sta_input()

Back on topic, we use the primitive to patch the first bytes of sta_input() via the writable view we created in frames 1 and 2, and replace these bytes with:

This simply calls our shellcode. However, the constraints of our primitive come into play:

  1. The first constraint of the primitive is that the first 6 bytes that we overwrite cannot be controlled. Overwriting memory at sta_input would not be great, as it would put 6 bytes of garbage at the beginning of the function, causing the target to crash.

However, if we look at the instruction dump of sta_input, we can see that it actually has a convenient layout:

What we can do here is overwrite memory at sta_input-6. The 6 bytes of garbage will just overwrite the unreachable NOPs of the previous sta_newstate() function, with no effect on execution.

With this trick, we don’t have to worry about these first 6 bytes, as they are effectively discarded.

  1. The second constraint of the primitive is that we are forced to overwrite exactly 182 bytes, so we cannot overwrite the first few bytes only. This is not really a problem. We can just fill the rest of the bytes with the same instruction bytes that are there in memory already.

  2. The third constraint of the primitive is that the last 12 bytes that we write are the ies->data and ies->len fields, and these don’t disassemble to valid instructions. That’s a problem because these bytes are within sta_input(). If the kernel tries to execute these bytes, it won’t be long before it crashes. To work around this, we must have corrective code in our implant. When called for the first time, our implant must correct the last 12 bytes of garbage that we wrote into sta_input(). Although slightly annoying, this is not complicated to implement.

With all that established, we’re all set. Our implant will now execute each time sta_input() is called, meaning, each time a Wi-Fi frame is received by the target!

Frame 4: Corrective frame

Finally, to maintain the stability of the target, we send a final beacon frame where we overflow ies->data to be NULL and ies->len to be 0.

This ensures that the target won’t crash.

Subsequent communication channel

With the aforementioned four frames, we have a recipe to reliably inject an implant into the target’s kernel. The implant gets called in the same context as sta_input():

Notably, the second argument, m, is the memory region containing the buffer of the frame that the kernel is currently processing. The implant can therefore inspect that buffer (via %rsi) and act depending on its contents.

In other words, we have a communication channel with the implant. We can therefore code our implant as a server backdoor and send commands to it via this channel.

Full steps

Let’s recap:

  1. A heap buffer overflow vulnerability exists in the FreeBSD kernel. An attacker can trigger this bug by sending a beacon frame with an oversized MeshId option.
  2. Using that vulnerability, we can implement a write-what-where primitive that allows us to perform one kernel memory overwrite per beacon frame we send.
  3. We can use that primitive three times to inject an implant into the target’s kernel.
  4. A fourth beacon frame is used to clean up ies->data and ies->len, to prevent a crash.
  5. Finally, our implant runs in the kernel of the target and acts as a full backdoor that can process subsequent Wi-Fi frames we send to it.

The Exploit

A full exploit can be found here. It injects a simple implant that calls printf() with the strings that the attacker subsequently sends. To use this exploit from a Linux machine that has a Wi-Fi card called wifi0, switch the card to monitor mode with:

Then, build and run the exploit for the desired target. For example, if you wish to target a pfSense 2.5.2 system:

Please exercise caution - this will exploit all vulnerable systems in your vicinity.

Once you have done this, look at the target’s tty0 console. You should see: “Hello there, I’m in the kernel”.


Thanks again to m00nbsd for providing this thorough write-up. They have contributed multiple bugs to the ZDI program over the last few years, and we certainly hope to see more submissions from them in the future. Until then, follow the team for the latest in exploit techniques and security patches.

CVE-2022-23088: Exploiting a Heap Overflow in the FreeBSD Wi-Fi Stack

  • There are no more articles
❌