Normal view

There are new articles available, click to refresh the page.
Before yesterdayThreat Research

FLARE Script Series: Automating Objective-C Code Analysis with Emulation

12 December 2018 at 17:30

This blog post is the next episode in the FireEye Labs Advanced Reverse Engineering (FLARE) team Script Series. Today, we are sharing a new IDAPython library – flare-emu – powered by IDA Pro and the Unicorn emulation framework that provides scriptable emulation features for the x86, x86_64, ARM, and ARM64 architectures to reverse engineers. Along with this library, we are also sharing an Objective-C code analysis IDAPython script that uses it. Read on to learn some creative ways that emulation can help solve your code analysis problems and how to use our new IDAPython library to save you lots of time in the process.

Why Emulation?

If you haven’t employed emulation as a means to solve a code analysis problem, then you are missing out! I will highlight some of its benefits and a few use cases in order to give you an idea of how powerful it can be. Emulation is flexible, and many emulation frameworks available today, including Unicorn, are cross-platform. With emulation, you choose which code to emulate and you control the context under which it is executed. Because the emulated code cannot access the system services of the operating system under which it is running, there is little risk of it causing damage. All of these benefits make emulation a great option for ad-hoc experimentation, problem solving, or automation.

Use Cases

  • Decoding/Decryption/Deobfuscation/Decompress – Often during malicious code analysis you will come across a function used to decode, decompress, decrypt, or deobfuscate some useful data such as strings, configuration data, or another payload. If it is a common algorithm, you may be able to identify it by sight or with a plug-in such as signsrch. Unfortunately, this is not often the case. You are then left to either opening up a debugger and instrumenting the sample to decode it for you, or transposing the function by hand into whatever programming language fits your needs at the time. These options can be time consuming and problematic depending on the complexity of the code and the sample you are analyzing. Here, emulation can often provide a preferable third option. Writing a script that emulates the function for you is akin to having the function available to you as if you wrote it or are calling it from a library. This allows you to reuse the function as many times as it’s needed, with varying inputs, without having to open a debugger. This case also applies to self-decrypting shellcode, where you can have the code decrypt itself for you.
  • Data Tracking – With emulation, you have the power to stop and inspect the emulation context at any time using an instruction hook. Pairing a disassembler with an emulator allows you to pause emulation at key instructions and inspect the contents of registers and memory. This allows you to keep tabs on interesting data as it flows through a function. This can have several applications. As previously covered in other blogs in the FLARE script series, Automating Function Argument Extraction and Automating Obfuscated String Decoding, this technique can be used to track the arguments passed to a given function throughout an entire program. Function argument tracking is one of the techniques employed by the Objective-C code analysis tool introduced later in this post. The data tracking technique could also be employed to track the this pointer in C++ code in order to markup object member references, or the return values from calls to GetProcAddress/dlsym in order to rename the variables they are stored in appropriately. There are many possibilities.

Introducing flare-emu

The FLARE team is introducing an IDAPython library, flare-emu, that marries IDA Pro’s binary analysis capabilities with Unicorn’s emulation framework to provide the user with an easy to use and flexible interface for scripting emulation tasks. flare-emu is designed to handle all the housekeeping of setting up a flexible and robust emulator for its supported architectures so that you can focus on solving your code analysis problems. It currently provides three different interfaces to serve your emulation needs, along with a slew of related helper and utility functions.

  1. emulateRange – This API is used to emulate a range of instructions, or a function, within a user-specified context. It provides options for user-defined hooks for both individual instructions and for when “call” instructions are encountered. The user can decide whether the emulator will skip over, or call into function calls. Figure 1 shows emulateRange used with both an instruction and call hook to track the return value of GetProcAddress calls and rename global variables to the name of the Windows APIs they will be pointing to. In this example, it was only set to emulate from 0x401514 to 0x40153D.  This interface provides an easy way for the user to specify values for given registers and stack arguments. If a bytestring is specified, it is written to the emulator’s memory and the pointer is written to the register or stack variable. After emulation, the user can make use of flare-emu’s utility functions to read data from the emulated memory or registers, or use the Unicorn emulation object that is returned for direct probing in case flare-emu does not expose some functionality you require.

    A small wrapper function for emulateRange, named emulateSelection, can be used to emulate the range of instructions currently highlighted in IDA Pro.


    Figure 1: emulateRange being used to track the return value of GetProcAddress

  2. iterate – This API is used to force emulation down specific branches within a function in order to reach a given target. The user can specify a list of target addresses, or the address of a function from which a list of cross-references to the function is used as the targets, along with a callback for when a target is reached. The targets will be reached, regardless of conditions during emulation that may have caused different branches to be taken. Figure 2 illustrates a set of code branches that iterate has forced to be taken in order to reach its target; the flags set by the cmp instructions are irrelevant.  Like the emulateRange API, options for user-defined hooks for both individual instructions and for when “call” instructions are encountered are provided. An example use of the iterate API is for the function argument tracking technique mentioned earlier in this post.


    Figure 2: A path of emulation determined by the iterate API in order to reach the target address

  3. emulateBytes – This API provides a way to simply emulate a blob of extraneous shellcode. The provided bytes are not added to the IDB and are simply emulated as is. This can be useful for preparing the emulation environment. For example, flare-emu itself uses this API to manipulate a Model Specific Register (MSR) for the ARM64 CPU that is not exposed by Unicorn in order to enable Vector Floating Point (VFP) instructions and register access. Figure 3 shows the code snippet that achieves this. Like with emulateRange, the Unicorn emulation object is returned for further probing by the user in case flare-emu does not expose some functionality required by the user.


    Figure 3: flare-emu using emulateBytes to enable VFP for ARM64

API Hooking

As previously stated, flare-emu is designed to make it easy for you to use emulation to solve your code analysis needs. One of the pains of emulation is in dealing with calls into library functions. While flare-emu gives you the option to simply skip over call instructions, or define your own hooks for dealing with specific functions within your call hook routine, it also comes with predefined hooks for over 80 functions! These functions include many of the common C runtime functions for string and memory manipulation that you will encounter, as well as some of their Windows API counterparts.

Examples

Figure 4 shows a few blocks of code that call a function that takes a timestamp value and converts it to a string. Figure 5 shows a simple script that uses flare-emu’s iterate API to print the arguments passed to this function for each place it is called. The script also emulates a simple XOR decode function and prints the resulting, decoded string. Figure 6 shows the resulting output of the script.


Figure 4: Calls to a timestamp conversion function


Figure 5: Simple example of flare-emu usage


Figure 6: Output of script shown in Figure 5

Here is a sample script that uses flare-emu to track return values of GetProcAddress and rename the variables they are stored in accordingly. Check out our README for more examples and help with flare-emu.

Introducing objc2_analyzer

Last year, I wrote a blog post to introduce you to reverse engineering Cocoa applications for macOS. That post included a short primer on how Objective-C methods are called under the hood, and how this adversely affects cross-references in IDA Pro and other disassemblers. An IDAPython script named objc2_xrefs_helper was also introduced in the post to help fix these cross-references issues. If you have not read that blog post, I recommend reading it before continuing on reading this post as it provides some context for what makes objc2_analyzer particularly useful. A major shortcoming of objc2_xrefs_helper was that if a selector name was ambiguous, meaning that two or more classes implement a method with the same name, the script was unable to determine which class the referenced selector belonged to at any given location in the binary and had to ignore such cases when fixing cross-references.

Now, with emulation support, this is no longer the case. objc2_analyzer uses the iterate API from flare-emu along with instruction and call hooks that perform Objective-C disassembly analysis in order to determine the id and selector being passed for every call to objc_msgSend variants in a binary. As an added bonus, it can also catch calls made to objc_msgSend variants when the function pointer is stored in a register, which is a very common pattern in Clang (the compiler used by modern versions of Xcode). IDA Pro tries to catch these itself and does a pretty good job, but it doesn’t catch them all. In addition to x86_64, support was also added for the ARM and ARM64 architectures in order to support reverse engineering iOS applications. This script supersedes the older objc2_xrefs_helper script, which has been removed from our repo. And, since the script can perform such data tracking in Objective-C code by using emulation, it can also determine whether an id is a class instance or a class object itself. Additional support has been added to track ivars being passed as ids as well. With all this information, Objective-C-style pseudocode comments are added to each call to objc_msgSend variants that represent the method call being made at each location. An example of the script’s capability is shown in Figure 7 and Figure 8.


Figure 7: Objective-C IDB snippet before running objc2_analyzer


Figure 8: Objective-C IDB snippet after running objc2_analyzer

Observe the instructions referencing selectors have been patched to instead reference the implementation function itself, for easy transition. The comments added to each call make analysis much easier. Cross-references from the implementation functions are also created to point back to the objc_msgSend calls that reference them as shown in Figure 9.


Figure 9: Cross-references added to IDB for implementation function

It should be noted that every release of IDA Pro starting with 7.0 have brought improvements to Objective-C code analysis and handling. However, at the time of writing, the latest version of IDA Pro being 7.2, there are still shortcomings that are mitigated using this tool as well as the immensely helpful comments that are added. objc2_analyzer is available, along with our other IDA Pro plugins and scripts, at our GitHub page.

Conclusion

flare-emu is a flexible tool to include in your arsenal that can be applied to a variety of code analysis problems. Several example problems were presented and solved using it in this blog post, but this is just a glimpse of its possible applications. If you haven’t given emulation a try for solving your code analysis problems, we hope you will now consider it an option. And for all, we hope you find value in using these new tools!

IDA, I Think It’s Time You And I Had a Talk: Controlling IDA Pro With Voice Control Software

3 October 2019 at 17:00

Introduction

This blog post is the next episode in the FireEye Labs Advanced Reverse Engineering (FLARE) team Script Series. Today, we are sharing something quite unusual. It is not a tool or a virtual machine distribution, nor is it a plugin or script for a popular reverse engineering tool or framework. Rather, it is a profile created for a consumer software application completely unrelated to reverse engineering or malware analysis… until now. The software is named VoiceAttack, and its purpose is to make it easy for users to control other software on their computer using voice commands. With FLARE’s new profile for VoiceAttack, users can completely control IDA Pro with their voice! Have you ever dreamed of telling IDA Pro to decompile a function or show you the strings of a binary? Well dream no more! Not only does our profile give you total control of the software, it also provides shortcuts and other cool features not previously available. It’s our hope that providing voice control for the world’s most popular disassembler will further empower users with repetitive stress injuries or disabilities to more effectively put their reverse engineering skills to use with this new accessibility option as well as helping the community at large work more efficiently.

Check out our video demonstration of some of the features of the profile to see it in action.

How Does It Work?

Voice attack is an inexpensive software application that utilizes the Windows Speech Recognition (WSR) feature to enable the creation of user-defined, voice-activated macros. The user specifies a key word or phrase, then defines one or more actions to be taken when that word or phrase is recognized. The most common types of actions to be taken include key presses, mouse movement and clicks, and clipboard manipulation. However, there are many other more advanced features available that provide a lot of flexibility to users including variables, loops, and conditionals. You can even have the computer speak to you in response to your commands! VoiceAttack requires an internet connection, but only during the registration process, after which the network adapter can be disabled or configured to a network that cannot reach the internet without issue.

To use VoiceAttack, you must first train Windows Speech Recognition to recognize your voice. Instructions on how to do so can be found here. This process only takes a few minutes at minimum, but the more time you spend training, the better the experience you will have with it.

What Does the IDA Pro Profile Provide?

FLARE’s IDA Pro profile for VoiceAttack maps every advertised keyboard shortcut in IDA Pro to a voice command. Although this is only one part of what the profile provides, many users will find this in itself very useful. When developing this profile, I was shocked to discover just how many keyboard shortcuts there really are for IDA Pro and what can be accomplished with them. Some of my favorite shortcuts are found under the View->Open Subviews and Windows menus. With this profile, I can simply say “show strings” or “show structures” or “show window x” to change the tab I am currently viewing or open a new view in a tab without having to move my mouse cursor anywhere. The next few paragraphs describe some other useful commands to make any reverse engineer’s job easier. For a more detailed description of the profile and commands available, see the Github page.

Macros

A series of voice commands can perform multi-step actions not otherwise reachable by individual keyboard shortcuts. For example, wouldn’t it be nice to have commands to toggle the visibility of opcode bytes (see Figure 1)? Currently, you have to open the Options menu, select the General menu item, input a value in the Number of opcode bytes text field, and click the OK button. Well, now you can simply say “show opcodes” or “hide opcodes” and it will be so!


Figure 1: Configuring the number of opcode bytes to show in IDA Pro's disassembly view

Defining a Unicode string in IDA Pro is a multi-step exercise, whether you navigate to the Edit->Strings menu or use the “string literals” keyboard shortcut Alt+A followed by pressing the U key as shown in Figure 2. Now you can simply say “make Unicode string” and the work is done for you.


Figure 2: String literals dialog in IDA Pro

Reversing a C++ application? The Create struct from selection action is a very helpful feature in this case, but it requires you to navigate to the Edit->Structs menu in order to use it. The voice command “create struct from selection” does this for you automatically. The “look it up” command will copy the currently highlighted token in the disassembly and search Google for it using your default browser. There are several other macros in the profile that are like this and save you a lot of time navigating menus and dialogs to perform simple actions.

Cursor Movement, Dialogs, and Navigation

The cursor movement commands allow the user to move the cursor up, down, left, or right, one or more times, in specified increments. These commands also allow for scrolling with a voice command that commences scrolling in a chosen direction, and another voice command for stopping scrolling. There are even voice commands to set the speed of the scroll to slow, medium, or fast. In the disassembly view, the cursor can also be moved per “word” on the current line of the disassembly or decompilation, or even per basic block or function.

Like many other applications, dialogs are a part of IDA Pro’s user interface. The ability to easily navigate and interact with items in a dialog with your voice is essential to a smooth user experience. Voice commands in the profile enable the user to easily click the OK or Cancel buttons, toggle checkboxes, and tab through controls in the dialog in both directions and in specified increments.

With the aid of a companion IDAPython plugin, additional navigation commands are supported. Commands that allow the user to move the cursor to the beginning or end of the current function, to the next or previous “call” instruction, to the previous or next instruction containing the highlighted token, or to a specified number of bytes forward or backwards from the current cursor position help to make voice-controlled navigation easier.

These cursor movement and navigation commands enable users to have full control of IDA Pro without the use of their hands. While this is true and an important goal for the profile, it is not practical for people who have full use of their hands to go completely hands-free. The commands that navigate the cursor in IDA Pro will never be as fast or easy as simply using the mouse to point and click somewhere on the screen. In any case, users will find themselves building up a collection of voice commands they prefer to use that will depend on personal tastes. However, enabling full voice control allows reverse engineers who do not have full use of their hands to still effectively operate IDA Pro, which we hope will be of great use to the community. Having such a capability is also useful for those who suffer from repetitive strain injury.

Input Recognition

The commands described so far give you control over IDA Pro with your voice, but there is still the matter of providing textual input for items such as function and variable names, comments, and other text input fields. VoiceAttack does provide the ability in macros to enable and disable what is called “Dictation Mode”. When in Dictation Mode, any recognized words are added to a buffer of text until Dictation Mode is disabled. Then this text can be used elsewhere in the macro. Unfortunately, this feature is not designed to recognize the kinds of technical terms one would be using in the context of reverse engineering programs. Even if it were, there is still the issue of having to format the text to be a valid function or variable name. Instead of wrestling with this feature to try to make it work for this purpose, a very large and growing collection of “input recognition” commands was created. These commands are designed to recognize common words used in the names of functions and variables, as well as full function names as found in the C runtime libraries and the Windows APIs. Once recognized, the word or function name is copied to the user’s clipboard and pasted into the text field. To avoid the inadvertent triggering of such commands during the regular operation of IDA Pro, these commands are only active when the “input mode” is enabled. This mode is enabled automatically when certain commands are activated such as “rename” or “find”, and automatically disabled when dialog commands such as “OK” or “cancel” are activated. The input mode can also be manually manipulated with the “input mode on” and “input mode off” commands.

Conclusion

Today, the FLARE team is releasing a profile for VoiceAttack and a companion IDAPython plugin that enables full voice control of IDA Pro along with many added convenience features. The profile contains over 1000 defined commands and growing. It is easy to view, edit, and add commands to this profile to customize it to suit your needs or to improve it for the community at large. The VoiceAttack software is highly affordable and enables you to create profiles for any applications or games that you use. For installation instructions and usage information, see the project’s Github page. Give it a try today!

Using Speakeasy Emulation Framework Programmatically to Unpack Malware

1 December 2020 at 20:30

Andrew Davis recently announced the public release of his new Windows emulation framework named Speakeasy. While the introductory blog post focused on using Speakeasy as an automated malware sandbox of sorts, this entry will highlight another powerful use of the framework: automated malware unpacking. I will demonstrate, with code examples, how Speakeasy can be used programmatically to:

  • Bypass unsupported Windows APIs to continue emulation and unpacking
  • Save virtual addresses of dynamically allocated code using API hooks
  • Surgically direct execution to key areas of code using code hooks
  • Dump an unpacked PE from emulator memory and fix its section headers
  • Aid in reconstruction of import tables by querying Speakeasy for symbolic information

Initial Setup

One approach to interfacing with Speakeasy is to create a subclass of Speakeasy’s Speakeasy class. Figure 1 shows a Python code snippet that sets up such a class that will be expanded in upcoming examples.

import speakeasy

class MyUnpacker(speakeasy.Speakeasy):
    def __init__(self, config=None):
        super(MyUnpacker, self).__init__(config=config)

Figure 1: Creating a Speakeasy subclass

The code in Figure 1 accepts a Speakeasy configuration dictionary that may be used to override the default configuration. Speakeasy ships with several configuration files. The Speakeasy class is a wrapper class for an underlying emulator class. The emulator class is chosen automatically when a binary is loaded based on its PE headers or is specified as shellcode. Subclassing Speakeasy makes it easy to access, extend, or modify interfaces. It also facilitates reading and writing stateful data before, during, and after emulation.

Emulating a Binary

Figure 2 shows how to load a binary into the Speakeasy emulator.

self.module = self.load_module(filename)

Figure 2: Loading the binary into the emulator

The load_module function returns a PeFile object for the provided binary on disk. It is an instance of the PeFile class defined in speakeasy/windows/common.py, which is subclassed from pefile’s PE class. Alternatively, you can provide the bytes of a binary using the data parameter rather than specifying a file name. Figure 3 shows how to emulate a loaded binary.

self.run_module(self.module)

Figure 3: Starting emulation

API Hooks

The Speakeasy framework ships with support for hundreds of Windows APIs with more being added frequently. This is accomplished via Python API handlers defined in appropriate files in the speakeasy/winenv/api directory. API hooks can be installed to have your own code executed when particular APIs are called during emulation. They can be installed for any API, regardless of whether a handler exists or not. An API hook can be used to override an existing handler and that handler can optionally be invoked from your hook. The API hooking mechanism in Speakeasy provides flexibility and control over emulation. Let’s examine a few uses of API hooking within the context of emulating unpacking code to retrieve an unpacked payload.

Bypassing Unsupported APIs

When Speakeasy encounters an unsupported Windows API call, it stops emulation and provides the name of the API function that is not supported. If the API function in question is not critical for unpacking the binary, you can add an API hook that simply returns a value that allows execution to continue. For example, a recent sample’s unpacking code contained API calls that had no effect on the unpacking process. One such API call was to GetSysColor. In order to bypass this call and allow execution to continue, an API hook may be added as shown in Figure 4.

self.add_api_hook(self.getsyscolor_hook,
                  'user32',
                  'GetSysColor',
                  argc=1
                  )

Figure 4: Adding an API hook

According to MSDN, this function takes 1 parameter and returns an RGB color value represented as a DWORD. If the calling convention for the API function you are hooking is not stdcall, you can specify the calling convention in the optional call_conv parameter. The calling convention constants are defined in the speakeasy/common/arch.py file. Because the GetSysColor return value does not impact the unpacking process, we can simply return 0. Figure 5 shows the definition of the getsyscolor_hook function specified in Figure 4.

def getsyscolor_hook(self, emu, api_name, func, params):
            return 0

Figure 5: The GetSysColor hook returns 0

If an API function requires more finessed handling, you can implement a more specific and meaningful hook that suits your needs. If your hook implementation is robust enough, you might consider contributing it to the Speakeasy project as an API handler!  

Adding an API Handler

Within the speakeasy/winenv/api directory you'll find usermode and kernelmode subdirectories that contain Python files for corresponding binary modules. These files contain the API handlers for each module. In usermode/kernel32.py, we see a handler defined for SetEnvironmentVariable as shown in Figure 6.

1: @apihook('SetEnvironmentVariable', argc=2)
2: def SetEnvironmentVariable(self, emu, argv, ctx={}):
3:     '''
4:     BOOL SetEnvironmentVariable(
5:         LPCTSTR lpName,
6:         LPCTSTR lpValue
7:         );
8:     '''
9:     lpName, lpValue = argv
10:    cw = self.get_char_width(ctx)
11:    if lpName and lpValue:
12:        name = self.read_mem_string(lpName, cw)
13:        val = self.read_mem_string(lpValue, cw)
14:        argv[0] = name
15:        argv[1] = val
16:        emu.set_env(name, val)
17:    return True

Figure 6: API handler for SetEnvironmentVariable

A handler begins with a function decorator (line 1) that defines the name of the API and the number of parameters it accepts. At the start of a handler, it is good practice to include MSDN's documented prototype as a comment (lines 3-8).

The handler's code begins by storing elements of the argv parameter in variables named after their corresponding API parameters (line 9). The handler's ctx parameter is a dictionary that contains contextual information about the API call. For API functions that end in an ‘A’ or ‘W’ (e.g., CreateFileA), the character width can be retrieved by passing the ctx parameter to the get_char_width function (line 10). This width value can then be passed to calls such as read_mem_string (lines 12 and 13), which reads the emulator’s memory at a given address and returns a string.

It is good practice to overwrite string pointer values in the argv parameter with their corresponding string values (lines 14 and 15). This enables Speakeasy to display string values instead of pointer values in its API logs. To illustrate the impact of updating argv values, examine the Speakeasy output shown in Figure 7. In the VirtualAlloc entry, the symbolic constant string PAGE_EXECUTE_READWRITE replaces the value 0x40. In the GetModuleFileNameA and CreateFileA entries, pointer values are replaced with a file path.

KERNEL32.VirtualAlloc(0x0, 0x2b400, 0x3000, "PAGE_EXECUTE_READWRITE") -> 0x7c000
KERNEL32.GetModuleFileNameA(0x0, "C:\\Windows\\system32\\sample.exe", 0x104) -> 0x58
KERNEL32.CreateFileA("C:\\Windows\\system32\\sample.exe", "GENERIC_READ", 0x1, 0x0, "OPEN_EXISTING", 0x80, 0x0) -> 0x84

Figure 7: Speakeasy API logs

Saving the Unpacked Code Address

Packed samples often use functions such as VirtualAlloc to allocate memory used to store the unpacked sample. An effective approach for capturing the location and size of the unpacked code is to first hook the memory allocation function used by the unpacking stub. Figure 8 shows an example of hooking VirtualAlloc to capture the virtual address and amount of memory being allocated by the API call.

1: def virtualalloc_hook(self, emu, api_name, func, params):
2:     '''
3:     LPVOID VirtualAlloc(
4:        LPVOID lpAddress,
5:        SIZE_T dwSize,
6:        DWORD  flAllocationType,
7:        DWORD  flProtect
8:      );
9:     '''
10:    PAGE_EXECUTE_READWRITE = 0x40
11:    lpAddress, dwSize, flAllocationType, flProtect = params
12:    rv = func(params)
13:    if lpAddress == 0 and flProtect == PAGE_EXECUTE_READWRITE:
14:        self.logger.debug("[*] unpack stub VirtualAlloc call, saving dump info")
15:        self.dump_addr = rv
16:        self.dump_size = dwSize

17:    return rv

Figure 8: VirtualAlloc hook to save memory dump information

The hook in Figure 8 calls Speakeasy’s API handler for VirtualAlloc on line 12 to allow memory to be allocated. The virtual address returned by the API handler is saved to a variable named rv. Since VirtualAlloc may be used to allocate memory not related to the unpacking process, additional checks are used on line 13 to confirm the intercepted VirtualAlloc call is the one used in the unpacking code. Based on prior analysis, we’re looking for a VirtualAlloc call that receives the lpAddress value 0 and the flProtect value PAGE_EXECUTE_READWRITE (0x40). If these arguments are present, the virtual address and specified size are stored on lines 15 and 16 so they may be used to extract the unpacked payload from memory after the unpacking code is finished. Finally, on line 17, the return value from the VirtualAlloc handler is returned by the hook.

Surgical Code Emulation Using API and Code Hooks

Speakeasy is a robust emulation framework; however, you may encounter binaries that have large sections of problematic code. For example, a sample may call many unsupported APIs or simply take far too long to emulate. An example of overcoming both challenges is described in the following scenario.

Unpacking Stubs Hiding in MFC Projects

A popular technique used to disguise malicious payloads involves hiding them inside a large, open-source MFC project. MFC is short for Microsoft Foundation Class, which is a popular library used to build Windows desktop applications. These MFC projects are often arbitrarily chosen from popular Web sites such as Code Project. While the MFC library makes it easy to create desktop applications, MFC applications are difficult to reverse engineer due to their size and complexity. They are particularly difficult to emulate due to their large initialization routine that calls many different Windows APIs. What follows is a description of my experience with writing a Python script using Speakeasy to automate unpacking of a custom packer that hides its unpacking stub within an MFC project.

Reverse engineering the packer revealed the unpacking stub is ultimately called during initialization of the CWinApp object, which occurs after initialization of the C runtime and MFC. After attempting to bypass unsupported APIs, I realized that, even if successful, emulation would take far too long to be practical. I considered skipping over the initialization code completely and jumping straight to the unpacking stub. Unfortunately, execution of the C-runtime initialization code was required in order for emulation of the unpacking stub to succeed.

My solution was to identify a location in the code that fell after the C-runtime initialization but was early in the MFC initialization routine. After examining the Speakeasy API log shown in Figure 9, such a location was easy to spot. The graphics-related API function GetDeviceCaps is invoked early in the MFC initialization routine. This was deduced based on 1) MFC is a graphics-dependent framework and 2) GetDeviceCaps is unlikely to be called during C-runtime initialization.

0x43e0a7: 'kernel32.FlsGetValue(0x0)' -> 0x4150
0x43e0e3: 'kernel32.DecodePointer(0x7049)' -> 0x7048
0x43b16a: 'KERNEL32.HeapSize(0x4130, 0x0, 0x7000)' -> 0x90
0x43e013: 'KERNEL32.TlsGetValue(0x0)' -> 0xfeee0001
0x43e02a: 'KERNEL32.TlsGetValue(0x0)' -> 0xfeee0001
0x43e02c: 'kernel32.FlsGetValue(0x0)' -> 0x4150
0x43e068: 'kernel32.EncodePointer(0x44e215)' -> 0x44e216
0x43e013: 'KERNEL32.TlsGetValue(0x0)' -> 0xfeee0001
0x43e02a: 'KERNEL32.TlsGetValue(0x0)' -> 0xfeee0001
0x43e02c: 'kernel32.FlsGetValue(0x0)' -> 0x4150
0x43e068: 'kernel32.EncodePointer(0x704c)' -> 0x704d
0x43c260: 'KERNEL32.LeaveCriticalSection(0x466f28)' -> None
0x422151: 'USER32.GetSystemMetrics(0xb)' -> 0x1
0x422158: 'USER32.GetSystemMetrics(0xc)' -> 0x1
0x42215f: 'USER32.GetSystemMetrics(0x2)' -> 0x1
0x422169: 'USER32.GetSystemMetrics(0x3)' -> 0x1
0x422184: 'GDI32.GetDeviceCaps(0x288, 0x58)' -> None

Figure 9: Identifying beginning of MFC code in Speakeasy API logs

To intercept execution at this stage I created an API hook for GetDeviceCaps as shown in Figure 10. The hook confirms the function is being called for the first time on line 2.

1: def mfc_init_hook(self, emu, api_name, func, params):
2:     if not self.trigger_hit:
3:         self.trigger_hit = True
4:         self.h_code_hook =   self.add_code_hook(self.start_unpack_func_hook)
5:         self.logger.debug("[*] MFC init api hit, starting unpack function")

Figure 10: API hook set for GetDeviceCaps

Line 4 shows the creation of a code hook using the add_code_hook function of the Speakeasy class. Code hooks allow you to specify a callback function that is called before each instruction that is emulated. Speakeasy also allows you to optionally specify an address range for which the code hook will be effective by specifying begin and end parameters.

After the code hook is added on line 4, the GetDeviceCaps hook completes and, prior to the execution of the sample's next instruction, the start_unpack_func_hook function is called. This function is shown in Figure 11.

1: def start_unpack_func_hook(self, emu, addr, size, ctx):
2:     self.h_code_hook.disable()
3:     unpack_func_va = self.module.get_rva_from_offset(self.unpack_offs) + self.module.get_base()
4:     self.set_pc(unpack_func_va)

Figure 11: Code hook that changes the instruction pointer

The code hook receives the emulator object, the address and size of the current instruction, and the context dictionary (line 1). On line 2, the code hook disables itself. Because code hooks are executed with each instruction, this slows emulation significantly. Therefore, they should be used sparingly and disabled as soon as possible. On line 3, the hook calculates the virtual address of the unpacking function. The offset used to perform this calculation was located using a regular expression. This part of the example was omitted for the sake of brevity.

The self.module attribute was previously set in the example code shown in Figure 2. It being subclassed from the PE class of pefile allows us to access useful functions such as get_rva_from_offset() on line 3. This line also includes an example of using self.module.get_base() to retrieve the module's base virtual address.

Finally, on line 4, the instruction pointer is changed using the set_pc function and emulation continues at the unpacking code. The code snippets in Figure 10 and Figure 11 allowed us to redirect execution to the unpacking code after the C-runtime initialization completed and avoid MFC initialization code.

Dumping and Fixing Unpacked PEs

Once emulation has reached the original entry point of the unpacked sample, it is time to dump the PE and fix it up. Typically, a hook would save the base address of the unpacked PE in an attribute of the class as illustrated on line 15 of Figure 8. If the unpacked PE does not contain the correct entry point in its PE headers, the true entry point may also need to be captured during emulation. Figure 12 shows an example of how to dump emulator memory to a file.

with open(self.output_path, "wb") as up:
    mm = self.get_address_map(self.dump_addr)
    up.write(self.mem_read(mm.get_base(), mm.get_size()))

Figure 12: Dumping the unpacked PE

If you are dumping a PE that has already been loaded in memory, it will not have the same layout as it does on disk due to differences in section alignment. As a result, the dumped PE's headers may need to be modified. One approach is to modify each section's PointerToRawData value to match its VirtualAddress field. Each section's SizeOfRawData value may need to be padded in order conform with the FileAlignment value specified in the PE’s optional headers. Keep in mind the resulting PE is unlikely to execute successfully. However, these efforts will allow most static analysis tools to function correctly.

The final step for repairing the dumped PE is to fix its import table. This is a complex task deserving of its own blog post and will not be discussed in detail here. However, the first step involves collecting a list of library function names and their addresses in emulator memory. If you know the GetProcAddress API is used by the unpacker stub to resolve imports for the unpacked PE, you can call the get_dyn_imports function as shown in Figure 13.

api_addresses = self.get_dyn_imports()

Figure 13: Retrieving dynamic imports

Otherwise, you can query the emulator class to retrieve its symbol information by calling the get_symbols function as shown in Figure 14.

symbols = self.get_symbols()

Figure 14: Retrieve symbol information from emulator class

This data can be used to discover the IAT of the unpacked PE and fix or reconstruct its import related tables.

Putting It All Together

Writing a Speakeasy script to unpack a malware sample can be broken down into the following steps:

  1. Reverse engineer the unpacking stub to identify: 1) where the unpacked code will reside or where its memory is allocated, 2) where execution is transferred to the unpacked code, and 3) any problematic code that may introduce issues such as unsupported APIs, slow emulation, or anti-analysis checks.
  2. If necessary, set hooks to bypass problematic code.
  3. Set a hook to identify the virtual address and, optionally, the size of the unpacked binary.
  4. Set a hook to stop emulation at, or after, execution of the original entry point of the unpacked code.
  5. Collect virtual addresses of Windows APIs and reconstruct the PE’s import table.
  6. Fix the PE’s headers (if applicable) and write the bytes to a file for further analysis.

For an example of a script that unpacks UPX samples, check out the UPX unpacking script in the Speakeasy repository.

Conclusion

The Speakeasy framework provides an easy-to-use, flexible, and powerful programming interface that enables analysts to solve complex problems such as unpacking malware. Using Speakeasy to automate these solutions allows them to be performed at scale. I hope you enjoyed this introduction to automating the Speakeasy framework and are inspired to begin using it to implement your own malware analysis solutions!

❌
❌