Normal view

There are new articles available, click to refresh the page.
Before yesterdayMalware

The DLL Search Order And Hijacking It

10 November 2021 at 07:52

If you ever used Process Monitor to track activity of a process, you might have encountered the following pattern:

Figure 1: Example of dnsapi.dll not being found in the application directory

The image above is a snippet from events captured by Process Monitor during the execution of x32dbg.exe on Windows 7. DNSAPI.DLL and IPHLPPAPI.DLL are persisted in the System directory, so you might question yourself:

Why would Windows try to search for either of these DLLs in the application directory first?

Operating Systems are very complex and so is the challenge of implementing an error-fault system to search for dependencies, like dynamic linked libraries. Today, we’ll talk about DLL Search Order and DLL Search Order Hijacking, in particular how it works and how adversaries can abuse it.

DLL Search Order

First, we have to talk about what happens when a PE File is executed on the Windows system.

The majority of native binaries you encounter on Windows are linked dynamically. Linked dynamically means that upon start of the execution, it uses information which are embedded inside the binary to locate DLLs that are essential for this process. In comparison with statically linked binaries, when linked dynamically the executable will use the libraries provided by the OS instead of having them compiled into the executable itself.

Before the dynamically linked executable can use or load these libraries, it will have to know where these dependencies are persisted on disk or if they are already in memory. This is where the DLL Search Order makes its appearance. To keep it simple, we will focus only on Windows Desktop Applications.

Pre-Checks and In-Memory Search

Before the Windows OS starts searching for the needed DLL on disk, it will first attempt to find the needed module in memory. If a DLL is already in memory, it will not loaded it again. Now this part is a little bit complicated and out of context for this blog article, we would have to define what “loaded” even means. If you are more interested in the first check, I advise you to look up the official Microsoft documentation[1].

If the memory check fails, Windows can fall back to using a list of known DLLs. if the needed library is part of that list, it will use the copy of the known DLL. The list of known DLLs are persisted in the Windows Registry.

Figure 2: List of KnownDlls on Windows 7

On-Disk Search

If the first two checks fail, the OS will have to search for the DLL on disk. Depending on the OS Settings, Windows will use a different search order. Per default, Windows enables the DLL Search Mode feature to harden the system and prevent DLL Search Order Hijacking attacks, a technique we will explain in the upcoming section.

The key to the feature is as follows:

  • HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\SafeDllSearchMode

Let’s take a look at the differences of the search order depending whether SafeDllSearchMode is enabled or not.

Figure 3: DLL Search Order flow

We clearly see that the current directory is prioritised if SafeDllSearchMode is disabled and this can be abused by adversaries. The art of abusing this search order flow is called DLL Search Order Hijacking.

DLL Search Order Hijacking

Adversaries can abuse the search order flow displayed above to load their own malicious DLLs instead of the legitimate ones into memory. There are many ways this technique can be used. However, it is more effective in achieving persistence on the target system then initial execution.

Let’s take a step back and revisit our example from above:

  • x32dbg.exe tries to load DNSAPI.DLL
  • DNSAPI.DLL is not in the list of known DLLs and is also not loaded into memory.
  • Since SafeDllSearchMode is enabled, it will fall back to the system directory if not found in the application directory

What would happen, if we craft and place a malicious DLL, named DNSAPI.DLL into the application directory?

We would be able to hijack the search order flow and force a legitimate application to load our malicious code into memory.

Practical Use Case

Let’s take a look at a simple practical example. Our application calls LoadLibraryA and tries to load dnsapi.dll like in our example from above. Next we craft a small DLL file, which does nothing else but create a message box in the DLLMain function. Once the DLL is loaded into memory, the main function will be triggered.

In the first run, we do not place the crafted DLL in the application directory. As expected, Windows will load dnsapi.dll from the system directory:

Next, we will now name our crafted DLL dnsapi.dll and place it in the application directory:

Whoops! I think we can all think of a couple use cases of how APT groups and malware can abuse this technique to achieve persistence on the victim’s system.

Real world examples and APTs

For the sake of keeping it simple and explaining the core principles behind this persistence technique, we’ve build a very simple use case here. Of course, the real world looks a little bit different and usually attackers have to take into account:

  • Endpoint Security solutions with behaviour based detections, preventing such attacks with signatures
  • Programmatic dependencies, which won’t allow you to just replace a DLL in an application directory and hope that it will work just fine
  • and many more

However, if you never heard about this technique, I hope I was able to create some awareness for it!

PEB: Where Magic Is Stored

22 August 2021 at 15:49

As a reverse engineer, every now and then you encounter a situation where you dive deeper into the internal structures of an operating system as usual. Be it out of simple curiosity, or because you need to understand how a binary uses specific parts of the operating system in certain ways . One of the more interesting structures in Windows is the Process Environment Block/PEB. In this article, I’d like to introduce you to this structure and talk about various use cases of how adversaries can abuse this structure for their own purposes.

Introducing PEB

The Process Environment Block is a critical structure in the Windows OS, most of its fields are not intended to be used by other than the operating system. It contains data structures that apply across a whole process and is stored in user-mode memory, which makes it accessible for the corresponding process. The structure contains valuable information about the running process, including:

  • whether the process is being debugged or not
  • which modules are loaded into memory
  • the command line used to invoke the process

All these information gives adversaries a number of possibilities to abuse it. The figure below shows the layout of the PEB structure:

typedef struct _PEB {
  BYTE                          Reserved1[2];
  BYTE                          BeingDebugged;
  BYTE                          Reserved2[1];
  PVOID                         Reserved3[2];
  PPEB_LDR_DATA                 Ldr;
  PRTL_USER_PROCESS_PARAMETERS  ProcessParameters;
  PVOID                         Reserved4[3];
  PVOID                         AtlThunkSListPtr;
  PVOID                         Reserved5;
  ULONG                         Reserved6;
  PVOID                         Reserved7;
  ULONG                         Reserved8;
  ULONG                         AtlThunkSListPtr32;
  PVOID                         Reserved9[45];
  BYTE                          Reserved10[96];
  PPS_POST_PROCESS_INIT_ROUTINE PostProcessInitRoutine;
  BYTE                          Reserved11[128];
  PVOID                         Reserved12[1];
  ULONG                         SessionId;
} PEB, *PPEB;

Now that we’ve talked a little bit about the layout and purpose of the structure, let’s take a look at a few use cases.

Reading the BeingDebugged flag

The most obvious way is to check the BeingDebugged to identify, whether a debugger is attached to the process or not. Through reading the variable directly from memory instead of using usual suspects like NtQueryInformationProcess or IsDebuggerPresent, malware can prevent noisy WINAPI calls. This makes it harder to spot this technique.

However, most debuggers already take care of this. X64dbg for example, has an option to hide the Debugger by modifying the PEB structure at start of the debugging session.

Iterating through loaded modules

Another use case, could be iterating the loaded modules and discover DLLs injected into memory with purpose to overwatch the running process. To understand how to achieve this, we need to take a look at the PPEB_LDR_DATA structure included in PEB, which is provided by the Ldr variable:

typedef struct _PEB_LDR_DATA {
  BYTE       Reserved1[8];
  PVOID      Reserved2[3];
  LIST_ENTRY InMemoryOrderModuleList;
} PEB_LDR_DATA, *PPEB_LDR_DATA;

PPEB_LDR_DATA contains the head to a doubly linked list named InMemoryOrderModuleList. Each item in this list is a structure from type LDR_DATA_TABLE_ENTRY, which contains all the information we need to iterate loaded modules. See the structure of LDR_DATA_TABLE_ENTRY below:

typedef struct _LDR_DATA_TABLE_ENTRY {
    PVOID Reserved1[2];
    LIST_ENTRY InMemoryOrderLinks;
    PVOID Reserved2[2];
    PVOID DllBase;
    PVOID EntryPoint;
    PVOID Reserved3;
    UNICODE_STRING FullDllName;
    BYTE Reserved4[8];
    PVOID Reserved5[3];
    union {
        ULONG CheckSum;
        PVOID Reserved6;
    };
    ULONG TimeDateStamp;
} LDR_DATA_TABLE_ENTRY, *PLDR_DATA_TABLE_ENTRY;

So by iterating the doubly linked list, we are able to discover the base address and full name of all modules loaded into memory of the running process. The snippet below is a small Proof of Concept. It iterates the linked list and prints the library name to stdout. I created it for the purpose of this blog article. You are free to use it, however I will also upload it to my github repo the upcoming days:

#include <Windows.h>
#include <iostream>
#include <shlwapi.h>


#define NO_STDIO_REDIRECT

typedef struct _UNICODE_STRING
{
    USHORT Length;
    USHORT MaximumLength;
    PWSTR Buffer;
} UNICODE_STRING, * PUNICODE_STRING;


typedef struct _LDR_DATA_TABLE_ENTRY_MOD {
    LIST_ENTRY InMemoryOrderLinks;
    PVOID Reserved2[2];
    PVOID DllBase;
    PVOID EntryPoint;
    PVOID Reserved3;
    UNICODE_STRING FullDllName;
    BYTE Reserved4[8];
    PVOID Reserved5[3];
    union {
        ULONG CheckSum;
        PVOID Reserved6;
    };
    ULONG TimeDateStamp;
} LDR_DATA_TABLE_ENTRY_MOD, * PLDR_DATA_TABLE_ENTRY_MOD_MOD;




int main(int argc, char** argv[]){

 
    PLDR_DATA_TABLE_ENTRY_MOD_MOD lib = NULL;
    _asm {
        xor eax, eax
        mov eax, fs:[0x30]
        mov eax, [eax + 0xC]
        mov eax, [eax + 0x14]
        mov lib, eax
    };
    printf("[+] Initialised pointer to first LDR_DATA_TABLE_ENTRY_MOD\n");
    

    // Loop as long as we don't reach the head of the linked list again
    while ( lib->FullDllName.Buffer != NULL ) {

        printf("[+] %S\n", lib->FullDllName.Buffer);
        lib = (PLDR_DATA_TABLE_ENTRY_MOD_MOD)lib->InMemoryOrderLinks.Flink;
    }
    
    printf("[+] Done!\n");



	return 0;

If you are wondering how I am able to access the PEB in the code below, you should take a look at the inline assembly in the main method, especially the instruction mov eax, fs:[0x30]. FS is a segment register, similar to GS. FS can be used to access thread-specific memory. Offset 0x30 allows you to access the linear address of the Process Environment Block.

Finally, we want to take a look at a real world example of how PEB can be abused.

How the MATA Framework abuses PEB

This use case was introduced to me while reverse engineering a Windows variant of the MATA Framework. According to Kaspersky[1], the MATA Framework is used by the Lazarus group and targets multiple platforms.

Malware authors have a high interest in obfuscation, because it increases the time needed to reverse engineer it. One way to hide API calls is to use API Hashing. I have written about Danabot’s API Hashing[2] before and how to overcome it. MATA also uses this technique.

However instead of using the WIN API calls to retrieve the address of DLLs loaded into memory, MATA abuses the Process Environment Block to fetch base addresses. Let’s take a look at how MATA for Windows achieves this:

MATA API Hashing

The input of the APIHashing method takes an integer as the only parameter, this is the hash for the corresponding API call.

Figure 1: Call to APIHash method

Right after the prologue, it retrieves a pointer to PEB by reading it from the Thread Environment Block via the segment register GS. Similar to our proof of concept above, MATA now fetches the address to the head of the linked list provided by InMemoryOrderModuleList. Each item of the linked list provides the DLL base address of the corresponding loaded module.

From there, the malware reads the e_lfanew field, which contains the offset to the file header. By adding the base address, e_lfsanew and 0x88 it jumps directly to the data directories of the corresponding PE. From the data directories, MATA accesses the exported function names in a similar way as I’ve described in my blog article about DanaBot’s API Hashing[3]. The hashing algorithm is fairly simple. Each integer representation of a character is added and the result of the addition is ROR'd by 0xD consecutively each iteration. If the final hash matches the input parameter, the address to the function is retrieved. The following figure explains the function at a high level:

High level overview of API Hashing of MATA malware

Learning from each other

That’s it with the blog article, I hope you enjoyed it! There are probably way more use cases and real world cases of how the PEB is and and can be abused. If you can think of another one, feel free to leave a comment below and share it, so that we can learn from each other!

Catching Debuggers with Section Hashing

24 January 2021 at 10:41

As a Reverse Engineer, you will always have to deal with various anti analysis measures. The amount of possibilities to hamper our work is endless. Not only you will have to deal with code obfuscation to hinder your static analysis, but also tricks to prevent you from debugging the software you want to dig deeper into. I want to present you Section Hashing today.

I will begin by explaining how software breakpoints work internally and then give you an example of a Section Hashing implementation.

Debuggers – How software breakpoints work

When you set a breakpoint in your favourite debugger at a specific instruction, the debugger software will replace it temporarily with another instruction, which causes a fault or an interrupt. On x86, this is very often the INT 3 instruction, which is the opcode 0xCC. We can examine how this looks like in RAM.

We open x32dbg.exe and debug a 32 bit PE and set a breakpoint near the entry point.

Disassembly view of debugged program

When setting a breakpoint, you will see the original instruction instead of the patched one in the debugger. However, we can examine the same memory page in RAM with ProcessHacker.

Code section in RAM during debug session

In volatile memory, the byte 33 changed to CC, which will cause the program to halt when reached. This software interrupt will then be handled by the debugger and the code will be replaced again.

Catching Breakpoints with Section Hashing

After explaining how software breakpoints work, I’ll get to the real topic of this article now. We will move to the Linux world now for this example.

A software breakpoint is actually nothing else than a code modification of the executable memory section in RAM. Once a breakpoint is set, the .text section will be modified. A very known technique to catch such breakpoints in RAM is called Section Hashing.

Authors can embed the hash of the .text section in the binary. Upon execution, they use the same algorithm to generate a new hash from the .text section. If a software breakpoint is set, the hash will differ from the embedded hash. An example implementation can look like this:

Example implementation of Section Hashing

In this case, a hash of the .text section is generated. Afterwards it is used to influence the generation of the flag. If a software breakpoint is set during execution, a wrong hash will be generated.

This is a simple example of Section Hashing. In combination with code obfuscation and other anti analysis measurements, it can be very hard to spot this technique. It is also occasionally used by commercial packers.

Defeating Section Hashing

There are multiple ways to defeat this technique, some of them could be:

  • Patching instructions
  • Using hardware breakpoints

Instead of modifying the code in Random Access Memory, in x86 hardware breakpoints use dedicated registers to halt the execution. Hardware Breakpoints are still detectable.

In Windows, the program can fetch the CONTEXT via GetThreadContext to see if the debugging registers are used. A great example on how this is implemented can be found here[1]. If you are interested in trying to defeat it by yourself, you can try to beat the Section Hashing technique by yourself at root-me.org[2].

Taming Virtual Machine Based Code Protection – 2

26 November 2020 at 14:02

In the last episode …

As you’ve probably guessed it, this is the second part of my journey to reverse engineer a virtual machine protected binary. If you haven’t read the first part[1], I encourage you to do so, because I will not repeat everything again here. While the first part dealt with explaining the virtual environment and giving an initial first look into the virtual machine’s custom instruction set, I will focus on disassembling the virtual machine code completely this time.

I might repeat some steps from the first part again, mostly because I felt that it was necessary to do so :-).

Into the battle

We already explained the environmental setup in the previous blog post and also identified the main loop, which is responsible for instruction execution.

Figure 1: Main loop responsible for instruction execution

Each iteration, an instruction is parsed and the final CALL in the left branch of figure 1 executes the instruction.

Critical functions

I covered the instruction parsing process in my last blog article a little bit. But since we are going to build a disassembler, I will explain the most important routines once again.

0x4013DF / ParseInstruction

This function is called each iteration in the loop from figure 1 and is responsible for parsing the byte codes.

Figure 2: ParseInstruction overview

Each loop, the Virtual Instruction Pointer/VIP is retrieved, pointing at the instruction to execute. Each instruction is parsed. This function is fully responsible for transforming the bytes into a further processable format. Let’s take a look at how the first three instructions are parsed:

Figure 3: Parsing instructions

If you are interested in understanding this format fully, I recommend you to jump to the disassembler code[2]. I will only cover the first instruction here.

So how do we get from 03 15 03 00 04 to the parsed format ?

The first byte is always the instruction id. 03 is the id for the PUSH instruction. The second byte is divided into its upper 6 bits and lower 2 bits, representing the instruction size and number of operands used for this instruction. The next bytes are used to represent a single operand. In the example above, the first operand config 00 03 00 00, is the configuration for USE 32 BIT OF REGISTER, SPECIFIED BY THE NEXT DWORD 04 00 00 00. The next DWORD is 04 00 00 00, which is the fourth virtual register. Now what is the fourth register here ? Let’s take a quick look at the instructions.

PUSH VR4
MOV VR4, VR7
SUB VR7, 0xB4

This looks very similar to the usual function prologue ;-). So the fourth register must be EBP!.

PUSH EBP
MOV EBP, ESP
SUB ESP, 0xB4

0x401271 / GetOpval & 0x401322 / StoreOpval

I will not cover these two functions in depth here. If you take a look at figure 3 again, you will see that I mention the operand configs. These functions are responsible for filling the operands according to these configs.

In the example above, the SUB VR7, 0xB4 instruction uses 00030000 07000000 for the first operand and 00020000 B4000000 for the second config. If you reverse engineer every single option, you will find out that the following configurations exist:

# First DWORD CONFIG
00000000 ==> LOWEST BYTE OF REG X # f.e AX
00010000 ==> SECOND LOWEST BYTE OF REG X # f.e. AH 
00020000 ==> LOWER 16 BIT OF REG X # f.e. AX
00030000 ==> 32 BIT OF REGX # f.e. EAX
01000000 ==> BYTE AT LOC
01010000 ==> BYTE AT LOC
01020000 ==> WORD AT LOC
01030000 ==> DWORD AT LOC
02000000 == BYTE FROM IMM.
02010000 ==> BYTE FROM IMM.
02020000 ==> WORD FROM IMM.
02030000 ==> DWORD FROM IMM.
# Second DWORD CONFIG, if register
00000000 ==> EAX
01000000 ==> EBX
02000000 ==> ECX
03000000 ==> EDX
04000000 ==> EBP
05000000 ==> ESI
06000000 ==> EDI
07000000 ==> ESP

Eternal Debugging

Now we can use the gained knowledge to gain an initial understanding of what is happening and to verify whether we are able to decode instructions manually.

Figure 4: Manually disassembled bytecode

If you take a look at the last instructions, you will see that there are some constants pushed into memory. If you google these constants, you will come to the conclusion that this must be the MD5 Init routine[3]. The next step is to build a disassembler.

Disassembling the code

I wrote this one in C++ and you can find the source code to it on my github page[4]. Writing this on Python would have been possible too … and probably a lot easier and faster, I chose C++ though for learning purposes. If my C++ is awful, forgive me. We all start somewhere ;-).

Figure 5: Output of decoded virtual machine bytes

Our disassembler does have some limitations though. The disassembly was complex and I believe that some memory address offsets and register sizes are wrong. Also, I did not reverse engineer all instructions. However though, that should not be a problem, because we only need to understand what is happening here on a higher level.

Identifying the algorithm

We already spotted the variables, which we also found in the MD5.c source code(f.e. 0x2381bc0). However, the actual hashing algorithm does not match the original one. Therefore it seems to be some kind of a modified version of it. Furthermore we spot a routine, which seems to be the XTEA algorithm[5].

Figure 6: Identified XTEA algorithm

Final words

So that’s basically it. I don’t know when and if I will a third part covering the serial key generator. When I started this challenge, I was only interested in learning how to disassemble custom instruction sets.

If you are interested in how others solved this challenge, I recommend you to read the tutorials from wagonono and kernelj, they both completely solved this challenge[6]. Wagonono also created a disassembler and his version is better than mine.

Deobfuscating DanaBot’s API Hashing

12 July 2020 at 13:27

You probably already guessed it from the title’s name, API Hashing is used to obfuscate a binary in order to hide API names from static analysis tools, hindering a reverse engineer to understand the malware’s functionality.
A first approach to get an idea of an executable’s functionalities is to more or less dive through the functions and look out for API calls. If, for example a CreateFileW function is called in a specific subroutine, it probably means that cross references or the routine itself implement some file handling functionalities. This won’t be possible if API Hashing is used.

Instead of calling the function directly, each API call has a corresponding checksum/hash. A hardcoded hash value might be retrieved and for each library function a checksum is computed. If the computed value matches the hash value we compare it against, we found our target.

API Hashing used by DanaBot

In this case a reverse engineer needs to choose a different path to analyse the binary or deobfuscate it. This blog article will cover how the DanaBot banking trojan implements API Hashing and possibly the easiest way on how this can be defeated. The SHA256of the binary I am dissecting here is added at the end of this blog post.

Deep diving into DanaBot

DanaBot itself is a banking trojan and has been around since atleast 2018 and was first discovered by ESET[1]. It is worth mentioning that it implements most of its functionalities in plugins, which are downloaded from the C2 server. I will focus on deobfuscating API Hashing in the first stage of DanaBot, a DLL which is dropped and persisted on the system, used to download further plugins.

Reversing the ResolvFuncHash routine

At the beginning of the function, the EAX register stores a pointer to the DOS header of the Dynamic Linked Library which, contains the function the binary wants to call. The corresponding hash of the yet unknown API function is stored in the EDX register. The routine also contains a pile of junk instructions, obfuscating the actual use case for this function.

The hash is computed solely from the function name, so the first step is to get a pointer to all function names of the target library. Each DLL contains a table with all exported functions, which are loaded into memory. This Export Directory is always the first entry in the Data Directory array. The PE file format and its headers contain enough information to reach this mentioned directory by parsing header structures:

Cycling through the PE headers to obtain the ExportDirectory and AddressOfNames

In the picture below, you can see an example of the mentioned junk instructions, as well as the critical block, which compares the computed hash with the checksum of the function we want to call. The routine iterates through all function names in the Export Directory and calculates the hash.
The loop breaks once the computed hash matches the value that is stored in the EDX register since the beginning of this routine.

Graph overview of obfuscated API Hashing function

Reversing the hashing algorithm

The hashing algorithm is fairly simple and nothing too complicated. Junk instructions and opaque predicates complicate the process of reversing this routine.

The algorithm takes the nth and the stringLength-n-1th char of the function name and stores them, as well as capitalised versions into memory, resulting in a total of 4 characters. Each one of those characters is XOR'd with the string length. Finally they are multiplied and the values ​​are added up each time the loop is run and result in the hash value.

def get_hash(funcname):
    """Calculate the hash value for function name. Return hash value as integer"""
    strlen = len(funcname)
    # if the length is even, we encounter a different behaviour
    i = 0
    hashv = 0x0
    while i < strlen:
        if i == (strlen - 1):
            ch1 = funcname[0]
        else:
            ch1 = funcname[strlen - 2 - i]
        # init first character and capitalize it
        ch = funcname[i]
        uc_ch = ch.capitalize()
        # Capitalize the second character
        uc_ch1 = ch1.capitalize()
        # Calculate all XOR values
        xor_ch = ord(ch) ^ strlen
        xor_uc_ch = ord(uc_ch) ^ strlen
        xor_ch1 = ord(ch1) ^ strlen
        xor_uc_ch1 = ord(uc_ch1) ^ strlen
        # do the multiplication and XOR again with upper case character1
        hashv += ((xor_ch * xor_ch1) * xor_uc_ch)
        hashv = hashv ^ xor_uc_ch1
        i += 1
    return hashv

A python script for calculating the hash for a given function name is also uploaded on my github page[2] and free for everyone to use. I’ve also uploaded a text file with hashes for exported functions of commonly used DLLs.

Deobfuscation by Commenting

So now that we cracked the algorithm, we want to update our disassembly to know which hash value represents which function. As I’ve already mentioned, we want to focus on simplicity. The easiest way is to compute hash values for exported functions of commonly used DLLs and write them into a file.

Generated hashes

With this file, we can write an IdaPython script to comment the library function name next to the Api Hashing call. Luckily the Api Hashing function is always called with the same pattern:

  • Move the wanted hash value into the EDX register
  • Move a DWORD into EAX register

First we retrieve all XRefs of the Api Hashing function. Each XRef will contain an address where the Api Hashing function is called at, which means that in atleast the 5 previous instructions, we will find the mentioned pattern. So we will fetch the previous instruction until we extract the wanted hash value, which is being pushed into EDX. Finally we can use this immediate to extract the corresponding api function from the hash values we have generated before and comment the function name next to the Xref address.

def add_comment(addr, hashv, api_table):
    """Write a comment at addr with the matching api function.Return True if a corresponding api hash was found."""
    # remove the "h" at the end of the string
    hashv = hex(int(hashv[:-1], 16))
    keys = api_table.keys()
    if hashv in keys:
        apifunc = api_table[hashv]
        print "Found ApiFunction = %s. Adding comment." % (apifunc,)
        idc.MakeComm(addr, apifunc)
        comment_added = True
    else:
        print "Api function for hash = %s not found" % (hashv,)
        comment_added = False
    return comment_added


def main():
    """Main"""
    f = open(
        "C:\\Users\\luffy\\Desktop\\Danabot\\05-07-2020\\Utils\\danabot_hash_table.txt", "r")
    lines = f.readlines()
    f.close()
    api_table = get_api_table(lines)
    i = 0
    ii = 0
    for xref in idautils.XrefsTo(0x2f2858):
        i += 1
        currentaddr = xref.frm
        addr_minus = currentaddr - 0x10
        while currentaddr >= addr_minus:
            currentaddr = PrevHead(currentaddr)
            is_mov = GetMnem(currentaddr) == "mov"
            if is_mov:
                dst_is_edx = GetOpnd(currentaddr, 0) == "edx"
                # needs to be edx register to match pattern
                if dst_is_edx:
                    src = GetOpnd(currentaddr, 1)
                    # immediate always ends with 'h' in IDA
                    if src.endswith("h"):
                        add_comment(xref.frm, src, api_table)
                        ii += 1
    print "Total xrefs found %d" % (i,)
    print "Total api hash functions deobfuscated %d" % (ii,)


if __name__ == '__main__':
    main()

Conclusion

As reverse engineers, we will probably continue to encounter Api Hashing in various different ways. I hope I was able to show you some quick & dirty method or give you at least some fundament on how to beat this obfuscation technique. I also hope that, the next time a blue team fellow has to analyse DanaBot, this article might become handy to him and saves him some time reverse engineering this banking trojan.

IoCs

  • Dropper = e444e98ee06dc0e26cae8aa57a0cddab7b050db22d3002bd2b0da47d4fd5d78c
  • DLL = cde01a2eeb558545c57d5c71c75e9a3b70d71ea6bbeda790a0b871fcb1b76f49

UpnP – Messing up Security since years

21 June 2020 at 08:01

UpnP is a set of networking protocols to permit network devices to discover each other’s presence on a network and establish services for various functionalities.
Too lazy to port forward yourself ? Just enable UpnP to automatically establish working configurations with devices! Dynamic device configuration like this makes our life more comfortable for sure. Sadly it also comes with many security issues.

In this blog article I am focusing on mentioning the stages of the UpnP protocol, a quick introduction to security issues regarding UpnP and how QBot abuses the UpnP protocol to exploit devices as proxy C2 servers.

UpnP in a nutshell

UpnP takes usage of common networking protocols and stacks HTTP, SOAP and XML on top of the IP protocol in order to provide a variety of functionalities for users. Without going to deep into how UpnP works in detail, the following figure is enough for the basics.

Quick explanation of existing stages in UpnP protocol

Some services a node with UpnP enabled can offer (it really depends on the device):

  • Port forwarding
  • Switching power on and off for light bulbs
  • etc.

This is very high level of course. If you are interested in everything about UpnP, I recommend you to check out Wikipedia[1] for a high level introduction or read this report that goes more into detail[2].

For the following content of this blog article, only the first three stages are really relevant.

IoT Security and UpnP

Misconfiguration

Again, while it might be very convenient for customers to have devices autoconfigure themselves, it leads to huge security risks.

Many routers have UpnP enabled by default. Think of misconfigured IoT devices that sends a command to port forward a specific port, leading to a port exposure to the internet.

It is known that many IoT devices contain awful security flaws like default credentials for telnet. If devices like this have such misconfigurations and expose its telnet port to the outside, it probably takes about 5 minutes till some script kiddie adds this device to its botnet.

Exploitation

A blog post from TrendMicro[3] previously mentioned that many devices still use very old UpnP libraries which are not up to date to current security standards. This creates a larger attack surface for attackers. The newest one being CallStranger.

source : https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-12695

It is caused by the Callback header value in the UpnP SUBSCRIBE function. This field can be controlled by an attacker and enabled a Server Side Request Forgery like vulnerability. It can be used for the following malicious cases:

  • Exfilitrate data
  • Scan networks
  • Force nodes to participate in DDoS attacks

I recommend you to visit the official domain[4] of this vulnerability, if you want gain more knowledge about this vulnerability.

UpnP abused by QBot

Security risks created by UpnP are not limited to the IoT landscape of course.

Another method to use UpnP for malicious cases is to install Proxy C2 servers on devices which have the mentioned protocol enabled, like QBot does for example. Let’s take a look at how this is done.

Diving into QBot’s UpnP proxy module

This technique was first discovered by McAfee[4] in 2017. First QBot starts scanning for devices which have UpnP enabled and is one of the following device types:

  • urn:schemas-upnp-org:device:InternetGatewayDevice:1
  • urn:schemas-upnp-org:service:WANIPConnection:1
  • urn:schemas-upnp-org:service:WANPPPConnection:1
  • upnp:rootdevice
Disassembly of strcmp calls to check for device type

If you are using INETSIM for malware analysis, you will probably realise that it does not offer any functionality to fake a SSDP or UpnP service in any way. However, we can use this python script[5] by user GrahamCobb which emulates a fake SSDP service and adjust the device description to suit our needs.

Once the devices are discovered, it sends requests for device descriptions and checks whether it deals with an internet gateway device. This can be determined by looking at the device description itself.

Capture SSDP traffic, showing the MSEARCH request and retrieval of the device description

If it is an internet gateway device, it confirms whether a connection exists by sending a GetStatusInfo followed by retrieving the external ip address of this device by sending the GetExternalIPAddress command.

Next it tries to use the AddPortMapping command to add port forwarding rules to the device.

Port forwarding command sent to fake SSDP service

Afterwards all rules are removed again and the ports which were successfully port forwarded are sent as a HTTP-POST to the C2 server.
The carrier protocol is HTTPS and the response is sent in the following form:

# destination address
https://[HARDCODED_IP]:[HARDCODED_PORT]/bot_serv

# POST DATA form, successful port forwarded ports are appended to ports
cmd=1&msg=%s&ports=

From this point on, my analysis stopped for now. However, McAfee explains that a new binary is downloaded from the contacted C2 server, which re-adds the port forwarding rules and is responsible for the C2 communication. The blog article I’ve referenced above explains the whole functionality, so I recommend you to take a look at it, if you are interested in the next steps.

Final Words

As you can see UpnP contains many security flaws and can lead to a compromised network. If you have UpnP enabled in your company’s network, I really recommend to check whether this is really needed and turn it off if it is not necessary.

So exams at university are coming up next, it will probably take some time until I can get my hands on the QBot C2 protocol or the proxy binary. I do however, want to look at these two functionalities next.

Examining Smokeloader’s Anti Hooking technique

24 May 2020 at 09:22

Hooking is a technique to intercept function calls/messages or events passed between software, or in this case malware. The technique can be used for malicious, as well as defensive cases.

Rootkits for example can hook API calls to make themselves invisible from analysis tools, while we as defenders can use hooking to gain more knowledge of malware or build detection mechanisms to protect customers.

Cybersecurity continues to be a game of cat and mouses, and while we try to build protections, blackhats will always try to bypass these protection mechanisms. Today I want to show you how SmokeLoader bypasses hooks on ntdll.dll and how Frida can be used to hook library functions.

The bypass was also already explained in a blog article from Checkpoint[1] written by Israel Gubi. It also covers a lot more than I do regarding Smokeloader, so it is definitely worth reading too.

Hooking with Frida

If you’ve read my previous blog articles about QBot, you are familiar with the process iteration and AV detection[3]. It iterates over processes and compares the process name with entries in a black list containing process names of common AV products. If one process name matches with an entry, QBot quits its execution.

Frida is a Dynamic Instrumentation Toolkit which can be used to write dynamic analysis scripts in high level languages, in this case JavaScript. If you want to know more about this technology, I advice you to read to visit this website[4] and read its documentation.

We can write a small Frida script to hook the lstrcmpiA function in order to investigate which process names are in the black list.

def main():
    """Main."""
    # argv[1] is our malware sample
    pid = frida.spawn(sys.argv[1])
    sess = frida.attach(pid)
    script = sess.create_script("""
        console.log("[+] Starting Frida script")
        var lstrcmpiA = ptr("0x76B43E8E")
        console.log("[+] Hooking lstrcmpiA at " + lstrcmpiA)
        Interceptor.attach(lstrcmpiA, {
            onEnter: function(args) {
                console.log("[+][+] Called strcmpiA");
                console.log("[+][+] Arg1Addr = " + args[0]);
                console.log("[+][+] Buffer");
                pretty_print(args[0], 0x30);
                console.log("[+][+] Arg2Addr = " + args[1]);
                console.log("[+][+] Buffer");
                pretty_print(args[1], 0x30);
            },
            onLeave: function(retval) {
                console.log("[+][+] Returned from strcmpiA")
            }
        });

        function pretty_print(addr, sz) {
            var bufptr = ptr(addr);
            var bytearr = Memory.readByteArray(bufptr, sz);
            console.log(bytearr);
        };

        """)
    script.load()
    frida.resume(pid)
    sys.stdin.read()
    sess.detach()

We attach to the malicious process and hook the lstrcmpiA function at static address. When analysing malware, we have (most of the time) the privilege to control and adjust our environment as much as we want. If you turn off ASLR and use snapshots, using Frida with static pointers is pretty convenient, because most functions will always have the same address. However, it’s also possible to calculate the addresses dynamically. lstrcmpiA has 2 arguments, which are both pointers of type LPSTR. So we just resolve the pointers, fill 0x30 bytes starting at pointer address into a ByteArray and print it.

Result of Frida Script

Smokeloader’s Anti Hooking technique

So how does Smokeloader bypass hooks? Well it can do it atleast for the ntdll.dll library. During execution Smokeloader retrieves the Temp folder path and generates a random name. If a file with the generated name already exists in the temp folder, it is deleted with DeleteFileW.

drltrace output DeleteFileW call, deleting 9A26.tmp in Temp Folder

Next the original ntdll.dll file is copied from system32 to the temp folder with the exact name it just generated. This leads to a copy of this mentioned library being placed in the temp directory.

Meta data of disguised ntdll.dll
Export functions of the disguised ntdll file

Instead of loading the real ntdll.dll file, the copy is loaded into memory by calling LdrLoadDll.

9A26.tmp as ntdll.dll

Most AV vendors, as well as analysts probably implemented their hooks on ntdll.dll, so the references to the copied ntdll.dll file will be missed.

Smokeloader continues to call functions from this copied DLL, using for example function calls like NtQueryInformationProcess to detect wether a debugger is attached to it.

Final Words

While analysing SmokeLoader at work, I stumbled across this AntiHook mechanism, which I haven’t seen before, so I wanted to share it here :-).


I’ve also only scratched on the surface of what Frida is capable of. I might work on something more complex next time.

❌
❌